Skip to content

perf: avoid rebuilding the vector store on every startup by using Milvus #115

@yjing86

Description

@yjing86

perf: persist the vector index with Milvus instead of rebuilding retrieval state on every startup

While looking through the current startup flow, I noticed that Sugar-AI rebuilds its vector store every time the app starts.

Right now the app loads the source documents, recomputes embeddings, and recreates the FAISS index during startup. This works for a small local setup, but it does not scale well as the document set grows.

From the current code path:

  • main.py initializes the agent on startup and calls setup_vectorstore()
  • app/ai.py loads PDFs/text files, generates embeddings, and builds a FAISS index from scratch

Why this is a problem

Rebuilding the retrieval index on every startup has a few drawbacks:

  • slower startup time
  • unnecessary repeated embedding work
  • harder to support larger document collections
  • no clean path for incremental document refresh
  • limited metadata support for future retrieval improvements

This is manageable for a very small local prototype, but it becomes a bottleneck if Sugar-AI keeps growing.

Proposed improvement

Move the retrieval layer from startup-time FAISS rebuilding to a persistent Milvus-backed vector store.

Instead of rebuilding the entire index every time the app starts, Sugar-AI should:

  • persist document embeddings in Milvus
  • load an existing collection at startup
  • only embed and insert new or changed documents when needed
  • query Milvus directly during retrieval

Why Milvus

I think Milvus would be a better long-term fit than rebuilding local FAISS indexes because it gives:

  • persistent vector storage
  • better support for larger collections
  • richer metadata fields
  • easier document updates and reindexing
  • a stronger foundation for future retrieval improvements

In particular, metadata support would make it easier later to add things like:

  • source-aware filtering
  • document type filtering
  • parent-child retrieval
  • keyword / tag-based secondary filtering
  • chunking strategies that preserve document structure

Suggested design

A possible migration path could look like this:

  1. Add a document ingestion step separate from application startup

    • parse source docs
    • chunk them
    • generate embeddings
    • upsert them into Milvus
  2. Change startup behavior

    • instead of rebuilding the index, connect to an existing Milvus collection
    • fail gracefully if the collection is missing or empty
  3. Store metadata for each chunk

    • source path
    • document title
    • section / page
    • chunk id
    • optional tags or keywords
  4. Add a refresh/reindex workflow

    • full rebuild when needed
    • incremental sync for newly added or updated docs

Expected benefits

  • much faster startup
  • no repeated embedding computation on every restart
  • better scalability as documentation grows
  • easier document refresh workflows
  • better retrieval architecture for future RAG improvements

Notes

This does add some deployment complexity compared to pure local FAISS, so I do not think FAISS needs to be removed immediately.

A practical path could be:

  • keep FAISS as a simple local/dev fallback
  • add Milvus as the persistent production-oriented backend

That would preserve the lightweight developer experience while giving Sugar-AI a more scalable retrieval architecture.

Acceptance criteria

  • startup no longer rebuilds the vector index by default
  • retrieval can load from a persistent Milvus collection
  • document chunks and embeddings are stored persistently
  • metadata is stored with each chunk
  • there is a documented workflow for initial indexing and refresh/reindex
  • local development still has a reasonable fallback path if Milvus is not configured

If maintainers think this direction makes sense, I’d be happy to work on it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions