perf: persist the vector index with Milvus instead of rebuilding retrieval state on every startup
While looking through the current startup flow, I noticed that Sugar-AI rebuilds its vector store every time the app starts.
Right now the app loads the source documents, recomputes embeddings, and recreates the FAISS index during startup. This works for a small local setup, but it does not scale well as the document set grows.
From the current code path:
main.py initializes the agent on startup and calls setup_vectorstore()
app/ai.py loads PDFs/text files, generates embeddings, and builds a FAISS index from scratch
Why this is a problem
Rebuilding the retrieval index on every startup has a few drawbacks:
- slower startup time
- unnecessary repeated embedding work
- harder to support larger document collections
- no clean path for incremental document refresh
- limited metadata support for future retrieval improvements
This is manageable for a very small local prototype, but it becomes a bottleneck if Sugar-AI keeps growing.
Proposed improvement
Move the retrieval layer from startup-time FAISS rebuilding to a persistent Milvus-backed vector store.
Instead of rebuilding the entire index every time the app starts, Sugar-AI should:
- persist document embeddings in Milvus
- load an existing collection at startup
- only embed and insert new or changed documents when needed
- query Milvus directly during retrieval
Why Milvus
I think Milvus would be a better long-term fit than rebuilding local FAISS indexes because it gives:
- persistent vector storage
- better support for larger collections
- richer metadata fields
- easier document updates and reindexing
- a stronger foundation for future retrieval improvements
In particular, metadata support would make it easier later to add things like:
- source-aware filtering
- document type filtering
- parent-child retrieval
- keyword / tag-based secondary filtering
- chunking strategies that preserve document structure
Suggested design
A possible migration path could look like this:
-
Add a document ingestion step separate from application startup
- parse source docs
- chunk them
- generate embeddings
- upsert them into Milvus
-
Change startup behavior
- instead of rebuilding the index, connect to an existing Milvus collection
- fail gracefully if the collection is missing or empty
-
Store metadata for each chunk
- source path
- document title
- section / page
- chunk id
- optional tags or keywords
-
Add a refresh/reindex workflow
- full rebuild when needed
- incremental sync for newly added or updated docs
Expected benefits
- much faster startup
- no repeated embedding computation on every restart
- better scalability as documentation grows
- easier document refresh workflows
- better retrieval architecture for future RAG improvements
Notes
This does add some deployment complexity compared to pure local FAISS, so I do not think FAISS needs to be removed immediately.
A practical path could be:
- keep FAISS as a simple local/dev fallback
- add Milvus as the persistent production-oriented backend
That would preserve the lightweight developer experience while giving Sugar-AI a more scalable retrieval architecture.
Acceptance criteria
- startup no longer rebuilds the vector index by default
- retrieval can load from a persistent Milvus collection
- document chunks and embeddings are stored persistently
- metadata is stored with each chunk
- there is a documented workflow for initial indexing and refresh/reindex
- local development still has a reasonable fallback path if Milvus is not configured
If maintainers think this direction makes sense, I’d be happy to work on it.
perf: persist the vector index with Milvus instead of rebuilding retrieval state on every startup
While looking through the current startup flow, I noticed that Sugar-AI rebuilds its vector store every time the app starts.
Right now the app loads the source documents, recomputes embeddings, and recreates the FAISS index during startup. This works for a small local setup, but it does not scale well as the document set grows.
From the current code path:
main.pyinitializes the agent on startup and callssetup_vectorstore()app/ai.pyloads PDFs/text files, generates embeddings, and builds a FAISS index from scratchWhy this is a problem
Rebuilding the retrieval index on every startup has a few drawbacks:
This is manageable for a very small local prototype, but it becomes a bottleneck if Sugar-AI keeps growing.
Proposed improvement
Move the retrieval layer from startup-time FAISS rebuilding to a persistent Milvus-backed vector store.
Instead of rebuilding the entire index every time the app starts, Sugar-AI should:
Why Milvus
I think Milvus would be a better long-term fit than rebuilding local FAISS indexes because it gives:
In particular, metadata support would make it easier later to add things like:
Suggested design
A possible migration path could look like this:
Add a document ingestion step separate from application startup
Change startup behavior
Store metadata for each chunk
Add a refresh/reindex workflow
Expected benefits
Notes
This does add some deployment complexity compared to pure local FAISS, so I do not think FAISS needs to be removed immediately.
A practical path could be:
That would preserve the lightweight developer experience while giving Sugar-AI a more scalable retrieval architecture.
Acceptance criteria
If maintainers think this direction makes sense, I’d be happy to work on it.