This would be very interesting. Full semantic search with vector embeddings would be so useful for literature review.
The approach would be generating embeddings for paper content (titles, abstracts, full text), storing them locally with something like FAISS as you mentioned, then doing hybrid search combining keyword + semantic similarity.
Main challenge would be keeping it performant - you'd need incremental indexing and probably make it opt-in given the computational overhead.
The approach would be generating embeddings for paper content (titles, abstracts, full text), storing them locally with something like FAISS as you mentioned, then doing hybrid search combining keyword + semantic similarity.
Main challenge would be keeping it performant - you'd need incremental indexing and probably make it opt-in given the computational overhead.