API chat endpoints
Upgraded from ancient langchain, added langserve to serve langchain chains (creates several endpoints where you can invoke or stream results, including a test playground).
- FastAPI + langserve API key secured endpoints (
/chat
,/rag_chat
) along "Proposal Insights" 1st step/similarity_search
- moved from ChromaDB to PGVector
- added PostgresChatMessageHistory
TODO where UX discussion is needed:
- extend PostgresChatMessageHistory to have extra columns in the db table: for user id, other UI grouping/namespace, and other metadata (user feedback, TruLens/RAGAS evaluation of response)
- add chat history retrieval endpoints, per user, per collection?/widget?/namespace?
- add chat history delete/soft-delete/archive endpoint?
- add LLM option in chats (now it defaults to OpenAI, which can be instrumented via a LLM Proxy to use a LocalAI endpoint - see LLM-infra)
TODO:
- control context window size (depends on LLMs used/selected by user?) with context compression
- "duplicate"
/rag_chat
to "Proposal Insights"-like 2nd step document handling (map/reduce or rolling context compression)
OPTIONAL:
- add Langfuse instrumentation (to log queries, prompts, and even later control/swap prompts)