Compute search embeddings locally
This MR addresses ots/llm/meta#63 by shifting the generation of embeddings for searches to be computed locally on the app server to improve reliability for searches.
Laddered on !3 (merged)
Steps to Test
- Install dependencies
- Add
SEMANTIC_SEARCH_EMBEDDING_MODEL = "Snowflake/snowflake-arctic-embed-m-v1.5"
in settings.py - Restart Torque (the app will download and cache some very large files)
- Search
Edited by Chris Zubak-Skees