Externalize embedding generation (!62) · Merge requests · ots / LLM / llm-api

Chris Zubak-Skees requested to merge improve-embedding-speed into main Dec 16, 2024

This MR helps address meta#61 by using an external API hosted on RunPod which has access to GPUs to speed up generating embeddings. Companion MR to ots/mediawiki/semantic-search!3

Steps to Test

Define the following in .env.local with an OpenAI-compatible API and key:

EMBEDDING_MODEL=Snowflake/snowflake-arctic-embed-m-v1.5
EMBEDDING_API_BASE=
EMBEDDING_API_KEY=

make run

Use the API console to make an embedding request.

Externalize embedding generation

Steps to Test

Merge request reports