Skip to content

Add /embeddings endpoints

Chris Zubak-Skees requested to merge add-embeddings-endpoints into main

This MR adds an endpoint to generate embeddings to support meta#54 (closed) and ots/mediawiki/torque!107 (closed).

This is designed so future versions can incorporate the semantic chunking @gridinoc is experimenting with, in which a set of documents can be further split into smaller documents.

Another possible future enhancement would be to add an option to do truncation and normalization of embeddings which support it, like the Matryoshka-enabled model we're currently using. That's also saved for the future.

The /embeddings endpoint is designed to be similar to the OpenAI endpoint of the same name, though it currently has fewer options and response fields, and adds an ability to specify the type of embedding, which used as the prompt name by the SentenceTransformers embedding this does.

Steps to test

export API_KEY=<API KEY from .env.local>
make run
curl -X 'POST' \
  'http://localhost:8889/embeddings' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer '$API_KEY \
  -H 'Content-Type: application/json' \
  -d '{
  "input": "Hello, world!",
  "type": "query"
}'

Expected result:

{"data":[{"embedding":[-0.005389418452978134,0.027321476489305496, ...
Edited by Chris Zubak-Skees

Merge request reports

Loading