Skip to content

Draft: Add /embeddings endpoints

Chris Zubak-Skees requested to merge add-embeddings-endpoints into main

This draft MR adds endpoints to generate document and query embeddings to support meta#54 and ots/mediawiki/torque!107. It probably needs its own ticket.

This is designed so future versions can incorporate the semantic chunking @gridinoc is experimenting with, in which a set of documents can be further split into smaller documents which are still identified by the IDs and metadata provided.

Another possible future enhancement would be to add an option to do truncation and normalization of embeddings which support it, like the Matryoshka-enabled model we're currently using. That's also saved for the future.

The endpoints are split into /embeddings/document and /embeddings/query so that we can provide a different prompt, which we do now, or perhaps do different processing in the future. Perhaps these should be /embed/documents and /embed/queries.

I have upgraded a bunch of the LangChain dependencies here, but have not tested all the endpoints. I also have to write some tests for these. And I have to write some steps for a reviewer to test.

Edited by Chris Zubak-Skees

Merge request reports

Loading