Draft: Add /embeddings endpoints
This draft MR adds endpoints to generate document and query embeddings to support meta#54 and ots/mediawiki/torque!107. It probably needs its own ticket.
This is designed so future versions can incorporate the semantic chunking @gridinoc is experimenting with, in which a set of documents can be further split into smaller documents which are still identified by the IDs and metadata provided.
Another possible future enhancement would be to add an option to do truncation and normalization of embeddings which support it, like the Matryoshka-enabled model we're currently using. That's also saved for the future.
The endpoints are split into /embeddings/document
and /embeddings/query
so that we can provide a different prompt, which we do now, or perhaps do different processing in the future. Perhaps these should be /embed/documents
and /embed/queries
.
I have upgraded a bunch of the LangChain dependencies here, but have not tested all the endpoints. I also have to write some tests for these. And I have to write some steps for a reviewer to test.