Improve indexing speed
This MR attempts to improve the speed at which Semantic Search generates embeddings and indexes documents by:
- Adding an in-memory cache for embeddings with an expiry time of one hour, as suggested by @frankduncan.
- Refactoring to support asynchronous concurrency and generating up to two documents in parallel when rebuilding the entire search index.
- Adding HTTP pooling for reusing connections when rebuilding the entire search index.
- Using
orson
for JSON parsing and dumping where possible. - Enabling some advanced features in the HTTP client.
This addresses ots/llm/meta#61. The caching may also address ots/llm/meta#64.
Steps to Test
git clone https://code.librehq.com/ots/llm/llm-api.git
cd llm-api
make run &
cd ../semantic-search
git fetch
git checkout improve-semantic-search-indexing-speed
cd ../torque/torque-django
pipenv install -e ../semantic-search/django-torque-semantic-search
pipenv run python manage.py shell
> from torque import models
# this should be faster
> models.WikiConfig.objects.get(collection__name="DemoView", group="TorqueAdmin").rebuild_search_index()
sudo -u postgres psql torque -c "update torque_searchcachedocument set dirty = true;"
# this will be less so
pipenv run python manage.py run_cache_rebuilder
> exit()
# this should produce relevant results
curl "http://localhost:5000/api/collections/DemoView/explore?filter_only=&with_ids=true&f=%7B%22admin_review%22%3A%5B%22Valid%22%5D%2C%22competition_status%22%3A%5B%22Active%22%5D%7D&qs=%5B%22water+in+india%22%5D&wiki_key=DemoView&group=TorqueAdmin&"
# this shouldn't hit the embeddings API a second time
curl "http://localhost:5000/api/collections/DemoView/explore?filter_only=&with_ids=&f=%7B%22admin_review%22%3A%5B%22Valid%22%5D%2C%22competition_status%22%3A%5B%22Active%22%5D%7D&offset=100&qs=%5B%22water+in+india%22%5D&wiki_key=DemoView&group=TorqueAdmin&"
Edited by Chris Zubak-Skees