Skip to content

Improve indexing speed

Chris Zubak-Skees requested to merge improve-semantic-search-indexing-speed into main

This MR attempts to improve the speed at which Semantic Search generates embeddings and indexes documents by:

  1. Adding an in-memory cache for embeddings with an expiry time of one hour, as suggested by @frankduncan.
  2. Refactoring to support asynchronous concurrency and generating up to two documents in parallel when rebuilding the entire search index.
  3. Adding HTTP pooling for reusing connections when rebuilding the entire search index.
  4. Using orson for JSON parsing and dumping where possible.
  5. Enabling some advanced features in the HTTP client.

This addresses ots/llm/meta#61. The caching may also address ots/llm/meta#64.

Steps to Test

git clone
cd llm-api
make run &
cd ../semantic-search
git fetch
git checkout improve-semantic-search-indexing-speed
cd ../torque/torque-django
pipenv install -e ../semantic-search/django-torque-semantic-search
pipenv run python shell
> from torque import models
# this should be faster
> models.WikiConfig.objects.get(collection__name="DemoView", group="TorqueAdmin").rebuild_search_index()
sudo -u postgres psql torque -c "update torque_searchcachedocument set dirty = true;"
 # this will be less so
pipenv run python run_cache_rebuilder
> exit()
# this should produce relevant results
curl "http://localhost:5000/api/collections/DemoView/explore?filter_only=&with_ids=true&f=%7B%22admin_review%22%3A%5B%22Valid%22%5D%2C%22competition_status%22%3A%5B%22Active%22%5D%7D&qs=%5B%22water+in+india%22%5D&wiki_key=DemoView&group=TorqueAdmin&"
# this shouldn't hit the embeddings API a second time
curl "http://localhost:5000/api/collections/DemoView/explore?filter_only=&with_ids=&f=%7B%22admin_review%22%3A%5B%22Valid%22%5D%2C%22competition_status%22%3A%5B%22Active%22%5D%7D&offset=100&qs=%5B%22water+in+india%22%5D&wiki_key=DemoView&group=TorqueAdmin&"
Edited by Chris Zubak-Skees

Merge request reports
