Resolve "expand filter results with semantic similar values"
Closes #48 (closed)
Offline: python ./api/filtersets/create_sim_map.py
uses filterset_request_20240913_trimmed.json
to create semantic_map.json
where each filter has for each value a list of similar tuples ('similar value', cosine)
At Runtime: api/filterset.py
loads the map and expands the results with similar ones, by default using min_cosine
0.5 and max_expansions
pf 5; which can be overridden via query parameters
Example result without expansion:
{
"filters": {
"current_work_locations": [
"Pennsylvania"
],
"future_work_locations": [
"This is a national project, based in Philadelphia, PA" // <-- this is bad value from Torque
],
"key_words_and_phrases": [
]
},
"id": "b294345a0dd6bd4f7160f9fbdbbf2e7f",
"keywords": [
"Philadelphia",
"literacy projects"
],
"query": "literacy projects in philidelphia"
}
now with semantic expansion:
{
"filters": {
"current_work_locations": [
"Pennsylvania",
"Connecticut",
"Delaware",
"Kentucky",
"Maryland",
"Nevada"
],
"future_work_locations": [
"This is a national project, based in Philadelphia, PA", // <-- this is bad value from Torque
"United States: Pennsylvania",
"Pennsylvania"
],
"key_words_and_phrases": [
]
},
"id": "b294345a0dd6bd4f7160f9fbdbbf2e7f",
"keywords": [
"Philadelphia",
"literacy projects"
],
"query": "literacy projects in philidelphia"
}