Elasticsearch Search API parameters and grounding accuracy. #162
jvwong
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The grounding-search system ranks the set of results from an initial call to the Elasticsearch Search API. There are several relevant parameters:
fuzziness: (Optional, string) Maximum edit distance allowed for matching. See Fuzziness for valid values and more information. See Fuzziness in the match query for an example.MAX_FUZZ_ES(default: 2)min_score: (Optional, float) Minimum _score for matching documents. Documents with a lower _score are not included in the search results.ES_MIN_SCORE(default: 0)Given the introduction of test cases where the target entities in fact, do not exist ('out of dictionary') (#160 ), there is a desire to reduce spurious matches (#161). One way to achieve this is to provide a stricter criteria for ES results, such as filtering for low ES
_scoreor reduced fuzziness.Test Configurations
The following analysis examines grounding search test results with these parameters altered alone or in combination:
Test Results
Figure 1. Search accuracy over different configurations (N=868). A test case fails when the expected ground is not the top search result returned from the grounding-search.
Figure 2. Search errors grouped into different classes based on rank. Runner up is second hit; OOD is 'out of dictionary' meaning a ground does not exist but a (non-empty) search hit is returned.
Beta Was this translation helpful? Give feedback.
All reactions