Implement a new Extractor subtype, called WordEmbeddingExtractor, for extracting NLP words using their embeddings (using Embeddings.jl and WordTokenizers.jl?)
Rough sketch of possible implementation can be found here, but this is for the old version of JsonGrinder.
A good starting point is NGramExtractor implementation, the design should be very similar.
We might also want to update suggestextractor with a new kwarg governing when Strings are extracted as ngrams and when they are tokenized
Implement a new
Extractorsubtype, calledWordEmbeddingExtractor, for extracting NLP words using their embeddings (usingEmbeddings.jlandWordTokenizers.jl?)Rough sketch of possible implementation can be found here, but this is for the old version of
JsonGrinder.A good starting point is
NGramExtractorimplementation, the design should be very similar.We might also want to update
suggestextractorwith a new kwarg governing whenStrings are extracted as ngrams and when they are tokenized