Noting down some areas where significant speedups may be achieved:
vcat in ProductNodes leads to a lot of copying
- data deduplication in leaves may lead to lower memory requirements and also to saving some compute (computing ngrams only once for identical strings in
NGramMatrix multiplication)
- deduplicating instances in
BagNodes in a similar fashion
Noting down some areas where significant speedups may be achieved:
vcatinProductNodes leads to a lot of copyingNGramMatrixmultiplication)BagNodes in a similar fashion