Add a mixture of results#529
Conversation
This is for multiple PRs e.g. BRIGHT v1.1, running gemini 2 and a few other that I noted was missing.
Model Results ComparisonReference models: Results for
|
| task_name | KaLM-Embedding/KaLM-embedding-multilingual-mini-instruct-v2.5 | intfloat/multilingual-e5-large | Max result | Model with max result | In Training Data |
|---|---|---|---|---|---|
| TweetSentimentClassification | 0.4935 | 0.503 | 0.6570 | codefuse-ai/F2LLM-v2-14B | False |
| Average | 0.4935 | 0.503 | 0.6570 | nan | - |
Training datasets: ATEC, AmazonCounterfactualClassification, AmazonCounterfactualVNClassification, AmazonPolarityClassification, AmazonPolarityClassification.v2, AmazonPolarityVNClassification, AmazonReviewsClassification, AmazonReviewsVNClassification, ArXivHierarchicalClusteringP2P, ArXivHierarchicalClusteringS2S, ArxivClusteringP2P, ArxivClusteringP2P.v2, ArxivClusteringS2S, BQ, Banking77Classification, Banking77Classification.v2, Banking77VNClassification, BiorxivClusteringP2P, BiorxivClusteringP2P.v2, BiorxivClusteringS2S, BiorxivClusteringS2S.v2, CQADupstack, CodeFeedbackMT, CodeFeedbackST, ContractNLIConfidentialityOfAgreementLegalBenchClassification, ContractNLIExplicitIdentificationLegalBenchClassification, ContractNLIInclusionOfVerballyConveyedInformationLegalBenchClassification, ContractNLILimitedUseLegalBenchClassification, ContractNLINoLicensingLegalBenchClassification, ContractNLINoticeOnCompelledDisclosureLegalBenchClassification, ContractNLIPermissibleAcquirementOfSimilarInformationLegalBenchClassification, ContractNLIPermissibleCopyLegalBenchClassification, ContractNLIPermissibleDevelopmentOfSimilarInformationLegalBenchClassification, ContractNLIPermissiblePostAgreementPossessionLegalBenchClassification, ContractNLIReturnOfConfidentialInformationLegalBenchClassification, ContractNLISharingWithEmployeesLegalBenchClassification, ContractNLISharingWithThirdPartiesLegalBenchClassification, ContractNLISurvivalOfObligationsLegalBenchClassification, DBPedia, DBPedia-Fa, DBPedia-NL, DBPedia-PL, DBPedia-PLHardNegatives, DBPedia-VN, DBPediaHardNegatives, DBPediaHardNegatives.v2, ESCIReranking, EmotionClassification, EmotionClassification.v2, EmotionVNClassification, FEVER, FEVER-FaHardNegatives, FEVER-NL, FEVER-VN, FEVERHardNegatives, FEVERHardNegatives.v2, FiQA-PL, FiQA2018, FiQA2018-Fa, FiQA2018-Fa.v2, FiQA2018-NL, FiQA2018-VN, HUMEArxivClusteringP2P, HUMEEmotionClassification, HUMEToxicConversationsClassification, HUMETweetSentimentExtractionClassification, HotpotQA, HotpotQA-Fa, HotpotQA-FaHardNegatives, HotpotQA-NL, HotpotQA-PL, HotpotQA-PLHardNegatives, HotpotQA-VN, HotpotQAHardNegatives, HotpotQAHardNegatives.v2, ImdbClassification, ImdbClassification.v2, ImdbVNClassification, MIRACLJaRetrievalLite, MIRACLReranking, MIRACLRetrieval, MIRACLRetrievalHardNegatives, MIRACLRetrievalHardNegatives.v2, MSMARCO, MSMARCO-Fa, MSMARCO-FaHardNegatives, MSMARCO-PL, MSMARCO-PLHardNegatives, MSMARCO-VN, MSMARCOHardNegatives, MSMARCOv2, MTOPDomainClassification, MTOPDomainVNClassification, MTOPIntentClassification, MTOPIntentVNClassification, MassiveIntentClassification, MassiveIntentVNClassification, MassiveScenarioClassification, MassiveScenarioVNClassification, MedrxivClusteringP2P, MedrxivClusteringP2P.v2, MedrxivClusteringS2S, MedrxivClusteringS2S.v2, MrTidyRetrieval, MrTyDiJaRetrievalLite, MultiLongDocReranking, MultiLongDocRetrieval, MultilingualSentiment, MultilingualSentiment.v2, NFCorpus, NFCorpus-Fa, NFCorpus-NL, NFCorpus-NL.v2, NFCorpus-PL, NFCorpus-VN, NQ, NQ-Fa, NQ-FaHardNegatives, NQ-NL, NQ-PL, NQ-PLHardNegatives, NQ-VN, NQHardNegatives, NanoDBPedia-VN, NanoDBPediaRetrieval, NanoFEVER-VN, NanoFEVERRetrieval, NanoFiQA2018Retrieval, NanoHotpotQA-VN, NanoHotpotQARetrieval, NanoMSMARCO-VN, NanoMSMARCORetrieval, NanoNFCorpusRetrieval, NanoNQ-VN, NanoNQRetrieval, NanoQuoraRetrieval, NanoSciFactRetrieval, PawsXPairClassification, Quora-NL, Quora-PL, Quora-PLHardNegatives, QuoraRetrieval, QuoraRetrieval-Fa, QuoraRetrieval-Fa.v2, QuoraRetrievalHardNegatives, QuoraRetrievalHardNegatives.v2, Reddit-Clustering, Reddit-Clustering-P2P, SciFact, SciFact-Fa, SciFact-Fa.v2, SciFact-NL, SciFact-NL.v2, SciFact-PL, SciFact-VN, Stackexchange-Clustering, Stackexchange-Clustering-P2P, TRECCOVID, TRECCOVID-Fa, TRECCOVID-Fa.v2, TRECCOVID-NL, TRECCOVID-PL, TRECCOVID-VN, ToxicConversationsClassification, ToxicConversationsClassification.v2, ToxicConversationsVNClassification, TweetSentimentExtractionClassification, TweetSentimentExtractionClassification.v2, TweetSentimentExtractionVNClassification, TwentyNewsgroups-Clustering, YahooAnswersTopicsClassification, YahooAnswersTopicsClassification.v2, mMARCO-NL
Results for Qwen/Qwen3-Embedding-0.6B
| task_name | Qwen/Qwen3-Embedding-0.6B | google/gemini-embedding-001 | intfloat/multilingual-e5-large | Max result | Model with max result | In Training Data |
|---|---|---|---|---|---|---|
| FEVERHardNegatives | 0.8726 | 0.8898 | 0.8379 | 0.9453 | ByteDance-Seed/Seed1.5-Embedding | True |
| TweetSentimentClassification | 0.4813 | nan | 0.503 | 0.6570 | codefuse-ai/F2LLM-v2-14B | False |
| Average | 0.677 | 0.8898 | 0.6704 | 0.8012 | nan | - |
Training datasets: CMedQAv2-reranking, CmedqaRetrieval, CodeSearchNet, DuRetrieval, FEVER, FEVER-FaHardNegatives, FEVER-NL, FEVER-VN, FEVERHardNegatives, FEVERHardNegatives.v2, HotpotQA, HotpotQA-Fa, HotpotQA-FaHardNegatives, HotpotQA-NL, HotpotQA-PL, HotpotQA-PLHardNegatives, HotpotQA-VN, HotpotQAHardNegatives, HotpotQAHardNegatives.v2, MIRACLJaRetrievalLite, MIRACLReranking, MIRACLRetrieval, MIRACLRetrievalHardNegatives, MIRACLRetrievalHardNegatives.v2, MMarcoReranking, MSMARCO, MSMARCO-Fa, MSMARCO-FaHardNegatives, MSMARCO-PL, MSMARCO-PLHardNegatives, MSMARCO-VN, MSMARCOHardNegatives, MSMARCOv2, MrTidyRetrieval, MrTyDiJaRetrievalLite, NQ, NQ-Fa, NQ-FaHardNegatives, NQ-NL, NQ-PL, NQ-PLHardNegatives, NQ-VN, NQHardNegatives, NanoFEVER-VN, NanoFEVERRetrieval, NanoHotpotQA-VN, NanoHotpotQARetrieval, NanoMSMARCO-VN, NanoMSMARCORetrieval, NanoNQ-VN, NanoNQRetrieval, T2Retrieval
Results for google/gemini-embedding-2-preview
| task_name | google/gemini-embedding-001 | google/gemini-embedding-2-preview | intfloat/multilingual-e5-large | Max result | Model with max result | In Training Data |
|---|---|---|---|---|---|---|
| BUCC.v2 | 0.9899 | 1.0000 | 0.9878 | 0.9905 | codefuse-ai/F2LLM-v2-8B | False |
| BibleNLPBitextMining | 0.2072 | 0.5298 | 0.1665 | 0.9899 | deepvk/USER-bge-m3 | False |
| BornholmBitextMining | 0.5169 | 0.9167 | 0.4416 | 0.7798 | jinaai/jina-embeddings-v5-text-small | False |
| BulgarianStoreReviewSentimentClassfication | 0.7813 | 0.7319 | 0.6385 | 0.8159 | microsoft/harrier-oss-v1-27b | False |
| CzechProductReviewSentimentClassification | 0.6816 | 0.6435 | 0.5714 | 0.7667 | Bytedance/Seed1.6-embedding-1215 | False |
| DBpediaClassification | 0.9476 | 0.8865 | 0.8828 | 0.9926 | Qwen/Qwen3-Embedding-4B | False |
| DiaBlaBitextMining | 0.8723 | 0.9963 | 0.8483 | 0.8882 | codefuse-ai/F2LLM-v2-14B | False |
| EstonianValenceClassification | 0.5352 | 0.4581 | 0.4289 | 0.6764 | microsoft/harrier-oss-v1-27b | False |
| FilipinoShopeeReviewsClassification | 0.4845 | 0.4243 | 0.3527 | 0.5279 | microsoft/harrier-oss-v1-27b | False |
| FinancialPhrasebankClassification | 0.8864 | 0.8447 | 0.8394 | 0.9519 | microsoft/harrier-oss-v1-0.6b | False |
| FloresBitextMining | 0.8371 | 0.9824 | 0.8108 | 0.9087 | SamilPwC-AXNode-GenAI/PwC-Embedding_expr | False |
| GreekLegalCodeClassification | 0.4376 | 0.3386 | 0.3713 | 0.8052 | Bytedance/Seed1.6-embedding-1215 | False |
| GujaratiNewsClassification | 0.9205 | 0.9010 | 0.7674 | 0.9343 | Bytedance/Seed1.6-embedding-1215 | False |
| IN22GenBitextMining | 0.9375 | 0.9953 | 0.7675 | 0.9375 | google/gemini-embedding-001 | False |
| IndicGenBenchFloresBitextMining | 0.9677 | 0.9690 | 0.8875 | 0.9881 | Sailesh97/Hinvec | False |
| IndonesianIdClickbaitClassification | 0.67 | 0.6256 | 0.6122 | 0.7560 | nvidia/llama-embed-nemotron-8b | False |
| LccSentimentClassification | 0.6993 | 0.5933 | 0.594 | 0.7687 | Alibaba-NLP/gte-Qwen2-7B-instruct | False |
| NTREXBitextMining | 0.9364 | 0.9881 | 0.914 | 0.9592 | microsoft/harrier-oss-v1-27b | False |
| NollySentiBitextMining | 0.6871 | 0.7837 | 0.675 | 0.8376 | microsoft/harrier-oss-v1-27b | False |
| NorwegianCourtsBitextMining | 0.9342 | 1.0000 | 0.9404 | 0.9481 | jinaai/jina-embeddings-v5-text-nano | False |
| NusaTranslationBitextMining | 0.7752 | 0.9316 | 0.672 | 0.9222 | Qwen/Qwen3-Embedding-8B | False |
| NusaXBitextMining | 0.8252 | 0.8393 | 0.7267 | 0.9056 | Bytedance/Seed1.6-embedding-1215 | False |
| PoemSentimentClassification | 0.5966 | 0.4756 | 0.5067 | 0.8642 | Bytedance/Seed1.6-embedding-1215 | False |
| SentimentAnalysisHindi | 0.7606 | 0.5818 | 0.642 | 0.8070 | microsoft/harrier-oss-v1-27b | False |
| Tatoeba | 0.8197 | 0.9947 | 0.7573 | 0.9659 | SamilPwC-AXNode-GenAI/PwC-Embedding_expr | False |
| ToxicConversationsClassification | 0.8875 | 0.7352 | 0.6601 | 0.9759 | voyageai/voyage-3-m-exp | False |
| TweetTopicSingleClassification | 0.7111 | 0.6699 | 0.6532 | 0.8631 | jinaai/jina-embeddings-v5-text-small | False |
| Average | 0.7521 | 0.7717 | 0.671 | 0.8714 | nan | - |
Model have high performance on these tasks: BUCC.v2,Tatoeba,NTREXBitextMining,NorwegianCourtsBitextMining,IN22GenBitextMining,NusaTranslationBitextMining,FloresBitextMining,DiaBlaBitextMining,BornholmBitextMining
Training datasets: FEVER, FEVER-FaHardNegatives, FEVER-NL, FEVER-VN, FEVERHardNegatives, FEVERHardNegatives.v2, HotpotQA, HotpotQA-Fa, HotpotQA-FaHardNegatives, HotpotQA-NL, HotpotQA-PL, HotpotQA-PLHardNegatives, HotpotQA-VN, HotpotQAHardNegatives, HotpotQAHardNegatives.v2, MIRACLJaRetrievalLite, MIRACLReranking, MIRACLRetrieval, MIRACLRetrievalHardNegatives, MIRACLRetrievalHardNegatives.v2, NQ, NQ-Fa, NQ-FaHardNegatives, NQ-NL, NQ-PL, NQ-PLHardNegatives, NQ-VN, NQHardNegatives, NanoFEVER-VN, NanoFEVERRetrieval, NanoHotpotQA-VN, NanoHotpotQARetrieval, NanoNQ-VN, NanoNQRetrieval
Results for google/siglip-base-patch16-224
| task_name | google/gemini-embedding-001 | google/siglip-base-patch16-224 | intfloat/multilingual-e5-large | Max result | Model with max result | In Training Data |
|---|---|---|---|---|---|---|
| AfriSentiClassification | 0.5356 | 0.5124 | 0.455 | 0.5688 | tencent/KaLM-Embedding-Gemma3-12B-2511 | False |
| BUCC.v2 | 0.9899 | 1.0000 | 0.9878 | 0.9905 | codefuse-ai/F2LLM-v2-8B | False |
| BibleNLPBitextMining | 0.2072 | 0.5298 | 0.1665 | 0.9899 | deepvk/USER-bge-m3 | False |
| BornholmBitextMining | 0.5169 | 0.9167 | 0.4416 | 0.7798 | jinaai/jina-embeddings-v5-text-small | False |
| BulgarianStoreReviewSentimentClassfication | 0.7813 | 0.7319 | 0.6385 | 0.8159 | microsoft/harrier-oss-v1-27b | False |
| CzechProductReviewSentimentClassification | 0.6816 | 0.6435 | 0.5714 | 0.7667 | Bytedance/Seed1.6-embedding-1215 | False |
| DBpediaClassification | 0.9476 | 0.8865 | 0.8828 | 0.9926 | Qwen/Qwen3-Embedding-4B | False |
| DiaBlaBitextMining | 0.8723 | 0.9963 | 0.8483 | 0.8882 | codefuse-ai/F2LLM-v2-14B | False |
| EstonianValenceClassification | 0.5352 | 0.4581 | 0.4289 | 0.6764 | microsoft/harrier-oss-v1-27b | False |
| FilipinoShopeeReviewsClassification | 0.4845 | 0.4243 | 0.3527 | 0.5279 | microsoft/harrier-oss-v1-27b | False |
| FinancialPhrasebankClassification | 0.8864 | 0.8447 | 0.8394 | 0.9519 | microsoft/harrier-oss-v1-0.6b | False |
| FloresBitextMining | 0.8371 | 0.9824 | 0.8108 | 0.9087 | SamilPwC-AXNode-GenAI/PwC-Embedding_expr | False |
| GreekLegalCodeClassification | 0.4376 | 0.3386 | 0.3713 | 0.8052 | Bytedance/Seed1.6-embedding-1215 | False |
| GujaratiNewsClassification | 0.9205 | 0.9010 | 0.7674 | 0.9343 | Bytedance/Seed1.6-embedding-1215 | False |
| IN22GenBitextMining | 0.9375 | 0.9953 | 0.7675 | 0.9375 | google/gemini-embedding-001 | False |
| IndicGenBenchFloresBitextMining | 0.9677 | 0.9690 | 0.8875 | 0.9881 | Sailesh97/Hinvec | False |
| IndonesianIdClickbaitClassification | 0.67 | 0.6256 | 0.6122 | 0.7560 | nvidia/llama-embed-nemotron-8b | False |
| ItaCaseholdClassification | 0.733 | 0.5127 | 0.6679 | 0.9439 | bigscience/sgpt-bloom-7b1-msmarco | False |
| KorSarcasmClassification | 0.6051 | 0.6358 | 0.5679 | 0.8190 | ICT-TIME-and-Querit/BOOM_4B_v1 | False |
| KurdishSentimentClassification | 0.8639 | 0.7964 | 0.7708 | 0.9403 | Bytedance/Seed1.6-embedding-1215 | False |
| LccSentimentClassification | 0.6993 | 0.5933 | 0.594 | 0.7687 | Alibaba-NLP/gte-Qwen2-7B-instruct | False |
| MacedonianTweetSentimentClassification | 0.7183 | 0.6325 | 0.6192 | 0.7547 | Qwen/Qwen3-Embedding-4B | False |
| NTREXBitextMining | 0.9364 | 0.9881 | 0.914 | 0.9592 | microsoft/harrier-oss-v1-27b | False |
| NollySentiBitextMining | 0.6871 | 0.7837 | 0.675 | 0.8376 | microsoft/harrier-oss-v1-27b | False |
| NorwegianCourtsBitextMining | 0.9342 | 1.0000 | 0.9404 | 0.9481 | jinaai/jina-embeddings-v5-text-nano | False |
| NusaTranslationBitextMining | 0.7752 | 0.9316 | 0.672 | 0.9222 | Qwen/Qwen3-Embedding-8B | False |
| NusaXBitextMining | 0.8252 | 0.8393 | 0.7267 | 0.9056 | Bytedance/Seed1.6-embedding-1215 | False |
| PoemSentimentClassification | 0.5966 | 0.4756 | 0.5067 | 0.8642 | Bytedance/Seed1.6-embedding-1215 | False |
| SentimentAnalysisHindi | 0.7606 | 0.5818 | 0.642 | 0.8070 | microsoft/harrier-oss-v1-27b | False |
| Tatoeba | 0.8197 | 0.9947 | 0.7573 | 0.9659 | SamilPwC-AXNode-GenAI/PwC-Embedding_expr | False |
| ToxicConversationsClassification | 0.8875 | 0.7352 | 0.6601 | 0.9759 | voyageai/voyage-3-m-exp | False |
| TweetTopicSingleClassification | 0.7111 | 0.6699 | 0.6532 | 0.8631 | jinaai/jina-embeddings-v5-text-small | False |
| Average | 0.7426 | 0.7477 | 0.6624 | 0.8611 | nan | - |
Model have high performance on these tasks: BUCC.v2,Tatoeba,NTREXBitextMining,NorwegianCourtsBitextMining,IN22GenBitextMining,NusaTranslationBitextMining,FloresBitextMining,DiaBlaBitextMining,BornholmBitextMining
Results for manveertamber/cadet-embed-base-v1
| task_name | intfloat/multilingual-e5-large | manveertamber/cadet-embed-base-v1 | Max result | Model with max result | In Training Data |
|---|---|---|---|---|---|
| BrightAopsRetrieval | 0.0722 | 0.0755 | 0.0825 | lightonai/Reason-ModernColBERT | False |
| BrightBiologyLongRetrieval | 0.0194 | 0.2532 | 0.2557 | sentence-transformers/all-mpnet-base-v2 | False |
| BrightBiologyRetrieval | 0.0174 | 0.2129 | 0.3387 | lightonai/Reason-ModernColBERT | False |
| BrightEarthScienceLongRetrieval | 0.2155 | 0.3348 | 0.3405 | sentence-transformers/all-mpnet-base-v2 | False |
| BrightEarthScienceRetrieval | 0.1506 | 0.3452 | 0.4170 | lightonai/Reason-ModernColBERT | False |
| BrightEconomicsLongRetrieval | 0.1359 | 0.1408 | 0.2087 | BAAI/bge-large-en-v1.5 | False |
| BrightEconomicsRetrieval | 0.0706 | 0.1912 | 0.2455 | lightonai/Reason-ModernColBERT | False |
| BrightLeetcodeRetrieval | 0.2787 | 0.2793 | 0.3086 | lightonai/Reason-ModernColBERT | False |
| BrightPonyLongRetrieval | 0.0234 | 0.0284 | 0.0338 | minishlab/potion-multilingual-128M | False |
| BrightPonyRetrieval | 0.1302 | 0.0747 | 0.1517 | BAAI/bge-m3 | False |
| BrightPsychologyLongRetrieval | 0.0594 | 0.1555 | 0.1931 | BAAI/bge-m3 | False |
| BrightPsychologyRetrieval | 0.0879 | 0.2123 | 0.3104 | lightonai/Reason-ModernColBERT | False |
| BrightRoboticsLongRetrieval | 0.0792 | 0.1287 | 0.1238 | BAAI/bge-m3 | False |
| BrightRoboticsRetrieval | 0.1112 | 0.1547 | 0.2181 | lightonai/Reason-ModernColBERT | False |
| BrightStackoverflowLongRetrieval | 0.1581 | 0.1282 | 0.2350 | mteb/baseline-bm25s | False |
| BrightStackoverflowRetrieval | 0.0694 | 0.1365 | 0.2425 | lightonai/Reason-ModernColBERT | False |
| BrightSustainableLivingLongRetrieval | 0.081 | 0.1847 | 0.1852 | mteb/baseline-bm25s | False |
| BrightSustainableLivingRetrieval | 0.0961 | 0.1515 | 0.2021 | lightonai/Reason-ModernColBERT | False |
| BrightTheoremQAQuestionsRetrieval | 0.1296 | 0.1526 | 0.2004 | sentence-transformers/all-mpnet-base-v2 | False |
| BrightTheoremQATheoremsRetrieval | 0.0549 | 0.0762 | 0.1078 | sentence-transformers/all-mpnet-base-v2 | False |
| Average | 0.102 | 0.1708 | 0.2201 | nan | - |
Model have high performance on these tasks: BrightRoboticsLongRetrieval
Training datasets: ArguAna, ArguAna-Fa, ArguAna-Fa.v2, ArguAna-NL, ArguAna-NL.v2, ArguAna-PL, ArguAna-VN, CMedQAv1-reranking, CMedQAv2-reranking, CmedqaRetrieval, CodeSearchNet, DBPedia, DBPedia-Fa, DBPedia-NL, DBPedia-PL, DBPedia-PLHardNegatives, DBPedia-VN, DBPediaHardNegatives, DBPediaHardNegatives.v2, DuRetrieval, FEVER, FEVER-FaHardNegatives, FEVER-NL, FEVER-VN, FEVERHardNegatives, FEVERHardNegatives.v2, HotpotQA, HotpotQA-Fa, HotpotQA-FaHardNegatives, HotpotQA-NL, HotpotQA-PL, HotpotQA-PLHardNegatives, HotpotQA-VN, HotpotQAHardNegatives, HotpotQAHardNegatives.v2, LeCaRDv2, MIRACLJaRetrievalLite, MIRACLReranking, MIRACLRetrieval, MIRACLRetrievalHardNegatives, MIRACLRetrievalHardNegatives.v2, MMarcoReranking, MSMARCO, MSMARCO-Fa, MSMARCO-FaHardNegatives, MSMARCO-PL, MSMARCO-PLHardNegatives, MSMARCO-VN, MSMARCOHardNegatives, MSMARCOv2, MrTidyRetrieval, MrTyDiJaRetrievalLite, NQ, NQ-Fa, NQ-FaHardNegatives, NQ-NL, NQ-PL, NQ-PLHardNegatives, NQ-VN, NQHardNegatives, NanoArguAnaRetrieval, NanoDBPedia-VN, NanoDBPediaRetrieval, NanoFEVER-VN, NanoFEVERRetrieval, NanoHotpotQA-VN, NanoHotpotQARetrieval, NanoMSMARCO-VN, NanoMSMARCORetrieval, NanoNQ-VN, NanoNQRetrieval, T2Reranking, T2Retrieval, mMARCO-NL
Note: Content truncated due to GitHub API limits. See the full report in the workflow artifacts.
This PR add various results
The target is
The goal is not to finish everything, simply add a large part of the results in one go.
Continuation of #528
Checklist
mteb/models/model_implementations/, this can be as an API. Instruction on how to add a model can be found here