From b94aab56f20c49f3a26aa345a3564cb07b9ee00e Mon Sep 17 00:00:00 2001 From: voorhs Date: Tue, 22 Jul 2025 11:11:51 +0300 Subject: [PATCH 1/7] upd main page --- docs/source/index.rst | 45 +++++++++++++++++++++++-------------------- 1 file changed, 24 insertions(+), 21 deletions(-) diff --git a/docs/source/index.rst b/docs/source/index.rst index 976c5bb3a..39bfa078e 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -1,13 +1,15 @@ AutoIntent documentation ======================== -**AutoIntent** is an open source tool for automatic configuration of a text classification pipeline for intent prediction. +**AutoIntent** is an open source tool for automatic configuration of text classification pipelines, with specialized support for intent prediction. .. note:: This project is under active development. -The task of intent detection is one of the main subtasks in creating task-oriented dialogue systems, along with scriptwriting and slot filling. AutoIntent project offers users the following: +The task of intent detection is one of the main subtasks in creating task-oriented dialogue systems, along with scriptwriting and slot filling. While AutoIntent is particularly well-suited for intent detection, it can be applied to any text classification problem, including sentiment analysis, topic classification, document categorization, and other NLP tasks. + +AutoIntent project offers users the following: - A convenient library of methods for intent classification that can be used in a sklearn-like "fit-predict" format. - An AutoML approach to creating classifiers, where the only thing needed is to upload a set of labeled data. @@ -36,33 +38,34 @@ Example of building an intent classifier in a couple of lines of code: for match in glob("vector_db*"): shutil.rmtree(match) -Documentation Contents ----------------------- - -:doc:`Quickstart ` -.............................. - -It is recommended to begin with the :doc:`quickstart` page. It contains overview of our capabilities and basic instructions for working with our library. +Documentation Guide +------------------- -:doc:`Key Concepts ` -.............................. +Getting Started +............... -Key terms and concepts we use throughout our documentation. +:doc:`πŸš€ Quickstart ` + Jump right in! Install AutoIntent and build your first text classifier in minutes. Perfect for users who want to get up and running quickly with practical examples. -:doc:`User Guides` -................................ +:doc:`πŸ“š Key Concepts ` + Essential terminology and concepts used throughout AutoIntent. Understanding these will help you navigate the documentation and make the most of the library's features. -A series of notebooks that demonstrate in detail and comprehensively the capabilities of our library and how to use it. +In-Depth Learning +................. -:doc:`API Reference ` -............................................... +:doc:`πŸ“– User Guides ` + Comprehensive tutorials and examples that walk you through AutoIntent's capabilities step-by-step. These hands-on guides cover everything from basic usage to advanced techniques. -Pay special attention to the sections :doc:`autoapi/autointent/modules/index` and :doc:`autoapi/autointent/metrics/index`. +:doc:`πŸŽ“ Learn AutoIntent ` + Dive deeper into the theory behind AutoIntent. Learn about dialogue systems, AutoML principles, and the science that powers intelligent text classification. -:doc:`Learn AutoIntent` -.................................... +Reference +......... -Some theoretical background on dialogue systems and auto ML. +:doc:`πŸ”§ API Reference ` + Complete technical documentation for all classes, methods, and functions. Essential reference for developers integrating AutoIntent into their applications. + + Key sections: :doc:`Modules ` | :doc:`Metrics ` .. toctree:: From 8f7ae75ffb6015fb8e73ad5b909c174a3d4651eb Mon Sep 17 00:00:00 2001 From: voorhs Date: Tue, 22 Jul 2025 12:22:29 +0300 Subject: [PATCH 2/7] upd quickstart --- docs/source/conf.py | 1 + docs/source/quickstart.rst | 245 +++++++++++++++++++++++++++++-------- pyproject.toml | 1 + 3 files changed, 198 insertions(+), 49 deletions(-) diff --git a/docs/source/conf.py b/docs/source/conf.py index 5a5df0f7b..11c9d5764 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -50,6 +50,7 @@ "sphinx.ext.intersphinx", "sphinx_multiversion", "sphinx.ext.napoleon", + "sphinx_toolbox.collapse" ] templates_path = ["_templates"] diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst index 59654b0b3..dd737595b 100644 --- a/docs/source/quickstart.rst +++ b/docs/source/quickstart.rst @@ -1,10 +1,46 @@ Quickstart -=========== +========== + +Welcome to AutoIntent! This guide will get you up and running with intent classification in just a few minutes. + +What is AutoIntent? +------------------- + +AutoIntent is a powerful AutoML library for intent classification that automatically finds the best model architecture and hyperparameters for your text classification tasks. Whether you're building chatbots or text analysis pipelines, AutoIntent simplifies the process of creating high-performance intent classifiers. + +Key Features +------------ + +* ✨ **AutoML Pipeline**: Automated model selection and hyperparameter optimization +* πŸ”§ **Modular Design**: Use individual components or the full pipeline +* πŸ“Š **Multiple Algorithms**: Support for classical neural networks, transformers, and traditional ML methods +* πŸ“ˆ **Experiment Tracking**: Built-in support for Weights & Biases, TensorBoard and CodeCarbon Installation ------------ -AutoIntent can be installed using the package manager ``pip``: +Basic Installation +.................. + +AutoIntent is compatible with Python 3.10+. For core functionality: + +.. code-block:: bash + + pip install autointent + +With Experiment Tracking +........................ + +To include experiment tracking capabilities: + +.. code-block:: bash + + pip install autointent[wandb,codecarbon] + +Development Installation +........................ + +To install the latest development version: .. code-block:: bash @@ -12,95 +48,206 @@ AutoIntent can be installed using the package manager ``pip``: cd AutoIntent pip install . -The library is compatible with Python 3.10+. +Quick Example +------------- + +Here's a complete example that demonstrates AutoIntent's capabilities: + +.. testcode:: python + + from autointent import Dataset, PipelineOptimizer + + # Prepare your data + data = { + "train": [ + {"utterance": "I want to check my account balance", "label": 0}, + {"utterance": "How do I transfer money?", "label": 1}, + {"utterance": "What's my current balance?", "label": 0}, + {"utterance": "I need to send money to my friend", "label": 1}, + {"utterance": "Can you help me make a payment?", "label": 1}, + {"utterance": "Show me my transaction history", "label": 0} + ], + "validation": [ + {"utterance": "Display my account info", "label": 0}, + {"utterance": "I want to transfer funds", "label": 1} + ] + } + + # Load data into AutoIntent + dataset = Dataset.from_dict(data) + + # Initialize and train the AutoML pipeline + pipeline = PipelineOptimizer.from_preset("classic-light") + pipeline.fit(dataset) + + # Make predictions on new data + predictions = pipeline.predict([ + "What is my available balance?", + "Transfer money to John" + ]) + +That's it! AutoIntent will automatically find the best model for your data. Data Format ----------- -To work with AutoIntent, you need to format your training data in a specific way. You need to provide a training split containing samples with utterances and labels, as shown below: +AutoIntent expects your data in a simple dictionary format with train/validation/test splits: -.. code-block:: json +Single-Label Classification +........................... - { +.. code-block:: python + + data = { "train": [ - { - "utterance": "Hello!", - "label": 0 - } + {"utterance": "Hello, how are you?", "label": 0}, + {"utterance": "Book a flight to Paris", "label": 1}, + {"utterance": "What's the weather like?", "label": 2} + ], + "validation": [ # Optional + {"utterance": "Hi there!", "label": 0} + ], + "test": [ # Optional but highly recommended + {"utterance": "Good morning", "label": 0} ] } -For a multilabel dataset, the ``label`` field should be a list of integers representing the corresponding class labels. +Multi-Label Classification +.......................... -To use it with our Python API, you can use our :class:`autointent.Dataset` object. +For multi-label tasks, use lists of 0s and 1s: .. code-block:: python - from autointent import Dataset - - dataset = Dataset.from_dict({"train": [...]}) + data = { + "train": [ + {"utterance": "Book urgent flight to Paris", "label": [1, 0, 1]}, # booking=1, weather=0, urgent=1 + {"utterance": "What's the weather?", "label": [0, 1, 0]} + ] + } -To load a dataset from the file system into Python, the :meth:`autointent.Dataset.from_json` method exists: +Loading Data +............ .. code-block:: python - dataset = Dataset.from_json("/path/to/json") + from autointent import Dataset -AutoML goes brrr... -------------------- + # From dictionary + dataset = Dataset.from_dict(data) + + # From JSON file + dataset = Dataset.from_json("/path/to/your/data.json") + + # From Hugging Face Hub + dataset = Dataset.from_hub("your-username/your-dataset") + +AutoML Training +--------------- -Once the data is ready, you can start building the optimal classifier: +AutoIntent provides several preset configurations optimized for different scenarios: .. code-block:: python from autointent import PipelineOptimizer - pipeline_optimizer = PipelineOptimizer.from_preset("classic-light") - pipeline_optimizer.fit(dataset) + # Our quick and accurate SoTA + pipeline = PipelineOptimizer.from_preset("classic-light") -This code starts the hyperparameter search with preset :ref:`search space `. + # If you have more training time + pipeline = PipelineOptimizer.from_preset("classic-heavy") -As a result, ``runs`` folder will be created in the current working directory, which will save the selected classifier ready for inference. + # Experimental preset with fine-tuning methods + pipeline = PipelineOptimizer.from_preset("transformers-light") + # Train the pipeline + pipeline.fit(dataset) -Inference ---------- +Available Presets +................. + +- ``classic-light``: Fast training with traditional ML methods +- ``classic-heavy``: Comprehensive search with traditional methods +- ``nn-medium``: Classic neural network-based approaches (RNN, CNN) +- ``nn-heavy``: Comprehensive neural network optimization +- ``transformers-light``: Transformer models with limited search +- ``transformers-no-hpo``: Transformer models without hyperparameter optimization +- ``zero-shot-openai``: Zero-shot classification using OpenAI models +- ``zero-shot-transformers``: Zero-shot classification using transformer models + +Making Predictions +------------------- -To apply the built classifier to new data, you can use our Python API: +Once trained, use your pipeline for inference: .. code-block:: python - from autointent import Pipeline + # Single prediction + result = pipeline.predict(["I want to check my balance"]) + print(result) # [0] - pipeline = Pipeline.load("path/to/run/directory") - utterances = ["123", "hello world"] - prediction = pipeline.predict(utterances) + # Batch predictions + results = pipeline.predict([ + "What's my account balance?", + "Transfer $100 to John", + "Show me recent transactions" + ]) + print(results) # [0, 1, 0] -Modular Approach ----------------- -If there is no need to iterate over pipelines and hyperparameters, you can import classification methods directly. +Direct Module Usage +------------------- -.. code-block:: python +For more control, use individual components without AutoML: + +.. testcode:: python from autointent.modules import KNNScorer - scorer = KNNScorer(embedder_name="sergeyzh/rubert-tiny-turbo", k=1) + # Initialize a specific scorer + scorer = KNNScorer( + embedder_config="sentence-transformers/all-MiniLM-L6-v2", + k=3 + ) + + # Train on your data train_utterances = [ - "why is there a hold on my american saving bank account", - "i am not sure why my account is blocked", - "why is there a hold on my capital one checking account", + "Check my account balance", + "Transfer money to account", + "Show transaction history" ] - train_labels = [0, 2, 1] + train_labels = [0, 1, 0] + scorer.fit(train_utterances, train_labels) - test_utterances = [ - "i think my account is blocked but i do not know the reason", - "can you tell me why is my bank account frozen", - ] - scorer.predict(test_utterances) -Further Reading ---------------- + # Make predictions + predictions = scorer.predict([ + "What's my current balance?", + "Send money to my friend" + ]) + +Available Modules +................. + +- **Scoring**: :class:`autointent.modules.KNNScorer`, :class:`autointent.modules.BertScorer`, :class:`autointent.modules.SklearnScorer`, :class:`autointent.modules.CatBoostScorer` +- **Decision**: :class:`autointent.modules.ArgmaxDecision`, :class:`autointent.modules.TunableDecision`, :class:`autointent.modules.AdaptiveDecision` + +See more at API reference :doc:`Modules `. + +Next Steps +---------- + +πŸš€ **Ready to dive deeper?** + +- **Concepts**: Learn about :doc:`concepts` and AutoIntent's architecture +- **Tutorials**: Follow our step-by-step guides in :doc:`user_guides` +- **Advanced Usage**: Explore custom configurations and advanced features +- **Examples**: Check out real-world examples in our `GitHub repository `_ + +πŸ› οΈ **Need Help?** + +- Report issues on our `GitHub Issues `_ +- Join our community discussions +- Check out the full API reference -- Get familiar with :doc:`concepts`. -- Check out the guide on basic Python API Usage: :doc:`user_guides/index_basic_usage` \ No newline at end of file +Happy intent classification! 🎯 \ No newline at end of file diff --git a/pyproject.toml b/pyproject.toml index 84c9468fb..213166af3 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -86,6 +86,7 @@ docs = [ "ipykernel (>=6.29.5,<7.0.0)", "tensorboardx (>=2.6.2.2,<3.0.0)", "sphinx-multiversion (>=0.2.4,<1.0.0)", + "sphinx-toolbox (>=4.0.0, <5.0.0)" ] dspy = [ "dspy (>=2.6.5,<3.0.0)", From 661576e88520d77f9dc6bf0ca01e4832ec3bcd02 Mon Sep 17 00:00:00 2001 From: voorhs Date: Tue, 22 Jul 2025 16:32:39 +0300 Subject: [PATCH 3/7] fix quickstart doctests --- docs/source/quickstart.rst | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst index dd737595b..441ce2a09 100644 --- a/docs/source/quickstart.rst +++ b/docs/source/quickstart.rst @@ -55,7 +55,7 @@ Here's a complete example that demonstrates AutoIntent's capabilities: .. testcode:: python - from autointent import Dataset, PipelineOptimizer + from autointent import Dataset, Pipeline # Prepare your data data = { @@ -65,7 +65,13 @@ Here's a complete example that demonstrates AutoIntent's capabilities: {"utterance": "What's my current balance?", "label": 0}, {"utterance": "I need to send money to my friend", "label": 1}, {"utterance": "Can you help me make a payment?", "label": 1}, - {"utterance": "Show me my transaction history", "label": 0} + {"utterance": "Show me my transaction history", "label": 0}, + {"utterance": "Can you show me my account details?", "label": 0}, + {"utterance": "I want to send funds to someone", "label": 1}, + {"utterance": "What is my available balance?", "label": 0}, + {"utterance": "How can I make a transfer?", "label": 1}, + {"utterance": "Please help me with a payment", "label": 1}, + {"utterance": "I need to view my recent transactions", "label": 0} ], "validation": [ {"utterance": "Display my account info", "label": 0}, @@ -77,7 +83,7 @@ Here's a complete example that demonstrates AutoIntent's capabilities: dataset = Dataset.from_dict(data) # Initialize and train the AutoML pipeline - pipeline = PipelineOptimizer.from_preset("classic-light") + pipeline = Pipeline.from_preset("classic-light") pipeline.fit(dataset) # Make predictions on new data @@ -149,16 +155,16 @@ AutoIntent provides several preset configurations optimized for different scenar .. code-block:: python - from autointent import PipelineOptimizer + from autointent import Pipeline # Our quick and accurate SoTA - pipeline = PipelineOptimizer.from_preset("classic-light") + pipeline = Pipeline.from_preset("classic-light") # If you have more training time - pipeline = PipelineOptimizer.from_preset("classic-heavy") + pipeline = Pipeline.from_preset("classic-heavy") # Experimental preset with fine-tuning methods - pipeline = PipelineOptimizer.from_preset("transformers-light") + pipeline = Pipeline.from_preset("transformers-light") # Train the pipeline pipeline.fit(dataset) @@ -182,17 +188,12 @@ Once trained, use your pipeline for inference: .. code-block:: python - # Single prediction - result = pipeline.predict(["I want to check my balance"]) - print(result) # [0] - # Batch predictions results = pipeline.predict([ "What's my account balance?", "Transfer $100 to John", "Show me recent transactions" ]) - print(results) # [0, 1, 0] Direct Module Usage From e509f53099ce28c7dd62ce0d5304f417bebd54ff Mon Sep 17 00:00:00 2001 From: voorhs Date: Tue, 22 Jul 2025 16:49:19 +0300 Subject: [PATCH 4/7] upd concepts --- docs/source/concepts.rst | 125 ++++++++++++++++++++++++++++++++++----- 1 file changed, 109 insertions(+), 16 deletions(-) diff --git a/docs/source/concepts.rst b/docs/source/concepts.rst index 11e69eb9b..4dee46534 100644 --- a/docs/source/concepts.rst +++ b/docs/source/concepts.rst @@ -1,30 +1,123 @@ +============ Key Concepts ============ -.. _key-search-space: +This page introduces the fundamental concepts that underpin AutoIntent's design and functionality. Understanding these concepts will help you effectively use the framework and make informed decisions about your text classification projects. + +.. _concepts-pipeline: + +Three-Stage Pipeline Architecture +================================= + +AutoIntent organizes text classification into a modular three-stage pipeline, providing clear separation of concerns and flexibility in optimization: + +**πŸ”€ Embedding Stage** + Transforms raw text into dense vector representations using pre-trained transformer models. This stage handles the computationally intensive text encoding and can be optimized independently from downstream classification tasks. + +**πŸ“Š Scoring Stage** + Processes embeddings to predict class probabilities. This stage supports diverse approaches from classical machine learning (KNN, logistic regression) to deep learning models (BERT fine-tuning, CNNs). All models operate on pre-computed embeddings for efficiency. + +**βš–οΈ Decision Stage** + Converts predicted probabilities into final classifications by applying thresholds and decision rules. This stage is crucial for multi-label classification and out-of-scope detection scenarios. + +This modular design enables efficient experimentation, allows reusing expensive embedding computations across different models, and supports deployment on CPU-only systems. + +.. _concepts-automl: + +AutoML Optimization Strategy +============================ + +AutoIntent employs a hierarchical optimization approach that balances exploration with computational efficiency: + +**πŸ”§ Module-Level Optimization** + Components are optimized sequentially: embedding β†’ scoring β†’ decision. Each stage builds upon the best model from the previous stage, creating a cohesive pipeline while preventing combinatorial explosion. + +**πŸ€– Model-Level Optimization** + Within each module, both model architectures and hyperparameters are jointly optimized using Optuna's Tree-structured Parzen Estimators and random sampling. + +**πŸ—ΊοΈ Search Space Configuration** + Optimization behavior is controlled through dictionary-like search spaces that define: + + - Available model types and their hyperparameter ranges + - Optimization budget and resource constraints + - Cross-validation and evaluation strategies + +.. _concepts-embedding-centric: + +Embedding-Centric Design +======================== + +AutoIntent's architecture centers around transformer-based text embeddings, providing several key advantages: + +**⚑ Pre-computed Embeddings** + Text is encoded once and reused across all scoring models, dramatically reducing computational overhead during hyperparameter optimization and enabling efficient experimentation. + +**πŸ€— Model Repository Integration** + Seamless access to thousands of pre-trained models from Hugging Face Hub, with intelligent selection strategies based on retrieval metrics or downstream task performance. + +**πŸš€ Deployment Flexibility** + Separation of embedding generation from classification enables deploying lightweight classifiers on resource-constrained systems while leveraging powerful transformer representations. + +.. _concepts-multiclass-multilabel: + +Classification Paradigms +======================== + +AutoIntent supports various classification scenarios through its flexible decision module: + +**🏷️ Multi-Class Classification** + Traditional single-label classification where each input belongs to exactly one class. Uses argmax or threshold-based decisions on predicted probabilities. + +**πŸ”– Multi-Label Classification** + Each input can belong to multiple classes simultaneously. Employs adaptive thresholding strategies that can be sample-specific or learned globally across the dataset. + + +.. _concepts-oos: + +Out-of-Scope Detection +====================== + +A critical capability for production text classification systems, especially in conversational AI: + +**πŸ“ Confidence Thresholding** + Uses predicted probability scores to identify inputs that don't belong to any known class. Threshold values can be tuned automatically to balance precision and recall. + +**πŸ”— Integration with Multi-Label** + OOS detection works seamlessly with multi-label scenarios, enabling detection of completely unknown inputs vs. partial matches to known classes. + +.. _concepts-presets: + +Optimization Presets +==================== + +AutoIntent provides predefined optimization strategies that balance quality, speed, and resource consumption: -Optimization Search Space -------------------------- +**⚑ Zero-Shot Presets** + Leverage class descriptions and large language models for classification without training data. Ideal for rapid prototyping and cold-start scenarios. -The automatic selection of a classifier occurs through the iteration of hyperparameters within a certain *search space*. Conceptually, this search space is a dictionary where the keys are the names of the hyperparameters, and the values are lists. The hyperparameters act as the coordinate "axes" of the search space, and the values in the lists act as points on this axis. +**πŸ“ˆ Classic Presets** + Focus on traditional ML approaches (KNN, linear models, tree-based methods) operating on transformer embeddings. Offer excellent balance of performance and efficiency. -.. _key-stages: +**🧠 Neural Network Presets** + Include deep learning approaches like CNN, RNN, and transformer fine-tuning. Provide highest potential performance at increased computational cost. -Classification Stages ---------------------- +**πŸͺœ Computational Tiers** + Each preset family offers light, medium, and heavy variants that trade optimization time for potential performance improvements. -Intent classification can be divided into two stages: scoring and decision. Scoring involves predicting the probabilities of the presence of each intent in a given utterance. Prediction involves forming the final decision based on the provided probabilities. +.. _concepts-modularity: -.. _key-oos: +Modular Architecture +==================== -Out-of-domain utterances ------------------------- +AutoIntent's design emphasizes modularity and extensibility: -If we want to detect out-of-domain examples, it is necessary to set a probability threshold during the decision stage, at which the presence of some known intent can be asserted. +**🧩 Plugin Architecture** + Each component (embedding models, scoring methods, decision strategies) implements a common interface, enabling easy addition of new approaches without modifying core framework code. -.. _key-nodes-modules: +**βš™οΈ Configuration-Driven** + All aspects of optimization can be controlled through declarative configuration files, supporting reproducible experiments and easy sharing of optimization strategies. -Nodes and Modules ------------------ +**πŸ”§ Extensibility** + Framework can be extended with custom embedding models, scoring algorithms, and decision strategies while maintaining compatibility with the AutoML optimization pipeline. -The scoring or decision model, along with its hyperparameters that need to be iterated, is called an *optimization module*. A set of modules related to one optimization stage (scoring or decision) is called an *optimization node*. +This modular design ensures that AutoIntent can evolve with advances in NLP research while maintaining stability and backward compatibility for existing users. From 7d029db62eef28b3b5e3c5a066ce2b838dfc5485 Mon Sep 17 00:00:00 2001 From: voorhs Date: Tue, 22 Jul 2025 17:35:27 +0300 Subject: [PATCH 5/7] add page about text embeddings --- docs/source/learn/index.rst | 4 +- docs/source/learn/text_embeddings.rst | 104 ++++++++++++++++++++++++++ 2 files changed, 106 insertions(+), 2 deletions(-) create mode 100644 docs/source/learn/text_embeddings.rst diff --git a/docs/source/learn/index.rst b/docs/source/learn/index.rst index 68c2fee5d..722a44a1b 100644 --- a/docs/source/learn/index.rst +++ b/docs/source/learn/index.rst @@ -1,5 +1,5 @@ -Learn AutoIntent -================ +Learn +===== .. toctree:: :glob: diff --git a/docs/source/learn/text_embeddings.rst b/docs/source/learn/text_embeddings.rst new file mode 100644 index 000000000..f3b92b817 --- /dev/null +++ b/docs/source/learn/text_embeddings.rst @@ -0,0 +1,104 @@ +Text Embeddings and Representation Learning +=========================================== + +In this section, you will learn about the theoretical foundations of text embeddings and how AutoIntent leverages them for efficient intent classification. + +What are Text Embeddings? +-------------------------- + +Text embeddings are dense vector representations of text that capture semantic meaning in a continuous vector space. Unlike traditional bag-of-words approaches that treat words as discrete tokens, embeddings map text to points in a high-dimensional space where semantically similar texts are located close to each other. + +**Mathematical Foundation** + +An embedding function :math:`f: \mathcal{T} \rightarrow \mathbb{R}^d` maps text :math:`t \in \mathcal{T}` to a dense vector :math:`\mathbf{e} \in \mathbb{R}^d`, where :math:`d` is the embedding dimension (typically 384, 768, or 1024). The key property is that semantic similarity in text space translates to geometric proximity in embedding space: + +.. math:: + \text{semantic_similarity}(t_1, t_2) \approx \cos(\mathbf{e}_1, \mathbf{e}_2) + +where :math:`\cos(\mathbf{e}_1, \mathbf{e}_2) = \frac{\mathbf{e}_1 \cdot \mathbf{e}_2}{||\mathbf{e}_1|| \cdot ||\mathbf{e}_2||}` + +Transformer-Based Embeddings +----------------------------- + +AutoIntent primarily uses transformer-based embedding models, which have revolutionized natural language processing through their attention mechanisms and contextual representations. + +**Sentence Transformers** + +The library leverages the `sentence-transformers `_ framework, which provides pre-trained models specifically optimized for semantic similarity tasks. These models are fine-tuned versions of BERT, RoBERTa, or other transformer architectures that produce high-quality sentence-level embeddings. + +**Key Advantages:** + +1. **Contextual Understanding**: Unlike word2vec or GloVe, transformer embeddings understand context. The word "bank" will have different representations in "river bank" vs. "money bank." + +2. **Cross-lingual Capabilities**: Many models support multiple languages, crucial for dialog systems serving diverse users. + +3. **Task Adaptation**: Models can be fine-tuned for specific domains or similarity tasks. + +**Model Types in AutoIntent:** + +- **Bi-encoders**: Encode texts independently, enabling efficient pre-computation and caching +- **Cross-encoders**: Process text pairs jointly for higher accuracy but at computational cost + + +Task-Specific Prompting +----------------------- + +AutoIntent supports task-specific prompts to optimize embedding quality for different use cases. + +Different tasks may benefit from different prompting strategies: + +.. code-block:: python + + # Query prompt for search + query_embeddings = embedder.embed(queries, TaskTypeEnum.query) + + # Passage prompt for documents + doc_embeddings = embedder.embed(documents, TaskTypeEnum.passage) + + # Classification prompt for intents + intent_embeddings = embedder.embed(utterances, TaskTypeEnum.classification) + +Embedding Quality and Evaluation +--------------------------------- + +AutoIntent evaluates embedding quality using retrieval metrics: + +- **NDCG** (Normalized Discounted Cumulative Gain) +- **Hit Rate** (Proportion of relevant items in top-k results) +- **Precision@k** and **Recall@k** + + +Practical Applications in Dialog Systems +----------------------------------------- + +**Intent Classification Pipeline** + +1. **User utterance**: "I want to book a flight to Paris" +2. **Embedding**: Convert to 768-dimensional vector +3. **Similarity search**: Find nearest training examples +4. **Classification**: Use embedding-based classifier (KNN, linear, etc.) +5. **Decision**: Apply confidence thresholds for final prediction + +**Zero-Shot Classification** + +Using intent descriptions for classification without training data: + +.. code-block:: python + + from autointent.modules.scoring import BiEncoderDescriptionScorer + + scorer = BiEncoderDescriptionScorer() + + # Intent descriptions instead of training data + descriptions = [ + "User wants to book a flight", + "User wants to cancel a reservation", + "User asks about flight status" + ] + + scorer.fit([], [], descriptions) + predictions = scorer.predict(["I want to fly to London"]) + +**Few-Shot Learning** + +Embeddings excel in few-shot scenarios where limited training data is available. AutoIntent's k-NN based methods are particularly effective. \ No newline at end of file From 92992d52e4ee7e7971e98c361c5f9634e85b8c89 Mon Sep 17 00:00:00 2001 From: voorhs Date: Tue, 22 Jul 2025 17:35:44 +0300 Subject: [PATCH 6/7] add page on automl theory --- docs/source/learn/automl_theory.rst | 243 ++++++++++++++++++++++++++++ docs/source/learn/optimization.rst | 1 - 2 files changed, 243 insertions(+), 1 deletion(-) create mode 100644 docs/source/learn/automl_theory.rst diff --git a/docs/source/learn/automl_theory.rst b/docs/source/learn/automl_theory.rst new file mode 100644 index 000000000..3f3dac809 --- /dev/null +++ b/docs/source/learn/automl_theory.rst @@ -0,0 +1,243 @@ +AutoML and Hyperparameter Optimization +====================================== + +This section provides a deep dive into the theoretical foundations of automated machine learning (AutoML) and hyperparameter optimization as implemented in AutoIntent. + +The Hyperparameter Optimization Problem +--------------------------------------- + +**The Core Problem** + +Hyperparameter optimization is about finding the best configuration of settings that maximizes model performance. Think of it as searching through all possible combinations of hyperparameters (like learning rates, model sizes, regularization strengths) to find the combination that gives the best results on validation data. + +The performance metric is typically estimated through cross-validation to avoid overfitting - we want configurations that work well on unseen data, not just the training data. + +**The Challenge of Combinatorial Explosion** + +In AutoIntent's three-stage pipeline, the total search space grows multiplicatively across all stages. If we have: + +- 10 different embedding models to choose from +- 20 different scoring configurations +- 5 different decision strategies + +Then we have 10 Γ— 20 Γ— 5 = 1,000 total combinations. In realistic scenarios, this can easily exceed 1,000,000 configurations, making it impossible to test every combination within reasonable time and computational budgets. + +Hierarchical Optimization Strategy +---------------------------------- + +AutoIntent addresses combinatorial explosion through a **hierarchical greedy optimization** approach that optimizes modules sequentially. + +**Sequential Module Optimization** + +The optimization proceeds in three stages, where each stage builds on the results of the previous one: + +1. **Embedding Optimization**: First, find the best embedding model configuration by testing different models and settings, evaluating them using retrieval or classification metrics. + +2. **Scoring Optimization**: Using the best embedding model from step 1, now optimize the scoring module by testing different classifiers (KNN, linear, neural networks, etc.) with various hyperparameters. + +3. **Decision Optimization**: Using the best embedding and scoring combination from steps 1-2, optimize the decision module by finding optimal thresholds and decision strategies for final predictions. + +**Proxy Metrics** + +Each stage uses specialized proxy metrics that correlate with final performance: + +- **Embedding Stage**: Retrieval metrics (NDCG, hit rate) or lightweight classification accuracy +- **Scoring Stage**: Classification metrics (F1, ROC-AUC) on validation data +- **Decision Stage**: Threshold-specific metrics for multi-label/OOS scenarios + +**Trade-offs** + +- βœ… **Computational Efficiency**: Instead of testing all possible combinations (which grows exponentially), we only test combinations within each stage separately, making optimization much faster and more manageable. +- βœ… **Parallelization**: Each stage can be parallelized independently, allowing multiple configurations to be tested simultaneously. +- ⚠️ **Local Optimality**: May miss globally optimal combinations due to greedy choices - the best embedding might work better with a different scorer than the one we pick, but we won't discover this combination. + +Tree-Structured Parzen Estimators (TPE) +---------------------------------------- + +AutoIntent uses Optuna's TPE algorithm for sophisticated hyperparameter optimization within each module. This is a form of Bayesian optimization that learns from previous trials to make smarter choices about which hyperparameters to try next. + +**How TPE Works** + +TPE builds two separate models: + +- **Good Configuration Model**: Learns the distribution of hyperparameters that led to good performance (typically the top 25% of trials) +- **Bad Configuration Model**: Learns the distribution of hyperparameters that led to poor performance (the remaining 75% of trials) + +The algorithm then suggests new hyperparameters by finding configurations that are likely under the "good" model but unlikely under the "bad" model. This naturally balances exploration (trying untested areas) with exploitation (focusing on promising regions). + +**Benefits of TPE** + +- **Smart Sampling**: After initial random trials, TPE makes increasingly informed decisions about which hyperparameters to try +- **Handles Different Parameter Types**: Works well with categorical, continuous, and integer parameters +- **Robust to Noisy Evaluations**: Can handle situations where the same hyperparameters might give slightly different results due to randomness +- **No Prior Knowledge Required**: Works without needing to specify complex relationships between parameters + +Search Space Design +------------------- + +**Parameter Types** + +AutoIntent supports various hyperparameter types with appropriate sampling strategies: + +AutoIntent supports several types of hyperparameters, each requiring different optimization strategies: + +**Categorical Parameters**: These are discrete choices from a fixed set of options, like choosing between different model types ("knn", "linear", "bert") or activation functions ("relu", "tanh", "sigmoid"). The optimizer samples uniformly from the available choices. + +**Continuous Parameters**: These are real-valued parameters like learning rates, regularization strengths, or temperature values. The optimizer can sample from uniform distributions (for parameters like dropout rates between 0.0 and 1.0) or log-uniform distributions (for parameters like learning rates that work better on logarithmic scales). + +**Integer Parameters**: These are whole number parameters like the number of neighbors in KNN, hidden dimensions in neural networks, or batch sizes. The optimizer can specify step sizes and bounds to ensure valid configurations. + +**Conditional Parameters**: Some parameters only make sense when certain other parameters have specific values. For example, LoRA-specific parameters (like lora_alpha and lora_r) only apply when the model type is "lora". AutoIntent handles these dependencies automatically in the search space configuration. + + +**Search Space Configuration** + +.. code-block:: yaml + + search_space: + - node_type: scoring + target_metric: scoring_f1 + search_space: + - module_name: knn + k: + low: 1 + high: 20 + weights: [uniform, distance, closest] + - module_name: linear + cv: [3, 5, 10] + +Cross-Validation and Data Splitting +----------------------------------- + +**Validation Schemes** + +AutoIntent supports multiple validation strategies to ensure robust hyperparameter selection: + +**Hold-out Validation (HO)** + +Split data into training and validation sets once. Train the model on the training set and evaluate performance on the validation set. This gives a single performance score for each hyperparameter configuration. + +**Cross-Validation (CV)** + +Split data into K folds (typically 3-5). For each fold, train on the remaining folds and validate on the current fold. Average the performance scores across all K folds to get a more robust estimate of how well the hyperparameters work. + +**Stratified Splitting** + +For imbalanced datasets, AutoIntent uses stratified sampling to maintain class distributions: + +.. code-block:: python + + from autointent.configs import DataConfig + + data_config = DataConfig( + scheme="cv", # Cross-validation + n_folds=5, # 5-fold CV + validation_size=0.2, # 20% for validation in HO + separation_ratio=0.5 # Prevent data leakage between modules + ) + +**Data Leakage Prevention** + +The ``separation_ratio`` parameter prevents information leakage between scoring and decision modules by using different data subsets for each stage. + +**Hyperparameter Bounds** + +Search spaces include reasonable bounds to prevent extreme configurations: + +.. code-block:: yaml + + learning_rate: + low: 1.0e-5 # Prevent too slow learning + high: 1.0e-2 # Prevent instability + log: true # Log-uniform sampling + +Multi-Objective Optimization Considerations +-------------------------------------------- + +While AutoIntent primarily optimizes single metrics, it considers multiple objectives implicitly: + +**Performance vs. Efficiency Trade-offs** + +- **Model size**: Smaller models for deployment efficiency +- **Training time**: Faster models for rapid iteration +- **Inference speed**: Optimized for production latency + +**Presets as Multi-Objective Solutions** + +AutoIntent provides presets that balance different objectives: + +.. code-block:: python + + # Different computational budgets + pipeline_light = Pipeline.from_preset("classic-light") # Speed-focused + pipeline_heavy = Pipeline.from_preset("classic-heavy") # Performance-focused + + # Different model types + pipeline_zero_shot = Pipeline.from_preset("zero-shot-transformers") # No training data + +Bayesian Optimization Theory +----------------------------- + +**Gaussian Process Surrogate Models** + +While TPE uses tree-structured models, the general Bayesian optimization framework uses Gaussian Processes as surrogate models. These are probabilistic models that learn to predict performance based on previous trials, including uncertainty estimates about unexplored regions of the hyperparameter space. + +**Exploration vs. Exploitation** + +Bayesian optimization balances: + +- **Exploitation**: Sampling near known good configurations +- **Exploration**: Sampling in uncertain regions of the space + +The acquisition function mathematically encodes this trade-off. + +**Convergence Properties** + +TPE and related algorithms have theoretical guarantees for convergence to global optima under certain conditions, though practical performance depends on: + +- Search space dimensionality +- Function smoothness +- Available computational budget + +Practical Optimization Strategies +---------------------------------- + +**Budget Allocation** + +.. code-block:: python + + hpo_config = HPOConfig( + sampler="tpe", + n_trials=50, # Total optimization budget + n_startup_trials=10, # Random initialization + timeout=3600, # 1-hour time limit + n_jobs=4 # Parallel trials + ) + +**Warm Starting** + +AutoIntent can resume interrupted optimization. This is the approximate code we use for creating optuna studies: + +.. code-block:: python + + # Optimization state is automatically saved + study = optuna.create_study( + study_name="intent_classification", + storage="sqlite:///optuna.db", + load_if_exists=True + ) + +Advanced Topics +--------------- + +**Meta-Learning** + +AutoIntent's presets can be viewed as meta-learning solutions - configurations that work well across diverse datasets based on empirical analysis. + +**Neural Architecture Search (NAS)** + +While not fully implemented, AutoIntent's modular design supports architecture search within model families (e.g., different CNN configurations). + +**Automated Feature Engineering** + +AutoIntent's embedding-centric design can be seen as automated feature engineering - the system automatically learns relevant representations through selecting best fitting embedding model. diff --git a/docs/source/learn/optimization.rst b/docs/source/learn/optimization.rst index 158d21f59..2b9f24b4c 100644 --- a/docs/source/learn/optimization.rst +++ b/docs/source/learn/optimization.rst @@ -43,4 +43,3 @@ This is similar to random search over a subset, but during the search, we attemp This approach is more sophisticated and can lead to better results by intelligently exploring the hyperparameter space. -The implementation of Bayesian optimization is planned for release v0.1.0. From b95ee6284f2fa560763fafd79b1e379014ce0a5f Mon Sep 17 00:00:00 2001 From: voorhs Date: Tue, 22 Jul 2025 19:09:52 +0300 Subject: [PATCH 7/7] update dialogue systems page --- docs/source/learn/dialogue_systems.rst | 409 +++++++++++++++++++++++-- docs/source/learn/optimization.rst | 2 +- 2 files changed, 387 insertions(+), 24 deletions(-) diff --git a/docs/source/learn/dialogue_systems.rst b/docs/source/learn/dialogue_systems.rst index cf581234d..7e0ab2d63 100644 --- a/docs/source/learn/dialogue_systems.rst +++ b/docs/source/learn/dialogue_systems.rst @@ -1,36 +1,399 @@ -Dialogue Systems -================ +Dialogue Systems Theory and Practice +==================================== -In this section, you will get acquainted with the basics of building dialogue systems. +In this section, you will learn about the theoretical foundations and practical challenges of building dialogue systems, with a focus on how AutoIntent addresses these challenges. -Intents -------- +What are Dialogue Systems? +-------------------------- -A dialogue system, in a broad sense, is a textual interface for interacting with a service (be it a food ordering service or a service for obtaining information about a bank account). Typically, the service supports a finite number of API methods that are invoked during the dialogue with the user. To determine which method is needed at a given moment in the dialogue, intent classifiers are used. If we reason in terms of machine learning, this is a text classification task. +A dialogue system is a computational framework that enables natural language interaction between humans and machines. These systems serve as intelligent interfaces that can understand user requests, maintain conversation context, and provide appropriate responses or actions. -A good intent classifier should consider the specifics of the dialogue system creation task: +**πŸ“‹ Types of Dialogue Systems** -- Domain multiplicity. The number of API methods can be large enough to train the classifier yourself. -- Detection of out-of-domain examples. It is necessary to handle cases where the user expresses unsupported intents. -- Intent multiplicity. At one point in the dialogue, for some tasks, several complementary intents may arise at once, and then, in terms of machine learning, the task reduces to multilabel classification. -- A vast set of existing methods and their hyperparameters. As they say, "for this task, you can go through many hyperparameters, and you will be going through these hyperparameters." -- Scarcity of the training sample. Collecting a diverse sample of examples of replicas and even more so of entire dialogues is quite difficult. -- Using ML classifiers together with a rule-based approach. +**🎯 Task-Oriented Systems**: Designed to help users accomplish specific tasks like booking flights, making restaurant reservations, or accessing bank account information. These systems typically have well-defined goals and operate within limited domains. -Four out of five problems listed in this list are solved by using the AutoIntent library! +**πŸ’¬ Open-Domain Chatbots**: Designed for general conversation without specific task constraints. Examples include social chatbots and virtual companions that can discuss various topics. -Slots ------ +**❓ Question-Answering Systems**: Focused on providing factual answers to user questions, often by retrieving information from knowledge bases or documents. -.. todo:: +**πŸ”€ Hybrid Systems**: Combine multiple approaches, supporting both task-oriented interactions and general conversation. - someday +Core Components of Dialogue Systems +----------------------------------- +Modern dialogue systems typically consist of several interconnected components: -Script ------- +**🧠 Natural Language Understanding (NLU)** -.. todo:: +The NLU component processes user input and extracts structured meaning, typically including: - someday - \ No newline at end of file +- **🎯 Intent Classification**: Determining what the user wants to do +- **🏷️ Entity Extraction**: Identifying specific pieces of information (names, dates, locations) +- **😊 Sentiment Analysis**: Understanding the user's emotional state or attitude + +**πŸŽ›οΈ Dialogue Management** + +This component maintains conversation state and decides what action to take next: + +- **πŸ“Š State Tracking**: Keeping track of what has been discussed and what information is needed +- **🧭 Policy Learning**: Deciding what response or action is most appropriate given the current state +- **πŸ’­ Context Management**: Handling multi-turn conversations and maintaining dialogue history + +**✍️ Natural Language Generation (NLG)** + +Converts system decisions into natural language responses that users can understand. + +**πŸ”Œ Backend Integration** + +Connects to external services, databases, or APIs to fulfill user requests. + +Intent Classification: The Heart of NLU +--------------------------------------- + +Intent classification is arguably the most critical component of task-oriented dialogue systems. It determines which service or action the user wants to invoke. + +**🎯 What is an Intent?** + +An intent represents the user's goal or purpose behind an utterance. For example: + +- "Book a flight to Paris" β†’ `book_flight` intent +- "What's my account balance?" β†’ `check_balance` intent +- "Cancel my reservation" β†’ `cancel_booking` intent + +**πŸ€– Intent Classification as Machine Learning** + +From a technical perspective, intent classification is a text classification problem where: + +- **πŸ“₯ Input**: User utterance (text) +- **πŸ“€ Output**: Intent class (category) +- **πŸ“š Training Data**: Examples of utterances paired with their corresponding intents + +**⚠️ Unique Challenges in Dialogue Systems** + +Intent classification in dialogue systems faces several challenges that distinguish it from general text classification: + +**1️⃣ Domain Complexity and Scale** + +Real-world dialogue systems often need to handle dozens or hundreds of different intents. A banking chatbot might support intents like `transfer_money`, `check_balance`, `report_fraud`, `apply_for_loan`, `find_atm`, `update_personal_info`, and many others. This scale makes manual rule-based approaches impractical. + +**2️⃣ Out-of-Scope Detection** + +Users don't always stay within the system's intended domain. They might ask questions the system wasn't designed to handle: + +- User: "What's the weather like?" (to a banking bot) +- User: "Tell me a joke" (to a flight booking system) + +The system must recognize these out-of-scope (OOS) utterances and handle them gracefully, rather than misclassifying them as valid intents. + +**3️⃣ Multi-Intent Utterances** + +Users sometimes express multiple intentions in a single utterance: + +- "I want to book a flight to London and also check if I have enough points for an upgrade" +- "Transfer $500 to John's account and send me a confirmation email" + +This requires multi-label classification where an utterance can belong to multiple intent categories simultaneously. + +**4️⃣ Limited Training Data** + +Collecting comprehensive training data for dialogue systems is challenging: + +- **πŸš€ Cold Start Problem**: New domains or intents may have little or no training data +- **πŸ“‰ Long Tail Distribution**: Some intents occur much less frequently than others +- **πŸ’¬ Conversation Context**: Training data should ideally capture how intents appear in real conversations, not just isolated utterances + +**5️⃣ Linguistic Variation** + +Users express the same intent in many different ways: + +- "Book me a flight" / "I need to fly somewhere" / "Can you help me travel to NYC?" +- "What's my balance?" / "How much money do I have?" / "Show me my account status" + +The system must handle this linguistic variation while maintaining accuracy. + +**6️⃣ Contextual Dependencies** + +In multi-turn dialogues, the same utterance can have different meanings depending on context: + +- User: "Book it" (could mean book a flight, hotel, or restaurant depending on previous conversation) +- User: "Yes" (confirmation, but confirmation of what?) + +How AutoIntent Addresses Dialogue System Challenges +---------------------------------------------------- + +AutoIntent specifically addresses the key challenges faced by dialogue system developers: + +**πŸ”„ Automated Model Selection** + +Instead of manually trying different approaches, AutoIntent automatically tests and compares multiple classification methods (KNN, neural networks, transformer models, etc.) to find the best approach for your specific dataset and use case. + +**🚫 Out-of-Scope Detection** + +AutoIntent provides built-in support for OOS detection through: + +- **πŸ“Š Confidence Thresholding**: Rejecting predictions below a certain confidence level +- **🎯 Specialized Decision Modules**: Like `JinoosDecision` and `TunableDecision` that are designed for OOS scenarios +- **βš–οΈ Threshold Optimization**: Automatically finding the best confidence thresholds that balance precision and recall + +**🏷️ Multi-Label Classification** + +AutoIntent natively supports multi-label scenarios through: + +- **πŸ€– Multi-Label Aware Algorithms**: Methods like `MLKnnScorer` designed specifically for multi-label tasks +- **πŸ“ˆ Adaptive Thresholding**: The `AdaptiveDecision` module can set different thresholds for different intent classes + +**🎯 Few-Shot Learning** + +AutoIntent excels in scenarios with limited training data through: + +- **πŸ” Embedding-Based Methods**: KNN and similarity-based approaches that work well with few examples +- **⚑ Zero-Shot Capabilities**: Using intent descriptions instead of training examples +- **πŸ”„ Transfer Learning**: Leveraging pre-trained models and embeddings + +**βš™οΈ Hyperparameter Optimization** + +AutoIntent eliminates the need for manual hyperparameter tuning through automated optimization, saving significant development time. + +Multi-Turn Dialogue Considerations +----------------------------------- + +While AutoIntent focuses primarily on single-utterance intent classification, real dialogue systems must handle multi-turn conversations. + +**πŸ”— Context Propagation** + +In multi-turn scenarios, context from previous turns affects intent classification: + +- **Turn 1**: "I want to book a flight" +- **Turn 2**: "Make it for tomorrow" (context: still talking about flight booking) +- **Turn 3**: "Actually, change that to next week" (context: modifying the previous request) + +**πŸ“‹ Session Management** + +Dialogue systems must maintain session state across multiple interactions: + +- **πŸ‘€ User Identity**: Who is the user? +- **πŸ“œ Conversation History**: What has been discussed? +- **πŸ’Ύ Slot Values**: What information has been collected? +- **🎯 Current Goal**: What is the user trying to accomplish? + +**πŸ”Œ Integration with AutoIntent** + +AutoIntent cannot directly be integrated into multi-turn system for now, but here are a few things to bridge the gap: + +1. **πŸ” Processing Each Turn**: Using AutoIntent to classify each user utterance +2. **πŸ“ Context Enrichment**: Adding conversation context as text features to improve classification + +Practical Applications and Use Cases +------------------------------------ + +**πŸ“ž Customer Service Chatbots** + +- **🏦 Banking**: Account inquiries, transactions, fraud reporting +- **πŸ›’ E-commerce**: Order tracking, returns, product recommendations +- **πŸ“‘ Telecommunications**: Bill payments, service upgrades, technical support +- **✈️ Travel**: Flight, hotel, and car rental bookings +- **πŸ₯ Healthcare**: Appointment scheduling, prescription refills +- **πŸ• Food Services**: Restaurant reservations, food delivery + + + +**πŸŽ™οΈ Voice Assistants** + +- **🏠 Smart Home**: Device control, automation setup +- **🎡 Entertainment**: Music playback, content search +- **πŸ“… Productivity**: Calendar management, reminders, note-taking + +**πŸ“… Booking Systems** + + +**πŸ€– LLM Agents and Tool Selection** + +Modern AI systems increasingly rely on Large Language Models (LLMs) that can use external tools and APIs to accomplish complex tasks. These systems, often called "AI agents" need to determine which tools to use for specific user requests. + +- **⚑ Function Calling**: LLMs like GPT-4, Claude, or Llama can be equipped with function-calling capabilities to use external APIs, databases, or computational tools +- **πŸ”§ Tool Orchestration**: Complex agents that combine multiple tools (web search, calculator, database queries, file operations) based on user needs +- **βš™οΈ Workflow Automation**: Systems that can execute multi-step processes by selecting appropriate tools in sequence + +**πŸš€ Performance Advantages of AutoIntent for LLM Systems** + +Even when using powerful LLMs, AutoIntent can provide significant advantages: + +**⚑ Latency Optimization**: API calls to distant LLM servers typically take 500-4000ms, while local ML model predictions with AutoIntent can complete much faster. For tool selection in real-time applications, this speed difference is crucial. + +**πŸ’° Cost Efficiency**: Local intent classification reduces the number of expensive LLM API calls by pre-filtering and routing requests to appropriate tools without requiring LLM reasoning. + +**πŸ”’ Reliability**: Local models provide consistent performance without dependency on external API availability, rate limits, or network connectivity issues. + +**πŸ›‘οΈ Privacy**: Sensitive user requests can be classified locally without sending data to external LLM providers. + +**πŸ”„ Hybrid Architecture Benefits** + +- **⚑ Fast Intent Routing**: Use AutoIntent to quickly classify user requests and route them to appropriate specialized tools or LLM prompts +- **🎯 Tool Pre-selection**: Narrow down the set of available tools before presenting options to the LLM, improving accuracy and reducing hallucination +- **πŸ”„ Fallback Strategies**: When local classification is confident, execute actions directly; when uncertain, escalate to LLM for more sophisticated reasoning +- **πŸ€– Multi-Agent Coordination**: Route different types of requests to specialized LLM agents based on local intent classification + +Slots and Entity Extraction +---------------------------- + +While AutoIntent focuses primarily on intent classification, understanding slots (also called entities) is crucial for building complete dialogue systems. + +**❓ What are Slots?** + +Slots are specific pieces of information that the system needs to extract from user utterances to fulfill their requests. They represent the parameters or arguments required by the intended action. + +**πŸ’‘ Examples of Slots** + +For a flight booking intent, relevant slots might include: + +- **πŸ›« Departure City**: "I want to fly from New York" +- **πŸ›¬ Destination City**: "to London" +- **πŸ“… Date**: "on March 15th" +- **πŸ‘₯ Number of Passengers**: "for two people" +- **πŸ’Ί Class**: "in business class" + +**πŸ“‹ Types of Slots** + +**πŸ“ Categorical Slots**: Fixed set of possible values + +- Seat class: economy, business, first +- Payment method: credit card, debit card, PayPal + +**πŸ”’ Numerical Slots**: Numeric values + +- Number of passengers: 1, 2, 3, 4... +- Amount to transfer: $100, $500, $1,250 + +**⏰ Temporal Slots**: Date and time information + +- Departure date: "tomorrow", "March 15th", "next Friday" +- Time: "morning", "3 PM", "around noon" + +**πŸ“ Location Slots**: Geographic information + +- Cities: "New York", "London", "Tokyo" +- Addresses: "123 Main Street", "downtown area" + +**πŸ‘€ Named Entity Slots**: Proper nouns + +- Person names: "John Smith", "Maria Garcia" +- Organization names: "Chase Bank", "Delta Airlines" + +**πŸ”— Relationship Between Intents and Slots** + +Different intents require different slots: + +- `book_flight` intent needs: departure_city, destination_city, date, passengers +- `transfer_money` intent needs: amount, recipient, account_type +- `check_weather` intent needs: location, date/time + +**πŸ”Œ Integration with AutoIntent** + +While AutoIntent doesn't directly handle slot extraction, it can be integrated with slot filling systems: + +1. **🎯 Intent Classification First**: Use AutoIntent to determine the user's intent +2. **🏷️ Slot Extraction**: Based on the predicted intent, apply appropriate slot extraction models +3. **🀝 Joint Training**: Use intent predictions as features for slot extraction models + +**πŸ› οΈ Popular Slot Extraction Approaches** + +- **πŸ“ Rule-Based**: Regular expressions and pattern matching +- **πŸ“Š Classical ML**: CRF (Conditional Random Fields) +- **🧠 Neural Approaches**: BERT-based NER models, BiLSTM-CRF +- **πŸ€– Joint Models**: Models that predict intents and slots simultaneously (encoders like BERT or LLMs like GPT) + +Dialogue Management and Flow Control +------------------------------------- + +Dialogue management orchestrates the conversation flow and determines system actions based on the current state. + +**πŸ“Š State Representation** + +The dialogue state typically includes: + +- **🎯 Current Intent**: What the user wants to do +- **πŸ’Ύ Slot Values**: Information collected so far +- **πŸ“œ Dialogue History**: Previous turns and actions +- **πŸ‘€ User Profile**: Persistent information about the user +- **🌍 Context**: External information (time, location, etc.) + +**🌊 Dialogue Flow Patterns** + +**πŸ“ Linear Flows**: Predetermined sequence of steps + +1. Collect departure city +2. Collect destination city +3. Collect travel date +4. Confirm booking +5. Process payment + +**πŸ”€ Branching Flows**: Different paths based on conditions + +- If user is premium member β†’ offer upgrade options +- If destination requires visa β†’ provide visa information +- If date is invalid β†’ ask for alternative dates + +**🀝 Mixed-Initiative**: Both user and system can drive the conversation + +- System can ask clarifying questions +- User can provide unrequested information +- System can make proactive suggestions + +**πŸ› οΈ Error Handling and Recovery** + +**❌ Recognition Errors**: When the system misunderstands user input + +- Confidence scoring to detect uncertain predictions +- Confirmation strategies ("Did you say London?") +- Graceful fallback to human agents + +**πŸ’₯ Task Completion Failures**: When the system cannot fulfill the request + +- Alternative suggestions +- Partial completion with explanation +- Escalation procedures + +**πŸ˜• User Confusion**: When users don't understand the system + +- Help messages and tutorials +- Progressive disclosure of capabilities +- Context-sensitive guidance + +**🧭 Dialogue Policies** + +**πŸ“‹ Rule-Based Policies**: Hand-crafted decision trees + +- Simple and predictable +- Easy to debug and modify +- Limited scalability and flexibility + +**πŸ€– Machine Learning Policies**: Learned from data + +- Reinforcement learning approaches +- Supervised learning from conversation logs +- Better handling of complex scenarios + +**🧠 LLM-Based Policies**: Leveraging large language models for dialogue management + +- Use LLMs (e.g., GPT, Llama) to generate system responses dynamically +- Few-shot or zero-shot prompting for intent recognition and slot filling +- Can handle open-domain and complex, unanticipated user inputs +- Requires careful prompt engineering and safety controls +- May be combined with retrieval-augmented generation for factual accuracy + +**πŸ”„ Hybrid Approaches**: Combining rules and learning + +- Rules for critical paths and constraints +- ML for optimization and personalization +- Best of both worlds approach + +Production Considerations for Dialogue Systems +---------------------------------------------- + +There are lots of considerations to think about, but the one where AutoIntent can help is **πŸ”„ Automated Model Update**. AutoIntent's AutoML capabilities enable periodic retraining and updating of classifiers with new data, ensuring the system stays accurate and up-to-date with minimal manual intervention. + + +Conclusion +---------- + +This comprehensive understanding of dialogue systems provides the context for how AutoIntent fits into the broader ecosystem. While AutoIntent specifically excels at the intent classification component, understanding the full picture helps developers build more effective and robust conversational systems. diff --git a/docs/source/learn/optimization.rst b/docs/source/learn/optimization.rst index 2b9f24b4c..2384377b8 100644 --- a/docs/source/learn/optimization.rst +++ b/docs/source/learn/optimization.rst @@ -6,7 +6,7 @@ In this section, you will learn how hyperparameter optimization works in our lib Pipeline -------- -The entire process of configuring a classifier in our library is divided into sequential steps (:ref:`and that's why `): +The entire process of configuring a classifier in our library is divided into sequential steps (:ref:`and that's why `): 1. Selecting an embedder (EmbeddingNode) 2. Selecting a classifier (ScoringNode)