This directory contains comprehensive examples demonstrating how to use the torchTextClassifiers package for various text classification tasks.
A simple binary sentiment classification example that covers:
- Creating a FastText classifier
- Preparing training and validation data
- Building and training the model
- Making predictions and evaluating performance
- Saving model configuration
Run the example:
cd examples
uv run --extra huggingface python basic_classification.pyWhat you'll learn:
- Basic API usage
- Binary classification workflow
- Model evaluation
- Configuration persistence
Demonstrates 3-class sentiment analysis (positive, negative, neutral):
- Multi-class data preparation
- Class distribution handling
- Detailed result analysis
- Configuration loading and validation
Run the example:
cd examples
uv run --extra huggingface python multiclass_classification.pyWhat you'll learn:
- Multi-class classification setup
- Class imbalance considerations
- Advanced result interpretation
- Model serialization/deserialization
Shows how to combine text and categorical features:
- Text + categorical data preparation
- Feature engineering for categorical variables
- Comparing mixed vs. text-only models
- Performance analysis with different feature types
Run the example:
cd examples
uv run --extra huggingface python Using_additional_features.pyWhat you'll learn:
- Mixed feature classification
- Categorical feature configuration
- Feature importance analysis
- Model comparison techniques
Explores advanced training configurations:
- Custom PyTorch Lightning trainer parameters
- Different hardware configurations (CPU/GPU)
- Training optimization techniques
- Model comparison and selection
Run the example:
cd examples
uv run --extra huggingface python advanced_training.pyWhat you'll learn:
- Advanced training configurations
- Hardware-specific optimizations
- Training parameter tuning
- Model performance comparison
Demonstrates model explainability with ASCII histogram visualizations:
- Training a FastText classifier with enhanced data
- Word-level contribution analysis
- ASCII histogram visualization in terminal
- Interactive mode for custom text analysis
- Real-time prediction explanations
Run the example:
cd examples
# Regular mode - analyze predefined examples
uv run --extra huggingface python simple_explainability_example.py
# Interactive mode - analyze your own text
uv run --extra huggingface python simple_explainability_example.py --interactiveWhat you'll learn:
- Model explainability and interpretation
- Word importance analysis
- Interactive prediction tools
- ASCII-based data visualization
- Real-time model analysis
To run any example:
- Install dependencies:
uv sync- Navigate to examples directory:
cd examples- Run an example:
uv --extra huggingface run python basic_classification.py๐ Basic Text Classification Example
==================================================
๐ Creating sample data...
Training samples: 10
Validation samples: 2
Test samples: 3
๐๏ธ Creating FastText classifier...
๐จ Building model...
โ
Model built successfully!
๐ฏ Training model...
โ
Training completed!
๐ฎ Making predictions...
Predictions: [1 0 1]
True labels: [1 0 1]
Test accuracy: 1.000
๐ Detailed Results:
----------------------------------------
1. โ
Predicted: Positive
Text: This is an amazing product with great features!...
2. โ
Predicted: Negative
Text: Completely disappointed with this purchase...
3. โ
Predicted: Positive
Text: Excellent build quality and works as expected...
๐พ Saving model configuration...
โ
Configuration saved to 'basic_classifier_config.json'
๐ Example completed successfully!
๐ญ Multi-class Text Classification Example
==================================================
๐ Creating multi-class sentiment data...
Training samples: 15
Class distribution: Negative=5, Neutral=5, Positive=5
๐๏ธ Creating multi-class FastText classifier...
๐จ Building model...
โ
Model built successfully!
๐ฏ Training model...
โ
Training completed!
๐ Detailed Results:
------------------------------------------------------------
1. โ
Predicted: Negative, True: Negative
Text: This is absolutely horrible!
2. โ
Predicted: Neutral, True: Neutral
Text: It's an average product, nothing more.
3. โ
Predicted: Positive, True: Positive
Text: Fantastic! Love every aspect of it!
Final Accuracy: 3/3 = 1.000
๐ Simple Explainability Example
๐ Testing explainability on 5 examples:
============================================================
๐ Example 1:
Text: 'This product is amazing!'
Prediction: Positive
๐ Word Contribution Histogram:
--------------------------------------------------
This | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 0.3549
product | โโโโโโโโโโโโโ 0.1651
is | โโโโโโโโโโโโโโโโโโโโโโโโ 0.2844
amazing! | โโโโโโโโโโโโโโโโ 0.1956
--------------------------------------------------
โ
Analysis completed for example 1
๐ Example 2:
Text: 'Poor quality and terrible service'
Prediction: Negative
โ ๏ธ Explainability failed:
โ
Analysis completed for example 2
๐ Example 3:
Text: 'Great value for money'
Prediction: Positive
๐ Word Contribution Histogram:
--------------------------------------------------
Great | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 0.3287
value | โโโโโโโโโโโโโโโโโโโโ 0.2220
for | โโโโโโโโโโโโโโโโโโโโโโโโโโ 0.2929
money | โโโโโโโโโโโโโโ 0.1564
--------------------------------------------------
โ
Analysis completed for example 3
๐ Example 4:
Text: 'Completely disappointing and awful experience'
Prediction: Negative
๐ Word Contribution Histogram:
--------------------------------------------------
Completely | โโโโโโโโโโ 0.1673
disappointing | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 0.4676
and | โโโโโ 0.0910
awful | โโโโโโโ 0.1225
experience | โโโโโโโโโ 0.1516
--------------------------------------------------
โ
Analysis completed for example 4
๐ Example 5:
Text: 'Love this excellent design'
Prediction: Positive
๐ Word Contribution Histogram:
--------------------------------------------------
Love | โโโโโโโโโโโโโโโโโโ 0.2330
this | โโโโโโโโโโโโโโโโโโโโ 0.2525
excellent | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 0.3698
design | โโโโโโโโโโโ 0.1447
--------------------------------------------------
โ
Analysis completed for example 5
๐ Explainability analysis completed for 5 examples!
๐ก Tip: Use --interactive flag to enter interactive mode for custom text analysis!
Example: uv run python examples/simple_explainability_example.py --interactive
============================================================
๐ฏ Interactive Explainability Mode
============================================================
Enter your own text to see predictions and explanations!
Type 'quit' or 'exit' to end the session.
๐ฌ Enter text: Amazing product quality!
๐ Analyzing: 'Amazing product quality!'
๐ฏ Prediction: Positive
๐ Word Contribution Histogram:
--------------------------------------------------
Amazing | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 0.5429
product | โโโโโโโโโโโโโโ 0.2685
quality! | โโโโโโโโโโ 0.1886
--------------------------------------------------
๐ก Most influential word: 'Amazing' (score: 0.5429)
--------------------------------------------------
๐ฌ Enter text: Terrible customer support
๐ Analyzing: 'Terrible customer support'
๐ฏ Prediction: Negative
๐ Word Contribution Histogram:
--------------------------------------------------
Terrible | โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 0.5238
customer | โโโโโโโโโโโ 0.1988
support | โโโโโโโโโโโโโโโ 0.2774
--------------------------------------------------
๐ก Most influential word: 'Terrible' (score: 0.5238)
--------------------------------------------------
๐ฌ Enter text: quit
๐ Thanks for using the explainability tool!
You can easily adapt the examples to your own data:
# Replace the example data with your own
X_train = np.array([
"Your text sample 1",
"Your text sample 2",
# ... more samples
])
y_train = np.array([0, 1, ...]) # Your labelsExperiment with different model parameters:
classifier = create_fasttext(
embedding_dim=200, # Increase for better representations
num_tokens=20000, # Increase for larger vocabularies
min_count=3, # Increase to filter rare words
num_epochs=100, # Increase for more training
batch_size=64, # Adjust based on your hardware
)Extend examples with your own categorical features:
# Add your categorical features
categorical_features = np.array([
[category1, category2, category3],
# ... more feature vectors
])
X_mixed = np.column_stack([text_data, categorical_features])- Increase embedding dimensions for complex tasks
- Use more training data when available
- Tune n-gram parameters (min_n, max_n) for your domain
- Experiment with batch sizes and learning rates
- Consider mixed features if you have structured data
- Use sparse embeddings for large vocabularies
- Increase batch size (if memory allows)
- Reduce embedding dimensions for faster convergence
- Use CPU training for small datasets
- Adjust num_workers for optimal data loading
- Use gradient accumulation with small batch sizes
- Enable mixed precision training (precision=16)
- Implement data streaming for very large datasets
- Use multiple GPUs if available
-
Memory errors:
- Reduce batch_size
- Use sparse=True
- Reduce embedding_dim
-
Slow training:
- Increase batch_size
- Reduce num_workers
- Use CPU for small datasets
-
Poor accuracy:
- Increase training data
- Tune hyperparameters
- Check data quality
- Increase num_epochs
-
Import errors:
- Run
uv syncto install dependencies - Check Python version compatibility
- Run
If you encounter issues:
- Check the main README for setup instructions
- Review the API documentation
- Look at similar examples for reference
- Open an issue on GitHub with your specific problem
- Main README - Package overview and installation
- API Reference - Complete API documentation
- Developer Guide - Adding new classifier types
- Tests - Unit and integration tests for reference
We welcome new examples! If you have a use case that would benefit others:
- Follow the existing example structure
- Include comprehensive comments
- Add error handling and validation
- Test your example thoroughly
- Update this README with your addition
Example template structure:
"""
Your Example Title
Brief description of what this example demonstrates.
"""
import numpy as np
from torchTextClassifiers import create_fasttext
def main():
print("๐ Your Example Title")
print("=" * 50)
# Your implementation here
print("๐ Example completed successfully!")
if __name__ == "__main__":
main()