This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
TextAttack (v0.3.10) is a Python framework for adversarial attacks, data augmentation, and model training in NLP. It provides a modular system where attacks are composed of four pluggable components: goal functions, constraints, transformations, and search methods. The project is maintained by UVA QData Lab.
pip install -e .[dev]make test # Run full test suite (pytest --dist=loadfile -n auto)
pytest tests -v # Verbose test run
pytest tests/test_augment_api.py # Run a single test file
pytest --lf # Re-run only last failed testsmake format # Auto-format with black, isort, docformatter
make lint # Check formatting (black --check, isort --check-only, flake8)make docs # Build HTML docs with Sphinx
make docs-auto # Hot-reload docs server on port 8765textattack attack --recipe textfooler --model bert-base-uncased-mr --num-examples 100
textattack augment --input-csv examples.csv --output-csv output.csv --input-column text --recipe embedding
textattack train --model-name-or-path lstm --dataset yelp_polarity --epochs 50
textattack list attack-recipes
textattack peek-dataset --dataset-from-huggingface snliAn Attack is composed of exactly four components:
- GoalFunction (
textattack/goal_functions/) - Determines if an attack succeeded. Categories:classification/(untargeted, targeted),text/(BLEU, translation overlap),custom/. - Constraints (
textattack/constraints/) - Filter invalid perturbations. Categories:semantics/(sentence encoders, word embeddings),grammaticality/(POS, language models, grammar tools),overlap/(edit distance, BLEU),pre_transformation/(restrict search space before transforming). - Transformation (
textattack/transformations/) - Generate candidate perturbations. Types:word_swaps/(embedding, gradient, homoglyph, WordNet),word_insertions/,word_merges/,sentence_transformations/,WordDeletion,CompositeTransformation. - SearchMethod (
textattack/search_methods/) - Traverse the perturbation space. Includes:BeamSearch,GreedySearch,GreedyWordSwapWIR,GeneticAlgorithm,ParticleSwarmOptimization,DifferentialEvolution.
The Attacker class orchestrates running attacks on datasets with parallel processing, checkpointing, and logging.
Pre-built attack configurations from the literature (e.g., TextFooler, DeepWordBug, BAE, BERT-Attack, CLARE, CheckList, etc.). Each recipe subclasses AttackRecipe and implements a build(model_wrapper) classmethod that returns a configured Attack object. Includes multi-lingual recipes for French, Spanish, and Chinese.
AttackedText(textattack/shared/attacked_text.py) - Central text representation that maintains both token list and original text with punctuation. Used throughout the pipeline instead of raw strings.ModelWrapper(textattack/models/wrappers/) - Abstract interface for models. Implementations for PyTorch, HuggingFace, TensorFlow, sklearn. Models must accept string input and return predictions.Dataset(textattack/datasets/) - Iterable of(input, output)pairs. Supports HuggingFace datasets and custom files.Augmenter(textattack/augmentation/) - Uses transformations and constraints for data augmentation (not adversarial attacks). Built-in recipes: wordnet, embedding, charswap, eda, checklist, clare, back_trans.PromptAugmentationPipeline(textattack/prompt_augmentation/) - Augments prompts and generates LLM responses.- LLM Wrappers (
textattack/llms/) - Wrappers for using LLMs (HuggingFace, ChatGPT) with prompt augmentation.
Entry point: textattack/commands/textattack_cli.py. Each command (attack, augment, train, eval-model, list, peek-dataset, benchmark-recipe, attack-resume) is a subclass of TextAttackCommand with register_subcommand() and run() methods.
- Version tracked in
docs/conf.py(imported bysetup.py) - Cache directory:
~/.cache/textattack/(override withTA_CACHE_DIRenv var) - Formatting: black (line length 88), isort (skip
__init__.py), flake8 (ignores: E203, E266, E501, W503, D203)
check-formatting.yml- Runsmake linton Python 3.9run-pytest.yml- Sets up Python 3.8/3.9 (pytest currently skipped in CI)publish-to-pypi.yml- PyPI publishingmake-docs.yml- Documentation buildcodeql-analysis.yml- Security analysis
Tests are in tests/ organized by feature:
test_command_line/- CLI command integration tests (attack, augment, train, eval, list, loggers)test_constraints/- Constraint unit teststest_augment_api.py,test_transformations.py,test_attacked_text.py,test_tokenizers.py,test_word_embedding.py,test_metric_api.py,test_prompt_augmentation.pytest_command_line/update_test_outputs.py- Script to regenerate expected test outputs
- Attack recipe: Subclass
AttackRecipeintextattack/attack_recipes/, implementbuild(model_wrapper), add import to__init__.py, add doc reference indocs/attack_recipes.rst. - Transformation: Subclass
Transformationin appropriate subfolder undertextattack/transformations/. - Constraint: Subclass
ConstraintorPreTransformationConstraintin appropriate subfolder undertextattack/constraints/. - Search method: Subclass
SearchMethodintextattack/search_methods/.