AAC Vocabulary and Comparative Analysis

Overview

The AAC metrics module now includes comprehensive vocabulary coverage analysis and comparative analysis between board sets. These features help clinicians and researchers evaluate how well an AAC board set covers core vocabulary and compare different board configurations.

Features

1. Vocabulary Coverage Analysis

Analyzes how well a board set covers core vocabulary lists and identifies gaps.

Key Features:

Coverage statistics for multiple core vocabulary lists
Identification of missing core words
High/low effort word analysis
Extra vocabulary detection (words not in core lists)

Usage:

import { ObfsetProcessor, Analytics } from '@willwade/aac-processors';

// Load board set
const processor = new ObfsetProcessor();
const tree = processor.loadIntoTree('path/to/boardset.obfset');

// Calculate metrics
const metrics = new Analytics.MetricsCalculator().analyze(tree);

// Analyze vocabulary coverage
const vocabAnalyzer = new Analytics.VocabularyAnalyzer();
const analysis = vocabAnalyzer.analyze(metrics);

console.log("Core Coverage:", analysis.core_coverage);
console.log("High Effort Words:", analysis.high_effort_words);
console.log("Low Effort Words:", analysis.low_effort_words);

Output Structure:

{
  core_coverage: {
    'default': {
      name: 'Combined Core, Anderson & Bitner, 2013',
      total_words: 646,
      covered: 142,
      missing: 504,
      coverage_percent: 21.98,
      missing_words: ['more', 'i', 'you', ...],
      average_effort: 3.2456
    },
    // ... more core lists
  },
  total_unique_words: 2443,
  extra_words: [...],
  high_effort_words: [{ word: 'hello', effort: 8.5 }, ...],
  low_effort_words: [{ word: 'more', effort: 1.2 }, ...]
}

2. Sentence Construction Analysis

Calculates the effort required to construct test sentences from the board set.

Key Features:

Tests 30 common sentences
Identifies words requiring spelling (typing)
Calculates average effort per word
Statistics across all sentences

Usage:

import { Analytics } from '@willwade/aac-processors';

const sentenceAnalyzer = new Analytics.SentenceAnalyzer();
const refLoader = new Analytics.ReferenceLoader();
const testSentences = refLoader.loadSentences();

const analyses = sentenceAnalyzer.analyzeSentences(metrics, testSentences);
const stats = sentenceAnalyzer.calculateStatistics(analyses);

console.log("Average effort:", stats.average_effort);
console.log("Typing percent:", stats.typing_percent);

Output Structure:

[
  {
    sentence: "I like to be here with you",
    words: ["I", "like", "to", "be", "here", "with", "you"],
    effort: 3.45, // Average per word
    total_effort: 24.15, // Total for sentence
    typing: true, // Required spelling
    missing_words: ["I", "like", "to"],
    word_efforts: [
      { word: "I", effort: 12.5, typed: true },
      { word: "like", effort: 15.0, typed: true },
      // ...
    ],
  },
  // ... more sentences
];

3. Comparative Analysis

Compares two board sets to identify differences and generate CARE component scores.

Key Features:

Missing/extra/overlapping word identification
CARE component scoring (Core, sentences, fringe)
High/low effort word comparison
Core list coverage comparison
Sentence construction comparison

Usage:

import { Analytics } from '@willwade/aac-processors';

// Load two board sets
const targetResult = calculator.analyze(targetTree);
const compareResult = calculator.analyze(compareTree);

// Compare
const comparisonAnalyzer = new Analytics.ComparisonAnalyzer();
const comparison = comparisonAnalyzer.compare(targetResult, compareResult, {
  includeSentences: true,
});

console.log("Missing words:", comparison.missing_words);
console.log("CARE components:", comparison.care_components);
console.log("High effort words:", comparison.high_effort_words);

Output Structure:

{
  // Target metrics
  total_boards: 212,
  total_words: 2443,
  target_effort_score: 6.5671,

  // Comparison metrics
  comp_boards: 480,
  comp_words: 4576,
  comp_effort_score: 5.4565,

  // Vocabulary comparison
  missing_words: ['word1', 'word2', ...],     // In comparison, not target
  extra_words: ['word3', 'word4', ...],        // In target, not comparison
  overlapping_words: ['word5', 'word6', ...],  // In both

  // CARE components
  care_components: {
    core: 142,              // Core words in target
    comp_core: 189,         // Core words in comparison
    sentences: 3.45,        // Avg sentence effort target
    comp_sentences: 2.98,   // Avg sentence effort comparison
    fringe: 89,             // Fringe words in target
    comp_fringe: 112,       // Fringe words in comparison
    common_fringe: 45,      // Fringe words in both
  },

  // High/low effort words
  high_effort_words: ['hello', 'goodbye', ...],  // Harder in target
  low_effort_words: ['more', 'want', ...],       // Easier in target

  // Core list analysis
  cores: {
    'default': {
      name: 'Combined Core, Anderson & Bitner, 2013',
      list: ['more', 'i', 'you', ...],
      average_effort: 3.24,
      comp_effort: 2.87
    }
  },

  // Sentence comparison
  sentences: [
    {
      sentence: 'I like to be here with you',
      effort: 3.45,
      typing: true,
      comp_effort: 2.98,
      comp_typing: false
    }
  ]
}

Reference Data

The module includes reference vocabulary lists for English:

Core Lists: Multiple core vocabulary definitions (Anderson & Bitner, Universal Core, UNC, etc.)
Common Words: High-frequency words with baseline effort scores
Sentences: 30 test sentences for construction analysis
Synonyms: Word-to-synonym mappings
Fringe: Extended vocabulary lists

Location: src/optional/analytics/reference/data/

Testing

Run the test scripts to see the features in action:

# Vocabulary coverage analysis
npx ts-node test-vocabulary-analysis.ts

# Comparative analysis
npx ts-node test-comparison-analysis.ts

Integration with Processors

All processors can now use these analysis features:

import { ObfProcessor, Analytics } from '@willwade/aac-processors';

const processor = new ObfProcessor();
const tree = processor.loadIntoTree('my-board.obf');

const metrics = new Analytics.MetricsCalculator().analyze(tree);

const vocabAnalyzer = new Analytics.VocabularyAnalyzer();
const coverage = vocabAnalyzer.analyze(metrics);

// Identify gaps in core vocabulary
Object.entries(coverage.core_coverage).forEach(([id, data]) => {
  if (data.coverage_percent < 50) {
    console.log(
      `${data.name}: Only ${data.coverage_percent.toFixed(1)}% covered`,
    );
    console.log(`Missing: ${data.missing_words.slice(0, 10).join(", ")}`);
  }
});

CLI Integration (Future)

These features will be integrated into the CLI:

# Analyze vocabulary coverage
aac-processors metrics my-boardset.obf --vocabulary

# Compare two board sets
aac-processors compare target.obfset comparison.obfset --output comparison.json

# Generate coverage report
aac-processors coverage my-boardset.obf --core-lists default,unc --format markdown

Use Cases

Clinical Evaluation: Identify gaps in core vocabulary coverage for a client's board set
Board Set Comparison: Compare different configurations (e.g., before/after optimization)
Research: Analyze vocabulary coverage across different board sets or formats
Quality Assurance: Ensure board sets meet minimum coverage thresholds
Optimization: Identify high-effort words that could be repositioned for easier access

Performance

Vocabulary analysis: ~100ms for 2,400-word board set
Sentence analysis: ~50ms for 30 sentences
Comparative analysis: ~200ms for comparing two 2,000+ word sets

Files

src/optional/analytics/metrics/vocabulary.ts - Vocabulary coverage analysis
src/optional/analytics/metrics/sentence.ts - Sentence construction analysis
src/optional/analytics/metrics/comparison.ts - Comparative analysis
src/optional/analytics/reference/index.ts - Reference data loader
test-vocabulary-analysis.ts - Vocabulary analysis demo
test-comparison-analysis.ts - Comparative analysis demo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AAC Vocabulary and Comparative Analysis

Overview

Features

1. Vocabulary Coverage Analysis

2. Sentence Construction Analysis

3. Comparative Analysis

Reference Data

Testing

Integration with Processors

CLI Integration (Future)

Use Cases

Performance

Files

FilesExpand file tree

VOCABULARY_ANALYSIS_GUIDE.md

Latest commit

History

VOCABULARY_ANALYSIS_GUIDE.md

File metadata and controls

AAC Vocabulary and Comparative Analysis

Overview

Features

1. Vocabulary Coverage Analysis

2. Sentence Construction Analysis

3. Comparative Analysis

Reference Data

Testing

Integration with Processors

CLI Integration (Future)

Use Cases

Performance

Files