Skip to content

illidanlab/semantic-dynamics-mci

Repository files navigation

Conversational Semantic Dynamics for Mild Cognitive Impairment Detection

Overview

Mild Cognitive Impairment (MCI) is an early stage of cognitive decline for which timely detection is important. Existing language-based approaches for MCI detection have shown promise, but most rely on static summaries of speech or text and may miss how semantic content changes over the course of a conversation. We propose a framework that captures dynamic semantic patterns from clinical conversations, including topic drift, prompt alignment, discourse coherence, and session-level semantic dispersion, and integrates them with conventional linguistic features for MCI detection. Our results show that modeling these semantic dynamics improves performance of MCI detection over the original linguistic baseline. These findings suggest that conversational semantic change may contribute to digital biomarkers for early cognitive decline.

Language Marker Extractor

To extract language markers from the transcripts, you need to extract syntactic complexity features using L2 Syntactic Complexity Analyzer. You can also use GUI from neosca GitHub Repo to extract syntactic features.

After that, put your syntactic complexity feature in file rawdata/syntactic_complexity_measures.csv and your transcripts data in folder Transcriptions, then run command python feature_extractor.py

It will generate the new8-extended feature map in rawdata/id2feature.p (107D = 99 legacy + 8 new).

New 8-Feature Extension

This updated version includes 8 newly developed linguistic features organized into 4 semantic groups:

Feature Groups

  1. Topic Drift (2 features):

    • topic_z_score: Z-score normalized topic coherence across conversation sequences
    • topic_z_score_indicator: Binary indicator for significant topic drift detection
  2. Prompt Alignment (1 feature):

    • prompt_align: Semantic alignment between participant responses and interviewer prompts
  3. Within-Session Coherence (3 features):

    • coh_mean: Mean coherence within a conversation session
    • coh_var: Variance of coherence within a session
    • coh_min: Minimum coherence within a session
  4. Session Semantic Dispersion (2 features):

    • sess_var_mean: Mean semantic variance across sessions
    • sess_var_norm: Normalized semantic variance across sessions

Ablation Study

To analyze the contribution of each new feature group, run leave-one-group-out ablation using:

python main_ablation_new8_logo.py --num_total_runs 100

The script automatically evaluates 4 settings:

  • leave out topic_drift
  • leave out prompt_alignment
  • leave out within_session_coherence
  • leave out session_semantic_dispersion

To summarize ablation results:

python summarize_alpha_ablation.py --pattern "logs/ablation_*.out"

Permutation Importance Analysis

To evaluate feature importance through permutation testing:

python main_permutation_new8.py --num_total_runs 100 --num_permute_repeats 10

This evaluates single-feature and group-level importance of the 8 new features by measuring performance drop after permutation.

The AuxiliaryExperiments folder also contains scripts for subject differentiation and confounder classification performance, both before and after temporal harmonization.

Data Helper Scripts

The Data folder includes four helper scripts for topic-based preprocessing and analysis:

  1. Build session-topic mapping:
python Data/topic.py

Output: outputs/session_topics.csv

  1. Build participant session text (aligned to topics):
python Data/session.py

Output: outputs/sessions_with_text.csv

  1. Compute session-level topic similarity and z-scores:
python Data/similarity.py

Outputs:

  • outputs/sessions_with_z_scores.csv
  • outputs/mci_scores.csv
  • outputs/nc_scores.csv
  1. Aggregate to subject-level features:
python Data/subject.py

Output: outputs/subject_level_features.csv

Optional plotting for step 4:

python Data/subject.py --plot

Data Request

The data is available upon request at https://www.i-conect.org/

Acknowledgement

This material is based in part upon work supported by the National Science Foundation under Grant IIS-2212174, National Institute of Aging (NIA) 1RF1AG072449, National Institute of General Medical Sciences (NIGMS) 1R01GM145700.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages