Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 127 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Contributing to Asclepios AI

Thank you for your interest in contributing to **Asclepios AI**!\
We welcome contributions that help improve the platform, documentation,
models, and code quality.

------------------------------------------------------------------------

## 🚀 How to Contribute

### 1. Fork the Repository

Click the **Fork** button at the top of the project repository to create
your own copy.

### 2. Clone Your Fork

``` bash
git clone git@github.com:MIT-Emerging-Talent/ELO2_Asclepios_Ai.git
cd asclepios-ai
```

### 3. Create a New Branch

Always create a feature or fix branch in your fork:

``` bash
git checkout -b feature/new-enhancement
```

------------------------------------------------------------------------

## 🧪 Development Setup

Install dependencies:

``` bash
pip install -r requirements.txt
```

Ensure you are using **Python 3.10+**.

------------------------------------------------------------------------

## 📝 Code Standards

To maintain consistency, contributors are expected to follow:

### ✔️ Python Standards

- Use **PEP8** formatting.

- Run linters before submitting a PR:

``` bash
ruff check .
```

### ✔️ Markdown Standards

Markdown files are checked using: - `.markdownlint.yml`

Run markdown linting:

``` bash
markdownlint .
```

### ✔️ Folder & File Naming

We follow `.ls-lint.yml` rules for consistent file naming across the
project.

------------------------------------------------------------------------

## 🧪 Testing

Ensure all code added or modified includes tests where applicable.

Run tests:

``` bash
pytest
```

------------------------------------------------------------------------

## 🔄 Submitting a Pull Request (PR)

1. Commit your changes:

``` bash
git commit -m "Add: new enhancement"
```

2. Push to your fork:

``` bash
git push origin feature/new-enhancement
```

3. Create a Pull Request on GitHub.

**Include the following in your PR:** - Clear description of the
change - Related issue number (if applicable) - Screenshots (for
UI/visual tools) - Test results (if required)

------------------------------------------------------------------------

## 💬 Community Guidelines

- Be respectful and helpful.
- Use clear communication.
- Provide meaningful commit messages.
- Document major changes clearly.

------------------------------------------------------------------------

## 📄 License

By contributing, you agree that your contributions will be licensed
under the **MIT License**, included in this repository.

------------------------------------------------------------------------

Thank you for helping make **Asclepios AI** better!\
For questions or suggestions, feel free to open an issue.
201 changes: 146 additions & 55 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,43 @@
# Asclepios AI: Optimizing Substance Use Disorder Treatment
# **Asclepios AI: Data-Driven Treatment Optimization for Substance Use Disorder (SUD)**

Welcome to the Asclepios AI project! This repository contains all the work
done by our team to develop an AI-powered platform that helps substance use
disorder (SUD) treatment facilities optimize patient care and resource
allocation.
Asclepios AI is a data-driven exploration into how machine learning,
domain knowledge, and thoughtful analysis can help improve outcomes for
individuals undergoing Substance Use Disorder (SUD) treatment. Using real
admissions and treatment episode data, our goal is to identify patterns
that can help facilities personalize treatment duration, improve completion
rates, and reduce relapse/readmission.

## 🌟 What We're Building
## Our Team

This project is being developed by a collaborative team:

- **Caesar Ghazi** ([@CaesarGhazi](https://github.com/CaesarGhazi))
- **Moe Alwathiq** ([Moe-phantom](https://github.com/Moe-phantom))
- **Rafaa Ali** ([@RafaaAli](https://github.com/RafaaAli))
- **Wuor Bhang** ([@WuorBhang](https://github.com/WuorBhang))

## **Primary Research Question**

How accurately can patient demographics, history, and treatment factors
predict an optimal treatment duration that minimizes relapse risk in Substance
Use Disorder treatment?

## Problem Statement

Despite the critical importance of completing adequate treatment for
sustained recovery, SUD facilities lack systematic approaches to personalizing
treatment length based on patient characteristics and predicted outcomes.
Approximately **30% of patients are readmitted within one year**, suggesting
many individuals either receive insufficient treatment or discontinue prematurely.

At the same time, treatment facilities struggle with capacity planning. They
often cannot effectively match the timing and volume of new admissions with
available beds and staff. This mismatch leads to under-used treatment slots,
prolonged waitlists, and reduced access to care.

Asclepios AI aims to use data-driven insights to address these challenges.

## What We're Building

We're creating a smart system that helps treatment centers:

Expand All @@ -17,16 +49,7 @@ Think of it like a personalized recommendation system - just like how Netflix
recommends movies you might like, our system recommends the optimal treatment
plan for each patient based on their unique situation.

## 👥 Our Team

This project is being developed by a collaborative team:

- **Caesar Ghazi** ([@CaesarGhazi](https://github.com/CaesarGhazi))
- **Moe Alwathiq** ([Moe-phantom](https://github.com/Moe-phantom))
- **Rafaa Ali** ([@RafaaAli](https://github.com/RafaaAli))
- **Wuor Bhang** ([@WuorBhang](https://github.com/WuorBhang))

## 🎯 Why This Matters
## Why This Matters

Substance use disorders affect millions of people worldwide and cost society
billions of dollars each year. Unfortunately:
Expand All @@ -39,61 +62,129 @@ billions of dollars each year. Unfortunately:
Our AI system aims to solve these problems by using data to make better
predictions about what each patient needs.

## 📁 Repository Structure
## Data Overview

Our work is organized in a step-by-step approach:
We use publicly available SUD admissions and treatment episode datasets,
which include variables such as:

```/
├── 0_domain_study/ # Research about SUD treatment and AI applications
├── 1_datasets/ # The data we're using to train our AI models
├── 2_data_preparation/ # Cleaning and organizing the data
├── 3_data_exploration/ # Understanding patterns in the data
├── 4_data_analysis/ # Building and testing our AI models
├── 5_communication_strategy/ # How we'll share our findings with others
├── 6_final_presentation/ # Our final presentation materials
└── collaboration/ # Team agreements, schedules, and reflections
- Patient demographics
- Substance use patterns
- Co-occurring issues
- Treatment history
- Treatment length
- Completion status
- Readmission / relapse factors

```
### Repository Data Folders

## 📊 Data Sources
- 📁 **Raw Data:** [`1_datasets/raw/`](./1_datasets/raw/)
- 📁 **Processed Data:** [`1_datasets/processed/`](./1_datasets/processed/)
- 📁 **Sample Data:** [`1_datasets/sample/`](./1_datasets/sample/)
- 📄 **Data Documentation:** [`1_datasets/README.md`](./1_datasets/README.md)

We're using publicly available datasets from reputable sources:
---

## Data Preparation

This step includes:

- Cleaning and formatting raw datasets
- Handling missing or inconsistent data
- Encoding categorical variables
- Normalizing and transforming features
- Structuring datasets for modeling

📁 Folder:
**[`2_data_preparation/`](./2_data_preparation/)**

---

- **Treatment Episode Data Set (TEDS)**: Information about admissions to
substance abuse treatment facilities
- **National Survey on Drug Use and Health (NSDUH)**: Annual survey data about
drug use and mental health in the U.S.
## Exploratory Data Analysis (EDA)

All data is handled ethically and with respect for patient privacy.
We explore:

## 🚀 How to Use This Repository
- Relationships between treatment factors and outcomes
- Duration patterns across substance types
- Demographic and behavioral trends
- Variables most correlated with relapse or completion
- Facility-level patterns affecting patient success

If you're interested in our work, you can:
📁 Folder:
**[`3_data_exploration/`](./3_data_exploration/)**

1. **Explore our research** in the [0_domain_study](./0_domain_study/) folder
2. **See our data** in the [1_datasets](./1_datasets/) folder
3. **Learn about our approach** by reading the README files in each folder
4. **Follow our progress** through our collaboration documents
---

Each folder contains a README.md file that explains what's in that section in
plain language.
## Modeling & Analysis

## 🤝 Want to Contribute?
We will build models to:

While this is primarily a student project, we welcome feedback and
suggestions! Feel free to:
- Predict relapse likelihood
- Identify high-risk patients
- Predict facility resource demand
- Evaluate model fairness and robustness

- Open an issue if you have questions or ideas
- Fork the repository to experiment with our approach
- Contact any of our team members with questions
Analysis includes:

## 📚 Learn More
- Feature engineering
- Training and validation
- Model comparison
- Interpretability

- Read about our research approach in [0_domain_study/README.md](./0_domain_study/README.md)
- See what datasets we're using in [1_datasets/README.md](./1_datasets/README.md)
- Check out our team agreements in the [collaboration](./collaboration/) folder
📁 Folder:
**[`4_data_analysis/`](./4_data_analysis/)**

---

*This project is part of the Emerging Talent 6 Collaborative Data Science
Project*
## Communicating Results

Our communication will include:

- Insight summaries
- Visual dashboards or graphs
- Interpretations for clinicians and decision-makers
- Recommendations based on model findings
- Limitations and ethical considerations

📁 Folder:
**[`5_communication_strategy/`](./5_communication_strategy/)**

---

## Final Presentation

Our final deliverables will recap the entire analysis with:

- A structured summary of findings
- Visualizations
- Model results
- Recommendations for treatment facilities
- Reflections and next steps

📁 Folder:
**[`6_final_presentation/`](./6_final_presentation/)**

---

## Repository Structure

Below is the full layout of the project repository:

```text
Asclepios_Ai/
├── 0_domain_study/
├── 1_datasets/
│ ├── processed/
│ ├── sample/
│ └── raw/
├── 2_data_preparation/
├── 3_data_exploration/
├── 4_data_analysis/
├── 5_communication_strategy/
└── 6_final_presentation/
```

### License

This project is licensed under the **MIT License**.
📄 [View License](./LICENSE)
Loading