diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index e69de29..a5739af 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -0,0 +1,127 @@ +# Contributing to Asclepios AI + +Thank you for your interest in contributing to **Asclepios AI**!\ +We welcome contributions that help improve the platform, documentation, +models, and code quality. + +------------------------------------------------------------------------ + +## ๐Ÿš€ How to Contribute + +### 1. Fork the Repository + +Click the **Fork** button at the top of the project repository to create +your own copy. + +### 2. Clone Your Fork + +``` bash +git clone git@github.com:MIT-Emerging-Talent/ELO2_Asclepios_Ai.git +cd asclepios-ai +``` + +### 3. Create a New Branch + +Always create a feature or fix branch in your fork: + +``` bash +git checkout -b feature/new-enhancement +``` + +------------------------------------------------------------------------ + +## ๐Ÿงช Development Setup + +Install dependencies: + +``` bash +pip install -r requirements.txt +``` + +Ensure you are using **Python 3.10+**. + +------------------------------------------------------------------------ + +## ๐Ÿ“ Code Standards + +To maintain consistency, contributors are expected to follow: + +### โœ”๏ธ Python Standards + +- Use **PEP8** formatting. + +- Run linters before submitting a PR: + + ``` bash + ruff check . + ``` + +### โœ”๏ธ Markdown Standards + +Markdown files are checked using: - `.markdownlint.yml` + +Run markdown linting: + +``` bash +markdownlint . +``` + +### โœ”๏ธ Folder & File Naming + +We follow `.ls-lint.yml` rules for consistent file naming across the +project. + +------------------------------------------------------------------------ + +## ๐Ÿงช Testing + +Ensure all code added or modified includes tests where applicable. + +Run tests: + +``` bash +pytest +``` + +------------------------------------------------------------------------ + +## ๐Ÿ”„ Submitting a Pull Request (PR) + +1. Commit your changes: + + ``` bash + git commit -m "Add: new enhancement" + ``` + +2. Push to your fork: + + ``` bash + git push origin feature/new-enhancement + ``` + +3. Create a Pull Request on GitHub. + +**Include the following in your PR:** - Clear description of the +change - Related issue number (if applicable) - Screenshots (for +UI/visual tools) - Test results (if required) + +------------------------------------------------------------------------ + +## ๐Ÿ’ฌ Community Guidelines + +- Be respectful and helpful. +- Use clear communication. +- Provide meaningful commit messages. +- Document major changes clearly. + +------------------------------------------------------------------------ + +## ๐Ÿ“„ License + +By contributing, you agree that your contributions will be licensed +under the **MIT License**, included in this repository. + +------------------------------------------------------------------------ + +Thank you for helping make **Asclepios AI** better!\ +For questions or suggestions, feel free to open an issue. diff --git a/README.md b/README.md index 9e71051..4046920 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,43 @@ -# Asclepios AI: Optimizing Substance Use Disorder Treatment +# **Asclepios AI: Data-Driven Treatment Optimization for Substance Use Disorder (SUD)** -Welcome to the Asclepios AI project! This repository contains all the work -done by our team to develop an AI-powered platform that helps substance use -disorder (SUD) treatment facilities optimize patient care and resource -allocation. +Asclepios AI is a data-driven exploration into how machine learning, +domain knowledge, and thoughtful analysis can help improve outcomes for +individuals undergoing Substance Use Disorder (SUD) treatment. Using real +admissions and treatment episode data, our goal is to identify patterns +that can help facilities personalize treatment duration, improve completion +rates, and reduce relapse/readmission. -## ๐ŸŒŸ What We're Building +## Our Team + +This project is being developed by a collaborative team: + +- **Caesar Ghazi** ([@CaesarGhazi](https://github.com/CaesarGhazi)) +- **Moe Alwathiq** ([Moe-phantom](https://github.com/Moe-phantom)) +- **Rafaa Ali** ([@RafaaAli](https://github.com/RafaaAli)) +- **Wuor Bhang** ([@WuorBhang](https://github.com/WuorBhang)) + +## **Primary Research Question** + +How accurately can patient demographics, history, and treatment factors +predict an optimal treatment duration that minimizes relapse risk in Substance +Use Disorder treatment? + +## Problem Statement + +Despite the critical importance of completing adequate treatment for +sustained recovery, SUD facilities lack systematic approaches to personalizing +treatment length based on patient characteristics and predicted outcomes. +Approximately **30% of patients are readmitted within one year**, suggesting +many individuals either receive insufficient treatment or discontinue prematurely. + +At the same time, treatment facilities struggle with capacity planning. They +often cannot effectively match the timing and volume of new admissions with +available beds and staff. This mismatch leads to under-used treatment slots, +prolonged waitlists, and reduced access to care. + +Asclepios AI aims to use data-driven insights to address these challenges. + +## What We're Building We're creating a smart system that helps treatment centers: @@ -17,16 +49,7 @@ Think of it like a personalized recommendation system - just like how Netflix recommends movies you might like, our system recommends the optimal treatment plan for each patient based on their unique situation. -## ๐Ÿ‘ฅ Our Team - -This project is being developed by a collaborative team: - -- **Caesar Ghazi** ([@CaesarGhazi](https://github.com/CaesarGhazi)) -- **Moe Alwathiq** ([Moe-phantom](https://github.com/Moe-phantom)) -- **Rafaa Ali** ([@RafaaAli](https://github.com/RafaaAli)) -- **Wuor Bhang** ([@WuorBhang](https://github.com/WuorBhang)) - -## ๐ŸŽฏ Why This Matters +## Why This Matters Substance use disorders affect millions of people worldwide and cost society billions of dollars each year. Unfortunately: @@ -39,61 +62,129 @@ billions of dollars each year. Unfortunately: Our AI system aims to solve these problems by using data to make better predictions about what each patient needs. -## ๐Ÿ“ Repository Structure +## Data Overview -Our work is organized in a step-by-step approach: +We use publicly available SUD admissions and treatment episode datasets, +which include variables such as: -```/ -โ”œโ”€โ”€ 0_domain_study/ # Research about SUD treatment and AI applications -โ”œโ”€โ”€ 1_datasets/ # The data we're using to train our AI models -โ”œโ”€โ”€ 2_data_preparation/ # Cleaning and organizing the data -โ”œโ”€โ”€ 3_data_exploration/ # Understanding patterns in the data -โ”œโ”€โ”€ 4_data_analysis/ # Building and testing our AI models -โ”œโ”€โ”€ 5_communication_strategy/ # How we'll share our findings with others -โ”œโ”€โ”€ 6_final_presentation/ # Our final presentation materials -โ””โ”€โ”€ collaboration/ # Team agreements, schedules, and reflections +- Patient demographics +- Substance use patterns +- Co-occurring issues +- Treatment history +- Treatment length +- Completion status +- Readmission / relapse factors -``` +### Repository Data Folders -## ๐Ÿ“Š Data Sources +- ๐Ÿ“ **Raw Data:** [`1_datasets/raw/`](./1_datasets/raw/) +- ๐Ÿ“ **Processed Data:** [`1_datasets/processed/`](./1_datasets/processed/) +- ๐Ÿ“ **Sample Data:** [`1_datasets/sample/`](./1_datasets/sample/) +- ๐Ÿ“„ **Data Documentation:** [`1_datasets/README.md`](./1_datasets/README.md) -We're using publicly available datasets from reputable sources: +--- + +## Data Preparation + +This step includes: + +- Cleaning and formatting raw datasets +- Handling missing or inconsistent data +- Encoding categorical variables +- Normalizing and transforming features +- Structuring datasets for modeling + +๐Ÿ“ Folder: + **[`2_data_preparation/`](./2_data_preparation/)** + +--- -- **Treatment Episode Data Set (TEDS)**: Information about admissions to - substance abuse treatment facilities -- **National Survey on Drug Use and Health (NSDUH)**: Annual survey data about - drug use and mental health in the U.S. +## Exploratory Data Analysis (EDA) -All data is handled ethically and with respect for patient privacy. +We explore: -## ๐Ÿš€ How to Use This Repository +- Relationships between treatment factors and outcomes +- Duration patterns across substance types +- Demographic and behavioral trends +- Variables most correlated with relapse or completion +- Facility-level patterns affecting patient success -If you're interested in our work, you can: +๐Ÿ“ Folder: + **[`3_data_exploration/`](./3_data_exploration/)** -1. **Explore our research** in the [0_domain_study](./0_domain_study/) folder -2. **See our data** in the [1_datasets](./1_datasets/) folder -3. **Learn about our approach** by reading the README files in each folder -4. **Follow our progress** through our collaboration documents +--- -Each folder contains a README.md file that explains what's in that section in -plain language. +## Modeling & Analysis -## ๐Ÿค Want to Contribute? +We will build models to: -While this is primarily a student project, we welcome feedback and -suggestions! Feel free to: +- Predict relapse likelihood +- Identify high-risk patients +- Predict facility resource demand +- Evaluate model fairness and robustness -- Open an issue if you have questions or ideas -- Fork the repository to experiment with our approach -- Contact any of our team members with questions +Analysis includes: -## ๐Ÿ“š Learn More +- Feature engineering +- Training and validation +- Model comparison +- Interpretability -- Read about our research approach in [0_domain_study/README.md](./0_domain_study/README.md) -- See what datasets we're using in [1_datasets/README.md](./1_datasets/README.md) -- Check out our team agreements in the [collaboration](./collaboration/) folder +๐Ÿ“ Folder: + **[`4_data_analysis/`](./4_data_analysis/)** --- -*This project is part of the Emerging Talent 6 Collaborative Data Science -Project* +## Communicating Results + +Our communication will include: + +- Insight summaries +- Visual dashboards or graphs +- Interpretations for clinicians and decision-makers +- Recommendations based on model findings +- Limitations and ethical considerations + +๐Ÿ“ Folder: + **[`5_communication_strategy/`](./5_communication_strategy/)** + +--- + +## Final Presentation + +Our final deliverables will recap the entire analysis with: + +- A structured summary of findings +- Visualizations +- Model results +- Recommendations for treatment facilities +- Reflections and next steps + +๐Ÿ“ Folder: + **[`6_final_presentation/`](./6_final_presentation/)** + +--- + +## Repository Structure + +Below is the full layout of the project repository: + +```text +Asclepios_Ai/ +โ”‚ +โ”œโ”€โ”€ 0_domain_study/ +โ”œโ”€โ”€ 1_datasets/ +โ”‚ โ”œโ”€โ”€ processed/ +โ”‚ โ”œโ”€โ”€ sample/ +โ”‚ โ””โ”€โ”€ raw/ +โ”œโ”€โ”€ 2_data_preparation/ +โ”œโ”€โ”€ 3_data_exploration/ +โ”œโ”€โ”€ 4_data_analysis/ +โ”œโ”€โ”€ 5_communication_strategy/ +โ””โ”€โ”€ 6_final_presentation/ +``` + +### License + +This project is licensed under the **MIT License**. +๐Ÿ“„ [View License](./LICENSE)