Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
c375103
Initial plan
Copilot Sep 27, 2025
c472605
Fix pydantic imports and add production endpoint config
Copilot Sep 27, 2025
4d5bf39
Complete Tackle Hunger Data Explorer with all requested features
Copilot Sep 27, 2025
ef68ba8
Fix AWS IP detection and enhance dark mode styling
Copilot Sep 27, 2025
38a6709
Fix missing packages and comprehensive dark theme contrast issues
Copilot Sep 27, 2025
666f4fb
Implement client-side pagination with staging GraphQL client for real…
Copilot Sep 27, 2025
cc0f5e3
Remove all AWS calls and map functionality - keep only network graph …
Copilot Sep 27, 2025
b2ab259
Completely eliminate AWS calls with comprehensive network isolation c…
Copilot Sep 28, 2025
3897d29
Fix GraphQL query error and remove sample data fallback - use only re…
Copilot Sep 28, 2025
9f46258
Phase 1: Fix critical performance and security issues in Data Explorer
jonero1 Oct 17, 2025
7e199e0
Fix duplicate sys/os imports flagged in code review
jonero1 Oct 17, 2025
5cb3ff2
Phase 2: Add data export, validation recommendations, and bug fixes
jonero1 Oct 21, 2025
e03fceb
Fix KeyError: 'empty_fields' in quality analytics
jonero1 Oct 21, 2025
7a96c6d
Fix second KeyError: 'completeness' in display_data_tables
jonero1 Oct 21, 2025
08d83f9
Fix pandas Styler cell limit error for large datasets
jonero1 Oct 21, 2025
4bb46ff
Add Priority Export feature for lowest-quality records
jonero1 Oct 21, 2025
899a802
Phase 3.1: Add interactive Folium map with quality-based markers
jonero1 Oct 21, 2025
afaea3f
Clean Phase 3.1: Remove sample data fallback - live API works with AI…
jonero1 Oct 21, 2025
499d7f2
Phase 3.2: Add enhanced network analysis with community detection and…
jonero1 Oct 22, 2025
f4e4419
Fix documentation: Update community detection algorithm name to greed…
jonero1 Oct 22, 2025
c6eabb5
feat(phase-3.3): Add batch quality scan scheduler (Days 1-3)
jonero1 Oct 23, 2025
05e93d4
fix(phase-3.3): Replace deprecated applymap with map
jonero1 Oct 23, 2025
ae4893f
Phase 3.3 Day 4: Persistent scan storage with SQLite (security-hardened)
jonero1 Oct 23, 2025
6154e0c
Phase 3.3 Days 5-6: Quality history tracking + trend visualization + …
jonero1 Oct 24, 2025
515d9e9
docs: Add comprehensive project documentation suite
jonero1 Oct 24, 2025
ac22751
fix: Improve data quality scoring and volunteer workflow
jonero1 Oct 24, 2025
7fc1b27
docs: Add production infrastructure and volunteer onboarding materials
jonero1 Oct 24, 2025
882ce90
docs: Add Phase 4 planning and technical analysis documentation
jonero1 Oct 24, 2025
a5aefb8
docs: Add branch merge analysis and data exploration guides
jonero1 Nov 6, 2025
4a2a821
docs: Phase 4 web scraping architecture and session completion notes …
jonero1 Dec 1, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 18 additions & 8 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,13 +1,23 @@
# SIMPLE .env Configuration for Volunteers
# Copy this file to .env and add your actual API token
# Tackle Hunger API Configuration
# Copy this file to .env and fill in the actual values from GitHub secrets

# Required: Get this from your team lead
AI_SCRAPING_TOKEN=your_ai_scraping_token_here

# Optional: Custom GraphQL API URL (defaults to dev API if not set)
# GraphQL API Endpoints
AI_SCRAPING_GRAPHQL_URL=https://devapi.sboc.us/graphql

# Optional: Environment (defaults to "dev" if not set)
# API Authentication
AI_SCRAPING_TOKEN=your_ai_scraping_token_here

# Environment Selection (dev|copilot|staging|production)
ENVIRONMENT=dev

# That's it! The code handles everything else automatically.
# AI/ETL Operation Identifiers
CREATED_METHOD=AI_Copilot_Assistant
MODIFIED_BY=''

# Rate limiting and timeout settings
API_RATE_LIMIT=10
API_TIMEOUT=30

# Logging configuration
LOG_LEVEL=INFO
LOG_FORMAT=json
Empty file.
79 changes: 79 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -174,3 +174,82 @@ env/

# Docker
docker-compose.override.yml

# Personal review documents (local workspace only)
PR65_REVIEW_SUMMARY.md

# Test & Analysis Files (Tackle Hunger specific)
test_*.py
*_backup.py
verify_*.py
check_*.py
investigate_*.py
analyze_*.py
demo_*.py
explore_*.py
export_*.py
generate_*.py
load_*.py
setup_*.py
quality_errors.csv
test_results.csv
data_analysis_*.csv
test_quality_scans.db

# Temporary Documentation (keep only curated docs in committed state)
*_SUMMARY.md
*_TRACKER.md
*_CHECKLIST.md
*_QUICKSTART.md
*_FIX*.md
*_UPDATE.md
*_COMPLETE.md
*_COMMENT.md
*_ISSUE.md
*_DRAFT.md
*_GUIDE.md
*_WORKSHEET.md
MY_INVESTIGATION_WORKSHEET.md
SESSION_SUMMARY_*.md
IMPLEMENTATION_SUMMARY_*.md
TESTING_CHECKLIST_*.md
QUICK_START_*.md
QUICK_REFERENCE_*.md
PHASE_*.md
!README.md
!PRODUCTION_README.md
!DEVELOPMENT_README.md
!DATA_EXPLORER_README.md

# Data Exports & Sample Data
data_exports/
sample_data.json
main_function.txt

# Scripts (temporary/test scripts only)
scripts/test_*.py
scripts/setup_*.py
reorganize_file.py
restart_streamlit.ps1
run_explorer.py
interactive_data_exploration.py

# Environment Templates (keep template, ignore filled versions)
env_template_for_apis.txt

# SQLite databases (except production schema)
*.db
!migrations/*.sql

# CSV data files
*.csv
!requirements*.txt

# GitHub workflows (if not ready for production)
.github/workflows/run-data-explorer.yml

# Copilot config (local only)
.copilot/

# Examples directory (if contains sensitive data)
examples/
47 changes: 47 additions & 0 deletions .streamlit/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
[global]
# Disable all development and warning features
showWarningOnDirectExecution = false

[server]
# Run in headless mode - no browser opening
headless = true

# Force localhost only - no external network detection
address = "localhost"
port = 8000

# Disable CORS and XSRF which might trigger external lookups
enableCORS = false
enableXsrfProtection = false

# Disable static file serving which might cause network calls
enableStaticServing = false

# Disable websocket compression to avoid potential network overhead
enableWebsocketCompression = false

[browser]
# Completely disable usage statistics collection
gatherUsageStats = false

# Force browser connection to localhost only
serverAddress = "localhost"
serverPort = 8000

[client]
# Use minimal toolbar to avoid external feature checks
toolbarMode = "minimal"

# Don't show error details which might trigger external lookups
showErrorDetails = "none"

[runner]
# Disable magic mode which might cause external calls
magicEnabled = false

# Use fast reruns to avoid delays that might trigger timeouts/external calls
fastReruns = true

[logger]
# Set minimal logging to avoid external log services
level = "error"
124 changes: 124 additions & 0 deletions ACTION_PLAN_PHASE_4_AND_1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# 🎯 Action Plan: Focus on Impact → Scale to Real Data

## **PHASE 4: Focus on Impact (Immediate Actions)**

### **✅ Action Item 1: Fix YMCA Food Hub Address**
**Timeline**: This week
**Priority**: HIGH - Real organization, wrong data hurts people seeking help

**Steps:**
1. **Contact YMCA of Greater New York**
- Phone: Look up main YMCA number
- Email: info@ymcanyc.org or similar
- Ask: "We found your Neighborhood Food Hub listed with Bronx address but it appears to be in Manhattan. Can you confirm correct address?"

2. **Update Data Sources**
- Contact 211 database administrator
- Report address correction needed
- Provide verification documentation

**Expected Outcome**: Correct address gets people to the right location for food assistance

---

### **✅ Action Item 2: Report False Listings**
**Timeline**: This week
**Priority**: MEDIUM - Prevents wasted time for people in need

**Google Places - "Downtown Soup Kitchen"**
1. Open Google Maps, search "Downtown Soup Kitchen, 654 Broadway, NYC"
2. Click "Suggest an edit" → "Remove this place"
3. Reason: "This business does not exist at this location. 654 Broadway is the American Express building."

**Yelp - "St. Mark's Food Pantry"**
1. Find the listing on Yelp
2. Use "Report a Problem" feature
3. Select "Business doesn't exist"
4. Provide details: "789 2nd Ave is a residential building, no food pantry exists here"

**Expected Outcome**: Clean up false data that misleads people seeking help

---

### **✅ Action Item 3: Research Holy Apostles Addition**
**Timeline**: Next 2 weeks
**Priority**: MEDIUM - Major food program may need better data coverage

**Research Steps:**
1. **Check 211 Coverage**
- Search 211nyc.org for "Holy Apostles Soup Kitchen"
- Search for "292 9th Ave"
- Document: Is it listed? Is info complete/current?

2. **Contact Holy Apostles Directly**
- Phone: (212) 807-6799 (likely)
- Website: holyapostlesnyc.org (likely)
- Ask: "Are you listed in the 211 database? Would you like to be?"

3. **Coordinate with 211**
- If not listed or needs update, contact 211 administrator
- Provide complete organization details
- Request addition/update

**Expected Outcome**: NYC's largest emergency food program gets proper database coverage

---

### **📊 Success Metrics for Phase 4:**
- [ ] YMCA address corrected in at least one database
- [ ] 2 false listings reported and flagged for removal
- [ ] Holy Apostles 211 status documented
- [ ] Clear next steps identified for any gaps found

---

## **PHASE 1: Scale to Real Data (After Impact Actions)**

### **🔧 Technical Preparation**

**API Access Setup:**
- [ ] **211 API**: Contact 211nyc.org for developer access
- [ ] **Google Places API**: Set up Google Cloud project, enable Places API
- [ ] **Yelp Fusion API**: Register at developer.yelp.com

**Code Updates Needed:**
```python
# Update graphql_client.py with real API endpoints
# Add rate limiting for API calls
# Implement proper error handling
# Add data export functionality
```

### **🎯 Real Data Strategy**

**Target Scope:**
- **Geographic**: Start with Manhattan (high density, good for testing)
- **Service Type**: Food assistance organizations
- **Expected Volume**: 500-2000 organizations
- **Gap Analysis**: Focus on single-source organizations

**Quality Assurance:**
- Sample 10% of gaps for manual verification
- Document investigation findings
- Track data quality patterns
- Refine algorithm based on results

### **📈 Scaling Plan**

**Week 1-2**: API setup and testing
**Week 3-4**: Run Manhattan gap analysis
**Week 5-6**: Investigate top 20 gaps
**Week 7-8**: Document findings and expand to other boroughs

---

## **🚀 Getting Started Today**

**Immediate Next Step**: Pick one Action Item from Phase 4 and start today!

**Recommended Starting Point**:
✅ **Action Item 2 (Report False Listings)** - Takes 10 minutes, immediate impact

**Quick Win**: Report the "Downtown Soup Kitchen" false listing to Google Places right now. It's the easiest action that will prevent people from wasting time looking for help at the wrong address.

Ready to dive in?
Loading
Loading