Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 168 additions & 0 deletions docs/non_food_service_address_fix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# Non-Food-Service Address Fix

This module provides functionality to identify and fix non-food-service addresses (like PO boxes) in charity Sites by moving them to the parent Organization and updating the Site with a physical address if available.

## Problem Statement

According to the Tackle Hunger schema requirements:
- **Sites** should contain physical addresses for food pickup/dropoff/distribution (avoid PO boxes)
- **Organizations** can contain mailing addresses including PO boxes

Some sites may have been incorrectly created with PO boxes or other non-physical addresses that should be moved to the parent organization.

## Solution

The fix involves:

1. **Detection**: Identify sites with non-physical addresses (PO boxes, virtual addresses, etc.)
2. **Organization Update**: Move the non-physical address to the parent Organization using `updateOrganizationFromAI`
3. **Site Update**: Update the Site with a physical address from other sites in the organization, or clear it if none available, using `updateSiteFromAI`

## Components

### AddressValidator (`src/tackle_hunger/address_validator.py`)

Validates addresses and identifies non-food-service addresses:

```python
from tackle_hunger.address_validator import AddressValidator

validator = AddressValidator()

# Check if an address is suitable for a food service site
is_suitable = validator.is_suitable_for_site("P.O. Box 123") # Returns False

# Get detailed classification
classification = validator.classify_address("123 Main Street")
print(classification.is_physical_address) # True
print(classification.confidence) # 0.85
```

### SiteOperations (`src/tackle_hunger/site_operations.py`)

Manages the complete workflow for fixing addresses:

```python
from tackle_hunger.graphql_client import TackleHungerClient
from tackle_hunger.site_operations import SiteOperations

client = TackleHungerClient()
site_ops = SiteOperations(client)

# Fix non-food-service addresses
sites_processed, fixes_applied = site_ops.fix_non_food_service_addresses(limit=50)
print(f"Processed {sites_processed} sites, applied {fixes_applied} fixes")
```

## Command Line Usage

The main script provides a convenient command-line interface:

```bash
# Analyze sites without making changes (dry run)
python scripts/fix_non_food_service_addresses.py --dry-run --limit 10

# Fix addresses for up to 50 sites
python scripts/fix_non_food_service_addresses.py --limit 50

# Enable verbose logging
python scripts/fix_non_food_service_addresses.py --verbose --limit 10
```

### Script Options

- `--limit N`: Maximum number of sites to process (default: 50)
- `--dry-run`: Analyze sites but make no changes
- `--verbose`: Enable detailed logging

## Environment Setup

Ensure your `.env` file contains:

```
AI_SCRAPING_TOKEN=your_ai_scraping_token_here
ENVIRONMENT=dev
```

## Address Detection Patterns

The system detects various non-physical address patterns:

### PO Box Patterns
- "P.O. Box 123"
- "PO Box 456"
- "Post Office Box 789"
- "Box 202" (standalone)

### Virtual Address Patterns
- "PMB 456" (Private Mail Box)
- "Suite 123 Mail Forwarding Service"
- "Mail Drop 101"

### Physical Address Indicators
- "123 Main Street"
- "456 Oak Avenue"
- "789 N. Washington Blvd"
- "Building 5, 303 Corporate Drive"

## GraphQL Operations

The fix uses these mutations following the schema defined in README.md:

### updateOrganizationFromAI
```graphql
mutation updateOrganizationFromAI($organizationId: String!, $input: organizationInputUpdate!) {
updateOrganizationFromAI(organizationId: $organizationId, input: $input) {
id
streetAddress
}
}
```

### updateSiteFromAI
```graphql
mutation updateSiteFromAI($siteId: String!, $input: siteInputForAIUpdate!) {
updateSiteFromAI(siteId: $siteId, input: $input) {
id
streetAddress
}
}
```

## Error Handling

The system includes comprehensive error handling:

- **Network errors**: Gracefully handles GraphQL API failures
- **Data validation**: Safely processes sites with missing or invalid data
- **Transaction safety**: Updates are atomic - if organization update fails, site is not modified
- **Logging**: Detailed logs help track what changes were made

## Testing

Run the comprehensive test suite:

```bash
# Run all tests
python -m pytest tests/ -v

# Run specific test modules
python -m pytest tests/test_address_validator.py -v
python -m pytest tests/test_site_operations.py -v
python -m pytest tests/test_integration.py -v
```

## Monitoring

The script generates detailed logs:

- **Console output**: Shows progress and summary
- **Log file**: `address_fixes.log` contains detailed operation logs
- **Error tracking**: Failed operations are logged with reasons

## Safety Features

- **Dry run mode**: Analyze without making changes
- **Incremental processing**: Process sites in batches
- **Audit trail**: All changes are logged with "AI_Copilot_Assistant" as the modifier
- **Rollback support**: Changes follow standard GraphQL patterns for potential rollback
147 changes: 147 additions & 0 deletions scripts/fix_non_food_service_addresses.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
#!/usr/bin/env python3
"""
Fix Non-Food-Service Addresses Script

This script identifies and fixes non-food-service addresses (like PO boxes)
in charity Sites by moving them to the parent Organization and updating
the Site with a physical address if available.

Usage:
python scripts/fix_non_food_service_addresses.py [--limit 50] [--dry-run]
"""

import sys
import os
import logging
import argparse

# Add src to path for imports
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'src'))

from tackle_hunger.graphql_client import TackleHungerClient # noqa: E402
from tackle_hunger.site_operations import SiteOperations # noqa: E402


def setup_logging(verbose: bool = False) -> None:
"""Set up logging configuration."""
level = logging.DEBUG if verbose else logging.INFO
format_str = '%(asctime)s - %(levelname)s - %(message)s'

logging.basicConfig(
level=level,
format=format_str,
handlers=[
logging.StreamHandler(sys.stdout),
logging.FileHandler('address_fixes.log')
]
)


def validate_environment() -> bool:
"""Validate that required environment variables are set."""
required_vars = ['AI_SCRAPING_TOKEN']
missing_vars = []

for var in required_vars:
if not os.getenv(var):
missing_vars.append(var)

if missing_vars:
print(f"Error: Missing required environment variables: {', '.join(missing_vars)}")
print("Please set these variables in your .env file or environment.")
return False

return True


def main():
"""Main script function."""
parser = argparse.ArgumentParser(
description="Fix non-food-service addresses in charity Sites"
)
parser.add_argument(
'--limit',
type=int,
default=50,
help='Maximum number of sites to process (default: 50)'
)
parser.add_argument(
'--dry-run',
action='store_true',
help='Analyze sites but do not make any changes'
)
parser.add_argument(
'--verbose',
action='store_true',
help='Enable verbose logging'
)

args = parser.parse_args()

# Set up logging
setup_logging(args.verbose)
logger = logging.getLogger(__name__)

# Validate environment
if not validate_environment():
return 1

try:
# Initialize client and operations
logger.info("Initializing Tackle Hunger client...")
client = TackleHungerClient()
site_ops = SiteOperations(client)

if args.dry_run:
logger.info("Running in DRY RUN mode - no changes will be made")

# Fetch and analyze sites
sites = site_ops.get_sites_for_ai(limit=args.limit)
logger.info(f"Fetched {len(sites)} sites for analysis")

# Analyze addresses
fixes = site_ops.analyze_site_addresses(sites)
logger.info(f"Identified {len(fixes)} sites requiring address fixes")

# Display results
if fixes:
print("\nSites requiring address fixes:")
print("=" * 60)
for i, fix in enumerate(fixes, 1):
print(f"{i}. Site: {fix.site_name} (ID: {fix.site_id})")
print(f" Current Address: {fix.original_address}")
print(f" Action: {fix.action}")
print(f" Reason: {fix.reason}")
print(f" Organization ID: {fix.organization_id}")
if fix.new_org_address:
print(f" New Org Address: {fix.new_org_address}")
print()
else:
print("\nNo sites requiring address fixes found.")

else:
# Actually perform the fixes
logger.info(f"Starting to fix non-food-service addresses for up to {args.limit} sites")
sites_processed, fixes_applied = site_ops.fix_non_food_service_addresses(limit=args.limit)

print("\nCompleted address fix operation:")
print(f"Sites processed: {sites_processed}")
print(f"Fixes applied: {fixes_applied}")

if fixes_applied > 0:
print(f"\nSuccessfully fixed {fixes_applied} sites with non-food-service addresses.")
print("Check the log file 'address_fixes.log' for detailed information.")
else:
print("\nNo address fixes were needed or applied.")

return 0

except Exception as e:
logger.error(f"Unexpected error: {e}")
if args.verbose:
logger.exception("Full traceback:")
return 1


if __name__ == "__main__":
sys.exit(main())
2 changes: 1 addition & 1 deletion src/tackle_hunger/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@
"""

__version__ = "1.0.0"
__author__ = "LNRS Tech for Good Volunteers"
__author__ = "LNRS Tech for Good Volunteers"
Loading