This guide explains how to use the data collection scripts to build a comprehensive database of Victorian schools and suburbs for the Schoolify application.
The Schoolify database contains:
- Comprehensive Victorian school data with full school names (including campus information)
- School rankings and academic metrics
- Victorian suburb geocoding information
First, run the comprehensive data collection script which executes all necessary steps in sequence:
python download_postcodes.py
This script will:
- Download a comprehensive CSV of Australian postcodes
- Extract Victorian suburbs (postcodes 3000-3999)
- Create a geocode cache for faster processing
Next, run the comprehensive scraper to collect school and suburb data:
python comprehensive_scraper.py
This script will:
- Fetch school rankings from Better Education
- Collect additional school data from Victorian government sources
- Process Victorian suburbs data with geocoding information
- Generate a complete geocode database file (geocode-db.js)
Finally, update the existing database with the new comprehensive data:
python update_database.py
This script will:
- Create backups of existing data files
- Merge new school data with existing data
- Update the geocode database with comprehensive suburb information
Check the updated database statistics:
python count_database_entries.py
The scripts collect data from multiple sources:
- Better Education - School rankings and basic information
- Victorian Government - Comprehensive school listings
- Australian Postcodes - Victorian suburbs with geocoding information
To keep the database up-to-date:
- Run these scripts periodically to refresh the data
- Check for changes in data source formats that might require script updates
- Verify the data quality after each update
- If the postcodes download fails, you can manually download the CSV from the URL in the script
- If school data scraping encounters errors, check if the source websites have changed their format
- Backups of previous data are stored in the
data/backupsdirectory