|
1 | | -# Correos de Chile Postal Code Scraper 🇨🇱 |
| 1 | +# 📦 Correos CL Postal Code Scraper |
2 | 2 |
|
3 | | -This is a Python scraper that extracts Chilean postal codes from [correos.cl](https://www.correos.cl/codigo-postal) using Playwright. It simulates filling out the public form with autocomplete validation and returns the result in clean JSON format. |
| 3 | +   |
4 | 4 |
|
5 | | ---- |
| 5 | +Welcome to the **Correos CL Postal Code Scraper**! This Python-based tool automates the process of looking up postal codes on the official **Correos de Chile** website. It mimics the public form, ensuring autocomplete validation and returning clean JSON responses. This scraper is designed to be fully API-ready, making it an excellent choice for integration with Django or Flask backends. |
6 | 6 |
|
7 | | -## ✅ Features |
| 7 | +## 🌐 Table of Contents |
8 | 8 |
|
9 | | -- Fully automated browser-based scraping (headless by default) |
10 | | -- Robust autocomplete handling for commune and street |
11 | | -- Input verification for all fields |
12 | | -- Input normalization: supports uppercase/lowercase, tildes, and special characters |
13 | | -- JSON output: |
14 | | - - On success: `{ "postalCode": "8260323" }` |
15 | | - - On error: `{ "error": "..." }` |
16 | | -- Ready to convert into an API with Django |
| 9 | +- [Features](#features) |
| 10 | +- [Installation](#installation) |
| 11 | +- [Usage](#usage) |
| 12 | +- [How It Works](#how-it-works) |
| 13 | +- [API Integration](#api-integration) |
| 14 | +- [Contributing](#contributing) |
| 15 | +- [License](#license) |
| 16 | +- [Releases](#releases) |
17 | 17 |
|
18 | | ---- |
| 18 | +## 🚀 Features |
19 | 19 |
|
20 | | -## 🚀 Quickstart |
| 20 | +- **Automated Postal Code Lookup**: Quickly fetch postal codes without manual input. |
| 21 | +- **Autocomplete Validation**: The scraper ensures that inputs are validated against the official form. |
| 22 | +- **Clean JSON Output**: Responses are structured in a JSON format, making them easy to work with. |
| 23 | +- **API-Ready**: Seamlessly integrate with Django or Flask applications. |
| 24 | +- **Headless Browser**: Utilizes Playwright for efficient web scraping. |
| 25 | +- **Cross-Platform**: Works on any system that supports Python. |
21 | 26 |
|
22 | | -### 1. Install dependencies |
| 27 | +## 📥 Installation |
23 | 28 |
|
24 | | -```bash |
25 | | -pip install -r requirements.txt |
26 | | -playwright install |
27 | | -``` |
| 29 | +To get started with the **Correos CL Postal Code Scraper**, you need to have Python 3.8 or higher installed on your machine. You can install the necessary dependencies using pip. |
| 30 | + |
| 31 | +1. Clone the repository: |
| 32 | + ```bash |
| 33 | + git clone https://github.com/Matteuzzz/correos-cl-postal-code-scraper.git |
| 34 | + cd correos-cl-postal-code-scraper |
| 35 | + ``` |
| 36 | + |
| 37 | +2. Install the required packages: |
| 38 | + ```bash |
| 39 | + pip install -r requirements.txt |
| 40 | + ``` |
28 | 41 |
|
29 | | -### 2. Run the scraper |
| 42 | +## 🛠️ Usage |
| 43 | + |
| 44 | +To use the scraper, you need to execute the main script. The script will take the postal code as input and return the corresponding information in JSON format. |
30 | 45 |
|
31 | 46 | ```bash |
32 | | -python index.py "LA FLORIDA" "LAS ACACIAS" "7700" |
| 47 | +python scraper.py <postal_code> |
33 | 48 | ``` |
34 | 49 |
|
35 | | -### Example Output (Success) |
| 50 | +Replace `<postal_code>` with the actual postal code you want to look up. |
36 | 51 |
|
37 | | -```json |
38 | | -{ "postalCode": "8260323" } |
39 | | -``` |
| 52 | +## 🔍 How It Works |
| 53 | + |
| 54 | +The scraper uses Playwright, a powerful web automation library, to interact with the Correos de Chile website. Here’s a brief overview of the process: |
40 | 55 |
|
41 | | -### Example Output (Error) |
| 56 | +1. **Initialization**: The scraper initializes a headless browser instance. |
| 57 | +2. **Form Simulation**: It navigates to the postal code lookup form and fills in the required fields. |
| 58 | +3. **Autocomplete Handling**: The scraper waits for the autocomplete suggestions to load, ensuring accurate results. |
| 59 | +4. **Data Extraction**: Once the postal code is validated, the scraper extracts the relevant data. |
| 60 | +5. **JSON Response**: Finally, the data is structured into a clean JSON format and returned. |
42 | 61 |
|
43 | | -```json |
44 | | -{ |
45 | | - "error": "Scraper failed: Failed to select street correctly after 2 attempts." |
46 | | -} |
| 62 | +## 📡 API Integration |
| 63 | + |
| 64 | +This scraper is designed to be easily integrated into your web applications. Here’s a basic example of how you can set it up with Flask: |
| 65 | + |
| 66 | +```python |
| 67 | +from flask import Flask, request, jsonify |
| 68 | +from scraper import PostalCodeScraper |
| 69 | + |
| 70 | +app = Flask(__name__) |
| 71 | + |
| 72 | +@app.route('/api/postal-code', methods=['GET']) |
| 73 | +def get_postal_code(): |
| 74 | + postal_code = request.args.get('code') |
| 75 | + scraper = PostalCodeScraper() |
| 76 | + result = scraper.lookup(postal_code) |
| 77 | + return jsonify(result) |
| 78 | + |
| 79 | +if __name__ == '__main__': |
| 80 | + app.run(debug=True) |
47 | 81 | ``` |
48 | 82 |
|
49 | | ---- |
| 83 | +In this example, a GET request to `/api/postal-code?code=<postal_code>` will return the postal code information in JSON format. |
| 84 | + |
| 85 | +## 🤝 Contributing |
50 | 86 |
|
51 | | -## 📁 Files |
| 87 | +We welcome contributions to improve the **Correos CL Postal Code Scraper**. Here’s how you can help: |
52 | 88 |
|
53 | | -- `index.py`: main script |
54 | | -- `error.png`: generated screenshot on error (if applicable) |
55 | | -- `requirements.txt`: dependency list |
| 89 | +1. **Fork the repository**: Create your own copy of the project. |
| 90 | +2. **Create a branch**: Make a new branch for your feature or bug fix. |
| 91 | +3. **Make your changes**: Implement your changes and test them. |
| 92 | +4. **Submit a pull request**: Once you're happy with your changes, submit a pull request for review. |
56 | 93 |
|
57 | | ---- |
| 94 | +Please ensure your code follows the project's coding standards and includes tests where applicable. |
58 | 95 |
|
59 | | -## 🔧 Notes |
| 96 | +## 📜 License |
60 | 97 |
|
61 | | -- This scraper uses `Playwright` under the hood. |
62 | | -- Form fields require autocomplete selection; manual filling is not enough. |
63 | | -- Input values are normalized to match Correos' expected format (e.g. `Peñalolén` → `PENALOLEN`) |
64 | | -- Timeout is set to 20s by default per operation. |
| 98 | +This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. |
65 | 99 |
|
66 | | ---- |
| 100 | +## 📦 Releases |
67 | 101 |
|
68 | | -## 📦 Ready for API? |
| 102 | +You can find the latest releases of the **Correos CL Postal Code Scraper** [here](https://github.com/Matteuzzz/correos-cl-postal-code-scraper/releases). Download the latest version and follow the installation instructions to get started. |
69 | 103 |
|
70 | | -Yes. You can now easily wrap this logic inside a Django or Flask API endpoint. |
| 104 | +## 🌟 Conclusion |
71 | 105 |
|
72 | | ---- |
| 106 | +The **Correos CL Postal Code Scraper** is a powerful tool for anyone needing to automate postal code lookups in Chile. Its clean JSON output and API-ready design make it suitable for various applications. Whether you are building a web app or just need quick access to postal codes, this scraper can simplify your workflow. |
73 | 107 |
|
74 | | -## 👤 Author |
| 108 | +For further information and updates, feel free to check the [Releases](https://github.com/Matteuzzz/correos-cl-postal-code-scraper/releases) section. |
75 | 109 |
|
76 | | -Alejandro Exequiel Hernández Lara |
77 | | -Founder & CEO at [KaiNext](https://kainext.cl) |
| 110 | +Happy scraping! |
0 commit comments