docs: update project for archival as a Go study milestone

ESousa97 · ESousa97 · commit 125e31239ad6 · 2026-03-23T21:42:32.000-03:00
- Add Core Learning Objectives and Final Thoughts to README
- Update architecture documentation to focus on learning outcomes
- Mark CONTRIBUTING as archived reference
- Maintain original branding and layout
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,29 +1,18 @@
-# Contributing to Go File Processor
+# Contributing (Archived Project)
 
-Thank you for your interest in contributing! This project follows rigorous standards for quality and concurrency in Go.
+Thank you for your interest! This project is currently **archived** and no longer accepting new features or active maintenance.
 
-## Development Setup
+## Project Purpose
+The **Go File Processor** was created as a second learning project to explore Go's concurrency and streaming I/O. It remains available as a historical reference for:
+- Worker Pool implementations.
+- Channel-based pipelines.
+- Middleware design patterns in Go.
 
-1.  **Requirements**: Go 1.22+ and `make`.
-2.  **Clone**: `git clone https://github.com/ESousa97/go-file-processor.git`.
-3.  **Tests**: Use `make test` to ensure everything is OK.
+## Exploring the Code
+You are welcome to fork this project to use as a template or to experiment with its features. Key areas of interest:
+- `internal/processor/csv_json.go`: The core engine using Worker Pools.
+- `internal/processor/transformer.go`: The implementation of the Middleware pattern.
+- `internal/processor/csv_json_bench_test.go`: Benchmarking logic to compare performance.
 
-## Code Conventions
-
-- Follow [Effective Go](https://golang.org/doc/effective_go.html).
-- Run `go fmt` before each commit.
-- All exported items must have professional Godoc comments in English.
-- Maintain extreme modularization: each file with a single responsibility.
-
-## Pull Request Process
-
-1.  Create a descriptive branch (`feature/`, `fix/`, `perf/`).
-2.  Ensure benchmarks haven't regressed via `make bench`.
-3.  Update `CHANGELOG.md` in the `[Unreleased]` section.
-4.  Request a code review.
-
-## Areas for Contribution
-
-- Support for new formats (XML, Avro).
-- Consumer optimization to further reduce serialization overhead.
-- CLI improvements (e.g., more detailed progress bar).
+## License
+The project remains under the **MIT License**, allowing you to use and modify it for your own purposes.
diff --git a/README.md b/README.md
@@ -12,25 +12,36 @@
 [![Go Reference](https://pkg.go.dev/badge/github.com/ESousa97/go-file-processor.svg)](https://pkg.go.dev/github.com/ESousa97/go-file-processor)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![Go Version](https://img.shields.io/github/go-mod/go-version/ESousa97/go-file-processor)](https://github.com/ESousa97/go-file-processor)
-[![Last Commit](https://img.shields.io/github/last-commit/ESousa97/go-file-processor)](https://github.com/ESousa97/go-file-processor/commits/main)
+[![Last Commit](https://img.shields.io/github/last-commit/ESousa97/go-file-processor)](https://github.com/ESousa97/go-file-processor)
 
 </div>
 
 ---
 
-**Go File Processor** is a high-performance command-line tool and library designed to efficiently convert massive CSV files (millions of records) into structured JSON. Using the Worker Pool pattern and channel-based processing, it ensures optimized CPU usage and constant memory consumption, regardless of the input file size.
+> **Note: Archival Project**  
+> This was my second major project in Go, built as a deep dive into the language's idiomatic concurrency patterns and high-performance I/O. It is now archived but serves as a solid reference for ETL (Extract, Transform, Load) implementations in Golang.
+
+**Go File Processor** is a high-performance command-line tool and library designed to efficiently convert massive CSV files (millions of records) into structured JSON. It demonstrates the power of Go's concurrency primitives to achieve maximum throughput with minimal memory overhead.
+
+## 🚀 Core Learning Objectives
+
+This project was a hands-on laboratory to master several Go concepts:
+
+*   **Concurrency via Worker Pool:** Leveraging `goroutines` and `channels` to process data in parallel without overwhelming the system.
+*   **Memory Efficiency (Streaming):** Using `io.Reader` and `io.Writer` to process gigabytes of data with a constant, tiny memory footprint.
+*   **The Middleware Pattern:** Implementing a "Chain of Responsibility" for data transformation that is both flexible and type-safe.
+*   **Atomic Operations:** Using `sync/atomic` for high-speed metrics tracking, avoiding the overhead of mutexes.
+*   **Idiomatic Project Layout:** Following standard Go folder structures (`cmd/`, `internal/`) and build automation with `Makefile`.
 
 ## Demonstration
 
 ### As a Library
 
-Add transformers and configure the execution pool fluently:
-
 ```go
 proc := processor.NewCSVToJSONProcessor()
 config := processor.Config{WorkerCount: 8}
 
-// Add transformers (Chain of Responsibility)
+// Fluent transformation chain
 config.AddTransformer(processor.EmailFilter(`@company.com$`))
 config.AddTransformer(processor.FieldMasker("email"))
 
@@ -39,118 +50,39 @@ metrics, err := proc.Process("input.csv", "output.json", config)
 
 ### As a CLI
 
-Run massive processing with real-time metrics:
-
 ```bash
 ./fileproc -input data.csv -output data.json -workers 4
 ```
 
-Output:
-
-```text
-[INFO] Starting processing...
-[INFO] Progress: 100000 rows processed
-[SUMMARY] EXECUTION COMPLETED IN 1.2s
-- Total lines read: 100000
-- Successfully processed: 98500
-- Errors/Ignored: 1500
-```
+## Tech Stack & Architecture
 
-## Tech Stack
-
-| Technology          | Role                                                                |
+| Technology          | What I Learned                                                      |
 | ------------------- | ------------------------------------------------------------------- |
-| **Go 1.22+**        | Core language with high-performance native concurrency              |
-| **Worker Pool**     | Parallelism management and load control                             |
-| **slog**            | Structured logging for observability and traceability               |
-| **Atomic Counters** | High-performance metrics collection without contention (lock-free)  |
-| **Channels**        | Secure and decoupled communication between Producer, Workers, and Consumer |
-
-## Prerequisites
-
-- **Go >= 1.22**
-- **Make** (for build automation and benchmarks)
-
-## Installation and Usage
+| **Worker Pool**     | How to orchestrate multiple goroutines for parallel work.           |
+| **Channels**        | Managing safe communication and backpressure between stages.        |
+| **Streaming I/O**   | Processing files record-by-record instead of loading to RAM.        |
+| **Atomic Counters** | Implementing thread-safe counters with maximum performance.         |
+| **Structured Logs** | Using `slog` for modern, machine-readable observability.            |
 
-### From Source
-
-```bash
-git clone https://github.com/ESousa97/go-file-processor.git
-cd go-file-processor
-make build
-```
+### Pipeline Flow
 
-### Data Generation and Benchmark
-
-To validate performance with 100k+ row files:
-
-```bash
-make generate-data
-make bench
-```
+The system uses a streaming model to maintain low memory usage:
+`Input CSV -> Producer -> Job Channel -> [Workers + Transformers] -> Result Channel -> Consumer -> Output JSON`
 
 ## Makefile Targets
 
 | Target               | Description                                               |
 | -------------------- | --------------------------------------------------------- |
-| `make build`         | Compiles the `fileproc` binary at the project root        |
-| `make test`          | Runs the unit test suite                                  |
-| `make bench`         | Runs performance comparisons (Sequential vs Parallel)     |
-| `make generate-data` | Generates a massive test file (100,000 records)           |
-| `make clean`         | Removes binaries and temporary files                      |
-
-## Architecture
-
-The project uses a channel-based streaming model to process data without loading the entire file into memory.
-
-```mermaid
-graph LR
-    Input[CSV Input] --> Producer[Producer]
-    Producer --> Jobs{Job Channel}
-    Jobs --> W1[Worker 1]
-    Jobs --> W2[Worker 2]
-    Jobs --> WN[Worker N]
-    W1 & W2 & WN --> Transformers[Transformation Layer]
-    Transformers --> Results{Result Channel}
-    Results --> Consumer[Consumer]
-    Consumer --> Output[JSON Output]
-
-    subgraph "Worker Pool"
-    W1
-    W2
-    WN
-    end
-```
-
-## API Reference
-
-Detailed technical documentation available at [pkg.go.dev/github.com/ESousa97/go-file-processor](https://pkg.go.dev/github.com/ESousa97/go-file-processor).
+| `make build`         | Compiles the `fileproc` binary.                           |
+| `make test`          | Runs the full unit test suite.                            |
+| `make bench`         | Runs benchmarks to see the speed of Parallel vs Sequential. |
+| `make generate-data` | Generates a 100k row test file for performance testing.   |
 
-## Configuration (CLI Flags)
+## 📚 Final Thoughts
 
-| Flag       | Description                       | Type     | Default       |
-| ---------- | --------------------------------- | -------- | ------------- |
-| `-input`   | Input CSV file path               | `string` | `input.csv`   |
-| `-output`  | Output JSON file path             | `string` | `output.json` |
-| `-workers` | Number of concurrent workers       | `int`    | `4`           |
+Building this project taught me that Go isn't just about syntax; it's about a philosophy of simplicity and performance. The transition from sequential processing to a parallel worker pool showed me how Go empowers developers to build tools that scale effortlessly.
 
-## Roadmap
-
-Follow the project's evolution stages:
-
-- [x] **Phase 1: Foundation** — Worker Pool and streaming core implementation.
-- [x] **Phase 2: Transformation** — Middleware layer (Chain of Responsibility).
-- [x] **Phase 3: Observability** — Atomic metrics and structured logs (`slog`).
-- [x] **Phase 4: Governance** — CI/CD, Professional documentation, and Badges.
-
-## Contributing
-
-Interested in collaborating? Check our [CONTRIBUTING.md](CONTRIBUTING.md) for code standards and PR process.
-
-## License
-
-This project is licensed under the **MIT License** — see the [LICENSE](LICENSE) file for details.
+---
 
 <div align="center">
 
@@ -166,6 +98,6 @@ This project is licensed under the **MIT License** — see the [LICENSE](LICENSE
 
 Made with ❤️ by [Enoque Sousa](https://github.com/ESousa97)
 
-**Project Status:** Active — Constantly updated
+**Project Status:** Archived — Educational Milestone
 
 </div>
diff --git a/docs/architecture.md b/docs/architecture.md
@@ -1,10 +1,10 @@
-# System Architecture
+# Historical Architecture Design
 
-This document details the architectural decisions and data flow of the **Go File Processor**.
+This document serves as a reference for the design decisions made during the development of the **Go File Processor**. This project was a study of Go's system architecture capabilities.
 
-## Data Flow (Pipeline)
+## The Streaming Pipeline
 
-The system uses a parallel streaming pipeline to ensure efficiency with massive files.
+The primary goal was to achieve high throughput with **constant memory usage**. We implemented a pipeline where data is processed in individual records, never loading the entire file into RAM.
 
 ```mermaid
 graph TD
@@ -24,24 +24,21 @@ graph TD
     end
 ```
 
-## Architectural Decisions (ADRs)
+## Core Architectural Lessons
 
-### 1. Worker Pool Pattern
-**Context**: Processing millions of records via a single main loop would cause I/O blocking and CPU underutilization.
-**Decision**: Implement a pool of goroutines (Workers) that process records in parallel.
-**Consequence**: Significant throughput increase on multi-core systems.
+### 1. The Worker Pool Pattern
+**Learning Goal**: Understand how to scale processing by decoupling the producer from the consumers using channels.
+**Implementation**: A fixed number of goroutines (Workers) listen on a shared channel.
+**Outcome**: High CPU utilization across all cores without manual thread management.
 
-### 2. Streaming vs Batching
-**Context**: Loading the entire file into memory (Full Read) can cause OOM (Out Of Memory) on files dozens of GBs in size.
-**Decision**: Process via `io.Reader` and `io.Writer`, keeping only the stream buffer in memory.
-**Consequence**: Constant RAM consumption (~20-50MB) regardless of file size.
+### 2. Backpressure Management
+**Learning Goal**: How to prevent the producer from overwhelming the consumer.
+**Implementation**: Using buffered channels as a "shock absorber" for data bursts.
 
-### 3. Middleware for Transformations
-**Context**: Transformation/filter logic should be flexible and decoupled from the core Worker code.
-**Decision**: Use the "Chain of Responsibility" pattern via the `Transformer func(*User) bool` type.
-**Consequence**: Ease of adding new filters without changing the main worker loop.
+### 3. Decoupled Middleware
+**Learning Goal**: Implementing clean, pluggable logic using Go's function types.
+**Implementation**: Using `Transformer func(*User) bool` as a chain of responsibility.
 
-### 4. Atomic Metrics
-**Context**: Multiple workers need to update success/error counters simultaneously. Mutexes could cause contention.
-**Decision**: Use `sync/atomic` for lock-free counting.
-**Consequence**: Maximum performance in high-concurrency scenarios.
+### 4. Lock-Free Metrics
+**Learning Goal**: Avoiding mutex contention in high-concurrency environments.
+**Implementation**: Using the `sync/atomic` package for thread-safe global counters.