Skip to content

Enhancement: Integrate real-world dataset for validation #5

@chripiermarini

Description

@chripiermarini

Description

The current pipeline is tested on synthetic or controlled data.
To better assess robustness and realism, we want to integrate and test the system using real-world data.


Goals

  • Validate the pipeline on realistic data
  • Identify limitations and assumptions
  • Test scalability and data compatibility

Possible Data Sources

  • Public datasets (e.g. Kaggle)
  • Time series demand datasets
  • Transportation / logistics network datasets

Proposed Steps

  1. Identify a suitable dataset
  2. Adapt the ingestion layer if needed
  3. Run the full pipeline end-to-end
  4. Analyze results and document findings

Acceptance Criteria

  • At least one real dataset successfully integrated
  • Pipeline runs end-to-end without errors
  • Basic analysis of results documented
  • Key limitations and assumptions clearly stated

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions