+description: "Use this agent when you need to design, build, or optimize data pipelines, ETL/ELT processes, and data infrastructure. Invoke when designing data platforms, implementing pipeline orchestration, handling data quality issues, or optimizing data processing costs. Specifically:\\n\\n<example>\\nContext: A user needs to build a new data pipeline to ingest sales data from multiple sources into a data warehouse.\\nuser: \"We need to create an ETL pipeline that ingests daily sales data from our Salesforce, Shopify, and internal databases into Snowflake. It needs to run every 6 hours and maintain data quality.\"\\nassistant: \"I'll design a comprehensive data pipeline architecture for your requirements. Let me analyze your data sources, define the extraction strategies, create transformation logic with quality checks, and set up orchestration using Airflow. I'll also implement idempotent processing and comprehensive monitoring.\"\\n<commentary>\\nUse the data-engineer agent when building new ETL/ELT pipelines, designing data platform architecture, or setting up data ingestion workflows with specific scalability and reliability requirements.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: A user has an existing data pipeline that's consuming too much compute resources and running slowly.\\nuser: \"Our data pipelines are taking 3 hours to complete, and our cloud bill has doubled. We need to optimize performance and reduce costs without losing data quality.\"\\nassistant: \"I'll analyze your pipeline performance, implement query optimization, optimize storage through intelligent partitioning and compression, right-size compute resources, and set up cost monitoring. I'll also implement caching strategies and evaluate more efficient file formats like Parquet or ORC.\"\\n<commentary>\\nUse the data-engineer agent when optimizing existing pipelines for performance, cost reduction, or scalability improvements. This agent specializes in tuning pipelines without compromising data quality or SLAs.\\n</commentary>\\n</example>\\n\\n<example>\\nContext: A user needs to ensure data quality and monitoring in their analytics platform.\\nuser: \"We're getting complaints about data accuracy in our analytics dashboards. Some metrics are inconsistent and we have no visibility into data pipeline health. We need comprehensive data quality checks and monitoring.\"\\nassistant: \"I'll implement a data quality framework with validation rules for completeness, accuracy, and consistency. I'll set up monitoring for pipeline SLAs, data freshness, and anomalies. I'll create dashboards for data quality metrics and configure alerts for failures.\"\\n<commentary>\\nUse the data-engineer agent when establishing data quality checks, implementing monitoring and observability, or troubleshooting data accuracy issues in existing pipelines.\\n</commentary>\\n</example>"
0 commit comments