Data Pipelines with Python mentor lab environment

Data · Blended · 11 weeks

Data Pipelines with Python

₩1,680,000

Design batch and streaming pipelines with pandas, Polars, and orchestration basics. Emphasis on data quality checks and reproducible notebooks that hiring managers can audit.

Modules & labs

  • Schema validation with pydantic models
  • Airflow-style DAG concepts (local runner)
  • Data quality dashboards
  • Parquet and columnar storage labs
  • Privacy-aware sampling techniques
  • Mentor review of pipeline diagrams

Outcomes

  • Document a pipeline with failure recovery steps
  • Implement validation gates before warehouse loads
  • Present lineage maps for a capstone dataset
Portrait of Soyeon Lee

Soyeon Lee

Data engineer mentoring on reliable ingestion patterns.

FAQ

Basic SELECT/JOIN knowledge helps. We include a two-week SQL refresher module.

Participant notes

Our team adopted the validation checklist from week 5 on a legacy CSV import — fewer silent failures.

Minho , Regional logistics firm · via Google

Polars section moved fast; office hours helped me catch up on lazy evaluation.

Ara

Request information