Skip to main content

// data-engineering

Data Engineering tutoring, one-on-one.

Pipelines that scale — SQL, Python, dbt & Airflow.

data & analytics 3 levels from $85/hr 1-on-1 online
First session free100% online, 1-on-1From $85/hr

// choose your starting point

Data Engineering levels

Start where you are — each level maps to a different point on the journey. We confirm the right one together in your free first session.

$85/hr

Pipelines Foundations

Build your first data pipeline.

Who it's forFor an analyst or Python user who can query a bit and now wants to build the actual pipelines that turn messy raw sources into clean, queryable tables.

Learn the foundations data teams run on: advanced SQL, Python for moving data, and how real ETL/ELT pipelines are built. Start turning messy sources…

What you'll be able to do

  • Write advanced SQL (joins, CTEs, window functions, aggregation) to shape real datasets.
  • Use Python (pandas / SQLAlchemy) to extract, clean, and load data between a source and a database.
  • Explain ETL vs ELT and choose the right approach for a given source and warehouse.
  • Build and run a first repeatable pipeline from a raw source to a clean output table.

Sounds familiar?

  • I can write a SELECT, but real analytics SQL — window functions, CTEs — is where I get stuck.
  • I move data with one-off Python scripts that break constantly and that nobody else can run.
  • People say "ETL" and "ELT" and I genuinely don't know the difference or which I should be doing.
  • I've never built a pipeline end-to-end — I just have a folder of scripts I run by hand.
Advanced SQLPython for dataETL vs ELTYour first pipeline
SQL (PostgreSQL)PythonpandasSQLAlchemy
Book this level
$115/hr

dbt & Orchestration

Transform and schedule with dbt + Airflow.

Who it's forFor someone who can build a basic pipeline and now wants the modern analytics-engineering stack — transforming with dbt and orchestrating with Airflow against a cloud warehouse.

Build the modern analytics-engineering stack: transform data with dbt (models, tests, docs), orchestrate jobs with Airflow, and load into a cloud…

What you'll be able to do

  • Build dbt models with tests, sources, and auto-generated docs, structured by dependencies.
  • Author Airflow DAGs that schedule and orchestrate jobs in the correct dependency order.
  • Add data-quality checks (dbt tests / freshness) that catch bad data before it spreads.
  • Load and transform data inside a cloud warehouse such as Snowflake or BigQuery.

Sounds familiar?

  • My SQL transformations are a tangle of scripts with no tests, no docs, and no idea what depends on what.
  • I run jobs manually or with cron, and there's no real scheduling, dependency order, or visibility.
  • When upstream data is bad, it silently poisons everything downstream and I find out too late.
  • I keep hearing "dbt" and "Airflow" in every job description and I've never actually used either.
dbt models & testsAirflow DAGsScheduling & dependenciesData qualityWarehouses
dbtApache AirflowSnowflake / BigQuerySQLPython
Book this level
$145/hr

Scale & Spark

Big data, streaming, production.

Who it's forFor a data engineer comfortable with dbt and Airflow who now needs to handle big data and streaming, and keep pipelines fast and reliable in production.

Engineer for scale: process big data with Spark/PySpark, design for performance, add streaming, and build pipelines that stay reliable in production…

What you'll be able to do

  • Process datasets too large for memory using Spark / PySpark.
  • Diagnose and fix performance problems through partitioning, caching, and avoiding wasteful shuffles.
  • Build a basic streaming pipeline that handles continuously arriving data.
  • Design data models and add reliability practices (idempotency, monitoring) that hold up in production.

Sounds familiar?

  • My pandas/SQL approach falls over once the data is too big to fit in memory.
  • Spark jobs run, but they're slow and I don't understand partitioning, shuffles, or why.
  • Everything I build is batch — I don't know how to handle data that arrives continuously.
  • Pipelines that work in dev fall apart in production and I'm always firefighting reliability.
Spark / PySparkPartitioning & performanceStreaming basicsModelling at scaleReliability
Apache Spark / PySparkPythonSQLKafka (streaming)Cloud data platform
Book this level

//what's included

How a Data Engineering session works

Every subject runs on the same method — a live, hands-on hour built so the learning sticks, with everything you make saved and yours to keep.

Live co-op coding

We work in one shared editor with you driving — you write the code or derive the maths while I steer in real time, not by watching slides.

Saveable whiteboard

Every diagram and derivation is sketched on a shared whiteboard you keep — the canvas is saved and yours to revisit after the hour.

Written recap

You leave with a written summary of what changed and one or two things to practise, so the session keeps working after we hang up.

Off-class help

Stuck between sessions? Send the error or the question and get unblocked — support does not stop the moment the call ends.

Assignments & checkpoints

We close each hour with a checkpoint you attempt solo, so we both see it actually landed — and loop back before the time runs out if it did not.

Your class archive

Code, whiteboards and recaps live in one place you can return to — a growing folder of everything we have built together.

Want the minute-by-minute anatomy of a real hour? See how it works →

// what to expect

Honest about how it goes

No guarantees, no fixed curriculum — just a specific, repeatable way of working that gets you unstuck on Data Engineering.

Built around your goal

There is no fixed syllabus to keep pace with. The hour is built backwards from the one thing you need — a failing assignment, a concept that will not stick, a project to ship.

Diagnosed, not re-taught

We find the precise step where it breaks down instead of re-covering what you already know — so the time goes to the gap that actually matters.

You drive, I steer

You do the work in real time while I guide — that is how it sticks. You leave able to do it yourself, not just having watched me do it.

Honest pace & pricing

You only pay for the levels and pace that fit. We agree the plan together after the free first session — no packages you do not need.

// faq

Frequently asked questions

About Data Engineering tutoring and how sessions work.

Is the first Data Engineering session really free?

Yes. Your first session is complimentary so you can experience the teaching style, talk through your goals, and decide whether to continue — no credit card required upfront.

How much does Data Engineering tutoring cost?

Sessions start at $85/hour, and multi-session packages are available at a discount. You only pay for the levels and pace that fit your goals — we agree on a plan together after the free first session.

How are Data Engineering sessions delivered?

All sessions are 1-on-1 and 100% online over video, with screen sharing and a shared editor or whiteboard. Sessions are typically 60–90 minutes and scheduled around your availability.

Which Data Engineering level should I start at?

It is set by where you are now, not a fixed curriculum. In the free first session we map your background to the right starting level and adjust the pace as you progress.

Who is teaching the sessions?

Every session is taught directly by Ali Jabbary, M.Sc., P.Eng. — not a rotating pool of tutors. You work with the same instructor throughout.

Ready to start Data Engineering?

Your first session is free, with no credit card required. Book a time that suits you and we'll map out exactly where to begin.

from $85/hr · 1-on-1 · 100% online · taught by Ali

Book a free callMessage Ali