
5 weekend data projects that are actually fun (and quietly impressive)
Article Summary
Five Saturday-sized data projects on free public data. Each teaches a real skill, looks good in a portfolio, and starts with data you actually care about.
The most common portfolio I see from someone learning data science is three notebooks, all using the Titanic dataset, all doing the exact same thing as ten thousand other people. It's fine. It's also forgettable — for them and for whoever's looking. The work feels like homework because it was homework.
The fix isn't a harder project. It's a project you actually care about. When the data is yours, or it's about something you find genuinely interesting, you stop "doing data science" and start answering a question you wanted answered anyway. That's when the skills sink in — and, conveniently, that's also what makes a project memorable to anyone reading your GitHub.
Here are five projects, each finishable in a Saturday, each on real free public data, each teaching a skill you'll reuse for years. I'll give you the question, the skill, and where to get the data. No invented datasets, no "imagine you have a CSV of..." — these are all real and free.
The "fun first" filter
Before the projects, one rule that quietly decides whether you'll finish: pick the dataset you have a feeling about.
If you don't care whether your local weather is getting hotter, skip the weather one. If you've never opened Spotify, skip the music one. Curiosity is the fuel that gets you past the boring middle bit of every project — the cleaning, the "why is this column a string," the chart that looks wrong the first time. Care about the answer and you'll push through. Don't, and the notebook joins the graveyard.
Right. Five projects.
Project 1 — Your own life, in data
The question: what do my own habits actually look like?
Most big platforms will hand you your own data if you ask. Spotify, Google (including your YouTube and location history), Apple Health steps, your phone's screen-time export — most let you download an archive of your activity. It's the most motivating dataset on earth, because it's about you.
Skills: loading messy real-world files (often JSON), parsing dates and times, grouping by day or hour, and your first honest chart.
A taste of it — once your listening or step data is in a pandas DataFrame with a date column:
import pandas as pd
df = pd.read_json("my_data.json")
df["date"] = pd.to_datetime(df["timestamp"])
df["hour"] = df["date"].dt.hour
by_hour = df.groupby("hour").size()
by_hour.plot(kind="bar", title="When am I most active?")You'll discover something true about yourself — your real peak hour, your most-played track, the day of the week you walk least. That little jolt of "huh, I didn't know that" is the whole point.
Project 2 — Is my town actually getting warmer?
The question: has the weather where I live genuinely changed over the decades, or does it just feel that way?
NOAA's Climate Data Online gives free public access to decades of quality-controlled daily weather — temperature, precipitation, the lot — for stations all over the world. You request a free token, pick a station near you, and pull the history. (Start here — the API needs a token, which is free and instant.)
Skills: working with a real API, handling a long time series, resampling daily readings into yearly averages, and drawing a trend line that means something. This is the project that teaches you time-series basics without a textbook.
The "mini-dashboard" version — average temperature per year with a trend — looks genuinely impressive precisely because the data is real and the question is personal. Nobody can accuse you of copying a tutorial when the chart is about your hometown.
Project 3 — What does the internet actually feel about X?
The question: are reviews / comments / posts about this thing mostly positive or negative — and why?
Grab the text reviews for something you care about (a game, a product, a film) or pull posts from a subreddit on a topic you follow. Then run basic sentiment analysis and see what the mood actually is, versus what you assumed.
Skills: working with text data, basic natural-language processing, counting and ranking words, and — the real lesson — being honest about the limits. Sentiment tools are blunt instruments; they miss sarcasm and context constantly. Discovering that yourself, and saying so in your write-up, is more impressive than pretending you built a perfect mood detector.
# the shape of it: each review gets a rough positive/negative score,
# then you look at what's driving the extremes
reviews["sentiment"] = reviews["text"].apply(score_sentiment)
print(reviews.sort_values("sentiment").head()) # the angriest reviews
print(reviews.sort_values("sentiment").tail()) # the happiestThe interesting bit is never the average score. It's reading the most negative reviews and finding the one specific thing everyone complains about.
Project 4 — The "is this actually worth it?" analysis
The question: does the expensive option actually pay off?
This is my favourite because it's pure real-world thinking. Pick a "is it worth it?" question with public data behind it. Do houses near the train station really cost more? Do higher-rated restaurants actually charge more, or is rating and price barely related? Does a city's cycling investment line up with fewer accidents?
Skills: joining two datasets, comparing groups honestly, and the most underrated skill in all of data science — noticing the difference between correlation and "well, obviously there's a third thing going on." A project that resists an over-confident conclusion shows more maturity than one that claims to have "proven" something.
When you're comparing groups and pivoting numbers around, you'll lean hard on pandas' grouping tools — and knowing when to reach for groupby versus a pivot table saves a lot of fumbling. This decision guide is the cheat sheet I wish I'd had.
Project 5 — A tiny, honest prediction
The question: can I guess tomorrow from yesterday?
Take any of the data above and make the smallest possible prediction. Given the last few days of weather, guess tomorrow's high. Given a film's genre and length, guess its rating bracket. Keep it tiny.
Skills: your first taste of machine learning — splitting data into "train" and "test," fitting a simple model, and (crucially) measuring how wrong you are. The lesson that lasts isn't "I built a model." It's "my model is right about 60% of the time, and here's exactly where it falls over." A modest, honestly-measured prediction beats a flashy one with no error analysis, every time.
This is the natural on-ramp to proper data science — and it'll tell you fast whether you find this stuff genuinely fun before you commit to anything bigger.
The README that makes it count
Here's the part most people skip, and it's the part that does 80% of the work of impressing anyone. A project nobody can understand is a project that didn't happen.
Write a short README — five honest sentences:
- The question I was trying to answer.
- The data I used and where it came from (link it).
- What I found — one or two real takeaways, in plain English.
- What surprised me or where the analysis can't be trusted.
- How to run it.
That fourth point — owning the limits — is what separates someone who can run code from someone who can think. Anyone reviewing your work, for a job or a course, is looking for exactly that.
The recap
- Pick data you have a feeling about. Curiosity gets you through the boring middle.
- Five Saturdays, five real skills: messy files, APIs and time series, text, joining and comparing, a first prediction.
- Use real public data (your own exports, NOAA, public reviews) — never the same tired tutorial set.
- Write a five-sentence README, and be honest about the limits. That's the quietly impressive part.
| Project | Skill it builds | Honest difficulty |
|---|---|---|
| Your own life data | Messy files, grouping, dates | Gentle |
| Hometown weather | APIs, time series, trends | Medium |
| Sentiment of reviews | Text data, knowing the limits | Medium |
| "Is it worth it?" | Joining data, correlation sense | Medium |
| Tiny prediction | First ML, measuring error | Spicy (but small) |
You don't need five. You need one, finished, with a README, on something you actually wondered about. That single honest project will teach you — and show — more than a stack of half-done tutorials.
If you want a second pair of eyes on your first real project, or you keep getting stuck at the "why is this column the wrong type" stage, that's exactly the kind of thing a one-on-one session sorts out in an afternoon. Bring your messy notebook; messy is where the learning lives.
Enjoyed this post? Get the next one in your inbox.
A short, useful email when there's a new tutorial, study guide, or career-prep post on the blog. No spam, unsubscribe anytime.
Written by Ali Jabbary
M.Sc., P.Eng. • Expert Data Scientist & ML Engineer with 10+ years of experience. 500+ students helped worldwide. Specializing in Python, AI/ML, and turning complex problems into simple solutions.


