
Narwhals: Write Your DataFrame Code Once, Run It Anywhere
Article Summary
Library authors stuck supporting both pandas and Polars: meet Narwhals, a zero-dependency compatibility layer. Write once, run on either.
A student of mine shipped a little open-source library last year — a tidy helper for cleaning survey data. It worked beautifully. Then someone opened an issue: "Does this work with Polars?" It did not. The whole thing was written in pandas, and rewriting it to also speak Polars meant either maintaining two code paths forever or picking a side and annoying half the users.
This is one of the quietest, most annoying problems in the Python data world right now, and most people writing day-to-day analysis code never even notice it. But if you've ever published a tool that touches dataframes, you've felt it. That's where Narwhals comes in.
The problem: you're stuck supporting two dataframes
For years, "a dataframe in Python" meant pandas. One library, one API, done. Then Polars arrived with a genuinely different and faster engine, a different (and frankly nicer) expression-based API, and lazy evaluation. People liked it. Then PyArrow tables started showing up everywhere as the interchange format. DuckDB joined the party. cuDF for GPUs. Modin and Dask for scale.
Suddenly the question "what dataframe does your tool accept?" has a dozen possible answers, and they all have incompatible APIs.
If you only write scripts and notebooks for yourself, this barely matters — you pick pandas or Polars and move on. But if you write anything that other people pass their dataframes into — a plotting library, a feature-engineering toolkit, an internal data-quality package — you have a real dilemma:
- Pick pandas only. Polars users have to convert to pandas first, which is slow and defeats the point of using Polars.
- Pick Polars only. You alienate the enormous installed base of pandas code.
- Support both by hand. You write
if isinstance(df, pd.DataFrame): ... else: ...branches everywhere, and you maintain that forever. It rots. - Take a heavy dependency. Some compatibility tools work, but they pull in a big dependency tree, which library authors hate inflicting on their users.
None of these is good. Narwhals exists specifically to dissolve this dilemma.
What Narwhals actually is
Narwhals describes itself as an "extremely lightweight and extensible compatibility layer between dataframe libraries." That's the whole pitch, and it's accurate.
Two properties make it interesting:
It has zero dependencies. From the project's own README: "Zero dependencies, Narwhals only uses what the user passes in so your library can stay lightweight." This is the part library authors care about most. Adding Narwhals to your package doesn't drag pandas, Polars, or anything else into your users' environment. It only uses the dataframe library they already brought.
Its API is a subset of Polars' syntax. Narwhals didn't invent a brand-new API to learn. It deliberately mirrors a subset of Polars' expression API. So if you know nw.col("x"), with_columns, filter, group_by, you already mostly know Narwhals. And if you don't know Polars yet, learning this subset is a nice on-ramp.
The mental model is simple: you wrap the incoming native dataframe, do your work using the Narwhals API, then unwrap back to whatever the user originally gave you. Pandas in, pandas out. Polars in, Polars out. Your code in the middle never changes.
Crucially, Narwhals doesn't copy or convert your data into some intermediate format. It wraps the original object and dispatches operations down to that library's own engine. A Polars dataframe is still being processed by Polars; pandas by pandas. Narwhals is a translator standing between your code and the backend, not a new dataframe of its own.
What it supports
Per the project docs, the backend support splits into two tiers:
- Full API support: cuDF, Modin, pandas, Polars, PyArrow
- Lazy-only support: Daft, Dask, DuckDB, Ibis, PySpark, SQLFrame
The "lazy-only" group covers engines that are fundamentally lazy/out-of-core, so Narwhals exposes the lazy slice of its API for them.
A small example
Here's the shape of it. One function, works on multiple backends, returns the same type it received.
import narwhals as nw
from narwhals.typing import IntoFrameT
def add_revenue_flag(df_native: IntoFrameT) -> IntoFrameT:
# 1. Wrap whatever the user passed in.
df = nw.from_native(df_native)
# 2. Use the Narwhals (Polars-flavored) API. Backend-agnostic.
df = df.with_columns(
revenue=nw.col("price") * nw.col("quantity"),
).with_columns(
is_big_order=nw.col("revenue") > 1000,
)
# 3. Unwrap back to the original library.
return df.to_native()Now both of these work, with no branching on your side:
import pandas as pd
import polars as pl
data = {"price": [10.0, 50.0, 200.0], "quantity": [3, 30, 8]}
pandas_result = add_revenue_flag(pd.DataFrame(data)) # -> pandas DataFrame
polars_result = add_revenue_flag(pl.DataFrame(data)) # -> Polars DataFrame
print(type(pandas_result)) # <class 'pandas.core.frame.DataFrame'>
print(type(polars_result)) # <class 'polars.dataframe.frame.DataFrame'>The user who passed pandas gets pandas back. The Polars user gets Polars back, processed by Polars the whole way. You wrote the logic once.
If you want to guarantee a function only receives a dataframe (and gives one back), there's a decorator-friendly pattern using nw.narwhalify, but from_native / to_native is the explicit version and the one I reach for first because it's obvious what's happening.
Who actually uses it (and who should)
This is the part I want to be honest about, because Narwhals is not for everyone.
Narwhals is built primarily for library and tool authors — people writing code that other people feed dataframes into. And it's not a hypothetical: real, widely-used projects already depend on it. The project lists a number of them, including Plotly, Altair, Bokeh, scikit-lego, and Marimo. When you make an Altair chart from a Polars dataframe and it Just Works, Narwhals is part of why.
Here's the honest dividing line:
| If you are... | Should you reach for Narwhals? |
|---|---|
| Writing a one-off analysis or notebook | No. Just use pandas or Polars directly. |
| Building an internal pipeline you fully control | Probably not. Pick one library and standardize. |
| Writing a library others install | Yes — this is the sweet spot. |
| Building a tool that must accept "any" dataframe | Yes. This is exactly the problem it solves. |
| Wanting to learn the Polars API gradually | Maybe, as a side benefit — the syntax overlaps. |
The key takeaway: if you're the end user of dataframes, Narwhals is plumbing you'll never touch directly — and that's fine. If you're the author who has to accept whatever dataframe a stranger throws at you, Narwhals turns an impossible support matrix into a single code path with no dependency cost.
A couple of honest caveats
Narwhals exposes a subset of the Polars API, not all of it. That's by design — the subset is chosen to be expressible across every backend. If you need some exotic, backend-specific operation, you can always to_native() and drop down to the real library for that step. You're never locked in; you can escape to the underlying dataframe whenever you need to.
And it's a translation layer, so it can't make a slow backend fast. If a user passes pandas, the work runs at pandas speed. Narwhals' job is portability, not performance — though by letting Polars users stay in Polars instead of converting to pandas first, it often removes a real performance penalty that a pandas-only library would have imposed.
Recap
- The modern dataframe ecosystem has many incompatible APIs (pandas, Polars, PyArrow, DuckDB, and more), which is a real headache for anyone publishing dataframe code.
- Narwhals is a zero-dependency compatibility layer. You wrap a native dataframe, write logic once in a Polars-style API, and unwrap back to the original type.
- It doesn't copy your data — it dispatches to the backend the user already brought.
- It's for library and tool authors, not for one-off analysis. Real projects like Plotly, Altair, and Marimo use it in production.
- You can always escape to the native dataframe for backend-specific tricks.
If this is the kind of design decision you're chewing on — "should my tool accept pandas, Polars, or both, and what does that cost me?" — it's exactly the sort of thing I love working through with students one-on-one, where we can look at your actual codebase instead of a toy example. There's usually a cleaner answer hiding in there than the one you started with.
Enjoyed this post? Get the next one in your inbox.
A short, useful email when there's a new tutorial, study guide, or career-prep post on the blog. No spam, unsubscribe anytime.
Written by Ali Jabbary
M.Sc., P.Eng. • Expert Data Scientist & ML Engineer with 10+ years of experience. 500+ students helped worldwide. Specializing in Python, AI/ML, and turning complex problems into simple solutions.


