# Python OOM dataframes

# Prehistory

Oops, that’s me, prehistoric *coughs*

So I’ve lately read about a benchmark of FireDucks - *pandas-like lazy blazingly fast dataframe library* (yeah, all the buzzwords are there).

Initial reviews are pretty good, claiming it’s [really indecently fast](https://www.linkedin.com/pulse/first-impressions-fireducks-matt-harrison-mthvc/):

> This ran in 1 second (with `._evaluate` added) on the 55 million row dataset. The classic pandas version took 8 minutes!

However, this [GIF had sparked](https://www.linkedin.com/posts/ultanorourke_fireducks-makes-pandas-125x-faster-changing-activity-7279481404327956481-mABC?utm_source=share&utm_medium=member_desktop) a lot of [controversy](https://www.linkedin.com/feed/update/urn:li:activity:7274024597606305792?utm_source=share&utm_medium=member_desktop), even despite Avi Chawla doing a [comprehensive overview](https://www.dailydoseofds.com/p/pandas-vs-fireducks-performance-comparison/).

![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1a8dfe-c3ac-4fce-bb06-8d93c16f366d_3003x2673.jpeg align="left")

The guys at FD seem to have made a [comparison with DuckDB and Polars](https://www.dailydoseofds.com/p/fireducks-vs-pandas-vs-duckdb-vs-polars), [too](https://fireducks-dev.github.io/docs/benchmarks/) + a [dashboard](https://fireducks-dev.github.io/db-benchmark/) of [db-benchmark](https://github.com/duckdblabs/db-benchmark):

![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7ce9bbf-5415-4efe-97ae-aaf8350aa1c1_2443x1018.png align="left")

# Dundundun DAFTMAN

I’ve recently read about another dataframe library, [Daft](https://www.getdaft.io/) - and decided to add it using the [Colab provided](https://www.linkedin.com/feed/update/urn:li:activity:7274024597606305792?utm_source=share&utm_medium=member_desktop); given it’s rather simple to do for a single comparison.

Behold, results from an [infinitetibugged colab](https://colab.research.google.com/drive/1IOPFLJmTjt5wcQU1ow-ZbFyvpUOzQxzq?usp=sharing):

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1737824820600/49a20040-c55e-4fcf-848e-405e6405b3eb.png align="center")

Or, mean time for DULocation and PULocation, respectively:

| **Library** | Mean, s |
| --- | --- |
| <mark>Fireducks</mark> | **<mark>4.099</mark>** |
| Daft | 5.424 |
| Polars | 6.155 |

| **Library** | **Mean, s** |
| --- | --- |
| <mark>Fireducks</mark> | **<mark>4.26</mark>** |
| Polars | 6.338 |
| Daft | 5.268 |

So:

* Yeah, Fireducks seems the fastest - it’s also not [open-source](https://github.com/fireducks-dev/fireducks/issues/22):
    

> By providing the beta version of FireDucks free of charge and enabling data scientists to actually use it, NEC will work to improve its functionality while verifying its effectiveness, with the aim of commercializing it within FY2024.  
> [https://www.nec.com/en/press/202310/global\_20231019\_01.html](https://www.nec.com/en/press/202310/global_20231019_01.html)

* [I’ve been interested by Daft’s promise and focus on p](https://www.nec.com/en/press/202310/global_20231019_01.html)roviding almost all possible [interfaces](https://www.getdaft.io/projects/docs/en/stable/user_guide/integrations.html), data types and engines
    

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Daft also exposes a <a target="_self" rel="noopener noreferrer nofollow" class="reference internal" href="https://www.getdaft.io/projects/docs/en/stable/user_guide/sql.html" style="pointer-events: none">SQL</a> interface which interoperates closely with the DataFrame interface, allowing you to express data transformations and queries on your tables as SQL strings.</div>
</div>

So I’ll try doing things with Daft in the nearest future!

---

Welcome to **Teleogenic**❣️

Other places I cross-post (not always) to:

* [**Hashnode**](https://posts.teleogenic.com)
    
* [**Medium**](https://baldr.medium.com/)
    
* [**Telegram**](https://t.me/ohmyboi)
    
* [**Twitter**](https://twitter.com/ZakharKogan)
    
* [**LinkedIn**](https://www.linkedin.com/in/zakhar-kogan/)
