-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Title of the talk
Optimizing Large-Scale Data Processing: A Deep Dive into FireDucks vs. Pandas
Description
Abstract:
As data scientists, we rely on Pandas for data preprocessing, but when dealing with large datasets, it struggles with performance. To overcome this, I explored various high-performance alternatives like DuckDB, Polars, and cuDF. While these libraries offer speed, they come with a learning curve, requiring new syntax and concepts. Then I discovered FireDucks—a library that is fully compatible with Pandas, meaning no new functions to learn, just a simple import change. FireDucks delivers impressive speed improvements over Pandas and even outperforms many other alternatives in large-scale data processing. In this session, I’ll share my experience comparing these tools and demonstrate why FireDucks is a game-changer for handling big data effortlessly.
Table of contents
Key Takeaways:
✅ Understanding Pandas' limitations with large datasets
✅ Exploring alternatives like DuckDB, Polars, and cuDF
✅ Why FireDucks stands out: seamless integration & high speed
✅ Best practices for using FireDucks efficiently in your workflow
This session is ideal for data professionals, analysts, and engineers looking to enhance their workflow efficiency with large datasets.
Duration (including Q&A)
15 miniutes
Prerequisites
No response
Speaker bio
The talk/workshop speaker agrees to
-
Share the slides, code snippets and other material used during the talk
-
If the talk is recorded, you grant the permission to release
the video on PythonPune's YouTube
channel
under CC-BY-4.0
license -
Not do any hiring pitches during the talk and follow the Code
of
Conduct