Do you need Spark to create a data stack?

### Title of the talk

Do you need Spark to create a data stack?

### Description

Spark has been considered a mature and reliable data processing framework for data engineers across the world. But with the evolution of the landscape around data engineering, we have new tools and frameworks available for use.

This talk will focus on using MinIO for object store, duckdb for data warehousing and dbt for processing. We will also look into polars for processing of data as well.

The purpose of this talk is not to declare obsolescence of Spark as a data processing library, but rather suggest alternatives for data engineers which can be useful and better suited for specific situations.

### Table of contents

1) Introduction 
2) Background behind Spark
3) Current outlook of data engineering
4) MinIO as local object store
5) Duckdb as data warehouse
6) Using dbt to define transforms

### Duration (including Q&A)

30-35 mins

### Prerequisites

_No response_

### Speaker bio

My LinkedIn ID is-- https://www.linkedin.com/in/sourav-singh-8124b6267

### The talk/workshop speaker agrees to

- [X] Share the slides, code snippets and other material used during the talk
- [X] If the talk is recorded, you grant the permission to release
the video on [PythonPune's YouTube
channel](https://www.youtube.com/channel/UCWjk7oGWV9eknuOzC20dyiQ)
under [CC-BY-4.0
license](https://creativecommons.org/licenses/by/4.0/)

- [X] Not do any hiring pitches during the talk and follow the [Code
of
Conduct](https://github.com/pythonpune/meetup-talks#code-of-conduct)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do you need Spark to create a data stack? #187

Title of the talk

Description

Table of contents

Duration (including Q&A)

Prerequisites

Speaker bio

The talk/workshop speaker agrees to

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Do you need Spark to create a data stack? #187

Description

Title of the talk

Description

Table of contents

Duration (including Q&A)

Prerequisites

Speaker bio

The talk/workshop speaker agrees to

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions