|
1 | 1 | <!-- PROJECT_TITLE --> |
2 | 2 |
|
3 | | -# Laygo |
| 3 | +# Laygo - simple pipelines, serious scale |
4 | 4 |
|
5 | 5 | <!-- PROJECT_TAGLINE --> |
6 | 6 |
|
7 | | -**A lightweight Python library for building resilient, in-memory data pipelines with elegant, chainable syntax.** |
| 7 | +**Lightweight Python library for building resilient data pipelines with a fluent API, designed to scale effortlessly from a single script to hundreds of cores and thousands of distributed serverless functions.** |
8 | 8 |
|
9 | 9 | <!-- BADGES_SECTION --> |
10 | 10 |
|
|
17 | 17 |
|
18 | 18 | ## 🎯 Overview |
19 | 19 |
|
20 | | -**Laygo** is a lightweight Python library for building resilient, in-memory data pipelines. It provides a fluent API to layer transformations, manage context, and handle errors with elegant, chainable syntax. |
| 20 | +**Laygo** is the lightweight Python library for data pipelines that I wish existed when I first started. It's designed from the ground up to make data engineering simpler, cleaner, and more intuitive, letting you build resilient, in-memory data workflows with an elegant, fluent API. |
| 21 | + |
| 22 | +It's built to grow with you. Scale seamlessly from a single local script to thousands of concurrent serverless functions with minimal operational overhead. |
21 | 23 |
|
22 | 24 | **Key Features:** |
23 | 25 |
|
24 | | -- **Fluent API**: Chainable method syntax for readable data transformations |
25 | | -- **Performance Optimized**: Uses chunked processing and list comprehensions for maximum speed |
26 | | -- **Memory Efficient**: Lazy evaluation and streaming support for large datasets |
27 | | -- **Parallel Processing**: Built-in ThreadPoolExecutor for CPU-intensive operations |
28 | | -- **Context Management**: Shared state across pipeline operations for stateful processing |
29 | | -- **Error Handling**: Comprehensive error handling |
30 | | -- **Type Safety**: Full type hints support with generic types |
| 26 | +- **Fluent & Readable**: Craft complex data transformations with a clean, chainable method syntax that's easy to write and maintain. |
| 27 | + |
| 28 | +- **Performance Optimized**: Process data at maximum speed using chunked processing, lazy evaluation, and list comprehensions. |
| 29 | + |
| 30 | +- **Memory Efficient**: Built-in streaming and lazy iterators allow you to handle datasets far larger than available memory. |
| 31 | + |
| 32 | +- **Effortless Parallelism**: Accelerate CPU-intensive tasks seamlessly. |
| 33 | + |
| 34 | +- **Distributed by Design**: Your pipeline script is both the manager and the worker. When deployed as a serverless function or a container, this design allows you to scale out massively by simply running more instances of the same code. Your logic scales the same way on a thousand cores as it does on one. |
| 35 | + |
| 36 | +- **Powerful Context Management**: Share state and configuration across your entire pipeline for advanced, stateful processing. |
| 37 | + |
| 38 | +- **Resilient Error Handling**: Isolate and manage errors at the chunk level, preventing a single bad record from failing your entire job. |
| 39 | + |
| 40 | +- **Modern & Type-Safe**: Leverage full support for modern Python with generic type hints for robust, maintainable code. |
31 | 41 |
|
32 | 42 | --- |
33 | 43 |
|
|
0 commit comments