0

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Free Learning

Meet Pypeline, a simple python library for building concurrent data pipelines

Natasha Mathur

2 min read
25 Sep 2018

0 Likes
0 Comments
9118 Views

article-image

The Python team came out with a new simple and powerful library called Pypeline, last week for creating concurrent data pipelines. Pypeline has been designed for solving simple to medium data tasks that require concurrency and parallelism. It can be used in places where using frameworks such as Spark or Dask feel unnatural.

Pypeline comprises an easy to use familiar and functional API. It enables building data pipelines using Processes, Threads, and asyncio.Tasks via the exact same API. With Pypeline, you also have control over memory and CPU resources which are used at each stage of your pipeline.

Pypeline Basic Usage

Using Pypeline, you can easily create multi-stage data pipelines with the help of functions such as map, flat_map, filter, etc. To do so, you need to define a computational graph specifying the operations which are to be performed at each stage, the number of resources, and the type of workers you want to use. Pypeline comes with 3 main modules, and each of them uses a different type of worker. To build multi-stage data pipelines, you can use 3 type of workers, namely, processes, threads, and tasks.

Processes

You can create a pipeline based on multiprocessing. Process workers with the help of process module. After this, you can specify the numbers of workers at each stage. The maxsize parameter limits the maximum amount of elements that the stage can hold simultaneously.

Threads and Tasks

Create a pipeline using threading.Thread workers by using the thread module. Additionally, in order to create a pipeline based on asyncio.Task workers, use an asyncio_task module.

Apart from being used to create multi-stage data pipelines, it can also help you create pipelines with the help of the pipe | operator.

For more information, check out the official documentation.

How to build a real-time data pipeline for web developers – Part 1 [Tutorial]

How to build a real-time data pipeline for web developers – Part 2 [Tutorial]

Create machine learning pipelines using unsupervised AutoML [Tutorial]

Like
Save for later
Comment

0 Likes
0 Comments
9118 Views

Recommendations for you

Real-World Web Development with .NET 9

Real-World Web Development with .NET 9

Dec 2024 578 pages

eBook

S$12.99 ~~S$53.99~~

CompTIA Security+ SY0-701 Certification Guide

CompTIA Security+ SY0-701 Certification Guide

Jan 2024 634 pages

Paperback

S$60.99

AI Ecosystem for the Absolute Beginners - Hands-On

AI Ecosystem for the Absolute Beginners - Hands-On

Dec 2024 4hrs 56mins

Video

S$12.99 ~~S$146.99~~

Semantic Kernel SDK for Intelligent Applications

Semantic Kernel SDK for Intelligent Applications

Dec 2024 2hrs 9mins

Video

S$12.99 ~~S$148.99~~

Getting Started with Tableau 2018.x

Getting Started with Tableau 2018.x

Sep 2018 396 pages

eBook

S$12.99 ~~S$59.99~~

Fundamentals of Object-Oriented Programming - C++

Fundamentals of Object-Oriented Programming - C++

Feb 2023 7hrs 6mins

Video

S$12.99 ~~S$40.99~~

Full-Stack Flask and React

Full-Stack Flask and React

Oct 2023 408 pages

eBook

S$12.99 ~~S$43.99~~

Raspberry Pi and MQTT Essentials

Raspberry Pi and MQTT Essentials

Sep 2022 272 pages

eBook

S$12.99 ~~S$39.99~~

C# 12 and .NET 8 – Modern Cross-Platform Development Fundamentals

C# 12 and .NET 8 – Modern Cross-Platform Development Fundamentals

Nov 2023 828 pages

eBook

S$12.99 ~~S$64.99~~

LLM Engineer's Handbook

LLM Engineer's Handbook

Oct 2024 522 pages

eBook

S$12.99 ~~S$64.99~~

article-image-revolutionising-work-and-everyday-life-with-chatgpt

M.T. White

16 Dec 2024

Revolutionising Work and Everyday Life with ChatGPT

M.T. White

16 Dec 2024

10 min read

article-image-building-trust-in-ai-the-role-of-rag-in-data-security-and-transparency

Keith Bourne

13 Dec 2024

Building Trust in AI: The Role of RAG in Data Security and Transparency

Keith Bourne

13 Dec 2024

15 min read

article-image-enhancing-data-quality-with-cleanlab

Prakhar Mishra

11 Dec 2024

Enhancing Data Quality with Cleanlab

Prakhar Mishra

11 Dec 2024

10 min read

article-image-revolutionize-power-bi-queries-with-openai

Gus Frazer

11 Dec 2024

Revolutionize Power BI Queries with OpenAI

Gus Frazer

11 Dec 2024

10 min read

Comments (0)

No comments for this article yet!