Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Getting Started with DuckDB

You're reading from   Getting Started with DuckDB A practical guide for accelerating your data science, data analytics, and data engineering workflows

Arrow left icon
Product type Paperback
Published in Jun 2024
Publisher Packt
ISBN-13 9781803241005
Length 382 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Ned Letcher Ned Letcher
Author Profile Icon Ned Letcher
Ned Letcher
Simon Aubury Simon Aubury
Author Profile Icon Simon Aubury
Simon Aubury
Arrow right icon
View More author details
Toc

Table of Contents (15) Chapters Close

Preface 1. Chapter 1: An Introduction to DuckDB FREE CHAPTER 2. Chapter 2: Loading Data into DuckDB 3. Chapter 3: Data Manipulation with DuckDB 4. Chapter 4: DuckDB Operations and Performance 5. Chapter 5: DuckDB Extensions 6. Chapter 6: Semi-Structured Data Manipulation 7. Chapter 7: Setting up the DuckDB Python Client 8. Chapter 8: Exploring DuckDB’s Python API 9. Chapter 9: Exploring DuckDB’s R API 10. Chapter 10: Using DuckDB Effectively 11. Chapter 11: Hands-On Exploratory Data Analysis with DuckDB 12. Chapter 12: DuckDB – The Wider Pond 13. Index 14. Other Books You May Enjoy

Preface

There is no shortage of data being produced by humanity, in myriad formats, shapes, and ever-growing quantities. As it grows, so do the opportunities for leveraging data to benefit our world: improving decision making for governments, companies, and public organizations; supporting scientific research and technological advancements; and enabling the development of consumer products and important public services. To realize these opportunities, we are faced with an imperative: if we want to perform effective data analysis and develop products and services infused with machine learning, we must be able to manage, understand, and effectively work with the data that makes it possible.

Whether you are a data analyst, data scientist, research scientist, data engineer, software engineer, or data hobbyist, you are likely to face many of the same challenges when it comes to working with data. Analytical data workflows and applications require that data be loaded, cleaned, transformed, organized, exported, and crunched into summarized forms. A running joke amongst data practitioners is that they spend more time preparing and wrangling their data, as well as fighting with the tools that support their work than they do on the value-producing activities that are likely to be in their job descriptions. As data grows in volume and variety, these activities become both more difficult and more pressing to solve.

DuckDB is an analytical database that handles many of these challenges with ease. It enables data practitioners to streamline and improve the effectiveness of activities across the entire life cycle of data analysis and the development of analytical data infrastructure. It is simple to install and use on virtually any machine, running entirely in-process—without the overheads of connecting to and maintaining a dedicated server. At the same time, it offers blazing-fast performance for analytical operations, as well as powerful data management capabilities — features that are normally associated with distributed data processing engines and dedicated SQL database management systems. DuckDB’s rich feature set makes it an incredibly versatile tool, being well suited to a range of different use cases, such as performing interactive data analysis and ad hoc data wrangling, efficiently querying data lakes, developing lean pipelines for transforming data, functioning as an operational data warehouse, and forming a low-latency query engine for powering responsive data apps. This versatility can also be a bit overwhelming at first, as it’s hard to compare DuckDB with any one existing tool that you might be familiar with.

In this book, we’ll dive into many of DuckDB’s powerful and flexible capabilities. We’ll give you a clear framework for how to think about what kind of a data tool DuckDB is and the types of applications it excels at. Through a range of hands-on examples, you’ll learn how to make the most of this exciting tool and discover the many ways that you can incorporate it into your own analytical workflows and projects.

lock icon The rest of the chapter is locked
Next Section arrow right
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime