Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

How-To Tutorials - Data Visualization

2 Articles
article-image-creating-and-using-kibana-dashboards
Huage Chen
02 Jul 2024
12 min read
Save for later

Creating and Using Kibana Dashboards

Huage Chen
02 Jul 2024
12 min read
This article is an excerpt from the book, Elastic Stack 8.x Cookbook, by Huage Chen and Yazid Akadiri. Unlock the full potential of Elastic Stack for search, analytics, security, and observability and manage substantial data workloads in both on-premise and cloud environmentsIntroductionIn this guide, we will integrate all previously created visualizations into a comprehensive dashboard consisting of multiple panels. Additionally, we will explore how to enhance user interaction using control-based drilldowns.Getting readyMake sure to complete the following recipes from this chapter:Creating visualizations with Kibana LensCreating visualizations from runtime fieldsCreating Kibana mapsAt the end of this recipe, you will have dashboards composed of the various visualizations and elements built into the aforementioned recipes.How to do it...Building dashboards is very straightforward in Kibana, especially if you’ve already created some visualizations. Follow these steps:1. Go to Kibana | Analytics | Dashboard and click on Create dashboard.This will bring you to a blank canvas, where you can start adding some visualizations.2. We will start by adding a nice image! You can be creative, but we provided a sample picture:A. Click on Add panel | Image. B. Select the Use link tab and set Link to image with the following URL: https://upload. wikimedia.org/wikipedia/commons/6/60/Ville_de_RENNES_Noir. svg. Then, click on Save:Figure 6.54 – Adding an image for a logoThe logo will be added to the panel. Including a picture is a great way to add some personalization and branding to your dashboards. Let’s add some proper visualizations from the ones we’ve built in the last three recipes.3. Click on Add from library and select the [Rennes Traffic] Number of locations visualization. Make sure to align it to the right with the image panel.4. Let’s add another visualization; this time, we’ll pick [Rennes Traffic] Average speed gauge.At this stage, your dashboard should look like the one shown in Figure 6.55:Figure 6.55 – Rennes traffic dashboard – first stepYou can easily rearrange the position of the different panels by clicking on the title section and moving the panel with your mouse anywhere you want on the canvas. To adjust the size and fit of the panel, position your mouse on the small arrow at the bottom right of the panel. Let’s keep adding more panels to our dashboard.5. Click on Add from library and add the following visualizations in the respective order:I. [Rennes Traffic] Traffic status waffleII. [Rennes Traffic] Speed by road hierarchyIII. [Rennes Traffic] Average speed & Traffic StatusIV. [Rennes Traffic] Traffic status by hour6. Finally, let’s add a Map visualization for a real-time view of the traffic; select the one named [Rennes Traffic] Traffic fluidity.By now, your dashboard should look like the one shown in Figure 6.56:Figure 6.56 – Rennes traffic dashboard – more visualizationsYou can start playing around with the dashboard to see the built-in interactivity of the panels. For example, clicking on a specific road hierarchy will automatically apply the filter to the entire dashboard.You can also have dedicated panels to filter and display only the data you are interested in with Controls. Let’s add some to our dashboard.7. On the dashboard toolbar, click on Controls:Figure 6.57 – Adding controls to the dashboard8. From the drop-down list, select Add control; the Create control flyout will appear on the right of the screen.9. Select the traffic_status field and click on Save and close.10. Back to the dashboard, you now have a new panel on top of the visualization named traffic_status. By clicking on it, you will see a drop-down list where you can select the values associated with the status of the traffic you want to filter, as shown in Figure 6.58. Select congested as an example:Figure 6.58 – Using controls in the dashboard11. You can see on your dashboard that all the panels have been updated according to the value selected in the traffic_status control.Imagine you want to filter your traffic data to analyze it within a specific time range, such as early in the morning or late in the afternoon, to better understand traffic patterns. This is where the time slider control proves to be incredibly useful.12. Go to the Controls menu again in the dashboard toolbar and select Add time slider control.You’ll see a new panel to the right of traffic_status:Figure 6.59 – Time slider controlBy clicking the play icon, you will see your dashboard animate and your data change over the defined time range. You can advance the time range forward as well as backward, which is especially useful when working with time series data.Your dashboard should now look as shown in Figure 6.60, with our two controls:                                                                                     Figure 6.60 – Rennes traffic dashboard with controls13. Save the dashboard by clicking the Save button in the upper-right corner. Name it [Rennes Traffic] Overview.To enhance our dashboard further, consider this: users frequently manage multiple dashboards, and the ability to navigate seamlessly from one to another is crucial, especially when aiming to refine analysis or focus on more detailed panels related to a specific dataset. Dashboard drilldowns are invaluable in this scenario as they allow you to transition between dashboards while maintaining the overall context. Let’s explore how to implement and use this feature effectively!For this exercise, we have already built a drilldown dashboard. Download and save the NDJSON file of the exported dashboard from the following location: https://github.com/PacktPublishing/ Elastic-Stack-8.x-Cookbook/blob/main/Chapter6/kibana-objects/rennesdata-drilldown-dashboard.ndjson. Then, follow these steps:1. To import the dashboard, go to Stack Management | Saved Objects.2. Click on Import and select the NDJSON file you have previously downloaded from the GitHub repository. Upon completing the import process, you will notice a warning in the flyout about data view conflicts. The reason is straightforward: our saved objects rely on an existing data view. To resolve the conflict, simply click on the drop-down list under the New data view column and select metrics-rennes_traffic-raw, as shown in Figure 6.61, then click on Confirm all changes to finalize the import procedure:Figure 6.61 – Importing saved objects and selecting the right data view3. Once all the objects have been imported, you will get a recap as shown in the following screenshot:                                                                                      Figure 6.62 – Saved objects successfully imported from the fileReturn to the [Rennes Traffic] Overview dashboard. Then, open the menu for the [Rennes Traffic] Speed by road hierarchie panel and select Create drilldown:Figure 6.63 – Creating drilldown from the panel4. Navigate to the drilldowns page and select the Go to Dashboard option. Here, you will need to name your drilldown—consider View Details for Road Hierarchy as a suggestion. Then, from the Choose destination dashboard drop-down menu, select [Rennes Traffic] Detailed traffic drilldown dashboard, which you have recently imported. This process sets up a targeted navigation path within your dashboard environment, allowing for a seamless transition between your overview and detailed analysis dashboards:Figure 6.64 – Configuring dashboard drilldown5. Click on Create drilldown. Save the dashboard to test our drilldown, click on one of the five charts in the [Rennes Traffic] Speed by road hierarchie panel. You will be redirected to the detailed dashboard filtered on the value you have selected.Figure 6.65 – Dashboard view after drilldownEt voilà! You have just built your first dashboard with a nice touch of interactivity thanks to controls and drilldowns.How it works...In Kibana, a dashboard is a collection of visualizations and saved searches that you can arrange and customize to display the data that is most important to you. You can create multiple dashboards for different use cases, and each dashboard can have its own set of visualizations and searches.Dashboards are a powerful tool for data analysis because they allow you to see multiple visualizations side by side and quickly identify patterns and trends in your data. You can also use dashboards to monitor key metrics in real time, which is especially useful for operational use cases. Kibana provides a wide range of visualization types that you can use to create custom dashboards, including bar charts, line charts, pie charts, tables, and more.The following table outlines a framework for choosing the right visualization:Use caseRecommended type of visualizationComparison and correlationMany items: Horizontal barFew items: Vertical barComparison over timeFew periods and categories: Stacked barFew time periods but many categories: Line graphDistribution of valuesFew numbers of points: Vertical bar histogramMany points: Line histogramComposition of a wholeSimple compositions with few items: Waffle or TreemapMultiple grouping dimensions for a few bottomlevel items: MosaicMultiple grouping dimensions for many bottomlevel items: TreemapEye-catching summaryOne value: MetricMany values: Table with color stylingVisualizing goals or targetsVertical bar or Line with reference linesMetricTable 6.2 – Choosing the right visualizationIn addition to visualizations, Kibana dashboards also support saved searches, which allow you to quickly filter your data based on specific criteria. You can save searches that you use frequently and add them to your dashboard for easy access.Overall, Kibana dashboards are a powerful tool for data analysis and monitoring. They allow you to quickly identify patterns and trends in your data, monitor key metrics in real time, and customize your view of the data to suit your needs.There’s more...In our recipe, we have used dashboard drilldowns, but you can also create URL and Discover drilldowns. With the former, you can link to data outside of Kibana, and with the latter, you can open Discover from a Lens panel while keeping all the contextual information.Dashboards are great when used in Kibana, but you can also share them with teams and colleagues outside of Kibana. You have many options that are easily accessible from the Share menu in the toolbar when it comes to sharing dashboards: you can interactively embed dashboards as an iFrame, export them as reports in various formats (PNG, CSV, PDF, etc.), and share them as direct links for easy access.When building dashboards, design thinking is a good practice. Start by asking yourself the following questions:What is the outcome or the goal of the dashboard? Is it about understanding high-level behaviors, visually correlating specific metrics at the same time, or finding the root cause of an issue?Who is using this dashboard to do their job? If you are building it for a team or someone else, step into their shoes to visualize their perspective when they will need that data.See alsoLooking for more design tips to elevate your dashboards? Look no further and check out this blog: https://www.elastic.co/blog/designing-intuitive-kibanadashboards-as-a-non-designerIf you’re interested in delving deeper into the topics of creating dashboards more efficiently, be sure to check out this technical blog: https://www.elastic.co/blog/buildingkibana-dashboards-more-efficientlyFor developers interested in debugging their Kibana dashboard, the following article will be very useful: https://www.elastic.co/blog/debugging-kibana-dashboardsConclusionIn this guide, we've explored the process of integrating various visualizations into a comprehensive Kibana dashboard, enhancing user interaction through control-based drilldowns. By following the steps outlined, you should now have a functional and interactive dashboard that can provide valuable insights into your data.We began by preparing the necessary visualizations and then moved on to assembling the dashboard by adding images for personalization and aligning various traffic visualizations. We also incorporated control panels for dynamic filtering, allowing for more precise data analysis. The final touch was adding drilldowns to enable seamless navigation between detailed and overview dashboards.Kibana dashboards offer powerful tools for data analysis and real-time monitoring. By displaying multiple visualizations side by side, you can quickly identify patterns and trends, making dashboards invaluable for operational and analytical use cases.Remember, the key to a successful dashboard is thoughtful design—consider the goals, the audience, and the specific data insights needed. Utilize the wide range of visualization types that Kibana offers and don't hesitate to leverage the sharing options to collaborate with your team effectively.For further reading and advanced tips on designing intuitive dashboards, building them efficiently, or debugging, check out the additional resources provided. Happy dashboarding!Author BioHuage Chen is a member of Elastic's customer engineering team and has been with Elastic for over five years, helping users throughout Europe to innovate and implement cloud-based solutions for search, data analysis, observability, and security. Before joining Elastic, he worked for 10 years in web content management, web portals, and digital experience platforms.Yazid Akadiri has been a solutions architect at Elastic for over four years, helping organizations and users solve their data and most critical business issues by harnessing the power of the Elastic Stack. At Elastic, he works with a broad range of customers, with a particular focus on Elastic observability and security solutions. He previously worked in web services-oriented architecture, focusing on API management and helping organizations build modern applications.
Read more
  • 0
  • 0
  • 686

article-image-hands-on-exploratory-data-analysis-with-duckdb
Ned Letcher
28 Jun 2024
7 min read
Save for later

Hands-On Exploratory Data Analysis with DuckDB

Ned Letcher
28 Jun 2024
7 min read
This article is an excerpt from the book, Getting Started with DuckDB, by Simon Aubury and Ned Letcher. Discover how Snowflake's unique objects and features can be used to leverage universal modeling techniques through real-world examples and SQL recipes.Introduction DuckDB is a versatile and highly optimized database management system designed for efficient data analysis workflows. Its capabilities allow practitioners to scale their data analysis efforts beyond traditional tools, making it an excellent choice for local machine data processing. In this excerpt, we will explore how to use DuckDB for hands-on exploratory data analysis, leveraging Python, Jupyter Notebooks, and Plotly for interactive data visualizations.Technical RequirementsTo follow along with the examples in this guide, you will need the following setup:Python environmentJupyter NotebookDuckDB installedJupySQL libraryPlotly libraryYou can find the necessary code examples in the chapter_11 folder in the book’s GitHub repository at [PacktPublishing](https://github.com/PacktPublishing/Getting-Started-with-DuckDB/tree/main/chapter_11).Obtaining the Dataset We will be using a pedestrian counting system dataset from the city of Melbourne, containing hourly pedestrian counts from sensors located in and around the Melbourne Central Business District (CBD). This dataset provides a comprehensive view of pedestrian traffic patterns over several years.To download the dataset, visit the dataset’s home page [Melbourne Pedestrian Counting System](https://data.melbourne.vic.gov.au/explore/dataset/pedestrian-counting-system-monthly-counts-per-hour) and locate the ZIP file containing the 2009 to 2022 archive.Setting Up the Environment Before diving into the code, ensure your Python environment is set up with the necessary dependencies. You will need to: 1. Set up a Python virtual environment:python -m venv duckdb_env source duckdb_env/bin/activate 2. Install the required libraries:   pip install jupyter duckdb plotly jupysql pandas  3. Start Jupyter Notebook: jupyter notebook Loading and Cleaning DataFirst, we will load our dataset from a CSV file and perform some data cleaning steps before writing it to a DuckDB database.Loading CSV Data into DuckDBimport duckdb import pandas as pd # Load the dataset into a pandas DataFrame data_url = "path_to_downloaded_zip_file/2022/2022.csv" pedestrian_counts = pd.read_csv(data_url) # Display the first few rows of the dataframe print(pedestrian_counts.head()) # Create a DuckDB connection and write the DataFrame to a DuckDB table con = duckdb.connect(database=':memory:') con.execute("CREATE TABLE pedestrian_counts AS SELECT * FROM pedestrian_counts") ```Data Cleaning StepsPerform necessary data cleaning operations such as handling missing values, correcting data types, and filtering irrelevant records.# Convert the 'Date_Time' column to datetime format pedestrian_counts['Date_Time'] = pd.to_datetime(pedestrian_counts['Date_Time']) # Handle missing values by filling them with 0 pedestrian_counts = pedestrian_counts.fillna(0) # Write the cleaned data to DuckDB con.execute("DROP TABLE pedestrian_counts") con.execute("CREATE TABLE pedestrian_counts AS SELECT * FROM pedestrian_counts") # Verify the cleaned data result = con.execute("SELECT * FROM pedestrian_counts LIMIT 5").fetchdf() print(result)Using JupySQL for SQL QueriesJupySQL is a powerful library that allows you to run SQL queries directly in Jupyter Notebooks. This makes it easy to interact with your DuckDB database without switching contexts. #### Example JupySQL Query%load_ext sql %sql duckdb:///:memory: # Query to view the first few rows of the dataset %%sql SELECT * FROM pedestrian_counts LIMIT 5;Visualizing Data with Plotly Plotly is a versatile data visualization library that integrates well with Jupyter Notebooks. We will use it to create interactive visualizations of our dataset.Total Pedestrian Counts Over Timeimport plotly.express as px # Aggregate pedestrian counts by year yearly_counts = con.execute("""    SELECT strftime('%Y', Date_Time) AS Year, SUM(Counts) AS Total_Counts    FROM pedestrian_counts    GROUP BY Year    ORDER BY Year """).fetchdf() # Create a bar chart fig = px.bar(yearly_counts, x='Year', y='Total_Counts', title='Total Pedestrian Counts by Year') fig.show()Monthly Traffic Counts# Aggregate pedestrian counts by month for the years 2019 and 2020 monthly_counts = con.execute("""    SELECT strftime('%Y-%m', Date_Time) AS Month, SUM(Counts) AS Monthly_Counts    FROM pedestrian_counts    WHERE strftime('%Y', Date_Time) IN ('2019', '2020')    GROUP BY Month    ORDER BY Month """).fetchdf() # Create a line chart to compare the two years fig = px.line(monthly_counts, x='Month', y='Monthly_Counts', title='Monthly Pedestrian Counts for 2019 and 2020') fig.show()Hourly Traffic Patterns# Aggregate pedestrian counts by hour of the day hourly_counts = con.execute("""    SELECT strftime('%H', Date_Time) AS Hour, AVG(Counts) AS Average_Counts    FROM pedestrian_counts    GROUP BY Hour    ORDER BY Hour """).fetchdf() # Create a line chart for hourly patterns fig = px.line(hourly_counts, x='Hour', y='Average_Counts', title='Average Hourly Pedestrian Counts') fig.show()Exploratory Data Analysis With our dataset loaded and visualized, we can perform a more detailed exploratory data analysis.Comparing Traffic on Weekdays vs. Weekends# Add a column for day of the week pedestrian_counts['Day_of_Week'] = pedestrian_counts['Date_Time'].dt.day_name() # Aggregate pedestrian counts by day of the week daily_counts = con.execute("""    SELECT Day_of_Week, AVG(Counts) AS Average_Counts    FROM pedestrian_counts    GROUP BY Day_of_Week    ORDER BY FIELD(Day_of_Week, 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday') """).fetchdf() # Create a bar chart for daily patterns fig = px.bar(daily_counts, x='Day_of_Week', y='Average_Counts', title='Average Pedestrian Counts by Day of the Week') fig.show()Peak Hours of Pedestrian Traffic# Identify peak hours by finding the hours with the highest average counts peak_hours = con.execute("""    SELECT strftime('%H', Date_Time) AS Hour, AVG(Counts) AS Average_Counts    FROM pedestrian_counts    GROUP BY Hour    ORDER BY Average_Counts DESC    LIMIT 5 """).fetchdf() # Create a bar chart for peak hours fig = px.bar(peak_hours, x='Hour', y='Average_Counts', title='Peak Hours of Pedestrian Traffic') fig.show()ConclusionDuckDB, combined with JupySQL and Plotly, provides a robust framework for performing hands-on exploratory data analysis. By leveraging DuckDB’s high-performance SQL capabilities and integrating with powerful visualization tools, you can efficiently uncover insights from your data. We encourage you to further explore DuckDB’s features and apply these techniques to your datasets.For a deeper dive into DuckDB's powerful data analysis capabilities and to explore more advanced topics, we highly recommend reading the book 'Getting Started with DuckDB' by Simon Aubury and Ned Letcher."Author BioSimon Aubury has been working in the IT industry since 2000 as a data engineering specialist. He has an extensive background in building large, flexible, highly available distributed data systems. Simon has delivered critical data systems for finance, transport, healthcare, insurance, and telecommunications clients in Australia, Europe, and Asia Pacific. In 2019, Simon joined ThoughtWorks as a principal data engineer and today is associate director of data platforms at Simple Machines in Sydney, Australia. Simon is active in the data community, a regular conference speaker, and the organizer of local and international meetups and data engineering conferences.Ned Letcher has worked as a data science and software engineering consultant since completing his PhD in computational linguistics in 2018 and currently works at Thoughtworks. He has designed and developed data-powered products and services across a range of industries and helped organizations and teams improve the effectiveness of their data processes and workflows. Ned has also worked as a Python trainer, supporting both tertiary students and data professionals across various organizations. He is active in the data community, speaking at and helping organize meetups and conferences, as well as contributing to a range of open source projects.
Read more
  • 0
  • 0
  • 496
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime