Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Alteryx Designer Cookbook
Alteryx Designer Cookbook

Alteryx Designer Cookbook: Over 60 recipes to transform your data into insights and take your productivity to a new level

eBook
$9.99 $47.99
Paperback
$59.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Alteryx Designer Cookbook

Working with Databases

Accessing databases from Alteryx is very simple and fast. And the methods we use for files can apply to databases as well.

But databases have more peculiarities, many of which we, as analysts, cannot change (such as response speed, availability, and design).

Also, we’ll be addressing the basics of Data Connection Manager (DCM), a very useful and powerful feature introduced by Alteryx in version 2021.4, but highly improved in version 2022.1.

DCM is a secure, centralized, single-source administration, storage, and connection-sharing capability for database and cloud interoperability, offering enhanced security improvements (credentials linked to data sources and resolved at runtime).

If you are an administrator within your company, you have probably already identified the huge benefits DCM brings to your job. If you’re not, you’ll realize how easy it is to administer your credentials and connections using DCM once you start using it.

Another powerful feature of Alteryx Designer is the In-Database (In-DB) tools. These tools allow us to perform blending and analysis against large sets of data without moving the data out of the database, providing performance improvements over the traditional methods, since everything is executed within our Database Management System (DBMS) and no traffic along the network is required (very low to no latency).

In this chapter, we’ll be looking at some recipes to improve how we work with databases:

  • Scanning databases dynamically (cursor behavior, but more efficient)
  • Using Alteryx Calgary Databases
  • Creating credentials in DCM
  • Creating connections in DCM
  • Getting information from your In-DB connections/queries

Technical requirements

We created a portable database (using SQLite) as a test set for these recipes, but if you want to try them with your own data, you’ll need to have your connection information and access credentials at hand.

For the Calgary recipe, make sure you have enough free disk space (~2GB) on your computer.

Important note

Even when it’s not required for the recipes, access to Alteryx Server will be needed in case you want to synchronize DCM against your existing enterprise credentials.

Cursor behavior, but more efficient

When working with databases, we call cursor to the process where, for each record of a given table, you need to sequentially scan/read all the records from a second table, in search of a condition.

This process is very useful for some use cases, but it might cause a huge overhead for the database management system and the network. For example, a cell phone provider company has all the data about each call – each IMEI for a period of time – and the marketing department is trying to predict the effects of a certain campaign on some customers.

If we analyze the amount of data produced by each call per phone, it’ll be huge and it’ll take us a lot of time. So, in this case, we probably will extract from the database the data associated with those customers targeted by the campaign first, then analyze it.

For that, we’ll have a first input consisting of the conditions the targeted audience must fulfill, and we use that input data to scan and retrieve each record from the transactional data source (calls in this case) associated with the selected ones.

In this recipe, we’ll learn how to perform a “cursor-like” reading of tables (for each record in one table, read all the records in a second table), using the Dynamic Input tool, avoiding the overhead, and not capturing the database’s server resources.

Getting ready

For this example, we put together a portable database in SQLite that you can download from here:

https://github.com/PacktPublishing/Alteryx-Designer-Cookbook/tree/main/ch2/Recipe1

This set contains a database with three tables:

  • DOCUMENTS: Containing all the information about a company’s billing (~254K records)
  • ARTICLES: Containing a description of each ARTICLE_ID available for the company
  • CUSTOMERS: FIRST, LAST, and EMAIL for each customer
Figure 2.1: Database structure

Figure 2.1: Database structure

The use case will be as follows: we, as a hardware store, need to gather the data corresponding to our top 10 CUSTOMERS from last year and get the top 10 ARTICLES each one bought.

We have DOCUMENTS (billing data) in one table, ARTICLES in another, and CUSTOMERS in a third one.

And our top 10 CUSTOMERS from last year come in an Excel File (DATA\Top10CUSTOMERS2021.xlsx).

How to do it…

We will do so using the following steps:

  1. On a new workflow, drop an Input Data tool and point it to DATA\Top10CUSTOMERS2021.xlsx.
  2. Select the 2021Top10 worksheet in Select Excel Input and click OK.
Figure 2.2: Select Excel Input

Figure 2.2: Select Excel Input

  1. Drop a Dynamic Input tool (from the Developer category) and configure it as follows:
  2. Click on Edit… for the Input Data Source Template option.
Figure 2.3: Dynamic Input tool configuration options

Figure 2.3: Dynamic Input tool configuration options

The Connect a File or Database screen will pop up.

Figure 2.4: Dynamic Input template configuration

Figure 2.4: Dynamic Input template configuration

  1. For the Connect a File or Database option, point it to the SQLite file. When prompted with Choose Table or Specify Query, click on the SQL Editor tab at the top of the window and write this SQL sentence:
    SELECT * FROM DOCUMENTS WHERE CUSTOMER_ID=1234 AND PERIOD=2022

This can be seen here:

Figure 2.5: Dynamic Input template query

Figure 2.5: Dynamic Input template query

As you may notice, there is no CUSTOMER_ID=1234 in the database, but here is where Alteryx Designer will operate its magic.

Once Alteryx validates the query, your template will look like this:

Figure 2.6: Template panel after the configuration

Figure 2.6: Template panel after the configuration

Now, we need to configure the action we want the tool to perform.

  1. Select Modify SQL Query, and click Add on the right of the configuration panel. You’ll be presented with five options. Select SQL: Update WHERE Clause:
Figure 2.7: Modify SQL Query options

Figure 2.7: Modify SQL Query options

A new screen will be shown with pre-populated fields:

Figure 2.8: Configuring the Dynamic Input tool

Figure 2.8: Configuring the Dynamic Input tool

  1. Make sure CUSTOMER_ID=1234 is selected for SQL Clause to Update, Value Type is set to Integer, Text to Replace is 1234 and Replacement Field shows CUSTOMER_ID and click OK.

If you run the workflow, you’ll get all records for 2022 corresponding only to the customer IDs contained in the control file (top 10 buyers from the previous year). From here, you can start the process of getting the top 10 articles bought by each customer, but that will be part of another recipe.

How it works…

When configuring the Dynamic Input tool to any of the Modify SQL Query options, Alteryx Designer will read all the conditions within the query and will replace the parts you indicated within your selections. In this case, since SQL: Update WHERE Clause was selected, Alteryx will modify only the part corresponding to the WHERE CUSTOMER_ID = 1234 part.

For the second part of the clause (PERIOD=2022), since we didn’t select any modifier for it, it’ll remain untouched.

The amazing part is that Alteryx Designer will execute one straight query per record coming from the Input Data tool, so, instead of having a cursor scanning the database per record in the input file (a single process from start to finish), there’ll be N individual queries running one after the other, causing the release of resources in the DBMS after each query.

Figure 2.9: Multiple queries executed from just one tool

Figure 2.9: Multiple queries executed from just one tool

There’s more…

Of course, you can combine multiple WHERE statements, and replace the part you need with incoming data every time you have to.

But, if you look at Figure 2.7, you have other options to make your database queries dynamic, such as replacing strings in queries, which can be very helpful for executing queries along different tables:

SELECT * FROM "TABLE" WHERE XXXX

You can set up a rule to indicate the tables you want to query, and in the WHERE clause, the conditions to query those tables, and all can be dynamic.

Working with Calgary databases

According to Alteryx’s definition:

Calgary is a list count data retrieval engine designed to perform analyses on large scale databases containing millions of records. Calgary utilizes indexing methodology to quickly retrieve records. A database index is a data structure that improves the speed of data retrieval operations on a database table. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random look ups and efficient access of ordered records.

Besides the actual definition, we can see Calgary as a proprietary file format, with the ability to handle huge amounts of data (~2B records) and to index the contents, so searches are extremely fast because there’s no need to read all the records before filtering them.

Alteryx provides a tool category for Calgary containing a set of five native tools:

  • Calgary Input: We’ll use this tool to query Calgary databases
  • Calgary Join: It’ll allow us to take an input file and perform join queries against a Calgary database
  • Calgary Cross Count: Performs aggregations across multiple Calgary databases and returns a count per record
  • Calgary Cross Count Append: This will allow you to take an input file and append counts to records that join a Calgary database when those records match your criteria
  • Calgary Loader: This is the tool we’ll use to create/load data into a Calgary file (.cydb)

We’ll be focusing on the Loader, Input, and Join tools throughout this recipe since they’re the most used tools in this category.

Getting ready

We built a test set for this recipe you can download from here:

https://github.com/PacktPublishing/Alteryx-Designer-Cookbook/tree/main/ch2/Recipe2

If you’re planning to follow along with your own data, make sure you have a decent number of records in your dataset (millions).

In both cases, make sure that you have at least 2 GB of available disk space on your computer.

How to do it…

There are two phases in this recipe:

  1. Creating/loading our data into a Calgary database
  2. Consuming the loaded data

To create/load the data, we will use the following steps:

  1. Drop an Input Data tool on the canvas and point it to the CitiBike_2013.zip file.
  2. Immediately, you’ll be prompted to select which file/s to read from the ZIP file and the type of the files. In our example, there’s just one, so select it and make sure that Select file type to extract is set to Comma Separated Value (*.csv) and click Open.
Figure 2.10: Read a file from a ZIP file

Figure 2.10: Read a file from a ZIP file

  1. Go to the Input Data tool configuration panel and make sure you change option 9, Delimiters, from a comma (,) to a pipe (|).
Figure 2.11: Input Data configuration panel – Delimiter option

Figure 2.11: Input Data configuration panel – Delimiter option

  1. Click the Refresh button and your Preview data will change from the following:
Figure 2.12: Contents with the wrong delimiter

Figure 2.12: Contents with the wrong delimiter

It will change to this:

Figure 2.13: After selecting the right delimiter for our file

Figure 2.13: After selecting the right delimiter for our file

  1. Add a Select tool to the canvas, and from the Options menu, click Save/Load and Load Fields Names & Types.
Figure 2.14: Shaping the data types based on saved configurations

Figure 2.14: Shaping the data types based on saved configurations

  1. Point to where you saved the recipe test set FIELD_TYPES\CitibikesFieldConfiguration.yxft and your Select tool will be populated with the field definitions saved in that file.
Figure 2.15: Resulting shape of our data

Figure 2.15: Resulting shape of our data

  1. Now, drop a Calgary Loader tool from the Calgary category.
Figure 2.16: Calgary Loader configuration panel

Figure 2.16: Calgary Loader configuration panel

  1. Point Root File Name to the folder you want to save your files in and give the file a name. As a best practice, consider using a single folder per Calgary set of files.

At this point, we can select which fields to keep (save) in the Calgary file and which ones we’ll be using to index. For this recipe, we’ll be indexing all fields, with the Auto index type.

Run the workflow and you’ll see that the file is being created and the data is being indexed. While loading and indexing, Alteryx analyzes the contents of our data (the first million records), selecting the best type of index based on the values contained within it.

Figure 2.17: Results of running the workflow

Figure 2.17: Results of running the workflow

By now, we’ll have the files (one .cydb and one .cyidx per indexed field, plus SelectedName_Indexes.xml containing the index values).

Now, onto making fast queries to our Calgary database. For querying Calgary, Alteryx offers two methods:

  • Static: Using the Calgary Input tool, you can define your query within the tool configuration panel
  • Dynamic: Based on a data stream, you can query your Calgary database dynamically using conditions

We can build a static query as follows:

We’ll be extracting all trips made by people that are 50 years old or more. Since the data is from 2013, we’ll be querying the dataset for those records with BIRTH_YEAR <= 1963:

  1. Drop a Calgary Input tool onto the canvas.
  2. Point the Calgary Data File option to the Citibike.cydb file we just created.

Once you point to the file, the tool’s configuration panel will show you the options for building the query:

Figure 2.18: Calgary Input configuration

Figure 2.18: Calgary Input configuration

  1. From the BIRTH group, double-click on the YEAR field. Alteryx will pop up a new window – Edit Query Item.
Figure 2.19: Setting the query item

Figure 2.19: Setting the query item

  1. We need to get a range starting at any value, but only up to 1963. So, uncheck Include Begin, check Include End, and enter 1963 for the end value.
Figure 2.20: Using only range end

Figure 2.20: Using only range end

  1. Click OK.

Your actual query clause will be added to the Query section of the configuration panel.

Figure 2.21: Query clause in the configuration panel

Figure 2.21: Query clause in the configuration panel

  1. Drop a Browse tool following the Calgary Input tool and run the workflow. You’ll be able to see all data regarding trips made by people 50 years or older.

We will now dynamically query a Calgary database:

We are going to query the Calgary database for the same results but using a different approach. We’ll be getting some input from a data stream and using those values to query/join against the Calgary database:

  1. In this case, we are going to use the age limit as an input, so drop a Text Input tool onto the canvas.
  2. Create a column called AGE_LIMIT and add a record with 50 as the value.
Figure 2.22: Incoming data

Figure 2.22: Incoming data

Since the input data we have is the minimum age to consider (remember, we are going to get rides done by people 50 years old or more), we need to transform it, so we can query our data based on BIRTH_YEAR.

  1. Connect a Formula tool to the Text Input output anchor, and create a new field called MAX_BORN with the following expression to determine which year is the maximum to query (remember that data is from 2013):
    2013-[AGE_LIMIT]
  2. Add a second column called MIN_BORN with the value 1000 (to ensure all data that represents any year before 1963 is considered).

Your Formula tool must look like this:

Figure 2.23: Formula to determine the range

Figure 2.23: Formula to determine the range

At this point, we have already defined our year range to query (from 1000 to 1963):

Figure 2.24: Input data enriched

Figure 2.24: Input data enriched

  1. Connect a Calgary Join tool to the Text Input tool and point Calgary Data File to CitiBike.cydb.
  2. Select Join Query Results to Each Input Record for the Action option.
  3. Click on the MIN_BORN input field and select BIRTH_YEAR for Index Field, Range - >=Begin AND <=End for Query Type, and MAX_BORN for End of Range, as in the screenshot here:
Figure 2.25: Calgary Join configured

Figure 2.25: Calgary Join configured

If you run the workflow, you’ll see that you’ll get the same records as we got using the static query with the Calgary Input tool.

How it works…

Calgary is a proprietary format developed by Alteryx that provides very high compression and very fast reading performance and indexing, making it ideal to work with huge amounts of data for lookups.

We recommend always enriching your data as much as you can before creating a Calgary file (very similar to what you’ll do when you create a multidimensional cube). For example, given the use case we used in this recipe, we’ll probably add the age of each person in the Calgary database when creating it, so we can use the Calgary Join tool directly on the AGE input.

The Calgary Input tool is very straightforward, allowing you to build queries in a simple way and retrieve the results very fast.

The Calgary Join tool is more complex and provides lots of options to query the data based on incoming/existing data streams, multiple indices, and several actions.

Figure 2.26: Calgary Join actions

Figure 2.26: Calgary Join actions

Important note

You can’t append records to a Calgary database, you need to re-create it.

There’s more…

As you already may have noticed, the Calgary Input tool organizes the fields based on their names, so for example, for all the fields starting with START_ or END_, it created a group that has all the fields starting with START_ or END_ in it (such as START_STATION_ID or START_STATION_NAME).

Figure 2.27: Fields grouped by prefix

Figure 2.27: Fields grouped by prefix

It is good practice to add a prefix to your fields to have them organized.

Since Alteryx looks at the first million records to select the index type when set to Auto, it might select the incorrect type for your dataset. It’s a good practice to analyze your data first and determine the selectivity of each index, based on the number of different values each data field might have. This can be easily achieved using a Summarize tool configured to perform a Count Distinct action on each field to be indexed.

Figure 2.28: Count Distinct on each field to Index

Figure 2.28: Count Distinct on each field to Index

The rules of thumb for this selection are as follows:

  • If your field has many possible values (more than 550), use High Selectivity (for example, BIKE_ID)
  • If your field has fewer unique values (less than 550), use Low Selectivity (for example, GENDER)

Doing this will also reduce the time Alteryx Designer needs to analyze your data (1 million records per index) and create the indices.

Finally, another good practice, that’ll make your work easier is adding flags or identifiers to the data before loading a Calgary database, such as a CURRENT_PERIOD field to easily query all records corresponding to the current period, or a SAME_PERIOD_LAST_YEAR field to get all the records corresponding to a particular period, but from last year.

You can also read Calgary files with a regular Input Data tool, but can’t take advantage of the indices (so the Calgary files will behave like a .yxdb file).

DCM – setting up credentials

As we saw in this chapter introduction, DCM allows you to administer credentials and passwords in a single-source, centralized way, so it solves some pain points, for example, multiple credential inputs, credentials being unsafely shared, loss of connection to data sources upon workflow sharing, among others.

Before getting into the matter, we need to identify three types of objects/concepts within DCM:

  • Credentials: Authentication mechanism for the specific technology
  • Data Sources: All accessible technologies supported by Alteryx
  • Connections: The combination of a data source and the credentials used to validate within

Also, if you have Alteryx Server, you can synchronize and share your connections against it. If you don’t, credentials, data sources, and connections created with DCM will remain local.

Getting ready

To follow this recipe, you must enable DCM on Alteryx Designer. To do so, go to Options → User Settings → Edit User Settings and from the DCM tab click on Enable DCM.

If the Enable DCM option appears disabled to you, click first on Override DCM System Settings, and it will enable it.

Figure 2.29: DCM options in User Settings

Figure 2.29: DCM options in User Settings

Make sure DCM Optional is the selected value for DCM Mode and SDK Access Mode is set to Allow.

Restart Alteryx Designer and you’ll be ready to work with DCM.

How to do it…

We will get started using the following steps:

  1. Go to File → Manage Connections.
Figure 2.30: Manage Connections menu

Figure 2.30: Manage Connections menu

A new window is displayed (yours might be blank):

Figure 2.31: DCM main window

Figure 2.31: DCM main window

  1. Click on + Add Credential at the top right of the window and Alteryx will ask you to enter values for Credential Name and Method.
Figure 2.32: DCM main window

Figure 2.32: DCM main window

  1. Enter a meaningful name for your credential, such as SQL SERVER System Administrator, and select from the dropdown for Method. In this case, we’ll be using Username and password.
Figure 2.33: Credential Method options

Figure 2.33: Credential Method options

  1. Once you make a selection for Method, Alteryx will show you the Username and Password input fields, so fill them in with your credentials.
Figure 2.34: Credential Method options

Figure 2.34: Credential Method options

Click Save and your credential will appear in the Credentials panel.

Figure 2.35: New credential added

Figure 2.35: New credential added

Thus, we have learned how to set up credentials using DCM.

How it works…

DCM saves the credential information provided as a credential object, encrypted as a secure object, and makes it available to be reused when you need it.

This actually improves the way that credentials are managed, since using DCM changes how that information is saved (if DCM is disabled, credentials are embedded within the workflow).

DCM – setting up a connection

To be able to connect to data using credentials, DCM needs you to create a connection. A connection object is a combination of a data source and a set of credentials.

In this recipe, we’ll be creating a new connection using DCM capabilities.

Getting ready

We’ll prepare to do this using the following steps:

  1. If you’ve already enabled DCM on Alteryx Designer you can skip this next step, otherwise, you need to do it to make DCM available for you. To do so, go to Options → User Settings → Edit User Settings and from the DCM tab click on Enable DCM.
Figure 2.36: DCM options in User Settings

Figure 2.36: DCM options in User Settings

  1. Make sure DCM Optional is the selected value for DCM Mode and SDK Access Mode is set to Allow.
  2. Restart Alteryx Designer and you’ll be ready to work with DCM.

If you have access to Alteryx Server, you’ll be able to synchronize your local and remote connections with it.

Also, make sure you have access to at least one database from any of the technologies supported by Alteryx.

Important note

This synchronization process is manual and can only be triggered from Alteryx Designer.

How to do it…

We’ll set up the actual connection using the following steps:

  1. Go to File → Manage Connections.
Figure 2.37: Manage Connections menu

Figure 2.37: Manage Connections menu

A new window is displayed (yours might be blank).

Figure 2.38: DCM main window

Figure 2.38: DCM main window

  1. Click on Add Data Source at the top right of the new window, so the Select Technology option shows up.
  2. From the dropdown, select the type of technology you will be connecting to (see the complete list of tools and technologies supported by DCM here: https://help.alteryx.com/current/designer/dcm-designer):
Figure 2.39: Technology selection for new connections

Figure 2.39: Technology selection for new connections

For this recipe, we’ll be using Microsoft SQL Server Quick Connect, but feel free to select the technology you want. The steps will be the same – what will change is the data you need to enter to connect to that technology.

  1. Enter your connection’s specifics and click Save.
Figure 2.40: Setting up a SQL Server connection

Figure 2.40: Setting up a SQL Server connection

  1. Now, we need to link the credentials with the data source object to create a connection.
Figure 2.41: Linking credentials to a connection

Figure 2.41: Linking credentials to a connection

  1. Click on + Connect Credential and the panel will change, so you can select the type of credentials (Authentication Method) you’ll be using for this connection.
Figure 2.42: Selecting Authentication Method for linked credentials

Figure 2.42: Selecting Authentication Method for linked credentials

Depending on your selection, Alteryx will filter and show all credentials of the selected type for you to choose.

Select Username and password and you’ll see that a new field was added to the panel, with a dropdown to select from all existing username and password credentials.

Figure 2.43: Selecting the credentials

Figure 2.43: Selecting the credentials

  1. Select the one we created in recipe #3 (SQL SERVER System Administrator) and click on the Link button.
Figure 2.44: Linked credential

Figure 2.44: Linked credential

Now we have our connection ready to be used.

How it works…

DCM allows us to create credentials and data sources. Those objects can be individually administrated in a centralized secure space. The combination of a data source and a set of credentials gives us a connection object that we can use in our workflows without caring about logins and server names.

Important note:

If you use DCM, every change you make to a connection will be picked up by your workflows. So, for example, you need to change your password once (in DCM’s Connection Manager) and all workflows using that credential will get updated.

There’s more…

If you see the underlying XML within the workflow for your connections, you’ll notice the difference in how they’re stored and managed by Alteryx Designer.

Figure 2.45: Using DCM and without using DCM

Figure 2.45: Using DCM and without using DCM

See the complete list of tools and technologies supported by DCM here: https://help.alteryx.com/current/designer/dcm-designer

Getting information from your In-DB connection/query

When working with In-Database tools in Alteryx, and probably using the Visual Query Builder, queries are built from within the tools by Alteryx and sometimes we’ll need to take those queries and have somebody optimize them for us or test them outside Alteryx.

The Dynamic Output tool allows us to get a lot of information about what and how Alteryx queries our databases.

Throughout this recipe, we’ll be exploring how to get that information and how we can make use of it.

Getting ready

To practice this recipe, we created a test set that you can download from here:

https://github.com/PacktPublishing/Alteryx-Designer-Cookbook/tree/main/ch2/Recipe5

Before starting with the recipe, just make sure that you install the SQLite ODBC driver (in the \SQLITE-ODBC folder). If you are on 32-bit Windows, use sqliteodbc.exe and if you are on 64-bit Windows, use sqliteodbc_W64.exe for the installation:

  1. Once installed, go to the ODBC data sources corresponding to the actual version of your OS (32- or 64-bit).
Figure 2.46: ODBC Data Source Administrator

Figure 2.46: ODBC Data Source Administrator

  1. In the System DSN tab, click on Add….
  2. Navigate to SQLite3 ODBC Driver, select it, and click Finish.
Figure 2.47: Selecting a driver for the data source

Figure 2.47: Selecting a driver for the data source

  1. On the new screen, give your connection a name.
Figure 2.48: SQLite3 driver configuration

Figure 2.48: SQLite3 driver configuration

  1. Click on the Browse… button and select where you saved the provided SQLite database (it should be in \DATA\Chapter2.sqlite).

For this recipe, we’ll not be touching any other settings of the driver.

If you plan to use your own data, you’ll only need to have access to a database you can query.

How to do it…

We are going to get the total billed amounts per customer. For this, we have three tables: DOCUMENTS, ARTICLES, and CUSTOMERS.

The DOCUMENTS table has all the information about the billing (including the amount in the TOTAL field) but has no details about customers or articles (just an ID). So we need to join the tables to get those details.

Figure 2.49: Structures of the tables

Figure 2.49: Structures of the tables

To be able to do so, we first need to connect to the database. We’ll be using In-DB connections to do i:.

  1. Grab a Connect In-DB tool from the In-Database category and drop it onto the canvas.
  2. From the tool configuration panel, click on Manage Connections to create an Alteryx In-DB connection.
Figure 2.50: In-DB connection

Figure 2.50: In-DB connection

The Manage In-DB Connections screen will pop up, allowing you to start configuring the new connection.

  1. From the Data Source dropdown, select Generic ODBC (we’ll be pointing it to the ODBC data source we created earlier).
  2. For the Connection Type dropdown, leave it at User and click the New button for Connections. This will enable the Connection Name field, so give the connection a name (we used SQLITE as you can see in the following figure).
Figure 2.51: In-DB connection

Figure 2.51: In-DB connection

  1. Now, on Connection String, click on the down-pointing arrow and select New database connection….
Figure 2.52: New database connection…

Figure 2.52: New database connection…

  1. This will make the ODBC Connection screen pop up. From here, select the AlteryxCookbook connection (the one we created in the Getting ready part of this recipe) and click OK.
Figure 2.53: Selecting which ODBC data source to use for the current connection

Figure 2.53: Selecting which ODBC data source to use for the current connection

  1. Click OK on the Manage In-DB Connections window, and the Choose Table or Specify Query window will pop up showing existing tables within the actual connection (by default, it’ll open in the Tables tab).
  2. Click on the Visual Query Builder tab so you can start building a query using drag and drop.
Figure 2.54: Visual Query Builder

Figure 2.54: Visual Query Builder

  1. Drag the DOCUMENTS table and drop it into the Main canvas.
  2. Repeat the operation for the ARTICLES and CUSTOMERS tables.

Now, we have the three tables available, and we’ll create the relations between them.

  1. From the DOCUMENTS table, drag the COMPANY_ID field and drop it over the ARTICLES table’s COMPANY_ID field.

Repeat the procedure, linking the following:

  • DOCUMENT.ARTICLE_ID with ARTICLES.ARTICLE_ID
  • DOCUMENTS.CUSTOMER_ID with CUSTOMERS.CUSTOMER_ID
  1. Now click on the first checkbox in the DOCUMENTS table to select all fields from it (*), and select ARTICLES.DESCRIPTION, CUSTOMERS.FIRST, CUSTOMERS.LAST, and CUSTOMERS.EMAIL, checking the checkbox of each of these fields.

Your query should look like this:

Figure 2.55: Completed query in Visual Query Builder

Figure 2.55: Completed query in Visual Query Builder

  1. Click OK and you’ll return to the Alteryx Designer canvas.

Now the tool is ready to execute the query. If you run the workflow, you’ll notice that it returns all records after joining the tables.

Now, to get the total amounts per customer, we need to summarize, grouping by CUSTOMER_ID, and get FIRST, LAST, EMAIL, and sum on TOTAL.

  1. Drop a Summarize In-DB tool onto the canvas, and configure it as shown in the following figure, so the tool’s configuration panel looks like this:
Figure 2.56: Summarize In-DB

Figure 2.56: Summarize In-DB

Now, if we run the workflow, we’ll get the total amount per customer.

Figure 2.57: Workflow results

Figure 2.57: Workflow results

At this point, we need to see how Alteryx resolves the queries we created in its drag-and-drop interface, and we can extract that information using a Dynamic Output In-DB tool.

  1. So, connect a Dynamic Output In-DB tool to the output anchor of the Summarize In-DB tool, and a regular Browse tool to the output anchor of the Dynamic Output In-DB tool.

At this point, your workflow should look like the following figure:

Figure 2.58: Our workflow

Figure 2.58: Our workflow

  1. Click on the Dynamic Output In-DB tool and select all output fields, except for Input Connection String and Output Connection String.
Figure 2.59: Dynamic Output In-DB output fields

Figure 2.59: Dynamic Output In-DB output fields

  1. Run the workflow and review the resulting fields.
Figure 2.60: Results for Dynamic Output In-DB

Figure 2.60: Results for Dynamic Output In-DB

The following fields can be found here:

  • Query: This is the complete query generated up to this point in the workflow.
  • Connection Name: The name of the Alteryx connection you’re using (comes from the name you gave it when you created it).
  • Connection Data Source: This is the database type. Note that since we used a generic ODBC type of connection, that value is not available to Alteryx – that’s why we get Unknown here.
  • In-DB XML: The Alteryx XML representation of the query.
  • Record Info XML: The XML representation of the query fields.
  • Query Alias List: This contains each segment of the query and the ID Alteryx gave to them.
  • Last Query Alias: The last alias from the list.

From the Query field, you have access to the SQL query created by Alteryx Designer – in our case, the following:

WITH "Tool1_fc91" AS (select DOCUMENTS.*,
   ARTICLES.DESCRIPTION,
   CUSTOMERS.FIRST,
   CUSTOMERS.LAST,
   CUSTOMERS.EMAIL
from DOCUMENTS
   inner join ARTICLES on DOCUMENTS.COMPANY_ID = ARTICLES.COMPANY_ID and DOCUMENTS.ARTICLE_ID = ARTICLES.ARTICLE_ID
   inner join CUSTOMERS on DOCUMENTS.CUSTOMER_ID = CUSTOMERS.CUSTOMER_ID) SELECT "CUSTOMER_ID", MIN("FIRST") AS "FIRST", MIN("LAST") AS "LAST", MIN("EMAIL") AS "EMAIL", SUM("TOTAL") AS "Sum_TOTAL" FROM "Tool1_fc91" GROUP BY "CUSTOMER_ID"

Where "Tool1_fc91" is a unique ID Alteryx assigns to each tool to further reference part of the complete query (subquery).

From the Query Alias List field, we can access the different sub-queries created to that point within the workflow.

At this point, we can save or copy that information to analyze and further optimize our queries.

How it works…

Creating queries in a visual interface is easier than writing code, and not all of us are able to do SQL scripting. The Visual Query Designer gives us the ability to create complex queries without any programming knowledge, but sometimes we’ll need assistance in optimizing those queries.

The Dynamic Output In-DB tool provides us with ease of access to the generated queries that Alteryx executes against our database management systems, by registering and extracting that information for us.

There’s more…

You’ll notice Alteryx added a black connector between both fields. If you double-click on it, the Link Properties screen will appear, allowing you to configure the link.

Figure 2.61: Configuring the relationships

Figure 2.61: Configuring the relationships

See also (follow-up steps)

The Connection Name field and the Query or Query Alias List fields extracted from the Dynamic Output In-DB tool can be used to generate dynamic and/or batch queries using a Dynamic Input In-DB tool connected to a data stream.

Important note:

The Dynamic Input In-DB tool only supports one input record, so if you have several queries to run, maybe it’s a good idea to create a macro.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Acquire the skills necessary to perform analytics operations like an expert
  • Discover hidden trends and insights in your data from various sources to make accurate predictions
  • Reduce the time and effort required to derive insights from your data
  • Purchase of the print or Kindle book includes a free eBook in the PDF format

Description

Alteryx allows you to create data manipulation and analytic workflows with a simple, easy-to-use, code-free UI, and perform fast-executing workflows, offering multiple ways to achieve the same results. The Alteryx Designer Cookbook is a comprehensive guide to maximizing your Alteryx skills and determining the best ways to perform data operations This book's recipes will guide you through an analyst's complete journey, covering all aspects of the data life cycle. The first set of chapters will teach you how to read data from various sources to obtain reports and pass it through the required adjustment operations for analysis. After an explanation of the Alteryx platform components with a particular focus on Alteryx Designer, you’ll be taken on a tour of what and how you can accomplish by using this tool. Along the way, you’ll learn best practices and design patterns. The book also covers real-world examples to help you apply your understanding of the features in Alteryx to practical scenarios By the end of this book, you’ll have enhanced your proficiency with Alteryx Designer and an improved ability to execute tasks within the tool efficiently

Who is this book for?

This book is for data analysts, data professionals, and business intelligence professionals seeking to harness the full potential of the tool. A basic understanding of Alteryx Designer and Alteryx terminology, including macros, apps, and workflows, is all you need to get started with this book.

What you will learn

  • Speed up the cleansing, data preparing, and shaping process
  • Perform operations and transformations on the data to suit your needs
  • Blend different types of data sources for analysis
  • Pivot and un-pivot the data for easy manipulation
  • Perform aggregations and calculations on the data
  • Encapsulate reusable logic into macros
  • Develop high-quality, data-driven reports to improve consistency

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Oct 31, 2023
Length: 740 pages
Edition : 1st
Language : English
ISBN-13 : 9781804613146
Vendor :
Alteryx
Category :
Concepts :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Oct 31, 2023
Length: 740 pages
Edition : 1st
Language : English
ISBN-13 : 9781804613146
Vendor :
Alteryx
Category :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 159.97
Alteryx Designer Cookbook
$59.99
Data Modeling with Snowflake
$49.99
Mastering Tableau 2023
$49.99
Total $ 159.97 Stars icon
Banner background image

Table of Contents

16 Chapters
Chapter 1: Inputting Data from Files Chevron down icon Chevron up icon
Chapter 2: Working with Databases Chevron down icon Chevron up icon
Chapter 3: Preparing Data Chevron down icon Chevron up icon
Chapter 4: Transforming Data Chevron down icon Chevron up icon
Chapter 5: Data Parsing Chevron down icon Chevron up icon
Chapter 6: Grouping Data Chevron down icon Chevron up icon
Chapter 7: Blending and Merging Datasets Chevron down icon Chevron up icon
Chapter 8: Aggregating Data Chevron down icon Chevron up icon
Chapter 9: Dynamic Operations Chevron down icon Chevron up icon
Chapter 10: Macros and Apps Chevron down icon Chevron up icon
Chapter 11: Downloads, APIs, and Web Services Chevron down icon Chevron up icon
Chapter 12: Developer Tools Chevron down icon Chevron up icon
Chapter 13: Reporting with Alteryx Chevron down icon Chevron up icon
Chapter 14: Outputting Data Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.9
(10 Ratings)
5 star 90%
4 star 10%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Gary Gruccio Jan 27, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I’ve used Alteryx for over 15 years and still learned something new! Comprehensive guide to building many process in Alteryx. Supplemented by downloadable files and workflows.
Amazon Verified review Amazon
Chamoddinu Feb 29, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I recently had the pleasure of exploring the Alteryx Designer Cookbook, and I must say, it exceeded my expectations. As someone who wanted to learn Alteryx from scratch, this book was an absolute gem. The author has done a fantastic job of breaking down complex concepts into easily digestible chunks, making it incredibly easy to grasp the tools and functionalities of Alteryx.What I particularly appreciated about this book was its hands-on approach. Not only does it provide clear explanations of each tool, but it also offers practical examples and exercises that allow readers to apply what they've learned in real-world scenarios. This interactive learning style was instrumental in helping me gain confidence in using Alteryx effectively.Whether you're a beginner or someone looking to sharpen their skills, this book caters to all levels of expertise. The step-by-step tutorials are accompanied by clear screenshots and illustrations, making it a breeze to follow along. Additionally, the layout and organization of the content are top-notch, making it easy to navigate and reference back to specific topics as needed.In conclusion, the Alteryx Designer Cookbook is a must-have resource for anyone looking to master Alteryx. With its user-friendly approach and practical exercises, it's the perfect companion for anyone looking to unlock the full potential of this powerful tool. Highly recommended!
Amazon Verified review Amazon
Hoang Le Nov 10, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book is easy to read and follow because the author describes the goal clearly in the beginning and shows the instruction step-by-step. In each part, the author usually splits into 3 parts: Getting Ready, How to do it and How it works. In the "Getting Ready" part, the author shows the goal what the user could get and how to prepare, set up the file, load it into Alteryx Designer. In the "How to do it" part, the author will show the instruction step-by-step with screenshots from Alteryx. It’s very easy to follow. In the "How it works" part, the author notes some advice, and important notes after the topic. I really like that structure. It helps the user keep track from beginning to the end of the book. In the first chapter for inputing the file in Alteryx, it’s difficult for some beginners when they don’t understand what is macro. But I understand that the author would like to show in special cases, the user could use the macro to input the files. Overall, this is a good book about Alteryx Designer for people who are new to Alteryx or people who don't have programming background. The book includes all important topics in cleaning and preparing data. This book would be very helpful in the data analytics journey.
Amazon Verified review Amazon
Kindle Customer Nov 29, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Alberto's new book is a gem!As the title suggests it focusses on easy-to-follow recipes for common challenges faced by Alteryx users, and also provides new and advanced concepts for those who are more experienced in Alteryx. For example - if you are starting out and need to work with multiple different sheets in Excel (one of the most common questions for new users); or produce formatted reports - this book has you covered. Additionally - if you are more advanced and want to work with paged APIs; dynamically update your workflows through XML editing; or create logging on your data pipelines - there's content in there for you too.This is a thoughtful and practical book that is clearly borne of Alberto's deep experience in doing this work for and with clients, and building solutions that were previously thought impossible.
Amazon Verified review Amazon
H2N Nov 12, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book is a must-read for data professionals seeking to enhance their skills in Alteryx Designer. Covering 14 chapters, it's an invaluable resource for business intelligence experts, data analysts, and data scientists. The guide provides comprehensive techniques for data manipulation, from inputting various file types to working effectively with databases. It delves into transforming unstructured data, data parsing, grouping, blending, and merging, along with introducing dynamic operations, macros, and applications for streamlined workflows. Additionally, it explores leveraging cloud services, developer tools for tackling data challenges, and effective strategies for data reporting and output. This book is a compact yet rich source of practical insights for mastering Alteryx Designer and data analytics.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.