Deriving information from existing data
As you have no doubt realized, having data does not automatically mean having information. To turn data into meaningful information, analysis tools are required. Such tools can be implemented as standalone applications, or be integrated within a database or application server layer.
If you're dealing with data stored in a database, implementing data processing logic at the database tier seems to be most efficient. For example, Oracle Database offers a number of native features focusing on data analysis—the process of converting data into information. Sometimes, though, to obtain the required piece of information from your data, it's quite enough to issue a simple SQL query against it. For example, if you want to know the number of orders placed over a certain period of time, say, a year, this query might be as simple as the following:
That is it. As you can see, the required piece of information is derived here from a relational table with a two-line SQL query. It's fairly obvious that you don't need any Business Intelligence tools to answer simple questions like the one in the previous example—SQL alone will be enough.
Unfortunately, not all the questions you have to face in practice can be answered so easily. Much more often, Business Intelligence has to operate on a larger dataset than one comprised by a single relational table, at times aggregating information from disparate data sources. This is where you are unlikely to get away with SQL alone—sophisticated Business Intelligence tools like those that the Oracle Business Intelligence suite includes are required.
However, you should have no illusions about the capabilities of Business Intelligence. It is important to realize that Business Intelligence is not a magic black box that can derive information from nowhere, answering probing analytical questions from sparse datasets. There will be a significant difference in the accuracy of the information extracted, depending on how dense the underlying dataset is.
Answering business questions from your data
As stated above, the quantity and quality of the information you can obtain directly depends on the quantity and quality of the data available at your disposal. Not surprisingly, the more information you can collect, the more probing analytical questions you can answer.
To start with, let's look at a simple example. Suppose you need to analyze the results of a survey you conducted online to get some vital feedback from your customers. It turned out, however, that only one-tenth of the overall number of your customers participated. Of course, the information you will extract from the results of this survey cannot be considered comprehensive, as the opinion of the majority remains unclear and, therefore, only a sparse dataset has been analyzed.
Note
If you consult a dictionary, you should see that 'sparse' is a synonym to 'scanty'. What this means here is that a sparse dataset has gaps. Say, you have sales figures for today and don't have them for yesterday and so on. However, for analysis purposes, you often need a dense dataset—one that contains no gaps. So densification is the process of filling gaps in a dataset.
While in the above example, a factor that complicated performing a quality analysis was that there was not enough incoming information; you will most likely in practice experience another kind of problem—composing the 'right' dataset from the sea of data available at your disposal. In terms of Oracle Discoverer, such datasets are called workbooks and represent business areas—collections of related information. Each workbook contains worksheets displaying data from the perspectives you want to view that data in, to obtain answers to your business questions. So when creating a workbook, you should include all the data structures required for building the worksheets you want.
Diagrammatically, the scenario involving Oracle Discoverer workbooks, each of which can be used to answer a certain set of business questions might look like the following diagram:
As you can see, Oracle Discoverer workbooks can be built on a variety of data sources, representing the derived data so that it can answer your business questions. The diagram in the figure gives you a rough idea of the mechanism used by Oracle Discoverer to provide a business view of the underlying data.
But you're probably wondering "How would I know which data structures must be included in a workbook to give enough information for getting those business questions answered?" Well, in most cases, the answer to this question is intuitively clear: you add to the workbook only those data structures that contain the data you want to see displayed at that point on the workbook's worksheets.
If you're working with relational data, then the rule of thumb is to include not only those tables whose data you want to see on worksheets, but also their related tables. It's not a big issue though, since the Discoverer wizard used to create a new workbook includes related tables by default. As you might guess, related tables may become required when the time comes to drill down through data to detail.
For example, when creating a workbook to be used for sales analysis, you would include at least the orders, customers, and employee tables. This is correct because you often need to have the ability to integrate customer, employee, and sales information for analysis. Now imagine that you want to create a worksheet for the report showing the sales of a particular region. What this means simply is that the regions table must also have been included.
Note
However, this may not be necessary if you're going to include only the region_id
field in your report. Being part of the customers
table, this field serves as the foreign key that relates to the regions
table.
In terms of Oracle Discoverer, database tables included in a workbook are folders, each of which includes items representing columns of the corresponding table. It's interesting to note, however, that folders and items are not necessarily based on relational tables and their columns respectively—other data structure options, such as LDAP directories and entries are also possible. So, the Oracle Discoverer documentation defines a folder as a collection of closely related information, and an item can be thought of as a piece of information of a certain type, within a folder.
Comparing and analyzing data
Now that you have a rough idea of how data can be organized for business analysis, it's time to move on and look at how you might use that data to your advantage, gaining a comprehensive view of your business.
As we have just seen, within a workbook consolidating a collection of related information, you can create a set of worksheets, each of which is to answer a certain business question. Continuing the discussion, this section touches briefly on issues related to comparing and analyzing data.
To view the data displayed on a worksheet from the perspective you need, Oracle Discoverer offers a number of components, including totals, percentages, exceptions, and calculations. With these tools in hand, you'll be able to analyze data more quickly and easily.
Schematically, this might look like the following figure:
The above diagram illustrates that you can add totals, percentages, calculations, and filters to a worksheet through the corresponding Discoverer components. All these Discoverer tools are easy to use, and allow you to rearrange the worksheet's data or drill into it, so that you can see it from another perspective, with just a few clicks. In the Asking business questions using data-access tools section earlier in this chapter, you saw an example of how easy it is to calculate a total for a column on a worksheet.
Although Discoverer tools open up a great way to approach the problem of analyzing data, you may sometimes have to rearrange data even before it is included into a workbook. For example, in the area of comparative analyses, you may need to perform data densification before you can proceed to data analysis.
As usual, this can be best understood by example. Say, when performing a period-to-period comparison, you may face the problem of sparse data. For example, you perform day-to-day comparisons to see the inventory figures for various products on a daily basis. The problem is that the inventory
table usually stores a row for a product when its quantity of available units changes. So, the rows stored in the inventory
table might look like the following:
As you can see, there are days when no event happened, and therefore, what you have here is a sparse dataset. In reports that you might want to create based on this table, you will most likely need this data to be represented differently, as a dense dataset. So, before you can include it to a workbook, a densification is needed.
The simplest way to solve this problem is to create a view based on the inventory table, so that the view contains a dense dataset. The densification might be accomplished with the help of the LAST_VALUE
Oracle SQL analytic function, included in the select list of the view's SELECT
statement.
The above example illustrated that the process of converting data into information you need, may start even before that data is chosen to be processed by a Business Intelligence tool. In other words, you may need to make some preparations to rearrange an underlying dataset so that it's ready for analysis.
Note
The downside to the above approach is the lack of flexibility. However, Oracle Discoverer gives you a greater degree of flexibility.
If you need the inventory table's original rows, you'll need to make a change to the set of underlying data objects. However, Oracle Discoverer provides a better option. Windowing functions, such as LAST_VALUE
discussed here, can be used to compute aggregates with the help of calculations, a Discoverer analysis feature mentioned at the beginning of this section. Like other Discoverer analysis features, calculations can be activated or deactivated with a single click, thus giving you a greater degree of flexibility. The Oracle Discoverer features will be discussed in more detail later in this book.