Packt+ | Advance your knowledge in tech

You're reading from Software Development on the SAP HANA Platform Written by a SAP HANA expert, this book takes you from installation to running your own processes in no time. By the end of the course you'll have awesome data retrieval and analytical powers to call on.

Product type Paperback

Published in Jul 2013

Publisher

ISBN-13 9781849689403

Length 328 pages

Edition Edition

Tools

SAP Hana

Concepts

Application Development

Author (1):

Mark Walker

View More author details

Table of Contents (19) Chapters

Software Development on the SAP HANA Platform

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

1. So, What Is This SAP HANA Thing Anyways? FREE CHAPTER

2. SAP HANA Studio – Installation and First Look

3. Your First SAP HANA Development – An Attribute View

4. Painting with Numbers – An Analytic View

5. Let's Get Graphical – Graphical Calculation Views

6. You Talking to Me? – Scripted Calculation Views

7. Hey! That's My Data! – Authorizations in SAP HANA

8. On Another Level – Hierarchies in SAP HANA

9. Deploying Your Reporting Application to Reporting Software

10. Data Provisioning Using Data Services

11. Application Development Using the XS Engine

So Long and Thanks – Where To Go from Here

Index

Time for action – statistics about internal commercial flights in the USA from 1998 to 2011

SAP markets their HANA database as being really, really fast. But how fast is it really, when there is a large amount of data in the system? The author decided to do a couple of tests to find out. Rather than creating some "fake" data, it was decided that using publicly available information would be better. This means that anyone else wanting to do some performance tests can repeat the experiment.

The US government, as a part of their "open" data initiative of making data available for analysis, released all the data concerning commercial flights inside the USA, starting in 1998 and going right through to the present day. You can find this at http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time.

After a couple of days of downloading the data (49 GB of CSV files), tidying it up a bit, and then loading it into SAP HANA and MySQL, it was possible to do a handful of comparisons on the two different database systems. To be fair, it has to be said that the MySQL machine has only 8 GB of memory, but it has an SSD, while the SAP HANA machine has 128 GB, but "real" disks.

As an indication of SAP HANA's data loading performance, each data file of around 1.5 GB and around five million lines, loaded using SAP HANA's native IMPORT FROM CSV FILE instruction, took approximately 45 seconds (on an average). That means around 100000 lines are loaded every second.

The first thing to note is the sheer data volume—we're talking about over 140 million lines of data, which is a reasonable dataset size. Many companies have this much data in their orders or stock movement tables, so doing a couple of stress tests on this dataset allows extrapolation on the performance which we can expect when we use SAP HANA for large applications.

Without explaining all the data fields available, we have some interesting information, such as the origin and destination airports, as well as the flight delay at departure and on arrival. With these fields, we can perform some interesting queries.

We'll start by taking a look at something relatively simple: count the number of flights leaving Los Angeles airport, per year. The SQL statement to achieve this is as follows; the syntax is same in any relational database. Don't worry, we won't be spending 300 pages writing SQL code.

SELECT YEAR, COUNT(*) FROM FLIGHT.ONTIME WHERE ORIGIN = 'LAX' GROUP BY YEAR

First up, let's see how MySQL fares with this query (age before beauty, as they say):

Two minutes for MySQL. Not bad at all. We could probably optimize this, at least a little, by creating an index on the ORIGIN and YEAR columns though.

Now how does SAP HANA do on the same query and same dataset:

Well the results of the SQL are the same—which proves no cheating is going on. The time taken to resolve a query on SAP HANA is slightly under 800 milliseconds and around 160 times faster than MySQL. Impressive stuff.

How about something a little more taxing? Let's count the number of flights per day of the week, and then sort them by the number of results.

To see how MySQL fares with this query, check the following screenshot:

Not too shabby I suppose. MySQL looks at the full 140 million-line dataset in less than three minutes. How much time does SAP HANA take for the same query?

Once again, we can see that SAP HANA returns the result in a couple of seconds, where MySQL needs nearly three minutes to do the same.

If we were to summarize these results, we could say that with SAP HANA:

The machine, working more efficiently, is less busy for less time, and can therefore do more work

With a more efficient database such as SAP HANA, your servers will be able to do even more work, getting necessary information into your users' hands faster, enabling them to spend more time analyzing data, rather than waiting for it.

What just happened?

We've just glanced through the simplest usage of SAP HANA: a "standard" (if rather fast) database. SAP HANA can be used to replace legacy databases in more or less any situation: the system provides connectors for several standard database-access protocols (ODBC and JDBC, notably), so just by unplugging your MySQL, PostgreSQL, or SQL server database, and replacing it with SAP HANA, you'll be getting immediate speedups.

You're reading from Software Development on the SAP HANA Platform Written by a SAP HANA expert, this book takes you from installation to running your own processes in no time. By the end of the course you'll have awesome data retrieval and analytical powers to call on.

Table of Contents (19) Chapters

Time for action – statistics about internal commercial flights in the USA from 1998 to 2011

What just happened?

Authors (1)

Personalised recommendations for you