In today's world, computers, smartphones, and other devices have become an integral part of our lives. Every day, massive quantities of data is produced. Billions of people access services on the Internet, and companies are constantly collecting data to learn about their users to better target products and improve user experience.
Handling this ever increasing amount of data presents substantial challenges. Large companies and organizations often build clusters of machines designed to store, process, and analyze large and complex datasets. Similar datasets are also produced in data-intensive fields such as environmental sciences and health care. These large-scale datasets have been recently called big data. The analysis techniques applied to big data usually involve a combination of machine learning, information retrieval, and visualization.
Computing clusters have been...