Introduction
Visualizing large data is challenging. There are more data points than possible pixels and manipulating distributed data can take a long time. Along with the increase in volume, there are new kinds of datasets which are becoming more and more mainstream. The need to analyze user comments, sentiments, customer calls and various unstructured data has resulted in the use of new kinds of visualizations. The use of graph databases and visualization to represent unstructured data is an example of how things are changing because of increased variety.
There are a variety of tools developed recently which allow interactive analysis with Spark by reducing query latency to the range of human interactions through caching. Additionally, Spark's unified programming model and diverse programming interfaces enable smooth integration with popular visualization tools. We can use these to perform both exploratory and expository visualization over large data. In this chapter, we are going to look...