Scala
The Scala standard library offers a rich set of tools, such as parallel collections and concurrent classes to scale number-crunching applications. Although these tools are very effective in processing medium-sized datasets, they are unfortunately quite often discarded by developers in favor of more elaborate frameworks.
Object creation
Although code optimization and memory management is beyond the scope of this chapter, it is worthwhile to remember that a few simple steps can be taken to improve the scalability of an application. One of the most frustrating challenges in using Scala to process large datasets is the creation of a large number of objects and the load on the garbage collector.
A partial list of remedial actions is as follows:
- Limiting unnecessary duplication of objects in an iterated function by using a mutable instance
- Using lazy values and Stream classes to create objects as needed
- Leveraging efficient collections such as bloom filters or skip lists
- Running
javap
to decipher...