The big data trends
Big data processing has been an integral part of the IT industry, more so in the past decade. Apache Hadoop and other similar endeavors are focused on building the infrastructure to store and process massive amounts of data. After being around for over 10 years, the Hadoop platform is considered mature and almost synonymous with big data processing. Apache Spark, a general computing engine that works well with is and not limited to the Hadoop ecosystem, was quite successful in the year 2015.
Building data science applications requires knowledge of the big data landscape and what software products are available out of that box. We need to carefully map the right blocks that fit our requirements. There are several options with overlapping functionality, and picking the right tools is easier said than done. The success of the application very much depends on assembling the right mix of technologies and processes. The good news is that there are several open source options...