Imagining the future – a look at emerging trends
Technology seems to progress at an increasing velocity. For decades, relational databases from vendors such as Oracle were the primary technology for managing all data. Today, there is a wide range of different database types that can be used, depending on the use case (such as graph databases for highly connected datasets, NoSQL databases for low-latency reading and writing of very large tables, and vector databases, which have become popular for ML applications such as generative AI).
It was also not all that long ago that Hadoop MapReduce was the state-of-the-art technology for processing very large datasets, but today, most new projects would choose Apache Spark over a MapReduce implementation. Apache Spark itself continues to progress from its initial release, with Spark 3.4 having been released in April 2023. We have also seen the introduction of Spark Streaming, Spark ML, and Spark GraphX for different use cases.
...