The key concern associated with using RDDs is that they can take a lot of time to master. The flexibility of running functional operators such as map, reduce, and shuffle allows you to perform a wide variety of transformations against your data. But with this power comes great responsibility, and it is potentially possible to write code that is inefficient, such as the use of GroupByKey; more information can be found in Avoid GroupByKey at https://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_groupbykey.html.
Generally, you will typically have slower performance when using RDDs compared to Spark DataFrames, as noted in the following diagram:
Source: Introducing DataFrames in Apache Spark for Large Scale Data Science at https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark...