Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Google Cloud releases a beta version of SparkR job types in Cloud Dataproc

Save for later
  • 2 min read
  • 21 Dec 2018

article-image

Google released a beta version of SparkR jobs on Cloud Dataproc, a cloud service that lets you run Apache Spark and Apache Hadoop in a cost-effective manner, earlier this week.

SparkR Jobs will build R support on GCP. It is a package that delivers a lightweight front-end to use Apache Spark from R. This new package supports distributed machine learning using MLlib. It can be used to process against large cloud storage datasets and for performing work that is computationally intensive. Moreover, this new package also allows the developers to use “dplyr-like operations” i.e. a powerful R-package, which transforms and summarizes tabular data with rows and columns on datasets stored in Cloud Storage.

The R programming language is very efficient when it comes to building data analysis tools and statistical apps. With cloud computing all the rage, even newer opportunities have opened up for developers working with R.

Using GCP’s Cloud Dataproc Jobs API, it gets easier to submit SparkR jobs to a cluster without any need to open firewalls for accessing web-based IDEs or SSH onto the master node. With the API, it is easy to automate the repeatable R statistics that users want to be running on their datasets.

Additionally, GCP for R also helps avoid the infrastructure barriers that put a limit on understanding data. This includes selecting datasets that need to be sampled due to compute or data size limits. GCP also allows you to build large-scale models that help analyze the datasets of sizes that would previously require big investments in high-performance computing infrastructures.

For more information, check out the official Google Cloud blog post.


Google expands its machine learning hardware portfolio with Cloud TPU Pods (alpha) to effectively train and deploy TensorFlow machine learning models on GCP

Google Cloud Storage Security gets an upgrade with Bucket Lock, Cloud KMS keys and more


Google Cloud’s Titan and Android Pie come together to secure users’ data on mobile devices

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime