Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Michelangelo PyML: Introducing Uber’s platform for rapid machine learning development

Save for later
  • 3 min read
  • 25 Oct 2018

article-image
Transportation network giants Uber have developed Michelangelo PyML - a Python-powered platform for rapid prototyping of machine learning models. The aim of this platform is to offer machine learning as a service that democratizes machine learning and makes it possible to scale the AI models to meet business needs efficiently.

Michelangelo PyML is an integration of Michelangelo - which Uber developed for large-scale machine learning in 2017. This will make it possible for their data scientists and engineers to build intelligent Python-based models that run at scale for online as well as offline tasks.

Why Uber chose PyML for Michelangelo


Uber developed Michelangelo in September 2017 with a clear focus of high performance and scalability. It currently enables Uber’s product teams to design, build, deploy and maintain machine learning solutions at scale and powers roughly close to 1 million predictions per second. However, that also came at the cost of flexibility. Users mainly were faced with 2 critical issues:

  • It was possible to train the models using the algorithms that were only natively supported by Michelangelo. To run unsupported algorithms, the platform’s capability had to be extended to include additional training and deployment components. This caused a lot of inconvenience at times.
  • The users could not use any feature transformations apart from those offered by Michelangelo’s DSL (Domain Specific Language)


Apart from these constraints, Uber also observed that data scientists usually preferred Python over other programming language, given the rich suite of libraries and frameworks available in Python for effective analytics and machine learning. Also, many data scientists gathered and worked with data locally using tools such as pandas, scikit-learn and Tensorflow, as opposed to Big Data tools such as Apache Spark and Hive, while spending hours in setting them up.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime

How PyML improves Michelangelo


Based on the challenges faced in using Michelangelo, Uber decided to revamp the platform by integrating PyML to make it more flexible. PyML provides a concrete framework for data scientists to build and train machine learning models that can be deployed quickly, safely and reliably across different environments. This, without any restriction on the types of data they can use or the algorithms they can choose to build the model, makes it an ideal choice of tool to integrate with a platform like Michelangelo.

By integrating Python-based models that can operate at scale with Michelangelo, Uber will now be able to handle online as well as offline queries and give smart predictions quite easily. This could be a potential masterstroke by Uber, as they try to boost their business and revenue growth after it slowed down over the last year.

Read more


Why did Uber created Hudi, an open source incremental processing framework on Apache Hadoop?

Uber’s Head of corporate development, Cameron Poetzscher, resigns following a report on a 2017 investigation into sexual misconduct

Uber’s Marmaray, an Open Source Data Ingestion and Dispersal Framework for Apache Hadoop