Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Salesforce Einstein team open sources TransmogrifAI, their automated machine learning library

Save for later
  • 2 min read
  • 17 Aug 2018

article-image

Salesforce has open sourced TransmogrifAI, their end-to-end automated machine learning library for structured data. This library is currently used in production to help power Salesforce Einstein AI platform. TransmogrifAI enables data scientists at Salesforce to transform customer data into meaningful, actionable predictions.  Now, they have open-sourced this project to enable other developers and data scientists to build machine learning solutions at scale, fast.

TransmogrifAI is built on Scala and SparkML that automates data cleansing, feature engineering, and model selection to arrive at a performant model. It encapsulates five main components of the machine learning process:

salesforce-open-sources-transmogrifai-automated-machine-learning-library-img-0

Source: Salesforce Engineering


Feature Inference:


TransmogrifAI allows users to specify a schema for their data to automatically extract the raw predictor and response signals as “Features”. In addition to allowing for user-specified types, TransmogrifAI also does inference of its own. The strongly-typed features allow developers to catch a majority of errors at compile-time rather than run-time.

Transmogrification or automated feature engineering:


TransmogrifAI comes with a myriad of techniques for all the supported feature types ranging from phone numbers, email addresses, geo-location to text data. It also optimizes the transformations to make it easier for machine learning algorithms to learn from the data.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime

Automated Feature Validation:


TransgmogrifAI has algorithms that perform automatic feature validation to remove features with little to no predictive power. These algorithms are useful when working with high dimensional and unknown data. They apply statistical tests based on feature types, and additionally, make use of feature lineage to detect and discard bias.

Automated Model Selection:


The TransmogrifAI Model Selector runs several different machine learning algorithms on the data and uses the average validation error to automatically choose the best one. It also automatically deals with the problem of imbalanced data by appropriately sampling the data and recalibrating predictions to match true priors.

Hyperparameter Optimization:


It automatically tunes hyperparameters and offers advanced tuning techniques.

This large-scale automation has brought down the total time taken to train models from weeks and months to a few hours with just a few lines of code. You can check out the project to get started with TransmogrifAI. For detailed information, read the Salesforce Engineering Blog.

Salesforce Spring 18 – New features to be excited about in this release!

How to secure data in Salesforce Einstein Analytics

How to create and prepare your first dataset in Salesforce Einstein