Alibaba Cloud released Mars, a tensor-based framework for large-scale data computation

A few days ago, Alibaba Cloud announced the release of Mars, its tensor-based framework for large-scale data computation. Mars tensor provides a familiar interface like Numpy, which is a popular tool for most of the Python users such as mathematicians, engineers, etc. and the ones working in core scientific computing. Mars can also scale into a single machine, and scale out to a cluster with hundreds of machines.

Users can simply install Mars tensor with the following code:

import mars.tensor as mta = mt.random.rand(1000, 2000)(a + 1).sum(axis=1).execute()

According to a Medium post by Synced, “Mars can simply tile a large tensor into small chunks and describe the inner computation with a directed graph, enabling the running of parallel computation on a wide range of distributed environments, from a single machine to a cluster comprising thousands of machines.”
Xuye Qin, Alibaba Cloud Senior Engineer, bragged about Mars’ performance by stating, “Mars can complete the computation on a 2.25T-size matrix and a 2.25T-size matrix multiplication in two hours.”

Unlike NumPy, Mars provides users with the ability to run matrix computation at a very large-scale. Alibaba developers carried out a simple experiment to test Mars’ performance. According to the graph below where NumPy (represented by a red cross at the upper left) lags far behind Mars tensors, which is successful in achieving ideal performance values.

alibaba-cloud-released-mars-a-tensor-based-framework-for-large-scale-data-computation-img-0

Source: Medium

Mars supports a subset of NumPy interfaces, which include:

Arithmetic and mathematics: +, -, *, /, exp, log, etc.

Reduction along axes (sum, max, argmax, etc).

Most of the array creation routines (empty, ones_like, diag, etc). Mars not only supports create array/tensor on GPU, but also supports create sparse tensor.