Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Nvidia's Volta Tensor Core GPU hits performance milestones. But is it the best?

Save for later
  • 3 min read
  • 08 May 2018

article-image

Nvidia has revealed that its Volta Tensor Core GPU has hit some significant milestones in performance. This is big news for the world of AI. It raises the bar in terms of the complexity and sophistication of the deep learning models that can be built. The Volta Tensor Core GPU has, according to the Nvidia team, has "achieved record-setting ResNet-50 performance for a single chip and single server" thanks to the updates and changes they have made.

Here are the headline records and milestones the Volta Tensor Core GPU has hit, according to the team's intensive and rigorous testing:

  • When it trains a ResNet-50, one V100 TensorCore GPU can achieve more than 1,075 images every second. That is apparently four times more than the Pascal GPU, the previous generation of Nvidia's GPU microarchitecture.
  • Last year, one DGX-1 server supported by 8 TensorCore V100s could achieve 4,200 images a second (still a hell of a lot). Now it can achieve 7,850.
  • One AWS P3 cloud instance supported by 8 TensorCore V100s Res-Net50 in less than 3 hours. That's three times faster than on a single TPU.


But what do these advances in performance mean in practice? And has Nvidia really managed to outperform its competitors?

Volta Tensor Core GPUs might not be as fast as you think


Nvidia is clearly pretty excited about what it has achieved. Certainly the power of the Volta Tensor Core GPUs are impressive and not to be sniffed at. But website ExtremeTech poses a caveat. The piece argues that there are problems with using FLOPS ( floating point operations per second) as a metric for performance. This is because the mathematical formula that's used to calculate FLOPs assumes a degree of consistency in how something is processed that may be misleading. One GPU, for example, might have higher potential FLOPS but not be running at capacity. It could, of course be outperformed by an 'inferior' GPU.

Other studies (this one from RiseML) have indicated that Google's TPU actually performs better than Nvidia's offering (when using a different test). Admittedly the difference wasn't huge, but enough when you consider that it's significantly cheaper than the Volta.

Ultimately, the difference between the two is as much about what you want from your GPU or TPU. Google might give you a little more power but there's much less flexibility than you get with the Volta. It will be interesting to see how the competition changes over the next few years. Based on current form Nvidia and Google are going to be leading the way for some time, whoever has bragging rights about performance.

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime