IBM which creates families of processors to address different segments, with different models for tasks like scale-up, scale-out, and now NVLink deployments. The company is adding new custom models that use new acceleration and memory devices, and that was the focus of this year’s talk at Hot Chips. They also announced about POWER10 which is expected to come with these new enhancements in 2021, they additionally announced, core counts of POWER10 and process technology.
IBM also spoke about focusing on developing diverse memory and accelerator solutions to differentiate its product stack with heterogeneous systems.
IBM aims to reduce the number of PHYs on its chips, so now it has PCIe Gen 4 PHYs while the rest of the SERDES run with the company's own interfaces. This creates a flexible interface that can support many types of accelerators and protocols, like GPUs, ASICs, CAPI, NVLink, and OpenCAPI.
AMD does not have an artificial intelligence–focused chip. However, AMD CEO Lisa Su in a keynote address at Hot Chips 31 stated that the company is working toward becoming a more significant player in artificial intelligence.
Lisa stated that the company had adopted a CPU/GPU/interconnect strategy to tap artificial intelligence and HPC opportunity. She said that AMD would use all its technology in the Frontier supercomputer. The company plans to fully optimize its EYPC CPU and Radeon Instinct GPU for supercomputing. It would further enhance the system’s performance with its Infinity Fabric and unlock performance with its ROCM (Radeon Open Compute) software tools.
Unlike Intel and NVIDIA, AMD does not have a dedicated artificial intelligence chip or application-specific accelerators. Despite this, Su noted, “We’ll absolutely see AMD be a large player in AI.” AMD is considering whether to build a dedicated AI chip or not. This decision will depend on how artificial intelligence evolves.
Lisa explained that companies have been improving their CPU (central processing unit) performance by leveraging various elements. These elements are process technology, die size, TDP (thermal design power), power management, microarchitecture, and compilers. Process technology is the biggest contributor, as it boosts performance by 40%. Increasing die size also boosts performance in the double digits, but it is not cost-effective.
While AMD used microarchitecture to boost EPYC Rome server CPU IPC (instructions per cycle) by 15% in single-threaded and 23% in multi-threaded workloads. This IPC improvement is above the industry average IPC improvement of around 5%–8%.
Intel revealed fine-grained details about its much-anticipated Spring Crest Deep Learning Accelerators at Hot Chips 31. The Nervana Neural Network Processor for Training (NNP-T) comes with 24 processing cores and a new take on data movement that's powered by 32GB of HBM2 memory. The spacious 27 billion transistors are spread across a 688mm2 die. The NNP-T also incorporates leading-edge technology from Intel-rival TSMC.
Intel in another presentation talked about Lakefield 3D Foveros hybrid processors that are the first to come to market with Intel's new 3D chip-stacking technology. The current design consists of two dies. The lower die houses all of the typical southbridge features, like I/O connections, and is fabbed on the 22FFL process. The upper die is a 10nm CPU that features one large compute core and four smaller Atom-based 'efficiency' cores, similar to an ARM big.LITTLE processor. Intel calls this a "hybrid x86 architecture," and it could denote a fundamental shift in the company's strategy. Finally, the company stacks DRAM atop the 3D processor in a PoP (package-on-Package) implementation.
California artificial intelligence startup Cerebras Systems introduced its Cerebras Wafer Scale Engine (WSE), the world’s largest-ever chip built for neural network processing. Sean Lie the Co-Founder and Chief Hardware Architect at Cerebras Lie presented the gigantic chip ever at Hot Chips 31.
The 16nm WSE is a 46,225 mm2 silicon chip which is slightly larger than a 9.7-inch iPad. It features 1.2 trillion transistors, 400,000 AI optimized cores, 18 Gigabytes of on-chip memory, 9 petabyte/s memory bandwidth, and 100 petabyte/s fabric bandwidth. It is 56.7 times larger than the largest Nvidia graphics processing unit, which accommodates 21.1 billion transistors on a 815 mm2 silicon base.
NVIDIA which announced about designing a test multi-chip solution for DNN computations at a VLSI conference last year, the company explained chip technology at Hot Chips 31 this year. It is currently a test chip which involves a multi-chip DL inference. It is designed for CNNs and has a RISC-V chip controller. It has 36 small chips, 8 Vector MACs per PE, and each chip has 12 PEs and each package has 6x6 chips.
AMD competes with Intel by launching EPYC Rome, world’s first 7 nm chip for data centers, luring in Twitter and Google
Apple advanced talks with Intel to buy its smartphone modem chip business for $1 billion, reports WSJ
Alibaba’s chipmaker launches open source RISC-V based ‘XuanTie 910 processor’ for 5G, AI, IoT and self-driving applications