Tensor Processing Unit

Hub AI

Tensor Processing Unit AI simulator

(@Tensor Processing Unit_simulator)

Hub AI

Tensor Processing Unit AI simulator

(@Tensor Processing Unit_simulator)

Wikipedia

Grokipedia

Tensor Processing Unit (TPU) is a neural processing unit (NPU) application-specific integrated circuit (ASIC) developed by Google for neural network machine learning. Tensorflow, Jax, and PyTorch are supported frameworks for TPU. Google began using TPUs internally in 2015, and in 2018 made them available for third-party use, both as part of its cloud infrastructure and by offering a smaller version of the chip for sale.

Compared to a graphics processing unit, TPUs are designed for a high volume of low precision computation (e.g. as little as 8-bit precision) with more input/output operations per joule, without hardware for rasterisation/texture mapping. The TPU ASICs are mounted in a heatsink assembly, which can fit in a hard drive slot within a data center rack, according to Norman Jouppi.

Different types of processors are suited for different types of machine learning models. TPUs are well suited for convolutional neural networks (CNNs), while GPUs have benefits for some fully connected neural networks, and CPUs can have advantages for recurrent neural networks (RNNs).

In 2013, Google recruited Dr. Amir Salek to establish custom silicon development capabilities for the company's datacenters. As founder and head of Custom Silicon for Google Technical Infrastructure and Google Cloud, Salek led the development of the original TPU (Google's first production chip), TPUv2 (the industry's first production deep-learning training chip), TPUv3, TPUv4, Edge-TPU, and additional silicon products including the VCU, IPU, and OpenTitan. According to Jonathan Ross, one of the original TPU engineers, and later the founder of Groq, three separate groups at Google were developing AI accelerators, with the TPU, a systolic array, being the design that was ultimately selected.

Norman P. Jouppi served as the tech lead and principal architect for Google's Tensor Processing Unit development, leading the rapid design, verification, and deployment of the first TPU to production in just 15 months. As lead author of the seminal 2017 paper "In-Datacenter Performance Analysis of a Tensor Processing Unit," presented at the 44th International Symposium on Computer Architecture (ISCA 2017), Jouppi demonstrated that the TPU achieved 15–30× higher performance and 30–80× higher performance-per-watt than contemporary CPUs and GPUs, establishing the TPU as a foundational platform for neural network inference at scale across Google's production services.

The tensor processing unit was announced in May 2016 at the Google I/O conference, when the company said that the TPU had been used inside their data centers for over a year. Google's 2017 paper describing its creation cites previous systolic matrix multipliers of similar architecture built in the 1990s. The chip was specifically designed for Google's TensorFlow framework, a symbolic math library used for machine learning applications such as neural networks. However, as of 2017 Google still used CPUs and GPUs for other types of machine learning. Other AI accelerator designs are appearing from other vendors also and are aimed at embedded and robotics markets.

Google's TPUs are proprietary. Some models are commercially available, and on February 12, 2018, The New York Times reported that Google "would allow other companies to buy access to those chips through its cloud-computing service." Google has said that they were used in the AlphaGo versus Lee Sedol series of human-versus-machine Go games, as well as in the AlphaZero system, which produced Chess, Shogi and Go playing programs from the game rules alone and went on to beat the leading programs in those games. Google has also used TPUs for Google Street View text processing and was able to find all the text in the Street View database in less than five days. In Google Photos, an individual TPU can process over 100 million photos a day. It is also used in RankBrain which Google uses to provide search results.

Google provides third parties access to TPUs through its Cloud TPU service as part of the Google Cloud Platform and through its notebook-based services Kaggle and Colaboratory.

See all