As part of a continuing partnership between Google and NVIDIA, the search giant has now revealed that it will be bringing the chipmaker's Tesla P4 GPUs to its cloud platform. The Google Cloud Platform (GCP) already allows customers to make use of NVIDIA's V100 Tensor Core GPUs but that flagship component isn't necessarily well-suited to every task. With the new hardware, GCP users will have an option that's more scalable without compromising too much on performance. The small form factor Tesla P4 is exceptionally efficient, according to NVIDIA. Built on the company's Pascal architecture and consuming between 50W and 75W of power at maximum, the P4 is capable of being boosted up to 5.5 TeraFLOPS of Single-Precision Performance. On the integer operations side of the equation, performance is set at around 22 TeraOperations per Second. That's backed by 8GB of memory and a memory bandwidth of 192GB per second.
That's plenty of performance at a claimed 60-times better than what a traditional CPU can offer and its integration with Google's TensorFlow via TensorRT library should make it ideal for optimizing trained neural networks for reduced precision INT8 operations. Moreover, NVIDIA says that performance is 'extremely low latency' at just 1.8 milliseconds 'at batch size 1.' Moreover, it utilizes NVIDIA DeepStream SDK and A.I.-based video services to enable transcoding and inference of HD video at up to 35 streams in real-time. That's in addition to the firmware's ability to decode and analyze streams simultaneously. In short, nearly every aspect of the new offering is set to be a workable solution for smaller A.I. projects and projects that will need to scale but not quite to the level of performance of the flagship Tensor Core GPUs listed above.
Google's announcement was made at this year's Google Cloud Next conference, held earlier this week. As of this writing, however, no timeline has been given with regard to when the new hardware will become available. With that said, the Tesla P4 should neatly fill the needs of neural networks or projects requiring high-performance computing and scalability. In particular, it seems as though the new offering will be perfect for those require more power but for which a single or multiple V100 Tensor Core GPUs would be too much.