tensorflow

Google Adds Distributed Training To TensorFlow

April 13, 2016 - Written By Daniel Fuller

Not long ago, Google’s machine learning and backend building tool, TensorFlow, finally went open source. Today, Google announced that TensorFlow has been updated to version 0.8, with the update including a major requested feature. As of today’s update, TensorFlow can now run as a distributed system across mutliple machines. This greatly increases its capabilities and makes for much faster processing, training, deployment and inference in practical use. This allows for a wider stable of possible use cases, as well as making the use of TensorFlow easier and more accessible by adding in the possibility of distributing the processing load across multiple smaller-scale machines, rather than needing a supercomputer to run the platform. In addition, this allows TensorFlow to be used to build out a neural network, the next new wave of the artificial intelligence revolution.

Although the shining star of the update is the ability to run the platform in a distributed environment, there are a number of under the hood fixes, stability tweaks and new features to help developers and users out. Along with the distribution capabilities, users are given a large stable of new libraries to define their own distribution variables, allowing for custom environment setups. These new libraries also extend to writing single-machine processes that can be cloned to multiple nodes that will share data, allowing for easier training and deployment of just about any kind of task. On top of all that, analysis and models of running data have been enhanced along with the requisite libraries, making systems running on TensorFlow, regardless of scale, that much easier to monitor and maintain.

According to Google’s release announcement, the new distributed system capability and huge library and monitoring upgrades are only the beginning of the changes coming to TensorFlow. They will be improving the systems, both through human effort and through allowing, essentially, the program to self-analyze and implement fixes and new features on an in-place basis. As the program runs and programmers and analysts find and implement new bits of code, they will be uploaded live to the project’s GitHub repository, allowing users to build out updates live as they’re implemented upstream.