Recent from talks
Nothing was collected or created yet.
TensorFlow
View on Wikipedia
| TensorFlow | |
|---|---|
TensorFlow logo | |
| Developer | Google Brain Team[1] |
| Initial release | November 9, 2015 |
| Stable release | 2.20.0
/ August 19, 2025 |
| Repository | github |
| Written in | Python, C++, CUDA |
| Platform | Linux, macOS, Windows, Android, JavaScript[2] |
| Type | Machine learning library |
| License | Apache 2.0 |
| Website | tensorflow |
| Part of a series on |
| Machine learning and data mining |
|---|
TensorFlow is a software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for training and inference of neural networks.[3][4] It is one of the most popular deep learning frameworks, alongside others such as PyTorch.[5] It is free and open-source software released under the Apache License 2.0.
It was developed by the Google Brain team for Google's internal use in research and production.[6][7][8] The initial version was released under the Apache License 2.0 in 2015.[1][9] Google released an updated version, TensorFlow 2.0, in September 2019.[10]
TensorFlow can be used in a wide variety of programming languages, including Python, JavaScript, C++, and Java,[11] facilitating its use in a range of applications in many sectors.
History
[edit]DistBelief
[edit]Starting in 2011, Google Brain built DistBelief as a proprietary machine learning system based on deep learning neural networks. Its use grew rapidly across diverse Alphabet companies in both research and commercial applications.[12][13] Google assigned multiple computer scientists, including Jeff Dean, to simplify and refactor the codebase of DistBelief into a faster, more robust application-grade library, which became TensorFlow.[14] In 2009, the team, led by Geoffrey Hinton, had implemented generalized backpropagation and other improvements, which allowed generation of neural networks with substantially higher accuracy, for instance a 25% reduction in errors in speech recognition.[15]
TensorFlow
[edit]TensorFlow is Google Brain's second-generation system. Version 1.0.0 was released on February 11, 2017.[16] While the reference implementation runs on single devices, TensorFlow can run on multiple CPUs and GPUs (with optional CUDA and SYCL extensions for general-purpose computing on graphics processing units).[17] TensorFlow is available on 64-bit Linux, macOS, Windows, and mobile computing platforms including Android and iOS.[18][19]
Its flexible architecture allows for easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices.
TensorFlow computations are expressed as stateful dataflow graphs. The name TensorFlow derives from the operations that such neural networks perform on multidimensional data arrays, which are referred to as tensors.[20] During the Google I/O Conference in June 2016, Jeff Dean stated that 1,500 repositories on GitHub mentioned TensorFlow, of which only 5 were from Google.[21]
In March 2018, Google announced TensorFlow.js version 1.0 for machine learning in JavaScript.[22]
In Jan 2019, Google announced TensorFlow 2.0.[23] It became officially available in September 2019.[10]
In May 2019, Google announced TensorFlow Graphics for deep learning in computer graphics.[24]
Tensor processing unit (TPU)
[edit]In May 2016, Google announced its Tensor processing unit (TPU), an application-specific integrated circuit (ASIC, a hardware chip) built specifically for machine learning and tailored for TensorFlow. A TPU is a programmable AI accelerator designed to provide high throughput of low-precision arithmetic (e.g., 8-bit), and oriented toward using or running models rather than training them. Google announced they had been running TPUs inside their data centers for more than a year, and had found them to deliver an order of magnitude better-optimized performance per watt for machine learning.[25]
In May 2017, Google announced the second-generation, as well as the availability of the TPUs in Google Compute Engine.[26] The second-generation TPUs deliver up to 180 teraflops of performance, and when organized into clusters of 64 TPUs, provide up to 11.5 petaflops.[citation needed]
In May 2018, Google announced the third-generation TPUs delivering up to 420 teraflops of performance and 128 GB high bandwidth memory (HBM). Cloud TPU v3 Pods offer 100+ petaflops of performance and 32 TB HBM.[27]
In February 2018, Google announced that they were making TPUs available in beta on the Google Cloud Platform.[28]
Edge TPU
[edit]In July 2018, the Edge TPU was announced. Edge TPU is Google's purpose-built ASIC chip designed to run TensorFlow Lite machine learning (ML) models on small client computing devices such as smartphones[29] known as edge computing.
TensorFlow Lite
[edit]In May 2017, Google announced a software stack specifically for mobile development, TensorFlow Lite.[30] In January 2019, the TensorFlow team released a developer preview of the mobile GPU inference engine with OpenGL ES 3.1 Compute Shaders on Android devices and Metal Compute Shaders on iOS devices.[31] In May 2019, Google announced that their TensorFlow Lite Micro (also known as TensorFlow Lite for Microcontrollers) and ARM's uTensor would be merging.[32]
TensorFlow 2.0
[edit]As TensorFlow's market share among research papers was declining to the advantage of PyTorch,[33] the TensorFlow Team announced a release of a new major version of the library in September 2019. TensorFlow 2.0 introduced many changes, the most significant being TensorFlow eager, which changed the automatic differentiation scheme from the static computational graph to the "Define-by-Run" scheme originally made popular by Chainer and later PyTorch.[33] Other major changes included removal of old libraries, cross-compatibility between trained models on different versions of TensorFlow, and significant improvements to the performance on GPU.[34]
Features
[edit]AutoDifferentiation
[edit]AutoDifferentiation is the process of automatically calculating the gradient vector of a model with respect to each of its parameters. With this feature, TensorFlow can automatically compute the gradients for the parameters in a model, which is useful to algorithms such as backpropagation which require gradients to optimize performance.[35] To do so, the framework must keep track of the order of operations done to the input Tensors in a model, and then compute the gradients with respect to the appropriate parameters.[35]
Eager execution
[edit]TensorFlow includes an "eager execution" mode, which means that operations are evaluated immediately as opposed to being added to a computational graph which is executed later.[36] Code executed eagerly can be examined step-by step-through a debugger, since data is augmented at each line of code rather than later in a computational graph.[36] This execution paradigm is considered to be easier to debug because of its step by step transparency.[36]
Distribute
[edit]In both eager and graph executions, TensorFlow provides an API for distributing computation across multiple devices with various distribution strategies.[37] This distributed computing can often speed up the execution of training and evaluating of TensorFlow models and is a common practice in the field of AI.[37][38]
Losses
[edit]To train and assess models, TensorFlow provides a set of loss functions (also known as cost functions).[39] Some popular examples include mean squared error (MSE) and binary cross entropy (BCE).[39]
Metrics
[edit]In order to assess the performance of machine learning models, TensorFlow gives API access to commonly used metrics. Examples include various accuracy metrics (binary, categorical, sparse categorical) along with other metrics such as Precision, Recall, and Intersection-over-Union (IoU).[40]
TF.nn
[edit]TensorFlow.nn is a module for executing primitive neural network operations on models.[41] Some of these operations include variations of convolutions (1/2/3D, Atrous, depthwise), activation functions (Softmax, RELU, GELU, Sigmoid, etc.) and their variations, and other operations (max-pooling, bias-add, etc.).[41]
Optimizers
[edit]TensorFlow offers a set of optimizers for training neural networks, including ADAM, ADAGRAD, and Stochastic Gradient Descent (SGD).[42] When training a model, different optimizers offer different modes of parameter tuning, often affecting a model's convergence and performance.[43]
Usage and extensions
[edit]TensorFlow
[edit]TensorFlow serves as a core platform and library for machine learning. TensorFlow's APIs use Keras to allow users to make their own machine-learning models.[34][44] In addition to building and training their model, TensorFlow can also help load the data to train the model, and deploy it using TensorFlow Serving.[45]
TensorFlow provides a stable Python Application Program Interface (API),[46] as well as APIs without backwards compatibility guarantee for Javascript,[47] C++,[48] and Java.[49][11] Third-party language binding packages are also available for C#,[50][51] Haskell,[52] Julia,[53] MATLAB,[54] Object Pascal,[55] R,[56] Scala,[57] Rust,[58] OCaml,[59] and Crystal.[60] Bindings that are now archived and unsupported include Go[61] and Swift.[62]
TensorFlow.js
[edit]TensorFlow also has a library for machine learning in JavaScript. Using the provided JavaScript APIs, TensorFlow.js allows users to use either Tensorflow.js models or converted models from TensorFlow or TFLite, retrain the given models, and run on the web.[45][63]
LiteRT
[edit]LiteRT, formerly known as TensorFlow Lite,[64] has APIs for mobile apps or embedded devices to generate and deploy TensorFlow models.[65] These models are compressed and optimized in order to be more efficient and have a higher performance on smaller capacity devices.[66]
LiteRT uses FlatBuffers as the data serialization format for network models, eschewing the Protocol Buffers format used by standard TensorFlow models.[66]
TFX
[edit]TensorFlow Extended (abbrev. TFX) provides numerous components to perform all the operations needed for end-to-end production.[67] Components include loading, validating, and transforming data, tuning, training, and evaluating the machine learning model, and pushing the model itself into production.[45][67]
Integrations
[edit]Numpy
[edit]Numpy is one of the most popular Python data libraries, and TensorFlow offers integration and compatibility with its data structures.[68] Numpy NDarrays, the library's native datatype, are automatically converted to TensorFlow Tensors in TF operations; the same is also true vice versa.[68] This allows for the two libraries to work in unison without requiring the user to write explicit data conversions. Moreover, the integration extends to memory optimization by having TF Tensors share the underlying memory representations of Numpy NDarrays whenever possible.[68]
Extensions
[edit]TensorFlow also offers a variety of libraries and extensions to advance and extend the models and methods used.[69] For example, TensorFlow Recommenders and TensorFlow Graphics are libraries for their respective functional.[70] Other add-ons, libraries, and frameworks include TensorFlow Model Optimization, TensorFlow Probability, TensorFlow Quantum, and TensorFlow Decision Forests.[69][70]
Google Colab
[edit]Google also released Colaboratory, a TensorFlow Jupyter notebook environment that does not require any setup.[71] It runs on Google Cloud and allows users free access to GPUs and the ability to store and share notebooks on Google Drive.[72]
Google JAX
[edit]Google JAX is a machine learning framework for transforming numerical functions.[73][74][75] It is described as bringing together a modified version of autograd (automatic obtaining of the gradient function through differentiation of a function) and TensorFlow's XLA (Accelerated Linear Algebra). It is designed to follow the structure and workflow of NumPy as closely as possible and works with TensorFlow as well as other frameworks such as PyTorch. The primary functions of JAX are:[73]
- grad: automatic differentiation
- jit: compilation
- vmap: auto-vectorization
- pmap: SPMD programming
Applications
[edit]Medical
[edit]GE Healthcare used TensorFlow to increase the speed and accuracy of MRIs in identifying specific body parts.[76] Google used TensorFlow to create DermAssist, a free mobile application that allows users to take pictures of their skin and identify potential health complications.[77] Sinovation Ventures used TensorFlow to identify and classify eye diseases from optical coherence tomography (OCT) scans.[77]
Social media
[edit]Twitter implemented TensorFlow to rank tweets by importance for a given user, and changed their platform to show tweets in order of this ranking.[78] Previously, tweets were simply shown in reverse chronological order.[78] The photo sharing app VSCO used TensorFlow to help suggest custom filters for photos.[77]
Search Engine
[edit]Google officially released RankBrain on October 26, 2015, backed by TensorFlow.[79]
Education
[edit]InSpace, a virtual learning platform, used TensorFlow to filter out toxic chat messages in classrooms.[80] Liulishuo, an online English learning platform, utilized TensorFlow to create an adaptive curriculum for each student.[81] TensorFlow was used to accurately assess a student's current abilities, and also helped decide the best future content to show based on those capabilities.[81]
Retail
[edit]The e-commerce platform Carousell used TensorFlow to provide personalized recommendations for customers.[77] The cosmetics company ModiFace used TensorFlow to create an augmented reality experience for customers to test various shades of make-up on their face.[82]
Research
[edit]TensorFlow is the foundation for the automated image-captioning software DeepDream.[83]
See also
[edit]References
[edit]- ^ a b "Credits". TensorFlow.org. Archived from the original on November 17, 2015. Retrieved November 10, 2015.
- ^ "TensorFlow.js". Archived from the original on May 6, 2018. Retrieved June 28, 2018.
- ^ Abadi, Martín; Barham, Paul; Chen, Jianmin; Chen, Zhifeng; Davis, Andy; Dean, Jeffrey; Devin, Matthieu; Ghemawat, Sanjay; Irving, Geoffrey; Isard, Michael; Kudlur, Manjunath; Levenberg, Josh; Monga, Rajat; Moore, Sherry; Murray, Derek G.; Steiner, Benoit; Tucker, Paul; Vasudevan, Vijay; Warden, Pete; Wicke, Martin; Yu, Yuan; Zheng, Xiaoqiang (2016). TensorFlow: A System for Large-Scale Machine Learning (PDF). Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16). arXiv:1605.08695. Archived (PDF) from the original on December 12, 2020. Retrieved October 26, 2020.
- ^ TensorFlow: Open source machine learning. Google. 2015. Archived from the original on November 11, 2021. "It is machine learning software being used for various kinds of perceptual and language understanding tasks" – Jeffrey Dean, minute 0:47 / 2:17 from YouTube clip
- ^ "Top 30 Open Source Projects". Open Source Project Velocity by CNCF. Archived from the original on September 3, 2023. Retrieved October 12, 2023.
- ^ Video clip by Google about TensorFlow 2015 at minute 0:15/2:17
- ^ Video clip by Google about TensorFlow 2015 at minute 0:26/2:17
- ^ Dean et al 2015, p. 2
- ^ Metz, Cade (November 9, 2015). "Google Just Open Sourced TensorFlow, Its Artificial Intelligence Engine". Wired. Archived from the original on November 9, 2015. Retrieved November 10, 2015.
- ^ a b TensorFlow (September 30, 2019). "TensorFlow 2.0 is now available!". Medium. Archived from the original on October 7, 2019. Retrieved November 24, 2019.
- ^ a b "API Documentation". Archived from the original on November 16, 2015. Retrieved June 27, 2018.,
- ^ Dean, Jeff; Monga, Rajat; et al. (November 9, 2015). "TensorFlow: Large-scale machine learning on heterogeneous systems" (PDF). TensorFlow.org. Google Research. Archived (PDF) from the original on November 20, 2015. Retrieved November 10, 2015.
- ^ Perez, Sarah (November 9, 2015). "Google Open-Sources The Machine Learning Tech Behind Google Photos Search, Smart Reply And More". TechCrunch. Archived from the original on November 9, 2015. Retrieved November 11, 2015.
- ^ Oremus, Will (November 9, 2015). "What Is TensorFlow, and Why Is Google So Excited About It?". Slate. Archived from the original on November 10, 2015. Retrieved November 11, 2015.
- ^ Ward-Bailey, Jeff (November 25, 2015). "Google chairman: We're making 'real progress' on artificial intelligence". CSMonitor. Archived from the original on September 16, 2015. Retrieved November 25, 2015.
- ^ TensorFlow Developers (2022). "Tensorflow Release 1.0.0". GitHub. doi:10.5281/zenodo.4724125. Archived from the original on February 27, 2021. Retrieved July 24, 2017.
- ^ Metz, Cade (November 10, 2015). "TensorFlow, Google's Open Source AI, Points to a Fast-Changing Hardware World". Wired. Archived from the original on November 11, 2015. Retrieved November 11, 2015.
- ^ Kudale, Aniket Eknath (June 8, 2020). "Building a Facial Expression Recognition App Using TensorFlow.js". Open Source For U. Archived from the original on October 11, 2024. Retrieved April 19, 2025.
- ^ MSV, Janakiram (February 24, 2021). "The Ultimate Guide to Machine Learning Frameworks". The New Stack. Archived from the original on December 24, 2024. Retrieved April 19, 2025.
- ^ "Introduction to tensors". tensorflow.org. Archived from the original on May 26, 2024. Retrieved March 3, 2024.
- ^ Machine Learning: Google I/O 2016 Minute 07:30/44:44 . Archived December 21, 2016, at the Wayback Machine. Retrieved June 5, 2016.
- ^ TensorFlow (March 30, 2018). "Introducing TensorFlow.js: Machine Learning in Javascript". Medium. Archived from the original on March 30, 2018. Retrieved May 24, 2019.
- ^ TensorFlow (January 14, 2019). "What's coming in TensorFlow 2.0". Medium. Archived from the original on January 14, 2019. Retrieved May 24, 2019.
- ^ TensorFlow (May 9, 2019). "Introducing TensorFlow Graphics: Computer Graphics Meets Deep Learning". Medium. Archived from the original on May 9, 2019. Retrieved May 24, 2019.
- ^ Jouppi, Norm. "Google supercharges machine learning tasks with TPU custom chip". Google Cloud Platform Blog. Archived from the original on May 18, 2016. Retrieved May 19, 2016.
- ^ "Build and train machine learning models on our new Google Cloud TPUs". Google. May 17, 2017. Archived from the original on May 17, 2017. Retrieved May 18, 2017.
- ^ "Cloud TPU". Google Cloud. Archived from the original on May 17, 2017. Retrieved May 24, 2019.
- ^ "Cloud TPU machine learning accelerators now available in beta". Google Cloud Platform Blog. Archived from the original on February 12, 2018. Retrieved February 12, 2018.
- ^ Kundu, Kishalaya (July 26, 2018). "Google Announces Edge TPU, Cloud IoT Edge at Cloud Next 2018". Beebom. Archived from the original on May 26, 2024. Retrieved February 2, 2019.
- ^ Vincent, James (May 17, 2017). "Google's new machine learning framework is going to put more AI on your phone". The Verge. Archived from the original on May 17, 2017. Retrieved May 19, 2017.
- ^ TensorFlow (January 16, 2019). "TensorFlow Lite Now Faster with Mobile GPUs (Developer Preview)". Medium. Archived from the original on January 16, 2019. Retrieved May 24, 2019.
- ^ "uTensor and Tensor Flow Announcement | Mbed". os.mbed.com. Archived from the original on May 9, 2019. Retrieved May 24, 2019.
- ^ a b He, Horace (October 10, 2019). "The State of Machine Learning Frameworks in 2019". The Gradient. Archived from the original on October 10, 2019. Retrieved May 22, 2020.
- ^ a b Ciaramella, Alberto; Ciaramella, Marco (July 2024). Introduction to Artificial Intelligence: from data analysis to generative AI. Intellisemantic Editions. ISBN 9788894787603.
- ^ a b "Introduction to gradients and automatic differentiation". TensorFlow. Archived from the original on October 28, 2021. Retrieved November 4, 2021.
- ^ a b c "Eager execution | TensorFlow Core". TensorFlow. Archived from the original on November 4, 2021. Retrieved November 4, 2021.
- ^ a b "Module: tf.distribute | TensorFlow Core v2.6.1". TensorFlow. Archived from the original on May 26, 2024. Retrieved November 4, 2021.
- ^ Sigeru., Omatu (2014). Distributed Computing and Artificial Intelligence, 11th International Conference. Springer International Publishing. ISBN 978-3-319-07593-8. OCLC 980886715.
- ^ a b "Module: tf.losses | TensorFlow Core v2.6.1". TensorFlow. Archived from the original on October 27, 2021. Retrieved November 4, 2021.
- ^ "Module: tf.metrics | TensorFlow Core v2.6.1". TensorFlow. Archived from the original on November 4, 2021. Retrieved November 4, 2021.
- ^ a b "Module: tf.nn | TensorFlow Core v2.7.0". TensorFlow. Archived from the original on May 26, 2024. Retrieved November 6, 2021.
- ^ "Module: tf.optimizers | TensorFlow Core v2.7.0". TensorFlow. Archived from the original on October 30, 2021. Retrieved November 6, 2021.
- ^ Dogo, E. M.; Afolabi, O. J.; Nwulu, N. I.; Twala, B.; Aigbavboa, C. O. (December 2018). "A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks". 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS). pp. 92–99. doi:10.1109/CTEMS.2018.8769211. ISBN 978-1-5386-7709-4. S2CID 198931032.
- ^ "TensorFlow Core | Machine Learning for Beginners and Experts". TensorFlow. Archived from the original on January 20, 2023. Retrieved November 4, 2021.
- ^ a b c "Introduction to TensorFlow". TensorFlow. Archived from the original on January 20, 2023. Retrieved October 28, 2021.
- ^ "All symbols in TensorFlow 2 | TensorFlow Core v2.7.0". TensorFlow. Archived from the original on November 6, 2021. Retrieved November 6, 2021.
- ^ "TensorFlow.js". js.tensorflow.org. Archived from the original on May 26, 2024. Retrieved November 6, 2021.
- ^ "TensorFlow C++ API Reference | TensorFlow Core v2.7.0". TensorFlow. Archived from the original on January 20, 2023. Retrieved November 6, 2021.
- ^ "org.tensorflow | Java". TensorFlow. Archived from the original on November 6, 2021. Retrieved November 6, 2021.
- ^ Icaza, Miguel de (February 17, 2018). "TensorFlowSharp: TensorFlow API for .NET languages". GitHub. Archived from the original on July 24, 2017. Retrieved February 18, 2018.
- ^ Chen, Haiping (December 11, 2018). "TensorFlow.NET: .NET Standard bindings for TensorFlow". GitHub. Archived from the original on July 12, 2019. Retrieved December 11, 2018.
- ^ "haskell: Haskell bindings for TensorFlow". tensorflow. February 17, 2018. Archived from the original on July 24, 2017. Retrieved February 18, 2018.
- ^ Malmaud, Jon (August 12, 2019). "A Julia wrapper for TensorFlow". GitHub. Archived from the original on July 24, 2017. Retrieved August 14, 2019.
operations like sin, * (matrix multiplication), .* (element-wise multiplication), etc [..]. Compare to Python, which requires learning specialized namespaced functions like tf.matmul.
- ^ "A MATLAB wrapper for TensorFlow Core". GitHub. November 3, 2019. Archived from the original on September 14, 2020. Retrieved February 13, 2020.
- ^ "Use TensorFlow from Pascal (FreePascal, Lazarus, etc.)". GitHub. January 19, 2023. Archived from the original on January 20, 2023. Retrieved January 20, 2023.
- ^ "tensorflow: TensorFlow for R". RStudio. February 17, 2018. Archived from the original on January 4, 2017. Retrieved February 18, 2018.
- ^ Platanios, Anthony (February 17, 2018). "tensorflow_scala: TensorFlow API for the Scala Programming Language". GitHub. Archived from the original on February 18, 2019. Retrieved February 18, 2018.
- ^ "rust: Rust language bindings for TensorFlow". tensorflow. February 17, 2018. Archived from the original on July 24, 2017. Retrieved February 18, 2018.
- ^ Mazare, Laurent (February 16, 2018). "tensorflow-ocaml: OCaml bindings for TensorFlow". GitHub. Archived from the original on June 11, 2018. Retrieved February 18, 2018.
- ^ "fazibear/tensorflow.cr". GitHub. Archived from the original on June 27, 2018. Retrieved October 10, 2018.
- ^ "tensorflow package - github.com/tensorflow/tensorflow/tensorflow/go - pkg.go.dev". pkg.go.dev. Archived from the original on November 6, 2021. Retrieved November 6, 2021.
- ^ "Swift for TensorFlow (In Archive Mode)". TensorFlow. Archived from the original on November 6, 2021. Retrieved November 6, 2021.
- ^ "TensorFlow.js | Machine Learning for JavaScript Developers". TensorFlow. Archived from the original on November 4, 2021. Retrieved October 28, 2021.
- ^ "LiteRT Overview | Google AI Edge". Google AI for Developers. Retrieved May 7, 2025.
- ^ "TensorFlow Lite | ML for Mobile and Edge Devices". TensorFlow. Archived from the original on November 4, 2021. Retrieved November 1, 2021.
- ^ a b "TensorFlow Lite". TensorFlow. Archived from the original on November 2, 2021. Retrieved November 1, 2021.
- ^ a b "TensorFlow Extended (TFX) | ML Production Pipelines". TensorFlow. Archived from the original on November 4, 2021. Retrieved November 2, 2021.
- ^ a b c "Customization basics: tensors and operations | TensorFlow Core". TensorFlow. Archived from the original on November 6, 2021. Retrieved November 6, 2021.
- ^ a b "Guide | TensorFlow Core". TensorFlow. Archived from the original on July 17, 2019. Retrieved November 4, 2021.
- ^ a b "Libraries & extensions". TensorFlow. Archived from the original on November 4, 2021. Retrieved November 4, 2021.
- ^ "Colaboratory – Google". research.google.com. Archived from the original on October 24, 2017. Retrieved November 10, 2018.
- ^ "Google Colaboratory". colab.research.google.com. Archived from the original on February 3, 2021. Retrieved November 6, 2021.
- ^ a b Bradbury, James; Frostig, Roy; Hawkins, Peter; Johnson, Matthew James; Leary, Chris; MacLaurin, Dougal; Necula, George; Paszke, Adam; Vanderplas, Jake; Wanderman-Milne, Skye; Zhang, Qiao (June 18, 2022), "JAX: Autograd and XLA", Astrophysics Source Code Library, Google, Bibcode:2021ascl.soft11002B, archived from the original on June 18, 2022, retrieved June 18, 2022
- ^ "Using JAX to accelerate our research". www.deepmind.com. December 4, 2020. Archived from the original on June 18, 2022. Retrieved June 18, 2022.
- ^ "Why is Google's JAX so popular?". Analytics India Magazine. April 25, 2022. Archived from the original on June 18, 2022. Retrieved June 18, 2022.
- ^ "Intelligent Scanning Using Deep Learning for MRI". Archived from the original on November 4, 2021. Retrieved November 4, 2021.
- ^ a b c d "Case Studies and Mentions". TensorFlow. Archived from the original on October 26, 2021. Retrieved November 4, 2021.
- ^ a b "Ranking Tweets with TensorFlow". Archived from the original on November 4, 2021. Retrieved November 4, 2021.
- ^ Davies, Dave (September 2, 2020). "A Complete Guide to the Google RankBrain Algorithm". Search Engine Journal. Archived from the original on November 6, 2021. Retrieved October 15, 2024.
- ^ "InSpace: A new video conferencing platform that uses TensorFlow.js for toxicity filters in chat". Archived from the original on November 4, 2021. Retrieved November 4, 2021.
- ^ a b Xulin. "流利说基于 TensorFlow 的自适应系统实践". Weixin Official Accounts Platform. Archived from the original on November 6, 2021. Retrieved November 4, 2021.
- ^ "How Modiface utilized TensorFlow.js in production for AR makeup try on in the browser". Archived from the original on November 4, 2021. Retrieved November 4, 2021.
- ^ Byrne, Michael (November 11, 2015). "Google Offers Up Its Entire Machine Learning Library as Open-Source Software". Vice. Archived from the original on January 25, 2021. Retrieved November 11, 2015.
Further reading
[edit]- Moroney, Laurence (October 1, 2020). AI and Machine Learning for Coders (1st ed.). O'Reilly Media. p. 365. ISBN 9781492078197. Archived from the original on June 7, 2021. Retrieved December 21, 2020.
- Géron, Aurélien (October 15, 2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (2nd ed.). O'Reilly Media. p. 856. ISBN 9781492032632. Archived from the original on May 1, 2021. Retrieved November 25, 2019.
- Ramsundar, Bharath; Zadeh, Reza Bosagh (March 23, 2018). TensorFlow for Deep Learning (1st ed.). O'Reilly Media. p. 256. ISBN 9781491980446. Archived from the original on June 7, 2021. Retrieved November 25, 2019.
- Hope, Tom; Resheff, Yehezkel S.; Lieder, Itay (August 27, 2017). Learning TensorFlow: A Guide to Building Deep Learning Systems (1st ed.). O'Reilly Media. p. 242. ISBN 9781491978504. Archived from the original on March 8, 2021. Retrieved November 25, 2019.
- Shukla, Nishant (February 12, 2018). Machine Learning with TensorFlow (1st ed.). Manning Publications. p. 272. ISBN 9781617293870.
External links
[edit]TensorFlow
View on GrokipediaOverview
Definition and Purpose
TensorFlow is an open-source software library for numerical computation using dataflow graphs, serving as a flexible interface for defining and training machine learning models, particularly deep neural networks, through operations on multidimensional arrays known as tensors.[10] Developed by the Google Brain team, it was first released in November 2015 under the Apache 2.0 open-source license, enabling widespread adoption for research and production applications.[10] At its core, TensorFlow facilitates the expression and execution of machine learning algorithms across diverse hardware platforms, from mobile devices to large-scale clusters, supporting tasks in fields such as computer vision, natural language processing, and speech recognition.[10] The primary purposes of TensorFlow include enabling efficient numerical computation, differentiable programming for gradient-based optimization, and scalable model deployment in varied environments.[11] It provides tools for building models that can run seamlessly on desktops, servers, mobile devices, and embedded systems, making it suitable for both prototyping and production-scale machine learning workflows.[3] This end-to-end platform emphasizes ease of use for beginners and experts alike, with high-level APIs like Keras integrated for rapid model development.[4] In TensorFlow, tensors represent the fundamental data structure as multi-dimensional arrays of elements sharing a uniform data type (dtype), allowing for operations such as element-wise addition, matrix multiplication, and reshaping.[12] For instance, a scalar tensor has shape[], a vector has shape [d1], and a matrix has shape [d1, d2], where d1 and d2 denote the dimensions; these shapes enable efficient handling of data batches, feature vectors, and image pixels in machine learning pipelines.[12] By leveraging tensor operations within dataflow graphs, TensorFlow optimizes computations for performance and parallelism, underpinning its role in scalable machine learning.[10]
Design Philosophy
TensorFlow's design philosophy centers on the use of dataflow graphs to represent computations, where nodes represent operations and edges represent multidimensional data arrays known as tensors. This model allows for efficient expression of complex numerical computations by defining a directed graph that captures dependencies between operations, enabling optimizations such as parallel execution and fusion of subgraphs. By structuring machine learning algorithms as these graphs, TensorFlow facilitates both static optimization during graph construction and dynamic execution, promoting flexibility in model design and deployment.[10] A core principle is portability across diverse hardware and platforms, ensuring that models can run with minimal modifications on CPUs, GPUs, TPUs, as well as desktop, mobile, web, and cloud environments. This is achieved through a unified execution engine that abstracts hardware-specific details, allowing seamless scaling from single devices to large distributed systems. The emphasis on portability supports heterogeneous computing, where computations can migrate between devices, without altering the core model logic.[3] TensorFlow adopts an end-to-end approach to machine learning, encompassing the entire workflow from data ingestion and preprocessing to model training, evaluation, and deployment in production. This holistic design enables practitioners to build, deploy, and manage models within a single ecosystem, reducing fragmentation and accelerating development cycles. Tools like TensorFlow Extended (TFX) integrate these stages, ensuring reproducibility and scalability for real-world applications.[3][13] Modularity and extensibility are foundational, with composable operations that allow users to assemble custom models from reusable building blocks, fostering experimentation and adaptability. TensorFlow supports user-defined operations through a registration mechanism, enabling extensions for domain-specific needs while maintaining compatibility. The framework was open-sourced under the Apache 2.0 license to encourage community contributions, democratizing access to advanced machine learning tools and driving rapid innovation through collaborative development.[10][3]Installation
TensorFlow is typically installed using pip within a Python environment. The standard command for the CPU version ispip install tensorflow.
GPU support on Windows with NVIDIA GPUs requires specific configurations depending on the TensorFlow version. Native Windows GPU support was discontinued after TensorFlow 2.10. For TensorFlow 2.17 and later (including the latest versions), GPU support on Windows with NVIDIA GPUs requires using WSL2 (Windows Subsystem for Linux 2), as native Windows GPU support is discontinued after TensorFlow 2.10.[14]
Key requirements for WSL2 GPU support:
- NVIDIA GPU with CUDA compute capability 3.5 or higher (prebuilt binaries in 2.17+ support 6.0+; 5.0 requires building from source).
- NVIDIA GPU drivers >= 528.33.
- CUDA Toolkit 12.3.
- cuDNN 8.9.7.
- Install via
pip install tensorflow[and-cuda]in a WSL2 environment (Windows 10 version 19044 or higher).
pip install "tensorflow<2.11".[14][15]
Users should consult the official TensorFlow documentation for the most current installation instructions and additional details.
History
DistBelief and Early Development
DistBelief was Google's proprietary deep learning framework, developed in 2011 as part of the Google Brain project, which was co-founded by Jeff Dean to advance artificial intelligence through large-scale neural networks.[16][17] The framework enabled the training of massive deep neural networks on computing clusters comprising thousands of machines, marking a significant advancement in scaling deep learning beyond single-machine capabilities.[18] A core innovation of DistBelief was its support for distributed training techniques, such as Downpour Stochastic Gradient Descent (SGD) and Sandblaster, which allowed asynchronous updates across parameter servers and workers to handle models with billions of parameters efficiently.[18] This capability was demonstrated in applications like large-scale image recognition, where DistBelief trained networks to process vast datasets, achieving state-of-the-art performance on tasks such as object detection in videos from YouTube.[18] These features underscored the framework's role in pushing the boundaries of deep learning at Google, particularly for perception-based AI systems. However, DistBelief's proprietary nature and tight integration with Google's internal infrastructure limited its flexibility, portability, and accessibility for users outside the company.[19] Recognizing these constraints, the Google Brain team, under Jeff Dean's leadership, decided to rebuild the system from the ground up, resulting in the open-source TensorFlow framework released in 2015.[19][16]Initial Release and Growth
TensorFlow was publicly released as an open-source project on November 9, 2015, under the Apache License 2.0, marking Google's transition from the internal DistBelief system to a broadly accessible machine learning framework.[19] The initial release focused on providing a flexible platform for numerical computation using dataflow graphs, with the first tagged version, 0.5.0, following shortly on November 26, 2015.[20] Development progressed rapidly, culminating in the stable version 1.0 on February 15, 2017, which stabilized the core Python API and introduced experimental support for Java and Go. This milestone reflected iterative improvements driven by early user feedback, enabling more reliable deployment in production environments.[21] Key features in the early versions emphasized static computation graphs, where models were defined as directed acyclic graphs before execution, allowing for optimizations like parallelization and distribution across devices.[19] The framework provided primary APIs in Python for high-level model building and C++ for low-level performance-critical operations, supporting a range of hardware from CPUs to GPUs.[19] These elements facilitated efficient training of deep neural networks, with built-in support for operations like convolutions and matrix multiplications essential for computer vision and natural language processing tasks. Adoption surged following the release, with TensorFlow integrated into several Google products, including search functionality in Google Photos for image recognition and neural machine translation in Google Translate.[22] By its first anniversary in 2016, the project had attracted contributions from over 480 individuals, including more than 200 external developers, fostering a vibrant ecosystem.[23] Community engagement propelled growth, as evidenced by the GitHub repository amassing over 140,000 stars by early 2020, signaling widespread interest among researchers and practitioners.[24] Despite its momentum, early TensorFlow faced challenges, notably a steep learning curve stemming from the graph-based execution mode, which required users to separate model definition from runtime evaluation, complicating debugging and iteration.[25] This paradigm, while powerful for optimization, contrasted with more intuitive dynamic execution approaches and initially hindered accessibility for beginners.[26] Nonetheless, external contributions helped address these issues through enhancements to documentation and tooling, solidifying TensorFlow's position as a cornerstone of machine learning development.Hardware Innovations: TPUs and Edge TPUs
Google developed the Tensor Processing Unit (TPU) as an application-specific integrated circuit (ASIC) optimized for accelerating neural network computations, particularly matrix multiplications central to deep learning workloads. Announced in May 2016 at Google I/O, the TPU had already been deployed internally in Google's data centers for over a year to power services like AlphaGo and Google Photos, addressing the limitations of general-purpose processors in handling the high-throughput tensor operations required by TensorFlow. The first cloud-accessible version became available in beta in 2017, providing external developers with access to this hardware through Google Cloud Platform.[27][28] At its core, the TPU architecture leverages systolic arrays to enable efficient, high-throughput execution of tensor operations, minimizing data movement and maximizing computational density. The inaugural TPU v1 featured a 256×256 systolic array comprising 65,536 8-bit multiply-accumulate units, operating at 700 MHz on a 28 nm process with a 40 W power envelope and a 24 MB unified buffer for activations and weights. Subsequent iterations scaled this design for greater efficiency: TPU v2 (2017) introduced liquid cooling and doubled peak performance; v3 (2018) added floating-point support; v4 (2020) enhanced interconnect bandwidth; v5 (2023) delivered up to 2.3× better price-performance over v4 through innovations in chip density and memory bandwidth, achieving pod-scale configurations with thousands of chips for large-scale training; Trillium v6 (2024) offered 4.7× performance improvements over v5p with doubled high-bandwidth memory capacity; and Ironwood v7 (2025) focused on inference with up to 4× better performance for generative AI workloads.[29][30] This evolution has progressively improved energy efficiency, with TPUs demonstrating up to 3× better carbon efficiency for AI workloads compared to earlier generations from TPU v4 to Trillium over the 2020–2024 period, as detailed in a 2025 life-cycle assessment.[31][32][33] TPUs integrate natively with TensorFlow via a dedicated compiler that maps computational graphs directly to TPU instructions, enabling seamless execution without extensive code modifications.[31][32][33] In 2018, Google extended TPU technology to edge devices with the Edge TPU, a compact ASIC tailored for on-device machine learning inference in resource-constrained environments. Announced in July 2018 as part of the Coral platform, the Edge TPU delivers up to 4 trillion operations per second (TOPS) at under 2 watts, making it ideal for always-on applications in Internet of Things (IoT) devices such as smart cameras and wearables. Integrated into Coral development kits, including system-on-modules and USB accelerators, it supports TensorFlow Lite models for quantized inference, enabling local processing to reduce latency and enhance privacy without relying on cloud connectivity.[34][35] The adoption of TPUs has significantly accelerated TensorFlow-based workflows, offering 15–30× higher performance than contemporary GPUs for inference tasks on the first generation, with later versions providing up to 100× efficiency gains in specific large-scale training scenarios due to optimized systolic execution and interconnects. Early access was exclusive to TensorFlow, allowing Google to refine hardware-software co-design before broader framework support, which has since expanded but maintains TensorFlow as the primary interface for peak performance.[31][32]TensorFlow 2.0 and Recent Developments
TensorFlow 2.0 was released on September 30, 2019, marking a major overhaul that addressed limitations in the previous version by integrating Keras as the default high-level API for model building and training.[36] This shift simplified the development process, allowing users to leverage Keras's intuitive interface directly within TensorFlow without needing separate installations.[36] Additionally, eager execution became the default mode, enabling immediate evaluation of operations like standard Python code, which facilitated faster prototyping, easier debugging, and better integration with debugging tools.[36] These changes improved overall stability through extensive community feedback and real-world testing, such as deployment at Google News.[36] Subsequent releases from versions 2.1 to 2.10, spanning 2020 to 2022, focused on enhancing usability and introducing privacy-preserving capabilities, including support for federated learning through the TensorFlow Federated (TFF) framework.[37] TFF, an open-source extension for machine learning on decentralized data, enabled collaborative model training without sharing raw data, integrating seamlessly with TensorFlow's core APIs to promote secure, distributed computations.[37] These updates also included refinements to Keras for better transformer support, deterministic operations, and performance optimizations via oneDNN, contributing to a more robust ecosystem.[38] In 2023, TensorFlow 2.15 introduced compatibility with Keras 3.0, which supports multiple backends including JAX, allowing models to run on JAX accelerators while maintaining TensorFlow's API consistency.[39] This release simplified GPU installations on Linux by bundling CUDA libraries and enhanced tf.function for better type handling and faster computations without gradients.[40] By August 2025, TensorFlow 2.20 further advanced the C++ API through LiteRT, a new inference runtime with Kotlin and C++ interfaces for on-device deployment, replacing legacy tf.lite modules.[41] It added support for NumPy 2.0 compatibility and optimizations for Python 3.13, including autotuning in tf.data for reduced input pipeline latency and zero-copy buffer handling for improved speed and memory efficiency on NPUs and GPUs.[42][41] Throughout these developments, the TensorFlow community emphasized ecosystem maturity by deprecating and removing unstable contrib modules starting in version 2.0, migrating their functionality to core APIs or separate projects like TFF to ensure long-term stability and cleaner codebases. This focus has solidified TensorFlow's role as a production-ready platform, with ongoing contributions from a broad developer base enhancing deployment tools and interoperability.[11]Technical Architecture
Computation Graphs and Execution Modes
TensorFlow represents computations as dataflow graphs, where nodes correspond to operations (such as mathematical functions or data movements) and edges represent multidimensional data arrays known as tensors.[43] These graphs enable efficient execution across diverse hardware, including CPUs, GPUs, and specialized accelerators, by allowing optimizations like parallelization and fusion of operations.[44] In TensorFlow 1.x, the primary execution paradigm relied on static computation graphs, where developers first define the entire graph structure—specifying operations and their dependencies—before executing it in a session.[45] This define-then-run approach, using constructs like placeholders for inputs and sessions for execution, facilitated graph-level optimizations such as constant folding and dead code elimination via the Grappler optimizer, but required explicit graph management that could complicate debugging.[45] Static graphs excelled in production deployment, enabling portability to environments without Python interpreters, such as mobile devices or embedded systems.[45] TensorFlow 2.x shifted the default to eager execution, an imperative mode where operations are evaluated immediately upon invocation, without building an explicit graph upfront.[46] Introduced experimentally in TensorFlow 1.5 and made the standard in 2.0, eager execution mirrors Python's dynamic nature, allowing seamless integration with control structures like loops and conditionals, and providing instant feedback for shapes, values, and errors during development.[45] This mode enhances flexibility and prototyping speed, particularly for research workflows, though it incurs higher overhead for repeated small operations due to Python interpreter involvement.[45] To combine the debugging ease of eager execution with the performance of static graphs, TensorFlow offerstf.function, which decorates Python functions to automatically trace and convert them into optimized graphs.[47] Upon first call with specific input types, tf.function uses AutoGraph to transform the code into a tf.Graph representation, creating a callable ConcreteFunction that caches and reuses the graph for subsequent invocations with matching signatures, avoiding retracing overhead.[47] This hybrid approach supports seamless transitions: developers write and test in eager mode, then apply tf.function for acceleration in training loops or inference, yielding up to several times faster execution for compute-intensive models on GPU or TPU hardware.[47]
A representative example is the computation , where and are input tensors. In eager execution, the operations tf.square(x) and tf.add are performed step-by-step as Python statements execute.[45] Wrapping this in @tf.function traces it into a static graph on the initial run: inputs flow through the squaring node, then the addition node, with the resulting graph executed efficiently thereafter, visualizing the flow as a directed acyclic graph where tensor values propagate forward without intermediate Python calls.[47]
The execution flow differs markedly between modes:
- Static graphs (TensorFlow 1.x style): Define full graph → Compile/optimize → Execute in session (batched inputs processed in one pass).
- Eager execution: Invoke operations → Immediate evaluation → Output returned directly (step-by-step, with Python overhead).
- Hybrid via
tf.function: Write eager code → Decorator traces to graph → First execution builds and runs graph → Reuse for performance.
tf.config.run_functions_eagerly(True) for debugging, ensuring graphs only activate when beneficial.[45]
Automatic Differentiation
Automatic differentiation in TensorFlow enables the computation of gradients for machine learning models by automatically tracking operations during the forward pass and deriving derivatives during the backward pass, facilitating efficient optimization such as backpropagation. This feature is implemented through thetf.GradientTape API, which records tensor operations in eager execution mode and supports reverse-mode differentiation to compute gradients with respect to input variables or model parameters.[48][44]
Reverse-mode differentiation, also known as backpropagation, is the primary method employed by TensorFlow for deep networks, as it efficiently computes gradients for multiple outputs relative to many inputs by traversing the computation graph in reverse order from the target (e.g., a loss function) to the sources (e.g., weights). This contrasts with forward-mode differentiation, which propagates derivatives from inputs to outputs but becomes inefficient for scenarios with numerous parameters, such as neural networks with millions of weights. TensorFlow's implementation uses a breadth-first search to identify backward paths and sums partial gradients along them, enabling scalability for large-scale models.[48][44]
The API supports higher-order gradients by allowing nested tf.GradientTape contexts, where gradients of gradients can be computed iteratively—for instance, obtaining the second derivative of a function like yields , demonstrating utility in advanced analyses like Hessian approximations. A representative example involves computing the gradient of a loss with respect to weights in a linear model:
import tensorflow as tf
x = tf.constant([[1., 2.], [3., 4.]])
w = tf.Variable(tf.random.normal((2, 2)))
b = tf.Variable(tf.zeros((2,)))
with tf.GradientTape() as tape:
y = x @ w + b
loss = tf.reduce_mean(y ** 2)
grad = tape.gradient(loss, w) # Computes ∇_w L
import tensorflow as tf
x = tf.constant([[1., 2.], [3., 4.]])
w = tf.Variable(tf.random.normal((2, 2)))
b = tf.Variable(tf.zeros((2,)))
with tf.GradientTape() as tape:
y = x @ w + b
loss = tf.reduce_mean(y ** 2)
grad = tape.gradient(loss, w) # Computes ∇_w L
tape.gradient(target, sources) derives via reverse-mode accumulation.[48]
Limitations include handling non-differentiable operations, where tape.gradient returns None for unconnected gradients or ops on non-differentiable types like integers, requiring explicit use of tf.stop_gradient to block flow or tf.GradientTape.stop_recording to pause tracking. For custom needs, such as numerical stability, users can define bespoke gradients using tf.custom_gradient, which registers a forward function and its derivative computation, though these must be traceable for model saving and may increase memory usage if the tape is set to persistent mode for multiple gradient calls.[49][48]
Distribution Strategies
TensorFlow provides thetf.distribute.Strategy API to enable distributed training across multiple GPUs, machines, or TPUs with minimal modifications to existing code. This API abstracts the complexities of data and model parallelism, allowing users to scale computations while maintaining compatibility with both Keras high-level APIs and custom training loops. It operates by creating replicas of the model and dataset, synchronizing gradients and variables as needed, and is optimized for performance using TensorFlow's graph execution mode via tf.function.[50]
The MirroredStrategy implements synchronous data parallelism for multi-GPU setups on a single machine, where each GPU holds a replica of the model and processes a portion of the batch. During training, gradients are computed locally on each replica and aggregated using an all-reduce algorithm—defaulting to NCCL for efficient communication—before updating the shared model variables, which are represented as MirroredVariable objects. This strategy ensures consistent model states across devices and is suitable for homogeneous GPU environments.[50]
For multi-machine clusters, the MultiWorkerMirroredStrategy extends synchronous training across multiple workers, each potentially with multiple GPUs. It coordinates via collective operations like ring all-reduce or NCCL, requiring environment variables such as TF_CONFIG to define the cluster topology. This approach scales efficiently for large-scale synchronous distributed training, maintaining the same API as MirroredStrategy for seamless transition. In contrast, the ParameterServerStrategy supports asynchronous training by designating worker nodes for computation and parameter servers for variable storage and updates. Workers fetch parameters, perform local computations, and send gradient updates asynchronously to the servers, which apply them immediately; this can lead to faster convergence in heterogeneous setups but may introduce staleness in gradients.[50]
TPU-specific scaling is handled by TPUStrategy, which integrates with Google Cloud TPUs for synchronous training across TPU cores. It leverages the TPU's high-bandwidth interconnect for efficient all-reduce operations and requires a TPUClusterResolver to configure the TPU system. This strategy is particularly effective for large models, as TPUs provide specialized acceleration for matrix operations central to deep learning.[50]
To utilize these strategies, code is typically wrapped in a strategy.scope() context manager, ensuring that model creation, variable initialization, and optimizer setup occur within the distributed environment. For example, in a Keras workflow:
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
model = tf.keras.Sequential([tf.keras.layers.Dense(10)])
model.compile(optimizer='adam', loss='mse')
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
model = tf.keras.Sequential([tf.keras.layers.Dense(10)])
model.compile(optimizer='adam', loss='mse')
model.fit(), replicating the dataset across replicas and aggregating updates. For custom loops, the strategy's run method distributes function calls, such as step computations, across devices.[50]
APIs and Components
Low-Level APIs
TensorFlow's low-level APIs provide the foundational building blocks for constructing custom computations and neural network primitives, offering fine-grained control over tensor operations that underpin more abstracted interfaces. These APIs, part of the TensorFlow Core, enable developers to define operations directly on tensors, supporting both eager execution and graph-based modes for flexibility in model design.[51] Thetf.nn namespace encompasses neural network-specific functions, including activation functions that introduce non-linearities into models. For instance, tf.nn.relu applies the rectified linear unit activation by computing the maximum of input features and zero, as in tf.nn.relu([-1.0, 2.0]) yielding [0.0, 2.0]. Similarly, tf.nn.sigmoid computes the sigmoid function element-wise to map inputs to (0,1), useful for binary classification gates. Other activations like tf.nn.gelu implement the Gaussian Error Linear Unit for smoother gradients in modern architectures.[52][53]
Convolutional operations in tf.nn facilitate feature extraction in spatial data, such as images. The tf.nn.conv2d function performs 2-D convolution on a 4-D input tensor (batch, height, width, channels) with filter kernels, enabling hierarchical pattern learning in convolutional neural networks (CNNs). Depthwise convolutions via tf.nn.depthwise_conv2d reduce parameters by applying filters separately to each input channel, optimizing for mobile or efficient models. Pooling layers downsample features to reduce dimensionality and introduce translation invariance; tf.nn.max_pool selects the maximum value in each window, while tf.nn.avg_pool computes averages, both commonly used after convolutions to control overfitting.[54][55]
Core operations handle fundamental tensor arithmetic and manipulations. Mathematical functions in tf.math include element-wise addition with tf.math.add, which sums two tensors as in tf.math.add([1, 2], [3, 4]) producing [4, 6]. Matrix multiplication is supported by tf.linalg.matmul, computing the product of two matrices, e.g., tf.linalg.matmul([[1, 2]], [[3], [4]]) resulting in [[11]], essential for linear transformations in neural layers. Tensor manipulations enable reshaping and subset extraction; tf.reshape alters tensor shape without data duplication, using -1 for inference as in reshaping [[1], [2], [3]] to [1, 3]. Slicing via indexing or tf.slice extracts sub-tensors, supporting advanced indexing like rank_1_tensor[1:4] to get [1, 1, 2] from a sequence.[56][57][12]
Extending TensorFlow with custom operations allows integration of domain-specific primitives not covered by built-in ops. In Python, developers can compose existing functions or use tf.Module to define reusable components with trainable variables; for example, a custom dense layer class inherits from tf.Module, initializes weights and biases as tf.Variables, and implements __call__ for forward pass:
class Dense(tf.Module):
def __init__(self, in_features, out_features, name=None):
super().__init__(name=name)
self.w = tf.Variable(tf.random.normal([in_features, out_features]), name='w')
self.b = tf.Variable(tf.zeros([out_features]), name='b')
def __call__(self, x):
return tf.nn.relu(tf.linalg.matmul(x, self.w) + self.b)
class Dense(tf.Module):
def __init__(self, in_features, out_features, name=None):
super().__init__(name=name)
self.w = tf.Variable(tf.random.normal([in_features, out_features]), name='w')
self.b = tf.Variable(tf.zeros([out_features]), name='b')
def __call__(self, x):
return tf.nn.relu(tf.linalg.matmul(x, self.w) + self.b)
REGISTER_OP, implementing the kernel in an OpKernel subclass, and loading via tf.load_op_library; a simple "zero_out" op, for instance, zeros all but the first element of an input tensor.[58][59]
These low-level APIs are particularly valuable for building non-standard models where high-level abstractions lack sufficient control, such as custom recurrent architectures or physics-informed neural networks requiring bespoke tensor flows. By enabling direct op composition, they support innovative research prototypes that deviate from conventional layer stacks.[51]
High-Level APIs
TensorFlow's high-level APIs, primarily through the integrated Keras library, provide intuitive and declarative interfaces for defining, training, and evaluating machine learning models, enabling rapid prototyping and experimentation while abstracting away low-level computational details.[4] Keras supports multiple paradigms for model construction, including the Sequential API for simple stacked architectures, the Functional API for complex, non-linear topologies, and subclassing for highly customized models. These APIs leverage TensorFlow's automatic differentiation under the hood to compute gradients efficiently during training.[4] The Sequential API allows users to build models as a linear sequence of layers by instantiating atf.keras.Sequential object and adding layers directly, such as model = tf.keras.Sequential([tf.keras.layers.Dense(10, activation='relu'), tf.keras.layers.Dense(1)]), which is ideal for straightforward feedforward networks.[4] For more flexible architectures involving shared layers or multiple inputs/outputs, the Functional API defines models by connecting layers explicitly, for example: inputs = tf.keras.Input(shape=(784,)); x = tf.keras.layers.Dense(64, activation='relu')(inputs); outputs = tf.keras.layers.Dense(10)(x); model = tf.keras.Model(inputs=inputs, outputs=outputs).[60] Subclassing the tf.keras.Model class offers maximum control, enabling custom forward passes and integration of non-standard components, as in class MyModel(tf.keras.Model): def __init__(self): super(MyModel, self).__init__(); self.dense = tf.keras.layers.Dense(10); def call(self, inputs): return self.dense(inputs).[61]
Loss functions in Keras quantify the discrepancy between predictions and true labels, with built-in options like tf.keras.losses.BinaryCrossentropy for binary classification tasks and tf.keras.losses.MeanSquaredError for regression problems; these are specified during model compilation via model.compile(loss=tf.keras.losses.BinaryCrossentropy()).[62] Custom losses can be defined as callable functions, such as def custom_loss(y_true, y_pred): return tf.keras.losses.mean_absolute_error(y_true, y_pred) * 2.0, and passed directly to the compile method for tailored objectives.[62]
Metrics track model performance during training and validation, with common built-ins including tf.keras.metrics.Accuracy for classification accuracy and tf.keras.metrics.AUC for evaluating binary classifiers via the area under the receiver operating characteristic curve; they are listed in the compilation step, e.g., model.compile(optimizer='[adam](/page/Adam)', loss='mse', metrics=[tf.keras.metrics.MeanAbsoluteError()]).[63]
Optimizers update model weights to minimize the loss, featuring implementations like tf.keras.optimizers.[Adam](/page/Adam) for adaptive gradient methods and tf.keras.optimizers.SGD for stochastic gradient descent, often with momentum; learning rate schedules, such as exponential decay via tf.keras.optimizers.schedules.ExponentialDecay, can be integrated to adjust rates dynamically during training.[64] Training occurs through the model.fit() method, which applies the optimizer to gradients computed from the loss, as in model.fit(x_train, y_train, epochs=5, batch_size=32), handling data iteration and evaluation seamlessly.[4]
The tf.data API complements Keras by constructing efficient input pipelines for large-scale datasets, enabling transformations like mapping preprocessing functions (e.g., normalization via dataset.map([lambda](/page/Lambda) x, y: (x / 255.0, y))) and batching with dataset.batch(32) to group elements for efficient GPU utilization.[65] These pipelines integrate directly with Keras models, passed to model.fit([dataset](/page/Data_set), epochs=10) for streamlined data loading, shuffling, and prefetching to optimize training throughput without blocking computation.[65]
Variants and Deployments
TensorFlow Lite
TensorFlow Lite originated in 2017 as a lightweight solution for deploying machine learning models on mobile and embedded devices, evolving from earlier efforts under TensorFlow Mobile to prioritize low-latency inference with reduced computational overhead. Announced as a developer preview on November 14, 2017, it addressed the constraints of resource-limited environments by introducing a streamlined runtime that supports core operations for inference without the full TensorFlow overhead. This marked a shift toward on-device processing, enabling applications to perform predictions locally while minimizing dependencies on cloud connectivity.[66] Key features of TensorFlow Lite include model conversion through the TFLiteConverter tool, which transforms trained TensorFlow models into a compact FlatBuffers format (.tflite) optimized for deployment. Quantization techniques, such as 8-bit integer representation, further reduce model size by up to four times and accelerate inference by converting floating-point weights to lower-precision integers, making it suitable for battery-constrained devices. The framework also provides an interpreter API available in C++, Java, and Python, allowing developers to load and execute models efficiently on platforms like Android and iOS.[67] In 2024, TensorFlow Lite was renamed LiteRT to reflect its expanded role as a versatile runtime supporting models from multiple frameworks beyond TensorFlow, while maintaining backward compatibility for existing implementations. As of TensorFlow 2.20 released in August 2025, the legacy tf.lite module has been deprecated in favor of LiteRT to complete the transition.[68][41] LiteRT enhances performance through delegates, which offload computations to specialized hardware accelerators such as GPUs via the GPU delegate or Android's Neural Networks API (NNAPI).[69] It also supports custom delegates for further optimization. Additionally, LiteRT is compatible with Edge TPUs for accelerated inference on compatible hardware. Common use cases for LiteRT involve on-device machine learning in mobile applications, such as real-time image classification in camera apps, where models process inputs directly on the device to ensure privacy and responsiveness.[66] For instance, developers can integrate the interpreter to run lightweight convolutional neural networks for tasks like object detection, enabling real-time performance on mid-range smartphones.[70] This enables seamless deployment in scenarios requiring offline functionality, from augmented reality features to sensor-based analytics on embedded systems.[66]TensorFlow.js
TensorFlow.js is an open-source JavaScript library developed by Google for machine learning, enabling the definition, training, and execution of models directly in web browsers or Node.js environments.[71] Launched on March 30, 2018, it allows client-side training and inference without requiring server dependencies, keeping user data on the device for enhanced privacy and low-latency processing.[71] This portability stems from its foundation in the core TensorFlow library, adapted for JavaScript runtimes.[72] At its core, TensorFlow.js leverages a WebGL backend for GPU acceleration in browsers, automatically utilizing available hardware to speed up computations when possible.[71] Models trained in Python using TensorFlow or Keras can be converted to TensorFlow.js format via a command-line tool, producing amodel.json file and sharded binary weights optimized for web loading and caching.[73] The converter supports SavedModel, Keras HDF5, and TensorFlow Hub formats, with built-in optimizations like graph simplification using Grappler and optional quantization to reduce model size.[73]
The library provides a high-level Layers API that closely mirrors Keras, facilitating the creation of sequential or functional models with familiar components such as dense layers, convolutional layers, and activation functions.[74] This API supports transfer learning in JavaScript by allowing the loading of pre-trained models—such as MobileNet—and fine-tuning them on custom datasets directly in the browser, as demonstrated in official tutorials for image classification tasks.[75]
TensorFlow.js powers interactive web applications and real-time processing scenarios, such as pose detection using pre-built models like PoseNet, which estimates human keypoints from video streams in the browser without server round-trips.[76] For instance, PoseNet enables single- or multi-person pose estimation at interactive frame rates, supporting use cases in fitness tracking, gesture recognition, and augmented reality demos.[77] These capabilities have been extended with models like MoveNet, offering ultra-fast detection of 17 body keypoints for dynamic applications.[78]
TensorFlow Extended (TFX)
TensorFlow Extended (TFX) is an open-source end-to-end platform for developing and deploying production-scale machine learning pipelines, initially introduced by Google in 2017.[79] It builds on TensorFlow to provide a modular framework that automates key steps in the machine learning workflow, ensuring scalability and reliability in production environments.[80] Core components include TensorFlow Data Validation (TFDV), which detects anomalies and schema mismatches in datasets, and TensorFlow Model Analysis (TFMA), which evaluates model performance across multiple metrics and slices.[81] TFX pipelines are orchestrated using integrations like Apache Beam for distributed data processing and execution on various runners, enabling efficient handling of large-scale batch and streaming dataflows.[82] Additionally, TFX is compatible with Kubeflow Pipelines, allowing seamless deployment on Kubernetes clusters for managed orchestration of complex workflows.[83] The platform's key stages encompass data ingestion via ExampleGen, which ingests and splits raw data into examples; transformation using TensorFlow Transform to preprocess features consistently between training and serving; validation with TFDV to ensure data quality; training with the Trainer component, which supports models built with Keras or other TensorFlow APIs; evaluation via TFMA for comprehensive model assessment; and serving through integration with TensorFlow Serving for low-latency inference in production.[80] These stages facilitate a reproducible machine learning lifecycle by versioning artifacts like datasets, schemas, and models, while incorporating monitoring for data drift and model performance degradation over time.[80] For instance, in a typical TFX pipeline for a recommendation system, raw user interaction logs are ingested, validated against an evolving schema, transformed into features, trained into a model, evaluated for fairness metrics, and pushed to serving infrastructure, ensuring end-to-end traceability and continuous improvement.[84] This approach minimizes errors in production transitions and supports iterative development at scale.[79]Integrations and Ecosystem
Scientific Computing Libraries
TensorFlow provides seamless integration with NumPy, the foundational library for numerical computing in Python, enabling data scientists to leverage familiar tools within machine learning workflows. Thetf.convert_to_tensor() function converts NumPy arrays and other compatible objects directly into TensorFlow tensors, preserving data types and shapes where possible.[85] Additionally, TensorFlow implements a subset of the NumPy API through tf.experimental.numpy, which allows NumPy-compatible code to run with TensorFlow's acceleration, including zero-copy sharing of memory between tensors and NumPy ndarrays to minimize overhead during data transfer.[86] This interoperability ensures that operations like array manipulations and mathematical computations align closely with NumPy's behavior, including broadcasting rules that follow the same semantics for efficient element-wise operations across arrays of different shapes.[12]
For handling sparse data, TensorFlow's sparse tensors are compatible with SciPy's sparse matrix formats, particularly the coordinate list (COO) representation, allowing straightforward conversion between SciPy's scipy.sparse objects and TensorFlow's tf.sparse.SparseTensor.[87] This enables users to import sparse datasets from SciPy for processing in TensorFlow models without dense conversions, which is crucial for memory-efficient handling of high-dimensional data like text or graphs. Regarding optimization, SciPy's routines from scipy.optimize can be invoked within TensorFlow workflows by wrapping model loss functions as Python callables, facilitating hybrid use cases such as fine-tuning with specialized solvers like L-BFGS-B alongside TensorFlow's native optimizers.
A practical example of this integration is loading NumPy arrays into a tf.data.Dataset for efficient input pipelines, where data from NumPy files (e.g., .npz archives) can be directly ingested, shuffled, and batched for training.[88] This approach, often referenced in high-level APIs like tf.data, supports scalable data loading without redundant copies. Overall, these features provide a seamless transition for data scientists accustomed to NumPy and SciPy ecosystems, reducing the learning curve for adopting TensorFlow in scientific computing tasks. As of November 2025, TensorFlow is compiled with NumPy 2.0 support by default and maintains compatibility with later NumPy 2.x versions, including ongoing support for NumPy 1.26 until the end of 2025.[89]
Advanced Frameworks
TensorFlow integrates with advanced machine learning frameworks to enhance its flexibility, enabling developers to leverage specialized tools for research, optimization, and deployment while mitigating ecosystem silos. These integrations primarily focus on interoperability through shared compilers, intermediate formats, and conversion utilities, allowing models developed in one framework to be adapted for use in TensorFlow's robust production environment.[90] A key integration is with JAX, Google's high-performance numerical computing library, facilitated by the JAX2TF converter introduced in the jax.experimental.jax2tf module. This tool allows JAX functions and models—such as those built with the Flax neural network library—to be converted into equivalent TensorFlow graphs usingjax2tf.convert, preserving functionality for inference and further training within TensorFlow. Since TensorFlow 2.15, enhanced compatibility with the XLA compiler, which both frameworks utilize, has improved performance and stability for these conversions, enabling seamless execution on accelerators like GPUs and TPUs. Additionally, TensorFlow Federated provides experimental support for JAX as an alternative frontend, compiling JAX computations directly to XLA via @tff.jax_computation decorators, which supports federated learning workflows without TensorFlow-specific code.[90][91]
TensorFlow also supports the Open Neural Network Exchange (ONNX) standard for cross-framework model portability, allowing export and import of models to facilitate interoperability. Exporting TensorFlow or Keras models to ONNX is handled by the tf2onnx tool, which converts SavedModels, checkpoints, or TFLite files into ONNX format using commands like python -m tf2onnx.convert --saved-model path/to/model --output model.onnx, supporting ONNX opsets from 14 to 18 (default 15) and TensorFlow versions 2.9 to 2.15. Importing ONNX models into TensorFlow is enabled via the onnx-tf backend, which translates ONNX graphs into TensorFlow operations for execution with TensorFlow's runtime or ONNX Runtime. This bidirectional support ensures models can be trained in TensorFlow and deployed in ONNX-compatible environments, or vice versa, with minimal rework.[92][93][94]
Beyond direct JAX support, TensorFlow enables interoperability with PyTorch through ONNX as an intermediary format; PyTorch models can be exported to ONNX using torch.onnx.export, then imported into TensorFlow via onnx-tf for continued training or serving. Similarly, Flax-based JAX models can be run in TensorFlow using JAX2TF wrappers, as demonstrated in examples where a Flax convolutional network trained partially in JAX is converted and fine-tuned in TensorFlow, combining JAX's research-friendly transformations with TensorFlow's ecosystem.[94][90]
These integrations address vendor lock-in by allowing developers to prototype in agile frameworks like JAX or PyTorch and migrate to TensorFlow for scalable distributed training, such as using tf.distribute.Strategy for multi-GPU setups after conversion. For instance, JAX code can be ported to TensorFlow to leverage its mature distributed strategies like MirroredStrategy, enabling efficient scaling across clusters without rewriting core logic.[90][50]
Development Tools
Google Colab provides a cloud-based Jupyter notebook environment that enables users to execute Python code directly in the browser without local setup, offering free access to GPU and TPU resources for accelerated TensorFlow computations.[95][96] Pre-installed with the latest TensorFlow versions, it supports seamless integration for prototyping and training machine learning models, making it particularly accessible for resource-constrained developers.[97] In educational contexts, Colab has significantly democratized access to TensorFlow-based machine learning education by allowing students and researchers worldwide to run complex experiments without hardware investments, as evidenced by its adoption in undergraduate AI courses for hands-on deep learning projects.[98][99] TensorBoard serves as a visualization suite within the TensorFlow ecosystem, allowing developers to inspect computational graphs, monitor training metrics such as loss and accuracy, and explore high-dimensional embeddings through interactive dashboards.[100] Launched alongside early TensorFlow releases, it facilitates debugging and optimization by rendering histograms, images, and scalar plots from logged events during model development.[101] Users can extend TensorBoard with custom logging mechanisms, such as defining bespoke metrics via Keras callbacks or tf.summary APIs, to track application-specific data like custom loss components or intermediate layer outputs.[102] The TensorFlow Debugger, accessible through the tf.debugging module, offers programmatic tools for inspecting tensor values and execution traces during model training, aiding in the identification of numerical instabilities or logical errors in TensorFlow graphs.[103] Introduced with TensorFlow 1.0, it supports features like conditional breakpoints and watchpoints on tensors, enabling step-by-step debugging similar to traditional programming environments but tailored for graph-based computations.[104] Complementing these, the TensorFlow Profiler analyzes model performance by capturing traces of operations, memory usage, and hardware utilization, helping developers pinpoint bottlenecks such as inefficient kernel launches or data pipeline delays.[105] Released in 2020 as an integrated TensorBoard plugin, it provides detailed breakdowns of CPU/GPU/TPU workloads and recommends optimizations for faster training iterations.[106] Together, these tools enhance collaborative development by enabling shared visualizations and diagnostics, fostering efficient iteration in the TensorFlow community.[107]Applications
Healthcare
TensorFlow has been widely adopted in healthcare for medical imaging applications, particularly through convolutional neural networks (CNNs) that analyze X-ray and CT scans to detect conditions such as pneumonia. For instance, researchers have developed TF-based CNN models that process chest X-ray images to classify pneumonia with high accuracy, often achieving over 95% precision on benchmark datasets like the Chest X-ray Pneumonia collection. These models leverage TensorFlow's Keras API to build and train architectures like EfficientNet or custom CNNs, enabling automated detection that assists radiologists in rapid diagnosis. Similar approaches extend to CT scans for identifying abnormalities in lung tissue, where TF facilitates end-to-end pipelines from image preprocessing to predictive output.[108][109] A seminal example is Google's 2016 deep learning system for detecting diabetic retinopathy in retinal fundus photographs, which used a TensorFlow-trained Inception-v3 CNN to achieve 97.5% sensitivity and 93.4% specificity at the high-sensitivity operating point on external validation sets, outperforming traditional methods and enabling scalable screening in underserved areas. This work laid the foundation for FDA-approved AI tools, such as IDx-DR (now LumineticsCore), the first autonomous AI system cleared in 2018 for detecting more-than-mild diabetic retinopathy in adults with diabetes, analyzing retinal images to provide triage recommendations with 87.2% sensitivity and 90.7% specificity. These tools demonstrate TensorFlow's role in transitioning research prototypes to clinical deployment, enhancing early intervention for vision-threatening conditions.[110][111][112] Despite these advances, challenges in healthcare applications include stringent data privacy requirements under regulations like HIPAA and GDPR, addressed by TensorFlow Federated (TFF), which enables collaborative model training across institutions without sharing raw patient data— for example, simulating federated setups on electronic health records to predict disease outcomes while preserving confidentiality. Regulatory compliance remains critical, as AI models must undergo rigorous validation for safety and efficacy, with the FDA authorizing over 950 AI-enabled devices by 2024 and more than 1,200 as of mid-2025, many involving imaging analysis. TFF's integration supports privacy-preserving federated learning in scenarios like multi-hospital collaborations for rare disease modeling.[113][114] The impact of TensorFlow in healthcare is evident in improved diagnostic accuracy, accelerating triage in emergency settings. In drug discovery, TF-based neural networks expedite virtual screening and lead optimization; for instance, Keras/TensorFlow models have identified novel CXCR3 antagonists for immunity disorders by predicting binding affinities on molecular datasets, shortening the traditional 10-15 year timeline for candidate identification. Overall, these applications enhance personalized medicine by integrating multimodal data, fostering faster therapeutic development while adhering to ethical standards.[115]Social Media
TensorFlow plays a pivotal role in enhancing user engagement on social media platforms through advanced recommendation systems, particularly via collaborative filtering techniques. These systems leverage deep neural networks to predict user preferences based on historical interactions, such as video watches or post engagements, generalizing traditional matrix factorization methods into nonlinear models. For instance, YouTube's recommendation engine employs TensorFlow to perform extreme multiclass classification, where the model predicts the next video a user might watch from millions of candidates, incorporating user history and contextual features like video freshness to promote viral content. This approach drives a significant portion of views, with daily active users watching around 60 minutes of video, equating to billions of views processed daily.[116] In content analysis, TensorFlow facilitates natural language processing (NLP) models for detecting sentiment and toxicity in user-generated text, enabling platforms to moderate harmful content effectively. The TensorFlow.js Toxicity Classifier, for example, assesses text for categories like insults, threats, and identity-based attacks, assigning probability scores above a threshold (e.g., 0.9) to flag toxic posts. This model supports real-time filtering by providing immediate client-side evaluation, preventing offensive content from entering databases and reducing backend load on social platforms. Complementing NLP, TensorFlow's computer vision capabilities power image tagging through convolutional neural networks, classifying uploaded photos to identify objects, scenes, or people, which aids in content organization and moderation on media-heavy sites.[117][118][119] To handle the massive scale of social media data, TensorFlow employs distributed training strategies that enable efficient processing of vast user datasets. Using APIs liketf.distribute.Strategy, models can train synchronously across multiple GPUs, TPUs, or machines, synchronizing gradients via algorithms such as NCCL for low-latency updates. In YouTube's case, this allows training billion-parameter models on hundreds of billions of examples, ensuring sublinear latency for ranking hundreds of candidates per user query. Such scalability is crucial for real-time applications, where platforms process petabytes of interaction data to personalize feeds without compromising performance.[50][116]