Hubbry Logo
search
logo

Torch (machine learning)

logo
Community Hub0 Subscribers
Read side by side
from Wikipedia
Torch
Original authorsRonan Collobert, Samy Bengio, Johnny Mariéthoz[1]
Initial releaseOctober 2002; 23 years ago (2002-10)[1]
Final release
7.0 / February 27, 2017; 9 years ago (2017-02-27)[2]
Written inLua, C, C++
Operating systemLinux, Android, Mac OS X, iOS
TypeLibrary for machine learning and deep learning
LicenseBSD License
Websitetorch.ch
Repository

Torch is an open-source machine learning library, a scientific computing framework, and a scripting language based on Lua.[3] It provides LuaJIT interfaces to deep learning algorithms implemented in C. It was created by the Idiap Research Institute at EPFL. Torch development moved in 2017 to PyTorch, a port of the library to Python.[4][5][6]

torch

[edit]

The core package of Torch is torch. It provides a flexible N-dimensional array or Tensor, which supports basic routines for indexing, slicing, transposing, type-casting, resizing, sharing storage and cloning. This object is used by most other packages and thus forms the core object of the library. The Tensor also supports mathematical operations like max, min, sum, statistical distributions like uniform, normal and multinomial, and basic linear algebra subprogram (BLAS) operations like dot product, matrix–vector multiplication, matrix–matrix multiplication and matrix product.

The following exemplifies using torch via its REPL interpreter:

> a = torch.randn(3, 4)

> =a
-0.2381 -0.3401 -1.7844 -0.2615
 0.1411  1.6249  0.1708  0.8299
-1.0434  2.2291  1.0525  0.8465
[torch.DoubleTensor of dimension 3x4]

> a[1][2]
-0.34010116549482
	
> a:narrow(1,1,2)
-0.2381 -0.3401 -1.7844 -0.2615
 0.1411  1.6249  0.1708  0.8299
[torch.DoubleTensor of dimension 2x4]

> a:index(1, torch.LongTensor{1,2})
-0.2381 -0.3401 -1.7844 -0.2615
 0.1411  1.6249  0.1708  0.8299
[torch.DoubleTensor of dimension 2x4]

> a:min()
-1.7844365427828

The torch package also simplifies object-oriented programming and serialization by providing various convenience functions which are used throughout its packages. The torch.class(classname, parentclass) function can be used to create object factories (classes). When the constructor is called, torch initializes and sets a Lua table with the user-defined metatable, which makes the table an object.

Objects created with the torch factory can also be serialized, as long as they do not contain references to objects that cannot be serialized, such as Lua coroutines, and Lua userdata. However, userdata can be serialized if it is wrapped by a table (or metatable) that provides read() and write() methods.

nn

[edit]

The nn package is used for building neural networks. It is divided into modular objects that share a common Module interface. Modules have a forward() and backward() method that allow them to feedforward and backpropagate, respectively. Modules can be joined using module composites, like Sequential, Parallel and Concat to create complex task-tailored graphs. Simpler modules like Linear, Tanh and Max make up the basic component modules. This modular interface provides first-order automatic gradient differentiation. What follows is an example use-case for building a multilayer perceptron using Modules:

> mlp = nn.Sequential()
> mlp:add(nn.Linear(10, 25)) -- 10 input, 25 hidden units
> mlp:add(nn.Tanh()) -- some hyperbolic tangent transfer function
> mlp:add(nn.Linear(25, 1)) -- 1 output
> =mlp:forward(torch.randn(10))
-0.1815
[torch.Tensor of dimension 1]

Loss functions are implemented as sub-classes of Criterion, which has a similar interface to Module. It also has forward() and backward() methods for computing the loss and backpropagating gradients, respectively. Criteria are helpful to train neural network on classical tasks. Common criteria are the mean squared error criterion implemented in MSECriterion and the cross-entropy criterion implemented in ClassNLLCriterion. What follows is an example of a Lua function that can be iteratively called to train an mlp Module on input Tensor x, target Tensor y with a scalar learningRate:

function gradUpdate(mlp, x, y, learningRate)
  local criterion = nn.ClassNLLCriterion()
  local pred = mlp:forward(x)
  local err = criterion:forward(pred, y); 
  mlp:zeroGradParameters();
  local t = criterion:backward(pred, y);
  mlp:backward(x, t);
  mlp:updateParameters(learningRate);
end

It also has StochasticGradient class for training a neural network using stochastic gradient descent, although the optim package provides much more options in this respect, like momentum and weight decay regularization.

Other packages

[edit]

Many packages other than the above official packages are used with Torch. These are listed in the torch cheatsheet.[7] These extra packages provide a wide range of utilities such as parallelism, asynchronous input/output, image processing, and so on. They can be installed with LuaRocks, the Lua package manager which is also included with the Torch distribution.

Applications

[edit]

Torch is used by the Facebook AI Research Group,[8] IBM,[9] Yandex[10] and the Idiap Research Institute.[11] Torch has been extended for use on Android[12][better source needed] and iOS.[13][better source needed] It has been used to build hardware implementations for data flows like those found in neural networks.[14]

Facebook has released a set of extension modules as open source software.[15]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Torch is an open-source scientific computing framework designed for machine learning, featuring wide support for algorithms such as deep neural networks, with a GPU-first architecture for efficient computation.[1] It is built on LuaJIT, providing a lightweight scripting interface to high-performance C and CUDA backends for tensor operations, linear algebra, and optimization routines.[2] Key components include the core torch library for N-dimensional array manipulation, nn for modular neural network construction, and optim for stochastic gradient descent and other solvers, enabling rapid prototyping in areas like computer vision and natural language processing.[1] Development of Torch began around 2002 at the Idiap Research Institute in Switzerland, evolving through contributions from NEC Laboratories America, New York University, and DeepMind, with Torch 7 released as the primary open-source version under the MIT license.[2] The framework gained prominence in the early 2010s for its flexibility and speed, powering research at organizations including Facebook, Google, and Twitter, and supporting applications from image recognition to speech processing.[2] Its ecosystem extended to community packages for signal processing, parallel computing, and multimedia handling, fostering an active user base despite Lua's niche status in programming.[1] By 2017, active development of the original Lua-based Torch largely ceased, with maintainers shifting focus to PyTorch, a reimplementation in Python that preserves Torch's dynamic computation model while adding Pythonic syntax, automatic differentiation, and production deployment tools.[3] PyTorch, initially developed by Facebook AI Research (now Meta AI), has since become a dominant framework in deep learning, used in academia and industry for tasks ranging from research prototyping to scalable model training on GPUs and TPUs.[4] This transition addressed Lua's limitations in ecosystem integration and accessibility, ensuring the legacy of Torch's innovations endures in modern machine learning workflows.[3]

Introduction

Overview

Torch is an open-source machine learning library, scientific computing framework, and scripting language based on Lua and LuaJIT.[1] It provides a flexible environment for developing and deploying machine learning algorithms, emphasizing efficiency and ease of use.[1] The core purpose of Torch is to enable efficient numerical computations, particularly for machine learning and deep learning tasks, through support for multi-dimensional arrays known as tensors and GPU acceleration.[1] This allows users to perform complex operations on large datasets with high performance, making it suitable for research and production environments.[1] Torch includes libraries for neural networks and optimization, facilitating the construction of sophisticated models.[1] Released under the BSD license, Torch permits broad use, modification, and distribution, fostering a vibrant community of contributors.[1] It was initially developed in October 2002 by Ronan Collobert, Samy Bengio, and Johnny Mariéthoz at the Idiap Research Institute in collaboration with the Swiss Federal Institute of Technology in Lausanne (EPFL).[1]

History

Torch originated in 2002 at the Idiap Research Institute in Martigny, Switzerland, where researchers Ronan Collobert, Samy Bengio, and Johnny Mariéthoz developed it as a modular machine learning software library. The initial version focused on providing efficient implementations of algorithms for tasks such as speech recognition and pattern matching, leveraging C for core computations, with Lua interfaces introduced later in Torch7 (2011). This foundational work was detailed in the technical report "Torch: A Modular Machine Learning Software Library" (Idiap Research Institute, October 2002).[5] Over the subsequent decade, Torch evolved significantly with contributions from multiple institutions, including the École Polytechnique Fédérale de Lausanne (EPFL) and NEC Laboratories America, with additional contributions from New York University and DeepMind. By 2011, the project had advanced to Torch7, a Matlab-like environment extending Lua for numeric computing and machine learning, which introduced enhanced support for tensor operations and neural network training. NEC Labs played a key role in optimizing deep learning capabilities, while EPFL's involvement strengthened its academic foundations. Adoption grew in industry, notably by Facebook's AI Research team under Yann LeCun, who utilized Torch for pioneering deep learning experiments in computer vision and natural language processing.[6][7] Key milestones included the integration of LuaRocks for streamlined package management in later Torch7 versions, facilitating easier installation and extension of the ecosystem. Torch was released as open-source software under the BSD-3-Clause license and hosted on GitHub in the torch7 repository, enabling contributions from a global community of researchers. The project's final major update, version 7.0, was released on February 27, 2017, marking the end of active development on the Lua-based framework. In 2017, the core maintainers announced a shift in focus to PyTorch, a Python-based successor, citing Lua's waning popularity in the machine learning community as a primary reason for the transition. This move reflected broader trends toward Python's dominance in scientific computing and ensured the continuation of Torch's tensor-centric innovations in a more accessible language.

Core Components

Torch Package

The Torch package serves as the foundational component of the Torch machine learning framework, providing support for N-dimensional tensors as the central data structure for all numerical computations.[1][8] These tensors enable efficient handling of multi-dimensional arrays, ranging from vectors (1D) to higher-dimensional data like images (2D or 3D) and videos (4D), forming the basis for scientific computing and machine learning operations in Torch.[9] Implemented with a C backend for performance, the package exposes its functionality through a LuaJIT interface, allowing scripted access to low-level tensor manipulations without sacrificing speed.[8] Key features of the Torch package include a suite of mathematical operations on tensors, such as element-wise addition and multiplication, which operate directly on tensor data for efficient computation. For instance, element-wise addition computes $ C = A + B $, where $ A $ and $ B $ are tensors of compatible shapes, using functions like torch.add(C, A, B) or the method C:add(A, B).[10] Broadcasting extends this capability to tensors of different dimensions by implicitly expanding singleton dimensions; for example, adding a 1D tensor of length 4 to a 2x2 tensor fills the latter element-wise, as in:
x = torch.Tensor(2,2):fill(2)
y = torch.Tensor(4):fill(3)
x:add(y)  -- Results in a 2x2 tensor filled with 5
[10] Linear algebra routines, such as matrix multiplication via torch.mm(C, A, B) or C:mm(A, B), compute $ C = A \times B $ for matrices $ A $ (n x m) and $ B $ (m x p), yielding an n x p result, essential for operations like those in neural network layers.[10] Additionally, I/O functions like torch.save(tensor, file) and torch.load(file) handle serialization and deserialization of tensors to binary files, supporting efficient data persistence across sessions.[9] Tensor creation in the Torch package is flexible, beginning with constructors like torch.Tensor() for an empty tensor or torch.Tensor(sizes) to allocate a tensor with specified dimensions, such as z = torch.Tensor(4,5,6,2) for a 4D tensor.[9] Indexing uses Lua-style square brackets, e.g., x[2][3], while advanced slicing employs methods like narrow(dim, index, size) to extract subsets (e.g., x:narrow(1, 2, 3)) or select(dim, index) for single slices.[9] Reshaping is achieved without data copying via resize(sizes) to adjust dimensions (e.g., x:resize(2,2)) or view(sizes) for non-contiguous views (e.g., x:view(2,2)), leveraging shared underlying storage for memory efficiency.[9] Type conversions support interoperability with Lua, such as converting tensors to Lua tables via implicit casting or explicit methods, and vice versa; for example, x:type('torch.IntTensor') or x:int() changes the tensor type while preserving data.[9] The package's efficiency stems from its C-based backend (TH library), which manages tensor storage as contiguous memory blocks accessible via pointers, minimizing copies through strides and offsets for operations like narrowing or transposing.[8] This core library integrates seamlessly with LuaJIT, using wrappers like cwrap for type handling and luaT for argument validation, enabling just-in-time compilation of Lua scripts for high-performance execution.[8] Tensors created in this package underpin higher-level abstractions, such as neural network modules in the NN package.[1]

NN Package

The nn package in Torch serves as the primary interface for constructing and training neural networks, offering a modular design that enables the assembly of feedforward and recurrent architectures through composable components.[11] This modularity allows users to define complex models by chaining layers and containers, leveraging Lua scripting for flexibility in model specification.[12] At its core, the package revolves around the nn.Module abstract class, which standardizes the behavior of all network components by defining essential methods for forward propagation and gradient computation.[11] Key modules facilitate the building of networks, with nn.Sequential providing a straightforward container for stacking layers in a linear sequence, such as connecting input transformations to output predictions without explicit wiring.[11] Fully connected layers are implemented via nn.Linear, which applies an affine transformation $ y = xW^T + b $ to input tensors, where $ W $ represents weights and $ b $ biases. Activation functions integrate seamlessly, including nn.Tanh for hyperbolic tangent nonlinearity and nn.ReLU for rectified linear units, which introduce non-linearities essential for learning hierarchical representations.[11] Training components within the nn package include loss functions like nn.MSECriterion, which computes the mean squared error between predictions and targets as $ L = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2 $, quantifying model performance during optimization.[11] Optimizers, accessed through the companion optim package, support algorithms such as stochastic gradient descent via optim.sgd, which iteratively updates parameters using the rule $ \theta \leftarrow \theta - \eta \nabla L $, where $ \eta $ is the learning rate and $ \nabla L $ the gradient of the loss. The package handles specific concepts like forward and backward passes through standardized methods in each module: the forward pass invokes updateOutput(input) to compute outputs from inputs, while the backward pass calls updateGradInput(input, gradOutput) to propagate gradients back through the network and accGradParameters(input, gradOutput) to accumulate parameter gradients.[11] Gradient computation employs an autograd-like mechanism tailored to Lua, where users manually chain backward calls but rely on module-specific implementations for efficient differentiation, avoiding full symbolic differentiation for performance in numerical computing.[11] These passes operate on tensor inputs from the core Torch package, ensuring compatibility with multi-dimensional data.[11] A typical workflow begins with defining a network, such as:
mlp = nn.Sequential()
mlp:add(nn.Linear(784, 128))
mlp:add(nn.ReLU())
mlp:add(nn.Linear(128, 10))
This creates a multi-layer perceptron for classification. Training involves a loop over epochs and batches: compute forward outputs, evaluate loss with nn.MSECriterion, perform backward passes to obtain gradients, and apply optim.sgd to update weights, iterating until convergence.[13] This process exemplifies the package's efficiency in Lua environments for rapid prototyping and experimentation.[6]

Other Packages

Torch's ecosystem includes several auxiliary packages that extend its core functionality for specialized tasks such as GPU acceleration, image processing, optimization, and multi-threading. These packages are managed and installed using LuaRocks, the standard package manager for Lua modules, which allows users to install them without root privileges after the initial Torch setup. For example, core Torch packages are installed via LuaRocks during the distribution setup, and additional ones can be added with commands like luarocks install [image](/page/Image) or luarocks install optim.[14][15] The cunn package provides CUDA-based GPU acceleration specifically for neural network modules from the nn package, enabling efficient training and inference on NVIDIA GPUs. It implements CUDA versions of common modules like Linear and LogSoftMax, and allows seamless conversion of CPU-based models to GPU with the module:cuda() method. For instance, after loading a model, calling model:cuda() transfers the module and its parameters to the GPU, interfacing directly with the core torch and nn packages for accelerated computations. Complementing cunn, the cutorch package handles GPU tensor operations, introducing the torch.CudaTensor type and methods like tensor:cuda() to transfer data between CPU and GPU memory, which is essential for GPU-accelerated training workflows where tensors are moved once and processed entirely on the device.[16][17] For image-related tasks, the image package offers tools for loading, transforming, and displaying images in a format compatible with Torch tensors (typically nChannel x height x width). Key functions include image.load() for reading image files into tensors and image.scale() for resizing, along with utilities for normalization to standardize pixel values (e.g., scaling to [0,1] or mean subtraction). These operations integrate with core Torch tensors, allowing transformed images to be fed directly into neural networks for tasks like computer vision.[18] The optim package supplies advanced optimization algorithms for training models, building on Torch's tensor operations to update parameters efficiently. It includes methods such as Adam for adaptive learning rates and L-BFGS for quasi-Newton optimization, used via functions like optim.adam(params, state) where params are model parameters and state holds hyperparameters. This package interfaces with nn modules by applying gradients computed during backpropagation to perform steps like stochastic gradient descent variants.[19][20] Additional utilities include the threads package for multi-threading support, which enables parallel execution of tasks across multiple LuaJIT threads using Torch's serialization for data exchange, such as submitting jobs to a thread pool with threads.Threads(nthreads):addjob(callback). The xlua package provides general-purpose extensions to Lua, including progress bars via xlua.progress(total, done) for monitoring long-running operations and utilities for string and table manipulation, enhancing usability in scripts that leverage core Torch components.[21][22]

Usage and Applications

Basic Usage

Torch (machine learning) requires basic knowledge of Lua programming, as it is built on LuaJIT and uses Lua scripts for development.[23] Torch is no longer actively developed since around 2017 and is in maintenance mode as of 2025. The following describes historical installation and usage for legacy purposes. Installation of Torch begins with meeting system requirements, including LuaJIT 2.1 and linear algebra libraries such as BLAS (e.g., OpenBLAS or ATLAS). On Ubuntu or macOS, clone the distribution repository and run the installation scripts without sudo to ensure a user-local setup. The commands are: git clone https://github.com/torch/distro.git ~/torch --recursive, followed by cd ~/torch, bash install-deps to handle dependencies like CMake and BLAS, and finally ./install.sh to build LuaJIT, LuaRocks, and core Torch libraries.[24] After installation, set up the environment by sourcing the activation script: source ~/torch/install/bin/torch-activate. This adds Torch's binaries (e.g., th, luarocks) to the PATH. To make this persistent across sessions, add the source command to ~/.bashrc or equivalent shell profile.[25] Launch the interactive Torch environment by running th in the terminal, which starts the Lua interpreter with Torch loaded. Within the session, load the core package using require 'torch'. For scripting, write code in a .lua file and execute it with th myscript.lua; this supports debugging via print statements or the built-in Lua debugger. Simple tensor operations demonstrate basic usage. Create a 3x3 tensor filled with ones: x = [torch](/page/Torch).Tensor(3,3):fill(1). Perform matrix multiplication: y = torch.mm(x, x), resulting in a tensor where each element is 3. These operations leverage Torch's efficient tensor library for numerical computations. Data persistence uses Torch's serialization tools. Save a tensor to a file: torch.save('[data](/page/Data).t7', x). Load it back: loaded = torch.load('[data](/page/Data).t7'). These functions handle tensors, tables, and models in a binary format for efficient I/O. Additional packages, such as those for image processing or optimization, are installed via LuaRocks, Torch's package manager: luarocks install image. This extends functionality without rebuilding the core system.[15]

Notable Applications

Torch was notably adopted in the early 2000s at the Idiap Research Institute for speech recognition tasks, where multilayer perceptron (MLP) models such as the MLP (Gater) Mixer and MLP (expert) architectures were employed to process speech data, including mixture of experts systems for feature extraction and classification.[26] These applications extended to pattern recognition, with Torch facilitating neural network-based experiments on tasks like handwritten digit recognition using linear tanh linear models, demonstrating its utility in handling structured data for signal and image processing in research settings during that decade.[26] In research impacts, Facebook's FAIR lab leveraged Torch for developing computer vision models, contributing to convolutional neural networks (CNNs) and recurrent neural networks (RNNs), enabling efficient prototyping and training of these architectures in academic and industrial research throughout the 2010s. Notable projects utilizing Torch included image classification benchmarks on the ImageNet dataset, where implementations like those in Facebook's fbcunn extension to Torch7 allowed researchers to achieve competitive results in the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC), contributing to the widespread adoption of CNNs for large-scale object recognition.[27] In natural language processing, prototypes such as the unified neural network architecture proposed in the 2011 work on part-of-speech tagging, chunking, named entity recognition, and semantic role labeling were developed using Torch, setting benchmarks for end-to-end learning in NLP tasks with minimal linguistic priors.[28] Industry examples highlight Torch's deployment in practical systems; Twitter integrated Torch for reinforcement learning applications in content recommendation, releasing extensions like torch-twrl in 2016 to enhance algorithmic decision-making in user feed personalization. Torch achieved state-of-the-art results in various deep learning challenges during the 2010s, particularly in computer vision and speech domains, owing to its early and efficient CUDA-based GPU support, which accelerated training of large neural networks and outperformed CPU-only alternatives in benchmarks like ImageNet classification. This GPU integration was pivotal in enabling scalable experiments that pushed performance boundaries in convolutional and recurrent models.

Legacy and Successors

Current Status

Torch has not seen active development since early 2017, with the core GitHub repository indicating that it is no longer maintained and that its underlying functionality has been migrated to the ATen library in its successor framework.[23] The last significant updates to the repository occurred around 2018, but substantive contributions ceased earlier as resources shifted away from the Lua-based implementation.[29] The surrounding community remains minimal, consisting primarily of occasional forks and enthusiast-driven maintenance efforts, with no organized development team. LuaRocks continues to host Torch-related packages, though updates have been infrequent, with the primary torch_torch7 module last revised approximately 224 days prior to November 2025 (v1.0.0-1, around April 2025).[30] This limited activity reflects Torch's transition from a vibrant ecosystem to one sustained only for niche or archival purposes. The decline in active use stems from Lua's relatively niche position in the machine learning landscape compared to Python's dominance, which offers broader ecosystem support and accessibility for practitioners. Additionally, developer resources have largely redirected toward more widely adopted frameworks that build on Torch's foundational ideas.[31] Despite its inactivity, Torch remains downloadable from its official website at torch.ch and can be compiled on modern systems, albeit with potential caveats such as LuaJIT compatibility challenges on newer operating systems and compilers lacking official support for post-5.1 Lua versions.[32] As of 2025, Torch is regarded as legacy software, appropriate for historical research or reproducing early deep learning experiments but not recommended for new projects due to its outdated dependencies and lack of security updates.[23]

Relation to PyTorch

PyTorch originated as a direct successor to the original Torch library, announced by Facebook AI Research (FAIR) in January 2017 as a Python-based replacement for the Lua-centric Torch framework while reusing the same efficient C backend libraries, including libTH for tensor computations and THNN for neural network operations.[33][34] This shift addressed Lua's limited popularity in the broader programming community, enabling wider adoption without sacrificing Torch's performance-oriented core.[4] Architecturally, PyTorch closely mirrors Torch's foundational elements, preserving key features such as multidimensional tensor operations with GPU acceleration, dynamic computation graphs for flexible model building, and modular neural network components that echo Torch's nn package—reimplemented in PyTorch as the torch.nn module for defining layers and models.[4] These similarities allowed PyTorch to serve as a near-direct port of Torch's core concepts, ensuring continuity for users familiar with Torch's imperative style and automatic differentiation via autograd. Key differences stem primarily from the language transition: PyTorch's Python interface enhances accessibility, debugging, and integration with the extensive Python ecosystem, making it more approachable for researchers and developers compared to Torch's Lua scripting.[4] PyTorch also introduced production-oriented enhancements absent in original Torch, such as TorchScript, which compiles Python code into an intermediate representation for optimized deployment and serialization, bridging research prototyping to scalable inference.[33] Migration from Torch to PyTorch was streamlined through built-in compatibility features, including PyTorch's torch.load function, which deserializes Torch's .t7 model files to import tensors, parameters, and nn modules directly, often requiring minimal code adjustments.[35] The Open Neural Network Exchange (ONNX) standard further supported interoperability, enabling export of Torch models via intermediate formats for import into PyTorch, which facilitated the widespread user transition during 2017-2018 as Torch development halted in favor of PyTorch.[36] PyTorch's enduring influence underscores Torch's legacy, with PyTorch achieving dominance in machine learning research by 2025—powering over 85% of deep learning papers at major conferences and boasting a 63% adoption rate in model training workflows—directly crediting Torch's innovative design for its dynamic, extensible paradigm.[37][38] Torch's C libraries evolved into ATen, PyTorch's core tensor library, providing the low-level optimizations for tensor manipulations and neural operations that sustain its high performance.[34]

References

User Avatar
No comments yet.