Hubbry Logo
List of statistical softwareList of statistical softwareMain
Open search
List of statistical software
Community hub
List of statistical software
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
List of statistical software
List of statistical software
from Wikipedia

The following is a list of statistical software.

Open-source

[edit]
gretl is an example of an open-source statistical package

Public domain

[edit]
  • CSPro (core is public domain but without publicly available source code; the web UI has been open sourced under Apache version 2[2] and the help system under GPL version 3[3])
  • Dataplot (NIST)
  • X-13ARIMA-SEATS (public domain in the United States only; outside of the United States is under US government copyright)[4]

Freeware

[edit]

Proprietary

[edit]

Add-ons

[edit]

See also

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Statistical software encompasses a diverse array of computer programs and packages specifically designed to facilitate statistical analysis, , visualization, and modeling. These tools enable users to process large datasets, perform complex computations such as , testing, and predictive modeling, and generate graphical representations to uncover patterns and trends. By automating manual calculations and supporting informed , statistical software is essential across disciplines including social sciences, , healthcare, and . The landscape of statistical software includes both proprietary and open-source options, each catering to different user needs in terms of accessibility, cost, and functionality. Proprietary packages like SAS, , , and typically feature intuitive, menu-driven interfaces suitable for non-programmers, offering robust support for advanced analyses in commercial and academic environments, though they often require licensing fees. In contrast, open-source alternatives such as —a programming language and environment for statistical computing—and Python libraries like and StatsModels provide free, extensible platforms that emphasize customization and integration with other tools, making them favorites among researchers and data scientists for handling innovative or large-scale projects. This list compiles notable statistical software packages, categorized by licensing type (open-source, , ) and functionality, highlighting their primary features, development history, and typical applications. While the selection is not exhaustive, it focuses on widely adopted tools that have influenced statistical practice since the mid-20th century, when early packages like SAS emerged to address computational demands in research and industry.

Open-source software

General-purpose systems

General-purpose open-source statistical software provides versatile platforms for , modeling, and visualization, supporting a wide range of tasks from basic to advanced simulations across scientific disciplines. These systems emphasize extensibility through community-contributed packages, enabling users to customize workflows for general statistical without domain-specific constraints. Key examples include programming environments that integrate numerical , graphical output, and reproducible practices. R is a free software environment and programming language designed for statistical computing and graphics, initially released in 1995 under the GNU General Public License (GPL). Developed by and Robert Gentleman at the , it builds on the S language from Bell Laboratories, offering robust data manipulation via functions like data frames and matrices, as well as extensive statistical modeling capabilities including , analysis of variance (ANOVA), time-series analysis, and clustering. Graphics production is facilitated by base plotting tools for standard visualizations and packages such as for layered, publication-quality plots, while extensibility is achieved through the Comprehensive R Archive Network (CRAN), which hosts over 23,000 packages as of November 2025 for additional functionalities like and . R's interpreted nature and object-oriented features make it suitable for interactive exploration and scripting in fields like bioinformatics and . Python, an open-source general-purpose programming language first released in 1991 under the Python Software Foundation (PSF) license, serves as a foundational environment for statistical analysis when augmented by specialized libraries. Core libraries include for efficient multidimensional array operations and numerical computing, enabling fast vectorized calculations essential for large datasets; , which extends NumPy with modules for scientific computing, optimization, integration, and statistical functions like hypothesis testing; and for data manipulation using DataFrame structures that support indexing, merging, and cleaning operations akin to R's data frames. Further, Statsmodels provides tools for econometric and statistical modeling, including regression diagnostics and time-series decomposition, while integrates algorithms such as random forests and support vector machines with statistical validation metrics, allowing seamless incorporation of predictive modeling into broader analyses. This ecosystem's modularity and compatibility with Jupyter notebooks promote collaborative, reproducible workflows in . Julia is a high-level, high-performance programming language for technical computing, launched in 2012 under the MIT license, designed to address the speed limitations of languages like Python and R in numerical tasks. Its multiple dispatch system allows methods to be defined based on argument types, facilitating efficient polymorphic code for statistical simulations and optimizations, while built-in support for parallelism and distributed computing—via packages like Distributed.jl—enables scalable processing on multicore systems or clusters. For probability modeling, the Distributions.jl package offers implementations of univariate and multivariate distributions, density functions, and sampling methods, supporting tasks from Monte Carlo simulations to Bayesian inference. Julia's just-in-time (JIT) compilation via LLVM achieves near-C speeds, making it ideal for compute-intensive general statistics like large-scale hypothesis testing or differential equation solving in scientific modeling. Scilab is an open-source software for numerical computation, initially released in 1994, now under the GNU General Public License v2.0 (originally CeCILL, compatible with GPL), providing a matrix-oriented environment similar to for engineering and scientific applications. It supports matrix-based computations, including linear algebra operations, polynomial manipulations, and , alongside tools for simulation, , and statistical analysis such as and . The Xcos module offers a graphical interface for modeling and simulating hybrid dynamical systems, akin to , allowing block-diagram-based design for control systems and physical simulations. Scilab's interpreter executes scripts interactively, with over 2,000 functions available for general-purpose tasks like visualization via 2D/3D plotting. Octave, developed as a free alternative to , is a high-level for numerical computations under the GPL, with its first alpha release in 1993 and stable release (version 1.0) in 1994; project development began in 1992. It focuses on solving linear and nonlinear problems through an interactive , supporting matrix manipulations, eigenvalue computations, and optimization routines essential for statistical applications. Toolboxes extend functionality to statistics, including hypothesis testing, regression, and probability distributions, as well as linear algebra for multivariate analysis; these are largely compatible with scripts, facilitating migration of existing code for general data processing and modeling tasks. runs on multiple platforms, emphasizing open-source accessibility for educational and research environments.

Specialized applications

This section covers open-source statistical software designed for specific domains such as , survey analysis, , data workflows, and visual , providing targeted tools beyond general-purpose environments. These packages emphasize domain-specific features like specialized modeling, intuitive interfaces for niche analyses, and integration with broader ecosystems while remaining freely modifiable under the GNU General Public License (GPL). Gretl is an econometrics-focused open-source package released in 2000, developed primarily by Allin Cottrell of and Riccardo "Jack" Lucchetti of Università Politecnica delle Marche. It supports advanced time-series analysis, including models and tests, panel data models such as fixed and random effects, and limited dependent variable estimation like and . Gretl features a called Hansl (HAnsl Never Sleeps Language) for automating analyses and extending functionality, alongside a for interactive exploration. Its cross-platform compatibility and integration with libraries like enable efficient handling of large datasets in econometric research. PSPP serves as an open-source alternative to the proprietary , first released in 2000 under the GPL and maintained by the GNU Project. It facilitates through spreadsheets and forms, computes including means, frequencies, and correlations, and performs inferential tests such as t-tests, ANOVA, and non-parametric procedures like chi-square. PSPP supports crosstabulations with measures of association and offers syntax compatibility with commands, allowing seamless migration of scripts for tasks like and . Its command-line and GUI modes make it suitable for both interactive survey data processing and batch operations in social sciences. JASP is a graphical user interface for Bayesian and frequentist statistical analysis, initially released in 2013 under the GPL and developed by the University of Amsterdam's Department of Psychological Methods. Built on the statistical environment for computations and for dynamic visualizations, it provides modules for t-tests with Bayes factors, ANOVA including repeated measures, linear and , and with forest plots. JASP emphasizes reproducible results through integrated reporting and visual outputs like prior/posterior distributions, making it particularly useful for psychological and requiring intuitive Bayesian workflows. Its modular design allows extension via packages for additional specialized analyses. KNIME is an open-source data analytics platform launched in 2006 under the GPL, originating from the and now maintained by KNIME AG. It employs a node-based visual system for constructing pipelines in , such as random forests and neural networks, via topic modeling, and predictive modeling with cross-validation. KNIME integrates seamlessly with and Python for scripting advanced algorithms, supports processing through extensions like , and enables collaborative analytics in domains like pharmaceuticals and finance. The drag-and-drop interface democratizes complex tasks without requiring extensive coding. Orange is a visual for , with development beginning in 1996 and first public release in 1997 under the GPL, created at the University of Ljubljana's Faculty of Computer and Information Science. It offers a drag-and-drop canvas for building workflows involving classification algorithms like support vector machines, clustering methods such as k-means, and data visualization tools including scatter plots and heatmaps. Orange includes add-ons for bioinformatics, enabling and analysis, and for , supporting and topic extraction. Its component-based architecture promotes exploratory analysis in education and research, with Python scripting for custom extensions.

Public domain and freeware

Public domain software

Public domain statistical software consists of tools developed primarily by government institutions, such as the U.S. National Institute of Standards and Technology (NIST) and the U.S. Census Bureau, and released without copyright restrictions, enabling unrestricted use, modification, and distribution worldwide. These programs emphasize reliability, reproducibility, and accessibility for scientific and demographic analysis, often supporting core statistical functions like data visualization, modeling, and processing without requiring licenses or fees. Dataplot, developed by NIST since the 1980s, serves as reference software for scientific visualization, statistical analysis, and non-linear modeling. It offers command-line interfaces for tasks including , ANOVA, distribution fitting, data plotting, and , with a focus on to validate scientific results across platforms like Unix, , Mac OS X, and Windows. The software's status allows integration with open-source environments for enhanced scripting capabilities. X-13ARIMA-SEATS, maintained by the U.S. Census Bureau since its origins in the 1965 X-11 method, is a specialized program for of time-series data. It performs , , and intervention analysis using models and signal extraction techniques, supporting economic and demographic applications with outputs like adjusted series and diagnostic statistics. As , it facilitates unrestricted adoption in research and . The and Survey Processing System (CSPro), jointly developed by the U.S. Census Bureau and since 2005, is designed for , editing, tabulation, and in census and survey operations, particularly in developing countries. It handles complex hierarchical data structures for demographic and health surveys, enabling custom applications for validation, aggregation, and reporting without proprietary limitations. The U.S. Census Bureau's Population Analysis Spreadsheets (PAS), updated and released during the 2010s, provide utilities for demographic data processing, including aggregation of population estimates, calculation of rates and ratios from summary files. These Excel-based workbooks support tasks like and mortality , offering straightforward methods for researchers to derive insights from raw demographic datasets.

Freeware

Freeware statistical software refers to closed-source programs distributed at no cost, typically under licenses that permit free use for personal, educational, or non-commercial purposes while restricting access to the source code and modifications. These tools often target specific domains like , survey analysis, or , providing user-friendly interfaces for professionals who require reliable, no-cost solutions without the need for programming expertise. Unlike open-source alternatives, freeware options emphasize ease of deployment and focused functionality, though they may include limitations on features or usage volume. Epi Info, developed by the Centers for Disease Control and Prevention (CDC), is an open-source suite for epidemiological investigations first released in the 1980s to support data management and analysis. It enables the creation of data entry forms with validation rules, basic statistical computations such as odds ratios and chi-square tests for outbreak investigations, and integration with mapping tools for visualizing disease patterns. Version 7, launched in 2011, introduced mobile device compatibility alongside enhanced support for relational databases and web-based , making it suitable for field in resource-limited settings. This tool has been widely adopted in responses, including systems for infectious diseases, due to its straightforward interface and no-cost availability for practitioners. However, CDC discontinued product development and technical assistance after September 1, 2025. BlueSky Statistics offers a freeware graphical user interface (GUI) built around the R programming language, introduced around 2010 to simplify statistical workflows for users averse to coding. The free version provides point-and-click access to over 400 procedures, including linear and logistic regression, factor analysis, and automated graphics generation, with drag-and-drop functionality for data import and model building. Designed for academics, researchers, and business analysts, it supports data management tasks like cleaning and transformation while outputting results in intuitive formats such as tables and plots, thereby bridging the gap between complex R computations and accessible analysis. Although a commercial Pro edition exists with advanced features, the freeware tier remains sufficient for standard statistical tasks without requiring programming knowledge. Alchemer, formerly known as SurveyGizmo since its founding in 2006, includes a free tier for closed-source survey creation and basic statistical , targeted at small-scale and feedback collection. The free plan allows up to three active surveys with a limit of 100 responses per month, supporting like means, frequencies, and crosstabulations to summarize respondent data. Users can questionnaires with logic branching and results for further , making it useful for or educational assessments where advanced modeling is not required. This tier contrasts with paid plans by omitting features like custom branding or unlimited responses but fulfills core needs for quick, cost-free insights from survey data. WinBUGS, released in 2000 by the MRC Biostatistics Unit at the , is a application for Bayesian statistical modeling available for non-commercial use. It features an interactive interface for specifying complex hierarchical models and performing (MCMC) simulations to estimate posterior distributions, supporting distributions such as normal, binomial, and Poisson for applications in and social sciences. Users define models using a graphical tool or simple scripting, with built-in diagnostics for convergence assessment, enabling analyses like disease risk modeling without extensive coding. As part of the broader BUGS project originating in , WinBUGS has facilitated widespread adoption of Bayesian methods through its extensible framework and free distribution. For similar Bayesian tasks, it serves as a closed-source counterpart to open-source tools like JASP.

Proprietary software

Desktop applications

Desktop applications encompass proprietary statistical software packages designed for installation on personal computers, offering intuitive graphical user interfaces and robust tools for advanced , modeling, and visualization. These tools are widely used in academia, industry, and research for tasks ranging from basic to complex , with vendor-provided support and regular updates ensuring reliability. Unlike open-source alternatives, they emphasize ease of use through menu-driven operations and integrated help systems, catering to users without extensive programming expertise. IBM SPSS Statistics, first released in 1968 as the Statistical Package for the Social Sciences, is a comprehensive suite acquired by in 2009. It provides menu-driven interfaces for performing hypothesis testing, , , and algorithms such as decision trees and neural networks. The latest version, 31 (released in 2025), incorporates AI integration via the AI Output Assistant powered by watsonx.ai, enabling interactions with analysis results for enhanced interpretability. SAS, developed at and first commercialized in 1976 by , is a suite renowned for its capabilities, predictive modeling, and automated reporting. Core desktop functionality relies on DATA steps for data manipulation and PROC procedures for statistical analysis, such as PROC REG for regression and PROC ANOVA for variance analysis. While the Viya platform extends capabilities to cloud environments, the foundational SAS 9.4 remains a desktop-centric installation for on-premises use. Stata, released in 1985 by StataCorp, is a econometric and general-purpose statistical software package emphasizing scripting via do-files for reproducible workflows. It excels in handling with commands like xtreg for fixed-effects models, through stset and stcox for Cox proportional hazards, and customizable graphics using twoway and graph commands. Version 19 (released in April 2025) introduces enhanced Bayesian features, including model averaging and multilevel modeling with prefix commands like bayesmh. Minitab, originating in 1972 from as a lightweight statistical tool, is a package focused on quality improvement and process analysis. It supports methodologies with tools like capability analysis (e.g., Cpk calculations), (DOE) via full factorial and response surface designs, and control charts including Xbar-R and individuals charts for monitoring variation. Version 22 (released in 2024) adds Python integration through the mtbpy module, allowing users to extend analyses with custom scripts directly within the interface. JMP, launched in 1989 as a product of , is a proprietary discovery-oriented software tailored for interactive data exploration, particularly in life sciences and . It features dynamic visualization tools like scatterplot matrices and treemaps, scripting in the Journaling Scripting Language (JSL) for , and DOE platforms supporting custom designs such as D-optimal and mixture experiments. Widely adopted for its point-and-click interface, JMP facilitates rapid prototyping of analyses without compromising statistical rigor. Many of these desktop applications offer free trials or discounted academic licenses to facilitate evaluation and educational use.

Add-ons and extensions

Add-ons and extensions are proprietary modules designed to enhance the statistical functionality of existing base software environments, such as spreadsheets or numerical platforms, by providing specialized tools for advanced without requiring users to switch to standalone applications. These extensions typically integrate seamlessly into the host software, leveraging its interface for data input and visualization while adding capabilities like hypothesis testing, regression, and validation procedures. They are particularly useful for professionals in fields requiring or domain-specific computations, such as biomedical research or . Analyse-it is a statistical add-in for , first released in December 1997, that extends the spreadsheet's capabilities with over 200 statistical tests, specialized charts, and analysis of variance (ANOVA) tools for in-depth data exploration and hypothesis testing. It supports various editions tailored to user needs, including process improvement and features. Version 6.15, released in April 2023, introduced enhancements for method validation and verification, enabling compliance with regulatory standards through procedures for analytical and diagnostic method evaluation. XLSTAT, a add-on for developed since 1993, offers more than 250 statistical features, including for , partial least squares (PLS) regression for multivariate modeling, and algorithms for . It integrates directly with Excel's , allowing no-code , and supports R integration via XLSTAT-R, which enables users to execute R procedures and scripts within Excel dialog boxes for extended computational power. The MATLAB Statistics and Machine Learning Toolbox, a proprietary extension first released in 1993 as the Statistics Toolbox and renamed in 2015, augments the MATLAB environment with functions for data description, analysis, and modeling, including clustering algorithms like k-means and hierarchical methods, classification techniques such as support vector machines (SVMs) and boosted trees, and hypothesis testing tools like t-tests and nonparametric tests. This toolbox facilitates both exploratory data analysis and advanced machine learning workflows, with support for large datasets through tall arrays and code generation for deployment. OriginPro, a graphing and software first released in 1992, includes statistical add-ons focused on , peak analysis, and , allowing users to perform nonlinear regressions, baseline corrections, and of overlapping peaks in scientific datasets. It extends base graphing tools with advanced fitting functions and Apps for surface analysis, making it suitable for researchers in chemistry, physics, and engineering who need precise visualization and quantitative insights from experimental data. Systat , a add-on for scientific graphing and first released in , enhances visualization software with , nonlinear , and (DOE) tools, enabling quick creation of publication-quality graphs alongside statistical evaluations like t-tests and ANOVA. Integrated with SigmaStat for over 50 statistical tests, it supports iterative processes and automated parameter estimation, aiding scientists in modeling complex relationships in biological and physical data.

Online and cloud-based tools

Free platforms

Free platforms encompass web-based statistical tools that provide no-cost access to environments, facilitating browser-based , , and visualization without requiring software installation or hardware setup. These services typically host open-source languages and libraries, enabling users to perform statistical modeling, data exploration, and reproducible workflows directly in the . By eliminating setup barriers, they support and educational use, often with limitations on compute time or resources to encourage efficient usage. Google Colab, introduced in 2017 and owned by , is a hosted service that allows users to write and execute code in the browser with free access to computing resources, including GPUs and TPUs. It primarily supports Python for statistical analysis, tasks, and data visualization, while can be enabled by switching the runtime type to an kernel with current versions such as R 4.4+ as of 2025. This environment integrates libraries like , Pandas, and for statistical computations, making it suitable for quick experiments and sharing notebooks via . Kaggle Notebooks (formerly known as Kaggle Kernels), launched in 2017, offer a free tier within the data science platform for creating and running interactive and Python notebooks. Users can leverage built-in datasets, competitions, and statistical libraries such as Seaborn for visualization and , all executed in a environment that promotes and community collaboration. The platform's integration with Kaggle's vast repository of public datasets enables seamless statistical explorations, from to model building, without local setup. Posit Cloud, formerly RStudio Cloud and rebranded under Posit in 2022, provides a free hosted version of the IDE since its beta launch in 2019, offering browser-based access to R environments for individual users. It supports R scripting, development of Shiny applications for interactive statistical dashboards, and creation of reproducible reports using tools like or R Markdown. With no installation required, the Community Edition allows up to 15 hours of monthly compute time, ideal for teaching, learning, and lightweight statistical projects. Observable, publicly launched in 2018, is a JavaScript-based notebook platform that is free for public use, emphasizing reactive computations and interactive visualization. However, as of April 15, 2025, Cloud has been deprecated, with no new instances available; existing deployments continue to function, and new development is encouraged via the open-source Framework. It enables users to build dynamic and exploratory analyses using libraries like and Plot, where cells react to changes in real-time for immediate feedback. The platform's focus on web-native makes it particularly effective for creating shareable, embedded visualizations of statistical without traditional coding environments.

Commercial services

Commercial services encompass paid, subscription-based platforms designed for , offering for large datasets, collaborative features for teams, and seamless integration with enterprise systems to support professional workflows in and analytics. Posit Workbench, formerly known as RStudio Server Pro, is a software-as-a-service (SaaS) -hosted (IDE) launched in 2011 that facilitates advanced statistical computing in and Python. It enables team collaboration through shared projects and multi-user access, integrates version control systems like for code management, and supports sophisticated statistical modeling, visualization, and tasks via its IDEs for , JupyterLab, and VS Code. Pricing is customized and available upon request, with individual related products starting around $1,163 per year as of 2025, scaling with user count and advanced features. SAS Viya is a cloud analytics platform introduced in , providing end-to-end capabilities for statistical analysis, , and decision-making in a scalable environment. It supports real-time statistical processing, AI-driven modeling with automated pipelines, and visual interfaces for , while allowing integration with on-premises data sources for hybrid deployments. As an enterprise solution, pricing is customized based on deployment scale and features, typically involving annual subscriptions. IBM Cloud Pak for Data, incorporating SPSS functionalities, is a proprietary SaaS platform available since 2018 that unifies , , and AI within a cloud ecosystem. It combines the statistical tools of —such as regression, testing, and multivariate analysis—with IBM Watson AI for automated insight generation, model deployment, and across hybrid clouds. Enterprise pricing is negotiated based on capacity and modules, often starting in the tens of thousands annually for mid-sized organizations. Alteryx Designer Cloud, the cloud iteration of 's workflow automation tool originating in 2010 with its full cloud release in 2020, is a proprietary platform focused on streamlining statistical workflows for . It emphasizes data preparation through blending and transformation, predictive statistical modeling via built-in tools, and automation of repetitive tasks in a collaborative browser-based interface. Following edition-based changes in May 2025, subscriptions start at $250 per user per month (billed annually) for the Starter Edition, with costs increasing for enterprise-scale usage and additional automations. Tableau Server, extended to cloud operations as Tableau Cloud since the platform's inception in , is a service that incorporates statistical extensions for advanced analysis. It enables statistical modeling through integrations with and Python scripts, allowing users to embed custom statistical computations within interactive visualizations and dashboards for collaborative sharing. Pricing as of 2025 includes Viewer at $15, Explorer at $42, and Creator at $75 per user per month, billed annually, with higher tiers for creators including full development access. Many of these services offer free trials to evaluate premium features like enhanced and dedicated support, distinguishing them from free platforms by providing robust for production environments.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.