Hubbry Logo
List of Apache Software Foundation projectsList of Apache Software Foundation projectsMain
Open search
List of Apache Software Foundation projects
Community hub
List of Apache Software Foundation projects
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
List of Apache Software Foundation projects
List of Apache Software Foundation projects
from Wikipedia

This list of Apache Software Foundation projects includes the software development initiatives maintained by the Apache Software Foundation (ASF).[1]

In addition to active projects, the ASF maintains several related areas:

  • Incubator: for projects that are in the process of becoming full ASF projects.
  • Attic: for ASF projects that have been formally retired.
  • INFRA (Apache Infrastructure Team): responsible for managing and providing the technical infrastructure and services used by all Apache projects.[2]

Active projects

[edit]
  • Accumulo: secure implementation of Bigtable
  • ActiveMQ: message broker supporting different communication protocols and clients, including a full Java Message Service (JMS) 1.1 client.[3]
  • AGE: PostgreSQL extension that provides graph database functionality in order to enable users of PostgreSQL to use graph query modeling in unison with PostgreSQL's existing relational model
  • Airavata: a distributed system software framework to manage simple to composite applications with complex execution and workflow patterns on diverse computational resources
  • Airflow: Python-based platform to programmatically author, schedule and monitor workflows
  • Allura: Python-based open source implementation of a software forge
  • Ambari: makes Hadoop cluster provisioning, managing, and monitoring dead simple
  • Ant: Java-based build tool
    • AntUnit: The Ant Library provides Ant tasks for testing Ant task, it can also be used to drive functional and integration tests of arbitrary applications with Ant
    • Ivy: a very powerful dependency manager oriented toward Java dependency management, even though it could be used to manage dependencies of any kind
    • IvyDE: integrate Ivy in Eclipse with the IvyDE plugin
  • APISIX: cloud-native microservices API gateway
  • Archiva: Build Artifact Repository Manager
  • Aries: OSGi Enterprise Programming Model
  • Arrow: "A high-performance cross-system data layer for columnar in-memory analytics".[4][5]
  • AsterixDB: open source Big Data Management System
  • Atlas: scalable and extensible set of core foundational governance services
  • Avro: a data serialization system.
  • Apache Axis Committee
    • Axis: open source, XML based Web service framework
    • Axis2: a service hosting and consumption framework that makes it easy to use SOAP and Web Services
    • Rampart: implementation of the WS-Security standard for the Axis2 Web services engine
    • Sandesha2: an Axis2 module implementing WS-RM.
  • Bahir: extensions to distributed analytic platforms such as Apache Spark
  • Beam, an uber-API for big data
  • Bigtop: a project for the development of packaging and tests of the Apache Hadoop ecosystem.
  • Bloodhound: defect tracker based on Trac[6]
  • BookKeeper: a reliable replicated log service
  • Brooklyn: a framework for modelling, monitoring, and managing applications through autonomic blueprints
  • BRPC: industrial-grade RPC framework for building reliable and high-performance services
  • BuildStream: tool for building/integrating software stacks
  • BVal: Bean Validation API Implementation
  • Calcite: dynamic data management framework
  • Camel: declarative routing and mediation rules engine which implements the Enterprise Integration Patterns using a Java-based domain specific language
  • CarbonData: an indexed columnar data format for fast analytics on big data platform, e.g., Apache Hadoop, Apache Spark, etc
  • Cassandra: highly scalable second-generation distributed database
  • Causeway(formerly Isis): a framework for rapidly developing domain-driven apps in Java
  • Cayenne: Java ORM framework
  • Celix: implementation of the OSGi specification adapted to C and C++
  • CloudStack: software to deploy and manage cloud infrastructure
  • Cocoon: XML publishing framework
  • Commons: reusable Java libraries and utilities too small to merit their own project
    • BCEL: Bytecode Engineering Library
    • Daemon: Commons Daemon
    • Jelly: Jelly is a Java and XML based scripting engine. Jelly combines the best ideas from JSTL, Velocity, DVSL, Ant and Cocoon all together in a simple yet powerful scripting engine
    • Logging: Commons Logging is a thin adapter allowing configurable bridging to other, well known logging systems
    • OGNL: Object Graph Navigation Library
  • Community Development: project that creates and provides tools, processes, and advice to help open-source software projects improve their own community health
  • Cordova: mobile development framework
  • CouchDB: Document-oriented database
  • Apache Creadur Committee
    • Rat: improves accuracy and efficiency when reviewing and auditing releases.
    • Tentacles: simplifies the job of reviewing repository releases consisting of large numbers of artefacts
    • Whisker: assists assembled applications to maintain correct legal documentation.
  • cTAKES: clinical "Text Analysis Knowledge Extraction Software" to extract information from electronic medical record clinical free-text
  • Curator: builds on ZooKeeper and handles the complexity of managing connections to the ZooKeeper cluster and retrying operations
  • CXF: web services framework
  • Daffodil: implementation of the Data Format Description Language (DFDL) used to convert between fixed format data and XML/JSON
  • DataFu: collection of libraries for working with large-scale data in Hadoop
  • DataSketches: open source, high-performance library of stochastic streaming algorithms commonly called "sketches" in the data sciences
  • Apache DB Committee
  • DeltaSpike: collection of JSR-299 (CDI) Extensions for building applications on the Java SE and EE platforms
  • Apache Directory Committee
    • Directory: LDAP and Kerberos, entirely in Java.
    • Directory Server: an extensible, embeddable LDAP and Kerberos server, entirely in Java
    • Directory Studio: Eclipse based LDAP browser and directory client
    • Fortress: a standards-based authorization platform that implements ANSI INCITS 359 Role-Based Access Control (RBAC)
    • Kerby: Kerberos binding in Java
    • LDAP API: an SDK for directory access in Java
    • SCIMple is an implementation of SCIM v2.0 specification
  • DolphinScheduler: a distributed ETL scheduling engine with powerful DAG visualization interface
  • Doris: MPP-based interactive SQL data warehousing for reporting and analysis, good for both high-throughput scenarios and high-concurrency point queries
  • Drill: software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets
  • Druid: high-performance, column-oriented, distributed data store
  • Dubbo: high-performance, lightweight, Java-based RPC framework
  • ECharts: charting and data visualization library written in JavaScript
  • Empire-db: a lightweight relational database abstraction layer and data persistence component
  • EventMesh: dynamic cloud-native basic service runtime used to decouple the application and middleware layer
  • Felix: implementation of the OSGi Release 5 core framework specification
  • Fineract: Platform for Digital Financial Services
  • Flagon: software tool usability testing platform
  • Flex: cross-platform SDK for developing and deploying rich Internet applications.
  • Flink: fast and reliable large-scale data processing engine.
  • Flume: large scale log aggregation framework
  • Apache Fluo Committee
    • Fluo: a distributed processing system that lets users make incremental updates to large data sets
    • Fluo Recipes: Apache Fluo Recipes build on the Fluo API to offer additional functionality to developers
    • Fluo YARN: a tool for running Apache Fluo applications in Apache Hadoop YARN
  • FreeMarker: a template engine, i.e. a generic tool to generate text output based on templates. FreeMarker is implemented in Java as a class library for programmers
  • Geode: low latency, high concurrency data management solutions
  • Geronimo: Java EE server
  • Gobblin: distributed data integration framework
  • Gora: an open source framework that provide an in-memory data model and persistence for big data
  • Griffin: an open source Data Quality solution for Big Data, which supports both batch and streaming mode. Originally developed by eBay[7]
  • Groovy: an object-oriented, dynamic programming language for the Java platform
  • Guacamole: HTML5 web application for accessing remote desktops [8]
  • Gump: integration, dependencies, and versioning management
  • Hadoop: Java software framework that supports data intensive distributed applications
  • HAWQ: advanced enterprise SQL on Hadoop analytic engine
  • HBase: Apache HBase software is the Hadoop database. Think of it as a distributed, scalable, big data store
  • Helix: a cluster management framework for partitioned and replicated distributed resources
  • Hive: the Apache Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage.
  • Hop: The Hop Orchestration Platform, or Apache Hop, aims to facilitate all aspects of data and metadata orchestration.
  • HTTP Server: The Apache HTTP Server application 'httpd'
    • mod_python: module that integrates the Python interpreter into Apache server. Deprecated in favour of mod_wsgi.
  • Apache HttpComponents: low-level Java libraries for HTTP
  • Hudi: provides atomic upserts and incremental data streams on Big Data
  • Iceberg: an open standard for analytic SQL tables, designed for high performance and ease of use.
  • Ignite: an In-Memory Data Fabric providing in-memory data caching, partitioning, processing, and querying components[9]
  • Impala: a high-performance distributed SQL engine
  • InLong: a one-stop integration framework for massive data that provides automatic, secure and reliable data transmission capabilities
  • IoTDB: data store for managing large amounts of time series data in industrial applications
  • Jackrabbit: implementation of the Java Content Repository API
  • James: Java email and news server
  • jclouds: open source multi-cloud toolkit for the Java platform
  • Jena is an open source Semantic Web framework for Java
  • JMeter: pure Java application for load and functional testing
  • Johnzon: JSR-353 compliant JSON parsing; modules to help with JSR-353 as well as JSR-374 and JSR-367
  • JSPWiki: A feature-rich and extensible WikiWiki engine built around the standard J2EE components (Java, servlets, JSP)
  • Juneau: A toolkit for marshalling POJOs to a wide variety of content types using a common framework
  • Kafka: a message broker software
  • Karaf: an OSGi distribution for server-side applications.
  • Kibble: a suite of tools for collecting, aggregating and visualizing activity in software projects.
  • Knox: a REST API Gateway for Hadoop Services
  • Kudu: a distributed columnar storage engine built for the Apache Hadoop ecosystem
  • Kvrocks: a distributed key-value NoSQL database, supporting the rich data structure
  • Kylin: distributed analytics engine
  • Kyuubi: a distributed multi-tenant Thrift JDBC/ODBC server for large-scale data management, processing, and analytics, built on top of Apache Spark and designed to support more engines
  • Libcloud: a standard Python library that abstracts away differences among multiple cloud provider APIs.
  • Linkis: a computation middleware project, which decouples the upper applications and the underlying data engines, provides standardized interfaces (REST, JDBC, WebSocket etc.) to easily connect to various underlying engines (Spark, Presto, Flink, etc.)
  • Apache Logging Services Committee
    • Chainsaw: a GUI log viewer.
    • Log4cxx: provides logging services for C++.
    • Log4j: Apache Log4j
    • Log4net: provides logging services for .NET.
    • Log4php: a logging framework for PHP.
  • Apache Lucene Committee
    • Lucene Core: a high-performance, full-featured text search engine library
    • Solr: enterprise search server based on the Lucene Java search library
  • Lucene.NET: a port of the Lucene search engine library, written in C# and targeted at .NET runtime users.
  • MADlib: Scalable, Big Data, SQL-driven machine learning framework for Data Scientists
  • Mahout: machine learning and data mining solution. Mahout
  • ManifoldCF: Open-source software for transferring content between repositories or search indexes
  • Maven: Java project management and comprehension tool
    • Doxia: a content generation framework, which supports many markup languages.
  • Mesos: open-source cluster manager
  • Apache MINA Committee
    • FtpServer: FTP server written entirely in Java
    • MINA: Multipurpose Infrastructure for Network Application, a framework to develop high performance and high scalability network applications. MINA
    • SSHD: a 100% pure Java library to support the SSH protocols on both the client and server side SSHD
    • Vysper: aims to be a modular, full featured XMPP (Jabber) server. Vysper is implemented in Java
  • Mnemonic: a transparent nonvolatile hybrid memory oriented library for Big data, High-performance computing, and Analytics
  • Apache MyFaces Committee
  • Mynewt: embedded OS optimized for networking and built for remote management of constrained devices
  • NetBeans: development environment, tooling platform, and application framework
  • NiFi: easy to use, powerful, and reliable system to process and distribute data
  • Nutch: a highly extensible and scalable open source web crawler
  • NuttX: mature, real-time embedded operating system (RTOS)
  • OFBiz: Open for Business: enterprise automation software
  • Olingo: Client and Server for OData
  • Oozie: a workflow scheduler system to manage Apache Hadoop jobs.
  • OpenJPA: Java Persistence API Implementation
  • OpenMeetings: video conferencing, instant messaging, white board and collaborative document editing application
  • OpenNLP: natural language processing toolkit
  • OpenOffice: an open-source, office-document productivity suite
  • OpenWebBeans: Dependency Injection Platform
  • OpenWhisk: distributed Serverless computing platform
  • ORC: columnar file format for big data workloads
  • Ozone: scalable, redundant, and distributed object store for Hadoop
  • Parquet: a general-purpose columnar storage format
  • PDFBox: Java based PDF library (reading, text extraction, manipulation, viewer)
  • Mod_perl: module that integrates the Perl interpreter into Apache server
  • Pekko: toolkit and an ecosystem for building highly concurrent, distributed, reactive and resilient applications for Java and Scala[10]
  • Petri: deals with the assessment of, education in, and adoption of the Foundation's policies and procedures for collaborative development and the pros and cons of joining the Foundation
  • Phoenix: SQL layer on HBase
  • Pig: a platform for analyzing large data sets on Hadoop
  • Pinot: a column-oriented, open-source, distributed data store written in Java[11]
  • Pivot: a platform for building rich internet applications in Java
  • PLC4X: Universal API for communicating with programmable logic controllers
  • Apache POI Committee
  • APR: Apache Portable Runtime, a portability library written in C
  • Portals: web portal related software
  • Pulsar: distributed pub-sub messaging system originally created at Yahoo
  • Qpid: AMQP messaging system in Java and C++
  • Ranger: a framework to enable, monitor and manage comprehensive data security across the Hadoop platform
  • Ratis: Java implementation for RAFT consensus protocol
  • RocketMQ: a fast, low latency, reliable, scalable, distributed, easy to use message-oriented middleware, especially for processing large amounts of streaming data
  • Roller: a full-featured, multi-user and group blog server suitable for both small and large blog sites
  • Royale: improving developer productivity in creating applications for wherever JavaScript runs (and other runtimes)
  • Rya: cloud-based RDF triple store that supports SPARQL queries
  • Samza: Stream Processing Framework
  • Santuario: XML Security in Java and C++
  • SDAP: integrated data analytic center for Big Science problems
  • SeaTunnel: a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data
  • Sedona: big geospatial data processing engine
  • Serf: high performance C-based HTTP client library built upon the Apache Portable Runtime (APR) library
  • ServiceComb: microservice framework that provides a set of tools and components to make development and deployment of cloud applications easier
  • ServiceMix: enterprise service bus that supports JBI and OSGi
  • ShardingSphere: related to a database clustering system providing data sharding, distributed transactions, and distributed database management
  • ShenYu: Java native API Gateway for service proxy, protocol conversion and API governance
  • Shiro: a simple to use Java Security Framework
  • SINGA: a distributed deep learning library
  • Spatial Information System (SIS): A library for developing geospatial applications
  • SkyWalking: application performance management and monitoring (APM)
  • Sling: innovative Web framework based on JCR and OSGi
  • Solr: Full Text search server
  • SpamAssassin: email filter used to identify spam
  • Spark: open source cluster computing framework
  • Steve: STeVe is a collection of online voting tools, used by the ASF, to handle STV and other voting methods
  • Storm: a distributed real-time computation system.
  • StreamPipes: self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore (Industrial) IoT data streams
  • Streams: Interoperability of online profiles and activity feeds
  • Struts: Java web applications framework
  • Submarine: Cloud Native Machine Learning Platform
  • Subversion: open source version control (client/server) system
  • Superset: enterprise-ready web application for data exploration, data visualization and dashboarding
  • Synapse: a lightweight and high-performance Enterprise Service Bus (ESB)
  • Syncope: an Open Source system for managing digital identities in enterprise environments.
  • SystemDS: scalable machine learning
  • Tapestry: component-based Java web framework
  • Apache Tcl Committee
    • Tcl integration for Apache httpd
    • Rivet: Server-side Tcl programming system combining ease of use and power
    • Websh: Websh is a rapid development environment for building powerful, fast, and reliable web applications in Tcl
  • Tez: an effort to develop a generic application framework which can be used to process arbitrarily complex directed-acyclic graphs (DAGs) of data-processing tasks and also a re-usable set of data-processing primitives which can be used by other projects
  • Thrift : Interface definition language and binary communication protocol that is used to define and create services for numerous languages
  • Tika: content analysis toolkit for extracting metadata and text from digital documents of various types, e.g., audio, video, image, office suite, web, mail, and binary
  • TinkerPop: A graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP)
  • Tomcat: web container for serving servlets and JSP
  • TomEE: an all-Apache Java EE 6 Web Profile stack for Apache Tomcat
  • Traffic Control: Built around Apache Traffic Server as the caching software, Traffic Control implements all the core functions of a modern CDN. Traffic Control
  • Traffic Server: HTTP/1.1 compliant caching proxy server. Traffic Server
  • Turbine: a servlet based framework that allows Java developers to quickly build web applications
  • TVM: an end to end machine learning compiler framework for CPUs, GPUs and accelerators
  • UIMA: unstructured content analytics framework
  • Unomi: reference implementation of the OASIS customer data platform specification
  • VCL: a cloud computing platform for provisioning and brokering access to dedicated remote compute resources.
  • Apache Velocity Committee:
    • Anakia: an XML transformation tool which uses JDOM and Velocity to transform XML documents into multiple formats.
    • Texen: a general purpose text generating utility based on Apache Velocity and Apache Ant.
    • Velocity: Java template creation engine
    • Apache Velocity DVSL: a tool modeled after XSLT and intended for general XML transformations using the Velocity Template Language.
    • Apache Velocity Tools: tools and infrastructure for the template engine
  • Apache Web Services Committee
    • Axiom: an XML object model supporting deferred parsing.
    • Woden: used to develop a Java class library for reading, manipulating, creating and writing WSDL documents.
  • Whimsy: tools that display and visualize various bits of data related to ASF organizations and processes.
  • Wicket: component-based Java web framework
  • Xalan: XSLT processors in Java and C++
  • Xerces: validating XML parser
  • Apache XML Graphics Committee
    • Batik: pure Java library for SVG content manipulation
    • FOP: Java print formatter driven by XSL formatting objects (XSL-FO); supported output formats include PDF, PS, PCL, AFP, XML (area tree representation), Print, AWT and PNG, and to a lesser extent, RTF and TXT
    • XML Graphics Commons: common components for Apache Batik and Apache FOP
  • Yetus: a collection of libraries and tools that enable contribution and release processes for software projects
  • YuniKorn: standalone resource scheduler responsible for scheduling batch jobs and long-running services on large scale distributed systems
  • Zeppelin: a collaborative data analytics and visualization tool for distributed, general-purpose data processing systems
  • ZooKeeper: coordination service for distributed applications

Incubating projects

[edit]

Retired projects

[edit]

A retired project is one which has been closed down on the initiative of the board, the project its PMC, the PPMC or the IPMC for various reasons. It is no longer developed at the Apache Software Foundation and does not have any other duties.

  • Abdera: implementation of the Atom Syndication Format and Atom Publishing Protocol
  • ACE: a distribution framework that allows central management and distribution of software components, configuration data and other artefacts to target systems
  • Any23: Anything To Triples (Any23) is a library, a web service and a command line tool that extracts structured data in RDF format from a variety of Web documents
  • Apex: Enterprise-grade unified stream and batch processing engine
  • Aurora: Mesos framework for long-running services and cron jobs
  • AxKit: XML Application Server for Apache. It provided on-the-fly conversion from XML to any format, such as HTML, WAP or text using either W3C standard techniques, or flexible custom code
  • Beehive: Java visual object model
  • Buildr: a build system for Java-based applications, including support for Scala, Groovy and a growing number of JVM languages and tools
  • Chemistry: provides open source implementations of the Content Management Interoperability Services (CMIS) specification
  • Chukwa: Chukwa is an open source data collection system for monitoring large distributed systems
  • Clerezza: a service platform which provides a set of functionality for management of semantically linked data accessible through RESTful Web Services and in a secured way
  • Click: simple and easy-to-use Java Web Framework
  • Continuum: continuous integration server
  • Crimson: Java XML parser which supports XML 1.0 via various APIs
  • Crunch: Provides a framework for writing, testing, and running MapReduce pipelines
  • Deltacloud: provides common front-end APIs to abstract differences between cloud providers
  • DeviceMap: device Data Repository and classification API
  • DirectMemory: off-heap cache for the Java Virtual Machine
  • DRAT: large scale code license analysis, auditing and reporting
  • Eagle: open source analytics solution for identifying security and performance issues instantly on big data platforms
  • ECS: API for generating elements for various markup languages
  • ESME: secure and highly scalable microsharing and micromessaging platform that allows people to discover and meet one another and get controlled access to other sources of information, all in a business process context
  • Etch: cross-platform, language- and transport-independent RPC-like messaging framework
  • Excalibur: Java inversion of control framework including containers and components
  • Falcon: data governance engine
  • Forrest: documentation framework based upon Cocoon
  • Giraph: scalable Graph Processing System
  • Hama: Hama is an efficient and scalable general-purpose BSP computing engine
  • Harmony: Java SE 5 and 6 runtime and development kit
  • HiveMind: services and configuration microkernel
  • iBATIS: Persistence framework which enables mapping SQL queries to POJOs
  • Jakarta: server side Java, including its own set of subprojects
  • Jakarta Cactus: simple test framework for unit testing server-side Java code
  • Joshua: statistical machine translation toolkit
  • Apache jUDDI Committee
    • Scout: Apache Scout is an implementation of the JSR 93 (JAXR).
  • Labs: a place for innovation where committees of the foundation can experiment with new ideas
  • Lens: Unified Analytics Interface
  • Lenya: content management system (CMS) based on Apache Cocoon
  • Lucy: search engine library that provides full-text search for dynamic programming languages
  • Marmotta: An Open Platform for Linked Data
  • MetaModel: provides a common interface for discovery, exploration of metadata and querying of different types of data sources.
  • Metron: Real-time big data security
  • MRUnit: Java library that helps developers unit test Apache Hadoop map reduce jobs
  • MXNet: Deep learning programming framework
  • ODE: Apache ODE is a WS-BPEL implementation that supports web services orchestration using flexible process definitions.
  • ObJectRelationalBridge (OJB): Object/Relational mapping tool that allowed transparent persistence for Java Objects against relational databases
  • Oltu - Parent: OAuth protocol implementation in Java
  • Onami: project focused on the development and maintenance of a set of Google Guice extensions not provided out of the box by the library itself
  • OODT: Object Oriented Data Technology, a data management framework for capturing and sharing data
  • Open Climate Workbench: A comprehensive suite of algorithms, libraries, and interfaces designed to standardize and streamline the process of interacting with large quantities of observational data and conducting regional climate model evaluations
  • ORO: Regular Expression engine supporting various dialects
  • Polygene: community based effort exploring Composite Oriented Programming for domain centric application development
  • PredictionIO: PredictionIO is an open source Machine Learning Server built on top of state-of-the-art open source stack, that enables developers to manage and deploy production-ready predictive services for various kinds of machine learning tasks.
  • REEF: A scale-out computing fabric that eases the development of Big Data applications on top of resource managers such as Apache YARN and Mesos
  • Regexp: Regular Expression engine
  • River: provides a standards-compliant JINI service
  • Sentry: Fine grained authorization to data and metadata in Apache Hadoop
  • Shale: web application framework based on JavaServer Faces
  • Shindig: OpenSocial container; helps start hosting OpenSocial apps quickly by providing the code to render gadgets, proxy requests, and handle REST and RPC requests
  • Sqoop: a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases
  • STDCXX: collection of algorithms, containers, iterators, and other fundamental components of every piece of software, implemented as C++ classes, templates, and functions essential for writing C++ programs
  • Stanbol: Software components for semantic content management
  • Stratos: Platform-as-a-Service (PaaS) framework
  • Tajo: relational data warehousing system. It using the hadoop file system as distributed storage.
  • Tiles: templating framework built to simplify the development of web application user interfaces.
  • Trafodion: Webscale SQL-on-Hadoop solution enabling transactional or operational workloads on Apache Hadoop[12][13][14]
  • Tuscany: SCA implementation, also providing other SOA implementations
  • Twill: Use Apache Hadoop YARN's distributed capabilities with a programming model that is similar to running threads
  • Usergrid: an open-source Backend-as-a-Service ("BaaS" or "mBaaS") composed of an integrated distributed NoSQL database, application layer and client tier with SDKs for developers looking to rapidly build web and/or mobile applications
  • VXQuery: Apache VXQuery implements a parallel XML Query processor.
  • Wave: online real-time collaborative editing
  • Whirr: set of libraries for running cloud services
  • Wink: RESTFul web services based on JAX-RS specification
  • Wookie: parser, server and plugins for working with W3C Packaged Web Apps
  • WS Muse: implementation of the WS-ResourceFramework (WSRF), WS-BaseNotification (WSN), and WS-DistributedManagement (WSDM) specifications
  • Xang: XML Web Framework that aggregated multiple data sources, made that data URL addressable and defined custom methods to access that data
  • Xindice: XML Database
  • Zipkin: distributed tracing system
  • OpenCMIS: Collection of Java libraries, frameworks and tools around the CMIS specification for document interoperability.

The above may be incomplete, as the list of retired projects changes.

See also

[edit]

References

[edit]

External list

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
The List of Apache Software Foundation projects is a directory cataloging the open-source software initiatives sponsored, developed, and maintained by the Apache Software Foundation (ASF), a U.S.-based non-profit corporation dedicated to fostering community-led development of freely available software for the public good. Founded in 1999, the ASF operates under the guiding principles of the "Apache Way," emphasizing consensus-driven decision-making, meritocracy, and transparency to support a decentralized network of volunteer contributors. As of fiscal year 2025, the foundation oversees more than 320 active top-level projects and subprojects, contributed to by over 8,400 committers and stewarded by more than 1,140 elected members, with software releases exceeding 1,300 annually. These projects are organized on the ASF's official directory by name (alphabetically), category (such as , cloud, libraries, and ), programming language (including , Python, C++, and ), and metrics like the number of committers, enabling users to explore initiatives ranging from foundational tools to specialized applications. Notable examples include the , a widely used software powering a significant portion of the ; , a framework for distributed storage and processing of large datasets; , a distributed streaming platform for real-time data pipelines; , an analytics engine for ; and , a servlet container for web applications. All projects are released under the permissive 2.0, promoting broad adoption, modification, and redistribution while ensuring legal protections for contributors. The directory also notes retired projects in the Apache Attic and emerging ones in the Apache Incubator, reflecting the dynamic lifecycle of the ASF ecosystem.

Introduction

Overview of the Apache Software Foundation

The Apache Software Foundation (ASF) was founded in 1999 as a 501(c)(3) non-profit corporation in the United States, emerging from the collaborative community behind the Apache HTTP Server project. Dedicated to open-source software, the ASF serves as a neutral steward, providing legal protection, trademark safeguarding, and infrastructure to support volunteer-driven development. The organization's mission centers on creating software for the public good through open collaboration, with all projects licensed under the permissive Apache License 2.0, which encourages widespread adoption and modification. As of fiscal year 2025, the ASF oversees 295 active top-level projects, alongside 32 incubating podlings. These projects span diverse domains such as big data processing, cloud computing, machine learning, and Internet of Things technologies, reflecting the foundation's broad impact on modern software ecosystems. With 9,905 committers contributing worldwide from thousands of organizations, the ASF fosters a guided by of "community over code." This global network has grown significantly since its origins, evolving from a single project into a comprehensive portfolio that powers across industries.

Project Lifecycle and Governance

(ASF) manages the lifecycle of its software projects through a structured process designed to foster open-source development under the principles of and consensus. Projects typically progress through four main stages: idea submission, incubation as podlings, active top-level status, and potential retirement to the . This lifecycle ensures that only mature, community-driven initiatives receive ongoing ASF support while preserving historical contributions from discontinued efforts. In FY2025, five podlings graduated to top-level status, including Apache DataFusion. Idea submission begins when a potential project is proposed by a sponsoring ASF member or officer, often with an existing codebase and intellectual property rights assigned to the ASF. Accepted proposals enter the incubation phase as "podlings," overseen by the Apache Incubator, where the focus is on building a diverse committer base, producing releases, and cultivating a healthy community in line with ASF policies. Incubation typically lasts about 1.5 years and requires demonstration of active development, adherence to the Apache License, and resolution of legal issues; graduation to active status occurs via a vote by the Incubator's Project Management Committee (PMC) upon achieving maturity criteria, such as broad participation and sustainable governance. Governance throughout the lifecycle is handled by PMCs, autonomous groups of volunteer committers who oversee individual projects or incubation efforts, with ultimate oversight from the ASF Board of Directors. The ASF emphasizes a meritocratic model where roles—ranging from users and contributors to committers and PMC members—are earned through demonstrated contributions, and decisions are made via lazy consensus, involving binding votes (+1 for approval, 0 for neutral, -1 for objection) that must be addressed to achieve agreement. This approach promotes collaborative, transparent management without hierarchical control. Retirement to the Apache Attic occurs when a project or podling exhibits prolonged inactivity, to produce releases, or dissolution of its community, as determined through public discussion and votes by the relevant PMC or the Incubator PMC. The process involves archiving the project's assets for historical preservation, making repositories read-only, and closing associated infrastructure, while allowing for potential revival through forking or re-incubation with Board approval. As of November 2025, there are 30 active podlings, including several focused on AI and applications such as Apache Cloudberry and Apache Texera.

Active Projects

Data Processing and Analytics Projects

The Apache Software Foundation (ASF) supports a robust of active projects dedicated to and , encompassing tools for distributed storage, stream and , event streaming, databases, and metadata management. These projects address the demands of environments by enabling scalable, fault-tolerant operations across diverse use cases such as real-time , pipelines, and architectures. As of , the ASF maintains approximately 70 active projects in this category, reflecting the foundation's ongoing emphasis on advancing open-source solutions for handling massive datasets. Apache is a distributed storage and processing framework that enables reliable, scalable computation on large datasets using commodity hardware. It is primarily used for executing jobs in workflows, such as log analysis and ETL operations. A key milestone was the release of version 1.0 in 2012, which marked its enterprise readiness after initial development in 2006. Apache Spark serves as a unified analytics engine for large-scale data processing, supporting both batch and streaming workloads with in-memory computation for faster performance. Its primary use case involves , , and tasks, including integration with libraries like MLlib for scalable model training. Founded in 2009 at UC Berkeley's AMPLab, a significant milestone was its graduation to top-level project status in 2014. Apache Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant messaging and real-time data pipelines. It is commonly applied in scenarios requiring pub-sub messaging, log aggregation, and for applications like recommendation systems. Originating in 2011 at , a pivotal milestone was the 1.0.0 release in 2017, solidifying its stability for enterprise adoption. Apache provides a distributed database that offers and scalability for handling write-heavy workloads across multiple data centers. Its core is storing and querying large volumes of structured data in applications like time-series and IoT sensor data . Developed initially in 2008 at , it achieved top-level project status in 2010. Apache is a framework that supports stateful computations over unbounded data streams with exactly-once semantics. It is primarily utilized for real-time , event-driven applications, and in domains such as fraud detection. With roots in the 2009 project, a major milestone was the 1.0 release in 2016 following its top-level graduation in 2014. Apache is an open table format that enables reliable schema evolution and for analytic datasets in data lakes. It is mainly used for managing petabyte-scale tables in query engines like Spark and Trino, facilitating transactions on object storage. Donated to the ASF in 2018 and graduating to top-level status in 2020, it has seen widespread adoption for modern lakehouse architectures. A recent addition, Apache Gravitino, is a unified metadata service that provides a federated catalog for data and AI assets across heterogeneous environments. It addresses primary use cases in data governance, discovery, and lineage tracking for multi-cloud data lakes. Graduating to top-level project status in June 2025, it represents the ASF's continued innovation in metadata management.

Infrastructure and Servers Projects

The Infrastructure and Servers projects within the Apache Software Foundation represent core technologies that underpin web serving, distributed coordination, messaging, and resource management in large-scale systems. These projects, many of which originated from industry needs at organizations like Yahoo, the NSA, and early web pioneers, have evolved through the ASF's meritocratic process to become widely adopted standards for reliable infrastructure. As of 2025, the category encompasses dozens of active initiatives, emphasizing their essential role in supporting the scalability and resilience of modern cloud and enterprise environments. Key examples include the , a collaborative effort launched in 1995 to create a freely available, feature-rich web server for delivering HTTP content. Primarily used for hosting static and dynamic websites across millions of domains, it achieved a pivotal milestone by becoming the dominant web server software worldwide within its first year of release, surpassing proprietary alternatives. Another foundational project is , an open-source Java servlet and JSP container donated to the ASF in 1999 and first released as version 3.0 in 2000. It serves as the primary runtime for deploying Java-based web applications in enterprise settings, playing a central role as the reference implementation for Jakarta EE specifications and enabling scalable server-side Java development. , initiated in 2004 and elevated to top-level status in 2007, is a multi-protocol that supports standards like JMS, AMQP, STOMP, and for asynchronous communication. Its primary use case involves integrating disparate systems in distributed architectures, such as enterprise service buses, where it ensures reliable message delivery and has supported high-throughput scenarios in industries like and . , originating from the NSA's Niagarafiles tool and donated to the ASF in 2014 before graduating in 2015, provides a visual management system for automating the ingestion, routing, and transformation of data. Commonly applied in cybersecurity, IoT, and pipelines to handle movement with built-in and tracking, it marked a milestone in open-sourcing government-grade data automation tools. Apache ZooKeeper, developed by Yahoo and entering the ASF ecosystem in 2008, offers a highly reliable distributed coordination service using a hierarchical for configuration, , and naming. It is essential for managing state in large clusters, such as in Hadoop or metadata coordination in Kafka, and has become a for fault-tolerant distributed applications since its top-level promotion in 2010. A recent addition is Apache StormCrawler, which graduated from incubation to top-level project status in June 2025, delivering a modular SDK for building scalable, crawlers atop . Designed for high-volume, low-latency data extraction in search engines and content aggregation platforms, it addresses challenges in distributed crawling with features like URL deduplication and politeness policies.

Development Tools and Frameworks Projects

The (ASF) maintains a robust portfolio of active projects focused on development tools and frameworks, encompassing , integration patterns, development, and search libraries, with approximately 80 such projects as of 2025 that demonstrate the Foundation's enduring influence on practices. These tools and frameworks support developers in streamlining workflows, managing dependencies, and building scalable applications across diverse environments. Representative examples illustrate their versatility and impact, from foundational build systems to modern integration solutions. Apache Ant, a Java-based build tool established in 2000, automates software build processes using XML-based configuration files, with its primary use case being the compilation and deployment of Java projects in environments requiring flexible, script-like automation. A key milestone was its adoption as the for Java builds in the early 2000s, influencing subsequent tools and integrating with IDEs like . Apache Maven, a tool that entered the ASF in 2003, employs a declarative project object model (POM) for managing builds, dependencies, and documentation, primarily used for centralized dependency resolution and standardized build lifecycles in Java ecosystems. Its pivotal milestone includes the release of Maven 1.0 in 2004, which popularized convention-over-configuration principles and Maven Central as a global artifact repository. Apache Struts, a framework initiated in 2000, implements the Model-View-Controller (MVC) architecture for Java-based , with its core enabling the creation of maintainable, action-oriented s through tag libraries and validation features. A significant milestone was the evolution to Struts 2 in 2006, merging with WebWork to enhance modularity and support for RESTful services. Apache Camel, an open-source integration framework launched in 2007, facilitates message routing and mediation using (EIPs), primarily applied in enterprise service buses for connecting disparate systems via components like JMS and HTTP. Its key milestone came with version 2.0 in 2010, introducing blueprint support for environments and broadening adoption in cloud-native integrations. Apache Lucene, a high-performance originating in 1999, provides full-text indexing and search capabilities through an structure, with primary use cases in for applications like Solr and . A landmark achievement was its integration into the Apache ecosystem in 2001 as part of the , evolving into a for scalable search technologies powering billions of queries daily. In 2025, notable additions to this category include graduates from the Apache Incubator such as Apache DevLake, a dev data platform for engineering metrics and analytics that unifies toolchains to measure developer productivity; Apache Grails, a full-stack built on for with convention-based scaffolding; and Apache Fory, a high-performance, multi-language serialization framework leveraging compilation for efficient data exchange across systems. These projects exemplify the ASF's ongoing meritocratic evolution, adapting tools to contemporary demands like observability and cross-language interoperability.

Incubating Projects

Current Podlings

The Apache Incubator evaluates proposed projects, known as podlings, through a structured process that ensures alignment with the Foundation's principles of open-source collaboration and . As of November 2025, the Incubator hosts approximately 30 active podlings undergoing incubation, each demonstrating initial progress toward becoming full top-level Apache projects. Key evaluation criteria for these podlings include achieving diversity among committers from multiple organizations to foster broad community support and sustainability, as well as producing initial releases that showcase core functionality, adherence to Apache licensing, and active development. Unlike some external references that provide only links without details, the following explicitly lists all current podlings with their overviews.
NameDescriptionEntry DateStatus
AmoroLakehouse management system on open data lake formats.2024-03-11Active development
AuronAccelerates Apache Spark SQL with a Rust-based vectorized execution layer.2025-08-05Active development
BaremapsToolkit for creating and operating online maps.2022-10-10Active development
BifroMQHigh-performance distributed MQTT broker.2025-04-22Active development
BurrPython framework for state machines and AI agent workflows.2025-05-24Active development
CloudberryAdvanced open-source MPP database on PostgreSQL.2024-10-11Active development
FesodJava library for reading/writing Excel files.2025-09-17Active development
FlussStreaming storage for real-time analytics.2025-06-04Active development
GeaFlowDistributed stream and batch graph compute engine.2025-06-06Active development
GlutenOffloads JVM-based SQL engine execution to native engines.2024-01-11Active development
GraphArOpen-source graph data file format.2024-03-25Active development
HamiltonFramework for defining and executing DAGs.2025-04-12Active development
HoraeDBDistributed cloud-native time-series database.2023-12-11Active development
HugeGraphLarge-scale graph database.2022-01-23Active development
IggyHigh-performance message streaming platform in Rust.2025-02-04Active development
KIESolutions for knowledge engineering and process automation.2023-01-13Active development
LivyREST interface for managing Apache Spark contexts.2017-06-05Active development
OpenServerlessCloud-agnostic serverless platform based on Kubernetes.2024-06-17Active development
OtavaCommand-line tool for detecting changes in time-series data.2024-11-27Active development
OzHeraCloud-native application observation platform.2024-07-11Active development
PegasusDistributed key-value storage system.2020-06-28Active development
PolarisCatalog for data lakes with enterprise security.2024-08-09Active development
Pony MailMail-archiving and interaction service.2016-05-27Active development
PouchDBJavaScript database inspired by Apache CouchDB.2025-04-15Active development
ResilientDBDistributed blockchain framework.2023-10-21Active development
SeataDistributed transaction solution.2023-10-29Active development
TexeraSystem for collaborative data science and AI workflows.2025-04-12Active development
ToreeMechanism to interactively access Apache Spark.2015-12-02Active development
WayangCross-platform data processing system.2020-12-16Active development
XTableOmni-directional converter for table formats.2024-02-11Active development

Recently Graduated Projects

The Apache Software Foundation's incubation process culminates in for projects that demonstrate maturity, community consensus, and alignment with ASF principles. In 2025, several podlings successfully transitioned to top-level project (TLP) status following approval by the Incubator Project Management Committee (IPMC), which evaluates factors such as code quality, documentation, licensing compliance, and active contributor engagement. This approval leads to an ASF Board resolution establishing the project as a TLP, with initial milestones including the formation of a dedicated Project Management Committee (PMC) and the release of a inaugural top-level version. Apache Gravitino graduated on June 3, 2025, emerging as a high-performance, geo-distributed metadata lake that unifies for data and AI assets across diverse sources and regions. It enables lakehouse by managing metadata directly in heterogeneous environments, addressing challenges in AI ecosystems where siloed data hinders model training and deployment. Gravitino's impact lies in its ability to provide contextual engineering capabilities, such as lineage tracking and access controls, fostering scalable AI workflows in enterprise settings. Apache StormCrawler followed on June 4, 2025, as an open-source SDK for constructing scalable, low-latency distributed web crawlers powered by . It offers modular components for handling URL filtering, content parsing, and storage integration, making it suitable for acquisition in search engines and analytics pipelines. The project's underscores its role in enhancing web-scale data collection, with contributions from a global community improving its resilience against dynamic web environments. Apache Grails graduated on October 7, 2025, as a Groovy-based framework that enables rapid development of robust web applications using conventions over configuration. It integrates seamlessly with Spring and Hibernate, supporting modern ecosystems while providing productivity tools for database migrations, , and plugin architecture. The graduation highlights its evolution and sustained community adoption for building scalable enterprise applications. Apache HertzBeat graduated on August 21, 2025, as an observability and monitoring solution that provides real-time metrics collection, alerting, and visualization for cloud-native environments. It supports multi-source data integration and customizable dashboards, enhancing IT operations through intelligent anomaly detection and automated responses. This milestone reflects its growing role in simplifying infrastructure monitoring for teams.

Retired Projects

Projects in the Apache Attic

The Apache Attic, established in November 2008, functions as a dedicated repository for top-level Apache projects that have reached the end of their active lifecycle, preserving their , , and historical artifacts without ongoing maintenance or community support. This mechanism allows to clearly delineate inactive projects while maintaining their availability for archival, educational, or revival purposes, in line with the foundation's governance policies on project retirement. As of November 2025, the houses over 100 retired top-level projects, spanning diverse domains such as , web frameworks, and development tools. Retirement typically occurs due to sustained inactivity, diminished , or strategic shifts, following a formal voting process by the project's PMC. Below is a selection of notable retired projects, highlighting their original purposes, retirement dates, and primary reasons for decommissioning.
Project NameOriginal PurposeRetirement DateReason for Retirement
Apache AbderaImplementation of the Atom Syndication Format and Atom Publishing Protocol for syndication.February 2017Lack of sustained activity and community contributions, leading to no recent development.
Apache AuroraMesos-based framework for managing long-running services, jobs, and ad-hoc tasks in cluster environments.February 2020Project inactivity, with committers voting to retire due to insufficient ongoing engagement.
Apache HarmonyModular open-source SE runtime environment with class libraries, aimed at providing an alternative to implementations.November 2011Declining interest after key contributors like shifted to , compounded by challenges in obtaining a Java TCK license for full compatibility certification.
Apache SqoopToolset for efficiently transferring bulk data between Hadoop and structured data stores like relational databases.June 2021Inactivity and lack of maintainer involvement, as voted by committers, though forks and commercial support continue externally.
Apache ApexUnified engine for stream and in applications, supporting real-time and ETL workflows.September 2019Dormancy due to waning participation, resulting in a retirement vote for inactivity.
Apache ArchivaMaven-based repository manager for build artifacts, providing centralized storage and proxying for project dependencies.February 2024Prolonged inactivity and failure to attract new contributors, per PMC decision.
Apache Any23Micro-parser for extracting RDF triples from various web formats, facilitating data integration.June 2023Insufficient development momentum and community support.
Apache AceOSGi-centric framework for centralized lifecycle management and deployment of remote applications.December 2017Lack of active maintenance and engagement.
Apache AvalonComponent-based framework for application assembly and service-oriented programming.November 2004Obsolescence due to evolving Java ecosystem standards and inactivity.
Apache Bloodhound and issue tracking tool built on , for development.July 2024Diminished usage and contributor base.
Among recent retirements in 2025, other entries include jclouds (multi-cloud toolkit, retired June) and Gora (NoSQL data store abstraction, retired March), both citing low engagement as the key factor.

Retired Incubating Podlings

The Incubator retires podlings that fail to meet graduation criteria, often due to insufficient community momentum, lack of releases, or prolonged inactivity, providing valuable lessons on the challenges of open-source project maturation within the ASF. Historically, over the two decades since the Incubator's inception, more than 100 podlings have entered incubation, with retirements highlighting common pitfalls like developer burnout or competing alternatives in the ecosystem. These cases underscore the rigorous evaluation process, where podlings must demonstrate self-sustaining communities to advance, emphasizing the importance of early contributor diversity and consistent progress. Recent retirements illustrate these trends, particularly in data processing and AI-related domains where rapid technological evolution can outpace community growth. For instance:
  • Apache Heron: Entered incubation on June 23, 2017, as a distributed, fault-tolerant engine designed for real-time analytics at scale. It was retired on January 18, 2023, due to insufficient community energy to sustain development despite initial promise as an alternative to established systems.
  • Apache Liminal: Proposed on May 23, 2020, this end-to-end platform aimed to enable data engineers and scientists to build, train, and deploy models seamlessly. Retired on July 18, 2024, primarily from inactivity and lack of active releases.
  • Apache Nemo: Incubated starting February 4, 2018, as a versatile system allowing flexible runtime behavior control for big data workflows. It was retired on June 23, 2025, owing to insufficient community support and stalled contributions.
  • Apache NLPCraft: Entered on February 13, 2020, offering a for building natural language understanding (NLU) applications with intent recognition capabilities. Retired on August 4, 2025, following prolonged inactivity and failure to build a robust contributor base.
These examples reflect broader patterns, with inactivity cited in over half of retirements and community shortcomings in many others, informing ASF guidelines to prioritize projects with diverse, engaged teams from the outset.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.