Recent from talks
Nothing was collected or created yet.
OpenSearch (software)
View on Wikipedia| OpenSearch | |
|---|---|
| Original author | Amazon Web Services |
| Developer | OpenSearch Software Foundation |
| Initial release | 12 April 2021 |
| Stable release | 3.3.2[1]
/ 30 October 2025 |
| Repository | github |
| Written in | Java |
| Type | Search engine |
| License | Apache License 2.0 |
| Website | www |
| OpenSearch Dashboards | |
|---|---|
| Developer | OpenSearch Software Foundation |
| Initial release | 12 April 2021 |
| Stable release | 3.3.2[1] |
| Repository | github |
| Written in | TypeScript, JavaScript |
| Type | Search engine |
| License | Apache License 2.0 |
| Website | www |
OpenSearch is a family of software consisting of a search engine (also named OpenSearch), and OpenSearch Dashboards, a data visualization dashboard for that search engine.[2] It is an open-source project developed by the OpenSearch Software Foundation (a Linux Foundation project) written primarily in Java.
As of August 2024, AWS reported that OpenSearch had "tens of thousands" of customers,[3] while Elastic claimed to have over 20,000 subscribers.[4] In the preceding year, OpenSearch had about 50 monthly contributors[5] while ElasticSearch had between 70 and 90.[6]
History
[edit]The project was created in 2021 by Amazon Web Services[7][8][2][9][10] as a fork of Elasticsearch and Kibana after Elastic NV changed the license of new versions of this software away from the open-source Apache License in favour of the Server Side Public License (SSPL).[11][12][8][2] Amazon would hold sole ownership status and write access to the source code repositories, but invited pull requests from anyone.[2][7] Other companies such as Logz.io, CrateDB, Red Hat and others announced an interest in building or joining a community to continue using and maintaining this open-source software.[12][13][8][14]
On September 16, 2024, the Linux Foundation and Amazon Web Services announced the creation of the OpenSearch Software Foundation.[15][16] Ownership of OpenSearch software was transferred from Amazon to OpenSearch Software Foundation, which is organized as an open technical project within the Linux Foundation. The Linux Foundation reported that at the time, "OpenSearch recorded more than 700 million software downloads and participation from thousands of contributors and more than 200 project maintainers." The OpenSearch Software Foundation would launch with support from premier members Amazon Web Services, SAP, and Uber.
Projects
[edit]OpenSearch
[edit]OpenSearch is a Lucene-based search engine that started as a fork of version 7.10.2 of the Elasticsearch service.[8][2] It has Elastic NV trademarks and telemetry removed. It is licensed under the Apache License, version 2,[2] without a Contributor License Agreement. The maintainers have made a commitment to remain completely compatible with Elasticsearch in its initial versions.[2]
OpenSearch Dashboards
[edit]OpenSearch Dashboards started as a fork of version 7.10.2 of Elastic's Kibana software, and is also under the Apache License, version 2.[8][2][17]
See also
[edit]References
[edit]- ^ a b "Release 3.3.2". 30 October 2025. Retrieved 30 October 2025.
- ^ a b c d e f g h Christina Cardoza (April 13, 2021). "Amazon announces OpenSearch, an open-source fork of Elasticsearch and Kibana". Software Development Times. Retrieved 2021-06-01.
- ^ "Modernize your data observability with Amazon OpenSearch Service zero-ETL integration with Amazon S3 | AWS Big Data Blog". aws.amazon.com. 2024-06-05. Retrieved 2024-08-31.
- ^ "Elastic Reports First Quarter Fiscal 2025 Financial Results". 2024-08-29.
- ^ "OpenSearch Open Source Project on Open Hub: Contributors". openhub.net. Retrieved 2024-08-31.
- ^ "Elasticsearch Open Source Project on Open Hub: Contributors". openhub.net. Retrieved 2024-08-31.
- ^ a b "Introducing OpenSearch". Amazon Web Services. 12 April 2021. Retrieved 27 April 2021.
- ^ a b c d e Tim Anderson (13 Apr 2021). "You know what? Fork this: AWS renames its take on Elasticsearch to OpenSearch following trademark fight". The Register. Retrieved 2021-06-01.
- ^ "Amazon Forks Elasticsearch Rebranding It as OpenSearch". InfoQ. Retrieved 2021-06-30.
- ^ Vaughan-Nichols, Steven (April 13, 2021). "OpenSearch: AWS rolls out its open source Elasticsearch fork". TechRepublic. Retrieved 2021-09-03.
- ^ Banon, Shay (14 January 2021). "Doubling down on open, Part II". Elastic. Retrieved 19 January 2021.
- ^ a b Vaughan-Nichols, Steven J. "Elastic changes open-source license to monetize cloud-service use". ZDNet. Retrieved 23 January 2021.
- ^ "CrateDB Doubling Down on Permissive Licensing and the Elasticsearch Lockdown". CrateDB. 27 January 2021. Retrieved 28 January 2021.
- ^ "Amazon Announces OpenSearch". www.i-programmer.info. Retrieved 2021-06-30.
- ^ "Linux Foundation Announces OpenSearch Software Foundation to Foster Open Collaboration in Search and Analytics". www.linuxfoundation.org. Retrieved 2024-09-20.
- ^ "AWS Welcomes the OpenSearch Software Foundation | AWS Open Source Blog". aws.amazon.com. 2024-09-16. Retrieved 2024-09-20.
- ^ "OpenSearch - Amazon forks Elasticsearch and the divergence begins". OpenSource Connections. 2021-04-14. Retrieved 2021-06-30.
External links
[edit]OpenSearch (software)
View on GrokipediaOrigins
Background on Elasticsearch Licensing Changes
Elasticsearch, originally released under the permissive Apache License 2.0, facilitated widespread adoption by allowing unrestricted use, modification, and distribution, including integration into managed cloud services offered by providers such as Amazon Web Services (AWS).[11] This licensing model supported the software's growth into a dominant search and analytics engine, with AWS launching Elasticsearch Service in 2015 based on the open-source codebase. The Apache 2.0 terms imposed no obligations on service providers to share revenues or modifications, enabling competitive offerings that contributed to Elasticsearch's ecosystem expansion.[12] On January 21, 2021, Elastic NV announced a shift for versions 7.11 and later, relicensing the Apache 2.0 portions of Elasticsearch and Kibana under a dual model of the Server Side Public License (SSPL) v1 and Elastic License 2.0 (ELv2). The SSPL, proposed by MongoDB and not recognized as open source by the Open Source Initiative due to its requirement that managed service providers release their entire software stack as source-available, aimed to curb what Elastic described as "unfair" commercialization by hyperscalers offering Elasticsearch as a service without equivalent contributions.[11] Similarly, ELv2 permitted internal use and self-hosting but restricted document-level security features for competing cloud services unless users obtained a commercial subscription from Elastic.[11] Elastic justified the change as a defense against cloud providers duplicating their technology stack—encompassing over 200 components—without reciprocity, citing AWS's Elasticsearch Service as an example where Elastic received no revenue despite significant usage. Critics, including AWS and portions of the developer community, argued that the relicensing deviated from open-source principles by introducing copyleft-like restrictions that effectively limited forking and vendor-neutral adoption, potentially leading to greater vendor lock-in under Elastic's control rather than empowering users.[12] This backlash manifested in concerns over reduced freedoms, with projects like Apache SkyWalking publicly decrying the move as closing off collaborative development under OSI-approved terms.[13] In response, AWS committed to maintaining the final Apache 2.0 release, version 7.10.2, as a community-driven alternative to preserve permissive licensing and mitigate risks of proprietary drift.[12]AWS Fork and Initial Launch
In April 2021, Amazon Web Services initiated the OpenSearch project by forking Elasticsearch version 7.10.2 and Kibana version 7.10.2, establishing a community-driven open-source alternative licensed under the permissive Apache 2.0 terms.[14] This fork preserved the core search, indexing, and analytics engine capabilities of the original projects while committing AWS to ongoing development without restrictive licensing constraints.[14] The immediate objective was to enable seamless continuity for users reliant on open-source distributions, avoiding dependencies on Elastic's subsequent dual-licensing model that combined Apache 2.0 with the more proprietary Server Side Public License for versions beyond 7.10.2.[15] The forked codebase, rebranded as OpenSearch for the backend and OpenSearch Dashboards for the visualization interface, retained foundational components such as Lucene-based full-text search, distributed querying, and real-time data ingestion to ensure backward compatibility with existing Elasticsearch 7.10.2 deployments.[14] Initial enhancements focused on stabilizing the branch for community contributions, with AWS pledging resources for maintenance and feature parity to the pre-fork baseline, rather than introducing divergent AWS-centric modifications at inception.[16] On July 12, 2021, OpenSearch 1.0 achieved production readiness, marking the project's first stable release and fulfilling the goal of delivering a viable, independently evolving suite.[16] Concurrently, AWS integrated OpenSearch into its managed cloud offerings by renaming Amazon Elasticsearch Service to Amazon OpenSearch Service and adding support for OpenSearch 1.0 on September 8, 2021, thereby providing a fully managed environment that prioritized operational compatibility, scalability via AWS infrastructure, and freedom from Elastic's license encumbrances for hosted instances.[17] This service launch emphasized ease of migration for Elasticsearch users, with options to run either legacy Apache 2.0-licensed Elasticsearch versions or the new OpenSearch fork under unified management.[17]Technical Foundation
Core Architecture and Components
OpenSearch employs a distributed architecture centered on Apache Lucene for its indexing and search capabilities, enabling efficient handling of large-scale data through inverted indexes that map terms to documents.[7] At the foundational level, data is organized into indexes, which are subdivided into primary shards—each functioning as an independent Lucene index—for parallel processing and distribution across nodes.[7] Replica shards, created by default as one copy per primary shard, provide redundancy for fault tolerance and enhance query performance by distributing read loads.[7] This sharding and replication mechanism supports horizontal scalability, allowing clusters to expand by adding nodes without introducing single points of failure when configured with multiple cluster manager nodes for quorum-based decision-making.[18] The system operates within clusters, which are collections of interconnected nodes that collectively manage data ingestion, storage, and retrieval.[7] Node types include data nodes, responsible for storing shards, performing indexing, and executing searches; cluster manager nodes, which orchestrate cluster-wide operations such as shard allocation and node health monitoring; and ingest nodes, dedicated to preprocessing documents via pipelines before indexing to offload compute-intensive tasks from data nodes.[18] In production setups, dedicating three cluster manager nodes across availability zones ensures resilience against node failures, as decisions require a majority quorum.[18] Data nodes are provisioned with sufficient RAM and storage to handle shard loads, typically balancing replicas across zones to mitigate zone-level outages.[18] Interactions with the architecture occur primarily through a RESTful API over HTTP, facilitating real-time document ingestion, full-text searches with relevance scoring (using algorithms like BM25), and aggregations for analytics on distributed data.[19] Queries are broadcast to relevant shards, with results aggregated at coordinating nodes to maintain consistency and efficiency.[7] This design inherently supports near-real-time updates, as ingested documents become searchable shortly after indexing, while the absence of centralized bottlenecks—achieved via decentralized shard management—enables robust scaling for petabyte-scale datasets.[7]Key Features and Capabilities
OpenSearch supports vector search capabilities, enabling the storage and querying of vector embeddings alongside traditional data to facilitate semantic search and AI-driven applications, such as similarity matching via approximate nearest neighbor algorithms like k-NN.[20] This integration allows for machine learning model deployment directly within search workflows, supporting tasks like neural search and document classification without external dependencies.[4] Additionally, it incorporates anomaly detection powered by the Random Cut Forest algorithm, which processes time-series data in near real-time to identify outliers and deviations, applicable to monitoring and predictive analytics.[21][22] The suite includes observability functionalities for handling logs, metrics, and traces, unifying these signals into a single platform for root-cause analysis and performance monitoring, with tools to transform unstructured log data and visualize distributed traces across services.[23][24] This setup supports ingestion from diverse sources via pipelines like Data Prepper, enabling efficient querying and alerting without additional licensing costs.[25] Licensed under the Apache 2.0 terms, OpenSearch ensures permissive use that promotes extensibility through community contributions and custom modifications, facilitating self-hosting on-premises or multi-cloud environments while avoiding vendor-specific lock-in or escalating fees associated with proprietary shifts in upstream projects.[26] This licensing model underpins cost predictability, as organizations can scale deployments independently of commercial vendors, leveraging the distributed architecture for high availability across nodes.[5]Ecosystem and Extensions
OpenSearch Dashboards
OpenSearch Dashboards serves as the primary web-based user interface for OpenSearch, forked from Kibana version 7.10.2 in April 2021 to align with the OpenSearch project's emphasis on open-source licensing and community-driven development.[27] It enables users to query, visualize, and manage data stored in OpenSearch indices through interactive tools, facilitating data exploration without requiring direct API interactions.[28] Unlike the original Kibana, which has diverged under Elastic's proprietary licensing model, OpenSearch Dashboards maintains backward compatibility with OpenSearch clusters while evolving independently to support features tailored to distributed search and analytics workloads.[6] Core functionalities include the creation of customizable dashboards that aggregate visualizations such as line charts, bar graphs, pie charts, and geographic maps for representing time-series data, metrics, and spatial information.[6] The Discover tool allows ad-hoc querying and filtering of raw data using OpenSearch Query Language (OSQL) or Dashboard Query Language (DQL), supporting real-time analysis of logs, application metrics, and other indexed datasets.[28] Visual Editor provides drag-and-drop capabilities for building complex visualizations, including heatmaps and treemaps, directly from index patterns, with support for aggregations like histograms and percentiles to derive insights from large-scale datasets.[29] Post-fork, OpenSearch Dashboards has incorporated enhancements specific to security integration, such as native support for the OpenSearch Security plugin, enabling role-based access control, audit logging, and encrypted communications within the visualization layer.[30] Custom plugin development has been expanded through a modular architecture, allowing extensions for advanced observability, anomaly detection visualizations, and trace analytics without reliance on proprietary Elastic features.[31] Recent updates, including workspaces introduced in OpenSearch 3.0 (released May 2025), enable multi-tenant dashboard management and isolated environments for collaborative data exploration, improving scalability for enterprise deployments.[32] These developments ensure ongoing compatibility with OpenSearch indices while addressing gaps in the original Kibana fork, such as enhanced plugin extensibility for custom UI components.[33]Plugins, Security, and Integrations
OpenSearch features a plugin architecture that allows extension of its search and analytics capabilities through modular components installed via theopensearch-plugin command-line tool.[34] Key plugins in the ecosystem include the Alerting plugin, which evaluates data streams from one or more indexes against user-defined conditions—such as thresholds or scripts—and triggers actions when met, supporting use cases like real-time monitoring of log volumes or error rates exceeding 5% within a 5-minute window.[35] The SQL plugin enables querying of OpenSearch indexes using ANSI SQL syntax, translating queries to the native DSL via REST APIs, JDBC/ODBC drivers, or a dedicated CLI tool, with support for features like aggregations, joins, and pagination limits up to 200 rows by default.[36] Complementing these, the Notifications plugin aggregates and routes alerts from Alerting and other plugins to channels including email, Slack, Microsoft Teams, custom webhooks, and Amazon Chime, configurable via a unified interface in OpenSearch Dashboards.[37]
The Security plugin, bundled and enabled by default in OpenSearch installations since the project's inception in April 2021, delivers enterprise-grade protections distinguishing it from forked predecessors where core security often requires paid licensing.[3] It enforces fine-grained role-based access control (RBAC) with over 30 predefined action groups for permissions on indices, clusters, and dashboards; supports authentication backends like internal users, HTTP basic auth, JWT, SAML, and OpenID Connect; and integrates with LDAP or Active Directory for directory-based user mapping, allowing synchronization of groups to roles via attributes like memberOf.[38] Additional safeguards include transport-layer encryption with TLS (enabled by default for node-to-node communication), HTTPS for REST APIs (configurable), IP filtering, and comprehensive audit logging that records events like authentication failures or unauthorized queries to files or external systems.[39] Multi-tenancy in OpenSearch Dashboards isolates user spaces for visualizations and indices, enabled by default but adjustable in configuration files.[38]
Integrations with AWS services enhance OpenSearch's deployment flexibility, particularly in managed environments like Amazon OpenSearch Service.[40] Native support for Amazon S3 enables zero-ETL querying of object storage data without ingestion, announced in June 2024, allowing direct analysis of petabyte-scale datasets via SQL or dashboards.[41] AWS Lambda integration processes data transformations serverlessly, while OpenSearch Ingestion pipelines—launched in 2023—stream data from sources like Amazon Kinesis or RDS into domains with managed buffering and fault tolerance, supporting protocols such as HTTP and Apache Kafka.[42] These connect with third-party tools via standard APIs, promoting hybrid setups where on-premises OpenSearch clusters federate with cloud resources for unified analytics, as demonstrated in integrations with Amazon CloudWatch for metric alerting on domain health metrics like CPU utilization exceeding 80%.[43]
