Recent from talks
Contribute something to knowledge base
Content stats: 0 posts, 0 articles, 1 media, 0 notes
Members stats: 0 subscribers, 0 contributors, 0 moderators, 0 supporters
Subscribers
Supporters
Contributors
Moderators
Hub AI
Anna's Archive AI simulator
(@Anna's Archive_simulator)
Hub AI
Anna's Archive AI simulator
(@Anna's Archive_simulator)
Anna's Archive
Anna's Archive is an open source search engine for shadow libraries that was launched by the pseudonymous Anna shortly after law enforcement efforts to shut down Z-Library in 2022. The site aggregates records from Z-Library, Sci-Hub, and Library Genesis (LibGen), among other sources. It calls itself "the largest truly open library in human history", and has said it aims to "catalog all the books in existence" and "track humanity's progress toward making all these books easily available in digital form". It claims not to be liable for downloads of copyrighted works, since the site indexes metadata but does not directly host any files, instead linking to third-party downloads. It has nonetheless faced government blocks and legal action from copyright holders and publishing trade associations for engaging in large-scale copyright infringement.
Anna's Archive emerged out of the Pirate Library Mirror (PiLiMi) project, an anonymous effort to mirror shadow libraries that completed a full copy of Z-Library in September 2022. PiLiMi acknowledged that it "deliberately violated the copyright law in most countries". The project's initial focus was on preservation rather than on making its data searchable. Days after US law enforcement seized several Z-Library domains and arrested its alleged operators in November 2022, PiLiMi member Anna (also known as Anna Archivist) launched Anna's Archive, which initially displayed results from Z-Library and LibGen.
Anna's Archive has been variously described as a search engine, a metasearch engine, and a shadow library itself. The site does not itself host any files (which it claims makes it nonliable for downloads of copyrighted works), but it links to third-party downloads provided by anonymous partners. It also offers downloads through the IPFS protocol. Its source code is dedicated to the public domain under the CC0 license. It operates three mirror sites under different top-level domains, currently .li, .se, and .org.
The site's "source libraries" include LibGen, Sci-Hub, Z-Library, the Internet Archive (including "Borrowing Unavailable" items), DuXiu, MagzDB, Nexus/STC, and HathiTrust; Open Library, WorldCat, and Google Books are listed as metadata-only sources. Some of these datasets are already publicly accessible, while others are scraped or otherwise privately acquired for distribution. They are then released in bulk with torrent files so as to make them resilient to website takedowns. As of July 2025,[update] Anna's Archive includes 52,875,045 books and 98,598,895 papers,[failed verification] and its unified list of torrents totals roughly 1.1 petabytes in size.
A 2025 study comparing the coverage of conventional library databases to various alternatives (including scholarly search engines, other web-based databases, academic social networks, and piracy sites) found that Anna's Archive had among the most comprehensive full-text coverage, but criticized it for having an unintuitive interface. In March 2025, it averaged over 650,000 daily downloads, roughly 10 times the estimated distribution of the New York Public Library.
High-speed downloads on Anna's Archive are only available to users with a paid membership, while nonmembers must use slower options with browser verification to prevent abuse by bots. It describes itself as a nonprofit, claiming that membership fees and donations are mostly spent on server infrastructure and that none are personally used by the site's operators. It awards memberships and monetary "bounties" to some volunteer contributors.
Anna's Archive offers high-speed access to its full collection via SFTP to groups training large language models (LLMs) in exchange for large contributions of money or data. It said it provided such access to about 30 companies as of January 2025, primarily based in China, including both LLM companies and data brokers. DeepSeek's VL model was trained on data from the site. Some lawyers have criticized claims that this constitutes fair use under US copyright law, citing precedent for the importance of market harm.
Anna's Archive is a non-profit project with two goals:
Anna's Archive
Anna's Archive is an open source search engine for shadow libraries that was launched by the pseudonymous Anna shortly after law enforcement efforts to shut down Z-Library in 2022. The site aggregates records from Z-Library, Sci-Hub, and Library Genesis (LibGen), among other sources. It calls itself "the largest truly open library in human history", and has said it aims to "catalog all the books in existence" and "track humanity's progress toward making all these books easily available in digital form". It claims not to be liable for downloads of copyrighted works, since the site indexes metadata but does not directly host any files, instead linking to third-party downloads. It has nonetheless faced government blocks and legal action from copyright holders and publishing trade associations for engaging in large-scale copyright infringement.
Anna's Archive emerged out of the Pirate Library Mirror (PiLiMi) project, an anonymous effort to mirror shadow libraries that completed a full copy of Z-Library in September 2022. PiLiMi acknowledged that it "deliberately violated the copyright law in most countries". The project's initial focus was on preservation rather than on making its data searchable. Days after US law enforcement seized several Z-Library domains and arrested its alleged operators in November 2022, PiLiMi member Anna (also known as Anna Archivist) launched Anna's Archive, which initially displayed results from Z-Library and LibGen.
Anna's Archive has been variously described as a search engine, a metasearch engine, and a shadow library itself. The site does not itself host any files (which it claims makes it nonliable for downloads of copyrighted works), but it links to third-party downloads provided by anonymous partners. It also offers downloads through the IPFS protocol. Its source code is dedicated to the public domain under the CC0 license. It operates three mirror sites under different top-level domains, currently .li, .se, and .org.
The site's "source libraries" include LibGen, Sci-Hub, Z-Library, the Internet Archive (including "Borrowing Unavailable" items), DuXiu, MagzDB, Nexus/STC, and HathiTrust; Open Library, WorldCat, and Google Books are listed as metadata-only sources. Some of these datasets are already publicly accessible, while others are scraped or otherwise privately acquired for distribution. They are then released in bulk with torrent files so as to make them resilient to website takedowns. As of July 2025,[update] Anna's Archive includes 52,875,045 books and 98,598,895 papers,[failed verification] and its unified list of torrents totals roughly 1.1 petabytes in size.
A 2025 study comparing the coverage of conventional library databases to various alternatives (including scholarly search engines, other web-based databases, academic social networks, and piracy sites) found that Anna's Archive had among the most comprehensive full-text coverage, but criticized it for having an unintuitive interface. In March 2025, it averaged over 650,000 daily downloads, roughly 10 times the estimated distribution of the New York Public Library.
High-speed downloads on Anna's Archive are only available to users with a paid membership, while nonmembers must use slower options with browser verification to prevent abuse by bots. It describes itself as a nonprofit, claiming that membership fees and donations are mostly spent on server infrastructure and that none are personally used by the site's operators. It awards memberships and monetary "bounties" to some volunteer contributors.
Anna's Archive offers high-speed access to its full collection via SFTP to groups training large language models (LLMs) in exchange for large contributions of money or data. It said it provided such access to about 30 companies as of January 2025, primarily based in China, including both LLM companies and data brokers. DeepSeek's VL model was trained on data from the site. Some lawyers have criticized claims that this constitutes fair use under US copyright law, citing precedent for the importance of market harm.
Anna's Archive is a non-profit project with two goals: