Hubbry Logo
Open LibraryOpen LibraryMain
Open search
Open Library
Community hub
Open Library
logo
8 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Open Library
Open Library
from Wikipedia

Open Library is an online project intended to create "one web page for every book ever published". Created by Aaron Swartz,[3][4] Brewster Kahle,[5] Alexis Rossi,[6] Anand Chitipothu,[6] and Rebecca Hargrave Malamud,[6] Open Library is a project of the Internet Archive, a nonprofit organization. It has been funded in part by grants from the California State Library and the Kahle/Austin Foundation. Open Library provides online digital copies in multiple formats, created from images of many public domain, out-of-print, and in-print books.

Key Information

Book database and digital lending library

[edit]

Its book information is collected from the Library of Congress, other libraries, and Amazon.com, as well as from user contributions through a wiki-like interface.[4] If books are available in digital form, a button labeled "Read" appears next to its catalog listing. Digital copies of the contents of each scanned book are distributed as encrypted e-books (created from images of scanned pages), audiobooks and streaming audio (created from the page images using OCR and text-to-speech software), unencrypted images of full pages from OpenLibrary.org and Archive.org, and APIs for automated downloading of page images.[7] Links to where books can be purchased or borrowed are also provided.

There are different entities in the database:

  • authors
  • works (which are the aggregate of all books with the same title and text)
  • editions (which are different publications of the corresponding works)

Open Library claims to have over 20 million records in its database.[8] Copies of the contents of tens of thousands of modern books have been made available from 150 libraries and publishers for ebook controlled digital lending.[9] Other books including in-print and in-copyright books have been scanned from copies in library collections, library discards, and donations, and are also available for lending in digital form.[10] In total, the Open Library offers copies of over 1.4 million books for what it calls "digital lending", but critics have called distribution of digital copies a violation of copyright law.[11]

History

[edit]

Open Library began in 2006 with Aaron Swartz as the original engineer and leader of the Open Library's technical team.[3][4] The project was led by George Oates from April 2009 to December 2011.[12] Oates was responsible for a complete site redesign during her tenure.[13] In 2015, the project was continued by Giovanni Damiola[6] and then Brenton Cheng[6] and Mek Karpeles[6] in 2016.

The site was redesigned and relaunched in May 2010. Its codebase is on GitHub.[14] The site uses Infobase, its own database framework based on PostgreSQL, and Infogami, its own Wiki engine written in Python.[15] The source code to the site is published under the GNU Affero General Public License.[16][2]

Book sponsorship program

[edit]

In the week of October 21, 2019, the Open Library website introduced a Book Sponsorship program,[17] which according to Cory Doctorow, "lets you direct a cash donation to pay for the purchase and scanning of any books. In return, you are first in line to check that book out when it is available, and then anyone who holds an Open Library library card can check it out.".[18] The feature was developed by Mek Karpeles, Tabish Shaikh,[6] and other members of the community.[19]

Books for the blind and dyslexic

[edit]

The website was relaunched adding ADA compliance and offering over one million modern and older books to the print disabled in May 2010[20] using the DAISY Digital Talking Book.[21] Under certain provisions of United States copyright law, libraries are sometimes able to reproduce copyrighted works in formats accessible to users with disabilities.[22][23]

[edit]

The Open Library has justified its ability to offer full contents of books in digital formats as part of the first-sale doctrine and fair use law.[24][25] The Open Library owns a physical copy of each book that they have made available, and thus argue that the lending out of one digital scan of the book in a controlled manner falls within the first-sale doctrine, a practice known as controlled digital lending and in use by multiple public and academic libraries.[25]

Since its launch, the Open Library has been accused of mass copyright violation by numerous groups,[25] including the American Authors Guild,[26] the British Society of Authors,[27] the Australian Society of Authors,[28] the Science Fiction and Fantasy Writers of America,[29] the US National Writers Union,[30] and a coalition of 37 national and international organizations of "writers, translators, photographers, and graphic artists; unions, organizations, and federations representing the creators of works included in published books; book publishers; and reproduction rights and public lending rights organizations".[31] The UK Society of Authors threatened legal action in 2019 unless the Open Library agreed to cease distribution of copyrighted works.[32]

Hachette v. Internet Archive

[edit]

The Open Library further came under criticism from several authors and publishers groups when it created the National Emergency Library in response to the COVID-19 pandemic in March 2020. Under these circumstances, the National Emergency Library removed the waitlists of all books in its Open Library collection and allowed any number of digital copies of a book to be downloaded as an encrypted file that would be unusable after two weeks, asserting that this unlimited borrowing was a reasonable exception under the national emergency to allow educational functions to continue since physical libraries and bookstores were forced to be shuttered.[25] The Authors Guild, the Association of American Publishers, the National Writers Union, and others argued that this allowed unlimited copyright infringement and denied revenues from distribution of authorized digital copies of books to authors who also needed relief during the COVID-19 national emergency.[25] Though the Open Library asserted that the copies of entire books in e-book format were still encrypted and the unlimited borrowing was for educational purposes, the National Writers Union asserted that images of each page of each book could still be accessed on the Web without encryption or other controls.[7][33]

Four major publishers—Hachette, Penguin Random House, John Wiley & Sons, and HarperCollins, all members of the Association of American Publishers—filed a lawsuit in the Southern New York Federal District Court against the Internet Archive in June 2020, asserting the Open Library project violated numerous copyrights.[34] In their suit, the publishers claimed "Without any license or any payment to authors or publishers, [the Internet Archive] scans print books, uploads these illegally scanned books to its servers, and distributes verbatim digital copies of the books in whole via public-facing websites. With just a few clicks, any Internet-connected user can download complete digital copies of in-copyright books from [the] defendant."[35] The publishers were represented by the law firms Davis Wright Tremaine and Oppenheim + Zebrak.[36] The Internet Archive ended the National Emergency Library on June 16, 2020, instead of the intended June 30 date, and requested the publishers to "call off their costly assault".[37] In July 2022, both parties filed requests for summary judgement. A first hearing was held on March 20, 2023.[38] A summary judgement was issued March 24, 2023, in favor of the plaintiffs. In its ruling the United States District Court for the Southern District of New York determined that the Internet Archive committed copyright infringement by scanning and distributing copies of books online. Stemming from the creation of the National Emergency Library (NEL) during the onset of the COVID-19 pandemic, publishing company Hachette Book Group alleged that the Open Library and the National Emergency Library facilitated copyright infringement.

On March 25, 2023, the court ruled against Internet Archive, who appealed the decision.[39] This appeal was later denied by the Second Circuit Court of Appeals in 2024.[40]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia

Open Library is an initiative of the non-profit , functioning as an editable online catalog that aims to create a dedicated for every ever published, compiling bibliographic from various sources and user contributions to facilitate discovery and access to . Launched in , it maintains over 20 million records accessible via a wiki-style interface, allowing users to add, edit, and expand entries while integrating with the Internet Archive's broader efforts.
The project emphasizes and software, drawing from large institutional catalogs to build a comprehensive index, and supports features like borrowing through controlled digital lending (CDL), where digital copies are loaned in one-to-one correspondence with owned physical volumes for limited periods. This model has enabled access to millions of digitized works, promoting universal availability of knowledge, though its scale relies on partnerships and grants, such as those from the Kahle/Austin Foundation. Open Library's editable nature fosters collaborative improvement, distinguishing it from static databases by inviting input from librarians, readers, and authors to refine metadata and coverage. Despite these achievements, Open Library has faced significant legal challenges, particularly over its CDL practices, which publishers including argued constituted willful by enabling unauthorized digital distribution. In the 2020 lawsuit , a U.S. district court ruled in 2023 that the lending exceeded , a decision upheld on appeal in 2024, leading to the removal of numerous titles from lending availability and restrictions on the program's scope. These rulings highlight tensions between and enforcement, impacting Open Library's role in providing equitable access to out-of-print and in-copyright materials.

Overview

Purpose and Organizational Context

Open Library operates as an open-source initiative to construct a comprehensive, editable catalog of metadata, with the explicit objective of generating a dedicated for every ever published worldwide. This collaborative platform enables users to contribute and refine bibliographic details, such as titles, authors, and publication information, while aggregating data from external sources including OCLC's database and scans of physical library holdings. The emphasis on openness distinguishes it as a crowdsourced effort, where registered participants can verify editions, add covers, and link to borrowing options, fostering a dynamic repository that prioritizes over proprietary control. As a project embedded within the , a 501(c)(3) established in 1996 by , Open Library aligns with the parent entity's mission to preserve and provide universal access to digital cultural artifacts. The supplies the infrastructural backbone, including its extensive scanning operations that have digitized over 25 million , enabling Open Library to integrate links to borrowable digital copies where available. This affiliation leverages the Archive's vast repository of scanned materials, but Open Library itself functions primarily as a metadata hub rather than a storage facility, directing users to external resources for acquisition or reading. Unlike conventional libraries that maintain physical collections and circulation systems, Open Library eschews tangible holdings in favor of a virtual index focused on bibliographic and digital lending facilitation. Its model supports discovery and metadata enrichment without owning or housing physical volumes, thereby complementing the Internet Archive's preservation efforts by emphasizing discoverability and community-driven accuracy in cataloging. This approach underscores a commitment to and , allowing integration with global library networks while avoiding the logistical constraints of brick-and-mortar institutions.

Core Components and Scale

Open Library's bibliographic catalog aggregates over 20 million records of book metadata, drawn from sources such as library partnerships, data, and community contributions, covering editions, works, authors, and subjects. This extensive dataset forms the foundational component, enabling comprehensive coverage of published works, including out-of-print titles and materials. The lending system centers on a collection of more than 3 million borrowable digital books as of 2025, with internal metrics indicating up to 4.5 million items available for controlled digital lending, primarily consisting of scanned physical volumes hosted via integration with the . These scans incorporate out-of-print and works, processed to support digital access while adhering to one-to-one lending ratios for in-copyright holdings. To enhance searchability, the platform links to OCR-applied scans from the , allowing text-based querying across digitized content. External scalability is supported by RESTful APIs, including the Books API for retrieving edition details by identifiers like or OLID, the Search API for catalog queries, and the Covers API for imagery, enabling integration into third-party library systems and applications. Scale vulnerabilities were exposed in when over 500,000 in-copyright books were removed from the borrowable collection following publisher demands amid ongoing litigation, contracting the active lending pool and highlighting dependencies on legal tolerances for digitized holdings.

Historical Development

Founding and Initial Launch (2006–2010)

Open Library was initiated in 2006 by , who served as its technical lead under the auspices of the , with the goal of creating an open, editable catalog encompassing every book ever published. Swartz announced the project publicly on July 16, 2007, describing it as an effort to construct "the world's greatest library" by aggregating and making freely available bibliographic data for universal access and community editing. The beta version launched that year emphasized metadata collection from diverse sources, including MARC records from major libraries like the and the British National Library, to establish a centralized, wiki-style database without initially prioritizing full-text . The project emerged within the broader context of the Open Content Alliance, a collaborative initiative founded by Internet Archive director Brewster Kahle to promote non-proprietary book digitization and open access to cultural materials. Early development involved partnerships with libraries participating in scanning programs, where physical volumes were digitized to create digital surrogates owned outright by the institutions, reflecting a commitment to preserving and democratizing access to knowledge through verifiable physical holdings rather than licensing dependencies. These efforts focused on public domain works, enabling direct online reading without the lending restrictions that would later apply to copyrighted materials. By 2010, Open Library had integrated basic digital borrowing capabilities for ebooks from the Archive's scanned collections, allowing users to access one at a time in a manner analogous to traditional library checkouts, though limited to non-circulating online views for these freely available titles. This feature built on the platform's metadata foundation, providing over one million out-of-copyright volumes for immediate perusal and underscoring the project's foundational aim of equitable distribution based on owned physical assets.

Expansion and Integration with Internet Archive (2011–2020)

Open Library deepened its integration with the during the , leveraging the organization's scanning infrastructure to expand the digitized collection significantly. By the late , this collaboration facilitated the processing of millions of volumes, including contributions from partner libraries and institutions, as part of initiatives like the Open Libraries project aimed at acquiring or digitizing four million books while respecting publisher agreements where applicable. A key development was the implementation of controlled digital lending (CDL), which applied a one-to-one ratio mirroring physical library practices—lending one digital scan per owned physical copy, with enforced borrowing periods and waitlists to prevent overuse. Piloted in the early through collaborative and lending efforts, CDL enabled access to 20th-century in-copyright titles contributed by libraries, positioning Open Library as a digital extension of traditional lending models. In October 2019, Open Library launched a Scan on Demand sponsorship program, enabling users to directly fund the of targeted books, thereby expansion of the borrowable catalog. This user-driven approach complemented institutional scanning by prioritizing reader-requested titles for conversion into lendable digital formats. Responding to library closures during the , Open Library initiated the National Emergency Library in March 2020, temporarily removing waitlists to allow concurrent borrowing of digital books until June 16, 2020, after which it reverted to standard CDL protocols. This measure provided crisis-time access to over a million titles but drew criticism from publishers for exceeding boundaries. Throughout this period, accessibility features advanced, including DAISY-formatted books for print-disabled users, which offered navigable audio and text substitutes compliant with standards for the visually impaired.

Recent Milestones and Adaptations (2021–2025)

In 2022, Open Library implemented a community reviews feature, allowing users to apply predefined tags—such as content warnings or thematic descriptors—to book entries, thereby qualitative metadata while maintaining editorial controls to prevent spam. This adaptation built on prior user editing capabilities, aiming to enrich discovery without relying on external review aggregators. Concurrently, the platform sustained elevated borrowing volumes following the 2020 National Emergency Library suspension of waitlists, which facilitated broader access during library closures and contributed to a surge in digital checkouts amid ongoing pandemic disruptions. By 2024, Open Library complied with court-mandated restrictions by delisting over 500,000 titles from its controlled digital lending program, curtailing in-copyright ebook loans and redirecting resources toward scans and open-access editions. This shift emphasized verifiable non-infringing materials, including newly works from 1929 entering free use on January 1, 2025, such as literary titles by and . Reduced lending capacity prompted operational resilience measures, including read-only access protocols during high-demand periods to preserve catalog integrity. These changes coincided with Internet Archive-wide advancements, notably reaching the milestone of 1 trillion archived web pages in October 2025, underscoring Open Library's integration within a larger preservation prioritizing durable, accessible digital artifacts over contested lending models.

Operational Features

Bibliographic Catalog and Editing

The Open Library bibliographic catalog functions as a crowdsourced, wiki-style database of metadata, encompassing over 20 million records derived from bulk imports of library catalogs and user-submitted data. It distinguishes between "works," which aggregate canonical bibliographic information across editions, and specific "editions," which detail publication variants including publishers, formats, and identifiers such as ISBNs, numbers, LCCNs, and Open Library IDs (OLIDs). This structure enables comprehensive tracking of publication histories, with users contributing details like author attributions, subject classifications, and cover images via an accessible "Edit" interface on individual record pages. Public editing privileges emphasize decentralized accuracy through community input over centralized institutional oversight, allowing registered users to create new entries, merge duplicates, or refine existing metadata without formal credentials. Contributions integrate empirical elements, such as linking records to digitized scans from partner libraries, alongside subjective additions like subjects or summaries, fostering iterative improvements but contingent on participant diligence. The model draws initial data from established library systems, enhancing coverage through user expansions, though it lacks automated cross-verification against primary sources, potentially perpetuating unconfirmed details until subsequent edits occur. Search functionalities support queries by , , or subject, yielding results with facets for filtering by edition specifics, publication dates, languages, and related translations. For instance, subject searches employ controlled vocabularies to retrieve thematically linked works, while pages link to associated editions and variant translations, with guidelines specifying original-language titles for works and translated titles for editions. This faceted approach facilitates discovery of bibliographic variants, such as international editions, but accuracy hinges on the completeness of user-maintained identifiers and classifications.

Controlled Digital Lending Mechanism

The Controlled Digital Lending (CDL) mechanism employed by Open Library operates on the principle of maintaining a one-to-one correspondence between owned physical copies and loaned digital versions, aiming to replicate the constraints of traditional in a digital format. Under this system, physical books acquired through purchase or donation are digitized via scanning, after which the corresponding physical copy is stored and rendered unavailable for lending during the digital loan period to prevent concurrent access. Digital loans consist of ephemeral files delivered through secure viewers, such as browser-based access or with (DRM), which restrict downloading, printing, or redistribution beyond the licensed view-only duration. This approach is predicated on the theory that such controls preserve the economic and access limitations inherent in physical lending, thereby aligning with principles without enabling reproduction beyond owned quantities. Implementation details include standardized loan periods of 14 days for most titles, during which a single user accesses the digital copy, followed by automatic expiration and return to availability for the next borrower; shorter one-hour loans are available on a first-come, first-served basis for select items, while high-demand titles feature waitlists with notifications. Circulation software enforces the one-digital-per-physical rule by tracking unique identifiers and blocking simultaneous loans, theoretically ensuring no net increase in accessible copies beyond the physical collection. Prior to legal restrictions, this supported lending from approximately 750,000 scanned titles, primarily comprising in-copyright works held by the . Following the 2020 initiation of publisher lawsuits, CDL operations faced publisher-initiated opt-outs for specific titles and judicial limitations, culminating in a 2023 district court ruling and 2024 appellate affirmation that rejected defenses for the program, mandating removal of over 500,000 titles from circulation. From a causal standpoint, the mechanism's efficacy hinges on verifiable ownership chains for physical copies—such as matched serial numbers or acquisition records—to prevent discrepancies where digital loans exceed physical holdings, a scalability challenge exacerbated by bulk acquisitions without granular tracking, as critiqued in analyses of potential market substitution effects. Publishers have contested the sufficiency of these controls, arguing they fail to mitigate unauthorized proliferation risks inherent in without publisher consent.

User Engagement and Sponsorship Programs

Open Library's book sponsorship program, launched on October 23, 2019, enables users to direct donations toward acquiring and digitizing specific titles not yet available in the collection, with scanned books subsequently offered for free borrowing. This initiative supports targeted expansion of the digital holdings by leveraging individual contributions to cover costs associated with purchasing physical copies and performing scans. Complementing sponsorship, users actively contribute to catalog maintenance through editable records, where they can add new , update edition details such as publishers and , and refine work-level information like descriptions, with all changes tracked in edit histories. entries can also be managed, including merging duplicates upon approval, fostering a collaborative refinement of bibliographic data. To encourage evaluative input, Open Library implemented Community Reviews in August 2021, a structured tagging system permitting users to select predefined categories via a modal interface and assign 1-5 star ratings applicable to works rather than editions, with data anonymized and aggregated for display on book pages. These reviews eschew free-form text in favor of tag-based classification, enabling statistical overviews of community sentiment and aiding discovery. Users further engage by curating content through and a reading log, accessible via account management, where can be categorized thematically—such as by subject or personal interest—and status-tracked as "want to read," "reading," or "read," supporting individualized organization without built-in public sharing mechanisms. Such tools promote sustained interaction, aligning user preferences with catalog utility while empowering niche prioritization in sponsorship-driven .

Accessibility Initiatives for Disabled Users

Open Library facilitates access to digitized books in specialized formats for users with print disabilities, such as blindness, , or , through integration with the Internet Archive's programs. These efforts leverage the Chafee Amendment (17 U.S.C. § 121), which exempts authorized entities from when reproducing works in accessible formats exclusively for eligible individuals who cannot read standard print due to physical or perceptual limitations. Unlike controlled digital lending for the general public, which relies on and imposes one-to-one borrowing limits, print disability access operates under this statutory exemption, allowing unlimited distribution to qualified users without concurrent circulation restrictions. The primary format provided is DAISY (Digital Accessible Information System), a structured audio-text hybrid enabling navigation by chapters, headings, and images via screen readers or displays, with support for unencrypted text files in works. Users must qualify as print disabled—certified by a doctor, optometrist, or authorized agency—and obtain a free account with a special , streamlining eligibility verification as updated in a May 30, 2025, process enhancement. Once approved, eligible patrons can borrow and download DAISY-formatted books directly, bypassing general waitlists; Ready Format (BRF) support is available for compatible titles through compatible devices. Launched in 2010, the initiative initially unlocked over 1 million volumes in DAISY format, drawn from the Internet Archive's scanned collections, with expansions by 2018 providing access to millions more for visually impaired users via web-based reading or downloads. This scale reflects ongoing digitization of physical scans optimized for accessibility, including (OCR) enhancements, ensuring compatibility with assistive technologies like JAWS or NVDA screen readers. The program's legal distinction from broader lending models has insulated it from challenges in related litigation, prioritizing equitable access grounded in congressional intent for disability exemptions. In the early 2010s, as Open Library expanded its controlled digital lending program—beginning with a partnership with the in 2011—publishers and authors raised initial concerns about unauthorized scanning and reproduction of ed works. Critics alleged that the project's digitization of entire books, followed by digital loans without publisher consent, exceeded and enabled potential through downloadable copies. These claims intensified in late 2017, when the highlighted Open Library's practices as copyright violations, noting the availability of in-print titles for digital borrowing without remuneration to rights holders. The Science Fiction and Fantasy Writers of America (SFWA) formalized these objections in a January 2018 infringement alert, asserting that Open Library's scanning of books published as recently as the constituted direct infringement rather than analogous physical lending, as it involved creating new digital copies not licensed from publishers. SFWA and allied groups, including Writer Beware, urged authors to verify their works' presence on Open Library and submit DMCA takedown notices, emphasizing that the platform's scale—potentially encompassing millions of titles—amplified the risks of unauthorized distribution. In response, Internet Archive founder defended the initiative in a January 2018 blog post, arguing that and one-to-one lending preserved library traditions without supplanting sales, and committed to honoring takedown requests while maintaining an opt-out list for publishers seeking broader exclusions. By 2019, these tensions persisted, with the Authors Guild issuing an open letter protesting controlled digital lending as a flawed model that undermined author earnings by treating digital copies as fungible equivalents to print ones, despite lacking contractual agreements with rights holders. Internet Archive countered that the practice was transformative, serving non-commercial preservation goals akin to physical libraries' interlibrary loans, and relied on fair use doctrine to justify limiting loans to the number of owned physical copies. The prevalence of DMCA notices and opt-out requests during this period empirically demonstrated gaps in Open Library's permission-based framework, as reactive removals addressed individual complaints but did not resolve systemic disputes over proactive scanning authorizations.

Hachette Book Group et al. v. Internet Archive Lawsuit (2020–2024)

In June 2020, , Inc., Publishers LLC, LLC, and John Wiley & Sons, Inc. filed a lawsuit against the in the United States District Court for the Southern District of New York, alleging willful violations through the defendant's Open Library program. The complaint targeted the 's Controlled Digital Lending (CDL) system, which involved scanning physical print books owned by the organization or its partners, creating digital copies, and lending those ebooks on a one-to-one basis with the physical copies—effectively digitizing and distributing unauthorized reproductions of the publishers' works without licenses. The suit specifically highlighted 127 titles from the plaintiffs' catalogs, including works like and , but sought remedies for broader infringement affecting tens of thousands of books. The filing was precipitated by the Internet Archive's launch of the National Emergency Library (NEL) on March 24, 2020, a temporary expansion of CDL that suspended waitlists and enabled simultaneous digital loans equivalent to its entire print collection of over 1.4 million titles, in response to physical library closures during the . Plaintiffs contended that CDL and NEL constituted systematic digital piracy, not protected library lending, as the process generated new copies that competed directly with licensed digital sales and subscriptions, evidenced by internal data showing millions of loans—including to users in markets with available purchases—and peak daily circulations exceeding 100,000 during NEL. They argued this harmed downstream licensing markets for , audiobooks, and print editions, rejecting any claim by emphasizing the commercial substitutability of the free digital loans, the lack of transformation in the reproductions, and the scale of unauthorized copying as exceeding traditional library exceptions. The Internet Archive defended CDL as a faithful digital analog to longstanding physical library practices, asserting that it maintained a strict owned-to-loaned ratio (one digital per physical copy, with the print book secured during lending), ensured non-commercial access without or user fees, and served transformative purposes of preservation, , and equitable access amid print scarcity. Regarding NEL, the described it as a short-term (March 24 to June 16, 2020) humanitarian response to pandemic disruptions, not a permanent shift, and maintained that both initiatives fell under doctrine per 17 U.S.C. § 107 by favoring public benefit over market effects in non-profit contexts. The cited precedents like interlibrary loans and argued no of net market harm, as loans were time-limited (typically 14 days) and targeted users without easy ebook alternatives, while highlighting the Internet Archive's of physical copies predating . Following discovery, which included depositions and production of lending logs revealing over 500,000 CDL loans in 2019 alone, the parties filed cross-motions for in early 2022. The district court, in a March 20, 2023 ruling by Judge , granted partial to the plaintiffs on direct infringement claims for both CDL and NEL, holding that the practices failed all four factors: non-transformative purpose, reproduction of expressive core content, excessive scope beyond criticism or education, and demonstrable harm to ebook licensing revenues. The court deferred damages and certain secondary claims for further proceedings but issued a permanent limiting the Internet Archive's digital lending of the plaintiffs' works, prompting the defendant to remove over 500,000 titles from availability pending appeal.

Appellate Rulings and Final Resolution (2024–2025)

On September 4, 2024, the United States Court of Appeals for the Second Circuit, sitting en banc, affirmed the U.S. District Court for the Southern District of New York's March 2023 summary judgment ruling in Hachette Book Group, Inc. v. Internet Archive. The appellate court held that the Internet Archive's (IA) Controlled Digital Lending (CDL) practices through its Open Library and Free Digital Library programs did not constitute fair use under Section 107 of the Copyright Act, rejecting IA's defense across all four statutory factors. Specifically, the court found that IA's scanning of physical books it owned into digital formats and lending those exact digital copies on a one-to-one basis with print counterparts created direct market substitutes for licensed e-books, undermining publishers' licensing revenues without the transformative purpose required for fair use. The Second Circuit dismissed IA's analogy to traditional library lending, noting that digital copies lack the physical wear or geographic constraints of print books, enabling unlimited simultaneous access and nationwide distribution that competes with commercial e-book markets. On the fourth factor—effect on the potential market—the panel emphasized of harm to plaintiffs' sales, including substitution for authorized digital editions, and declined to limit analysis to the 127 titles directly at issue, viewing CDL as a systemic to broader licensing ecosystems. The upheld the district court's permanent prohibiting IA from further distributing the scanned copies of the sued-upon works and similar infringing activities. Following the affirmation, IA did not file a petition for with the U.S. by the December 3, 2024, deadline, effectively concluding the litigation after over four years. In a December 4, 2024, announcement, IA confirmed it would comply with its prior stipulation to the Association of American Publishers (AAP), removing from circulation over 500,000 scanned titles subject to the publishers' claims, while preserving access to works and legally licensed materials. This resolution dismantled IA's CDL model for copyrighted in-print books without monetary or further appeals, establishing that digital replication and lending of owned print copies does not inherently qualify as absent transformative elements or negligible market impact.

Reception and Criticisms

Supporter Perspectives on Preservation and Access

Supporters of Open Library, including leadership and collaborating librarians, contend that its digitization efforts safeguard by making available out-of-print and obscure titles neglected by commercial markets due to low profitability. An study found that 87% of books published in the twentieth century remain unavailable for purchase from major online retailers like Amazon, highlighting a preservation gap that Open Library addresses through scanning and cataloging over 20 million bibliographic records, including millions of rare volumes. In response to the , the National Emergency Library extension of Open Library suspended one-book-at-a-time waitlists from March 24, 2020, to June 30, 2020, enabling simultaneous remote borrowing of 1.4 million digitized titles by users worldwide, particularly benefiting students and remote learners cut off from physical collections. officials described this as fulfilling libraries' core mission of equitable access during crises, with the initiative drawing endorsements from educators for sustaining and continuity amid widespread closures. Advocates, such as Open Libraries Director Chris Freeland, argue that the platform promotes broader by providing free digital loans of low-demand works excluded from commercial e-book ecosystems, which favor bestsellers, and overcomes geographic barriers inherent in brick-and-mortar libraries. Lending data from Open Library's over 3 million borrowable items underscores usage patterns favoring educational and niche content, with supporters asserting this model supplements rather than supplants market availability, fostering informed citizenship without evidence of widespread substitution for purchasable editions.

Publisher and Author Objections to Market Harm

Publishers and authors have contended that Open Library's controlled digital lending (CDL) practice inflicts direct economic harm by enabling unauthorized digital access that substitutes for licensed purchases and library licenses, thereby eroding royalties and sales revenue. The has highlighted that CDL threatens author incomes by allowing libraries to scan and lend entire books without permission, bypassing the ebook market where authors typically receive royalties from licensed distributions. Specifically, authors earn approximately 25 percent of revenue from library ebook licenses, a stream that CDL circumvents by offering free digital loans, potentially rendering paid licensing obsolete for many titles. This , critics argue, creates disincentives for new , as creators forgo compensation for works that libraries could otherwise purchase or digitally. Even small royalty losses significantly impact authors' livelihoods, given the marginal of writing, where CDL's scale—scanning and lending millions of titles—amplifies the harm akin to unauthorized distribution despite one-user-at-a-time controls. The Association of American Publishers has asserted that any purported public benefits from CDL fail to justify the damage to publishers' actual and potential markets, including lost opportunities from controlled digital exploitation. Following the U.S. District Court's March 2023 ruling and the Second Circuit's September 2024 affirmation in v. , publishers emphasized that upholding rights safeguards industry sustainability by preserving licensing models essential for funding authorship and publishing. The described the decisions as preventing "wholesale copying and distribution" that would "devastate the livelihoods of authors," reinforcing that contractual and technical protections are vital for creators to control and monetize their works.

Evaluations of Fair Use Doctrine Application

In the application of the doctrine to controlled digital lending (CDL) by Open Library, courts weighed the four statutory factors under 17 U.S.C. § 107, ultimately finding against in the , Inc. v. litigation. The first factor, concerning the purpose and character of the use, was deemed unfavorable due to the lack of transformative purpose; CDL involved scanning and lending complete digital copies of books for reading, mirroring the original expressive function rather than adding new expression, criticism, or functionality, despite 's nonprofit status. Commercial elements, such as donations tied to access and partnerships yielding revenue, further tilted this factor against , though courts noted commerciality alone is not dispositive. The second and third factors—nature of the copyrighted work and amount used—also weighed against , as the works were creative fiction and books eligible for full protection, and CDL reproduced entire copies without limitation to excerpts. The fourth factor, effect on the potential market, proved decisive, with evidence showing CDL substituted for licensed sales; for instance, during the period from March to September 2020, Internet Archive's lending of 100,000+ copies of plaintiffs' titles coincided with publishers' reported revenue declines, demonstrating actual harm rather than mere speculation. Courts rejected analogies to physical lending, emphasizing digital copies' infinite replicability and lack of degradation, which causally enable widespread unauthorized distribution beyond owned copies. Doctrinal tensions emerged in distinguishing CDL from precedents like (Google Books), where was upheld for creating searchable indices with snippets, not full-text lending; Google Books facilitated discovery without supplanting reading markets, whereas CDL's one-to-one digital loans functioned as direct competitors to authorized digital editions. Some academic critiques, often from library science perspectives, advocate doctrinal reform to prioritize preservation and access over market effects, arguing CDL mimics physical libraries without net harm. However, empirical evidence from publisher data—such as tracked borrowing spikes correlating with sales dips—supports the rulings' emphasis on verifiable substitution, underscoring causal market displacement over idealistic access claims, particularly given the ease of digital amplification.

Impact and Current Status

Contributions to Digital Preservation

Open Library maintains an extensive catalog comprising over 50 million edition records, aggregating and preserving bibliographic metadata from diverse sources and community contributions, which ensures the endurance of descriptive data on published works amid potential disruptions to physical collections. This supports systematic documentation of histories, authors, and variants, countering the fragmentation inherent in decentralized analog catalogs. Complementing this, Open Library integrates with the Internet Archive's program, which has produced scans of more than 25 million books, capturing high-fidelity images of physical volumes to mitigate risks from material decay, such as paper acidification and environmental damage. These digital surrogates preserve the exact content and formatting of originals, including out-of-print titles whose physical copies are increasingly scarce or inaccessible due to library or natural attrition. By enabling verifiable access to these preserved scans, Open Library facilitates scholarly examination of historical texts without reliance on fragile artifacts, thereby reducing the incidence of content loss and supporting longitudinal studies in fields like literature and history. As a non-profit endeavor, its archiving prioritizes the stewardship of cultural artifacts over commercial interests, aligning with longstanding library mandates to safeguard human knowledge for perpetual availability. Following the Second Circuit Court of Appeals' affirmation on September 4, 2024, of the district court's ruling that Open Library's controlled digital lending of in-copyright books constituted rather than , the implemented an injunction requiring the removal of affected titles from its lending catalog. This led to the delisting of over 500,000 scanned books in June 2024, primarily modern titles owned by suing publishers such as Hachette, , , and Wiley, as mandated by the court's order to halt unauthorized digital reproductions and distributions. The Association of American Publishers (AAP) agreement, which the committed to honoring even after declining review on December 4, 2024, ensured these removals persisted, eliminating a substantial portion of lendable in-copyright materials. These deletions contracted Open Library's operational scope, compelling a pivot toward works and legally acquired , as the platform could no longer rely on one-to-one digital lending of scanned print copies for protected titles without publisher licenses. Borrowing volumes declined correspondingly, with users losing access to roughly half a million titles that had previously supported the service's scale, resulting in measurable reductions in overall lending activity and user engagement as reported by the itself. The court's determination that such lending acted as a market substitute for licensed e-books—evidenced by concurrent print and digital availability—underscored the causal link between unauthorized scanning and dissemination and publishers' lost licensing revenue, validating their claims of economic harm. Operationally, the outcomes eroded user trust, as evidenced by a June 2024 petition drive by supporters seeking restoration of removed titles, while highlighting the vulnerabilities of scaling digital libraries without explicit rights holder consent or robust precedents. This reinforced the practical necessity for licensed acquisition models, such as partnerships with publishers for authorized e-book lending, over experimental paradigms like nationwide initiatives that courts deemed to undermine primary markets. The sustained removals, upheld through the final resolution, thus constrained Open Library to a narrower preservation role focused on out-of-copyright materials, limiting its ambition as a comprehensive digital lending hub.

Future Prospects and Broader Implications

Following the Second Circuit's September 4, 2024, affirmation of the district court's ruling that the Internet Archive's Controlled Digital Lending (CDL) practices did not constitute fair use, Open Library's operators removed approximately 500,000 titles from its lending catalog to comply with the decision and an associated agreement with the Association of American Publishers. This curtailment signals a trajectory toward greater reliance on public domain works, out-of-print materials with verifiable ownership chains, or negotiated licenses, as unauthorized scanning and one-to-one lending of in-copyright books risks further infringement liability. While the Internet Archive has pursued preservation partnerships with entities like public libraries for local digital heritage and received grants for non-book archiving, such as $1 million from Press Forward in July 2025 for local news digitization, no large-scale pivot to licensed e-book lending for Open Library has materialized as of late 2025. Emerging technologies like blockchain for tracking digital copies' provenance could theoretically enable provable "owned" lending without physical counterparts, but adoption remains speculative absent demonstrated implementation or legal validation. The ruling's broader implications erode the viability of CDL as a scalable model for non-commercial digital libraries, prioritizing copyright holders' market incentives over unilateral access expansions that courts deemed competitive substitutes for licensed e-books. This outcome reinforces causal dynamics where unauthorized lending discourages publisher investment in digital formats, as evidenced by the plaintiffs' arguments that Open Library's program harmed sales without transformative value. For , market-based solutions—such as subscription platforms or licenses—offer verifiable paths forward, contrasting with idealistic open-access models lacking creator consent, which face enforcement risks amplified by the decision's rejection of across all factors. Policy debates may intensify around targeted exemptions for orphan works or expanded library privileges, but the appellate court's en banc opinion, declining Supreme Court review in December 2024, entrenches judicial skepticism toward broad digital reproductions mimicking commercial markets. Without legislative reforms, the precedent favors innovation incentives tied to enforceable rights, potentially spurring hybrid models where libraries collaborate with rights holders rather than litigate doctrinal boundaries.

References

Add your contribution
Related Hubs
User Avatar
No comments yet.