Hubbry Logo
search
logo
1637470

Crossref

logo
Community Hub0 Subscribers
Read side by side
from Wikipedia

Crossref (formerly styled CrossRef)[1] is a nonprofit open digital infrastructure organization for the global scholarly research community. It is the largest digital object identifier (DOI) Registration Agency of the International DOI Foundation. It has 19,000 members from 150 countries representing publishers, libraries, research institutions, and funders and was launched in early 2000 as a cooperative effort among publishers to enable persistent cross-platform citation linking in online academic journals.[2] As of July 2023, Crossref identifies and connects 150 million records of metadata about research objects made openly available for reuse without restriction. They facilitate an average of 1.1 billion DOI resolutions (clicks of a DOI link) every month, and they see 1 billion queries of the metadata every month.

Key Information

Background

[edit]

Crossref is a nonprofit association of approximately 19,000 voting members made up of 6,000 societies and publishers, including both commercial and nonprofit organizations, 6,500 academic and research institutions, research funders, museums, repositories, government agencies and NGOs. Crossref includes members with varied business models, including those with both open access and subscription policies. Crossref does not provide a database of fulltext scientific content. Rather, it facilitates the links among distributed content hosted at other sites through the use of open metadata and persistent identifiers.

Crossref interlinks millions of items from a variety of content types, including journals, books, conference proceedings, research grants, working papers, technical reports, and data sets. Linked content includes materials from scientific, technical, and medical (STM), and social sciences and humanities (SSH) disciplines. Crossref's sustainability model includes an annual membership fee, a per-record registration fee, and additional service fees, while all the metadata remains open without restriction. Crossref provides the technical and business infrastructure to provide for this infrastructure using digital object identifiers (DOIs). Crossref provides a query service for its records through an open REST API and a Search form.[citation needed]

In addition to the technology and metadata linking scholarly objects, Crossref enables a common linking contract among its participants. Members agree to assign DOIs to their current journal content, and they also agree to link from the references of their content to the content of other publishers. This reciprocity is an important component of what makes the system work.

Non-member organizations may participate in Crossref by integrating the metadata. Such organizations include libraries, online journal hosts, linking service providers, secondary database providers, search engines, and providers of article discovery tools.

Metadata

[edit]

When a scholarly journal publishes an article, typically the publisher will enter the following information about the article into CrossRef: journal name, article DOI, publication date, journal volume, issue, and page, URL of article as well as journal, and number of pages. Optional metadata that can be entered includes the text of the article abstract, ORCID iDs of the authors, funding information, including funder registry IDs and funding award numbers, license information, and similarity check URLs.

The references cited by a work can also be added. This contributes to the Crossref Cited-by service, which allows one to see what articles have cited another.[3] Most major scholarly publishers do provide the references to each of their articles - Elsevier was a major holdout but began providing references in 2021.[4]

Crossref does not currently support adding information about the role of each author or other contributor to an article, but this feature is planned to be released in 2025, at least for CRediT information.[5]

Services

[edit]

In addition to maintaining scholarly records, Crossref provides additional services such as plagiarism screening, searching by funding, and a button on article pages and PDFs to determine the status of an item, such as whether it has been corrected or retracted.

Crossmark

[edit]

The Crossmark update system facilitates updates, corrections, and retractions for the scholarly community. The reader simply clicks on the Crossmark button to view status information about the document. If an update exists, the status information will include a DOI link to the statement of correction or retraction.

Crossmark provides a cross-platform way for readers to quickly discover the status of a research output along with additional metadata related to the editorial process. Crucially, the Crossmark button can also be embedded in PDFs, which means that members have a way of alerting readers to changes months or even years after it's been downloaded.

— Crossref[6]

Crossmark also allows publishers to link publications about a clinical trial, such as those reporting its results, to the registration for that clinical trial.[7]

Challenges

[edit]

In June 2024 a paper got wider audience when a team of researchers found fabricated metadata entered into the Crossref database, in the case of the analyzed publisher 9% of the references were wrong. This then also got sourced into datasets like Openalex. Metadata in the reported examples does not contain the real citations any more, but made up citations.[8][9]

Awards

[edit]

In September 2012, Crossref was awarded the Association of Learned and Professional Society Publishers (ALPSP) Award for Contribution to Scholarly Publishing. According to ALPSP, "With over 4,000 participating publishers, Crossref's reach is international and it is very well regarded not just amongst publishers, but also the literary community and researchers. Crossref has built on this unique position to offer other services such as Crossref Cited by Linking, CrossCheck, CrossMark and the latest project, FundRef. Crossref's services provide solutions that are best done collectively by the industry to improve scholarly communications."[10]

The Council of Science Editors (CSE) awarded Crossref its Award for Meritorious Achievement at the CSE annual meeting in May 2009. This was only the second time the award had been presented to an organization rather than to an individual.[11]

In September 2008, ALPSP awarded Crossref its Innovation in Publishing award for the CrossCheck plagiarism screening service powered by iThenticate.[12] In 2025, to mark the organisation's 25th anniversary, they launched their own Crossref Metadata Awards to emphasize their community’s role in stewarding and enriching the scholarly record.[13] The six winners selected on the basis of the overall highest coverage of metadata elements included in Participation Reports.[14]

See also

[edit]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Crossref is a not-for-profit membership organization that operates open infrastructure for registering digital object identifiers (DOIs) and metadata for scholarly content, facilitating the global linking, citation, discovery, and assessment of research outputs to support open science and scholarly communication.[1] Founded in January 2000 as the Publishers International Linking Association, Inc. (PILA), it began as a collaborative effort among 12 major publishers to create a centralized system for reference linking in academic journals.[2] As of November 2025, Crossref serves over 23,000 members across 163 countries and maintains metadata for over 176 million research records, handling nearly 2 billion API queries monthly.[1][3][4] The organization originated from discussions in the mid-1990s within the Association of American Publishers and the International DOI Foundation, which sought a persistent identifier solution for digital scholarly materials.[2] A prototype was demonstrated in 1999 at the Frankfurt Book Fair, leading to the official launch of the Crossref system in June 2000 with initial focus on cross-publisher reference linking.[2] Over the decades, it has expanded beyond journals to include books, datasets, standards, and other research objects, while rebranding from CrossRef to Crossref in 2015 to reflect its broader mission.[5] Governance is community-driven, with a board of directors and equal voting rights for all members, aligning with the Principles of Open Scholarly Infrastructure.[1] Key services include DOI content registration, which allows members to deposit rich metadata for enhanced discoverability; free metadata retrieval via APIs for global access; and specialized tools such as Cited-by for tracking citations, Crossmark for indicating content updates or corrections, and Similarity Check for plagiarism detection.[6] These services underpin the scholarly ecosystem by creating a reusable, interconnected record of research, promoting transparency in funding through the Funder Registry, and capturing event data for online mentions of works.[6] By enabling efficient metadata exchange, Crossref plays a critical role in making scholarly knowledge more accessible and reliable.[1]

History and Background

Founding and Early Years

The development of the Digital Object Identifier (DOI) system, a precursor to Crossref, began in 1996 when the Enabling Technologies Committee of the Association of American Publishers issued a request for proposals for a persistent identifier system to manage digital content and protect intellectual property. The Corporation for National Research Initiatives (CNRI) proposed its Handle System, which was selected and prototyped to enable unique, location-independent identification of digital objects. This laid the technological foundation for linking scholarly publications reliably across the web.[7][2] In 1998, the International DOI Foundation (IDF) was established as a not-for-profit organization to develop, maintain, and govern the DOI system, building on the Handle System and addressing the needs of the publishing community for standardized intellectual property management in the digital environment. The IDF collaborated with initiatives like the INDECS project (1998–2000) to refine the DOI's data model and syntax, ensuring interoperability for content identification. By 1999, these efforts had progressed to practical applications in scholarly linking.[7][8] A key milestone occurred in 1999 with the DOI-X prototype project, led by Academic Press and Wiley in partnership with the IDF, which demonstrated automated reference linking using DOIs and a centralized metadata database. The prototype was showcased at the Frankfurt Book Fair in October 1999 during the STM Annual Conference, impressing representatives from major publishers and prompting a coalition. In December 1999, 12 founding organizations—including the American Association for the Advancement of Science, Academic Press, American Institute of Physics, Association for Computing Machinery, Blackwell, Elsevier Science, Institute of Electrical and Electronics Engineers, Kluwer Academic Publishers, Macmillan Magazines (Nature), Oxford University Press, Springer-Verlag, and John Wiley & Sons—committed to creating a dedicated linking service.[7][9][2] Crossref was incorporated on January 27, 2000, as Publishers International Linking Association, Inc. (PILA), a not-for-profit entity based in New York to operate the service independently. The system launched on June 5, 2000, as the first collaborative reference linking network for scholarly journals, initially enabling links across over 1.3 million articles from 2,700 journals submitted by 33 publishers. Ed Pentz, who had experience in electronic publishing from launching Academic Press's first online journal in 1995, was appointed as the founding Executive Director on February 1, 2000, overseeing the launch and focusing on persistent citation linking among member publishers to enhance research discoverability.[7][10][11]

Growth and Key Milestones

Crossref marked its 10th anniversary in 2009 by commissioning and publishing The Formation of Crossref: A Short History, a document reflecting on its origins and early development as a collaborative infrastructure for scholarly linking.[2] Since its inception with 12 founding publisher members in 2000, Crossref has expanded dramatically, growing to over 23,000 members spanning more than 160 countries by 2025.[1] This growth reflects a broadening scope beyond traditional publishers to encompass a diverse array of institutions, funders, and research organizations, fostering a more inclusive ecosystem for scholarly communication. In 2025, marking its 25th anniversary, Crossref launched the Metadata Awards to recognize community efforts in metadata quality. Headquartered in Lynnfield, Massachusetts, Crossref has evolved into a global not-for-profit entity supporting open infrastructure for research outputs.[1][12][13] Key milestones underscore this expansion. By early 2024, Crossref had amassed over 170 million metadata records, building on earlier achievements such as reaching 100 million records around 2019 and surpassing 150 million by mid-2023. As of November 2025, this figure stands at over 176 million, accompanied by robust usage metrics including approximately 1.1 billion monthly DOI resolutions and over 2 billion metadata queries.[1][14][4] These scales highlight Crossref's role in enabling persistent access to scholarly content worldwide. Financially, Crossref has sustained steady operations, reporting a budget of around USD $13 million in revenue as of 2025.[15] Operationally, the organization has scaled its team from a single employee in 2000 to 49 staff members by 2025, supporting enhanced technological capabilities and community engagement.[1]

Organization and Governance

Structure and Leadership

Crossref operates as a nonprofit organization under the legal entity Publishers International Linking Association, Inc. (PILA), registered in New York, USA, with 501(c)(6) tax-exempt status.[16] It follows a community-governed model, where a board of directors, elected by members, oversees strategic direction, operations, budget approvals, and bylaw amendments, ensuring alignment with its mission to facilitate scholarly communication.[16] The board meets quarterly to make key decisions through motions, promoting open governance and collective input from various committees representing the global research community.[16] Leadership at Crossref is headed by Executive Director Ed Pentz, who founded the organization in 2000 and continues in this role, guiding day-to-day operations and long-term vision.[1] The current Chair of the board is Lisa Schiff from the California Digital Library, supported by a Treasurer (Rose L’Huillier from Elsevier) and Secretary (Lucy Ofiesh, Crossref's Chief Operating Officer).[17][16] The board comprises 16 members drawn from diverse sectors, including academic publishers, learned societies, libraries, research funders, and technology organizations, with representation from multiple countries to reflect the international scope of scholarly publishing.[16] This composition emphasizes inclusive decision-making, with terms typically lasting three years and elections held annually to renew approximately one-third of the seats.[18] Operationally, Crossref maintains a distributed team of 49 people as of 2025, focused on sustaining core infrastructure, developing tools and services for metadata management, and providing support to its member community worldwide.[1] The organization's financial model relies primarily on tiered annual membership fees, which fund operations and are scaled according to members' publishing revenue or expenses, supplemented by fees for content registration services.[19] Transparency is ensured through publicly available annual reports detailing revenue, expenditures, and sustainability efforts.[20]

Membership and Community Involvement

Crossref's membership comprises over 23,000 organizations from more than 160 countries, encompassing publishers, research institutions, government agencies, research funders, museums, and other entities involved in scholarly communication.[21] This diverse community reflects the global scope of the organization, with members contributing to the registration and linking of over 170 million research objects as of 2025.[1] Membership requires organizations to agree to Crossref's terms, which include depositing metadata for registered content and adhering to standards for persistent identifiers. Fees are structured on a tiered basis according to an organization's annual revenue or expenses, typically tied to publication output, with annual membership fees starting from $200 for the smallest entities (revenue/expenses under $1,000 USD) and scaling up for larger publishers; additionally, a one-time deposit fee applies per new DOI or metadata record.[19] In return, members gain access to core services such as DOI registration for ensuring long-term accessibility of content, metadata deposition to enhance discoverability, and utilization of tools like Crossmark for signaling updates and Similarity Check for plagiarism detection.[21] The Global Equitable Membership program further supports participation by waiving fees for organizations in least economically advantaged countries, promoting inclusivity in scholarly infrastructure.[22] The Crossref community fosters active engagement through various collaborative mechanisms, including working groups that advise on technical and policy developments, and an online forum where members discuss implementation challenges and share best practices.[23] Regular events, such as the Crossref Community Update held on May 7, 2025, provide opportunities for feedback on initiatives like the metadata schema version 5.4 update, which introduced enhancements for citations and abstracts.[24] Collaborative projects, exemplified by the first Metadata Sprint in April 2025 in Madrid, Spain, bring together participants to address specific metadata improvement tasks, strengthening community ties and advancing shared goals.[25] Members play a pivotal role in the sustainability of Crossref by collectively funding and governing the infrastructure that stewards the scholarly record, ensuring equitable access to research outputs worldwide through metadata exchange and persistent linking.[1] This shared model, sustained by membership contributions, supports over 2 billion monthly API queries and promotes the long-term integrity of global research dissemination.[26]

Core Infrastructure

DOI Registration and Persistent Linking

Crossref operates as the largest Digital Object Identifier (DOI) Registration Agency (RA) under the governance of the International DOI Foundation (IDF), which oversees the DOI system as defined in ISO 26324. With over 23,000 members from more than 160 countries, Crossref manages more than 170 million DOI records, far surpassing other RAs in scale and global reach. This infrastructure supports the scholarly community by providing persistent identifiers that ensure long-term accessibility to research outputs, regardless of changes in hosting or location.[1][27] The DOI registration process begins with membership, where organizations obtain a unique DOI prefix from Crossref. Members then construct DOIs by appending a suffix to this prefix and deposit associated metadata in XML format using Crossref's schema, either through automated systems or manual interfaces. This applies to a wide range of content types, including journal articles, books and chapters, conference proceedings, datasets, dissertations, preprints, reports, and standards. Once registered, the DOI serves as a permanent handle, resolving to the current location of the content via the Handle System managed by the Corporation for National Research Initiatives (CNRI). Crossref facilitates approximately 1.3 billion successful DOI resolutions monthly, accounting for 94% of all global DOI activity and enabling seamless access to scholarly materials.[28][29][30] Reference linking forms a core component of Crossref's functionality, allowing automated matching of citations in reference lists to corresponding DOIs across publications. Members are required to include DOIs in their outgoing references and accept incoming links, with Crossref providing free tools to parse and match unstructured citations to registered metadata. This process enhances interoperability by eliminating the need for bilateral agreements between publishers, fostering a interconnected web of scholarly content where users can navigate from one full-text source to another with a single click. For instance, a citation in a journal article can resolve directly to the DOI of the referenced work, regardless of the publisher.[31][32] Crossref's DOI system integrates with global research infrastructure to support non-traditional outputs, extending beyond conventional publications. Members can register DOIs for preprints, with around 16,000 new preprint DOIs added monthly since schema support was introduced in 2016, enabling early citation and versioning. Similarly, DOIs are assigned to software and datasets, often through collaborations like those with DataCite, allowing publications to link bidirectionally to these resources and promoting reuse in computational and data-driven scholarship. This broad applicability ensures that diverse outputs, such as code repositories or data collections, remain persistently discoverable within the scholarly ecosystem.[33][34][35]

Metadata Standards and Schema

Crossref maintains a structured metadata schema to ensure consistency and interoperability for scholarly content registered with digital object identifiers (DOIs). The schema defines required, recommended, and optional elements that capture essential details about research outputs, including DOIs as unique identifiers, titles, contributor information (such as authors with integrated ORCID iDs for persistent identification), abstracts, funding details, and references to support citation tracking.[36][37][38] As of 2025, Crossref has amassed over 170 million open metadata records, enabling widespread reuse in discovery tools and analyses.[1] The metadata deposit schema has evolved to accommodate richer descriptions, with version 5.4.0 released in March 2025 introducing support for multiple contributor roles (including "corresponding author" and "other"), a type attribute for citations to improve matching, version numbering for works, and expanded language options.[39][40] These updates build on prior versions by enhancing the granularity of roles and references, which facilitate the construction of comprehensive citation graphs across scholarly literature.[41] Full integration of the CRediT taxonomy for detailed contributor roles is planned for subsequent schema releases, such as version 5.5, to further standardize acknowledgments of diverse contributions.[42][43] Members deposit metadata primarily through XML files formatted according to the Crossref schema, submitted via HTTPS POST or the web-based admin tool, which supports uploads of XML built to standards like NLM or JATS.[44][45] The process emphasizes depositing rich, structured data—such as full references and funding awards—to maximize discoverability, interoperability, and reuse in global scholarly ecosystems.[46][47] All deposited metadata is publicly accessible without restriction, supporting nearly 2 billion monthly API queries as of 2025 to power tools like search engines and reference managers.[48] Retrieval occurs via the REST API, which provides metadata in JSON or XML formats, alongside OAI-PMH and other interfaces; rate limits for public and polite pools will be revised starting December 1, 2025, to sustain performance amid growing demand.[49][50] This open access model underpins Crossref's role in fostering a connected research infrastructure.[1]

Services and Tools

Crossmark and Content Updates

Crossmark is a service provided by Crossref that enables the display of post-publication updates for scholarly content associated with Digital Object Identifiers (DOIs). Launched on April 27, 2012, it aims to alert readers to important changes such as corrections, retractions, and version updates, while also providing access to additional editorial metadata like publication history and licenses.[51][52] The functionality of Crossmark relies on participating publishers depositing specific metadata through Crossref's DOI registration system. When an update occurs, members submit details including the type of change (e.g., correction or retraction) and a link to the updated content, which Crossref stores and makes accessible via an API. Readers encounter a Crossmark button or badge embedded on the publisher's website or PDF near the content's title; clicking it reveals a pop-up dialog showing the current status—such as "up-to-date," "updated," or "retracted"—along with timestamps, descriptions of changes, and links to the latest version. This system supports tracking retractions by flagging affected DOIs, ensuring the scholarly record remains transparent without altering the original publication.[52][53] By facilitating easy visibility of post-publication modifications, Crossmark enhances trust in the integrity of scholarly outputs, allowing researchers, librarians, and readers to verify the currency and reliability of cited works. For publishers, it demonstrates a commitment to maintaining the scholarly record, potentially increasing the perceived credibility of their content and enabling the sharing of supplementary information like peer-review status or funding details. Integration with major platforms, including support for REST API queries, further streamlines its use across diverse publishing workflows.[52][54] Adoption of Crossmark has grown steadily among Crossref members, with participation becoming a standard practice for many large publishers to signal ongoing maintenance of their outputs. As of March 2020, 440 Crossref members had implemented the service, registering Crossmark metadata for 11.4 million DOIs, including notable examples from Elsevier, Oxford University Press, and the Royal Society. While not mandatory for all members, it is required for those opting into the service to apply the badge consistently to new content and maintain a policy page outlining update procedures, with backfile application encouraged but optional. Early adopters in 2012 included 21 journals covering nearly 20,000 documents, marking initial traction for retraction and update tracking.[54][51][53]

Cited-by Linking and Similarity Check

Crossref's Cited-by service enables publishers to discover and display forward citations to their content, fostering connections between scholarly works. Launched in early 2021 with significant expansion following Elsevier's decision to open its reference data, the service now covers over 90% of Crossref's registered DOIs, allowing members to retrieve comprehensive lists of citing articles, including citation counts and direct links to the citing works.[55][56] This functionality relies on members depositing reference lists as part of their DOI metadata submissions, which are then parsed to create accurate citation links across the ecosystem.[57] The service provides public API endpoints for accessing citation metrics, such as the total number of citing works via the "is-referenced-by-count" field in metadata responses, enabling integration into publisher websites and research assessment tools. For instance, scholars can query citations for a specific DOI to explore how ideas evolve through subsequent research, enhancing discoverability without additional costs to participants.[58] Non-participating members risk underreporting citations by up to 20%, underscoring the value of comprehensive reference deposition.[55] Complementing citation tracking, Crossref's Similarity Check service—formerly known as CrossCheck—supports content integrity by detecting potential plagiarism in manuscripts before publication. Powered by iThenticate software from Turnitin, it scans submitted texts against a vast database exceeding 78 million scholarly publications and web sources, generating an originality report with similarity scores and highlighted matches.[59][60] Eligible Crossref members, who must include full-text URLs in at least 90% of their metadata deposits, gain discounted access to upload unlimited manuscripts for analysis.[61] This pre-publication screening helps editors verify originality, protects publication reputations, and educates authors on proper citation practices.[62] Implementation of both services integrates seamlessly with Crossref's core infrastructure, where reference metadata deposition underpins Cited-by links, while full-text accessibility enables Similarity Check comparisons. Their combined impact bolsters research assessment by providing reliable citation networks and ensuring ethical content creation; for example, the 2024 milestone of the related Grant Linking System registered over 125,000 grants, facilitating traceable funding-to-output connections that enhance transparency in scholarly evaluation.[63]

Additional Services

Crossref offers several supplementary services that extend beyond its core DOI registration and linking functionalities, enhancing the transparency, discoverability, and impact assessment of scholarly content. These include registries for funding sources, tracking of online scholarly events, advanced metadata access options, and streamlined content deposition tools. By integrating these services, Crossref supports a more interconnected ecosystem for research outputs, enabling stakeholders to trace funding influences, monitor broader engagement, and efficiently manage metadata workflows.[6] The Open Funder Registry (OFR) provides persistent identifiers for grant-giving organizations worldwide, facilitating the linking of research outputs to their funding sources. Launched as an open, CC0-licensed resource, the registry contains records for over 44,000 funders, including government agencies, private foundations, and international bodies, with updates occurring every 4-6 weeks to incorporate new entries and revisions. About 25% of Crossref records include funding information, promoting transparency in research funding and aiding compliance with open access mandates from bodies like the National Institutes of Health (NIH) and the European Research Council (ERC). As of March 2025, members can use Research Organization Registry (ROR) IDs in place of Funder IDs for improved interoperability. Funders and publishers use these IDs to standardize metadata, reducing ambiguity in grant acknowledgments and enabling aggregated analyses of funding outcomes.[64][65][66][67] Event Data service captures and aggregates online mentions of scholarly works identified by DOIs, extending traditional citation tracking to include diverse forms of engagement. It monitors sources such as Wikipedia edits, social media platforms like Reddit, peer review platforms, news outlets, blogs, datasets, and patents, compiling events that reflect scholarly discourse beyond formal publications. This aggregation supports altmetrics by providing raw data on shares, comments, and references, which researchers and institutions use to gauge broader societal impact; for instance, it has processed millions of events annually, helping to visualize research visibility across non-academic channels. The service distributes this information via a free API, allowing users to query and analyze events without restrictions, though Twitter data was discontinued in 2023 due to API changes.[68][69] Metadata Retrieval offers free, public access to Crossref's extensive repository through APIs and search tools, enabling global reuse of over 170 million records containing details like titles, authors, publication dates, and funder information. The REST API supports queries by DOI, title, or author, returning results in JSON or XML formats, while the public annual data files provide comprehensive snapshots for bulk analysis. For high-volume users, Metadata Plus provides premium enhancements, including higher API rate limits, monthly dumps in XML and JSON formats, and priority support, ensuring reliable access for large-scale integrations like library catalogs or analytics platforms. These tools prioritize open access to foster metaresearch and interoperability with systems such as ORCID and DataCite.[70][71] Content Registration equips Crossref members with interfaces to deposit and update metadata alongside DOIs, ensuring persistent identification for diverse content types including journals, books, datasets, and preprints. Members can submit XML-formatted metadata via web-based tools, XML gateways, or custom integrations, adhering to Crossref's schema that incorporates elements like contributor roles, funding details, and version information. This service assigns unique DOI suffixes to member prefixes, enabling immediate resolution to content locations, and supports ongoing updates to reflect corrections or new versions. By streamlining deposition, it enhances the accuracy and completeness of the global scholarly record, with over 23,000 members relying on these mechanisms for efficient workflow management.[28]

Challenges and Developments

Metadata Quality Issues

In 2024, researchers identified a significant case of fabricated metadata in Crossref registrations, where "sneaked references"—extra citations not present in the actual articles—were inserted to artificially inflate citation counts. This manipulation was detected in journals published by Technoscience Academy, with at least 9.8% of recorded references being erroneous across analyzed articles, totaling over 10,000 sneaked entries in 230 publications.[72] The issue arose during the DOI registration process, exploiting Crossref's metadata schema by adding fabricated reference lists that linked back to the same publisher's content.[73] The primary causes stemmed from inadequate validation during automated metadata submission, allowing publishers to deposit inconsistent data without sufficient checks against the full-text content. This lack of rigorous verification enabled deliberate alterations, such as inserting hundreds of phantom citations per article, which bypassed traditional peer review and content scrutiny.[72] While automation streamlines registration, it inadvertently facilitated these errors by prioritizing speed over cross-verification with source documents like PDFs or HTML versions.[74] These metadata inaccuracies have profound effects on scholarly infrastructure, distorting citation networks by creating false interconnections that mislead bibliometric analyses and researcher evaluations. For instance, the sneaked references propagated to downstream datasets like OpenAlex, which relies on Crossref metadata and is widely used to train AI models for tasks such as recommendation systems and knowledge graph construction, potentially introducing biases into machine learning outputs.[75] In one case, affected articles saw citation counts inflated by thousands, compromising the integrity of global scholarly metrics.[72] In January 2025, researchers reported a new instance of sneaked references in the metadata of articles from the International Journal of Innovative Science and Research Technology (IJISRT), totaling 80,205 fabricated citations, all self-referencing the same journal. This case, detected through methods comparing Crossref metadata against full-text or PDF-extracted references, further highlighted vulnerabilities in metadata integrity and their propagation to services like OpenAlex.[76] In response, the scholarly community raised alerts through publications and forums, prompting Crossref to investigate and suspend the implicated publisher's registration privileges while collaborating on corrections.[74] Crossref's Data Science team partnered with researchers including Lonni Besançon, Guillaume Cabanac, Cyril Labbé, Alexander Magazinov, Jules di Scala, and Kathryn Weber-Boer to develop automated detection methods, such as comparing metadata references against full-text extractions.[74][76] The organization has since emphasized best practices, including regular metadata updates and adherence to validation guidelines during submission, to prevent recurrence and maintain trust in the persistent identifier system.[38]

Strategic Initiatives and Future Plans

Crossref's strategic agenda through 2025, originally outlined in a 2021 update extending the 2018-2021 plan, emphasizes six core goals, including bolstering internal team capabilities and adhering to the Principles of Open Scholarly Infrastructure (POSI) for enhanced transparency and accountability.[77] This framework prioritizes open governance through measures like public self-assessments, increased operational transparency, and planned security penetration tests in 2025 with published results.[43] Sustainability efforts focus on financial stability, operational scaling, and environmental responsibility, such as cloud migration from data centers and carbon emission tracking for staff activities.[43] Diversifying the scholarly record involves engaging broader communities, supporting multilingual metadata, and enriching content types like preprints to create a more inclusive ecosystem.[77] By late 2024, progress included revised leadership structures, new hires in technology and programs, and initial cloud transitions, with ongoing API infrastructure upgrades including schema 5.4 development.[14] Recent initiatives underscore Crossref's commitment to metadata improvement and system reliability. In April 2025, the first Metadata Sprint convened 21 participants and staff in Madrid to collaboratively address short-term metadata issues, resulting in 12 completed projects such as enhancing the Public Data File and integrating Retraction Watch metadata.[78] The metadata schema version 5.4.0, released in March 2025, introduced features like citation type attributes, version numbering support across record types, and preprint status fields, laying groundwork for subsequent updates including full integration of the CRediT taxonomy for contributor roles.[39] Additionally, starting December 1, 2025, revised rate limits for the public and polite pools of the REST API aim to sustain stability amid approximately one billion monthly hits while keeping metadata freely accessible.[79] Tool evolution reflects efforts to streamline operations. The Metadata Manager, a long-standing registration aid, will retire by the end of 2025 due to maintenance challenges, replaced by a new record registration form developed over four years and launched in September 2024, which now handles around 200 articles daily and supports journal content with expansions planned.[80] Looking ahead, Crossref's focus centers on enhancing metadata reusability through initiatives like schema advancements for contributor roles and upgraded preprint matching services, while integrating with open science ecosystems via collaborations such as the Open Funder Registry, ROR affiliations, ORCID auto-updates, and Research Nexus for unified data sources including Event Data.[14] These efforts, supported by a new data science team and equitable fee modeling under the Resourcing Crossref for Future Sustainability program, aim to foster interconnectivity and long-term viability in scholarly infrastructure.[43]

Recognition and Impact

Awards and Honors

Crossref has received multiple awards recognizing its pivotal role in advancing scholarly infrastructure and publishing practices. In 2008, the partnership between Crossref and iParadigms was awarded the ALPSP Award for Publishing Innovation for CrossCheck, a service designed to detect plagiarism by screening manuscripts against published content.[81] The following year, in 2009, Crossref earned the Council of Science Editors (CSE) Award for Meritorious Achievement, presented at the CSE annual meeting in Pittsburgh for its contributions to scientific editing and publishing standards.[82] In 2012, Crossref was honored with the ALPSP Award for Contribution to Scholarly Publishing, acknowledging its decade-long impact on enabling reliable access to and interoperability of scholarly research.[83] As part of its 25th anniversary celebrations in 2025, Crossref introduced the Inaugural Metadata Awards to highlight exemplary metadata practices among its members, with six winners—including GigaScience Press for completeness and enrichment in small-publisher metadata—announced on May 7, 2025.[84]

Contributions to Scholarly Communication

Crossref has significantly advanced scholarly communication by enabling persistent global linking of research outputs through its DOI-based infrastructure. As of November 20, 2025, Crossref supports over 176 million metadata records, including journal articles, books, datasets, and more, registered by more than 23,000 members from 160 countries.[4] This vast network facilitates seamless interconnections across diverse content types, promoting open access by embedding license information in metadata, which allows tools like Unpaywall to identify freely available versions of publications.[85] By standardizing identifiers and metadata schemas, Crossref enhances content reuse, enabling researchers to cite, link, and build upon works reliably without facing link rot or access barriers.[1] The organization's metadata has profoundly impacted research practices by improving citation accuracy, funding traceability, and alternative metrics. Crossref's reference linking service ensures precise connections between citing and cited works, reducing errors in bibliometric analyses and supporting studies on topics like gender disparities in citations.[85] Through the Open Funder Registry, it collects funding acknowledgments in over 25% of records as of 2023, allowing funders to trace the outputs of their investments and fostering transparency in resource allocation.[86] Additionally, Crossref Event Data aggregates non-traditional interactions—such as social media mentions and policy citations—serving as foundational infrastructure for altmetrics platforms to measure broader societal impact beyond traditional citations.[87] Integrations with ORCID and DataCite further amplify these effects; for instance, the Auto-Update service automatically syncs new publications to researchers' ORCID profiles, streamlining identity management and credit attribution across scholarly ecosystems.[88] As a not-for-profit membership organization, Crossref promotes equitable access to essential infrastructure, deliberately countering proprietary silos in scholarly publishing. Governed by its community of publishers, funders, and researchers, it reinvests all surpluses into metadata improvements and open tools, adhering to principles like universal participation to ensure no entity dominates the linking ecosystem.[1] This model democratizes the tools needed for discoverability and interoperability, benefiting smaller publishers and global south institutions by providing the same high-quality services as larger players without profit-driven restrictions.[85] Crossref's long-term legacy positions it as the backbone of the scholarly record, increasingly influencing AI and data-driven science. Its open, license-free metadata serves as a critical dataset for training machine learning models in text mining, entity resolution, and recommendation systems, enabling advancements in automated literature reviews and knowledge graph construction.[85] By maintaining a persistent, interconnected archive, Crossref underpins data-intensive research paradigms, where AI tools can leverage its links to datasets via DataCite collaborations, ultimately accelerating discovery in fields from biomedicine to climate science.[89]

References

User Avatar
No comments yet.