Hubbry Logo
TermBase eXchangeTermBase eXchangeMain
Open search
TermBase eXchange
Community hub
TermBase eXchange
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
TermBase eXchange
TermBase eXchange
from Wikipedia
TermBase eXchange
Filename extension.tbx
Internet media typeapplication/x-tbx [1]
UTI conformationpublic.xml
Developed byLocalization Industry Standards Association
Initial release2002?
Latest release
April 2019; 6 years ago (2019-04)[2]
Type of formatTerminology
Extended fromXML
StandardISO 30042
Open format?yes
Websitehttps://www.gala-global.org/sites/default/files/migrated-pages/docs/tbx_oscar_0.pdf

TermBase eXchange (TBX) is an international standard (ISO 30042:2019) for the representation of structured concept-oriented terminological data, copublished by ISO and the Localization Industry Standards Association (LISA).[3][4][5] Originally released in 2002 by LISA's OSCAR special interest group, TBX was adopted by ISO TC 37 in 2008. In 2019 ISO 30042:2008 was withdrawn and revised by ISO 30042:2019. It is currently available as an ISO standard and as an open, industry standard, available at no charge.[4][5]

TBX defines an XML format for the exchange of terminology data, and is "an industry standard for terminology exchange".[6]

See also

[edit]
  • IATE (“Inter-Active Terminology for Europe”) is the EU's inter-institutional terminology database used in the EU institutions and agencies since summer 2004 for the collection, dissemination and shared management of EU-specific terminology. The IATE multilingual databases can be downloaded in a zipped format, then multilanguage glossaries in TBX format can be generated using a free tool.
  • OpenTMS (Open Source Translation Management System)
  • XLIFF (XML Localisation Interchange File Format): an XML-based format created to standardize the way localizable data are passed between tools during a localization process and a common format for CAT tool files.

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
TermBase eXchange (TBX) is an for the representation, archiving, and exchange of structured terminological data from terminology databases, known as termbases, in an XML-based format. It facilitates among terminology management tools, enabling the sharing of lexical and conceptual information such as terms, definitions, and equivalents across languages, primarily in fields like , localization, and . Developed to address the need for a vendor-neutral format in the language industry, TBX traces its origins to efforts in the on terminology exchange formats, evolving through the (TEI) and early work by key contributors like Alan K. Melby. The standard was first published in 2002 by the Localization Industry Standards Association (LISA) through its OSCAR special interest group, providing an initial framework for terminological interchange. In 2008, it was adopted by the (ISO) Technical Committee 37 (ISO TC 37) as ISO 30042:2008, marking its formal recognition as a global standard. The standard underwent revision, resulting in ISO 30042:2019 (TBX Version 3), which enhanced , validation through schemas like RNG and XSD, and support for dialects such as TBX-Basic for simplified data exchange. This version, led by project contributors including Hanne Smaadahl and Arle Lommel, emphasizes concept-oriented structures while maintaining where possible. Currently, TBX is maintained by the TBX Council in liaison with ISO TC 37 and organizations like the Federation Internationale des Traducteurs (FIT), with an ongoing five-year to ensure relevance in evolving digital environments.

History

Origins in LISA

The TermBase eXchange (TBX) originated in 2002 within the Localization Industry Standards Association (LISA), specifically through its OSCAR special interest group, which stood for Open Standards for Container/Content Allowing Re-use. This group focused on creating XML-based standards to support automated language processing across globalization, internationalization, localization, and translation processes. TBX emerged as a dedicated format for the representation and exchange of terminological data, marking an early effort to unify data handling in the burgeoning field of multilingual content management. The core purpose of TBX was to standardize termbase interchange within localization workflows, tackling the fragmentation and heterogeneity caused by proprietary formats in dominant tools of the era, such as SDL Trados and . These systems often used incompatible structures for storing and sharing terminology, leading to inefficiencies in collaborative projects and . By providing a neutral, XML-based interchange format, TBX aimed to promote , allowing terminological resources to be shared across diverse software environments without loss of structure or meaning. Development involved key contributors from LISA's membership, including academics like Sue Ellen Wright of and professionals from translation memory and terminology software vendors, who collaborated to define the foundational framework. The initial release featured a basic centered on essential elements such as terms, language equivalents, and definitions, establishing a flexible yet structured approach to terminological data exchange prior to any formal international standardization.

Adoption by ISO

The adoption of TermBase eXchange (TBX) by the (ISO) marked a pivotal transition from an industry-driven initiative to a globally recognized standard. In 2007, the Localization Industry Standards Association (LISA) submitted the TBX specification, developed by its OSCAR , to ISO Technical Committee 37 (TC 37), and other language and content resources, Subcommittee 3 (SC 3), Management of terminology. This submission utilized a fast-track procedure, leading to the formal adoption and publication of ISO 30042:2008 in December 2008, which defined TBX as an XML-based framework for the interchange of terminological data. The standard was co-published by ISO and LISA, ensuring continuity while elevating TBX to an international benchmark for management systems. A key aspect of this adoption was TBX's alignment with established ISO terminology standards, particularly ISO 12620, which specifies data categories for language resources. ISO 30042:2008 required that all TBX data categories be drawn from the ISO 12620 registry, promoting and consistency across terminological databases used in translation, , and content creation. This integration facilitated the modular representation of terminological elements, allowing TBX to support diverse processes such as term extraction, concept modeling, and data exchange without prescribing a single rigid structure. Following LISA's insolvency in March 2011, its OSCAR standards portfolio, including TBX, was transferred to ISO TC 37 for ongoing maintenance and development. This handover solidified ISO's sole custodianship, withdrawing LISA's formal role and ensuring the standard's evolution under international governance. Early post-adoption efforts by ISO emphasized TBX's modularity to accommodate varying terminological needs, such as specialized dialects for translation workflows or broader knowledge organization tasks. This focus enhanced TBX's adaptability, positioning it as a flexible tool for global standardization in terminology resources.

Technical Specifications

Core Framework

TermBase eXchange (TBX) serves as an extensible (XML) format designed for the interchange of terminological , adhering to the ISO 30042 standard for structured representation of terms and related linguistic information. This framework enables the exchange of resources across diverse software systems, ensuring compatibility and in fields such as and localization. At its foundation, TBX leverages XML to define a modular that separates core structural elements from customizable categories, allowing users to tailor the format to specific needs without altering the underlying schema. TBX supports two styles: Data Category as Attribute (DCA), where categories are attributes on generic elements like <descrip>, and Data Category as Tag (DCT), using specific element names like <definition>. The basic structure of a TBX document begins with the root element <tbx>, evolved from the Martif Interchange Format, which encapsulates the entire terminological database. Within this root, individual concepts are represented by <conceptEntry> elements, each containing nested components such as <langSec> for language-specific sections and <termSec> for term details. This supports multilingual entries through attributes like xml:lang on <langSec>, facilitating the inclusion of equivalent terms across languages, along with definitions via <descrip type="definition">, notes in <termNote> elements (e.g., for grammatical information), and administrative metadata in <admin> sections for tracking origins, status, and revision history. Key principles of the TBX core framework emphasize modularity through dialects, such as the simplified TBX-Basic or more complex private dialects, which extend the mandatory TBX-Core structure with optional modules for enhanced functionality. These dialects maintain backward compatibility while allowing customization of data categories, ensuring that essential elements like terms, definitions, and notes remain consistently represented. Furthermore, TBX integrates seamlessly with XML technologies, including XML Schema Definition (XSD) files for validation, which enforce structural integrity and data constraints across implementations. This compatibility enables automated processing, parsing, and verification of terminological data in standard XML environments.

Data Categories and Modules

The TermBase eXchange (TBX) standard incorporates data categories defined in ISO 12620, which provide a registry of standardized attributes for terminological resources, enabling consistent representation of elements such as terms, definitions, and notes. These categories include core elements like <term> for denoting a concept's designation, <descrip type="definition"> for explanatory text, <note> for general annotations, <termNote> for notes specific to a term's usage or status, and <adminNote> for administrative metadata such as entry modification history. By drawing from this inventory, TBX ensures across terminological databases while allowing for precise categorization of linguistic and conceptual data. TBX organizes these data categories into modules and dialects, which serve as customizable building blocks for handling diverse terminological needs. Modules are predefined subsets that specify permissible categories and their constraints, divided into public (endorsed for general use, such as TBX-Basic for simple exchanges and private dialects for complex ontologies) and private (user-defined for specialized applications) types. Dialects, in turn, combine one or more modules to create tailored profiles, with public examples like TBX-Basic extending the minimal TBX-Core to include categories such as /definition/ and /subjectField/, facilitating extensions without altering the core structure. This modular approach supports user-defined extensions, ensuring flexibility for domain-specific while maintaining compatibility. Flexibility in TBX is achieved through attribute-value pairs applied to elements, allowing nuanced specifications such as <term type="synonym"> to indicate a synonymous variant or @xml:lang="fr" to denote the language of a termSet. These pairs, often in Data Category as Attribute (DCA) style, enable concise encoding of metadata like term types (e.g., abbreviation, acronym) or notes, promoting efficient data exchange. To ensure interoperability, TBX documents must validate against dialect-specific schemas, typically using RELAX NG (RNG) for structural constraints, supplemented by XSD for datatype validation and Schematron for additional rules like cardinality or value restrictions. This validation framework, integrated into the core structure, verifies compliance with selected modules and prevents errors in terminological exchanges.

Versions and Evolution

TBX 2.0 (ISO 30042:2008)

ISO 30042:2008, published in December 2008, established the first international standard for TermBase eXchange (TBX), adopting the framework previously developed by the Localization Industry Standards Association (LISA) as TBX version 2.0 in 2002 with minor enhancements for standardization. This XML-based specification provided a structured format for interchanging terminological data, primarily aimed at supporting translation and authoring processes in computer-based environments. The core features of TBX 2.0 centered on a modular architecture consisting of a core structure module and an extensible constraint specification (XCS) module, enabling customization through subsets or supersets of default data categories defined in accordance with ISO 12616. It supported hierarchical term entries via the <termEntry> element, which encapsulated conceptual information and included multiple <langSet> elements for multilingual equivalents, each containing <tig> (term information group) or <ntig> (non-term information group) for terms and related attributes. Administrative metadata was integrated through elements like <admin>, <transac>, and <date>, allowing tracking of creation dates, ownership, and transaction notes to maintain in terminological databases. Despite its advancements, TBX 2.0 exhibited limitations, including a rigid structure that represented various termbase formats without enforcing a single compatible schema, often leading to challenges among implementations. It offered limited support for or ontologies, focusing instead on basic terminological interchange without provisions for integration. Additionally, its reliance on separate XCS files for declaring variants and on ISO 12616 for data category specifications created dependencies that could confuse implementers if not fully adhered to. In the early , TBX 2.0 saw widespread adoption within the localization industry for exchanging terminology data between computer-aided translation () tools and translation memory systems, facilitating seamless integration of termbases in multilingual projects.

TBX 3.0 (ISO 30042:2019)

TBX 3.0, formalized as the second edition of ISO 30042:2019, was published on April 4, 2019, by the (ISO), superseding and withdrawing the 2008 edition (ISO 30042:2008). This revision enhances support for complex terminologies by introducing a more flexible, modular framework that accommodates domain-specific needs while maintaining through defined migration paths. Major changes in TBX 3.0 emphasize improved modularity, enabling the creation of industry-defined dialects—such as TBX-Core, TBX-Min, and TBX-Basic—that build progressively as supersets using a to avoid overlapping modules. These dialects replace the previous XCS formalism with a simpler @type attribute on the root element, allowing tailored category selections for specific communities while ensuring a non-negotiable core structure. Better handling of systems and relations through the metamodel and categories that support hierarchical and associative links between concepts. Additionally, the standard aligns with schema documentation practices via support for ODD (One Document Does-all), a language-neutral approach that separates the normative core from ancillary content and replaces DTDs with flexible schemas like RNG or XSD. Key additions include the DCT (Data Category as Tag) style alongside the existing DCA (Data Category as Attribute) style, selectable via the @style attribute on the element, which introduces XML namespaces for declaring categories and improves compatibility. The inline has been updated to comply with 2.0, enhancing interoperability with localization standards by standardizing elements like

attributes and adding as a child of for richer term-level annotations. Structural refinements, such as renaming to and to , further promote concept-oriented representation and modern XML best practices, including explicit directionality support. As of , TBX 3.0 is undergoing its mandatory five-year review process, initiated following the standard's , with opportunities for community feedback coordinated through ISO technical committees and resources like the TBX Info website. This review aims to address evolving needs in while preserving the standard's core principles.

Applications and Usage

Terminology Management in Translation

TermBase eXchange (TBX) serves as a standardized XML-based format for exchanging structured terminological data, enabling seamless import and export of term lists between (CAT) tools such as to maintain consistency across multilingual projects. In , for instance, term bases can be exported in TBX format for compatibility with various translation environments, facilitating the sharing of resources without format-specific barriers. This ensures that translators working on the same project can access unified term repositories, regardless of their preferred software. The adoption of TBX in translation workflows yields significant benefits, particularly in reducing localization errors by standardizing terms, their equivalents, and associated context notes among distributed teams. By enforcing terminological consistency, TBX minimizes inconsistencies that could arise from ad-hoc translations, thereby enhancing technical accuracy and overall quality in multilingual outputs. For example, context notes embedded in TBX files provide translators with essential guidance on usage, further preventing mistranslations and promoting adherence to project-specific glossaries. In translation workflows, TBX facilitates the conversion of proprietary termbase formats into a neutral, exchangeable structure, supporting efficient sharing within global supply chains and aligning with requirements for management in professional services. This process allows agencies and freelancers to integrate from vendor-specific tools into collaborative platforms, ensuring that resources like style guides and approved terms are accessible throughout the project lifecycle without lock-in. TBX's XML further aids this integration by permitting selective data exchange tailored to needs. A practical case of TBX application is in software localization, where it manages (UI) terms across languages to preserve brand integrity and functionality. For instance, provides its official collections in TBX format, enabling localization teams to consistently translate UI elements like buttons and menus while incorporating language-specific adaptations and notes. Similarly, Mozilla's Pontoon platform utilizes TBX files to handle UI for , ensuring that terms such as "install" are rendered accurately in target languages with contextual details for software-specific usage. This approach streamlines the localization of dynamic UI components, reducing revision cycles and errors in end-user experiences.

Integration with Localization Tools

TBX facilitates seamless integration with various localization tools by providing a standardized XML-based format for exchanging terminological data, enabling direct import into termbases without loss of structure. For instance, SDL Trados Studio supports importing TBX files into its MultiTerm termbases using built-in conversion tools, allowing users to populate terminology resources from external glossaries efficiently. Similarly, ApSIC Xbench, a and terminology management tool, permits adding TBX files to projects for bilingual reference searches and validation during localization workflows. , another (CAT) platform, also handles TBX import and export, ensuring compatibility across multilingual projects. Automation of TBX in localization pipelines is achieved through converters and APIs that transform terminological for broader use, such as converting TBX to format to integrate terms into translation memories. Tools like TTMEM's Convert and Go utility enable batch conversion of TBX files to TMX, supporting automated workflows in software localization where must align with existing translation assets. The Goldpan TMX/TBX Editor further aids this by allowing creation, editing, and conversion of TBX files with up to eight languages, facilitating integration into (continuous integration/continuous deployment) pipelines for agile development. In CI/CD environments, these converters ensure updates propagate automatically, as seen in platforms like RWS that incorporate standards like TBX for continuous localization in software releases. As of 2025, tools such as the TBX Exporter for Excel add-in simplify the creation of TBX files from spreadsheets, enhancing accessibility for non-specialist users in workflows. TBX extends its utility through compatibility with complementary standards like SRX (Segmentation Rules eXchange) and XLIFF (XML Localization Interchange File Format), enhancing precision in text processing and segment-level terminology application. When paired with SRX, TBX supports consistent segmentation rules across tools, allowing terminological data to inform how source text is divided for translation, as implemented in systems like XTM Cloud that handle both formats for workflow optimization. With XLIFF, TBX enables embedding or referencing terminology at the segment level, where notes and attributes from TBX (inspired by XLIFF structures) maintain data integrity during file exchanges in localization tools. This integration is evident in frameworks like Okapi, which process TBX alongside XLIFF for end-to-end translation pipelines. A key challenge in localization ecosystems is bridging proprietary formats from diverse tools, which TBX addresses as a vendor-neutral intermediary for lossless data transfer. By serving as a common , TBX allows terminology from closed systems—such as those in SDL MultiTerm or other environments—to be exported, shared, and re-imported without proprietary dependencies, promoting interoperability in the translation and localization industry. This neutrality reduces conversion errors and supports scalable exchanges, as outlined in ISO 30042, ensuring TBX's role in handling complex, multi-vendor workflows.

Adoption and Community

Industry Implementation

TBX has achieved significant adoption within European Union institutions, particularly through the Inter-Active Terminology for Europe (IATE) database, which enables exports of terminological data in TBX format compliant with ISO 30042:2019 since at least 2021. This integration supports multilingual drafting and legal-linguistic consistency across EU agencies. Similarly, multinational corporations such as have incorporated TBX into their workflows, offering product-specific collections for download in .tbx format to facilitate standardized exchange and integration with translation tools. In specialized sectors, TBX facilitates the management of controlled vocabularies essential for precision. In the medical field, it has been applied to represent multilingual extensions of systems like , using an adapted TBX framework to structure and exchange clinical terminology for translation purposes. The legal sector benefits from TBX's ability to handle concept-oriented data in controlled environments, supporting consistent terminology in multilingual legal documents, though specific implementations often align with broader translation standards. According to the TBX Council's registry of implementations (last updated 2021), over 30 tools claim support for TBX, including more than 20 commercial applications such as SDL Trados and . This widespread tool compatibility underscores TBX's role as an industry standard for terminological interchange. A primary barrier to broader has been the initial associated with TBX's XML structure, which requires familiarity with markup for effective use. This challenge is increasingly mitigated by converters, validators, and simplified subsets like TBX-Basic, which reduce complexity while maintaining ISO compliance.

Resources and Support

Official documents for TermBase eXchange (TBX) are primarily governed by the ISO 30042:2019 standard, which defines the framework for representing structured terminological data, including the metamodel, data categories, and XML encoding styles such as Data Category as Attribute (DCA) and Data Category as Term (DCT). The full text of ISO 30042:2019 can be purchased directly from the website, providing detailed specifications for terminology management and exchange. Complementary specifications and guidelines are hosted on tbxinfo.net, a dedicated resource site maintained by LTAC Global, offering downloadable documents on TBX dialects, modules, and implementation best practices. For validation of TBX files, schemas are available through tbxinfo.net, enabling users to check compliance with the standard's core structure and dialect-specific requirements. These schemas, often integrated with Schematron rules for additional constraints, support automated verification in XML editors like , ensuring during exchange. The site also provides an online TBX validation for quick testing without local software installation. Community hubs for TBX development and support include tbxinfo.net, which serves as a central repository for public dialects, modules, and developer resources, fostering collaboration among terminologists and software developers. Open-source tools, such as converters for transforming TBX to other formats like , are available on ; for example, the Csh-MultiTerm-TBX-Converter facilitates bidirectional exchange between SDL MultiTerm and TBX formats using mappings. Another repository, tbx-conversion, provides Python scripts for converting between TBX and formats like NTRF, aiding integration with databases such as . Training resources for learning and implementing TBX are offered through LTAC Global, including tutorials on creating dialects and using public modules, accessible via their organization and tbxinfo.net help pages. These materials cover practical topics like schema generation for DCA and DCT styles, with examples for the TBX-Basic dialect. A starter guide, TBXStarterGuide.docx, provides step-by-step instructions for building initial TBX files, including validation against integrated schemas and Schematron rules. While specific webinars on TBX dialects are not frequently scheduled, LTAC Global contributes to industry events and online resources that address dialect customization and terminology workflows. Ongoing updates to TBX are managed through ISO Technical Committee 37 (ISO/TC 37), which oversees and standards, including a five-year systematic review of ISO 30042:2019 initiated in 2024 and ongoing as of 2025. Feedback for the 2025 review can be submitted by contacting the review coordinator at [email protected], as announced on tbxinfo.net, allowing stakeholders to propose enhancements to dialects and validation mechanisms. This process ensures TBX remains aligned with evolving needs in management and localization.

References

  1. https://learn.[microsoft](/page/Microsoft).com/en-us/globalization/localization/localization-file-formats
Add your contribution
Related Hubs
User Avatar
No comments yet.