Recent from talks
Contribute something
Nothing was collected or created yet.
Database abstraction layer
View on WikipediaThis article needs additional citations for verification. (September 2014) |
A database abstraction layer (DBAL[1] or DAL) is an application programming interface which unifies the communication between a computer application and databases such as SQL Server, IBM Db2, MySQL, PostgreSQL, Oracle or SQLite. Traditionally, all database vendors provide their own interface that is tailored to their products. It is up to the application programmer to implement code for the database interfaces that will be supported by the application. Database abstraction layers reduce the amount of work by providing a consistent API to the developer and hide the database specifics behind this interface as much as possible. There exist many abstraction layers with different interfaces in numerous programming languages. If an application has such a layer built in, it is called database-agnostic.[2]
Database levels of abstraction
[edit]Physical level (lowest level)
[edit]The lowest level connects to the database and performs the actual operations required by the users. At this level the conceptual instruction has been translated into multiple instructions that the database understands. Executing the instructions in the correct order allows the DAL to perform the conceptual instruction.
Implementation of the physical layer may use database-specific APIs or use the underlying language standard database access technology and the database's version SQL.
Implementation of data types and operations are the most database-specific at this level.
Conceptual or logical level (middle or next highest level)
[edit]The conceptual level consolidates external concepts and instructions into an intermediate data structure that can be devolved into physical instructions. This layer is the most complex as it spans the external and physical levels. Additionally it needs to span all the supported databases and their quirks, APIs, and problems.
This level is aware of the differences between the databases and able to construct an execution path of operations in all cases. However the conceptual layer defers to the physical layer for the actual implementation of each individual operation.
External or view level
[edit]The external level is exposed to users and developers and supplies a consistent pattern for performing database operations. [3] Database operations are represented only loosely as SQL or even database access at this level.
Every database should be treated equally at this level with no apparent difference despite varying physical data types and operations.
Database abstraction in the API
[edit]Libraries unify access to databases by providing a single low-level programming interface to the application developer. Their advantages are most often speed and flexibility because they are not tied to a specific query language (subset) and only have to implement a thin layer to reach their goal. As all SQL dialects are similar to one another, application developers can use all the language features, possibly providing configurable elements for database-specific cases, such as typically user-IDs and credentials. A thin-layer allows the same queries and statements to run on a variety of database products with negligible overhead.
Popular use for database abstraction layers are among object-oriented programming languages, which are similar to API-level abstraction layers. In an object-oriented language like C++ or Java, a database can be represented through an object, whose methods and members (or the equivalent thereof in other programming languages) represent various functionalities of the database. They also share advantages and disadvantages with API-level interfaces.
Language-level abstraction
[edit]An example of a database abstraction layer on the language level would be ODBC that is a platform-independent implementation of a database abstraction layer. The user installs specific driver software, through which ODBC can communicate with a database or set of databases. The user then has the ability to have programs communicate with ODBC, which then relays the results back and forth between the user programs and the database. The downside of this abstraction level is the increased overhead to transform statements into constructs understood by the target database.
Alternatively, there are thin wrappers, often described as lightweight abstraction layers, such as OpenDBX[4] and libzdb.[5] Finally, large projects may develop their own libraries, such as, for example, libgda[6] for GNOME.
Arguments
[edit]In favor
[edit]- Development period: software developers only have to know the database abstraction layer's API instead of all APIs of the databases their application should support. The more databases should be supported the bigger is the time saving.
- Wider potential install-base: using a database abstraction layer means that there is no requirement for new installations to utilise a specific database, i.e. new users who are unwilling or unable to switch databases can deploy on their existing infrastructure.
- Future-proofing: as new database technologies emerge, software developers won't have to adapt to new interfaces.
- Developer testing: a production database may be replaced with a desktop-level implementation of the data for developer-level unit tests.
- Added Database Features: depending on the database and the DAL, it may be possible for the DAL to add features to the database. A DAL may use database programming facilities or other methods to create standard but unsupported functionality or completely new functionality. For instance, the DBvolution DAL implements the standard deviation function for several databases that do not support it.
Against it
[edit]- Speed: any abstraction layer will reduce the overall speed more or less depending on the amount of additional code that has to be executed. The more a database layer abstracts from the native database interface and tries to emulate features not present on all database backends, the slower the overall performance. This is especially true for database abstraction layers that try to unify the query language as well like ODBC.
- Dependency: a database abstraction layer provides yet another functional dependency for a software system, i.e. a given database abstraction layer, like anything else, may eventually become obsolete, outmoded or unsupported.
- Masked operations: database abstraction layers may limit the number of available database operations to a subset of those supported by the supported database backends. In particular, database abstraction layers may not fully support database backend-specific optimizations or debugging features. These problems magnify significantly with database size, scale, and complexity.
See also
[edit]References
[edit]- ^ Ambler, Tim; Cloud, Nicholas (2015). JavaScript Frameworks for Modern Web Dev. Apress. p. 346. ISBN 978-1-4842-0662-1.
- ^ "What is database-agnostic? - Definition from WhatIs.com".
- ^ "Levels of Abstraction".
- ^ "OpenDBX". linuxnetworks.de. 24 June 2012. Retrieved 26 July 2018.
- ^ "Libzdb". tildeslash.com. 2018. Retrieved 26 July 2018.
- ^ "GNOME-DB". 12 June 2015. Retrieved 26 July 2018.
Libgda library [...] is mainly a database and data abstraction layer, and includes a GTK+ based UI extension, and some graphical tools.
Database abstraction layer
View on GrokipediaCore Concepts
ANSI/SPARC Three-Level Architecture
The ANSI/SPARC three-level architecture was introduced in the 1970s by the ANSI/X3/SPARC Study Group on Database Management Systems to establish a standardized framework for DBMS design, emphasizing data independence to insulate application programs from changes in data storage or organization.[6] This model, first detailed in the 1978 framework report, separates database concerns into three distinct levels—internal (physical), conceptual (logical), and external (view)—each with its own schema to facilitate modular development and maintenance in DBMS.[7] The physical level, also known as the internal level, defines the lowest abstraction, focusing on hardware-specific aspects of data storage and access. It specifies file structures such as sequential, indexed sequential, or hashed files, along with indexing mechanisms to optimize retrieval, including primary and secondary indexes that map logical keys to physical locations on storage devices.[7] Data compression techniques at this level, such as those for reducing redundancy in storage models like flat files or hierarchical structures, ensure efficient use of disk space and I/O operations, while access paths detail how data is organized for performance without exposing these details to higher levels.[7] This level's schema reflects efficiency considerations, modeling the database in terms of an abstract storage view that hides low-level hardware dependencies from the rest of the system.[6] The conceptual level, or logical level, provides a unified view of the entire database, independent of physical implementation. It includes schema definitions that outline the overall structure, such as tables, attributes, and constraints, along with entity relationships that model the semantics of the data, like one-to-many associations between entities in a relational context.[7] A key concept here is logical data independence, which allows modifications to the conceptual schema—such as adding new entities or altering relationships—without impacting external views or application programs that rely on them, thereby promoting flexibility in evolving database designs.[6] This level serves as the core information model of the enterprise, capturing all relevant static and dynamic aspects of the data universe.[7] The external level, referred to as the view level, offers customized presentations tailored to specific users or applications. It consists of multiple external schemas, each comprising user-specific views that act as virtual tables derived from the conceptual schema, restricting access to relevant subsets of data and renaming elements for clarity.[7] These views enable tailored data presentations, such as simplified subsets for end-users or application-specific projections, without requiring alterations to the underlying conceptual or physical schemas, thus supporting diverse user needs within a shared database.[6] This architecture forms the theoretical basis for database abstraction layers in contemporary systems, enabling separation of data concerns across software stacks.[7]Purpose and Role of Abstraction Layers
A database abstraction layer (DAL) serves as an intermediary software component that conceals database-specific implementation details from the application code, presenting a unified interface for data operations.[8] This layer translates high-level application requests into database-appropriate commands, shielding developers from vendor-specific syntax, connection protocols, and optimization quirks across different database management systems (DBMS).[9] The primary roles of a DAL include facilitating database portability, which allows applications to switch underlying DBMS—such as from Oracle to MySQL—without necessitating widespread code modifications.[8] It simplifies maintenance by centralizing database interactions in a single layer, making updates to queries or configurations more efficient and less error-prone.[9] Additionally, a DAL enables support for multiple database backends concurrently within the same application, enhancing scalability and deployment flexibility in heterogeneous environments.[10] DALs embody key concepts of data independence as outlined in the ANSI/SPARC three-level architecture, which provides the foundational model for separating user views from physical storage.[11] External data independence protects application views from changes in the conceptual schema, while internal (or physical) data independence insulates the conceptual schema from alterations in physical storage, such as file organization or indexing strategies.[12] For instance, a DAL achieves internal data independence by allowing migration from SQL Server to PostgreSQL without altering application logic, as the layer handles differences in SQL dialects and storage mechanisms.[13] Effective utilization of a DAL presupposes a foundational understanding of basic SQL for query construction and DBMS architectures to grasp how abstraction maps to underlying operations.[14] Developers must also recognize the trade-offs in performance and feature support when abstracting complex database functionalities.[9]Implementation Methods
API-Based Abstraction
API-based abstraction in database abstraction layers refers to the implementation of a standardized application programming interface (API) that enables applications to interact with multiple database management systems (DBMS) through a uniform set of functions and methods, insulating developers from DBMS-specific details. These APIs typically include core operations such as establishing connections (connect), executing SQL queries (query or executeQuery), and performing updates or inserts (executeUpdate), which are translated by underlying drivers into vendor-specific commands. For example, the JDBC API in Java provides interfaces like Connection for managing database sessions, Statement for basic SQL execution, and PreparedStatement for parameterized queries, allowing applications to issue abstract SQL without direct knowledge of the target DBMS syntax or protocol.[15] Similarly, the ODBC API offers functions like SQLConnect for connections, SQLExecDirect for query execution, and SQLExecute for prepared statements, abstracting access across diverse data sources.[16]
Key components of these APIs address common challenges in multi-DBMS environments, including connection pooling to reuse database connections and minimize establishment overhead, query construction mechanisms to handle SQL dialect variations, and error handling wrappers to normalize exceptions across systems. Connection pooling is facilitated through objects like JDBC's DataSource interface, which maintains a cache of reusable Connection instances, improving scalability in high-load applications by avoiding the costly process of repeated connection creation.[17] Query builders, often embodied in prepared statement APIs, allow developers to parameterize SQL to mitigate syntax differences—such as varying quote characters or function names—while drivers translate the final form to match the DBMS dialect, like converting standard JOIN syntax for Oracle or SQL Server specifics.[18] Error handling wrappers standardize DBMS-specific errors; for instance, JDBC uses the SQLException class to encapsulate details like SQL state codes and vendor error messages, enabling consistent application-level recovery regardless of the underlying system.[15] In ODBC, drivers populate diagnostic records via functions like SQLGetDiagRec to abstract error reporting from DBMS variations.[19]
Prominent examples of generic APIs include JDBC, which operates in Java environments by leveraging type-specific drivers (e.g., Type 4 pure Java drivers) to map abstract API calls—such as a PreparedStatement.executeQuery("SELECT * FROM table WHERE id = ?")—directly to the DBMS protocol without intermediate translation layers in modern implementations.[15] ODBC, designed for cross-platform access in C and other languages, uses a Driver Manager to route calls to DBMS-specific drivers, which handle mappings like converting ODBC's standard SQLExecute calls to native commands for sources ranging from relational databases to flat files, ensuring portability across Windows, Unix, and other systems.[19] These drivers act as the translation bridge, encapsulating DBMS idiosyncrasies such as data type mappings or escape sequence interpretations.
Performance considerations in API-based abstraction arise primarily from the translation layers within drivers, which introduce overhead by parsing and converting abstract calls to native formats, potentially increasing latency in high-throughput scenarios compared to direct DBMS access.[20] To optimize this, techniques like prepared statements are integral; in JDBC, PreparedStatement objects precompile SQL on the server side, reducing parsing and optimization costs on subsequent executions with varying parameters, which can yield up to several times faster performance for repeated queries.[18] ODBC similarly employs prepared execution via SQLPrepare and SQLExecute, caching execution plans to amortize translation overhead, though overall API latency may still depend on driver efficiency and network factors.[16]
Language-Integrated Abstraction
Language-integrated abstraction embeds database query operations directly into the syntax and type system of a programming language, enabling developers to express queries using familiar language constructs while maintaining integration with the language's ecosystem for data manipulation.[21] This approach contrasts with external APIs by leveraging the host language's features, such as operators and expressions, to construct and compose queries that are translated to SQL at runtime or compile time.[22] A prominent example is LINQ (Language-Integrated Query) in .NET languages like C# and Visual Basic, where query expressions use declarative syntax resembling SQL but are fully integrated as first-class language elements.[23] Developers can write queries likefrom customer in customers where customer.City == "London" select customer, which the compiler translates into executable code with type safety ensured at compile time. In Python, SQLAlchemy's Core provides a similar integration through its SQL Expression Language, allowing construction of SQL statements using Python objects and operators, such as select(users).where(users.c.name == 'John'), which builds type-aware expressions without leaving the Python environment.
Key mechanisms include type-safe query construction, where the language's type system validates query elements against database schemas during development, preventing errors like mismatched column types before execution.[24] Compile-time checks further enhance this by analyzing query validity, such as ensuring join conditions align with table relationships, reducing runtime surprises.[25] Integration with language ecosystems facilitates seamless data handling, as queries can chain with native functions for transformations like filtering or aggregation, all within the same code block.[21]
These features improve developer experience by minimizing boilerplate code; for instance, fluent interfaces in LINQ or jOOQ allow method chaining for complex operations, such as query.from(table).join(other).where(condition).select(fields), making queries more readable and maintainable than raw SQL strings.[24] This reduces context-switching between languages and supports IDE autocompletion for schema-aware development.[23]
However, language-integrated abstraction depends heavily on the host language's runtime environment, which may introduce performance overhead from query translation or limit portability across non-compatible languages.[25] It can also lead to lock-in with language-specific DBMS adapters, complicating migrations to databases not fully supported by the integration layer.
