Recent from talks
Contribute something
Nothing was collected or created yet.
Datasource
View on WikipediaThis article provides insufficient context for those unfamiliar with the subject. (April 2021) |
A datasource or DataSource is a name given to the connection set up to a database from a server. The name is commonly used when creating a query to the database. The data source name (DSN) need not be the same as the filename for the database. For example, a database file named friends.mdb could be set up with a DSN of school. Then DSN school would be used to refer to the database when performing a query.
A factory for connections to the physical data source that this DataSource object represents. An alternative to the DriverManager facility, a DataSource object is the preferred means of getting a connection. An object that implements the DataSource interface will typically be registered with a naming service based on the Java Naming and Directory Interface (JNDI) API.
The DataSource interface is implemented by a driver vendor. There are three types of implementations:
- Basic implementation — produces a standard Connection object
- Connection pooling implementation — produces a Connection object that will automatically participate in connection pooling. This implementation works with a middle-tier connection pooling manager.
- Distributed transaction implementation — produces a Connection object that may be used for distributed transactions and almost always participates in connection pooling. This implementation works with a middle-tier transaction manager and almost always with a connection pooling manager.
A DataSource object has properties that can be modified when necessary. For example, if the data source is moved to a different server, the property for the server can be changed. The benefit is that because the data source's properties can be changed, any code accessing that data source does not need to be changed.
A driver that is accessed via a DataSource object does not register itself with the DriverManager. Rather, a DataSource object is retrieved through a lookup operation and then used to create a Connection object. With a basic implementation, the connection obtained through a DataSource object is identical to a connection obtained through the DriverManager facility.
A DataSource object is the representation of a data source in the Java programming language. In basic terms, a data source is a facility for storing data. It can be as sophisticated as a complex database for a large corporation or as simple as a file with rows and columns. A data source can reside on a remote server, or it can be on a local desktop machine. Applications access a data source using a connection, and a DataSource object can be thought of as a factory for connections to the particular data source that the DataSource instance represents. The DataSource interface provides two methods for establishing a connection with a data source.
Using a DataSource object is the preferred alternative to using the DriverManager for establishing a connection to a data source. They are similar to the extent that the DriverManager class and DataSource interface both have methods for creating a connection, methods for getting and setting a timeout limit for making a connection, and methods for getting and setting a stream for logging.
Their differences are more significant than their similarities, however. Unlike the DriverManager, a DataSource object has properties that identify and describe the data source it represents. Also, a DataSource object works with a Java Naming and Directory Interface (JNDI) naming service and can be created, deployed, and managed separately from the applications that use it. A driver vendor will provide a class that is a basic implementation of the DataSource interface as part of its Java Database Connectivity (JDBC) 2.0 or 3.0 driver product. What a system administrator does to register a DataSource object with a JNDI naming service and what an application does to get a connection to a data source using a DataSource object registered with a JNDI naming service are described later in this chapter.
Being registered with a JNDI naming service gives a DataSource object two major advantages over the DriverManager. First, an application does not need to hardcode driver information, as it does with the DriverManager. A programmer can choose a logical name for the data source and register the logical name with a JNDI naming service. The application uses the logical name, and the JNDI naming service will supply the DataSource object associated with the logical name. The DataSource object can then be used to create a connection to the data source it represents.
The second major advantage is that the DataSource facility allows developers to implement a DataSource class to take advantage of features like connection pooling and distributed transactions. Connection pooling can increase performance dramatically by reusing connections rather than creating a new physical connection each time a connection is requested. The ability to use distributed transactions enables an application to do the heavy duty database work of large enterprises.
Although an application may use either the DriverManager or a DataSource object to get a connection, using a DataSource object offers significant advantages and is the recommended way to establish a connection.
Since 1.4
Since Java EE 6 a JNDI-bound DataSource can alternatively be configured in a declarative way directly from within the application.[1][2] This alternative is particularly useful for self-sufficient applications or for transparently using an embedded database.[3][4]
A DataSource is an abstract representation of a live set of data that presents a common predictable API for other objects to interact with. The nature of your data, its quantity, its complexity, and the logic for returning query results all play a role in determining your type of DataSource. For small amounts of simple textual data, a JavaScript array is a good choice. If your data has a small footprint but requires a simple computational or transformational filter before being displayed, a JavaScript function may be the right approach. For very large datasets—for example, a robust relational database—or to access a third-party webservice you'll certainly need to leverage the power of a Script Node or XHR DataSource.
References
[edit]- ^ "Introducing the DataSourceDefinition Annotation | Java.net". Archived from the original on 2013-12-03. Retrieved 2013-11-30.
- ^ "DataSourceDefinition (Java EE 6 )".
- ^ "The state of @DataSourceDefinition in Java EE". 30 June 2012.
- ^ "April 2012".
Datasource
View on Grokipediajavax.sql package, that serves as a factory for creating connections to a physical data source such as a relational database. Introduced with JDBC 2.0 in 1998, it provides a portable and configurable mechanism for applications to access data sources without directly using the DriverManager class, enabling better integration with enterprise environments like Java EE containers. Unlike the older DriverManager approach, a DataSource object can be configured with properties such as server name, port number, and database name, which can be modified at runtime or deployment without recompiling the application code. It is typically obtained through Java Naming and Directory Interface (JNDI) lookups in server-based applications, promoting resource pooling and connection management for improved performance and scalability. Implementations of DataSource, such as those provided by database vendors like Oracle or Apache DBCP, support features like connection pooling, distributed transactions via XADataSource, and connection validation to handle failover and load balancing. This interface has become foundational in modern Java applications for decoupling data access logic from specific driver details, facilitating easier maintenance and portability across different database systems.
General Concept
Definition and Purpose
A DataSource is a standardized facility or interface in software systems that enables applications to connect to and retrieve data from various underlying storage systems, such as databases, files, or remote services, while abstracting the complexities of direct low-level connections.[1][2] This abstraction layer simplifies data access by providing a uniform mechanism to obtain connections, regardless of the specific data provider or protocol involved.[3] The primary purpose of a DataSource is to facilitate efficient data access through mechanisms like resource pooling, where connections are reused to minimize the overhead of repeatedly establishing new links to the data origin.[4][5] It also promotes portability by allowing applications to switch between different data providers—such as from one database vendor to another—without requiring modifications to the core application code, thanks to its vendor-independent design.[6][7] Key benefits of using a DataSource include enhanced scalability, as connection pooling supports handling increased loads by efficiently managing a limited set of reusable connections; improved security through centralized credential management, which avoids embedding sensitive information directly in application code and enables secure mapping of user identities to database privileges; and greater maintainability in multi-tier architectures, where the DataSource acts as a decoupling layer between business logic and data storage.[4][8] These advantages make DataSources particularly valuable in enterprise environments requiring robust, flexible data integration. Examples of data origins accessible via a DataSource include relational databases through standards like JDBC, and APIs or remote services that expose data endpoints.[2][9] In each case, the emphasis is on the abstraction layer, which shields developers from provider-specific details and ensures consistent data handling across diverse sources.[9]Historical Development
The concept of DataSource emerged in the late 1980s as part of efforts to standardize database access in enterprise environments, with early precursors focusing on unifying connectivity across disparate systems.[10] In 1992, Microsoft introduced Open Database Connectivity (ODBC) as a key milestone, providing a standardized application programming interface (API) for accessing relational databases on Windows platforms and enabling driver-based connections to various data sources.[11] In the Java ecosystem, the Java Naming and Directory Interface (JNDI) was specified in 1998 by Sun Microsystems to facilitate resource location in distributed applications, laying groundwork for managed DataSource lookups in application servers.[12] This was followed by the introduction of the javax.sql.DataSource interface in JDBC 2.0's Standard Extension API in 1998, developed by Sun Microsystems to overcome limitations of the basic DriverManager for connection pooling and distributed transactions in enterprise settings.[13] Subsequent enhancements came with JDBC 3.0 in 2002 under JSR 54, which built on DataSource capabilities by adding features like statement pooling and savepoint support to improve performance in high-load scenarios.[14] On the web development front, Yahoo released its User Interface (YUI) library in February 2006, incorporating a JavaScript DataSource utility as an early adaptation for handling asynchronous data retrieval in AJAX applications.[15] Post-2010 developments shifted toward cloud-native architectures, exemplified by the release of Spring Boot 1.0 in April 2014, which simplified DataSource configuration through auto-configuration and integration with cloud services for scalable, containerized deployments.[16]In Database Technologies
Java JDBC DataSource
The Java JDBC DataSource interface, defined in thejavax.sql package, serves as a factory for establishing connections to physical data sources, extending the foundational JDBC model to support advanced features like connection pooling and distributed transactions. Introduced as part of JDBC 2.0, it provides a standardized, vendor-implemented mechanism that is typically registered with a naming service such as JNDI, allowing applications to obtain connections without directly interacting with the DriverManager class.[17][18]
Key methods of the DataSource interface include getConnection(), which attempts to establish a database connection using default credentials, and getConnection(String username, String password), which uses provided authentication details; both may throw SQLException if the operation fails. Inheriting from CommonDataSource, it also supports configuration methods such as setLoginTimeout(int seconds) to specify the maximum time in seconds to wait for a connection (defaulting to 0 for no timeout) and getLoginTimeout() to retrieve this value. For scenarios involving distributed transactions, the related XADataSource interface produces XAConnection objects, enabling coordination across multiple resources via a transaction manager.[17][19][20]
DataSource integrates seamlessly with connection pooling implementations to manage reusable database connections, acting as a factory that minimizes latency by recycling connections rather than creating new ones for each request. Popular libraries include Apache Commons DBCP, which provides a BasicDataSource implementation configurable via JavaBeans properties for basic pooling needs, and HikariCP, a lightweight, high-performance pool known for its minimal overhead and reliability in production environments. These implementations allow DataSource to handle high-concurrency scenarios efficiently, such as in web applications where frequent database queries occur.[18][21]
Configuration of a JDBC DataSource often occurs in application servers like Apache Tomcat through JNDI lookups, where resources are defined in files such as context.xml. Essential properties include driverClassName (e.g., com.mysql.cj.jdbc.Driver for MySQL), url (the JDBC connection string, e.g., jdbc:mysql://localhost:3306/mydb), and pooling parameters like maxTotal (maximum active connections, e.g., 100) or maxIdle (maximum idle connections, e.g., 30). The JDBC driver JAR must be placed in the server's library directory (e.g., $CATALINA_HOME/lib), and the resource is referenced in the application's web.xml for container-managed authentication.[22]
Compared to the DriverManager approach, DataSource offers superior thread-safety for concurrent access, support for distributed transactions through XADataSource, and the ability to avoid hard-coding credentials by leveraging JNDI-bound configurations, making it ideal for enterprise Java applications. Later JDBC versions (e.g., 4.0 and above) introduce additional exception types like SQLTimeoutException for timeout handling.[18][20]
In a typical servlet-based example, a DataSource is looked up using InitialContext for database operations:
import javax.naming.Context;
import javax.naming.InitialContext;
import javax.sql.DataSource;
import java.sql.Connection;
import java.sql.SQLException;
// Lookup the DataSource
Context ctx = new InitialContext();
DataSource ds = (DataSource) ctx.lookup("java:comp/env/jdbc/MyDB");
// Obtain a connection
Connection con = null;
try {
con = ds.getConnection("username", "password");
// Perform database queries here, e.g., PreparedStatement execution
} catch (SQLException e) {
// Handle exception
} finally {
if (con != null) {
try {
con.close(); // Returns connection to pool
} catch (SQLException e) {
// Handle close exception
}
}
}
import javax.naming.Context;
import javax.naming.InitialContext;
import javax.sql.DataSource;
import java.sql.Connection;
import java.sql.SQLException;
// Lookup the DataSource
Context ctx = new InitialContext();
DataSource ds = (DataSource) ctx.lookup("java:comp/env/jdbc/MyDB");
// Obtain a connection
Connection con = null;
try {
con = ds.getConnection("username", "password");
// Perform database queries here, e.g., PreparedStatement execution
} catch (SQLException e) {
// Handle exception
} finally {
if (con != null) {
try {
con.close(); // Returns connection to pool
} catch (SQLException e) {
// Handle close exception
}
}
}
Implementations in Other Languages
In the .NET ecosystem, ADO.NET, introduced in 2002 with the .NET Framework 1.0, provides DataSource-like functionality through classes such as SqlConnection and DbProviderFactory for managing pooled database connections.[23] SqlConnection enables efficient access to SQL Server or OLE DB data sources by reusing connections via connection strings, which specify parameters like server name, database, and pooling options such as Min Pool Size and Max Pool Size to optimize resource usage and reduce overhead from frequent connection establishment. DbProviderFactory, part of the System.Data.Common namespace, implements a factory pattern to instantiate provider-specific connection objects dynamically, promoting portability across different database providers without hardcoding implementation details. In Python, the DB-API specification (PEP 249), finalized in 1999, defines a standard interface for database access, with connection pooling commonly implemented through libraries like SQLAlchemy, first released in 2006.[24][25] SQLAlchemy's Engine object serves as a DataSource equivalent, creating and managing a pool of connections to databases such as PostgreSQL, where it handles creation, validation, and recycling of connections to maintain performance in multi-threaded or web applications. For PostgreSQL specifically, the psycopg2 adapter (compliant with DB-API) includes built-in pooling classes like ThreadedConnectionPool, which pre-allocate a fixed number of connections for efficient reuse and minimize latency in high-concurrency scenarios.[26] PHP's PHP Data Objects (PDO), introduced in 2005 with PHP 5.1.0, offers a DataSource-style abstraction layer using Data Source Names (DSN) to specify connection details for various drivers, including MySQL, PostgreSQL, and SQLite, allowing a single interface for multiple backend databases.[27] PDO connections are established via the constructor with a DSN string (e.g., "mysql:host=localhost;dbname=test"), and pooling is achieved through persistent connections enabled by the PDO::ATTR_PERSISTENT option or extensions like Swoole for advanced, coroutine-based pooling in asynchronous environments, which cache connections across script executions to avoid repeated handshakes. Across these languages, common patterns emphasize factory-based creation of connection objects and external configuration for portability; for instance, .NET applications often use appsettings.json files to store connection strings, enabling environment-specific adjustments without code changes. A notable cross-language trend is the integration of Object-Relational Mapping (ORM) tools that embed DataSource logic, such as Entity Framework in .NET (released in 2008), which abstracts connection management within its DbContext for simplified querying and reduces boilerplate for pooled access compared to raw ADO.NET.[28] Similar ORM approaches in Python (via SQLAlchemy) and PHP (e.g., Doctrine) follow this pattern, prioritizing developer productivity while leveraging underlying pooling mechanisms.In Client-Side Development
Yahoo YUI DataSource
The Yahoo YUI DataSource utility, introduced as part of the Yahoo! User Interface (YUI) Library version 2.x in February 2006, served as a JavaScript class designed to fetch, cache, and manage data from various sources in client-side AJAX applications.[29][30] It provided a unified interface for handling tabular data, enabling widgets like DataTable and AutoComplete to interact with local or remote data sources without requiring full page reloads.[30] YUI DataSource supported several core subclasses tailored to different data origins: LocalDataSource for in-memory structures such as JavaScript arrays, object literals, XML documents, or HTML tables; XHRDataSource for making asynchronous HTTP requests to server-side endpoints; and ScriptNodeDataSource (introduced in YUI 2.6.0) for cross-domain data retrieval via JSONP using dynamic script nodes.[30] Each type inherited from the base DataSource class and included methods likesendRequest(), which initiated data retrieval by passing a request object and a callback configuration, and doBeforeParseFn(), a customizable function for preprocessing raw responses before schema-based parsing.[30]
The architecture emphasized asynchronous operation, where data requests were queued, cached locally (with configurable maxCacheEntries to limit size), and processed through a response schema defined via the responseSchema property to extract fields, results, and metadata from formats like JSON, XML, or text.[30] This parsing integrated seamlessly with YUI components, such as populating a DataTable widget by passing the parsed results array directly to its rendering pipeline, while custom events like requestEvent and responseParseEvent allowed developers to hook into the data flow for modifications.[30] Periodic polling was also supported via setInterval() for real-time updates from remote sources.[30]
In early web applications around 2006–2010, DataSource was commonly used to load dynamic content, such as populating UI grids or dropdowns from server APIs in single-page interfaces, addressing the limitations of synchronous scripting in browsers like Internet Explorer 6 and Firefox 1.5.[30] For instance, developers could instantiate a XHRDataSource to query a REST endpoint and feed the results into a sortable DataTable without disrupting user interactions.[30]
YUI DataSource received its last major update in version 2.9.0, released on April 13, 2011, after which YUI 2 entered deprecation in 2011 as Yahoo shifted focus to YUI 3 and modern JavaScript standards; the library was fully archived by 2014 with no further maintenance.[31][32]
