Data architecture

Data architecture consist of models, policies, rules, and standards that govern which data is collected and how it is stored, arranged, integrated, and put to use in data systems and in organizations.^[1] Data is usually one of several architecture domains that form the pillars of an enterprise architecture or solution architecture.^[2]

Overview

A data architecture aims to set data standards for all its data systems as a vision or a model of the eventual interactions between those data systems. Data integration, for example, should be dependent upon data architecture standards since data integration requires data interactions between two or more data systems. A data architecture, in part, describes the data structures used by a business and its computer applications software. Data architectures address data in storage, data in use, and data in motion; descriptions of data stores, data groups, and data items; and mappings of those data artifacts to data qualities, applications, locations, etc.

Essential to realizing the target state, data architecture describes how data is processed, stored, and used in an information system. It provides criteria for data processing operations to make it possible to design data flows and also control the flow of data in the system.

The data architect is typically responsible for defining the target state, aligning during development and then following up to ensure enhancements are done in the spirit of the original blueprint.

During the definition of the target state, the data architecture breaks a subject down to the atomic level and then builds it back up to the desired form. The data architect breaks the subject down by going through three traditional architectural stages:

Conceptual - represents all business entities.
Logical - represents the logic of how entities are related.
Physical - the realization of the data mechanisms for a specific type of functionality.

The "data" column of the Zachman Framework for enterprise architecture –

Layer	View	Data (what)	Stakeholder
1	Scope/contextual	List of things and architectural standards^[3] important to the business	Planner
2	Business model/conceptual	Semantic model or conceptual/enterprise data model	Owner
3	System model/logical	Enterprise/logical data model	Designer
4	Technology model/physical	Physical data model	Builder
5	Detailed representations	Actual databases	Developer

In this second, broader sense, data architecture includes a complete analysis of the relationships among an organization's functions, available technologies, and data types.

Data architecture should be defined in the planning phase of the design of a new data processing and storage system. The major types and sources of data necessary to support an enterprise should be identified in a manner that is complete, consistent, and understandable. The primary requirement at this stage is to define all of the relevant data entities, not to specify computer hardware items. A data entity is any real or abstract thing about which an organization or individual wishes to store data.

Physical data architecture

Physical data architecture of an information system is part of a technology plan. The technology plan is focused on the actual tangible elements to be used in the implementation of the data architecture design. Physical data architecture encompasses database architecture. Database architecture is a schema of the actual database technology that would support the designed data architecture.

Elements of data architecture

Certain elements must be defined during the design phase of the data architecture schema. For example, an administrative structure that is to be established in order to manage the data resources must be described. Also, the methodologies that are to be employed to store the data must be defined. In addition, a description of the database technology to be employed must be generated, as well as a description of the processes that are to manipulate the data. It is also important to design interfaces to the data by other systems, as well as a design for the infrastructure that is to support common data operations (i.e. emergency procedures, data imports, data backups, external transfers of data).

Without the guidance of a properly implemented data architecture design, common data operations might be implemented in different ways, rendering it difficult to understand and control the flow of data within such systems. This sort of fragmentation is undesirable due to the potential increased cost and the data disconnects involved. These sorts of difficulties may be encountered with rapidly growing enterprises and also enterprises that service different lines of business.

Properly executed, the data architecture phase of information system planning forces an organization to specify and describe both internal and external information flows. These are patterns that the organization may not have previously taken the time to conceptualize. It is therefore possible at this stage to identify costly information shortfalls, disconnects between departments, and disconnects between organizational systems that may not have been evident before the data architecture analysis.^[4]

Constraints and influences

Various constraints and influences will have an effect on data architecture design. These include enterprise requirements, technology drivers, economics, business policies and data processing needs.

Enterprise requirements: These generally include such elements as economical and effective system expansion, acceptable performance levels (especially system access speed), transaction reliability, and transparent data management. In addition, the conversion of raw data such as transaction records and image files into more useful information forms through such features as data warehouses is also a common organizational requirement, since this enables managerial decision making and other organizational processes. One of the architecture techniques is the split between managing transaction data and (master) reference data. Another is splitting data capture systems from data retrieval systems (as done in a data warehouse).

Technology drivers: These are usually suggested by the completed data architecture and database architecture designs. In addition, some technology drivers will derive from existing organizational integration frameworks and standards, organizational economics, and existing site resources (e.g. previously purchased software licensing). In many cases, the integration of multiple legacy systems requires the use of data virtualization technologies.

Economics: These are also important factors that must be considered during the data architecture phase. It is possible that some solutions, while optimal in principle, may not be potential candidates due to their cost. External factors such as the business cycle, interest rates, market conditions, and legal considerations could all have an effect on decisions relevant to data architecture.

Business policies: Business policies that also drive data architecture design include internal organizational policies, rules of regulatory bodies, professional standards, and applicable governmental laws that can vary by applicable agency. These policies and rules describe the manner in which the enterprise wishes to process its data.

Data processing needs: These include accurate and reproducible transactions performed in high volumes, data warehousing for the support of management information systems (and potential data mining), repetitive periodic reporting, ad hoc reporting, and support of various organizational initiatives as required (i.e. annual budgets, new product development).

References

^ Business Dictionary - Data Architecture Archived 2013-03-30 at the Wayback Machine; TOGAF 9.1 - Phase C: Information Systems Architectures - Data Architecture
^ What is data architecture GeekInterview, 2008-01-28, accessed 2011-04-28
^ Data Architecture Standards
^ Mittal, Prashant (2009). Author. pg 256: Global India Publications. p. 314. ISBN 978-93-8022-820-4.{{cite book}}: CS1 maint: location (link)

External links

Achieving Usability Through Software Architecture, sei.cmu.edu 2001
The Logical Data Architecture, by Nirmal Baid
Building a modern data and analytics architecture
The “Right to Repair” Data Architecture with DataOps, the DataOps Blog
TOGAF 9: Preparation Process

[1] Business Dictionary - Data Architecture Archived 2013-03-30 at the Wayback Machine; TOGAF 9.1 - Phase C: Information Systems Architectures - Data Architecture

[2] What is data architecture GeekInterview, 2008-01-28, accessed 2011-04-28

[3] Data Architecture Standards

[4] Mittal, Prashant (2009). Author. pg 256: Global India Publications. p. 314. ISBN 978-93-8022-820-4.{{cite book}}: CS1 maint: location (link)

[1]

[2]

[3]

[4]

v t e Data model
Main	Architecture Modeling Structure
Schemas	Conceptual Logical Physical
Types	Database Data structure diagram Entity–relationship model (enhanced) Geographic Generic Semantic Common
Related models	Data-flow diagram Information model Object model Object–role modeling Unified Modeling Language
See also	Database design Business process modeling Core architecture data model Enterprise modelling Function model Process modeling XML schema Data Format Description Language

History

History

Data architecture

Data architecture

Data architecture

Overview

Physical data architecture

Elements of data architecture

Constraints and influences

See also

References

Further reading

External links

Data architecture

Fundamentals

Definition and Scope

Historical Development

Importance and Applications

Architectural Levels

Conceptual Data Architecture

Logical Data Architecture

Physical Data Architecture

Centralized data architectures

Core Components

Data Models and Schemas

Types of Data Models

Schema Designs

Selection Criteria

Evolution

Metadata Management

Data Integration and Flow

Design Considerations

Principles and Standards

Constraints and Influences

Governance and Security

Modern Practices

Methodologies and Frameworks

Tools and Technologies

Emerging Trends

References