Spatial database

A spatial database is a general-purpose database (usually a relational database) that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data.

Most spatial databases allow the representation of simple geometric objects such as points, lines and polygons. Some spatial databases handle more complex structures such as 3D objects, topological coverages, linear networks, and triangulated irregular networks (TINs). While typical databases have developed to manage various numeric and character types of data, such databases require additional functionality to process spatial data types efficiently, and developers have often added geometry or feature data types.

Geographic database (or geodatabase) is a georeferenced spatial database, used for storing and manipulating geographic data (or geodata, i.e., data associated with a location on Earth),^[a] especially in geographic information systems (GIS). Almost all current relational and object-relational database management systems now have spatial extensions, and some GIS software vendors have developed their own spatial extensions to database management systems.

The Open Geospatial Consortium (OGC) developed the Simple Features specification (first released in 1997)^[1] and sets standards for adding spatial functionality to database systems.^[2] The SQL/MM Spatial ISO/IEC standard is a part of the structured query language and multimedia standard extending the Simple Features.^[3]

Characteristics

The core functionality added by a spatial extension to a database is one or more spatial datatypes, which allow for the storage of spatial data as attribute values in a table.^[4] Most commonly, a single spatial value would be a geometric primitive (point, line, polygon, etc.) based on the vector data model. The datatypes in most spatial databases are based on the OGC Simple Features specification for representing geometric primitives. Some spatial databases also support the storage of raster data. Because all geographic locations must be specified according to a spatial reference system, spatial databases must also allow for the tracking and transformation of coordinate systems. In many systems, when a spatial column is defined in a table, it also includes a choice of coordinate system, chosen from a list of available systems that is stored in a lookup table.

The second major functionality extension in a spatial database is the addition of spatial capabilities to the query language (e.g., SQL); these give the spatial database the same query, analysis, and manipulation operations that are available in traditional GIS software. In most relational database management systems, this functionality is implemented as a set of new functions that can be used in SQL SELECT statements. Several types of operations are specified by the Open Geospatial Consortium standard:

Measurement: Computes line length, polygon area, the distance between geometries, etc.
Geoprocessing: Modify existing features to create new ones, for example by creating a buffer around them, intersecting features, etc.
Predicates: Allows true/false queries about spatial relationships between geometries. Examples include "do two polygons overlap?" or 'is there a residence located within a mile of the area we are planning to build the landfill?' (see DE-9IM)
Geometry Constructors: Creates new geometries, usually by specifying the vertices (points or nodes) which define the shape.
Observer Functions: Queries that return specific information about a feature, such as the location of the center of a circle.

Some databases support only simplified or modified sets of these operations, especially in cases of NoSQL systems like MongoDB and CouchDB.

Spatial index

A spatial index is used by a spatial database to optimize spatial queries, implementing spatial access methods. Database systems use indices to quickly look up values by sorting data values in a linear (e.g. alphabetical) order; however, this way of indexing data is not optimal for spatial queries in two- or three-dimensional space. Instead, spatial databases use a spatial index designed specifically for multi-dimensional ordering.^[5] Common spatial index methods include:

Binary space partitioning (BSP-Tree): Subdividing space by hyperplanes.
Bounding volume hierarchy (BVH)
Geohash
Grid (spatial index)
HHCode
Hilbert R-tree
k-d tree
m-tree – an m-tree index can be used for the efficient resolution of similarity queries on complex objects as compared using an arbitrary metric.
Octree
PH-tree
Quadtree
R-tree: Typically the preferred method for indexing spatial data.^[6] Objects (shapes, lines and points) are grouped using the minimum bounding rectangle (MBR). Objects are added to an MBR within the index that will lead to the smallest increase in its size.
R+ tree
R* tree
UB-tree
X-tree
Z-order (curve)

Spatial query

A spatial query is a special type of database query supported by spatial databases, including geodatabases. The queries differ from non-spatial SQL queries in several important ways. Two of the most important are that they allow for the use of geometry data types such as points, lines and polygons and that these queries consider the spatial relationship between these geometries.

The function names for queries differ across geodatabases. The following are a few of the functions built into PostGIS, a free geodatabase which is a PostgreSQL extension (the term 'geometry' refers to a point, line, box or other two or three dimensional shape):^[7]

Function prototype: functionName (parameter(s)) : return type

ST_Distance(geometry, geometry) : number
ST_Equals(geometry, geometry) : boolean
ST_Disjoint(geometry, geometry) : boolean
ST_Intersects(geometry, geometry) : boolean
ST_Touches(geometry, geometry) : boolean
ST_Crosses(geometry, geometry) : boolean
ST_Overlaps(geometry, geometry) : boolean
ST_Contains(geometry, geometry) : boolean
ST_Length(geometry) : number
ST_Area(geometry) : number
ST_Centroid(geometry) : geometry
ST_Intersection(geometry, geometry) : geometry

Thus, a spatial join between a points layer of cities and a polygon layer of countries could be performed in a spatially-extended SQL statement as:

SELECT * FROM cities, countries WHERE ST_Contains(countries.shape, cities.shape)

The Intersect vector overlay operation (a core element of GIS software) could be replicated as:

SELECT ST_Intersection(veg.shape, soil.shape) int_poly, veg.*, soil.* FROM veg, soil where ST_Intersects(veg.shape, soil.shape)