Spatial Database Concepts
Introduction to Spatial Databases
• Spatial databases incorporate
functionality that provides support for databases that keep track of objects in
a multidimensional space.
• For example, cartographic databases
that store maps include two-dimensional spatial descriptions of their
objects—from countries and states to rivers, cities, roads, seas, and so on.
• The systems that manage geographic
data and related applications are known as Geographical Information Systems
(GIS).
• Spatial Database are used in areas
such as environmental applications, transportation systems, emergency response
systems, and battle management.
• Also they are use as meteorological
databases for weather information, are three-dimensional, since temperatures
and other meteorological information are related to three-dimensional spatial
points.
• In general, a spatial database
stores objects that have spatial characteristics that describe them and that
have spatial relationships among them.
• The spatial relationships among the
objects are important, and they are often needed when querying the database.
• Although a spatial database can in
general refer to an n-dimensional space for any n, we will limit our discussion
to two dimensions as an illustration.
• A spatial database is optimized to
store and query data related to objects in space, including points, lines and
polygons.
• Satellite images are a prominent
example of spatial data.
• Queries posed on these spatial data,
where predicates for selection deal with spatial parameters, are called spatial
queries.
• For example, “What are the names of
all bookstores within five miles of the College of Computing building at
Georgia Tech?” is a spatial query.
• A query such as “List all the
customers located within twenty miles of company headquarters” will require the
processing of spatial data types typically outside the scope of standard
relational algebra and may involve consulting an external geographic database
that maps the company headquarters and each customer to a 2-D map based on
their address.
• Effectively, each customer will be
associated to a position. A traditional B+ -tree
index based on customers’ zip codes or other nonspatial attributes cannot be used
to process this query since traditional indexes are not capable of ordering
multidimensional coordinate data.
• Therefore, there is a special need for databases
tailored for handling spatial data and spatial queries.
Spatial Data Types
• Spatial data comes in three basic
forms. These forms have become a de facto standard due to their wide use
in commercial systems.
• Map Data
• Attribute
data
• Image
data
• Map
Data: Includes various
geographic or spatial features of objects in a map, such as an object’s shape
and the location of the object within the map.
• The three basic types of features
are points, lines, and polygons.
• Points are used to represent spatial characteristics
of objects whose locations correspond to a single 2-d coordinate (x, y,
or longitude/latitude) in the scale of a particular application.
• Depending on the scale, some
examples of point objects could be buildings, cellular towers, or stationary
vehicles.
• Moving vehicles and other moving
objects can be represented by a sequence of point locations that change over
time.
• Lines represent objects having length, such as roads
or rivers, whose spatial characteristics can be approximated by a sequence of
connected lines.
• Polygons are used to represent spatial characteristics
of objects that have a boundary, such as countries, states, lakes, or cities.
• Notice that some objects, such as
buildings or cities, can be represented as either points or polygons, depending
on the scale of detail.
• Attribute data is the descriptive data that GIS
systems associate with map features.
• For example, suppose that a map
contains features that represent counties within a US state (such as Texas or
Oregon).
• Attributes for each county feature
(object) could include population, largest city/town, area in square miles, and
so on. Other attribute data could be included for other features in the map,
such as states, cities, congressional districts, census tracts, and so
on.
• Image data includes data such as satellite
images and aerial photographs, which are typically created by cameras. Objects
of interest, such as buildings and roads, can be identified and overlaid on
these images.
• Images can also be attributes of map
features. One can add images to other map features so that clicking on the
feature would display the image.
• Aerial and satellite images are
typical examples of raster data.
Spatial Models
• Models of spatial information are sometimes grouped into two broad
categories: field and object.
• A spatial application (such as
remote sensing or highway traffic control) is modeled using either a field- or
an object-based model, depending on the requirements and the traditional choice
of model for the application.
• Field
models are often used to
model spatial data that is continuous in nature, such as terrain elevation,
temperature data, and soil variation characteristics.
• object models have traditionally been used for applications
such as transportation networks, land parcels, buildings, and other objects that
possess both spatial and non-spatial attributes.
Spatial Operators
• Spatial operators are used to
capture all the relevant geometric properties of objects embedded in the
physical space and the relations between them, as well as to perform spatial
analysis. Operators are classified into three broad categories:
• Topological
operators
-Projective operators
-Metric operators
-Projective operators
-Metric operators
• Dynamic
Spatial Operators
• Spatial
Queries
• Topological properties are invariant
when topological transformations are applied.
• These properties do not change after
transformations like rotation, translation, or scaling.
• Topological operators are
hierarchically structured in several levels, where the base level offers
operators the ability to check for detailed topological relations between
regions with a broad boundary, and the higher levels offer more abstract
operators that allow users to query uncertain spatial data independent of the
underlying geometric data model.
• Examples include open (region),
close (region), and inside (point, loop).
• Projective
operators: Projective
operators, such as convex hull, are used to express predicates about the
concavity/convexity of objects as well as other spatial relations (for example,
being inside the concavity of a given object).
• Metric operators. Metric operators provide a more
specific description of the object’s geometry. They are used to measure some
global properties of single objects (such as the area, relative size of an
object’s parts, compactness, and symmetry), and to measure the relative
position of different objects in terms of distance and direction.
• Examples include length (arc) and
distance (point, point).
• Dynamic Spatial Operators. The operations performed by the
operators mentioned above are static, in the sense that the operands are not
affected by the application of the operation.
• For example, calculating the length
of the curve has no effect on the curve itself.
• Dynamic operations alter the objects upon which the
operations act. The three fundamental dynamic operations are create, destroy,
and update.
• A representative example of dynamic
operations would be updating a spatial object that can be subdivided into
translate (shift position), rotate (change orientation), scale up or down,
reflect (produce a mirror image), and shear (deform).
• Spatial Queries. Spatial queries are requests for
spatial data that require the use of spatial operations.
• The following categories illustrate
three typical types of spatial queries:
• ■ Range query. Finds the
objects of a particular type that are within a given spatial area or within a
particular distance from a given location. (For example, find all hospitals
within the Metropolitan Atlanta city area, or find all ambulances within five
miles of an accident location.)
• ■ Nearest neighbor query. Finds
an object of a particular type that is closest to a given location. (For
example, find the police car that is closest to the location of crime.)
■ Spatial joins or overlays. Typically
joins the objects of two types based on some spatial condition, such as the
objects intersecting or overlapping spatially or being within a certain
distance of one another. (For example, find all townships located on a major
highway between two cities or find all homes that are within two miles of a
lake.)
Spatial Data
Indexing
• A spatial index is used to organize
objects into a set of buckets (which correspond to pages of secondary memory),
so that objects in a particular spatial region can be easily located.
• Each bucket has a bucket region, a
part of space containing all objects stored in the bucket.
• The bucket regions are usually
rectangles; for point data structures, these regions are disjoint and they
partition the space so that each point belongs to precisely one bucket.
There are
essentially two ways of providing a spatial index
- 1. Specialized indexing structures that allow efficient search for data objects based on spatial search operations are included in the database system. These indexing structures would play a similar role to that performed by B+-tree indexes in traditional database systems.
- Examples of these indexing structures are grid files and R-trees. Special types of spatial indexes, known as spatial join indexes, can be used to speed up spatial join operations.
- 2. Instead of creating brand new indexing structures, the two-dimensional (2-d) spatial data is converted to single-dimensional (1-d) data, so that traditional indexing techniques (B+ -tree) can be used.
The algorithms for converting from 2-d to 1-d
are known as space filling curves. We will not discuss these methods in
detail.
Spatial Indexing Techniques
• Grid
Files: The fixed-grid method divides an n-dimensional
hyperspace into equal size buckets. The data structure that implements the
fixed grid is an n-dimensional array.
• R-Trees:
The R-tree is a
height-balanced tree, which is an extension of the B+-tree for k-dimensions,
where k > 1. For two dimensions (2-d), spatial objects are
approximated in the R-tree by their minimum bounding rectangle (MBR),
which is the smallest rectangle, with sides parallel to the coordinate system (x
and y) axis, that contains the object.
• Spatial Join Index: A spatial join index precomputes a spatial join
operation and stores the pointers to the related object in an index structure.
Join indexes improve the performance of recurring join queries over tables that
have low update rates.
Spatial Data Mining
• Spatial data tends to be highly
correlated. For example, people with similar characteristics, occupations, and
backgrounds tend to cluster together in the same neighborhoods.
• The three major spatial data mining
techniques are spatial classification, spatial association, and spatial
clustering.
• The goal of classification is to
estimate the value of an attribute of a relation based on the value of the
relation’s other attributes.
• Spatial association rules are defined in terms of spatial predicates
rather than items. A spatial association
rule is of the form
P1 ^ P2 ^ ... ^ Pn⇒Q1 ^ Q2 ^ ... ^ Qm,
P1 ^ P2 ^ ... ^ Pn⇒Q1 ^ Q2 ^ ... ^ Qm,
where at least one of the Pi’s
or Qj’s is a spatial predicate.
• Spatial Clustering attempts to group database objects
so that the most similar objects are in the same cluster, and objects in
different clusters are as dissimilar as possible.
• One application of spatial
clustering is to group together seismic events in order to determine earthquake
faults.
• An example of a spatial clustering
algorithm is density-based clustering, which tries to find clusters
based on the density of data points in a region.
Applications of Spatial Data
• Spatial data management is useful in
many disciplines, including geography, remote sensing, urban planning, and
natural resource management.
• Spatial database management is
playing an important role in the solution of challenging scientific problems
such as global climate change and genomics.
• Due to the spatial nature of genome
data, GIS and spatial database management systems have a large role to play in
the area of bioinformatics.
• Some of the typical applications
include pattern recognition (for example, to check if the topology of a
particular gene in the genome is found in any other sequence feature map in the
database), genome browser development, and visualization maps.
• Another important application area of spatial
data mining is the spatial outlier detection. A spatial outlier is a
spatially referenced object whose nonspatial attribute values are significantly
different from those of other spatially referenced objects in its spatial
neighborhood.
Comments
Post a Comment