How geographical information (GI) data is stored and managed is very important when it comes to getting the most from this type of data and to aid sharing data between different user groups.
When storing GI data the important areas to consider are:
- Types of storage and data structure
- Geospatial data formats
- Metadata
Types of storage and data structure
Database systems should be efficient tools for the storage, analysis and reporting of your data. As a result, the choice of database package and data structure used in a given project should be dictated by the requirements of each organisation.
Data structure
Data structures currently fall into four major types:
- Flat file
- Hierarchical
- Relational
- Object-oriented
Some information is provided on The Archaeology Data Service, and more detailed discussion can be found in the book Fundamentals of Spatial Information Systems (Laurini and Thompson 1996).
Attribute database structuring
Attribute data is often captured separately to the geometry simply because it is more cost effective. An essential requirement for separate data entry is a common identifier that can be used to relate object geometry and attributes together following data capture.
Combining and integrating attribute databases from different sources relies on the implementation of data standards, to ensure they have a compatible field structure, and therefore when attempting to structure and organise an attribute database, the following factors are critical:
- Naming conventions
- Key fields
- Character field definitions
- Grid references
- Validation
- Numeric data
- Data entry control
- Confidence values
- Consistency
- Documentation
- Dates
The Archaeology Data Service discusses each of these issues in turn.
Geospatial data formats
One of the biggest problems with sharing GI data is that they can be encoded in many different formats. Typically, GI data are stored in proprietary system-specific formats and visualised with GIS, CAD or Remote Sensing software. Popular formats include:
|
Vector
|
Raster
|
|
ESRI Shapefile
|
Windows Bitmap (BMP)
|
|
MapInfo Tab, Mid, MIF
|
ERDAS Imagine (IMG)
|
|
Encapsulated Postscript (EPS)
|
Portable Network Graphics (PNG)
|
|
AutoCAD Drawing (DWG)
|
Graphics Interchange Format (GIF)
|
|
AutoCAD Drawing Exchange Format (DXF)
|
JPEG File Interchange Format (JFIF)
|
|
Vector Product Format (VPF)
|
Multi-resolution Seamless Image Database (MrSID)
|
There are many formats and no uniformly accepted industry-wide standards. However, a number of popular data formats used for geospatial data are considered to be de facto or proprietary standards, in that they are widely used, freely available and openly published. They are, however, still under the control of commercial organisations who may change the specification at any time. Examples of such formats include:
File size also varies across the different formats, and GI data is notably larger than other data formats which can result in storage problems.
Translation tools
Given the high cost of creating geographical databases, there has been high demand for tools to transfer data between systems. This has led to the development of translation software, either using a direct read into memory or through an intermediate file format.
Geographic translation software must address both syntactic translation (converting specific digital symbols) and semantic translation (converting the meaning inherent in geographic information):
Translation is not always without its problems – things can go wrong and users may end up with corrupted media, incomplete data files, wrong versions of translators, and different interpretations of a format specification. This highlights the need for common file interchange formats.
Open source formats
A lack of open source formats for GI data not only creates problems for sharing data, but also for long-term preservation of this type of data.
The Open Geospatial Consortium (OGC) promotes the development and use of advanced open standards and techniques, which encourage and enable interoperability among and between diverse data stores, services, applications and organisations.
XML and GML are examples of standard open data exchange formats that are becoming widely favoured by data suppliers such as Ordnance Survey, enabling creation of a level playing field for producers of software and translators.
The OGC have also created open standards for assisting with web mapping interoperability:
- The Web Map Service (WMS) allows a client to overlay map images for display served from multiple Web Map Services on the Internet.
- The Web Feature Service (WFS) allows a client to retrieve geospatial data encoded in Geography Markup Language (GML) from multiple Web Feature Services on the internet.