CDEFD

The Gathering Site: Geographical and Ecological Data

A detailed and complete coverage of all the data items which may be incorporated into the geographical and ecological site description would clearly exceed the scope of this model. The US Federal Geographic Data Committee (FGDC 1994) lists more than 300 individual data elements and compound elements for geospatial data alone. CDEFD came to the conclusion that for extensive coverage of geographical and other data related to field collections, a database system should rely on one of the available commercial GIS (geographic information system) programs. However, site data cannot be excluded from this model, because they are too important (e.g. in herbarium collections) or even form essential part of the objective of the task (floristic mapping). The model as presented here is intended to aid in the definition of requirements for a linked GIS or for the basic structures to be implemented in a proprietary database design. Roughly, four data areas may be distinguished:

These four data areas provide the framework for the entity relation model of site information (Diagram 8). Gazetteer data, geo-ecological classification units, and ecological site descriptors are handled by entities named accordingly. Co-ordinate data are included in the entity Gathering Site.

Specialized collections may rely on a very specific sampling scheme, for example, vouchers taken from numbered trees in a forest sample plot, or microbial samples from a defined spot on a dunghill. To keep the site information general, such data will normally be specified in the attribute individual low-scale locality description of the Gathering or Field Unit entity. Information like depth below surface for underground or underwater collections belongs here as well.


Diagram 8: Gathering site-related entities (Entities)


Because of the widely varying degree of details recorded, the general data model must be kept very flexible to allow for adaptation to specific needs, without sacrificing over-all compatibility. This is achieved by the possibility to subtype the major entities involved. Thus, user-defined as well as standard data tables may be incorporated as they become available. Examples for subtypes include:

The definition of the entities themselves, and their relation to the Gathering Event is complicated by the fact that geo-ecological data are time related. Co-ordinate data are stable over time and for named areas a validity period may be defined in the gazetteer itself. However, the Geo-ecological Classification Unit a site belongs to may change rapidly, and Ecological Descriptors may be highly dependent on the observation date. In this case the user has to define a new Gathering Site to link the appropriate information.


Diagram 9: Site data


The user defined site name provides a shortcut to a previously used gathering site in an implemented system.

For individual geographic area names and categories, names in different languages may exist (attributes language, area category name in language, area name in language). One of these synonyms has to be marked by means of a default name flag for a defined system. The area category is a designation which may or may not cited with the name, e.g. "Department"; "Kreis"; "Municipio"; "Eparchia"; "Island"; "TDWG Botanical Country". Area categories may be necessary to identify a specific area (e.g. "New York", city and state).

Area validation data can be used for quality control. It is necessary to cite the area circumscription validity period, because named areas may change significantly over time. The actual circumscription (planar and altitudinal) of the area could be expressed in a GIS as a series of vectors describing the perimeter of the area in some kind of co-ordinate system. The maximum/minimum data given here can easily be stored in a normal database and are very useful for the validation of the input of point locations.

Geo-ecological classification data include a defined category, e.g. "Formation", its value, e.g. "Gallery forest" and, preferably, a bibliographical reference detailing the system used (e.g. "Braun-Blanquet & Fuller 1929"). A classification usually involves a hierarchy, so a pointer to a higher category may be used.

For ecological site descriptors only a very general structure is provided here, consisting of the basic data elements present for any descriptor: name and value (i.e. character and character state).

The geographic area subtype designator tells the application program, which of the possible area subtypes is to be included. Analogous subtype designations are given for geo-ecological classifications and for ecological descriptors.

Diagram 10 defines basic attributes which may be used to define geospatial co-ordinates in the entity Gathering Site. Altitude and co-ordinate precision modifiers include expressions such as "about", "ca.", etc. The grid precision modifier is used to record uncertainty in grid assignment or proximity to grid line (indicating possibility of duplicate recording in different grid cells). The grid cell code together with the grid system name defines a location within a grid cell (e.g. "UTM" and "UF19", or "German MTB (Meßtischblatt)" and "7413/14").

All geographical co-ordinate systems (including polar co-ordinates) can be expressed by the co-ordinate system's name, the altitude, and two floating point numbers (x- and y-value). North/South and East/West in geographical co-ordinates can be expressed by positive and negative values, respectively. In a general implementation, the definition, data entry rules, and formatting rules for the x- and y-value can be defined in a separate entity Co-ordinate System. For the standard Greenwich geographical co-ordinate system, this entity would contain the information: Co-ordinate system name: "Std. Geographical"; Central meridian: 0; Prefix-for-X: "Latitude"; Suffix-for-X-positive: "N"; Suffix-for-X-negative: "S"; Prefix for Y: "Longitude"; Suffix for Y positive: "E"; Suffix for Y negative: "W"; Number formatting: "degree-minute-second" (other choices are: degree-decimals; decimals). The input formatting routine for degree-values should accept floating point values in the degrees or minutes part (e.g. 41.50° should be understood as 41° 30'), thus allowing a mixed data entry of degree-minute-second, degree-minute (with fractional part) and degree-decimal.

The measurement method refers to the source of co-ordinate and/or grid values, and contains entries such as: "estimation from map", "estimation from map with altitude by barometric altimeter", "GPS", "GPS with local reference", etc. Co-ordinate measurement error contains a numeric value read as plus/minus measurement error of the co-ordinate values.


Diagram 10: Datastructure of geospatial co-ordinates


Descriptive information about the position of the site which cannot be accommodated by means of the gazetteer or co-ordinates are accommodated by the attribute site location detail ("Roadside, road between Jucurán and Casas Viejas, about 2 km from Jucurán").


Definitions: Terminology, Data Structure Diagrams, Entity Relation Diagrams
Next; Previous; Contents; Entity list; References; Author information. Last updated: April 29, 1996, wgb@zedat.fu-Berlin.de