TITLE: A Common Datastructure for European Floristic Databases (CDEFD)

COORDINATOR: Dr. W. G. Berendsohn, BGBM

START DATE: 01.08.1993

DURATION: 33 months

Funding: European Commission, 3rd Framework Programme, DG XII, Biotechnology

OBJECTIVES AND ACHIEVEMENTS:

Researchers in chemistry, molecular biology, ecology and many other branches of science use materials obtained from organisms as the base of their studies and subsequently store their research results in databases. Their source material often consists of samples taken from natural history collections, or it is vouchered in such collections to ensure proper identification of the organisms. This material includes plant, animal, or paleontological specimens in natural history collections, culture collections of microbial strains, botanic- or zoological garden collections, natural product collections, etc.

In almost all of these fields, Europe owns the most extensive collections of such specimens worldwide. To facilitate access to these resources electronic inventories are needed. However, only few significant efforts in this direction have up to now been undertaken in Europe, at least if compared to the United States, where a policy decision towards the creation such inventories was taken long ago.

To achieve interoperability of present and future databases, common data models and standards are needed. At the outset of the project, the objective of CDEFD was to develop project-independent structures to be used in the design of floristic databases and databases including floristic data. In the course of the project, this was extended to include biological collections in general, because it was realized that all objects or samples obtained from organisms share the same core data structure.

By means of a CASE (Computer Aided Software Engineering) program the results were recorded in a constantly updated information model, which consists mainly of diagrams depicting the complex data structures. Accompanying text and tables further define the contents and meaning of structural elements.

The CDEFD datamodel for biological collections is based upon a proposed structure of elemental inter- related types of "biological objects": Scientific plant name, potential taxon (plant name with circumscription reference), collection or observation site, and unit (a physical object in a collection or in the field). All results of biological studies or information gathered in administrative or decision-making processes are linked to at least one of these objects.

"Plant Name" and "Potential Taxon Name" are objects which have been treated in detail by a separate project (IOPI GPC Taxonomic Information Model). The accent of the CDEFD model lies on the Units, which have been analysed in full detail. Units may be derived from other units, in processes like duplication or extraction. This is modelled by means of a reflexive relationship, with an entity "Derived Unit Creation Event" used to provide further details on the process. Unit-subtypes (entities which may be added to the core unit's attributes) are defined for different classes of parameters, ranging from quantification data to chemical identification procedures and results. Collection management was analyzed, including storage parameters and transaction management of units in and between biological collections; e.g. the exchange of herbarium material or the distribution of strains from a culture collection. At some point in time, any material in a collection has been obtained from a field source. This process is termed "gathering", and it relates the unit to a system of geotraphic data items.

Problems surfaced in the attempt to define and standardize the information relating to geographic description and location of collection sites. Experts were consulted and specialized geographic information systems have been evaluated. The group came to the conclusion that commercially available geographic information systems (GIS) should be used whenever a detailed cover of geographic information is requested. Such systems can be linked to relational databases, so that integration with the model for collection data can be achieved. However, a tentative detailed datastructure for geographic data relating to collection site has been included in the model.

Links with information outside the realm of collection data were evaluated in detail by looking at the structure of chemical data (secondary metabolites), karyological data, and ecophysiological data. This information was modelled from two points of view: The data about organisms produced by research studies, and the data which is used during the study itself. The latter discussion leads to the development of a generalized model for experimental studies. The discussion of linked information provided important feedback for the modelling process of core biological collection data.

The CDEFD model is considered to be basic research, i.e. it is not meant to provide a model for direct implementation. Instead, the complex research model provides a general framework for the planning of specific databases. In addition, the model supplies guidelines for the definition of data fields and thus facilitates the discussion on data standards.

PUBLICATIONS AND PRESENTATIONS:

Total: 15 Joint: 13

A. Anagnostopoulos, W. Berendsohn, G. Hagedorn, J. Jakupovic, P. L. Nimis & B. Valdés, Field data in the CDEFD Model . Poster, ESF Workshop Amsterdam 24.-27. 3. 1996.

W. Berendsohn, Core Features of the CDEFD Information Model for Biological Collections. Lecture during the European Science Foundation (ESF), Disseminating Biodiversity Information Workshop, Amsterdam 24.-27. 3. 1996.

W. Berendsohn, A. Anagnostopoulos, G. Hagedorn, J. Jakupovic, P. L. Nimis & B. Valdés, A Framework for Biological Data. Poster, ESF Workshop Amsterdam 24.-27. 3. 1996.

W. Berendsohn, A. Anagnostopoulos, J. Jakupovic, P. L. Nimis & B. Valdés, A Datamodel for Botanical Collections. Poster presented during the meeting of the Organization for Phyto-Taxonomic Investigation of the Mediterranean Area (OPTIMA), Seville 25. - 29. 9. 1995 (proceedings in preparation) and during the XII meeting of the IUBS Commission for Taxonomic Databases (TDWG), in Madrid 4.-6. 10. 1995.

W. Berendsohn, A. Anagnostopoulos, J. Jakupovic, P. L. Nimis & B. Valdés, Standardizing Botanical Collection Data. Poster presented during the XII meeting of the IUBS Commission for Taxonomic Databases (TDWG) in Madrid 4.-6. 10. 1995.

W. Berendsohn, A. Anagnostopoulos, J. Jakupovic, P. L. Nimis & B. Valdés, The CDEFD Information Model for Biological Collections. In: W. Los (ed.): Proceedings from the European Science Foundation Workshop Disseminating Biodiversity Information Workshop, Amsterdam 24.-27. 3. 1996. (in press).

W. Berendsohn, J. Greilhuber, A. Anagnostopoulos, G. Bedini, J. Jakupovic, P. L.Nimis, B. Valdés, A comprehensive datamodel for karyological databases. Pl. Syst. Evol. (in press, 1996)

S. Elankovan, W. Berendsohn & H. Meyer, Person Teams: An Implementation. Poster, ESF Workshop Amsterdam 24.-27. 3. 1996.

S. Elankovan, W. Berendsohn & H. Meyer, Person Teams: An Implementation. . In: W. Los (ed.): Proceedings from the European Science Foundation Workshop Disseminating Biodiversity Information Workshop, Amsterdam 24.-27. 3. 1996. (in press).

G. Hagedorn, A. Anagnostopoulos, W. Berendsohn, J. Jakupovic, P. L. Nimis & B. Valdés, Culture Collections in the CDEFD Model. Poster, ESF Workshop Amsterdam 24.-27. 3. 1996.

J. Jakupovic, A. Anagnostopoulos, W. Berendsohn, G. Hagedorn, P. L. Nimis & B. Valdés, CDEFD: Collections of Natural Products. Poster, ESF Workshop Amsterdam 24.-27. 3. 1996.

P. L. Nimis, A. Anagnostopoulos, W. Berendsohn, G. Hagedorn, J. Jakupovic & B. Valdés, Units: A Lichen Exsiccatum in Trieste . Poster, ESF Workshop, Amsterdam 24.-27. 3. 1996.

H. Meyer, W. Berendsohn & S. Elankovan, CDEFD: Taxonomic Identification. Poster, ESF Workshop Amsterdam 24.-27. 3. 1996.

H. Meyer, W. Berendsohn & S. Elankovan, CDEFD: Taxonomic Identification. . In: W. Los (ed.): Proceedings from the European Science Foundation Workshop Disseminating Biodiversity Information Workshop, Amsterdam 24.-27. 3. 1996. (in press).

B. Valdés, A. Anagnostopoulos, W. Berendsohn, G. Hagedorn, J. Jakupovic & P. L. Nimis, Collection Management in the CDEFD Model. Poster, ESF Workshop Amsterdam 24.-27. 3. 1996.

KEYWORDS:

information model/ data model/ CASE tools/ databases/ biological information/ biological collections/ natural history collections/ collection databases/ culture collections/ secondary metabolyte collections/ floristic collections/ zoological collections/ entomological collections/ microbiological collections/ karyological data

CORE PROJECT MEMBERS:

W. BERENDSOHN (COORDINATOR), Botanical Garden and Botanical Museum Berlin-Dahlem (BGBM), Königin-Luise-Str. 6-8, D - 14195 Berlin, Germany

J. JAKUPOVIC, AnalytiCon GmbH, Gustav-Meyer-Allee 25, D - 13355 Berlin, Germany

P.-L. NIMIS, Biology Department, University of Trieste, 10 Via Giorgieri, I - 34127 Trieste, Italy

D. PHITOS and A. ANAGNOSTOPOULOS, Botanical Institute, University of Patras, GR - 26500 Patras, Greece

B. VALDÉS, Departamento de Biologia Vegetal y Ecologia, Universidad de Sevilla, Apartado de Correos 1095, E - 41012 Sevilla, Spain.

Acknowledgements to contributors in meetings:

G. Bedini, Pisa; F. Bisby, Southampton; J. Cooper, Egham; E. Feoli, Trieste; P. Ganis, Trieste; D. Green, Albury; J. Greilhuber, Vienna; H. Haeupler, Bochum; H. Hovenkamp, Leiden; H. Kremers, Berlin; J. Lebbe, Paris; W. Loader, Kew; R. May, Bonn; D. Minter, Egham; C. Oberprieler, Berlin; E. Pastor, Seville; D. Phitos, Patras; R. Pankhurst, Edinburgh; G. Rambold, Munich; J. le Renard, Paris; A. Rissone, London; J. Rubio Recio, Seville; K. Siems, Berlin; M. Tretiach, Trieste.


This page last updated Sept. 29, 1996. Contact: W. Berendsohn, wgb@zedat.fu-berlin.de