Notes
Slide Show
Outline
1
ABCD Schema
  • Access to Biological Collection Data
  • Content Definition

    TDWG Annual Meeting 2004, Walter G. Berendsohn, ABCD subgroup convenor, 10 Nov. 2004
2
Agenda
  • Introduction (design principles)
  • What happened since Oeiras?
  • Structure of ABCD
    • Mandatory/Recommended elements
    • Datasets – Dataset – Unit
    • Metadata, IDs
  • Standard process, publication, revisions
3
On the website
  • www.bgbm.org/TDWG/CODATA/Schema/
  • Purpose of ABCD
  • Origin and Evolution of ABCD
  • Design principles
  • Mapping existing data definitions and standards (e.g. Darwin Core, HISPID, OECDE-MDS, EURISCO)
4
Purpose of ABCD Schema
  • A common data specification for biological collection units
    •  primary occurrence data
    •  living and preserved specimens as well as field observations
  • Intended to provide semantic definitions
  • Intended to support the exchange and integration of detailed primary data
5
Rich data
  • Support multiple special interest networks within a single architecture
  • Current research questions go beyond simple taxonomic / geographic priorities
  • Applied research: interactions, pathogens,
  • Research stipulated by data access: unknown questions
6
Origin and Evolution
  • Unit-level collection standard is community effort since 1990
  • Based on Standards, Information models, Implemented databases, .....
  • CODATA (Committee on Data for Science and Technology) support since 2001
  • Recent contributors on Website


7
Design principles
  • Full coverage for unit-level collection data; generalise what’s in common
    • not for full taxonomic or nomenclatural data
    • not for structured descriptive data
    • not for collection-level “meta”data
  • Polymorphism (variable atomisation)
    • be inclusive, allow to connect less structured data sources
8
Design principles
  • Extendibility
    • slots for standardised community-specific data items
    • slot for interim extensions
  • Flexible containers
    • element-element or element-attribute couples for category-value pairs allow freely defined and repeatable data fields (e.g., higher taxa, measurements, morphological features)

9
Design principles
  • Typing
    • building blocks for community standards
  • Avoid referencing
    • single hierarchy (current protocols)
    • future version must support multiple object types and referencing
  • No recursive structures
    • programmers need some sleep, too
10
Design principles
  • Machine-readable annotations
    • provide documentation of the schema
    • can be used in the mapping process
      (semantic search by configuration assistant)
  • Language support
    • language attribute where appropriate
    • script identification needed?
11
Design principles
  • Support of IPR statements
    • provider rights (IPR, ownership, etc.)
    • disclaimers
    • restrictions for use
    • licensing
  • Low tech
    • biologists must be able to join discussion
12
What happened since Oeiras?
  • ABCD v. 1.2, continued reference implementation using BioCASe Protocol
  • Inclusion in GBIF Portal
  • Further discussion of content ->v. 1.49e
    •  Plant genetic resources
    •  Culture collections
    •  Metadata
  •  Documentation
    • Standard mappings
    • Semantic objects library for biological collections
13
ABCD Reference Implementation
  • Driven by European projects (BioCASE, ENBI, SYNTHESYS) and Belgian and German GBIF National Node projects
  • Uses v. 1.2 and 1.48 (culture collections)
  • > 4 Mio. records on-line and accessible through GBIF and BioCASE portals
14
Structure
  • Datasets
  • Dataset
    • Set-level metadata
      • UDDI, derivation history, IPR, ...
    • Unit-data
      • Unit-level metadata
      • Taxonomic determination and classification
      • Collection event and site
15
ABCD – basic structure
16
ABCD – basic structure
17
ABCD: Mandatory Elements
  • Dataset-level
    • ContentContact/Name (UDDI administrative contact)
    • TechnicalContact/Name (UDDI)
    • Title (DC, short concise title for dataset)
    • DateModified (DC, date of creation or last change)
  • Unit-level (equivalent to DwC IDs)
    • SourceInstitutionID
    • SourceID
    • UnitID
18
Basic metadata
19
Metadata:
UDDI Contacts
20
Content
Metadata
21
Content
Metadata
22
Content
Metadata
23
Content
Metadata
24
Content
Metadata
25
Content
Metadata
26
Content
Metadata
27
Content
Metadata
28
Content
Metadata
29
Unit IDs
30
Next steps
  • Incorporation of further comments
  • Consolidation of metadata definitions
  • Consolidation of types
  • Publication of ABCD version 2 as a proposed standard
  • Reference implementations of v. 2
  • Vote at St. Petersburg?