![]()  | 
    
       Biodiversitätsinformatik / Biodiversity Informatics  | 
  
        
        Choosing a formal language for the 
        
         | 
      
| 
               
               R1. PT1 and PT2 are congruent 
  | 
            
               
                | 
          
| 
         
               R2. PT1 is included in PT2
                
  | 
            
               
                | 
          
| 
         
 R3. PT1
              includes PT2 
  | 
            
               
                | 
          
| 
         
 R4. PT1 and PT2 overlap each other 
  | 
            
               
                | 
          
| 
         
 R5. PT1 and PT2 exclude each other
                
  | 
            
               
                | 
          
The relationship between several Potential Taxa thus form an oriented graph, where the nodes are the Potential Taxa and the edges are formed by those pairs of Potential Taxa, for which the expert(s) assigned set relationships :

Suppose that there is a pool of connected Potential Taxa from different sources. Two different kind of queries about factual information are of interest for the user:
To which taxa (actually: Potential Taxa) does certain factual information apply?
Which factual information applies to certain taxon names (actually: Potential Taxa)?
The result should not depend on which Potential Taxon the factual information was originally linked to. For this purpose a rule system based on the Potential Taxon graph has to be developed. Moreover it should be possible to formulate flexible rules, which restrain the result (e.g. depending on factors such as an assessment of the expertise of sources or authors of relationships). As a result, users can be notified about qualitative aspects of the linkage between transmitted facts and the potential taxon they used at the start of their query.
There are four categories for the applicability of factual
        information with respect to "their" Potential Taxon:
        1) fully applicable, if the factual information applies to every element
        of the taxon,
        2) partially applicable, if the factual information applies only to a subset of
        elements
        of the taxon, 
        3) doubtful applicable, if the factual information may apply to some
        elements
        of the taxon and
        4) not applicable, if the factual information does not apply to any
        element of the taxon.
Suppose that some factual information is fully applicable for the potential taxon PT1. Taking into account the graph with its relationships there are at least three options for the quality of the factual information if transmitted to the potential taxon PT2:
fully applicable, if PT1 º PT2 or PT1 É PT2
partially applicable, if PT1 Ì PT2 or PT1 Å PT2
not applicable, if PT1 ! PT2
As shown, the quality of the factual information applying to PT2 depends on both the quality of the same factual information when applying to PT1 and on the set relationship between both of them.
In the graph it is evident that an edge does not exist for every pair of potential taxa, although a path (sequence of edges) between them might exist. In our example this is the case for PTi and PTk. Therefore there must be a rule, which calculates the resulting set relationship, when concatenating two contiguous edges with their respective set relationships. If e.g. Bij and Bjk are the set relationship „Ì“, then it is easy to see that the resulting relationship between PTi and PTk is also „Ì“ and hence fully applicable factual information to PTi is only partially applicable to PTk . Assume that Bij still remains „Ì“ but that Bjk is „Å“. Then it turns out that the resulting relationship between PTi and PTk is no longer unique. It could be „Ì“ or „Å“ or even "!". This forces the introduction of "combined" relationships and a corresponding extension of the rule. With this extended rule it is then possible to associate a unique "combined" relationship to each path in the graph.
Actually, two potential taxa can be connected in the graph through several paths. This is the case for PTi and PTl, because there is a "direct" path - the corresponding edge - and also an "indirect" path over PTj. A "combined" relationship is associated to each path. Additional rules must thus specify how the system has to be proceeded to obtain from such two "combined" relationships the resulting "combined" relationship. This leads at least to two alternative rules.
For each oriented relationship between two potential taxa PT1 and PT2 there exists a reverse oriented relationship between PT2 and PT1, which can be likewise defined by an appropriate rule. This results in altogether at least four different rules.
The quality of factual information when transmitted from an "original" PTo to a "target" PTt thus depends (i) on the Potential Taxon graph, or more precisely on all paths from PTo to PTt and on the oriented relationships that are assigned to the edges included in these paths and (ii) on the applicability of the factual data. Computing the quality of transmitted factual information is therefore based on:
algorithms that find all paths from PTo to PTt in an oriented graph
rules that assign to each path a relationship on the basis of the relationships corresponding to the included edges and which then assign a unique final relationship to the pair (PTo, PTt) based on all paths from PTo to PTt. This last relationship is used to compute the quality of the transmitted factual information.
a rule that combines the resulting relationship with the applicability of the factual information to arrive at a relevant result.
For the formal description of such a graph as well as for the algorithms and rules any higher programming language can be used. These rules do not need to be edited, since they do not depend on the specific contents of the included data. As an example we used Visual Basic to define a "relationship data type" as well as the above mentioned rules.
Definition of a datatype for "combined relationship"-objects:
Public Type Relationship
           Congruent_to As Boolean
           Is_included_in As Boolean
           Includes As Boolean
           Overlaps As Boolean
           Excludes As Boolean
           Doubtful As Boolean
        End Type
Reversal rule for "combined relationships":
Public Function reverse(Rel1 As Relationship) As Relationship
           reverse = Rel1
           reverse.Is_included_in = Rel1.Includes
           reverse.Includes = Rel1.Is_included_in
        End Function
Unification rule for two "combined relationships" (strong agreement - intersection):
Public Function cons(Rel1 As Relationship, Rel2 As Relationship) As Relationship
           If Rel1.Doubtful = Rel2.Doubtful Then
              cons.Congruent_to = Rel1.Congruent_to And Rel2.Congruent_to
              cons.Is_included_in = Rel1.Is_included_in And Rel2.Is_included_in
              cons.Includes = Rel1.Includes And Rel2.Includes
              cons.Overlaps = Rel1.Overlaps And Rel2.Overlaps
              cons.Excludes = Rel1.Excludes And Rel2.Excludes
              cons.Doubtful = Rel1.Doubtful
           ElseIf Rel1.Doubtful = False Then
              cons.Congruent_to = Rel1.Congruent_to
              cons.Is_included_in = Rel1.Is_included_in
              cons.Includes = Rel1.Includes
              cons.Overlaps = Rel1.Overlaps
              cons.Excludes = Rel1.Excludes
              cons.Doubtful = Rel1.Doubtful
           Else
              cons.Congruent_to = Rel2.Congruent_to
              cons.Is_included_in = Rel2.Is_included_in
              cons.Includes = Rel2.Includes
              cons.Overlaps = Rel2.Overlaps
              cons.Excludes = Rel2.Excludes
              cons.Doubtful = Rel2.Doubtful
           End If
        End Function
Unification rule for two "combined relationships" (weak agreement - union):
Public Function large_cons(Rel1 As Relationship, Rel2 As Relationship) As Relationship
           large_cons.Congruent_to = Rel1.Congruent_to Or Rel2.Congruent_to
           large_cons.Is_included_in = Rel1.Is_included_in Or Rel2.Is_included_in
           large_cons.Includes = Rel1.Includes Or Rel2.Includes
           large_cons.Overlaps = Rel1.Overlaps Or Rel2.Overlaps
           large_cons.Excludes = Rel1.Excludes Or Rel2.Excludes
           large_cons.Doubtful = Rel1.Doubtful Or Rel2.Doubtful
        End Function
Concatenation rule for two contiguous "combined relationships":
Public Function concatenate(Rel1 As Relationship, Rel2 As Relationship) As Relationship
        Dim RelNull As Relationship
        Dim RelFull As Relationship
        Dim TempRelResult As Relationship
        
           RelNull.Congruent_to = False
           RelNull.Is_included_in = False
           RelNull.Includes = False
           RelNull.Overlaps = False
           RelNull.Excludes = False
           RelNull.Doubtful = False
        
           RelFull.Congruent_to = True
           RelFull.Is_included_in = True
           RelFull.Includes = True
           RelFull.Overlaps = True
           RelFull.Excludes = True
           RelFull.Doubtful = False
        
        
           concatenate = RelNull
           TempRelResult = RelNull
        
           If Rel1.Congruent_to Then
              concatenate = Rel2
           End If
           If Rel2.Congruent_to Then
              TempRelResult = Rel1
              concatenate = large_cons(concatenate,
        TempRelResult)
              TempRelResult = RelNull
           End If
           If Rel1.Is_included_in Then
              If Rel2.Is_included_in Then
                 TempRelResult.Is_included_in = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Includes Then
                 TempRelResult = RelFull
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Overlaps Then
                 TempRelResult.Is_included_in = True
                 TempRelResult.Overlaps = True
                 TempRelResult.Excludes = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Excludes Then
                 TempRelResult.Excludes = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
           End If
        
           If Rel1.Includes Then
              If Rel2.Is_included_in Then
                 TempRelResult.Congruent_to = True
                 TempRelResult.Is_included_in = True
                 TempRelResult.Includes = True
                 TempRelResult.Overlaps = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Includes Then
                 TempRelResult.Includes = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Overlaps Then
                 TempRelResult.Includes = True
                 TempRelResult.Overlaps = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Excludes Then
                 TempRelResult.Includes = True
                 TempRelResult.Overlaps = True
                 TempRelResult.Excludes = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
           End If
        
           If Rel1.Overlaps Then
              If Rel2.Is_included_in Then
                 TempRelResult.Is_included_in = True
                 TempRelResult.Overlaps = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Includes Then
                 TempRelResult.Includes = True
                 TempRelResult.Overlaps = True
                 TempRelResult.Excludes = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Overlaps Then
                 TempRelResult = RelFull
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Excludes Then
                 TempRelResult.Includes = True
                 TempRelResult.Overlaps = True
                 TempRelResult.Excludes = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
           End If
        
           If Rel1.Excludes Then
              If Rel2.Is_included_in Then
                 TempRelResult.Is_included_in = True
                 TempRelResult.Overlaps = True
                 TempRelResult.Excludes = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Includes Then
                 TempRelResult.Excludes = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Overlaps Then
                 TempRelResult.Is_included_in = True
                 TempRelResult.Overlaps = True
                 TempRelResult.Excludes = True
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
              If Rel2.Excludes Then
                 TempRelResult = RelFull
                 concatenate =
        large_cons(concatenate, TempRelResult)
                 TempRelResult = RelNull
              End If
           End If
        
           concatenate.Doubtful = Rel1.Doubtful Or Rel2.Doubtful
        
        End Function
Interpretation rule for "combined relationships":
Public Function evaluate(Category as
        String, Rel1 As Relationship) As String
           If (Not
        Rel1.Congruent_to) And (Not Rel1.Is_included_in) And (Not Rel1.Includes)
        And (Not Rel1.Overlaps) And (Not Rel1.Excludes) Then
              evaluate
        = " Contradiction !"
           ElseIf Category =
        " fully applicable " Then
              If
        (Not Rel1.Doubtful) Then
                
        If (Not Rel1.Excludes) Then
                   
        If (Not Rel1.Is_included_in) And (Not Rel1.Overlaps) Then
                      
        evaluate = " fully applicable !"
                   
        Else
                
              evaluate = " partially
        applicable !"
                   
        End If
                
        Else
                   
        If (Rel1.Congruent_to Or Rel1.Is_included_in Or Rel1.Includes Or
        Rel1.Overlaps) Then
                      
        evaluate = " doubtful applicable !"
                   
        Else
                      
        evaluate = " not applicable !"
                   
        End If
                
        End If
              Else
                
        If (Not Rel1.Excludes) Then
                   
        If (Not Rel1.Is_included_in) And (Not Rel1.Overlaps) Then
                      
        evaluate = " fully applicable ?"
                   
        Else
                       evaluate = "
        partially applicable ?"
                   
        End If
                
        Else
                   
        If (Rel1.Congruent_to Or Rel1.Is_included_in Or Rel1.Includes Or
        Rel1.Overlaps) Then
                      
        evaluate = " doubtful applicable ?"
                   
        Else
                    
          evaluate = " not applicable ?"
                   
        End If
                
        End If
              End
        If
           ElseIf Category =
        " partially applicable " Then
              If
        (Not Rel1.Doubtful) Then
                
        If (Not Rel1.Excludes) Then
                   
        If (Not Rel1.Includes) And (Not Rel1.Overlaps) Then
                      
        evaluate = " partially applicable !"
                   
        Else
                      
        evaluate = " doubtful applicable !"
                   
        End If
                
        Else
                   
        If (Rel1.Congruent_to Or Rel1.Is_included_in Or Rel1.Includes Or
        Rel1.Overlaps) Then
                      
        evaluate = " doubtful applicable !"
                   
        Else
                      
        evaluate = " not applicable !"
                   
        End If
                
        End If
              Else
                
        If (Not Rel1.Excludes) Then
                   
        If (Not Rel1.Includes) And (Not Rel1.Overlaps) Then
                      
        evaluate = " partially applicable ?"
                   
        Else
                      
        evaluate = " doubtful applicable ?"
                   
        End If
                
        Else
                   
        If (Rel1.Congruent_to Or Rel1.Is_included_in Or Rel1.Includes Or
        Rel1.Overlaps) Then
                      
        evaluate = " doubtful applicable ?"
                   
        Else
                      
        evaluate = " not applicable ?"
                   
        End If
                
        End If
              End
        If
           Else
              If
        (Not Rel1.Doubtful) Then
                
        If (Rel1.Congruent_to Or Rel1.Is_included_in Or Rel1.Includes Or
        Rel1.Overlaps) Then
                   
        evaluate = " doubtful applicable !"
                
        Else
                   
        evaluate = " not applicable !"
                
        End If
              Else
                
        If (Rel1.Congruent_to Or Rel1.Is_included_in Or Rel1.Includes Or
        Rel1.Overlaps) Then
                   
        evaluate = " doubtful applicable ?"
                
        Else
                   
        evaluate = " not applicable ?"
                
        End If
              End
        If
           End If
        End Function
        
Sometimes - as a function of specific characteristics of the considered data or data sources - addition of new rules and/or the adjustment of existing ones is necessary, e.g to:
include or exclude certain data sources which are available in the system
give preferential treatment to certain data sources for data output
weighting edges depending on their source (e.g. higher weighting of the opinion held by a certain expert for a certain taxonomic group)
define a special treatment for queries that entail some special risk (e.g. medical information or information concerning the protection of species)
Since rules of this kind are not generally foreseeable and since they may refer directly to data contents and metadata of the source, they should not be incorporated in the core rules and analysis algorithms, but should be read and applied at run-time.
To ensure the adjustment of these rules, they could be formulated in a formal language adapted for propositional calculus. This would also facilitate the implementation of a user interface for this purpose . The programming language Prolog fulfils these requirements and we shall use it for the further description of the system . For the implementation however, other languages can be taken in account. An implementation could also be based on a complex configuration file, from which parameters are passed to core rules at runtime.
Marc Geoffroy, Anton Güntsch & Walter G. Berendsohn
First version (German only): August 2001
 Revised second (German and English) version: June 2002
__________________________________________________________________________
MoreTax (Rule-based association of taxonomic concepts) is a research and development project financed by the Federal Agency for Nature Conservation of the German Ministry of the Environment.
        Project co-ordinator: Walter
        Berendsohn
        Project scientist: Marc
        Geoffroy
        
        
This page last updated on 12-11-2002
© Freie Universität
Berlin, Botanischer Garten und Botanisches Museum Berlin-Dahlem,
Seitenverantwortlicher / Page editor:  M.
Geoffroy.    
BGBM Impressum / Imprint