Find missing Links
After mapping to and extending CIDOC CRM we have a more or less loose bunch of Entities, we want to say something about. To build an ontology we need the relationships between these entities. At this point it becomes challenging, because now it is not just a matter of finding or extending individual classes, but build an entire semantic net. Fortunately, WissKI can support us in this activity with Pathbuilder (more detailed information here). He will help us structuring the data and checking automatically the restrictions of our ontology. It is highly recommended to document and/or visualize the ontology with an entity-relation model or an unified modeling language schema.
We now start an iterative process by trying to implement our conceptual model in the pathbuilder . In doing so, it is possible to encounter "dead ends", meaning that no direct path exists between two entities and we need to insert additional intermediate classes to get from one entity to another. Again, these intermediate classes may have to be created first in the ontology editor. We therefore switch between the logical model, the pathbuilder and the ontology editor until our conceptual model is implemented. We do try to actually build the full data model with disambiguation, field types, formatters or cardinality at this point. We use the pathbuilder at this stage only to identify (missing) entities or object properties to complete our ontology and fasten our conceptual model. Advanced users may build the ontology and create the whole pathbuilder structure in one pass - but for the sake of comprehensibility, we will proceed step by step. You learn about the complete implementation in chapter " Configuration".
Create development Pathbuilder
We create a pathbuilder only to use its function to display the possible object properties between classes. Figure 1 shows just a table view of our desired information. You may recognize that these table only show entities but do not name the relationships between them. We will find them by adding steps to the pathbuilder .
The pathbuilder will automatically check the restrictions of the ontology and will only display classes and properties that are allowed. Every time we find a full path, we document our progress, if we reach a dead end, we switch back to Protégé and create the required entities or properties. Under Manage Configuration Pathbuilders, click + Add Pathbuilder. Name it Development and click Save without form generation. Click on + Add Path. In the section STEP, you can click on please select and you will see all classes in your ontology uploaded in your triple store. We now try to find a path between our class E78 Curated Holding and Collection Title. Select as the first step http://erlangen-crm.org/200717/E78_Curated_Holding. You instantly have the possibility to choose the second step, which must be a property. Look closely if you read a property name that could match and compare the scope note if the scope note fits your needs. We take http://erlangen-crm.org/200717/P102_has_title here. Third step is https://wisski.com/Collection_Title and therefor the last step. The standard Datatype Property is http://erlangen-crm.org/200717/P3_has_note. Now the path is complete and we should document the progress.
We have so far avoided to speak of properties as the attributes of entities. Its a bit confusing to speak of properties, because in relational database contexts properties are the characteristics or attributes of an entity: Collection Title is therefore a property of Collection - but in graph database contexts properties are the relationships between those entities, which impose restrictions by determining domains ("start entities" or subjects) and ranges ("end entities or objects). In this case P102 has title is a property, which allows to connect the class E78 Curated Holding with the class Collection Title. In future we will speak only of "properties" when we are talking about relationships between entities; we will use the term "entity" when we are in the scope of the ontology and " group ", when we are in the scope of Drupal. We deal with the terms attribute and field in a similar way: "Attributes" are the characteristics of an entity in the scope of the ontology and "fields" are their equivalent in the context of Drupal as parts of groups.
Document the ontology
As we create our ontology, we should regularly document our progress visually. Software like yEd, Lucidchart or Visio are made to support the task. Figure 2 shows an example of a synthesis of the ontology (class name, object property, data property) and WissKI structure (field/ group name and machine name). The field or group name should be the label the user sees in the frontend. The machine name should help you to identify the groups and fields in their different contexts. In this guide the naming convention consists of three parts: either it is a group, input or a reference, the context of the field with minimum three letters and a human understandable name, i.e. the field Collection Title gets the machine name i_col_title for input_collection_title. The field Belongs to Collection in the group Specimen would be named as ref_spe_belongs_to_collection. This avoids problems when you have recurring field labels like Name or Title, because labels could have the same appellations, but not the same machine name.
Even if your database is small and the question is narrow, the ontology can quickly become confusing. A visualization of the ontology helps to keep track of all components and makes it easier to extend our model later. While figure 1 showed us a simple schema of our data, Figure 2 gives an advanced overview of the structures our example ontology consists of. Figure 3 shows a visualization if we complete ontology with all desired entities.
You see how a small model about our museum collection has a relatively large number of entries. You may notice the two classes that are cyan colored as well. These two are "borrowed" from CIDOC CRMsci, an extension of CIDOC CRM "intended to be used as a global schema for integrating metadata about scientific observation, measurements and processed data in descriptive and empirical sciences such as biodiversity, geology, geography, archaeology, cultural heritage conservation and others in research IT environments and research data libraries." In many cases, it is easier and more consistent to take classes from existing standards than to invent your own. Therefore, it is also worthwhile to look into other (CIDOC) ontologies if no suitable mapping is found in CRM.
Import the complete ontology into the triple store
If you have the feeling, that every entity has a corresponding class and could connected with suitable properties it is time to reimport the ontology.
Disambiguation, data types, formatters and more
This chapter gave you a rough overview of the topic ontology development where you already got to know the pathbuilder a little bit. We focused on the logical model without going deeper into configuration options. Hence, the next step is to learn about pathbuilders advanced configuration capabilities, such as disambiguation, field types and formatters.