Skip to content

Commit 701d112

Browse files
authored
Merge pull request #53 from nmrML/Clean-up-chemical-entity-#20
Clean up chemical entity #20
2 parents 23bff55 + efb2766 commit 701d112

File tree

7 files changed

+310
-10625
lines changed

7 files changed

+310
-10625
lines changed

docs/odk-workflows/RepositoryFileStructure.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ These are the current imports in NMRCV
2121
| omo | http://purl.obolibrary.org/obo/omo.owl | mirror |
2222
| iao | http://purl.obolibrary.org/obo/iao.owl | None |
2323
| obi | http://purl.obolibrary.org/obo/obi.owl | custom |
24-
| chebi | http://purl.obolibrary.org/obo/chebi.owl | None |
24+
| chebi | http://purl.obolibrary.org/obo/chebi.owl | custom |
2525

2626
## Components
2727
Components, in contrast to imports, are considered full members of the ontology. This means that any axiom in a component is also included in the ontology base - which means it is considered _native_ to the ontology. While this sounds complicated, consider this: conceptually, no component should be part of more than one ontology. If that seems to be the case, we are most likely talking about an import. Components are often not needed for ontologies, but there are some use cases:

src/ontology/Makefile

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
# More information: https://github.com/INCATools/ontology-development-kit/
1111

1212
# Fingerprint of the configuration file when this Makefile was last generated
13-
CONFIG_HASH= 1ba644404ead28f12b159a84d638daf10791c394ce2f9060998152d134df9dcc
13+
CONFIG_HASH= fd98f62cea8e98a347fcdcf420367f85a1508e39fd0bcd97b7b3b1b954831e0b
1414

1515

1616
# ----------------------------------------
@@ -382,11 +382,8 @@ $(IMPORTDIR)/obi_import.owl: $(MIRRORDIR)/obi.owl
382382
echo "ERROR: You have configured your default module type to be custom; this behavior needs to be overwritten in nmrCV.Makefile!" && false
383383
## Module for ontology: chebi
384384

385-
$(IMPORTDIR)/chebi_import.owl: $(MIRRORDIR)/chebi.owl $(IMPORTDIR)/chebi_terms_combined.txt
386-
if [ $(IMP) = true ] && [ $(IMP_LARGE) = true ]; then $(ROBOT) extract -i $< -T $(IMPORTDIR)/chebi_terms_combined.txt --force true --copy-ontology-annotations true --individuals include --method BOT \
387-
query --update ../sparql/inject-subset-declaration.ru --update ../sparql/inject-synonymtype-declaration.ru --update ../sparql/postprocess-module.ru \
388-
$(ANNOTATE_CONVERT_FILE); fi
389-
385+
$(IMPORTDIR)/chebi_import.owl: $(MIRRORDIR)/chebi.owl
386+
echo "ERROR: You have configured your default module type to be custom; this behavior needs to be overwritten in nmrCV.Makefile!" && false
390387

391388
.PHONY: refresh-imports
392389
refresh-imports:

src/ontology/imports/chebi_import.owl

Lines changed: 228 additions & 10597 deletions
Large diffs are not rendered by default.

src/ontology/imports/chebi_terms.txt

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,29 @@
1+
# needed root terms
2+
CHEBI:33250 # atom
3+
CHEBI:33252 # atomic nucleus
4+
CHEBI:59999 # 'chemical substance'
5+
CHEBI:23367 # 'molecular entity'
6+
CHEBI:36357 # polyatomic entity
7+
CHEBI:33579 # main group molecular entity
8+
CHEBI:24835 # inorganic molecular entity
9+
CHEBI:33259 # elemental molecular entity
10+
CHEBI:33497 # transition element molecular entity
11+
CHEBI:35568 # mancude ring
12+
CHEBI:24870 # ion
13+
14+
# object relations for needed axioms
15+
BFO:0000051 # 'has part'
16+
RO:0000087 # 'has role'
17+
18+
# needed roles
19+
CHEBI:51086 # 'chemical role'
20+
CHEBI:197449 # NMR solvent
21+
CHEBI:228364 # NMR chemical shift reference compound
22+
CHEBI:67137 # NMR shift reagent
23+
24+
CHEBI:139358 # isotopically modified compound
25+
CHEBI:76107 # deuterated compound
26+
127
CHEBI:156265 # methanol-d4
228
CHEBI:193038 # acetonitrile-d3
329
CHEBI:193039 # benzene-d6
@@ -24,7 +50,6 @@ CHEBI:41981 # dideuterium oxide
2450
CHEBI:48236 # trichlorofluoromethane
2551
CHEBI:78217 # acetone d6
2652
CHEBI:85365 # deuterated chloroform
27-
CHEBI:197449 # NMR solvent
2853
CHEBI:36810 # (trifluoromethyl)benzene
2954
CHEBI:38585 # 1,4-difluorobenzene
3055
CHEBI:47032 # 1,4-dioxane
@@ -112,7 +137,7 @@ CHEBI:176578 # cobalt-59 atom
112137
# TODO: add gallium atom isotopes
113138
CHEBI:52758 # germanium-73 atom
114139
CHEBI:176584 # arsenic-75 atom
115-
CHEBI:52457 # CHEBI:52457
140+
CHEBI:52457 # selenium-77 atom
116141
CHEBI:52743 # bromine-79 atom
117142
# TODO: add bromine atom isotopes
118143
# TODO: add krypton atom isotopes

src/ontology/nmrCV-edit.owl

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -3868,10 +3868,11 @@ AnnotationAssertion(dce:source <http://nmrML.org/nmrCV#NMR:1000329> "http://www.
38683868
AnnotationAssertion(rdfs:label <http://nmrML.org/nmrCV#NMR:1000329> "AB multiplet pattern"@en)
38693869
SubClassOf(<http://nmrML.org/nmrCV#NMR:1000329> <http://nmrML.org/nmrCV#NMR:1400305>)
38703870

3871-
# Class: <http://nmrML.org/nmrCV#NMR:1000330> (NMR solvent)
3871+
# Class: <http://nmrML.org/nmrCV#NMR:1000330> (NMR solvent (molecular entity))
38723872

3873+
AnnotationAssertion(obo:IAO_0000111 <http://nmrML.org/nmrCV#NMR:1000330> "NMR solvent"@en)
38733874
AnnotationAssertion(obo:IAO_0000115 <http://nmrML.org/nmrCV#NMR:1000330> "A molecular entity that is used as a solvent in nuclear magnetic resonance (NMR) spectroscopy"@en)
3874-
AnnotationAssertion(rdfs:label <http://nmrML.org/nmrCV#NMR:1000330> "NMR solvent"@en)
3875+
AnnotationAssertion(Annotation(skos:editorialNote "We needed to add material entity in the label to avoid clashes with the similary labeled role class in CHEBI."@en) rdfs:label <http://nmrML.org/nmrCV#NMR:1000330> "NMR solvent (molecular entity)"@en)
38753876
EquivalentClasses(<http://nmrML.org/nmrCV#NMR:1000330> ObjectIntersectionOf(obo:CHEBI_23367 ObjectSomeValuesFrom(obo:RO_0000087 obo:CHEBI_197449)))
38763877
SubClassOf(Annotation(oboInOwl:is_inferred "true") <http://nmrML.org/nmrCV#NMR:1000330> obo:CHEBI_23367)
38773878

@@ -6743,10 +6744,6 @@ SubClassOf(<http://nmrML.org/nmrCV#NMR:1400320> <http://nmrML.org/nmrCV#NMR:1400
67436744
AnnotationAssertion(rdfs:label <http://nmrML.org/nmrCV#NMR:1400321> "Bruker WIN NMR format")
67446745
SubClassOf(<http://nmrML.org/nmrCV#NMR:1400321> <http://nmrML.org/nmrCV#NMR:1400285>)
67456746

6746-
# Class: obo:BFO_0000023 (role)
6747-
6748-
EquivalentClasses(obo:BFO_0000023 obo:CHEBI_50906)
6749-
67506747
# Class: obo:CHEBI_156265 (methanol-d4)
67516748

67526749
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_156265 <http://nmrML.org/nmrCV#NMR:1000330>)
@@ -6880,13 +6877,9 @@ SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_229457 <http://nmrM
68806877

68816878
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_229458 <http://nmrML.org/nmrCV#NMR:1400033>)
68826879

6883-
# Class: obo:CHEBI_24431 (chemical entity)
6884-
6885-
SubClassOf(obo:CHEBI_24431 obo:BFO_0000040)
6880+
# Class: obo:CHEBI_23367 (molecular entity)
68866881

6887-
# Class: obo:CHEBI_24432 (biological role)
6888-
6889-
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_24432 obo:BFO_0000023)
6882+
SubClassOf(obo:CHEBI_23367 obo:BFO_0000040)
68906883

68916884
# Class: obo:CHEBI_26078 (phosphoric acid)
68926885

@@ -6934,7 +6927,15 @@ SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_33093 <http://nmrML
69346927

69356928
# Class: obo:CHEBI_33232 (application)
69366929

6937-
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_33232 obo:BFO_0000023)
6930+
SubClassOf(obo:CHEBI_33232 obo:BFO_0000023)
6931+
6932+
# Class: obo:CHEBI_33250 (atom)
6933+
6934+
SubClassOf(obo:CHEBI_33250 obo:BFO_0000040)
6935+
6936+
# Class: obo:CHEBI_33252 (atomic nucleus)
6937+
6938+
SubClassOf(obo:CHEBI_33252 obo:BFO_0000040)
69386939

69396940
# Class: obo:CHEBI_33681 (helium(0))
69406941

@@ -6973,6 +6974,7 @@ SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_39429 <http://nmrML
69736974
# Class: obo:CHEBI_41981 (dideuterium oxide)
69746975

69756976
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_41981 <http://nmrML.org/nmrCV#NMR:1000330>)
6977+
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_41981 <http://nmrML.org/nmrCV#NMR:1400033>)
69766978

69776979
# Class: obo:CHEBI_45892 (trifluoroacetic acid)
69786980

@@ -6999,13 +7001,9 @@ SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_47032 <http://nmrML
69997001
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_48236 <http://nmrML.org/nmrCV#NMR:1000330>)
70007002
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_48236 <http://nmrML.org/nmrCV#NMR:1400033>)
70017003

7002-
# Class: obo:CHEBI_50906 (role)
7003-
7004-
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_50906 obo:BFO_0000017)
7005-
70067004
# Class: obo:CHEBI_51086 (chemical role)
70077005

7008-
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_51086 obo:BFO_0000023)
7006+
SubClassOf(obo:CHEBI_51086 obo:BFO_0000023)
70097007

70107008
# Class: obo:CHEBI_5115 (monofluorobenzene)
70117009

@@ -7019,6 +7017,10 @@ SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_55317 <http://nmrML
70197017

70207018
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_59606 <http://nmrML.org/nmrCV#NMR:1400033>)
70217019

7020+
# Class: obo:CHEBI_59999 (chemical substance)
7021+
7022+
SubClassOf(obo:CHEBI_59999 obo:BFO_0000040)
7023+
70227024
# Class: obo:CHEBI_63005 (sodium nitrate)
70237025

70247026
SubClassOf(Annotation(oboInOwl:is_inferred "true") obo:CHEBI_63005 <http://nmrML.org/nmrCV#NMR:1400033>)

src/ontology/nmrCV-odk.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,4 +52,6 @@ import_group:
5252
- id: chebi
5353
is_large: true
5454
use_gzipped: true
55+
# Using ROBOT filter in chmo.Makefile because the default ODK ROBOT extract method pulls in too much from CHEBI.
56+
module_type: custom
5557

src/ontology/nmrCV.Makefile

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,4 +13,32 @@ $(IMPORTDIR)/obi_import.owl: $(MIRRORDIR)/obi.owl $(IMPORTDIR)/obi_terms.txt
1313
extract -T $(IMPORTDIR)/obi_terms.txt --force true --copy-ontology-annotations true --individuals exclude --method BOT \
1414
query --update ../sparql/inject-subset-declaration.ru --update ../sparql/inject-synonymtype-declaration.ru --update ../sparql/postprocess-module.ru \
1515
remove -T $(IMPORTDIR)/obi_remove_list.txt --select "self descendants instances" --signature true \
16-
$(ANNOTATE_CONVERT_FILE); fi
16+
$(ANNOTATE_CONVERT_FILE); fi
17+
18+
19+
## Module for ontology: chebi
20+
21+
# We use ROBOT filter instead of the default ODK ROBOT extract method because the latter pulls in too much from ChEBI.
22+
# Using ROBOT filter like this allows us to only import the terms we need under their very general CHEBI 'root' parents.
23+
# This ROBOT filter approach entails, that not all axioms of the imported terms are imported as well,
24+
# e.g. currently only 'part of' and 'has role' relations between the terms specified in the chebi_import.txt are
25+
# imported.
26+
# If other axioms are needed in the future the nmrCV editors need to make sure to include the needed object properties
27+
# and classes used in these, which is quite a time consuming task, but needed, as ROBOT extract pulls in too much
28+
# and the CHEBI module would otherwise be too big to load.
29+
# Since we also define classes based on the roles borne by some CHEBI terms, e.g. a 'chemical shift reference compound'
30+
# is equivalent to "'molecular entity' and ('has role' some ''NMR chemical shift reference compound [role]')"
31+
# we additionally run a reasoning step to materialize the subclassOf axioms needed to group these CHEBI terms under
32+
# the classes we define.
33+
34+
$(IMPORTDIR)/chebi_import.owl: $(MIRRORDIR)/chebi.owl $(IMPORTDIR)/chebi_terms_combined.txt
35+
if [ $(IMP) = true ] && [ $(IMP_LARGE) = true ]; then $(ROBOT) \
36+
filter -i $< -T $(IMPORTDIR)/chebi_terms_combined.txt --signature false --select "annotations self" \
37+
--exclude-term http://purl.obolibrary.org/obo/CHEBI_24431 \
38+
--exclude-term http://purl.obolibrary.org/obo/CHEBI_24432 \
39+
--exclude-term http://purl.obolibrary.org/obo/CHEBI_50906 \
40+
query --update ../sparql/inject-subset-declaration.ru --update ../sparql/inject-synonymtype-declaration.ru \
41+
--update ../sparql/postprocess-module.ru \
42+
$(ANNOTATE_CONVERT_FILE); fi
43+
$(ROBOT) reason --reasoner ELK -i $(SRC) --exclude-duplicate-axioms true --exclude-tautologies all --equivalent-classes-allowed asserted-only \
44+
--annotate-inferred-axioms true --axiom-generators "SubClass" convert -f ofn --output $(SRC)

0 commit comments

Comments
 (0)