[OTDev] RDF, APIs and ontologies

Nina Jeliazkova nina at acad.bg
Fri Nov 13 12:49:40 CET 2009



Christoph Helma wrote:
> Excerpts from Nina Jeliazkova's message of Wed Nov 11 14:54:42 +0100 2009:
>   
>> Dear Christoph, All,
>>
>> I would suggest to start with an example.  Before Friday meeting it will
>> be good if we have specific idea how to represent features in RDF .  We
>> can consider BO ontology for descriptors and preliminary ontology for
>> carcinogenicity Olga Tcheremenskaia showed yesterday during the online
>> meeting.
>>
>> So far we have identified the following information is necessary to
>> describe a feature
>>
>> 1)Name
>> 2)Units
>> 3)Data type (numeric, string, etc.)
>> 4)Where the feature originates from: - this can be an algorithm used to
>> calculate,a model, measurement protocol, literature reference,or another
>> data source.
>>
>> RDF suggestions to represent this information are welcome. 
>>     
>
> I would represent feature values in the dataset RDF as follows:
>
> 	@prefix compound: <http://webservices.in-silico.ch/compound/>
> 	@prefix feature: <http://opentox.org/ontologies/features/>
>
> 	compound:{compound_id} feature:{feature_id} {feature_value} .
>
> Examples:
>
> 	# Carcinogenicity classification
> 	# if we are happy with the DSSTOX definition
> 	compound:InChI=1S/C6H5NO2/c8-7(9)6-4-2-1-3-5-6/h1-5H <http://www.epa.gov/ncct/dsstox/CentralFieldDef.html#ActivityOutcome_CPDBAS_MultiCellCall> true . # true and false are boolean literals in N3, you can also define datatypes explicitly (http://www.w3.org/TR/rdf-mt/#dtype_interp)
>
> 	# if we want to manage our own definitions
> 	compound:InChI=1S/C6H5NO2/c8-7(9)6-4-2-1-3-5-6/h1-5H feature:multi_cell_call true . 
>
> 	# Rat TD50
> 	compound:InChI=1S/C6H5NO2/c8-7(9)6-4-2-1-3-5-6/h1-5H feature:rat_td50_mmol 0.207 . # implies numeric values
>
> 	# BBRC structral feature from supervised graph mining
> 	compound:InChI=1S/C6H5NO2/c8-7(9)6-4-2-1-3-5-6/h1-5H feature:bbrc_representative  [ <#smarts> "NO"; <#p_value> 0.99;  <#effect> "activating"  ]. # a more complex feature with name/value pairs
> 	
> 	...
>
> GET http://opentox.org/ontologies/features/{feature_id} should return the feature definitions in RDF like:
>
> 	@prefix feature: <http://opentox.org/ontologies/features/>
>
> 	feature:{feature_id} rdfs:label {feature_name} .
> 	feature:{feature_id} whatever:unit {feature_unit} . # I would have to find an ontology entry, maybe there is something in blueobelisc or chemaxon
> 	feature:{feature_id} whatever:source {uri_for_algorithm_or_model_or_protocol_or_reference} . # have to find a suitable ontology
> 	# if we need to specify algorithm/model/... parameters
> 	{uri_for_algorithm_or_model_or_protocol_or_reference} whatever:parameters {parameter_value} . # have to find a suitable ontology
>
> Examples:
>
> 	feature:multi_cell_call rdfs:label "DSSTOX/CPDB Multi Cell Call" .
> 	# no unit - nothing to define here
> 	feature:multi_call_call  whatever:source <http://www.epa.gov/ncct/dsstox/StructureDataFiles/CPDBAS_DownloadFiles/CPDBAS_v5d_1547_20Nov2008.zip> . # source file
> http://www.epa.gov/ncct/dsstox/CentralFieldDef.html#TD50_Rat_mmol
> 	feature:rat_td50_mmol whatever:unit "mmol/kg-bw/day" .
> 	feature:rat_td50_mmol whatever:source <http://www.epa.gov/ncct/dsstox/StructureDataFiles/CPDBAS_DownloadFiles/CPDBAS_v5d_1547_20Nov2008.zip> . # source file
> 	feature:bbrc_representative rdfs:label "Backbone refinement class representatives" 
> 	feature:bbrc_representative whatever:source <http://webservices.in-silico.ch/algorithms/fminer> .
> 	<http://webservices.in-silico.ch/algorithms/fminer> whatever:parameters [ <#dataset_uri> <http://webservices.in-silico.ch/dataset/3> ] .
>
> POSTing the same RDF to http://opentox.org/ontologies/features/ should
> create http://opentox.org/ontologies/features/{feature_id}. PUT and
> DELETE would work in analogy.
>
>   
For everybody's convenience , I am gathering links to existing
ontologies at
http://opentox.org/dev/apis/api-1.1/feature_ontology/ontologies_existing/onto_list
There are links to various ontologies, related to chemistry, data mining
as well as generic one as Dublin core and measurement units.

The proposal sounds reasonable as start.  Will be no doubt refining lot
of things when going into implementation.

I would propose
1) Every OpenTox object  to make use of Dublin Core ontology to define
title, subject , description , type, source , relation , creator and
publisher.   An excerpt from Dublin core elements are below:
http://dublincore.org/documents/usageguide/elements.shtml
4.1. Title
4.2. Subject
4.3. Description
4.4. Type
4.5. Source
4.6. Relation
4.8. Creator
4.9. Publisher
4.10. Contributor
4.11. Rights
4.12. Date
4.13. Format
4.14. Identifier
4.16. Audience
4.17. Provenance

For example the "Source" element can be used to refer to the algorithm
used to generate a feature, or could refer to original data source or
publication.  The Relation element can be used to denote the feature is
e.g. carcinogenicity endpoint, by referring to carcinogenicity ontology.


2)  Does the proposal means we abandon the API that allows to retrieve
feature values, given a compound and feature identifiers ?

Best regards,
Nina

> Best regards,
> Christoph
> _______________________________________________
> Development mailing list
> Development at opentox.org
> http://www.opentox.org/mailman/listinfo/development
>   




More information about the Development mailing list