[OTDev] RDF for dataset representation
Nina Jeliazkova nina at acad.bgWed Oct 28 14:45:27 CET 2009
- Previous message: [OTDev] RDF for dataset representation
- Next message: [OTDev] RDF for dataset representation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Christoph, Sounds good, see more comments inline, mostly based on use cases I am interested in. Christoph Helma wrote: > Hi all, > > I had a closer look at the RDF ѕtuff, and from my superficial understanding up > to now (no practical experience yet) it seems to be a good exchange format for > the dataset component. I would suggest the following convention for creating > RDF triplets that represent a dataset: > > Subject: Compound URI > Predicate: Measurement/Algorithm definition URI > Object: Feature value > > An example could look like (in Notation 3 http://www.w3.org/2000/10/swap/Primer): > > @prefix algorithm: <http://www.opentox.org/ontologies/algorithm/> . > @prefix toxicity: <http://www.opentox.org/ontologies/toxicity/> . > @prefix blueobelisk: <http://blueobelisk.sourceforge.net/ontologies/chemoinformatics-algorithms/#> . > > # Examples: > > # a calculated logP > <http://webservices.in-silico.ch/compound/InChI=1S/H4N2/c1-2/h1-2H2> blueobelisk:xlogP -2.20 . > - Do we need to make distinction between different e.g. XLogP implementations (I would say yes) ? Is it possible to handle this via BO ontology, or we need an extension? - What would be the best way to extend BO ontology (this is more a question to Egon)? - How would we handle quantities, defined in existing data sets (e.g. all LogP flavours available in EPA DSSTOX), not calculated via OpenTox, or an user uploaded dataset. - How to handle quantities, calculated via some algorithm, but with different parameters (e.g. eHOMO calculated with AM1 or PM3). I would prefer that the property (e.g. blueobelisk:xlogp) refer to a specific implementation, rather to the algorithm itself (same concept as algorithm/model split we already invented). The implementation itself will be linked to the algorithm. Looking into the current list of feature definitions in Ambit (http://ambit.uni-plovdiv.bg:8080/ambit2/feature_definition ), most of them can be mapped to existing or to-be-developed ontologies, but we need to extend your proposal in a way to keep track of the source of the data. For example it is important to know that feature MolWeight <http://ambit.uni-plovdiv.bg:8080/ambit2/feature_definition/12109>is representing Molecular weight, but I would not want to lose the information it came from ISSCAN_v3a_1153_19Sept08.1222179139.sdf <http://www.epa.gov/NCCT/dsstox/sdf_isscan_external.html> http://ambit.uni-plovdiv.bg:8080/ambit2/feature_definition/12109 <http://ambit.uni-plovdiv.bg:8080/ambit2/feature_definition/11945> This was the primary reason to invent feature definition to consist of name + reference - I am sure this can be described in RDF as well. > # toxicological classification > <http://webservices.in-silico.ch/compound/InChI=1S/H4N2/c1-2/h1-2H2> toxicity:multi_cell_call "active" . > > # a class sensitive structural feature, calculated by an algorithm that does not yet exist in an established ontology > <http://webservices.in-silico.ch/compound/InChI=1S/H4N2/c1-2/h1-2H2> algorithm:backbone_refinement_class [ <#smarts> "N-N"; <#p_value> 0.9998; <#effect> "activating" ] . > > Actually I was thinking of an (extensible) ontology for SMARTS defined fragments; ChEBI ontology has lot of predefined groups that can be used. Read across use case will benefit from that :) > or in RDF/XML: > > <rdf:RDF xmlns="file:///home/ch/ontologies/tmp#" > xmlns:algorithm="http://www.opentox.org/ontologies/algorithm/" > xmlns:blueobelisk="http://blueobelisk.sourceforge.net/ontologies/chemoinformatics-algorithms/#" > xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > xmlns:toxicity="http://www.opentox.org/ontologies/toxicity/"> > > <rdf:Description rdf:about="http://webservices.in-silico.ch/compound/InChI=1S/H4N2/c1-2/h1-2H2"> > <blueobelisk:xlogP rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">-2.2</blueobelisk:xlogP> > <algorithm:backbone_refinement_class rdf:parseType="Resource"> > <effect>activating</effect> > <p_value rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">0.9998</p_value> > <smarts>N-N</smarts> > </algorithm:backbone_refinement_class> > <toxicity:multi_cell_call>active</toxicity:multi_cell_call> > </rdf:Description> > </rdf:RDF> > > Advantages: > > - We do not need a separate feature webservice (at least for for simple feature values and moderatly complex features, like the tuples in the BBRC example) > - We do not need necessarily a feature-ontology (or feature-definition) webservice, if we use, expand and combine existing ontologies > We would need a way to handle dynamically defined properties and even ontologies. I am particularly thinking of user-defined datasets. > - It can help us to solve the problem of unique IDs, by using URIs > AFAIK, that will require an RDF store for the ontology service (centralised one?) - am I right? It would be good if three is a distributed solution. > - Plays well with REST > - Established standard > - Facilitates queries/reasoning (especially useful for building GUIs) > > Possible disadvantages: > > - Support in programming languages? > There are several Java libraries , even Restlet in 2.x has some support (no querying) - graph structure with serialization to several formats. > I suspect that RDF could be also useful for the representation of other > OpenTox objects (Algorithms, Models, ...). > Yes. Could we have a closer look into Algorithm object in BO dictionary and decide if it can be reused in OpenTox ? > Any opinions? > No need to say I am in favour of trying to cast OpenTox objects to RDF instead of custom formats. BTW, couple of weeks ago I've started a list of potentially useful ontologies at http://opentox.org/dev/apis/api-1.1/feature_ontology/ontologies_existing/onto_list/?searchterm=existing%20ontologies . If there is a better place for the list at the site, please free to move it. Best regards, Nina > Best regards, > Christoph > _______________________________________________ > Development mailing list > Development at opentox.org > http://www.opentox.org/mailman/listinfo/development >
- Previous message: [OTDev] RDF for dataset representation
- Next message: [OTDev] RDF for dataset representation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list