[OTDev] POST for feature selection

Christoph Helma helma at in-silico.de
Wed Oct 28 14:32:15 CET 2009


Excerpts from Nina Jeliazkova's message of Mon Oct 19 12:35:34 +0200 2009:
> Hi Tobias,
> 
> Tobias Girschick wrote:
> > On Fri, 2009-10-16 at 13:22 +0200, Christoph Helma wrote:
> >   
> >> Excerpts from Nina Jeliazkova's message of Tue Oct 13 12:08:44 +0200 2009:
> >>     
> >>>> Well, yes, but that's the goal of creating/using an ontology, isn't it.
> >>>>   
> >>>>         
> >>> Indeed.
> >>>
> >>> I agree with simplification of the URIs, few questions follow.
> >>>
> >>> - If we have the so-called Algorithm Ontology service completely
> >>> independent of the Algorithm service, how does Algorithm Ontology
> >>> service knows which Algorithms are available ?
> >>>       
> >> It could register a new algorithm, eg. with
> >>
> >> PUT /algorithm-ontology/{class}/{subclass} algorithm_uri
> >>     
> >
> > This sounds intuitive, the problem I see is, that of the different
> > servers. To which server do I PUT the new algorithm? Only my own, where
> > also the algorithm is provided? All servers that provide algorithms
> > (IMHO no good idea)? 
> Why not - each particular server need not to support all the ontology,
> only the part that is relevant.  Besides, the purpose of ontology is
> common meaning, what I would be interested is to be able to query like
> "get me all feature definitions that represent 'Melting point'" or
> "retrieve all classification algorithms" . "Melting point" in the sense
> of ontology is a "class" and particular "MP_1" feature definition is an
> instance of that class.  
> 
> Following these thought, my initial idea is that each resource (e.g.
> feature definition, algorithm, model) comes with an assigned meaning
> (e.g. MP_1 is assigned "Melting Point"), based on corresponding
> ontology.  Christoph suggested to split ontologies and resources, IMHO
> mostly driven by the fact he is developing a single application, where
> the resources have predefined (hardcoded) meanings and an ontology
> resource is perceived as something optional.

My main intention is to separate the semantic part (which is moѕtly
needed at the application/GUI level) from the core information that is
needed by the individual webservices. This should allow us to build
small and efficient webservices, that do not have to care about
semantics and to assemble those into applications that use the
ontologies to put the computational results into a meaningful
(toxicological) context.

I am also too lazy to build my own ontologies and I hope to reuse
external ontologies as soon as it comes to application development.

> I am fine with the split, provided that an ontology service for each
> kind of resource is mandatory, so that the meaning  of each resource
> will not rely on anything predefined and specific for the application.
> My point of view is the most easiest way is for everybody implementing
> "feature" resource or "algorithm resource" is to implement its ontology
> counterpart. For these to be compatible, we need a common description of
> the ontologies, for example for descriptors we can use Blue Obelisk OWL
> file (and perhaps extend it).  If all implementations depend on the same
> e.g. OWL file , there should not be compatibility issues. 
> 
> An example: Below is the RDF description of XLogP descriptor from BO
> ontology.  MolecularDescriptor is a class, defined in the BO dictionary;
> same for constitutionalDescriptor.   All OpenTox resources, implementing
> XLogP could refer to this entry in BO ontology.  Additionally, we might
> establish OWL for endpoints and annotate that XLogP is "Octanol water
> partition coefficient" and "Octanol water partition coefficient" belongs
> to "Physicochemical effects" (following ECHA vocabularity).
> 
>         <MolecularDescriptor rdf:about="&me;xlogP">
>             <rdfs:label>XLogP</rdfs:label>
>             <dc:contributor rdf:resource="&me;mf"/>
>             <dc:contributor rdf:resource="&me;elw"/>
>             <dc:date>2005-01-27</dc:date>
>             <definition rdf:parseType='Literal'>
>                 Prediction of logP based on the atom-type method called
>     XLogP.
>             </definition>
>             <description rdf:parseType='Literal'>
>                 For a description of the methodology see
>                 <bibtex:cite ref="WANG97"/>
>                 .
>             </description>
>             <isClassifiedAs rdf:resource="&me;constitutionalDescriptor"/>
>         </MolecularDescriptor>
> 
> > Or do we want a central server that hosts at least
> > the ontologies (it's the same questions for the feature ontology as
> >   
> yes, of course it's the same issue.
> > well)? That would make it easier to control the ontology vocabulary...
> >   
> Well, I thought a set of common .owl files would be sufficient to keep
> common vocabulary.

If we adopt RDF for data exchange (see my separate post on this list),
we might need ontology webservices to obtain valid URIs for the
predicates (e.g.
<http://blueobelisk.sourceforge.net/ontologies/chemoinformatics-algorithms/#xlogP>,
but this link seems to be defunct). I suspect that local references
could also work, but this would mean a duplication of efforts (and we
have to make sure that everyone uses the same vocabulary (version)).
Although local files have their advantage, when it comes to single
machine installations I would suggest the following strategy:

- Use (and contribute to) external ontology services whenever possible
- Run our own service(s) for vocabularies that do not fit into external
  ontolgies
- Pull the most recent owl files from these services for local
  installations

Best regards,
Christoph



More information about the Development mailing list