[OTDev] POST for feature selection

Mon Oct 19 12:35:34 CEST 2009

Hi Tobias,

Tobias Girschick wrote:
> On Fri, 2009-10-16 at 13:22 +0200, Christoph Helma wrote:
>   
>> Excerpts from Nina Jeliazkova's message of Tue Oct 13 12:08:44 +0200 2009:
>>     
>>>> Well, yes, but that's the goal of creating/using an ontology, isn't it.
>>>>   
>>>>         
>>> Indeed.
>>>
>>> I agree with simplification of the URIs, few questions follow.
>>>
>>> - If we have the so-called Algorithm Ontology service completely
>>> independent of the Algorithm service, how does Algorithm Ontology
>>> service knows which Algorithms are available ?
>>>       
>> It could register a new algorithm, eg. with
>>
>> PUT /algorithm-ontology/{class}/{subclass} algorithm_uri
>>     
>
> This sounds intuitive, the problem I see is, that of the different
> servers. To which server do I PUT the new algorithm? Only my own, where
> also the algorithm is provided? All servers that provide algorithms
> (IMHO no good idea)? 
Why not - each particular server need not to support all the ontology,
only the part that is relevant.  Besides, the purpose of ontology is
common meaning, what I would be interested is to be able to query like
"get me all feature definitions that represent 'Melting point'" or
"retrieve all classification algorithms" . "Melting point" in the sense
of ontology is a "class" and particular "MP_1" feature definition is an
instance of that class.  

Following these thought, my initial idea is that each resource (e.g.
feature definition, algorithm, model) comes with an assigned meaning
(e.g. MP_1 is assigned "Melting Point"), based on corresponding
ontology.  Christoph suggested to split ontologies and resources, IMHO
mostly driven by the fact he is developing a single application, where
the resources have predefined (hardcoded) meanings and an ontology
resource is perceived as something optional.

I am fine with the split, provided that an ontology service for each
kind of resource is mandatory, so that the meaning  of each resource
will not rely on anything predefined and specific for the application.
My point of view is the most easiest way is for everybody implementing
"feature" resource or "algorithm resource" is to implement its ontology
counterpart. For these to be compatible, we need a common description of
the ontologies, for example for descriptors we can use Blue Obelisk OWL
file (and perhaps extend it).  If all implementations depend on the same
e.g. OWL file , there should not be compatibility issues. 

An example: Below is the RDF description of XLogP descriptor from BO
ontology.  MolecularDescriptor is a class, defined in the BO dictionary;
same for constitutionalDescriptor.   All OpenTox resources, implementing
XLogP could refer to this entry in BO ontology.  Additionally, we might
establish OWL for endpoints and annotate that XLogP is "Octanol water
partition coefficient" and "Octanol water partition coefficient" belongs
to "Physicochemical effects" (following ECHA vocabularity).

        <MolecularDescriptor rdf:about="&me;xlogP">
            <rdfs:label>XLogP</rdfs:label>
            <dc:contributor rdf:resource="&me;mf"/>
            <dc:contributor rdf:resource="&me;elw"/>
            <dc:date>2005-01-27</dc:date>
            <definition rdf:parseType='Literal'>
                Prediction of logP based on the atom-type method called
    XLogP.
            </definition>
            <description rdf:parseType='Literal'>
                For a description of the methodology see
                <bibtex:cite ref="WANG97"/>
                .
            </description>
            <isClassifiedAs rdf:resource="&me;constitutionalDescriptor"/>
        </MolecularDescriptor>

> Or do we want a central server that hosts at least
> the ontologies (it's the same questions for the feature ontology as
>   
yes, of course it's the same issue.
> well)? That would make it easier to control the ontology vocabulary...
>   
Well, I thought a set of common .owl files would be sufficient to keep
common vocabulary.
> any opinions? Do I miss something and we don't have this problem?
>
>   
IMHO you are not missing anything, on the contrary. Since we are
thinking not about a single application with a shared back end, but
about multiple independent services, we do have a problem. The problem
is in REST style there is no notion of resource "sameness". It's amazing
how this topic is avoided in usual rest discussions...

I don't think we have sufficient resources to design a generic system of
interacting distributed resources (this is a research topic on itself).
We either try a workaround (fixed service by configuration, centralized
service) or try to make use of RDF/OWL where we can declare "object1 is
same as object2" . At this moment I don't know exactly how to
incorporate the later in our case; Ivelina (in cc) is currently studying
RDF and related topics and hopefully we'll have a proposal later this
week. And hopefully there will be more ideas on this list.

Best regards,
Nina

P.S. Restlet 2.x has support for RDF and various RDF serializations, has
anybody tried these?  Our ambit implementation is still based on pre-2.0
Restlet.
>>> - An ontology implies that we have defined classes (e.g. algorithm
>>> types) and individuals (specific algorithm instances).  An algorithm
>>> type will be an ontology entry (of type class). An algorithm will be
>>> again an ontology entry, but of type "individual".  Thus, what is the
>>> purpose  of  splitting  classes and instances in different resources /
>>> services ? 
>>>       
>> I think the main benefits are
>>
>> - simplified algorithm_uris
>> - stable algorithm_uris, even if the algorithm ontology is modified
>> 	(I suspect that we will need a lot of reclassificiations/modifications/additions of algorithms in the near future and we do not want to modify the algorithm API/URIs every time we modify the ontology)
>>
>> Best regards,
>> Christoph
>> _______________________________________________
>> Development mailing list
>> Development at opentox.org
>> http://www.opentox.org/mailman/listinfo/development
>>