[OTDev] Fwd: alpha testing of MaxTox; API compliance

Nina Jeliazkova nina at acad.bg
Tue Apr 20 21:52:40 CEST 2010


Hello Surajit,  All,

Vedrin Jeliazkov wrote:
> ---------- Forwarded message ----------
> From: Barry Hardy <barry.hardy at douglasconnect.com>
> Date: 12 April 2010 15:47
> Subject: Re: alpha testing of MaxTox; A&A
> To: surajit ray <mr.surajit.ray at gmail.com>
> Cc: sunil chawla <sunil at seascapelearning.com>, Indira Ghosh
> <indirag at mail.jnu.ac.in>, indira ghosh <ighdna at yahoo.com>,
> bhargavpatel62 at gmail.com, vedrin Jeliazkov <vedrin at acad.bg>, Andreas
> Maunz <andreas at maunz.de>
>
>
> Dear Surajit:
>
> - If you are ready with a prediction application, then we should carry
> out an alpha user test, not just Vedrin's performance test.  You can
> request Vedrin for one; probably David would carry out the first test.
>  Also, has Vedrin confirmed with his test whether you are API
> compliant or not?
>
>   
Not entirely compliant yet.  Details below:

    <rdf:RDF
        xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
        xmlns:ot="http://www.opentox.org/api/1.1#"
        xmlns:owl="http://www.w3.org/2002/07/owl#"
        xmlns:dc="http://purl.org/dc/elements/1.1/"
        xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
        xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
      <owl:Class rdf:about="http://www.opentox.org/api/1.1#Model"/>
      <ot:Model
    rdf:about="http://opentox2.informatik.uni-freiburg.de:8080/MaxtoxTest/model/3">
        <ot:parameters rdf:parseType="Resource">

There is NO object ot:parameters; instead there is an object
ot:Parameter ; and property ot:parameters of  ot:Model ;  therefore the
correct code looks like as:

    <ot:Model rdf:ID="Model_1">
        <ot:parameters rdf:resource="#Parameter_2"/>
        <ot:parameters rdf:resource="ou"/>
    </Model>
    <ot:Parameter rdf:ID="Parameter_2">
        <ot:paramScope rdf:datatype="&xsd;string">mandatory</ot:paramScope>
        <ot:paramValue rdf:datatype="&xsd;string">AAAAA</ot:paramValue>
    </ot:Parameter>
    <ot:Parameter rdf:about="ou">
        <ot:paramScope rdf:datatype="&xsd;string">optional</ot:paramScope>
        <ot:paramValue rdf:datatype="&xsd;double">3.14</ot:paramValue>
    </ot:Parameter>

Please consult opentox.owl for properties and classes.


          <ot:paramScope
    rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >mandatory</ot:paramScope>
          <ot:paramValue
    rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >Any URI</ot:paramValue>

The parameter for model RDF representations are NOT placeholders, but
should be filled in with actual URIs and values, used to create the
model. Therefore "Any URI" here is not appropriate.

          <dc:identifier
    rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI"
         
    >http://opentox2.informatik.uni-freiburg.de:8080/MaxtoxTest/parameter/compound_uri</dc:identifier>
          <dc:description
    rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >URI of compound whose toxicity needs to be
    predicted.</dc:description>
          <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >compound_uri</dc:title>
        </ot:parameters>



Compound URI is NOT a parameter for the model, and not included in the
RDF.  The compound_uri, or dataset_uri is a parameter of the POST
command , e.g

curl -X POST -d "compound_uri=..." 
http://opentox2.informatik.uni-freiburg.de:8080/MaxtoxTest/model/3

        <ot:parameters rdf:parseType="Resource">
          <dc:description
    rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >TD50  (0 = Active & 1 = NonActive) => Dataset from
    DSSTOX tested for TD50</dc:description>
          <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >ToxicityEndpoint</dc:title>
        </ot:parameters>

The endpoint is also NOT a parameter for the model, but similarly to
compound_uri is a parameter of the POST command, with the name of
prediction_feature, e.g.

curl -X POST -d "compound_uri=..."  -d
"prediction_feature=uri-of-theendpoint" 
http://opentox2.informatik.uni-freiburg.de:8080/MaxtoxTest/model/3

        <ot:trainingDataset rdf:parseType="Resource">
          <dc:identifier
    rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI"
         
    >http://opentox2.informatik.uni-freiburg.de:8080/MaxtoxTest/dataset/FT_3</dc:identifier>
          <dc:description
    rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >The dataset which shows all the fragments in the
    dictionary</dc:description>
          <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >fragmentDataset</dc:title>
        </ot:trainingDataset>

ot:trainingDataset is NOT a resource, but a property of ot:Model, which
points to ot:Dataset resource . Therefore, the RDF for training dataset
should look like

    <ot:Model rdf:ID="Model_1">
...
        <ot:trainingDataset rdf:resource="dataset_url"/>
    </ot:Model>


        <ot:trainingDataset rdf:parseType="Resource">
          <dc:identifier
    rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI"
         
    >http://opentox2.informatik.uni-freiburg.de:8080/MaxtoxTest/dataset/MT_3</dc:identifier>
          <dc:description
    rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >The dataset which was used to get the fragments in the
    dictionary for this model</dc:description>
          <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
          >dictionaryProducingDataset</dc:title>
        </ot:trainingDataset>

Dictionary producing dataset seems to be more related to descriptor
calculation service, generating fragments, rather than be a training
dataset of the  model itself

       
    <ot:isA>http://www.opentox.org/modelTypes.owl#MCSSBasedToxicityPredictor</ot:isA>

There is no ontology http://www.opentox.org/modelTypes.owl , and
therefore no entity
http://www.opentox.org/modelTypes.owl#MCSSBasedToxicityPredictor. 

The model should be associated with an algorithm, and the algorithm has
a type from algorithm types ontology.  That ontology might be extended
to include MaxTox related entities.

        <dc:identifier
    rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI"
       
    >http://opentox2.informatik.uni-freiburg.de:8080/MaxtoxTest/model/3</dc:identifier>
        <dc:description
    rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
        >TD50 --> Dataset from DSSTOX tested for TD50. *** No of
    compounds in training set : 200 *** Toxic/Non-toxic = 100/100 *** No
    of Dictionary Scaffolds : 3627 *** Training Set Accuracy : 0.925 ***
    Training Set Sensitivity : 0.89 *** Training Set Specificity : 0.96
    *** Test Set Accuracy : 0.83 *** Test Set Sensitivity : 0.76 ***
    Test Set Specificity : 0.84</dc:description>
        <dc:title rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
        >Model Number : 3</dc:title>
      </ot:Model>
    </rdf:RDF>


Please pay attention in RDF there are resources (tab OWL Classes in
Protege ) and properties (tab Properties in Protege).  Resources have
properties, the reverse is not true.

Otherwise, it seems the general confusion in fitting MaxTox in Opentox
API is that the core part of MaxTox (the one generating the fragments)
is in fact a descriptor calculation algorithm, not a model itself.

The MaxTox model is just a RandomForest model (with its R
implementation) .  Therefore if modularizing MaxTox the right way, there
should be a descriptor calculation algorithm, generating the fragments
(similar to FMiner) and a random forest model.  The descriptor
(fragment) calculation algorithm may have the property ot:hasInput
linked to  the dictionaryProducingDataset. 


Best regards,
Nina

> - The first report related to A&A is due end of May, although that
> will not be the end of it!  We should also have an A&A spec included
> in API 1.2 which will be drafted and commented on through the OpenTox
> website under the API section and should be finalised for September.
> Andreas indicated today he was ready to run tests related to his
> OpenSSO setup.
>
> Barry
>
>
> Am 12.04.2010 11:29, schrieb surajit ray:
>
> Hi Barry,
> Just an update on our progress :-
> We are already API complaint for our prediction workflow (similar to
> ToxPredict). In the future we intend to expose our fingerprint
> generator as a descriptor calculation algorithm for the ToxCreate use
> case.
> For the first workflow continuous testing is ongoing by Vedrin's
> server. As for the other component we will inform Vedrin of our
> readiness as soon as we expose the algorithm.
> Also we would like to know the time frame for the Authentication API
> finalization. Since we have to include that as well in the final case.
> Thanks
> Surajit
>   




More information about the Development mailing list