[OTDev] Performance testing and monitoring

Sat Apr 17 16:17:27 CEST 2010

Hi Nina,

Was trying to make sense of the changes mentioned  here .....

On Mon, Mar 22, 2010 at 3:40 PM, Nina Jeliazkova <nina at acad.bg> wrote:

 Hi Surajit,
>
> The main issue is the code has some deviation in representing objects and
> properties, compared to those defined in
> http://opentox.org/api/1.1/opentox.owl .
>
> IMHO, the most convenient way to familiarize oneself with objects and
> relationships is to open opentox.owl with Protege and explore OWLClasses tab
> with properties view.
>
> The list of properties , defined for ot:Model object are in the middle
> panel.
>
>
> surajit ray wrote:
>
> Hi Nina,
> this is them code generating the curl output for model
>
>              OntModel jenaModel = createJenaRDFModel();
>             OT.OTClass.Model.createOntClass(jenaModel);
>
>              Individual model =
> jenaModel.createIndividual(MaxtoxApplicationSettings.getServerRootPath() +
> "/model/" + model_number, OT.OTClass.Model.getOntClass(jenaModel));
>             model.addLiteral(DC.title, jenaModel.createTypedLiteral("Model
> Number : " + model_number, XSDDatatype.XSDstring));
>             model.addLiteral(DC.description,
> jenaModel.createTypedLiteral(modelDetails.get("description"),
> XSDDatatype.XSDstring));
>             model.addLiteral(DC.identifier,
> jenaModel.createTypedLiteral(MaxtoxApplicationSettings.getServerRootPath() +
> "/model/" + model_number, XSDDatatype.XSDanyURI));
>
>
> links to other ontologies are to be established between ot:Algorithm, not
> ot:Model , so a statement below is not relevant for ot:Model
>
>              model.addProperty(OT.isA, "
> http://www.opentox.org/modelTypes.owl#MCSSBasedToxicityPredictor");
>
> I guess your perspective is correct, from the way Ambit is built from
ground up. But in our case the algorithm is a part of the model and not a
separate entity. I could make a REST interface for just a description of the
algorithm, but we dont intend at this stage, to give separate access to the
algorithm (at least that was not mentioned in our use-case requirement
agreed upon at the project onset). Also our algorithm is quite multi-layered
so exposing the algorithm( piece by piece) would be a time consuming task.
Also there is the question of storing the intermediate data that the first
layer of the  algorithm will generate before the other layers can work on
it.

Right now we can provide access to the model as a whole (as agreed in the
use-case). In that perspective the sub-components need only have
descriptions.

>
>              Individual dictionaryProducingDataset =
> jenaModel.createIndividual(OT.OTClass.Dataset.getOntClass(jenaModel));
>             dictionaryProducingDataset.addLiteral(DC.title,
> jenaModel.createTypedLiteral("dictionaryProducingDataset",
> XSDDatatype.XSDstring));
>             dictionaryProducingDataset.addLiteral(DC.description,
> jenaModel.createTypedLiteral("The dataset which was used to get the
> fragments in the dictionary for this model", XSDDatatype.XSDstring));
>             dictionaryProducingDataset.addLiteral(DC.identifier,
> jenaModel.createTypedLiteral(modelDetails.get("dataset_uri"),
> XSDDatatype.XSDanyURI));
>             model.addProperty(OT.trainingDataset,
> dictionaryProducingDataset);
>
>              Individual fragmentDataset =
> jenaModel.createIndividual(OT.OTClass.Dataset.getOntClass(jenaModel));
>             fragmentDataset.addLiteral(DC.title,
> jenaModel.createTypedLiteral("fragmentDataset", XSDDatatype.XSDstring));
>             fragmentDataset.addLiteral(DC.description,
> jenaModel.createTypedLiteral("The dataset which shows all the fragments in
> the dictionary", XSDDatatype.XSDstring));
>             fragmentDataset.addLiteral(DC.identifier,
> jenaModel.createTypedLiteral(modelDetails.get("fragmentset_uri"),
> XSDDatatype.XSDanyURI));
>             model.addProperty(OT.trainingDataset, fragmentDataset);
>
> In our case we wish to reveal two kinds of datasets. That is because the
intermediate data happens to be a descriptor for the molecule. The training
data obviously fits right in with the ot:trainingdataset. How about the MCSS
fragment set for the model ? Where do you thinks that fits in ?

Also right now the dataset we are providing as training dataset is not from
those available as a service within Opentox(since at the time we started we
were unaware of such datasets).

  Endpoints are assigned via ot:Feature, not as parameters of ot:Model
>
>               Individual endpoint =
> jenaModel.createIndividual(OT.OTClass.Parameter.getOntClass(jenaModel));
>             endpoint.addLiteral(DC.title,
> jenaModel.createTypedLiteral("ToxicityEndpoint", XSDDatatype.XSDstring));
>             endpoint.addLiteral(DC.description,
> jenaModel.createTypedLiteral(modelDetails.get("endpoint"),
> XSDDatatype.XSDstring));
>             //endpoint.addLiteral(DC.identifier,
> jenaModel.createTypedLiteral(MaxtoxApplicationSettings.getServerRootPath() +
> "/parameter/moleculeSizeCutoffParameter", XSDDatatype.XSDanyURI));
>             model.addProperty(OT.parameters, endpoint);
>
>
Accepted. Does that mean we have to a build a REST interface for a feature
as well ? Once the other services are up, we intend to use datasets from the
Opentox interfaces - so in the long run we do not need to host a REST
interface for the features.. I'd rather code in the direction of more
utilization of existing interfaces than provide redundant feature REST
interfaces. Once we make a model from Opentox dataset (which will take some
research and time) - I think the problem will automatically get solved.

>   URI of the compound used for prediction is not  parameter of the model
>
>              Individual compound_uri =
> jenaModel.createIndividual(OT.OTClass.Parameter.getOntClass(jenaModel));
>             compound_uri.addLiteral(DC.title,
> jenaModel.createTypedLiteral("compound_uri", XSDDatatype.XSDstring));
>             compound_uri.addLiteral(DC.description,
> jenaModel.createTypedLiteral("URI of compound whose toxicity needs to be
> predicted.", XSDDatatype.XSDstring));
>             compound_uri.addLiteral(DC.identifier,
> jenaModel.createTypedLiteral(MaxtoxApplicationSettings.getServerRootPath() +
> "/parameter/compound_uri", XSDDatatype.XSDanyURI));
>             compound_uri.addLiteral(OT.paramValue,
> jenaModel.createTypedLiteral("Any URI", XSDDatatype.XSDstring));
>             compound_uri.addLiteral(OT.paramScope,
> jenaModel.createTypedLiteral("mandatory", XSDDatatype.XSDstring));
>             model.addProperty(OT.parameters, compound_uri);
>
> Ok. So what is it ? Is it ot:hasInput ? But that seems to be dataset as a
API spec. What if it is a compound URI ? I guess the API does not take care
of the variations to the central theme.

>  and finally ot:algorithm, ot:independentVariables, ot:dependentVariables,
> ot:predictedVariables properties are missing.
>
>
> Must we fill in ALL the blanks to be compliant. I must say the API is way
too inflexible then.

>  Could you tell me how to do it correctly ?
>
> Please have a look at
>
>
> https://ambit.svn.sourceforge.net/svnroot/ambit/branches/opentox/opentox-client/src/main/java/org/opentox/rdf/representation/ModelRepresentation.java
>
> The code at
> https://ambit.svn.sourceforge.net/svnroot/ambit/branches/opentox/opentox-clientdepends only on Jena and Restlet and can be directly used in others
> projects.
>
> Best regards,
> Nina
>
>
The links don't work.  Could you please give me some alternates ?

And lastly :
We have upgraded the website to have three fully analysed and validated
models. The list of models are at :
http://opentox2.informatik.uni-freiburg.de:8080/MaxtoxTest/model (can be
opened in a browser)

As to alpha testing mentioned by Barry, I would like to know what would
constitute an alpha test. Since we are predicting a single molecule's
activity with the model (the same as what Vedrin is running as a performance
test) - I fathom from our perspective it is pretty much what is already
happening. We could compare the results with other algorithms. That would be
validation/testing.

As to the API , I still believe that it has arose from a single application
reference (ambit) and needs to include other possibilities which may not
follow the same patterns. Otherwise the load of "compliance coding" is going
to severely hamper independent developers from joining in - a serious issue
from the sustainability perspective.

Datasets are easier to provide, just a question of putting the interface in
place. Models and the way it is conceptualized withing the API will lead to
huge "redesign sessions" for any one who simply intends to provide a
prediction model that he/she has built. Essentially leaving most amateur
coders out in the cold (since they will not have the bandwidth to make such
huge code changes).

And one final question ... Is this API going to last the next six months  ?
If not, compliance could mean a different set of rules in the future.
Normally in the domain of software dev and testing, compliance testing
happens when the API is matured enough (and stable). Our AA system will make
more changes to the API - and this far from finalized.

Cheers
Surajit