[OTDev] Fwd: Predicted variables and confidence --- was: [OTP] Lazar models

Martin Guetlein martin.guetlein at googlemail.com
Wed May 25 08:55:14 CEST 2011


On Tue, May 24, 2011 at 8:46 PM, Nina Jeliazkova
<jeliazkova.nina at gmail.com>wrote:

> Hi Martin, All,
>
> On 24 May 2011 21:27, Martin Guetlein <martin.guetlein at googlemail.com
> >wrote:
>
> > This should probably better be posted to the development list...
> >
> > ---------- Forwarded message ----------
> > From: Martin Guetlein <martin.guetlein at googlemail.com>
> > Date: Tue, May 24, 2011 at 8:26 PM
> > Subject: Predicted variables and confidence --- was: [OTP] Lazar models
> > To: opentox partners mailing list <partners at opentox.org>, Nina
> Jeliazkova
> > <
> > jeliazkova.nina at gmail.com>
> > Cc: Christoph Helma <helma at in-silico.ch>
> >
> >
> > Hi all,
> >
> > I just managed to produce the first validation report that utilizes
> > non-lazar 'confidence' values, with a j48 model from ambit:
> > http://local-ot/validation/report/validation/47
> > (Once again this is just proof of concept, this is a training data
> > validation and the confidence value is the class-probability value coming
> > from WEKA, I asked Nina to add this information to the model predictions
> > some time ago.)
> >
>
> Good to have both services working :)
>

Oops, this is the right link:
http://toxcreate2.in-silico.ch/validation/report/validation/93


>
>
> >
> > Both model services (ambit and lazar) now add the confidence as a
> separate
> > feature to the prediction dataset which is nice, I think we should keep
> it
> > that way.
> >
> > One deviation is that Ambit adds both features (prediction and
> confidence)
> > to Model#predictedVariables while IST puts them into
> > PredictionDataset#features. IST is doing this because we do not have a
> > feature service, features do only exist in datasets (which makes A&A
> > easier). I am fine with both solutions, but we maybe should agree on a
> > common way to do it?
> >
> >
> What about combining both solutions?  Features could be in the dataset, as
> in IST services, or as separate resources,  but additionally models provide
> list of predicted variables via /model/id/predicted ?  This way there will
> be still no need of a separate feature service for you.
>

At the moment there is no /model/id/predicted for IST models and the
prediction feature is created for the prediction dataset. For example, we
apply model http://toxcreate2.in-silico.ch/model/102
to test dataset http://toxcreate2.in-silico.ch/dataset/2200. The model
creates a prediction dataset http://toxcreate2.in-silico.ch/dataset/2202,
including the model prediction feature
http://toxcreate2.in-silico.ch/dataset/2202/feature/prediction/Hamster%20Carcinogenicity/value
.

@Christoph
Couldn't we just move this predicted feature to the model?
http://toxcreate2.in-silico.ch/model/102<http://toxcreate2.in-silico.ch/model/102>
/predicted/feature/Hamster%20Carcinogenicity/value
We would still need no feature service, we could provide model/id/predicted,
and have the other advantages Nina is writing about, below.

Martin


> It's quite convenient to know how many and which features are generated by
> a
> model.  We are using these to find out if the predictions are already
> cached, or need to be calculated a new. And there will be a straightforward
> way to check if a dataset indeed contains features from particular model.
> Finally, if <dependentVariable | predictedVariable> owl:sameAs  <endpoint>
> is set, then the model will appear under one of the endpoint categories in
> ToxPredict, and not as a model with unknown endpoint, as now.
>
>
> > The second deviation is how the actual prediction and confidence features
> > look like. To unify this, my proposition would be:
> > * The predicted feature is of type OT:ModelPredictionFeature (subclass of
> > OT:Feature)
> > * The confidence feature is of type OT:ModelConfidenceFeature (subclass
> of
> > OT:Feature)
> > * The confidence feature has a property OT:confidenceOf which points to
> the
> > predicted feature (in case a model has more than one prediction feature)
> >
> >
> Agree.
>
> Nina
>
>
> > Best regards,
> > Martin
> >
> >
> > --
> > Dipl-Inf. Martin Gütlein
> > Phone:
> > +49 (0)761 203 8442 (office)
> > +49 (0)177 623 9499 (mobile)
> > Email:
> > guetlein at informatik.uni-freiburg.de
> > _______________________________________________
> > Development mailing list
> > Development at opentox.org
> > http://www.opentox.org/mailman/listinfo/development
> >
> _______________________________________________
> Development mailing list
> Development at opentox.org
> http://www.opentox.org/mailman/listinfo/development
>



-- 
Dipl-Inf. Martin Gütlein
Phone:
+49 (0)761 203 8442 (office)
+49 (0)177 623 9499 (mobile)
Email:
guetlein at informatik.uni-freiburg.de



More information about the Development mailing list