[OTDev] Errors and warnings in datasets

Christoph Helma helma at in-silico.ch
Tue Oct 12 09:20:58 CEST 2010


Excerpts from Nina Jeliazkova's message of Mon Oct 11 19:44:01 +0200 2010:
> On 11 October 2010 19:51, Christoph Helma <helma at in-silico.ch> wrote:
> 
> > Excerpts from Nina Jeliazkova's message of Mon Oct 11 17:27:14 +0200 2010:
> > > Christoph,
> > >
> > > Could you tell what kind of errors (parsing of SMILES ?) would you like
> > to
> > > store into metadata?  Is it possible to provide examples?
> >
> > It will be a mixed bag of SMILES errors, duplicated structures,
> > incorrect activity entries, .... Examples can be found e.g. at
> > http://toxcreate.org/models under "Warnings:  show".  Simple string
> > annotation for concatenated error/warning messages could be sufficient.
> >
> 
> Ah, ok - agree such information would be very useful.
> 
> What do you think about storing these somehow linked to the compounds ?  As
> a start could be simple string annotation , but linked to the compounds and
> not datasets.   Thus one should be able to point to the exact compound where
> the error is.
> 
> Perhaps introduce "metadata" for compounds as well?
> 

I think parsing errors are dataset related, not properties of
compounds - and I am reporting also parsing errors for feature values
and formatting errors which are not compound related.
Maybe metadata for data entries would be more consistent, but I fear
that could make things too complicated/slow (take e.g. the display of
dataset summaries: it would require to iterate over all dataset entries
just to show error messages - obtaining them directly from dataset
metadata would be far more efficient).

Best regards,
Christoph



More information about the Development mailing list