[OTDev] Dataset features in Ontology

Christoph Helma helma at in-silico.ch
Fri May 6 12:44:38 CEST 2011


Und hier uber die qualitative Einstufung von Datensaetzen (z.B. zur
Auswahl in ToxCreate)

> Christoph, All,
> 
> 
> On 31 March 2011 14:12, Nina Jeliazkova <jeliazkova.nina at gmail.com> wrote:
> 
> >
> >
> > On 31 March 2011 13:48, Christoph Helma <helma at in-silico.ch> wrote:
> >
> >>
> >> > > Thanks, this works as advertised! But how can I decide, which datasets
> >> > > are ready for production use (e.g. to select the four relevant
> >> datasets
> >> > >
> >> > >    dc:title "Benchmark Data Set for in Silico Prediction of Ames
> >> > > Mutagenicity" ;
> >> > >    dc:title "Bursi mutagenicity dataset.sdf" ;
> >> > >    dc:title "CPDBAS: Carcinogenic Potency Database Summary Tables -
> >> All
> >> > > Species" ;
> >> > >    dc:title "ISSCAN: Istituto Superiore di Sanita, CHEMICAL
> >> CARCINOGENS:
> >> > > STRUCTURES AND EXPERIMENTAL DATA" ;
> >> > >
> >> > > from 247 mutagenicity datasets), if I do not know titles in advance?
> >> > >
> >> >
> >> > There is currently no any metadata to handle it,  let's agree on some
> >> RDF
> >> > property  to denote a "production use" and we'll include it it into
> >> > /dataset/id/metadata ( read and update) , as recently was done for
> >> licenses.
> >>
> >> That would be *very* useful, also for models (or even algorithms - to
> >> distinguish between algorithms in development and stable/mature
> >> algorithms).
> >
> >
> > Agree.  Could be applicable even for chemical structures.
> >
> >
> >> Do you know existing properties that can be used or do we
> >> need to invent one?
> >>
> >
> >
> I've been told to look into :
> 
> 1) Vocabularity Status Ontology
> 
> http://www.w3.org/2003/06/sw-vocab-status/ns
> 
> http://www.w3.org/2003/06/sw-vocab-status/note
> 
> vs:term_status
>     a    r:Property;
>     default87:comment
>         "the status of a vocabulary term, expressed as a short
> symbolic string; known values include 'unstable','testing', 'stable'
> and 'archaic'";
>     default87:label
>        "term status";
>     vs:term_status
>        "unstable".
> 
> 
> 2) DCMI terms ontology
> 
> http://dublincore.org/documents/dcmi-terms/
> 
> 3)  DOAP (for software - algorithms in our case)
> http://trac.usefulinc.com/doap
> 
> 
> So far the first option looks like the simplest and applicable in all cases
> we need.  DCMI terms ontology has lot of properties, but not sure which are
> applicable. DOAP description of the software related resources would be nice
> to have, may need to be adapted to datasets.
> 
> For a quick solution, I would suggest vs:term_status from 1) . Do you think
> the suggested status values are sufficient  ('unstable','testing', 'stable'
> and 'archaic') or we need to extend the list?  The values themselves  are
> not formally defined in the ontology, only listed in a free-text comment.
> 
> Best regards,
> Nina
> 
> 
> 
> 
> >> Best regards,
> >> Christoph
> >>
> >
> >
> 



More information about the Development mailing list