[OTDev] Dataset features in Ontology

Nina Jeliazkova jeliazkova.nina at gmail.com
Sat Apr 2 08:13:46 CEST 2011


Christoph, All,


On 31 March 2011 14:12, Nina Jeliazkova <jeliazkova.nina at gmail.com> wrote:

>
>
> On 31 March 2011 13:48, Christoph Helma <helma at in-silico.ch> wrote:
>
>>
>> > > Thanks, this works as advertised! But how can I decide, which datasets
>> > > are ready for production use (e.g. to select the four relevant
>> datasets
>> > >
>> > >    dc:title "Benchmark Data Set for in Silico Prediction of Ames
>> > > Mutagenicity" ;
>> > >    dc:title "Bursi mutagenicity dataset.sdf" ;
>> > >    dc:title "CPDBAS: Carcinogenic Potency Database Summary Tables -
>> All
>> > > Species" ;
>> > >    dc:title "ISSCAN: Istituto Superiore di Sanita, CHEMICAL
>> CARCINOGENS:
>> > > STRUCTURES AND EXPERIMENTAL DATA" ;
>> > >
>> > > from 247 mutagenicity datasets), if I do not know titles in advance?
>> > >
>> >
>> > There is currently no any metadata to handle it,  let's agree on some
>> RDF
>> > property  to denote a "production use" and we'll include it it into
>> > /dataset/id/metadata ( read and update) , as recently was done for
>> licenses.
>>
>> That would be *very* useful, also for models (or even algorithms - to
>> distinguish between algorithms in development and stable/mature
>> algorithms).
>
>
> Agree.  Could be applicable even for chemical structures.
>
>
>> Do you know existing properties that can be used or do we
>> need to invent one?
>>
>
>
I've been told to look into :

1) Vocabularity Status Ontology

http://www.w3.org/2003/06/sw-vocab-status/ns

http://www.w3.org/2003/06/sw-vocab-status/note

vs:term_status
    a    r:Property;
    default87:comment
        "the status of a vocabulary term, expressed as a short
symbolic string; known values include 'unstable','testing', 'stable'
and 'archaic'";
    default87:label
       "term status";
    vs:term_status
       "unstable".


2) DCMI terms ontology

http://dublincore.org/documents/dcmi-terms/

3)  DOAP (for software - algorithms in our case)
http://trac.usefulinc.com/doap


So far the first option looks like the simplest and applicable in all cases
we need.  DCMI terms ontology has lot of properties, but not sure which are
applicable. DOAP description of the software related resources would be nice
to have, may need to be adapted to datasets.

For a quick solution, I would suggest vs:term_status from 1) . Do you think
the suggested status values are sufficient  ('unstable','testing', 'stable'
and 'archaic') or we need to extend the list?  The values themselves  are
not formally defined in the ontology, only listed in a free-text comment.

Best regards,
Nina




>> Best regards,
>> Christoph
>>
>
>



More information about the Development mailing list