[OTDev] Java Examples for Dataset Creation/statistics/validation
Tobias Girschick tobias.girschick at in.tum.deWed Dec 9 16:26:35 CET 2009
- Previous message: [OTDev] Java Examples for Dataset Creation/statistics/validation
- Next message: [OTDev] Java Examples for Dataset Creation/statistics/validation - questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Martin, On Wed, 2009-12-09 at 15:24 +0200, Nina Jeliazkova wrote: > Martin Guetlein wrote: > > Hi Nina, All, > > > > On Wed, Dec 9, 2009 at 8:26 AM, Nina Jeliazkova <nina at acad.bg> wrote: > > > >> Hi Pantelis, > >> > >> chung wrote: > >> > >>> Hi Nina, > >>> At http://www.opentox.org/data/documents/development/RDF% > >>> 20files/AlgorithmTypes/view?searchterm=Algorithm%20Types%20ontology (the > >>> ontology for all algorithm types we use in OT), all algorithm types, > >>> appear to be Resources, not Literals. However in > >>> http://www.opentox.org/data/documents/development/RDF% > >>> 20files/JavaOnly/JenaExamples , the object which answers the question: > >>> > >>> <http://myservice.com/algorithm/id> > >>> <http://www.opentox.org/api/1.1#isA> ?obj > >>> > >>> is a literal. The corresponding triple is: > >>> > >>> Subject: > >>> http://opentox.ntua.gr:3000/algorithm/mlr > >>> Predicate: > >>> http://www.opentox.org/api/1.1#isA > >>> Object: > >>> "http://www.opentox.org/algorithmTypes.owl#RegressionEagerSingleTarget" > >>> > >>> Is this correct? If yes, should we use literals in that case or > >>> resources? > >>> > >>> > >>> > >> |Should be resources, not literals (the Range of isA property is a > >> resource). I'll update the example ASAP. > >> > >>> The same holds for the supported statistics. The java code snippet > >>> produces an RDF which includes the triple: > >>> > >>> http://opentox.ntua.gr:3000/algorithm/mlr > >>> http://www.opentox.org/api/1.1#statisticsSupported > >>> "statistics-1"^^http://www.w3.org/2001/XMLSchema#string > >>> > >>> Shouldn't the object > >>> ("statistics-1"^^http://www.w3.org/2001/XMLSchema#string ) be a Resource > >>> instead of a string? Was it a Resource, one assign propertied on it - > >>> For example one could declare its type etc. > >>> > >>> > >> In the current opentox.owl supported statistics are simply literals > >> (just to follow the old XML spec), but I agree it will be better if > >> "statisticsSupported" are indeed resources. > >> On another note, statisticsSupported are closely related to the > >> Validation service, which already has defined several statistics, > >> specific to classification and regression models. I am not sure how/if > >> Validation service (or another client or service) uses the information > >> from "statisticsSupported" field, but it would be good if this > >> information is somehow exploited. > >> > >> Tobias, Martin, what do you think? > >> > > > > I don't quite understand what the statisticsSupported flag is about. > > In the example on the overview page > > (http://www.opentox.org/data/documents/development/RDF%20files/Overview) > > the svm algorithm supports all the regression statistics listed in the > > validation object so far. Would it not be enough to state that it is a > > regression algorithm (then the RegressionStatistics object in the > > validation result will be set)? > > Or do I misinterpret the functionality? If so, could you give an example? > > > I guess TUM /NTUA could answer better. I think that at the moment all regression algorithms support all the statistics. But this might not always be the case. Some algorithm might for example only supply a RMSE and no other quality measure. > > This leads to another question regarding validation. AFAIK there is no > > regression/classification flag in prediction models(?). That's why I'm > > > This is supposed to be handled via AlgorithmTypes ontology > http://opentox.org/data/documents/development/RDF%20files/AlgorithmTypes/view > > and each Algorithm is supposed to declare a link to that ontology via > ot:isA or owl:sameAs property. > > > planning to distinguish between regression and classification via data > > type of the prediction feature (numerical -> regression, else > > classification). Do you think that's sufficient? > > > This might not be sufficient to handle e.g. Toxtree or clustering > algorithms . > > Best regards, > Nina > > Best regards, > > Martin > > > > > > > >> Best regards, > >> Nina > >> > >>> These phenomena do not appear in the RDF representation of a dataset > >>> where most elements are Resources instead of Literals. > >>> > >>> On Tue, 2009-12-08 at 13:45 +0200, Nina Jeliazkova wrote: > >>> > >>> > >>>> Hi Pantelis, > >>>> > >>>> In principle yes (the Class of the resource should be defined and this > >>>> is done via RDFType), but there is already in the example > >>>> > >>>> OT.OTClass.Dataset.createOntClass(jenaModel); > >>>> > >>>> which does the same , if jenaModel is OntModel. > >>>> > >>>> > >>> When I add this piece of code, the following triple is additionally > >>> included in the representation: > >>> > >>> * http://sth.com/dataset/1 > >>> * http://www.w3.org/1999/02/22-rdf-syntax-ns#type > >>> * http://www.opentox.org/api/1.1#Dataset > >>> > >>> which is absent if > >>> dataset.addRDFType(OT.OTClass.Dataset.createProperty(datasetModel)); is > >>> not included. Otherwise the only triple present is this: > >>> > >>> * http://www.opentox.org/api/1.1#Dataset > >>> * http://www.w3.org/1999/02/22-rdf-syntax-ns#type > >>> * http://www.w3.org/2002/07/owl#Class > >>> > >>> which simply implies that Dataset is of type Class (this doesn't provide > >>> information about the dataset itself, as an instance, but only for the > >>> resource http://www.opentox.org/api/1.1#Dataset . So this way, we have > >>> not defined that http://sth.com/dataset/1 is of type > >>> http://www.opentox.org/api/1.1#Dataset which in turn is a Class. > >>> > >>> > >>> Best Regards, > >>> Pantelis > >>> > >>> > >>> > >>>> Regards, > >>>> Nina > >>>> chung wrote: > >>>> > >>>> > >>>>> Hi Nina, > >>>>> I think we have to include the following line in the code for the > >>>>> creation of an RDF representation for datasets: > >>>>> > >>>>> dataset.addRDFType(OT.OTClass.Dataset.createProperty(datasetModel)); > >>>>> > >>>>> This declares that the Resource under consideration is a Dataset. > >>>>> > >>>>> P.S. Thanks for the snippets! > >>>>> > >>>>> Best regards, > >>>>> Pantelis > >>>>> > >>>>> > >>>>> > >>> _______________________________________________ > >>> Development mailing list > >>> Development at opentox.org > >>> http://www.opentox.org/mailman/listinfo/development > >>> > >>> > >> _______________________________________________ > >> Development mailing list > >> Development at opentox.org > >> http://www.opentox.org/mailman/listinfo/development > >> > >> > > > > > > > > > > _______________________________________________ > Development mailing list > Development at opentox.org > http://www.opentox.org/mailman/listinfo/development -- Dipl.-Bioinf. Tobias Girschick Technische Universität München Institut für Informatik Lehrstuhl I12 - Bioinformatik Bolzmannstr. 3 85748 Garching b. München, Germany Room: MI 01.09.042 Phone: +49 (89) 289-18002 Email: tobias.girschick at in.tum.de Web: http://wwwkramer.in.tum.de/girschick
- Previous message: [OTDev] Java Examples for Dataset Creation/statistics/validation
- Next message: [OTDev] Java Examples for Dataset Creation/statistics/validation - questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list