[OTDev] Validation: classification statistics for non-binary class values
Martin Guetlein martin.guetlein at googlemail.comWed Dec 9 13:45:09 CET 2009
- Previous message: [OTDev] Validation: classification statistics for non-binary class values
- Next message: [OTDev] Validation: classification statistics for non-binary class values
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Nina, On Wed, Dec 9, 2009 at 12:38 PM, Nina Jeliazkova <nina at acad.bg> wrote: > Hi Martin, > > Martin Guetlein wrote: > > Hi Nina, All, > > very good Point. Here is how it could look like: > > [[ > default:confusionmatrix > a ot:ConfusionMatrix ; > > # contains numClassValues**2 entries like the following > > ot:confusionMatrixValue > [ > a ot:ConfusionMatrixValue ; > dc:value "25"^^xsd:int ; > ot:confusionMatrixCoordinates ; > [ > a ot:ConfusionMatrixCoordinate ; > dc:predictedValue "active"^^xsd:String ; > dc:actualValue "moderately_active"^^xsd:String ; > ] > ] > ... > ]] > > I think we will end up with quite a lot of Classes in our opentox.owl. > > > Having large number of classes should be fine, provided we are not > replicating things under different names. > > Here is an idea how we can reuse some of the existing classes : > > #This is a cell in a confusion matrix > <owl:Class rdf:ID="ConfusionMatrixCell"> > <rdfs:subClassOf rdf:resource="#OpentoxResource"/> > </owl:Class> > #the cell is linked to the Feature and the actual value via FeatureValue > class > <owl:ObjectProperty rdf:ID="confusionMatrixActual"> > <rdf:type rdf:resource="&owl;FunctionalProperty"/> > <rdfs:domain rdf:resource="#ConfusionMatrixCell"/> > <rdfs:range rdf:resource="#FeatureValue"/> > </owl:ObjectProperty> > > #the cell is linked to the Feature and the predicted value via FeatureValue > class > <owl:ObjectProperty rdf:ID="confusionMatrixPredicted"> > <rdf:type rdf:resource="&owl;FunctionalProperty"/> > <rdfs:domain rdf:resource="#ConfusionMatrixCell"/> > <rdfs:range rdf:resource="#FeatureValue"/> > </owl:ObjectProperty> > #and the numeric value itself > <owl:DatatypeProperty rdf:ID="confusionMatrixValue"> > <rdf:type rdf:resource="&owl;FunctionalProperty"/> > <rdfs:domain rdf:resource="#ConfusionMatrixCell"/> > <rdfs:range rdf:resource="&xsd;int"/> > </owl:DatatypeProperty> > > #theabove is to be added in opentox.owl > > #instances elsewhere (generated by services) > <ConfusionMatrixCell rdf:ID="ConfusionMatrixCell_7"> > <confusionMatrixActual rdf:resource="#FeatureValue_8"/> > <confusionMatrixPredicted rdf:resource="#FeatureValue_9"/> > <confusionMatrixValue > rdf:datatype="&xsd;int">25</confusionMatrixValue> > </ConfusionMatrixCell> > > <FeatureValue rdf:ID="FeatureValue_8"> > <feature rdf:resource="#Feature_10"/> > <value rdf:datatype="&xsd;string">active</value> > </FeatureValue> > <FeatureValue rdf:ID="FeatureValue_9"> > <feature rdf:resource="#Feature_10"/> > <value rdf:datatype="&xsd;string">moderate</value> > </FeatureValue> > > (as a side effect, visualising confusion matrix with relevant links for > predicted/actual will be straightforward :) > > #and using ConfusionMatrixCell to denote a ConfusionMatrix as in your > proposal. > > Any comments? > Looks good, linking to feature values really makes sense. I will try to integrate this into the opentox ontology today. Best regards, Martin > Best regards, > Nina > > Best Regards, > Martin > > > > On Tue, Dec 8, 2009 at 12:57 PM, Nina Jeliazkova <nina at acad.bg> wrote: > > > Hi Martin, > > Do we have confusion matrix somewhere in the classification statistics? > It provides more information than just true positives. > > Best regards, > Nina > > > Martin Guetlein wrote: > > > Hello All, > > as Harry noted in one of the last meetings, the classification > statistics in the validation object only take binary classification > into account so far. There can of course be more than one class value > (e.g. inacitve, moderately-active, active). > Hence, some classification results (e.g. numTruePositives) are now > available multiple times (once for each class-value). > > As collections are not allowed in OWL-DL, I had to create > intermediate classes (following the scheme Nina proposed for the > dataset). Here is how an example of the Classification Statistics > Object may look like: > > [[ > default:thisClassificationStatistics > a ot:classificationStatistics ; > > ot:accuracy "99.0"^^xsd:float ; # accuracy is only available once > ot:numberUnclassified "26"^^xsd:int ; > ... > > ot:classStatisticEntry > [ a ot:classStatisticEntry ; > ot:classValue "moderately_active"^^String ; > ot:classStatisticValue > [ a ot:ClassStatisticValue ; > ot:classStatistic default:areaUnderRocCurve ; > ot:value "0.77"^^:xsd:float ; > ] ; > ot:classStatisticValue > [ a ot:ClassStatisticValue ; > ot:classStatistic default:numTruePositives ; > ot:value "123"^^:xsd:int ; > ] ; > ot:classStatisticValue > ... > > ot:classStatisticEntry > [ a ot:classStatisticEntry ; > ot:classValue "intactive"^^String ; > ... > > ot:classStatisticEntry > [ a ot:classStatisticEntry ; > ot:classValue "active"^^String ; > ... > ]] > > Here is the old classification statistics object (I renamed it from > ClassifcationInformation to ClassificationStatistics): > http://www.opentox.org/data/documents/development/RDF%20files/Validation/#-ot-classificationinfo-rdf > > Any comments, corrections before I add that to the opentox.owl? > > Best regards, > Martin > > > > > > _______________________________________________ > Development mailing list > Development at opentox.org > http://www.opentox.org/mailman/listinfo/development > > > > > -- Dipl-Inf. Martin Gütlein Phone: +49 (0)761 203 8442 (office) +49 (0)177 623 9499 (mobile) Email: guetlein at informatik.uni-freiburg.de
- Previous message: [OTDev] Validation: classification statistics for non-binary class values
- Next message: [OTDev] Validation: classification statistics for non-binary class values
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list