[OTDev] Validation: classification statistics for non-binary class values
Nina Jeliazkova nina at acad.bgWed Dec 9 12:38:59 CET 2009
- Previous message: [OTDev] Validation: classification statistics for non-binary class values
- Next message: [OTDev] Validation: classification statistics for non-binary class values
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Martin, Martin Guetlein wrote: > Hi Nina, All, > > very good Point. Here is how it could look like: > > [[ > default:confusionmatrix > a ot:ConfusionMatrix ; > > # contains numClassValues**2 entries like the following > > ot:confusionMatrixValue > [ > a ot:ConfusionMatrixValue ; > dc:value "25"^^xsd:int ; > ot:confusionMatrixCoordinates ; > [ > a ot:ConfusionMatrixCoordinate ; > dc:predictedValue "active"^^xsd:String ; > dc:actualValue "moderately_active"^^xsd:String ; > ] > ] > ... > ]] > > I think we will end up with quite a lot of Classes in our opentox.owl. > Having large number of classes should be fine, provided we are not replicating things under different names. Here is an idea how we can reuse some of the existing classes : #This is a cell in a confusion matrix <owl:Class rdf:ID="ConfusionMatrixCell"> <rdfs:subClassOf rdf:resource="#OpentoxResource"/> </owl:Class> #the cell is linked to the Feature and the actual value via FeatureValue class <owl:ObjectProperty rdf:ID="confusionMatrixActual"> <rdf:type rdf:resource="&owl;FunctionalProperty"/> <rdfs:domain rdf:resource="#ConfusionMatrixCell"/> <rdfs:range rdf:resource="#FeatureValue"/> </owl:ObjectProperty> #the cell is linked to the Feature and the predicted value via FeatureValue class <owl:ObjectProperty rdf:ID="confusionMatrixPredicted"> <rdf:type rdf:resource="&owl;FunctionalProperty"/> <rdfs:domain rdf:resource="#ConfusionMatrixCell"/> <rdfs:range rdf:resource="#FeatureValue"/> </owl:ObjectProperty> #and the numeric value itself <owl:DatatypeProperty rdf:ID="confusionMatrixValue"> <rdf:type rdf:resource="&owl;FunctionalProperty"/> <rdfs:domain rdf:resource="#ConfusionMatrixCell"/> <rdfs:range rdf:resource="&xsd;int"/> </owl:DatatypeProperty> #theabove is to be added in opentox.owl #instances elsewhere (generated by services) <ConfusionMatrixCell rdf:ID="ConfusionMatrixCell_7"> <confusionMatrixActual rdf:resource="#FeatureValue_8"/> <confusionMatrixPredicted rdf:resource="#FeatureValue_9"/> <confusionMatrixValue rdf:datatype="&xsd;int">25</confusionMatrixValue> </ConfusionMatrixCell> <FeatureValue rdf:ID="FeatureValue_8"> <feature rdf:resource="#Feature_10"/> <value rdf:datatype="&xsd;string">active</value> </FeatureValue> <FeatureValue rdf:ID="FeatureValue_9"> <feature rdf:resource="#Feature_10"/> <value rdf:datatype="&xsd;string">moderate</value> </FeatureValue> (as a side effect, visualising confusion matrix with relevant links for predicted/actual will be straightforward :) #and using ConfusionMatrixCell to denote a ConfusionMatrix as in your proposal. Any comments? Best regards, Nina > Best Regards, > Martin > > > > On Tue, Dec 8, 2009 at 12:57 PM, Nina Jeliazkova <nina at acad.bg> wrote: > >> Hi Martin, >> >> Do we have confusion matrix somewhere in the classification statistics? >> It provides more information than just true positives. >> >> Best regards, >> Nina >> >> >> Martin Guetlein wrote: >> >>> Hello All, >>> >>> as Harry noted in one of the last meetings, the classification >>> statistics in the validation object only take binary classification >>> into account so far. There can of course be more than one class value >>> (e.g. inacitve, moderately-active, active). >>> Hence, some classification results (e.g. numTruePositives) are now >>> available multiple times (once for each class-value). >>> >>> As collections are not allowed in OWL-DL, I had to create >>> intermediate classes (following the scheme Nina proposed for the >>> dataset). Here is how an example of the Classification Statistics >>> Object may look like: >>> >>> [[ >>> default:thisClassificationStatistics >>> a ot:classificationStatistics ; >>> >>> ot:accuracy "99.0"^^xsd:float ; # accuracy is only available once >>> ot:numberUnclassified "26"^^xsd:int ; >>> ... >>> >>> ot:classStatisticEntry >>> [ a ot:classStatisticEntry ; >>> ot:classValue "moderately_active"^^String ; >>> ot:classStatisticValue >>> [ a ot:ClassStatisticValue ; >>> ot:classStatistic default:areaUnderRocCurve ; >>> ot:value "0.77"^^:xsd:float ; >>> ] ; >>> ot:classStatisticValue >>> [ a ot:ClassStatisticValue ; >>> ot:classStatistic default:numTruePositives ; >>> ot:value "123"^^:xsd:int ; >>> ] ; >>> ot:classStatisticValue >>> ... >>> >>> ot:classStatisticEntry >>> [ a ot:classStatisticEntry ; >>> ot:classValue "intactive"^^String ; >>> ... >>> >>> ot:classStatisticEntry >>> [ a ot:classStatisticEntry ; >>> ot:classValue "active"^^String ; >>> ... >>> ]] >>> >>> Here is the old classification statistics object (I renamed it from >>> ClassifcationInformation to ClassificationStatistics): >>> http://www.opentox.org/data/documents/development/RDF%20files/Validation/#-ot-classificationinfo-rdf >>> >>> Any comments, corrections before I add that to the opentox.owl? >>> >>> Best regards, >>> Martin >>> >>> >>> >>> >> _______________________________________________ >> Development mailing list >> Development at opentox.org >> http://www.opentox.org/mailman/listinfo/development >> >> > > > >
- Previous message: [OTDev] Validation: classification statistics for non-binary class values
- Next message: [OTDev] Validation: classification statistics for non-binary class values
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list