[OTDev] Validation: classification statistics for non-binary class values

Martin Guetlein martin.guetlein at googlemail.com
Thu Dec 10 12:41:45 CET 2009


Hello All,

I changed the ClassificationStatistics class (former name was
ClassificationInfo) to take the non-binary classification task into
account.
You can find an updated version of the opentox.owl on the webpage:
http://www.opentox.org/data/documents/development/RDF%20files/OpenToxOntology/view

Best regards,
Martin


On Wed, Dec 9, 2009 at 1:45 PM, Martin Guetlein
<martin.guetlein at googlemail.com> wrote:
> Hi Nina,
>
> On Wed, Dec 9, 2009 at 12:38 PM, Nina Jeliazkova <nina at acad.bg> wrote:
>> Hi Martin,
>>
>> Martin Guetlein wrote:
>>
>> Hi Nina, All,
>>
>> very good Point. Here is how it could look like:
>>
>> [[
>> default:confusionmatrix
>>   a ot:ConfusionMatrix ;
>>
>>   # contains numClassValues**2 entries like the following
>>
>>   ot:confusionMatrixValue
>>   [
>>     a ot:ConfusionMatrixValue ;
>>     dc:value "25"^^xsd:int ;
>>     ot:confusionMatrixCoordinates ;
>>     [
>>       a ot:ConfusionMatrixCoordinate ;
>>       dc:predictedValue "active"^^xsd:String ;
>>       dc:actualValue "moderately_active"^^xsd:String ;
>>     ]
>>   ]
>>   ...
>> ]]
>>
>> I think we will end up with quite a lot of Classes in our opentox.owl.
>>
>>
>> Having large number of classes should be fine, provided we are not
>> replicating things under different names.
>>
>> Here is an idea how we can reuse some of the existing classes :
>>
>> #This is a cell in a confusion matrix
>>     <owl:Class rdf:ID="ConfusionMatrixCell">
>>         <rdfs:subClassOf rdf:resource="#OpentoxResource"/>
>>     </owl:Class>
>> #the cell is linked to the Feature and the actual value via FeatureValue
>> class
>>     <owl:ObjectProperty rdf:ID="confusionMatrixActual">
>>         <rdf:type rdf:resource="&owl;FunctionalProperty"/>
>>         <rdfs:domain rdf:resource="#ConfusionMatrixCell"/>
>>         <rdfs:range rdf:resource="#FeatureValue"/>
>>     </owl:ObjectProperty>
>>
>> #the cell is linked to the Feature and the predicted value via FeatureValue
>> class
>>     <owl:ObjectProperty rdf:ID="confusionMatrixPredicted">
>>         <rdf:type rdf:resource="&owl;FunctionalProperty"/>
>>         <rdfs:domain rdf:resource="#ConfusionMatrixCell"/>
>>         <rdfs:range rdf:resource="#FeatureValue"/>
>>     </owl:ObjectProperty>
>> #and the numeric value itself
>>     <owl:DatatypeProperty rdf:ID="confusionMatrixValue">
>>         <rdf:type rdf:resource="&owl;FunctionalProperty"/>
>>         <rdfs:domain rdf:resource="#ConfusionMatrixCell"/>
>>         <rdfs:range rdf:resource="&xsd;int"/>
>>     </owl:DatatypeProperty>
>>
>> #theabove is to be added in opentox.owl
>>
>> #instances elsewhere (generated by services)
>>     <ConfusionMatrixCell rdf:ID="ConfusionMatrixCell_7">
>>         <confusionMatrixActual rdf:resource="#FeatureValue_8"/>
>>         <confusionMatrixPredicted rdf:resource="#FeatureValue_9"/>
>>         <confusionMatrixValue
>> rdf:datatype="&xsd;int">25</confusionMatrixValue>
>>     </ConfusionMatrixCell>
>>
>>     <FeatureValue rdf:ID="FeatureValue_8">
>>         <feature rdf:resource="#Feature_10"/>
>>         <value rdf:datatype="&xsd;string">active</value>
>>     </FeatureValue>
>>     <FeatureValue rdf:ID="FeatureValue_9">
>>         <feature rdf:resource="#Feature_10"/>
>>         <value rdf:datatype="&xsd;string">moderate</value>
>>     </FeatureValue>
>>
>> (as a side effect, visualising confusion matrix with relevant links for
>> predicted/actual will be straightforward :)
>>
>> #and using ConfusionMatrixCell to denote a ConfusionMatrix as in your
>> proposal.
>>
>> Any comments?
>>
>
> Looks good, linking to feature values really makes sense.
> I will try to integrate this into the opentox ontology today.
>
> Best regards,
> Martin
>
>
>> Best regards,
>> Nina
>>
>> Best Regards,
>> Martin
>>
>>
>>
>> On Tue, Dec 8, 2009 at 12:57 PM, Nina Jeliazkova <nina at acad.bg> wrote:
>>
>>
>> Hi Martin,
>>
>> Do we have confusion matrix somewhere in the classification statistics?
>> It provides more information than just true positives.
>>
>> Best regards,
>> Nina
>>
>>
>> Martin Guetlein wrote:
>>
>>
>> Hello All,
>>
>> as Harry noted in one of the last meetings, the classification
>> statistics in the validation object only take binary classification
>> into account so far. There can of course be more than one class value
>> (e.g. inacitve, moderately-active, active).
>> Hence, some classification results (e.g. numTruePositives) are now
>> available multiple times (once for each class-value).
>>
>> As collections are not allowed in OWL-DL, I had to create
>> intermediate classes (following the scheme Nina proposed for the
>> dataset). Here is how an example of the Classification Statistics
>> Object may look like:
>>
>> [[
>> default:thisClassificationStatistics
>>   a ot:classificationStatistics ;
>>
>>   ot:accuracy "99.0"^^xsd:float ; # accuracy is only available once
>>   ot:numberUnclassified "26"^^xsd:int ;
>>   ...
>>
>>   ot:classStatisticEntry
>>     [ a ot:classStatisticEntry ;
>>       ot:classValue "moderately_active"^^String ;
>>       ot:classStatisticValue
>>         [ a ot:ClassStatisticValue ;
>>           ot:classStatistic default:areaUnderRocCurve ;
>>           ot:value "0.77"^^:xsd:float ;
>>         ] ;
>>       ot:classStatisticValue
>>         [ a ot:ClassStatisticValue ;
>>           ot:classStatistic default:numTruePositives ;
>>           ot:value "123"^^:xsd:int ;
>>         ] ;
>>       ot:classStatisticValue
>>       ...
>>
>>   ot:classStatisticEntry
>>     [ a ot:classStatisticEntry ;
>>       ot:classValue "intactive"^^String ;
>>       ...
>>
>>   ot:classStatisticEntry
>>     [ a ot:classStatisticEntry ;
>>       ot:classValue "active"^^String ;
>>       ...
>> ]]
>>
>> Here is the old classification statistics object (I renamed it from
>> ClassifcationInformation to ClassificationStatistics):
>> http://www.opentox.org/data/documents/development/RDF%20files/Validation/#-ot-classificationinfo-rdf
>>
>> Any comments, corrections before I add that to the opentox.owl?
>>
>> Best regards,
>> Martin
>>
>>
>>
>>
>>
>> _______________________________________________
>> Development mailing list
>> Development at opentox.org
>> http://www.opentox.org/mailman/listinfo/development
>>
>>
>>
>>
>>
>
>
>
> --
> Dipl-Inf. Martin Gütlein
> Phone:
> +49 (0)761 203 8442 (office)
> +49 (0)177 623 9499 (mobile)
> Email:
> guetlein at informatik.uni-freiburg.de
>



-- 
Dipl-Inf. Martin Gütlein
Phone:
+49 (0)761 203 8442 (office)
+49 (0)177 623 9499 (mobile)
Email:
guetlein at informatik.uni-freiburg.de



More information about the Development mailing list