[OTDev] On confidences

Barry Hardy barry.hardy at douglasconnect.com
Thu Jun 2 08:51:59 CEST 2011


(I am expanding this comms exchange on lazar model confidences with 
Christoph as it would probably benefit from further discussion on what 
our general framework and approach to confidences and communications of 
confidences should be)

I read material Andreas Maunz sent over just before the AXLR8 meeting, 
and felt after reading it I at least had a reasonable (although 
superficially shallow) understanding of the maths of your generalising 
to significance-weighted Tanimotos, smoothing nearest neighbour 
similarities, and adding the gaussian smoothing exponential.  However 
making the next steps needs further work and interaction:
a) How to understand the above maths (and others) more clearly and 
deeply? An interaction discussion along the path "developer - 
communicator - user" is probably needed. Otherwise I worry that 
converting the maths used into a simplified explanation in English may 
result in incorrect statements. They will at least need a review.
b) Even then it is hard to understand the values in practice, so we need 
several examples with several models to get a better feel for the 
meaning of the numbers
c) The ToxCreate help says that "For most models confidence > 0.025 is a 
sensible (hard) cutoff to distiguish between reliable and unreliable 
predictions." and you can tell people that, but the first reaction to a 
prediction that has a confidence of 0.026 as being a reliable prediction 
is confusion, with the first reaction often being the opposite 
comprehension. So redefining the index (even 1-x?) would be helpful for 
first meaning comprehensions siutations. Could we even have a 
classification to the index? - strongly confident .... very unconfident 
... that users could understand more easily?
d) Then also we have to prepare help explaining the maths and concepts 
if a way that is easy to understand (probably leaving out the maths)

Another issue, is that different models using different methods to 
communciate confidences in predictions will also be difficult for users 
to grasp. Could a classification approach on diverse confidences somehow 
"normalise meanings" for users?

Barry

>> - Also, the developers will have to communicate with the "tutorial
>> developer", and perhaps even react. For example, try to explain to
>> someone what a confidence of 0.08 for a lazar model prediction means. I
>> am not even sure yet from the material how I would redefine it to be
>> somewhat intuitive.
> The last point is a good example why developers should not write
> tutorials. I have explained lazar confidence ~1000 times (and an
> explanation pops up if you click at the word in ToxCreate) so I simply
> do not realize if a proper definition is missing in a tutorial. For this
> reason I need someone who less involved to spot such problems. I can of
> course try to give an explanation (e.g. as in ToxCreate), but I cannot
> judge if it is understandable for someone with a different background.
>
> Mapping lazar confidences to something more intuitive (ie real
> probabilities) is possible, it is on my list, but we just did not have
> the time to implement it.
>
> Best regards,
> Christoph
>
>




More information about the Development mailing list