[OTDev] [Fwd: Re: Feature Generation Algorithms: Avoiding duplicates]

Nina Jeliazkova nina at acad.bg
Tue Jan 19 15:58:24 CET 2010


Tobias Girschick wrote:
> Hi Nina,
>
>   
>> It might help if you try to define your descriptors in a way similar 
>> to BO ontology.
>>     
>
> We have thought about that. But I am not sure that this makes sense or
> is possible. At least if I consider e.g. this as one descriptor:
>
> C(C) (minSup: 0.7, dataset: http://somedataset, hasSource/algo: FTM)
>
> How should I describe this in an ontology? What I can do is use
> information of some of the parameters (e.g. path not tree) for
>   

If extending BO ontology, a new descriptor will just be individual, of
MolecularDescriptor class.  If this is not sufficient, you might
subclass MolecularDescriptor and I
define additional propertiesfor the new class (requires dataset, has
parameters, etc.).
> categorization. But if I am right a single descriptor is to be
> understood as a unique mapping, a function that takes the molecule and
> maps it to a real, int or boolean value. For e.g. physico-chemical
> descriptors, the owl:sameAs relation gives a definition of the function,
> right? 
>   
As I have read recently, owl:sameAs is perceived differently by
different peoples, and the best way to define some type hierarchy is to
use class inheritance via rdf:type.

> Clearly we need to define a way to store parameters for (some) features
> and if I remember your last email to Fabian and the last meeting right,
> you agree on that. The question is how? 
> I still don't like the idea of declaring this type of feature as some
> kind of model, although from a modelling point of view it seems the same
> or very similar. But from a semantic point of view this are two totally
> different things. What do you think about extending the Feature instead
> of the Model. We could have simple Features (same as at the moment) and
> ComplexFeatures that have an ot:Algorithm with ot:parameters and an
> ot:dataset?
>
>
>   
We already have complex features, which are features with values other
than scalar (have a look at the latest opentox.owl).
It's a bit different semantic compared to the one how the features are
obtained. The later was meant to be described via hasSource property.

I would prefer to define additional resource, separate from ot:Feature
and link it to the feature via ot:hasSource.   Thus we'll keep features
generic and link them into other ontologies. (Features represent
experimental as well as calculated values so far).

Let's introduce
ot:ParameterizedDescriptor , which can be a subclass of algorithm and
will have properties exactly same as Model has (e.g. dataset, parameters
and algorithm), but will not be a model.

Does this make sense ?

Regards,
Nina




>>  BTW, it seems you are not using owl:sameAs in RDF description of
>> features, or at least they do not appear in the database. Can we
>> verify? It might be parsing error from my side as well.
>>     
>
> No we are not using them up to now, so it's no parsing error ;) We were
> not sure what to put there. In the CDK and JOELib2 case we will have to
> do a by-hand mapping of the descriptors to the BO ontology and use this
> (or extend it), right? In the FTM or gSpan case, problems see above...
>
> Best regards,
> Tobias
>
>
>
>   




More information about the Development mailing list