[OTDev] Descriptor Calculation Services
Nina Jeliazkova nina at acad.bgTue Jan 12 18:23:57 CET 2010
- Previous message: [OTDev] Descriptor Calculation Services
- Next message: [OTDev] Descriptor Calculation Services
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Christoph Helma wrote: > Excerpts from Tobias Girschick's message of Mon Jan 11 10:05:23 +0100 2010: > >> Hi Pantelis, All, >> >> On Thu, 2010-01-07 at 18:49 +0200, chung wrote: >> >>> Hi Tobias, All, >>> While trying to train a model, the service is possible to "find" some >>> missing values for a specific feature. >>> >> To obviate misunderstandings: You want to train a model with a data set >> that contains missing values for a specific feature and the service >> detects the missing features before training, right? >> >> >>> Is there a way to use your >>> services to obtain the missing value? >>> >> If the feature with the missing values was produced from our descriptor >> calculation service, yes. But you would have to build a dataset with all >> the compounds where the value is missing and submit it to the descriptor >> calculation service. >> The question is, if a model training service should automatically >> provide the functionality of "filling up" missing values. I think this >> is something that should be done in the preprocessing phase - in a >> preprocessing/data cleaning service. >> > > I would be extremely careful with the addition of missing features for > several reasons: > > - Sometimes there are good physical/chemical/biological/algorithmic reasons why > features are missing - calculating these features might give > you a number but it is very likely that it is meaningless. > Agree. > - A sameAs relationship does not guarantee, that (calculated and > measured) feature values are comparable (very frequently they are > not). > Right, this is the reason of having ot:hasSource for features , allowing to identify exactly the descriptor calculation service used. > - Even if you find a measured value for the same feature, there is a > good chance, that it has been obtained by a different protocol and > that it is not comparable with the other feature values. > Agree. > I would suggest to add features only > > - if you have a clear understanding, why a feature is missing > - if you can prove that the feature calculation algorithm creates values > that are comparable with the original measurements (or calculation > algorithm) > - if you clearly document how and why the original dataset has been > modified > An user interface supporting the above (e.g. allowing the user to document why something is modified) would be relevant for both Fastox and Toxmodel. Best regards, Nina > Best regards, > Christoph > _______________________________________________ > Development mailing list > Development at opentox.org > http://www.opentox.org/mailman/listinfo/development >
- Previous message: [OTDev] Descriptor Calculation Services
- Next message: [OTDev] Descriptor Calculation Services
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list