[OTDev] TUM open questions
Christoph Helma helma at in-silico.deFri Dec 4 10:21:30 CET 2009
- Previous message: [OTDev] TUM open questions
- Next message: [OTDev] TUM open questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Excerpts from Tobias Girschick's message of Fri Dec 04 09:46:19 +0100 2009: > Dear All, > > in our yesterdays meeting some questions/unresolved issues came up. To > make it easier to discuss them later in the meeting I will give a short > overview: > > (1) Could one of you (maybe Nina or Christoph) shortly repeat the > rationale behind the DataEntry in the RDF? (Will there be an API > "access") Nina has explained her rationale in previous posts - I am not sure if I understand all of her arguments correctly. > (2) About the API: Is there (will there be) a Feature API (the current > state "obsolete with RDF" contains a lot of stuff from version 1.0, e.g. > feature_definitions). I do not think, that we need a separate service for feature values, as these can be written as literals in the RDF - which is served through the dataset service. We need a service to look up features (of feature definitions in API 1.0) - this should be done through an ontology service (well established features are covered e.g. in blueobelisc, but we need a mechanism for new developments. This can be done either through the ot: ontology or by the algorithms themself, fminer eg. will provide metadata for its features). > > (3) Don't we need a (REST) API to query the ontology? Yes. > There is currently > no way to access the ontology via REST services. E.g. how do I (or the > GUI) get all the Algorithms (their URIs) for calculating > physico-chemical descriptors? We lost this functionality in 1.0->1.1 > transition > > (4) We propose to reintroduce one level of hierarchy to the algorithm > API to make a clearer statement about input and output of an POST > to /algorithm possible. We prefer to distinguish algorithms that learn a > model from algorithms that merely alter a dataset (adding or selecting > descriptors, ...). I do not think, that we have to expose that throught the API. Just ask the algorithm service for a RDF with metadata from your algorithms and you can make very flexible queries. > (5) At the moment we see the workflow of predicting (applying a model) > like this > 1 - POST /model/3 dataset/1 (the dataset 1 may not have > all the necessary descriptors needed to apply the model) > 2 - ModelWS checks which descriptors need to be calculated > 3 - POST /algorithm/<calcDesc> dataset/1 -> dataset/1 > 4 - calculate predicitons for dataset/1 based on model/3 > 5 - POST/PUT dataset/1 > This is fine. But in case we want to use the same test dataset > (dataset/1) with several models (e.g. same algo but different > parameters) we will have to recalculate the missing descriptors every > time. Could we add a method/algorithm/service that transfers the > features/descriptors from one (training) dataset to another (test) > dataset to avoid this? Does this make sense? I use the following workflow: POST /descriptor_calculation training_dataset # creates feature_dataset POST /algorithm training_dataset feature_dataset # creates model POST /model compound_uri # creates prediction or POST /model prediction_dataset # creates dataset with predictions This is fairly straightforward and allows you to reuse/exchange descriptors. > > (6) Regarding the AlgorithmTypes.owl: Could you explain why > ClassificationEagerSingleTarget, ... are Individuals and not an > instantiation of it, like WekaJ48? Furthermore we feel that it would be > better called Multiple not Many, but this is a minor thing. Nina? Best regards, Christoph
- Previous message: [OTDev] TUM open questions
- Next message: [OTDev] TUM open questions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list