[OTDev] TUM open questions

Martin Guetlein martin.guetlein at googlemail.com
Fri Dec 4 16:25:05 CET 2009


Hello All,

On Fri, Dec 4, 2009 at 3:09 PM, Christoph Helma <helma at in-silico.de> wrote:
> Excerpts from Nina Jeliazkova's message of Fri Dec 04 12:39:56 +0100 2009:
>> > I use the following workflow:
>> >
>> > POST /descriptor_calculation training_dataset                                    # creates feature_dataset
>> > POST /algorithm              training_dataset feature_dataset # creates model
>> > POST /model                  compound_uri                     # creates prediction
>> > or
>> > POST /model                  prediction_dataset               # creates dataset with predictions
>> >
>> > This is fairly straightforward and allows you to reuse/exchange descriptors.
>> >
>> Yes, but straightforward implementation duplicates information
>> (training/feature datasets are not very much different).
>
> No, training and feature datasets are disjunct in my case. This allows
> me e.g. to quickly create lazar models with different types of
> descriptors and compare the results with other algorithms.


To determine the parameters for building a prediction model (what to
predict?, which features to use?) is needed for the validation as
well.
I made a proposal how the curl call for validating an algorithm could
look like (see http://www.opentox.org/data/documents/development/validation/validation-and-reporting-overview-and-data-flow).
An excerpt:

  curl -X POST -d algorithm_uri="<algorithm_service>/algorithm/<algorithm_id>" \
               -d
training_dataset_uri="<dataset_service>/dataset/<train_dataset_id>" \
               -d
test_dataset_uri="<dataset_service>/dataset/<test_dataset_id>" \
               -d prediction_feature="<prediction_feature>" \
               -d
algorithm_params="<alg_param_key_1>=<alg_param_val1>;<alg_param_key_2>=<alg_param_val2>"
[OPTIONAL]\
               <validation_service>/validation

-> validation-internal api call to build model:

  curl -X POST -d dataset_uri="<dataset_service>/dataset/<train_dataset_id>" \
               -d prediction_feature="<prediction_feature>" \
               -d <alg_param_key1>="<alg_param_val1>" \
               -d <alg_param_key2>="<alg_param_val2>" \
                <algorithm_service>/algorithm/<algorithm_id>

What do yout think?

Best regards,
Martin



-- 
Dipl-Inf. Martin Gütlein
Phone:
+49 (0)761 203 8442 (office)
+49 (0)177 623 9499 (mobile)
Email:
guetlein at informatik.uni-freiburg.de



More information about the Development mailing list