[OTDev] validation and reporting workflow
Nina Jeliazkova nina at acad.bgMon Dec 7 09:13:23 CET 2009
- Previous message: [OTDev] validation and reporting workflow
- Next message: [OTDev] validation and reporting workflow
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Martin, (apologies, I have to read older emails before replying to the more recent ones ...) Martin Guetlein wrote: > Hi Tobias, All, > > On Fri, Dec 4, 2009 at 8:41 AM, Tobias Girschick > <tobias.girschick at in.tum.de> wrote: > >> Hello Martin, >> >> thanks for the visulization of the Validation and Reporting Workflows. >> It would be interesting to see the "API-Version" (e.g. sequence of curl >> calls) of the graphical overviews, too. This could also be helpful to >> check if the API in its current state is capable of handling the full >> validation and reporting. >> > > Thats a nice idea, I will add the curl calls. > It would be good to have the same for ToxModel and Fastox as well. > On Fri, Dec 4, 2009 at 8:55 AM, Tobias Girschick > <tobias.girschick at in.tum.de> wrote: > >> Hello Martin, >> >> another thing, that is not clear to me is that you write "The following >> chart illustrates the possible working process of validating an >> algorithm" (http://www.opentox.org/data/documents/development/validation/validation-and-reporting-overview-and-data-flow) >> and further below you say the reports described are "reports for model >> validation". >> In my opinion, the OpenTox user usually will validate a model, not an >> algorithm. On the other hand, if you build "the same" (everything except >> algorithm identical) model with two or three different algorithms (or >> algorithm parameters), you can validate the algorithms (regarding this >> dataset/model). >> > > I'm not quit sure if I got your point right. > I use the term 'validate an algorithm' for the procedure 'use > > algorithm to build model on training set, make predictions on test > set, compare predictions to actual values'. > And the term 'validate a model' to 'make predictions on test set, > compare predictions to actual values'. > Both are of course possible with the validation webservice (I just > sketched the first case on the web page, because it is more > complicated, and it includes the second case). > > Very useful discussion. It highlights the fact the validation service is a client for Algorithm service (exactly the same way ToxModel user interface is a client to the Algorithm service). In this case it will make sense to have a common Algorithm API , specifying e.g. dataset_uri, parameters, feature_uri-s , etc., which can be used by all clients to build a model. Then the validation service will either 1)take an existing model, or 2)build a model using Algorithm service API , and then 3) use it for the validation procedures. Along the same line of thoughts, the Validation service is also a client to the Model service, using Model prediction API with various datasets (and of course doing more specific tasks as gathering statistics). Looking at the proposed workflow http://www.opentox.org/data/documents/development/validation/validation-and-reporting-overview-and-data-flow , Model API so far seems to be sufficient, while Algorithm API neeeds to be clarified. Is it possible to fix Algorithm API parameters as in Martin example: curl -X POST -d dataset_uri="<dataset_service>/dataset/<train_dataset_id>" \ -d prediction_feature="<prediction_feature>" \ -d <alg_param_key1>="<alg_param_val1>" \ -d <alg_param_key2>="<alg_param_val2>" \ <algorithm_service>/algorithm/<algorithm_id> -> <model_service>/model/<model_id> Algorithm POST call parameters: Training dataset: 1)dataset_uri="<dataset_service>/dataset/<train_dataset_id>" 2)Prediction feature(s) prediction_feature="uri to prediction features (might be >=1)" 3)I would add parameters for the independent variables as well independent_variables="uri to independent variables" 4)algorithm parameters could be as proposed above <alg_param_key1>="<alg_param_val1> I am not sure what's the best way for parameters, for example how to embed with the key/value representation Weka algorithm parameters in the form of "-P -M10" , or MOPAC keywords "PM3 NOINTER NOMM BONDS MULLIK PRECISE GNORM=0.0" ? Other suggestions/comments? > If a developer wants to compare his new algorithm to others, he could > uses the 'validate an algorithm' command (with the new algorithm, as > Then this is "compare an algorithm", not "validate an algorithm". I am not sure the later term is generally accepted - perhaps Stefan /TUM group could clarify? Best regards, Nina > well as other algorithms, maybe on a range of data sets). Other > techniques like cross-validation are possible as well, of course. > > If a developer has a model for a certain endpoint, he will use the > 'validate model' command. > Does that answer your question? > > Regards, > Martin > > > > > >> best Regards, >> Tobias >> >> On Thu, 2009-12-03 at 18:35 +0100, Martin Guetlein wrote: >> >>> Hello All, >>> >>> as discussed in the virtual meeting yesterday, I prepared a web page >>> to give some insight into the validation and reporting services: >>> >>> http://www.opentox.org/data/documents/development/validation/validation-and-reporting-overview-and-data-flow >>> >>> (You will find a link to this page on the validation api site as well.) >>> >>> Comments and suggestions for improvement are highly appreciated. >>> >>> Regards, >>> Martin >>> >>> >> -- >> Dipl.-Bioinf. Tobias Girschick >> >> Technische Universität München >> Institut für Informatik >> Lehrstuhl I12 - Bioinformatik >> Bolzmannstr. 3 >> 85748 Garching b. München, Germany >> >> Room: MI 01.09.042 >> Phone: +49 (89) 289-18002 >> Email: tobias.girschick at in.tum.de >> Web: http://wwwkramer.in.tum.de/girschick >> >> _______________________________________________ >> Development mailing list >> Development at opentox.org >> http://www.opentox.org/mailman/listinfo/development >> >> > > > >
- Previous message: [OTDev] validation and reporting workflow
- Next message: [OTDev] validation and reporting workflow
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list