[OTDev] Validation: Efficiency

Tue Mar 1 11:50:49 CET 2011

Martin, All,

On 28 February 2011 11:44, Martin Guetlein
<martin.guetlein at googlemail.com>wrote:

> On Sun, Feb 27, 2011 at 11:32 AM, Nina Jeliazkova
> <jeliazkova.nina at gmail.com> wrote:
> > Christoph, All,
> >
> >
> >> >
> >> > >
> >> > Mapped to our services, there is a need for top level "noun"
> >> >
> >> > http://host:port/ambit2/{set_id}/{dataset_id}
> >> >
> >> > http://host:port/ambit2/dataset/{set_id}/{dataset_id}
> >>
> >> This is what I had in mind. I guess we will need a slight API
> >> modification to create dataset sets (e.g. POST
> >> http://host:port/ambit2/dataset/set to create a set, which can be the
> >> target of a further POST to create a dataset).
> >>
> >> I am not sure if such a solution fits well into the framework, as the
> >> OpenTox way to group datasets would be through ontology entries - but
> >> that does not reduce the number of policies.  Lets hear Martins and
> >> Andreas opinions first, maybe someone else has also another idea how to
> >> reduce the number of validation policies.
> >>
> >>
> > If the above will change the current pattern of /dataset/id , I am not
> much
> > of favour of it (testing compliance across all partners is very time
> > consuming and at this stage it is better to avoid any such changes). If
> only
> > adding new resource,  without changing the current API,  it's fine.
>
>
> Hi Nina, Christoph, All,
>
> I just had a short discussion with Andreas Maunz, and we both think
> that sets are a good solution.
> Just a few points:
>
> Downwards compatibility should be assured, the dataset service should
> work as it does now.
>
> The set concept would be needed for models too, as the number of
> models grows with the number of folds, and so does the number of
> policies (so far).
>
> This point is more for my understanding how the whole thing would
> work: A set would only contain resources of the local service, e.g.
> <model-service>/set/<set-id> would only contain models from the same
> service with URIs like <model-service>/set/<set-id>/model/<model-id>.
> The model service uses the set URI for checking user rights at the
> policy service (no wildcards needed at the policy service). When
> creating a model (or a dataset) the set is given as 'destination
> location' parameter. Is this how it could work?
>
>
Sounds feasible - may be it's time to describe the "set" extensions on API
page ?

And the set extensions could be applied to any of the OpenTox resources,
besides dataset and models - e.g. features , as it was suggested before by
Surajit.

>
> > What about the following:
> >
> > The validation service starts a validation procedure. At this point it
> > already knows it should split the dataset into N subsets and there will
> be N
> > more datasets, holding prediction results.  It could allocate
> placeholders
> > (empty datasets with known URIs) for all the necessary resources and
> create
> > one policy, involving all URIs (as Andreas noted one policy could have
> many
> > URIs) , then proceed with calculations.
> >
> > This will require an option to tell the model where to store the results
> > (into the empty dataset created as above).  Such option was already
> > discussed before in the context of descriptor calculation (to be able to
> > POST/PUT results into a given dataset URI  - added as optional in the API
> )
> > . Your implementation will need to be only slightly extended, to accept
> POST
> > (or PUT is better in this case?) to a dataset, which is empty (I assume
> you
> > could easily check if a dataset is empty).  Finally, as it is only one
> > policy , the policy deletion issue should be resolved.
> >
> > Will this work?
>
> Nice idea. I would favor the set concept though, because this approach
> has IMHO some drawbacks:
>
> Allocating the empty datasets, would require some
> create-empty-dataset-without-policy mechanism, because you do not know
> the dataset URI beforehand.

Yes, seems I have missed this - it could be resolved via operation "create
several datasets" , but it's an API extension and closer to the sets
approach.

Best regards,
Nina

> This mechanism would require either a API
> extension, or it would limit the validation service to only work with
> 'its own' dataset service.
>
> Don't know how this would work for models.
>
> Best regards,
> Martin
>
> >
> >
> > Best regards,
> > Nina
> >
> >
> >
> >
> >
> >> Best regards,
> >> Christoph
> >>
> > _______________________________________________
> > Development mailing list
> > Development at opentox.org
> > http://www.opentox.org/mailman/listinfo/development
> >
>
>
>
> --
> Dipl-Inf. Martin Gütlein
> Phone:
> +49 (0)761 203 8442 (office)
> +49 (0)177 623 9499 (mobile)
> Email:
> guetlein at informatik.uni-freiburg.de
>