[OTDev] Task history

Fri Feb 26 12:30:05 CET 2010

Hi Nina,

The situation is :

I am generating a task which does the following

a) Read a molecule from the ambit website
b) load the dictionary from a mysql table
c) fingerprint the molecule using this dictionary
d) load the R engine interface and RandomForest package
e) send fingerprint to the Rserve instance and retrieve a prediction

If I generate an atomic task for each step it would lead to a huge uri
redirection route .. and also I have to have all the atomic tasks in memory
(so they can redirect to the next step) . Although in principle this may
sound elegant - the code involved and the cost in terms memory
for retaining atomic tasks for redirection will be too much to justify this
simple set of steps. Also say if an atomic task fails it would be a hassle
to chain DELETE all the preceding tasks.

I like Christoph's suggestion to have a tree for a task. But the question to
what level within the tree the user can have control (in terms of DELETE)
requests in case of failed tasks. Also threads may get locked in case the
user accidentally deletes a subtask, whose result is being awaited by
another thread. Right now I find it convenient to run a single main task -
and maintain a history. Later we could modify the API to have a querying
mechanism for a tree of tasks - to retrieve histories as well as
intermediate results obtained before the failed subtask.

Cheers
Surajit

On Fri, Feb 26, 2010 at 3:01 PM, Nina Jeliazkova <nina at acad.bg> wrote:

> surajit ray wrote:
> > Hi Nina,
> >
> > Should we not have some sort of task history parameter/value in the Task
> > API. Otherwise in cases where there are multiple steps to a single a task
> -
> > the user may not be able to see which step is failing .... or why...
> >
> >
> Well, the Task object was assumed to encapsulate an atomic job, which
> does not consist of steps.   With the current redirection API , it is
> quite easily to achieve series of tasks in a transparrent manner.  Asan
> example, this is how currently TUM algorithm and model services work
> 1) A dataset URI is posted to the Model service
> 2) It returns Task URI at TUM service
> 3) The TUM Model service runs the calculations, and posts the results
> into IDEA dataset service.  When querying TUM Task service for the Task
> URI from step2, it redirects (303) to IDEA Task service and  it returns
> a new Task URI on IDEA server.
> 4) Subsequent GETs on IDEA Task URI service will return OK 200 if the
> results are stored into the database.
>
> This is a very elegant approach of automatic workflow by using HTTP
> redirects and is not restricted to tasks on a single server.   Most
> interesting part is we have not designed it intentionally, it just
> happened by using REST style and proper HTTP codes.
>
> We should strive to keep things as simple as possible.  TUM group might
> be also willing to share their experience of arranging workflows arount
> OpenTox services.
>
> Best regards,
> Nina
>
> _______________________________________________
> Development mailing list
> Development at opentox.org
> http://www.opentox.org/mailman/listinfo/development
>

-- 
Surajit Ray
Partner
www.rareindianart.com