[OTDev] Task history

Fri Feb 26 13:55:44 CET 2010

Excerpts from Nina Jeliazkova's message of Fri Feb 26 12:52:29 +0100 2010:
> Hi Surajit,
> 
> surajit ray wrote:
> > Hi Nina,
> >
> > The situation is :
> >
> > I am generating a task which does the following
> >
> > a) Read a molecule from the ambit website
> > b) load the dictionary from a mysql table
> > c) fingerprint the molecule using this dictionary
> > d) load the R engine interface and RandomForest package
> > e) send fingerprint to the Rserve instance and retrieve a prediction
> >   
> From the outside world, it looks like a single task of generating
> prediction for a given molecule/dataset, and the steps might only
> reflect in errors/ error description thrown . 
> 
> For example if MaxTox is used by ToxPredict, I would not like receiving
> details how exactly the prediction is generated; error report if
> something fails is sufficient.  For internal processing you might take
> any reasonable approach, but this will not be reflected in the API.
> > If I generate an atomic task for each step it would lead to a huge uri
> > redirection route .. and also I have to have all the atomic tasks in memory
> > (so they can redirect to the next step) . 
> Redirection make sense  if the processing tasks are on different and/or
> remote services. For internal processing, it doesn't make sense indeed.
> > Although in principle this may
> > sound elegant - the code involved and the cost in terms memory
> > for retaining atomic tasks for redirection will be too much to justify this
> > simple set of steps. Also say if an atomic task fails it would be a hassle
> > to chain DELETE all the preceding tasks.
> >   
> No need to chain, it might be impossible, if tasks are on remote
> services. The complete/failed tasks might just expire after certain time.
> > I like Christoph's suggestion to have a tree for a task. But the question to
> > what level within the tree the user can have control (in terms of DELETE)
> > requests in case of failed tasks. Also threads may get locked in case the
> > user accidentally deletes a subtask, whose result is being awaited by
> > another thread. Right now I find it convenient to run a single main task -
> > and maintain a history. Later we could modify the API to have a querying
> > mechanism for a tree of tasks - to retrieve histories as well as
> > intermediate results obtained before the failed subtask.
> >   
> The problem of more complex structures rather than atomic tasks is that
> indeed one dives into thread management, deadlocks . etc. In a
> distributed setting this is a nightmare and certainly not the main topic
> of this project.  Tasks, arranged as trees might be fine, if everything
> runs on the same site, but if not , it seems like additional trouble. 
> We should , however, extend the API to be able to cancel atomic tasks.

I am not sure, if we can avoid it altogether (although I would like to -
it is indedd a nightmare).
Lets assume the following scenario from model validation:

task: model validation (validation service)
	split dataset
	n times do
		task: create dataset (dataset service)
		wait_for_task
	n times do
		task: create model (algorithm service)
			task: parse input and create dataset (dataset service)
			wait_for_task
			task: create features (algorithm service)
				task: create feature dataset (dataset service)
				wait_for_task
			wait_for_task
			task: create model (algorithm service)
			wait_for_task
		task: predict test set (model service)
    wait_for_task
	calculate statistics
wait_for_task
create report

It involves at least 4 services (gets more complicated when modeling and
feature calculation algorithms come from different services) and we
cannot expect, that everything runs on the same machine. Furthermore we
cannot make any assumptions about calculation times (might take hours or
even days for large datasets) - setting expiration times is not an
option here.

In my experience tasks tend to fail at the most unlikely places and for
the stupidest reasons (e.g. exceeding database size limits, all kinds of
timeouts, redland memory leaks, ...), so we need to have a mechanism to
communicate errors (maybe also progress) back to parent tasks. Not sure
about the best mechanism (I favor simplicity), but we should avoid
one-way tickets. Suggestions are very welcome!

Best regards,
Christoph

> 
> My approach up to now is to use language provided tools for thread
> handling (e.g. java.concurrent package in our case) and don't introduce
> thread management complexity into the API.   Workflows might be better
> means for managing task sequences.
> 
> Regards,
> Nina
> > Cheers
> > Surajit
> >
> > On Fri, Feb 26, 2010 at 3:01 PM, Nina Jeliazkova <nina at acad.bg> wrote:
> >
> >   
> >> surajit ray wrote:
> >>     
> >>> Hi Nina,
> >>>
> >>> Should we not have some sort of task history parameter/value in the Task
> >>> API. Otherwise in cases where there are multiple steps to a single a task
> >>>       
> >> -
> >>     
> >>> the user may not be able to see which step is failing .... or why...
> >>>
> >>>
> >>>       
> >> Well, the Task object was assumed to encapsulate an atomic job, which
> >> does not consist of steps.   With the current redirection API , it is
> >> quite easily to achieve series of tasks in a transparrent manner.  Asan
> >> example, this is how currently TUM algorithm and model services work
> >> 1) A dataset URI is posted to the Model service
> >> 2) It returns Task URI at TUM service
> >> 3) The TUM Model service runs the calculations, and posts the results
> >> into IDEA dataset service.  When querying TUM Task service for the Task
> >> URI from step2, it redirects (303) to IDEA Task service and  it returns
> >> a new Task URI on IDEA server.
> >> 4) Subsequent GETs on IDEA Task URI service will return OK 200 if the
> >> results are stored into the database.
> >>
> >> This is a very elegant approach of automatic workflow by using HTTP
> >> redirects and is not restricted to tasks on a single server.   Most
> >> interesting part is we have not designed it intentionally, it just
> >> happened by using REST style and proper HTTP codes.
> >>
> >> We should strive to keep things as simple as possible.  TUM group might
> >> be also willing to share their experience of arranging workflows arount
> >> OpenTox services.
> >>
> >> Best regards,
> >> Nina
> >>
> >> _______________________________________________
> >> Development mailing list
> >> Development at opentox.org
> >> http://www.opentox.org/mailman/listinfo/development
> >>
> >>     
> >
> >
> >
> >   
>