[OTDev] scripts to extract toxicity data from echa site

Barry Hardy barry.hardy at douglasconnect.com
Tue Jun 7 15:57:40 CEST 2011


I believe the current answer is that is either not planned or a long 
time off.
Barry

Am 07.06.2011 15:10, schrieb Vedrin Jeliazkov:
> Hi Barry, All,
>
> On 7 June 2011 15:29, Barry Hardy<barry.hardy at douglasconnect.com>  wrote:
>> Dear All:
>> It might be worthwhile for the developer community to write scripts to
>> extract public REACH dossier toxicity data from the ECHA website to make it
>> available in a more suitable form for scientific purposes including model
>> building, improving models etc.
> While certainly feasible (at least to some extent), such scripts
> wouldn't perform the foreseen task in a optimal way from technical
> point of view. The scripts would basically involve the following
> steps:
>
> -- mirror the relevant HTML content from ECHA site;
> -- run some heavy post-processing in order to extract valuable bits of
> information in a structured and machine readable format;
> -- populate some DB backend with the interesting data and expose it
> through an OT dataset (or similar) service.
>
> Having in mind that:
>
> 1) ECHA already has this data in a DB (iuclid5) and that this DB has a
> well documented webservices interface,
> 2) AMBIT includes a module for data exchange with iuclid 5 through
> this webservices interface,
>
> IMHO a much more technically sound way for using this data would be
> through such setup. This would require though talking to the right
> people at ECHA and convincing them to publish the data through iuclid5
> webservices in addition to the plain HTML they currently have.
>
>> It should also be done in a way that is legal.
> ECHA's disclaimer
> (http://echa.europa.eu/disclaimer_en.asp#registration) includes the
> following sentence:
>
> "Use the information with care. Reproduction or further distribution
> of the information is subject to copyright laws and might require the
> permission of the owner of that information."
>
> Obviously there are some open legal issues, apart from the technical
> ones, that require further discussion with ECHA.
>
>> What do you think?
> I wouldn't bother doing anything before we have a clear statement from ECHA on:
>
> -- whether they plan to publish dossiers data through a IUCLID5 web service;
> -- whether this data could be used for model building, validation or
> whatever other cheminformatics purposes one could be interested in.
>
> Just my two cents,
> Vedrin
> _______________________________________________
> Development mailing list
> Development at opentox.org
> http://www.opentox.org/mailman/listinfo/development
>
>




More information about the Development mailing list