[OTDev] scripts to extract toxicity data from echa site
Vedrin Jeliazkov vedrin.jeliazkov at gmail.comTue Jun 7 15:10:46 CEST 2011
- Previous message: [OTDev] scripts to extract toxicity data from echa site
- Next message: [OTDev] scripts to extract toxicity data from echa site
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Barry, All, On 7 June 2011 15:29, Barry Hardy <barry.hardy at douglasconnect.com> wrote: > Dear All: > It might be worthwhile for the developer community to write scripts to > extract public REACH dossier toxicity data from the ECHA website to make it > available in a more suitable form for scientific purposes including model > building, improving models etc. While certainly feasible (at least to some extent), such scripts wouldn't perform the foreseen task in a optimal way from technical point of view. The scripts would basically involve the following steps: -- mirror the relevant HTML content from ECHA site; -- run some heavy post-processing in order to extract valuable bits of information in a structured and machine readable format; -- populate some DB backend with the interesting data and expose it through an OT dataset (or similar) service. Having in mind that: 1) ECHA already has this data in a DB (iuclid5) and that this DB has a well documented webservices interface, 2) AMBIT includes a module for data exchange with iuclid 5 through this webservices interface, IMHO a much more technically sound way for using this data would be through such setup. This would require though talking to the right people at ECHA and convincing them to publish the data through iuclid5 webservices in addition to the plain HTML they currently have. > It should also be done in a way that is legal. ECHA's disclaimer (http://echa.europa.eu/disclaimer_en.asp#registration) includes the following sentence: "Use the information with care. Reproduction or further distribution of the information is subject to copyright laws and might require the permission of the owner of that information." Obviously there are some open legal issues, apart from the technical ones, that require further discussion with ECHA. > What do you think? I wouldn't bother doing anything before we have a clear statement from ECHA on: -- whether they plan to publish dossiers data through a IUCLID5 web service; -- whether this data could be used for model building, validation or whatever other cheminformatics purposes one could be interested in. Just my two cents, Vedrin
- Previous message: [OTDev] scripts to extract toxicity data from echa site
- Next message: [OTDev] scripts to extract toxicity data from echa site
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Development mailing list