[OTDev] Luca Settimo

Nina Jeliazkova jeliazkova.nina at gmail.com
Wed Sep 7 16:07:20 CEST 2011


Dear Luca,

The access to the structures (and data) is assumed to be via the OpenTox
REST web services API, not through database calls.

As it was explained at http://ambit.uni-plovdiv.bg/downloads/ambit2/  the
ambit2.war file have to be deployed to a servlet container and data accessed
via API calls.  The content of the download page is also accessible at
http://ambit.sourceforge.net/download_ambitrest.html .

For the API documentation and publications
http://opentox.org/dev/apis/api-1.2

http://www.jcheminf.com/content/2/1/7
http://www.jcheminf.com/content/3/1/18

http://ambit.sourceforge.net/api.html
http://ambit.sourceforge.net/ambit_services.html


On 7 September 2011 16:36, <luca_settimo at vrtx.com> wrote:

> Dear Opentox support and dear Vedrin
> I thank you for the answer that you gave me last month.
> We had some difficulty to get the structures from the database dump
> http://ambit.uni-plovdiv.bg/downloads/ambit2/db/ambit2-2011051401.7z
> that you sent me some time ago
> My colleague (Pat Walters) tried to load the database  into MySQL, he said
> that there are about 50 tables and we don't see any documentation.  Do
> you know if there is a description or an entity relationship diagram
> available?
>
> There's a table called "structures", that has 478,009 structures in SDF
> format and 70,646 in INC (INChI?) format.
> There's another table called "chemicals" with 118,726 SMILES
>
> The other tables contain descriptors and data, but we are not sure how it
> all fits together.
>
>

If you would like to study the database schema, a version of it could be
found in the Prototype Database deliverable (p.12), along with other
information.

http://www.opentox.org/data/documents/development/opentoxreports/opentoxreportd32


The updated document for the final database is under review by partners and
hopefully will be published soon.

Information about the schema can be found a at
https://ambit.svn.sourceforge.net/svnroot/ambit/trunk/ambit2-all/ambit2-db/src/main/resources/ambit2/db/sql/ambit2.mwb
(MySQL workbench file) and in the book chapter [1] .


If you only would like to access structures (and.or data), and not necessary
install OpenTox web services, it might be more appropriate to use the
pre-installed service at
https://ambit.uni-plovdiv.bg:8443/ambit2/dataset<https://ambit.uni-plovdiv.bg:8443/ambit2/dataset?max=100>
and
just download the structures in the preferred format.

Best regards,
Nina Jeliazkova

[1] Jeliazkova N., Jaworska J., Worth A. (*2010*) Chapter 17. Open Source
Tools for Read-Across and Category Formation, In M. Cronin, & Madden J.
(Eds.), In Silico Toxicology : Principles and
Application<http://www.rsc.org/publishing/ebooks/2010/9781849730044.asp>(pp.
408-445). Cambridge, UK: RSC Publishing


> Please include Pat Walters in the reply if you could help thanks
>
> Thanks
> Luca
>
>
>
> From:   Vedrin Jeliazkov <vedrin.jeliazkov at gmail.com>
> To:     luca_settimo at vrtx.com
> Cc:     opentox development mailing list <development at opentox.org>
> Date:   04/08/2011 13:29
> Subject:        Re: Luca Settimo
>
>
>
> Hi Luca,
>
> > could you give me some more info on the databases that you collected for
> AMBIT?
>
> The database dump that is available at
> http://ambit.uni-plovdiv.bg/downloads/ambit2/db/ambit2-2011051401.7z
> contains the following datasets:
>
> ECHA list of pre-registered substances (143835 entries)
> ChemIDplus (structures for 80468 chemicals from the ECHA list of
> pre-registered substances)
> Chemical Identifier Resolver (structures for 72985 chemicals from the
> ECHA list of pre-registered substances)
> ChemDraw (structures for 22519 chemicals from the ECHA list of
> pre-registered substances)
> CPDBAS (1547 entries)
> DBPCAN (209 entries)
> EPAFHM (617 entries)
> FDAMDD (1216 entries)
> HPVCSI (3548 entries)
> HPVISD (1006 entries)
> IRISTR (544 entries)
> KIERBL (278 entries)
> NCTRER (232 entries)
> NTPBSI (2330 entries)
> NTPHTS (1408 entries)
> ISSCAN (1150 entries)
> ISSMIC (151 entries)
> ISSSTY (232 entries)
> TOXCST (320 entries)
> TXCST2 (960 entries)
> ECETOC Technical Report No. 66 Skin irritation and corrosion Reference
> Chemicals data base (1995) (176 entries)
> Local Lymph Node Data for the Evaluation of Skin Sensitization -
> Compilation of historical data (Dermatitis Vol 16 No 4 2005) (209
> entries)
> Local Lymph Node Data for the Evaluation of Skin Sensitization -
> Second compilation (Dermatitis Vol 21 No 1 2010) (108 entries)
> Bioconcentration factor (BCF) Gold Standard Database (1130 entries)
> Benchmark Data Set for pKa Prediction of Monoprotic Small Molecules
> the SMARTS Way (185 entries)
> Benchmark Data Set for In Silico Prediction of Ames Mutagenicity (6512
> entries)
> Bursi AMES Toxicity Dataset (4337 entries)
> EPI_AOP (818 entries)
> EPI_BCF (685 entries)
> EPI_BioHC (175 entries)
> EPI_Biowin (1263 entries)
> EPI_Boil_Pt (5890 entries)
> EPI_Henry (1829 entries)
> EPI_KM (631 entries)
> EPI_KOA (308 entries)
> EPI_Kowwin (15809 entries)
> EPI_Melt_Pt (10051 entries)
> EPI_PCKOC (788 entries)
> EPI_VP (3037 entries)
> EPI_WaterFrag (5764 entries)
> EPI_Wskowwin (2348 entries)
> TOXCST_ACEA (320 entries)
> TOXCST_Attagene (320 entries)
> TOXCST_BioSeek (320 entries)
> TOXCST_Cellumen (320 entries)
> TOXCST_CellzDirect (320 entries)
> TOXCST_Gentronix (320 entries)
> TOXCST_NCGC (320 entries)
> TOXCST_Novascreen (320 entries)
> TOXCST_Solidus (320 entries)
> TOXCST_ToxRefDB (320 entries)
> ECBPRS (structures and data for 80410 chemicals from the ECHA list of
> pre-registered substances)
> OPSIN (structures for 78458 chemicals from the ECHA list of
> pre-registered substances)
>
> You can also access all of the above mentioned datasets at
> https://ambit.uni-plovdiv.bg:8443/ambit2/dataset after you login with
> your OpenTox username and password at
> https://ambit.uni-plovdiv.bg:8443/ambit2/opentoxuser (You can register
> as an OpenTox user at http://www.opentox.org/join_form if you haven't
> already).
>
> In addition to these datasets, you could access at the same location
> the PubChem Structures + Assays dataset (473965 entries), which is not
> included in the MySQL dump that is available for download in order to
> keep it more compact.
>
> Please note that some additional datasets (not listed above, but
> available in the DB) are accessible only by OpenTox partners, due to
> specific licensing requirements and agreements.
>
> > Are you aware of this paper?
>
> [http://dx.doi.org/10.1016/j.taap.2009.08.022]
>
> > Perhaps you will find very useful Table 1 because it shows all databases
> for tox that are available in the literature. Which of these
> > do you have?
>
> As you can see from the list above, there's some degree of overlap
> between the references in Table 1 of this paper and the datasets
> included in the OpenTox DB, but both have entries that are absent in
> the other list. One major obstacle for including some of the sources
> that you mention is the lack of computer-readable bulk download for
> them. In addition, the AMBIT database is evolving continuously (even
> as I write these lines) and it can be somehow hard to tell what's
> included and what's not -- all registered users with sufficient
> privileges can add datasets at any time. In general, the OpenTox
> framework (and AMBIT as one particular implementation of the OpenTox
> API) provides the infrastructure to store and process relevant data in
> a more or less similar way as the Apache HTTP server acts for making
> available web site content. It's up to the users to upload whatever
> datasets, algorithms, models, etc..., they like to use or make
> available to others. So, in essence, the OpenTox DB is a kind of
> starting reference point, with particular emphasis on datasets that
> are relevant to the European REACH legislation, mainly due to the
> specific context of the OpenTox project. However, the OpenTox
> framework was designed in a generic way, to enable its use in other
> domains as well. It's up to the users to install, populate, run,
> maintain their own instances of OpenTox services. Furthermore, due to
> the common API, these services could be linked together and rely on
> each other for executing specific tasks (e.g. an algorithm provided by
> service A can be used to build a model by service B, using training
> dataset available at service C; the model at service B could be
> validated by service D and used to predict properties for a dataset
> hosted at service E, etc). You can have all of these running on a
> single box, or on a private cluster, or as (distributed) services that
> you offer to the public to use.
>
> > So Barry told me that you have a linux version of
> tox-create/tox-predict? Is that true?
>
> See my previous and Micha's mail for a detailed answers to these
> questions. The apps are platform independent and can run on any OS.
> ToxPredict and its dependencies are Java-based, ToxCreate and its
> dependencies are Ruby-based.
>
> As a somehow easier first step you might want to try the OpenTox
> virtual appliance, which has all of these apps pre-installed for you
> on a recent version of Linux:
>
>
> http://ambit.uni-plovdiv.bg/downloads/opentox/Opentox%20Virtual%20Appliance%20DC.ova
>
>
> Please note that this is a large file (2730474496 bytes). Its md5
> checksum which you could check to ensure that no errors have occurred
> while downloading it is: 1530bb83e88c3c646bcbac3183745bab
>
> You could import and run the appliance in VirtualBox
> (http://www.virtualbox.org/).
>
> Let us know if we can be of further assistance.
>
> Kind regards,
> Vedrin
>
>
>
>
>
>
>
> "Registered in England and Wales No: 2907620
> Registered Office: 88 Milton Park, Abingdon, Oxfordshire OX14 4RY, UK"
>
>
>
>
>
>
> "Disclaimer: The information contained in this transmission may contain
> privileged and confidential information. It is intended only for use of
> the person(s) named above. If you are not the intended recipient, you are
> hereby notified that any review, dissemination, distribution or
> duplication of this communication is strictly prohibited. If you are not
> the intended recipient, please contact the sender by reply e-mail and
> destroy all copies of the original message."
>
>
>
>
> _______________________________________________
> Development mailing list
> Development at opentox.org
> http://www.opentox.org/mailman/listinfo/development
>



More information about the Development mailing list