[OTDev] IBMC QNA\MNA services

Druzhilovsky dmitry.druzhilovsky at ibmc.msk.ru
Wed Dec 22 12:09:52 CET 2010


Dear Christoph,

You have written:
"I have ... suspect that some brackets around explicit hydrogens are misssing."

Explanation.
The MNA descriptors:

HO
CHHHN
CHHCC
CHCC
CHCN
CCCC
CCCN
NCCC
NCO
OHN

are the MNA descriptors of first level,
whereas the MNA descriptors:

C(C(CCC)C(CC-H)N(CC-C))
C(C(CCC)C(CC-H)-H(C))
C(C(CCC)C(CN-H)-C(C-C-N))
C(C(CCN)C(CC-H)C(CC-C))
C(C(CCN)C(CC-H)-H(C))
C(C(CC-H)C(CC-H)-H(C))
C(C(CC-H)C(CC-H)-C(C-H-H-C))
C(C(CC-H)C(CC-C)-H(C))
C(C(CC-C)N(CC-C)-H(C))
N(C(CCN)C(CN-H)-C(N-H-H-H))
-H(C(CC-H))
-H(C(CN-H))
-H(-C(C-H-H-C))
-H(-C(N-H-H-H))
-H(-O(-H-N))
-C(C(CC-C)-H(-C)-H(-C)-C(C-C-N))
-C(C(CC-C)-C(C-H-H-C)-N(-C-O))
-C(N(CC-C)-H(-C)-H(-C)-H(-C))
-N(-C(C-C-N)-O(-H-N))
-O(-H(-O)-N(-C-O))

are the MNA descriptors of second level.

So, brackets are corresponded to first and second immidiate neighbours.


Best regards

Dmitry S. Druzhilovsky

Laboratory of Structure-Function Based Drug Design
119121, Russia, Moscow, Pogodinskaya street, 10 
Phone: +7 499 255-30-29
Fax: +7 499 245-08-57

D> -----Original Message-----
D> From: development-bounces at opentox.org [mailto:development-
D> bounces at opentox.org] On Behalf Of Christoph Helma
D> Sent: Monday, December 13, 2010 7:40 PM
D> To: development
D> Subject: Re: [OTDev] IBMC QNA\MNA services
D> 
D> Dear Dimitry, all,
D> 
D> You can find my representation for the first 10 substructures in your
D> example in the attachment (rdf/xml and turtle formats).
D> 
D> I have noticed that some of your features are not valid smarts (e.g.
D> CHHHN, CHHCC) and suspect that some brackets around explicit hydrogens
D> are misssing.
D> 
D> Best regards,
D> Christoph
D> 
D> 
D> Excerpts from Nina Jeliazkova's message of Mon Dec 13 11:36:54 +0100
D> 2010:
D> > Christoph,
D> >
D> > Here is the example
D> >
D> > <?xml version="1.0" ?><rdf:RDF
D> xmlns:ot="http://www.opentox.org/api/1.1#"
D> > xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:owl="
D> > http://www.w3.org/2002/07/owl#"
D> > xmlns:dc="http://purl.org/dc/elements/1.1/"><owl:Class
D> >
D> rdf:about="http://www.opentox.org/api/1.1#Dataset"></owl:Class><owl:Cl
D> > ass
D> >
D> rdf:about="http://www.opentox.org/api/1.1#DataEntry"></owl:Class><owl:
D> > Class
D> >
D> rdf:about="http://www.opentox.org/api/1.1#Feature"></owl:Class><owl:Cl
D> > ass
D> >
D> rdf:about="http://www.opentox.org/api/1.1#FeatureValue"></owl:Class><o
D> > wl:Class
D> >
D> rdf:about="http://www.opentox.org/api/1.1#Compound"></owl:Class><owl:O
D> > bjectProperty
D> >
D> rdf:about="http://www.opentox.org/api/1.1#compound"></owl:ObjectProper
D> > ty><owl:ObjectProperty
D> >
D> rdf:about="http://www.opentox.org/api/1.1#dataEntry"></owl:ObjectPrope
D> > rty><owl:ObjectProperty
D> >
D> rdf:about="http://www.opentox.org/api/1.1#values"></owl:ObjectProperty
D> > ><owl:ObjectProperty
D> >
D> rdf:about="http://www.opentox.org/api/1.1#feature"></owl:ObjectPropert
D> > y><owl:ObjectProperty
D> >
D> rdf:about="http://www.opentox.org/api/1.1#hasSource"></owl:ObjectPrope
D> > rty><owl:ObjectProperty
D> >
D> rdf:about="http://www.opentox.org/api/1.1#acceptValue"></owl:ObjectPro
D> > perty><owl:DatatypeProperty
D> >
D> rdf:about="http://www.opentox.org/api/1.1#units"></owl:DatatypePropert
D> > y><owl:DatatypeProperty
D> >
D> rdf:about="http://www.opentox.org/api/1.1#value"></owl:DatatypePropert
D> > y><owl:AnnotationProperty
D> >
D> rdf:about="http://purl.org/dc/elements/1.1/description"></owl:Annotati
D> > onProperty><owl:AnnotationProperty
D> >
D> rdf:about="http://purl.org/dc/elements/1.1/creator"></owl:AnnotationPr
D> > operty><owl:AnnotationProperty
D> >
D> rdf:about="http://purl.org/dc/elements/1.1/type"></owl:AnnotationPrope
D> > rty><owl:AnnotationProperty
D> >
D> rdf:about="http://purl.org/dc/elements/1.1/title"></owl:AnnotationProp
D> > erty><ot:Dataset
D> > rdf:about="https://ambit.uni-
D> plovdiv.bg:8443/ambit2/dataset/36400"><ot
D> > :dataEntry><ot:DataEntry><ot:compound><ot:Compound
D> > rdf:about="
D> > https://ambit.uni-
D> plovdiv.bg:8443/ambit2/compound/163134/conformer/506
D> >
D> 294"></ot:Compound></ot:compound><ot:values><ot:FeatureValue><ot:featu
D> > re
D> > rdf:resource="https://ambit.uni-
D> plovdiv.bg:8443/ambit2/feature/178539"
D> > ></ot:feature><ot:value
D> > rdf:datatype="http://www.w3.org/2001/XMLSchema#string">HC
D> > HO
D> > CHHHN
D> > CHHCC
D> > CHCC
D> > CHCN
D> > CCCC
D> > CCCN
D> > NCCC
D> > NCO
D> > OHN
D> > C(C(CCC)C(CC-H)N(CC-C))
D> > C(C(CCC)C(CC-H)-H(C))
D> > C(C(CCC)C(CN-H)-C(C-C-N))
D> > C(C(CCN)C(CC-H)C(CC-C))
D> > C(C(CCN)C(CC-H)-H(C))
D> > C(C(CC-H)C(CC-H)-H(C))
D> > C(C(CC-H)C(CC-H)-C(C-H-H-C))
D> > C(C(CC-H)C(CC-C)-H(C))
D> > C(C(CC-C)N(CC-C)-H(C))
D> > N(C(CCN)C(CN-H)-C(N-H-H-H))
D> > -H(C(CC-H))
D> > -H(C(CN-H))
D> > -H(-C(C-H-H-C))
D> > -H(-C(N-H-H-H))
D> > -H(-O(-H-N))
D> > -C(C(CC-C)-H(-C)-H(-C)-C(C-C-N))
D> > -C(C(CC-C)-C(C-H-H-C)-N(-C-O))
D> > -C(N(CC-C)-H(-C)-H(-C)-H(-C))
D> > -N(-C(C-C-N)-O(-H-N))
D> > -O(-H(-O)-N(-C-O))
D> >
D> >
D> </ot:value></ot:FeatureValue></ot:values></ot:DataEntry></ot:dataEntry
D> > ></ot:Dataset><ot:Feature
D> > rdf:about="https://ambit.uni-plovdiv.bg:8443/ambit2/feature/178539
D> > "><dc:creator>Default</dc:creator><ot:hasSource>
D> > http://195.178.207.160/OpenTox/MNAGet</ot:hasSource><owl:sameAs
D> > rdf:resource="http://www.opentox.org/api/1.1#MNA
D> >
D> "></owl:sameAs><ot:units></ot:units><dc:title>MNA</dc:title></ot:Featu
D> > re></rdf:RDF>
D> >
D> >
D> > Nina
D> >
D> > On 13 December 2010 10:28, Nina Jeliazkova
D> <jeliazkova.nina at gmail.com>wrote:
D> >
D> > > Dear Dmitry,
D> > >
D> > > On 3 December 2010 16:08, Druzhilovsky
D> <dmitry.druzhilovsky at ibmc.msk.ru>wrote:
D> > >
D> > >> Dear Nina, All,
D> > >>
D> > >> We finished MakeMNA/MakeQNA service, and presented it:
D> > >>
D> > >> http://195.178.207.160/OpenTox/MakeMNA
D> > >> http://195.178.207.160/OpenTox/MakeQNA
D> > >>
D> > >> Could you check and give comments? And how could we integrate our
D> > >> service into ToxCreate?
D> > >>
D> > >> Example POST:
D> > >>
D> > >> curl -X POST -d
D> > >> dataset_uri=https://ambit.uni-plovdiv.bg:8443/ambit2/dataset/2765
D> > >> -d dataset_service=https://ambit.uni-
D> plovdiv.bg:8443/ambit2/dataset
D> > >> http://195.178.207.160/OpenTox/MakeMNA
D> > >>
D> > >
D> > > curl -X POST -d \
D> > > > dataset_uri=https://ambit.uni-
D> plovdiv.bg:8443/ambit2/dataset/2765
D> > > > -d \
D> > > > dataset_service=https://ambit.uni-plovdiv.bg:8443/ambit2/dataset
D> \
D> > > > http://195.178.207.160/OpenTox/MakeMNA
D> > > <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF
D> > > xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
D> > >   xmlns:ns0="http://www.opentox.org/api/1.1#"
D> > >   xmlns:ns1="http://purl.org/dc/elements/1.1/">
D> > >
D> > >   <rdf:Description
D> rdf:about="http://www.opentox.org/api/1.1#DataSet">
D> > >     <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
D> > >   </rdf:Description>
D> > >
D> > >   <rdf:Description rdf:about="
D> > > http://apps.ideaconsult.net:8080/ambit2/dataset/">
D> > >     <ns0:dataEntry rdf:nodeID="arc056fb1"/>
D> > >   </rdf:Description>
D> > >
D> > >   <rdf:Description rdf:nodeID="arc056fb1">
D> > >     <rdf:type
D> rdf:resource="http://www.opentox.org/api/1.1#DataEntry"/>
D> > >     <ns0:compound rdf:resource="
D> > > https://ambit.uni-
D> plovdiv.bg:8443/ambit2/compound/163134/conformer/5
D> > > 06294
D> > > "/>
D> > >     <ns0:values rdf:nodeID="MNA1"/>
D> > >   </rdf:Description>
D> > >
D> > >   <rdf:Description rdf:nodeID="MNA1">
D> > >     <rdf:type
D> rdf:resource="http://www.opentox.org/api/1.1#FeatureValue"/>
D> > >     <ns0:feature rdf:nodeID="Feature11"/>
D> > >     <ns0:value
D> > > rdf:datatype="http://www.w3.org/2001/XMLSchema#string">HC
D> > > HO
D> > > CHHHN
D> > > CHHCC
D> > > CHCC
D> > > CHCN
D> > > CCCC
D> > > CCCN
D> > > NCCC
D> > > NCO
D> > > OHN
D> > > C(C(CCC)C(CC-H)N(CC-C))
D> > > C(C(CCC)C(CC-H)-H(C))
D> > > C(C(CCC)C(CN-H)-C(C-C-N))
D> > > C(C(CCN)C(CC-H)C(CC-C))
D> > > C(C(CCN)C(CC-H)-H(C))
D> > > C(C(CC-H)C(CC-H)-H(C))
D> > > C(C(CC-H)C(CC-H)-C(C-H-H-C))
D> > > C(C(CC-H)C(CC-C)-H(C))
D> > > C(C(CC-C)N(CC-C)-H(C))
D> > > N(C(CCN)C(CN-H)-C(N-H-H-H))
D> > > -H(C(CC-H))
D> > > -H(C(CN-H))
D> > > -H(-C(C-H-H-C))
D> > > -H(-C(N-H-H-H))
D> > > -H(-O(-H-N))
D> > > -C(C(CC-C)-H(-C)-H(-C)-C(C-C-N))
D> > > -C(C(CC-C)-C(C-H-H-C)-N(-C-O))
D> > > -C(N(CC-C)-H(-C)-H(-C)-H(-C))
D> > > -N(-C(C-C-N)-O(-H-N))
D> > > -O(-H(-O)-N(-C-O))
D> > >
D> > > </ns0:value>
D> > >   </rdf:Description>
D> > >
D> > >   <rdf:Description rdf:nodeID="Feature11">
D> > >     <rdf:type
D> rdf:resource="http://www.opentox.org/api/1.1#Feature"/>
D> > >     <ns0:hasSource
D> > > rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
D> > > http://195.178.207.160/OpenTox/MNAGet</ns0:hasSource>
D> > >     <ns1:title
D> rdf:datatype="http://www.w3.org/2001/XMLSchema#string
D> > > ">MNA</ns1:title>
D> > >   </rdf:Description>
D> > >
D> > > </rdf:RDF><rdf:RDF
D> > >     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
D> > >     xmlns:ot="http://www.opentox.org/api/1.1#"
D> > >     xmlns:bx="http://purl.org/net/nknouf/ns/bibtex#"
D> > >     xmlns:owl="http://www.w3.org/2002/07/owl#"
D> > >     xmlns:otee="http://www.opentox.org/echaEndpoints.owl#"
D> > >     xmlns:dc="http://purl.org/dc/elements/1.1/"
D> > >     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
D> > >     xmlns:ota="http://www.opentox.org/algorithmTypes.owl#"
D> > >     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
D> > >   <owl:Class rdf:about="http://www.opentox.org/api/1.1#Task"/>
D> > >   <owl:DatatypeProperty rdf:about="
D> > > http://www.opentox.org/api/1.1#percentageCompleted"/>
D> > >   <owl:DatatypeProperty rdf:about="
D> > > http://www.opentox.org/api/1.1#hasStatus"/>
D> > >   <ot:Task rdf:about="
D> > > https://ambit.uni-plovdiv.bg:8443/ambit2/task/80a6cf85-a807-4347-
D> 8c9
D> > > 4-9abc220bf039
D> > > ">
D> > >     <ot:percentageCompleted rdf:datatype="
D> > > http://www.w3.org/2001/XMLSchema#float"
D> > >     >0.0</ot:percentageCompleted>
D> > >     <ot:hasStatus
D> rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
D> > >     >Running</ot:hasStatus>
D> > >     <dc:date
D> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime"
D> > >     >1292228310135</dc:date>
D> > >     <dc:title
D> rdf:datatype="http://www.w3.org/2001/XMLSchema#string"
D> > >     >File import application/rdf+xml [1938]</dc:title>
D> > >   </ot:Task>
D> > >   <owl:AnnotationProperty
D> > > rdf:about="http://purl.org/dc/elements/1.1/date
D> > > "/>
D> > >   <owl:AnnotationProperty
D> > > rdf:about="http://purl.org/dc/elements/1.1/title
D> > > "/>
D> > > </rdf:RDF>
D> > >
D> > > curl -L -k -H "Accept:text/uri-list" "
D> > > https://ambit.uni-plovdiv.bg:8443/ambit2/task/80a6cf85-a807-4347-
D> 8c9
D> > > 4-9abc220bf039
D> > > "
D> > > https://ambit.uni-plovdiv.bg:8443/ambit2/dataset/36400
D> > >
D> > > As it could be seen, finally the URI of the dataset with MNA
D> > > descriptors is returned, so it works fine. However, the dataset
D> > > representation, SMILES string and anything, which is not ot:Task
D> > > representation and is returned from the first POST call, should
D> not be there.
D> > >
D> > > Another comment is the MNA/QNA descriptor calculation is returning
D> > > Task URI only after all the processing is completed and results
D> sent
D> > > to the dataset service. This means , the HTTP POST call may not
D> > > complete for long time, if the dataset to be processed  contains
D> > > more than few compounds.  OpenTox API recommends to return task
D> URI
D> > > immediately after accepting the processing, and then the client
D> polls the task URI to find if it is completed.
D> > >
D> > > I have not tested how well the services work if sending multiple
D> > > requests in parallel, would be better if you agree to setup
D> > > smokeping testing for this purpose.
D> > >
D> > >
D> > >> But as you asked me,  if the MNA are still described as one
D> string
D> > >> feature, nobody else will be able to make sense of them ... We
D> are
D> > >> suggesting such
D> > >> structure: representation line contains only 1 or 0, which means
D> > >> presents or absence MNA descriptor. For each structure we'll
D> > >> generate fixed number of MNA descriptors for example 500. So each
D> > >> partners could be used this string us independent variables
D> (which
D> > >> the number 500) for regression analysis.
D> > >> For
D> > >> QNA representation Chebyshev polynomials will be used. So each
D> > >> string will be include 100 independent variables. Variable means
D> > >> Chebyshev polynomial value obtained QNA descriptor.
D> > >>
D> > >>
D> > > I guess the best way to handle custom formats for feature values
D> > > content (in addition to standard string and number types ) is to
D> > > propose specific MIME format and document them at OpenTox site.
D> > > There might be better ways, would be good to discuss during today
D> meeting.
D> > >
D> > >
D> > >
D> > >> And more, how kind date format do you use for uploading data for
D> > >> lasar regression?
D> > >>
D> > >
D> > > I hope Christoph could answer ToxCreate and lazar related
D> questions.
D> > >
D> > > Best regards,
D> > > Nina
D> > >
D> > >
D> > >>
D> > >> Best regards
D> > >> Dmitry
D> > >>
D> > >> _______________________________________________
D> > >> Development mailing list
D> > >> Development at opentox.org
D> > >> http://www.opentox.org/mailman/listinfo/development
D> > >>
D> > >
D> > >




More information about the Development mailing list