Opened 4 years ago

Last modified 4 years ago

#4824 assigned enhancement

3-letter code lookup

Reported by: pett Owned by: pett
Priority: moderate Milestone:
Component: Input/Output Version:
Keywords: Cc: Tristan Croll
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

Given a ligand structure, find the PDB 3-letter name for it.

Email discussion:

Beginning to sound like heavy lifting. :-). Particularly the easy-to-not-notice-when-it-breaks hosting/fetching of the components.cif. Probably better would be to search on just heavy atoms, then fetch just those components and try the IDATM matching.

--Eric

On Jun 11, 2021, at 2:51 PM, Tristan Croll <tic20@…> wrote:

If you grab/host an up-to-date copy of components.cif, then the graph-matching approach I've been using could be quite useful. I've been just using the element masses for the nodes - but if the geometry of your search residue is decent and you load the ideal coordinates from components.cif as templates using chimerax.mmcif​, then you could use idatm_type​ instead... which would probably allow you to get away with only using heavy atoms for most cases... although it still wouldn't be able to distinguish chiral enantiomers (NAG/NDG etc.). I believe RDKit has a pretty thorough implementation already built in, if you wanted to go that way.
From: Eric Pettersen <pett@…>
Sent: 11 June 2021 22:06
To: Tristan Croll <tic20@…>
Subject: Re: Ligand look up

The heavy-atom-and-narrow-down is better than nothing I suppose.

--Eric

On Jun 11, 2021, at 1:59 PM, Tristan Croll <tic20@…> wrote:

Well, you can do searches by heavy-atom composition, and I suppose you could provide an interface for the user to narrow things down from there. I think the complete set of available searches is implemented at http://ligand-expo.rcsb.org/ld-search.html. Even internally, their structure-matching tools aren't great. Here's the response I received for my query when I deposited a re-refinement and the deposition interface refused to automatically match my ligand:

Thank you for the information and apologies for the issues you are having in ligand assignment. The assignment algorithm has difficulty finding exact matching for aromatic compounds as it only analyses the specific arrangement of single and double bonds in the structure. Please provide the additional information for this molecule - including the 'alternative' ID, which in this case should be the ID you have already provided. We will them address this during processing of your entry.

From: Eric Pettersen <pett@…>
Sent: 11 June 2021 21:34
To: Tristan Croll <tic20@…>
Subject: Re: Ligand look up

Ah, it's not just a heavy-atom search, you have to get the protonation state "correct" (i.e. neutral)? Yeah, that would be hard. I guess it makes sense since otherwise you can't distinguish benzene from cyclohexane.

--Eric

On Jun 11, 2021, at 1:30 PM, Tristan Croll <tic20@…> wrote:

Hi Eric,

Not really, no. In general that's an almost absurdly-difficult task. What name(s) are attached to a new compound is left pretty much up to the depositor, so searching by name is quite hit-and-miss. While there is a search-by-SMILES functionality, apparently it gets quite confused by aromatic rings where the arrangement of double and single bonds is arbitrary. Oh, and then there's the lack of any clear consistency on which amine and acid groups are protonated... lots of challenges there.

-- Tristan
From: Eric Pettersen <pett@…>
Sent: 11 June 2021 21:05
To: Tristan Croll <tic20@…>
Subject: Ligand look up

Hi Tristan,

Does ISOLDE have any builtin facility for determining if a ligand has an existing code in the Chemical Component Dictionary?

--Eric

Change History (1)

comment:1 by pett, 4 years ago

Type: defectenhancement
Note: See TracTickets for help on using tickets.