[chimera-dev] PDB file parser failing to identify chain IDs

Tue Jul 21 07:47:23 PDT 2009

Eric,

Thanks for the answer. I was aware of the 'sequences' method but I needed
singleton residues as well, which is the reason why I tried implementing my
own chain detection. So I went with your solution 2 using the
rootForAtom(atom, True) method and it did the trick!

Thanks again,

Sebastien

2009/7/20 Eric Pettersen <pett at cgl.ucsf.edu>

> Hi Sébastien, You don't precisely say why you need to know if two residues
> are in the same chain, so I'm going to describe two solutions.  If they
> aren't sufficient for your needs then let me know why and I can offer
> further suggestions.  Anyway:
>
> solution 1)  Molecules have a 'sequences()' member function that returns a
> list of Sequences.  These Sequences are the multi-residue chains (*i.e.* there
> is no Sequence containing waters).  Each Sequences has a 'residues'
> attribute which is the list of Residues in the chain.  So you can easily run
> through the residues of a chain this way.
>
> solution 2)  For purposes of drawing the graphics, each connected chain of
> atoms has a unique "root" atom that drawing commences from.  Therefore if
> "a1, a2 = res1.atoms[0], res2.atoms[0]" then res1 is in the same connected
> chain as res2 if "res1.molecule.rootForAtom(a1, True) ==
> res2.molecule.rootForAtom(a2, True)".  The second argument to rootForAtom()
> controls whether the chain is considered connected across active bond
> rotations.
>
> bonus solution :-)  ModelPanel.base has a getPhysicalChains function that
> return a list of lists.  Each list is residues in the same physically
> connected chain, though the first list is singleton residues.
>
> --Eric
>
> On Jul 20, 2009, at 6:30 AM, Sébastien Cuendet wrote:
>
> Hi,
>
> I'm working with PDB files, opening them with:
>
> chimera.openModels.open(fileHandler, 'PDB').
>
> Later on when I use the chainId on the residues of the molecule (accessed
> by residue.id.chainId). The problem is that the PDB format does not require
> the chaind id to be defined. When that is the case, all residues have id '
> ', which is rather annoying to distinguish the chains to which they belong.
>
> Now, bringing up problems only is never really appreciated ;), so my
> colleague Aurelien Grosdidier and I tried to think of some solutions to
> split the residues into chains when loading a model from a PDB file that
> does not contain any chain ID.
>
> 1) Split on TER. Chains are usually separated by a TER line, which could
> thus be used to determine when to create a new chain. However, like the
> chain id information, the TER line is optional in the PDB format...
>
> 2) Compute an atom to atom distance matrix for each two consecutive
> residues. Since the typical distance between two atoms is known, we can
> compute if two residues reasonably belong to the same chain. The
> computations can easily be pruned when two residues belong to the same
> chain, which is the most frequent case.
>
> 3) Compute an atom to atom distance matrix and use it to determine which
> atoms belong to the same chain. This is computationnally more expensive than
> 2), but it must be done somewhere in Chimera to compute the bonds between
> the atoms that are rendered graphically. Any chance we can access this
> information (I did not find how)?
>
> Any comment on those solutions? Any other solutions? Any help will be
> appreciated!
>
> Thanks,
>
> Sebastien
> _______________________________________________
> Chimera-dev mailing list
> Chimera-dev at cgl.ucsf.edu
> http://www.cgl.ucsf.edu/mailman/listinfo/chimera-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://plato.cgl.ucsf.edu/pipermail/chimera-dev/attachments/20090721/4da796b0/attachment.html>