[chimera-dev] PDB file parser failing to identify chain IDs
Eric Pettersen
pett at cgl.ucsf.edu
Mon Jul 20 10:37:38 PDT 2009
Hi Sébastien,
You don't precisely say why you need to know if two residues are in
the same chain, so I'm going to describe two solutions. If they
aren't sufficient for your needs then let me know why and I can offer
further suggestions. Anyway:
solution 1) Molecules have a 'sequences()' member function that
returns a list of Sequences. These Sequences are the multi-residue
chains (i.e. there is no Sequence containing waters). Each Sequences
has a 'residues' attribute which is the list of Residues in the
chain. So you can easily run through the residues of a chain this way.
solution 2) For purposes of drawing the graphics, each connected
chain of atoms has a unique "root" atom that drawing commences from.
Therefore if "a1, a2 = res1.atoms[0], res2.atoms[0]" then res1 is in
the same connected chain as res2 if "res1.molecule.rootForAtom(a1,
True) == res2.molecule.rootForAtom(a2, True)". The second argument to
rootForAtom() controls whether the chain is considered connected
across active bond rotations.
bonus solution :-) ModelPanel.base has a getPhysicalChains function
that return a list of lists. Each list is residues in the same
physically connected chain, though the first list is singleton residues.
--Eric
On Jul 20, 2009, at 6:30 AM, Sébastien Cuendet wrote:
> Hi,
>
> I'm working with PDB files, opening them with:
>
> chimera.openModels.open(fileHandler, 'PDB').
>
> Later on when I use the chainId on the residues of the molecule
> (accessed by residue.id.chainId). The problem is that the PDB format
> does not require the chaind id to be defined. When that is the case,
> all residues have id ' ', which is rather annoying to distinguish
> the chains to which they belong.
>
> Now, bringing up problems only is never really appreciated ;), so my
> colleague Aurelien Grosdidier and I tried to think of some solutions
> to split the residues into chains when loading a model from a PDB
> file that does not contain any chain ID.
>
> 1) Split on TER. Chains are usually separated by a TER line, which
> could thus be used to determine when to create a new chain. However,
> like the chain id information, the TER line is optional in the PDB
> format...
>
> 2) Compute an atom to atom distance matrix for each two consecutive
> residues. Since the typical distance between two atoms is known, we
> can compute if two residues reasonably belong to the same chain. The
> computations can easily be pruned when two residues belong to the
> same chain, which is the most frequent case.
>
> 3) Compute an atom to atom distance matrix and use it to determine
> which atoms belong to the same chain. This is computationnally more
> expensive than 2), but it must be done somewhere in Chimera to
> compute the bonds between the atoms that are rendered graphically.
> Any chance we can access this information (I did not find how)?
>
> Any comment on those solutions? Any other solutions? Any help will
> be appreciated!
>
> Thanks,
>
> Sebastien
> _______________________________________________
> Chimera-dev mailing list
> Chimera-dev at cgl.ucsf.edu
> http://www.cgl.ucsf.edu/mailman/listinfo/chimera-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://plato.cgl.ucsf.edu/pipermail/chimera-dev/attachments/20090720/7456a0bd/attachment.html>
More information about the Chimera-dev
mailing list