[Chimera-users] Slow dealing with pdb files

Eric Pettersen pett at cgl.ucsf.edu
Wed Jan 2 17:00:40 PST 2008


On Jan 1, 2008, at 1:44 PM, Francesco Pietra wrote:

> I am dealing with the average structure (a protein complex embedded  
> in a POCP
> membrane and water solvated) derived with Amber's ptraj from a 1.5  
> ns MD.
>
> Opening this pdb file in 1.2470 Chimera has become extremely slow.  
> The file is
> 6.4MB. First, below the screen it is warned "Ignored bad PDB record  
> found on
> line #", for lines from 1 to 114154. This may take some 10 minutes.

These are for the water ATOM records where the atom serial number and/ 
or residue number were "****" (what FORTRAN inserts when a number  
won't fit inside a field width).

> After that, the warning message changes to "Computed secondary  
> structure
> assignments (see reply log)" which lasts for longer than 1 hour and  
> 20 minutes.
> During this time, "top" command shows that python is using 12% MEM  
> and 99% CPU.

Due to the fact that this is an "average" structure, Chimera's  
estimation of the connectivity is bad for many parts of the structure  
-- particularly the POP residues in the membrane.  This creates a  
rat's nest of intra-residue connectivity which the ring-finding  
algorithm (designed for "reasonable" structures) takes a long time to  
operate on.  Normally Chimera wouldn't run ring-finding as a  
structure opens, but due some interesting naming of hydrogens in the  
POP residues (e.g. RH16) it assigns some of the hydrogens to be other  
elements (e.g. rhodium, as per PDB atom naming rules).  Since rhodium  
is a metal, it wants to depict it as a sphere, which means it needs  
to know the radius, which in turn depends on the atom type, which  
needs to find rings...

> Then, the graphics appears, with the membrane-protein-complex not  
> centered in
> the water box.

This is due to the "****" waters being ignored.

> I could then carry out rapid mapping of the protein residues around  
> the
> single-residue ligand (select protein & :ligandname z<#), which was  
> what I
> wanted to do.

If you only care about the protein and ligand in your analysis, you  
should just edit your file to strip the waters and lipids.  When I  
did this with the file you sent it only took moments to open.

--Eric

                         Eric Pettersen
                         UCSF Computer Graphics Lab
                         http://www.cgl.ucsf.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://plato.cgl.ucsf.edu/pipermail/chimera-users/attachments/20080102/95a803d7/attachment.html>


More information about the Chimera-users mailing list