[Chimera-users] 16,000 molecules slow

Wed Sep 13 15:09:45 PDT 2006

Email exchange with Brittany Morgan about slow ball and stick display
when displaying 16000 molecules.

----

From: "Brittany Morgan" <brmorgan at clarku.edu>
To: "'Thomas Goddard'" <goddard at cgl.ucsf.edu>
Subject: RE: [Chimera-users] Memory allocation
Date: Tue, 12 Sep 2006 20:52:39 -0400

  I am working with molecules, and it has problems redrawing them in
ball-and-stick representation, although I expect that at larger system sizes
other functions will start slowing down as well. If I look at the memory
usage while it is redrawing them, Chimera seems to be using only about 1/4
of the available memory, but creating a paging file nonetheless. It will
work, it just takes about 15 minutes to redraw about 64,000 atoms, and I
need to go to system sizes much larger. If I need more memory, that is
possible, I just wasn't sure if that was the issue because it didn't appear
that Chimera was using all of the available memory. I am using Chimera in a
non-standard way (I'm not displaying proteins, but rather many small
(unrealistic) molecule-type objects).

Thank you for your help!
Brittany

-----

From: Thomas Goddard [mailto:goddard at cgl.ucsf.edu] 
Sent: Tuesday, September 12, 2006 9:22 PM
To: brmorgan at clarku.edu
Cc: chimera-users at cgl.ucsf.edu
Subject: Re: [Chimera-users] Memory allocation

Hi Brittany,

  Thanks for the detailed information.  I do not think memory is the
problem.  Display of 58,000 atoms (pdb 1uf2) in ball and stick took
10 seconds on my 4 year old laptop.  That is the time to make the
intial drawing.  Then it update at about 2 frames per second when
rotating the model on this rather old machine.

  Your problem is likely caused by one of two issues.  First, if your
"molecule-type objects" contain many (tens or hundreds) small rings
within a residue then Chimera will bog down trying to compute the
chemical atom types (aromaticity...).  This is a problem we have
recently analyzed when using Chimera with molecules that do not have
reasonable chemical structure.  We plan on fixing this problem by
being able to tag molecules in Chimera as "non-chemical".  But we do
not have a version of Chimera available with that feature yet.  So
currently the only solution is to modify your non-molecule model using
multiple residues to avoid having many rings in a single residue.  The
ring computation is not done in wire display style but is invoked when
switching to ball and stick style because atom radii are needed and
are based on the chemical types.

  The second possible problem is that you are not using hardware
accelerated 3D graphics.  On Linux systems you usually need to install
a graphics driver to get hardware acceleration.  Windows and Mac
systems usually already have the appropriate driver.  Rendering
computations can be 100 times slower if it is done in software instead
of by the graphics chip.  Ball and stick display style is much more
time consuming to render than wire frame.  If you are able to rotate
the 64,000 atom ball and stick model after it is first displayed (15
minutes) and it updates reasonably fast (> 1 frame per second) then
you have hardware acceleration and this is not the problem.

	Tom

-----

From: "Brittany Morgan" <brmorgan at clarku.edu>
To: "'Thomas Goddard'" <goddard at cgl.ucsf.edu>
Subject: RE: [Chimera-users] Memory allocation
Date: Tue, 12 Sep 2006 21:34:09 -0400

Hi Tom,

  I can rotate it after it is first displayed with no difficulty, so I would
think that the problem is likely the first. My toy model molecules consist
of 4 "atoms" in a linear configuration, so these are very small molecules,
and I wouldn't have any rings within a residue, if I understand this
correctly. However, would the fact that I have so many molecules be an issue
(if I have 64,000 atoms, I have 16,000 molecules)? I can tell you more
explicitly what I am attempting to display (and how) if that would be of
use. Since I guessed, more or less, on how to output the data to get it to
display what I want, it is likely that I am doing this in a less than
optimal way.

Brittany

--------

Date: Tue, 12 Sep 2006 18:51:58 -0700 (PDT)
From: Thomas Goddard <goddard at cgl.ucsf.edu>
To: brmorgan at clarku.edu

Hi Brittany,

  Ok it's not rings.  Very likely something is slow when dealing with
your large number of molecules.  The typical Chimera use involves only
a handful of molecules, twenty or thirty is considered alot, and 16,000
may be a record!

  Another Chimera developer Eric Pettersen will know more about efficiency
with large numbers of molecules.  (I work primarily on volume data.)
So I've sent this email to him and he may have advice.

  If it is possible I think you would be better off making your 4 atom
chains be a single residue with all chains in a single molecule.  Chimera
is likely to handle that much better.  An immediate problem is that if
you are using the PDB file format its residue number field is only 4 digits
(max 9999 residues).  It is awkward but you could use multiple PDB chain
identifiers (single letter a-z, A-Z, 0-9) to handle up to ~600,000 chains.
Another approach is to put all the chains in a single residue.  But a
residue with 64,000 atoms may also cause poor Chimera performance.

  I'll also warn you that Chimera is not at good as other molecular
graphics programs (PyMol, and VMD) at handling systems larger than 100,000
atoms.  Chimera uses much more memory (about 3 Kbytes per atom).  PyMol
and VMD I am told are 5-10 times more memory efficient on a per-atom basis.
Chimera may be better for doing custom types of analysis though.

	Tom

Is it OK if I forward your last email to the chimera-users mailing list?
That may help others with similar problems.

-----

From: "Brittany Morgan" <brmorgan at clarku.edu>
To: "'Thomas Goddard'" <goddard at cgl.ucsf.edu>
Subject: RE: [Chimera-users] Memory allocation
Date: Wed, 13 Sep 2006 13:57:36 -0400

Hi Tom,

Feel free to forward anything that you think would be of use to others.

Unfortunately, 16,000 is only a moderately large system for me. I'm looking
to be able to get up to around 112,500 atoms and a little under 30,000
molecules. I only need the visualization aspects of Chimera, all analysis I
do on my own. Do you think PyMol would be able to handle something like
that? I can try making them all a single molecule, but I also suspect that
having a chain with 112,500 atoms would cause problems.

Thanks again,
Brittany

----

Date: Wed, 13 Sep 2006 11:07:59 -0700 (PDT)
From: Thomas Goddard <goddard at cgl.ucsf.edu>
To: brmorgan at clarku.edu

Hi Brittany,

  There are two difficulties you are likely to run into: PDB file format
limits on number of atoms (100,000) and number of residues (10,000),
and poor software performance for such large systems.  If you are
primarily concerned with visualization rather than analysis than PyMOL
I believe will give better performance.  But you may have a hard time
with the file format issues.  I think you will get better support
solving the file format problems with Chimera since we have 5 developers
while PyMOL has just one.

  Having 100,000 atoms in a single PDB chain is not a problem.  I believe
also that Chimera can handle a modification of PDB file format where the
atom number starts in column 6 instead of column 7 allowing up to 1,000,000
atoms.  Eric would know.

	Tom

----

From: Eric Pettersen <pett at cgl.ucsf.edu>
To: brmorgan at clarku.edu
Date: Wed, 13 Sep 2006 12:26:30 -0700

Hi Brittany,
	I confess I find your problem somewhat mysterious.  I wrote a  
Chimera Python script (attached) that can either generate a single  
molecule with 16000 4-atom residues or 16000 4-atom molecules  
(depending on whether the 'singleMol' variable is True or False).   
While Chimera is better with the single-molecule system (on my mac,  
taking about 5 seconds each to run the script and to change to ball- 
and-stick), it isn't that much worse using the 16000-molecule system  
(about 20 seconds each -- not nearly 15 minutes!).  It also took  
about twice as much memory (~1GB vs ~480 MB; this was using the  
1.2255 snapshot).  You can try it out yourself by using File..Open to  
open the script and changing the "singleMol = False" line in the  
script beforehand to choose modes.
	You should check that Chimera is using your graphics card rather  
than software rendering.  One way to do that is to use Help->Report A  
Bug.. and click the Next button in the resulting dialog.  The ensuing  
panel will have a "Bug Description" field that reports your OpenGL  
vendor and renderer.  What does it say?  If it says something  
involving "Mesa" then you are using software rendering and should try  
to update the driver for you card by visiting your card  
manufacturer's web site.  You can click "Cancel" in the bug dialog to  
dismiss it.
	As Tom mentioned, we allow a modified PDB format where the sixth  
column of an ATOM record can be a digit, effectively allowing 1  
million atoms.  There are several gotchas in your situation.  Whereas  
most such large systems are composed of standard residues with known  
connectivity, you may well need to specify your connectivity -- and  
CONECT records are strictly limited to 5-digit atom serial numbers.   
If your molecules do have typical bonding distances, you can let  
Chimera figure out the connectivity, but you would need to put each  
molecule in its own residue to prevent the N-squared intra-residue  
connectivity search from getting out of hand (so you would likely  
need to use chain IDs once you're past 9999 residues), and you would  
need to insert TER cards between residues to prevent the implied  
connectivity between residues (may not actually be necessary in the  
1.2255 snapshot).

--Eric

File manyMols.py follows

import chimera

singleMol = False
if singleMol:
	m = chimera.Molecule()
	m.name = "test"
	mols = [m]
	for ri in range(16000):
		r = m.newResidue("UNK", " ", ri+1, " ")
		prev = None
		for ai in range(4):
			a = m.newAtom("C%d" % (ai+1), chimera.Element("C"))
			r.addAtom(a)
			if prev:
				m.newBond(a, prev)
			crd = chimera.Coord(ri, ai, 0.0)
			a.setCoord(crd)
			prev = a
else:
	mols = []
	for mi in range(16000):
		m = chimera.Molecule()
		m.name = "test%d" % mi
		mols.append(m)
		r = m.newResidue("UNK", " ", 1, " ")
		prev = None
		for ai in range(4):
			a = m.newAtom("C%d" % (ai+1), chimera.Element("C"))
			r.addAtom(a)
			if prev:
				m.newBond(a, prev)
			crd = chimera.Coord(mi, ai, 0.0)
			a.setCoord(crd)
			prev = a
chimera.openModels.add(mols)