[chimera-dev] [Chimera-users] distance measurements
Repic Matej
matej.repic at epfl.ch
Wed Aug 19 18:37:11 PDT 2015
Hi Feixia,
While the findclash in chimera will do, I believe you need a more
lightweight tool suitable for high-throughput jobs. Chimera has a lot of
overhead parsing the structure and also the python backend is not
super-efficient for numerous calculations as is necessary in your case.
The pdb has ~110,000 structures and even if it takes 1s to parse one
structure it will still take about a day and a half to loop through all
the structures. Therefore, I suggest you apply the right tool for the job,
which in this case is the "ncont" tool from the ccp4 crystallographic
suite available for free from http://www.ccp4.ac.uk/download/
For my test set of 20 pdbs it was about 15 times faster than chimera.
A ncont input file for checking all :LYS at CA against all :LYS at CA would look
like:
------ncont.inp-------
source //(LYS)/CA
target //(LYS)/CA
maxdist 1000
sort distance inc
----------------------
You can run this file with the command:
ncont xyzin structure.pdb < ncont.inp > ncont.out
If you want to run it on every pdb file in the folder just run a loop:
for f in *.pdb; do ncont xyzin "$f" < ncont.inp > "$f".dist; echo "$f
finished"; done
For doing the same thing in chimera I used this script:
---------------------------dist.py---------------------------------
# The script loops over all pdb files in the same folder as the
# script is residing in and saves all distances between
# lysine C-alpha atoms to a file.
#
# This script should be run with the following command:
# chimera --silent --nogui dist.py
import chimera
from chimera import runCommand as rc
from os import listdir
# Make sure we only get pdbs from the current folder
files = [ f for f in listdir('.') if f.endswith(".pdb")]
# Loop over pdb files
for f in files :
rc('open %s' % f)
rc('findclash :lys at ca test self overlap -1000 bondSeparation 1
saveFile %s.dist' % f)
rc('close 0')
print('%s done' % f)
--------------------------------------------------------------------
Best,
Matej
------------------------------------------------------
Dr. Matej Repic
Ecole Polytechnique Fédérale de Lausanne
Laboratory of Computational Chemistry and Biochemistry
SB - ISIC LCBC
BCH 4108
CH - 1015 Lausanne
------------------------------------------------------
On 8/19/15, 19:55, "chimera-users-bounces at cgl.ucsf.edu on behalf of Elaine
Meng" <chimera-users-bounces at cgl.ucsf.edu on behalf of meng at cgl.ucsf.edu>
wrote:
>Hi Feixia!
>Yes, you could use Chimera to measure all lysine CA-CA distances in many
>structures, but it would require some scripting to loop through the
>structures, find the lysines, and do the measurements.
>
>There is some information on looping through structures and running
>Chimera commands here:
><http://www.rbvi.ucsf.edu/chimera/docs/ProgrammersGuide/basicPrimer.html>
>
>Now, for each structure, you might imagine the script should first find
>all the lysines and then use the ³distance" command on each pairwise
>combination, substituting in the proper residue numbers. However, there
>is an easier way with the ³findclash² command. You can use it for
>multiple distance measurement in a single command. For example:
>
>open 2gbp
>findclash :lys at ca test self overlap -1000 log true
>
>Š will measure all distances among lysine CA atoms in structure 2gbp and
>list results in the Reply Log (open from Favorites menu). The -1000 in
>the command says to measure the distance even if the atoms are
>³overlapping² by -1000 (more than 1000 angstroms apart). The last column
>in the results is the atom-atom distance, and they are given in order of
>increasing distance, in this case:
>
>464 contacts
>atom1 atom2 overlap distance
>LYS 191.A CA LYS 189.A CA -1.802 5.562
>LYS 227.A CA LYS 223.A CA -2.137 5.897
>LYS 300.A CA LYS 246.A CA -2.800 6.560
>LYS 169.A CA LYS 164.A CA -4.676 8.436
>LYS 270.A CA LYS 276.A CA -4.677 8.437
>[Š several lines removed Š]
>LYS 276.A CA LYS 203.A CA -53.705 57.465
>LYS 276.A CA LYS 137.A CA -56.756 60.516
>LYS 61.A CA LYS 137.A CA -56.829 60.589
>LYS 58.A CA LYS 137.A CA -59.598 63.358
>464 contacts
>
>The log contents can be saved to a text file, for example see
>
><http://plato.cgl.ucsf.edu/pipermail/chimera-users/2008-October/003184.htm
>l>
>
>However, your log would have a huge amount of results for all structures
>and it might be hard to tell which results go with which structures, so
>another possibility would be to save a separate file of results for each
>structure, and in that case your script would need to substitute in an
>output filename, e.g.
>
>findclash :lys at ca test self overlap -1000 saveFile ~/Desktop/2gbp.log
>
>See findclash documentation for all the options. Then you can embed the
>command in the python looping script as described in the first link
>above.
><http://www.rbvi.ucsf.edu/chimera/docs/UsersGuide/midas/findclash.html>
>
>I hope this helps,
>Elaine
>-----
>Elaine C. Meng, Ph.D.
>UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab
>Department of Pharmaceutical Chemistry
>University of California, San Francisco
>
>On Aug 18, 2015, at 8:36 AM, Feixia <feixia.chu at unh.edu> wrote:
>
>> Hi there,
>> I am interested in retrieving distance information from large dataset
>>in an automatic fashion. For instance, can we use Chimera to get the
>>distances between lysine alpha-carbons of current PDB entries.
>>Presumably, we can download all PDB structures on our local desktop, and
>>just call functions one structure at a time. I wonder if we can do that
>>with Chimera. Your advice will be highly appreciated.
>> Best,
>> Feixia
>
>
>_______________________________________________
>Chimera-users mailing list
>Chimera-users at cgl.ucsf.edu
>http://www.rbvi.ucsf.edu/mailman/listinfo/chimera-users
More information about the Chimera-dev
mailing list