[Chimera-users] adding chain IDs
Eric Pettersen
pett at cgl.ucsf.edu
Fri Oct 7 14:43:08 PDT 2005
Hi,
Since it's a lot easier to do things in Chimera when a file has
chain IDs, Maximilian Andrews wrote a script (below) for adding chain
IDs to files that lack them. It does require that TER cards be
present at the end of chains, but those are a lot easier to add than
chain IDs. Enjoy!
Eric Pettersen
UCSF Computer Graphics Lab
pett at cgl.ucsf.edu
http://www.cgl.ucsf.edu
#! /usr/bin/env python
# Filename: rename_pdb_for_chimera.py
######################################################################
# #
# Created: Maximilian N. Andrews, July 18, 2005 #
# Ciamician, Bologna University #
# #
# Function: #
# #
# This script adds a chain ID to all the subunits (column 22) in #
# the PDB file. #
# The chains are named 'A','B','C',etc. until all of the subunits #
# have been labeled. #
# #
# For the subunits to be renamed properly the PDB must contain a #
# line beginning with TER after each subunit, for example: #
# ... #
# ATOM 1570 OXT HIE 103 24.590 -9.249 30.054 #
# TER #
# ATOM 1571 N SER 104 16.936 7.767 30.236 #
# ... #
# #
# Notes: #
# #
# The original file is first read by the script, then copied to a #
# backup file - with the same name, but with "~" added at the end #
# of it - and then it is renamed and saved with the original name. #
# #
# Usage: #
# #
# rename_pdb.py arg1 arg2 #
# arg1 is the the number of chains that have to be renamed; #
# arg2 is the name of the input file; #
# example: rename_pdb.py 4 MY_PDB.pdb #
# #
######################################################################
import sys
import os
def BAILING_OUT(N_ARG):
""" The srcipt needs two arguments:
[int] The number of subunits to be renamed;
[str] The name of the PDB file to be renamed.
The correct use of this script is:
rename_pdb.py arg1 arg2
arg1 is the number of chains in the protein;
arg2 is the name of the input file;
example: rename_pdb.py 4 MY_PDB.pdb
"""
if N_ARG != 3:
print BAILING_OUT.__doc__
sys.exit()
else:
return
BAILING_OUT(len(sys.argv))
NUMBER_CHAINS = sys.argv[1]
PDB_IN = sys.argv[2]
print 'Number of chains is: ',sys.argv[1]
print 'The PDB file to rename is: ',sys.argv[2]
PDB_OUT = PDB_IN
PDB_BAK = PDB_IN+'~'
f = open(PDB_IN,'r')
lines = f.readlines()
f.close()
print 'Making backup file '+PDB_BAK+'...'
os.rename(PDB_IN,PDB_BAK)
print '...done!'
print 'Renaming '+PDB_IN+'...'
CHAIN = 65 # this is the integer that corrisponds to the character "A"
MAX_CHAIN = int(NUMBER_CHAINS) + CHAIN
LETTER = chr(CHAIN)
NEW_PDB = []
for i in range(len(lines)):
line = lines[i]
if line.startswith('TER'):
CHAIN = CHAIN + 1
if CHAIN < MAX_CHAIN:
LETTER = chr(CHAIN)
else:
LETTER = ' '
if line.startswith('ATOM'):
newline=line[0:21]+LETTER+line[22:]
NEW_PDB.append(newline)
elif line.startswith('TER') and len(line) > 21: # just in case
there's a chain identifier in the "TER" line... (the condition len
(line) is added just in case the TER line is a just few characters
long... i.e. it would add a blank space at the start of the following
line)
newline=line[0:21]+' '+line[22:]
NEW_PDB.append(newline)
else: # if line doesn't start with either ATOM or TER, it will
print it as is
NEW_PDB.append(line)
print '...done!'
print 'Writing renamed file '+PDB_OUT+'...'
f = open(PDB_OUT,'w')
for i in range(len(NEW_PDB)):
f.write(NEW_PDB[i])
f.close()
print '...done!'
More information about the Chimera-users
mailing list