[Chimera-users] Converting SDF files to Mol2 Format
Eric Pettersen
pett at cgl.ucsf.edu
Mon Mar 29 12:48:52 PDT 2010
On Mar 26, 2010, at 8:15 AM, Nancy wrote:
> Hi Elaine,
>
> I need to convert a large number of PubChem SDF files into Mol2
> format. When I convert an SDF file into Mol2 format, I add partial
> charges and run energy minimisation on the molecule, as this is
> necessary for the program I am inputting the files into. So far, I
> have only used Chimera in this method to convert individual files.
> Is there a way to convert a large number of SDF files to Mol2 format
> at once, and also preserve the filenames?
>
> Thanks in advance,
>
> Nancy
Hi Nancy,
There are several possibilities here. Probably the most
straightforward is to use either the 1.4 or 1.4.1 release in
conjunction with a script to process your files. Let's assume your
files are all in one directory and have names of the form
moleculeName.sdf . The following csh-style script would accomplish
the task:
#!/bin/tcsh -f
foreach sdf (*.sdf)
echo $sdf
chimera --nogui $sdf processSDF.cmd
mv output.mol2 $sdf:r.mol2
end
and where the contents of "processSDF.cmd" is just:
minimize
write format mol2 0 output.mol2
You would want to put the csh script in a file (e.g. processSDF.csh)
which you would want to make executable (chmod +x processSDF.csh).
You would want to run the script in the directory with the SDF files,
where the processSDF.cmd file would also be located. If you are
familiar with shell scripting you can of course add paths to the
various script names to change these requirements.
The above, though simple, may be a little slow due to starting one
instance of Chimera per SDF file. You can get Chimera to run through
your directory of SDF files, but you have to use a Python script
instead of a Chimera command script. Like this:
chimera --nogui processSDF.py
with the contents of processSDF.py being:
from os import chdir, listdir
from chimera import runCommand
chdir("/path/to/SDF-file-dir") # change to the SDF file directory
for sdf in listdir("."):
if not sdf.endswith(".sdf"):
continue
runCommand("open " + sdf)
runCommand("minimize")
runCommand("write format mol2 0 " + sdf[:-4] + ".mol2")
runCommand("close all")
You could do all the above in the 1.5 release but the issue is that
the 1.5 branch now uses the 1.3 version of AmberTools/Antechamber
which relies on the program sqm to compute charges rather than mopac.
While the charges computed by sqm are theoretically more precise than
mopac, they take considerably longer to compute. We intend to add
options for using less strict charge-convergence criteria in order to
speed things up, but that work still is yet to be done. So whereas
moieties involving ~35 or less atoms (including hydrogens) don't take
excessive amounts of time (30 seconds or less), the compute time
scales with the cube of the number of atoms(!) so a system of 72 atoms
took more than 13 minutes and a system of 84 atoms took more than 21
minutes. I've also found that sqm fails to converge sometimes. The
only structure I've had this happen with is ATP (at -4 charge), but it
happened for a variety of conformers of ATP.
A final possibility is to use the PubChem3D (PubChem3D release note)
files and convert them directly to Mol2 files. This would require the
1.5 release (the 1.4 series doesn't read the charges in the SDF file)
and assumes that the included MMFF charges are sufficient for your
needs and that the conformer provided is good enough for your purposes
without minimization. Using minimization would defeat the purpose
here since Chimera's minimization needs to know GAFF atom types which
requires non-standard residues to be processed by Antechamber which
will add charges in addition to assigning types.
--Eric
Eric Pettersen
UCSF Computer Graphics Lab
http://www.cgl.ucsf.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://plato.cgl.ucsf.edu/pipermail/chimera-users/attachments/20100329/6b117c10/attachment.html>
More information about the Chimera-users
mailing list