[chimera-dev] Help with "looping through PDB IDs" script

Mon Apr 21 10:07:14 PDT 2014

Hi Navya,
I'm not sure if this is your question, but it is not possible to submit structures from Chimera to the CASTp web server for a new calculation.  The only way to run a new calculation is by submitting directly at their website.
<http://sts-fw.bioengr.uic.edu/castp/calculation.php>

The CASTp fetch from Chimera is only getting pre-calculated results for existing PDB entries from the CASTp database.
I hope this clarifies the situation,
Elaine
----------
Elaine C. Meng, Ph.D. 
UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab
Department of Pharmaceutical Chemistry
University of California, San Francisco

On Apr 17, 2014, at 10:12 PM, Navya Shilpa Josyula <njosyu2 at uic.edu> wrote:

> Hi Eric,
> 
> Thank you for your reply. I am understanding that the castp python script works only for .pdb files. But it would be really time consuming for me to upload each of the .pdb1 file to castp and then download the .poc and .pocInfo files as I have to process these two files again to extract the atoms list, pocID and MS_volume values.
> 
> Could you please elaborate more on how to extract the .poc and .pocInfo values for .pdb1 files from CASTp server using processCastpID() and processCastpFiles(). I need to be able to loop through all .pdb1 files and for each pdb1 file I need to get the atoms list for each pocID along with pocVolumes and to write these values into a .csv file.
> 
> Thank you in advance,
> Navya
> 
> 
> On Thu, Apr 17, 2014 at 5:05 PM, Eric Pettersen <pett at cgl.ucsf.edu> wrote:
> On Apr 17, 2014, at 2:11 PM, Navya Shilpa Josyula <njosyu2 at uic.edu> wrote:
> 
>> Now I am trying to write CASTp information for each of my proteins into a separate file. As you suggested in earlier email, the processCastpID function is in the gui.py file but not in __init__.py file. Hope I am not missing anything here. As per my understanding, this function is fetching the 4 castp files of which I would require only ".poc" and ".pocInfo" files. From these two files I want to write the data of only atoms list, pocID and MS_Volume data into a single file for all 400 proteins in my dataset. Is there a link or any script available for such requirement?
> 
> There are some fine points that I missed in my answer yesterday, and the situation is complicated further by your use of .pdb1 files instead of the "normal" entries.
> 
> So for one thing, if you are going to use the .pdb1 files, then you are going to have to run CASTp yourself on each and then process the results.  In that case you might as well also analyze the .poc and .pocInfo files yourself to determine what pocket each atom belongs to (the next-to-last field in the .poc file) and the volume of that pocket (listed in the .pocInfo file).
> 
> The main point I missed in my reply, which may now be moot because of the .pdb1 thing, is that processCastpID() builds its own structure and therefore you would not open the PDB first, you would instead return the structure (along with the cavities list) from that method and make the structure available in chimera with:
> 
> 	chimera.openModels.add([structure])
> 
> and then proceed with selecting the right residues, using currentResidues() to list them, etc.  I guess if you didn't want to process the .pdb1 CASTp files yourself (after running CASTp on the .pdb1) you could use processCastpFiles() to get the cavity list and structure and proceed as I just outlined.  processCastpFiles is in __init__, unlike processCastpID() as you found!
> 
>> Again, as mentioned in my last email, since my output files will be huge in size, will I be able to write my files directly to a database table in SQL server?
> 
> I'm not much of an expert on this, but maybe this page would help: DatabaseInterfaces - Python Wiki
> 
> --Eric