[Chimera-users] Reading and saving large files in chimera

Tue Mar 27 12:46:35 PDT 2018

Hi, Tofayel.

Elaine is indisposed for a few days, so I'm looking into this problem.

Unfortunately, from what I can tell, it is a problem with the PDB format 
itself.The official standard 
(https://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM) 
only allows for three-character residue names in columns 18-20 and 
one-character chain identifiers in column 22.  It looks like Phenix, 
seeing that column 21 is unused, is allowing two-character chain 
identifiers in columns 21-22.  Chimera (and ChimeraX), on the other 
hand, seeing that column 21 is unused, is allowing four-character 
residue names.  So the string "ASPA1" is interpreted by Phenix as 
residue "ASP" in chain "A1", but by Chimera as residue "ASPA" in chain 
"1".  (Doubly unfortunate, the one-character chain identifier limit is 
built into the Chimera data structures so we cannot simply change the 
PDB parser to support two-character chains.)

As Elaine suggested earlier, your best bet is probably using mmCIF 
format instead of PDB.  mmCIF is not column-based and is explicitly 
designed to handle very large structures.  Phenix version 1.8.2 and 
later support mmCIF as does ChimeraX.  I would not recommend Chimera for 
large mmCIF files as it requires excessive memory resources and is very 
slow to boot.  (ChimeraX can open 5ire_BIOMT_expanded.pdb in under a 
minute and renders a beautiful [tooting our own horn :-)] image with 
ambient occlusion.)

Conrad

On 3/26/2018 10:08 PM, #TOFAYEL AHMED# wrote:
> Hi Elaine,
> 
> 
> Thanks for getting back to me so promptly. Please find below the google 
> drive link for the files. I am describing what I have in the folder: I 
> have downloaded the model 5IRE from PDB website (filename:5ire.pdb) and 
> applied phenix to operate on the transformation matrix (BIOMT record in 
> REMARK 350) to generate the icosahedral complete 
> virus (filename:5ire_BIOMT_expanded.pdb). Now, phenix has used two 
> letter IDs for the chains and these chain IDs get changed when I open 
> 5ire_BIOMT_expanded.pdb in Chimera1.12. I have then resaved the file 
> again using Chimera1.12 and this file is named as chimera-saved.pdb.
> 
> 
> This is just an example case and it closely matches my problem. In real 
> scenario, I need to use fit in map option in chimera and move the 
> coordinates, so there is no way I can avoid "opening my model in chimera 
> and resave them in new position".
> 
> 
> Kindly suggest after going through the files !
> 
> 
> Link: 
> https://drive.google.com/drive/folders/1p8-QxEDjlszAbQ3M-f9MREo2gCHgIOWQ?usp=sharing
> 
> 
> Best regards,
> 
> Tofayel
> 
> 
> 
> ------------------------------------------------------------------------
> *From:* Elaine Meng <meng at cgl.ucsf.edu>
> *Sent:* Tuesday, March 27, 2018 1:04:22 AM
> *To:* #TOFAYEL AHMED#
> *Cc:* UCSF Chimera Mailing List
> *Subject:* Re: [Chimera-users] Reading and saving large files in chimera
> Hi Tofayel,
> Not being a phenix user, I may need more specifics, like:
> 
> - are you getting a single PDB file of the whole capsid from applying 
> the biomt in phenix, and opening that single PDB file in Chimera?
> - does that file have more than one chain with the same ID, or chain IDs 
> with multiple characters?
> - is the chain ID different as soon as you read the structure in to 
> Chimera, or is it only different when you write to an output file?
> - are you saving as Chimera session or writing a single PDB file?
> 
> I know the capsid in your example has many chains, so I can only guess 
> that in the file from phenix, there are either duplicate chain IDs or 
> chain IDs with more characters than Chimera or PDB format allow.
> 
> One idea is to try using ChimeraX instead.  It can write mmCIF format.  
> However, with my limited understanding of what is happening (sorry), I 
> don’t know if that would solve your problem.
> Best regards,
> Elaine
> -----
> Elaine C. Meng, Ph.D.
> UCSF Chimera(X) team
> Department of Pharmaceutical Chemistry
> University of California, San Francisco
> 
>> On Mar 26, 2018, at 4:03 AM, #TOFAYEL AHMED# <TOFAYEL001 at e.ntu.edu.sg> wrote:
>> 
>> Hi Chimera developers and users,
>> 
>> I have faced a problem with chain IDs and secondary structure while working on a large file. As I am refining my model in Phenix and visualizing in Chimera, I need to go back and forth between these two software and that creates a problem. To replicate my  problem, here I have taken an example from PDB so that you get the 
> idea. If I take model bearing PDB id 5IRE and apply 
> "phenix.pdb.biomt_reconstruction" I can generate the complete virus 
> icosahedral structure from the asymmetric unit. But when I open this 
> complete virus structure in Chimera and save it back again the chain IDs 
> get changed. Is there any work around to maintain the original chain IDs 
> and therefore maintain the secondary structure definitions in the file 
> header, after opening and thereafter saving using Chimera?
>> 
>> Best regards,
>> Tofayel
>> NTU Singapore
> 
> 
> 
> _______________________________________________
> Chimera-users mailing list: Chimera-users at cgl.ucsf.edu
> Manage subscription: http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users
>