Opened 8 years ago

Last modified 5 years ago

#711 accepted defect

mis-association (misalignment) between structure 1ml1 and its own sequence

Reported by: Elaine Meng Owned by: Eric Pettersen
Priority: minor Milestone:
Component: Sequence Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

If I open 1ml1 and then click the “monotim” to show all the chains associated with the sequence, there are two blocks of mismatches near the beginning of the sequence, but if you mouse over you can see it is instead an issue of the sequence and structure being shifted over by one. If I just click the “A” link, I also get two blocks of mismatches just for that one, but shorter. Screenshot of sequence windows attached.

Attachments (2)

monotim-mismatches.png (69.3 KB ) - added by Elaine Meng 8 years ago.
1ml1.pdb (939.4 KB ) - added by Elaine Meng 8 years ago.
PDB file of 1ml1 from May 2 2017

Download all attachments as: .zip

Change History (14)

by Elaine Meng, 8 years ago

Attachment: monotim-mismatches.png added

comment:1 by Eric Pettersen, 8 years ago

Status: assignedfeedback

Did the mmCIF entry for 1ml1 change? The chain name is now "triosephosphate isomerase" instead of "monotim", though the sequences look to be the same. The mis-alignments also no longer exist.

comment:2 by Elaine Meng, 8 years ago

I see, this only happens with the PDB file. I tried the one I had from May 2 and then also fetched a new one (ignore cache) and it still happened. Use: open 1ml1 format pdb

comment:3 by Elaine Meng, 8 years ago

Oops I didn't ignore cache. When I actually ignore cache I get an error fetching (any) PDB but not mmCIF. Is RCSB no longer providing PDB? Or is this the same problem mentioned in our last dev meeting? Let me know if it needs a new ticket. E.g.

open 1zik format pdb ignoreCache true
Fetching url http://www.pdb.org/pdb/files/pdb1zik.ent failed: HTTP Error 404: Not Found

by Elaine Meng, 8 years ago

Attachment: 1ml1.pdb added

PDB file of 1ml1 from May 2 2017

comment:4 by Eric Pettersen, 8 years ago

Fixed the PDB fetching. BTW, if you had fetched "from pdbe" it would have worked -- so that's another workaround until tomorrow's build.

comment:5 by Eric Pettersen, 8 years ago

I'm not 1000% sure what I'm supposed to do here. In each chain residue 14 (a cysteine) is missing, but in the 3D structure residue 13 is clearly directly connected to residue 15 via the normal backbone atoms. The mmCIF reader puts a missing-structure pseudobond in lieu of what is pretty clearly a covalent bond, whereas the PDB reader uses a regular bond. The resultant availability of inserting a gap there for mmCIF allows the structure to be aligned to the sequence without error, but not for the PDB

The REMARK 465 records in the PDB file note the residue 14 is not located in the experiment.

in reply to:  8 comment:6 by Elaine Meng, 8 years ago

I don’t know what is best either.  If we think it’s rare, then the workaround can be “use mmCIF instead.”  If it turns out to be a recurring type of problem, then revisit maybe trying to do something more proactive, not that I have any ideas of what that might be.  Could demote priority on this ticket and just keep it as an issue of which we are aware.

comment:7 by Eric Pettersen, 8 years ago

Priority: majortrivial
Status: feedbackaccepted

The problem is that both behaviors are both right and wrong. The mmCIF behavior is right if your "believe" the SEQRES. The PDB behavior is right if you "believe" the structure.

in reply to:  10 ; comment:8 by Elaine Meng, 8 years ago

That suggests trying both behaviors and then using the one that doesn’t give mismatches… perhaps not feasible though.

in reply to:  11 comment:9 by Eric Pettersen, 8 years ago

That’s not what I’m suggesting.  I am just saying that the PDB behavior is correct (i.e. there _is_ a mismatch with the sequence) if you look at the structure.  The mmCIF behavior is correct (you need to insert a gap between those residues despite the deposited structure) if you look at the SEQRES sequence.

—Eric

in reply to:  12 ; comment:10 by Elaine Meng, 8 years ago

Can we infer a possible gap from the numbering break, or is residue number always ignored?

in reply to:  13 ; comment:11 by Eric Pettersen, 8 years ago

Numbering gaps don’t always imply structure/sequence gaps.  Circular permutations and fusion proteins can have numbering gaps where no sequence/structure gap exists.

comment:12 by Eric Pettersen, 5 years ago

Priority: trivialminor
Note: See TracTickets for help on using tickets.