Opened 16 months ago

Closed 15 months ago

Last modified 15 months ago

#15522 closed defect (fixed)

Mismatch of crosslink positions when loading structures from PDB-Dev

Reported by: Georg.Kempf@… Owned by: ben@…
Priority: normal Milestone:
Component: Higher-Order Structure Version:
Keywords: Cc: Eric Pettersen, Tom Goddard
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

Hello,

I wanted to report the following potential issue. When loading a model from PDB-Dev that contains crosslink restraints in the IHM format, the crosslink positions sometimes are not matching the model residue numbering. From my understanding, this is because the pdb-dev/ihm format requires the crosslink positions to be given with “_atom_site.label_seq_id” numbering (referring to “ENTITY_POLY_SEQ” list) which can have an offset to the “_atom_site.auth_seq_id” representing the actual model residue number. Hence the crosslinks are displayed for the wrong residues.

An example would be one of our structures:
https://pdb-dev.wwpdb.org/entry.html?PDBDEV_00000210


Best regards,
Georg

Change History (9)

comment:1 by Eric Pettersen, 16 months ago

Cc: Eric Pettersen added
Component: UnassignedHigher-Order Structure
Owner: set to Tom Goddard
Platform: all
Project: ChimeraX
Status: newassigned

This could possibly be achieved by setting the structure's 'res_numbering' attribute to 'canonical' before adding the pseudobonds, and optionally setting the numbering back to 'author' afterward.

comment:2 by Tom Goddard, 16 months ago

Cc: Tom Goddard added
Owner: changed from Tom Goddard to ben@…

Hi Ben,

The ChimeraX IHM code is looking up residues to make crosslink pseudobonds using the author sequence numbers since those are the standard numbers ChimeraX uses. But the sequence numbers the code is using from the IHM tables are the mmCIF sequence numbers which can differ from the author sequence numbers. In addition to the case of crosslinks I see the ChimeraX IHM code is using residue number for associating template models. I guess that code is also wrong, incorrectly using mmCIF numbering from the file as if it were author numbering when the ChimeraX residue lookup is done. I'll leave it to you to straighten out this problem. Hopefully not too difficult since as Eric comments above you can temporarily switch ChimeraX to use mmCIF residue numbering.

Tom

comment:3 by Ben Webb, 15 months ago

I'm not super-familiar with this code, but Eric's suggestion to set res_numbering does appear to work in my testing. See https://github.com/RBVI/ChimeraX/pull/44 for a proposed fix.

comment:4 by Tom Goddard, 15 months ago

I wrote the original ChimeraX IHM code and then Ben Webb heavily modified it to use the PyPi IHM reader he wrote.

I dont' know if the canonical residue numbering will break things when reading -- that depends on whether all the IHM mmCIF tables being read are using that residue numbering when refering to the atomic models. If they are then it seems like a sane change to make. But I think Ben has a better understanding of the IHM mmCIF dictionary than I do.

I looked at Ben's pull request and requested a small change that the switch to canonical residue numbers and back to author residue numbers after loading the data happen in the same function in the code.

comment:5 by Ben Webb, 15 months ago

All of the IHMCIF tables use seq_id to refer to residues, never author-provided residue numbers, so I can't think of anywhere where using canonical numbering would break anything. (It gets a bit more involved when using nonpolymer entities, where the author-provided residue number is used as the key, but all of the IHMCIF tables currently handled by ChimeraX pertain only to polymers.)

comment:6 by Tom Goddard, 15 months ago

Ok. I'll merge your patch when I return from vacation next week. Thanks for making the fix.

comment:7 by Tom Goddard, 15 months ago

Resolution: fixed
Status: assignedclosed

Fixed.

Merged Ben's pull request #44 which switches to using canonical instead of author residue numbering while the IHM file is being parsed and crosslinks are being created.

comment:8 by Georg.Kempf@…, 15 months ago

Great, thanks a lot for the fast fix! Will it be included in the next daily builds?

Best regards,
Georg

From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu>
Date: Tuesday, 23. July 2024 at 05:26
To: Kempf, Georg <Georg.Kempf@fmi.ch>, ben@salilab.org <ben@salilab.org>, goddard@cgl.ucsf.edu <goddard@cgl.ucsf.edu>
Cc: pett@cgl.ucsf.edu <pett@cgl.ucsf.edu>
Subject: Re: [ChimeraX] #15522: Mismatch of crosslink positions when loading structures from PDB-Dev
#15522: Mismatch of crosslink positions when loading structures from PDB-Dev
---------------------------------------------+--------------------
          Reporter:  Georg.Kempf@…           |      Owner:  ben@…
              Type:  defect                  |     Status:  closed
          Priority:  normal                  |  Milestone:
         Component:  Higher-Order Structure  |    Version:
        Resolution:  fixed                   |   Keywords:
        Blocked By:                          |   Blocking:
Notify when closed:                          |   Platform:  all
           Project:  ChimeraX                |
---------------------------------------------+--------------------
Changes (by Tom Goddard):

 * resolution:   => fixed
 * status:  assigned => closed

Comment:

 Fixed.

 Merged Ben's pull request #44 which switches to using canonical instead of
 author residue numbering while the IHM file is being parsed and crosslinks
 are being created.
--
Ticket URL: <https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/15522#comment:7>
ChimeraX <https://www.rbvi.ucsf.edu/chimerax/>
ChimeraX Issue Tracker

comment:9 by goddard@…, 15 months ago

Yes the fix is in the daily build on the ChimeraX web site now.

   Tom
Note: See TracTickets for help on using tickets.