Opened 7 years ago

Closed 7 years ago

#1570 closed defect (fixed)

Chain info incorrect when microheterogenicity

Reported by: Eric Pettersen Owned by: Greg Couch
Priority: moderate Milestone:
Component: Input/Output Version:
Keywords: Cc: Eric Pettersen
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

For structures with microheterogenicity, the mmCIF reader produces two copies of each chain. Example structures are 1ejg and 2izq. The PDB reader produces only one copy of each chain, though it has other problems.

Change History (13)

comment:1 by Eric Pettersen, 7 years ago

Cc: Eric Pettersen added

comment:2 by Greg Couch, 7 years ago

Owner: changed from Greg Couch to Eric Pettersen

Adding a bunch of print statements makes the bug go away. Assuming the bug is not in the mmcif code.

comment:3 by Greg Couch, 7 years ago

Cc: Greg Couch added; Eric Pettersen removed

comment:4 by Greg Couch, 7 years ago

More info. Valgrind did not find any errors. The print statements were for structure->num_chains() and were done in multiple places inside ExtractMolecule::finished_parse(). num_chains() indirectly invokes make_chains().

Adding an explicit make_chains(), at the end of finish_parse, works around the bug.

comment:5 by Greg Couch, 7 years ago

Amend that, that workaround worked on Linux, but not on Windows. Will look a bit more.

comment:6 by Eric Pettersen, 7 years ago

Cc: Eric Pettersen added; Greg Couch removed
Owner: changed from Eric Pettersen to Greg Couch

For this structure, in the mmcif reader you call Structure.chains() multiple times before you ever call Structure.set_input_seq_info(). So the early call to chains() creates four chains, and then your later calls to set_input_seq_info() create four more chains.

comment:7 by Greg Couch, 7 years ago

How did you figure that out? There are no calls to chains nor num_chains.

comment:8 by Eric Pettersen, 7 years ago

I put print statements before and after the mmcif reading in mmcd.py, and also in Structure.chains() and Structure.set_input_seq_info().

comment:9 by Eric Pettersen, 7 years ago

It would be nice to get a stack trace from the chains() call and see what the intervening layers are, but oh well.

comment:10 by Eric Pettersen, 7 years ago

Yeah, I'm not seeing the chains() call in the code either. Must be called indirectly somehow...

comment:11 by Eric Pettersen, 7 years ago

Reporter: changed from lpravda@… to Eric Pettersen

I think you have to put the same print statements in that I did and then more within the mmcif code to binary search for the culprit. I'm guessing the "culprit" lurks in the atomic library somewhere, though the call to chains() from within atomic [if that's what it is] isn't necessarily a bug.

Also, taking Lukas off this ticket as an act of mercy. :-)

comment:12 by Greg Couch, 7 years ago

Looks like it's the delete_residue() call that gets rid of the extra residue. But I don't see how.

comment:13 by Greg Couch, 7 years ago

Resolution: fixed
Status: assignedclosed

Found it. In Structure::_delete_atoms it asks for the residue's chain to cleanup ribbons. And that caused Structure::make_chains() to be called.

Note: See TracTickets for help on using tickets.