Opened 3 years ago

Closed 4 months ago

#8483 closed defect (fixed)

Merging fragments issues

Reported by: Tristan Croll Owned by: Eric Pettersen
Priority: normal Milestone:
Component: Structure Editing Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description (last modified by Eric Pettersen)

The following bug report has been submitted:
Platform:        Linux-5.15.0-58-generic-x86_64-with-glibc2.35
ChimeraX Version: 1.6.dev202301180300 (2023-01-18 03:00:03 UTC)
Description
It's high time I started giving users of ISOLDE the ability to merge in fragments from a different model (that is, pieces that may or may not be inserted into an existing chain, not just adding new chains). My existing code for this (at https://github.com/tristanic/isolde/blob/0399d0b4c9473004b321f0a816bafe26469dfb99/isolde/src/atomic/building/merge.py#L2) is currently hidden away in the API and (to my knowledge) used only by me, because it provides too many ways to break things ~irreparably. Was hoping to get some advice/pointers from Eric regarding the building and, in particular, sequence APIs - and also report a few things that could be considered bugs in ChimeraX.

In no particular order:

- accidentally linking backbone atoms in "unexpected" ways (e.g. C from one amino acid residue to N from a different chain) has really nasty effects. Perhaps the most benign is what happened when I accidentally linked the terminal C from a newly-added fragment in chain E back to somewhere in the middle of chain D (due to a typo when setting the new `anchor_c` residue) - deleting the offending bond created a missing-structure pseudobond, and when I saved the structure to .pdb both chains were written as chain E. The worst-case scenario is that ChimeraX dies with a segmentation fault. I can of course modify my own code to sanity-check the inputs, but perhaps it would be better for mistakes like this to be blocked by the ChimeraX API?

- if I open a model with no definitive sequence defined (e.g. a PDB file with no SEQRES records), then merging a fragment into an existing chain with the method linked above overwrites that chain's sequence with just the sequence of the new fragment. This breaks matchmaker for that chain until I regenerate `chain.characters`. 

- when a chain *does* have a definitive sequence, newly-added residues aren't updated in the sequence viewer (i.e. it still shows them as missing, and doesn't highlight sequence mismatches).

Happy to put in the legwork to fix the above, but just hoped you could get me pointed in the right direction. Looking back at the above, the most pressing question boils down to: how do I go about updating the sequence, sequence-to-residue mapping etc. so that the sequence viewer will correctly reflect changes?

OpenGL version: 3.3.0 NVIDIA 515.86.01
OpenGL renderer: NVIDIA GeForce RTX 3070/PCIe/SSE2
OpenGL vendor: NVIDIA Corporation

Python: 3.9.11
Locale: en_GB.UTF-8
Qt version: PyQt6 6.4.0, Qt 6.4.0
Qt runtime version: 6.4.2
Qt platform: xcb

XDG_SESSION_TYPE=x11
DESKTOP_SESSION=ubuntu
XDG_SESSION_DESKTOP=ubuntu
XDG_CURRENT_DESKTOP=ubuntu:GNOME
DISPLAY=:1
Manufacturer: Dell Inc.
Model: XPS 8950
OS: Ubuntu 22.04 Jammy Jellyfish
Architecture: 64bit ELF
Virtual Machine: none
CPU: 20 12th Gen Intel(R) Core(TM) i7-12700
Cache Size: 25600 KB
Memory:
	               total        used        free      shared  buff/cache   available
	Mem:            31Gi       6.8Gi       9.5Gi       327Mi        14Gi        23Gi
	Swap:          2.0Gi       2.0Mi       2.0Gi

Graphics:
	0000:01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104 [GeForce RTX 3070 Lite Hash Rate] [10de:2488] (rev a1)	
	Subsystem: Dell GA104 [GeForce RTX 3070 Lite Hash Rate] [1028:c903]	
	Kernel driver in use: nvidia

Installed Packages:
    alabaster: 0.7.13
    appdirs: 1.4.4
    asttokens: 2.2.1
    Babel: 2.11.0
    backcall: 0.2.0
    blockdiag: 3.0.0
    build: 0.8.0
    certifi: 2022.12.7
    cftime: 1.6.2
    charset-normalizer: 2.1.1
    ChimeraX-AddCharge: 1.5.8
    ChimeraX-AddH: 2.2.3
    ChimeraX-AlignmentAlgorithms: 2.0.1
    ChimeraX-AlignmentHdrs: 3.3.1
    ChimeraX-AlignmentMatrices: 2.1
    ChimeraX-Alignments: 2.8
    ChimeraX-AlphaFold: 1.0
    ChimeraX-AltlocExplorer: 1.0.3
    ChimeraX-AmberInfo: 1.0
    ChimeraX-Arrays: 1.1
    ChimeraX-Atomic: 1.43.5
    ChimeraX-AtomicLibrary: 10.0.1
    ChimeraX-AtomSearch: 2.0.1
    ChimeraX-AxesPlanes: 2.3.2
    ChimeraX-BasicActions: 1.1.2
    ChimeraX-BILD: 1.0
    ChimeraX-BlastProtein: 2.1.2
    ChimeraX-BondRot: 2.0.1
    ChimeraX-BugReporter: 1.0.1
    ChimeraX-BuildStructure: 2.7.2
    ChimeraX-Bumps: 1.0
    ChimeraX-BundleBuilder: 1.2
    ChimeraX-ButtonPanel: 1.0.1
    ChimeraX-CageBuilder: 1.0.1
    ChimeraX-CellPack: 1.0
    ChimeraX-Centroids: 1.3.2
    ChimeraX-ChangeChains: 1.0.2
    ChimeraX-CheckWaters: 1.3.1
    ChimeraX-ChemGroup: 2.0
    ChimeraX-Clashes: 2.2.4
    ChimeraX-Clipper: 0.20.0
    ChimeraX-ColorActions: 1.0.3
    ChimeraX-ColorGlobe: 1.0
    ChimeraX-ColorKey: 1.5.3
    ChimeraX-CommandLine: 1.2.5
    ChimeraX-ConnectStructure: 2.0.1
    ChimeraX-Contacts: 1.0.1
    ChimeraX-Core: 1.6.dev202301180300
    ChimeraX-CoreFormats: 1.1
    ChimeraX-coulombic: 1.4.2
    ChimeraX-Crosslinks: 1.0
    ChimeraX-Crystal: 1.0
    ChimeraX-CrystalContacts: 1.0.1
    ChimeraX-DataFormats: 1.2.3
    ChimeraX-Dicom: 1.1
    ChimeraX-DistMonitor: 1.3.1
    ChimeraX-DockPrep: 1.1
    ChimeraX-Dssp: 2.0
    ChimeraX-EMDB-SFF: 1.0
    ChimeraX-ESMFold: 1.0
    ChimeraX-ExperimentalCommands: 1.0
    ChimeraX-FileHistory: 1.0.1
    ChimeraX-FunctionKey: 1.0.1
    ChimeraX-Geometry: 1.2.1
    ChimeraX-gltf: 1.0
    ChimeraX-Graphics: 1.1.1
    ChimeraX-Hbonds: 2.4
    ChimeraX-Help: 1.2.1
    ChimeraX-HKCage: 1.3
    ChimeraX-IHM: 1.1
    ChimeraX-ImageFormats: 1.2
    ChimeraX-IMOD: 1.0
    ChimeraX-IO: 1.0.1
    ChimeraX-ISOLDE: 1.6.dev0
    ChimeraX-ItemsInspection: 1.0.1
    ChimeraX-Label: 1.1.7
    ChimeraX-LinuxSupport: 1.0.1
    ChimeraX-ListInfo: 1.1.1
    ChimeraX-Log: 1.1.5
    ChimeraX-LookingGlass: 1.1
    ChimeraX-Maestro: 1.8.2
    ChimeraX-Map: 1.1.4
    ChimeraX-MapData: 2.0
    ChimeraX-MapEraser: 1.0.1
    ChimeraX-MapFilter: 2.0.1
    ChimeraX-MapFit: 2.0
    ChimeraX-MapSeries: 2.1.1
    ChimeraX-Markers: 1.0.1
    ChimeraX-Mask: 1.0.2
    ChimeraX-MatchMaker: 2.0.11
    ChimeraX-MDcrds: 2.6
    ChimeraX-MedicalToolbar: 1.0.2
    ChimeraX-Meeting: 1.0.1
    ChimeraX-MLP: 1.1.1
    ChimeraX-mmCIF: 2.11
    ChimeraX-MMTF: 2.2
    ChimeraX-Modeller: 1.5.8
    ChimeraX-ModelPanel: 1.3.6
    ChimeraX-ModelSeries: 1.0.1
    ChimeraX-Mol2: 2.0
    ChimeraX-Mole: 1.0
    ChimeraX-Morph: 1.0.2
    ChimeraX-MouseModes: 1.2
    ChimeraX-Movie: 1.0
    ChimeraX-Neuron: 1.0
    ChimeraX-Nucleotides: 2.0.3
    ChimeraX-OpenCommand: 1.10
    ChimeraX-PDB: 2.6.12
    ChimeraX-PDBBio: 1.0
    ChimeraX-PDBLibrary: 1.0.2
    ChimeraX-PDBMatrices: 1.0
    ChimeraX-PickBlobs: 1.0.1
    ChimeraX-Positions: 1.0
    ChimeraX-PresetMgr: 1.1
    ChimeraX-PubChem: 2.1
    ChimeraX-ReadPbonds: 1.0.1
    ChimeraX-Registration: 1.1.1
    ChimeraX-RemoteControl: 1.0
    ChimeraX-RenderByAttr: 1.0
    ChimeraX-RenumberResidues: 1.1
    ChimeraX-ResidueFit: 1.0.1
    ChimeraX-RestServer: 1.1
    ChimeraX-RNALayout: 1.0
    ChimeraX-RotamerLibMgr: 3.0
    ChimeraX-RotamerLibsDunbrack: 2.0
    ChimeraX-RotamerLibsDynameomics: 2.0
    ChimeraX-RotamerLibsRichardson: 2.0
    ChimeraX-SaveCommand: 1.5.1
    ChimeraX-SchemeMgr: 1.0
    ChimeraX-SDF: 2.0.1
    ChimeraX-Segger: 1.0
    ChimeraX-Segment: 1.0.1
    ChimeraX-SelInspector: 1.0
    ChimeraX-SeqView: 2.8.1
    ChimeraX-Shape: 1.0.1
    ChimeraX-Shell: 1.0.1
    ChimeraX-Shortcuts: 1.1.1
    ChimeraX-ShowSequences: 1.0.1
    ChimeraX-SideView: 1.0.1
    ChimeraX-Smiles: 2.1
    ChimeraX-SmoothLines: 1.0
    ChimeraX-SpaceNavigator: 1.0
    ChimeraX-StdCommands: 1.10.1
    ChimeraX-STL: 1.0.1
    ChimeraX-Storm: 1.0
    ChimeraX-StructMeasure: 1.1.1
    ChimeraX-Struts: 1.0.1
    ChimeraX-Surface: 1.0.1
    ChimeraX-SwapAA: 2.0.1
    ChimeraX-SwapRes: 2.2
    ChimeraX-TapeMeasure: 1.0
    ChimeraX-Test: 1.0
    ChimeraX-Toolbar: 1.1.2
    ChimeraX-ToolshedUtils: 1.2.1
    ChimeraX-Topography: 1.0
    ChimeraX-Tug: 1.0.1
    ChimeraX-UI: 1.26
    ChimeraX-uniprot: 2.2.2
    ChimeraX-UnitCell: 1.0.1
    ChimeraX-ViewDockX: 1.1.6
    ChimeraX-VIPERdb: 1.0
    ChimeraX-Vive: 1.1
    ChimeraX-VolumeMenu: 1.0.1
    ChimeraX-VTK: 1.0
    ChimeraX-WavefrontOBJ: 1.0
    ChimeraX-WebCam: 1.0.2
    ChimeraX-WebServices: 1.1.1
    ChimeraX-Zone: 1.0.1
    colorama: 0.4.5
    comm: 0.1.2
    contourpy: 1.0.7
    cxservices: 1.2
    cycler: 0.11.0
    Cython: 0.29.32
    debugpy: 1.6.5
    decorator: 5.1.1
    distro: 1.7.0
    docutils: 0.19
    entrypoints: 0.4
    executing: 1.2.0
    filelock: 3.7.1
    fonttools: 4.38.0
    funcparserlib: 1.0.1
    grako: 3.16.5
    h5py: 3.7.0
    html2text: 2020.1.16
    idna: 3.4
    ihm: 0.35
    imagecodecs: 2022.9.26
    imagesize: 1.4.1
    importlib-metadata: 6.0.0
    ipykernel: 6.19.2
    ipython: 8.7.0
    ipython-genutils: 0.2.0
    jedi: 0.18.2
    Jinja2: 3.1.2
    jupyter-client: 7.4.8
    jupyter-core: 5.1.3
    kiwisolver: 1.4.4
    line-profiler: 3.5.1
    lxml: 4.9.1
    lz4: 4.0.2
    MarkupSafe: 2.1.2
    matplotlib: 3.6.2
    matplotlib-inline: 0.1.6
    msgpack: 1.0.4
    nest-asyncio: 1.5.6
    netCDF4: 1.6.0
    networkx: 2.8.8
    numexpr: 2.8.4
    numpy: 1.23.5
    openvr: 1.23.701
    packaging: 23.0
    ParmEd: 3.4.3
    parso: 0.8.3
    pep517: 0.13.0
    pexpect: 4.8.0
    pickleshare: 0.7.5
    Pillow: 9.3.0
    pip: 22.2.2
    pkginfo: 1.8.3
    platformdirs: 2.6.2
    prompt-toolkit: 3.0.36
    psutil: 5.9.4
    ptyprocess: 0.7.0
    pure-eval: 0.2.2
    pycollada: 0.7.2
    pydicom: 2.3.0
    Pygments: 2.12.0
    PyOpenGL: 3.1.5
    PyOpenGL-accelerate: 3.1.5
    pyparsing: 3.0.9
    PyQt6-commercial: 6.4.0
    PyQt6-Qt6: 6.4.2
    PyQt6-sip: 13.4.0
    PyQt6-WebEngine-commercial: 6.4.0
    PyQt6-WebEngine-Qt6: 6.4.2
    python-dateutil: 2.8.2
    pytz: 2022.7.1
    pyzmq: 25.0.0
    qtconsole: 5.4.0
    QtPy: 2.3.0
    RandomWords: 0.4.0
    rdkit-pypi: 2022.9.2
    requests: 2.28.1
    scipy: 1.9.3
    setuptools: 65.1.1
    sfftk-rw: 0.7.2
    six: 1.16.0
    snowballstemmer: 2.2.0
    sortedcontainers: 2.4.0
    Sphinx: 5.1.1
    sphinx-autodoc-typehints: 1.19.1
    sphinxcontrib-applehelp: 1.0.3
    sphinxcontrib-blockdiag: 3.0.0
    sphinxcontrib-devhelp: 1.0.2
    sphinxcontrib-htmlhelp: 2.0.0
    sphinxcontrib-jsmath: 1.0.1
    sphinxcontrib-qthelp: 1.0.3
    sphinxcontrib-serializinghtml: 1.1.5
    stack-data: 0.6.2
    tables: 3.7.0
    tifffile: 2022.10.10
    tinyarray: 1.2.4
    tomli: 2.0.1
    tornado: 6.2
    traitlets: 5.8.0
    urllib3: 1.26.14
    wcwidth: 0.2.6
    webcolors: 1.12
    wheel: 0.37.1
    wheel-filename: 1.4.1
    zipp: 3.11.0

Change History (7)

comment:1 by Eric Pettersen, 3 years ago

Component: UnassignedStructure Editing
Owner: set to Eric Pettersen
Platform: all
Project: ChimeraX
Status: newaccepted
Summary: ChimeraX bug report submissionMerging fragments issues

comment:2 by Eric Pettersen, 3 years ago

The three behaviors you describe all sound like bugs and I will fix them, but it could be awhile before I get to them. Nonetheless, to address your actual main question, to set/correct the sequence information for a chain you use chain.bulk_set(residues, characters) where characters can be a list or a string. Residues can contain None values if you are giving the full sequence but some of the residues of the sequence have no structure. If you are specifying the full sequence you should probably also set chain.from_seqres to True, so that SEQRES records will be written out in PDB format. Conversely, set it to False for partial sequence information.

comment:3 by Eric Pettersen, 14 months ago

Description: modified (diff)

First of the three problems (unwanted chain merge) fixed.

Fix: https://github.com/RBVI/ChimeraX/commit/a7f6e47e63f25668b6460e5bc31c0de0921db2e5

comment:4 by Eric Pettersen, 14 months ago

Problem 2 (sequence merging) no longer seems to be an issue.

comment:5 by Eric Pettersen, 4 months ago

First part of the third issue was that the sequences weren't properly being combined in the first place.

The fix for that is: https://github.com/RBVI/ChimeraX/commit/c499042558193a9f783318e63bba9d6cace61639

comment:6 by Eric Pettersen, 4 months ago

At least some residue additions now update the sequence viewer. I only did some minimal testing, so there could still be "blind spots". File more tickets (with examples!) if you run into any.

Fix: https://github.com/RBVI/ChimeraX/commit/39b6562419e061ae89b214e6c378e9e13039d81e

comment:7 by Eric Pettersen, 4 months ago

Resolution: fixed
Status: acceptedclosed
Note: See TracTickets for help on using tickets.