#15106 closed defect (limitation)

Sequence viewer structure association has mismatches for identical subsequence

Reported by: goddard@… Owned by: pett
Priority: normal Milestone:
Component: Sequence Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        macOS-14.4.1-arm64-arm-64bit
ChimeraX Version: 1.8.dev202405030711 (2024-05-03 07:11:24 UTC)
Description
I'm puzzled that Sequence Viewer declares there are 88 mismatches when trying to associate AlphaFold model A2G0Z0 to an exact identical subsequence of A2G0Z0 loaded from a fasta file.  I recall Eric told me that associating a structure to a subsequence of that structure may not do a good job, but this seems kind of extreme given that the subsequence exactly matches part of the structure sequence.  I've attached the fasta file.  The AlphaFold model can be opened with "open A2G0Z0 from alphafold".

The reason I have a subsequence in the fasta file is because it came from a sequence search of a protein with just one of the two domains of A2G0Z0.  This seems like a common scenario where structures are found by sequence search that are larger than the query sequence.  So it would be nice if such structures can associate with their own exactly matching subsequence in the alignment.

Log:
UCSF ChimeraX version: 1.8.dev202405030711 (2024-05-03)  
© 2016-2024 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  

> open /Users/goddard/ucsf/umap/braf_alphafold/af292_env/a2g0z0.fasta

Summary of feedback from opening
/Users/goddard/ucsf/umap/braf_alphafold/af292_env/a2g0z0.fasta  
---  
note | Alignment identifier is a2g0z0.fasta  
  
Opened 1 sequence from a2g0z0.fasta  

> open a2g0z0 fromDatabase alphafold

Chain information for AlphaFold A2G0Z0 #1  
---  
Chain | Description | UniProt  
A | TKL family protein kinase | A2G0Z0_TRIVA 1-1028  
  
Color AlphaFold A2G0Z0 by residue attribute pLDDT_score  
Associated AlphaFold A2G0Z0 chain A to A2G0Z0 with 88 mismatches  




OpenGL version: 4.1 Metal - 88
OpenGL renderer: Apple M2 Ultra
OpenGL vendor: Apple

Python: 3.11.4
Locale: UTF-8
Qt version: PyQt6 6.6.1, Qt 6.6.1
Qt runtime version: 6.6.3
Qt platform: cocoa
Hardware:

    Hardware Overview:

      Model Name: Mac Studio
      Model Identifier: Mac14,14
      Model Number: Z1800003VLL/A
      Chip: Apple M2 Ultra
      Total Number of Cores: 24 (16 performance and 8 efficiency)
      Memory: 64 GB
      System Firmware Version: 10151.101.3
      OS Loader Version: 10151.101.3

Software:

    System Software Overview:

      System Version: macOS 14.4.1 (23E224)
      Kernel Version: Darwin 23.4.0
      Time since boot: 1 day, 7 hours, 54 minutes

Graphics/Displays:

    Apple M2 Ultra:

      Chipset Model: Apple M2 Ultra
      Type: GPU
      Bus: Built-In
      Total Number of Cores: 60
      Vendor: Apple (0x106b)
      Metal Support: Metal 3
      Displays:
        PHL 278B1:
          Resolution: 3840 x 2160 (2160p/4K UHD 1 - Ultra High Definition)
          UI Looks like: 1920 x 1080 @ 60.00Hz
          Main Display: Yes
          Mirror: Off
          Online: Yes
          Rotation: Supported


Installed Packages:
    aiobotocore: 2.12.3
    aiohttp: 3.9.5
    aioitertools: 0.11.0
    aiosignal: 1.3.1
    alabaster: 0.7.16
    alphashape: 1.3.1
    annotated-types: 0.6.0
    appdirs: 1.4.4
    appnope: 0.1.4
    asciitree: 0.3.3
    asttokens: 2.4.1
    attrs: 23.2.0
    Babel: 2.14.0
    beautifulsoup4: 4.12.3
    biopython: 1.83
    blockdiag: 3.0.0
    blosc2: 2.0.0
    botocore: 1.34.69
    build: 1.2.1
    certifi: 2023.11.17
    cffi: 1.16.0
    cftime: 1.6.3
    charset-normalizer: 3.3.2
    ChimeraX-AddCharge: 1.5.17
    ChimeraX-AddH: 2.2.6
    ChimeraX-AlignmentAlgorithms: 2.0.2
    ChimeraX-AlignmentHdrs: 3.5
    ChimeraX-AlignmentMatrices: 2.1
    ChimeraX-Alignments: 2.12.6
    ChimeraX-AlphaFold: 1.0
    ChimeraX-AltlocExplorer: 1.1.1
    ChimeraX-AmberInfo: 1.0
    ChimeraX-Arrays: 1.1
    ChimeraX-ArtiaX: 0.4.5
    ChimeraX-Atomic: 1.57
    ChimeraX-AtomicLibrary: 14.0.3
    ChimeraX-AtomSearch: 2.0.1
    ChimeraX-AxesPlanes: 2.4
    ChimeraX-BasicActions: 1.1.2
    ChimeraX-BILD: 1.0
    ChimeraX-BlastProtein: 2.4.5
    ChimeraX-BondRot: 2.0.4
    ChimeraX-BugReporter: 1.0.1
    ChimeraX-BuildStructure: 2.12.1
    ChimeraX-Bumps: 1.0
    ChimeraX-BundleBuilder: 1.2.3
    ChimeraX-ButtonPanel: 1.0.1
    ChimeraX-CageBuilder: 1.0.1
    ChimeraX-CellPack: 1.0
    ChimeraX-Centroids: 1.4
    ChimeraX-ChangeChains: 1.1
    ChimeraX-CheckWaters: 1.4
    ChimeraX-ChemGroup: 2.0.1
    ChimeraX-Clashes: 2.2.4
    ChimeraX-Clipper: 0.23.0
    ChimeraX-ColorActions: 1.0.4
    ChimeraX-ColorGlobe: 1.0
    ChimeraX-ColorKey: 1.5.5
    ChimeraX-CommandLine: 1.2.5
    ChimeraX-ConnectStructure: 2.0.1
    ChimeraX-Contacts: 1.0.1
    ChimeraX-copick: 0.1.0
    ChimeraX-Core: 1.8.dev202405030711
    ChimeraX-CoreFormats: 1.2
    ChimeraX-coulombic: 1.4.3
    ChimeraX-crai: 0.3
    ChimeraX-Crosslinks: 1.0
    ChimeraX-Crystal: 1.0
    ChimeraX-CrystalContacts: 1.0.1
    ChimeraX-DataFormats: 1.2.3
    ChimeraX-Dicom: 1.2
    ChimeraX-DiffPlot: 1.0
    ChimeraX-DistMonitor: 1.4.2
    ChimeraX-DockPrep: 1.1.3
    ChimeraX-Dssp: 2.0
    ChimeraX-EMDB-SFF: 1.0
    ChimeraX-ESMFold: 1.0
    ChimeraX-FileHistory: 1.0.1
    ChimeraX-FunctionKey: 1.0.1
    ChimeraX-Geometry: 1.3
    ChimeraX-gltf: 1.0
    ChimeraX-Graphics: 1.1.1
    ChimeraX-Hbonds: 2.4
    ChimeraX-Help: 1.2.2
    ChimeraX-HKCage: 1.3
    ChimeraX-IHM: 1.1
    ChimeraX-ImageFormats: 1.2
    ChimeraX-IMOD: 1.0
    ChimeraX-IO: 1.0.1
    ChimeraX-ItemsInspection: 1.0.1
    ChimeraX-IUPAC: 1.0
    ChimeraX-Label: 1.1.9
    ChimeraX-ListInfo: 1.2.2
    ChimeraX-Log: 1.1.6
    ChimeraX-LookingGlass: 1.1
    ChimeraX-Maestro: 1.9.1
    ChimeraX-Map: 1.2
    ChimeraX-MapData: 2.0
    ChimeraX-MapEraser: 1.0.1
    ChimeraX-MapFilter: 2.0.1
    ChimeraX-MapFit: 2.0
    ChimeraX-MapSeries: 2.1.1
    ChimeraX-Markers: 1.0.1
    ChimeraX-Mask: 1.0.2
    ChimeraX-MatchMaker: 2.1.3
    ChimeraX-MCopy: 1.0
    ChimeraX-MDcrds: 2.7
    ChimeraX-MedicalToolbar: 1.0.2
    ChimeraX-Meeting: 1.0.1
    ChimeraX-MLP: 1.1.1
    ChimeraX-mmCIF: 2.14.1
    ChimeraX-MMTF: 2.2
    ChimeraX-Modeller: 1.5.15
    ChimeraX-ModelPanel: 1.5
    ChimeraX-ModelSeries: 1.0.1
    ChimeraX-Mol2: 2.0.3
    ChimeraX-Mole: 1.0
    ChimeraX-Morph: 1.0.2
    ChimeraX-MouseModes: 1.2
    ChimeraX-Movie: 1.0
    ChimeraX-Neuron: 1.0
    ChimeraX-Nifti: 1.1
    ChimeraX-NIHPresets: 1.1.17
    ChimeraX-NMRSTAR: 1.0.2
    ChimeraX-NRRD: 1.1
    ChimeraX-Nucleotides: 2.0.3
    ChimeraX-OME-Zarr: 0.5.3
    ChimeraX-OpenCommand: 1.13.4
    ChimeraX-PDB: 2.7.5
    ChimeraX-PDBBio: 1.0.1
    ChimeraX-PDBLibrary: 1.0.4
    ChimeraX-PDBMatrices: 1.0
    ChimeraX-PhenixUI: 1.2.2
    ChimeraX-PickBlobs: 1.0.1
    ChimeraX-Positions: 1.0
    ChimeraX-PresetMgr: 1.1.1
    ChimeraX-PubChem: 2.2
    ChimeraX-ReadPbonds: 1.0.1
    ChimeraX-Registration: 1.1.2
    ChimeraX-RemoteControl: 1.0
    ChimeraX-RenderByAttr: 1.4.1
    ChimeraX-RenumberResidues: 1.1
    ChimeraX-ResidueFit: 1.0.1
    ChimeraX-RestServer: 1.2
    ChimeraX-RNALayout: 1.0
    ChimeraX-RotamerLibMgr: 4.0
    ChimeraX-RotamerLibsDunbrack: 2.0
    ChimeraX-RotamerLibsDynameomics: 2.0
    ChimeraX-RotamerLibsRichardson: 2.0
    ChimeraX-SaveCommand: 1.5.1
    ChimeraX-SchemeMgr: 1.0
    ChimeraX-SDF: 2.0.2
    ChimeraX-Segger: 1.0
    ChimeraX-Segment: 1.0.1
    ChimeraX-Segmentations: 1.0.4
    ChimeraX-SelInspector: 1.0
    ChimeraX-SeqView: 2.11.2
    ChimeraX-Shape: 1.0.1
    ChimeraX-Shell: 1.0.1
    ChimeraX-Shortcuts: 1.1.1
    ChimeraX-ShowSequences: 1.0.3
    ChimeraX-SideView: 1.0.1
    ChimeraX-Smiles: 2.1.2
    ChimeraX-SmoothLines: 1.0
    ChimeraX-SpaceNavigator: 1.0
    ChimeraX-StdCommands: 1.16.4
    ChimeraX-STL: 1.0.1
    ChimeraX-Storm: 1.0
    ChimeraX-StructMeasure: 1.2.1
    ChimeraX-Struts: 1.0.1
    ChimeraX-Surface: 1.0.1
    ChimeraX-SwapAA: 2.0.1
    ChimeraX-SwapRes: 2.5
    ChimeraX-TapeMeasure: 1.0
    ChimeraX-TaskManager: 1.0
    ChimeraX-Test: 1.0
    ChimeraX-TetraScapeCommand: 0.1
    ChimeraX-Toolbar: 1.1.2
    ChimeraX-ToolshedUtils: 1.2.4
    ChimeraX-Topography: 1.0
    ChimeraX-ToQuest: 1.0
    ChimeraX-Tug: 1.0.1
    ChimeraX-UI: 1.38
    ChimeraX-uniprot: 2.3
    ChimeraX-UnitCell: 1.0.1
    ChimeraX-ViewDockX: 1.4.1
    ChimeraX-VIPERdb: 1.0
    ChimeraX-Vive: 1.1
    ChimeraX-VolumeMenu: 1.0.1
    ChimeraX-vrml: 1.0
    ChimeraX-VTK: 1.0
    ChimeraX-WavefrontOBJ: 1.0
    ChimeraX-WebCam: 1.0.2
    ChimeraX-WebServices: 1.1.3
    ChimeraX-Zone: 1.0.1
    click: 8.1.7
    click-log: 0.4.0
    cloudpickle: 3.0.0
    colorama: 0.4.6
    comm: 0.2.2
    contourpy: 1.2.1
    copick: 0.1.dev45+g5a19260
    cripser: 0.0.13
    cryptography: 42.0.5
    cxservices: 1.2.2
    cycler: 0.12.1
    Cython: 3.0.10
    dask: 2024.4.2
    debugpy: 1.8.1
    decorator: 5.1.1
    distributed: 2024.4.2
    docutils: 0.20.1
    executing: 2.0.1
    fasteners: 0.19
    filelock: 3.13.4
    fonttools: 4.51.0
    frozenlist: 1.4.1
    fsspec: 2024.3.1
    funcparserlib: 2.0.0a0
    geomdl: 5.3.1
    glfw: 2.7.0
    grako: 3.16.5
    h5py: 3.11.0
    hatchling: 1.24.2
    html2text: 2024.2.26
    idna: 3.7
    ihm: 1.0
    imagecodecs: 2024.1.1
    imageio: 2.34.1
    imagesize: 1.4.1
    importlib-metadata: 7.1.0
    ipykernel: 6.29.2
    ipython: 8.21.0
    ipywidgets: 8.1.2
    jedi: 0.19.1
    Jinja2: 3.1.3
    jmespath: 1.0.1
    joblib: 1.4.0
    jupyter-client: 8.6.0
    jupyter-core: 5.7.2
    jupyterlab-widgets: 3.0.10
    kiwisolver: 1.4.5
    lazy-loader: 0.4
    line-profiler: 4.1.2
    llvmlite: 0.42.0
    locket: 1.0.0
    lxml: 5.2.1
    lz4: 4.3.3
    MarkupSafe: 2.1.5
    matplotlib: 3.8.4
    matplotlib-inline: 0.1.7
    mpmath: 1.3.0
    mrcfile: 1.5.0
    msgpack: 1.0.8
    multidict: 6.0.5
    nest-asyncio: 1.6.0
    netCDF4: 1.6.5
    networkx: 3.3
    nibabel: 5.2.0
    nptyping: 2.5.0
    numba: 0.59.1
    numcodecs: 0.12.1
    numexpr: 2.10.0
    numpy: 1.26.4
    ome-zarr: 0.8.3
    openvr: 1.26.701
    packaging: 23.2
    pandas: 2.2.2
    ParmEd: 4.2.2
    parso: 0.8.4
    partd: 1.4.1
    pathspec: 0.12.1
    pep517: 0.13.1
    pexpect: 4.9.0
    pillow: 10.3.0
    pip: 24.0
    pkginfo: 1.10.0
    platformdirs: 4.2.1
    pluggy: 1.5.0
    prompt-toolkit: 3.0.43
    psutil: 5.9.8
    ptyprocess: 0.7.0
    pure-eval: 0.2.2
    py-cpuinfo: 9.0.0
    pyarrow: 16.0.0
    pycollada: 0.8
    pycparser: 2.22
    pydantic: 2.7.1
    pydantic-core: 2.18.2
    pydicom: 2.4.4
    pygments: 2.17.2
    pynmrstar: 3.3.4
    pynndescent: 0.5.12
    pynrrd: 1.0.0
    PyOpenGL: 3.1.7
    PyOpenGL-accelerate: 3.1.7
    pyopenxr: 1.0.3401
    pyparsing: 3.1.2
    pyproject-hooks: 1.1.0
    PyQt6-commercial: 6.6.1
    PyQt6-Qt6: 6.6.3
    PyQt6-sip: 13.6.0
    PyQt6-WebEngine-commercial: 6.6.0
    PyQt6-WebEngine-Qt6: 6.6.3
    pyspnego: 0.10.2
    python-dateutil: 2.9.0.post0
    pytz: 2024.1
    PyYAML: 6.0.1
    pyzmq: 26.0.3
    qtconsole: 5.5.1
    QtPy: 2.4.1
    RandomWords: 0.4.0
    requests: 2.31.0
    Rtree: 1.1.0
    s3fs: 2024.3.1
    scikit-image: 0.23.2
    scikit-learn: 1.4.2
    scipy: 1.13.0
    setuptools: 69.5.1
    setuptools-scm: 8.0.4
    sfftk-rw: 0.8.1
    shapely: 2.0.2
    six: 1.16.0
    smbprotocol: 1.13.0
    snowballstemmer: 2.2.0
    sortedcontainers: 2.4.0
    soupsieve: 2.5
    sphinx: 7.2.6
    sphinx-autodoc-typehints: 2.0.1
    sphinxcontrib-applehelp: 1.0.8
    sphinxcontrib-blockdiag: 3.0.0
    sphinxcontrib-devhelp: 1.0.6
    sphinxcontrib-htmlhelp: 2.0.5
    sphinxcontrib-jsmath: 1.0.1
    sphinxcontrib-qthelp: 1.0.7
    sphinxcontrib-serializinghtml: 1.1.10
    stack-data: 0.6.3
    starfile: 0.5.6
    superqt: 0.6.3
    sympy: 1.12
    tables: 3.8.0
    tblib: 3.0.0
    tcia-utils: 1.5.1
    threadpoolctl: 3.4.0
    tifffile: 2024.1.30
    tinyarray: 1.2.4
    toolz: 0.12.1
    torch: 2.2.2
    tornado: 6.4
    tqdm: 4.66.2
    traitlets: 5.14.2
    trimesh: 4.0.10
    trove-classifiers: 2024.4.10
    typing-extensions: 4.11.0
    tzdata: 2024.1
    umap-learn: 0.5.6
    urllib3: 2.2.1
    wcwidth: 0.2.13
    webcolors: 1.13
    wheel: 0.43.0
    wheel-filename: 1.4.1
    widgetsnbextension: 4.0.10
    wrapt: 1.16.0
    yarl: 1.9.4
    zarr: 2.17.2
    zict: 3.0.0
    zipp: 3.18.1
File attachment: a2g0z0.fasta

a2g0z0.fasta

Attachments (1)

a2g0z0.fasta (301 bytes ) - added by goddard@… 18 months ago.
Added by email2trac

Download all attachments as: .zip

Change History (7)

by goddard@…, 18 months ago

Attachment: a2g0z0.fasta added

Added by email2trac

comment:1 by Tom Goddard, 18 months ago

Component: UnassignedSequence
Owner: set to pett
Platform: all
Project: ChimeraX
Status: newassigned
Summary: ChimeraX bug report submissionSequence viewer structure association has mismatches for identical subsequence

comment:2 by Tom Goddard, 18 months ago

The fasta sequence is missing the N and C terminus of the structure sequence and also missing a 3 residue segment and 2 residue segment. I guess the structure having a 3 residue and a 2 residue insertion is what is tripping up the associating the sequences.

comment:3 by Tom Goddard, 18 months ago

If I force it to associate with Needleman-Wunsch then it associates with 0 mismatches.

It is not easy to get it to associate with Needleman-Wunsch. The only way I could figure out to do it is to change seqalign settings assoc_error_rate and rerun Alignment.associate(). It would be nice if I could ask it to associate with Needleman-Wunsch using an optional argument to Alignment.associate().

comment:4 by pett, 18 months ago

Yes, the insertions are causing the mismatches. Bluntly put, the sequence in the alignment is not the sequence of the structure, nor is it a subsequence.

Your best path to force NW association is to not use associate() at all, instead:

from chimerax.seqalign.alignment import nw_assoc
match_map, errors = nw_assoc(session, alignment_seq, chain)
alignment.prematched_assoc_structure(match_map, errors, False)

comment:5 by goddard@…, 18 months ago

Ok, thanks for the tip.  I was thinking I would see if the association produces mismatches and in that case run Needlmen-Wunsch.  But maybe I will just always use Needleman-Wunsch since I know which sequence in the alignment to match to and it was quite fast for these nearly identical sequences.

comment:6 by pett, 18 months ago

Resolution: limitation
Status: assignedclosed

Yes, sorry, but it is a basic assumption that the sequence in the alignment is either the full sequence of the chain or a true subsequence.

Note: See TracTickets for help on using tickets.