Opened 18 months ago
Closed 18 months ago
#15106 closed defect (limitation)
Sequence viewer structure association has mismatches for identical subsequence
Reported by: | Owned by: | pett | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Sequence | Version: | |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
The following bug report has been submitted: Platform: macOS-14.4.1-arm64-arm-64bit ChimeraX Version: 1.8.dev202405030711 (2024-05-03 07:11:24 UTC) Description I'm puzzled that Sequence Viewer declares there are 88 mismatches when trying to associate AlphaFold model A2G0Z0 to an exact identical subsequence of A2G0Z0 loaded from a fasta file. I recall Eric told me that associating a structure to a subsequence of that structure may not do a good job, but this seems kind of extreme given that the subsequence exactly matches part of the structure sequence. I've attached the fasta file. The AlphaFold model can be opened with "open A2G0Z0 from alphafold". The reason I have a subsequence in the fasta file is because it came from a sequence search of a protein with just one of the two domains of A2G0Z0. This seems like a common scenario where structures are found by sequence search that are larger than the query sequence. So it would be nice if such structures can associate with their own exactly matching subsequence in the alignment. Log: UCSF ChimeraX version: 1.8.dev202405030711 (2024-05-03) © 2016-2024 Regents of the University of California. All rights reserved. How to cite UCSF ChimeraX > open /Users/goddard/ucsf/umap/braf_alphafold/af292_env/a2g0z0.fasta Summary of feedback from opening /Users/goddard/ucsf/umap/braf_alphafold/af292_env/a2g0z0.fasta --- note | Alignment identifier is a2g0z0.fasta Opened 1 sequence from a2g0z0.fasta > open a2g0z0 fromDatabase alphafold Chain information for AlphaFold A2G0Z0 #1 --- Chain | Description | UniProt A | TKL family protein kinase | A2G0Z0_TRIVA 1-1028 Color AlphaFold A2G0Z0 by residue attribute pLDDT_score Associated AlphaFold A2G0Z0 chain A to A2G0Z0 with 88 mismatches OpenGL version: 4.1 Metal - 88 OpenGL renderer: Apple M2 Ultra OpenGL vendor: Apple Python: 3.11.4 Locale: UTF-8 Qt version: PyQt6 6.6.1, Qt 6.6.1 Qt runtime version: 6.6.3 Qt platform: cocoa Hardware: Hardware Overview: Model Name: Mac Studio Model Identifier: Mac14,14 Model Number: Z1800003VLL/A Chip: Apple M2 Ultra Total Number of Cores: 24 (16 performance and 8 efficiency) Memory: 64 GB System Firmware Version: 10151.101.3 OS Loader Version: 10151.101.3 Software: System Software Overview: System Version: macOS 14.4.1 (23E224) Kernel Version: Darwin 23.4.0 Time since boot: 1 day, 7 hours, 54 minutes Graphics/Displays: Apple M2 Ultra: Chipset Model: Apple M2 Ultra Type: GPU Bus: Built-In Total Number of Cores: 60 Vendor: Apple (0x106b) Metal Support: Metal 3 Displays: PHL 278B1: Resolution: 3840 x 2160 (2160p/4K UHD 1 - Ultra High Definition) UI Looks like: 1920 x 1080 @ 60.00Hz Main Display: Yes Mirror: Off Online: Yes Rotation: Supported Installed Packages: aiobotocore: 2.12.3 aiohttp: 3.9.5 aioitertools: 0.11.0 aiosignal: 1.3.1 alabaster: 0.7.16 alphashape: 1.3.1 annotated-types: 0.6.0 appdirs: 1.4.4 appnope: 0.1.4 asciitree: 0.3.3 asttokens: 2.4.1 attrs: 23.2.0 Babel: 2.14.0 beautifulsoup4: 4.12.3 biopython: 1.83 blockdiag: 3.0.0 blosc2: 2.0.0 botocore: 1.34.69 build: 1.2.1 certifi: 2023.11.17 cffi: 1.16.0 cftime: 1.6.3 charset-normalizer: 3.3.2 ChimeraX-AddCharge: 1.5.17 ChimeraX-AddH: 2.2.6 ChimeraX-AlignmentAlgorithms: 2.0.2 ChimeraX-AlignmentHdrs: 3.5 ChimeraX-AlignmentMatrices: 2.1 ChimeraX-Alignments: 2.12.6 ChimeraX-AlphaFold: 1.0 ChimeraX-AltlocExplorer: 1.1.1 ChimeraX-AmberInfo: 1.0 ChimeraX-Arrays: 1.1 ChimeraX-ArtiaX: 0.4.5 ChimeraX-Atomic: 1.57 ChimeraX-AtomicLibrary: 14.0.3 ChimeraX-AtomSearch: 2.0.1 ChimeraX-AxesPlanes: 2.4 ChimeraX-BasicActions: 1.1.2 ChimeraX-BILD: 1.0 ChimeraX-BlastProtein: 2.4.5 ChimeraX-BondRot: 2.0.4 ChimeraX-BugReporter: 1.0.1 ChimeraX-BuildStructure: 2.12.1 ChimeraX-Bumps: 1.0 ChimeraX-BundleBuilder: 1.2.3 ChimeraX-ButtonPanel: 1.0.1 ChimeraX-CageBuilder: 1.0.1 ChimeraX-CellPack: 1.0 ChimeraX-Centroids: 1.4 ChimeraX-ChangeChains: 1.1 ChimeraX-CheckWaters: 1.4 ChimeraX-ChemGroup: 2.0.1 ChimeraX-Clashes: 2.2.4 ChimeraX-Clipper: 0.23.0 ChimeraX-ColorActions: 1.0.4 ChimeraX-ColorGlobe: 1.0 ChimeraX-ColorKey: 1.5.5 ChimeraX-CommandLine: 1.2.5 ChimeraX-ConnectStructure: 2.0.1 ChimeraX-Contacts: 1.0.1 ChimeraX-copick: 0.1.0 ChimeraX-Core: 1.8.dev202405030711 ChimeraX-CoreFormats: 1.2 ChimeraX-coulombic: 1.4.3 ChimeraX-crai: 0.3 ChimeraX-Crosslinks: 1.0 ChimeraX-Crystal: 1.0 ChimeraX-CrystalContacts: 1.0.1 ChimeraX-DataFormats: 1.2.3 ChimeraX-Dicom: 1.2 ChimeraX-DiffPlot: 1.0 ChimeraX-DistMonitor: 1.4.2 ChimeraX-DockPrep: 1.1.3 ChimeraX-Dssp: 2.0 ChimeraX-EMDB-SFF: 1.0 ChimeraX-ESMFold: 1.0 ChimeraX-FileHistory: 1.0.1 ChimeraX-FunctionKey: 1.0.1 ChimeraX-Geometry: 1.3 ChimeraX-gltf: 1.0 ChimeraX-Graphics: 1.1.1 ChimeraX-Hbonds: 2.4 ChimeraX-Help: 1.2.2 ChimeraX-HKCage: 1.3 ChimeraX-IHM: 1.1 ChimeraX-ImageFormats: 1.2 ChimeraX-IMOD: 1.0 ChimeraX-IO: 1.0.1 ChimeraX-ItemsInspection: 1.0.1 ChimeraX-IUPAC: 1.0 ChimeraX-Label: 1.1.9 ChimeraX-ListInfo: 1.2.2 ChimeraX-Log: 1.1.6 ChimeraX-LookingGlass: 1.1 ChimeraX-Maestro: 1.9.1 ChimeraX-Map: 1.2 ChimeraX-MapData: 2.0 ChimeraX-MapEraser: 1.0.1 ChimeraX-MapFilter: 2.0.1 ChimeraX-MapFit: 2.0 ChimeraX-MapSeries: 2.1.1 ChimeraX-Markers: 1.0.1 ChimeraX-Mask: 1.0.2 ChimeraX-MatchMaker: 2.1.3 ChimeraX-MCopy: 1.0 ChimeraX-MDcrds: 2.7 ChimeraX-MedicalToolbar: 1.0.2 ChimeraX-Meeting: 1.0.1 ChimeraX-MLP: 1.1.1 ChimeraX-mmCIF: 2.14.1 ChimeraX-MMTF: 2.2 ChimeraX-Modeller: 1.5.15 ChimeraX-ModelPanel: 1.5 ChimeraX-ModelSeries: 1.0.1 ChimeraX-Mol2: 2.0.3 ChimeraX-Mole: 1.0 ChimeraX-Morph: 1.0.2 ChimeraX-MouseModes: 1.2 ChimeraX-Movie: 1.0 ChimeraX-Neuron: 1.0 ChimeraX-Nifti: 1.1 ChimeraX-NIHPresets: 1.1.17 ChimeraX-NMRSTAR: 1.0.2 ChimeraX-NRRD: 1.1 ChimeraX-Nucleotides: 2.0.3 ChimeraX-OME-Zarr: 0.5.3 ChimeraX-OpenCommand: 1.13.4 ChimeraX-PDB: 2.7.5 ChimeraX-PDBBio: 1.0.1 ChimeraX-PDBLibrary: 1.0.4 ChimeraX-PDBMatrices: 1.0 ChimeraX-PhenixUI: 1.2.2 ChimeraX-PickBlobs: 1.0.1 ChimeraX-Positions: 1.0 ChimeraX-PresetMgr: 1.1.1 ChimeraX-PubChem: 2.2 ChimeraX-ReadPbonds: 1.0.1 ChimeraX-Registration: 1.1.2 ChimeraX-RemoteControl: 1.0 ChimeraX-RenderByAttr: 1.4.1 ChimeraX-RenumberResidues: 1.1 ChimeraX-ResidueFit: 1.0.1 ChimeraX-RestServer: 1.2 ChimeraX-RNALayout: 1.0 ChimeraX-RotamerLibMgr: 4.0 ChimeraX-RotamerLibsDunbrack: 2.0 ChimeraX-RotamerLibsDynameomics: 2.0 ChimeraX-RotamerLibsRichardson: 2.0 ChimeraX-SaveCommand: 1.5.1 ChimeraX-SchemeMgr: 1.0 ChimeraX-SDF: 2.0.2 ChimeraX-Segger: 1.0 ChimeraX-Segment: 1.0.1 ChimeraX-Segmentations: 1.0.4 ChimeraX-SelInspector: 1.0 ChimeraX-SeqView: 2.11.2 ChimeraX-Shape: 1.0.1 ChimeraX-Shell: 1.0.1 ChimeraX-Shortcuts: 1.1.1 ChimeraX-ShowSequences: 1.0.3 ChimeraX-SideView: 1.0.1 ChimeraX-Smiles: 2.1.2 ChimeraX-SmoothLines: 1.0 ChimeraX-SpaceNavigator: 1.0 ChimeraX-StdCommands: 1.16.4 ChimeraX-STL: 1.0.1 ChimeraX-Storm: 1.0 ChimeraX-StructMeasure: 1.2.1 ChimeraX-Struts: 1.0.1 ChimeraX-Surface: 1.0.1 ChimeraX-SwapAA: 2.0.1 ChimeraX-SwapRes: 2.5 ChimeraX-TapeMeasure: 1.0 ChimeraX-TaskManager: 1.0 ChimeraX-Test: 1.0 ChimeraX-TetraScapeCommand: 0.1 ChimeraX-Toolbar: 1.1.2 ChimeraX-ToolshedUtils: 1.2.4 ChimeraX-Topography: 1.0 ChimeraX-ToQuest: 1.0 ChimeraX-Tug: 1.0.1 ChimeraX-UI: 1.38 ChimeraX-uniprot: 2.3 ChimeraX-UnitCell: 1.0.1 ChimeraX-ViewDockX: 1.4.1 ChimeraX-VIPERdb: 1.0 ChimeraX-Vive: 1.1 ChimeraX-VolumeMenu: 1.0.1 ChimeraX-vrml: 1.0 ChimeraX-VTK: 1.0 ChimeraX-WavefrontOBJ: 1.0 ChimeraX-WebCam: 1.0.2 ChimeraX-WebServices: 1.1.3 ChimeraX-Zone: 1.0.1 click: 8.1.7 click-log: 0.4.0 cloudpickle: 3.0.0 colorama: 0.4.6 comm: 0.2.2 contourpy: 1.2.1 copick: 0.1.dev45+g5a19260 cripser: 0.0.13 cryptography: 42.0.5 cxservices: 1.2.2 cycler: 0.12.1 Cython: 3.0.10 dask: 2024.4.2 debugpy: 1.8.1 decorator: 5.1.1 distributed: 2024.4.2 docutils: 0.20.1 executing: 2.0.1 fasteners: 0.19 filelock: 3.13.4 fonttools: 4.51.0 frozenlist: 1.4.1 fsspec: 2024.3.1 funcparserlib: 2.0.0a0 geomdl: 5.3.1 glfw: 2.7.0 grako: 3.16.5 h5py: 3.11.0 hatchling: 1.24.2 html2text: 2024.2.26 idna: 3.7 ihm: 1.0 imagecodecs: 2024.1.1 imageio: 2.34.1 imagesize: 1.4.1 importlib-metadata: 7.1.0 ipykernel: 6.29.2 ipython: 8.21.0 ipywidgets: 8.1.2 jedi: 0.19.1 Jinja2: 3.1.3 jmespath: 1.0.1 joblib: 1.4.0 jupyter-client: 8.6.0 jupyter-core: 5.7.2 jupyterlab-widgets: 3.0.10 kiwisolver: 1.4.5 lazy-loader: 0.4 line-profiler: 4.1.2 llvmlite: 0.42.0 locket: 1.0.0 lxml: 5.2.1 lz4: 4.3.3 MarkupSafe: 2.1.5 matplotlib: 3.8.4 matplotlib-inline: 0.1.7 mpmath: 1.3.0 mrcfile: 1.5.0 msgpack: 1.0.8 multidict: 6.0.5 nest-asyncio: 1.6.0 netCDF4: 1.6.5 networkx: 3.3 nibabel: 5.2.0 nptyping: 2.5.0 numba: 0.59.1 numcodecs: 0.12.1 numexpr: 2.10.0 numpy: 1.26.4 ome-zarr: 0.8.3 openvr: 1.26.701 packaging: 23.2 pandas: 2.2.2 ParmEd: 4.2.2 parso: 0.8.4 partd: 1.4.1 pathspec: 0.12.1 pep517: 0.13.1 pexpect: 4.9.0 pillow: 10.3.0 pip: 24.0 pkginfo: 1.10.0 platformdirs: 4.2.1 pluggy: 1.5.0 prompt-toolkit: 3.0.43 psutil: 5.9.8 ptyprocess: 0.7.0 pure-eval: 0.2.2 py-cpuinfo: 9.0.0 pyarrow: 16.0.0 pycollada: 0.8 pycparser: 2.22 pydantic: 2.7.1 pydantic-core: 2.18.2 pydicom: 2.4.4 pygments: 2.17.2 pynmrstar: 3.3.4 pynndescent: 0.5.12 pynrrd: 1.0.0 PyOpenGL: 3.1.7 PyOpenGL-accelerate: 3.1.7 pyopenxr: 1.0.3401 pyparsing: 3.1.2 pyproject-hooks: 1.1.0 PyQt6-commercial: 6.6.1 PyQt6-Qt6: 6.6.3 PyQt6-sip: 13.6.0 PyQt6-WebEngine-commercial: 6.6.0 PyQt6-WebEngine-Qt6: 6.6.3 pyspnego: 0.10.2 python-dateutil: 2.9.0.post0 pytz: 2024.1 PyYAML: 6.0.1 pyzmq: 26.0.3 qtconsole: 5.5.1 QtPy: 2.4.1 RandomWords: 0.4.0 requests: 2.31.0 Rtree: 1.1.0 s3fs: 2024.3.1 scikit-image: 0.23.2 scikit-learn: 1.4.2 scipy: 1.13.0 setuptools: 69.5.1 setuptools-scm: 8.0.4 sfftk-rw: 0.8.1 shapely: 2.0.2 six: 1.16.0 smbprotocol: 1.13.0 snowballstemmer: 2.2.0 sortedcontainers: 2.4.0 soupsieve: 2.5 sphinx: 7.2.6 sphinx-autodoc-typehints: 2.0.1 sphinxcontrib-applehelp: 1.0.8 sphinxcontrib-blockdiag: 3.0.0 sphinxcontrib-devhelp: 1.0.6 sphinxcontrib-htmlhelp: 2.0.5 sphinxcontrib-jsmath: 1.0.1 sphinxcontrib-qthelp: 1.0.7 sphinxcontrib-serializinghtml: 1.1.10 stack-data: 0.6.3 starfile: 0.5.6 superqt: 0.6.3 sympy: 1.12 tables: 3.8.0 tblib: 3.0.0 tcia-utils: 1.5.1 threadpoolctl: 3.4.0 tifffile: 2024.1.30 tinyarray: 1.2.4 toolz: 0.12.1 torch: 2.2.2 tornado: 6.4 tqdm: 4.66.2 traitlets: 5.14.2 trimesh: 4.0.10 trove-classifiers: 2024.4.10 typing-extensions: 4.11.0 tzdata: 2024.1 umap-learn: 0.5.6 urllib3: 2.2.1 wcwidth: 0.2.13 webcolors: 1.13 wheel: 0.43.0 wheel-filename: 1.4.1 widgetsnbextension: 4.0.10 wrapt: 1.16.0 yarl: 1.9.4 zarr: 2.17.2 zict: 3.0.0 zipp: 3.18.1 File attachment: a2g0z0.fasta
Attachments (1)
Change History (7)
by , 18 months ago
Attachment: | a2g0z0.fasta added |
---|
comment:1 by , 18 months ago
Component: | Unassigned → Sequence |
---|---|
Owner: | set to |
Platform: | → all |
Project: | → ChimeraX |
Status: | new → assigned |
Summary: | ChimeraX bug report submission → Sequence viewer structure association has mismatches for identical subsequence |
comment:2 by , 18 months ago
The fasta sequence is missing the N and C terminus of the structure sequence and also missing a 3 residue segment and 2 residue segment. I guess the structure having a 3 residue and a 2 residue insertion is what is tripping up the associating the sequences.
comment:3 by , 18 months ago
If I force it to associate with Needleman-Wunsch then it associates with 0 mismatches.
It is not easy to get it to associate with Needleman-Wunsch. The only way I could figure out to do it is to change seqalign settings assoc_error_rate and rerun Alignment.associate(). It would be nice if I could ask it to associate with Needleman-Wunsch using an optional argument to Alignment.associate().
comment:4 by , 18 months ago
Yes, the insertions are causing the mismatches. Bluntly put, the sequence in the alignment is not the sequence of the structure, nor is it a subsequence.
Your best path to force NW association is to not use associate() at all, instead:
from chimerax.seqalign.alignment import nw_assoc
match_map, errors = nw_assoc(session, alignment_seq, chain)
alignment.prematched_assoc_structure(match_map, errors, False)
comment:5 by , 18 months ago
Ok, thanks for the tip. I was thinking I would see if the association produces mismatches and in that case run Needlmen-Wunsch. But maybe I will just always use Needleman-Wunsch since I know which sequence in the alignment to match to and it was quite fast for these nearly identical sequences.
comment:6 by , 18 months ago
Resolution: | → limitation |
---|---|
Status: | assigned → closed |
Yes, sorry, but it is a basic assumption that the sequence in the alignment is either the full sequence of the chain or a true subsequence.
Added by email2trac