Opened 19 months ago
Closed 18 months ago
#15106 closed defect (limitation)
Sequence viewer structure association has mismatches for identical subsequence
| Reported by: | Owned by: | Eric Pettersen | |
|---|---|---|---|
| Priority: | normal | Milestone: | |
| Component: | Sequence | Version: | |
| Keywords: | Cc: | ||
| Blocked By: | Blocking: | ||
| Notify when closed: | Platform: | all | |
| Project: | ChimeraX |
Description
The following bug report has been submitted:
Platform: macOS-14.4.1-arm64-arm-64bit
ChimeraX Version: 1.8.dev202405030711 (2024-05-03 07:11:24 UTC)
Description
I'm puzzled that Sequence Viewer declares there are 88 mismatches when trying to associate AlphaFold model A2G0Z0 to an exact identical subsequence of A2G0Z0 loaded from a fasta file. I recall Eric told me that associating a structure to a subsequence of that structure may not do a good job, but this seems kind of extreme given that the subsequence exactly matches part of the structure sequence. I've attached the fasta file. The AlphaFold model can be opened with "open A2G0Z0 from alphafold".
The reason I have a subsequence in the fasta file is because it came from a sequence search of a protein with just one of the two domains of A2G0Z0. This seems like a common scenario where structures are found by sequence search that are larger than the query sequence. So it would be nice if such structures can associate with their own exactly matching subsequence in the alignment.
Log:
UCSF ChimeraX version: 1.8.dev202405030711 (2024-05-03)
© 2016-2024 Regents of the University of California. All rights reserved.
How to cite UCSF ChimeraX
> open /Users/goddard/ucsf/umap/braf_alphafold/af292_env/a2g0z0.fasta
Summary of feedback from opening
/Users/goddard/ucsf/umap/braf_alphafold/af292_env/a2g0z0.fasta
---
note | Alignment identifier is a2g0z0.fasta
Opened 1 sequence from a2g0z0.fasta
> open a2g0z0 fromDatabase alphafold
Chain information for AlphaFold A2G0Z0 #1
---
Chain | Description | UniProt
A | TKL family protein kinase | A2G0Z0_TRIVA 1-1028
Color AlphaFold A2G0Z0 by residue attribute pLDDT_score
Associated AlphaFold A2G0Z0 chain A to A2G0Z0 with 88 mismatches
OpenGL version: 4.1 Metal - 88
OpenGL renderer: Apple M2 Ultra
OpenGL vendor: Apple
Python: 3.11.4
Locale: UTF-8
Qt version: PyQt6 6.6.1, Qt 6.6.1
Qt runtime version: 6.6.3
Qt platform: cocoa
Hardware:
Hardware Overview:
Model Name: Mac Studio
Model Identifier: Mac14,14
Model Number: Z1800003VLL/A
Chip: Apple M2 Ultra
Total Number of Cores: 24 (16 performance and 8 efficiency)
Memory: 64 GB
System Firmware Version: 10151.101.3
OS Loader Version: 10151.101.3
Software:
System Software Overview:
System Version: macOS 14.4.1 (23E224)
Kernel Version: Darwin 23.4.0
Time since boot: 1 day, 7 hours, 54 minutes
Graphics/Displays:
Apple M2 Ultra:
Chipset Model: Apple M2 Ultra
Type: GPU
Bus: Built-In
Total Number of Cores: 60
Vendor: Apple (0x106b)
Metal Support: Metal 3
Displays:
PHL 278B1:
Resolution: 3840 x 2160 (2160p/4K UHD 1 - Ultra High Definition)
UI Looks like: 1920 x 1080 @ 60.00Hz
Main Display: Yes
Mirror: Off
Online: Yes
Rotation: Supported
Installed Packages:
aiobotocore: 2.12.3
aiohttp: 3.9.5
aioitertools: 0.11.0
aiosignal: 1.3.1
alabaster: 0.7.16
alphashape: 1.3.1
annotated-types: 0.6.0
appdirs: 1.4.4
appnope: 0.1.4
asciitree: 0.3.3
asttokens: 2.4.1
attrs: 23.2.0
Babel: 2.14.0
beautifulsoup4: 4.12.3
biopython: 1.83
blockdiag: 3.0.0
blosc2: 2.0.0
botocore: 1.34.69
build: 1.2.1
certifi: 2023.11.17
cffi: 1.16.0
cftime: 1.6.3
charset-normalizer: 3.3.2
ChimeraX-AddCharge: 1.5.17
ChimeraX-AddH: 2.2.6
ChimeraX-AlignmentAlgorithms: 2.0.2
ChimeraX-AlignmentHdrs: 3.5
ChimeraX-AlignmentMatrices: 2.1
ChimeraX-Alignments: 2.12.6
ChimeraX-AlphaFold: 1.0
ChimeraX-AltlocExplorer: 1.1.1
ChimeraX-AmberInfo: 1.0
ChimeraX-Arrays: 1.1
ChimeraX-ArtiaX: 0.4.5
ChimeraX-Atomic: 1.57
ChimeraX-AtomicLibrary: 14.0.3
ChimeraX-AtomSearch: 2.0.1
ChimeraX-AxesPlanes: 2.4
ChimeraX-BasicActions: 1.1.2
ChimeraX-BILD: 1.0
ChimeraX-BlastProtein: 2.4.5
ChimeraX-BondRot: 2.0.4
ChimeraX-BugReporter: 1.0.1
ChimeraX-BuildStructure: 2.12.1
ChimeraX-Bumps: 1.0
ChimeraX-BundleBuilder: 1.2.3
ChimeraX-ButtonPanel: 1.0.1
ChimeraX-CageBuilder: 1.0.1
ChimeraX-CellPack: 1.0
ChimeraX-Centroids: 1.4
ChimeraX-ChangeChains: 1.1
ChimeraX-CheckWaters: 1.4
ChimeraX-ChemGroup: 2.0.1
ChimeraX-Clashes: 2.2.4
ChimeraX-Clipper: 0.23.0
ChimeraX-ColorActions: 1.0.4
ChimeraX-ColorGlobe: 1.0
ChimeraX-ColorKey: 1.5.5
ChimeraX-CommandLine: 1.2.5
ChimeraX-ConnectStructure: 2.0.1
ChimeraX-Contacts: 1.0.1
ChimeraX-copick: 0.1.0
ChimeraX-Core: 1.8.dev202405030711
ChimeraX-CoreFormats: 1.2
ChimeraX-coulombic: 1.4.3
ChimeraX-crai: 0.3
ChimeraX-Crosslinks: 1.0
ChimeraX-Crystal: 1.0
ChimeraX-CrystalContacts: 1.0.1
ChimeraX-DataFormats: 1.2.3
ChimeraX-Dicom: 1.2
ChimeraX-DiffPlot: 1.0
ChimeraX-DistMonitor: 1.4.2
ChimeraX-DockPrep: 1.1.3
ChimeraX-Dssp: 2.0
ChimeraX-EMDB-SFF: 1.0
ChimeraX-ESMFold: 1.0
ChimeraX-FileHistory: 1.0.1
ChimeraX-FunctionKey: 1.0.1
ChimeraX-Geometry: 1.3
ChimeraX-gltf: 1.0
ChimeraX-Graphics: 1.1.1
ChimeraX-Hbonds: 2.4
ChimeraX-Help: 1.2.2
ChimeraX-HKCage: 1.3
ChimeraX-IHM: 1.1
ChimeraX-ImageFormats: 1.2
ChimeraX-IMOD: 1.0
ChimeraX-IO: 1.0.1
ChimeraX-ItemsInspection: 1.0.1
ChimeraX-IUPAC: 1.0
ChimeraX-Label: 1.1.9
ChimeraX-ListInfo: 1.2.2
ChimeraX-Log: 1.1.6
ChimeraX-LookingGlass: 1.1
ChimeraX-Maestro: 1.9.1
ChimeraX-Map: 1.2
ChimeraX-MapData: 2.0
ChimeraX-MapEraser: 1.0.1
ChimeraX-MapFilter: 2.0.1
ChimeraX-MapFit: 2.0
ChimeraX-MapSeries: 2.1.1
ChimeraX-Markers: 1.0.1
ChimeraX-Mask: 1.0.2
ChimeraX-MatchMaker: 2.1.3
ChimeraX-MCopy: 1.0
ChimeraX-MDcrds: 2.7
ChimeraX-MedicalToolbar: 1.0.2
ChimeraX-Meeting: 1.0.1
ChimeraX-MLP: 1.1.1
ChimeraX-mmCIF: 2.14.1
ChimeraX-MMTF: 2.2
ChimeraX-Modeller: 1.5.15
ChimeraX-ModelPanel: 1.5
ChimeraX-ModelSeries: 1.0.1
ChimeraX-Mol2: 2.0.3
ChimeraX-Mole: 1.0
ChimeraX-Morph: 1.0.2
ChimeraX-MouseModes: 1.2
ChimeraX-Movie: 1.0
ChimeraX-Neuron: 1.0
ChimeraX-Nifti: 1.1
ChimeraX-NIHPresets: 1.1.17
ChimeraX-NMRSTAR: 1.0.2
ChimeraX-NRRD: 1.1
ChimeraX-Nucleotides: 2.0.3
ChimeraX-OME-Zarr: 0.5.3
ChimeraX-OpenCommand: 1.13.4
ChimeraX-PDB: 2.7.5
ChimeraX-PDBBio: 1.0.1
ChimeraX-PDBLibrary: 1.0.4
ChimeraX-PDBMatrices: 1.0
ChimeraX-PhenixUI: 1.2.2
ChimeraX-PickBlobs: 1.0.1
ChimeraX-Positions: 1.0
ChimeraX-PresetMgr: 1.1.1
ChimeraX-PubChem: 2.2
ChimeraX-ReadPbonds: 1.0.1
ChimeraX-Registration: 1.1.2
ChimeraX-RemoteControl: 1.0
ChimeraX-RenderByAttr: 1.4.1
ChimeraX-RenumberResidues: 1.1
ChimeraX-ResidueFit: 1.0.1
ChimeraX-RestServer: 1.2
ChimeraX-RNALayout: 1.0
ChimeraX-RotamerLibMgr: 4.0
ChimeraX-RotamerLibsDunbrack: 2.0
ChimeraX-RotamerLibsDynameomics: 2.0
ChimeraX-RotamerLibsRichardson: 2.0
ChimeraX-SaveCommand: 1.5.1
ChimeraX-SchemeMgr: 1.0
ChimeraX-SDF: 2.0.2
ChimeraX-Segger: 1.0
ChimeraX-Segment: 1.0.1
ChimeraX-Segmentations: 1.0.4
ChimeraX-SelInspector: 1.0
ChimeraX-SeqView: 2.11.2
ChimeraX-Shape: 1.0.1
ChimeraX-Shell: 1.0.1
ChimeraX-Shortcuts: 1.1.1
ChimeraX-ShowSequences: 1.0.3
ChimeraX-SideView: 1.0.1
ChimeraX-Smiles: 2.1.2
ChimeraX-SmoothLines: 1.0
ChimeraX-SpaceNavigator: 1.0
ChimeraX-StdCommands: 1.16.4
ChimeraX-STL: 1.0.1
ChimeraX-Storm: 1.0
ChimeraX-StructMeasure: 1.2.1
ChimeraX-Struts: 1.0.1
ChimeraX-Surface: 1.0.1
ChimeraX-SwapAA: 2.0.1
ChimeraX-SwapRes: 2.5
ChimeraX-TapeMeasure: 1.0
ChimeraX-TaskManager: 1.0
ChimeraX-Test: 1.0
ChimeraX-TetraScapeCommand: 0.1
ChimeraX-Toolbar: 1.1.2
ChimeraX-ToolshedUtils: 1.2.4
ChimeraX-Topography: 1.0
ChimeraX-ToQuest: 1.0
ChimeraX-Tug: 1.0.1
ChimeraX-UI: 1.38
ChimeraX-uniprot: 2.3
ChimeraX-UnitCell: 1.0.1
ChimeraX-ViewDockX: 1.4.1
ChimeraX-VIPERdb: 1.0
ChimeraX-Vive: 1.1
ChimeraX-VolumeMenu: 1.0.1
ChimeraX-vrml: 1.0
ChimeraX-VTK: 1.0
ChimeraX-WavefrontOBJ: 1.0
ChimeraX-WebCam: 1.0.2
ChimeraX-WebServices: 1.1.3
ChimeraX-Zone: 1.0.1
click: 8.1.7
click-log: 0.4.0
cloudpickle: 3.0.0
colorama: 0.4.6
comm: 0.2.2
contourpy: 1.2.1
copick: 0.1.dev45+g5a19260
cripser: 0.0.13
cryptography: 42.0.5
cxservices: 1.2.2
cycler: 0.12.1
Cython: 3.0.10
dask: 2024.4.2
debugpy: 1.8.1
decorator: 5.1.1
distributed: 2024.4.2
docutils: 0.20.1
executing: 2.0.1
fasteners: 0.19
filelock: 3.13.4
fonttools: 4.51.0
frozenlist: 1.4.1
fsspec: 2024.3.1
funcparserlib: 2.0.0a0
geomdl: 5.3.1
glfw: 2.7.0
grako: 3.16.5
h5py: 3.11.0
hatchling: 1.24.2
html2text: 2024.2.26
idna: 3.7
ihm: 1.0
imagecodecs: 2024.1.1
imageio: 2.34.1
imagesize: 1.4.1
importlib-metadata: 7.1.0
ipykernel: 6.29.2
ipython: 8.21.0
ipywidgets: 8.1.2
jedi: 0.19.1
Jinja2: 3.1.3
jmespath: 1.0.1
joblib: 1.4.0
jupyter-client: 8.6.0
jupyter-core: 5.7.2
jupyterlab-widgets: 3.0.10
kiwisolver: 1.4.5
lazy-loader: 0.4
line-profiler: 4.1.2
llvmlite: 0.42.0
locket: 1.0.0
lxml: 5.2.1
lz4: 4.3.3
MarkupSafe: 2.1.5
matplotlib: 3.8.4
matplotlib-inline: 0.1.7
mpmath: 1.3.0
mrcfile: 1.5.0
msgpack: 1.0.8
multidict: 6.0.5
nest-asyncio: 1.6.0
netCDF4: 1.6.5
networkx: 3.3
nibabel: 5.2.0
nptyping: 2.5.0
numba: 0.59.1
numcodecs: 0.12.1
numexpr: 2.10.0
numpy: 1.26.4
ome-zarr: 0.8.3
openvr: 1.26.701
packaging: 23.2
pandas: 2.2.2
ParmEd: 4.2.2
parso: 0.8.4
partd: 1.4.1
pathspec: 0.12.1
pep517: 0.13.1
pexpect: 4.9.0
pillow: 10.3.0
pip: 24.0
pkginfo: 1.10.0
platformdirs: 4.2.1
pluggy: 1.5.0
prompt-toolkit: 3.0.43
psutil: 5.9.8
ptyprocess: 0.7.0
pure-eval: 0.2.2
py-cpuinfo: 9.0.0
pyarrow: 16.0.0
pycollada: 0.8
pycparser: 2.22
pydantic: 2.7.1
pydantic-core: 2.18.2
pydicom: 2.4.4
pygments: 2.17.2
pynmrstar: 3.3.4
pynndescent: 0.5.12
pynrrd: 1.0.0
PyOpenGL: 3.1.7
PyOpenGL-accelerate: 3.1.7
pyopenxr: 1.0.3401
pyparsing: 3.1.2
pyproject-hooks: 1.1.0
PyQt6-commercial: 6.6.1
PyQt6-Qt6: 6.6.3
PyQt6-sip: 13.6.0
PyQt6-WebEngine-commercial: 6.6.0
PyQt6-WebEngine-Qt6: 6.6.3
pyspnego: 0.10.2
python-dateutil: 2.9.0.post0
pytz: 2024.1
PyYAML: 6.0.1
pyzmq: 26.0.3
qtconsole: 5.5.1
QtPy: 2.4.1
RandomWords: 0.4.0
requests: 2.31.0
Rtree: 1.1.0
s3fs: 2024.3.1
scikit-image: 0.23.2
scikit-learn: 1.4.2
scipy: 1.13.0
setuptools: 69.5.1
setuptools-scm: 8.0.4
sfftk-rw: 0.8.1
shapely: 2.0.2
six: 1.16.0
smbprotocol: 1.13.0
snowballstemmer: 2.2.0
sortedcontainers: 2.4.0
soupsieve: 2.5
sphinx: 7.2.6
sphinx-autodoc-typehints: 2.0.1
sphinxcontrib-applehelp: 1.0.8
sphinxcontrib-blockdiag: 3.0.0
sphinxcontrib-devhelp: 1.0.6
sphinxcontrib-htmlhelp: 2.0.5
sphinxcontrib-jsmath: 1.0.1
sphinxcontrib-qthelp: 1.0.7
sphinxcontrib-serializinghtml: 1.1.10
stack-data: 0.6.3
starfile: 0.5.6
superqt: 0.6.3
sympy: 1.12
tables: 3.8.0
tblib: 3.0.0
tcia-utils: 1.5.1
threadpoolctl: 3.4.0
tifffile: 2024.1.30
tinyarray: 1.2.4
toolz: 0.12.1
torch: 2.2.2
tornado: 6.4
tqdm: 4.66.2
traitlets: 5.14.2
trimesh: 4.0.10
trove-classifiers: 2024.4.10
typing-extensions: 4.11.0
tzdata: 2024.1
umap-learn: 0.5.6
urllib3: 2.2.1
wcwidth: 0.2.13
webcolors: 1.13
wheel: 0.43.0
wheel-filename: 1.4.1
widgetsnbextension: 4.0.10
wrapt: 1.16.0
yarl: 1.9.4
zarr: 2.17.2
zict: 3.0.0
zipp: 3.18.1
File attachment: a2g0z0.fasta
Attachments (1)
Change History (7)
by , 19 months ago
| Attachment: | a2g0z0.fasta added |
|---|
comment:1 by , 19 months ago
| Component: | Unassigned → Sequence |
|---|---|
| Owner: | set to |
| Platform: | → all |
| Project: | → ChimeraX |
| Status: | new → assigned |
| Summary: | ChimeraX bug report submission → Sequence viewer structure association has mismatches for identical subsequence |
comment:2 by , 18 months ago
The fasta sequence is missing the N and C terminus of the structure sequence and also missing a 3 residue segment and 2 residue segment. I guess the structure having a 3 residue and a 2 residue insertion is what is tripping up the associating the sequences.
comment:3 by , 18 months ago
If I force it to associate with Needleman-Wunsch then it associates with 0 mismatches.
It is not easy to get it to associate with Needleman-Wunsch. The only way I could figure out to do it is to change seqalign settings assoc_error_rate and rerun Alignment.associate(). It would be nice if I could ask it to associate with Needleman-Wunsch using an optional argument to Alignment.associate().
comment:4 by , 18 months ago
Yes, the insertions are causing the mismatches. Bluntly put, the sequence in the alignment is not the sequence of the structure, nor is it a subsequence.
Your best path to force NW association is to not use associate() at all, instead:
from chimerax.seqalign.alignment import nw_assoc
match_map, errors = nw_assoc(session, alignment_seq, chain)
alignment.prematched_assoc_structure(match_map, errors, False)
comment:5 by , 18 months ago
Ok, thanks for the tip. I was thinking I would see if the association produces mismatches and in that case run Needlmen-Wunsch. But maybe I will just always use Needleman-Wunsch since I know which sequence in the alignment to match to and it was quite fast for these nearly identical sequences.
comment:6 by , 18 months ago
| Resolution: | → limitation |
|---|---|
| Status: | assigned → closed |
Yes, sorry, but it is a basic assumption that the sequence in the alignment is either the full sequence of the chain or a true subsequence.
Added by email2trac