Opened 4 years ago
Last modified 4 years ago
#6472 assigned enhancement
Show AlphaFold predicted aligned error (PAE) plots
Reported by: | Tom Goddard | Owned by: | Tom Goddard |
---|---|---|---|
Priority: | moderate | Milestone: | |
Component: | Structure Prediction | Version: | |
Keywords: | Cc: | Elaine Meng, Tristan Croll | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
Want to show residue-residue predicted aligned error (PAE) calculated by AlphaFold. This matrix of distance values indicates for every pair of residues whether AlphaFold thinks its predictions has the two residues in the correct relative positions. This is useful for judging when AlphaFold predicts the wrong domain positions. Want to show a 2D plot (image) color coded by PAE value and be able to drag a box on the plot and color the associated residues. This is the same capability available on AlphaFold database pages.
Also want to be able color domains defined by clustering residues that the PAE matrix suggests are in the right relative position as detailed in a request from Tristan Croll #4966.
Also should be able to fetch the PAE JSON files from the EBI AlphaFold database.
Change History (8)
comment:1 by , 4 years ago
comment:2 by , 4 years ago
I am thinking about a few more additions for AlphaFold PAE.
Probably the open command should be able to directly open the json or pkl file. Specifying an associated structure will be necessary which is easy to do by command, but currently our Open File dialog does not support such mandatory options (because it is a native dialog and Qt does not support adding controls to native Open dialogs).
Probably should support the open command directly fetching PAE from AlphaFold DB, e.g. "open Q5VSL9 fromDatabase pae", or maybe instead "open Q5VSL9 from alphafold format pae".
Might want to add a color command that can do the domain coloring, e.g. "color paedomains #1".
Might want to add session saving support for the PAE matrix. Currently the plot and PAE matrix itself are not saved in sessions.
Might want to add plot gui options to control frequently useful parameters like connectMaxPae. Still deciding what parameters my qualify.
comment:3 by , 4 years ago
You might want to look at md_crds/init.py which handles the analogous situation for opening coordinate files from the Open dialog, which need associated structure models.
comment:4 by , 4 years ago
Is that the same as what I do with MTZ files in the Clipper plugin (i.e. launch a secondary choose-model dialog after the user clicks OK on the Open dialog)? That's been working out quite well.
comment:5 by , 4 years ago
I see the new code attaches the PAE matrix to the structure as structure._alphafold_pae
. Would it be possible to "upgrade" that to a non-private variable (i.e. make it part of the supported API)? I'd like to make ISOLDE use that for its confidence weighting of restraints rather than its current independent fetch.
follow-up: 6 comment:6 by , 4 years ago
Yes, I can make that public. It is not currently saved in sessions which may cause you some problems if an ISOLDE session is restored. Should probably save it in sessions.
follow-up: 7 comment:7 by , 4 years ago
Not saving in sessions shouldn't really be a problem for ISOLDE. The PAE matrix is only accessed at the time "isolde restrain distances adjustForConfidence true" is called - after that the weights are stored with the restraints and will be saved/restored. The only (rare) situation where it might be annoying is if the user wants to use the command after restoring a session, and the only fallout would be that it would ask them to load the PAE matrix again. ________________________________ From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu> Sent: 30 March 2022 19:25 To: goddard@cgl.ucsf.edu <goddard@cgl.ucsf.edu> Cc: meng@cgl.ucsf.edu <meng@cgl.ucsf.edu>; Tristan Croll <tic20@cam.ac.uk> Subject: Re: [ChimeraX] #6472: Show AlphaFold predicted aligned error (PAE) plots #6472: Show AlphaFold predicted aligned error (PAE) plots -------------------------------------------+------------------------- Reporter: Tom Goddard | Owner: Tom Goddard Type: enhancement | Status: assigned Priority: moderate | Milestone: Component: Structure Prediction | Version: Resolution: | Keywords: Blocked By: | Blocking: Notify when closed: | Platform: all Project: ChimeraX | -------------------------------------------+------------------------- Comment (by goddard@…): {{{ Yes, I can make that public. It is not currently saved in sessions which may cause you some problems if an ISOLDE session is restored. Should probably save it in sessions. }}} -- Ticket URL: <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rbvi.ucsf.edu%2Ftrac%2FChimeraX%2Fticket%2F6472%23comment%3A6&data=04%7C01%7Ctic20%40universityofcambridgecloud.onmicrosoft.com%7C3d9f3d863e7e4601d26508da127ac0be%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637842616099699345%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BTB%2BPVlVRtx7VAaN3%2BVyziW1KIPrfZyiNSQopthfEDA%3D&reserved=0> ChimeraX <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rbvi.ucsf.edu%2Fchimerax%2F&data=04%7C01%7Ctic20%40universityofcambridgecloud.onmicrosoft.com%7C3d9f3d863e7e4601d26508da127ac0be%7C49a50445bdfa4b79ade3547b4f3986e9%7C0%7C0%7C637842616099699345%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=8e5oaxZA9K%2FyEvLogxSEvaUJpzMiPS6RptgIOciAqBc%3D&reserved=0> ChimeraX Issue Tracker
comment:8 by , 4 years ago
Ok, I made the AtomicStructure alphafold_pae attribute public (no leading underscore). The attribute may not exist.
First version done.
I added a tool, menu Structure Prediction / AlphaFold Error Plot to open PAE files (.json or .pkl) and display them as an interactive plot. The plot has a button on it to color each domain of the associated structure a different color, with domains derived from the PAE matrix. I added an "alphafold pae" command that controls does the same things as the tool. And I added an "Error Plot" button to the AlphaFold tool to show the error plot tool.
Here is the current "alphafold pae" command syntax.
alphafold pae [structure] [file name of a file to open/read; a name of 'browse' will bring up a file browser] [uniprotId UniProt id] [palette a colormap] [range range] [plot true or false] [colorDomains true or false] [connectMaxPae a number] [cluster a number]
The structure argument says what open structure the PAE data is associated with. The file allows specifying a path to a .json or .pkl PAE file. The AlphaFold DB provides .json. But if user run the full AlphaFold it produces .pkl (and not .json). The uniprotId option specifies a uniprot accession id or uniprot_name to fetch the PAE from the EBI AlphaFold database. Palette and range specify the colormap for making the 2d plot. There is a new builtin colormap that is different shade of green named "pae" that is the default, covering PAE values ranging from 0 to 30 Angstroms. The colorDomains option says whether to color the structure domains, default false. The plot option says whether to show the plot, default true if colorDomains is false, or false if colorDomains is true. The connectMaxPae (default 5) and cluster (default 0.5) floating point options are parameters that control the residue clustering algorithm that computes domains which is NetworkX greedy_modularity_communities(). The "cluster" parameter is what that algorithm calls graph_resolution with small values producing fewer larger domains, and larger values making more smaller domains, typical range 0.5 - 5. The connectMaxPae parameter is the maximum PAE value to consider two residues connected. Connected residues form a weighted graph with weight inversely proportional to PAE value, and clusters are formed from the graph. Larger values may lead to more connectivity but increase the computation time. Typical computation time is about 5 seconds for 1000 residues with the default connectMaxPae of 5.