Changes between Version 2 and Version 3 of PhenixChimeraX


Ignore:
Timestamp:
Apr 16, 2020, 12:34:27 PM (6 years ago)
Author:
Tom Goddard
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • PhenixChimeraX

    v2 v3  
    1313== Specific advantages I can see include: ==
    1414
    15 - integration with Phenix’s auto-building tools for a faster and (optionally) more interactive model building pipeline. These tools are largely based in fragment-based matching to traces through density. While quite powerful (particularly at resolutions better than about 3 Angstroms) they become very slow (in the order of hours to days for a large model) when aiming for maximum completeness, and begin to fail badly beyond 3 Angstroms. Major reasons for this slowness that I can see are (a) the need to generate and then judge multiple mutually-exclusive paths; (b) difficulties in matching sidechains to density, particularly for larger and more flexible sidechains in lower-resolution density; and (c) poor geometry where fragments join. My initial experiments suggest that including ISOLDE (whether human-driven or automatic) in the build-refine-judge-build cycle can help enormously here: it tends to very rapidly pull somewhat-wonky-but-essentially-correct segments into better conformations (and fit) while avoiding most of the force-fitting of truly-wrong spots that can happen with traditional crystallographic restraints. My essential vision here would be a tool where the user defines a start and end point for a missing fragment, an associated sequence and a box of density, and asks Phenix to do its best using its “quick” mode (this is quite fast, on the order of seconds for a scenario like this). The user then gets the opportunity to correct what has been built (settling/remodelling into density, trimming away anything completely wrong) before iterating again. Done properly, this alone could be extremely powerful, significantly easing and accelerating what is arguably the most time-consuming step in building novel proteins.
     15- **Interactive fragment building.** Integration with Phenix’s auto-building tools for a faster and (optionally) more interactive model building pipeline. These tools are largely based in fragment-based matching to traces through density. While quite powerful (particularly at resolutions better than about 3 Angstroms) they become very slow (in the order of hours to days for a large model) when aiming for maximum completeness, and begin to fail badly beyond 3 Angstroms. Major reasons for this slowness that I can see are (a) the need to generate and then judge multiple mutually-exclusive paths; (b) difficulties in matching sidechains to density, particularly for larger and more flexible sidechains in lower-resolution density; and (c) poor geometry where fragments join. My initial experiments suggest that including ISOLDE (whether human-driven or automatic) in the build-refine-judge-build cycle can help enormously here: it tends to very rapidly pull somewhat-wonky-but-essentially-correct segments into better conformations (and fit) while avoiding most of the force-fitting of truly-wrong spots that can happen with traditional crystallographic restraints. My essential vision here would be a tool where the user defines a start and end point for a missing fragment, an associated sequence and a box of density, and asks Phenix to do its best using its “quick” mode (this is quite fast, on the order of seconds for a scenario like this). The user then gets the opportunity to correct what has been built (settling/remodelling into density, trimming away anything completely wrong) before iterating again. Done properly, this alone could be extremely powerful, significantly easing and accelerating what is arguably the most time-consuming step in building novel proteins.
    1616
    17 - Phenix’s crystallographic map calculations (based on CCTBX) are significantly more advanced and rigorous than mine (using Clipper). They are, however, much slower at present. Part of this is simply because the core C++ component is not parallelised (its single-threaded performance is actually about 30% faster than Clipper for the same problem) – Duncan Stockwell in the Read lab is currently looking at fixing this. The remainder is inefficiencies in the Python layer which could be mitigated/avoided with some further work. If it were possible, switching from Clipper to CCTBX for handling of crystallographic maps and symmetry would make things much easier both practically and politically. Practically, in the sense that ISOLDE would automatically benefit from the 20-30 years of theoretical advances in the field embodied in CCTBX. Politically, because my current path will otherwise put me in effective competition with Phenix: my overall vision with ISOLDE is essentially to merge model building, refinement and validation into one smooth continuous process – which, taken to its logical extreme, would eventually take “current” Phenix out of the picture entirely. I would much prefer to avoid that conflict if possible, and joining forces seems the most obvious way to do so. If I could have used CCTBX when starting this project I would have, but the incompatibility of Python versions made that effectively impossible (and the size and complexity of the code base is far too daunting for me to have taken on the task of modernising it). There is a down-side, though: while CCTBX is undoubtedly more advanced than Clipper in its capabilities it is much harder to understand and sparse on documentation (or inline code commenting), particularly in its C++ layer.
     17- **X-ray map calculations.** Phenix’s crystallographic map calculations (based on CCTBX) are significantly more advanced and rigorous than mine (using Clipper). They are, however, much slower at present. Part of this is simply because the core C++ component is not parallelised (its single-threaded performance is actually about 30% faster than Clipper for the same problem) – Duncan Stockwell in the Read lab is currently looking at fixing this. The remainder is inefficiencies in the Python layer which could be mitigated/avoided with some further work. If it were possible, switching from Clipper to CCTBX for handling of crystallographic maps and symmetry would make things much easier both practically and politically. Practically, in the sense that ISOLDE would automatically benefit from the 20-30 years of theoretical advances in the field embodied in CCTBX. Politically, because my current path will otherwise put me in effective competition with Phenix: my overall vision with ISOLDE is essentially to merge model building, refinement and validation into one smooth continuous process – which, taken to its logical extreme, would eventually take “current” Phenix out of the picture entirely. I would much prefer to avoid that conflict if possible, and joining forces seems the most obvious way to do so. If I could have used CCTBX when starting this project I would have, but the incompatibility of Python versions made that effectively impossible (and the size and complexity of the code base is far too daunting for me to have taken on the task of modernising it). There is a down-side, though: while CCTBX is undoubtedly more advanced than Clipper in its capabilities it is much harder to understand and sparse on documentation (or inline code commenting), particularly in its C++ layer.
    1818
    19 - Airlie McCoy (here in the Read Lab, primary developer of Phaser for molecular replacement) has often talked about wanting to look seriously at ChimeraX as a potential front-end. This (I think) would be a relatively straightforward task compared to anything to do with model building, primarily involving showing the same model in a range of possible poses along with the resulting maps. There may be some scope for help from ISOLDE here as well: while molecular replacement is mostly a rigid-(multi)body search, once you get to the point where solutions are looking sufficiently “real” then some tightly-restrained settling of the model into the map may be justified.
     19- **X-ray phasing front-end.** Airlie McCoy (here in the Read Lab, primary developer of Phaser for molecular replacement) has often talked about wanting to look seriously at ChimeraX as a potential front-end. This (I think) would be a relatively straightforward task compared to anything to do with model building, primarily involving showing the same model in a range of possible poses along with the resulting maps. There may be some scope for help from ISOLDE here as well: while molecular replacement is mostly a rigid-(multi)body search, once you get to the point where solutions are looking sufficiently “real” then some tightly-restrained settling of the model into the map may be justified.
    2020
    21 - One thing that ISOLDE doesn’t do right now is refine atomic B-factors. This has no real practical impact for fitting of cryoEM maps (but is important for judging of fit to density and for final interpretation of the model), but has a serious impact in crystallography where errors in B-factors directly affect the quality of the calculated map. This again is an enormously complicated field; it would be great to be able to work directly with the refinement algorithms already implemented in Phenix. While I believe these do have room for improvement at low resolutions, this would save a huge amount of effort (and political capital) that would need to be spent in reimplementation, and set a solid base for future improvement.
     21- **Refining B-factors.** One thing that ISOLDE doesn’t do right now is refine atomic B-factors. This has no real practical impact for fitting of cryoEM maps (but is important for judging of fit to density and for final interpretation of the model), but has a serious impact in crystallography where errors in B-factors directly affect the quality of the calculated map. This again is an enormously complicated field; it would be great to be able to work directly with the refinement algorithms already implemented in Phenix. While I believe these do have room for improvement at low resolutions, this would save a huge amount of effort (and political capital) that would need to be spent in reimplementation, and set a solid base for future improvement.
    2222
    23 - Phenix (specifically, the command-line tool phenix.elbow) now incorporates an ANTECHAMBER-based pipeline for ligand parameterisation, which would provide an easy route to supporting most novel ligands in ISOLDE. Current limitations are that it only supports non-covalently bound ligands, and fails on metal-containing systems (e.g. heme). Like all Phenix command-line tools, phenix.elbow is a wrapper around a Python script. Going forward, the plan is for all of these tools to be based on a common template class (designed by Billy Poon on the Phenix team), providing a fairly consistent API. This would probably be an easy sell funding-wise in the context of the current COVID-19 crisis – I expect over the coming months we’re going to see a flood of complexes with potential inhibitors (experimentally determined and computationally predicted alike).
     23- **Ligand parameterization.** Phenix (specifically, the command-line tool phenix.elbow) now incorporates an ANTECHAMBER-based pipeline for ligand parameterisation, which would provide an easy route to supporting most novel ligands in ISOLDE. Current limitations are that it only supports non-covalently bound ligands, and fails on metal-containing systems (e.g. heme). Like all Phenix command-line tools, phenix.elbow is a wrapper around a Python script. Going forward, the plan is for all of these tools to be based on a common template class (designed by Billy Poon on the Phenix team), providing a fairly consistent API. This would probably be an easy sell funding-wise in the context of the current COVID-19 crisis – I expect over the coming months we’re going to see a flood of complexes with potential inhibitors (experimentally determined and computationally predicted alike).
    2424
    2525== REST Interface ==