Opened 20 months ago
Last modified 19 months ago
#14713 accepted defect
DSSP output in ChimeraX
Reported by: | Owned by: | pett | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Structure Analysis | Version: | |
Keywords: | Cc: | Elaine Meng, Tom Goddard, m.hekkelman@…, a.perrakis@…, Greg Couch | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
Dear ChimeraX developers, We are the current developers of DSSP and during a group meeting the DSSP implementation in ChimeraX came up. From the documentation I gather that you are using a reimplementation of the 1983 algorithm. There have been a number of updates to that algorithm. For instance there was an update for Alpha-bulges and more recently we added detection for poly-proline type II helices (a.k.a. kappa-helices). To incorporate the latter, we have made backward compatible changes to the DSSP format (see https://pdb-redo.eu/dssp/about#DSSP). At the same time we also added direct annotation of mmCIF files (see https://github.com/PDB-REDO/dssp/blob/trunk/libdssp/mmcif_pdbx/dssp-extension.dic for the dictionary extension). The current version of DSSP is available as stand-alone executable, library code and an API (see https://pdb-redo.eu/dssp/download). The code is under a BSD 2-clause license. Would you be interested in incorporating the current version of DSSP in ChimeraX, reading the mmCIF output and/or executing the algorthm? We would be happy to discuss by mail or in a short teleconference. On a related note, we also develop the PDB-REDO databank and webserver/service. It would be great if PDB-REDO entries (i.e. rerefined and rebuilt versions of PDB entries, frequently better models than the original; see pdb-redo.eu) could be loaded directly into ChimeraX equivalent to how this is done for PDB and EMDB entries. We host both models and map coefficients in MTZ format. For example, PDB(-REDO) entry '1cbs' is hosted as https://pdb-redo.eu/db/1cbs/1cbs_final.cif for the model and https://pdb-redo.eu/db/1cbs/1cbs_final.mtz for the electron density maps (normal, difference and, if availble, anomalous). Similar options of fetching PDB-REDO data are available in Coot, Moorhen, YASARA and CCP4mg and we believe this could also be very useful for ChimeraX users. Best wishes, Robbie Joosten
Change History (8)
comment:1 by , 20 months ago
Cc: | added |
---|---|
Component: | Unassigned → Structure Analysis |
Owner: | set to |
Platform: | → all |
Project: | → ChimeraX |
Status: | new → accepted |
comment:2 by , 19 months ago
Dear Eric, Thanks for picking this up. We will have a look at the API documentation. The API works (the form on the main page uses it) but, yeah, we for got to document it properly. Anyway, I'm happy to hear that you are interested in one way or another incorporating the current DSSP. We are happy to help if needed. With respect to supporting PDB-REDO, please add me to the ticket. Best wishes, Robbie On 08/03/2024 01:38, ChimeraX wrote:
comment:3 by , 19 months ago
Cc: | added |
---|
Hi Robbie,
I added you to the PDB-REDO ticket. One thing I wanted to mention is that ChimeraX uses some highly specialized code tuned to the exact format used by the PDB for their mmCIF files to read them as fast as possible. If it encounters files that don't match that format, it can still read them but not as fast and it issues a warning to the ChimeraX log.
The PDB-REDO mmCIFs don't quite match the PDB format because in their atom information the numerical fields are right justified instead of left justified. So right now ChimeraX issues a warning when reading them and uses the slightly slower parsing. I don't know if you are interested in changing this in your files or not. If you are, I would just wait for the REDO files to get updated. Otherwise, I can suppress the warning so that users don't see it all the time...
--Eric
comment:5 by , 19 months ago
Although 3x is only the parse time -- the time to create the molecular data and draw the structure is unaffected, so the overall time from opening the file to displaying the structure is slowed, but not by 3x.
comment:6 by , 19 months ago
Hi Eric, I didn't realise an mmCIF parser could be sensitive to that. I don't think we ever benchmarked the performance of libcifpp (our cif library) on different formatting, mostly because the (mm)CIF format doesn't prescribe any particular white-space organisation. Now I'm interested in how you do with files from Phenix that use a single space between fields and have no column formatting. Now since right justification is easiest to read (I'm afraid that I still have to do that more often than one would like), I don't think we are going to change our formatting anytime soon. I think that removing the warning makes sense. That said, if you find errors (as in dictionary non-compliance) in our mmCIF files, please scream bloody murder and tell us. Cheers, Robbie
comment:7 by , 19 months ago
See https://readcif.readthedocs.io/en/latest/compare.html for benchmarking results. In the files from the PDB, the columns are left-justified. That is a requirement for reliably figuring out the columns offsets from the first row of a table. For PDB-REDO, if you leave out the audit_conform table from the generated mmCIF file, then it wouldn't look it it comes from the PDB, and ChimeraX would automatically use the slower parsing. -- Greg On 3/20/2024 11:59 AM, r.joosten@nki.nl wrote: > > > > > >> -----Original Message----- >> From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu> >> Sent: Wednesday, March 20, 2024 17:34 >> To: pett@cgl.ucsf.edu; r.joosten@nki.nl >> Cc: a.perrakis@nki.nl; goddard@cgl.ucsf.edu; gregc@cgl.ucsf.edu; >> m.hekkelman@nki.nl; meng@cgl.ucsf.edu >> Subject: Re: [ChimeraX] #14713: DSSP output in ChimeraX >> >> LET OP: Deze e-mail is afkomstig van buiten de organisatie. Open alleen links of >> bijlagen als je de afzender kent en weet dat de inhoud veilig is. >> CAUTION: This email originated from outside of the organization. Do not click >> links or open attachments unless you recognize the sender and know the >> content is safe. >> >> #14713: DSSP output in ChimeraX >> -----------------------------------------+---------------------- >> Reporter: r.joosten@\u2026 | Owner: pett >> Type: defect | Status: accepted >> Priority: normal | Milestone: >> Component: Structure Analysis | Version: >> Resolution: | Keywords: >> Blocked By: | Blocking: >> Notify when closed: | Platform: all >> Project: ChimeraX | >> -----------------------------------------+---------------------- >> Changes (by pett): >> >> * cc: Greg Couch (added) >> >> Comment: >> >> Hi Robbie, >> I added you to the PDB-REDO ticket. One thing I wanted to mention >> is that ChimeraX uses some highly specialized code tuned to the exact >> format used by the PDB for their mmCIF files to read them as fast as >> possible. If it encounters files that don't match that format, it can >> still read them but not as fast and it issues a warning to the ChimeraX >> log. >> The PDB-REDO mmCIFs don't quite match the PDB format because in >> their atom information the numerical fields are right justified instead of >> left justified. So right now ChimeraX issues a warning when reading them >> and uses the slightly slower parsing. I don't know if you are interested >> in changing this in your files or not. If you are, I would just wait for >> the REDO files to get updated. Otherwise, I can suppress the warning so >> that users don't see it all the time... >> >> --Eric >> -- >> Ticket URL: >> <https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/14713#comment:3> >> ChimeraX <https://www.rbvi.ucsf.edu/chimerax/> >> ChimeraX Issue Tracker >
comment:8 by , 19 months ago
Hi Greg, Thank you for the link. The stylized read is a nice performance trick if you can rely on consistent formatting for an entire block. I tried that for reading reflection data from the PDB (back in the day) and found out that I couldn't rely on this, particularly when you also have line wrapping. But nowadays PDB entries are very consistent. The audit_conform block describes the dictionary (version) and is important when you hold on to files for a longer time and want to make data as FAIR as you can. It is not PDB specific per se, but we seem to be one of the few people writing these records. We cannot drop these, but I'll try to think of another record that distinguishes PDB from PDB-REDO data. Cheers, Robbie On 20 Mar 2024 20:15, Greg Couch <gregc@cgl.ucsf.edu> wrote: LET OP: Deze e-mail is afkomstig van buiten de organisatie. Open alleen links of bijlagen als je de afzender kent en weet dat de inhoud veilig is. CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe. See https://readcif.readthedocs.io/en/latest/compare.html for benchmarking results. In the files from the PDB, the columns are left-justified. That is a requirement for reliably figuring out the columns offsets from the first row of a table. For PDB-REDO, if you leave out the audit_conform table from the generated mmCIF file, then it wouldn't look it it comes from the PDB, and ChimeraX would automatically use the slower parsing. -- Greg On 3/20/2024 11:59 AM, r.joosten@nki.nl wrote: > > > > > >> -----Original Message----- >> From: ChimeraX <ChimeraX-bugs-admin@cgl.ucsf.edu> >> Sent: Wednesday, March 20, 2024 17:34 >> To: pett@cgl.ucsf.edu; r.joosten@nki.nl >> Cc: a.perrakis@nki.nl; goddard@cgl.ucsf.edu; gregc@cgl.ucsf.edu; >> m.hekkelman@nki.nl; meng@cgl.ucsf.edu >> Subject: Re: [ChimeraX] #14713: DSSP output in ChimeraX >> >> LET OP: Deze e-mail is afkomstig van buiten de organisatie. Open alleen links of >> bijlagen als je de afzender kent en weet dat de inhoud veilig is. >> CAUTION: This email originated from outside of the organization. Do not click >> links or open attachments unless you recognize the sender and know the >> content is safe. >> >> #14713: DSSP output in ChimeraX >> -----------------------------------------+---------------------- >> Reporter: r.joosten@… | Owner: pett >> Type: defect | Status: accepted >> Priority: normal | Milestone: >> Component: Structure Analysis | Version: >> Resolution: | Keywords: >> Blocked By: | Blocking: >> Notify when closed: | Platform: all >> Project: ChimeraX | >> -----------------------------------------+---------------------- >> Changes (by pett): >> >> * cc: Greg Couch (added) >> >> Comment: >> >> Hi Robbie, >> I added you to the PDB-REDO ticket. One thing I wanted to mention >> is that ChimeraX uses some highly specialized code tuned to the exact >> format used by the PDB for their mmCIF files to read them as fast as >> possible. If it encounters files that don't match that format, it can >> still read them but not as fast and it issues a warning to the ChimeraX >> log. >> The PDB-REDO mmCIFs don't quite match the PDB format because in >> their atom information the numerical fields are right justified instead of >> left justified. So right now ChimeraX issues a warning when reading them >> and uses the slightly slower parsing. I don't know if you are interested >> in changing this in your files or not. If you are, I would just wait for >> the REDO files to get updated. Otherwise, I can suppress the warning so >> that users don't see it all the time... >> >> --Eric >> -- >> Ticket URL: >> <https://www.rbvi.ucsf.edu/trac/ChimeraX/ticket/14713#comment:3> >> ChimeraX <https://www.rbvi.ucsf.edu/chimerax/> >> ChimeraX Issue Tracker >
Dear Robbie,
Sincerely, Eric