Opened 2 years ago
Closed 17 months ago
#9799 closed enhancement (worksforme)
Make sure 5 character CCD templates in mmCIF files work
Reported by: | Tom Goddard | Owned by: | Greg Couch |
---|---|---|---|
Priority: | moderate | Milestone: | 1.9 |
Component: | Input/Output | Version: | |
Keywords: | Cc: | pett | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
The PDB expects to start using 5 character chemical component names by the end of 2023 because there are no 3 character names left.
We should check that ChimeraX works with 5 character CCD names. There are example files on github
https://github.com/wwPDB/extended-wwPDB-identifier-examples
Change History (3)
comment:1 by , 2 years ago
Milestone: | → 1.7 |
---|
comment:2 by , 2 years ago
Milestone: | 1.7 → 1.8 |
---|
There doesn't appear to be any CCD files with 5 characters to fetch yet. Will need to know what the URL will be. For 3 character CCD identifiers, we use: https://files.wwpdb.org/pub/pdb/refdata/chem_comp/{name[-1]}/{name}/{name}.cif.
Sent email to info@… asking what to use.
comment:3 by , 17 months ago
Milestone: | 1.8 → 1.9 |
---|---|
Resolution: | → worksforme |
Status: | assigned → closed |
From info@… on 12/12/23:
Dear Greg,
My apologies for the delayed response.
This will remain the same as in the current FTP tree, i.e. with last character as hash.
(see https://www.wwpdb.org/news/news?year=2021#613b93b3ef055f03d1f222cf).
Best wishes,
Rachel
Here's PDB email about the change
From: Jasmine Young via pdb-l <pdb-l@…>
Subject: pdb-l: Coming Soon: PDB Entries with Novel Ligands Distributed Only in PDBx/mmCIF and PDBML File Formats
Date: September 18, 2023 at 9:24:27 AM PDT
To: pdb-l@…
Reply-To: Jasmine Young <jasmin@…>
Dear PDB-l.
At current growth rates, we anticipate running out of three-character Chemical Component IDs by the end of 2023. After this point, the wwPDB will issue *five-character alphanumeric accession codes for CCD IDs in the OneDep system*. To avoid confusion with current four-character PDB IDs, four-character codes will not be used. Owing to limitations of the legacy PDB file format, PDB entries containing the new five character ID codes will only be distributed in PDBx/mmCIF and PDBML formats (see previous announcement <http://www.wwpdb.org/news/news?year=2023#63ff72ccc031758bf1c30ff7>).
In addition, wwPDB has reserved a set of CCD IDs: 01 - 99, DRG, INH, LIG that will never be used in the PDB. These reserved codes can be used for new ligands during structure determination so that they can be identified as new upon deposition and added to the CCD during biocuration.
wwPDB asks users and software developers to review code to remove any current limitations on CCD ID lengths, and to enable use of PDBx/mmCIF format files. Example files with extended CCD IDs are available via GitHub <https://github.com/wwPDB/extended-wwPDB-identifier-examples> to assist code revisions. Information about the PDBx/mmCIF dictionary and file format is provided at mmcif.wwpdb.org <https://mmcif.wwpdb.org/>.
For any further information please contact us at info@….
--
Regards,
Jasmine
===========================================================
Jasmine Young, Ph.D.
Biocuration Team Lead
RCSB Protein Data Bank