Opened 5 years ago
Closed 5 years ago
#3528 closed defect (fixed)
Obsolete Blast hits
Reported by: | Tristan Croll | Owned by: | pett |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | Sequence | Version: | |
Keywords: | Cc: | Conrad | |
Blocked By: | Blocking: | ||
Notify when closed: | Platform: | all | |
Project: | ChimeraX |
Description
The following bug report has been submitted: Platform: Linux-3.10.0-1127.13.1.el7.x86_64-x86_64-with-centos-7.8.2003-Core ChimeraX Version: 1.0 (2020-06-04 23:15:07 UTC) Description A search using the Blast Protein tool picked up 4jcb, which has been obsoleted since 2014. Would be a good idea to regularly weed out obsolete entries from your database - they're usually removed for pretty good reason, and almost always replaced with improved models. Log: UCSF ChimeraX version: 1.0 (2020-06-04) © 2016-2020 Regents of the University of California. All rights reserved. How to cite UCSF ChimeraX > open final.cif Summary of feedback from opening final.cif --- warnings | Unknown polymer entity '1' near line 599 Unknown polymer entity '2' near line 10916 Unknown polymer entity '3' near line 11350 Unknown polymer entity '4' near line 21311 Unknown polymer entity '5' near line 23726 8 messages similar to the above omitted Atom C1 is not in the residue template for GPC /AV:101 Atom C1 is not in the residue template for GPC /BA:101 Atom C1 is not in the residue template for GPC /BB:101 Atom C1 is not in the residue template for GPC /BC:101 Atom C1 is not in the residue template for GPC /BD:101 36 messages similar to the above omitted Missing or incomplete entity_poly_seq table. Inferred polymer connectivity. Chain information for final.cif #1 --- Chain | Description AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX | ? BA BC BF BG BH BJ BK BL BM BN BO BP BQ BR BS BT BU BX ba bb bc bd be bf bg bh bi bj bk bl bm bo bp | ? BB BD BE BI BV BW bn | ? C | ? H1 | ? H2 | ? L | ? M | ? UA | ? UB | ? UC | ? aa | ? ab ac ad ae af ag ah ai aj ak al am an ao ap | ? > addh Summary of feedback from adding hydrogens to final.cif #1 --- warnings | Not adding hydrogens to /UA UNK 2 CB because it is missing heavy- atom bond partners Not adding hydrogens to /UA UNK 3 CB because it is missing heavy-atom bond partners Not adding hydrogens to /UA UNK 4 CB because it is missing heavy-atom bond partners Not adding hydrogens to /UA UNK 5 CB because it is missing heavy-atom bond partners Not adding hydrogens to /UA UNK 6 CB because it is missing heavy-atom bond partners 59 messages similar to the above omitted notes | No usable SEQRES records for final.cif (#1) chain AA; guessing termini instead No usable SEQRES records for final.cif (#1) chain AB; guessing termini instead No usable SEQRES records for final.cif (#1) chain AC; guessing termini instead No usable SEQRES records for final.cif (#1) chain AD; guessing termini instead No usable SEQRES records for final.cif (#1) chain AE; guessing termini instead 83 messages similar to the above omitted Chain-initial residues that are actual N termini: /AA HIS 2, /AB HIS 2, /AC HIS 2, /AD HIS 2, /AE HIS 2, /AF HIS 2, /AG HIS 2, /AH HIS 2, /AI HIS 2, /AJ HIS 2, /AK HIS 2, /AL HIS 2, /AM HIS 2, /AN HIS 2, /AO HIS 2, /AP HIS 2, /AQ HIS 2, /AR HIS 2, /AS HIS 2, /AT HIS 2, /AU HIS 2, /AV HIS 2, /AW HIS 2, /AX HIS 2, /BA GLY 6, /BB GLY 5, /BC GLY 6, /BD GLY 5, /BE GLY 5, /BF GLY 6, /BG GLY 6, /BH GLY 6, /BI GLY 5, /BJ GLY 6, /BK GLY 6, /BL GLY 6, /BM GLY 6, /BN GLY 6, /BO GLY 6, /BP GLY 6, /BQ GLY 6, /BR GLY 6, /BS GLY 6, /BT GLY 6, /BU GLY 6, /BV GLY 5, /BW GLY 5, /BX GLY 6, /C ALA 15, /H1 MET 1, /H2 SER 1, /L ALA 1, /M MET 1, /UA UNK 2, /UB UNK 18, /UC UNK 1, /aa HIS 2, /ab MET 1, /ac MET 1, /ad MET 1, /ae MET 1, /af MET 1, /ag MET 1, /ah MET 1, /ai MET 1, /aj MET 1, /ak MET 1, /al MET 1, /am MET 1, /an MET 1, /ao MET 1, /ap MET 1, /ba GLY 6, /bb GLY 6, /bc GLY 6, /bd GLY 6, /be GLY 6, /bf GLY 6, /bg GLY 6, /bh GLY 6, /bi GLY 6, /bj GLY 6, /bk GLY 6, /bl GLY 6, /bm GLY 6, /bn GLY 5, /bo GLY 6, /bp GLY 6 Chain-initial residues that are not actual N termini: Chain-final residues that are actual C termini: /BA PHE 44, /BB PHE 44, /BC PHE 44, /BD PHE 44, /BE PHE 44, /BF PHE 44, /BG PHE 44, /BH PHE 44, /BI PHE 44, /BJ PHE 44, /BK PHE 44, /BL PHE 44, /BM PHE 44, /BN PHE 44, /BO PHE 44, /BP PHE 44, /BQ PHE 44, /BR PHE 44, /BS PHE 44, /BT PHE 44, /BU PHE 44, /BV PHE 44, /BW PHE 44, /BX PHE 44, /C ARG 302, /H2 ILE 181, /L LYS 273, /ba PHE 44, /bb PHE 44, /bc PHE 44, /bd PHE 44, /be PHE 44, /bf PHE 44, /bg PHE 44, /bh PHE 44, /bi PHE 44, /bj PHE 44, /bk PHE 44, /bl PHE 44, /bm PHE 44, /bn PHE 44, /bo PHE 44, /bp PHE 44 Chain-final residues that are not actual C termini: /AA TYR 46, /AB TYR 46, /AC TYR 46, /AD TYR 46, /AE TYR 46, /AF TYR 46, /AG TYR 46, /AH TYR 46, /AI TYR 46, /AJ TYR 46, /AK TYR 46, /AL TYR 46, /AM TYR 46, /AN TYR 46, /AO TYR 46, /AP TYR 46, /AQ TYR 46, /AR TYR 46, /AS TYR 46, /AT TYR 46, /AU TYR 46, /AV TYR 46, /AW TYR 46, /AX TYR 46, /H1 LYS 53, /M TYR 324, /UA UNK 33, /UB UNK 17, /UC UNK 14, /aa ALA 60, /ab ALA 60, /ac ALA 60, /ad ALA 60, /ae ALA 60, /af ALA 60, /ag ALA 60, /ah ALA 60, /ai ALA 60, /aj ALA 60, /ak ALA 60, /al ALA 60, /am ALA 60, /an ALA 60, /ao ALA 60, /ap ALA 60 4988 hydrogen bonds /AA TYR 46 is not terminus, removing H atom from 'C' /AB TYR 46 is not terminus, removing H atom from 'C' /AC TYR 46 is not terminus, removing H atom from 'C' /AD TYR 46 is not terminus, removing H atom from 'C' /AE TYR 46 is not terminus, removing H atom from 'C' 40 messages similar to the above omitted 46120 hydrogens added > open /run/media/tic20/storage/structure_dump/pu_qian/gprclh1-338aleft.mrc Opened gprclh1-338aleft.mrc, grid size 300,300,300, pixel 1.05, shown at level 0.0201, step 2, values float32 > volume gaussian #2 bfactor 40 > clipper associate #2,3 toModel #1 Chain information for final.cif --- Chain | Description 1.2/AA 1.2/AB 1.2/AC 1.2/AD 1.2/AE 1.2/AF 1.2/AG 1.2/AH 1.2/AI 1.2/AJ 1.2/AK 1.2/AL 1.2/AM 1.2/AN 1.2/AO 1.2/AP 1.2/AQ 1.2/AR 1.2/AS 1.2/AT 1.2/AU 1.2/AV 1.2/AW 1.2/AX | ? 1.2/BA 1.2/BC 1.2/BF 1.2/BG 1.2/BH 1.2/BJ 1.2/BK 1.2/BL 1.2/BM 1.2/BN 1.2/BO 1.2/BP 1.2/BQ 1.2/BR 1.2/BS 1.2/BT 1.2/BU 1.2/BX 1.2/ba 1.2/bb 1.2/bc 1.2/bd 1.2/be 1.2/bf 1.2/bg 1.2/bh 1.2/bi 1.2/bj 1.2/bk 1.2/bl 1.2/bm 1.2/bo 1.2/bp | ? 1.2/BB 1.2/BD 1.2/BE 1.2/BI 1.2/BV 1.2/BW 1.2/bn | ? 1.2/C | ? 1.2/H1 | ? 1.2/H2 | ? 1.2/L | ? 1.2/M | ? 1.2/UA | ? 1.2/UB | ? 1.2/UC | ? 1.2/aa | ? 1.2/ab 1.2/ac 1.2/ad 1.2/ae 1.2/af 1.2/ag 1.2/ah 1.2/ai 1.2/aj 1.2/ak 1.2/al 1.2/am 1.2/an 1.2/ao 1.2/ap | ? > isolde start > set selectionWidth 4 Done loading forcefield > ui tool show Shell /opt/UCSF/ChimeraX/lib/python3.7/site-packages/IPython/core/history.py:226: UserWarning: IPython History requires SQLite, your history will not be saved warn("IPython History requires SQLite, your history will not be saved") Failed to add /run/media/tic20/storage/structure_dump/pu_qian/new_2020_05/GP1.xml: Residue template USER_GP1 with the same override level 0 already exists. Failed to add /run/media/tic20/storage/structure_dump/pu_qian/new_2020_05/GPC.xml: Residue template USER_GPC with the same override level 0 already exists. > set bgColor white > show sel > delete sel > clipper set contourSensitivity 0.25 > select clear > select /UA 226 atoms, 225 bonds, 1 model selected > view sel > ui tool show "Blast Protein" > blastprotein /ae database pdb cutoff 1e-3 matrix BLOSUM62 maxSeqs 100 name > bp1 Web Service: BlastProtein2 is a Python wrapper that calls blastp to search nr or pdb for sequences similar to the given protein sequence Opal service URL: http://webservices.rbvi.ucsf.edu/opal2/services/BlastProtein2Service Opal job id: appBlastProtein2Service15950736967141427397413 Opal status URL prefix: http://webservices.rbvi.ucsf.edu/appBlastProtein2Service15950736967141427397413 stdout.txt = standard output stderr.txt = standard error BlastProtein finished. > open 1xrd Summary of feedback from opening 1xrd fetched from pdb --- warning | Atom H1 is not in the residue template for MET /A:1 note | Fetching compressed mmCIF 1xrd from http://files.rcsb.org/download/1xrd.cif 1xrd title: Light-Harvesting Complex 1 Alfa Subunit from Wild-Type Rhodospirillum rubrum [more info...] Chain information for 1xrd --- Chain | Description 2.1/A 2.2/A 2.3/A 2.4/A 2.5/A 2.6/A 2.7/A 2.8/A 2.9/A 2.10/A | Light- harvesting protein B-880, α chain > matchmaker #2/A to #1/ae Parameters --- Chain pairing | bb Alignment algorithm | Needleman-Wunsch Similarity matrix | BLOSUM-62 SS fraction | 0.3 Gap open (HH/SS/other) | 18/18/6 Gap extend | 1 SS matrix | | | H | S | O ---|---|---|--- H | 6 | -9 | -6 S | | 6 | -6 O | | | 4 Iteration cutoff | 2 Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.1), sequence alignment score = 166.3 RMSD between 21 pruned atom pairs is 0.910 angstroms; (across all 52 pairs: 9.822) Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.2), sequence alignment score = 169.3 RMSD between 26 pruned atom pairs is 1.041 angstroms; (across all 52 pairs: 8.892) Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.3), sequence alignment score = 169.3 RMSD between 23 pruned atom pairs is 0.901 angstroms; (across all 52 pairs: 8.994) Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.4), sequence alignment score = 160.8 RMSD between 21 pruned atom pairs is 0.966 angstroms; (across all 52 pairs: 9.853) Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.5), sequence alignment score = 159.7 RMSD between 22 pruned atom pairs is 1.120 angstroms; (across all 52 pairs: 9.871) Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.6), sequence alignment score = 157.8 RMSD between 27 pruned atom pairs is 0.862 angstroms; (across all 52 pairs: 8.182) Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.7), sequence alignment score = 162.7 RMSD between 22 pruned atom pairs is 1.030 angstroms; (across all 52 pairs: 9.215) Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.8), sequence alignment score = 157.3 RMSD between 20 pruned atom pairs is 1.037 angstroms; (across all 52 pairs: 8.139) Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.9), sequence alignment score = 165.7 RMSD between 23 pruned atom pairs is 0.988 angstroms; (across all 52 pairs: 10.126) Matchmaker final.cif, chain ae (#1.2) with 1xrd, chain A (#2.10), sequence alignment score = 151.2 RMSD between 25 pruned atom pairs is 0.983 angstroms; (across all 52 pairs: 7.352) > close #2 > open 4jc9 Summary of feedback from opening 4jc9 fetched from pdb --- warning | PDB entry 4JC9 has been replaced by 4V9G notes | Fetching compressed mmCIF 4jc9 from http://files.rcsb.org/download/4jc9.cif Fetching CCD SPO from http://ligand-expo.rcsb.org/reports/S/SPO/SPO.cif 4jc9 title: RC-LH1-PufX dimer complex from Rhodobacter sphaeroides [more info...] Chain information for 4jc9 #2 --- Chain | Description 1 2 3 5 7 D F J N P T V X Z | Light-harvesting protein B-875 α chain 4 6 8 9 E G I K O Q S U W Y | Light-harvesting protein B-875 β chain B | Intrinsic membrane protein PufX H | Reaction center protein H chain L | Reaction center protein L chain M | Reaction center protein M chain Non-standard residues in 4jc9 #2 --- BCL — bacteriochlorophyll A BPH — bacteriopheophytin A FE2 — Fe (II) ion PO4 — phosphate ion SPO — spheroidene U10 — ubiquinone-10 (Coenzyme Q10) > matchmaker #2/1 to #1/ae Parameters --- Chain pairing | bb Alignment algorithm | Needleman-Wunsch Similarity matrix | BLOSUM-62 SS fraction | 0.3 Gap open (HH/SS/other) | 18/18/6 Gap extend | 1 SS matrix | | | H | S | O ---|---|---|--- H | 6 | -9 | -6 S | | 6 | -6 O | | | 4 Iteration cutoff | 2 Matchmaker final.cif, chain ae (#1.2) with 4jc9, chain 1 (#2), sequence alignment score = 57.4 RMSD between 24 pruned atom pairs is 1.059 angstroms; (across all 42 pairs: 4.207) > close #2 OpenGL version: 3.3.0 NVIDIA 450.51.05 OpenGL renderer: TITAN Xp/PCIe/SSE2 OpenGL vendor: NVIDIA Corporation Manufacturer: Dell Inc. Model: Precision T5600 OS: CentOS Linux 7 Core Architecture: 64bit ELF CPU: 32 Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz Cache Size: 20480 KB Memory: total used free shared buff/cache available Mem: 62G 8.1G 45G 238M 9.0G 53G Swap: 4.9G 0B 4.9G Graphics: 03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [TITAN Xp] [10de:1b02] (rev a1) Subsystem: NVIDIA Corporation Device [10de:11df] Kernel driver in use: nvidia PyQt version: 5.12.3 Compiled Qt version: 5.12.4 Runtime Qt version: 5.12.8
Change History (8)
comment:1 by , 5 years ago
Cc: | added |
---|---|
Component: | Unassigned → Sequence |
Owner: | set to |
Platform: | → all |
Project: | → ChimeraX |
Status: | new → accepted |
Summary: | ChimeraX bug report submission → Obsolete Blast hits |
comment:2 by , 5 years ago
Priority: | normal → minor |
---|
follow-up: 3 comment:3 by , 5 years ago
Well... that might explain a bit. Use of obsolete entries as templates was quite an issue in the last CASP round. I thought it was mostly people using (and not properly updating) their own bespoke databases - and that’s certainly true for some of the servers. But if the official NCBI database is also at fault... well, crap.
follow-up: 4 comment:4 by , 5 years ago
If you were feeling enthusiastic, the up-to-date list of obsolete entries (and their replacements, if any) is kept at http://ftp.wwpdb.org/pub/pdb/data/status/obsolete.dat. Rather than filtering out of the database, it might be easier to just filter the search results against that before formatting for display? On 2020-07-20 17:16, ChimeraX wrote:
comment:6 by , 5 years ago
I have contacted NLM about the obsolete entries.
Case #: CAS-595013-C6T1M0
comment:7 by , 5 years ago
The response from NLM:
Hello,
Thank you for the notice. I'll notify the blast developers and Structure group and we'll see what we can do.
Best regards,
Wayne
-=-=-=-=-=-=-=-=-=-=-=-
Wayne Matten, PhD
NCBI Customer Services
NCBI | NLM | NIH
comment:8 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
It looks like NLM has removed the obsolete structures from the blast database.
Note:
See TracTickets
for help on using tickets.
Hi Tristan,
--Eric