Opened 6 years ago

Closed 6 years ago

#2333 closed defect (not a bug)

UTF-16 file

Reported by: p.albanese@… Owned by: Eric Pettersen
Priority: normal Milestone:
Component: Input/Output Version:
Keywords: Cc: Greg Couch, Tom Goddard
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

The following bug report has been submitted:
Platform:        Windows-10-10.0.17763
ChimeraX Version: 0.9 (2019-06-06)
Description
I'm trying to map crosslinks in form of .pb files (i was already doing this with previous versions, daily releases actually) but now with this release i got this message:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

File "C:\Program Files\ChimeraX\bin\lib\codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)

See log for complete Python traceback.

___ is that somthing related to the PDB or the .pb file attached?

Log:
UCSF ChimeraX version: 0.9 (2019-06-06)  
© 2016-2019 Regents of the University of California. All rights reserved.  
How to cite UCSF ChimeraX  

> open "C:\\\Users\\\Alban001\\\Downloads\\\4wjg.pdb"

4wjg.pdb title:  
Structure of T. Brucei haptoglobin-hemoglobin receptor binding to human
haptoglobin-hemoglobin [more info...]  
  
Chain information for 4wjg.pdb #1  
---  
Chain | Description  
1 B G L Q V | hemoglobin subunit β  
2 C H M R W | zonulin  
3 D I N S X | iron-regulated surface determinant protein H  
4 E J O T Y | haptoglobin-hemoglobin receptor  
A F K P U Z | hemoglobin subunit α  
  
Non-standard residues in 4wjg.pdb #1  
---  
HEM — protoporphyrin IX containing Fe (HEME)  
NAG — N-acetyl-D-glucosamine  
OXY — oxygen molecule  
  

> select /4:36-296

1963 atoms, 1982 bonds selected  

> select /E:36-296

1963 atoms, 1982 bonds selected  

> select /1:1-146

1123 atoms, 1153 bonds selected  

> select /B:1-146

1123 atoms, 1153 bonds selected  

> select /B:1-146

1123 atoms, 1153 bonds selected  

> select /A:44@O

1 atom selected  

> select /1:1-146

1123 atoms, 1153 bonds selected  

> show selAtoms ribbons

> hide selAtoms

> select up

40807 atoms, 41616 bonds, 64 pseudobonds, 1 model selected  

> select up

41424 atoms, 42331 bonds, 64 pseudobonds, 1 model selected  

> select up

44086 atoms, 45092 bonds, 64 pseudobonds, 1 model selected  

> select up

46624 atoms, 47688 bonds, 64 pseudobonds, 1 model selected  

> select up

47792 atoms, 48888 bonds, 77 pseudobonds, 1 model selected  

> hide selAtoms

> show selAtoms ribbons

> select clear

> select /B:9

6 atoms, 5 bonds selected  

> select up

1530 atoms, 1560 bonds selected  

> select up

9578 atoms, 9843 bonds selected  

> select ~sel

38214 atoms, 39045 bonds, 51 pseudobonds, 1 model selected  

> delete sel

> save "K:\\\HbHp_TP\\\PDB\\\HpHb_human.pdb"

> select clear

> close #1

> open K:/HbHp_TP/PDB/HpHb_human.pdb

Summary of feedback from opening K:/HbHp_TP/PDB/HpHb_human.pdb  
---  
warning | Ignored bad PDB record found on line 21459  
END  
  
HpHb_human.pdb title:  
Structure of T. Brucei haptoglobin-hemoglobin receptor binding to human
haptoglobin-hemoglobin [more info...]  
  
Chain information for HpHb_human.pdb #1  
---  
Chain | Description  
A F | hemoglobin subunit α  
B G | hemoglobin subunit β  
C H | zonulin  
  
Non-standard residues in HpHb_human.pdb #1  
---  
HEM — protoporphyrin IX containing Fe (HEME)  
NAG — N-acetyl-D-glucosamine  
OXY — oxygen molecule  
  

> open "K:/HbHp_TP/PD results (tera2)/0_mapped.pb"

Traceback (most recent call last):  
File "C:\Program Files\ChimeraX\bin\lib\site-packages\chimerax\ui\gui.py",
line 572, in _qt_safe  
run(session, "open " + quote_if_necessary(paths[0]))  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\core\commands\run.py", line 31, in run  
results = command.run(text, log=log)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\core\commands\cli.py", line 2632, in run  
result = ci.function(session, **kw_args)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\core\commands\open.py", line 64, in open  
path_models = session.models.open(paths, format=format, name=name, **kw)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\core\models.py", line 601, in open  
session, filenames, format=format, name=name, **kw)  
File "C:\Program Files\ChimeraX\bin\lib\site-packages\chimerax\core\io.py",
line 477, in open_multiple_data  
models, status = open_data(session, fspec, format=format, name=name, **kw)  
File "C:\Program Files\ChimeraX\bin\lib\site-packages\chimerax\core\io.py",
line 433, in open_data  
models, status = open_func(*args, **kw)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\read_pbonds\\__init__.py", line 23, in open_file  
return readpbonds.read_pseudobond_file(session, stream, file_name)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\read_pbonds\readpbonds.py", line 16, in read_pseudobond_file  
lines = stream.readlines()  
File "C:\Program Files\ChimeraX\bin\lib\codecs.py", line 322, in decode  
(result, consumed) = self._buffer_decode(data, self.errors, final)  
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0:
invalid start byte  
  
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0:
invalid start byte  
  
File "C:\Program Files\ChimeraX\bin\lib\codecs.py", line 322, in decode  
(result, consumed) = self._buffer_decode(data, self.errors, final)  
  
See log for complete Python traceback.  
  

> sequence chain #1/C#1/H

Alignment identifier is 1  

> select clear

> open "K:/HbHp_TP/PD results (tera2)/HbHp_75.pb"

Traceback (most recent call last):  
File "C:\Program Files\ChimeraX\bin\lib\site-packages\chimerax\ui\gui.py",
line 572, in _qt_safe  
run(session, "open " + quote_if_necessary(paths[0]))  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\core\commands\run.py", line 31, in run  
results = command.run(text, log=log)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\core\commands\cli.py", line 2632, in run  
result = ci.function(session, **kw_args)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\core\commands\open.py", line 64, in open  
path_models = session.models.open(paths, format=format, name=name, **kw)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\core\models.py", line 601, in open  
session, filenames, format=format, name=name, **kw)  
File "C:\Program Files\ChimeraX\bin\lib\site-packages\chimerax\core\io.py",
line 477, in open_multiple_data  
models, status = open_data(session, fspec, format=format, name=name, **kw)  
File "C:\Program Files\ChimeraX\bin\lib\site-packages\chimerax\core\io.py",
line 433, in open_data  
models, status = open_func(*args, **kw)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\read_pbonds\\__init__.py", line 23, in open_file  
return readpbonds.read_pseudobond_file(session, stream, file_name)  
File "C:\Program Files\ChimeraX\bin\lib\site-
packages\chimerax\read_pbonds\readpbonds.py", line 16, in read_pseudobond_file  
lines = stream.readlines()  
File "C:\Program Files\ChimeraX\bin\lib\codecs.py", line 322, in decode  
(result, consumed) = self._buffer_decode(data, self.errors, final)  
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0:
invalid start byte  
  
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0:
invalid start byte  
  
File "C:\Program Files\ChimeraX\bin\lib\codecs.py", line 322, in decode  
(result, consumed) = self._buffer_decode(data, self.errors, final)  
  
See log for complete Python traceback.  
  




OpenGL version: 3.3.0
OpenGL renderer: Quadro FX 380/PCIe/SSE2
OpenGL vendor: NVIDIA Corporation
File attachment: 0_mapped.pb

0_mapped.pb

Attachments (1)

0_mapped.pb (4.5 KB ) - added by p.albanese@… 6 years ago.
Added by email2trac

Download all attachments as: .zip

Change History (3)

by p.albanese@…, 6 years ago

Attachment: 0_mapped.pb added

Added by email2trac

comment:1 by Eric Pettersen, 6 years ago

Cc: Greg Couch Tom Goddard added
Component: UnassignedInput/Output
Owner: set to Eric Pettersen
Platform: all
Project: ChimeraX
Status: newaccepted
Summary: ChimeraX bug report submissionUTF-16 file

comment:2 by Eric Pettersen, 6 years ago

Resolution: not a bug
Status: acceptedclosed

Hi Pascal,

The file you sent is in UTF-16 format with a 2-byte BOM (byte order marker: https://en.wikipedia.org/wiki/Byte_order_mark) at the beginning. I don't think any version of ChimeraX was ever able to read that particular file. Certainly the the 0.9 production release (from early June) cannot. You need to write the file either in ASCII or UTF-8 format.

--Eric

Eric Pettersen
UCSF Computer Graphics Lab

Note: See TracTickets for help on using tickets.