Python 3: How to specify stdin encoding -
while porting code python 2 python 3, run problem when reading utf-8 text standard input. in python 2, works fine:
for line in sys.stdin: ...
but python 3 expects ascii sys.stdin, , if there non-ascii characters in input, error:
unicodedecodeerror: 'ascii' codec can't decode byte .. in position ..: ordinal not in range(128)
for regular file, specify encoding when opening file:
with open('filename', 'r', encoding='utf-8') file: line in file: ...
but how can specify encoding standard input? other posts have suggested using
input_stream = codecs.getreader('utf-8')(sys.stdin) line in input_stream: ...
however, doesn't work in python 3. still same error message. i'm using ubuntu 12.04.2 , locale set en_us.utf-8.
python 3 not expect ascii sys.stdin
. it'll open stdin
in text mode , make educated guess encoding used. guess may come down ascii
, not given. see sys.stdin
documentation on how codec selected.
like other file objects opened in text mode, sys.stdin
object derives io.textiobase
base class; has .buffer
attribute pointing underlying buffered io instance (which in turn has .raw
attribute).
wrap sys.stdin.buffer
attribute in new io.textiowrapper()
instance specify different encoding:
import io import sys input_stream = io.textiowrapper(sys.stdin.buffer, encoding='utf-8')
alternatively, set pythonioencoding
environment variable desired codec when running python.
Comments
Post a Comment