@
widewing ctrl+v+a,这个弄错了,不好意思。
@
ioth 带有分隔符的文本文件,后续按照分隔符拆分。
@
crb912 好吧,我错了,只是想问如何才能更快的遍历
@
silymore print 最慢?那我去掉。现在 for line in input 和 mmap 一起运行,觉得 mmap 还没有前者快?错觉?
with open("test1.txt","r+b") as f:
mm = mmap.mmap(f.fileno(),0,prot=mmap.PROT_READ)
while True:
line = mm.readline()
#print line
if line == '':
break
for v in line.split('^A'):
# print chardet.detect(v)
#print chardet.detect(v)['encoding']
try:
if(chardet.detect(v)['encoding'] in ['ascii','none','utf-8','GB2312','GBK','Big5','GB18030','windows-1252']):
print v.decode(chardet.detect(v)['encoding']).encode('utf-8')
else:
print v.decode('utf-8').encode('utf-8')
except:
with open('error_mmap.txt','a') as e:
e.write(line)
m.close()