代码如下
#! /usr/bin/env python
# -*- coding: utf-8 -*-
"""
*
"""
import sys
import time
from nltk.tag import StanfordPOSTagger
reload(sys)
sys.setdefaultencoding('utf-8')
model_filename = "./data/modles/pos.tagger"
path_to_jar = "./stanford-postagger.jar"
Tagger = StanfordPOSTagger(model_filename=model_filename, path_to_jar=path_to_jar)
if __name__ == "__main__":
st = time.time()
print Tagger.tag([u"你的", u"百度", u"打人"]), time.time()-st
print Tagger.tag([u"你的", u"百度", u"打人"]), time.time()-st
print Tagger.tag([u"你的", u"百度", u"打人"]), time.time()-st
输出:
[(u'\u4f60\u7684', u'nz'), (u'\u767e\u5ea6', u'nz'), (u'\u6253\u4eba', u'v')] 5.10674095154 s
[(u'\u4f60\u7684', u'nz'), (u'\u767e\u5ea6', u'nz'), (u'\u6253\u4eba', u'v')] 10.2533240318 s
[(u'\u4f60\u7684', u'nz'), (u'\u767e\u5ea6', u'nz'), (u'\u6253\u4eba', u'v')] 16.8123478889 s
速度竟然如此慢,请大佬赐教,是我打开方式不对还是怎么??
1
holajamc 2017-11-29 18:33:04 +08:00
既然已经用了 stanford 不如直接用 hankcs
|
2
knightdf 2017-11-29 22:42:16 +08:00
别用 python 调了,好像每次起一个 jvm 做的,当然慢了
|