写了一个提取词语的,针对某一篇文章
http://www.bbc.co.uk/news/business-19661899结果:
also : marking
although : many
anant : agarwal
answering : whether
anyone : learn
battle : growing, higher, open, global, doubt, numbers
beat : system
bought : being
chancellor : growing, doubt, martin, global, open, higher, vice, numbers, battle
commencing : humanities
complete : degree
conversations : difficult
currently : studying
depends : marking
dissemination : research
distance : institutions
doubt : open, numbers, higher
engineering : research, while
front : computer, sitting
global : story, doubt, continue, reading, open, numbers, higher
growing : higher, main, numbers, reading, story, continue, open, doubt
ideas : years
individual : open
institutions : research
iris : recognition
irony : computer
issue : less
issues : challenging
large : open
latter : former
lecture : traditional
level : degree
lift : rising, agarwal
literature : english
made : most
main : global, quote
majority : degree, level, quite
martin : vice, numbers, global, higher, battle, growing, doubt, open
maths : research, while
numbers : higher, open, large
open : higher
paper : computer
peer : grade
professor : world
quite : degree, level
quote : reading, continue
research : while, social
rising : agarwal
same : traditional
saudi : arabia
schools : most
sciences : while, research
shaping : global, numbers, open, martin, doubt, growing, chancellor, vice, higher, battle
sitting : computer
social : while
someone : latest
story : quote
subjects : questions
taking : exams
testing : traditional
through : halfway
vice : doubt, global, open, numbers, growing, battle, higher
code : cheating
engineering : social
growing : global
main : continue, story, reading
maths : engineering, social
reading : continue
sciences : maths, social, engineering
story : continue, reading
taylor : prof
感觉要过滤掉很多常见词啊,结果各种乱
如果针对大量文章的不知用什么方法弄...