毕设遇到的一些困难,刚接触hadoop。我这里有抓取到的一些商家的信息的文本,格式大概这样:
{
"status": 0,
"message": "ok",
"result": {
"name": "台北纯K(江北店)",
"location": {
"lng": 106.55089,
"lat": 29.585814
},
"address": "江北区北城天街46号九街高屋1楼(近同创国际)",
"telephone": "(023)67116711",
"uid": "d874d4cbb060e92e2bd7ab37",
"detail_info": {
"tag": "休闲娱乐;ktv",
"detail_url": "http://api.map.baidu.com/place/detail?uid\u003dd874d4cbb060e92e2bd7ab37\u0026output\u003dhtml\u0026source\u003dplaceapi_v2",
"type": "life",
"price": "95",
"overall_rating": "4.5",
"service_rating": "0",
"environment_rating": "0",
"image_num": "30",
"comment_num": "100",
"shop_hours": "11:00-",
"description": "门店介绍:"
}
}
}
{
"status": 0,
"message": "ok",
"result": {
"name": "欢乐迪KTV(未来国际店)",
"location": {
"lng": 106.53938,
"lat": 29.580435
},
"address": "观音桥步行街未来国际大厦5楼",
"telephone": "023-67704888",
"uid": "f696f3c267d7b4f21b11d5cd",
"detail_info": {
"tag": "休闲娱乐;ktv",
"detail_url": "http://api.map.baidu.com/place/detail?uid\u003df696f3c267d7b4f21b11d5cd\u0026output\u003dhtml\u0026source\u003dplaceapi_v2",
"type": "life",
"price": "23",
"overall_rating": "5.0",
"service_rating": "3.4",
"environment_rating": "3.6",
"image_num": "30",
"groupon_num": "10",
"comment_num": "1052",
"shop_hours": "13-00-次日凌晨3:30",
"alias": "HappydayKTV",
"description": "门店介绍:"
}
}
}
现在我想利用hadoop实现 提取某几个标签输出出来,比如输出成如下格式:
name overall_rating address
台北纯K(江北店) 4.5 江北区北城天街46号九街高屋1楼(近同创国际)
欢乐迪KTV(未来国际店) 5.0 观音桥步行街未来国际大厦5楼
然后再根据评分进行一下排序并输出排序结果,哪位懂的能给予一些指导吗?谢谢~
这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。
V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。
V2EX is a community of developers, designers and creative people.