急急急,各位大牛帮我看看网站,关于 googlebot 疯狂爬取网站信息,且耗费巨大流量,看了下网站流量统计,googlebot 一个月要花掉 20G 流量,网站刚开始,体量也非常小,平时没什么人访问的。今天看了下日志,发现 IP:203.208.60.215 疯狂不停的爬取网站,而且是同一个文件下的相关文件“ job/index.php" 235126",但我发现后面这个数字也挺夸张的,235126,根本就没有这个文件啊,我昨天 robots 屏蔽了 googlebot,但今天依然在爬取,求懂得大神帮我分析下,该怎么办,谢谢了!!下方为日志部分截取。
203.208.60.215 - - [23/Aug/2017:01:18:31 +0800] "GET /job/index.php?job=35_50_970&city=32_394_3340&order=lastdate&keyword=B%B3%AC&all=0_0_70_16_62_57_7&tp=2&page=1 HTTP/1.1" 200 10492 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
qxu2063240029.my3w.com text/html "/usr/home/qxu2063240029/htdocs/job/index.php" 235126
203.208.60.215 - - [23/Aug/2017:01:18:31 +0800] "GET /job/index.php?job=35_50_155&city=32_394_3345&order=sdate&keyword=%B8%BE%BF%C6&all=53_0_68_0_62_57_7&tp=2&cert=3&page=1 HTTP/1.1" 200 10530 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
qxu2063240029.my3w.com text/html "/usr/home/qxu2063240029/htdocs/job/index.php" 228234
203.208.60.215 - - [23/Aug/2017:01:18:32 +0800] "GET /job/index.php?job=35_50_988&city=32_394_3345&order=sdate&keyword=%BC%EC%D1%E9&all=48_0_68_18_62_0_7&tp=2&cert=3&page=1 HTTP/1.1" 200 10546 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
qxu2063240029.my3w.com text/html "/usr/home/qxu2063240029/htdocs/job/index.php" 230826