elasticsearch 和 mongoldb 同步数据时的错误

2016-09-11 15:11:11 +08:00
 necpowman

mongo-connector.log 的内容是

OperationFailed: TransportError(404, u'{"_index":"nicovideo","_type":"posts","_id":"27131240","found":false}')
2016-09-10 00:01:36,787 [ERROR] mongo_connector.oplog_manager:324 - Unable to process oplog document {u'h': -4790094769725122799L, u'ts': Timestamp(1473480246, 2), u'o': {u'$set': {u'view_count': 22, u'__v': 5, u'urls': [{u'cookie': u'sm29618346:1473480091:1473480091:0747757f18bebe4a:1', u'vip': False, u'type': u'mp4', u'get_at': datetime.datetime(2016, 9, 10, 4, 4, 6, 945000), u'value': u'http://smile-fnl11.nicovideo.jp/smile?m=29618346.70461'}]}}, u't': 2L, u'v': 2, u'ns': u'nicovideo.posts', u'o2': {u'_id': 29618346}, u'op': u'u'}
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/mongo_connector/oplog_manager.py", line 310, in run
    ns, timestamp)
  File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 43, in wrapped
    reraise(new_type, exc_value, exc_tb)
  File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 32, in wrapped
    return f(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic2_doc_manager.py", line 161, in update
    id=u(document_id))
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 69, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 330, in get
    doc_type, id), params=params)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 307, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py", line 93, in perform_request
    self._raise_error(response.status, raw_data)
  File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py", line 105, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
OperationFailed: TransportError(404, u'{"_index":"nicovideo","_type":"posts","_id":"29618346","found":false}')

看起来像是通过 oplog 找不到对应的数据

在 github 上面提了 issue ,没人回复,所以求助 V 友了

3332 次点击
所在节点    程序员
8 条回复
Nexvar
2016-09-11 15:22:02 +08:00
帮顶
yeasy
2016-09-11 22:15:57 +08:00
connect 方案不太稳定,建议还是自己实现
Nexvar
2016-09-12 01:00:35 +08:00
@yeasy 自己实现有什么思路吗
dangyuluo
2016-09-12 08:28:10 +08:00
我是写个程序,每天凌晨 5 点自己从数据库里读数据到 ES 里。
yybeta
2016-09-12 08:59:46 +08:00
还试过 river 插件,不太好用。如果 mongo 里有时间戳字段可以写一个定时检测和增量同步到 es 的脚本,我就是这么实现的。
necpowman
2016-09-12 14:29:49 +08:00
@dangyuluo @yybeta
两位前辈请问有代码给我参考一下吗,昨晚自己写了导入数据的脚步写了一宿,今天一测发现不能用。。。
yybeta
2016-09-12 17:49:27 +08:00
@necpowman 已发已 at
yeasy
2016-09-14 14:48:29 +08:00
思路都是差不多的。
从 mongo 导入为 json ,导入 es 。

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/305462

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX