新入 V 站,搬运点我之前的博客吧。
Issues fixed in 3.0.4
以下均是 3.0.4 中修复了的 jira issues 。(比较重要的 issues )
由于人肉整理,难免有遗漏,欢迎补充。
WT 引擎的 shard 在 chunk migration 的时候如果,在源 chunk 上有大量写,可能会造成 migration 后的 chunk 缺失部分数据。
如果 handle 在 checkpoint start 的时候繁忙(如在 bluk load 的时候),如果这时候的 checkpoint 访问到这个 handle ,如强制 drop ,之类的,那么 chekpoint 就会 failed
将 journal 的 fdatasync 错误的进行在了 data 数据目录中,原因是: The reason is because __wt_log_open initializes log->log_dir_fh by calling __wt_open(session, "journal", ...), which expands the path to "/mnt/db/journal" before passing it to __open_directory. But __open_directory appears to expect the path to a file in the directory rather than the path of the directory, as it strips the final component of the path and opens "/mnt/db". The result is that the wt log system calls fdatasync on "/mnt/db" and not on "/mnt/db/journal".
There are two consequences to this:
• Performance - the unnecessary fdatasync calls on /mnt/db can get stuck behind a lengthy call to fdatasync on a .wt file, or otherwise be hampered by i/o to .wt files; whereas an fdatsync call to /mnt/db/journal should be unimpeded by any activity in /mnt/db if the /mnt/db/journal has been placed on a separate drive, per our recommended best practices.
• Durability - I believe durability of journaled writes depends on the fdatasync of /mnt/db/journal, in order to ensure that newly created journal files are durable.
在标明一个 node 为 down 之前,再次尝试心跳检测
初始化同步的时候, reset 了状态 清空了 oplog ,这时候可能会触发 rollback logic to attempt and fail a rollback leading to a controlled shutdown.
The rollback logic should ignore an empty oplog during rollback to accommodate for initial/re-sync running at the same time.
WT 引擎在系统意外 crash 后可能无法恢复启动(如果中断了 checkpoint )
checkpoint 提交时,吞吐量下降明显。
mongos 保持着到所有 shard 的链接,但是不直接管理他们的 lifetime ,当 mongos 连接到 recovering 状态的复制集成员的时候可能出现 error 。
WT 引擎在批量提交或者批量 update 的时候设置了 upsert=true ,且没有匹配到 document ,且 update 的 item 有唯一 index ,就可能会出现大量 WriteConflict
从上的读操作可能会 block 住复制。
因为当从压力大的时候在从上的一些长查询可能会不让出锁,不 yield 。
造成复制延时。
从节点在 background 模式下对同一个 collection 建立多个 index 的时候,这是时候该 collection 的元数据变动,可能会 fatal error
执行计划 cache 的刷新问题,对同一种类查询,执行计划有 cache 就不去验证,同一种类查询但是条件不同可能的执行情况也不同。
可以通过 internalQueryCacheReplanningEnabled 参数的设置来解决
The query optimizer caches plans for each query shape and reuses these plans for a time. In situations where the performance of the cached plan is poor for a particular instance of the query shape, the optimizer may select a the plan with poor performance and fail to evict the cache entry. This behavior may impact deployments where two queries with the same shape have different performance characteristics if they have different selectivity.
This improvement makes the query planner evaluate the cost of the cached query plan, and if the cost of this plan is too high, the query planner switches to a more efficient plan. This more efficient plan is then cached for future use.
This improvement is not enabled by default. To enable by default set the internalQueryCacheReplanningEnabled parameter totrue using the setParameter command on a running system, or at start time using the setParameter commandline option orsetParameter in the configuration file.
For example, to enable using setParameter:
db.runCommand({setParameter: 1, internalQueryCacheReplanningEnabled: true})
This improvement can be disabled as follows:
db.runCommand({setParameter: 1, internalQueryCacheReplanningEnabled: false})
3.0.4 可以使用这个参数,默认是关闭
周李洋,社区常用 ID eshujiushiwo ,
Teambition 运维总监
关注 Mysql 与 MongoDB 技术,数据架构,服务器架构等。
MongoDB 大陆首位认证 DBA , mongo-mopre , mongo-mload 作者,任 CSDN mongodb 版主, MongoDB 上海用户组发起人,
MongoDB 官方翻译组核心成员, MongoDB 中文站博主, MongoDB Contribution Award 获得者,
MongoDB Days Beijing 2014 演讲嘉宾。
联系方式: 378013446
MongoDB 上海用户组: 313290880
欢迎交流。
转载请注明链接: http://www.mongoing.com/jira_3.0.4