Mongodb 中全文索引居然比正则慢

2017-06-07 00:53:46 +08:00
 linkbg

版本:

> version()
3.4.3

创建全文索引:

> db.tests.createIndex({'a':'text'}) # a 字段的值很大超过 1024 个字符

查询

> db.tests.find({'$text':{'$search':'ere'}).explain('executionStats')
"executionStats" : {
		"executionSuccess" : true,
		"nReturned" : 80018,
		"executionTimeMillis" : 306877,
		"totalKeysExamined" : 83546,
		"totalDocsExamined" : 83546,
		"executionStages" : {
			"stage" : "TEXT",
			"nReturned" : 80018,
			"executionTimeMillisEstimate" : 306699,
			"works" : 167095,
			"advanced" : 80018,
			"needTime" : 87076,
			"needYield" : 0,
			"saveState" : 16525,
			"restoreState" : 16525,
			"isEOF" : 1,
			"invalidates" : 0,
			"indexPrefix" : {
				
			},
			"indexName" : "a_text",
			"parsedTextQuery" : {
				"terms" : [
					"ii"
				],
				"negatedTerms" : [ ],
				"phrases" : [
					"ere"
				],
				"negatedPhrases" : [ ]
			},
			"textIndexVersion" : 3,
			"inputStage" : {
				"stage" : "TEXT_MATCH",
				"nReturned" : 80018,
				"executionTimeMillisEstimate" : 306649,
				"works" : 167095,
				"advanced" : 80018,
				"needTime" : 87076,
				"needYield" : 0,
				"saveState" : 16525,
				"restoreState" : 16525,
				"isEOF" : 1,
				"invalidates" : 0,
				"docsRejected" : 3528,
				"inputStage" : {
					"stage" : "TEXT_OR",
					"nReturned" : 83546,
					"executionTimeMillisEstimate" : 305932,
					"works" : 167095,
					"advanced" : 83546,
					"needTime" : 83548,
					"needYield" : 0,
					"saveState" : 16525,
					"restoreState" : 16525,
					"isEOF" : 1,
					"invalidates" : 0,
					"docsExamined" : 83546,
					"inputStage" : {
						"stage" : "IXSCAN",
						"nReturned" : 83546,
						"executionTimeMillisEstimate" : 1103,
						"works" : 83547,
						"advanced" : 83546,
						"needTime" : 0,
						"needYield" : 0,
						"saveState" : 16525,
						"restoreState" : 16525,
						"isEOF" : 1,
						"invalidates" : 0,
						"keyPattern" : {
							"_fts" : "text",
							"_ftsx" : 1
						},
						"indexName" : "a_text",
						"isMultiKey" : true,
						"isUnique" : false,
						"isSparse" : false,
						"isPartial" : false,
						"indexVersion" : 2,
						"direction" : "backward",
						"indexBounds" : {
							
						},
						"keysExamined" : 83546,
						"seeks" : 1,
						"dupsTested" : 83546,
						"dupsDropped" : 0,
						"seenInvalidated" : 0
					}

1,不是很理解,为什么需要做 TEXT_OR 和 TEXT_MATCH 的操作?

理想状态下,全文索引应该比正则快,要不然还要全文索引干什么,但是,如上的查询条件使用正则(无论存不存在 b_text 索引,和避免热数据,重启 mongo )都是一样的结果

> db.tests.find({'a':{'$regex':'ere','$options':'i'}}).explain('executionStats')
"executionStats" : {
		"executionSuccess" : true,
		"nReturned" : 81319,
		"executionTimeMillis" : 101701,
		"totalKeysExamined" : 0,
		"totalDocsExamined" : 4256954123,
		"executionStages" : {
			"stage" : "COLLSCAN",
			"filter" : {
				"a" : {
					"$regex" : "ere",
					"$options" : "i"
				}
			},
			"nReturned" : 81319,
			"executionTimeMillisEstimate" : 101391,
			"works" : 4256956,
			"advanced" : 81319,
			"needTime" : 4175636,
			"needYield" : 0,
			"saveState" : 33964,
			"restoreState" : 33964,
			"isEOF" : 1,
			"invalidates" : 0,
			"direction" : "forward",
			"docsExamined" : 4256954
		}

是不是我打开的方式不对呢?麻烦给指点一下。谢谢

3538 次点击
所在节点    MongoDB
2 条回复
mooncakejs
2017-06-07 06:48:09 +08:00
数据有几条?
linkbg
2017-06-07 07:49:12 +08:00
@mooncakejs 百万级数据

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/366499

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX