我们做了一个支持全文搜索和关系查询的 Redis

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

› Redis

› Redis 入门指南

› phpRedisAdmin

› Deploy Redis on Ubuntu or Debian

这是一个创建于 3204 天前的主题，其中的信息可能已经有所发展或是发生改变。

我们的企业网盘OnceDoc和管理软件采用内存数据库 Redis 。 Redis 是一个使用 C 语言编写的键值对存储数据库，体积小巧，性能优异，实施简单。很多大并发网站如 Twitter 、 GitHub Weibo 、 Snapchat 、 Flickr 、阿里等都将其用作 SESSION 存储及缓存的管理。出于性能的考虑 Redis 自带的命令一般不支持按值查找。但是企业软件又需要数据库有搜索、复杂条件查询以及聚合分析的能力。为了实现这些功能我们修改了 Redis 的源码。并创建了一个新的开源分支 OnceDB，用户可以到 Github 下载最新的Windows 版本进行测试。

注* 我们为部分客户部署了一些 Linux 实例，到目前为止运行稳定， Windows 版本并没有在生产环境测试过。

版本

我们基于 Redis3.x 版本进行修改，尽管最新的 4.x 添加了外部模块的支持，但并不适合我们的应用场景，外部模块会增加 Reids 的使用风险，并且会增加修改的难度。 Redis 从 3.x 版本开始支持集群 Cluster ，足以满足我们的需要。

驱动

我们基于 [email protected] 版本做了一个新的 node.js 驱动 oncedb-client。

安装

使用 npm 安装 oncedb-client 驱动模块

npm install oncedb-client

通过 require 安装 oncedb-client 模块，并创建 client ，此时会默认连接本地 6379 端口的 redis ，然后就可以用他在 nodejs 中进行查询了。

var client = require("oncedb-client").createClient();

查询 String: search [key pattern] operator value

string 是 redis 最基本的类型，而且 string 类型是二进制安全的。即 string 可以包含任何数据。比如 jpg 图片或者序列化的 JSON 对象。下例在 text* 类型的 key 中查找值为'Kris'的数据

var client = require("oncedb-client").createClient();

client.search('text*', '=', 'Kris', function(err, objs) {
  console.log(objs)
})

输出结果为数组，第一条为健第二条记录为值

> [ 'text1', 'Kris' ]

搜索含有 Kris 的记录

client.search('text*', '~', 'Kris', function(err, objs) {
  console.log(objs)
})

输出结果为两组健值

> [ 'text1', 'Kris', 'text5', 'This is ok, Kris' ]

查询 Hash: hsearch [key pattern] field operator value ...

Redis hash 是一个 string 类型的 field 和 value 的映射表。一个 hash 类型的 key 含多个 field ，一个 field 对应一个 value 。 Hash 非常适合存放 JSON 对象。 hsearch 支持对 field 进行查询。

比如查询一条 userInfo 记录，其中 {'>':'100'} ，也可使用类 mongodb 语法：{'$gt':'100'}：

client.hsearch('userInfo:*', {
    'name'     : 'Mar'
  , 'gender'   : 'male'
  , 'nVisit'   : {'>':'100'}
}, function(err, objs) {
    console.log(objs)  
})

查找结果

> [ { _key: 'userInfo:1006',
   name: 'Mar',
   gender: 'male',
   nVisit: '10000' } ]

[文档不断完善中]

原文地址：OnceDoc Blog

第 1 条附言 · 2017-01-16 16:41:22 +08:00

Hash 批量打印（指定field及key）: hselect [num of fields] field1 field2 ... key1 key2 ...

client.hselect(
    ['name', 'email', 'isPublic', 'nVisit']
  , ['userInfo:100', 'userInfo:103', 'userInfo:1005', 'userInfo:1006']
  , function(err, objs) {
    console.log(objs)
})

输出结果

> [ { _key: 'userInfo:100',
    name: 'shanghai',
    email: null,
    isPublic: null,
    nVisit: null },
{ _key: 'userInfo:103',
    name: 'newghost',
    email: null,
    isPublic: null,
    nVisit: null },
{ _key: 'userInfo:1005',
    name: 'Mars2',
    email: null,
    isPublic: '0',
    nVisit: '10000' },
{ _key: 'userInfo:1006',
    name: 'Mar',
    email: null,
    isPublic: '1',
    nVisit: '10000' } ]

Hash 批量打印（只指定key）: hmgetall key1 key2 ...

client.hmgetall(['userInfo:100', 'userInfo:1003', 'userInfo:100'], function(err, objs) {
  console.log(objs)
})

输出结果

> [ { _key: 'userInfo:100',
    id: '100',
    name: 'shanghai',
    gender: 'female',
    poster: '龙' },
  { _key: 'userInfo:1003',
    name: 'Telyer',
    id: '1003',
    gender: 'male',
    active: '0',
    joinTime: '1484445746020',
    poster: '王五',
    isPublic: '0',
    nVisit: '300' },
  { _key: 'userInfo:100',
    id: '100',
    name: 'shanghai',
    gender: 'female',
    poster: '龙' } ]

[文档不断完善中]

17 条回复 • 2017-03-15 10:10:12 +08:00

alwayshere

2017-01-16 16:12:33 +08:00

貌似挺不错的样子，楼主是否考虑过用 ssdb 来替换 redis ， http://ssdb.io ，协议和 redis 几乎一样

lan894734188

2017-01-16 16:13:26 +08:00 via Android

支持

newghost

2017-01-16 16:22:06 +08:00

@alwayshere

Redis 的好处是 Windows 和 Linux 都支持。这个看上去也很不错呀，有机会研究研究，

newghost

2017-01-16 16:23:06 +08:00

@lan894734188

Thank you.

enenaaa

2017-01-16 16:27:02 +08:00

问一下楼主， windows 版本，将程序窗口最小化后，大部分内存会被换入硬盘虚拟内存，导致检索卡顿。你们是怎么解决的。做成后台服务么。

newghost

2017-01-16 16:40:52 +08:00

@enenaaa

我们没有在生产环境用过 Windows 的版本，没有遇到过你说的情况

misaka19000

2017-01-16 16:43:19 +08:00

感觉很不错，支持~

yanzixuan

2017-01-16 16:50:32 +08:00

@newghost 贵公司不考虑下 LINUX 用户么。。流泪

alwayshere

2017-01-16 20:19:32 +08:00

@newghost 希望能开发出 ssdb 版的，谢谢大神啦

newghost

2017-01-16 22:14:04 +08:00

@alwayshere

不是大神，这个项目也是几个人和力搞的。不过也确实发现了 redis 有些设计不合理的地方，比如说 keys 的性能非常差，有机会学习一下 ssdb 的代码

@yanzixuan

我们生产环境用的都是 Linux ，不过本地测试都是在 Windows 下进行的，毕竟 Windows 占了 80%的市场份额。

laiwei

2017-01-16 23:18:45 +08:00 via Android

有没有相关的 benchmark 数据哈

newghost

2017-01-17 08:43:18 +08:00

@laiwei

正在整理。我们试了 15 万条记录的全文搜索是毫秒级的。

wyntergreg

2017-01-17 08:55:00 +08:00

已 star

kofj

2017-01-17 10:27:53 +08:00

首先，赞一下，很不错的想法。但是，简单看了一下你们的代码后，发现用的是最基本的“顺序扫描法”执行搜索的，并没有看见有建立索引的部分，所以觉得说“支持全文搜索”欠妥。
https://github.com/OnceDoc/OnceDB/blob/master/src/t_string.c#L479-L555

newghost

2017-01-17 12:17:23 +08:00

@kofj

嗯， redis 的 key 搜索目前只支持这样，它内部并没有实现一个 key 的有序列表。它里面是按 key 的 hash 值排序的，可能是因为 hash 值的分布比较均匀，在做集群时更容易做到负载均衡。我们以后可能会在 redis 里面或驱动层增加一个关于 key 的有序列表，这样会极大提高它的搜索性能。

@wyntergreg

感谢

zonghua

2017-03-03 01:10:40 +08:00 via iPhone

python 有个 redisco 包实现了 redis 存储的 model ，内部实现好像是用 redis 的 zset

jingling0101

2017-03-15 10:10:12 +08:00

支持中文吗