在Distributed locks with Redis – Redis中,首先它描述了如何正确地使用单实例实现分布式锁,然后它介绍了分布式版本的算法。但是对于分布式版本,我有许多疑问。
首先,那篇文章说
In the distributed version of the algorithm we assume we have N Redis masters. Those nodes are totally independent, so we don’t use replication or any other implicit coordination system. We already described how to acquire and release the lock safely in a single instance. We take for granted that the algorithm will use this method to acquire and release the lock in a single instance. In our examples we set N=5, which is a reasonable value, so we need to run 5 Redis masters on different computers or virtual machines in order to ensure that they’ll fail in a mostly independent way.
In order to acquire the lock, the client performs the following operations:
- It gets the current time in milliseconds.
- It tries to acquire the lock in all the N instances sequentially, using the same key name and random value in all the instances. During step 2, when setting the lock in each instance, the client uses a timeout which is small compared to the total lock auto-release time in order to acquire it. For example if the auto-release time is 10 seconds, the timeout could be in the ~ 5-50 milliseconds range. This prevents the client from remaining blocked for a long time trying to talk with a Redis node which is down: if an instance is not available, we should try to talk with the next instance ASAP.
- The client computes how much time elapsed in order to acquire the lock, by subtracting from the current time the timestamp obtained in step 1. If and only if the client was able to acquire the lock in the majority of the instances (at least 3), and the total time elapsed to acquire the lock is less than lock validity time, the lock is considered to be acquired.
- If the lock was acquired, its validity time is considered to be the initial validity time minus the time elapsed, as computed in step 3.
- If the client failed to acquire the lock for some reason (either it was not able to lock N/2+1 instances or the validity time is negative), it will try to unlock all the instances (even the instances it believed it was not able to lock).
SET resource_name my_random_value NX PX {ttl}
中的 ttl 吗?也就是我下面所说的 TTL,是吗?SET resource_name my_random_value NX PX {ttl}
,那么 ttl 是怎么计算出来的呢?我认为不同实例的 ttl 是不同的,因为尝试获取在不同的实例里的锁的时间是不一样的。因为要确保“如果所有实例的同一个 key 都在同一时间被删除”,所以我觉得每个实例里所设置的 ttl 是“TTL - (在某个实例尝试获取锁的时间 - 第一步获取到的时间)
”,对吗?(这里的 TTL 表示的是逻辑上的 TTL,并不是真实设置在某个实例里的 ttl,也就是所有实例里的同一个 key 都会在“第一步获取到的时间 + TTL”这个时间被删除)这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。
V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。
V2EX is a community of developers, designers and creative people.