V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

• 请不要在回答技术问题时复制粘贴 AI 生成的内容

这是一个创建于 2168 天前的主题，其中的信息可能已经有所发展或是发生改变。

如下，表 table_a 字段 id （主键），check_no(varchar 唯一索引)，no_type 三个字段，这个表只有 insert 操作，没有查询，其他操作，数据量 1000 万左右另外一个表 table_b 数据量 1 万左右逻辑是这样： 1.insert table_a valus(主键，单号，单号状态); 2.步骤 1 插入成功之后 select * table_b from where id = xx for update;然后对这个表 update table_b set xxx=xxx where id =xx

并发不大，每秒 10 个请求，一般情况没有任何问题，16ms 左右就可以执行完，但是偶尔会有死锁（频率不大，一两个月会有发生一次，时间没有规律，持续时长短的话 30s，长的话 5 分钟）

异常日志表现如下： 1.项目日志报错：

SQL: select * from tables_b where id=xxx for update

Cause: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction

; SQL []; Lock wait timeout exceeded; try restarting transaction; nested exception is java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction

2.数据库慢查询如下：

Time: 190413 0:01:45

User@Host: xxx[xxxx] @ [xxx.xxx.xxx.xxx]

Thread_id: 27063157 Schema: m_center QC_hit: No

Query_time: 12.579884 Lock_time: 10.984690 Rows_sent: 0 Rows_examined: 0

SET timestamp=1555128105; INSERT INTO table_a(id,check_no,no_type) VALUES ('xxxx','xxxxx','xxx');

3.监控数据库状态 cpu 消耗升高，内存消耗升高，有死锁(数据库配置：8H8G)

分析测试： 1.开始怀疑并发操作单条记录会造成死锁，然后自己并发测试 10 个线程对同一条记录进行修改，每个线程 100 次循环，测试结果并未发生死锁 2.数据库慢查询日志锁时间很长的都是 table_a,如上图 Lock_time: 10.984690，不明白为什么 insert 表会 lock 这么久

小弟知识有限，请各位大佬指点迷津

table_a

SQL

死锁

lock

6 条回复 • 2019-07-24 14:37:32 +08:00

keepeye

2019-04-13 16:43:25 +08:00

是不是事务中间出现异常没 rollback?

hunterzhang86

2019-04-13 18:48:23 +08:00

select * table_b from where id = xx for update 这个如果查不到数据，会加上间隙锁，而如果能查到数据，select * table_b from where id = xx for update 和 update table_b set xxx=xxx where id =xx 都需要 X 锁，所以如果两个 session 同时拿到间隙锁，后面的 update 语句又需要 X 锁，就造成了死锁。而其他 session 如果执行 select * table_b from where id = xx for update 也会出现拿不到锁的情况。注意这里的间隙锁的区间包含了 X 锁的 id。
可以考虑改一下数据库的隔离级别为 RC 解决这个问题。
参考： https://time.geekbang.org/column/article/75173

carlclone

2019-04-13 19:59:57 +08:00

为什么还要 for update , update 操作本来就加锁了

zhihhh

2019-04-14 16:14:52 +08:00

想问一下事务情况是怎样的。
上面说的
插入
查询 for update
update

这三个操作时一个 begin 事务 1，2，3 commit 的吗？

zhihhh

2019-04-14 16:30:43 +08:00

我仔细看了一下楼主的描述感觉有可能发生这样的情况

我感觉 insert 那个可能是另外的问题。

推测
select * from tables_b where id=xxx for update 这个语句等锁超时了。
排除掉其他对这两个表的查询导致的情况
可能是首先第一个事务执行了
begin select * from tables_b where id =xxx for update 锁住了对应行，
这个时候准备执行 update table_b set xxx=xxx where id =xx 但是这个 xx 被另外的
begin select * from tables_b where id =xx 锁住了，就死锁了。
超过了超时时间，就报错了。

建议楼主如果是 5.7 打开 innodb_deadlock_detect=on 再观察一下。
如果还有类似问题要考虑一下这两个表是否还有别的加锁操作或者长事务在上面跑？