面试中遇到的一道 sql 题

2022-11-17 12:45:35 +08:00
 qiyong

面试中遇到一道简单的 sql 题,小弟也能写出来,但写的非常乱,嵌套了很多子查询,我感觉有更美妙的写法,请教各位

题目

我的解答

SELECT a.id, a.name, a.group_id, b.name group_name, c.sum_goods_sales_volume, d.sum_group_sales_volume
FROM goods a,
     goods_group b,
     (select sum(sales_volume) sum_goods_sales_volume, goods_id
      from goods_sales_record
      group by goods_id
      order by sum_goods_sales_volume desc
      limit 3) c,
     (select sum(sales_volume) sum_group_sales_volume, bb.group_id FROM goods_sales_record aa
         JOIN goods bb WHERE aa.goods_id = bb.id GROUP BY bb.group_id) d
WHERE a.group_id = b.id AND  a.id = c.goods_id AND a.group_id = d.group_id
order by sum_group_sales_volume desc, sum_goods_sales_volume desc;

附上表结构 及 数据 方便大佬们使用

goods

/*
 Navicat Premium Data Transfer

 Source Server         : 本地 MySql
 Source Server Type    : MySQL
 Source Server Version : 80028
 Source Host           : localhost:3306
 Source Schema         : transaction

 Target Server Type    : MySQL
 Target Server Version : 80028
 File Encoding         : 65001

 Date: 17/11/2022 12:43:34
*/

SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;

-- ----------------------------
-- Table structure for goods
-- ----------------------------
DROP TABLE IF EXISTS `goods`;
CREATE TABLE `goods`  (
  `id` int NOT NULL,
  `name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NULL DEFAULT NULL,
  `group_id` int NULL DEFAULT NULL,
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = Dynamic;

-- ----------------------------
-- Records of goods
-- ----------------------------
INSERT INTO `goods` VALUES (1, '苹果手机', 1);
INSERT INTO `goods` VALUES (2, '三星手机', 1);
INSERT INTO `goods` VALUES (3, '联想电脑', 2);
INSERT INTO `goods` VALUES (4, '华为手机', 1);
INSERT INTO `goods` VALUES (5, '华硕电脑', 2);
INSERT INTO `goods` VALUES (6, 'IKBC', 3);

SET FOREIGN_KEY_CHECKS = 1;

goods_group

/*
 Navicat Premium Data Transfer

 Source Server         : 本地 MySql
 Source Server Type    : MySQL
 Source Server Version : 80028
 Source Host           : localhost:3306
 Source Schema         : transaction

 Target Server Type    : MySQL
 Target Server Version : 80028
 File Encoding         : 65001

 Date: 17/11/2022 12:43:41
*/

SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;

-- ----------------------------
-- Table structure for goods_group
-- ----------------------------
DROP TABLE IF EXISTS `goods_group`;
CREATE TABLE `goods_group`  (
  `id` int NOT NULL,
  `name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NULL DEFAULT NULL,
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = Dynamic;

-- ----------------------------
-- Records of goods_group
-- ----------------------------
INSERT INTO `goods_group` VALUES (1, '手机');
INSERT INTO `goods_group` VALUES (2, '电脑');
INSERT INTO `goods_group` VALUES (3, '键盘');

SET FOREIGN_KEY_CHECKS = 1;

goods_sales_record

/*
 Navicat Premium Data Transfer

 Source Server         : 本地 MySql
 Source Server Type    : MySQL
 Source Server Version : 80028
 Source Host           : localhost:3306
 Source Schema         : transaction

 Target Server Type    : MySQL
 Target Server Version : 80028
 File Encoding         : 65001

 Date: 17/11/2022 12:43:26
*/

SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;

-- ----------------------------
-- Table structure for goods_sales_record
-- ----------------------------
DROP TABLE IF EXISTS `goods_sales_record`;
CREATE TABLE `goods_sales_record`  (
  `id` int NOT NULL,
  `goods_id` int NULL DEFAULT NULL,
  `sales_volume` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NULL DEFAULT NULL,
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = Dynamic;

-- ----------------------------
-- Records of goods_sales_record
-- ----------------------------
INSERT INTO `goods_sales_record` VALUES (1, 1, '50');
INSERT INTO `goods_sales_record` VALUES (2, 2, '30');
INSERT INTO `goods_sales_record` VALUES (3, 3, '88');
INSERT INTO `goods_sales_record` VALUES (4, 4, '88');
INSERT INTO `goods_sales_record` VALUES (5, 5, '444');
INSERT INTO `goods_sales_record` VALUES (6, 6, '34');

SET FOREIGN_KEY_CHECKS = 1;
5513 次点击
所在节点    程序员
38 条回复
zhangxh1023
2022-11-17 13:40:44 +08:00
这种东西感觉没个半小时调试加搜索我感觉我写不出来🤦‍♂️
fengjianxinghun
2022-11-17 13:51:04 +08:00
这题没有 copilot 做不出来
wangnimabenma
2022-11-17 13:54:35 +08:00
抖个机灵,我会分开查不会写复杂的子查询或者其他。原因我可以和面试官好好说说
iseki
2022-11-17 14:04:42 +08:00
用 with 拆开会更好一点
qiyong
2022-11-17 14:11:25 +08:00
忘说了 是第二道题
zhzy0077
2022-11-17 14:27:47 +08:00
第一问第二问不是连着的吗

SELECT TOP 3 A.name, C.sales_volume, B.name FROM goods A
JOIN goods_group B ON A.group_id = B.id
JOIN goods_sales_record C ON A.id = C.goods_id
ORDER BY sales_volume DESC

SELECT D.good_name, D.sales_volume, D.group_name, E.sum_sales FROM
(
SELECT TOP 3 A.name AS good_name, C.sales_volume, B.name AS group_name, B.id AS group_id FROM goods A
JOIN goods_group B ON A.group_id = B.id
JOIN goods_sales_record C ON A.id = C.goods_id
ORDER BY sales_volume DESC
) D
JOIN (
SELECT B.id AS id, SUM(C.sales_volume) AS sum_sales FROM goods A
JOIN goods_group B ON A.group_id = B.id
JOIN goods_sales_record C ON A.id = C.goods_id
GROUP BY B.id
) E ON D.group_id = E.id
ORDER BY sum_sales DESC, sales_volume DESC
CRVV
2022-11-17 14:46:13 +08:00
首先把销量的类型改成数字, `sales_volume` INT

最后都需要套一层子查询来重新排序,就不写了。
这三种写法都是用 ORDER BY volume DESC LIMIT 3 来选出前 3 ,还可以用 rank <= 3 来选前 3 的,如果有重复的会得到不同结果
都可以在最新的 MySQL 上执行

SELECT goods.name,
goods_sales_record.sales_volume,
goods_group.name AS group_name,
t.group_volume
FROM goods
INNER JOIN goods_sales_record ON goods.id = goods_sales_record.goods_id
INNER JOIN goods_group ON goods.group_id = goods_group.id
INNER JOIN (SELECT goods_group.id, sum(goods_sales_record.sales_volume) AS group_volume
FROM goods_group
INNER JOIN goods ON goods.group_id = goods_group.id
INNER JOIN goods_sales_record on goods.id = goods_sales_record.goods_id
GROUP BY goods_group.id) AS t ON t.id = goods_group.id
ORDER BY 2 DESC limit 3;

SELECT goods.name,
goods_sales_record.sales_volume,
goods_group.name AS group_name,
t.group_volume
FROM goods
INNER JOIN goods_sales_record ON goods.id = goods_sales_record.goods_id
INNER JOIN goods_group ON goods.group_id = goods_group.id
CROSS JOIN LATERAL (SELECT sum(goods_sales_record.sales_volume) AS group_volume
FROM goods INNER JOIN goods_sales_record on goods.id = goods_sales_record.goods_id
WHERE group_id = goods_group.id) AS t
ORDER BY 2 DESC limit 3;

SELECT goods.name,
goods_sales_record.sales_volume,
goods_group.name AS group_name,
sum(sales_volume) OVER (PARTITION BY goods_group.id) AS group_volume
FROM goods
INNER JOIN goods_sales_record ON goods.id = goods_sales_record.goods_id
INNER JOIN goods_group ON goods.group_id = goods_group.id
ORDER BY 2 DESC
LIMIT 3;
DinnyXu
2022-11-17 14:56:37 +08:00
这个题用 SQL 是能做出来,但是对于实际业务来说 SQL 不会写成这种 join 和 嵌套,商品和商品分组查出来,在单独去统计各个商品的销量以及商品分组总销量。要是我面试就可以跟面试官扯皮一下,这种题拿来面试是侮辱智商的。直接喷他
CRVV
2022-11-17 15:08:19 +08:00
@DinnyXu
这种 SQL 在很多地方都用得到,而且这个题其实不难。
如果每个表都有几亿行,用 BigQuery 写一句 SQL 就能实现,速度也不慢。如果不写 SQL 你打算怎么做?
jhb
2022-11-17 15:08:28 +08:00
这应该是一道考窗口函数的题目,不用写复杂的分组语句
qzwmjv
2022-11-17 15:43:56 +08:00
这不写嵌套 sql 就完事?除非有数据倾斜需要解决
potatowish
2022-11-17 16:13:30 +08:00
select t0.* from (
select
t.goods_name,
t.sales_volume sv,
t.group_name,
sum(t.sales_volume) OVER (PARTITION BY t.group_name)gsv
from (
select g.id as goods_id, g.name as goods_name, gg.name as group_name, gsr.sales_volume
from goods_sales_record gsr
left join goods g on g.id = gsr.goods_id
left join goods_group gg on gg.id = g.group_id
)t
order by sv desc limit 3
)t0
order by t0.gsv desc, t0.sv desc;
raysonlu
2022-11-17 16:52:32 +08:00
@CRVV 这类的查询业务,数据量到达亿级的话,是把整个查询业务都放在 sql 呢(比如这种面试题需要一句 sql 语句得出最终结果,或窗口函数),还是多次 sql 查询分步处理好?(比如 php 或 java 先查出销量最好三个,再根据分组 id 查分组销量情况,再拼凑)
wangxin3
2022-11-17 16:56:52 +08:00
mysql8

select g.name as 商品名,
gsr.sales_volume as 商品销量,
gg.name as 商品所属分组名,
sum(gsr.sales_volume) over (partition by g.group_id) as 总销量
from goods_sales_record gsr
left join goods g on gsr.goods_id = g.id
left join goods_group gg on gg.id = g.group_id
order by 总销量 desc, 商品销量 desc
limit 3
qiyong
2022-11-17 17:18:18 +08:00
@wangxin3 大佬 你这个商品销量排序是失效的 为啥呢 我没看出来
wangxin3
2022-11-17 17:40:57 +08:00
@qiyong 你的销量字段类型改下,改为 int 再试试
CRVV
2022-11-17 17:48:34 +08:00
@raysonlu

如果能分步处理当然分步处理是很多人首选的方案,因为代码比较好懂,会写复杂 SQL 的人没那么多。

但就这个题来说,如果商品有 1 亿个,销量表是分小时的,要查所有商品,按近一个月的销量倒序排列,要能搜索能翻页,还是那种能直接跳到第 10000 页的设计。这样的不太可能分步来处理吧,每一个中间步骤的结果都很大。实际的需求通常都比这个复杂。

写一个大 SQL 可能直接就把数据库弄死了,当然也不行。
所以我上面说 BigQuery ,这东西就是干这个事用的,应该算是解决这种问题的方案之一。
sadfQED2
2022-11-17 17:53:34 +08:00
@DinnyXu PM:我想要个 xxx 数据,麻烦帮我跑一个吧

写 SQL 查十几分钟就搞定了,而且实际工作中,这种 SQL 都是丢到数仓里面去执行的,数仓里面几百行的 SQL 很常见
no13bus
2022-11-17 17:59:35 +08:00
有的时候业务量没那么大的时候,假设想差一些临时性的数据,逻辑比较麻烦,这个时候写 sql 是很正常的做法。
m2276699
2022-11-17 18:39:08 +08:00
cte 、rank

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/895912

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX