求教， Mysql group by 后对分组数据的处理问题

2017-02-13 22:09:04 +08:00

sorcerer

有这样一个表

|id |col1|col2|col3|col4|
|---|----|----|--- |----|
| 1 |  1 |  1 |  1 |  1 |
| 2 |  1 |  1 |  1 |  2 |
| 3 |  1 |  1 |  1 |  1 |
| 4 |  1 |  1 |  2 |  1 |
| 5 |  1 |  1 |  2 |  2 |
| 6 |  1 |  2 |  1 |  1 |
| 7 |  1 |  2 |  1 |  1 |
| 8 |  1 |  3 |  2 |  2 |
| 9 |  1 |  3 |  2 |  2 |

现在我需要按 col1 和 col2 对表进行分组，然后对每个分组进行统计，统计 col3=1,col4 不重复值的数量 ps:count(distinct(col4)), 统计 col3=2 ， col4 值的数量 ps:count(col4)

即结果为

|id |col1|col2|distinct(col4)|count(col4)|
|---|----|----|--------------|-----------|
| 1 |  1 |  1 |       2      |         2 |
| 2 |  1 |  2 |       1      |         0 |
| 3 |  1 |  3 |       0      |         2 |

想了很久都没搞出来=。=只能厚着脸皮来这里请教了。

5659 次点击

所在节点

MySQL

18 条回复

yangqi

2017-02-13 22:19:57 +08:00

你这什么意思，逻辑有问题吧？第一行 count(col4)难道不是 5 么， distinct(col4)实际上是 count(distinct(col4))?

sorcerer

2017-02-13 22:25:32 +08:00

@yangqi 是 count(distinct(col4))，我写错了，而且还要和 col3 的值要有联系....

yangqi

2017-02-13 22:41:23 +08:00

@sorcerer 那你第三列 distinct(col4)又是什么？

sorcerer

2017-02-13 22:44:48 +08:00

@yangqi 第三列是统计 col3=1,col4 不重复值的数量 ps:count(distinct(col4)) 第四列是统计 col3=2 ， col4 值的数量 ps:count(col4)

yangqi

2017-02-13 22:49:03 +08:00

@sorcerer 这个统计不可能用一个简单查询做的，肯定要子查询然后用到 IF()判断了。

sorcerer

2017-02-14 08:37:12 +08:00

@yangqi 不好意思，昨晚睡了没看到，用子查询应该怎么写呢，我需要对 col1 和 col2 进行分组统计，所以 group by 肯定要的

ebony0319

2017-02-14 09:02:53 +08:00

分组后就不存在什么 ID 了把，就是一个集合的概念了。还有就是 where 过滤不是应该在之前么。

sorcerer

2017-02-14 09:08:07 +08:00

@ebony0319 只要按那种格式显示出来就可以了， id 那一列去掉吧......

sorcerer

2017-02-14 09:10:14 +08:00

@ebony0319 我想对分组后的数据再进行条件筛选有可能实现吗

staticor

2017-02-14 10:03:32 +08:00

试试这样的逻辑 count(distinct if(col3=2, col4, null))

sorcerer

2017-02-14 10:39:42 +08:00

谢谢这正是我想要的，不过怎样才能 if 中 distinct 两列呢，我尝试了一下 count(distinct if(col3=2,(col4,col5),null)) 好像不行

staticor

2017-02-14 12:25:17 +08:00

@sorcerer concat 两列得到一个辅助列?

ebony0319

2017-02-14 13:19:17 +08:00

你的 col1 ， col2 聚合的时候 col3 ， col4 怎么办？我想看一下原需求是什么.

sorcerer

2017-02-14 13:49:43 +08:00

@ebony0319 比如说 col3 代表一个账号的状态， col4 姓名， col5 是电话号码，现在我聚合后,想要统计 col3 等于 1 ，即激活状态下 col4 和 col5 两列不重复的账号数量，以及 col3 等于 2 ，即非激活状态下，账号的数量(这个不需要去重)

CRVV

2017-02-14 15:06:41 +08:00

按照 staticor 的提示

SELECT
col1,
col2,
count(DISTINCT if(col3 = 1, col4, NULL)),
count(if(col3 = 2, col4, NULL))
FROM table
GROUP BY col1, col2
ORDER BY col2;

就得到了原文里的结果

sorcerer

2017-02-14 15:21:47 +08:00

@CRVV 嗯这样是可以，但是 distinct 两列就不行了，试过 concat 函数貌似不行 count(distinct (if(col3=1,concat(col4,col5),null)))......

ebony0319

2017-02-14 17:21:13 +08:00

我看懂了。给你结果你可能要吐血了。其实很简单。
首先选出不重复且激活状态的数据，然后聚合就好了。
SELECT
col1, col2, col4, COUNT(col4)
FROM
(SELECT DISTINCT
col1, col2, col3, col4
FROM
temp
WHERE
col3 = 1) T
GROUP BY col1 , col2

http://ww1.sinaimg.cn/large/af912ab4gy1fcq4du1bwgj20ak0ab0sr

ebony0319

2017-02-14 17:27:37 +08:00

微博图床挂了？图片是那个结果，你按照我的思路来应该没错的。 ww1.sinaimg.cn/large/af912ab4gy1fcq4du1bwgj20ak0ab0sr

第 1 页／共 1 页

这是一个专为移动设备优化的页面（即为了让你能够在 Google 搜索结果里秒开这个页面），如果你希望参与 V2EX 社区的讨论，你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/340250

V2EX 是创意工作者们的社区，是一个分享自己正在做的有趣事物、交流想法，可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.