Python 怎么优雅地拆分字典

2014-09-12 15:45:09 +08:00
 14
原数据
{
"A0801_000000_201301": "1,321.8",
"A0801_000000_201302": "1,199.8",
"A0801_000000_201309": "1,433.4",
"A0802_000000_201305": "6,688.3",
"A0802_000000_201306": "8,085.2",
"A0802_000000_201307": "9,481.0",
"A0802_000000_201308": "10,878.4",
"A0802_000000_201309": "12,311.8",
"A0802_000000_201310": "13,739.9",
……


目标是:
{
"A0801": [{"201301": ""}, ……]
"A0802": [{"201308": ""}, ……]
……
}
4442 次点击
所在节点    Python
19 条回复
xiaket
2014-09-12 15:59:15 +08:00
需求都没陈述清楚... 那个00000是怎么处理的?

用列表解析或者itertools里面的东西来做吧.
linKnowEasy
2014-09-12 15:59:57 +08:00
替换。
alsotang
2014-09-12 16:00:30 +08:00
最普通的迭代。
icinessz
2014-09-12 16:09:18 +08:00
package main

import "fmt"

func main() {
src := map[string]string{
"A0801_000000_201301": "1,321.8",
"A0801_000000_201302": "1,199.8",
"A0801_000000_201309": "1,433.4",
"A0802_000000_201305": "6,688.3",
"A0802_000000_201306": "8,085.2",
"A0802_000000_201307": "9,481.0",
"A0802_000000_201308": "10,878.4",
"A0802_000000_201309": "12,311.8",
"A0802_000000_201310": "13,739.9",
}
rs := map[string]map[string]string{}
for k, v := range src {
if _, ok := rs[k[:5]]; !ok {
rs[k[:5]] = map[string]string{}
}
rs[k[:5]][k[13:]] = v
}
fmt.Println(rs)
}

--------------------------------
map[A0801:map[201309:1,433.4 201301:1,321.8 201302:1,199.8] A0802:map[201308:10,878.4 201309:12,311.8 201310:13,739.9 201305:6,688.3 201306:8,085.2 201307:9,481.0]]
spritevan
2014-09-12 16:16:18 +08:00
#!/usr/bin/env python

from pprint import pprint as pp

origin = {
"A0801_000000_201301": "1,321.8",
"A0801_000000_201302": "1,199.8",
"A0801_000000_201309": "1,433.4",
"A0802_000000_201305": "6,688.3",
"A0802_000000_201306": "8,085.2",
"A0802_000000_201307": "9,481.0",
"A0802_000000_201308": "10,878.4",
"A0802_000000_201309": "12,311.8",
"A0802_000000_201310": "13,739.9",
}

res = {}
fn = lambda fields,v: res.setdefault(fields[0], []).append({fields[-1]:v})
for k,v in origin.iteritems():
fn(k.split('_'),v)
pp(res)

---

{'A0801': [{'201309': '1,433.4'},
{'201302': '1,199.8'},
{'201301': '1,321.8'}],
'A0802': [{'201305': '6,688.3'},
{'201306': '8,085.2'},
{'201307': '9,481.0'},
{'201310': '13,739.9'},
{'201308': '10,878.4'},
{'201309': '12,311.8'}]}
imn1
2014-09-12 16:20:19 +08:00
如果原始数据是一个字串(json),用正则拆很快
hahastudio
2014-09-12 16:35:50 +08:00
大概 LZ 没玩过 setdefault,我记得我在第一次看 Cookbook 的时候也被这用法惊呆了
不过我不了解你对中间那串 0 怎么搞的

https://gist.github.com/hahastudio/e1d4bb5423be3052935b
14
2014-09-12 16:43:37 +08:00
@hahastudio 感谢,要的就是这样的东西
hahastudio
2014-09-12 16:57:40 +08:00
@14 其实跟你的那个不太一样= =
我仔细看才发现你在帖子里要的是一个列表,里面都是只有一个键值对的字典= =
那样的话你可以看 @spritevan 的回答= =
14
2014-09-12 17:07:34 +08:00
@hahastudio 确实……不过把你的代码稍稍改一下就是了:
for k in d:
....key, mid, subkey = k.split('_')
....new_d.setdefault(key, []).append({subkey:d[k]})
advancedxy
2014-09-12 17:27:59 +08:00
from collections import defaultdict

d = {
"A0801_000000_201301": "1,321.8",
"A0801_000000_201302": "1,199.8",
"A0801_000000_201309": "1,433.4",
"A0802_000000_201305": "6,688.3",
"A0802_000000_201306": "8,085.2",
"A0802_000000_201307": "9,481.0",
"A0802_000000_201308": "10,878.4",
"A0802_000000_201309": "12,311.8",
"A0802_000000_201310": "13,739.9",
}

def addItem(dd, item):
k,v = item
k1,k2,k3 = k.split('_')
dd[k1].append({k3:value})
return dd

dict(reduce(addItem, d, defaultdict(list)))
starsoi
2014-09-12 20:49:36 +08:00
@hahastudio @14 setdefault 看着简短,但速度还是没有直截了当的if else快 (大约快30%)

https://gist.github.com/starsoi/ef3c813ebd2c04e3e8ff.js
hahastudio
2014-09-12 21:29:31 +08:00
@starsoi 嘛,性能自然是不足= =
这是花哨写法的代价= =

不过你要注意一下
第一,应该是 new_d[key]
第二,你这样少了每次新建字典时候的第一个结果
use_ifelse = """
new_d = {}
for k in tabledata:
....key, mid, subkey = k.split('_')
....if key not in new_d:
........new_d[key] = []
....new_d[key].append({subkey:tabledata[k]})
"""
这样你比较一下,性能提升就没你说的那么多了
frankzeng
2014-09-12 21:43:51 +08:00
#!/usr/bin/python

ss = {
"A0801_000000_201301": "1,321.8",
"A0801_000000_201302": "1,199.8",
"A0801_000000_201309": "1,433.4",
"A0802_000000_201305": "6,688.3",
"A0802_000000_201306": "8,085.2",
"A0802_000000_201307": "9,481.0",
"A0802_000000_201308": "10,878.4",
"A0802_000000_201309": "12,311.8",
"A0802_000000_201310": "13,739.9",}


output = {}
for key,data in ss.iteritems():
temp = key.split("_")
try:
k = temp[0]
j = temp[2]
except:
print key,data
continue
if k in output:
pass
else:
output[k] = []
output[k].append({j:data})

print output

只需要遍历一次,而且简单易懂,你值得拥有。
starsoi
2014-09-12 22:33:00 +08:00
@hahastudio 有道理,原来是码错了。。
hahastudio
2014-09-12 22:43:29 +08:00
@starsoi 不过貌似数据规模一大的话,还是用 defaultdict 比较好

new_d = defaultdict(list)
for k in tabledata:
....key, mid, subkey = k.split('_')
....new_d[key].append({subkey:tabledata[k]})

http://nbviewer.ipython.org/gist/hahastudio/5f7ed0ee9c4adfa2a86f
mengzhuo
2014-09-12 23:02:35 +08:00
著名的One Line Tree, 绝对优雅

aa = {
"A0801_000000_201301": "1,321.8",
"A0801_000000_201302": "1,199.8",
"A0801_000000_201309": "1,433.4",
"A0802_000000_201305": "6,688.3",
"A0802_000000_201306": "8,085.2",
"A0802_000000_201307": "9,481.0",
"A0802_000000_201308": "10,878.4",
"A0802_000000_201309": "12,311.8",
"A0802_000000_201310": "13,739.9",


from collections import defaultdict
def tree(): return defaultdict(tree)

bb = tree()

for k,v in aa.items():
....prefix, _ , appendix = k.split('_')
....bb[prefix][appendix] = v
xylophone21
2014-09-13 15:10:21 +08:00
@hahastudio

4行代码,居然也要专门封装成一下,Python的服务还真是到位啊.

if self.data.has_key(key):
self.data[key].append(value)
else:
self.data[key] = [value]


http://starship.python.net/~mwh/hacks/setdefault.html
biglazycat
2020-09-06 21:04:16 +08:00
tabledata = {
"A0801_000000_201301": "1,321.8",
"A0801_000000_201302": "1,199.8",
"A0801_000000_201309": "1,433.4",
"A0802_000000_201305": "6,688.3",
"A0802_000000_201306": "8,085.2",
"A0802_000000_201307": "9,481.0",
"A0802_000000_201308": "10,878.4",
"A0802_000000_201309": "12,311.8",
"A0802_000000_201310": "13,739.9",
}

output = {}
for k, v in tabledata.items():
(a, b, c) = k.split('_')
output.setdefault(a,[]).append({c: v})
print(output)

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/133087

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX