如何利用 itertools 模块节省内存同时还能把事情办好?

如果想在 V2EX 获得更好的推广效果，欢迎了解 PRO 会员机制：
https://www.v2ex.com/pro/about

如果你经常使用铜币置顶主题，持有 V2EX Solana Token 会在每日签到时获得额外铜币：
https://www.v2ex.com/solana

This topic created in 2018 days ago, the information mentioned may be changed or developed.

如何利用 itertools 模块节省内存同时还能把事情办好?今天番茄加速就来跟大家一起探讨。

　　拼接元素

　　 itertools 中的 chain 函数实现元素拼接，原型如下，参数*表示个数可变的参数

　　 chain(iterables)

　　应用如下：

　　 In [33]: list(chain(['I','love'],['python'],['very', 'much']))

　　 Out[33]: ['I', 'love', 'python', 'very', 'much']

　　哇，不能再好用了，它有点 join 的味道，但是比 join 强，它的重点在于参数都是可迭代的实例。

　　那么，chain 如何实现高效节省内存的呢?chain 大概的实现代码如下：

　　 def chain(*iterables):

　　 for it in iterables:

　　 for element in it:

　　 yield element

　　以上代码不难理解，chain 本质返回一个生成器，所以它实际上是一次读入一个元素到内存，所以做到最高效地节省内存。

　　逐个累积

　　返回列表的累积汇总值，原型：

　　 accumulate(iterable[, func, *, initial=None])

　　应用如下：

　　 In [36]: list(accumulate([1,2,3,4,5,6],lambda x,y: x*y))

　　 Out[36]: [1, 2, 6, 24, 120, 720]

　　 accumulate 大概的实现代码如下：

　　 def accumulate(iterable, func=operator.add, *, initial=None):

　　 it = iter(iterable)

　　 total = initial

　　 if initial is None:

　　 try:

　　 total = next(it)

　　 except StopIteration:

　　 return

　　 yield total

　　 for element in it:

　　 total = func(total, element)

　　 yield total

　　以上代码，你还好吗?与 chain 简单的 yield 不同，此处稍微复杂一点，yield 有点像 return，所以 yield total 那行直接就返回一个元素，也就是 iterable 的第一个元素，因为任何时候这个函数返回的第一个元素就是它的第一个。又因为 yield 返回的是一个 generator 对象，比如名字 gen，所以 next(gen)时，代码将会执行到 for element in it:这行，而此时的迭代器 it 已经指到 iterable 的第二个元素，OK，相信你懂了!

　　漏斗筛选

　　它是 compress 函数，功能类似于漏斗功能，所以我称它为漏斗筛选，原型：

　　 compress(data, selectors)

　　 In [38]: list(compress('abcdefg',[1,1,0,1]))

　　 Out[38]: ['a', 'b', 'd']

　　容易看出，compress 返回的元素个数等于两个参数中较短的列表长度。

　　它的大概实现代码：

　　 def compress(data, selectors):

　　 return (d for d, s in zip(data, selectors) if s)

　　这个函数非常好用

　　段位筛选

　　扫描列表，不满足条件处开始往后保留，原型如下：

　　 dropwhile(predicate, iterable)

　　应用例子：

　　 In [39]: list(dropwhile(lambda x: x<3,[1,0,2,4,1,1,3,5,-5]))

　　 Out[39]: [4, 1, 1, 3, 5, -5]

　　实现它的大概代码如下：

　　 def dropwhile(predicate, iterable):

　　 iterable = iter(iterable)

　　 for x in iterable:

　　 if not predicate(x):

　　 yield x

　　 break

　　 for x in iterable:

　　 yield x

1 replies • 2020-12-22 18:25:11 +08:00

wysnylc

Dec 22, 2020

打回重写