https://www.reddit.com/r/golang/comments/7ony5f/what_is_the_meaning_of_flat_and_cum_in_golang/
我看了一些资料,都是如这个帖子里面所说,cum ( cumulative )是本函数以及本函数调用的子函数们的时间和。 如帖子里面的调用方式我写了一个 demo,结果符合预期
func ctA() int64 {
return ctB()
}
func ctB() int64 {
var x int64 = 0
for i := 0; i < 1e9; i++ {
x = x + 1
}
return x + ctC() + ctD()
}
func ctC() int64 {
var x int64 = 0
for i := 0; i < 1e8; i++ {
x = x + 1
}
return x
}
func ctD() int64 {
var x int64 = 0
for i := 0; i < 5e8; i++ {
x = x + 1
}
return x
}
(pprof) top
Showing nodes accounting for 48.68s, 99.43% of 48.96s total
Dropped 11 nodes (cum <= 0.24s)
flat flat% sum% cum cum%
30.31s 61.91% 61.91% 48.68s 99.43% main.ctB
15.35s 31.35% 93.26% 15.35s 31.35% main.ctD
3.02s 6.17% 99.43% 3.02s 6.17% main.ctC
0 0% 99.43% 48.68s 99.43% main.ctA
0 0% 99.43% 48.68s 99.43% main.main
0 0% 99.43% 48.68s 99.43% runtime.main
0 0% 99.43% 0.27s 0.55% runtime.mstart
0 0% 99.43% 0.27s 0.55% runtime.mstart1
0 0% 99.43% 0.27s 0.55% runtime.sysmon
但是稍微复杂一些的例子,里面的 main.main,cum 的值就很小了(如下),这就说不通了啊,main 不是一直在函数的调用栈上么?理论上应该是最长的呀。
(pprof) top30
Showing nodes accounting for 15.82s, 95.36% of 16.59s total
Dropped 92 nodes (cum <= 0.08s)
Showing top 30 nodes out of 67
flat flat% sum% cum cum%
6.69s 40.33% 40.33% 6.69s 40.33% runtime.pthread_cond_wait
4.61s 27.79% 68.11% 4.61s 27.79% runtime.pthread_cond_signal
1.35s 8.14% 76.25% 1.35s 8.14% runtime.memmove
0.84s 5.06% 81.31% 0.84s 5.06% runtime.usleep
0.51s 3.07% 84.39% 0.52s 3.13% runtime.findnull
0.49s 2.95% 87.34% 0.49s 2.95% runtime.nanotime
0.44s 2.65% 89.99% 0.44s 2.65% runtime.(*semaRoot).dequeue
0.34s 2.05% 92.04% 0.34s 2.05% runtime.memclrNoHeapPointers
0.20s 1.21% 93.25% 0.33s 1.99% runtime.scanobject
0.09s 0.54% 93.79% 0.09s 0.54% runtime.pthread_mutex_lock
0.05s 0.3% 94.09% 0.13s 0.78% runtime.gentraceback
0.03s 0.18% 94.27% 0.35s 2.11% sync.(*Mutex).Lock
0.03s 0.18% 94.45% 0.38s 2.29% sync.(*RWMutex).Lock
0.02s 0.12% 94.58% 0.14s 0.84% bufio.(*Scanner).Scan
0.02s 0.12% 94.70% 0.75s 4.52% runtime.scang
0.02s 0.12% 94.82% 8.40s 50.63% runtime.schedule
0.01s 0.06% 94.88% 0.19s 1.15% main.Loop
0.01s 0.06% 94.94% 0.14s 0.84% runtime.callers
0.01s 0.06% 95.00% 7.07s 42.62% runtime.findrunnable
0.01s 0.06% 95.06% 1.20s 7.23% runtime.gcDrain
0.01s 0.06% 95.12% 0.11s 0.66% runtime.mallocgc
0.01s 0.06% 95.18% 3.04s 18.32% runtime.newproc.func1
0.01s 0.06% 95.24% 3.03s 18.26% runtime.newproc1
0.01s 0.06% 95.30% 0.56s 3.38% runtime.osyield
0.01s 0.06% 95.36% 4.92s 29.66% runtime.systemstack
0 0% 95.36% 0.18s 1.08% main.(*TrieNode).addWord
0 0% 95.36% 0.52s 3.13% main.BuildTire
0 0% 95.36% 2.45s 14.77% main.Loop.func1
0 0% 95.36% 0.53s 3.19% main.main
0 0% 95.36% 0.10s 0.6% runtime.(*gcWork).balance
如何理解这件事情呢?
这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。
V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。
V2EX is a community of developers, designers and creative people.