更快地在Haskell中进行直方图计算

我对Haskell很新,我想创建一个直方图。 我正在使用Data.Vector.Unboxed来融合数据上的操作; 这是快速的(用-O -fllvm编译时),瓶颈是我的折叠应用程序; 它汇总了桶的数量。

我怎样才能让它更快? 我读过关于通过严格保密来减少Thunk的数量,所以我通过使用seq和foldr来制定严格的规定,但是没有看到性能提高。 强烈鼓励您的想法。

import qualified Data.Vector.Unboxed as V

histogram :: [(Int,Int)]
histogram = V.foldr' agg [] $ V.zip k v
 where 
    n = 10000000
    c = 1000000
    k = V.generate n (i -> i `div` c * c)
    v = V.generate n (i -> 1)
    agg kv [] = [kv]
    agg kv@(k,v) acc@((ck,cv):as)
        | k == ck = let a = (ck,cv+v):as in a `seq` a
        | otherwise = let a = kv:acc in a `seq` a

main :: IO ()
main = print histogram 

编译:

ghc --make -O -fllvm histogram.hs

首先,用-O2 -rtsopts编译程序。 然后,要获得可以优化的第一个想法,请使用选项+RTS -sstderr运行程序:

$ ./question +RTS -sstderr
[(0,1000000),(1000000,1000000),(2000000,1000000),(3000000,1000000),(4000000,1000000),(5000000,1000000),(6000000,1000000),(7000000,1000000),(8000000,1000000),(9000000,1000000)]
   1,193,907,224 bytes allocated in the heap
   1,078,027,784 bytes copied during GC
     282,023,968 bytes maximum residency (7 sample(s))
      86,755,184 bytes maximum slop
             763 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      1964 colls,     0 par    3.99s    4.05s     0.0021s    0.0116s
  Gen  1         7 colls,     0 par    1.60s    1.68s     0.2399s    0.6665s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    2.67s  (  2.68s elapsed)
  GC      time    5.59s  (  5.73s elapsed)
  EXIT    time    0.02s  (  0.03s elapsed)
  Total   time    8.29s  (  8.43s elapsed)

  %GC     time      67.4%  (67.9% elapsed)

  Alloc rate    446,869,876 bytes per MUT second

  Productivity  32.6% of total user, 32.0% of total elapsed

注意你的时间有67%花在GC上! 显然有一些错误。 为了找出问题所在,我们可以在启用堆分析的情况下运行程序(使用+RTS -h ),产生下图:

第一个堆配置文件

所以,你正在泄漏thunk。 这是如何发生的? 查看代码,唯一一次在agg构建(递归)thunk的时间是在添加时进行的。 通过添加爆炸模式来制作cv严格来修复问题:

{-# LANGUAGE BangPatterns #-}
import qualified Data.Vector.Unboxed as V

histogram :: [(Int,Int)]
histogram = V.foldr' agg [] $ V.zip k v
 where
    n = 10000000
    c = 1000000
    k = V.generate n (i -> i `div` c * c)
    v = V.generate n id
    agg kv [] = [kv]
    agg kv@(k,v) acc@((ck,!cv):as) -- Note the !
        | k == ck = (ck,cv+v):as
        | otherwise = kv:acc

main :: IO ()
main = print histogram

输出:

$ time ./improved +RTS -sstderr 
[(0,499999500000),(1000000,1499999500000),(2000000,2499999500000),(3000000,3499999500000),(4000000,4499999500000),(5000000,5499999500000),(6000000,6499999500000),(7000000,7499999500000),(8000000,8499999500000),(9000000,9499999500000)]
     672,063,056 bytes allocated in the heap
          94,664 bytes copied during GC
     160,028,816 bytes maximum residency (2 sample(s))
       1,464,176 bytes maximum slop
             155 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0       992 colls,     0 par    0.03s    0.03s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.03s    0.03s     0.0161s    0.0319s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    1.24s  (  1.25s elapsed)
  GC      time    0.06s  (  0.06s elapsed)
  EXIT    time    0.03s  (  0.03s elapsed)
  Total   time    1.34s  (  1.34s elapsed)

  %GC     time       4.4%  (4.5% elapsed)

  Alloc rate    540,674,868 bytes per MUT second

  Productivity  95.5% of total user, 95.1% of total elapsed

./improved +RTS -sstderr  1,14s user 0,20s system 99% cpu 1,352 total

这好多了。


所以现在你可以问,为什么这个问题出现了,即使你使用了seq ? 原因是seq只会迫使第一个参数成为WHNF,并且对于(_,_) (其中_是未评估的thunk)已经是WHNF! 此外, seq aa是相同的a ,因为它seq ab (非正式)表示:评价一个计算b之前,所以seq aa只是意味着:评估前的评估,这是一样的只是评估!

链接地址: http://www.djcxy.com/p/42981.html

上一篇: Making a histogram computation in Haskell faster

下一篇: How does take 2 $ [1..] work in haskell?