Haskell: A stricter fold' with deepseq

The page Foldr Foldl Foldl' discusses foldl' , and defines it like:

foldl' f z []     = z
foldl' f z (x:xs) = let z' = z `f` x 
                    in seq z' $ foldl' f z' xs

This is done to avoid space leaks, ie so fold' that produces a constant size result only uses constant space.

However, this doesn't necessarily work, as pointed out here:

The involved seq function does only evaluate the top-most constructor. If the accumulator is a more complex object, then fold' will still build up unevaluated thunks.

The obvious solution is to change seq to deepseq as shown (assuming you're working with NFData ):

foldl_strict f z []     = z
foldl_strict f z (x:xs) = let z' = z `f` x 
                          in deepseq z' $ foldl_strict f z' xs

But I have a feeling this can be horribly inefficient, as the entire structure will need to be traversed by deepseq each pass through the loop (unless the compiler can statically prove this is not necessary).

I then tried this:

foldl_stricter  f z l      = deepseq z $ foldl_stricter' f z l
foldl_stricter' f z []     = z
foldl_stricter' f z (x:xs) = let z' = deepseq x $ z `f` x 
                             in seq z' $ foldl_stricter' f z' xs

But found it had this issue. The below fails when it should return 3.

foldl_stricter (x y -> x + head y) 0 [[1..],[2..]]

So fold_stricter is too strict. The list need not be strict, what is important to prevent a space leak is that the accumulator is strict. fold_stricter goes too far and also makes the list strict also, which causes the above to fail.

Which takes us back to fold_strict . Does repeatedly running deepseq on a data structure of size n take O(n) time, or only O(n) time the first time and O(1) thereafter? (As dbaupp suggests in his comment below)


In fact, your two implementations of foldl are significantly different. There is no guarantee that fzx will need to completely traverse x to compute its answer, so deepseq x (fzx) may do unnecessary work; moreover, even if x is completely evaluated, there is no guarantee that fzx has no nested thunks, so let z' = deepseq x (fzx) in seq z' (foo z') may not do enough work.

The correct solution to the problem you stated is to use foldl' and a strict data type as the accumulator type; this way, the seq will only need to check the constructor to know that the entire structure is completely evaluated, and conversely forcing the constructor will force the entire structure to be completely evaluated.

链接地址: http://www.djcxy.com/p/10912.html

上一篇: 在Java中尝试/做模式实现

下一篇: 哈斯克尔:一个更严格的折叠'与​​deepseq