How is foldl lazy?

2018-06-15 03:15:30

There are lots of good questions and answers about foldl , foldr , and foldl' in Haskell.

So now I know that:
1) foldl is lazy
2) don't use foldl because it can blow up the stack
3) use foldl' instead because it is strict (ish)

How foldl is evaluated:
1) a whole bunch of thunks are created
2) after Haskell is done creating thunks, the thunks are reduced
3) overflow the stack if there are too many thunks

What I'm confused about:
1) why does reduction have to occur after all thunk-ing?
2) why isn't foldl evaluated just like foldl' ? Is this just an implementation side-effect?
3) from the definition, foldl looks like it could be evaluated efficiently using tail-recursion -- how can I tell whether a function will actually be efficiently evaluated? It seems like I have to start worrying about order-of-evaluation in Haskell, if I don't want my program to crash.

Thanks in advance. I don't know if my understanding of the evaluation of foldl is right -- please suggest corrections if necessary.

UPDATE: it looks the answer to my question has something to do with Normal Form, Weak Normal Form, and Head Normal Form, and Haskell's implementation of them.
However, I'm still looking for an example where evaluating the combining function more eagerly would lead to a different result (either a crash or unnecessary evaluation).

You know, that by definition:

foldl op start (x1:x2:...:xN:[]) = ((start `op` x1) `op` x2) ...

the line in foldl that does this is:

foldl op a (x:xs) = foldl op (a `op` x) xs

You are right that this is tail recursive, but observe that the expression

(a `op` x)

is lazy and at the end of the list, a huge expression will have been build that is then reduced. The difference to foldl' is just that the expression above is forced to evaluate in every recursion, so at the end you have a value in weak head normal form.

The short answer is that in foldl f , it's not necessarily the case that f is strict, so it might be too eager to reduce the thunks up front. However, in practice it usually is, so you nearly always want to be using foldl' .

I wrote a more in-depth explanation of how the evaluation order of foldl and foldl' works on another question. It's rather long but I think it should clarify things a bit for you.

I'm still looking for an example where evaluating the combining function more eagerly would lead to a different result

The general rule of thumb is never, ever use foldl . Always use foldl' , except when you should use foldr . I think you know enough about foldl to understand why it should be avoided.

See also: Real World Haskell > Functional Programming # Left folds, laziness, and space leaks

However, your question neglects foldr . The nifty thing about foldr is that it can produce incremental results, while foldl' needs to traverse the entire list before yielding an answer. This means that foldr 's laziness allows it to work with infinite lists. There are also questions and answers that go into detail about this sort of thing.

Having brought that up, let me try to succinctly answer your questions.

1) why does reduction have to occur after all thunk-ing?

Reduction only occurs at strictness points. Performing IO, for example, is a strictness point. foldl' uses seq to add an additional strictness point that foldl does not have.

2) why isn't foldl evaluated just like foldl'? Is this just an implementation side-effect?

Because of the additional strictness point in foldl'

3) from the definition, foldl looks like a tail-recursive function to me -- how can I tell whether a function will actually be efficiently evaluated? It seems like I have to start worrying about order-of-evaluation in Haskell, if I don't want my program to crash.

You need to learn a little bit more about lazy evaluation. Lazy evaluation is not exclusive to Haskell, but Haskell is one of the very, very few languages in which laziness is the default. For a beginner, just remember to always use foldl' and you should be fine.

If laziness actually gets you into trouble someday, that is when you should probably make sure you understand laziness and Haskell's strictness points. You might say that said theoretical day is a strictness point for lazy-by-default learning.