GHC doesn't pick the only available instance

2018-06-15 05:47:05

I'm trying to write a CSS DSL in Haskell, and keep the syntax as close to CSS as possible. One difficulty is that certain terms can appear both as a property and value. For example flex: you can have "display: flex" and "flex: 1" in CSS.

I've let myself inspire by the Lucid API, which overrides functions based on the function arguments to generate either attributes or DOM nodes (which sometimes also share names, eg <style> and <div style="..."> ).

Anyway, I've ran into a problem that GHC fails to typecheck the code (Ambiguous type variable), in a place where it is supposed to pick one of the two available typeclass instances. There is only one instance which fits (and indeed, in the type error GHC prints "These potential instance exist:" and then it lists just one). I'm confused that given the choice of a single instance, GHC refuses to use it. Of course, if I add explicit type annotations then the code compiles. Full example below (only dependency is mtl, for Writer).

{-# LANGUAGE FlexibleInstances #-}
module Style where

import Control.Monad.Writer.Lazy


type StyleM = Writer [(String, String)]
newtype Style = Style { runStyle :: StyleM () }


class Term a where
    term :: String -> a

instance Term String where
    term = id

instance Term (String -> StyleM ()) where
    term property value = tell [(property, value)]


display :: String -> StyleM ()
display = term "display"

flex :: Term a => a
flex = term "flex"

someStyle :: Style
someStyle = Style $ do
    flex "1"     -- [1] :: StyleM ()
    display flex -- [2]

And the error:

Style.hs:29:5: error:
    • Ambiguous type variable ‘a0’ arising from a use of ‘flex’
      prevents the constraint ‘(Term
                                  ([Char]
                                   -> WriterT
                                        [(String, String)]
                                        Data.Functor.Identity.Identity
                                        a0))’ from being solved.
        (maybe you haven't applied a function to enough arguments?)
      Probable fix: use a type annotation to specify what ‘a0’ should be.
      These potential instance exist:
        one instance involving out-of-scope types
          instance Term (String -> StyleM ()) -- Defined at Style.hs:17:10
    • In a stmt of a 'do' block: flex "1"
      In the second argument of ‘($)’, namely
        ‘do { flex "1";
              display flex }’
      In the expression:
        Style
        $ do { flex "1";
               display flex }
Failed, modules loaded: none.

I've found two ways how to make this code compile, none of which I'm happy with.

Add explicit annotation where the flex function is used ([1]).

Move the line where flex is used to the end of the do block (eg. comment out [2]).

One difference between my API and Lucid is that the Lucid terms always take one argument, and Lucid uses fundeps, which presumably gives the GHC typechecker more information to work with (to choose the correct typeclass instance). But in my case the terms don't always have an argument (when they appear as the value).

The problem is that the Term instance for String -> StyleM () only exists when StyleM is parameterized with () . But in a do-block like

someStyle :: Style
someStyle = Style $ do
    flex "1"
    return ()

there is not enough information to know which is the type parameter in flex "1" , because the return value is thrown away.

A common solution to this problem is the "constraint trick". It requires type equality constraints, so you have to enable {-# LANGUAGE TypeFamilies #-} or {-# LANGUAGE GADTs #-} and tweak the instance like this:

{-# LANGUAGE TypeFamilies #-}

instance (a ~ ()) => Term (String -> StyleM a) where
    term property value = tell [(property, value)]

This tells the compiler: "You don't need to know the precise type a to get the instance, there is one for all types! However, once the instance is determined, you'll always find that the type was () after all!"

This trick is the typeclass version of Henry Ford's "You can have any color you like, as long as it's black." The compiler can find an instance despite the ambiguity, and finding the instance gives him enough information to resolve the ambiguity.

It works because Haskell's instance resolution never backtracks, so once an instance "matches", the compiler has to commit to any equalities it discovers in the preconditions of the instance declaration, or throw a type error.

There is only one instance which fits (and indeed, in the type error GHC prints "These potential instance exist:" and then it lists just one). I'm confused that given the choice of a single instance, GHC refuses to use it.

Type classes are open; any module could define new instances. So GHC never assumes that it knows about all instances, when checking a use of a type class. (With the possible exception of the bad extensions like OverlappingInstances .) Logically, then, the only possible answers to a question "is there an instance for CT " are "yes" and "I don't know". To answer "no" risks incoherence with another part of your program that does define an instance CT .

So, you should not imagine the compiler iterating over every declared instance and seeing whether it fits at the particular use site of interest, because what would it do with all the "I don't know"s? Instead, the process works like this: infer the most general type that could be used at the particular use site and query the instance store for the needed instance. The query can return a more general instance than the one needed, but it can never return a more specific instance, since it would have to choose which more specific instance to return; then your program is ambiguous.

One way to think about the difference is that iterating over all declared instances for C would take linear time in the number of instances, while querying the instance store for a specific instance only has to examine a constant number of potential instances. For example, if I want to type check

Left True == Left False

I need an instance for Eq (Either Bool t) , which can only be satisfied by one of

instance Eq (Either Bool t)
instance Eq (Either a t)    -- *
instance Eq (f Bool t)
instance Eq (f a t)
instance Eq (g t)
instance Eq b

(The instance marked * is the one that actually exists, and in standard Haskell (without FlexibleInstances ) it's the only one of these instances that is legal to declare; the traditional restriction to instances of the form C (T var1 ... varN) makes this step easy since there will always be exactly one potential instance.)

If instances are stored in something like a hash table then this query can be done in constant time regardless of the number of declared instances of Eq (which is probably a pretty large number).

In this step, only instance heads (the stuff to the right of the => ) are examined. Along with a "yes" answer, the instance store can return new constraints on type variables that come from the context of the instance (the stuff to the left of the => ). These constraints then need to be solved in the same manner. (This is why instances are considered to overlap if they have overlapping heads, even if their contexts look mutually exclusive, and why instance Foo a => Bar a is almost never a good idea.)

In your case, since a value of any type can be discarded in do notation, we need an instance for Term (String -> StyleM a) . The instance Term (String -> StyleM ()) is more specific, so it's useless in this case. You could either write

do
  () <- flex "1"
  ...

to make the needed instance more specific, or make the provided instance more general by using the type equality trick as explained in danidiaz's answer.

链接地址: http://www.djcxy.com/p/43266.html

上一篇: 使用ViewPatterns和PatternSynonyms来简单地模式匹配

下一篇: GHC不会选择唯一可用的实例