Haskell: use of unsafePerformIO for global constant bindings

There are lots of discussions of using unsafePerformIO carefully for global mutable variables, and some language additions to support it (eg Data.Global ). I have a related but distinct question: using it for global constant bindings. Here's a usage I consider entirely OK: command-line parsing.

module Main where

--------------------------------------------------------------------------------
import Data.Bool (bool)
import Data.Monoid ((<>))
import Options.Applicative (short, help, execParser, info, helper, fullDesc,
                            progDesc, long, switch)
import System.IO.Unsafe (unsafePerformIO)

--------------------------------------------------------------------------------
data CommandLine = CommandLine
  Bool               --quiet
  Bool               --verbose
  Bool               --force

commandLineParser = CommandLine

  <$> switch
    (  long "quiet"
    <> short 'q'
    <> help "Show only error messages.")

  <*> switch
    (  long "verbose"
    <> short 'v'
    <> help "Show lots of detail.")

  <*> switch
    (  long "force"
    <> short 'f'
    <> help "Do stuff anyway.")

{- Parse the command line, and bind related values globally for
convenience. This use of unsafePerformIO is OK since the action has no
side effects and it's idempotent. -}

CommandLine cQuiet cVerbose cForce 

  = unsafePerformIO . execParser $ info (helper <*> commandLineParser)
      ( fullDesc
     <> progDesc "example program"
      )

-- Print a message:
say     = say' $ not cQuiet -- unless --quiet
verbose = say' cVerbose     -- if --verbose
say'    = bool (const $ return ()) putStrLn

--------------------------------------------------------------------------------
main :: IO ()

main = do
  verbose "a verbose message"
  say "a regular message"

It is very valuable to be able to refer cQuiet , cVerbose , etc. globally rather than have to pass them around as arguments wherever they're needed. After all, this is exactly what global identifiers are for: these have a single value that never changes during any run of the program — it just happens that the value is initialized from the outside world rather than declared in the program text.

It makes sense in principal to do the same thing with other sorts of constant data fetched from the outside, eg settings from a configuration file — but then an extra point arises: the action which fetches those is not idempotent, unlike reading the command line (I'm slightly abusing the term “idempotent” here, but trust that I'm understood). This just adds the constraint that the action must be performed only once. My question is: what's the best way to do that with code of this form:

data Config = Foo String | Bar (Maybe String) | Baz Int

readConfig :: IO Config
readConfig = do …

Config foo bar baz = unsafePerformIO readConfig

The doc suggests to me that this is sufficient and none of the precautions mentioned there are needed, but I'm not sure. I've seen proposals for adding a top-level syntax inspired by do-notation specifically for such situations:

Config foo bar baz <- readConfig

… which seems like a very good idea; I'd rather be sure the action will be performed at most once than rely on various compiler settings and hope no compiler behavior comes along that breaks existing code.

I feel the fact that these are in fact constants, together with the ugliness involved in passing such things around explicitly despite the fact that they never change, argue strongly for there being a safe and supported way to do this. I'm open to hearing contrary opinions if someone thinks I'm missing an important point here, though.

Updates

  • The say and verbose uses in the example are not the best, because it's not values in the IO monad that are the real annoyance — these could easily read the parameters from a global IORef . The problem is the use of such parameters pervasively in pure code, which have to all be rewritten to either take the parameters explicitly (even though these do not change and thus should not need to be function parameters), or be converted to IO which is even worse. I'll improve the example when I have time.

  • Another way to think about this: the class of behaviors I'm talking about could be obtained in the following clunky way: run a program that fetches some data via I/O; take the results and substitute them into the template text of the main program as the values of some global bindings; then compile and run the resulting main program. You would then safely have the advantage of referring to those constants easily throughout the program. It seems that it should not be so hard to implement this pattern directly. I phrased the question mentioning unsafePerformIO , but really I'm interested in understanding this kind of behavior, and what the best way to obtain it would be. unsafePerformIO is one way, but it has drawbacks.

  • known limitations:

  • With unsafePerformIO , when the data-fetching action happens is not fixed. This may be a feature, so that eg an error related to a missing configuration parameter occurs if and only if that parameter is ever actually used. If you need different behavior, you'll have to force the values with seq as needed.

  • I don't know if I'd consider top-level command line parsing to always be OK! Specifically, observe what happens with this alternate main when the user provides bad input.

    main = do
      putStrLn "Arbitrary program initialization"
      verbose "a verbose message"
      say "a regular message"
      putStrLn "Clean shutdown"
    
    > ./commands -x
    Arbitrary program initialization
    Invalid option `-x'
    
    Usage: ...
    

    Now in this case you can force one (or all!) of the pure values so that the parser is known to have run by a well-defined point in time.

    main = do
      () <- return $ cQuiet `seq` cVerbose `seq` cForce `seq` ()
      -- ...
    
    > ./commands -x
    Invalid option `-x'
    ...
    

    But what happens if you have something like—

    forkIO (withArgs newArgs action)
    

    The only sensible thing to do is {-# NOINLINE cQuiet #-} and friends, so some of those precautions in System.IO.Unsafe do apply to you. But this is an interesting case to patch over, note that you have given up the ability to run sub-computations with alternate values. An eg ReaderT solution using local doesn't have that drawback.

    This seems an even larger drawback to me in the case of reading config files, as long running applications usually are reconfigurable without requiring a stop/start cycle. A top-level pure value precludes reconfiguration.

    But maybe this is even more clear if you consider the intersection of both your config files and your command line arguments. In many utilities arguments on the command line override values provided in a config file, an impossible behavior given what you have now.

    For toys, sure, go hog wild. For anything else, at least make your top-level value an IORef or MVar . There are some ways to still make the non- unsafePerformIO solutions nicer though. Consider—

    data Config = Config { say     :: String -> IO ()
                         , verbose :: String -> IO ()
                         }
    
    mkSay :: Bool -> String -> IO ()
    mkSay quiet s | quiet     = return ()
                  | otherwise = putStrLn s
    
    -- In some action...
      let config = Config (mkSay quietFlag) (mkVerbose verboseFlag)
    
    compute :: Config -> IO Value
    compute config = do
      -- ...
      verbose config "Debugging info"
      -- ...
    

    This also respects the spirit of Haskell function signatures, in that it's now clear (without even needing to consider the open world of IO) that your functions' behavior actually does depend on program configuration.


    Two cases from Hackage that come to mind:

    The package cmdargs makes use of unsafePerformIO - treating command line arguments as constant.

    In the package oeis , the "pure" function getSequenceByID uses unsafePerformIO to return content from a web page on http://oeis.org. It notes in its documentation:

    Note that the result is not in the IO monad, even though the implementation requires looking up information via the Internet. There are no side effects to speak of, and from a practical point of view the function is referentially transparent (OEIS A-numbers could change in theory, but it's extremely unlikely).


    -XImplicitParams is useful in this situation.

    {-# LANGUAGE ImplicitParams #-}
    
    data CommandLine = CommandLine
      Bool               --quiet
      Bool               --verbose
      Bool               --force
    
    say' :: Bool -> String -> IO ()
    say' = bool (const $ return ()) putStrLn
    
    say, verbose :: (?cmdLine :: CommandLine) => String -> IO ()
    say = case ?cmdLine of CommandLine cQuiet _ _ -> say' $ not cQuiet
    verbose = case ?cmdLine of CommandLine _ cVerbose _ -> say' cVerbose
    

    Anything that is implicitly typed and uses say or verbose will have the ?cmdLine :: CommandLine implicit parameter added to its type.

    :type (s -> say (show s))
    (s -> say (show s))
      :: (Show a, ?cmdLine::CommandLine) => a -> IO ()
    
    链接地址: http://www.djcxy.com/p/43252.html

    上一篇: 我怎样才能提取这种多态递归函数?

    下一篇: Haskell:使用unsafePerformIO进行全局常量绑定