Automatically restarting Erlang applications

I recently ran into a bug where an entire Erlang application died, yielding a log message that looked like this:

=INFO REPORT==== 11-Jun-2010::11:07:25 ===
     application: myapp
     exited: shutdown
     type: temporary

I have no idea what triggered this shutdown, but the real problem I have is that it didn't restart itself. Instead, the now-empty Erlang VM just sat there doing nothing.

Now, from the research I've done, it looks like there are other "start types" you can give an application: 'transient' and 'permanent'.

If I start a Supervisor within an application, I can tell it to make a particular process transient or permanent, and it will automatically restart it for me. However, according to the documentation, if I make an application transient or permanent, it doesn't restart it when it dies, but rather it kills all the other applications as well.

What I really want to do is somehow tell the Erlang VM that a particular application should always be running, and if it goes down, restart it. Is this possible to do?

(I'm not talking about implementing a supervisor on top of my application, because then it's a catch 22: what if my supervisor process crashes? I'm looking for some sort of API or setting that I can use to have Erlang monitor and restart my application for me.)

Thanks!


You should be able to fix this in the top-level supervisor: set the restart strategy to allow one million restarts every second, and the application should never crash. Something like:

init(_Args) ->
    {ok, {{one_for_one, 1000000, 1},
          [{ch3, {ch3, start_link, []},
            permanent, brutal_kill, worker, [ch3]}]}}.

(Example adapted from the OTP Design Principles User Guide.)


You can use heart to restart the entire VM if it goes down, then use a permanent application type to make sure that the VM exits when your application exits.

Ultimately you need something above your application that you need to trust, whether it is a supervisor process, the erlang VM, or some shell script you wrote - it will always be a problem if that happens to fail also.


Use Monit, then setup your application to terminate by using a supervisor for the whole application with a reasonable restart frequency. If the application terminates, the VM terminates, and monit restarts everything.

I could never get Heart to be reliable enough, as it only restarts the VM once, and it doesn't deal well with a kill -9 of the erlang VM.

链接地址: http://www.djcxy.com/p/38228.html

上一篇: Erlang主管:如何检查所有工作人员是否已回复

下一篇: 自动重启Erlang应用程序