How to check whether the process was restarted by supervisor?

To be more precise:

I have a supervisor for dynamic number of children. I want it to use different init function when given child is added and started for the fist time than for all the restarts than will happen later. Optionally, I could use the same function, if it is possible for the process to discover that it was restarted.


Technically, there are side effects that could be used to find out whether a process was restarted by its supervisor or if it was the first start. For example, you could check the pid of the process and compare it with the pid of the supervisor. However, this is ugly, error prone, and incoherent with OTP principles . Indeed, the supervisor itself might have been restarted, or the application or the node itself. It's futile to try to find out.

Instead, following OTP principles, you must make sure that supervised processes perform the same task whether they were started for the first time or restarted . This can be achieve with a proper supervision tree that handles dependencies between processes.

The typical reason for desiring to find out if a process was started or restarted is because when they are first started, they must do something which doesn't need to be redone on restart. Eventually, you need to make sure that what needs to be done on start is undone on termination , so your processes could do the same thing on start in all situations.

For example, what is to be done could be to start another process (let's call it B), then just link the child and process B on start, then shall the child terminate, B will terminate as well (and reciprocally). You will have to configure the supervisor of B processes to not restart its children (ie make them temporary).

(update based on first comment below)

Adding factories only pushes the problem further but does not solve it entirely. Let's say that you have a factory that is responsible for creating the children. This process could save the states of children and restore them on restart. To achieve this, you would:

  • make sure the supervisor does not restart children (they should be specified as temporary);
  • create a monitor after the child is created with erlang:monitor/2 . Whenever a child terminates, the factory will receive a message. It can then restart the child and provide it with the state;
  • get the children to periodically send required information to the factory.
  • Please note that for memory efficiency that you should restore the state of the process from the factory in a separate message. Indeed, if you put the saved state in the specification, the supervisor will maintain a copy.

    You also probably want to make sure that children die if the factory dies. To achieve this, you should link the two processes. As a result, instead of using erlang:monitor/2 , you could configure your factory to receive EXIT messages by configuring the process to trap exit (with erlang:process_flag/2 ).

    Yet, this does not solve the problem, as the factory itself could terminate abnormally. It would be restarted by its supervisor and all states would be lost, without proper cleanup. So you need to make sure that what needs to be done on children start is undone when the factory terminates.


    I think you should start by clarify why a process will die, and then what is the relevant strategy to restart properly (You don't give information about this). Then if it is a normal behavior of your application that those processes terminate (for example session time out), my opinion is that you should try to separate the task that can die from the one that is in charge to store the state. It may be also relevant to keep information in ets, dets or mnesia. But as Paul said try to stick to the OTP principles.


    Supervised processes are OTP processes. Your supervisor is therefore starting otp processes like gen_server. To see when processes are restarted make your own nameserver, and then use gen_server:start_link({via, YourModule, Identifier}... to track restarts.

    链接地址: http://www.djcxy.com/p/38210.html

    上一篇: Erlang主管终止行为

    下一篇: 如何检查过程是否由主管重新启动?