What is the point of having temporary children for Erlang supervisors?

2018-06-13 09:41:33

Supervisors are there to restart processes that fail. Temporary processes are ones that should never be restarted. So, why bother having this type of child for a supervisor? Is it mainly so they can be terminated as part of a one_for_all strategy, or when the application is terminated?

There are several problems with the view that supervisors only exist to restart jobs. Here are a few of them:

The meaning of "temporary" is vague.

Restarting a job that failed somewhere in progress for an unknown (possibly resource-bound or otherwise external) reason and letting a process expire that is finished with its job are two different things.

Supervisors provide a consistent interface or doorway not just to restarts, but also to starts, logging, tracing, crash cleanup, servicing of state at a higher (as in "less acceptable to fail") level, and various other OTP conveniences that are built into tools like SASL.

Ultimately, all processes are temporary. To model this in an Erlang system you have to make it OK for a supervisor to spawn and let expire certain jobs. This is why you can add jobs to a supervisor, have various types of supervisors, and the frequent answer to "how do I find process X" is "ask its supervisor".

You can certainly just spawn some random one-off process in the middle of your code to complete some one-off task (and sometimes this is the right thing to do), but now you've got to write crash handling code within your process just in case something fails (if you care about the job, that is -- and if you don't, why are you doing it?). If you do this very often you will wind up writing an informally specified, buggy implementation of a lot of the functionality that is already a part of what OTP provides in the form of supervisors -- this is the Erlang version of Greenspun's Tenth Rule.

(The Tenth Rule thing happens all the time because while the language Erlang is very small, simple and not the subject of many misperceptions across hackerdom; OTP and the runtime environment are huge, complicated and the part of Erlang/OTP that are the subject of a bajillion outsider misconceptions.)

Most of the time modules that are written to perform some one-off job are written with a start/0,N (or do or whatever) sort of function that actually calls a named supervisor, adds a temporary worker to its list, and gets it spun up under supervision even though it is temporary. This is not the right thing to do in every case but it is a pretty common thing to see -- and I tend to default to something like this until I have a reason not to.

To think of it in another way... In the real world this term "supervisor" means to supervise a job, task or worker. Its not quite as broad as "manager", but its a lot more broad than just "hiring new workers whenever one worker quits" -- that's even too narrow a definition for a HR department.

The role of a supervisor is not only to start/restart processes, but also to kill them.

It is mandatory to take care of killing processes because, in a long living application, there is a risk of accumulating "orphan" processes that became useless. So, as a rule of thumb, each process should be linked to some others unless it is guaranteed to die by itself within a limited time.

This can be done in the module itself and it makes sense when, for example, a process A spawns a process B and when they should always die under the same conditions (don't forget that a process dying with the reason 'normal' will not propagates its death to the linked processes). But this has two main inconveniences:

As the situation is often much more complex, it will add some (a lot of ?) process management code to your module, mixed with its main purpose code, decreasing the readability of the application.

it will give you a very partial (local) view of the process management while it is a transversal and architectural problem. Again it leads to a poor readability of the process management.

The usage of a supervisor, in fact a supervision tree, allows you to separate the process management from your application code. It provides a comprehensive, centralized and standardized view of it, integrated in the OTP environment.

Temporary processes exist (in some cases they are the majority); their life cycle must be managed; this is the job of the supervisors.

链接地址: http://www.djcxy.com/p/38232.html

上一篇: 在服务器端没有收到数据

下一篇: 为Erlang主管提供临时子女有什么意义？