diff --git a/lib/elixir/pages/getting-started/debugging.md b/lib/elixir/pages/getting-started/debugging.md index b9fe6ea2475..a370f70f1d3 100644 --- a/lib/elixir/pages/getting-started/debugging.md +++ b/lib/elixir/pages/getting-started/debugging.md @@ -77,7 +77,7 @@ dbg(Map.put(feature, :in_version, "1.14.0")) The code above prints this: -```shell +```text [my_file.exs:2: (file)] feature #=> %{inspiration: "Rust", name: :dbg} [my_file.exs:3: (file)] @@ -97,7 +97,7 @@ __ENV__.file This code prints: -```shell +```text [dbg_pipes.exs:5: (file)] __ENV__.file #=> "/home/myuser/dbg_pipes.exs" |> String.split("/", trim: true) #=> ["home", "myuser", "dbg_pipes.exs"] diff --git a/lib/elixir/pages/mix-and-otp/agents.md b/lib/elixir/pages/mix-and-otp/agents.md index 4df693b71ae..86e4252c8c8 100644 --- a/lib/elixir/pages/mix-and-otp/agents.md +++ b/lib/elixir/pages/mix-and-otp/agents.md @@ -3,7 +3,7 @@ SPDX-FileCopyrightText: 2021 The Elixir Team --> -# Simple state management with agents +# Simple state with agents In this chapter, we will learn how to keep and share state between multiple entities. If you have previous programming experience, you may think of globally shared variables, but the model we will learn here is quite different. The next chapters will generalize the concepts introduced here. @@ -11,19 +11,14 @@ If you have skipped the *Getting Started* guide or read it long ago, be sure to ## The trouble with (mutable) state -Elixir is an immutable language where nothing is shared by default. If we want to share information, which can be read and modified from multiple places, we have two main options in Elixir: +Elixir is an immutable language where nothing is shared by default. If we want to share information, this is typically done by sending messages between processes. - * Using processes and message passing - * [ETS (Erlang Term Storage)](`:ets`) - -We covered processes in the *Getting Started* guide. ETS (Erlang Term Storage) is a new topic that we will explore in later chapters. When it comes to processes though, we rarely hand-roll our own, instead we use the abstractions available in Elixir and OTP: +When it comes to processes though, we rarely hand-roll our own, instead we use the abstractions available in Elixir and OTP: * `Agent` — Simple wrappers around state. * `GenServer` — "Generic servers" (processes) that encapsulate state, provide sync and async calls, support code reloading, and more. * `Task` — Asynchronous units of computation that allow spawning a process and potentially retrieving its result at a later time. -We will explore these abstractions as we move forward. Keep in mind that they are all implemented on top of processes using the basic features provided by the VM, like `send/2`, `receive/1`, `spawn/1` and `Process.link/1`. - Here, we will use agents, and create a module named `KV.Bucket`, responsible for storing our key-value entries in a way that allows them to be read and modified by other processes. ## Agents 101 @@ -47,7 +42,7 @@ iex> Agent.stop(agent) :ok ``` -We started an agent with an initial state of an empty list. We updated the agent's state, adding our new item to the head of the list. The second argument of `Agent.update/3` is a function that takes the agent's current state as input and returns its desired new state. Finally, we retrieved the whole list. The second argument of `Agent.get/3` is a function that takes the state as input and returns the value that `Agent.get/3` itself will return. Once we are done with the agent, we can call `Agent.stop/3` to terminate the agent process. +We started an agent with an initial state of an empty list. The `start_link/1` function returned the `:ok` tuple with a process identifier (PID) of the agent. We will use this PID for all further interactions. We then updated the agent's state, adding our new item to the head of the list. The second argument of `Agent.update/3` is a function that takes the agent's current state as input and returns its desired new state. Finally, we retrieved the whole list. The second argument of `Agent.get/3` is a function that takes the state as input and returns the value that `Agent.get/3` itself will return. Once we are done with the agent, we can call `Agent.stop/3` to terminate the agent process. The `Agent.update/3` function accepts as a second argument any function that receives one argument and returns a value: @@ -93,7 +88,9 @@ Also note the `async: true` option passed to `ExUnit.Case`. This option makes th Async or not, our new test should obviously fail, as none of the functionality is implemented in the module being tested: ```text -** (UndefinedFunctionError) function KV.Bucket.start_link/1 is undefined (module KV.Bucket is not available) +1) test stores values by key (KV.BucketTest) + test/kv/bucket_test.exs:4 + ** (UndefinedFunctionError) function KV.Bucket.start_link/1 is undefined (module KV.Bucket is not available) ``` In order to fix the failing test, let's create a file at `lib/kv/bucket.ex` with the contents below. Feel free to give a try at implementing the `KV.Bucket` module yourself using agents before peeking at the implementation below. @@ -104,9 +101,11 @@ defmodule KV.Bucket do @doc """ Starts a new bucket. + + All options are forwarded to `Agent.start_link/2`. """ - def start_link(_opts) do - Agent.start_link(fn -> %{} end) + def start_link(opts) do + Agent.start_link(fn -> %{} end, opts) end @doc """ @@ -125,49 +124,43 @@ defmodule KV.Bucket do end ``` -The first step in our implementation is to call `use Agent`. Most of the functionality we will learn, such as `GenServer` and `Supervisor`, follow this pattern. For all of them, calling `use` generates a `child_spec/1` function with default configuration, which will be handy when we start supervising processes in chapter 4. +The first step in our implementation is to call `use Agent`. This is a pattern we will see throughout the guides and understand in depth in the next chapter. -Then we define a `start_link/1` function, which will effectively start the agent. It is a convention to define a `start_link/1` function that always accepts a list of options. We don't plan on using any options right now, but we might later on. We then proceed to call `Agent.start_link/1`, which receives an anonymous function that returns the Agent's initial state. +Then we define a `start_link/1` function, which will effectively start the agent. It is a convention to define a `start_link/1` function that always accepts a list of options. We then call `Agent.start_link/2` passing an anonymous function that returns the Agent's initial state and the same list of options we received. We are keeping a map inside the agent to store our keys and values. Getting and putting values on the map is done with the Agent API and the capture operator `&`, introduced in [the Getting Started guide](../getting-started/anonymous-functions.md#the-capture-operator). The agent passes its state to the anonymous function via the `&1` argument when `Agent.get/2` and `Agent.update/2` are called. Now that the `KV.Bucket` module has been defined, our test should pass! You can try it yourself by running: `mix test`. -## Test setup with ExUnit callbacks +## Naming processes -Before moving on and adding more features to `KV.Bucket`, let's talk about ExUnit callbacks. As you may expect, all `KV.Bucket` tests will require a bucket agent to be up and running. Luckily, ExUnit supports callbacks that allow us to skip such repetitive tasks. +When starting `KV.Bucket`, we pass a list of options which we forward to `Agent.start_link/2`. One of the options accepted by `Agent.start_link/2` is a name option which allows us to name a process, so we can interact with it using its name instead of its PID. -Let's rewrite the test case to use callbacks: +Let's write a test as an example. Back on `KV.BucketTest`, add this: ```elixir -defmodule KV.BucketTest do - use ExUnit.Case, async: true + test "stores values by key on a named process" do + {:ok, _} = KV.Bucket.start_link(name: :shopping_list) + assert KV.Bucket.get(:shopping_list, "milk") == nil - setup do - {:ok, bucket} = KV.Bucket.start_link([]) - %{bucket: bucket} + KV.Bucket.put(:shopping_list, "milk", 3) + assert KV.Bucket.get(:shopping_list, "milk") == 3 end - - test "stores values by key", %{bucket: bucket} do - assert KV.Bucket.get(bucket, "milk") == nil - - KV.Bucket.put(bucket, "milk", 3) - assert KV.Bucket.get(bucket, "milk") == 3 - end -end ``` -We have first defined a setup callback with the help of the `setup/1` macro. The `setup/1` macro defines a callback that is run before every test, in the same process as the test itself. - -Note that we need a mechanism to pass the `bucket` PID from the callback to the test. We do so by using the *test context*. When we return `%{bucket: bucket}` from the callback, ExUnit will merge this map into the test context. Since the test context is a map itself, we can pattern match the bucket out of it, providing access to the bucket inside the test: +However, keep in mind that names are shared in the current node. If two tests attempt to create two processes named `:shopping_list` at the same time, one would succeed and the other would fail. For this reason, it is a common practice in Elixir to name processes started during tests after the test itself, like this: ```elixir -test "stores values by key", %{bucket: bucket} do - # `bucket` is now the bucket from the setup block -end + test "stores values by key on a named process", config do + {:ok, _} = KV.Bucket.start_link(name: config.test) + assert KV.Bucket.get(config.test, "milk") == nil + + KV.Bucket.put(config.test, "milk", 3) + assert KV.Bucket.get(config.test, "milk") == 3 + end ``` -You can read more about ExUnit cases in the [`ExUnit.Case` module documentation](`ExUnit.Case`) and more about callbacks in `ExUnit.Callbacks`. +The `config` argument, passed after the test name, is the *test context* and it includes configuration and metadata about the current test, which is useful in scenarios like these. ## Other agent actions @@ -214,4 +207,4 @@ end When a long action is performed on the server, all other requests to that particular server will wait until the action is done, which may cause some clients to timeout. -In the next chapter, we will explore GenServers, where the segregation between clients and servers is made more apparent. +Some APIs, such as GenServers, make a clearer distiction between client and server, and we will explore them in future chapters. Next let's talk about naming things, applications, and supervisors. diff --git a/lib/elixir/pages/mix-and-otp/config-and-distribution.md b/lib/elixir/pages/mix-and-otp/config-and-distribution.md new file mode 100644 index 00000000000..6ed27bfb771 --- /dev/null +++ b/lib/elixir/pages/mix-and-otp/config-and-distribution.md @@ -0,0 +1,305 @@ + + +# Configuration and distribution + +So far we have hardcoded our applications to run a web server on port 4040. This has been somewhat problematic since we can't, for example, run our development server and tests at the same time. In this chapter, we will learn how to use the application environment for configuration, paving the way for us to enable distribution by running multiple development servers on the same machine (on different ports). + +In this last guide, we will make the routing table for our distributed key-value store configurable, and then finally package the software for production. + +Let's do this. + +## Application environment + +In the chapter [Registries, applications, and supervisors](supervisor-and-application.md), we have learned that our project is backed by an application, which bundles our modules and specifies how your supervision tree starts and shuts down. Each application can also have its own configuration, which in Erlang/OTP (and therefore Elixir) is called "application environment". + +We can use the application environment to configure our own application, as well as others. Let's see the application environment in practice. Create a file `config/runtime.exs` with the following: + +```elixir +import Config + +port = + cond do + port_env = System.get_env("PORT") -> + String.to_integer(port_env) + + config_env() == :test -> + 4040 + + true -> + 4050 + end + +config :kv, :port, port +``` + +The above is attempting to read the "PORT" environment variable and use it as the port if defined. Otherwise, we default to port `4040` for tests and port `4050` for other environments, eliminating the conflict between environments we have seen in the past. Then we store its value under the `:port` key of our `:kv` application. + +Now we just need to read this configuration. Open up `lib/kv.ex` and the `start/2` function to the following: + +```elixir + def start(_type, _args) do + port = Application.fetch_env!(:kv, :port) + + children = [ + {Registry, name: KV, keys: :unique}, + {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one}, + {Task.Supervisor, name: KV.ServerSupervisor}, + Supervisor.child_spec({Task, fn -> KV.Server.accept(port) end}, restart: :permanent) + ] + + Supervisor.start_link(children, strategy: :one_for_one) + end +``` + +Run `iex -S mix` and you will see the following message printed: + +```text +[info] Accepting connections on port 4050 +``` + +Run tests, without killing the development server, and you will see it running on port 4040. + +Our change was straight-forward. We used `Application.fetch_env!/2` to read the entry for `port` in `:kv`'s environment. We explicitly used `fetch_env!/2` (instead of `get_env/2` or `fetch_env`) because it will raise if the port was not configured (preventing the app from booting). + +## Compile vs runtime configuration + +Configuration files provide a mechanism for us to configure the environment of any application. Elixir provides two configuration entry points: + + * `config/config.exs` — this file is read at build time, before we compile our application and before we even load our dependencies. This means we can't access the code in our application nor in our dependencies. However, it means we can control how they are compiled + + * `config/runtime.exs` — this file is read after our application and dependencies are compiled and therefore it can configure how our application works at runtime. If you want to read system environment variables (via `System.get_env/1`) or access external configuration, this is the appropriate place to do so + +You can learn more about configuration in the `Config` and `Config.Provider` modules. + +Generally speaking, we use `Application.fetch_env!/2` (and friends) to read runtime configuration. `Application.compile_env/2` is available for reading compile-time configuration. This allows Elixir to track which modules to recompile when the compilation environment changes. + +Now that we can start multiple servers, let's explore distribution. + +## Our first distributed code + +Elixir ships with facilities to connect nodes and exchange information between them. In fact, we use the same concepts of processes, message passing and receiving messages when working in a distributed environment because Elixir processes are *location transparent*. This means that when sending a message, it doesn't matter if the recipient process is on the same node or on another node, the VM will be able to deliver the message in both cases. + +In order to run distributed code, we need to start the VM with a name. The name can be short (when in the same network) or long (requires the full computer address). Let's start a new IEx session: + +```console +$ iex --sname foo +``` + +You can see now the prompt is slightly different and shows the node name followed by the computer name: + + Interactive Elixir - press Ctrl+C to exit (type h() ENTER for help) + iex(foo@jv)1> + +My computer is named `jv`, so I see `foo@jv` in the example above, but you will get a different result. We will use `foo@computer-name` in the following examples and you should update them accordingly when trying out the code. + +Let's define a module named `Hello` in this shell: + +```elixir +iex> defmodule Hello do +...> def world, do: IO.puts("hello world") +...> end +``` + +If you have another computer on the same network with both Erlang and Elixir installed, you can start another shell on it. If you don't, you can start another IEx session in another terminal. In either case, give it the short name of `bar`: + +```console +$ iex --sname bar +``` + +Note that inside this new IEx session, we cannot access `Hello.world/0`: + +```elixir +iex> Hello.world +** (UndefinedFunctionError) function Hello.world/0 is undefined (module Hello is not available) + Hello.world() +``` + +However, we can spawn a new process on `foo@computer-name` from `bar@computer-name`! Let's give it a try (where `@computer-name` is the one you see locally): + +```elixir +iex> Node.spawn_link(:"foo@computer-name", fn -> Hello.world() end) +#PID<9014.59.0> +hello world +``` + +Elixir spawned a process on another node and returned its PID. You can see the PID number no longer starts with zero, showing it belongs to another node. The code then executed on the other node where the `Hello.world/0` function exists and invoked that function. Note that the result of "hello world" was printed on the current node `bar` and not on `foo`. In other words, the message to be printed was sent back from `foo` to `bar`. This happens because the process spawned on the other node (`foo`) knows all the output should be sent back to the original node! + +We can send and receive messages from the PID returned by `Node.spawn_link/2` as usual. Let's try a quick ping-pong example: + +```elixir +iex> pid = Node.spawn_link(:"foo@computer-name", fn -> +...> receive do +...> {:ping, client} -> send(client, :pong) +...> end +...> end) +#PID<9014.59.0> +iex> send(pid, {:ping, self()}) +{:ping, #PID<0.73.0>} +iex> flush() +:pong +:ok +``` + +In other words, we can spawn processes in other nodes, hold onto their PIDs, and then send messages to them as if they were running on the same machine. That's the *location transparency* principle. And because everything we have built so far was built on top of messaging passing, we should be able to adjust our key-value store to become a distributed one with little work. + +## Distributed naming registry with `:global` + +First, let's check that our code is not currently distributed. Start a new node like this: + +```console +$ PORT=4100 iex --sname foo -S mix +``` + +And the other like this: + +```console +$ PORT=4101 iex --sname bar -S mix +``` + +Now, within `foo@computer-name`, do this: + +```elixir +iex> :erpc.call(:"bar@computer-name", KV, :create_bucket, ["shopping"]) +{:ok, #PID<22121.164.0>} +``` + +Instead of using `Node.spawn_link/2`, we used [Erlang's builtin RPC module](`:erpc`) to call the function `create_bucket` in the `KV` module passing a one element list with the string "shopping" as the argument list. We could have used `Node.spawn_link/2`, but `:erpc.call/4` conveniently returns the result of the invocation. + +Still in `foo@computer-name`, let's try to access the bucket: + +```elixir +iex> KV.lookup_bucket("shopping") +nil +``` + +It returns `nil`. However, if you run `KV.lookup_bucket("shopping")` in `bar@computer-name`, it will return the proper bucket. In other words, the nodes can communicate with each other, but buckets spawned in one node are not visible to the other. + +This is because we are using [Elixir's Registry](`Registry`) to name our buckets, which is a **local** process registry. In other words, it is designed for processes running on a single node and not for distribution. + +Luckily, Erlang ships with a distributed registry called [`:global`](`:global`), which is directly supported by the `:name` option by passing a `{:global, name}` tuple. All we need to do is update the `via/1` function in `lib/kv.ex` from this: + +```elixir + defp via(name), do: {:via, Registry, {KV, name}} +``` + +to this: + +```elixir + defp via(name), do: {:global, name} +``` + +Do the change above and restart both `foo@computer-name` and `bar@computer-name`. Now, back on `foo@computer-name`, let's give it another try: + +```elixir +iex> :erpc.call(:"bar@computer-name", KV, :create_bucket, ["shopping"]) +{:ok, #PID<21821.179.0>} +iex> KV.lookup_bucket("shopping") +#PID<21821.179.0> +``` + +And there you go! By simply changing which naming registry we used, we now have a distributed key value store. You can even try using `telnet` to connect to the servers on different ports and validate that changes in one session are visible in the other one. Exciting! + +## Node discovery and dependencies + +There is one essential ingredient to wrap up our distributed key-value store. In order for the `:global` registry to work, we need to make sure the nodes are connected to each other. When we run `:erpc` call passing the node name: + +```elixir +:erpc.call(:"bar@computer-name", KV, :create_bucket, ["shopping"]) +``` + +Elixir automatically connected the nodes together. This is easy to do in an IEx session when both instances are running on the same machine but it requires more work in a production environment, where instances are on different machines which may be started at any time and running on different IP addresses. + +Luckily for us, this is also a well-solved problem. For example, if you are using [the Phoenix web framework](https://phoenixframework.org) in production, it ships with [the `dns_cluster` package](https://github.com/phoenixframework/dns_cluster), which automatically runs DNS queries to find new nodes and connect them. If you are using Kubernetes or cloud providers, [packages like `libcluster`](https://github.com/bitwalker/libcluster) ship with different strategies to discover and connect nodes. + +Installing dependencies in Elixir is simple. Most commonly, we use the [Hex Package Manager](https://hex.pm), by listing the dependency inside the deps function in our `mix.exs` file: + +```elixir +def deps do + [{:dns_cluster, "~> 0.2"}] +end +``` + +This dependency refers to the latest version of `dns_cluster` in the 0.x version series that has been pushed to Hex. This is indicated by the `~>` preceding the version number. For more information on specifying version requirements, see the documentation for the `Version` module. + +Typically, stable releases are pushed to Hex. If you want to depend on an external dependency still in development, Mix is able to manage Git dependencies too: + +```elixir +def deps do + [{:dns_cluster, git: "https://github.com/phoenixframework/dns_cluster.git"}] +end +``` + +You will notice that when you add a dependency to your project, Mix generates a `mix.lock` file that guarantees *repeatable builds*. The lock file must be checked in to your version control system, to guarantee that everyone who uses the project will use the same dependency versions as you. + +Mix provides many tasks for working with dependencies, which can be seen in `mix help`: + +```console +$ mix help +mix deps # Lists dependencies and their status +mix deps.clean # Deletes the given dependencies' files +mix deps.compile # Compiles dependencies +mix deps.get # Gets all out of date dependencies +mix deps.tree # Prints the dependency tree +mix deps.unlock # Unlocks the given dependencies +mix deps.update # Updates the given dependencies +``` + +The most common tasks are `mix deps.get` and `mix deps.update`. Once fetched, dependencies are automatically compiled for you. You can read more about deps by running `mix help deps`. + +To wrap up this chapter, we will build a very simple node discovery mechanism, where the name of the nodes we should connect to are given on boot, using the lessons we learned in this chapter. + +## `Node.connect/1` + +We will change our application to support a "NODES" environment variable with the name of all nodes each instance should connect to. + +Open up `config/runtime.exs` and add this to the bottom: + +```elixir +nodes = + System.get_env("NODES", "") + |> String.split(",", trim: true) + |> Enum.map(&String.to_atom/1) + +config :kv, :nodes, nodes +``` + +We fetch the environment variable, split it on "," while discarding all empty strings, and then convert each entry to an atom, as node names are atoms. + +Now, in your `start/2` callback, we will add this to of the `start/2` function: + +```elixir + def start(_type, _args) do + for node <- Application.fetch_env!(:kv, :nodes) do + Node.connect(node) + end +``` + +Now we can start our nodes as: + +```console +$ NODES="foo@computer-name,bar@computer-name" PORT=4040 iex --sname foo -S mix +$ NODES="foo@computer-name,bar@computer-name" PORT=4041 iex --sname bar -S mix +``` + +And they should connect to each other. Give it a try! + +In an actual production system, there is some additional care we must take. For example, we often use `--name` instead of `--sname` and give fully qualified node names. + +Furthermore, when connecting two instances, we must guarantee they have the same cookie, which is a secret Erlang uses to authorize the connection. When they run on the same machine, they share the same cookie by default, but it must be either explicitly set or shared in other ways when deploying in a cluster. + +We will revisit these topics in the last chapter when we talk about releases. + +## Distributed system trade-offs + +In this chapter, we made our key-value store distributed by using the `:global` naming registry. However, it is important to keep in mind that every distributed system, be it a library or a full-blown database, is designed with a series of trade-offs in mind. + +In particular, `:global` requires consistency across all known nodes whenever a new bucket is created. For example, if your cluster has three nodes, creating a new bucket will require all three nodes to agree on its name. This means if one node is unresponsive, perhaps due to a [network partition](https://en.wikipedia.org/wiki/Network_partition), the node will have to either reconnect or be kicked out before registration succeeds. This also means that, as your cluster grows in size, registration becomes more expensive, although lookups are always cheap and immediate. Within the ecosystem, there are other named registries, which explore different trade-offs, such as [Syn](https://github.com/ostinelli/syn). + +Further complications arise when we consider storage. Today, when our nodes terminate, we lose all data stored in the buckets. In our current design, since we allow each node to store their own buckets, it means we would need to backup each node. And, if we don't want data losses, we would also need to replicate the data. + +For those reasons, it is still very common to use a database (or any storage system) when writing production applications in Elixir, and use Elixir to implement the realtime and collaborative aspects of your applications that extend beyond storage. For example, we can use Elixir to track which clients are connected to the cluster at any given moment or implement a feed where users are notified in realtime whenever items are added or removed from a bucket. + +In fact, that's exactly what we will build in the next chapter. Allowing us to wrap up everything we have learned so far and also talk about one of the essential building blocks in Elixir software: GenServers. diff --git a/lib/elixir/pages/mix-and-otp/config-and-releases.md b/lib/elixir/pages/mix-and-otp/config-and-releases.md deleted file mode 100644 index 0eb7b83ac9c..00000000000 --- a/lib/elixir/pages/mix-and-otp/config-and-releases.md +++ /dev/null @@ -1,429 +0,0 @@ - - -# Configuration and releases - -In this last guide, we will make the routing table for our distributed key-value store configurable, and then finally package the software for production. - -Let's do this. - -## Application environment - -So far we have hard-coded the routing table into the `KV.Router` module. However, we would like to make the table dynamic. This allows us not only to configure development/test/production, but also to allow different nodes to run with different entries in the routing table. There is a feature of OTP that does exactly that: the application environment. - -Each application has an environment that stores the application's specific configuration by key. For example, we could store the routing table in the `:kv` application environment, giving it a default value and allowing other applications to change the table as needed. - -Open up `apps/kv/mix.exs` and change the `application/0` function to return the following: - -```elixir -def application do - [ - extra_applications: [:logger], - env: [routing_table: []], - mod: {KV, []} - ] -end -``` - -We have added a new `:env` key to the application. It returns the application default environment, which has an entry of key `:routing_table` and value of an empty list. It makes sense for the application environment to ship with an empty table, as the specific routing table depends on the testing/deployment structure. - -In order to use the application environment in our code, we need to replace `KV.Router.table/0` with the definition below: - -```elixir -@doc """ -The routing table. -""" -def table do - Application.fetch_env!(:kv, :routing_table) -end -``` - -We use `Application.fetch_env!/2` to read the entry for `:routing_table` in `:kv`'s environment. You can find more information and other functions to manipulate the app environment in the `Application` module. - -Since our routing table is now empty, our distributed tests should fail. Restart the apps and re-run tests to see the failure: - -```console -$ iex --sname bar -S mix -$ elixir --sname foo -S mix test --only distributed -``` - -We need a way to configure the application environment. That's when we use configuration files. - -## Configuration - -Configuration files provide a mechanism for us to configure the environment of any application. Elixir provides two configuration entry points: - - * `config/config.exs` — this file is read at build time, before we compile our application and before we even load our dependencies. This means we can't access the code in our application nor in our dependencies. However, it means we can control how they are compiled - - * `config/runtime.exs` — this file is read after our application and dependencies are compiled and therefore it can configure how our application works at runtime. If you want to read system environment variables (via `System.get_env/1`) or access external configuration, this is the appropriate place to do so - -You can learn more about configuration in the `Config` and `Config.Provider` modules. For now, let's see an example. - -We can configure IEx default prompt to another value by creating a `config/runtime.exs` file with the following content: - -```elixir -import Config -config :iex, default_prompt: ">>>" -``` - -Start IEx with `iex -S mix` and you can see that the IEx prompt has changed. - -This means we can also configure our `:routing_table` directly in the `config/runtime.exs` file. However, which configuration value should we use? - -Currently we have two tests tagged with `@tag :distributed`. The "server interaction" test in `KVServerTest`, and the "route requests across nodes" in `KV.RouterTest`. Both tests are failing since they require a routing table, which is currently empty. - -For simplicity, we will define a routing table that always points to the current node. That's the table we will use for development and most of our tests. Back in `config/runtime.exs`, add this line: - -```elixir -config :kv, :routing_table, [{?a..?z, node()}] -``` - -With such a simple table available, we can now remove `@tag :distributed` from the test in `test/kv_server_test.exs`. If you run the complete suite, the test should now pass. - -However, for the tests in `KV.RouterTest`, we effectively need two nodes in our routing table. To do so, we will write a setup block that runs before all tests in that file. The setup block will change the application environment and revert it back once we are done, like this: - -```elixir -defmodule KV.RouterTest do - use ExUnit.Case - - setup_all do - current = Application.get_env(:kv, :routing_table) - - Application.put_env(:kv, :routing_table, [ - {?a..?m, :"foo@computer-name"}, - {?n..?z, :"bar@computer-name"} - ]) - - on_exit fn -> Application.put_env(:kv, :routing_table, current) end - end - - @tag :distributed - test "route requests across nodes" do -``` - -Note we removed `async: true` from `use ExUnit.Case`. Since the application environment is a global storage, tests that modify it cannot run concurrently. With all changes in place, all tests should pass, including the distributed one. - -## Releases - -Now that our application runs distributed, you may be wondering how we can package our application to run in production. After all, all of our code so far depends on Erlang and Elixir versions that are installed in your current system. To achieve this goal, Elixir provides releases. - -A release is a self-contained directory that consists of your application code, all of its dependencies, plus the whole Erlang Virtual Machine (VM) and runtime. Once a release is assembled, it can be packaged and deployed to a target as long as the target runs on the same operating system (OS) distribution and version as the machine that assembled the release. - -In a regular project, we can assemble a release by simply running `mix release`. However, we have an umbrella project, and in such cases Elixir requires some extra input from us. Let's see what is necessary: - -```shell -$ MIX_ENV=prod mix release -** (Mix) Umbrella projects require releases to be explicitly defined with a non-empty applications key that chooses which umbrella children should be part of the releases: - -releases: [ - foo: [ - applications: [child_app_foo: :permanent] - ], - bar: [ - applications: [child_app_bar: :permanent] - ] -] - -Alternatively you can perform the release from the children applications -``` - -That's because an umbrella project gives us plenty of options when deploying the software. We can: - - * deploy all applications in the umbrella to a node that will work as both TCP server and key-value storage - - * deploy the `:kv_server` application to work only as a TCP server as long as the routing table points only to other nodes - - * deploy only the `:kv` application when we want a node to work only as storage (no TCP access) - -As a starting point, let's define a release that includes both `:kv_server` and `:kv` applications. We will also add a version to it. Open up the `mix.exs` in the umbrella root and add inside `def project`: - -```elixir -releases: [ - foo: [ - version: "0.0.1", - applications: [kv_server: :permanent, kv: :permanent] - ] -] -``` - -That defines a release named `foo` with both `kv_server` and `kv` applications. Their mode is set to `:permanent`, which means that, if those applications crash, the whole node terminates. That's reasonable since those applications are essential to our system. - -Before we assemble the release, let's also define our routing table for production. Given we expect to have two nodes, we need to update `config/runtime.exs` to look like this: - -```elixir -import Config - -config :kv, :routing_table, [{?a..?z, node()}] - -if config_env() == :prod do - config :kv, :routing_table, [ - {?a..?m, :"foo@computer-name"}, - {?n..?z, :"bar@computer-name"} - ] -end -``` - -We have hard-coded the table and node names, which is good enough for our example, but you would likely move it to an external configuration system in an actual production setup. We have also wrapped it in a `config_env() == :prod` check, so this configuration does not apply to other environments. - -With the configuration in place, let's give assembling the release another try: - - $ MIX_ENV=prod mix release foo - * assembling foo-0.0.1 on MIX_ENV=prod - * skipping runtime configuration (config/runtime.exs not found) - - Release created at _build/prod/rel/foo! - - # To start your system - _build/prod/rel/foo/bin/foo start - - Once the release is running: - - # To connect to it remotely - _build/prod/rel/foo/bin/foo remote - - # To stop it gracefully (you may also send SIGINT/SIGTERM) - _build/prod/rel/foo/bin/foo stop - - To list all commands: - - _build/prod/rel/foo/bin/foo - -Excellent! A release was assembled in `_build/prod/rel/foo`. Inside the release, there will be a `bin/foo` file which is the entry point to your system. It supports multiple commands, such as: - - * `bin/foo start`, `bin/foo start_iex`, `bin/foo restart`, and `bin/foo stop` — for general management of the release - - * `bin/foo rpc COMMAND` and `bin/foo remote` — for running commands on the running system or to connect to the running system - - * `bin/foo eval COMMAND` — to start a fresh system that runs a single command and then shuts down - - * `bin/foo daemon` and `bin/foo daemon_iex` — to start the system as a daemon on Unix-like systems - - * `bin/foo install` — to install the system as a service on Windows machines - -If you run `bin/foo start`, it will start the system using a short name (`--sname`) equal to the release name, which in this case is `foo`. The next step is to start a system named `bar`, so we can connect `foo` and `bar` together, like we did in the previous chapter. But before we achieve this, let's talk a bit about the benefits of releases. - -## Why releases? - -Releases allow developers to precompile and package all of their code and the runtime into a single unit. The benefits of releases are: - - * Code preloading. The VM has two mechanisms for loading code: interactive and embedded. By default, it runs in the interactive mode which dynamically loads modules when they are used for the first time. The first time your application calls `Enum.map/2`, the VM will find the `Enum` module and load it. There's a downside. When you start a new server in production, it may need to load many other modules, causing the first requests to have an unusual spike in response time. Releases run in embedded mode, which loads all available modules upfront, guaranteeing your system is ready to handle requests after booting. - - * Configuration and customization. Releases give developers fine grained control over system configuration and the VM flags used to start the system. - - * Self-contained. A release does not require the source code to be included in your production artifacts. All of the code is precompiled and packaged. Releases do not even require Erlang or Elixir on your servers, as they include the Erlang VM and its runtime by default. Furthermore, both Erlang and Elixir standard libraries are stripped to bring only the parts you are actually using. - - * Multiple releases. You can assemble different releases with different configuration per application or even with different applications altogether. - -We have written extensive documentation on releases, so [please check the official documentation for more information](`mix release`). For now, we will continue exploring some of the features outlined above. - -## Assembling multiple releases - -So far, we have assembled a release named `foo`, but our routing table contains information for both `foo` and `bar`. Let's start `foo`: - - $ _build/prod/rel/foo/bin/foo start - 16:58:58.508 [info] Accepting connections on port 4040 - -And let's connect to it and issue a request in another terminal: - - $ telnet 127.0.0.1 4040 - Trying 127.0.0.1... - Connected to localhost. - Escape character is '^]'. - CREATE bitsandpieces - OK - PUT bitsandpieces sword 1 - OK - GET bitsandpieces sword - 1 - OK - GET shopping foo - Connection closed by foreign host. - -Our application works already when we operate on the bucket named "bitsandpieces". But since the "shopping" bucket would be stored on `bar`, the request fails as `bar` is not available. If you go back to the terminal running `foo`, you will see: - - 17:16:19.555 [error] Task #PID<0.622.0> started from #PID<0.620.0> terminating - ** (stop) exited in: GenServer.call({KV.RouterTasks, :"bar@computer-name"}, {:start_task, [{:"foo@josemac-2", #PID<0.622.0>, #PID<0.622.0>}, [#PID<0.622.0>, #PID<0.620.0>, #PID<0.618.0>], :monitor, {KV.Router, :route, ["shopping", KV.Registry, :lookup, [KV.Registry, "shopping"]]}], :temporary, nil}, :infinity) - ** (EXIT) no connection to bar@computer-name - (elixir) lib/gen_server.ex:1010: GenServer.call/3 - (elixir) lib/task/supervisor.ex:454: Task.Supervisor.async/6 - (kv) lib/kv/router.ex:21: KV.Router.route/4 - (kv_server) lib/kv_server/command.ex:74: KVServer.Command.lookup/2 - (kv_server) lib/kv_server.ex:29: KVServer.serve/1 - (elixir) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2 - (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3 - Function: #Function<0.128611034/0 in KVServer.loop_acceptor/1> - Args: [] - -Let's now define a release for `:bar`. One first step could be to define a release exactly like `foo` inside `mix.exs`. Additionally we will set the `cookie` option on both releases to `weknoweachother` in order for them to allow connections from each other. See the [Distributed Erlang Documentation](http://www.erlang.org/doc/reference_manual/distributed.html) for further information on this topic: - -```elixir -releases: [ - foo: [ - version: "0.0.1", - applications: [kv_server: :permanent, kv: :permanent], - cookie: "weknoweachother" - ], - bar: [ - version: "0.0.1", - applications: [kv_server: :permanent, kv: :permanent], - cookie: "weknoweachother" - ] -] -``` - -And now let's assemble both releases: - -```shell -$ MIX_ENV=prod mix release foo -$ MIX_ENV=prod mix release bar -``` - -Stop `foo` if it's still running and re-start it to load the `cookie`: - -```shell -$ _build/prod/rel/foo/bin/foo start -``` - -And start `bar` in another terminal: - -```shell -$ _build/prod/rel/bar/bin/bar start -``` - -You should see an error like the error below happen 5 times, before the application finally shuts down: - -```text - 17:21:57.567 [error] Task #PID<0.620.0> started from KVServer.Supervisor terminating - ** (MatchError) no match of right hand side value: {:error, :eaddrinuse} - (kv_server) lib/kv_server.ex:12: KVServer.accept/1 - (elixir) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2 - (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3 - Function: #Function<0.98032413/0 in KVServer.Application.start/2> - Args: [] -``` - -That's happening because the release `foo` is already listening on port `4040` and `bar` is trying to do the same! One option could be to move the `:port` configuration to the application environment, like we did for the routing table, and setup different ports per node. - -But let's try something else. Let's make it so the `bar` release contains only the `:kv` application. So it works as a storage but it won't have a front-end. Change the `:bar` information to this: - -```elixir -releases: [ - foo: [ - version: "0.0.1", - applications: [kv_server: :permanent, kv: :permanent], - cookie: "weknoweachother" - ], - bar: [ - version: "0.0.1", - applications: [kv: :permanent], - cookie: "weknoweachother" - ] -] -``` - -And now let's assemble `bar` once more: - - $ MIX_ENV=prod mix release bar - -And finally successfully boot it: - - $ _build/prod/rel/bar/bin/bar start - -If you connect to localhost once again and perform another request, now everything should work, as long as the routing table contains the correct node names. Outstanding! - -With releases, we were able to "cut different slices" of our project and prepared them to run in production, all packaged into a single directory. - -## Configuring releases - -Releases also provide built-in hooks for configuring almost every need of the production system: - - * `config/config.exs` — provides build-time application configuration, which is executed before our application compiles. This file often imports configuration files based on the environment, such as `config/dev.exs` and `config/prod.exs`. - - * `config/runtime.exs` — provides runtime application configuration. It is executed every time the release boots and is further extensible via config providers. - - * `rel/env.sh.eex` and `rel/env.bat.eex` — template files that are copied into every release and executed on every command to set up environment variables, including ones specific to the VM, and the general environment. - - * `rel/vm.args.eex` — a template file that is copied into every release and provides static configuration of the Erlang Virtual Machine and other runtime flags. - -As we have seen, `config/config.exs` and `config/runtime.exs` are loaded during releases and regular Mix commands. On the other hand, `rel/env.sh.eex` and `rel/vm.args.eex` are specific to releases. Let's take a look. - -### Operating System environment configuration - -Every release contains an environment file, named `env.sh` on Unix-like systems and `env.bat` on Windows machines, that executes before the Elixir system starts. In this file, you can execute any OS-level code, such as invoke other applications, set environment variables and so on. Some of those environment variables can even configure how the release itself runs. - -For instance, releases run using short-names (`--sname`). However, if you want to actually run a distributed key-value store in production, you will need multiple nodes and start the release with the `--name` option. We can achieve this by setting the `RELEASE_DISTRIBUTION` environment variable inside the `env.sh` and `env.bat` files. Mix already has a template for said files which we can customize, so let's ask Mix to copy them to our application: - - $ mix release.init - * creating rel/vm.args.eex - * creating rel/remote.vm.args.eex - * creating rel/env.sh.eex - * creating rel/env.bat.eex - -If you open up `rel/env.sh.eex`, you will see: - -```shell -#!/bin/sh - -# # Sets and enables heart (recommended only in daemon mode) -# case $RELEASE_COMMAND in -# daemon*) -# HEART_COMMAND="$RELEASE_ROOT/bin/$RELEASE_NAME $RELEASE_COMMAND" -# export HEART_COMMAND -# export ELIXIR_ERL_OPTIONS="-heart" -# ;; -# *) -# ;; -# esac - -# # Set the release to load code on demand (interactive) instead of preloading (embedded). -# export RELEASE_MODE=interactive - -# # Set the release to work across nodes. -# # RELEASE_DISTRIBUTION must be "sname" (local), "name" (distributed) or "none". -# export RELEASE_DISTRIBUTION=name -# export RELEASE_NODE=<%= @release.name %> -``` - -The steps necessary to work across nodes is already commented out as an example. You can enable full distribution by uncommenting the last two lines by removing the leading `# `. - -If you are on Windows, you will have to open up `rel/env.bat.eex`, where you will find this: - -```bat -@echo off -rem Set the release to load code on demand (interactive) instead of preloading (embedded). -rem set RELEASE_MODE=interactive - -rem Set the release to work across nodes. -rem RELEASE_DISTRIBUTION must be "sname" (local), "name" (distributed) or "none". -rem set RELEASE_DISTRIBUTION=name -rem set RELEASE_NODE=<%= @release.name %> -``` - -Once again, uncomment the last two lines by removing the leading `rem ` to enable full distribution. And that's all! - -### VM arguments - -The `rel/vm.args.eex` allows you to specify low-level flags that control how the Erlang VM and its runtime operate. You specify entries as if you were specifying arguments in the command line with code comments also supported. Here is the default generated file: - - ## Customize flags given to the VM: https://www.erlang.org/doc/man/erl.html - ## -mode/-name/-sname/-setcookie are configured via env vars, do not set them here - - ## Increase number of concurrent ports/sockets - ##+Q 65536 - - ## Tweak GC to run more often - ##-env ERL_FULLSWEEP_AFTER 10 - -You can see [a complete list of VM arguments and flags in the Erlang documentation](http://www.erlang.org/doc/man/erl.html). - -## Summing up - -Throughout the guide, we have built a very simple distributed key-value store as an opportunity to explore many constructs like generic servers, supervisors, tasks, agents, applications and more. Not only that, we have written tests for the whole application, got familiar with ExUnit, and learned how to use the Mix build tool to accomplish a wide range of tasks. - -If you are looking for a distributed key-value store to use in production, you should definitely look into [Riak](http://riak.com/products/riak-kv/), which also runs in the Erlang VM. In Riak, the buckets are replicated, to avoid data loss, and instead of a router, they use [consistent hashing](https://en.wikipedia.org/wiki/Consistent_hashing) to map a bucket to a node. A consistent hashing algorithm helps reduce the amount of data that needs to be migrated when new storage nodes are added to your live system. - -Of course, Elixir can be used for much more than distributed key-value stores. Embedded systems, data-processing and data-ingestion, web applications, audio/video streaming systems, and others are many of the different domains Elixir excels at. We hope this guide has prepared you to explore any of those domains or any future domain you may desire to bring Elixir into. - -Happy coding! diff --git a/lib/elixir/pages/mix-and-otp/dependencies-and-umbrella-projects.md b/lib/elixir/pages/mix-and-otp/dependencies-and-umbrella-projects.md deleted file mode 100644 index cecc976323f..00000000000 --- a/lib/elixir/pages/mix-and-otp/dependencies-and-umbrella-projects.md +++ /dev/null @@ -1,305 +0,0 @@ - - -# Dependencies and umbrella projects - -In this chapter, we will discuss how to manage dependencies in Mix. - -Our `kv` application is complete, so it's time to implement the server that will handle the requests we defined in the first chapter: - -```text -CREATE shopping -OK - -PUT shopping milk 1 -OK - -PUT shopping eggs 3 -OK - -GET shopping milk -1 -OK - -DELETE shopping eggs -OK -``` - -However, instead of adding more code to the `kv` application, we are going to build the TCP server as another application that is a client of the `kv` application. Since the whole runtime and Elixir ecosystem are geared towards applications, it makes sense to break our projects into smaller applications that work together rather than building a big, monolithic app. - -Before creating our new application, we must discuss how Mix handles dependencies. In practice, there are two kinds of dependencies we usually work with: internal and external dependencies. Mix supports mechanisms to work with both. - -## External dependencies - -External dependencies are the ones not tied to your business domain. For example, if you need an HTTP API for your distributed KV application, you can use the [Plug](https://github.com/elixir-lang/plug) project as an external dependency. - -Installing external dependencies is simple. Most commonly, we use the [Hex Package Manager](https://hex.pm), by listing the dependency inside the deps function in our `mix.exs` file: - -```elixir -def deps do - [{:plug, "~> 1.0"}] -end -``` - -This dependency refers to the latest version of Plug in the 1.x.x version series that has been pushed to Hex. This is indicated by the `~>` preceding the version number. For more information on specifying version requirements, see the documentation for the `Version` module. - -Typically, stable releases are pushed to Hex. If you want to depend on an external dependency still in development, Mix is able to manage Git dependencies too: - -```elixir -def deps do - [{:plug, git: "https://github.com/elixir-lang/plug.git"}] -end -``` - -You will notice that when you add a dependency to your project, Mix generates a `mix.lock` file that guarantees *repeatable builds*. The lock file must be checked in to your version control system, to guarantee that everyone who uses the project will use the same dependency versions as you. - -Mix provides many tasks for working with dependencies, which can be seen in `mix help`: - -```console -$ mix help -mix deps # Lists dependencies and their status -mix deps.clean # Deletes the given dependencies' files -mix deps.compile # Compiles dependencies -mix deps.get # Gets all out of date dependencies -mix deps.tree # Prints the dependency tree -mix deps.unlock # Unlocks the given dependencies -mix deps.update # Updates the given dependencies -``` - -The most common tasks are `mix deps.get` and `mix deps.update`. Once fetched, dependencies are automatically compiled for you. You can read more about deps by typing `mix help deps`, and in the documentation for the `Mix.Tasks.Deps` module. - -## Internal dependencies - -Internal dependencies are the ones that are specific to your project. They usually don't make sense outside the scope of your project/company/organization. Most of the time, you want to keep them private, whether due to technical, economic or business reasons. - -If you have an internal dependency, Mix supports two methods to work with them: Git repositories or umbrella projects. - -For example, if you push the `kv` project to a Git repository, you'll need to list it in your deps code in order to use it: - -```elixir -def deps do - [{:kv, git: "https://github.com/YOUR_ACCOUNT/kv.git"}] -end -``` - -If the repository is private though, you may need to specify the private URL `git@github.com:YOUR_ACCOUNT/kv.git`. In any case, Mix will be able to fetch it for you as long as you have the proper credentials. - -Using Git repositories for internal dependencies is somewhat discouraged in Elixir. Remember that the runtime and the Elixir ecosystem already provide the concept of applications. As such, we expect you to frequently break your code into applications that can be organized logically, even within a single project. - -However, if you push every application as a separate project to a Git repository, your projects may become very hard to maintain as you will spend a lot of time managing those Git repositories rather than writing your code. - -For this reason, Mix supports "umbrella projects". Umbrella projects are used to build applications that run together in a single repository. That is exactly the style we are going to explore in the next sections. - -Let's create a new Mix project. We are going to creatively name it `kv_umbrella`, and this new project will have both the existing `kv` application and the new `kv_server` application inside. The directory structure will look like this: - - + kv_umbrella - + apps - + kv - + kv_server - -The interesting thing about this approach is that Mix has many conveniences for working with such projects, such as the ability to compile and test all applications inside `apps` with a single command. However, even though they are all listed together inside `apps`, they are still decoupled from each other, so you can build, test and deploy each application in isolation if you want to. - -So let's get started! - -## Umbrella projects - -Let's start a new project using `mix new`. This new project will be named `kv_umbrella` and we need to pass the `--umbrella` option when creating it. Do not create this new project inside the existing `kv` project! - -```console -$ mix new kv_umbrella --umbrella -* creating README.md -* creating .formatter.exs -* creating .gitignore -* creating mix.exs -* creating apps -* creating config -* creating config/config.exs -``` - -From the printed information, we can see far fewer files are generated. The generated `mix.exs` file is different too. Let's take a look (comments have been removed): - -```elixir -defmodule KvUmbrella.MixProject do - use Mix.Project - - def project do - [ - apps_path: "apps", - start_permanent: Mix.env() == :prod, - deps: deps() - ] - end - - defp deps do - [] - end -end -``` - -What makes this project different from the previous one is the `apps_path: "apps"` entry in the project definition. This means this project will act as an umbrella. Such projects do not have source files nor tests, although they can have their own dependencies. Each child application must be defined inside the `apps` directory. - -Let's move inside the apps directory and start building `kv_server`. This time, we are going to pass the `--sup` flag, which will tell Mix to generate a supervision tree automatically for us, instead of building one manually as we did in previous chapters: - -```console -$ cd kv_umbrella/apps -$ mix new kv_server --module KVServer --sup -``` - -The generated files are similar to the ones we first generated for `kv`, with a few differences. Let's open up `mix.exs`: - -```elixir -defmodule KVServer.MixProject do - use Mix.Project - - def project do - [ - app: :kv_server, - version: "0.1.0", - build_path: "../../_build", - config_path: "../../config/config.exs", - deps_path: "../../deps", - lockfile: "../../mix.lock", - elixir: "~> 1.14", - start_permanent: Mix.env() == :prod, - deps: deps() - ] - end - - # Run "mix help compile.app" to learn about applications - def application do - [ - extra_applications: [:logger], - mod: {KVServer.Application, []} - ] - end - - # Run "mix help deps" to learn about dependencies - defp deps do - [ - # {:dep_from_hexpm, "~> 0.3.0"}, - # {:dep_from_git, git: "https://github.com/elixir-lang/my_dep.git", tag: "0.1.0"}, - # {:sibling_app_in_umbrella, in_umbrella: true}, - ] - end -end -``` - -First of all, since we generated this project inside `kv_umbrella/apps`, Mix automatically detected the umbrella structure and added four lines to the project definition: - -```elixir -build_path: "../../_build", -config_path: "../../config/config.exs", -deps_path: "../../deps", -lockfile: "../../mix.lock", -``` - -Those options mean all dependencies will be checked out to `kv_umbrella/deps`, and they will share the same build, config, and lock files. We haven't talked about configuration yet, but from here we can build the intuition that all configuration and dependencies are shared across all projects in an umbrella, and it is not per application. - -The second change is in the `application` function inside `mix.exs`: - -```elixir -def application do - [ - extra_applications: [:logger], - mod: {KVServer.Application, []} - ] -end -``` - -Because we passed the `--sup` flag, Mix automatically added `mod: {KVServer.Application, []}`, specifying that `KVServer.Application` is our application callback module. `KVServer.Application` will start our application supervision tree. - -In fact, let's open up `lib/kv_server/application.ex`: - -```elixir -defmodule KVServer.Application do - # See https://hexdocs.pm/elixir/Application.html - # for more information on OTP Applications - @moduledoc false - - use Application - - @impl true - def start(_type, _args) do - # List all child processes to be supervised - children = [ - # Starts a worker by calling: KVServer.Worker.start_link(arg) - # {KVServer.Worker, arg}, - ] - - # See https://hexdocs.pm/elixir/Supervisor.html - # for other strategies and supported options - opts = [strategy: :one_for_one, name: KVServer.Supervisor] - Supervisor.start_link(children, opts) - end -end -``` - -Notice that it defines the application callback function, `start/2`, and instead of defining a supervisor named `KVServer.Supervisor` that uses the `Supervisor` module, it conveniently defined the supervisor inline! You can read more about such supervisors by reading the `Supervisor` module documentation. - -We can already try out our first umbrella child. We could run tests inside the `apps/kv_server` directory, but that wouldn't be much fun. Instead, go to the root of the umbrella project and run `mix test`: - -```console -$ mix test -``` - -And it works! - -Since we want `kv_server` to eventually use the functionality we defined in `kv`, we need to add `kv` as a dependency to our application. - -## Dependencies within an umbrella project - -Dependencies between applications in an umbrella project must still be explicitly defined and Mix makes it easy to do so. Open up `apps/kv_server/mix.exs` and change the `deps/0` function to the following: - -```elixir -defp deps do - [{:kv, in_umbrella: true}] -end -``` - -The line above makes `:kv` available as a dependency inside `:kv_server` and automatically starts the `:kv` application before the server starts. - -Finally, copy the `kv` application we have built so far to the `apps` directory in our new umbrella project. The final directory structure should match the structure we mentioned earlier: - - + kv_umbrella - + apps - + kv - + kv_server - -We now need to modify `apps/kv/mix.exs` to contain the umbrella entries we have seen in `apps/kv_server/mix.exs`. Open up `apps/kv/mix.exs` and add to the `project/0` function: - -```elixir -build_path: "../../_build", -config_path: "../../config/config.exs", -deps_path: "../../deps", -lockfile: "../../mix.lock", -``` - -Now you can run tests for both projects from the umbrella root with `mix test`. Sweet! - -## Don't drink the kool aid - -Umbrella projects are a convenience to help you organize and manage multiple applications. While it provides a degree of separation between applications, those applications are not fully decoupled, as they share the same configuration and the same dependencies. - -The pattern of keeping multiple applications in the same repository is known as "mono-repo". Umbrella projects maximize this pattern by providing conveniences to compile, test and run multiple applications at once. - -If you find yourself in a position where you want to use different configurations in each application for the same dependency or use different dependency versions, then it is likely your codebase has grown beyond what umbrellas can provide. - -The good news is that breaking an umbrella apart is quite straightforward, as you simply need to move applications outside of the umbrella project's `apps/` directory and update the project's mix.exs file to no longer set the `build_path`, `config_path`, `deps_path`, and `lockfile` configuration. You can depend on private projects outside of the umbrella in multiple ways: - - 1. Move it to a separate folder within the same repository and point to it using a path dependency (the mono-repo pattern) - 2. Move the repository to a separate Git repository and depend on it - 3. Publish the project to a private [Hex.pm](https://hex.pm/) organization - -## Summing up - -In this chapter, we have learned more about Mix dependencies and umbrella projects. While we may run `kv` without a server, our `kv_server` depends directly on `kv`. By breaking them into separate applications, we gain more control in how they are developed and tested. - -When using umbrella applications, it is important to have a clear boundary between them. Our upcoming `kv_server` must only access public APIs defined in `kv`. Think of your umbrella apps as any other dependency or even Elixir itself: you can only access what is public and documented. Reaching into private functionality in your dependencies is a poor practice that will eventually cause your code to break when a new version is up. - -Umbrella applications can also be used as a stepping stone for eventually extracting an application from your codebase. For example, imagine a web application that has to send "push notifications" to its users. The whole "push notifications system" can be developed as a separate application in the umbrella, with its own supervision tree and APIs. If you ever run into a situation where another project needs the push notifications system, the system can be moved to a private repository or [a Hex package](https://hex.pm/). - -Finally, keep in mind that applications in an umbrella project all share the same configurations and dependencies. If two applications in your umbrella need to configure the same dependency in drastically different ways or even use different versions, you have probably outgrown the benefits brought by umbrellas. Remember you can break the umbrella and still leverage the benefits behind "mono-repos". - -With our umbrella project up and running, it is time to start writing our server. diff --git a/lib/elixir/pages/mix-and-otp/distributed-tasks.md b/lib/elixir/pages/mix-and-otp/distributed-tasks.md deleted file mode 100644 index 7461017527a..00000000000 --- a/lib/elixir/pages/mix-and-otp/distributed-tasks.md +++ /dev/null @@ -1,361 +0,0 @@ - - -# Distributed tasks and tags - -In this chapter, we will go back to the `:kv` application and add a routing layer that will allow us to distribute requests between nodes based on the bucket name. - -The routing layer will receive a routing table of the following format: - -```elixir -[ - {?a..?m, :"foo@computer-name"}, - {?n..?z, :"bar@computer-name"} -] -``` - -The router will check the first byte of the bucket name against the table and dispatch to the appropriate node based on that. For example, a bucket starting with the letter "a" (`?a` represents the Unicode codepoint of the letter "a") will be dispatched to node `foo@computer-name`. - -If the matching entry points to the node evaluating the request, then we've finished routing, and this node will perform the requested operation. If the matching entry points to a different node, we'll pass the request to said node, which will look at its own routing table (which may be different from the one in the first node) and act accordingly. If no entry matches, an error will be raised. - -> Note: we will be using two nodes in the same machine throughout this chapter. You are free to use two (or more) different machines on the same network but you need to do some prep work. First of all, you need to ensure all machines have a `~/.erlang.cookie` file with exactly the same value. Then you need to guarantee [epmd](https://www.erlang.org/doc/apps/erts/epmd_cmd) is running on a port that is not blocked (you can run `epmd -d` for debug info). - -## Our first distributed code - -Elixir ships with facilities to connect nodes and exchange information between them. In fact, we use the same concepts of processes, message passing and receiving messages when working in a distributed environment because Elixir processes are *location transparent*. This means that when sending a message, it doesn't matter if the recipient process is on the same node or on another node, the VM will be able to deliver the message in both cases. - -In order to run distributed code, we need to start the VM with a name. The name can be short (when in the same network) or long (requires the full computer address). Let's start a new IEx session: - -```console -$ iex --sname foo -``` - -You can see now the prompt is slightly different and shows the node name followed by the computer name: - - Interactive Elixir - press Ctrl+C to exit (type h() ENTER for help) - iex(foo@jv)1> - -My computer is named `jv`, so I see `foo@jv` in the example above, but you will get a different result. We will use `foo@computer-name` in the following examples and you should update them accordingly when trying out the code. - -Let's define a module named `Hello` in this shell: - -```elixir -iex> defmodule Hello do -...> def world, do: IO.puts("hello world") -...> end -``` - -If you have another computer on the same network with both Erlang and Elixir installed, you can start another shell on it. If you don't, you can start another IEx session in another terminal. In either case, give it the short name of `bar`: - -```console -$ iex --sname bar -``` - -Note that inside this new IEx session, we cannot access `Hello.world/0`: - -```elixir -iex> Hello.world -** (UndefinedFunctionError) function Hello.world/0 is undefined (module Hello is not available) - Hello.world() -``` - -However, we can spawn a new process on `foo@computer-name` from `bar@computer-name`! Let's give it a try (where `@computer-name` is the one you see locally): - -```elixir -iex> Node.spawn_link(:"foo@computer-name", fn -> Hello.world() end) -#PID<9014.59.0> -hello world -``` - -Elixir spawned a process on another node and returned its PID. The code then executed on the other node where the `Hello.world/0` function exists and invoked that function. Note that the result of "hello world" was printed on the current node `bar` and not on `foo`. In other words, the message to be printed was sent back from `foo` to `bar`. This happens because the process spawned on the other node (`foo`) knows all the output should be sent back to the original node! - -We can send and receive messages from the PID returned by `Node.spawn_link/2` as usual. Let's try a quick ping-pong example: - -```elixir -iex> pid = Node.spawn_link(:"foo@computer-name", fn -> -...> receive do -...> {:ping, client} -> send(client, :pong) -...> end -...> end) -#PID<9014.59.0> -iex> send(pid, {:ping, self()}) -{:ping, #PID<0.73.0>} -iex> flush() -:pong -:ok -``` - -From our quick exploration, we could conclude that we should use `Node.spawn_link/2` to spawn processes on a remote node every time we need to do a distributed computation. However, we have learned throughout this guide that spawning processes outside of supervision trees should be avoided if possible, so we need to look for other options. - -There are three better alternatives to `Node.spawn_link/2` that we could use in our implementation: - -1. We could use Erlang's [`:erpc`](`:erpc`) module to execute functions on a remote node. Inside the `bar@computer-name` shell above, you can call `:erpc.call(:"foo@computer-name", Hello, :world, [])` and it will print "hello world" - -2. We could have a server running on the other node and send requests to that node via the `GenServer` API. For example, you can call a server on a remote node by using `GenServer.call({name, node}, arg)` or passing the remote process PID as the first argument - -3. We could use [tasks](`Task`), which we have learned about in [a previous chapter](task-and-gen-tcp.md), as they can be spawned on both local and remote nodes - -The options above have different properties. The GenServer would serialize your requests on a single server, while tasks are effectively running asynchronously on the remote node, with the only serialization point being the spawning done by the supervisor. - -For our routing layer, we are going to use tasks, but feel free to explore the other alternatives too. - -## async/await - -So far we have explored tasks that are started and run in isolation, without regard to their return value. However, sometimes it is useful to run a task to compute a value and read its result later on. For this, tasks also provide the `async/await` pattern: - -```elixir -task = Task.async(fn -> compute_something_expensive() end) -res = compute_something_else() -res + Task.await(task) -``` - -`async/await` provides a very simple mechanism to compute values concurrently. Not only that, `async/await` can also be used with the same `Task.Supervisor` we have used in previous chapters. We just need to call `Task.Supervisor.async/2` instead of `Task.Supervisor.start_child/2` and use `Task.await/2` to read the result later on. - -## Distributed tasks - -Distributed tasks are exactly the same as supervised tasks. The only difference is that we pass the node name when spawning the task on the supervisor. Open up `lib/kv/supervisor.ex` from the `:kv` application. Let's add a task supervisor as the last child of the tree: - -```elixir -{Task.Supervisor, name: KV.RouterTasks}, -``` - -Now, let's start two named nodes again, but inside the `:kv` application: - -```console -$ cd apps/kv -$ iex --sname foo -S mix -$ iex --sname bar -S mix -``` - -From inside `bar@computer-name`, we can now spawn a task directly on the other node via the supervisor: - -```elixir -iex> task = Task.Supervisor.async({KV.RouterTasks, :"foo@computer-name"}, fn -> -...> {:ok, node()} -...> end) -%Task{ - mfa: {:erlang, :apply, 2}, - owner: #PID<0.122.0>, - pid: #PID<12467.88.0>, - ref: #Reference<0.0.0.400> -} -iex> Task.await(task) -{:ok, :"foo@computer-name"} -``` - -Our first distributed task retrieves the name of the node the task is running on. Notice we have given an anonymous function to `Task.Supervisor.async/2` but, in distributed cases, it is preferable to give the module, function, and arguments explicitly: - -```elixir -iex> task = Task.Supervisor.async({KV.RouterTasks, :"foo@computer-name"}, Kernel, :node, []) -%Task{ - mfa: {Kernel, :node, 0}, - owner: #PID<0.122.0>, - pid: #PID<12467.89.0>, - ref: #Reference<0.0.0.404> -} -iex> Task.await(task) -:"foo@computer-name" -``` - -The difference is that anonymous functions require the target node to have exactly the same code version as the caller. Using module, function, and arguments is more robust because you only need to find a function with matching arity in the given module. - -With this knowledge in hand, let's finally write the routing code. - -## Routing layer - -Create a file at `lib/kv/router.ex` with the following contents: - -```elixir -defmodule KV.Router do - @doc """ - Dispatch the given `mod`, `fun`, `args` request - to the appropriate node based on the `bucket`. - """ - def route(bucket, mod, fun, args) do - # Get the first byte of the binary - first = :binary.first(bucket) - - # Try to find an entry in the table() or raise - entry = - Enum.find(table(), fn {enum, _node} -> - first in enum - end) || no_entry_error(bucket) - - # If the entry node is the current node - if elem(entry, 1) == node() do - apply(mod, fun, args) - else - {KV.RouterTasks, elem(entry, 1)} - |> Task.Supervisor.async(KV.Router, :route, [bucket, mod, fun, args]) - |> Task.await() - end - end - - defp no_entry_error(bucket) do - raise "could not find entry for #{inspect bucket} in table #{inspect table()}" - end - - @doc """ - The routing table. - """ - def table do - # Replace computer-name with your local machine name - [{?a..?m, :"foo@computer-name"}, {?n..?z, :"bar@computer-name"}] - end -end -``` - -Let's write a test to verify our router works. Create a file named `test/kv/router_test.exs` containing: - -```elixir -defmodule KV.RouterTest do - use ExUnit.Case, async: true - - test "route requests across nodes" do - assert KV.Router.route("hello", Kernel, :node, []) == - :"foo@computer-name" - assert KV.Router.route("world", Kernel, :node, []) == - :"bar@computer-name" - end - - test "raises on unknown entries" do - assert_raise RuntimeError, ~r/could not find entry/, fn -> - KV.Router.route(<<0>>, Kernel, :node, []) - end - end -end -``` - -The first test invokes `Kernel.node/0`, which returns the name of the current node, based on the bucket names "hello" and "world". According to our routing table so far, we should get `foo@computer-name` and `bar@computer-name` as responses, respectively. - -The second test checks that the code raises for unknown entries. - -In order to run the first test, we need to have two nodes running. Let's move into `apps/kv` and restart the node named `bar` which is going to be used by tests. This way, `bar` will not load the `:kv_server` app and leave the port available for `foo` and tests. - -```console -$ cd apps/kv -$ iex --sname bar -S mix -``` - -And now run tests with: - -```console -$ elixir --sname foo -S mix test -``` - -The test should pass. - -## Test filters and tags - -Although our tests pass, our testing structure is getting more complex. In particular, running tests with only `mix test` causes failures in our suite, since our test requires a connection to another node. - -Luckily, ExUnit ships with a facility to tag tests, allowing us to run specific callbacks or even filter tests altogether based on those tags. We have already used the `:capture_log` tag in the previous chapter, which has its semantics specified by ExUnit itself. - -This time let's add a `:distributed` tag to `test/kv/router_test.exs`: - -```elixir -@tag :distributed -test "route requests across nodes" do -``` - -Writing `@tag :distributed` is equivalent to writing `@tag distributed: true`. - -With the test properly tagged, we can now check if the node is alive on the network and, if not, we can exclude all distributed tests. Open up `test/test_helper.exs` inside the `:kv` application and add the following: - -```elixir -exclude = - if Node.alive?(), do: [], else: [distributed: true] - -ExUnit.start(exclude: exclude) -``` - -Now run tests with `mix test`: - -```console -$ mix test -Excluding tags: [distributed: true] - -....... - -Finished in 0.05 seconds -9 tests, 0 failures, 1 excluded -``` - -This time all tests passed and ExUnit warned us that distributed tests were being excluded. If you run tests with `$ elixir --sname foo -S mix test`, one extra test should run and successfully pass as long as the `bar@computer-name` node is available. - -The `mix test` command also allows us to dynamically include and exclude tags. For example, we can run `$ mix test --include distributed` to run distributed tests regardless of the value set in `test/test_helper.exs`. We could also pass `--exclude` to exclude a particular tag from the command line. Finally, `--only` can be used to run only tests with a particular tag: - -```console -$ elixir --sname foo -S mix test --only distributed -``` - -You can read more about filters, tags, and the default tags in the `ExUnit.Case` module documentation. - -## Wiring it all up - -Now with our routing system in place, let's change `KVServer` to use the router. Replace the `lookup/2` function in `KVServer.Command` from this: - -```elixir -defp lookup(bucket, callback) do - case KV.Registry.lookup(KV.Registry, bucket) do - {:ok, pid} -> callback.(pid) - :error -> {:error, :not_found} - end -end -``` - -by this: - -```elixir -defp lookup(bucket, callback) do - case KV.Router.route(bucket, KV.Registry, :lookup, [KV.Registry, bucket]) do - {:ok, pid} -> callback.(pid) - :error -> {:error, :not_found} - end -end -``` - -Instead of directly looking up the registry, we are using the router instead to match a specific node. Then we get a `pid` that can be from any process in our cluster. From now on, `GET`, `PUT` and `DELETE` requests are all routed to the appropriate node. - -Let's also make sure that when a new bucket is created it ends up on the correct node. Replace the `run/1` function in `KVServer.Command`, the one that matches the `:create` command, with the following: - -```elixir -def run({:create, bucket}) do - case KV.Router.route(bucket, KV.Registry, :create, [KV.Registry, bucket]) do - pid when is_pid(pid) -> {:ok, "OK\r\n"} - _ -> {:error, "FAILED TO CREATE BUCKET"} - end -end -``` - -Now if you run the tests, you will see that an existing test that checks the server interaction will fail, as it will attempt to use the routing table. To address this failure, change the `test_helper.exs` for `:kv_server` application as we did for `:kv` and add `@tag :distributed` to this test too: - -```elixir -@tag :distributed -test "server interaction", %{socket: socket} do -``` - -However, keep in mind that by making the test distributed, we will likely run it less frequently, since we may not do the distributed setup on every test run. We will learn how to address this in the next chapter, by effectively learning how to make the routing table configurable. - -## Summing up - -We have only scratched the surface of what is possible when it comes to distribution. - -In all of our examples, we relied on Erlang's ability to automatically connect nodes whenever there is a request. For example, when we invoked `Node.spawn_link(:"foo@computer-name", fn -> Hello.world() end)`, Erlang automatically connected to said node and started a new process. However, you may also want to take a more explicit approach to connections, by using `Node.connect/1` and `Node.disconnect/1`. - -By default, Erlang establishes a fully meshed network, which means all nodes are connected to each other. Under this topology, the Erlang distribution is known to scale to several dozens of nodes in the same cluster. Erlang also has the concept of hidden nodes, which can allow developers to assemble custom topologies as seen in projects such as [Partisan](https://github.com/lasp-lang/partisan). - -In production, you may have nodes connecting and disconnecting at any time. In such scenarios, you need to provide *node discoverability*. Libraries such as [libcluster](https://github.com/bitwalker/libcluster/) and [dns_cluster](https://github.com/phoenixframework/dns_cluster) provide several strategies for node discoverability using DNS, Kubernetes, etc. - -Distributed key-value stores, used in real-life, need to consider the fact nodes may go up and down at any time and also migrate the bucket across nodes. Even further, buckets often need to be duplicated between nodes, so a failure in a node does not lead to the whole bucket being lost. This process is called *replication*. Our implementation won't attempt to tackle such problems. Instead, we assume there is a fixed number of nodes and therefore use a fixed routing table. - -These topics can be daunting at first but remember that most Elixir frameworks abstract those concerns for you. For example, when using [the Phoenix web framework](https://phoenixframework.org), its plug-and-play abstractions take care of sending messages and tracking how users join and leave a cluster. However, if you are interested in distributed systems after all, there is much to explore. Here are some additional references: - - * [The excellent Distribunomicon chapter from Learn You Some Erlang](http://learnyousomeerlang.com/distribunomicon) - * Erlang's [`:global` module](`:global`), which can provide global names and global locks, allowing unique names and unique locks in a whole cluster of machines - * Erlang's [`:pg` module](`:pg`), which allows process to join different groups shared across the whole cluster - * [Phoenix PubSub project](https://github.com/phoenixframework/phoenix_pubsub), which provides a distributed messaging system and a distributed presence system for tracking users and processes in a cluster - -You will also find many libraries for building distributed systems within the overall Erlang ecosystem. For now, it is time to go back to our simple distributed key-value store and learn how to configure and package it for production. diff --git a/lib/elixir/pages/mix-and-otp/docs-tests-and-with.md b/lib/elixir/pages/mix-and-otp/docs-tests-and-with.md index e29949dded9..eb9d0c62412 100644 --- a/lib/elixir/pages/mix-and-otp/docs-tests-and-with.md +++ b/lib/elixir/pages/mix-and-otp/docs-tests-and-with.md @@ -25,7 +25,7 @@ DELETE shopping eggs OK ``` -After the parsing is done, we will update our server to dispatch the parsed commands to the `:kv` application we built previously. +After the parsing is done, we will update our server to dispatch the parsed commands to the relevant buckets. ## Doctests @@ -33,16 +33,16 @@ On the language homepage, we mention that Elixir makes documentation a first-cla In this section, we will implement the parsing functionality, document it and make sure our documentation is up to date with doctests. This helps us provide documentation with accurate code samples. -Let's create our command parser at `lib/kv_server/command.ex` and start with the doctest: +Let's create our command parser at `lib/kv/command.ex` and start with the doctest: ```elixir -defmodule KVServer.Command do +defmodule KV.Command do @doc ~S""" Parses the given `line` into a command. ## Examples - iex> KVServer.Command.parse("CREATE shopping\r\n") + iex> KV.Command.parse("CREATE shopping\r\n") {:ok, {:create, "shopping"}} """ @@ -56,29 +56,29 @@ Doctests are specified by an indentation of four spaces followed by the `iex>` p Also, note that we started the documentation string using `@doc ~S"""`. The `~S` prevents the `\r\n` characters from being converted to a carriage return and line feed until they are evaluated in the test. -To run our doctests, we'll create a file at `test/kv_server/command_test.exs` and call `doctest KVServer.Command` in the test case: +To run our doctests, we'll create a file at `test/kv/command_test.exs` and call `doctest KV.Command` in the test case: ```elixir -defmodule KVServer.CommandTest do +defmodule KV.CommandTest do use ExUnit.Case, async: true - doctest KVServer.Command + doctest KV.Command end ``` Run the test suite and the doctest should fail: ```text - 1) doctest KVServer.Command.parse/1 (1) (KVServer.CommandTest) - test/kv_server/command_test.exs:3 + 1) doctest KV.Command.parse/1 (1) (KV.CommandTest) + test/kv/command_test.exs:3 Doctest failed doctest: - iex> KVServer.Command.parse("CREATE shopping\r\n") + iex> KV.Command.parse("CREATE shopping\r\n") {:ok, {:create, "shopping"}} - code: KVServer.Command.parse "CREATE shopping\r\n" === {:ok, {:create, "shopping"}} + code: KV.Command.parse "CREATE shopping\r\n" === {:ok, {:create, "shopping"}} left: :not_implemented right: {:ok, {:create, "shopping"}} stacktrace: - lib/kv_server/command.ex:7: KVServer.Command (module) + lib/kv/command.ex:7: KV.Command (module) ``` Excellent! @@ -96,50 +96,50 @@ end Our implementation splits the line on whitespace and then matches the command against a list. Using `String.split/1` means our commands will be whitespace-insensitive. Leading and trailing whitespace won't matter, nor will consecutive spaces between words. Let's add some new doctests to test this behavior along with the other commands: ```elixir -@doc ~S""" -Parses the given `line` into a command. + @doc ~S""" + Parses the given `line` into a command. -## Examples + ## Examples - iex> KVServer.Command.parse "CREATE shopping\r\n" - {:ok, {:create, "shopping"}} + iex> KV.Command.parse "CREATE shopping\r\n" + {:ok, {:create, "shopping"}} - iex> KVServer.Command.parse "CREATE shopping \r\n" - {:ok, {:create, "shopping"}} + iex> KV.Command.parse "CREATE shopping \r\n" + {:ok, {:create, "shopping"}} - iex> KVServer.Command.parse "PUT shopping milk 1\r\n" - {:ok, {:put, "shopping", "milk", "1"}} + iex> KV.Command.parse "PUT shopping milk 1\r\n" + {:ok, {:put, "shopping", "milk", "1"}} - iex> KVServer.Command.parse "GET shopping milk\r\n" - {:ok, {:get, "shopping", "milk"}} + iex> KV.Command.parse "GET shopping milk\r\n" + {:ok, {:get, "shopping", "milk"}} - iex> KVServer.Command.parse "DELETE shopping eggs\r\n" - {:ok, {:delete, "shopping", "eggs"}} + iex> KV.Command.parse "DELETE shopping eggs\r\n" + {:ok, {:delete, "shopping", "eggs"}} -Unknown commands or commands with the wrong number of -arguments return an error: + Unknown commands or commands with the wrong number of + arguments return an error: - iex> KVServer.Command.parse "UNKNOWN shopping eggs\r\n" - {:error, :unknown_command} + iex> KV.Command.parse "UNKNOWN shopping eggs\r\n" + {:error, :unknown_command} - iex> KVServer.Command.parse "GET shopping\r\n" - {:error, :unknown_command} + iex> KV.Command.parse "GET shopping\r\n" + {:error, :unknown_command} -""" + """ ``` With doctests at hand, it is your turn to make tests pass! Once you're ready, you can compare your work with our solution below: ```elixir -def parse(line) do - case String.split(line) do - ["CREATE", bucket] -> {:ok, {:create, bucket}} - ["GET", bucket, key] -> {:ok, {:get, bucket, key}} - ["PUT", bucket, key, value] -> {:ok, {:put, bucket, key, value}} - ["DELETE", bucket, key] -> {:ok, {:delete, bucket, key}} - _ -> {:error, :unknown_command} + def parse(line) do + case String.split(line) do + ["CREATE", bucket] -> {:ok, {:create, bucket}} + ["GET", bucket, key] -> {:ok, {:get, bucket, key}} + ["PUT", bucket, key, value] -> {:ok, {:put, bucket, key, value}} + ["DELETE", bucket, key] -> {:ok, {:delete, bucket, key}} + _ -> {:error, :unknown_command} + end end -end ``` Notice how we were able to elegantly parse the commands without adding a bunch of `if/else` clauses that check the command name and number of arguments! @@ -147,104 +147,107 @@ Notice how we were able to elegantly parse the commands without adding a bunch o Finally, you may have observed that each doctest corresponds to a different test in our suite, which now reports a total of 7 doctests. That is because ExUnit considers the following to define two different doctests: ```elixir -iex> KVServer.Command.parse("UNKNOWN shopping eggs\r\n") +iex> KV.Command.parse("UNKNOWN shopping eggs\r\n") {:error, :unknown_command} -iex> KVServer.Command.parse("GET shopping\r\n") +iex> KV.Command.parse("GET shopping\r\n") {:error, :unknown_command} ``` Without new lines, as seen below, ExUnit compiles it into a single doctest: ```elixir -iex> KVServer.Command.parse("UNKNOWN shopping eggs\r\n") +iex> KV.Command.parse("UNKNOWN shopping eggs\r\n") {:error, :unknown_command} -iex> KVServer.Command.parse("GET shopping\r\n") +iex> KV.Command.parse("GET shopping\r\n") {:error, :unknown_command} ``` As the name says, doctest is documentation first and a test later. Their goal is not to replace tests but to provide up-to-date documentation. You can read more about doctests in the `ExUnit.DocTest` documentation. -## `with` +## Using `with` As we are now able to parse commands, we can finally start implementing the logic that runs the commands. Let's add a stub definition for this function for now: ```elixir -defmodule KVServer.Command do +defmodule KV.Command do @doc """ Runs the given command. """ - def run(command) do - {:ok, "OK\r\n"} + def run(command, socket) do + :gen_tcp.send(socket, "OK\r\n") + :ok end end ``` -Before we implement this function, let's change our server to start using our new `parse/1` and `run/1` functions. Remember, our `read_line/1` function was also crashing when the client closed the socket, so let's take the opportunity to fix it, too. Open up `lib/kv_server.ex` and replace the existing server definition: +Before we implement this function, let's change our server to start using our new `parse/1` and `run/1` functions. Remember, our `read_line/1` function was also crashing when the client closed the socket, so let's take the opportunity to fix it, too. Open up `lib/kv/server.ex` and replace the existing server definition: ```elixir -defp serve(socket) do - socket - |> read_line() - |> write_line(socket) + defp serve(socket) do + socket + |> read_line() + |> write_line(socket) - serve(socket) -end + serve(socket) + end -defp read_line(socket) do - {:ok, data} = :gen_tcp.recv(socket, 0) - data -end + defp read_line(socket) do + {:ok, data} = :gen_tcp.recv(socket, 0) + data + end -defp write_line(line, socket) do - :gen_tcp.send(socket, line) -end + defp write_line(line, socket) do + :gen_tcp.send(socket, line) + end ``` by the following: ```elixir -defp serve(socket) do - msg = - case read_line(socket) do - {:ok, data} -> - case KVServer.Command.parse(data) do - {:ok, command} -> - KVServer.Command.run(command) - {:error, _} = err -> - err - end - {:error, _} = err -> - err - end - - write_line(socket, msg) - serve(socket) -end + defp serve(socket) do + msg = + case read_line(socket) do + {:ok, data} -> + case KV.Command.parse(data) do + {:ok, command} -> + KV.Command.run(command, socket) + + {:error, _} = err -> + err + end + + {:error, _} = err -> + err + end + + write_line(socket, msg) + serve(socket) + end -defp read_line(socket) do - :gen_tcp.recv(socket, 0) -end + defp read_line(socket) do + :gen_tcp.recv(socket, 0) + end -defp write_line(socket, {:ok, text}) do - :gen_tcp.send(socket, text) -end + defp write_line(_socket, :ok) do + :ok + end -defp write_line(socket, {:error, :unknown_command}) do - # Known error; write to the client - :gen_tcp.send(socket, "UNKNOWN COMMAND\r\n") -end + defp write_line(socket, {:error, :unknown_command}) do + # Known error; write to the client + :gen_tcp.send(socket, "UNKNOWN COMMAND\r\n") + end -defp write_line(_socket, {:error, :closed}) do - # The connection was closed, exit politely - exit(:shutdown) -end + defp write_line(_socket, {:error, :closed}) do + # The connection was closed, exit politely + exit(:shutdown) + end -defp write_line(socket, {:error, error}) do - # Unknown error; write to the client and exit - :gen_tcp.send(socket, "ERROR\r\n") - exit(error) -end + defp write_line(socket, {:error, error}) do + # Unknown error; write to the client and exit + :gen_tcp.send(socket, "ERROR\r\n") + exit(error) + end ``` If we start our server, we can now send commands to it. For now, we will get two different responses: "OK" when the command is known and "UNKNOWN COMMAND" otherwise: @@ -264,18 +267,18 @@ This means our implementation is going in the correct direction, but it doesn't The previous implementation used pipelines which made the logic straightforward to follow. However, now that we need to handle different error codes along the way, our server logic is nested inside many `case` calls. -Thankfully, Elixir v1.2 introduced the `with` construct, which allows you to simplify code like the above, replacing nested `case` calls with a chain of matching clauses. Let's rewrite the `serve/1` function to use `with`: +Thankfully, Elixir has the `with` construct, which allows you to simplify code like the above, replacing nested `case` calls with a chain of matching clauses. Let's rewrite the `serve/1` function to use `with`: ```elixir -defp serve(socket) do - msg = - with {:ok, data} <- read_line(socket), - {:ok, command} <- KVServer.Command.parse(data), - do: KVServer.Command.run(command) - - write_line(socket, msg) - serve(socket) -end + defp serve(socket) do + msg = + with {:ok, data} <- read_line(socket), + {:ok, command} <- KV.Command.parse(data), + do: KV.Command.run(command, socket) + + write_line(socket, msg) + serve(socket) + end ``` Much better! `with` will retrieve the value returned by the right-side of `<-` and match it against the pattern on the left side. If the value matches the pattern, `with` moves on to the next expression. In case there is no match, the non-matching value is returned. @@ -286,55 +289,60 @@ You can read more about `with/1` in our documentation. ## Running commands -The last step is to implement `KVServer.Command.run/1`, to run the parsed commands against the `:kv` application. Its implementation is shown below: +The last step is to implement `KV.Command.run/1` to run the parsed commands on top of buckets. Its implementation is shown below: ```elixir -@doc """ -Runs the given command. -""" -def run(command) - -def run({:create, bucket}) do - KV.Registry.create(KV.Registry, bucket) - {:ok, "OK\r\n"} -end + @doc """ + Runs the given command. + """ + def run(command, socket) -def run({:get, bucket, key}) do - lookup(bucket, fn pid -> - value = KV.Bucket.get(pid, key) - {:ok, "#{value}\r\nOK\r\n"} - end) -end + def run({:create, bucket}, socket) do + KV.create_bucket(bucket) + :gen_tcp.send(socket, "OK\r\n") + :ok + end -def run({:put, bucket, key, value}) do - lookup(bucket, fn pid -> - KV.Bucket.put(pid, key, value) - {:ok, "OK\r\n"} - end) -end + def run({:get, bucket, key}, socket) do + lookup(bucket, fn pid -> + value = KV.Bucket.get(pid, key) + :gen_tcp.send(socket, "#{value}\r\nOK\r\n") + :ok + end) + end -def run({:delete, bucket, key}) do - lookup(bucket, fn pid -> - KV.Bucket.delete(pid, key) - {:ok, "OK\r\n"} - end) -end + def run({:put, bucket, key, value}, socket) do + lookup(bucket, fn pid -> + KV.Bucket.put(pid, key, value) + :gen_tcp.send(socket, "OK\r\n") + :ok + end) + end -defp lookup(bucket, callback) do - case KV.Registry.lookup(KV.Registry, bucket) do - {:ok, pid} -> callback.(pid) - :error -> {:error, :not_found} + def run({:delete, bucket, key}, socket) do + lookup(bucket, fn pid -> + KV.Bucket.delete(pid, key) + :gen_tcp.send(socket, "OK\r\n") + :ok + end) + end + + defp lookup(bucket, callback) do + if bucket = KV.lookup_bucket(bucket) do + callback.(bucket) + else + {:error, :not_found} + end end -end ``` -Every function clause dispatches the appropriate command to the `KV.Registry` server that we registered during the `:kv` application startup. Since our `:kv_server` depends on the `:kv` application, it is completely fine to depend on the services it provides. +Each function clause dispatches the appropriate command to the appropriate bucket. -You might have noticed we have a function head, `def run(command)`, without a body. In the [Modules and Functions](../getting-started/modules-and-functions.md#default-arguments) chapter, we learned that a bodiless function can be used to declare default arguments for a multi-clause function. Here is another use case where we use a function without a body to document what the arguments are. +You might have noticed we have a function head, `def run(command, socket)`, without a body. In the [Modules and Functions](../getting-started/modules-and-functions.md#default-arguments) chapter, we learned that a bodiless function can be used to declare default arguments for a multi-clause function. Here is another use case where we use a function without a body to document what the arguments are. -Note that we have also defined a private function named `lookup/2` to help with the common functionality of looking up a bucket and returning its `pid` if it exists, `{:error, :not_found}` otherwise. +We have also defined a private function named `lookup/2` to help with the common functionality of looking up a bucket and returning its `pid` if it exists, `{:error, :not_found}` otherwise. -By the way, since we are now returning `{:error, :not_found}`, we should amend the `write_line/2` function in `KVServer` to print such error as well: +By the way, since we are now returning `{:error, :not_found}`, we should amend the `write_line/2` function in `KV.Server` to print such error as well: ```elixir defp write_line(socket, {:error, :not_found}) do @@ -342,67 +350,57 @@ defp write_line(socket, {:error, :not_found}) do end ``` -Our server functionality is almost complete. Only tests are missing. This time, we have left tests for last because there are some important considerations to be made. - -`KVServer.Command.run/1`'s implementation is sending commands directly to the server named `KV.Registry`, which is registered by the `:kv` application. This means this server is global and if we have two tests sending messages to it at the same time, our tests will conflict with each other (and likely fail). We need to decide between having unit tests that are isolated and can run asynchronously, or writing integration tests that work on top of the global state, but exercise our application's full stack as it is meant to be exercised in production. - -So far we have only written unit tests, typically testing a single module directly. However, in order to make `KVServer.Command.run/1` testable as a unit we would need to change its implementation to not send commands directly to the `KV.Registry` process but instead pass a server as an argument. For example, we would need to change `run`'s signature to `def run(command, pid)` and then change all clauses accordingly: +Our server functionality is almost complete. Only tests are missing. -```elixir -def run({:create, bucket}, pid) do - KV.Registry.create(pid, bucket) - {:ok, "OK\r\n"} -end - -# ... other run clauses ... -``` +## Integration tests -Feel free to go ahead and do the changes above and write some unit tests. The idea is that your tests will start an instance of the `KV.Registry` and pass it as an argument to `run/2` instead of relying on the global `KV.Registry`. This has the advantage of keeping our tests asynchronous as there is no shared state. +`KV.Command.run/1`'s implementation is sending commands directly to the `KV` module, which is using a local registry to name processes. This means if we have two tests sending messages to the same bucket, our tests will conflict with each other (and likely fail). One might think this would be a reason to use mocks and other strategies to keep our tests isolated, but such techniques often make our testing environment too distant from how our code actually runs in production, and you may end-up with bugs lurking. -But let's also try something different. Let's write integration tests that rely on the global server names to exercise the whole stack from the TCP server to the bucket. Our integration tests will rely on global state and must be synchronous. With integration tests, we get coverage on how the components in our application work together at the cost of test performance. They are typically used to test the main flows in your application. For example, we should avoid using integration tests to test an edge case in our command parsing implementation. +Luckily, there is a technique that we have been using throughout this guide that would be equally applicable here: it is ok to rely on the local registry as long as each test uses unique names. Using a combination of the test module and test name is more than enough to guarantee that. -Our integration test will use a TCP client that sends commands to our server and assert we are getting the desired responses. +So let's write integration tests that rely on unique names to exercise the whole stack from the TCP server to the bucket. -Let's implement the integration test in `test/kv_server_test.exs` as shown below: +Create a new file at `test/kv/server_test.exs` as shown below: ```elixir -defmodule KVServerTest do - use ExUnit.Case +defmodule KV.ServerTest do + use ExUnit.Case, async: true - setup do - Application.stop(:kv) - :ok = Application.start(:kv) - end + @socket_options [:binary, packet: :line, active: false] - setup do - opts = [:binary, packet: :line, active: false] - {:ok, socket} = :gen_tcp.connect(~c"localhost", 4040, opts) - %{socket: socket} + setup config do + {:ok, socket} = :gen_tcp.connect(~c"localhost", 4040, @socket_options) + test_name = config.test |> Atom.to_string() |> String.replace(" ", "-") + %{socket: socket, name: "#{config.module}-#{test_name}"} end - test "server interaction", %{socket: socket} do - assert send_and_recv(socket, "UNKNOWN shopping\r\n") == - "UNKNOWN COMMAND\r\n" + test "server interaction", %{socket: socket, name: name} do + # CREATE + assert send_and_recv(socket, "CREATE #{name}\r\n") == "OK\r\n" - assert send_and_recv(socket, "GET shopping eggs\r\n") == - "NOT FOUND\r\n" + # PUT + assert send_and_recv(socket, "PUT #{name} eggs 3\r\n") == "OK\r\n" - assert send_and_recv(socket, "CREATE shopping\r\n") == - "OK\r\n" + # GET + assert send_and_recv(socket, "GET #{name} eggs\r\n") == "3\r\n" + assert send_and_recv(socket, "") == "OK\r\n" - assert send_and_recv(socket, "PUT shopping eggs 3\r\n") == - "OK\r\n" + # DELETE + assert send_and_recv(socket, "DELETE #{name} eggs\r\n") == "OK\r\n" - # GET returns two lines - assert send_and_recv(socket, "GET shopping eggs\r\n") == "3\r\n" + # GET + assert send_and_recv(socket, "GET #{name} eggs\r\n") == "\r\n" assert send_and_recv(socket, "") == "OK\r\n" + end - assert send_and_recv(socket, "DELETE shopping eggs\r\n") == - "OK\r\n" + test "unknown command", %{socket: socket} do + assert send_and_recv(socket, "WHATEVER\r\n") == + "UNKNOWN COMMAND\r\n" + end - # GET returns two lines - assert send_and_recv(socket, "GET shopping eggs\r\n") == "\r\n" - assert send_and_recv(socket, "") == "OK\r\n" + test "unknown bucket", %{socket: socket} do + assert send_and_recv(socket, "GET whatever eggs\r\n") == + "NOT FOUND\r\n" end defp send_and_recv(socket, command) do @@ -413,38 +411,16 @@ defmodule KVServerTest do end ``` -Our integration test checks all server interaction, including unknown commands and not found errors. It is worth noting that, as with ETS tables and linked processes, there is no need to close the socket. Once the test process exits, the socket is automatically closed. - -This time, since our test relies on global data, we have not given `async: true` to `use ExUnit.Case`. Furthermore, in order to guarantee our test is always in a clean state, we stop and start the `:kv` application before each test. In fact, stopping the `:kv` application even prints a warning on the terminal: - -```text -18:12:10.698 [info] Application kv exited: :stopped -``` +Run `mix test` and the tests should all pass. However, make sure to terminate any `iex -S mix` session you may have running, as currently tests and development environment are running on the same port (4040). We will address it in the next chapter. -To avoid printing log messages during tests, ExUnit provides a neat feature called `:capture_log`. By setting `@tag :capture_log` before each test or `@moduletag :capture_log` for the whole test module, ExUnit will automatically capture anything that is logged while the test runs. In case our test fails, the captured logs will be printed alongside the ExUnit report. +We added three tests, the first one tests most bucket actions, while the other two deal with error cases. Given there is a lot of shared setup across these tests, we used the `setup/2` macro to deal with common boilerplate. The macro receives the same *test context* as tests and starts a client TCP connection per test. It also defines a unique bucket name using the module name and the test name, making sure any space in the test name is replaced by `-` as to not interfere with our command parsing logic. -Between `use ExUnit.Case` and `setup`, add the following call: +Then, in each test, we pattern matched on the *test context*, extracting the socket or name as necessary. This is similar to the code we wrote in `test/kv/bucket_test.exs`: ```elixir -@moduletag :capture_log + test "stores values by key on a named process", config do ``` -In case the test crashes, you will see a report as follows: - -```text - 1) test server interaction (KVServerTest) - test/kv_server_test.exs:17 - ** (RuntimeError) oops - stacktrace: - test/kv_server_test.exs:29 - - The following output was logged: - - 13:44:10.035 [notice] Application kv exited: :stopped -``` - -With this simple integration test, we start to see why integration tests may be slow. Not only can this test not run asynchronously, but it also requires the expensive setup of stopping and starting the `:kv` application. - -At the end of the day, it is up to you and your team to figure out the best testing strategy for your applications. You need to balance code quality, confidence, and test suite runtime. For example, we may start with testing the server only with integration tests, but if the server continues to grow in future releases, or it becomes a part of the application with frequent bugs, it is important to consider breaking it apart and writing more intensive unit tests that don't have the weight of an integration test. +Except back then we matched on all config and, this time around, we matched only on the data we needed. -Let's move to the next chapter. We will finally make our system distributed by adding a bucket routing mechanism. We will use this opportunity to also improve our testing chops. +Let's move to the next chapter. We will finally make our system distributed by adding a tiny bit of configuration and, *spoiler alert*, changing one line of code. diff --git a/lib/elixir/pages/mix-and-otp/dynamic-supervisor.md b/lib/elixir/pages/mix-and-otp/dynamic-supervisor.md index 8c83e4bb783..78ea475582d 100644 --- a/lib/elixir/pages/mix-and-otp/dynamic-supervisor.md +++ b/lib/elixir/pages/mix-and-otp/dynamic-supervisor.md @@ -5,161 +5,192 @@ # Supervising dynamic children -We have now successfully defined our supervisor which is automatically started (and stopped) as part of our application life cycle. +We have successfully learned how our supervision tree is automatically started (and stopped) as part of our application's life cycle. We can also name our buckets via the `:name` option. We also learned that, in practice, we should always start new processes inside supervisors. Let's apply these insights by ensuring our buckets are named and supervised. -Remember, however, that our `KV.Registry` is both linking (via `start_link`) and monitoring (via `monitor`) bucket processes in the `handle_cast/2` callback: +## Child specs + +Supervisors know how to start processes because they are given "child specifications". In our `lib/kv.ex` file, we defined a list of children with a single child spec: ```elixir -{:ok, bucket} = KV.Bucket.start_link([]) -ref = Process.monitor(bucket) + children = [ + {Registry, name: KV, keys: :unique} + ] ``` -Links are bidirectional, which implies that a crash in a bucket will crash the registry. Although we now have the supervisor, which guarantees the registry will be back up and running, crashing the registry still means we lose all data associating bucket names to their respective processes. - -In other words, we want the registry to keep on running even if a bucket crashes. Let's write a new registry test: +When the child specification is a tuple (as above) or module, then it is equivalent to calling the `child_spec/1` function on said module, which then returns the full specification. The pair above is equivalent to: ```elixir -test "removes bucket on crash", %{registry: registry} do - KV.Registry.create(registry, "shopping") - {:ok, bucket} = KV.Registry.lookup(registry, "shopping") - - # Stop the bucket with non-normal reason - Agent.stop(bucket, :shutdown) - assert KV.Registry.lookup(registry, "shopping") == :error -end +iex> Registry.child_spec(name: KV, keys: :unique) +%{ + id: KV, + start: {Registry, :start_link, [[name: KV, keys: :unique]]}, + type: :supervisor +} ``` -The test is similar to "removes bucket on exit" except that we are being a bit more harsh by sending `:shutdown` as the exit reason instead of `:normal`. If a process terminates with a reason other than `:normal`, all linked processes receive an EXIT signal, causing the linked process to also terminate unless it is trapping exits. +The underlying map returns the `:id` (required), the module-function-args triplet to invoke to start the process (required), the type of the process (optional), among other optional keys. In other words, the `child_spec/1` function allows us to compose and encapsulate specifications in modules. -Since the bucket terminated, the registry also stopped, and our test fails when trying to `GenServer.call/3` it: +Therefore, if we want to supervise `KV.Bucket`, we only need to define a `child_spec/1` function. Luckily for us, whenever we invoke `use Agent` (or `use GenServer` or `use Supervisor` and so forth), an implementation with reasonable defaults is provided. So let's take it for a spin. Back on `iex -S mix`, try this: -```text - 1) test removes bucket on crash (KV.RegistryTest) - test/kv/registry_test.exs:26 - ** (exit) exited in: GenServer.call(#PID<0.148.0>, {:lookup, "shopping"}, 5000) - ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started - code: assert KV.Registry.lookup(registry, "shopping") == :error - stacktrace: - (elixir) lib/gen_server.ex:770: GenServer.call/3 - test/kv/registry_test.exs:33: (test) +```elixir +iex> KV.Bucket.child_spec([]) +%{id: KV.Bucket, start: {KV.Bucket, :start_link, [[]]}} +iex> KV.Bucket.child_spec([name: :shopping]) +%{id: KV.Bucket, start: {KV.Bucket, :start_link, [[name: :shopping]]}} ``` -We are going to solve this issue by defining a new supervisor that will spawn and supervise all buckets. Opposite to the previous Supervisor we defined, the children are not known upfront, but they are rather started dynamically. For those situations, we use a supervisor optimized to such use cases called `DynamicSupervisor`. The `DynamicSupervisor` does not expect a list of children during initialization; instead each child is started manually via `DynamicSupervisor.start_child/2`. +Let's try to start it as part of a supervisor then, using the `{module, options}` format to pass the bucket name (let's also use an atom as the name for convenience): -## The bucket supervisor +```elixir +iex> children = [{KV.Bucket, name: :shopping}] +iex> Supervisor.start_link(children, strategy: :one_for_one) +iex> KV.Bucket.put(:shopping, "milk", 1) +:ok +iex> KV.Bucket.get(:shopping, "milk") +1 +``` -Since a `DynamicSupervisor` does not define any children during initialization, the `DynamicSupervisor` also allows us to skip the work of defining a whole separate module with the usual `start_link` function and the `init` callback. Instead, we can define a `DynamicSupervisor` directly in the supervision tree, by giving it a name and a strategy. +What happens now if we explicitly kill the bucket process? + +```elixir +# Find the pid for the given name +iex> pid = Process.whereis(:shopping) +#PID<0.48.0> +# Send it a kill exit signal +iex> Process.exit(pid, :kill) +true +# But a new process is alive in its place +iex> Process.whereis(:shopping) +#PID<0.50.0> +``` -Open up `lib/kv/supervisor.ex` and add the dynamic supervisor as a child as follows: +Given our buckets can already be supervised, it is time to hook them into our supervision tree. + +## Dynamic supervisors + +Given our buckets can already be supervised, you may be thinking to start them as part of our application `start/2` callback, such as: ```elixir - def init(:ok) do children = [ - {KV.Registry, name: KV.Registry}, - {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one} + {Registry, name: KV, keys: :unique} + {KV.Bucket, name: {:via, Registry, {KV, "shopping"}}} ] - - Supervisor.init(children, strategy: :one_for_one) - end ``` -Remember that the name of a process can be any atom. So far, we have named processes with the same name as the modules that define their implementation. For example, the process defined by `KV.Registry` was given a process name of `KV.Registry`. This is simply a convention: If later there is an error in your system that says, "process named KV.Registry crashed with reason", we know exactly where to investigate. - -In this case, there is no module, so we picked the name `KV.BucketSupervisor`. It could have been any other name. We also chose the `:one_for_one` strategy, which is currently the only available strategy for dynamic supervisors. +And while the above would definitely work, it comes with a huge caveat: it only starts a single bucket. In practice, we want the user to be able to create new buckets at any time. In other words, we need to start and supervise processes dynamically. -Run `iex -S mix` so we can give our dynamic supervisor a try: +While the `Supervisor` module has APIs for starting children after its initialization, it was not designed or optimized for the use case of having potentially millions of children. For this purpose, Elixir instead provides the `DynamicSupervisor` module. Using it is quite similar to `Supervisor` except that, instead of specifying the children during start, you do it afterwards. Let's take it for a spin: ```elixir -iex> {:ok, bucket} = DynamicSupervisor.start_child(KV.BucketSupervisor, KV.Bucket) -{:ok, #PID<0.72.0>} -iex> KV.Bucket.put(bucket, "eggs", 3) +iex> {:ok, sup_pid} = DynamicSupervisor.start_link(strategy: :one_for_one) +iex> DynamicSupervisor.start_child(sup_pid, {KV.Bucket, name: :another_list}) +iex> KV.Bucket.put(:another_list, "milk", 1) :ok -iex> KV.Bucket.get(bucket, "eggs") -3 +iex> KV.Bucket.get(:another_list, "milk") +1 ``` -`DynamicSupervisor.start_child/2` expects the name of the supervisor and the child specification of the child to be started. - -The last step is to change the registry to use the dynamic supervisor: +And it all works as expected. In fact, we can even give names to `DynamicSupervisor` themselves, instead of passing PIDs around and also use it to start buckets named using the registry: ```elixir - def handle_cast({:create, name}, {names, refs}) do - if Map.has_key?(names, name) do - {:noreply, {names, refs}} - else - {:ok, pid} = DynamicSupervisor.start_child(KV.BucketSupervisor, KV.Bucket) - ref = Process.monitor(pid) - refs = Map.put(refs, ref, name) - names = Map.put(names, name, pid) - {:noreply, {names, refs}} - end - end +iex> DynamicSupervisor.start_link(strategy: :one_for_one, name: :dyn_sup) +iex> name = {:via, Registry, {KV, "yet_another_list"}} +iex> DynamicSupervisor.start_child(:dyn_sup, {KV.Bucket, name: name}) +iex> KV.Bucket.put(name, "milk", 1) +:ok +iex> KV.Bucket.get(name, "milk") +1 ``` -That's enough for our tests to pass but there is a resource leakage in our application. When a bucket terminates, the supervisor will start a new bucket in its place. After all, that's the role of the supervisor! +Overall, processes can be named and supervised, regardless if they are supervisors, agents, etc, since all of Elixir standard library was designed around those capabilities. -However, when the supervisor restarts the new bucket, the registry does not know about it. So we will have an empty bucket in the supervisor that nobody can access! To solve this, we want to say that buckets are actually temporary. If they crash, regardless of the reason, they should not be restarted. - -We can do this by passing the `restart: :temporary` option to `use Agent` in `KV.Bucket`: +With all ingredients in place to supervise and name buckets, open up the `lib/kv.ex` module and let's add a new function called `KV.lookup_bucket/1`, which receives a name and either create or returns a bucket for the given name: ```elixir -defmodule KV.Bucket do - use Agent, restart: :temporary -``` +defmodule KV do + use Application -Let's also add a test to `test/kv/bucket_test.exs` that guarantees the bucket is temporary: + @impl true + def start(_type, _args) do + children = [ + {Registry, name: KV, keys: :unique}, + {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one} + ] -```elixir - test "are temporary workers" do - assert Supervisor.child_spec(KV.Bucket, []).restart == :temporary + Supervisor.start_link(children, strategy: :one_for_one) end -``` -Our test uses the `Supervisor.child_spec/2` function to retrieve the child specification out of a module and then assert its restart value is `:temporary`. At this point, you may be wondering why use a supervisor if it never restarts its children. It happens that supervisors provide more than restarts, they are also responsible for guaranteeing proper startup and shutdown, especially in case of crashes in a supervision tree. - -## Supervision trees + @doc """ + Creates a bucket with the given name. + """ + def create_bucket(name) do + DynamicSupervisor.start_child(KV.BucketSupervisor, {KV.Bucket, name: via(name)}) + end -When we added `KV.BucketSupervisor` as a child of `KV.Supervisor`, we began to have supervisors that supervise other supervisors, forming so-called "supervision trees". + @doc """ + Looks up the given bucket. + """ + def lookup_bucket(name) do + GenServer.whereis(via(name)) + end -Every time you add a new child to a supervisor, it is important to evaluate if the supervisor strategy is correct as well as the order of child processes. In this case, we are using `:one_for_one` and the `KV.Registry` is started before `KV.BucketSupervisor`. + defp via(name), do: {:via, Registry, {KV, name}} +end +``` -One flaw that shows up right away is the ordering issue. Since `KV.Registry` invokes `KV.BucketSupervisor`, then the `KV.BucketSupervisor` must be started before `KV.Registry`. Otherwise, it may happen that the registry attempts to reach the bucket supervisor before it has started. +The code is relatively simple. First we changed `start/2` to also start a dynamic supervisor named `KV.BucketSupervisor`. Then, when implemented `KV.create_bucket/1` which receives a bucket and starts with using our registry and dynamic supervisor. And we also added `KV.lookup_bucket/1` that receives the same name and attempts to find its PID. -The second flaw is related to the supervision strategy. If `KV.Registry` dies, all information linking `KV.Bucket` names to bucket processes is lost. Therefore the `KV.BucketSupervisor` and all children must terminate too - otherwise we will have orphan processes. +To make sure it all works as expected, let's write a test. Open up `test/kv_test.exs` and add this: -In light of this observation, we should consider moving to another supervision strategy. The two other candidates are `:one_for_all` and `:rest_for_one`. A supervisor using the `:rest_for_one` strategy will kill and restart child processes which were started *after* the crashed child. In this case, we would want `KV.BucketSupervisor` to terminate if `KV.Registry` terminates. This would require the bucket supervisor to be placed after the registry which violates the ordering constraints we have established two paragraphs above. +```elixir +defmodule KVTest do + use ExUnit.Case, async: true -So our last option is to go all in and pick the `:one_for_all` strategy: the supervisor will kill and restart all of its children processes whenever any one of them dies. This is a completely reasonable approach for our application, since the registry can't work without the bucket supervisor, and the bucket supervisor should terminate without the registry. Let's reimplement `init/1` in `KV.Supervisor` to encode those properties: + test "creates and looks up buckets by any name" do + name = "a unique name that won't be shared" + assert is_nil(KV.lookup_bucket(name)) -```elixir - def init(:ok) do - children = [ - {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one}, - {KV.Registry, name: KV.Registry} - ] + assert {:ok, bucket} = KV.create_bucket(name) + assert KV.lookup_bucket(name) == bucket - Supervisor.init(children, strategy: :one_for_all) + assert KV.create_bucket(name) == {:error, {:already_started, bucket}} end +end ``` -There are two topics left before we move on to the next chapter. +The test shows we are creating and locating buckets with any name, making sure we use a unique name to avoid conflicts between tests. + +## The `start_supervised` test helper + +Before we move on, let's do some clean up. -## Shared state in tests +In `test/kv/bucket_test.exs`, we explicitly invoked `KV.Bucket.start_link/1` to start our buckets. However, we now know that we should avoid calling `start_link/1` directly and instead start processes as part of supervision trees. -So far we have been starting one registry per test to ensure they are isolated: +In order to aid testing, `ExUnit` already starts a supervision tree per test and provides the `start_supervised` function to start processes within test-specific supervision tree. One advantage of this approach is that `ExUnit` guarantees any started process is shut down at the end of the test too. Let's rewrite our tests to use it instead: ```elixir -setup do - registry = start_supervised!(KV.Registry) - %{registry: registry} -end -``` +defmodule KV.BucketTest do + use ExUnit.Case, async: true + + test "stores values by key" do + {:ok, bucket} = start_supervised(KV.Bucket) + assert KV.Bucket.get(bucket, "milk") == nil -Since we have changed our registry to use `KV.BucketSupervisor`, our tests are now relying on this shared supervisor even though each test has its own registry. The question is: should we? + KV.Bucket.put(bucket, "milk", 3) + assert KV.Bucket.get(bucket, "milk") == 3 + end + + test "stores values by key on a named process", config do + {:ok, _} = start_supervised({KV.Bucket, name: config.test}) + assert KV.Bucket.get(config.test, "milk") == nil -It depends. It is ok to rely on shared state as long as we depend only on a non-shared partition of this state. Although multiple registries may start buckets on the shared bucket supervisor, those buckets and registries are isolated from each other. We would only run into concurrency issues if we used a function like `DynamicSupervisor.count_children(KV.BucketSupervisor)` which would count all buckets from all registries, potentially giving different results when tests run concurrently. + KV.Bucket.put(config.test, "milk", 3) + assert KV.Bucket.get(config.test, "milk") == 3 + end +end +``` -Since we have relied only on a non-shared partition of the bucket supervisor so far, we don't need to worry about concurrency issues in our test suite. In case it ever becomes a problem, we can start a supervisor per test and pass it as an argument to the registry `start_link` function. +It is a small change, but our tests are now using all of the relevant best practices. Excellent! ## Observer @@ -174,13 +205,10 @@ iex> :observer.start() > When running `iex` inside a project with `iex -S mix`, `observer` won't be available as a dependency. To do so, you will need to call the following functions before: > > ```elixir -> iex> Mix.ensure_application!(:wx) # Not necessary on Erlang/OTP 27+ -> iex> Mix.ensure_application!(:runtime_tools) # Not necessary on Erlang/OTP 27+ -> iex> Mix.ensure_application!(:observer) > iex> :observer.start() > ``` > -> If any of the calls above fail, here is what may have happened: some package managers default to installing a minimized Erlang without WX bindings for GUI support. In some package managers, you may be able to replace the headless Erlang with a more complete package (look for packages named `erlang` vs `erlang-nox` on Debian/Ubuntu/Arch). In others managers, you may need to install a separate `erlang-wx` (or similarly named) package. +> If the call above fails, here is what may have happened: some package managers default to installing a minimized Erlang without WX bindings for GUI support. In some package managers, you may be able to replace the headless Erlang with a more complete package (look for packages named `erlang` vs `erlang-nox` on Debian/Ubuntu/Arch). In others managers, you may need to install a separate `erlang-wx` (or similarly named) package. > > There are conversations to improve this experience in future releases. @@ -193,12 +221,12 @@ In the Applications tab, you will see all applications currently running in your Not only that, as you create new buckets on the terminal, you should see new processes spawned in the supervision tree shown in Observer: ```elixir -iex> KV.Registry.create(KV.Registry, "shopping") -:ok +iex> KV.lookup_bucket("shopping") +#PID<0.89.0> ``` We will leave it up to you to further explore what Observer provides. Note you can double-click any process in the supervision tree to retrieve more information about it, as well as right-click a process to send "a kill signal", a perfect way to emulate failures and see if your supervisor reacts as expected. At the end of the day, tools like Observer are one of the reasons you want to always start processes inside supervision trees, even if they are temporary, to ensure they are always reachable and introspectable. -Now that our buckets are properly linked and supervised, let's see how we can speed things up. +Now that our buckets are named and supervised, we are ready to start our server and start receiving requests. diff --git a/lib/elixir/pages/mix-and-otp/erlang-term-storage.md b/lib/elixir/pages/mix-and-otp/erlang-term-storage.md deleted file mode 100644 index 6d5a7db9ea1..00000000000 --- a/lib/elixir/pages/mix-and-otp/erlang-term-storage.md +++ /dev/null @@ -1,276 +0,0 @@ - - -# Speeding up with ETS - -Every time we need to look up a bucket, we need to send a message to the registry. In case our registry is being accessed concurrently by multiple processes, the registry may become a bottleneck! - -In this chapter, we will learn about ETS (Erlang Term Storage) and how to use it as a cache mechanism. - -> Warning! Don't use ETS as a cache prematurely! Log and analyze your application performance and identify which parts are bottlenecks, so you know *whether* you should cache, and *what* you should cache. This chapter is merely an example of how ETS can be used, once you've determined the need. - -## ETS as a cache - -ETS allows us to store any Elixir term in an in-memory table. Working with ETS tables is done via [Erlang's `:ets` module](`:ets`): - -```elixir -iex> table = :ets.new(:buckets_registry, [:set, :protected]) -#Reference<0.1885502827.460455937.234656> -iex> :ets.insert(table, {"foo", self()}) -true -iex> :ets.lookup(table, "foo") -[{"foo", #PID<0.41.0>}] -``` - -When creating an ETS table, two arguments are required: the table name and a set of options. From the available options, we passed the table type and its access rules. We have chosen the `:set` type, which means that keys cannot be duplicated. We've also set the table's access to `:protected`, meaning only the process that created the table can write to it, but all processes can read from it. The possible access controls: - - `:public` — Read/Write available to all processes. - - `:protected` — Read available to all processes. Only writable by owner process. This is the default. - - `:private` — Read/Write limited to owner process. - -Be aware that if your Read/Write call violates the access control, the operation will raise `ArgumentError`. Finally, since `:set` and `:protected` are the default values, we will skip them from now on. - -ETS tables can also be named, allowing us to access them by a given name: - -```elixir -iex> :ets.new(:buckets_registry, [:named_table]) -:buckets_registry -iex> :ets.insert(:buckets_registry, {"foo", self()}) -true -iex> :ets.lookup(:buckets_registry, "foo") -[{"foo", #PID<0.41.0>}] -``` - -Let's change the `KV.Registry` to use ETS tables. The first change is to modify our registry to require a name argument, we will use it to name the ETS table and the registry process itself. ETS names and process names are stored in different locations, so there is no chance of conflicts. - -Open up `lib/kv/registry.ex`, and let's change its implementation. We've added comments to the source code to highlight the changes we've made: - -```elixir -defmodule KV.Registry do - use GenServer - - ## Client API - - @doc """ - Starts the registry with the given options. - - `:name` is always required. - """ - def start_link(opts) do - # 1. Pass the name to GenServer's init - server = Keyword.fetch!(opts, :name) - GenServer.start_link(__MODULE__, server, opts) - end - - @doc """ - Looks up the bucket pid for `name` stored in `server`. - - Returns `{:ok, pid}` if the bucket exists, `:error` otherwise. - """ - def lookup(server, name) do - # 2. Lookup is now done directly in ETS, without accessing the server - case :ets.lookup(server, name) do - [{^name, pid}] -> {:ok, pid} - [] -> :error - end - end - - @doc """ - Ensures there is a bucket associated with the given `name` in `server`. - """ - def create(server, name) do - GenServer.cast(server, {:create, name}) - end - - ## Server callbacks - - @impl true - def init(table) do - # 3. We have replaced the names map by the ETS table - names = :ets.new(table, [:named_table, read_concurrency: true]) - refs = %{} - {:ok, {names, refs}} - end - - # 4. The previous handle_call callback for lookup was removed - - @impl true - def handle_cast({:create, name}, {names, refs}) do - # 5. Read and write to the ETS table instead of the map - case lookup(names, name) do - {:ok, _pid} -> - {:noreply, {names, refs}} - - :error -> - {:ok, pid} = DynamicSupervisor.start_child(KV.BucketSupervisor, KV.Bucket) - ref = Process.monitor(pid) - refs = Map.put(refs, ref, name) - :ets.insert(names, {name, pid}) - {:noreply, {names, refs}} - end - end - - @impl true - def handle_info({:DOWN, ref, :process, _pid, _reason}, {names, refs}) do - # 6. Delete from the ETS table instead of the map - {name, refs} = Map.pop(refs, ref) - :ets.delete(names, name) - {:noreply, {names, refs}} - end - - @impl true - def handle_info(_msg, state) do - {:noreply, state} - end -end -``` - -Notice that before our changes `KV.Registry.lookup/2` sent requests to the server, but now it reads directly from the ETS table, which is shared across all processes. That's the main idea behind the cache mechanism we are implementing. - -In order for the cache mechanism to work, the created ETS table needs to have access `:protected` (the default), so all clients can read from it, while only the `KV.Registry` process writes to it. We have also set `read_concurrency: true` when starting the table, optimizing the table for the common scenario of concurrent read operations. - -The changes we have performed above have broken our tests because the registry requires the `:name` option when starting up. Furthermore, some registry operations such as `lookup/2` require the name to be given as an argument, instead of a PID, so we can do the ETS table lookup. Let's change the setup function in `test/kv/registry_test.exs` to fix both issues: - -```elixir - setup context do - _ = start_supervised!({KV.Registry, name: context.test}) - %{registry: context.test} - end -``` - -Since each test has a unique name, we use the test name to name our registries. This way, we no longer need to pass the registry PID around, instead we identify it by the test name. Also note we assigned the result of `start_supervised!` to underscore (`_`). This idiom is often used to signal that we are not interested in the result of `start_supervised!`. - -Once we change `setup`, some tests will continue to fail. You may even notice tests pass and fail inconsistently between runs. For example, the "spawns buckets" test: - -```elixir -test "spawns buckets", %{registry: registry} do - assert KV.Registry.lookup(registry, "shopping") == :error - - KV.Registry.create(registry, "shopping") - assert {:ok, bucket} = KV.Registry.lookup(registry, "shopping") - - KV.Bucket.put(bucket, "milk", 1) - assert KV.Bucket.get(bucket, "milk") == 1 -end -``` - -may be failing on this line: - -```elixir -{:ok, bucket} = KV.Registry.lookup(registry, "shopping") -``` - -How can this line fail if we just created the bucket in the previous line? - -The reason those failures are happening is because, for educational purposes, we have made two mistakes: - - 1. We are prematurely optimizing (by adding this cache layer) - 2. We are using `cast/2` (while we should be using `call/2`) - -## Race conditions? - -Developing in Elixir does not make your code free of race conditions. However, Elixir's abstractions where nothing is shared by default make it easier to spot a race condition's root cause. - -What is happening in our tests is that there is a delay in between an operation and the time we can observe this change in the ETS table. Here is what we were expecting to happen: - -1. We invoke `KV.Registry.create(registry, "shopping")` -2. The registry creates the bucket and updates the cache table -3. We access the information from the table with `KV.Registry.lookup(registry, "shopping")` -4. The command above returns `{:ok, bucket}` - -However, since `KV.Registry.create/2` is a cast operation, the command will return before we actually write to the table! In other words, this is happening: - -1. We invoke `KV.Registry.create(registry, "shopping")` -2. We access the information from the table with `KV.Registry.lookup(registry, "shopping")` -3. The command above returns `:error` -4. The registry creates the bucket and updates the cache table - -To fix the failure we need to make `KV.Registry.create/2` synchronous by using `call/2` rather than `cast/2`. This will guarantee that the client will only continue after changes have been made to the table. Let's back to `lib/kv/registry.ex` and change the function and its callback as follows: - -```elixir -def create(server, name) do - GenServer.call(server, {:create, name}) -end -``` - -```elixir -@impl true -def handle_call({:create, name}, _from, {names, refs}) do - case lookup(names, name) do - {:ok, pid} -> - {:reply, pid, {names, refs}} - - :error -> - {:ok, pid} = DynamicSupervisor.start_child(KV.BucketSupervisor, KV.Bucket) - ref = Process.monitor(pid) - refs = Map.put(refs, ref, name) - :ets.insert(names, {name, pid}) - {:reply, pid, {names, refs}} - end -end -``` - -We changed the callback from `handle_cast/2` to `handle_call/3` and changed it to reply with the PID of the created bucket. Generally speaking, Elixir developers prefer to use `call/2` instead of `cast/2` as it also provides back-pressure — you block until you get a reply. Using `cast/2` when not necessary can also be considered a premature optimization. - -Let's run the tests once again. This time though, we will pass the `--trace` option: - -```console -$ mix test --trace -``` - -The `--trace` option is useful when your tests are deadlocking or there are race conditions, as it runs all tests synchronously (`async: true` has no effect) and shows detailed information about each test. If you run the tests multiple times you may see this intermittent failure: - -```text - 1) test removes buckets on exit (KV.RegistryTest) - test/kv/registry_test.exs:19 - Assertion with == failed - code: assert KV.Registry.lookup(registry, "shopping") == :error - left: {:ok, #PID<0.109.0>} - right: :error - stacktrace: - test/kv/registry_test.exs:23 -``` - -According to the failure message, we are expecting that the bucket no longer exists on the table, but it still does! This problem is the opposite of the one we have just solved: while previously there was a delay between the command to create a bucket and updating the table, now there is a delay between the bucket process dying and its entry being removed from the table. Since this is a race condition, you may not be able to reproduce it on your machine, but it is there. - -Last time we fixed the race condition by replacing the asynchronous operation, a `cast`, by a `call`, which is synchronous. Unfortunately, the `handle_info/2` callback we are using to receive the `:DOWN` message and delete the entry from the ETS table does not have a synchronous equivalent. This time, we need to find a way to guarantee the registry has processed the `:DOWN` notification sent when the bucket process terminated. - -An easy way to do so is by sending a synchronous request to the registry before we do the bucket lookup. The `Agent.stop/2` operation is synchronous and only returns after the bucket process terminates. Therefore, once `Agent.stop/2` returns, the registry has received the `:DOWN` message but it may not have processed it yet. In order to guarantee the processing of the `:DOWN` message, we can do a synchronous request. Since messages are processed in order, once the registry replies to the synchronous request, then the `:DOWN` message will definitely have been processed. - -Let's do so by creating a "bogus" bucket, which is a synchronous request, after `Agent.stop/2` in both "remove" tests at `test/kv/registry_test.exs`: - -```elixir - test "removes buckets on exit", %{registry: registry} do - KV.Registry.create(registry, "shopping") - {:ok, bucket} = KV.Registry.lookup(registry, "shopping") - Agent.stop(bucket) - - # Do a call to ensure the registry processed the DOWN message - _ = KV.Registry.create(registry, "bogus") - assert KV.Registry.lookup(registry, "shopping") == :error - end - - test "removes bucket on crash", %{registry: registry} do - KV.Registry.create(registry, "shopping") - {:ok, bucket} = KV.Registry.lookup(registry, "shopping") - - # Stop the bucket with non-normal reason - Agent.stop(bucket, :shutdown) - - # Do a call to ensure the registry processed the DOWN message - _ = KV.Registry.create(registry, "bogus") - assert KV.Registry.lookup(registry, "shopping") == :error - end -``` - -Our tests should now (always) pass! - -This concludes our optimization chapter. We have used ETS as a cache mechanism where reads can happen from any processes but writes are still serialized through a single process. More importantly, we have also learned that once data can be read asynchronously, we need to be aware of the race conditions it might introduce. - -In practice, if you find yourself in a position where you need a registry for dynamic processes, you should use the `Registry` module provided as part of Elixir. It provides functionality similar to the one we have built using a GenServer + `:ets` while also being able to perform both writes and reads concurrently. [It has been benchmarked to scale across all cores even on machines with 40 cores](https://elixir-lang.org/blog/2017/01/05/elixir-v1-4-0-released/). - -Next, let's discuss external and internal dependencies and how Mix helps us manage large codebases. diff --git a/lib/elixir/pages/mix-and-otp/genservers.md b/lib/elixir/pages/mix-and-otp/genservers.md index ecbc405a892..30fcf2a7ce7 100644 --- a/lib/elixir/pages/mix-and-otp/genservers.md +++ b/lib/elixir/pages/mix-and-otp/genservers.md @@ -3,46 +3,70 @@ SPDX-FileCopyrightText: 2021 The Elixir Team --> -# Client-server communication with GenServer +# Client-server with GenServer -In the [previous chapter](agents.md), we used agents to represent our buckets. In the [introduction to mix](introduction-to-mix.md), we specified we would like to name each bucket so we can do the following: +To wrap up our distributed key-value store, we will implement a feature where a client can subscribe to a bucket and receive realtime notifications of any modification happening in the bucket, regardless of where in the cluster the bucket is located. -```elixir -CREATE shopping -OK - -PUT shopping milk 1 -OK +We will do by adding a new command, called SUBSCRIBE, to be used like this: -GET shopping milk -1 -OK +```text +SUBSCRIBE shopping +milk SET TO 1 +eggs SET TO 10 +milk DELETED ``` -In the session above we interacted with the "shopping" bucket. +To make this work, we must change our `KV.Bucket` implementation to track subscriptions and emit broadcasts. However, as we will see, we cannot implement such on top of agents, and we will need to rewrite our bucket implementation to a `GenServer`. + +## Links and monitors + +Processes in Elixir are isolated. When they need to communicate, they do so by sending messages. However, how do you know when a process terminates, either because it has completed or due to a crash? -Since agents are processes, each bucket has a process identifier (PID), but buckets do not have a name. Back [in the Process chapter](../getting-started/processes.md), we have learned that we can register processes in Elixir by giving them atom names: +We have two options: links and monitors. + +We have used links extensively. Whenever we started a process, we typically did so by using `start_link` or similar. The idea behind links is that, if any of the processes crash, the other will crash due to the link. We talked about them in the [Process chapter of the Getting Started guide](../getting-started/processes.md). Here is a refresher: ```elixir -iex> Agent.start_link(fn -> %{} end, name: :shopping) -{:ok, #PID<0.43.0>} -iex> KV.Bucket.put(:shopping, "milk", 1) -:ok -iex> KV.Bucket.get(:shopping, "milk") -1 +iex> self() +#PID<0.115.0> +iex> spawn_link(fn -> :nothing_bad_will_happen end) +#PID<0.116.0> +iex> self() +#PID<0.115.0> ``` -However, naming dynamic processes with atoms is a terrible idea! If we use atoms, we would need to convert the bucket name (often received from an external client) to atoms, and **we should never convert user input to atoms**. This is because atoms are not garbage collected. Once an atom is created, it is never reclaimed. Generating atoms from user input would mean the user can inject enough different names to exhaust our system memory! +```elixir +iex> spawn_link(fn -> raise "oops" end) +#PID<0.117.0> -In practice, it is more likely you will reach the Erlang VM limit for the maximum number of atoms before you run out of memory, which will bring your system down regardless. +12:37:33.229 [error] Process #PID<0.117.0> raised an exception +Interactive Elixir (1.18.4) - press Ctrl+C to exit (type h() ENTER for help) +iex> self() +#PID<0.118.0> +``` -Instead of abusing the built-in name facility, we will create our own *process registry* that associates the bucket name to the bucket process. +The reason why we links are so pervasive is because when we start a process inside a supervisor, we want our process to crash if the supervisor terminates. On the other hand, we don't want the supervisor to crash when a child terminates, and therefore supervisors trap exits from links by calling `Process.flag(:trap_exit, true)`. -The registry needs to guarantee that it is always up to date. For example, if one of the bucket processes crashes due to a bug, the registry must notice this change and avoid serving stale entries. In Elixir, we say the registry needs to *monitor* each bucket. Because our *registry* needs to be able to receive and handle ad-hoc messages from the system, the `Agent` API is not enough. +In other words, links create an intrinsic relationship between the processes. If we simply want to track when a process dies, without tying their exit signals to each other, a better solution is to use monitors. When a monitored process terminates, we receive a message in our inbox, regardless of the reason: -We will use a `GenServer` to create a registry process that can monitor the bucket processes. GenServer provides industrial strength functionality for building servers in both Elixir and OTP. +```elixir +iex> pid = spawn(fn -> Process.sleep(5000) end) +#PID<0.119.0> +iex> Process.monitor(pid) +#Reference<0.1076459149.2159017989.118674> +iex> flush() +:ok +# Wait five seconds +iex> flush() +{:DOWN, #Reference<0.1076459149.2159017989.118674>, :process, #PID<0.119.0>, :normal} +:ok +``` + +Once the process terminates, we receive a "DOWN message", represented in a five-element tuple. The last element is the reason why it crashed (`:normal` means it terminated successfully). -Please read the `GenServer` module documentation for an overview if you haven't yet. Once you do so, we are ready to proceed. +Monitors will play a very important role in our subscribe feature. When a client subscribes to a bucket, the bucket will store the client PID and send messages to it on every change. However, if the client terminates (for example because it was disconnected), the bucket must remove the client from its list of subscribers (otherwise the list would keep on growing forever as clients connect and disconnect). + +We chose the `Agent` module to implement our `KV.Bucket` and, unfortunately, agents cannot receive messages. So the first step is to rewrite our `KV.Bucket` to a `GenServer`. The `GenServer` module documentation has a good overview on what they are and how to implement them. Give it a read and then we are ready to proceed. ## GenServer callbacks @@ -84,234 +108,255 @@ def handle_call({:put, key, value}, _from, state) do end ``` -There is quite a bit more ceremony in the GenServer code but, as we will see, it brings some benefits too. - -For now, we will write only the server callbacks for our bucket registering logic, without providing a proper API, which we will do later. - -Create a new file at `lib/kv/registry.ex` with the following contents: +Let's go ahead and rewrite `KV.Bucket` at once. Open up `lib/kv/bucket.ex` and replace its contents with this new version: ```elixir -defmodule KV.Registry do +defmodule KV.Bucket do use GenServer - ## Missing Client API - will add this later + @doc """ + Starts a new bucket. + """ + def start_link(opts) do + GenServer.start_link(__MODULE__, %{}, opts) + end - ## Defining GenServer Callbacks + @doc """ + Gets a value from the `bucket` by `key`. + """ + def get(bucket, key) do + GenServer.call(bucket, {:get, key}) + end - @impl true - def init(:ok) do - {:ok, %{}} + @doc """ + Puts the `value` for the given `key` in the `bucket`. + """ + def put(bucket, key, value) do + GenServer.call(bucket, {:put, key, value}) + end + + @doc """ + Deletes `key` from `bucket`. + + Returns the current value of `key`, if `key` exists. + """ + def delete(bucket, key) do + GenServer.call(bucket, {:delete, key}) end + ### Callbacks + @impl true - def handle_call({:lookup, name}, _from, names) do - {:reply, Map.fetch(names, name), names} + def init(bucket) do + state = %{ + bucket: bucket + } + + {:ok, state} end @impl true - def handle_cast({:create, name}, names) do - if Map.has_key?(names, name) do - {:noreply, names} - else - {:ok, bucket} = KV.Bucket.start_link([]) - {:noreply, Map.put(names, name, bucket)} - end + def handle_call({:get, key}, _from, state) do + value = get_in(state.bucket[key]) + {:reply, value, state} + end + + def handle_call({:put, key, value}, _from, state) do + state = put_in(state.bucket[key], value) + {:reply, :ok, state} + end + + def handle_call({:delete, key}, _from, state) do + {value, state} = pop_in(state.bucket[key]) + {:reply, value, state} end end ``` -There are two types of requests you can send to a GenServer: calls and casts. Calls are synchronous and the server **must** send a response back to such requests. While the server computes the response, the client is **waiting**. Casts are asynchronous: the server won't send a response back and therefore the client won't wait for one. Both requests are messages sent to the server, and will be handled in sequence. In the above implementation, we pattern-match on the `:create` messages, to be handled as cast, and on the `:lookup` messages, to be handled as call. +The first function is `start_link/1`, which starts a new GenServer passing a list of options. `GenServer.start_link/3`, which takes three arguments: -In order to invoke the callbacks above, we need to go through the corresponding `GenServer` functions. Let's start a registry, create a named bucket, and then look it up: +1. The module where the server callbacks are implemented, in this case `__MODULE__` (meaning the current module) -```elixir -iex> {:ok, registry} = GenServer.start_link(KV.Registry, :ok) -{:ok, #PID<0.136.0>} -iex> GenServer.cast(registry, {:create, "shopping"}) -:ok -iex> {:ok, bucket} = GenServer.call(registry, {:lookup, "shopping"}) -{:ok, #PID<0.174.0>} -``` +2. The initialization arguments, in this case the empty bucket `%{}` -Our `KV.Registry` process received a cast with `{:create, "shopping"}` and a call with `{:lookup, "shopping"}`, in this sequence. `GenServer.cast` will immediately return, as soon as the message is sent to the `registry`. The `GenServer.call` on the other hand, is where we would be waiting for an answer, provided by the above `KV.Registry.handle_call` callback. +3. A list of options which can be used to specify things like the name of the server. Once again, we forward the list of options that we receive on `start_link/1` to `GenServer.start_link/3`, as we did for agents -You may also have noticed that we have added `@impl true` before each callback. The `@impl true` informs the compiler that our intention for the subsequent function definition is to define a callback. If by any chance we make a mistake in the function name or in the number of arguments, like we define a `handle_call/2`, the compiler would warn us there isn't any `handle_call/2` to define, and would give us the complete list of known callbacks for the `GenServer` module. +Once started, the GenServer will invoke the `init/1` callback, that receives the second argument given to `GenServer.start_link/3` and returns `{:ok, state}`, where state is a new map. We can already notice how the `GenServer` API makes the client/server segregation more apparent. `start_link/3` happens in the client, while `init/1` is the respective callback that runs on the server. -This is all good and well, but we still want to offer our users an API that allows us to hide our implementation details. +There are two types of requests you can send to a GenServer: calls and casts. Calls are synchronous and the server **must** send a response back to such requests. While the server computes the response, the client is **waiting**. Casts are asynchronous: the server won't send a response back and therefore the client won't wait for one. Both requests are messages sent to the server, and will be handled in sequence. So far we have only used `GenServer.call/2`, to keep the same semantics as the Agent, but we will give `cast` a try when implementing subscriptions. Given we kept the same behaviour, all tests will still pass. -## The Client API +Each request must be implemented as a specific callback. For `call/2` requests, we implement a `handle_call/3` callback that receives the `request`, the process from which we received the request (`_from`), and the current server state (`state`). The `handle_call/3` callback returns a tuple in the format `{:reply, reply, updated_state}`. The first element of the tuple, `:reply`, indicates that the server should send a reply back to the client. The second element, `reply`, is what will be sent to the client while the third, `updated_state` is the new server state. -A GenServer is implemented in two parts: the client API and the server callbacks. You can either combine both parts into a single module or you can separate them into a client module and a server module. The client is any process that invokes the client function. The server is always the process identifier or process name that we will explicitly pass as argument to the client API. Here we'll use a single module for both the server callbacks and the client API. +Another Elixir feature we used in the implementation above are the nested traversal functions: `get_in/1`, `put_in/2`, and `pop_in/1`. Instead of keeping the `bucket` as our GenServer state, we defined a state map with a `bucket` key inside. This will be important as we also need to track subscribers as part of the GenServer state. These new functions make it straight-forward to manipulate data structures nested in other data structures. -Edit the file at `lib/kv/registry.ex`, filling in the blanks for the client API: +With our GenServer in place, let's work on subscription, starting with the tests. -```elixir - ## Client API +## Implementing subscriptions - @doc """ - Starts the registry. - """ - def start_link(opts) do - GenServer.start_link(__MODULE__, :ok, opts) - end +Our new test will subscribe to a bucket and then assert that, as operations are performed against the bucket, we receive messages of said events. - @doc """ - Looks up the bucket pid for `name` stored in `server`. +Open up `test/kv/bucket_test.exs` and key this in: - Returns `{:ok, pid}` if the bucket exists, `:error` otherwise. - """ - def lookup(server, name) do - GenServer.call(server, {:lookup, name}) - end +```elixir + test "subscribes to puts and deletes" do + {:ok, bucket} = start_supervised(KV.Bucket) + KV.Bucket.subscribe(bucket) - @doc """ - Ensures there is a bucket associated with the given `name` in `server`. - """ - def create(server, name) do - GenServer.cast(server, {:create, name}) + KV.Bucket.put(bucket, "milk", 3) + assert_receive {:put, "milk", 3} + + # Also check it works even from another process + spawn(fn -> KV.Bucket.delete(bucket, "milk") end) + assert_receive {:delete, "milk"} end ``` -The first function is `start_link/1`, which starts a new GenServer passing a list of options. `start_link/1` calls out to `GenServer.start_link/3`, which takes three arguments: - -1. The module where the server callbacks are implemented, in this case `__MODULE__` (meaning the current module) - -2. The initialization arguments, in this case the atom `:ok` +In order to make the test pass, we need to implement the `KV.Bucket.subscribe/1`. So let's add these three new functions to `KV.Bucket`: -3. A list of options which can be used to specify things like the name of the server. For now, we forward the list of options that we receive on `start_link/1` to `GenServer.start_link/3` - -The next two functions, `lookup/2` and `create/2`, are responsible for sending these requests to the server. In this case, we have used `{:lookup, name}` and `{:create, name}` respectively. Requests are often specified as tuples, like this, in order to provide more than one "argument" in that first argument slot. It's common to specify the action being requested as the first element of a tuple, and arguments for that action in the remaining elements. Note that the requests must match the first argument to `handle_call/3` or `handle_cast/2`. +```elixir + @doc """ + Subscribes the current process to the bucket. + """ + def subscribe(bucket) do + GenServer.cast(bucket, {:subscribe, self()}) + end -That's it for the client API. On the server side, we can implement a variety of callbacks to guarantee the server initialization, termination, and handling of requests. Those callbacks are optional and for now, we have only implemented the ones we care about. Let's recap. + @impl true + def handle_cast({:subscribe, pid}, state) do + Process.monitor(pid) + state = update_in(state.subscribers, &MapSet.put(&1, pid)) + {:noreply, state} + end -The first is the `init/1` callback, that receives the second argument given to `GenServer.start_link/3` and returns `{:ok, state}`, where state is a new map. We can already notice how the `GenServer` API makes the client/server segregation more apparent. `start_link/3` happens in the client, while `init/1` is the respective callback that runs on the server. + @impl true + def handle_info({:DOWN, _ref, _type, pid, _reason}, state) do + state = update_in(state.subscribers, &MapSet.delete(&1, pid)) + {:noreply, state} + end +``` -For `call/2` requests, we implement a `handle_call/3` callback that receives the `request`, the process from which we received the request (`_from`), and the current server state (`names`). The `handle_call/3` callback returns a tuple in the format `{:reply, reply, new_state}`. The first element of the tuple, `:reply`, indicates that the server should send a reply back to the client. The second element, `reply`, is what will be sent to the client while the third, `new_state` is the new server state. +On subscription, we send a `cast/2` request with the current process identifier and implement its `handle_cast/2` callback that receives the `request` and the current server state. We then proceed to monitor the given `pid` and add it to the list of subscribers, which we are implementing using `MapSet`. The `handle_cast/2` callback returns a tuple in the format `{:noreply, updated_state}`. Note that in a real application we would have probably implemented it with a synchronous call, as it provides back pressure, instead of an asynchronous cast. We are doing it this way to illustrate how to implement a cast callback. -For `cast/2` requests, we implement a `handle_cast/2` callback that receives the `request` and the current server state (`names`). The `handle_cast/2` callback returns a tuple in the format `{:noreply, new_state}`. Note that in a real application we would have probably implemented the callback for `:create` with a synchronous call instead of an asynchronous cast. We are doing it this way to illustrate how to implement a cast callback. +Then, because we have monitored a process, once that process terminates, we will receive a "DOWN message". GenServers handle regular messages using the `handle_info/2` callback, which also typically return `{:noreply, updated_state}`. In this callback, we remove the PID that terminated from our list of subscribers. -There are other tuple formats both `handle_call/3` and `handle_cast/2` callbacks may return. There are other callbacks like `terminate/2` and `code_change/3` that we could implement. You are welcome to explore the full `GenServer` documentation to learn more about those. +We are almost there. We can see both `handle_cast/2` and `handle_info/2` callbacks assume there is a subscribers key in our state with a `MapSet`. So let's add it by updating the existing `init/1` to the following: -For now, let's write some tests to guarantee our GenServer works as expected. +```elixir + @impl true + def init(bucket) do + state = %{ + bucket: bucket, + subscribers: MapSet.new() + } -## Testing a GenServer + {:ok, state} + end +``` -Testing a GenServer is not much different from testing an agent. We will spawn the server on a setup callback and use it throughout our tests. Create a file at `test/kv/registry_test.exs` with the following: +And finally let's update the callbacks for `put/3` and `delete/2` to broadcast messages whenever they are invoked, like this: ```elixir -defmodule KV.RegistryTest do - use ExUnit.Case, async: true - - setup do - registry = start_supervised!(KV.Registry) - %{registry: registry} + def handle_call({:put, key, value}, _from, state) do + state = put_in(state.bucket[key], value) + broadcast(state, {:put, key, value}) + {:reply, :ok, state} end - test "spawns buckets", %{registry: registry} do - assert KV.Registry.lookup(registry, "shopping") == :error - - KV.Registry.create(registry, "shopping") - assert {:ok, bucket} = KV.Registry.lookup(registry, "shopping") + def handle_call({:delete, key}, _from, state) do + {value, state} = pop_in(state.bucket[key]) + broadcast(state, {:delete, key}) + {:reply, value, state} + end - KV.Bucket.put(bucket, "milk", 1) - assert KV.Bucket.get(bucket, "milk") == 1 + defp broadcast(state, message) do + for pid <- state.subscribers do + send(pid, message) + end end -end ``` -Our test case first asserts there are no buckets in our registry, creates a named bucket, looks it up, and asserts it behaves as a bucket. - -There is one important difference between the `setup` block we wrote for `KV.Registry` and the one we wrote for `KV.Bucket`. Instead of starting the registry by hand by calling `KV.Registry.start_link/1`, we instead called the `ExUnit.Callbacks.start_supervised!/2` function, passing the `KV.Registry` module. +There is no need to modify the callback for `get/2`. And that's it, run the tests again, and our new test should pass! -The `start_supervised!` function was injected into our test module by `use ExUnit.Case`. It does the job of starting the `KV.Registry` process, by calling its `start_link/1` function. The advantage of using `start_supervised!` is that ExUnit will guarantee that the registry process will be shutdown **before** the next test starts. In other words, it helps guarantee that the state of one test is not going to interfere with the next one in case they depend on shared resources. +## Wiring it all up -When starting processes during your tests, we should always prefer to use `start_supervised!`. We recommend you to change the `setup` block in `bucket_test.exs` to use `start_supervised!` too. +Now that our bucket deals with subscriptions, we need to expose this new functionality in our server. Let's once again start with the test. -Run the tests and they should all pass! +Open up `test/kv/server_test.exs` and add this new test: -## The need for monitoring +```elixir + test "subscribes to buckets", %{socket: socket, name: name} do + assert send_and_recv(socket, "CREATE #{name}\r\n") == "OK\r\n" + :gen_tcp.send(socket, "SUBSCRIBE #{name}\r\n") -Everything we have done so far could have been implemented with a `Agent`. In this section, we will see one of many things that we can achieve with a GenServer that is not possible with an Agent. + {:ok, other} = :gen_tcp.connect(~c"localhost", 4040, @socket_options) -Let's start with a test that describes how we want the registry to behave if a bucket stops or crashes: + assert send_and_recv(other, "PUT #{name} milk 3\r\n") == "OK\r\n" + assert :gen_tcp.recv(socket, 0, 1000) == {:ok, "milk SET TO 3\r\n"} -```elixir -test "removes buckets on exit", %{registry: registry} do - KV.Registry.create(registry, "shopping") - {:ok, bucket} = KV.Registry.lookup(registry, "shopping") - Agent.stop(bucket) - assert KV.Registry.lookup(registry, "shopping") == :error -end + assert send_and_recv(other, "DELETE #{name} milk\r\n") == "OK\r\n" + assert :gen_tcp.recv(socket, 0, 1000) == {:ok, "milk DELETED\r\n"} + end ``` -The test above will fail on the last assertion as the bucket name remains in the registry even after we stop the bucket process. +The test creates a bucket and subscribes to it. Then it opens up another TCP connection to send commands. For each command sent, we expect the subscribed socket to receive a message. -In order to fix this bug, we need the registry to monitor every bucket it spawns. Once we set up a monitor, the registry will receive a notification every time a bucket process exits, allowing us to clean the registry up. - -Let's first play with monitors by starting a new console with `iex -S mix`: +To make the test pass, we need to change `KV.Command` to parse the new `SUBSCRIBE` command and then run it. Open up `lib/kv/commands.ex` and then first change the `parse/1` definition to the following: ```elixir -iex> {:ok, pid} = KV.Bucket.start_link([]) -{:ok, #PID<0.66.0>} -iex> Process.monitor(pid) -#Reference<0.0.0.551> -iex> Agent.stop(pid) -:ok -iex> flush() -{:DOWN, #Reference<0.0.0.551>, :process, #PID<0.66.0>, :normal} + def parse(line) do + case String.split(line) do + ["SUBSCRIBE", bucket] -> {:ok, {:subscribe, bucket}} + ["CREATE", bucket] -> {:ok, {:create, bucket}} + ["GET", bucket, key] -> {:ok, {:get, bucket, key}} + ["PUT", bucket, key, value] -> {:ok, {:put, bucket, key, value}} + ["DELETE", bucket, key] -> {:ok, {:delete, bucket, key}} + _ -> {:error, :unknown_command} + end + end ``` -Note `Process.monitor(pid)` returns a unique reference that allows us to match upcoming messages to that monitoring reference. After we stop the agent, we can `flush/0` all messages and notice a `:DOWN` message arrived, with the exact reference returned by `monitor`, notifying that the bucket process exited with reason `:normal`. - -Let's reimplement the server callbacks to fix the bug and make the test pass. First, we will modify the GenServer state to two maps: one that contains `name -> pid` and another that holds `ref -> name`. Then we need to monitor the buckets on `handle_cast/2` as well as implement a `handle_info/2` callback to handle the monitoring messages. The full server callbacks implementation is shown below: +We added a new clause that converts "SUBSCRIBE" into a tuple. Now we need to match on this tuple within `run/1`. We can do so by adding a new clause at the bottom of `run/1`, with the following code: ```elixir -## Server callbacks + def run({:subscribe, bucket}, socket) do + lookup(bucket, fn pid -> + KV.Bucket.subscribe(pid) + :inet.setopts(socket, active: true) + receive_messages(socket) + end) + end -@impl true -def init(:ok) do - names = %{} - refs = %{} - {:ok, {names, refs}} -end + defp receive_messages(socket) do + receive do + {:put, key, value} -> + :gen_tcp.send(socket, "#{key} SET TO #{value}\r\n") + receive_messages(socket) -@impl true -def handle_call({:lookup, name}, _from, state) do - {names, _} = state - {:reply, Map.fetch(names, name), state} -end + {:delete, key} -> + :gen_tcp.send(socket, "#{key} DELETED\r\n") + receive_messages(socket) + + {:tcp_closed, ^socket} -> + {:error, :closed} -@impl true -def handle_cast({:create, name}, {names, refs}) do - if Map.has_key?(names, name) do - {:noreply, {names, refs}} - else - {:ok, bucket} = KV.Bucket.start_link([]) - ref = Process.monitor(bucket) - refs = Map.put(refs, ref, name) - names = Map.put(names, name, bucket) - {:noreply, {names, refs}} + # If we receive any message, including socket writes, we discard them + _ -> + receive_messages(socket) + end end -end +``` -@impl true -def handle_info({:DOWN, ref, :process, _pid, _reason}, {names, refs}) do - {name, refs} = Map.pop(refs, ref) - names = Map.delete(names, name) - {:noreply, {names, refs}} -end +Let's go over it by parts. We use the existing `lookup/2` private function to lookup for a bucket. If one is found, we subscribe the current process to the bucket. Then we call `:inet.setopts(socket, active: true)` (which we will explain soon) and `receive_messages/1`. -@impl true -def handle_info(msg, state) do - require Logger - Logger.debug("Unexpected message in KV.Registry: #{inspect(msg)}") - {:noreply, state} -end -``` +`receive_messages/1` awaits for messages from the bucket and then calls itself again, becoming a loop. We match on `{:put, key, value}` and `{:delete, key}` and write to those events to the socket. We also match on `{:tcp_closed, ^socket}`, which is a message that will be delivered if the TCP socket closes, and use it to abort the loop. We discard any other message. + +At this point you may be wondering: where does `{:tcp_closed, ^socket}` come from? + +So far, when receiving messages from the socket, we used `:gen_tcp.recv/3` to perform calls that will block the current process until content is available. This is known as "passive mode". However, we can also ask `:gen_tcp` to stream messages to the current process inbox as they arrive, which is known as "active mode", which is exactly what we configured when we called `:inet.setopts(socket, active: true)`. Those messages have the shape `{:tcp, socket, data}`. When the socket is in active mode and it is closed, it delivers a `{:tcp_closed, socket}` message. Once we receive this message, we exit the loop, which will exit the connection process. Since the bucket is monitoring the process, it will automatically remove the subscription too. You could verify this in practice by adding a `COUNT SUBSCRIPTIONS` command that returns the number of subscribers for a given bucket. -Observe that we were able to considerably change the server implementation without changing any of the client API. That's one of the benefits of explicitly segregating the server and the client. +In practice, many systems would prefer to call `:inet.setopts(socket, active: :once)` to specify only a single TCP message should be delivered to avoid overflowing message queues. Once the message is received, they call `:inet.setopts/2` again. In our case, we are simply discarding anything that arrives over the socket, so setting `active: true` is equally fine. In all scenarios, the benefit of using active mode is that the process can receive TCP messages as well as messages from other processes at the same time, instead of blocking on `:gen_tcp.recv/3`. -Finally, different from the other callbacks, we have defined a "catch-all" clause for `handle_info/2` that discards and logs any unknown message. To understand why, let's move on to the next section. +To wrap it all up, you should give our new feature a try in a distributed setting too. Start two `NODES=... PORT=... iex --sname ... -S mix` instances. In one of them, create a bucket. In the other, subscribe to the same bucket. Once you go back to the first shell, you will see that, even as you send commands to the bucket in one machine, the messages will be streamed to the other one. In other words, our subscription system is also distributed, and all we had to do is to send messages! ## `call`, `cast` or `info`? @@ -319,25 +364,20 @@ So far we have used three callbacks: `handle_call/3`, `handle_cast/2` and `handl 1. `handle_call/3` must be used for synchronous requests. This should be the default choice as waiting for the server reply is a useful back-pressure mechanism. -2. `handle_cast/2` must be used for asynchronous requests, when you don't care about a reply. A cast does not guarantee the server has received the message and, for this reason, should be used sparingly. For example, the `create/2` function we have defined in this chapter should have used `call/2`. We have used `cast/2` for educational purposes. +2. `handle_cast/2` must be used for asynchronous requests, when you don't care about a reply. A cast does not guarantee the server has received the message and, for this reason, should be used sparingly. For example, the `subscribe/1` function we have defined in this chapter should have used `call/2`. We have used `cast/2` for educational purposes. 3. `handle_info/2` must be used for all other messages a server may receive that are not sent via `GenServer.call/2` or `GenServer.cast/2`, including regular messages sent with `send/2`. The monitoring `:DOWN` messages are an example of this. -Since any message, including the ones sent via `send/2`, go to `handle_info/2`, there is a chance that unexpected messages will arrive to the server. Therefore, if we don't define the catch-all clause, those messages could cause our registry to crash, because no clause would match. We don't need to worry about such cases for `handle_call/3` and `handle_cast/2` though. Calls and casts are only done via the `GenServer` API, so an unknown message is quite likely a developer mistake. - To help developers remember the differences between call, cast and info, the supported return values and more, we have a tiny [GenServer cheat sheet](https://elixir-lang.org/downloads/cheatsheets/gen-server.pdf). -## Monitors or links? +## Agents or GenServers? -We have previously learned about links in the [Process chapter](../getting-started/processes.md). Now, with the registry complete, you may be wondering: when should we use monitors and when should we use links? +Before moving forward to the last chapter, you may be wondering: in the future, should you use an `Agent` or a `GenServer`? -Links are bi-directional. If you link two processes and one of them crashes, the other side will crash too (unless it is trapping exits). A monitor is uni-directional: only the monitoring process will receive notifications about the monitored one. In other words: use links when you want linked crashes, and monitors when you just want to be informed of crashes, exits, and so on. +As we saw throughout this guide, agents are straight-forward to get started but they are limited in what they can do. Agents are effectively a subset of GenServers. In fact, agents are implemented on top of GenServers. As well as supervisors, the `Registry` module, and many other features you will find in both Erlang and Elixir. -Returning to our `handle_cast/2` implementation, you can see the registry is both linking and monitoring the buckets: +In other words, GenServers are the most essential component for building concurrent and fault-tolerant systems in Elixir. They provide a robust and flexible framework for managing state and coordinating interactions between processes. -```elixir -{:ok, bucket} = KV.Bucket.start_link([]) -ref = Process.monitor(bucket) -``` +For those reasons, many adopt a rule of thumb to never use Agents and jump straight into GenServers instead. On the other hand, others are more than fine with using agents to store a bit of state here and there. Either way, you will be fine! -This is a bad idea, as we don't want the registry to crash when a bucket crashes. The proper fix is to actually not link the bucket to the registry. Instead, we will link each bucket to a special type of process called Supervisors, which are explicitly designed to handle failures and crashes. We will learn more about them in the next chapter. +This is the last feature we have implemented for our distributed key-value store. In the next chapter, we will learn how to package our application before shipping it to production. diff --git a/lib/elixir/pages/mix-and-otp/introduction-to-mix.md b/lib/elixir/pages/mix-and-otp/introduction-to-mix.md index 84888ce85a7..80b2bdd2688 100644 --- a/lib/elixir/pages/mix-and-otp/introduction-to-mix.md +++ b/lib/elixir/pages/mix-and-otp/introduction-to-mix.md @@ -9,8 +9,8 @@ In this guide, we will build a complete Elixir application, with its own supervi The requirements for this guide are (see `elixir -v`): - * Elixir 1.15.0 onwards - * Erlang/OTP 24 onwards + * Elixir 1.18.0 onwards + * Erlang/OTP 27 onwards The application works as a distributed key-value store. We are going to organize key-value pairs into buckets and distribute those buckets across multiple nodes. We will also build a simple client that allows us to connect to any of those nodes and send requests such as: @@ -44,7 +44,7 @@ In this chapter, we will create our first project using Mix and explore differen > #### Source code {: .info} > -> The final code for the application built in this guide is in [this repository](https://github.com/josevalim/kv_umbrella) and can be used as a reference. +> The final code for the application built in this guide is in [this repository](https://github.com/josevalim/kv) and can be used as a reference. > #### Is this guide required reading? {: .info} > @@ -82,7 +82,7 @@ Let's take a brief look at those generated files. > #### Executables in the `PATH` {: .info} > -> Mix is an Elixir executable. This means that in order to run `mix`, you need to have both `mix` and `elixir` executables in your PATH. That's what happens when you install Elixir. +> Mix is an Elixir executable. This means that in order to run `mix`, you need to have both `mix` and `elixir` executables in your [`PATH`](https://en.wikipedia.org/wiki/PATH_(variable)). That's what happens when you install Elixir. ## Project compilation diff --git a/lib/elixir/pages/mix-and-otp/releases.md b/lib/elixir/pages/mix-and-otp/releases.md new file mode 100644 index 00000000000..526d1bbb35d --- /dev/null +++ b/lib/elixir/pages/mix-and-otp/releases.md @@ -0,0 +1,170 @@ + + +# Releases + +Now that our application is ready, you may be wondering how we can package our application to run in production. After all, all of our code so far depends on Erlang and Elixir versions that are installed in your current system. To achieve this goal, Elixir provides releases. + +A release is a self-contained directory that consists of your application code, all of its dependencies, plus the whole Erlang Virtual Machine (VM) and runtime. Once a release is assembled, it can be packaged and deployed to a target as long as the target runs on the same operating system (OS) distribution and version as the machine that assembled the release. + +To get started, simply run `mix release` while setting `MIX_ENV=prod`: + +```console +$ MIX_ENV=prod mix release +Compiling 4 files (.ex) +Generated kv app +* assembling kv-0.1.0 on MIX_ENV=prod +* using config/runtime.exs to configure the release at runtime + +Release created at _build/prod/rel/kv + + # To start your system + _build/prod/rel/kv/bin/kv start + +Once the release is running: + + # To connect to it remotely + _build/prod/rel/kv/bin/kv remote + + # To stop it gracefully (you may also send SIGINT/SIGTERM) + _build/prod/rel/kv/bin/kv stop + +To list all commands: + + _build/prod/rel/kv/bin/kv +``` + +Excellent! A release was assembled in `_build/prod/rel/kv`. Everything you need to run your application is inside that directory. In particular, there is a `bin/kv` file which is the entry point to your system. It supports multiple commands, such as: + + * `bin/kv start`, `bin/kv start_iex`, `bin/kv restart`, and `bin/kv stop` — for general management of the release + + * `bin/kv rpc COMMAND` and `bin/kv remote` — for running commands on the running system or to connect to the running system + + * `bin/kv eval COMMAND` — to start a fresh system that runs a single command and then shuts down + + * `bin/kv daemon` and `bin/kv daemon_iex` — to start the system as a daemon on Unix-like systems + + * `bin/kv install` — to install the system as a service on Windows machines + +If you run `bin/kv start_iex` inside the release directory, it will start the system using a short name (`--sname`) equal to the release name, which in this case is `kv`. The next step is to start two instances, on different ports and different names, as we did earlier on. But before we do this, let's talk a bit about the benefits of releases. + +## Why releases? + +Releases allow developers to precompile and package all of their code and the runtime into a single unit. The benefits of releases are: + + * Code preloading. The VM has two mechanisms for loading code: interactive and embedded. By default, it runs in the interactive mode which dynamically loads modules when they are used for the first time. The first time your application calls `Enum.map/2`, the VM will find the `Enum` module and load it. There's a downside. When you start a new server in production, it may need to load many other modules, causing the first requests to have an unusual spike in response time. Releases run in embedded mode, which loads all available modules upfront, guaranteeing your system is ready to handle requests after booting. + + * Configuration and customization. Releases give developers fine grained control over system configuration and the VM flags used to start the system. + + * Self-contained. A release does not require the source code to be included in your production artifacts. All of the code is precompiled and packaged. Releases do not even require Erlang or Elixir on your servers, as they include the Erlang VM and its runtime by default. Furthermore, both Erlang and Elixir standard libraries are stripped to bring only the parts you are actually using. + + * Multiple releases. You can assemble different releases with different configuration per application or even with different applications altogether. + +We have written extensive documentation on releases, so [please check the official documentation for more information](`mix release`). For now, we will continue exploring some of the features outlined above. + +## Configuring releases + +Releases also provide built-in hooks for configuring almost every need of the production system: + + * `config/config.exs` — provides build-time application configuration, which is executed before our application compiles. This file often imports configuration files based on the environment, such as `config/dev.exs` and `config/prod.exs`. + + * `config/runtime.exs` — provides runtime application configuration. It is executed every time the release boots and is further extensible via config providers. + + * `rel/env.sh.eex` and `rel/env.bat.eex` — template files that are copied into every release and executed on every command to set up environment variables, including ones specific to the VM, and the general environment. + + * `rel/vm.args.eex` — a template file that is copied into every release and provides static configuration of the Erlang Virtual Machine and other runtime flags. + +In this case, we already have specified a `config/runtime.exs` that deals with both `PORT` and `NODES` environment variables. Furthermore, while releases don't accept a `--sname` parameter, they do allow us to set the name via the `RELEASE_NODE` env var. Therefore, we can start two copies of the system by jumping into `_build/prod/rel/kv` and typing this (remember to adjust `@computer-name` to your actual computer name): + +```console +$ NODES="foo@computer-name,bar@computer-name" PORT=4040 RELEASE_NODE="foo" bin/kv start_iex +``` + +```console +$ NODES="foo@computer-name,bar@computer-name" PORT=4041 RELEASE_NODE="bar" bin/kv start_iex +``` + +To verify it all worked out, you can type `Node.list` in the IEx section and see if it returns the other node. If it doesn't, you can start diagnosing, first by comparing the node names within each `iex>` prompt and calling `Node.connect/1` directly. With applications running, you can `telnet` into them as usual too. + +While the above is enough to get started, you may want to perform advanced configuration based on the environment you are replying to. Releases provide scripts for that, which are great to automate based on host, network, or cloud settings. + +## Operating System scripts + +Every release contains an environment file, named `env.sh` on Unix-like systems and `env.bat` on Windows machines, that executes before the Elixir system starts. In this file, you can execute any OS-level code, such as invoke other applications, set environment variables and so on. Some of those environment variables can even configure how the release itself runs. + +For instance, releases run using short-names (`--sname`). However, if you want to actually run a distributed key-value store in production, you will need multiple nodes and start the release with the `--name` option. We can achieve this by setting the `RELEASE_DISTRIBUTION` environment variable inside the `env.sh` and `env.bat` files. Mix already has a template for said files which we can customize, so let's ask Mix to copy them to our application: + + $ mix release.init + * creating rel/vm.args.eex + * creating rel/remote.vm.args.eex + * creating rel/env.sh.eex + * creating rel/env.bat.eex + +If you open up `rel/env.sh.eex`, you will see: + +```shell +#!/bin/sh + +# # Sets and enables heart (recommended only in daemon mode) +# case $RELEASE_COMMAND in +# daemon*) +# HEART_COMMAND="$RELEASE_ROOT/bin/$RELEASE_NAME $RELEASE_COMMAND" +# export HEART_COMMAND +# export ELIXIR_ERL_OPTIONS="-heart" +# ;; +# *) +# ;; +# esac + +# # Set the release to load code on demand (interactive) instead of preloading (embedded). +# export RELEASE_MODE=interactive + +# # Set the release to work across nodes. +# # RELEASE_DISTRIBUTION must be "sname" (local), "name" (distributed) or "none". +# export RELEASE_DISTRIBUTION=name +# export RELEASE_NODE=<%= @release.name %> +``` + +The steps necessary to work across nodes is already commented out as an example. You can enable full distribution by setting the `RELEASE_DISTRIBUTION` variable to `name`. + +If you are on Windows, you will have to open up `rel/env.bat.eex`, where you will find this: + +```bat +@echo off +rem Set the release to load code on demand (interactive) instead of preloading (embedded). +rem set RELEASE_MODE=interactive + +rem Set the release to work across nodes. +rem RELEASE_DISTRIBUTION must be "sname" (local), "name" (distributed) or "none". +rem set RELEASE_DISTRIBUTION=name +rem set RELEASE_NODE=<%= @release.name %> +``` + +Once again, set the `RELEASE_DISTRIBUTION` variable to `name` and you are good to go! + +## VM arguments + +The `rel/vm.args.eex` allows you to specify low-level flags that control how the Erlang VM and its runtime operate. You specify entries as if you were specifying arguments in the command line with code comments also supported. Here is the default generated file: + + ## Customize flags given to the VM: https://www.erlang.org/doc/man/erl.html + ## -mode/-name/-sname/-setcookie are configured via env vars, do not set them here + + ## Increase number of concurrent ports/sockets + ##+Q 65536 + + ## Tweak GC to run more often + ##-env ERL_FULLSWEEP_AFTER 10 + +You can see [a complete list of VM arguments and flags in the Erlang documentation](http://www.erlang.org/doc/man/erl.html). + +## Summing up + +Throughout the guide, we have built a very simple distributed key-value store as an opportunity to explore many constructs like generic servers, supervisors, tasks, agents, applications and more. Not only that, we have written tests for the whole application, got familiar with ExUnit, and learned how to use the Mix build tool to accomplish a wide range of tasks. + +If you are looking for a distributed key-value store to use in production, you should definitely look into [Riak](http://riak.com/products/riak-kv/), which also runs in the Erlang VM. In Riak, the buckets are replicated and stored across several nodes to avoid data loss. + +Of course, Elixir can be used for much more than distributed key-value stores. Embedded systems, data-processing and data-ingestion, web applications, audio/video streaming systems, machine learning, and others are many of the different domains Elixir excels at. We hope this guide has prepared you to explore any of those domains or any future domain you may desire to bring Elixir into. + +Happy coding! diff --git a/lib/elixir/pages/mix-and-otp/supervisor-and-application.md b/lib/elixir/pages/mix-and-otp/supervisor-and-application.md index 8c5839b2022..b3b51f5d090 100644 --- a/lib/elixir/pages/mix-and-otp/supervisor-and-application.md +++ b/lib/elixir/pages/mix-and-otp/supervisor-and-application.md @@ -3,142 +3,65 @@ SPDX-FileCopyrightText: 2021 The Elixir Team --> -# Supervision trees and applications +# Registries and supervision trees -In the previous chapter about `GenServer`, we implemented `KV.Registry` to manage buckets. At some point, we started monitoring buckets so we were able to take action whenever a `KV.Bucket` crashed. Although the change was relatively small, it introduced a question which is frequently asked by Elixir developers: what happens when something fails? +In the [previous chapter](agents.md), we used agents to represent our buckets. In the [introduction to mix](introduction-to-mix.md), we specified we would like to name each bucket so we can do the following: -Before we added monitoring, if a bucket crashed, the registry would forever point to a bucket that no longer exists. If a user tried to read or write to the crashed bucket, it would fail. Any attempt at creating a new bucket with the same name would just return the PID of the crashed bucket. In other words, that registry entry for that bucket would forever be in a bad state. Once we added monitoring, the registry automatically removes the entry for the crashed bucket. Trying to lookup the crashed bucket now (correctly) says the bucket does not exist and a user of the system can successfully create a new one if desired. +```text +CREATE shopping +OK -In practice, we are not expecting the processes working as buckets to fail. But, if it does happen, for whatever reason, we can rest assured that our system will continue to work as intended. +PUT shopping milk 1 +OK -If you have prior programming experience, you may be wondering: "could we just guarantee the bucket does not crash in the first place?". As we will see, Elixir developers tend to refer to those practices as "defensive programming". That's because a live production system has dozens of different reasons why something can go wrong. The disk can fail, memory can be corrupted, bugs, the network may stop working for a second, etc. If we were to write software that attempted to protect or circumvent all of those errors, we would spend more time handling failures than writing our own software! - -Therefore, an Elixir developer prefers to "let it crash" or "fail fast". And one of the most common ways we can recover from a failure is by restarting whatever part of the system crashed. - -For example, imagine your computer, router, printer, or whatever device is not working properly. How often do you fix it by restarting it? Once we restart the device, we reset the device back to its initial state, which is well-tested and guaranteed to work. In Elixir, we apply this same approach to software: whenever a process crashes, we start a new process to perform the same job as the crashed process. - -In Elixir, this is done by a Supervisor. A Supervisor is a process that supervises other processes and restarts them whenever they crash. To do so, Supervisors manage the whole life cycle of any supervised processes, including startup and shutdown. - -In this chapter, we will learn how to put those concepts into practice by supervising the `KV.Registry` process. After all, if something goes wrong with the registry, the whole registry is lost and no bucket could ever be found! To address this, we will define a `KV.Supervisor` module that guarantees that our `KV.Registry` is up and running at any given moment. - -At the end of the chapter, we will also talk about Applications. As we will see, Mix has been packaging all of our code into an application, and we will learn how to customize our application to guarantee that our Supervisor and the Registry are up and running whenever our system starts. - -## Our first supervisor - -A supervisor is a process which supervises other processes, which we refer to as child processes. The act of supervising a process includes three distinct responsibilities. The first one is to start child processes. Once a child process is running, the supervisor may restart a child process, either because it terminated abnormally or because a certain condition was reached. For example, a supervisor may restart all children if any child dies. Finally, a supervisor is also responsible for shutting down the child processes when the system is shutting down. Please see the `Supervisor` module for a more in-depth discussion. - -Creating a supervisor is not much different from creating a GenServer. We are going to define a module named `KV.Supervisor`, which will use the Supervisor behaviour, inside the `lib/kv/supervisor.ex` file: - -```elixir -defmodule KV.Supervisor do - use Supervisor - - def start_link(opts) do - Supervisor.start_link(__MODULE__, :ok, opts) - end - - @impl true - def init(:ok) do - children = [ - KV.Registry - ] - - Supervisor.init(children, strategy: :one_for_one) - end -end +GET shopping milk +1 +OK ``` -Our supervisor has a single child so far: `KV.Registry`. After we define a list of children, we call `Supervisor.init/2`, passing the children and the supervision strategy. - -The supervision strategy dictates what happens when one of the children crashes. `:one_for_one` means that if a child dies, it will be the only one restarted. Since we have only one child now, that's all we need. The `Supervisor` behaviour supports several strategies, which we will discuss in this chapter. +In the example session above we interacted with the "shopping" bucket by referencing its name. Therefore, an important feature in our key-value store is to give names to processes. -Once the supervisor starts, it will traverse the list of children and it will invoke the `child_spec/1` function on each module. - -The `child_spec/1` function returns the child specification which describes how to start the process, if the process is a worker or a supervisor, if the process is temporary, transient or permanent and so on. The `child_spec/1` function is automatically defined when we `use Agent`, `use GenServer`, `use Supervisor`, etc. Let's give it a try in the terminal with `iex -S mix`: +We have also learned in the previous chapter we can already name our buckets. For example: ```elixir -iex> KV.Registry.child_spec([]) -%{id: KV.Registry, start: {KV.Registry, :start_link, [[]]}} -``` - -We will learn those details as we move forward on this guide. If you would rather peek ahead, check the `Supervisor` docs. - -After the supervisor retrieves all child specifications, it proceeds to start its children one by one, in the order they were defined, using the information in the `:start` key in the child specification. For our current specification, it will call `KV.Registry.start_link([])`. - -Let's take the supervisor for a spin: - -```elixir -iex> {:ok, sup} = KV.Supervisor.start_link([]) -{:ok, #PID<0.148.0>} -iex> Supervisor.which_children(sup) -[{KV.Registry, #PID<0.150.0>, :worker, [KV.Registry]}] -``` - -So far we have started the supervisor and listed its children. Once the supervisor started, it also started all of its children. - -What happens if we intentionally crash the registry started by the supervisor? Let's do so by sending it a bad input on `call`: - -```elixir -iex> [{_, registry, _, _}] = Supervisor.which_children(sup) -[{KV.Registry, #PID<0.150.0>, :worker, [KV.Registry]}] -iex> GenServer.call(registry, :bad_input) -08:52:57.311 [error] GenServer #PID<0.150.0> terminating -** (FunctionClauseError) no function clause matching in KV.Registry.handle_call/3 -iex> Supervisor.which_children(sup) -[{KV.Registry, #PID<0.157.0>, :worker, [KV.Registry]}] -``` - -Notice how the supervisor automatically started a new registry, with a new PID, in place of the first one once we caused it to crash due to a bad input. - -In the previous chapters, we have always started processes directly. For example, we would call `KV.Registry.start_link([])`, which would return `{:ok, pid}`, and that would allow us to interact with the registry via its `pid`. Now that processes are started by the supervisor, we have to directly ask the supervisor who its children are, and fetch the PID from the returned list of children. In practice, doing so every time would be very expensive. To address this, we often give names to processes, allowing them to be uniquely identified in a single machine from anywhere in our code. - -Let's learn how to do that. - -## Naming processes - -While our application will have many buckets, it will only have a single registry. Therefore, whenever we start the registry, we want to give it a unique name so we can reach out to it from anywhere. We do so by passing a `:name` option to `KV.Registry.start_link/1`. - -Let's slightly change our children definition (in `KV.Supervisor.init/1`) to be a list of tuples instead of a list of atoms: - -```elixir - def init(:ok) do - children = [ - {KV.Registry, name: KV.Registry} - ] +iex> KV.Bucket.start_link(name: :shopping) +{:ok, #PID<0.43.0>} +iex> KV.Bucket.put(:shopping, "milk", 1) +:ok +iex> KV.Bucket.get(:shopping, "milk") +1 ``` -With this in place, the supervisor will now start `KV.Registry` by calling `KV.Registry.start_link(name: KV.Registry)`. +However, naming dynamic processes with atoms is a terrible idea! If we use atoms, we would need to convert the bucket name (often received from an external client) to atoms, and **we should never convert user input to atoms**. This is because atoms are not garbage collected. Once an atom is created, it is never reclaimed. Generating atoms from user input would mean the user can inject enough different names to exhaust our system memory! -If you revisit the `KV.Registry.start_link/1` implementation, you will remember it simply passes the options to GenServer: +In practice, it is more likely you will reach the Erlang VM limit for the maximum number of atoms before you run out of memory, which will bring your system down regardless. -```elixir - def start_link(opts) do - GenServer.start_link(__MODULE__, :ok, opts) - end -``` +Luckily, Elixir (and Erlang) comes with built-in abstractions for naming processes, called name registries, each with different trade-offs which we will explore throughout these guides. -which in turn will register the process with the given name. The `:name` option expects an atom for locally named processes (locally named means it is available to this machine — there are other options, which we won't discuss here). Since module identifiers are atoms (try `i(KV.Registry)` in IEx), we can name a process after the module that implements it, provided there is only one process for that name. This helps when debugging and introspecting the system. +## Local, decentralized, and scalable registry -Let's give the updated supervisor a try inside `iex -S mix`: +Elixir ships with a single-node process registry module aptly called `Registry`. Its main feature is that you can use any Elixir value to name a process, not only atoms. Let's take it for a spin in `iex`: ```elixir -iex> KV.Supervisor.start_link([]) -{:ok, #PID<0.66.0>} -iex> KV.Registry.create(KV.Registry, "shopping") +iex> Registry.start_link(name: KV, keys: :unique) +iex> name = {:via, Registry, {KV, "shopping"}} +iex> KV.Bucket.start_link(name: name) +{:ok, #PID<0.43.0>} +iex> KV.Bucket.put(name, "milk", 1) :ok -iex> KV.Registry.lookup(KV.Registry, "shopping") -{:ok, #PID<0.70.0>} +iex> KV.Bucket.get(name, "milk") +1 ``` -This time the supervisor started a named registry, allowing us to create buckets without having to explicitly fetch the PID from the supervisor. You should also know how to make the registry crash again, without looking up its PID: give it a try. - -> At this point, you may be wondering: should you also locally name bucket processes? Remember buckets are started dynamically based on user input. Since local names MUST be atoms, we would have to dynamically create atoms, which is a bad idea since once an atom is defined, it is never erased nor garbage collected. This means that, if we create atoms dynamically based on user input, we will eventually run out of memory (or to be more precise, the VM will crash because it imposes a hard limit on the number of atoms). This limitation is precisely why we created our own registry (or why one would use Elixir's built-in `Registry` module). +As you can see, instead of passing an atom to the `:name` option, we pass a tuple of shape `{:via, registry_module, {registry_name, process_name}}`, and everything just worked. You could have used anything as the `process_name`, even an integer or a map! That's because all of Elixir built-in behaviours, agents, supervisors, tasks, etc, are compatible with naming registries, as long as you pass them using the "via" tuple format. -We are getting closer and closer to a fully working system. The supervisor automatically starts the registry. But how can we automatically start the supervisor whenever our system starts? To answer this question, let's talk about applications. +Therefore, all we need to do to name our buckets is to start a `Registry`, using `Registry.start_link/1`. But you may be wondering, where exactly should we place that? ## Understanding applications -We have been working inside an application this entire time. Every time we changed a file and ran `mix compile`, we could see a `Generated kv app` message in the compilation output. +Every Elixir project is an application. Elixir itself is defined in an application named `:elixir`. The `ExUnit.Case` module is part of the `:ex_unit` application. And so forth. + +In fact, we have been working inside an application this entire time. Every time we changed a file and ran `mix compile`, we could see a `Generated kv app` message in the compilation output. We can find the generated `.app` file at `_build/dev/lib/kv/ebin/kv.app`. Let's have a look at its contents: @@ -146,8 +69,7 @@ We can find the generated `.app` file at `_build/dev/lib/kv/ebin/kv.app`. Let's {application,kv, [{applications,[kernel,stdlib,elixir,logger]}, {description,"kv"}, - {modules,['Elixir.KV','Elixir.KV.Bucket','Elixir.KV.Registry', - 'Elixir.KV.Supervisor']}, + {modules,['Elixir.KV','Elixir.KV.Bucket']}, {registered,[]}, {vsn,"0.1.0"}]}. ``` @@ -156,7 +78,7 @@ This file contains Erlang terms (written using Erlang syntax). Even though we ar > The `logger` application ships as part of Elixir. We stated that our application needs it by specifying it in the `:extra_applications` list in `mix.exs`. See the [official documentation](`Logger`) for more information. -In a nutshell, an application consists of all the modules defined in the `.app` file, including the `.app` file itself. An application has generally only two directories: `ebin`, for Elixir artifacts, such as `.beam` and `.app` files, and `priv`, with any other artifact or asset you may need in your application. +In a nutshell, an application consists of all the modules defined in the `.app` file, including the `.app` file itself. The application itself is located at the `_build/dev/lib/kv` folder and typically has only two directories: `ebin`, for Elixir artifacts, such as `.beam` and `.app` files, and `priv`, with any other artifact or asset you may need in your application. Although Mix generates and maintains the `.app` file for us, we can customize its contents by adding new entries to the `application/0` function inside the `mix.exs` project file. We are going to do our first customization soon. @@ -196,9 +118,9 @@ iex> Application.ensure_all_started(:kv) {:ok, [:logger, :kv]} ``` -In practice, our tools always start our applications for us, but there is an API available if you need fine-grained control. +In practice, our tools always start our applications for us, and you don't have to worry about the above, but it is good to know how it all works behind the scenes. -## The application callback +### The application callback Whenever we invoke `iex -S mix`, Mix automatically starts our application by calling `Application.start(:kv)`. But can we customize what happens when our application starts? As a matter of fact, we can! To do so, we define an application callback. @@ -213,50 +135,119 @@ The first step is to tell our application definition (for example, our `.app` fi end ``` -The `:mod` option specifies the "application callback module", followed by the arguments to be passed on application start. The application callback module can be any module that implements the `Application` behaviour. +The `:mod` option specifies the "application callback module", followed by the arguments to be passed on application start. The application callback module can be any module that invokes `use Application`. Since we have specified `KV` as the module callback, let's change the `KV` module defined in `lib/kv.ex` to the following: -To implement the `Application` behaviour, we have to `use Application` and define a `start/2` function. The goal of `start/2` is to start a supervisor, which will then start any child services or execute any other code our application may need. Let's use this opportunity to start the `KV.Supervisor` we have implemented earlier in this chapter. +```elixir +defmodule KV do + use Application +end +``` -Since we have specified `KV` as the module callback, let's change the `KV` module defined in `lib/kv.ex` to implement a `start/2` function: +Now run `mix test` and you will see a couple things happening. First of all, you will get a compilation warning: + +```text +Compiling 1 file (.ex) + warning: function start/2 required by behaviour Application is not implemented (in module KV) + │ + 1 │ defmodule KV do + │ ~~~~~~~~~~~~~~~ + │ + └─ lib/kv.ex:1: KV (module) +``` + +This warning is telling us that `use Application` actually defines a behaviour, which expects us to implement to a `start/2` function in our `KV` module. + +Then our application does not even boot because the `start/2` function is not actually implemented: + +```text +18:29:39.109 [notice] Application kv exited: exited in: KV.start(:normal, []) + ** (EXIT) an exception was raised: + ** (UndefinedFunctionError) function KV.start/2 is undefined or private +``` + +Implementing the `start/2` callback is relatively straight-forward, all we need to do is to start a supervision tree, and return `{:ok, root_supervisor_pid}`. The `Supervisor.start_link/2` function does precisely that, it only expects a list of children and the supervision strategy. Let's just pass an empty list of children for now: ```elixir defmodule KV do use Application + # The @impl true annotation says we are implementing a callback @impl true def start(_type, _args) do - # Although we don't use the supervisor name below directly, - # it can be useful when debugging or introspecting the system. - KV.Supervisor.start_link(name: KV.Supervisor) + Supervisor.start_link([], strategy: :one_for_one) end end ``` -> Please note that by doing this, we are breaking the boilerplate test case which tested the `hello` function in `KV`. You can simply remove that test case. +Now run `mix test` again and our app should boot but we should see one failure. When we changed the `KV` module, we broke the boilerplate test case which tested the `KV.hello/0` function. You can simply remove that test case and we are back to a green suite. + +We wrote very little code but we did something incredibly powerful. We now have a function, `KV.start/2` that is invoked whenever your application starts. This gives us the perfect place to start our key-value registry. The `Application` module also allows us to define a `stop/1` callback and other funtionality. You can check the `Application` and `Supervisor` modules for extensive documentation on their uses. + +Let's finally start our registry. + +## Supervision trees -When we `use Application`, we may define a couple of functions, similar to when we used `Supervisor` or `GenServer`. This time we only had to define a `start/2` function. The `Application` behaviour also has a `stop/1` callback, but it is rarely used in practice. You can check the documentation for more information. +Now that we have the `start/2` callback, we can finally go ahead and start our registry. You may be tempted to do it like this: -Now that you have defined an application callback which starts our supervisor, we expect the `KV.Registry` process to be up and running as soon as we start `iex -S mix`. Let's give it another try: +```elixir + def start(_type, _args) do + Registry.start_link(name: KV, keys: :unique) + Supervisor.start_link([], strategy: :one_for_one) + end +``` + +However, this would not be a good idea. In Elixir, we typically start processes inside supervision trees. In fact, we rarely use the `start_link` functions to start processes (except at the root of the supervision tree itself). Instead, do this: ```elixir -iex> KV.Registry.create(KV.Registry, "shopping") + def start(_type, _args) do + children = [ + {Registry, name: KV, keys: :unique} + ] + + Supervisor.start_link(children, strategy: :one_for_one) + end +``` + +A supervisor receives one or more child specifications that tell it exactly how to start each child. A child specification is typically represented by a `{module, options}` pair, as shown above, and often as simply the module name. Sometimes, these children are supervisors themselves, giving us supervision trees. + +Let's take it for a spin and see if we can indeed name our buckets using our new registry. Let's make sure to start a new `iex -S mix` (`recompile()` is not enough, as it does not reload your supervision tree) and then: + +```iex +iex> name = {:via, Registry, {KV, "shopping"}} +iex> KV.Bucket.start_link(name: name) +{:ok, #PID<0.43.0>} +iex> KV.Bucket.put(name, "milk", 1) :ok -iex> KV.Registry.lookup(KV.Registry, "shopping") -{:ok, #PID<0.88.0>} +iex> KV.Bucket.get(name, "milk") +1 ``` -Let's recap what is happening. Whenever we invoke `iex -S mix`, it automatically starts our application by calling `Application.start(:kv)`, which then invokes the application callback. The application callback's job is to start a **supervision tree**. Right now, our supervisor has a single child named `KV.Registry`, started with name `KV.Registry`. Our supervisor could have other children, and some of these children could be their own supervisors with their own children, leading to the so-called supervision trees. +Perfect, this time we didn't need to start the registry inside `iex`, as it was started as part of the application itself. + +By starting processes inside supervisors, we gain important properties such as: + + * **Introspection**: for each application, you can fully introspect and visualize each process in its supervision tree, its memory usage, message queue, etc + + * **Resilience**: when a process fails for an unexpected reason, its supervisor controls if and how those processes should be restarted, leading to self-healing systems + + * **Graceful shutdown**: when your application is shutting down, the children of a supervision tree are terminated in the opposite order they were started, leading to graceful shutdowns ## Projects or applications? -Mix makes a distinction between projects and applications. Based on the contents of our `mix.exs` file, we would say we have a Mix project that defines the `:kv` application. As we will see in later chapters, there are projects that don't define any application. +Mix makes a distinction between projects and applications. Based on the contents of our `mix.exs` file, we would say we have a Mix project that defines the `:kv` application. When we say "project" you should think about Mix. Mix is the tool that manages your project. It knows how to compile your project, test your project and more. It also knows how to compile and start the application relevant to your project. When we talk about applications, we talk about OTP. Applications are the entities that are started and stopped as a whole by the runtime. You can learn more about applications and how they relate to booting and shutting down of your system as a whole in the documentation for the `Application` module. -## Next steps +## Summing up + +We learned important concepts in this chapter: + + * Naming registries allow us to find processes in a given machine (or, as we will see in the future, even in a cluster) + + * Applications bundle our modules, its dependencies, and how code starts and stops -Although this chapter was the first time we implemented a supervisor, it was not the first time we used one! In the previous chapter, when we used `start_supervised!` to start the registry during our tests, `ExUnit` started the registry under a supervisor managed by the ExUnit framework itself. By defining our own supervisor, we provide more structure on how we initialize, shutdown and supervise processes in our applications, aligning our production code and tests with best practices. + * Processes are started as part of supervisors for introspection and fault-tolerance -But we are not done yet. So far we are supervising the registry but our application is also starting buckets. Since buckets are started dynamically, we can use a special type of supervisor called `DynamicSupervisor`, which is optimized to handle such scenarios. Let's explore it next. +In the next chapter, we will tie it all up by making sure all our buckets are named and supervised. To do so, we will learn a new tool called dynamic supervisors. diff --git a/lib/elixir/pages/mix-and-otp/task-and-gen-tcp.md b/lib/elixir/pages/mix-and-otp/task-and-gen-tcp.md index 8ae44d38962..fe037ee3dcc 100644 --- a/lib/elixir/pages/mix-and-otp/task-and-gen-tcp.md +++ b/lib/elixir/pages/mix-and-otp/task-and-gen-tcp.md @@ -5,7 +5,7 @@ # Task and gen_tcp -In this chapter, we are going to learn how to use Erlang's [`:gen_tcp` module](`:gen_tcp`) to serve requests. This provides a great opportunity to explore Elixir's `Task` module. In future chapters, we will expand our server so that it can actually serve the commands. +In this chapter, we are going to learn how to use Erlang's [`:gen_tcp` module](`:gen_tcp`) to serve requests. This provides a great opportunity to explore Elixir's `Task` module. In future chapters, we will expand our server so that it can actually interact with buckets. ## Echo server @@ -17,10 +17,10 @@ A TCP server, in broad strokes, performs the following steps: 2. Waits for a client connection on that port and accepts it 3. Reads the client request and writes a response back -Let's implement those steps. Move to the `apps/kv_server` application, open up `lib/kv_server.ex`, and add the following functions: +Let's implement those steps. Create a new `lib/kv/server.ex` and add the following functions: ```elixir -defmodule KVServer do +defmodule KV.Server do require Logger def accept(port) do @@ -62,7 +62,7 @@ defmodule KVServer do end ``` -We are going to start our server by calling `KVServer.accept(4040)`, where 4040 is the port. The first step in `accept/1` is to listen to the port until the socket becomes available and then call `loop_acceptor/1`. `loop_acceptor/1` is a loop accepting client connections. For each accepted connection, we call `serve/1`. +We are going to start our server by calling `KV.Server.accept(4040)`, where 4040 is the port. The first step in `accept/1` is to listen to the port until the socket becomes available and then call `loop_acceptor/1`. `loop_acceptor/1` is a loop accepting client connections. For each accepted connection, we call `serve/1`. `serve/1` is another loop that reads a line from the socket and writes those lines back to the socket. Note that the `serve/1` function uses the pipe operator `|>/2` to express this flow of operations. The pipe operator evaluates the left side and passes its result as the first argument to the function on the right side. The example above: @@ -85,7 +85,7 @@ This is pretty much all we need to implement our echo server. Let's give it a tr Start an IEx session inside the `kv_server` application with `iex -S mix`. Inside IEx, run: ```elixir -iex> KVServer.accept(4040) +iex> KV.Server.accept(4040) ``` The server is now running, and you will even notice the console is blocked. Let's use [a `telnet` client](https://en.wikipedia.org/wiki/Telnet) to access our server. There are clients available on most operating systems, and their command lines are generally similar: @@ -109,10 +109,12 @@ My particular telnet client can be exited by typing `ctrl + ]`, typing `quit`, a Once you exit the telnet client, you will likely see an error in the IEx session: - ** (MatchError) no match of right hand side value: {:error, :closed} - (kv_server) lib/kv_server.ex:45: KVServer.read_line/1 - (kv_server) lib/kv_server.ex:37: KVServer.serve/1 - (kv_server) lib/kv_server.ex:30: KVServer.loop_acceptor/1 +```text +** (MatchError) no match of right hand side value: {:error, :closed} + (kv) lib/kv/server.ex:45: KV.Server.read_line/1 + (kv) lib/kv/server.ex:37: KV.Server.serve/1 + (kv) lib/kv/server.ex:30: KV.Server.loop_acceptor/1 +``` That's because we were expecting data from `:gen_tcp.recv/2` but the client closed the connection. We need to handle such cases better in future revisions of our server. @@ -120,36 +122,25 @@ For now, there is a more important bug we need to fix: what happens if our TCP a ## Tasks -We have learned about agents, generic servers, and supervisors. They are all meant to work with multiple messages or manage state. But what do we use when we only need to execute some task and that is it? - -The `Task` module provides this functionality exactly. For example, it has a `Task.start_link/1` function that receives an anonymous function and executes it inside a new process that will be part of a supervision tree. +Whenever you have an existing function and you simply want to execute it when your application starts, the `Task` module is exactly you need. For example, it has a `Task.start_link/1` function that receives an anonymous function and executes it inside a new process that will be part of a supervision tree. -Let's give it a try. Open up `lib/kv_server/application.ex`, and let's change the supervisor in the `start/2` function to the following: +Let's give it a try. Open up `lib/kv.ex` and let's add a new child: ```elixir def start(_type, _args) do children = [ - {Task, fn -> KVServer.accept(4040) end} + {Registry, name: KV, keys: :unique}, + {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one}, + {Task, fn -> KV.Server.accept(4040) end} ] - opts = [strategy: :one_for_one, name: KVServer.Supervisor] - Supervisor.start_link(children, opts) + Supervisor.start_link(children, strategy: :one_for_one) end ``` -As usual, we've passed a two-element tuple as a child specification, which in turn will invoke `Task.start_link/1`. - -With this change, we are saying that we want to run `KVServer.accept(4040)` as a task. We are hardcoding the port for now but this could be changed in a few ways, for example, by reading the port out of the system environment when starting the application: - -```elixir -port = String.to_integer(System.get_env("PORT") || "4040") -# ... -{Task, fn -> KVServer.accept(port) end} -``` +With this change, we are saying that we want to run `KV.Server.accept(4040)` as a task. We are hardcoding the port for now but we will make this a configuration in later chapters. As usual, we've passed a two-element tuple as a child specification, which in turn will invoke `Task.start_link/1`. -Insert these changes in your code and now you may start your application using the following command `PORT=4321 mix run --no-halt`, notice how we are passing the port as a variable, but still defaults to `4040` if none is given. - -Now that the server is part of the supervision tree, it should start automatically when we run the application. Start your server, now passing the port, and once again use the `telnet` client to make sure that everything still works: +Now that the server is part of the supervision tree, it should start automatically when we run the application. Run `iex -S mix` to boot the app and use the `telnet` client to make sure that everything still works: ```console $ telnet 127.0.0.1 4321 @@ -178,7 +169,7 @@ HELLOOOOOO? It doesn't seem to work at all. That's because we are serving requests in the same process that are accepting connections. When one client is connected, we can't accept another client. -## Task supervisor +## Adding (flawed) concurrency In order to make our server handle simultaneous connections, we need to have one process working as an acceptor that spawns other processes to serve requests. One solution would be to change: @@ -195,43 +186,49 @@ to also use `Task.start_link/1`: ```elixir defp loop_acceptor(socket) do {:ok, client} = :gen_tcp.accept(socket) - Task.start_link(fn -> serve(client) end) + {:ok, pid} = Task.start_link(fn -> serve(client) end) + :ok = :gen_tcp.controlling_process(client, pid) loop_acceptor(socket) end ``` -We are starting a linked Task directly from the acceptor process. But we've already made this mistake once. Do you remember? +In the new acceptor loop, we are starting a new task every time there is a new client. Now, if you attempt to connect two clients at the same time, it should work! + +Or does it? For example, what happens when you exit one telnet session? The other session should crash! The reason of this crash is two fold: + +1. We have a bug in our server where we don't expect `:gen_tcp.recv/2` to return an `{:error, :closed}` tuple + +2. Because each server task is linked to the acceptor process, if one task crashes, the acceptor process will also crash, taking down all other tasks and clients -This is similar to the mistake we made when we called `KV.Bucket.start_link/1` straight from the registry. That meant a failure in any bucket would bring the whole registry down. +An important rule thumb throughout this guide is to always start processes as children of supervisors. The code above is an excellent example of what happens when we don't. If we don't isolate the different parts of our systems, failures can now cascade through our system, as it would happen in other languages. -The code above would have the same flaw: if we link the `serve(client)` task to the acceptor, a crash when serving a request would bring the acceptor, and consequently all other connections, down. +To fix this, we could use a `DynamicSupervisor`, but tasks also provide a specialized `Task.Supervisor` which has better ergonomics and is optimized for supervising tasks themselves. Let's give it a try. -We fixed the issue for the registry by using a simple one for one supervisor. We are going to use the same tactic here, except that this pattern is so common with tasks that `Task` already comes with a solution: a simple one for one supervisor that starts temporary tasks as part of our supervision tree. +## Adding a task supervisor -Let's change `start/2` once again, to add a supervisor to our tree: +Let's change `start/2` in `lib/kv.ex` once more, to add the task supervisor to our tree: ```elixir def start(_type, _args) do - port = String.to_integer(System.get_env("PORT") || "4040") - children = [ - {Task.Supervisor, name: KVServer.TaskSupervisor}, - {Task, fn -> KVServer.accept(port) end} + {Registry, name: KV, keys: :unique}, + {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one}, + {Task.Supervisor, name: KV.ServerSupervisor}, + {Task, fn -> KV.Server.accept(4040) end} ] - opts = [strategy: :one_for_one, name: KVServer.Supervisor] - Supervisor.start_link(children, opts) + Supervisor.start_link(children, strategy: :one_for_one) end ``` -We'll now start a `Task.Supervisor` process with name `KVServer.TaskSupervisor`. Remember, since the acceptor task depends on this supervisor, the supervisor must be started first. +We'll now start a `Task.Supervisor` process with name `KV.TaskSupervisor`. Keep in mind that the order children are started matters. For example, the acceptor must come last because, if it comes first, it means our application can start accepting requests before the `Task.Supervisor` is running or before we can locate buckets. Shutting down an application will also stop the children in reverse order, guaranteeing a clean termination. Now we need to change `loop_acceptor/1` to use `Task.Supervisor` to serve each request: ```elixir defp loop_acceptor(socket) do {:ok, client} = :gen_tcp.accept(socket) - {:ok, pid} = Task.Supervisor.start_child(KVServer.TaskSupervisor, fn -> serve(client) end) + {:ok, pid} = Task.Supervisor.start_child(KV.BucketSupervisor, fn -> serve(client) end) :ok = :gen_tcp.controlling_process(client, pid) loop_acceptor(socket) end @@ -239,76 +236,72 @@ end You might notice that we added a line, `:ok = :gen_tcp.controlling_process(client, pid)`. This makes the child process the "controlling process" of the `client` socket. If we didn't do this, the acceptor would bring down all the clients if it crashed because sockets would be tied to the process that accepted them (which is the default behavior). -Start a new server with `PORT=4040 mix run --no-halt` and we can now open up many concurrent telnet clients. You will also notice that quitting a client does not bring the acceptor down. Excellent! +Now start a new server with `iex -S mix` and try to open up many concurrent telnet clients. You will notice that quitting a client does not bring the acceptor down, even though we haven't fixed the bug in `:gen_tcp.recv/2` yet (which we will address in the next chapter). Excellent! -Here is the full echo server implementation: +## Restart strategies -```elixir -defmodule KVServer do - require Logger +There is one important topic we haven't explored yet with the necessary depth. What happens when a supervised process crashes? - @doc """ - Starts accepting connections on the given `port`. - """ - def accept(port) do - {:ok, socket} = :gen_tcp.listen(port, - [:binary, packet: :line, active: false, reuseaddr: true]) - Logger.info "Accepting connections on port #{port}" - loop_acceptor(socket) - end +In the previous chapter, when we started a bucket and killed it, the supervisor automatically started one in its place: - defp loop_acceptor(socket) do - {:ok, client} = :gen_tcp.accept(socket) - {:ok, pid} = Task.Supervisor.start_child(KVServer.TaskSupervisor, fn -> serve(client) end) - :ok = :gen_tcp.controlling_process(client, pid) - loop_acceptor(socket) - end +```elixir +iex> children = [{KV.Bucket, name: :shopping}] +iex> Supervisor.start_link(children, strategy: :one_for_one) +iex> KV.Bucket.put(:shopping, "milk", 1) +iex> pid = Process.whereis(:shopping) +#PID<0.48.0> +iex> Process.exit(pid, :kill) +true +iex> Process.whereis(:shopping) +#PID<0.50.0> +``` - defp serve(socket) do - socket - |> read_line() - |> write_line(socket) +What exactly happens when a process terminates is part of its child specification. For `KV.Bucket`, we have this: - serve(socket) - end +```elixir +iex> KV.Bucket.child_spec([]) +%{id: KV.Bucket, start: {KV.Bucket, :start_link, [[]]}} +``` - defp read_line(socket) do - {:ok, data} = :gen_tcp.recv(socket, 0) - data - end +However, for tasks, we have this: - defp write_line(line, socket) do - :gen_tcp.send(socket, line) - end -end +```elixir +iex> Task.child_spec(fn -> :ok end) +%{ + id: Task, + restart: :temporary, + start: {Task, :start_link, [#Function<43.39164016/0 in :erl_eval.expr/6>]} +} ``` -Since we have changed the supervisor specification, we need to ask: is our supervision strategy still correct? +Notice that a task says `:restart` is `:temporary`. `KV.Bucket` says nothing, which means it defaults to `:permanent`. `:temporary` means that a process is never restarted, regardless of why it crashed. `:permanent` means a process is always restarted, regardless of the exit reason. There is also `:transient`, which means it won't be restarted as long as it terminates successfully. + +Now we must ask ourselves, are those the correct settings? -In this case, the answer is yes: if the acceptor crashes, there is no need to crash the existing connections. On the other hand, if the task supervisor crashes, there is no need to crash the acceptor too. +For `KV.Bucket`, using `:permanent` seem logical, as should not request the user to recreate a bucket they have previous created. Although currently we would lose the bucket data, in actual system we would add mechanisms to recover it on initialization. However, for tasks, we have used them in two opposing ways in this chapter, which means at least one of them is wrong. -However, there is still one concern left, which are the restart strategies. Tasks, by default, have the `:restart` value set to `:temporary`, which means they are not restarted. This is an excellent default for the connections started via the `Task.Supervisor`, as it makes no sense to restart a failed connection, but it is a bad choice for the acceptor. If the acceptor crashes, we want to bring the acceptor up and running again. +We use a task to start the acceptor. The acceptor is a critical component of our infrastructure. If it crashes, it means we won't accept further requests, and our server would then be useless as no one can connect to it. On the other hand, we also use `Task.Supervisor` to start tasks that deal with each connection. In this case, restarting may not be useful at all, given the reason we crashed could just as well be a connection issue, and attempting to restart over the same connection would lead to further failures. -Let's fix this. We know that for a child of shape `{Task, fun}`, Elixir will invoke `Task.child_spec(fun)` to retrieve the underlying child specification. Therefore, one might imagine that to change the `{Task, fun}` specification to have a `:restart` of `:permanent`, we would need to change the `Task` module. However, that's impossible to do, as the `Task` module is defined as part of Elixir's standard library (and even if it was possible, it is unlikely it would be a good idea). -Luckily, this can be done by using `Supervisor.child_spec/2`, which allows us to configure a child specification with new values. Let's rewrite `start/2` in `KVServer.Application` once more: +Therefore, we want the acceptor to actually run in `:permanent` mode, while we preserve the `Task.Supervisor` as `:temporary`. Luckily Elixir has an API that allows us to change an existing child specification, which we use below. + +Let's change `start/2` in `lib/kv.ex` once more to the following: ```elixir def start(_type, _args) do - port = String.to_integer(System.get_env("PORT") || "4040") - children = [ - {Task.Supervisor, name: KVServer.TaskSupervisor}, - Supervisor.child_spec({Task, fn -> KVServer.accept(port) end}, restart: :permanent) + {Registry, name: KV, keys: :unique}, + {DynamicSupervisor, name: KV.BucketSupervisor, strategy: :one_for_one}, + {Task.Supervisor, name: KV.ServerSupervisor}, + Supervisor.child_spec({Task, fn -> KV.Server.accept(4040) end}, restart: :permanent) ] - opts = [strategy: :one_for_one, name: KVServer.Supervisor] - Supervisor.start_link(children, opts) + Supervisor.start_link(children, strategy: :one_for_one) end ``` Now we have an always running acceptor that starts temporary task processes under an always running task supervisor. -## Wrapping up +## Leveraging the ecosystem In this chapter, we implemented a basic TCP acceptor while exploring concurrency and fault-tolerance. Our acceptor can manage concurrent connections, but it is still not ready for production. Production-ready TCP servers run a pool of acceptors, each with their own supervisor. Elixir's `PartitionSupervisor` might be used to partition and scale the acceptor, but it is out of scope for this guide. In practice, you will use existing packages tailored for this use-case, such as [Ranch](https://github.com/ninenines/ranch) (in Erlang) or [Thousand Island](https://github.com/mtrudel/thousand_island) (in Elixir). diff --git a/lib/elixir/scripts/elixir_docs.exs b/lib/elixir/scripts/elixir_docs.exs index 74571ac0f8d..1676a01078e 100644 --- a/lib/elixir/scripts/elixir_docs.exs +++ b/lib/elixir/scripts/elixir_docs.exs @@ -49,15 +49,13 @@ canonical = System.fetch_env!("CANONICAL") "lib/elixir/pages/references/unicode-syntax.md", "lib/elixir/pages/mix-and-otp/introduction-to-mix.md", "lib/elixir/pages/mix-and-otp/agents.md", - "lib/elixir/pages/mix-and-otp/genservers.md", "lib/elixir/pages/mix-and-otp/supervisor-and-application.md", "lib/elixir/pages/mix-and-otp/dynamic-supervisor.md", - "lib/elixir/pages/mix-and-otp/erlang-term-storage.md", - "lib/elixir/pages/mix-and-otp/dependencies-and-umbrella-projects.md", "lib/elixir/pages/mix-and-otp/task-and-gen-tcp.md", "lib/elixir/pages/mix-and-otp/docs-tests-and-with.md", - "lib/elixir/pages/mix-and-otp/distributed-tasks.md", - "lib/elixir/pages/mix-and-otp/config-and-releases.md", + "lib/elixir/pages/mix-and-otp/config-and-distribution.md", + "lib/elixir/pages/mix-and-otp/genservers.md", + "lib/elixir/pages/mix-and-otp/releases.md", "lib/elixir/pages/meta-programming/quote-and-unquote.md", "lib/elixir/pages/meta-programming/macros.md", "lib/elixir/pages/meta-programming/domain-specific-languages.md", @@ -73,9 +71,9 @@ canonical = System.fetch_env!("CANONICAL") groups_for_extras: [ "Getting started": ~r"pages/getting-started/.*\.md$", Cheatsheets: ~r"pages/cheatsheets/.*\.cheatmd$", + "Mix & OTP": ~r"pages/mix-and-otp/.*\.md$", "Anti-patterns": ~r"pages/anti-patterns/.*\.md$", "Meta-programming": ~r"pages/meta-programming/.*\.md$", - "Mix & OTP": ~r"pages/mix-and-otp/.*\.md$", References: ~r"pages/references/.*\.md$" ], groups_for_docs: [ diff --git a/lib/mix/lib/mix/project.ex b/lib/mix/lib/mix/project.ex index 7c89b2ba35e..060c533c885 100644 --- a/lib/mix/lib/mix/project.ex +++ b/lib/mix/lib/mix/project.ex @@ -130,6 +130,67 @@ defmodule Mix.Project do makes sure Elixir is not added as a dependency to the generated `.app` file or to the escript generated with `mix escript.build`, and so on. + ## Umbrella projects + + Umbrella projects are a convenience to help you organize and manage multiple + applications. While it provides a degree of separation between applications, + those applications are not fully decoupled, as they share the same configuration + and the same dependencies. + + In an umbrella project, you have an `apps/` folder where you store each application. + Then, instead of each app in the umbrella having its own configuration, build cache, + lockfile and so, they all point to the parent project by specifying the following + configuration in their `mix.exs`: + + build_path: "../../_build", + config_path: "../../config/config.exs", + deps_path: "../../deps", + lockfile: "../../mix.lock", + + The pattern of keeping multiple applications in the same repository is known as + [monorepo](https://en.wikipedia.org/wiki/Monorepo). Umbrella projects maximize + this pattern by providing conveniences to compile, test and run multiple + applications at once. When an umbrella application needs to depend on another + one, it can be done by passing the `in_umbrella: true` option to your dependency. + If an umbrella application `:foo` depends on its sibling `:bar`, you can specify + this dependency in `foo`'s `mix.exs` file as: + + {:bar, in_umbrella: true} + + ### Undoing umbrellas + + Using umbrella projects can impact how you design and write your software and, + as time passes, they may turn out to be the wrong choice. + If you find yourself in a position where you want to use different configurations + in each application for the same dependency or use different dependency versions, + then it is likely your codebase has grown beyond what umbrellas can provide. + + If you find yourself in this situation, you have two options: + + 1. Convert everything into a single Mix project, which can be done in steps. + First move all files in `lib`, `test`, `priv`, and friends into a single + application, while still keeping the overall umbrella structure and + `mix.exs` files. For example, if your umbrellas has three applications, + `foo`, `bar` and `baz`, where `baz` depends on both `foo` and `bar`, + move all source to `baz`. Then remove `foo` and `bar` one by one, + updating any configuration and removing references to the `:foo` and + `:bar` application names. Until you have only a single application. + + 2. Remove umbrella structure while keeping them as distinct applications. + This is done by moving applications outside of the umbrella + project's `apps/` directory and updating the projects' `mix.exs` files + to no longer set the `build_path`, `config_path`, `deps_path`, and + `lockfile` configurations, guaranteeing each of them have their own + build and dependency structure. + + Keep in mind that umbrellas are one of many options for managing private + packages within your organization. You might: + + 1. Have multiple directories inside the same repository and using `:path` + dependencies (which is essentially the monorepo pattern) + 2. Use private Git repositories and Mix' ability to fetch Git dependencies + 3. Publishing packages to a private [Hex.pm](https://hex.pm/) organization + ## Invoking this module This module contains many functions that return project information and