# Docker Problem with Network Changes

All Docker containers of a given service and service commissioning file are run inside a separate Docker network. This service-specific Docker network is named after the [service ID](https://docs.cybus.io/2-0-6/documentation/services/serviceid). This network connects all containers of the service, as well as the ingress proxy container of Connectware. The name of the ingress proxy is `connectware`, with the Docker Compose project name as prefix (e.g., `connectware`), and the decimal 1 as suffix. Hence, the full container name can be `connectware_connectware_1`.

In addition to the service-specific networks, there is also one more Docker network connecting the Connectware-internal application containers. This internal network is named with the Docker Compose project name as prefix (e.g., `connectware`) and the name `default`, e.g., `connectware_default`.

There is one somewhat unexpected caveat in the event of changing Docker network configurations, leading to temporary loss of certain data connections each time this occurs. This event can occur only during enabling or disabling of a Connectware service. For details, read on.

## Service-Specific Networks

Each time a Connectware service with container resources is enabled, a new service-specific Docker network is created. The service-specific network is then connected to all service containers, as well as to the common ingress proxy container. Upon disabling a Connectware service, this network is disconnected from the ingress proxy and the other containers.

Hence, the Docker network configuration of the ingress proxy container changes every time a service is enabled or disabled, if that service contains containers. This is not a problem as long as each network change does not alter the `default gateway setting` of the ingress proxy.

## Ingress Proxy Default Gateway

Whether or not a connected Docker network sets the `default gateway` of a container depends on the `lexicographical` ordering of the `network names`.

Without any service-specific networks, the ingress proxy is connected to the Connectware-internal network named after the Docker Compose project name, for example, `connectware_default`. Any data connection from outside clients to, e.g., REST endpoints of Connectware relies on the ingress proxy configuration, which should route the request from the outside to the Connectware-internal network, so that the appropriate service container responds to the specific REST request. This concerns both HTTP clients (for REST) and MQTT clients (for MQTT data connections).

The problem occurs if there are services whose service ID name is lexicographically before the Connectware-internal network name. In the default installation, the Connectware-internal network name is `connectware_default`. If another service with service ID `anotherService` is enabled, the `a` comes before the `c`. Hence, upon enabling `anotherService`, the ingress proxy’s default gateway is changed by the Docker runtime from the `connectware_default` network to the `anotherService` network.

This will cause a connection loss for several seconds, until the reachability of the internal containers is restored by trying not only the default gateway network, but one by one the additional networks. Connection loss on the order of 10 seconds has been observed, after which everything is restored automatically.

## Observed Errors

If the network change introduces the described error, the observed effect is that any ongoing MQTT client or HTTP client loses the connection to the Connectware service for approximately 10 seconds. After that time, the reachability of the internal containers is automatically restored by trying the routing not only on the default gateway network but also on the additional Docker networks.

Also, if this event occurs, in the Docker logs of the `container-manager` there will be a log message of log level WARNING, stating:

{% code lineNumbers="true" %}

```yaml
Ingress container default network changed to anotherService. Was connectware_default.
```

{% endcode %}

## Scope Limitation

This error concerns only a very specific scenario. To clarify the scope limitation: The error will not occur in any of the following cases:

* Connectware services are not enabled or disabled but just keep running.
* The enabled Connectware services do not contain any Cybus::Container resource.
* All installed services have service IDs starting with letters lexicographically after the Connectware-internal network name.
* There are no ongoing MQTT, HTTP, or TCP client connections from outside clients to Connectware. Connections from Connectware to other servers or machines are not affected.

## Workarounds

If this error is a problem for you, any of the following workarounds can be applied to avoid it:

* Rename the Connectware-internal network to be lexicographically always before any service ID, such as `a_a_connectware...`. To achieve this renaming, the Docker Compose project name must be changed. By default, the Docker Compose project name is taken from the current working directory where the `docker-compose.yml` file is installed and the `docker compose` command is called. This directory can be renamed from `connectware` to the desired other name. Alternatively, the `-p` option can be used in all program calls to `docker compose`, such as `docker compose -p a_a_connectware ...`. Watch out: When renaming a running installation, the Docker volumes must be renamed, too.
* Rename all service IDs to be lexicographically always after the Connectware-internal network. There could be a convention that every service ID must start with a prefix such as `s_`.

## Further Reading

This problem in the Docker runtime has been reported and is known to exist for many years now. Further reading:

* <https://github.com/moby/moby/issues/21741#issuecomment-210090728>
* <https://github.com/moby/libnetwork/issues/1141#issuecomment-215522809>
* <https://gist.github.com/jfellus/cfee9efc1e8e1baf9d15314f16a46eca>
