Troubleshooting Connectware on Kubernetes
Troubleshooting guide for diagnosing and resolving common issues with Connectware running on Kubernetes.
This guide helps you diagnose and resolve common issues with Connectware running on Kubernetes. Follow the sections in order for systematic troubleshooting.
How to Troubleshoot
When troubleshooting Connectware issues, proceed in the following order:
Check pod status to identify obvious failures.
Inspect pod events for Kubernetes level errors.
Review logs to identify application level problems.
Collect debug information before making changes.
Restart or remove unhealthy pods if appropriate.
If you cannot identify or resolve the issue, contact the Cybus support team at [email protected].
Prerequisites
Before troubleshooting, ensure you have:
Helm version 3 is installed on your system.
The Kubernetes command line tool kubectl is configured and has access to the target installation.
You know the name and namespace of your Connectware installation. See Obtaining the Name, Namespace, and Version of Your Connectware Installation.
You have permissions to view pods, logs, and events.
Checking Pod Status
Connectware requires all pods to be in Running status with all containers ready. Check this by running:
Expected output: All pods show matching values in the Ready column, for example 1/1 or 2/2, and a Status of Running.
admin-web-app-8649f98fc6-sktb7
1/1
Running
0
3m1s
auth-server-5f46964984-5rwvc
1/1
Running
0
2m39s
broker-0
1/1
Running
0
2m11s
broker-1
1/1
Running
0
2m50s
connectware-5b948ffdff-tj2x9
1/1
Running
0
2m41s
container-manager-5f5678657c-94486
1/1
Running
0
2m46s
ingress-controller-85fffdcb4b-m8kpm
1/1
Running
0
2m37s
nats-0
1/1
Running
0
2m31s
nats-1
1/1
Running
0
2m30s
nats-2
1/1
Running
0
2m30s
postgresql-0
1/1
Running
0
2m58s
protocol-mapper-69f59f7dd4-6xhkf
1/1
Running
0
2m42s
resource-status-tracking-fcd58dc79-cl5nw
1/1
Running
0
2m12s
resource-status-tracking-fcd58dc79-vlzqs
1/1
Running
0
2m22s
service-manager-6b5fffd66d-gt584
1/1
Running
0
2m52s
system-control-server-bd486f5bd-2mkxz
1/1
Running
0
2m45s
welder-robots-0
1/1
Running
0
2m59s
workbench-57d4b59fbb-gqwnb
1/1
Running
0
2m38s
Identifying Unhealthy Pods
A pod should be considered unhealthy if it:
Shows an error state such as
CrashLoopBackOfforInit.Remains in a transitional state for an extended time.
Shows mismatched Ready values (e.g.,
0/1instead of1/1).
Example of a pod that is unable to start
auth-server-b4b69ccfd-fvsmz
0/1
Init:0/1
0
8m
Inspecting Pod Events
To identify the cause of a pod issue:
Describe the pod:
Review the Events section at the bottom of the output.
This indicates a cluster level issue where required volumes are unavailable. Such issues must be resolved at the Kubernetes or infrastructure level and are outside the scope of Connectware documentation.
If no clear cause is visible, continue with log inspection.
As general guidance:
Issues immediately after upgrades or configuration changes are often caused by incorrect Helm values.
Issues appearing later are often related to cluster infrastructure.
Checking Logs Using Kubetail
For viewing logs from multiple pods simultaneously, we recommend using kubetail. kubetail is a wrapper around kubectl that aggregates logs from multiple pods. By default, kubetail will always follow the logs like kubectl logs -f would.
Installation instructions are available at https://github.com/johanhaleby/kubetail.
Here are a few examples of how you can use kubetail. Also make sure to check kubetail --help.
Displaying Logs from All Pods in a Namespace
Displaying Logs of Pods That Match a Search Term
Displaying Logs for Pods That Match a Regular Expression
Displaying Logs from the Past
You can combine the parameter -s <timeframe> with any other command to display logs from the past up to now:
Displaying Logs of a Terminated Container of a Pod
Displaying Timestamps
If the logs you are viewing a missing timestamps, you can use the parameter --timestamps for kubetail to add timestamps to each log line:
Checking Logs Using Kubectl
If you do not want to use kubetail as suggested in the previous chapter, you can use kubectl to read logs.
Here are a few examples of how you can use it:
Displaying and Tailing Logs of a Pod
Displaying and Tailing Logs for All Pods with a Label
Displaying Logs of a Terminated Container of a Pod
Displaying Logs from the Past
You can combine the parameter --since <timeframe> with any other command to display logs from the past up to now:
Displaying Timestamps
If the logs that you are viewing are missing timestamps, you can use the parameter --timestamps for kubectl to add timestamps to each log line:
Removing Unhealthy Pods
When a pod is identified as unhealthy, either through pod status checks or log inspection, first collect the current system state using the debugging script (collect_debug.sh) from the Connectware Kubernetes Toolkit. This ensures that diagnostic information is preserved before any changes are made. For more information, see Collecting Debug Information.
After collecting debug data, delete the affected pod:
The controller managing the pod will automatically create a new instance. Restarting pods in this way often resolves transient issues and does not delete persisted data.
Special Considerations for StatefulSet Pods
Pods whose names end with a fixed number, such as broker-0, belong to a StatefulSet. Kubernetes handles StatefulSets differently from other workloads. An unhealthy StatefulSet pod is not automatically replaced after configuration changes.
If a StatefulSet pod is unhealthy due to a configuration issue, you must:
Correct the configuration.
Manually delete the affected pod so it can be recreated with the updated settings.
This behavior is intentional, as StatefulSets often manage persistent or stateful data.
In Connectware, StatefulSets include the broker, nats, postgresql, and any protocol mapper agents that you have defined.
Collecting Debug Information
The Connectware Kubernetes Toolkit provides a debugging script (collect_debug.sh) to gather logs and state information. Always run this script before attempting fixes.
Prerequisites
Installed the following tools:
kubectl,tar,sed,rm,sortAccess to the target installation using
kubectl
Downloading the Debugging Script
You can download the debugging script from https://download.cybus.io/.
Example
Running the Debugging Script
The debugging script takes parameters to target the correct Kubernetes namespace holding a Connectware installation:
-n
namespace
The Kubernetes namespace to use
-k
path to kubeconfig file
A kubeconfig file to use other than the default (~/.kube/config)
-c
name of kubeconfig context
The name of a kubeconfig context different than the currently selected
If your kubectl command is already configured to point at the correct cluster, you can use the debugging script by specifying the namespace:
The script creates a compressed archive in the current directory. Provide this archive to support when reporting issues.
If you use a central log aggregator, also include relevant logs for the affected timeframe.
Troubleshooting Protocol-Mapper Agents
This section covers issues with protocol-mapper agents that can be the result of minor configuration mistakes.
Agent does not connect when the Connectware broker uses mTLS
Symptoms
Agent log shows:
Likely cause
mTLS is not enabled in the agent configuration.
Resolution
Enable mTLS for the agent as described in Using Mutual Transport Layer Security (mTLS) for agents with the connectware-agent Helm chart.
TLS connection fails before handshake
Symptoms
Agent log shows:
Likely cause
The agent is connecting to the wrong MQTTS port on the broker.
Resolution
Verify
mqttPortandmqttDataPortin theprotocolMapperAgentssection of your Helmvalues.yaml.If you are not using a custom setup, these values are correct by default and can be removed.
Agent with mTLS enabled does not connect to broker
Symptoms
Agent log shows:
Likely cause
Certificates are missing or invalid.
Resolution
Verify certificate generation and configuration as described in Using Mutual Transport Layer Security (mTLS) for agents with the connectware-agent Helm chart.
Ensure Kubernetes objects were created from files named
ca-chain.pem,tls.crt, andtls.key. Incorrect filenames will cause the agent to fail to locate certificates.
Agent registration fails due to certificate Common Name mismatch
Symptoms
Allowing an mTLS enabled agent in Connectware Client Registry fails with the message An Error has occurred - Registration failed.
auth-server logs show:
Likely cause
The certificate Common Name does not match the agent name.
Resolution
Ensure the certificate Common Name exactly matches the agent name configured in the Helm value
name.
Agent registration fails with connection error
Symptoms
Agent log shows:
Likely cause
The agent certificate was signed by the wrong Certificate Authority.
Resolution
Verify the agent certificate was signed by the Certificate Authority that is used by Connectware.
Agent registration fails with conflict error
Symptoms
Agent log shows:
Likely cause
An agent or user with the same name already exists.
Resolution
Every agent needs a user whose username matches the value configured in the name Helm value for the agent.
Ensure the agent name is unique.
If there is another agent with the same name, do the following:
Delete the agent.
Delete the corresponding user. For more information, see Deleting Users.
If you created a user with the agent’s name for something else, you have to choose a different name for the agent.
Agent enters CrashLoopBackOff due to license errors
Symptoms
Agent pod enters
CrashLoopBackOff.Logs show authentication or license errors followed by agent shutdown.
Example
Likely cause
Cached agent credentials are no longer valid.
Resolution
The agent needs to be re-registered.
Delete the agent.
Delete the corresponding user. For more information, see Deleting Users.
Delete the agent StatefulSet:
Delete the agent PersistentVolumeClaim:
Apply the configuration changes via the
helm upgradecommand:
For more information, see Applying Helm Configuration Changes.
Last updated
Was this helpful?

