Collect diagnostic data to streamline support resolution
Collect diagnostic data to streamline support case resolution
Abstract
- 1. Diagnostic data collection overview
- 2. Run must-gather on OpenShift to collect diagnostic data
- 3. Run must-gather on Kubernetes to collect diagnostic data
- 4. Collect diagnostic data from air-gapped clusters with a mirrored must-gather image
- 5. Diagnostic data types and collection scope
- 6. Collect heap dumps to diagnose memory issues
- 7. Configuration options for diagnostic collection
- 8. Diagnostic data output structure and organization
- 9. Additional resources
Collect diagnostic data from Red Hat Developer Hub deployments on OpenShift Container Platform and Kubernetes platforms by using the must-gather tool to accelerate troubleshooting and provide complete information for support cases.
1. Diagnostic data collection overview
The must-gather tool collects diagnostic data and logging information from your cluster, which helps Red Hat Support resolve your deployment issues efficiently.
These features are for Technology Preview only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs), might not be functionally complete, and Red Hat does not recommend using them for production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information on Red Hat Technology Preview features, see Technology Preview Features Scope.
Use must-gather when you open a support ticket with Red Hat Global Support Services, troubleshoot deployment issues, or capture deployment state before RHDH upgrades.
Typical collection size ranges from 10 MB to 50 MB for basic collections and 500 MB to 2 GB when including heap dumps. Plan for adequate storage and bandwidth when collecting for support tickets.
The must-gather output might contain sensitive information such as configuration values, environment variables, application logs, and resource definitions. While the tool automatically sanitizes known secret types, you should review the output before sharing it with support to check for domain-specific sensitive data.
2. Run must-gather on OpenShift to collect diagnostic data
Collect diagnostic data from RHDH deployments on OpenShift Container Platform.
Prerequisites
- You have deployed Red Hat Developer Hub on OpenShift Container Platform.
-
Cluster administrator access or
RBACpermissions for resource inspection. -
Authenticated OpenShift CLI (
oc) (OpenShift Container Platform).
Procedure
Run must-gather:
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:1.10
Output:
./must-gather.local.<timestamp>NoteThe must-gather tool automatically detects and collects data from all RHDH instances in the cluster, regardless of whether they were deployed using the Operator or Helm chart.
For advanced scenarios, you can limit data collection:
-
Skip Helm-based deployments:
-- /usr/bin/gather --without-helm -
Skip Operator-based deployments:
-- /usr/bin/gather --without-operator
-
Skip Helm-based deployments:
Verification
Confirm that the output directory exists:
$ ls must-gather.local.<timestamp>/
The directory contains subdirectories for each deployment method found:
-
operator/- Data from Operator-based deployments -
helm/- Data from Helm-based deployments
-
3. Run must-gather on Kubernetes to collect diagnostic data
Collect diagnostic data from RHDH deployments on supported Kubernetes platforms.
Prerequisites
- You have deployed Red Hat Developer Hub on a supported Kubernetes platform.
-
Cluster administrator access or
RBACpermissions for resource inspection. -
Authenticated CLI:
kubectlandhelm.
Procedure
Install must-gather:
$ helm upgrade --install my-rhdh-must-gather redhat-developer-hub-must-gather \ --repo https://charts.openshift.io \ --namespace rhdh-diagnostics \ --create-namespace
NoteThe must-gather tool automatically detects and collects data from all RHDH instances in the cluster, regardless of whether they were deployed using the Operator or Helm chart.
For advanced scenarios, you can limit data collection using Helm values:
-
Skip Helm-based deployments:
--set gather.withHelm=false -
Skip Operator-based deployments:
--set gather.withOperator=false
-
Skip Helm-based deployments:
Wait for collection:
$ kubectl wait --for=condition=ready pod \ -l app.kubernetes.io/instance=my-rhdh-must-gather,app.kubernetes.io/component=gather \ --timeout=3600s -n rhdh-diagnostics
Extract data:
$ kubectl exec deploy/my-rhdh-must-gather -c data-holder -n rhdh-diagnostics -- \ tar czf - -C /must-gather . > rhdh-must-gather-output.tar.gz
Clean up:
$ helm uninstall my-rhdh-must-gather -n rhdh-diagnostics
Verification
Confirm that the output archive contains diagnostic data:
$ tar -tzf rhdh-must-gather-output.tar.gz
The archive contains subdirectories for each deployment method found:
-
operator/- Data from Operator-based deployments -
helm/- Data from Helm-based deployments
-
4. Collect diagnostic data from air-gapped clusters with a mirrored must-gather image
Mirror the must-gather image to your registry to enable diagnostic data collection from disconnected clusters.
Prerequisites
-
skopeotool installed on the machine with registry access. - Access to an internal container registry.
- For OpenShift Container Platform: Cluster administrator access to update the global pull secret.
- For Kubernetes: Ability to create secrets in target namespaces.
Procedure
Choose your mirroring workflow based on network access:
Partially disconnected:
Your local machine can access both the internet and the internal registry.
Fully disconnected:
Requires a bastion host or approved file transfer method to move images.
Mirror the must-gather image:
For partially disconnected environments:
$ skopeo copy \ docker://registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:*<version>* \ docker://<internal-registry>/rhdh/rhdh-must-gather:*<version>*
For fully disconnected environments:
On a machine with internet access, pull the must-gather image:
$ skopeo copy \ docker://registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:*<version>* \ dir:./rhdh-must-gather-<version>
Pull the Helm chart:
$ helm pull redhat-developer-hub-must-gather --repo https://charts.openshift.io
Transfer the image directory and the
redhat-developer-hub-must-gather-<version>.tgzchart file to your bastion host. Push the image to the internal registry:$ skopeo copy \ dir:./rhdh-must-gather-<version> \ docker://<internal-registry>/rhdh/rhdh-must-gather:*<version>*
Replace
<version>with the must-gather version that matches your RHDH deployment and<internal-registry>with your internal registry hostname.Configure image pull authentication:
NoteIf your internal registry requires authentication, you must configure image pull secrets before running must-gather. Without proper credentials, the must-gather pod fails with
ImagePullBackOfferrors.On OpenShift Container Platform:
Add your internal registry credentials to the cluster-wide pull secret. For instructions, see Updating the global cluster pull secret in the Red Hat OpenShift Container Platform documentation.
On Kubernetes:
Create a docker-registry secret:
$ kubectl create secret docker-registry must-gather-pull-secret \ --docker-server=<internal-registry> \ --docker-username=<username> \ --docker-password=<password> \ -n rhdh-diagnostics
Run must-gather using the mirrored image:
On OpenShift Container Platform:
$ oc adm must-gather --image=<internal-registry>/rhdh/rhdh-must-gather:*<version>*
On Kubernetes (partially disconnected):
$ helm upgrade --install my-rhdh-must-gather redhat-developer-hub-must-gather \ --repo https://charts.openshift.io \ --namespace rhdh-diagnostics \ --create-namespace \ --set image.registry=<internal-registry> \ --set image.repository=rhdh/rhdh-must-gather \ --set image.tag=<version> \ --set imagePullSecrets[0].name=must-gather-pull-secret
On Kubernetes (fully disconnected):
$ helm upgrade --install my-rhdh-must-gather /path/to/redhat-developer-hub-must-gather-<version>.tgz \ --repo https://charts.openshift.io \ --namespace rhdh-diagnostics \ --create-namespace \ --set image.registry=<internal-registry> \ --set image.repository=rhdh/rhdh-must-gather \ --set image.tag=<version> \ --set imagePullSecrets[0].name=must-gather-pull-secret
Verification
Verify that the collection starts without
ImagePullBackOfferrors:$ oc get pods -n openshift-must-gather-<random> # On OpenShift $ kubectl get pods -n rhdh-diagnostics # On Kubernetes
Troubleshooting
ImagePullBackOff errors:
Check that registry credentials are configured correctly and that the image path matches your registry structure.
Certificate errors:
If your internal registry uses self-signed certificates, configure certificate trust on the cluster. For OpenShift Container Platform, see Configuring image registry repository mirroring in the Red Hat OpenShift Container Platform documentation.
5. Diagnostic data types and collection scope
The must-gather tool uses collectors to gather specific types of diagnostic data.
Reduce collection time by:
- Excluding collectors not needed for your deployment
- Filtering to specific namespaces
- Enabling optional collectors only when required
5.1. Default-enabled collectors
The following collectors run by default. Exclude collectors not needed for your deployment using --without-<collector> flags:
- platform collector
-
Collects cluster version and platform type (OpenShift Container Platform, AKS, EKS, or GKE). Disable with
--without-platform. - helm collector
-
Collects Helm release information. Use for Helm-based deployments only. Disable with
--without-helm. - operator collector
-
Collects Operator logs and custom resources. Use for Operator-based deployments only. Disable with
--without-operator. - orchestrator collector
-
Collects workflow data for the Orchestrator plugin. Use to troubleshoot Orchestrator workflows only. Disable with
--without-orchestrator. - route-ingress collector
-
Collects route definitions (OpenShift Container Platform) and ingress configurations (Kubernetes). Use to troubleshoot external access only. Disable with
--without-routeor--without-ingress. - namespace-inspect collector
-
Collects namespace resources for support teams. Limit to specific namespaces using
--namespaces. Recommended for all collections. Disable with--without-namespace-inspect.
5.2. Opt-in collectors
The following collectors do not run by default. Enable them only when needed:
- cluster-info collector
Collects cluster-wide state information beyond basic platform metadata. Enable with
--cluster-info.ImportantEnable this collector only when support requests it, as it significantly increases collection time and may impact cluster performance during collection.
- heap-dumps collector
Collects Node.js memory snapshots for diagnosing memory issues. Enable with
--with-heap-dumps.Requirements:
- Liveness probe timeout increased to prevent pod restarts
- Sufficient storage (50 MB to 500 MB per heap dump file)
Use only when:
- Troubleshooting memory leaks
- Support requests memory analysis
- Diagnosing out-of-memory errors
6. Collect heap dumps to diagnose memory issues
Collect Node.js heap snapshots when you troubleshoot memory leaks, out-of-memory errors, or when support requests memory analysis.
Prerequisites
- Sufficient storage for heap dump files (50 MB to 500 MB each).
-
For SIGUSR2 method: Node.js 12+ for
--heapsnapshot-signalflag support.
Procedure
Choose the heap dump collection method:
Inspector protocol (default):
- Works out of the box
-
Use unless your deployment sets
--disable-sigusr1inNODE_OPTIONS
SIGUSR2 signal:
-
Requires configuring
NODE_OPTIONSin your deployment first - Use when the inspector protocol is not available
If using SIGUSR2, configure
NODE_OPTIONS:NoteSkip this step if using the inspector protocol (default method).
Operator deployments:
apiVersion: rhdh.redhat.com/v1alpha1 kind: Backstage metadata: name: my-rhdh spec: application: extraEnvs: - name: NODE_OPTIONS value: "--heapsnapshot-signal=SIGUSR2 --diagnostic-dir=/tmp"Apply the change:
$ oc apply -f backstage-cr.yaml
Helm deployments:
upstream: backstage: extraEnvVars: - name: NODE_OPTIONS value: "--heapsnapshot-signal=SIGUSR2 --diagnostic-dir=/tmp"Update the release:
$ helm upgrade my-rhdh redhat-developer-hub/backstage -f values.yaml -n <namespace>
Wait for pods to restart before proceeding.
Increase the liveness probe timeout to prevent pod restarts:
ImportantRHDH stops responding during heap dump collection, and large memory footprints (1GB+) can cause liveness probe failures. Plan collection during maintenance windows or low-traffic periods.
Operator deployments:
Update the Backstage custom resource to patch the deployment:
apiVersion: rhdh.redhat.com/v1alpha1 kind: Backstage metadata: name: my-rhdh spec: deployment: patch: spec: template: spec: containers: - name: backstage-backend livenessProbe: failureThreshold: 180Apply the change:
$ oc apply -f backstage-cr.yaml
Helm deployments:
Update your Helm values:
upstream: backstage: livenessProbe: failureThreshold: 180Apply the change:
$ helm upgrade my-rhdh redhat-developer-hub/backstage -f values.yaml -n <namespace>
Wait for pods to restart before proceeding.
Run must-gather with heap dumps enabled:
On OpenShift Container Platform:
For inspector protocol (default):
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:1.10 -- /usr/bin/gather --with-heap-dumps
For SIGUSR2 method:
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:1.10 -- /usr/bin/gather --with-heap-dumps --heap-dump-method sigusr2
On supported Kubernetes platforms:
For inspector protocol (default):
gather: heapDump: enabled: true method: "inspector"For SIGUSR2 method:
gather: heapDump: enabled: true method: "sigusr2"Collection takes 5-15 minutes depending on pod count and memory size.
Verification
Check for .heapsnapshot files:
$ find must-gather.local.<timestamp> -name "*.heapsnapshot" # On OpenShift $ tar -tzf rhdh-must-gather-output.tar.gz | grep heapsnapshot # On Kubernetes
- If collection starts but times out before completing
You see progress messages but collection eventually times out before finishing. Check the
must-gatherlogs for warnings about liveness probe settings:[WARN] Pod 'xxx' may restart during heap dump collection! [WARN] Current: failureThreshold=3 × periodSeconds=10s = 30s before restart [WARN] Required: at least 900s (HEAP_DUMP_TIMEOUT)
Set your liveness probe timeout longer than the collection time.
For Operator-based deployments, update the Backstage custom resource:
spec: deployment: patch: spec: template: spec: containers: - name: backstage-backend livenessProbe: failureThreshold: 180Apply with
oc apply -f backstage-cr.yaml.For Helm-based deployments, patch the deployment directly:
$ oc patch deployment <deployment-name> -n <namespace> \ -p '{"spec":{"template":{"spec":{"containers":[{"name":"backstage-backend","livenessProbe":{"failureThreshold":180}}]}}}}'Wait for your pods to restart with the new configuration, then run heap dump collection again.
- If collection times out
Large memory footprints (multiple gigabytes) can take 10-15 minutes to snapshot. Increase the heap dump timeout to allow more time for collection.
On OpenShift Container Platform, set the timeout using an environment variable:
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:1.10 \ -- /usr/bin/env HEAP_DUMP_TIMEOUT=1800 /usr/bin/gather --with-heap-dumps
On Kubernetes, set the timeout in your Helm values:
gather: heapDump: enabled: true timeout: "1800"
7. Configuration options for diagnostic collection
Reference tables for must-gather command-line flags, environment variables, and Helm chart values.
7.1. Command-line flags
Command-line flags are available when running the gather script directly. On OpenShift Container Platform, pass these flags after -- /usr/bin/gather. On Kubernetes, use the equivalent Helm chart values instead.
| Flag | Description | Default | Example |
|---|---|---|---|
|
|
Comma-separated list of namespaces to collect data from. |
All namespaces |
|
|
|
Include cluster-wide state information. Use only when requested by support. |
Not included |
|
|
|
Trigger Node.js heap snapshots and collect heap dump files for memory analysis. |
Not included |
|
|
|
Comma-separated list of specific RHDH instance names to collect heap dumps from. Supports exact match, prefix match, or contains match. Only applies when |
All RHDH instances |
|
|
|
Method to trigger heap dumps. Valid values: |
|
|
|
|
Exclude the Operator collector. Use for Helm-based deployments. |
Operator collector included |
|
|
|
Exclude the Helm collector. Use for Operator-based deployments. |
Helm collector included |
|
|
|
Exclude the Orchestrator collector. Use when not using Orchestrator functionality. |
Orchestrator collector included |
|
|
|
Exclude the platform collector. |
Platform collector included |
|
|
|
Exclude the route collector (OpenShift Container Platform routes). |
Route collector included |
|
|
|
Exclude the ingress collector (Kubernetes ingress resources). |
Ingress collector included |
|
|
|
Exclude the namespace-inspect collector. Not recommended as it aids support navigation. |
Namespace-inspect collector included |
|
7.2. Environment variables
Environment variables control must-gather operational behavior when running the gather script directly. When using the Helm chart, use the Helm chart values described in the next section instead of environment variables.
| Variable | Description | Default | Helm Chart Equivalent |
|---|---|---|---|
|
|
Logging verbosity level. Valid values: |
|
|
|
|
Timeout for individual kubectl and Helm commands (seconds). |
|
|
|
|
Relative time duration to limit log collection (for example, |
Not set (collects all available logs) |
|
|
|
Absolute RFC3339 timestamp to limit log collection. |
Not set (collects all available logs) |
|
|
|
Timeout in seconds for heap dump generation per pod. |
|
|
7.3. Helm chart values
When deploying must-gather on Kubernetes platforms using the Helm chart, configure collection options using a values file. The following table shows key Helm values and their equivalent command-line flags or environment variables.
| Helm Value Path | Description | Equivalent Command-line Flag or Environment Variable |
|---|---|---|
|
|
Array of namespaces to collect from. |
|
|
|
Boolean to include cluster-wide state information. Use only when requested by support. |
|
|
|
Boolean to collect heap dumps from Node.js backend pods. |
|
|
|
Array of specific RHDH instance names to collect heap dumps from. |
|
|
|
Method to trigger heap dumps. Valid values: |
|
|
|
Timeout in seconds for heap dump collection. |
|
|
|
Boolean to enable or disable the Operator collector. Set to |
|
|
|
Boolean to enable or disable the Helm collector. Set to |
|
|
|
Boolean to enable or disable the Orchestrator collector. Set to |
|
|
|
Log level for must-gather operations. Valid values: |
|
|
|
Timeout in seconds for individual kubectl or helm commands. |
|
|
|
Relative time duration to limit log collection (for example, |
|
|
|
Absolute RFC3339 timestamp to limit log collection. |
|
Helm boolean values use positive logic (withX: true means include), while command-line flags use negative logic (--without-X means exclude).
8. Diagnostic data output structure and organization
The must-gather output directory organizes data by collector type, namespace, and resource type.
8.1. Top-level directory structure
The must-gather output directory contains files and subdirectories organized by collection method and data type. The following table shows the top-level structure:
| Path | Contents |
|---|---|
|
|
Collection metadata including timestamps, must-gather version, collectors that ran, and collection parameters. |
|
|
Data sanitization summary and details. |
|
|
All OpenShift Container Platform routes cluster-wide. |
|
|
All Kubernetes ingresses cluster-wide. |
|
|
Cluster-wide information (only present if you specified |
|
|
Deep namespace inspect data (collected by default). |
|
|
Platform and infrastructure information. |
|
|
Helm deployment data (native releases and standalone deployments). Only present if Helm-deployed RHDH instances are detected. |
|
|
Orchestrator-flavored deployment data (if detected). Only present if Orchestrator components are detected. |
|
|
Operator deployment data (if RHDH operators found). Only present if RHDH operators are detected. |
8.2. Heap dump file locations
When the tool collects heap dumps with the --with-heap-dumps flag, they appear within deployment-specific directories, not under namespace-inspect/namespaces/. The location depends on the installation method:
Heap dump file paths by deployment type:
-
Helm releases:
helm/releases/ns=<namespace>/<release>/deployment/heap-dumps/… -
Helm standalone:
helm/standalone/ns=<namespace>/<workload>/deployment/heap-dumps/… -
Operator:
operator/backstage-crs/ns=<namespace>/<cr>/deployment/heap-dumps/…
Example:
helm/releases/ns=rhdh-prod/developer-hub/deployment/heap-dumps/pod=backstage-developer-hub-7d8f9c5b-xk2m4/container=backstage-backend/heapdump-20260430-143022.heapsnapshot
Analyze heap dump files using Chrome DevTools, Node.js heap analysis tools, or memory profilers.
8.3. Common diagnostic data locations
The following table shows where to find frequently needed diagnostic information in the must-gather output:
| Diagnostic Data | Location |
|---|---|
|
Backend pod logs |
|
|
PostgreSQL connection configuration |
|
|
Operator logs |
|
|
Backstage custom resource definition |
|
|
Route or Ingress definitions |
|
|
Helm release values |
|
|
Deployment configurations |
|
|
Service definitions |
|
|
Platform and cluster version |
|
|
Heap dumps for memory analysis |
|