Collect diagnostic data to streamline support resolution
Generate diagnostic data to streamline support case resolution
Abstract
- 1. Generate diagnostic data using the must-gather tool to streamline support cases
- 1.1. Deployment methods and data collection
- 1.2. Diagnostic data types and collection scope
- 1.3. Run must-gather for Operator-based deployments to collect diagnostic data
- 1.4. Run must-gather for Helm-based deployments to collect diagnostic data
- 1.5. Collect heap dumps to diagnose memory issues
- 1.6. Configuration options for diagnostic collection
- 1.7. Diagnostic data output structure and organization
- 1.8. Platform compatibility and supported configurations
- 1.9. Additional resources
Generate diagnostic data from Red Hat Developer Hub deployments on OpenShift Container Platform and Kubernetes platforms by using the must-gather tool to accelerate troubleshooting and provide complete information for support cases.
1. Generate diagnostic data using the must-gather tool to streamline support cases
When opening a support ticket for RHDH deployment issues, use the must-gather tool to quickly generate diagnostic data. The tool collects deployment manifests, logs, and resource definitions in one operation to resolve platform issues faster.
Use must-gather when opening a support ticket with Red Hat Global Support Services, troubleshooting deployment issues, or capturing deployment state before RHDH upgrades.
On OpenShift Container Platform:
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:latest
Output: ./must-gather.local.<timestamp>
On supported Kubernetes platforms:
$ helm install my-rhdh-must-gather redhat-developer-hub-must-gather \ --repo https://charts.openshift.io \ --namespace rhdh-diagnostics \ --create-namespace $ kubectl wait --for=condition=ready pod \ -l app.kubernetes.io/instance=my-rhdh-must-gather,app.kubernetes.io/component=gather \ --timeout=3600s -n rhdh-diagnostics $ kubectl exec deploy/my-rhdh-must-gather -c data-holder -n rhdh-diagnostics -- \ tar czf - -C /must-gather . > rhdh-must-gather-output.tar.gz $ helm uninstall my-rhdh-must-gather -n rhdh-diagnostics
Typical collection size: 10-50 MB (basic), 500 MB - 2 GB (with heap dumps). Plan for adequate storage and bandwidth when collecting for support tickets.
1.1. Deployment methods and data collection
RHDH supports two deployment methods, and the must-gather tool uses different collectors for each method.
1.1.1. Operator-based deployments
Operator-based deployments use the RHDH Operator to manage Backstage custom resources. These deployments have Backstage custom resources in the cluster that the Operator controller watches and reconciles.
The must-gather Operator collector gathers:
- Operator controller logs
- Backstage custom resource definitions
- Operator configuration and status
For Operator deployments, exclude the Helm collector to focus collection on relevant data.
1.1.2. Helm-based deployments
Helm-based deployments use the RHDH Helm chart to deploy Kubernetes resources directly without an Operator. These deployments have Helm releases that manage the application lifecycle.
The must-gather Helm collector gathers:
- Helm release information and manifests
- Deployed resource configurations
- Helm chart values
For Helm deployments, exclude the Operator collector to focus collection on relevant data.
1.1.3. Mixed deployments
Clusters can contain both Operator-managed and Helm-managed RHDH instances in different namespaces. In this case, run must-gather without exclusion flags to collect data from all deployment methods.
1.2. Diagnostic data types and collection scope
The must-gather tool uses collectors to gather specific types of diagnostic data.
Reduce collection time by:
- Excluding collectors not needed for your deployment
- Filtering to specific namespaces
- Enabling optional collectors only when required
The must-gather output might contain sensitive information:
- Configuration values and environment variables
- Application logs and resource definitions
Security measures:
- Rely on automatic sanitization of known secret types.
- Review output before sharing with support.
- Check for domain-specific sensitive data.
1.2.1. Default-enabled collectors
The following collectors run by default. Exclude collectors not needed for your deployment using --without-<collector> flags:
- platform collector
-
Collects cluster version and platform type (OpenShift Container Platform, AKS, EKS, or GKE). Disable with
--without-platform. - helm collector
-
Collects Helm release information. Use for Helm-based deployments only. Disable with
--without-helm. - operator collector
-
Collects Operator logs and custom resources. Use for Operator-based deployments only. Disable with
--without-operator. - orchestrator collector
-
Collects workflow data for the Orchestrator plugin. Use when troubleshooting Orchestrator workflows only. Disable with
--without-orchestrator. - route-ingress collector
-
Collects route definitions (OpenShift Container Platform) and ingress configurations (Kubernetes). Use when troubleshooting external access only. Disable with
--without-routeor--without-ingress. - namespace-inspect collector
-
Collects namespace resources for support teams. Limit to specific namespaces using
--namespaces. Recommended for all collections. Disable with--without-namespace-inspect.
1.2.2. Opt-in collectors
The following collectors do not run by default. Enable them only when needed:
- cluster-info collector
Collects cluster-wide state information beyond basic platform metadata. Enable with
--cluster-info.ImportantEnable only when support requests it.
Impact:
- Significantly increases collection time.
- May impact cluster performance during collection.
- heap-dumps collector
Collects Node.js memory snapshots for diagnosing memory issues. Enable with
--with-heap-dumps.Requirements:
- Liveness probe timeout increased to prevent pod restarts
- Sufficient storage (50-500 MB per heap dump file)
Use only when:
- Troubleshooting memory leaks
- Support requests memory analysis
- Diagnosing out-of-memory errors
1.3. Run must-gather for Operator-based deployments to collect diagnostic data
Collect diagnostic data from Operator-based RHDH deployments.
Prerequisites
-
Cluster administrator access or
RBACpermissions for resource inspection. -
Authenticated CLI: OpenShift CLI (
oc) (OpenShift Container Platform) orkubectlandhelm(Kubernetes).
Procedure
Verify you have an Operator-based deployment:
$ oc get backstage -A
If this returns
Backstagecustom resources, you have an Operator deployment.Run must-gather:
On OpenShift Container Platform:
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:latest -- /usr/bin/gather --without-helm
Output:
./must-gather.local.<timestamp>On supported Kubernetes platforms:
Install must-gather:
$ helm upgrade --install my-rhdh-must-gather redhat-developer-hub-must-gather \ --repo https://charts.openshift.io \ --namespace rhdh-diagnostics \ --create-namespace \ --set gather.withHelm=false
Wait for collection:
$ kubectl wait --for=condition=ready pod \ -l app.kubernetes.io/instance=my-rhdh-must-gather,app.kubernetes.io/component=gather \ --timeout=3600s -n rhdh-diagnostics
Extract data:
$ kubectl exec deploy/my-rhdh-must-gather -c data-holder -n rhdh-diagnostics -- \ tar czf - -C /must-gather . > rhdh-must-gather-output.tar.gz
Clean up:
$ helm uninstall my-rhdh-must-gather -n rhdh-diagnostics
Verification
Check the output directory exists:
On OpenShift Container Platform:
$ ls must-gather.local.<timestamp>/operator/
On Kubernetes:
$ tar -tzf rhdh-must-gather-output.tar.gz | grep operator/
1.4. Run must-gather for Helm-based deployments to collect diagnostic data
Collect diagnostic data from Helm-based RHDH deployments.
Prerequisites
-
Cluster administrator access or
RBACpermissions for resource inspection. -
Authenticated CLI: OpenShift CLI (
oc) (OpenShift Container Platform) orkubectlandhelm(Kubernetes).
Procedure
Verify you have a Helm-based deployment:
$ helm list -A | grep developer-hub
If this returns Helm releases, you have a Helm deployment.
Run must-gather:
On OpenShift Container Platform:
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:latest -- /usr/bin/gather --without-operator
Output:
./must-gather.local.<timestamp>On supported Kubernetes platforms:
Install must-gather:
$ helm upgrade --install my-rhdh-must-gather redhat-developer-hub-must-gather \ --repo https://charts.openshift.io \ --namespace rhdh-diagnostics \ --create-namespace \ --set gather.withOperator=false
Wait for collection:
$ kubectl wait --for=condition=ready pod \ -l app.kubernetes.io/instance=my-rhdh-must-gather,app.kubernetes.io/component=gather \ --timeout=3600s -n rhdh-diagnostics
Extract data:
$ kubectl exec deploy/my-rhdh-must-gather -c data-holder -n rhdh-diagnostics -- \ tar czf - -C /must-gather . > rhdh-must-gather-output.tar.gz
Clean up:
$ helm uninstall my-rhdh-must-gather -n rhdh-diagnostics
Verification
Check the output directory exists:
On OpenShift Container Platform:
$ ls must-gather.local.<timestamp>/helm/
On Kubernetes:
$ tar -tzf rhdh-must-gather-output.tar.gz | grep helm/
1.5. Collect heap dumps to diagnose memory issues
Collect Node.js heap snapshots when troubleshooting memory leaks, out-of-memory errors, or when support requests memory analysis.
Prerequisites
- Sufficient storage for heap dump files (50-500 MB each).
-
For SIGUSR2 method: Node.js 12+ for
--heapsnapshot-signalflag support.
Procedure
Choose the heap dump collection method:
Inspector protocol (default):
- Works out of the box
-
Use unless your deployment sets
--disable-sigusr1inNODE_OPTIONS
SIGUSR2 signal:
-
Requires configuring
NODE_OPTIONSin your deployment first - Use when the inspector protocol is not available
If using SIGUSR2, configure
NODE_OPTIONS:NoteSkip this step if using the inspector protocol (default method).
Operator deployments:
apiVersion: rhdh.redhat.com/v1alpha1 kind: Backstage metadata: name: my-rhdh spec: application: extraEnvs: - name: NODE_OPTIONS value: "--heapsnapshot-signal=SIGUSR2 --diagnostic-dir=/tmp"Apply the change:
$ oc apply -f backstage-cr.yaml
Helm deployments:
upstream: backstage: extraEnvVars: - name: NODE_OPTIONS value: "--heapsnapshot-signal=SIGUSR2 --diagnostic-dir=/tmp"Update the release:
$ helm upgrade my-rhdh redhat-developer-hub/backstage -f values.yaml -n <namespace>
Wait for pods to restart before proceeding.
Increase the liveness probe timeout to prevent pod restarts:
ImportantRHDH stops responding during heap dump collection. Large memory footprints (1GB+) can cause liveness probe failures. Plan collection during maintenance windows or low-traffic periods.
Operator deployments:
$ oc patch deployment <deployment-name> -n <namespace> \ -p '{"spec":{"template":{"spec":{"containers":[{"name":"backstage-backend","livenessProbe":{"failureThreshold":180}}]}}}}'Helm deployments:
upstream: backstage: livenessProbe: failureThreshold: 180Wait for pods to restart before proceeding.
Run must-gather with heap dumps enabled:
On OpenShift Container Platform:
For inspector protocol (default):
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:latest -- /usr/bin/gather --with-heap-dumps
For SIGUSR2 method:
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:latest -- /usr/bin/gather --with-heap-dumps --heap-dump-method sigusr2
On supported Kubernetes platforms:
For inspector protocol (default):
gather: heapDump: enabled: true method: "inspector"For SIGUSR2 method:
gather: heapDump: enabled: true method: "sigusr2"Collection takes 5-15 minutes depending on pod count and memory size.
Verification
Check for .heapsnapshot files:
$ find must-gather.local.<timestamp> -name "*.heapsnapshot" # On OpenShift $ tar -tzf rhdh-must-gather-output.tar.gz | grep heapsnapshot # On Kubernetes
- If collection starts but times out before completing
You see progress messages but collection eventually times out before finishing. Check the
must-gatherlogs for warnings about liveness probe settings:[WARN] Pod 'xxx' may restart during heap dump collection! [WARN] Current: failureThreshold=3 × periodSeconds=10s = 30s before restart [WARN] Required: at least 900s (HEAP_DUMP_TIMEOUT)
Set your liveness probe timeout longer than the collection time. Increase the timeout to allow collection to complete:
$ oc patch deployment <deployment-name> -n <namespace> \ -p '{"spec":{"template":{"spec":{"containers":[{"name":"backstage-backend","livenessProbe":{"failureThreshold":180}}]}}}}'Wait for your pods to restart with the new configuration, then run heap dump collection again.
- If collection times out
Large memory footprints (multiple gigabytes) can take 10-15 minutes to snapshot. Increase the heap dump timeout to allow more time for collection.
On OpenShift Container Platform, set the timeout using an environment variable:
$ oc adm must-gather --image=registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:latest \ -- /usr/bin/env HEAP_DUMP_TIMEOUT=1800 /usr/bin/gather --with-heap-dumps
On Kubernetes, set the timeout in your Helm values:
gather: heapDump: enabled: true timeout: "1800"
1.6. Configuration options for diagnostic collection
Reference tables for must-gather command-line flags, environment variables, and Helm chart values.
1.6.1. Command-line flags
Command-line flags are available when running the gather script directly. On OpenShift Container Platform, pass these flags after -- /usr/bin/gather. On Kubernetes, use the equivalent Helm chart values instead.
| Flag | Description | Default | Example |
|---|---|---|---|
|
|
Comma-separated list of namespaces to collect data from. |
All namespaces |
|
|
|
Include cluster-wide state information. Use only when requested by support. |
Not included |
|
|
|
Trigger Node.js heap snapshots and collect heap dump files for memory analysis. |
Not included |
|
|
|
Comma-separated list of specific RHDH instance names to collect heap dumps from. Supports exact match, prefix match, or contains match. Only applies when |
All RHDH instances |
|
|
|
Method to trigger heap dumps. Valid values: |
|
|
|
|
Exclude the Operator collector. Use for Helm-based deployments. |
Operator collector included |
|
|
|
Exclude the Helm collector. Use for Operator-based deployments. |
Helm collector included |
|
|
|
Exclude the Orchestrator collector. Use when not using Orchestrator functionality. |
Orchestrator collector included |
|
|
|
Exclude the platform collector. |
Platform collector included |
|
|
|
Exclude the route collector (OpenShift Container Platform routes). |
Route collector included |
|
|
|
Exclude the ingress collector (Kubernetes ingress resources). |
Ingress collector included |
|
|
|
Exclude the namespace-inspect collector. Not recommended as it aids support navigation. |
Namespace-inspect collector included |
|
1.6.2. Environment variables
Environment variables control must-gather operational behavior when running the gather script directly. When using the Helm chart, use the Helm chart values described in the next section instead of environment variables.
| Variable | Description | Default | Helm Chart Equivalent |
|---|---|---|---|
|
|
Logging verbosity level. Valid values: |
|
|
|
|
Timeout for individual kubectl and Helm commands (seconds). |
|
|
|
|
Relative time duration to limit log collection (for example, |
Not set (collects all available logs) |
|
|
|
Absolute RFC3339 timestamp to limit log collection. |
Not set (collects all available logs) |
|
|
|
Timeout in seconds for heap dump generation per pod. |
|
|
1.6.3. Helm chart values
When deploying must-gather on Kubernetes platforms using the Helm chart, configure collection options using a values file. The following table shows key Helm values and their equivalent command-line flags or environment variables.
| Helm Value Path | Description | Equivalent Command-line Flag or Environment Variable |
|---|---|---|
|
|
Array of namespaces to collect from. |
|
|
|
Boolean to include cluster-wide state information. Use only when requested by support. |
|
|
|
Boolean to collect heap dumps from Node.js backend pods. |
|
|
|
Array of specific RHDH instance names to collect heap dumps from. |
|
|
|
Method to trigger heap dumps. Valid values: |
|
|
|
Timeout in seconds for heap dump collection. |
|
|
|
Boolean to enable or disable the Operator collector. Set to |
|
|
|
Boolean to enable or disable the Helm collector. Set to |
|
|
|
Boolean to enable or disable the Orchestrator collector. Set to |
|
|
|
Log level for must-gather operations. Valid values: |
|
|
|
Timeout in seconds for individual kubectl or helm commands. |
|
|
|
Relative time duration to limit log collection (for example, |
|
|
|
Absolute RFC3339 timestamp to limit log collection. |
|
Helm boolean values use positive logic (withX: true means include), while command-line flags use negative logic (--without-X means exclude).
1.7. Diagnostic data output structure and organization
The must-gather output directory organizes data by collector type, namespace, and resource type.
1.7.1. Top-level directory structure
The must-gather output directory contains files and subdirectories organized by collection method and data type. The following table shows the top-level structure:
| Path | Contents |
|---|---|
|
|
Tool version information (for example, |
|
|
Data sanitization summary and details. |
|
|
All OpenShift Container Platform routes cluster-wide. |
|
|
All Kubernetes ingresses cluster-wide. |
|
|
Must-gather container logs (if running in pod). |
|
|
Cluster-wide information (only present if you specified |
|
|
Deep namespace inspect data (collected by default). |
|
|
Platform and infrastructure information. |
|
|
Helm deployment data (native releases and standalone deployments). Only present if Helm-deployed RHDH instances are detected. |
|
|
Orchestrator-flavored deployment data (if detected). Only present if Orchestrator components are detected. |
|
|
Operator deployment data (if RHDH operators found). Only present if RHDH operators are detected. |
1.7.2. Heap dump file locations
When the tool collects heap dumps with the --with-heap-dumps flag, they appear within deployment-specific directories, not under namespace-inspect/namespaces/. The location depends on the installation method:
Heap dump file paths by deployment type:
-
Helm releases:
helm/releases/ns=<namespace>/<release>/deployment/heap-dumps/… -
Helm standalone:
helm/standalone/ns=<namespace>/<workload>/deployment/heap-dumps/… -
Operator:
operator/backstage-crs/ns=<namespace>/<cr>/deployment/heap-dumps/…
Example:
helm/releases/ns=rhdh-prod/developer-hub/deployment/heap-dumps/pod=backstage-developer-hub-7d8f9c5b-xk2m4/container=backstage-backend/heapdump-20260430-143022.heapsnapshot
Analyze heap dump files using Chrome DevTools, Node.js heap analysis tools, or memory profilers.
1.7.3. Locating common diagnostic data
The following table shows where to find frequently needed diagnostic information in the must-gather output:
| Diagnostic Data | Location |
|---|---|
|
Backend pod logs |
|
|
PostgreSQL connection configuration |
|
|
Operator logs |
|
|
Backstage custom resource definition |
|
|
Route or Ingress definitions |
|
|
Helm release values |
|
|
Deployment configurations |
|
|
Service definitions |
|
|
Platform and cluster version |
|
|
Heap dumps for memory analysis |
|
1.8. Platform compatibility and supported configurations
The must-gather tool supports OpenShift Container Platform 4.18+ and supported Kubernetes platforms.
1.8.1. Supported platforms
The RHDH must-gather tool supports the following Kubernetes platforms. The supported platforms and installation methods:
| Platform | Minimum Version | Installation Method | Notes |
|---|---|---|---|
|
Red Hat OpenShift Container Platform |
4.18 |
|
Integrates with OpenShift |
|
Microsoft Azure Kubernetes Service |
Kubernetes 1.31+ |
Helm chart |
Requires |
|
Amazon Elastic Kubernetes Service |
Kubernetes 1.31+ |
Helm chart |
Requires |
|
Google Kubernetes Engine |
Kubernetes 1.31+ |
Helm chart |
Requires |
1.8.2. Red Hat Developer Hub version compatibility
The RHDH must-gather tool collects diagnostic data from any supported RHDH instance regardless of deployment version. The must-gather tool was introduced in RHDH 1.10.0. Use the latest must-gather image version to ensure the most up-to-date collection capabilities and fixes.
1.8.3. Disconnected and air-gapped environment support
For disconnected environments, mirror the must-gather image to your internal registry.
Mirror the image:
$ podman pull registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:1.10.0 $ podman tag registry.access.redhat.com/rhdh/rhdh-must-gather-rhel9:1.10.0 internal-registry.example.com/rhdh/must-gather:1.10.0 $ podman push internal-registry.example.com/rhdh/must-gather:1.10.0
Use the mirrored image on OpenShift Container Platform:
$ oc adm must-gather --image=internal-registry.example.com/rhdh/must-gather:1.10.0
For authenticated registries, add credentials to the cluster-wide pull secret in the openshift-config namespace (cluster-wide required because oc adm must-gather creates temporary namespaces).
Use the mirrored image on Kubernetes with Helm values:
image: registry: internal-registry.example.com repository: rhdh/must-gather tag: "1.10.0"
For authenticated registries, create a docker registry secret:
$ kubectl create secret docker-registry my-registry-secret \ --docker-server=internal-registry.example.com \ --docker-username=<username> \ --docker-password=<password> \ --namespace rhdh-diagnostics
Add to Helm values:
imagePullSecrets: - name: my-registry-secret
Transfer output from disconnected environments to a system with internet access for support case upload. Test collection scope in non-production first to estimate output size (basic: 10-50 MB, with heap dumps: 500 MB - 2 GB).
1.8.4. Required RBAC permissions
Minimum required permissions:
- Read pods, deployments, services, configmaps in target namespaces.
- Read pod logs.
- Read custom resources (Backstage CRs for Operator, Helm releases for Helm).
- Read routes (OpenShift Container Platform) or ingresses (Kubernetes).
-
Execute commands in pods (only for
--with-heap-dumps).
On OpenShift Container Platform, oc adm must-gather requires cluster-admin privileges. On Kubernetes, the Helm chart creates a service account with appropriate role bindings. For restrictive security policies, create additional role bindings or use an elevated service account.
1.8.5. Known limitations and unsupported scenarios
- Unsupported platforms
- Red Hat tests only supported Kubernetes platforms and OpenShift Container Platform. The tool might work on other distributions without guarantee.
- Multi-cluster deployments
- Collects data from a single cluster only. For multi-cluster RHDH deployments, run must-gather on each cluster separately.
- Dynamic plugin diagnostics
-
No specialized collectors for individual dynamic plugins. Plugin diagnostics rely on pod logs and
ConfigMapdata. - Ephemeral environments
- Requires resources to remain stable during collection (2-15 minutes). Frequent pod restarts might result in incomplete data.