:_mod-docs-content-type: ASSEMBLY ifdef::context[:parent-context: {context}] [id="troubleshooting-key-manager-proteccio-adoption_{context}"] :context: troubleshooting-proteccio = Troubleshooting {key_manager} Proteccio HSM adoption [role="_abstract"] Use this reference to troubleshoot common issues that might occur during {key_manager_first_ref} adoption with Proteccio HSM integration. If Proteccio HSM issues persist, consult the Eviden Trustway documentation and ensure that HSM server configuration matches the client settings. == Resolving prerequisite validation failures [role="_abstract"] The adoption script fails with the following errors during the prerequisites check, verify that your configuration includes all the required Proteccio files and that the HSM Ansible role is available. + ---- ERROR: Required file proteccio_files/YOUR_CERT_FILE not found ERROR: Cannot connect to OpenShift cluster ERROR: Proteccio HSM Ansible role not found ---- .Procedure . Verify that all required Proteccio files are present: + ---- $ ls -la /path/to/your/proteccio_files/ ---- + Ensure that your configured certificate files, private key, HSM certificate file, and configuration file exist as specified in your `proteccio_required_files` configuration. . Test OpenShift cluster connectivity: + ---- $ oc cluster-info $ oc get pods -n openstack ---- . Verify that the HSM Ansible role is available: + ---- $ ls -la /path/to/your/roles/ansible-role-rhoso-proteccio-hsm/ ---- == Resolving SSH connection failures to the source environment [role="_abstract"] If you cannot connect to source {OpenStackPreviousInstaller} environment, you might see the following error: + ---- Warning: Permanently added 'YOUR_UNDERCLOUD_HOST' (ED25519) to the list of known hosts. Permission denied (publickey). ---- To troubleshoot the issue, verify your SSH key access and test the SSH commands that the adoption uses. *Solution*: . Verify SSH key access to the undercloud: + ---- $ ssh YOUR_UNDERCLOUD_HOST echo "Connection test" ---- . Test the specific SSH commands used by the adoption: + ---- $ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack bash -lc "echo test"' $ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "echo test"' ---- . If the connection fails, verify the SSH configuration and ensure that the undercloud hostname resolves correctly. == Resolving database import failures [role="_abstract"] If the source database export or import fails, check the the source Galera container, database connectivity, and the source {key_manager_first_ref} configuration. The database import or export error looks similar to the following example: + ---- Error: no container with name or ID "galera-bundle-podman-0" found mysqldump: Got error: 1045: "Access denied for user 'barbican'@'localhost'" ---- .Procedure . Verify that the source Galera container is running: + ---- $ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "sudo podman ps | grep galera"' ---- . Test database connectivity with the extracted credentials: + ---- $ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "sudo podman exec galera-bundle-podman-0 mysql -u barbican -p -e \"SELECT 1;\""' ---- . Check the source {key_manager} configuration for the correct database password: + ---- $ sudo ssh -t YOUR_UNDERCLOUD_HOST 'sudo -u stack ssh -t tripleo-admin@YOUR_CONTROLLER_HOST.ctlplane "sudo grep connection /var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf"' ---- == Resolving custom image pull failures [role="_abstract"] If Proteccio custom images fail to pull or start, you might see the following error: + ---- Failed to pull image "": rpc error Pod has unbound immediate PersistentVolumeClaims ---- To troubleshoot this issue, verify image registry access, image pull secrets, and registry authentication. .Procedure . Verify image registry access: + ---- $ podman pull ---- . Check image pull secrets and registry authentication: + ---- $ oc get secrets -n openstack | grep pull $ oc describe pod -n openstack ---- . Verify that the `OpenStackVersion` resource was applied correctly: + ---- $ oc get openstackversion openstack -n openstack -o yaml ---- == Resolving HSM certificate mounting issues [role="_abstract"] If Proteccio client certificates are not properly mounted in pods, check the secret creation and ensure that the {key_manager_first_ref} configuration includes the correct volume mounts. The error looks similar to the following example: + ---- $ oc exec -c barbican-api -- ls -la /etc/proteccio/ ls: cannot access '/etc/proteccio/': No such file or directory ---- .Procedure . Verify that the `proteccio-data` secret was created correctly: + ---- $ oc describe secret proteccio-data -n openstack ---- . Check that the secret contains the expected files: + ---- $ oc get secret proteccio-data -n openstack -o yaml ---- . Verify that the {key_manager} configuration includes the correct volume mounts: + ---- $ oc get barbican barbican -n openstack -o yaml | grep -A10 pkcs11 ---- == Resolving service startup failures [role="_abstract"] If {key_manager_first_ref} services fail to start after configuration, check the pod logs, RabbitMQ user configuration, and resource constraints. The error looks similar to the following example: + ---- CrashLoopBackOff Init:Error amqp.exceptions.AccessRefused: Login was refused using authentication mechanism AMQPLAIN ---- .Procedure . Check pod logs for specific error messages: + ---- $ oc logs -c barbican-api -n openstack $ oc logs -c barbican-api-log -n openstack ---- . Verify that the {key_manager} configuration is valid: + ---- $ oc get barbican barbican -n openstack -o yaml ---- . Check the RabbitMQ user configuration if you see authentication errors: + ---- # Get the transport URL to find expected username $ oc get secret rabbitmq-transport-url-barbican-barbican-transport -n openstack -o jsonpath='{.data.transport_url}' | base64 -d # Create the missing RabbitMQ user (extract username and password from URL above) $ oc exec rabbitmq-server-0 -n openstack -- rabbitmqctl add_user $ oc exec rabbitmq-server-0 -n openstack -- rabbitmqctl set_permissions ".*" ".*" ".*" # Restart failing pods $ oc delete pods -l service=barbican -n openstack ---- . Check for resource constraints or scheduling issues: + ---- $ oc describe pod -n openstack ---- == Resolving adoption verification failures [role="_abstract"] If the secrets from the source environment are not accessible after adoption, you might see the following error: + ---- $ openstack secret list # Returns empty list or HTTP 500 errors ---- To troubleshoot this issue, verify that the database import completed successfully, test API connectivity, and check for schema adoption issues. .Procedure . Verify that the database import completed successfully: + ---- $ oc exec openstack-galera-0 -n openstack -- mysql -u root -p barbican -e "SELECT COUNT(*) FROM secrets;" ---- . Check for schema adoption issues: + ---- $ oc logs job.batch/barbican-db-sync -n openstack ---- . Test API connectivity: + ---- $ oc exec openstackclient -n openstack -- curl -s -k -H "X-Auth-Token: $(openstack token issue -f value -c id)" https://barbican-internal.openstack.svc:9311/v1/secrets ---- . Verify that projects and users were adopted correctly, as secrets are project-scoped. == Rolling back the HSM adoption [role="_abstract"] If the hardware security module (HSM) adoption fails, you can restore your environment to its original state and attempt the adoption again. . Restore the {rhos_long} 18.0 database backup: + ---- $ oc exec -i openstack-galera-0 -n openstack -- mysql -u root -p barbican < /path/to/your/backups/rhoso18_barbican_backup.sql ---- . Reset to standard images: + ---- $ oc delete openstackversion openstack -n openstack ---- . Restore the base control plane configuration: + ---- $ oc apply -f /path/to/your/base_controlplane.yaml ---- .Next steps To avoid additional issues when attempting your adoption again, consider the following suggestions: * Check the adoption logs that are stored in your configured working directory with timestamped summary reports. * For HSM-specific issues, consult the Proteccio documentation and verify HSM connectivity from the target environment. * Run the adoption in dry-run mode (`./run_proteccio_adoption.sh` option 3) to validate the environment before making changes.