This guide provides security controls and remediation steps for hardening a stand-alone Longhorn storage system on RKE2/K3s. It prioritizes findings from Longhorn hardened cluster logs to address compliance failures in restricted environments.
This section hardens the underlying Kubernetes nodes to ensure they meet CIS benchmark requirements and provide a stable, secure foundation for Longhorn storage operations.
The Center for Internet Security (CIS) Kubernetes Benchmark is an industry-standard set of best practices for securely configuring Kubernetes clusters. Implementing these benchmarks is critical because they provide a prescriptive roadmap for reducing the attack surface of the control plane and worker nodes, ensuring that default settings which often prioritize ease of use over security are hardened against exploitation.
This control (a specific security technical safeguard) ensures that the Kubernetes distribution (RKE2 or K3s) is running with a CIS benchmark profile and that kernel defaults are protected from runtime modification. These settings enforce hardened defaults for kubelet, kube-apiserver, and host kernel behavior.
For Longhorn, compliance with CIS profiles is especially important because its storage components operate close to the host kernel, block devices, and networking stack. Inconsistent or weakened kernel settings can directly impact I/O behavior, isolation guarantees, and network reliability, increasing the risk of data corruption, performance degradation, or node instability.
References:
Run RKE2/K3s with a CIS profile enabled and enforce kernel defaults using protect-kernel-defaults. Configure kernel panic behavior to fail fast during unrecoverable errors.
Enable CIS enforcement in the distribution config.
Create or update /etc/rancher/rke2/config.yaml on all nodes. For users running Kubernetes v1.29 and later, the cis-1.11 profile is recommended:
profile: "cis"
# Required for cis-1.11 only; not required for cis-1.9/cis-1.10
kube-apiserver-arg:
- 'service-account-extend-token-expiration=false'
profile: "cis" flag automates the application of CIS-compliant configurations.service-account-extend-token-expiration=false argument is a mandatory requirement for the CIS-1.11 profile to ensure service account tokens adhere to strict security lifecycles.Apply the required kernel parameters.
Apply the hardening parameters to the host system.
Note on RKE2 Installation Paths: If you installed RKE2 using the official script (
curl -sfL https://get.rke2.io | sh), a preconfigured CIS sysctl file is available at/opt/rke2/share/rke2/rke2-cis-sysctl.conf. You can copy it directly:cp /opt/rke2/share/rke2/rke2-cis-sysctl.conf /etc/sysctl.d/60-rke2-cis.conf
Alternatively, you can create the file manually:
cat << EOF > /etc/sysctl.d/60-rke2-cis.conf
vm.panic_on_oom=0
vm.overcommit_memory=1
kernel.panic=10
kernel.panic_on_oops=1
EOF
systemctl restart systemd-sysctl
vm.*): Settings like vm.overcommit_memory=1 provide more predictable memory allocation, reducing the risk of the Out-Of-Memory (OOM) killer terminating critical Longhorn replica processes.kernel.panic*): Forcing a reboot on “oops” or panics ensures that a compromised or unstable node does not continue to send faulty storage heartbeats or corrupted data blocks to the rest of the Longhorn cluster.Restart the RKE2 service on each node.
Validation: Validate that the RKE2 configuration uses the CIS profile and checks kernel parameters:
grep "profile: cis" /etc/rancher/rke2/config.yaml
Expected Result: Output includes profile: "cis" (or another compliant CIS version) and protect-kernel-defaults: true.
Longhorn requires specific kernel modules and host utilities to attach, encrypt and manage block devices. In CIS-hardened environments, these dependencies are not guaranteed to be present or loaded by default.
Failure to load these modules results in Longhorn volume attachment failures and node-level errors.
References:
Ensure that required kernel modules for both V1 and V2 engines are installed installed, loaded and restricted to privileged execution on the host. For V2 Data Engine (Technical Preview) stability, ensure nodes run Linux Kernel 5.19 or later (6.7+ recommended) to prevent unexpected reboots and memory corruption associated with NVMe-TCP. You can check the references mentioned above for the latest Longhorn requirements.
The Longhorn CLI automates the environment setup, including the installation of open-iscsi and cryptsetup, and the loading of modules like iscsi_tcp, dm_crypt, and nvme_tcp. This tool ensures a “secure-by-default” configuration across all nodes with a single command.
Setup and Verify: Use the install command to apply dependencies and the check command to confirm the environment is ready.
# Install dependencies (Add --enable-spdk for V2 Data Engine)
longhornctl install preflight
# Validate the environment
longhornctl check preflight
V2 Memory Requirements: If using the V2 Data Engine, 2 MiB-sized Huge Pages must be manually allocated (for example, echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages) to allow SPDK to perform high-speed polling. Refer to the V2 Quick Start Guide for permanent configuration steps.
Expected Result: A successful longhornctl check returns an info status for all required services and modules. On immutable systems like SLE Micro, a reboot may be necessary to finalize the installation.
This section ensures the confidentiality of data at rest through encryption and maintains security by detecting silent data corruption.
Volume encryption ensures data confidentiality at rest by utilizing the Linux kernel’s dm_crypt module and LUKS (Linux Unified Key Setup). This protection ensures that even if physical storage media is compromised or a node is accessed by an unauthorized user, the data remains unreadable.
Longhorn supports encryption for both Filesystem and Block modes. Critically, encryption is maintained throughout the data lifecycle: any backups created from encrypted volumes inherit the same level of protection.
References:
Enforce encryption for all sensitive workloads by defining a StorageClass that references a Kubernetes Secret. This secret should ideally be restricted to the longhorn-system namespace. To minimize the “blast radius” in multi-tenant environments, utilize per-volume secrets to ensure each volume has a unique encryption key.
The Longhorn CSI driver handles volume encryption by resolving secret templates during volume provisioning. Configuration focuses on two main components:
Secret Management: Longhorn utilizes Kubernetes Secrets to store encryption passphrases. While CRYPTO_KEY_VALUE is mandatory, advanced security postures can be achieved by defining custom algorithms like argon2i for PBKDF or aes-xts-plain64 for ciphers.
StorageClass Orchestration: To enable encryption, the StorageClass must include the encrypted: "true" parameter.
${pvc.name}) to isolate keys between workloads.Expected State: Once configured, a StorageClass with the encrypted: "true" flag automatically provision LUKS-encrypted block devices. PVCs remain in a Pending state until their associated encryption secrets are created and accessible to the CSI sidecars.
This section focuses on maintaining the correctness of data by detecting and repairing silent data corruption.
Snapshot data integrity checks are designed to detect “bit rot” or silent data corruption that occurs at the physical storage level and remains invisible to the file system. Longhorn maintains the integrity of your data by generating and verifying hashes for snapshot disk files, ensuring that the data recovered from a snapshot is identical to the state originally captured.
References:
Enable the snapshot-data-integrity global setting to enforce automated verification. In hardened environments, ensuring data has not been tampered with or corrupted is as critical as preventing unauthorized access. For a balance of performance and protection, use fast-check mode to minimize I/O overhead while ensuring metadata-consistent integrity.
The integrity engine can be configured globally or on a per-volume basis. Enabling this feature allows Longhorn to automatically detect corrupted replicas and initiate the rebuilding process to restore data consistency.
Global Enforcement: Use the Longhorn settings to enable hashing across the cluster.
fast-check, enabled).Scheduling and Performance: Hashing disk files consumes storage and computation resources. It is recommended to:
snapshot-data-integrity-cronjob to schedule checks during off-peak hours.snapshot-data-integrity-immediate-check-after-snapshot-creation to minimize immediate I/O impact.Expected State: When enabled, the global snapshot-data-integrity setting is reflected in the cluster configuration. A successful implementation ensures that any snapshot disk file corruption is identified through periodic hash validation, signaling the system to repair affected replicas.
This section restricts network communication to reduce lateral movement and isolate storage traffic.
In “default-deny” security environments, Longhorn components require explicit authorization to communicate with the Kubernetes API, between the control and data planes, and with external backup targets. Manually managing these policies is complex due to the extensive port range (for example, 9500–9503 for Manager, 8500–8504 for Instance Manager, and 10000–30000 for replica traffic).
Restricting traffic to only known Longhorn components minimizes the “blast radius” if a malicious pod attempts lateral movement within the cluster.
References:
Enable the built-in NetworkPolicies provided by the Longhorn Helm chart rather than maintaining manual manifests. This approach is more maintainable than manual manifests, as it ensures that the complete communication matrix including gRPC, UI, CSI and Webhook traffic is automatically whitelisted according to the specific requirements of the Longhorn version in use.
The Longhorn Helm chart includes a built-in policy engine that generates required NetworkPolicy objects for all core components (Manager, Instance Manager, etc.).
Policy Enforcement: Toggle the internal policy engine during installation or upgrade. This automates the configuration of complex port ranges (for example, 8500–8504 for Instance Managers and 10000–30000 for replica traffic).
networkPolicies.enabled to true in your Helm configuration. Refer to the Helm Chart Values for provider-specific settings (for example, RKE2, K3s, or Cilium).Egress for Backup Targets: Built-in policies primarily secure intra-cluster communication. If your environment enforces a global default-deny egress policy, you must manually define egress rules to allow longhorn-manager and instance-manager pods to reach external backupstore endpoints (for example, NFS on port 2049 or S3/MinIO on port 443).
Expected State: Upon application, the longhorn-system namespace contains a suite of NetworkPolicy objects. This hardened state ensures that only authorized Longhorn components can communicate over the required TCP ports, effectively isolating the storage control and data planes from unauthorized cluster traffic.
By default, Longhorn uses the primary Kubernetes CNI network for all traffic. To enhance security and performance, Longhorn supports the isolation of in-cluster data traffic (replication and management) to a dedicated network interface. This segmentation prevents high-bandwidth storage traffic from interfering with the Kubernetes control plane and significantly reduces the attack surface of the storage layer by physically or logically isolating data replication.
References:
Implement network segregation by configuring a dedicated storage network using a Multus NetworkAttachmentDefinition. This ensures that sensitive storage traffic is physically or logically separated from general application and management traffic.
Longhorn uses the Multus CNI to attach secondary network interfaces to storage-related components.
Multus Integration: The Storage Network setting accepts a NetworkAttachmentDefinition in <NAMESPACE>/<NAME> format. Applying this setting adds the required CNI annotations to Longhorn pods, routing replication traffic through the secondary interface.
Operational Requirements:
instance-manager and backing-image-manager pods once volumes are offline.Prerequisites: Ensure the specified Multus network is reachable across all cluster nodes. Refer to the Prerequisite Verification for testing connectivity between nodes using a simple DaemonSet.
Expected State: Once the setting is active, all storage-related pods (for example, instance-manager) feature the k8s.v1.cni.cncf.io/networks annotation. The hardened state is confirmed when storage replication traffic is successfully verified on the isolated secondary interface, leaving the primary cluster network for control plane operations.
By default, communication between the Longhorn control plane (longhorn-manager) and the data plane (instance-manager) is unencrypted. Implementing Mutual TLS (mTLS) ensures that all gRPC traffic is encrypted and that both parties authenticate each other using a trusted Certificate Authority (CA).
This prevents “man-in-the-middle” attacks and unauthorized command injection within the storage network. Longhorn uses a Kubernetes Secret-based mechanism to distribute these certificates.
References:
Enable mTLS for all gRPC communication by deploying a Kubernetes Secret named longhorn-grpc-tls in the longhorn-system namespace. For the highest security posture, this secret should be created before Longhorn installation to ensure all components are initialized with encryption active.
Longhorn utilizes an optional secret mount. If the longhorn-grpc-tls secret is present, components automatically enable TLS. Otherwise, they fall back to plaintext.
Certificate Requirements: The tls.crt must be signed by your CA and must contain a specific list of Subject Alternative Names (SANs) to support Longhorn’s internal service discovery (for example, longhorn-backend, longhorn-engine-manager, and longhorn-csi).
Secret Formatting: Deploy the CA, certificate, and private key as a kubernetes.io/tls secret.
Lifecycle and Upgrades:
longhorn-manager and instance-manager pods must be restarted to mount the secret and initiate encrypted communication.Expected State: Successful activation is validated through the longhorn-manager logs, which indicate that gRPC services are initializing with TLS enabled. Once verified, the storage network is effectively hardened against unauthenticated gRPC traffic.
© 2019-2026 Longhorn Authors | Documentation Distributed under CC-BY-4.0
© 2026 The Linux Foundation. All rights reserved. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.