Linux KVM

Key Points


References

Reference_description_with_linked_URLs_______________________Notes______________________________________________________________




https://github.com/intel/linux-sgxLinux SGX
https://docs.kernel.org/virt/ne_overview.htmlLinux support for AWS Nitro Enclaves in EC2







Key Concepts



Security Guides

Security Guides


Secure Enclaves for Containers


Graphene - lightweight OS in VM runs on Linux and SGX enclaves - 2024

https://github.com/gramineproject/graphene

Graphene is a   lightweight library OS, designed to run a single application with minimal host requirements. Graphene can run applications in an isolated environment with benefits comparable to running a   complete OS in a   virtual machine -- including guest customization, ease of porting to different OSes, and process migration.

Graphene supports native, unmodified Linux binaries on any platform. Currently, Graphene runs on Linux and Intel SGX enclaves on Linux platforms.

In untrusted cloud and edge deployments, there is a   strong desire to shield the whole application from rest of the infrastructure. Graphene supports this “lift and shift” paradigm for bringing unmodified applications into Confidential Computing with Intel SGX. Graphene can protect applications from a   malicious system stack with minimal porting effort.


Status of SGX Cloud support - 2023

https://tozny.com/blog/secure-computation-cloud-sgx/

The following provides links to several vendors providing open-source frameworks and SDKs for developing code to run in enclaves (trusted execution environments).


KVM Container Security Concepts

https://kvmforum2022.sched.com/event/15jJS/the-five-big-problems-with-confidential-containers-christophe-de-dinechin-red-hat?iframe=no&w=&sidebar=yes&bg=no

Confidential computing is a set of technologies, such as Intel's TDX or AMD's SEV, designed to protect data in use, notably with the use of encrypted memory. Confidential containers (CC) are the application of technology to run containers in a way that does not expose any data to the host. Alice Frosi, Sergio Lopez and Christophe de Dinechin presented this technology last year, in a talk titled "Don't peek into my container". This year, CC became a CNCF sandbox project. This technology is full of promises, but it also presents a number of hard technical challenges, for which we have solutions of unequal quality. In this talk, we will focus on five major technical or commercial difficulties: 1/ attestation of the workloads, 2/ performance (including memory, disk and networking bloat), 3/ image download (including possible optimizations), 4/ access control (and the need to rethink credentials) and 5/ debuggability. For some of these problems, we have solutions in the works or on the horizon. For some others, we just know that it will be bad, and we are exploring ideas on how to limit the damage. The majority of these problems involve the hypervisor or KVM to some extent.


https://static.sched.com/hosted_files/kvmforum2022/f9/Five%20Big%20Problems%20with%20Confidential%20Containers%20%E2%80%93%C2%A0KVM%20Forum%202022.pdf

Five Big Problems with Confidential Containers – KVM Forum 2022.pdf file






Linux KVM Forum 2022

https://events.linuxfoundation.org/kvm-forum/program/schedule/



Linux support SGX enclaves

https://github.com/intel/linux-sgx


Intel(R) Software Guard Extensions (Intel(R) SGX) is an Intel technology for application developers seeking to protect select code and data from disclosure or modification.

The Linux* Intel(R) SGX software stack is comprised of the Intel(R) SGX driver, the Intel(R) SGX SDK, and the Intel(R) SGX Platform Software (PSW). The Intel(R) SGX SDK and Intel(R) SGX PSW are hosted in the linux-sgx project.

The SGXDataCenterAttestationPrimitives project maintains an out-of-tree driver for the Linux* Intel(R) SGX software stack, which will be used until the driver upstreaming process is complete. It is used on the platforms with Flexible Launch Control and Intel(R) AES New Instructions support and could support both Elliptic Curve Digital Signature algorithm (ECDSA) based attestation and Enhanced Privacy Identification (EPID) based attestation.

Note: Ice Lake Xeon-SP (and the future Xeon-SP platforms) doesn't support EPID attestation.

The linux-sgx-driver project hosts the other out-of-tree driver for the Linux* Intel(R) SGX software stack, which will be used until the driver upstreaming process is complete. It is used to support Enhanced Privacy Identification (EPID) based attestation on the platforms without Flexible Launch Control.

The intel-device-plugins-for-kubernetes project enables users to run container applications running Intel(R) SGX enclaves in Kubernetes clusters. It also gives instructions how to set up ECDSA based attestation in a cluster.

The intel-sgx-ssl project provides a full-strength general purpose cryptography library for Intel(R) SGX enclave applications. It is based on the underlying OpenSSL* Open Source project. Intel(R) SGX provides a build combination to build out a SGXSSL based SDK as below. Users could also utilize this cryptography library in SGX enclave applications seperately.

This repository provides a reference implementation of a Launch Enclave for 'Flexible Launch Control' under psw/ae/ref_le. The reference LE implementation can be used as a basis for enforcing different launch control policy by the platform developer or owner. To build and try it by yourself, please refer to the ref_le.md for details.



AWS Nitro Enclaves for Linux

https://docs.kernel.org/virt/ne_overview.html

Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability that allows customers to carve out isolated compute environments within EC2 instances [1].

For example, an application that processes sensitive data and runs in a VM, can be separated from other applications running in the same VM. This application then runs in a separate VM than the primary VM, namely an enclave. It runs alongside the VM that spawned it. This setup matches low latency applications needs.

The current supported architectures for the NE kernel driver, available in the upstream Linux kernel, are x86 and ARM64.

The resources that are allocated for the enclave, such as memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a process running in the primary VM, that communicates with the NE kernel driver via an ioctl interface.

In this sense, there are two components:

1. An enclave abstraction process - a user space process running in the primary VM guest that uses the provided ioctl interface of the NE driver to spawn an enclave VM (that’s 2 below).

There is a NE emulated PCI device exposed to the primary VM. The driver for this new PCI device is included in the NE driver.

The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl maps to an enclave start PCI command. The PCI device commands are then translated into actions taken on the hypervisor side; that’s the Nitro hypervisor running on the host where the primary VM is running. The Nitro hypervisor is based on core KVM technology.

2. The enclave itself - a VM running on the same host as the primary VM that spawned it. Memory and CPUs are carved out of the primary VM and are dedicated for the enclave VM. An enclave does not have persistent storage attached.

The memory regions carved out of the primary VM and given to an enclave need to be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from user space [2][3][7]. The memory size for an enclave needs to be at least 64 MiB. The enclave memory and CPUs need to be from the same NUMA node.

An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain available for the primary VM. A CPU pool has to be set for NE purposes by an user with admin capability. See the cpu list section from the kernel documentation [4] for how a CPU pool format looks.

An enclave communicates with the primary VM via a local communication channel, using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device, while the enclave VM has a virtio-mmio vsock emulated device. The vsock device uses eventfd for signaling. The enclave VM sees the usual interfaces - local APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio device is placed in memory below the typical 4 GiB.

The application that runs in the enclave needs to be packaged in an enclave image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the enclave VM. The enclave VM has its own kernel and follows the standard Linux boot protocol [6][8].

The kernel bzImage, the kernel command line, the ramdisk(s) are part of the Enclave Image Format (EIF); plus an EIF header including metadata such as magic number, eif version, image size and CRC.

Hash values are computed for the entire enclave image (EIF), the kernel and ramdisk(s). That’s used, for example, to check that the enclave image that is loaded in the enclave VM is the one that was intended to be run.

These crypto measurements are included in a signed attestation document generated by the Nitro Hypervisor and further used to prove the identity of the enclave; KMS is an example of service that NE is integrated with and that checks the attestation doc.

The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The init process in the enclave connects to the vsock CID of the primary VM and a predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is used to check in the primary VM that the enclave has booted. The CID of the primary VM is 3.

If the enclave VM crashes or gracefully exits, an interrupt event is received by the NE driver. This event is sent further to the user space enclave process running in the primary VM via a poll notification mechanism. Then the user space enclave process can exit.

[1] https://aws.amazon.com/ec2/nitro/nitro-enclaves/ [2] https://www.kernel.org/doc/html/latest/admin-guide/mm/hugetlbpage.html [3] https://lwn.net/Articles/807108/ [4] https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html [5] https://man7.org/linux/man-pages/man7/vsock.7.html [6] https://www.kernel.org/doc/html/latest/x86/boot.html [7] https://www.kernel.org/doc/html/latest/arm64/hugetlbpage.html [8] https://www.kernel.org/doc/html/latest/arm64/booting.html



Potential Value Opportunities



Potential Challenges



Candidate Solutions



Step-by-step guide for Example



sample code block

sample code block
 



Recommended Next Steps