Compute hardware

  1. CPU
    1. CPU to VM allocation
  2. Interfaces
    1. HBA offload
    2. Smart NIC
    3. PCIe
    4. Dataplane development kit (DPDK)
    5. Single root input/output virtualization (SR-IOV)
  3. Memory
    1. Random access memory (RAM)
      1. DRAM types
      2. Dual inline memory module (DIMM)
    2. Read-only memory (ROM)
      1. ROM types
    3. Remote direct memory access (RDMA)
    4. Intel Optane DC persistent memory module (DCPMM)
  4. Trusted platform module (TPM)

CPU

  • VT-x: instructions for VM
  • VT-d: instructions for PCIe pass-through

CPU to VM allocation

  • no more than 3 VM per CPU/vCPU (empirical)
    • on big ratios – cache contention
core 1core 2core 3core 4
T1VM3VM3VM3VM3
T2VM1VM2idleidle
T3VM3VM3VM3VM3
T4VM1VM2idleidle

VM1, VM2: 1 vCPU
VM3: 2 vCPU → 4 vCPU results in performance hit

Interfaces

HBA offload

  • HBA processes TCP/IP, iSCSI instead of OS

Smart NIC

  • host CPU offload:
    • packet inspection
    • policy enforcement
    • encryption/decryption
    • tunnelling
    • monitoring, analytics
    • clusterization: use resources of NIC with lower load
    • indexing, hashing
    • compression, deduplication
  • FPGA + NPU

PCIe

  • P2P connections
  • revisions
    • 1.0: 250 MBps ≡ 2.5 GTps (8b/10b encoding)
    • 2.0: 500 MBps ≡ 5 GTps
    • 3.0: 1 GBps ≡ 8 GTps
    • 4.0: 2 GBps ≡ 16 GTps
    • 5.0: 4 GBps ≡ 32 GTps
  • GTps: gigatransfer per second
    • raw data
    • 1 GBps (8 bit) → 8 GTps (10 bit)

Dataplane development kit (DPDK)

  • libraries and drivers
  • application can bypass kernel and hypervisor to access NIC

Single root input/output virtualization (SR-IOV)

  • presents single physical PCIe device as several virtual devices to VMs
  • VMs share physical PCIe bus

Memory

Random access memory (RAM)

  • volatile
  • electronic types
    • dynamic RAM (DRAM)
      • charge is stored in condensers ⇒ controller recharges to the same value
      • slower (≈ 10ns) due to recharging
    • static RAM (SRAM)
      • charge is stored in a number of transistors
      • CPU registers
  • column address strobe (CAS) delay: delay (in clock cycles) between read command and data being ready
  • memory rank: how many chip sets are on the module
    • 2Rx4: 2 ranks, 4-bit data bus width of a single chip
    • only one set is accessible at a time
    • higher rank ≡ more capacity, higher latency (after 4 ranks, before latency is lower because of parallel processing)
    • 64-bit blocks (72-bit with ECC)

DRAM types

  1. synchronous DRAM (SDRAM)
    • synchronizes to CPU frequency
  2. extended data out DRAM (EDO DRAM)
    • predictive fetch of next block during transmission of previous block: write cmd + read data
  3. burst EDO DRAM (BEDO DRAM)
    • during short time can transfer data from up to 4 addresses ≡ burst
  4. double data rate DRAM (DDR DRAM)
    • read during rising and falling clock cycles
    • usually off-chip modules
  5. high bandwidth memory (HBM)
    • up to 8 stacked SDRAM dies
    • Through Silicon Via (TSV) interconnects between the dies
    • 2×128-bit channels on the package per die, each controlled separately
    • not a module, integrated into the chip
    • high latency
    • lower capacity than DDR

Dual inline memory module (DIMM)

  • unbuffered DIMM (UDIMM)
    • controller sends commands and data to DRAM chip directly
    • parallel transfer
    • low latency, low capacity
    • no ECC
    • 2×DIMM per channel max
  • registered DIMM (RDIMM)
    • uses register to buffer commands and addresses
    • does not buffer data
    • 3×DIMM per channel max
    • higher capacity and latency
  • load-reduced DIMM (LRDIMM)
    • uses isolation memory buffer (iMB) for both commands and data
    • higher capacity and latency

Read-only memory (ROM)

  • non-volatile

ROM types

  1. ROM
    • software is burned-in on factory
  2. programmable ROM (PROM)
    • software can be written separately
    • burns certain inter-transistor connections ⇒ read-only
  3. erasable PROM (EPROM)
    • data can be erased by UV-light
    • data is erased as a whole
  4. electrically erasable PROM (EEPROM)
    • per byte I/O
  5. Flash
    • block I/O ⇒ faster than EEPROM

Remote direct memory access (RDMA)

  • zero-copy data transfer in HPC with NUMA: no copy to OS buffers
  • offloads transfer from CPU

Intel Optane DC persistent memory module (DCPMM)

  • modes
    • memory
      • volatile
      • DRAM ≡ cache for DCPMM
    • app-direct
      • SSD, non-volatile
      • application decides itself where to write data
    • mixes
      • 25% memory + 75% app-direct
  • DDR4 DIMM
  • more IOPS and WPD compared to SSD

Trusted platform module (TPM)

  • memory segments:
    • persistent
      • endorsement key (EK)
        • TPM authC, AIK creation
        • not used for digital signatures
      • storage rook key (SRK)
        • KEK
    • versatile
      • attestation identity key (AIK)
        • digital signatures
      • platform configuration register (PCR)
        • stores configuration hash
      • storage keys
        • keys for data encryption