5. Feature set and Requirements from Infrastructure¶
A profile Profiles, Profile Extensions & Flavours specifies the configuration of a Cloud Infrastructure node (host or server); Profile Extensions (specialisations) may specify additional configuration. Workloads utilise profiles to describe the configuration of nodes on which they can be hosted to execute on. Workload Flavours provide a mechanism to specify the VM or Pod sizing information to host the workload. Depending on the requirements of the workloads, a VM or a Pod will be deployed as per the specified Flavour information on a node configured as per the specified Profile. Not only do the nodes (the hardware) have to be configured but some of the capabilities also need to be configured in the software layers (such as Operating System and Virtualisation Software). Thus, a Profile can be defined in terms of configuration needed in the software layers, the Cloud Infrastructure Software Profile, and the hardware, the Cloud Infrastructure Hardware Profile.
5.1. Cloud Infrastructure Software profile description¶
Cloud Infrastructure Software layer is composed of 2 layers, Figure 5.1:
The virtualisation Infrastructure layer, which is based on hypervisor virtualisation technology or container-based virtualisation technology. Container virtualisation can be nested in hypervisor-based virtualisation
The host OS layer
Ref |
Cloud Infrastructure Software |
Type |
Definition/Notes |
Capabilities Reference (1) |
---|---|---|---|---|
infra.sw.001 |
Host Operating System |
<value> |
Values such as Ubuntu20.04, Windows 10 Release #, etc. |
e.cap.021 |
infra.sw.002 |
Virtualisation Infrastructure Layer |
<value> |
Values such as KVM, Hyper-V, Kubernetes, etc. |
e.cap.022 |
(1) Reference to the capabilities defined in Infrastructure Capabilities, Measurements and Catalogue.
For a host (compute node or physical server), the virtualisation layer is an abstraction layer between hardware components (compute, storage, and network resources) and virtual resources allocated to a VM or a Pod. Figure 5.2 represents the virtual resources (virtual compute, virtual network, and virtual storage) allocated to a VM or a Pod and managed by the Cloud Infrastructure Manager.
A Cloud Infrastructure Software Profile is a set of features, capabilities, and metrics offered by a Cloud Infrastructure software layer and configured in the software layers (the Operating System (OS) and the virtualisation software (such as hypervisor)). Figure 5.3 depicts a high level view of the Basic and High Performance Cloud Infrastructure Profiles.
The following sections detail the Cloud Infrastructure Software Profile capabilities per type of virtual resource.
5.1.1. Virtual Compute Profiles¶
Table 5-1 and Table 5-2 depict the features related to virtual compute.
Reference |
Feature |
Type |
Description |
Capabilities Reference |
---|---|---|---|---|
infra.com.cfg.001 |
CPU allocation ratio |
<value> |
Number of virtual cores per physical core. |
i.cap.016 |
infra.com.cfg.002 |
NUMA alignment |
Yes/No |
Support of NUMA at the Host OS and virtualisation layers, in addition to hardware. |
e.cap.007 |
infra.com.cfg.003 |
CPU pinning |
Yes/No |
Binds a vCPU to a physical core or SMT thread. Configured in OS and virtualisation layers. |
e.cap.006 |
infra.com.cfg.004 |
Huge pages |
Yes/No |
Ability to manage huge pages of memory. Configured in OS and virtualisation layers. |
i.cap.018 |
infra.com.cfg.005 |
Simultaneous Multithreading (SMT) |
Yes/No/Optional |
Allows multiple execution threads to be executed on a single physical CPU core. Configured in OS, in addition to the hardware. |
e.cap.018 |
Table 5-1: Virtual Compute features.
Reference |
Feature |
Type |
Description |
Capabilities Reference |
---|---|---|---|---|
infra.com.acc.cfg .001 |
IPSec Acceleration |
Yes/No/Optional |
IPSec Acceleration |
e.cap.008 |
infra.com.acc.cfg .002 |
Transcoding Acceleration |
Yes/No/Optional |
Transcoding Acceleration |
e.cap.010 |
infra.com.acc.cfg .003 |
Programmable Acceleration |
Yes/No/Optional |
Programmable Acceleration |
e.cap.011 |
infra.com.acc.cfg .004 |
GPU |
Yes/No/Optional |
Hardware coprocessor |
e.cap.014 |
infra.com.acc.cfg .005 |
FPGA/other Acceleration H/W |
Yes/No/Optional |
Non-specific hardware. These Capabilities generally require hardware-dependent drivers be injected into workloads. |
e.cap.016 |
Table 5-2: Virtual Compute Acceleration features.
5.1.2. Virtual Storage Profiles¶
Table 5-3 and Table 5-4 depict the features related to virtual storage.
Reference |
Feature |
Type |
Description |
---|---|---|---|
infra.stg.cfg.001 |
Catalogue Storage Types |
Yes/No |
Support of Storage types described in the catalogue |
infra.stg.cfg.002 |
Storage Block |
Yes/No |
|
infra.stg.cfg.003 |
Storage with replication |
Yes/No |
|
infra.stg.cfg.004 |
Storage with encryption |
Yes/No |
Table 5-3: Virtual Storage features.
Reference |
Feature |
Type |
Description |
---|---|---|---|
infra.stg.acc.cfg.001 |
Storage IOPS oriented |
Yes/No |
|
infra.stg.acc.cfg.002 |
Storage capacity oriented |
Yes/No |
Table 5-4: Virtual Storage Acceleration features.
5.1.3. Virtual Networking Profiles¶
Table 5-5 and Table 5-6 depict the features related to virtual networking.
Reference |
Feature |
Type |
Description |
Capabilities Reference |
---|---|---|---|---|
infra.net.cfg.001 |
Connection Point interface IO virtualisation |
e.g. virtio1.1 |
||
infra.net.cfg.002 |
Overlay protocol |
Protocols |
The overlay network encapsulation protocol needs to enable ECMP in the underlay to take advantage of the scale-out features of the network fabric. |
|
infra.net.cfg.003 |
NAT |
Yes/No |
Support of Network Address Translation |
|
infra.net.cfg.004 |
Security Groups |
Yes/No |
Set of rules managing incoming and outgoing network traffic |
|
infra.net.cfg.005 |
Service Function Chaining |
Yes/No |
Support of Service Function Chaining (SFC) |
|
infra.net.cfg.006 |
Traffic patterns symmetry |
Yes/No |
Traffic patterns should be optimal, in terms of packet flow. North-south traffic shall not be concentrated in specific elements in the architecture, making those critical choke-points, unless strictly necessary (i.e. when NAT 1:many is required). |
Table 5-5: Virtual Networking features.
Reference |
Feature |
Type |
Description |
Capabilities Reference |
---|---|---|---|---|
infra.net.acc.cfg.001 |
vSwitch optimisation |
Yes/No and SW Optimisation |
e.g. DPDK. |
|
infra.net.acc.cfg.002 |
SmartNIC (for HW Offload) |
Yes/No |
HW Offload |
|
infra.net.acc.cfg.003 |
Crypto acceleration |
Yes/No |
|
|
infra.net.acc.cfg.004 |
Crypto Acceleration Interface |
Yes/No |
Table 5-6: Virtual Networking Acceleration features.
5.1.4. Security¶
See Chapter 7 Security.
5.1.5. Platform Services¶
This section details the services that may be made available to workloads by the Cloud Infrastructure.
Reference |
Feature |
Type |
Description |
---|---|---|---|
infra.svc.stg.001 |
Object Storage |
Yes/No |
Object Storage Service (e.g S3-compatible) |
Table 5-7: Cloud Infrastructure Platform services.
Platform Service Category |
Platform Service Examples |
---|---|
Data Stores/Databases |
Ceph, etcd, MongoDB, Redis |
Streaming and Messaging |
Apache Kafka, Rabbit MQ |
Load Balancer and Service Proxy |
Envoy, Istio, NGINX |
Service Mesh |
Envoy, Istio |
Security & Compliance |
Calico, cert-manager |
Monitoring |
Prometheus, Grafana (for Visualisation), Kiali (for Service Mesh) |
Logging |
Fluentd, ElasticSearch (Elastic.io, Open Distro), ELK Stack (Elasticsearch, Logstash, and Kibana) |
Application Definition and Image Build |
Helm |
CI/CD |
Argo, GitLab, Jenkins |
Ingress/Egress Controllers |
Envoy, Istio, NGINX |
Network Service |
CoreDNS, Istio |
Coordination and Service Discovery |
CoreDNS, etcd, Zookeeper |
Automation and Configuration |
Ansible |
Key Management |
Vault |
Tracing |
Jaeger |
Table 5-7a: Service examples.
5.1.5.1. Platform Services - Load Balancer Requirements¶
The table below specifies a set of requirements for the Load Balancer platform service.
Reference |
Requirement |
Notes |
---|---|---|
pas.lb.001 |
The Load Balancer must support workload resource scaling |
|
pas.lb.002 |
The Load Balancer must support resource resiliency |
|
pas.lb.003 |
The Load Balancer must support scaling and resiliency in the local environment |
Local environment: within a subnet, tenant network, Availability Zone of a cloud, … |
pas.lb.004 |
The Load Balancer must support OSI Level 3/4 load-balancing |
OSI Level 3 load-balancing decision on the source and destination IP addresses and OSI Level 4 TCP port numbers. |
pas.lb.005 |
The Load Balancer must, at a minimum, support round-robin load-balancing |
|
pas.lb.006 |
The Load Balancer must create event logs with the appropriate severity levels (catastrophic, critical, …) |
|
pas.lb.007 |
The Load Balancer must support monitoring of endpoints |
|
pas.lb.008 |
The Load Balancer must support Direct Server Return (DSR) |
Other modes OK as well, but DSR should always be supported |
pas.lb.009 |
The Load Balancer must stateful TCP load-balancing |
|
pas.lb.010 |
The Load Balancer must support UDP load-balancing |
|
pas.lb.011 |
The Load Balancer must support load-balancing and correct handling of fragmented packets |
|
pas.lb.012 |
The Load Balancer may support state-full SCTP load-balancing |
|
pas.lb.013 |
The Load Balancer may support state-full M-TCP load-balancing |
|
pas.lb.014 |
The Load Balancer may support Level 7 load balancing |
OSI Level 7 (application characteristics based) should support HTTP and HTTPS |
pas.lb.0156 | The L7 Load Balancer may support HTTP2 |
||
pas.lb.016 |
The L7 Load Balancer may support HTTP3 |
|
pas.lb.017 |
The L7 Load Balancer may support QUIC |
Table 5-7b: Platform Services - Load Balancer Requirements.
5.1.5.2. Platform Services - Log Management Service (LMS)¶
The table below specifies a set of requirements for the Log Management Service (LMS).
Reference |
Requirement |
Notes |
---|---|---|
pas.lms.001 |
LMS must support log management from multiple, distributed sources |
|
pas.lms.002 |
LMS must manage log rotation at configurable time periods |
|
pas.lms.003 |
LMS must manage log rotation at configurable log file status (%full) |
|
pas.lms.004 |
LMS must manage archival and retention of logs for configurable time periods by different log types |
|
pas.lms.005 |
LMS must ensure log file integrity (no changes, particularly changes that may affect the completeness, consistency, and accuracy including event times, of the log file content) |
Covered by req.sec.mon.005: “The Prod-Platform and NonProd-Platform must secure and protect all logs (containing sensitive information) both in-transit and at rest.” |
pas.lms.006 |
LMS must monitor log rotation and log archival processes |
|
pas.lms.007 |
LMS must monitoring the logging status of all log sources |
|
pas.lms.008 |
LMS must ensure that each logging host’s clock is synched to a common time source |
|
pas.lms.009 |
LMS must support reconfiguring of logging as needed based on policy changes, technology changes, and other factors |
|
pas.lms.010 |
LMS must support the documenting and reporting of anomalies in log settings, configurations, and processes |
|
pas.lms.011 |
LMS must support the correlating of entries from multiple logs that relate to the same event |
|
pas.lms.012 |
LMS must support the correlating of multiple log entries from a single source or multiple sources based on logged values (e.g., event types, timestamps, IP addresses) |
|
pas.lms.013 |
LMS should support rule-based correlation |
Table 5-7c: Platform Services - Log Management Service (LMS) Requirements.
5.1.5.3. Platform Services - Monitoring Service Requirements¶
The table below specifies a set of requirements for the Monitoring service (aka monitoring system).
Reference |
Requirement |
Notes |
---|---|---|
pas.mon.001 |
The Monitoring service must be able to collect data generated by or collected from any resource (physical and virtual infrastructure, application, network, etc.) |
Capabilities to monitor applications, services, operating systems, network protocols, system metrics and infrastructure components |
pas.mon.002 |
The Monitoring service must be able to aggregate collected data |
|
pas.mon.003 |
The Monitoring service must be able to correlate data from different systems |
|
pas.mon.004 |
The Monitoring service must be able to perform at least one of active or passive monitoring |
|
pas.mon.005 |
The Monitoring service must support configuration of thresholds, outside of which the resource cannot function normally, for alert generation |
|
pas.mon.006 |
The Monitoring service must support configuration of alert notification medium (email, SMS, phone, etc.) |
|
pas.mon.007 |
The Monitoring service must support configurable re-alerting after a configurable period of time if the metric remains outside of the threshold |
|
pas.mon.008 |
The Monitoring service must support configurable alert escalations |
|
pas.mon.009 |
The Monitoring service must support alert acknowledgments by disabling future alerting of the same resource/reason |
|
pas.mon.010 |
The Monitoring service must support selective enabling and disabling of alerts by resource, category of resources, time periods. |
|
pas.mon.011 |
The monitoring service must publish its APIs for programmatic invocation of all monitoring service functions |
|
pas.mon.012 |
The monitoring service must itself be monitored through a logging service |
|
pas.mon.013 |
The Monitoring service should be implemented for high availability to ensure non-stop monitoring of critical infrastructure components |
|
pas.mon.014 |
The Monitoring service should run as separately from production services |
|
pas.mon.015 |
Failure of the system being monitored should not cause a failure in the monitoring service |
|
pas.mon.016 |
An inoperative monitoring service should not generate alerts about the monitored system |
|
pas.mon.017 |
The monitoring service should provide a consolidated view of the entire monitored infrastructure |
View: dashboard or report |
Table 5-7d: Platform Services - Monitoring Service Requirements.
5.2. Cloud Infrastructure Software Profiles features and requirements¶
This section will detail Cloud Infrastructure Software Profiles and associated configurations for the 2 types of Cloud Infrastructure Profiles: Basic and High Performance.
5.2.1. Virtual Compute¶
Table 5-8 depicts the features and configurations related to virtual compute for the two (2) Cloud Infrastructure Profiles.
Reference |
Feature |
Type |
Basic |
High Performance |
---|---|---|---|---|
infra.com.cfg.001 |
CPU allocation ratio |
<value> |
N:1 |
1:1 |
infra.com.cfg.002 |
NUMA alignment |
Yes/No |
N |
Y |
infra.com.cfg.003 |
CPU pinning |
Yes/No |
N |
Y |
infra.com.cfg.004 |
Huge pages |
Yes/No |
N |
Y |
infra.com.cfg.005 |
Simultaneous Multithreading (SMT) |
Yes/No/Optional |
Y |
Optional |
Table 5-8: Virtual Compute features and configuration for the 2 types of Cloud Infrastructure Profiles.
Table 5-9 lists the features related to compute acceleration for the High Performance profile. The table also lists the applicable Profile Extensions and Extra Specs that may need to be specified.
Reference |
Feature |
Profile-Extensions |
Profile Extra Specs |
---|---|---|---|
infra.com.acc.cfg.001 |
IPSec Acceleration |
Compute Intensive GPU |
|
infra.com.acc.cfg.002 |
Transcoding Acceleration |
Compute Intensive GPU |
Video Transcoding |
infra.com.acc.cfg.003 |
Programmable Acceleration |
Firmware-programmable adapter |
Accelerator |
infra.com.acc.cfg.004 |
GPU |
Compute Intensive GPU |
|
infra.com.acc.cfg.005 |
FPGA/other Acceleration H/W |
Firmware-programmable adapter |
Table 5-9: Virtual Compute Acceleration features.
5.2.2. Virtual Storage¶
Table 5-10 and Table 5-11 depict the features and configurations related to virtual storage for the two (2) Cloud Infrastructure Profiles.
Reference |
Feature |
Type |
Basic |
High Performance |
---|---|---|---|---|
infra.stg.cfg.001 |
Catalogue storage Types |
Yes/No |
Y |
Y |
infra.stg.cfg.002 |
Storage Block |
Yes/No |
Y |
Y |
infra.stg.cfg.003 |
Storage with replication |
Yes/No |
N |
Y |
infra.stg.cfg.004 |
Storage with encryption |
Yes/No |
Y |
Y |
Table 5-10: Virtual Storage features and configuration for the two (2) profiles.
Table 5-11 depicts the features related to Virtual storage Acceleration
Reference |
Feature |
Type |
Basic |
High Performance |
---|---|---|---|---|
infra.stg.acc.cfg.001 |
Storage IOPS oriented |
Yes/No |
N |
Y |
infra.stg.acc.cfg.002 |
Storage capacity oriented |
Yes/No |
N |
N |
Table 5-11: Virtual Storage Acceleration features.
5.2.3. Virtual Networking¶
Table 5-12 and Table 5-13 depict the features and configurations related to virtual networking for the 2 types of Cloud Infrastructure Profiles.
Reference |
Feature |
Type |
Basic |
High Performance |
---|---|---|---|---|
infra.net.cfg.001 |
Connection Point interface |
IO virtualisation |
virtio1.1 |
virtio1.1* |
infra.net.cfg.002 |
Overlay protocol |
Protocols |
VXLAN, MPLSoUDP, GENEVE, other |
VXLAN, MPLSoUDP, GENEVE, other |
infra.net.cfg.003 |
NAT |
Yes/No |
Y |
Y |
infra.net.cfg.004 |
Security Group |
Yes/No |
Y |
Y |
infra.net.cfg.005 |
Service Function Chaining |
Yes/No |
N |
Y |
infra.net.cfg.006 |
Traffic patterns symmetry |
Yes/No |
Y |
Y |
Table 5-12: Virtual Networking features and configuration for the 2 types of SW profiles.
Note: * might have other interfaces (such as SR-IOV VFs to be directly passed to a VM or a Pod) or NIC-specific drivers on guest machines transiently allowed until mature enough solutions are available with a similar efficiency level (for example regarding CPU and energy consumption).
Reference |
Feature |
Type |
Basic |
High Performance |
---|---|---|---|---|
infra.net.acc.cfg.001 |
vSwitch optimisation (DPDK) |
Yes/No and SW Optimisation |
N |
Y |
infra.net.acc.cfg.002 |
SmartNIC (for HW Offload) |
Yes/No/Optional |
N |
Optional |
infra.net.acc.cfg.003 |
Crypto acceleration |
Yes/No/Optional |
N |
Optional |
infra.net.acc.cfg.004 |
Crypto Acceleration Interface |
Yes/No/Optional |
N |
Optional |
Table 5-13: Virtual Networking Acceleration features.
5.3. Cloud Infrastructure Hardware Profile description¶
The support of a variety of different workload types, each with different (sometimes conflicting) compute, storage, and network characteristics, including accelerations and optimizations, drives the need to aggregate these characteristics as a hardware (host) profile and capabilities. A host profile is essentially a “personality” assigned to a compute host (also known as physical server, compute host, host, node, or pServer). The host profiles and related capabilities consist of the intrinsic compute host capabilities (such as number of CPU sockets, number of cores per CPU, RAM, local disks and their capacity, etc.), and capabilities enabled in hardware/BIOS, specialised hardware (such as accelerators), the underlay networking, and storage.
This chapter defines a simplified host, profile and related capabilities model associated with each of the different Cloud Infrastructure Hardware Profile and related capabilities; the two Profiles, Profile Extensions & Flavours (aka host profiles, node profiles, hardware profiles) and some of their associated capabilities are shown in Figure 5.4.
The profiles can be considered to be the set of EPA-related (Enhanced Performance Awareness) configurations on Cloud Infrastructure resources.
Note: In this chapter we shall not list all of the EPA-related configuration parameters.
A given host can only be assigned a single host profile; a host profile can be assigned to multiple hosts. In addition to the host profile, Profiles and Workload Flavours and additional capability specifications for the configuration of the host can be specified. Different Cloud Service Providers (CSP) may use different naming standards for their host profiles. For the profiles to be configured, the architecture of the underlying resource needs to be known.
Ref |
Cloud Infrastructure Resource |
Type |
Definition/Notes |
Capabilities Reference |
---|---|---|---|---|
infra.hw.001 |
CPU Architecture |
<value> |
Values such as x64, ARM, etc. |
|
The host profile properties are specified in the following sub-sections. The following diagram (Figure 5.5) pictorially represents a high-level abstraction of a physical server (host).
5.4. Cloud Infrastructure Hardware Profiles features and requirements.¶
The configurations specified in here will be used in specifying the actual hardware profile configurations for each of the Cloud Infrastructure Hardware Profiles depicted in Figure 5-4.
5.4.1. Compute Resources¶
Reference |
Feature |
Description |
Basic |
High Performance |
---|---|---|---|---|
infra.hw.cpu.cfg.001 |
Minimum number of CPU sockets |
Specifies the minimum number of populated CPU sockets within each host (*) |
2 |
2 |
infra.hw.cpu.cfg.002 |
Minimum number of cores per CPU |
Specifies the number of cores needed per CPU (*) |
20 |
20 |
infra.hw.cpu.cfg.003 |
NUMA alignment |
NUMA alignment enabled and BIOS configured to enable NUMA |
N |
Y |
infra.hw.cpu.cfg.004 |
Simultaneous Multithreading (SMT) SMT enabled that allows each core to work multiple streams of data simultaneously |
Y |
Optional |
Table 5-14: Minimum sizing and capability configurations for general purpose servers.
(*) Please note that these specifications are for general purpose servers normally located in large data centres. Servers for specialised use with the data centres or other locations, such as at edge sites, are likely to have different specifications.
5.4.1.1. Compute Acceleration Hardware Specifications¶
Reference |
Feature |
Description |
Basic |
High Performance |
Capabilities Reference |
---|---|---|---|---|---|
infra.hw.cac.cfg.001 |
GPU |
GPU |
N |
Optional |
|
infra.hw.cac.cfg.002 |
FPGA/other Acceleration H/W |
HW Accelerators |
N |
Optional |
|
Table 5-15: Compute acceleration configuration specifications.
5.4.2. Storage Configurations¶
Reference |
Feature |
Description |
Basic |
High Performance |
---|---|---|---|---|
infra.hw.stg.hdd.cfg.001* |
Local Storage HDD |
Hard Disk Drive |
||
infra.hw.stg.ssd.cfg.002* |
Local Storage SSD |
Solid State Drive |
Recommended |
Recommended |
Table 5-16: Storage configuration specification.
Note: *This specified local storage configurations including # and capacity of storage drives.
5.4.3. Network Resources¶
5.4.3.1. NIC configurations¶
Reference |
Feature |
Description |
Basic |
High Performance |
---|---|---|---|---|
infra.hw.nic.cfg.001 |
NIC Ports |
Total number of NIC Ports available in the host |
4 |
4 |
infra.hw.nic.cfg.002 |
Port Speed |
Port speed specified in Gbps (minimum values) |
10 |
25 |
Table 5-17: Minimum NIC configuration specification.
5.4.3.2. PCIe Configurations¶
Reference |
Feature |
Description |
Basic |
High Performance |
---|---|---|---|---|
infra.hw.pci.cfg.001 |
PCIe slots |
Number of PCIe slots available in the host |
8 |
8 |
infra.hw.pci.cfg.002 |
PCIe speed |
Gen 3 |
Gen 3 |
|
infra.hw.pci.cfg.003 |
PCIe Lanes |
8 |
8 |
Table 5-18: PCIe configuration specification.
5.4.3.3. Network Acceleration Configurations¶
Reference |
Feature |
Description |
Basic |
High Performance |
Capabilities Reference |
---|---|---|---|---|---|
infra.hw.nac.cfg.001 |
Crypto Acceleration |
IPSec, Crypto |
N |
Optional |
|
infra.hw.nac.cfg.002 |
SmartNIC |
offload network functionality |
N |
Optional |
|
infra.hw.nac.cfg.003 |
Compression |
Optional |
Optional |
||
infra.hw.nac.cfg.004 |
SR-IOV over PCI-PT |
SR-IOV |
N |
Optional |
|
Table 5-19: Network acceleration configuration specification.