OPNFV Documentation

Open Platform for NFV (OPNFV) facilitates the development and evolution of NFV components across various open source ecosystems. Through system level integration, deployment and testing, OPNFV creates a reference NFV platform to accelerate the transformation of enterprise and service provider networks. Participation is open to anyone, whether you are an employee of a member company or just passionate about network transformation.

Platform overview

Introduction

Network Functions Virtualization (NFV) is transforming the networking industry via software-defined infrastructures and open source is the proven method for quickly developing software for commercial products and services that can move markets. Open Platform for NFV (OPNFV) facilitates the development and evolution of NFV components across various open source ecosystems. Through system level integration, deployment and testing, OPNFV constructs a reference NFV platform to accelerate the transformation of enterprise and service provider networks. As an open source project, OPNFV is uniquely positioned to bring together the work of standards bodies, open source communities, service providers and commercial suppliers to deliver a de facto NFV platform for the industry.

By integrating components from upstream projects, the community is able to conduct performance and use case-based testing on a variety of solutions to ensure the platform’s suitability for NFV use cases. OPNFV also works upstream with other open source communities to bring contributions and learnings from its work directly to those communities in the form of blueprints, patches, bugs, and new code.

OPNFV focuses on building NFV Infrastructure (NFVI) and Virtualised Infrastructure Management (VIM) by integrating components from upstream projects such as OpenDaylight, ONOS, Tungsen Fabric, OVN, OpenStack, Kubernetes, Ceph Storage, KVM, Open vSwitch, and Linux. More recently, OPNFV has extended its portfolio of forwarding solutions to include DPDK, fd.io and ODP, is able to run on both Intel and ARM commercial and white-box hardware, support VM, Container and BareMetal workloads, and includes Management and Network Orchestration MANO components primarily for application composition and management in the Fraser release.

These capabilities, along with application programmable interfaces (APIs) to other NFV elements, form the basic infrastructure required for Virtualized Network Functions (VNF) and MANO components.

Concentrating on these components while also considering proposed projects on additional topics (such as the MANO components and applications themselves), OPNFV aims to enhance NFV services by increasing performance and power efficiency improving reliability, availability and serviceability, and delivering comprehensive platform instrumentation.

OPNFV Platform Architecture

The OPNFV project addresses a number of aspects in the development of a consistent virtualisation platform including common hardware requirements, software architecture, MANO and applications.

OPNFV Platform Overview Diagram

Overview infographic of the opnfv platform and projects.

To address these areas effectively, the OPNFV platform architecture can be decomposed into the following basic building blocks:

  • Hardware: Infrastructure working group, Pharos project and associated activities
  • Software Platform: Platform integration and deployment projects
  • MANO: MANO working group and associated projects
  • Tooling and testing: Testing working group and test projects
  • Applications: All other areas and drive requirements for OPNFV

OPNFV Lab Infrastructure

The infrastructure working group oversees such topics as lab management, workflow, definitions, metrics and tools for OPNFV infrastructure.

Fundamental to the WG is the Pharos Specification which provides a set of defined lab infrastructures over a geographically and technically diverse federated global OPNFV lab.

Labs may instantiate bare-metal and virtual environments that are accessed remotely by the community and used for OPNFV platform and feature development, build, deploy and testing. No two labs are the same and the heterogeneity of the Pharos environment provides the ideal platform for establishing hardware and software abstractions providing well understood performance characteristics.

Community labs are hosted by OPNFV member companies on a voluntary basis. The Linux Foundation also hosts an OPNFV lab that provides centralized CI and other production resources which are linked to community labs.

The Lab-as-a-service (LaaS) offering provides developers to readily access NFV infrastructure on demand. Ongoing lab capabilities will include the ability easily automate deploy and test of any OPNFV install scenario in any lab environment using a concept called “Dynamic CI”.

OPNFV Software Platform Architecture

The OPNFV software platform is comprised exclusively of open source implementations of platform component pieces. OPNFV is able to draw from the rich ecosystem of NFV related technologies available in open source communities, and then integrate, test, measure and improve these components in conjunction with our upstream communities.

Virtual Infrastructure Management

OPNFV derives it’s virtual infrastructure management from one of our largest upstream ecosystems OpenStack. OpenStack provides a complete reference cloud management system and associated technologies. While the OpenStack community sustains a broad set of projects, not all technologies are relevant in the NFV domain, the OPNFV community consumes a sub-set of OpenStack projects and the usage and composition may vary depending on the installer and scenario.

For details on the scenarios available in OPNFV and the specific composition of components refer to the OPNFV User Guide & Configuration Guide.

OPNFV now also has initial support for containerized VNFs.

Operating Systems

OPNFV currently uses Linux on all target machines, this can include Ubuntu, Centos or SUSE Linux. The specific version of Linux used for any deployment is documented in the installation guide.

Networking Technologies

SDN Controllers

OPNFV, as an NFV focused project, has a significant investment on networking technologies and provides a broad variety of integrated open source reference solutions. The diversity of controllers able to be used in OPNFV is supported by a similarly diverse set of forwarding technologies.

There are many SDN controllers available today relevant to virtual environments where the OPNFV community supports and contributes to a number of these. The controllers being worked on by the community during this release of OPNFV include:

  • Neutron: an OpenStack project to provide “network connectivity as a service” between interface devices (e.g., vNICs) managed by other OpenStack services (e.g. Nova).
  • OpenDaylight: addresses multivendor, traditional and greenfield networks, establishing the industry’s de facto SDN platform and providing the foundation for networks of the future.
  • Tungsen Fabric: An open source SDN controller designed for cloud and NFV use cases. It has an analytics engine, well defined northbound REST APIs to configure and gather ops/analytics data.
  • OVN: A virtual networking solution developed by the same team that created OVS. OVN stands for Open Virtual Networking and is dissimilar from the above projects in that it focuses only on overlay networks.

Data Plane

OPNFV extends Linux virtual networking capabilities by using virtual switching and routing components. The OPNFV community proactively engages with the following open source communities to address performance, scale and resiliency needs apparent in carrier networks.

  • OVS (Open vSwitch): a production quality, multilayer virtual switch designed to enable massive network automation through programmatic extension, while still supporting standard management interfaces and protocols.
  • FD.io (Fast data - Input/Output): a high performance alternative to Open vSwitch, the core engine of FD.io is a vector processing engine (VPP). VPP processes a number of packets in parallel instead of one at a time thus significantly improving packet throughput.
  • DPDK: a set of libraries that bypass the kernel and provide polling mechanisms, instead of interrupt based operations, to speed up packet processing. DPDK works with both OVS and FD.io.

MANO

OPNFV integrates open source MANO projects for NFV orchestration and VNF management. New MANO projects are constantly being added.

Deployment Architecture

A typical OPNFV deployment starts with three controller nodes running in a high availability configuration including control plane components from OpenStack, SDN controllers, etc. and a minimum of two compute nodes for deployment of workloads (VNFs). A detailed description of the hardware requirements required to support the 5 node configuration can be found in pharos specification: Pharos Project

In addition to the deployment on a highly available physical infrastructure, OPNFV can be deployed for development and lab purposes in a virtual environment. In this case each of the hosts is provided by a virtual machine and allows control and workload placement using nested virtualization.

The initial deployment is done using a staging server, referred to as the “jumphost”. This server-either physical or virtual-is first installed with the installation program that then installs OpenStack and other components on the controller nodes and compute nodes. See the OPNFV User Guide & Configuration Guide for more details.

The OPNFV Testing Ecosystem

The OPNFV community has set out to address the needs of virtualization in the carrier network and as such platform validation and measurements are a cornerstone to the iterative releases and objectives.

To simplify the complex task of feature, component and platform validation and characterization the testing community has established a fully automated method for addressing all key areas of platform validation. This required the integration of a variety of testing frameworks in our CI systems, real time and automated analysis of results, storage and publication of key facts for each run as shown in the following diagram.

Overview infographic of the OPNFV testing Ecosystem

Release Verification

The OPNFV community relies on its testing community to establish release criteria for each OPNFV release. With each release cycle the testing criteria become more stringent and better representative of our feature and resiliency requirements. Each release establishes a set of deployment scenarios to validate, the testing infrastructure and test suites need to accommodate these features and capabilities.

The release criteria as established by the testing teams include passing a set of test cases derived from the functional testing project ‘functest,’ a set of test cases derived from our platform system and performance test project ‘yardstick,’ and a selection of test cases for feature capabilities derived from other test projects such as bottlenecks, vsperf, cperf and storperf. The scenario needs to be able to be deployed, pass these tests, and be removed from the infrastructure iteratively in order to fulfill the release criteria.

Functest

Functest provides a functional testing framework incorporating a number of test suites and test cases that test and verify OPNFV platform functionality. The scope of Functest and relevant test cases can be found in the Functest User Guide

Functest provides both feature project and component test suite integration, leveraging OpenStack and SDN controllers testing frameworks to verify the key components of the OPNFV platform are running successfully.

Yardstick

Yardstick is a testing project for verifying the infrastructure compliance when running VNF applications. Yardstick benchmarks a number of characteristics and performance vectors on the infrastructure making it a valuable pre-deployment NFVI testing tools.

Yardstick provides a flexible testing framework for launching other OPNFV testing projects.

There are two types of test cases in Yardstick:

  • Yardstick generic test cases and OPNFV feature test cases; including basic characteristics benchmarking in compute/storage/network area.
  • OPNFV feature test cases include basic telecom feature testing from OPNFV projects; for example nfv-kvm, sfc, ipv6, Parser, Availability and SDN VPN

System Evaluation and compliance testing

The OPNFV community is developing a set of test suites intended to evaluate a set of reference behaviors and capabilities for NFV systems developed externally from the OPNFV ecosystem to evaluate and measure their ability to provide the features and capabilities developed in the OPNFV ecosystem.

The Dovetail project will provide a test framework and methodology able to be used on any NFV platform, including an agreed set of test cases establishing an evaluation criteria for exercising an OPNFV compatible system. The Dovetail project has begun establishing the test framework and will provide a preliminary methodology for the Fraser release. Work will continue to develop these test cases to establish a stand alone compliance evaluation solution in future releases.

Additional Testing

Besides the test suites and cases for release verification, additional testing is performed to validate specific features or characteristics of the OPNFV platform. These testing framework and test cases may include some specific needs; such as extended measurements, additional testing stimuli, or tests simulating environmental disturbances or failures.

These additional testing activities provide a more complete evaluation of the OPNFV platform. Some of the projects focused on these testing areas include:

Bottlenecks

Bottlenecks provides a framework to find system limitations and bottlenecks, providing root cause isolation capabilities to facilitate system evaluation.

NFVBench

NFVbench is a lightweight end-to-end dataplane benchmarking framework project. It includes traffic generator(s) and measures a number of packet performance related metrics.

QTIP

QTIP boils down NFVI compute and storage performance into one single metric for easy comparison. QTIP crunches these numbers based on five different categories of compute metrics and relies on Storperf for storage metrics.

Storperf

Storperf measures the performance of external block storage. The goal of this project is to provide a report based on SNIA’s (Storage Networking Industry Association) Performance Test Specification.

VSPERF

VSPERF provides an automated test-framework and comprehensive test suite for measuring data-plane performance of the NFVI including switching technology, physical and virtual network interfaces. The provided test cases with network topologies can be customized while also allowing individual versions of Operating System, vSwitch and hypervisor to be specified.

Installation

Abstract

This an overview document for the installation of the Gambia release of OPNFV.

The Gambia release can be installed making use of any of the installer projects in OPNFV: Apex, Compass4Nfv or Fuel. Each installer provides the ability to install a common OPNFV platform as well as integrating additional features delivered through a variety of scenarios by the OPNFV community.

Introduction

The OPNFV platform is comprised of a variety of upstream components that may be deployed on your infrastructure. A composition of components, tools and configurations is identified in OPNFV as a deployment scenario.

The various OPNFV scenarios provide unique features and capabilities that you may want to leverage, and it is important to understand your required target platform capabilities before installing and configuring your scenarios.

An OPNFV installation requires either a physical infrastructure environment as defined in the Pharos specification, or a virtual one. When configuring a physical infrastructure it is strongly advised to follow the Pharos configuration guidelines.

Scenarios

OPNFV scenarios are designed to host virtualised network functions (VNF’s) in a variety of deployment architectures and locations. Each scenario provides specific capabilities and/or components aimed at solving specific problems for the deployment of VNF’s.

A scenario may, for instance, include components such as OpenStack, OpenDaylight, OVS, KVM etc., where each scenario will include different source components or configurations.

To learn more about the scenarios supported in the Fraser release refer to the scenario description documents provided:

Installation Procedure

Detailed step by step instructions for working with an installation toolchain and installing the required scenario are provided by the installation projects. The projects providing installation support for the OPNFV Gambia release are: Apex, Compass4nfv and Fuel.

The instructions for each toolchain can be found in these links:

OPNFV Test Frameworks

If you have elected to install the OPNFV platform using the deployment toolchain provided by OPNFV, your system will have been validated once the installation is completed. The basic deployment validation only addresses a small part of capabilities in the platform and you may want to execute more exhaustive tests. Some investigation will be required to select the right test suites to run on your platform.

Many of the OPNFV test project provide user-guide documentation and installation instructions in this document

User Guide & Configuration Guide

Abstract

OPNFV is a collaborative project aimed at providing a variety of virtualisation deployments intended to host applications serving the networking and carrier industries. This document provides guidance and instructions for using platform features designed to support these applications that are made available in the OPNFV Gambia release.

This document is not intended to replace or replicate documentation from other upstream open source projects such as KVM, OpenDaylight, OpenStack, etc., but to highlight the features and capabilities delivered through the OPNFV project.

Introduction

OPNFV provides a suite of scenarios, infrastructure deployment options, which are able to be installed to host virtualised network functions (VNFs). This document intends to help users of the platform leverage the features and capabilities delivered by OPNFV.

OPNFVs’ Continuous Integration builds, deploys and tests combinations of virtual infrastructure components in what are defined as scenarios. A scenario may include components such as KVM, OpenDaylight, OpenStack, OVS, etc., where each scenario will include different source components or configurations. Scenarios are designed to enable specific features and capabilities in the platform that can be leveraged by the OPNFV user community.

Feature Overview

The following links outline the feature deliverables from participating OPNFV projects in the Gambia release. Each of the participating projects provides detailed descriptions about the delivered features including use cases, implementation, and configuration specifics.

The following Configuration Guides and User Guides assume that the reader already has some knowledge about a given project’s specifics and deliverables. These Guides are intended to be used following the installation with an OPNFV installer to allow users to deploy and implement feature delivered by OPNFV.

If you are unsure about the specifics of a given project, please refer to the OPNFV wiki page at http://wiki.opnfv.org for more details.

Feature Configuration Guides

Feature User Guides

Testing Frameworks

Testing Framework Overview

OPNFV Testing Overview

Introduction

Testing is one of the key activities in OPNFV and includes unit, feature, component, system level testing for development, automated deployment, performance characterization and stress testing.

Test projects are dedicated to provide frameworks, tooling and test-cases categorized as functional, performance or compliance testing. Test projects fulfill different roles such as verifying VIM functionality, benchmarking components and platforms or analysis of measured KPIs for OPNFV release scenarios.

Feature projects also provide their own test suites that either run independently or within a test project.

This document details the OPNFV testing ecosystem, describes common test components used by individual OPNFV projects and provides links to project specific documentation.

The OPNFV Testing Ecosystem

The OPNFV testing projects are represented in the following diagram:

Overview of OPNFV Testing projects

The major testing projects are described in the table below:

Project Description
Bottlenecks This project aims to find system bottlenecks by testing and verifying OPNFV infrastructure in a staging environment before committing it to a production environment. Instead of debugging a deployment in production environment, an automatic method for executing benchmarks which plans to validate the deployment during staging is adopted. This project forms a staging framework to find bottlenecks and to do analysis of the OPNFV infrastructure.
CPerf SDN Controller benchmarks and performance testing, applicable to controllers in general. Collaboration of upstream controller testing experts, external test tool developers and the standards community. Primarily contribute to upstream/external tooling, then add jobs to run those tools on OPNFV’s infrastructure.
Dovetail This project intends to define and provide a set of OPNFV related validation criteria/tests that will provide input for the OPNFV Complaince Verification Program. The Dovetail project is executed with the guidance and oversight of the Complaince and Certification (C&C) committee and work to secure the goals of the C&C committee for each release. The project intends to incrementally define qualification criteria that establish the foundations of how one is able to measure the ability to utilize the OPNFV platform, how the platform itself should behave, and how applications may be deployed on the platform.
Functest This project deals with the functional testing of the VIM and NFVI. It leverages several upstream test suites (OpenStack, ODL, ONOS, etc.) and can be used by feature project to launch feature test suites in CI/CD. The project is used for scenario validation.
NFVbench NFVbench is a compact and self contained data plane performance measurement tool for OpensStack based NFVi platforms. It is agnostic of the NFVi distribution, Neutron networking implementation and hardware. It runs on any Linux server with a DPDK compliant NIC connected to the NFVi platform data plane and bundles a highly efficient software traffic generator. Provides a fully automated measurement of most common packet paths at any level of scale and load using RFC-2544. Available as a Docker container with simple command line and REST interfaces. Easy to use as it takes care of most of the guesswork generally associated to data plane benchmarking. Can run in any lab or in production environments.
QTIP QTIP as the project for “Platform Performance Benchmarking” in OPNFV aims to provide user a simple indicator for performance, supported by comprehensive testing data and transparent calculation formula. It provides a platform with common services for performance benchmarking which helps users to build indicators by themselves with ease.
StorPerf The purpose of this project is to provide a tool to measure block and object storage performance in an NFVI. When complemented with a characterization of typical VF storage performance requirements, it can provide pass/fail thresholds for test, staging, and production NFVI environments.
VSPERF VSPERF is an OPNFV project that provides an automated test-framework and comprehensive test suite based on Industry Test Specifications for measuring NFVI data-plane performance. The data-path includes switching technologies with physical and virtual network interfaces. The VSPERF architecture is switch and traffic generator agnostic and test cases can be easily customized. Software versions and configurations including the vSwitch (OVS or VPP) as well as the network topology are controlled by VSPERF (independent of OpenStack). VSPERF is used as a development tool for optimizing switching technologies, qualification of packet processing components and for pre-deployment evaluation of the NFV platform data-path.
Yardstick The goal of the Project is to verify the infrastructure compliance when running VNF applications. NFV Use Cases described in ETSI GS NFV 001 show a large variety of applications, each defining specific requirements and complex configuration on the underlying infrastructure and test tools.The Yardstick concept decomposes typical VNF work-load performance metrics into a number of characteristics/performance vectors, which each of them can be represented by distinct test-cases.

Testing Working Group Resources

Test Results Collection Framework

Any test project running in the global OPNFV lab infrastructure and is integrated with OPNFV CI can push test results to the community Test Database using a common Test API. This database can be used to track the evolution of testing and analyse test runs to compare results across installers, scenarios and between technically and geographically diverse hardware environments.

Results from the databse are used to generate a dashboard with the current test status for each testing project. Please note that you can also deploy the Test Database and Test API locally in your own environment.

Overall Test Architecture

The management of test results can be summarized as follows:

+-------------+    +-------------+    +-------------+
|             |    |             |    |             |
|   Test      |    |   Test      |    |   Test      |
| Project #1  |    | Project #2  |    | Project #N  |
|             |    |             |    |             |
+-------------+    +-------------+    +-------------+
         |               |               |
         V               V               V
     +---------------------------------------------+
     |                                             |
     |           Test Rest API front end           |
     |    http://testresults.opnfv.org/test        |
     |                                             |
     +---------------------------------------------+
         ^                |                     ^
         |                V                     |
         |     +-------------------------+      |
         |     |                         |      |
         |     |    Test Results DB      |      |
         |     |         Mongo DB        |      |
         |     |                         |      |
         |     +-------------------------+      |
         |                                      |
         |                                      |
   +----------------------+        +----------------------+
   |                      |        |                      |
   | Testing Dashboards   |        |  Test Landing page   |
   |                      |        |                      |
   +----------------------+        +----------------------+
The Test Database

A Mongo DB Database was introduced for the Brahmaputra release. The following collections are declared in this database:

  • pods: the list of pods used for production CI
  • projects: the list of projects providing test cases
  • test cases: the test cases related to a given project
  • results: the results of the test cases
  • scenarios: the OPNFV scenarios tested in CI

This database can be used by any project through the Test API. Please note that projects may also use additional databases. The Test Database is mainly use to collect CI test results and generate scenario trust indicators. The Test Database is also cloned for OPNFV Plugfests in order to provide a private datastore only accessible to Plugfest participants.

Test API description

The Test API is used to declare pods, projects, test cases and test results. Pods correspond to a cluster of machines (3 controller and 2 compute nodes in HA mode) used to run the tests and are defined in the Pharos project. The results pushed in the database are related to pods, projects and test cases. Trying to push results generated from a non-referenced pod will return an error message by the Test API.

The data model is very basic, 5 objects are available:
  • Pods
  • Projects
  • Test cases
  • Results
  • Scenarios

For detailed information, please go to http://artifacts.opnfv.org/releng/docs/testapi.html

The code of the Test API is hosted in the releng-testresults repository [TST2]. The static documentation of the Test API can be found at [TST3]. The Test API has been dockerized and may be installed locally in your lab.

The deployment of the Test API has been automated. A jenkins job manages:

  • the unit tests of the Test API
  • the creation of a new docker file
  • the deployment of the new Test API
  • the archive of the old Test API
  • the backup of the Mongo DB
Test API Authorization

PUT/DELETE/POST operations of the TestAPI now require token based authorization. The token needs to be added in the request using a header ‘X-Auth-Token’ for access to the database.

e.g:

headers['X-Auth-Token']

The value of the header i.e the token can be accessed in the jenkins environment variable TestApiToken. The token value is added as a masked password.

headers['X-Auth-Token'] = os.environ.get('TestApiToken')

The above example is in Python. Token based authentication has been added so that only CI pods running Jenkins jobs can access the database. Please note that currently token authorization is implemented but is not yet enabled.

Test Project Reporting

The reporting page for the test projects is http://testresults.opnfv.org/reporting/

Testing group reporting page

This page provides reporting per OPNFV release and per testing project.

Testing group Euphrates reporting page

An evolution of the reporting page is planned to unify test reporting by creating a landing page that shows the scenario status in one glance (this information was previously consolidated manually on a wiki page). The landing page will be displayed per scenario and show:

  • the status of the deployment
  • the score from each test suite. There is no overall score, it is determined

by each test project. * a trust indicator

Test Case Catalog

Until the Colorado release, each testing project managed the list of its test cases. This made it very hard to have a global view of the available test cases from the different test projects. A common view was possible through the API but it was not very user friendly. Test cases per project may be listed by calling:

with project_name: bottlenecks, functest, qtip, storperf, vsperf, yardstick

A test case catalog has now been realized [TST4]. Roll over the project then click to get the list of test cases, and then click on the case to get more details.

Testing group testcase catalog
Test Dashboards

The Test Dashboard is used to provide a consistent view of the results collected in CI. The results shown on the dashboard are post processed from the Database, which only contains raw results. The dashboard can be used in addition to the reporting page (high level view) to allow the creation of specific graphs according to what the test owner wants to show.

In Brahmaputra, a basic dashboard was created in Functest. In Colorado, Yardstick used Grafana (time based graphs) and ELK (complex graphs). Since Danube, the OPNFV testing community decided to adopt the ELK framework and to use Bitergia for creating highly flexible dashboards [TST5].

Testing group testcase catalog
Power Consumption Monitoring Framework
Introduction

Power consumption is a key driver for NFV. As an end user is interested to know which application is good or bad regarding power consumption and explains why he/she has to plug his/her smartphone every day, we would be interested to know which VNF is power consuming.

Power consumption is hard to evaluate empirically. It is however possible to collect information and leverage Pharos federation to try to detect some profiles/footprints. In fact thanks to CI, we know that we are running a known/deterministic list of cases. The idea is to correlate this knowledge with the power consumption to try at the end to find statistical biais.

High Level Architecture

The energy recorder high level architecture may be described as follows:

Energy recorder high level architecture

The energy monitoring system in based on 3 software components:

  • Power info collector: poll server to collect instantaneous power consumption information
  • Energy recording API + influxdb: On one leg receive servers consumption and

on the other, scenarios notfication. It then able to establish te correlation between consumption and scenario and stores it into a time-series database (influxdb) * Python SDK: A Python SDK using decorator to send notification to Energy recording API from testcases scenarios

Power Info Collector

It collects instantaneous power consumption information and send it to Event API in charge of data storing. The collector use different connector to read the power consumption on remote servers:

  • IPMI: this is the basic method and is manufacturer dependent. Depending on manufacturer, refreshing delay may vary (generally for 10 to 30 sec.)
  • RedFish: redfish is an industry RESTFUL API for hardware managment. Unfortunatly it is not yet supported by many suppliers.
  • ILO: HP RESTFULL API: This connector support as well 2.1 as 2.4 version of HP-ILO

IPMI is supported by at least:

  • HP
  • IBM
  • Dell
  • Nokia
  • Advantech
  • Lenovo
  • Huawei

Redfish API has been successfully tested on:

  • HP
  • Dell
  • Huawei (E9000 class servers used in OPNFV Community Labs are IPMI 2.0

compliant and use Redfish login Interface through Browsers supporting JRE1.7/1.8)

Several test campaigns done with physical Wattmeter showed that IPMI results were notvery accurate but RedFish were. So if Redfish is available, it is highly recommended to use it.

Installation

To run the server power consumption collector agent, you need to deploy a docker container locally on your infrastructure.

This container requires:

  • Connectivy on the LAN where server administration services (ILO, eDrac, IPMI,...) are configured and IP access to the POD’s servers
  • Outgoing HTTP access to the Event API (internet)

Build the image by typing:

curl -s https://raw.githubusercontent.com/bherard/energyrecorder/master/docker/server-collector.dockerfile|docker build -t energyrecorder/collector -

Create local folder on your host for logs and config files:

mkdir -p /etc/energyrecorder
mkdir -p /var/log/energyrecorder

In /etc/energyrecorder create a configuration for logging in a file named collector-logging.conf:

curl -s https://raw.githubusercontent.com/bherard/energyrecorder/master/server-collector/conf/collector-logging.conf.sample > /etc/energyrecorder/collector-logging.conf

Check configuration for this file (folders, log levels.....) In /etc/energyrecorder create a configuration for the collector in a file named collector-settings.yaml:

curl -s https://raw.githubusercontent.com/bherard/energyrecorder/master/server-collector/conf/collector-settings.yaml.sample > /etc/energyrecorder/collector-settings.yaml

Define the “PODS” section and their “servers” section according to the environment to monitor. Note: The “environment” key should correspond to the pod name, as defined in the “NODE_NAME” environment variable by CI when running.

IMPORTANT NOTE: To apply a new configuration, you need to kill the running container an start a new one (see below)

Run

To run the container, you have to map folder located on the host to folders in the container (config, logs):

docker run -d --name energy-collector --restart=always -v /etc/energyrecorder:/usr/local/energyrecorder/server-collector/conf -v /var/log/energyrecorder:/var/log/energyrecorder energyrecorder/collector
Energy Recording API

An event API to insert contextual information when monitoring energy (e.g. start Functest, start Tempest, destroy VM, ..) It is associated with an influxDB to store the power consumption measures It is hosted on a shared environment with the folling access points:

Component Connectivity
Energy recording API documentation http://energy.opnfv.fr/resources/doc/
influxDB (data) http://energy.opnfv.fr:8086

In you need, you can also host your own version of the Energy recording API (in such case, the Python SDK may requires a settings update) If you plan to use the default shared API, following steps are not required.

Image creation

First, you need to buid an image:

curl -s https://raw.githubusercontent.com/bherard/energyrecorder/master/docker/recording-api.dockerfile|docker build -t energyrecorder/api -
Setup

Create local folder on your host for logs and config files:

mkdir -p /etc/energyrecorder
mkdir -p /var/log/energyrecorder
mkdir -p /var/lib/influxdb

In /etc/energyrecorder create a configuration for logging in a file named webapp-logging.conf:

curl -s https://raw.githubusercontent.com/bherard/energyrecorder/master/recording-api/conf/webapp-logging.conf.sample > /etc/energyrecorder/webapp-logging.conf

Check configuration for this file (folders, log levels.....)

In /etc/energyrecorder create a configuration for the collector in a file named webapp-settings.yaml:

curl -s https://raw.githubusercontent.com/bherard/energyrecorder/master/recording-api/conf/webapp-settings.yaml.sample > /etc/energyrecorder/webapp-settings.yaml

Normaly included configuration is ready to use except username/passwer for influx (see run-container.sh bellow). Use here the admin user.

IMPORTANT NOTE: To apply a new configuration, you need to kill the running container an start a new one (see bellow)

Run

To run the container, you have to map folder located on the host to folders in the container (config, logs):

docker run -d --name energyrecorder-api -p 8086:8086 -p 8888:8888  -v /etc/energyrecorder:/usr/local/energyrecorder/web.py/conf -v /var/log/energyrecorder/:/var/log/energyrecorder -v /var/lib/influxdb:/var/lib/influxdb energyrecorder/webapp admin-influx-user-name admin-password readonly-influx-user-name user-password

with

Parameter name Description
admin-influx-user-name
admin-password
readonly-influx-user-name
user-password
Influx user with admin grants to create
Influx password to set to admin user
Influx user with readonly grants to create
Influx password to set to readonly user

NOTE: Local folder /var/lib/influxdb is the location web influx data are stored. You may used anything else at your convience. Just remember to define this mapping properly when running the container.

Power consumption Python SDK

a Python SDK - almost not intrusive, based on python decorator to trigger call to the event API.

It is currently hosted in Functest repo but if other projects adopt it, a dedicated project could be created and/or it could be hosted in Releng.

How to use the SDK

import the energy library:

import functest.energy.energy as energy

Notify that you want power recording in your testcase:

@energy.enable_recording
def run(self):
    self.do_some_stuff1()
    self.do_some_stuff2()

If you want to register additional steps during the scenarios you can to it in 2 different ways.

Notify step on method definition:

@energy.set_step("step1")
def do_some_stuff1(self):
...
@energy.set_step("step2")
def do_some_stuff2(self):

Notify directly from code:

@energy.enable_recording
def run(self):
  Energy.set_step("step1")
  self.do_some_stuff1()
  ...
  Energy.set_step("step2")
  self.do_some_stuff2()
SDK Setting

Settings delivered in the project git are ready to use and assume that you will use the sahre energy recording API. If you want to use an other instance, you have to update the key “energy_recorder.api_url” in <FUNCTEST>/functest/ci/config_functest.yaml” by setting the proper hostname/IP

Results

Here is an example of result comming from LF POD2. This sequence represents several CI runs in a raw. (0 power corresponds to hard reboot of the servers)

You may connect http://energy.opnfv.fr:3000 for more results (ask for credentials to infra team).

Energy monitoring of LF POD2
OPNFV Test Group Information

For more information or to participate in the OPNFV test community please see the following:

wiki: https://wiki.opnfv.org/testing

mailing list: test-wg@lists.opnfv.org

IRC channel: #opnfv-testperf

weekly meeting (https://wiki.opnfv.org/display/meetings/TestPerf):
  • Usual time: Every Thursday 15:00-16:00 UTC / 7:00-8:00 PST
  • APAC time: 2nd Wednesday of the month 8:00-9:00 UTC

Reference Documentation

Project Documentation links
Bottlenecks https://wiki.opnfv.org/display/bottlenecks/Bottlenecks
CPerf https://wiki.opnfv.org/display/cperf
Dovetail https://wiki.opnfv.org/display/dovetail
Functest https://wiki.opnfv.org/display/functest/
NFVbench https://wiki.opnfv.org/display/nfvbench/
QTIP https://wiki.opnfv.org/display/qtip
StorPerf https://wiki.opnfv.org/display/storperf/Storperf
VSPERF https://wiki.opnfv.org/display/vsperf
Yardstick https://wiki.opnfv.org/display/yardstick/Yardstick

[TST1]: OPNFV web site

[TST2]: TestAPI code repository link in releng-testresults

[TST3]: TestAPI autogenerated documentation

[TST4]: Testcase catalog

[TST5]: Testing group dashboard

Testing User Guides

This page provides the links to the installation, configuration and user guides of the different test projects.

Bottlenecks

Bottlenecks User Guide
User Guide

For each testsuite, you can either setup teststory or testcase to run certain test. teststory comprises several testcases as a set in one configuration file. You could call teststory or testcase by using Bottlenecks user interfaces. Details will be shown in the following section.

Brief Introdcution of the Test suites in Project Releases

Brahmaputra:

  • rubbos is introduced, which is an end2end NFVI perforamnce tool.
  • Virtual switch test framework (VSTF) is also introduced, which is an test framework used for vswitch performance test.

Colorado:

  • rubbos is refactored by using puppet, and ease the integration with several load generators(Client) and worker(Tomcat).
  • VSTF is refactored by extracting the test case’s configuration information.

Danube:

  • posca testsuite is introduced to implement stress (factor), feature and tuning test in parametric manner.
  • Two testcases are developed and integrated into community CI pipeline.
  • Rubbos and VSTF are not supported any more.

Euphrates:

  • Introduction of a simple monitoring module, i.e., Prometheus+Collectd+Node+Grafana to monitor the system behavior when executing stress tests.
  • Support VNF scale up/out tests to verify NFVI capability to adapt the resource consuming.
  • Extend Life-cycle test to data-plane to validate the system capability to handle concurrent networks usage.
  • Testing framework is revised to support installer-agnostic testing.

These enhancements and test cases help the end users to gain more comprehensive understanding of the SUT. Graphic reports of the system behavior additional to test cases are provided to indicate the confidence level of SUT. Installer-agnostic testing framework allow end user to do stress testing adaptively over either Open Source or commercial deployments.

Integration Description
Release Integrated Installer Supported Testsuite
Brahmaputra Fuel Rubbos, VSTF
Colorado Compass Rubbos, VSTF
Danube Compass POSCA
Euphrates Any POSCA
Fraser Any POSCA
Gambia Any POSCA, kubestone
Test suite & Test case Description
POSCA 1 posca_factor_ping
2 posca_factor_system_bandwidth
3 posca_facotor_soak_througputs
4 posca_feature_vnf_scale_up
5 posca_feature_vnf_scale_out
6 posca_factor_storperf
7 posca_factor_multistack_storage_parallel
8 posca_factor_multistack_storage
9 posca_feature_moon_resources
10 posca_feature_moon_tenants
Kubestone 1 deployment_capacity

As for the abandoned test suite in the previous Bottlenecks releases, please refer to http://docs.opnfv.org/en/stable-danube/submodules/bottlenecks/docs/testing/user/userguide/deprecated.html.

POSCA Testsuite Guide
POSCA Introduction

The POSCA (Parametric Bottlenecks Testing Catalogue) test suite classifies the bottlenecks test cases and results into 5 categories. Then the results will be analyzed and bottlenecks will be searched among these categories.

The POSCA testsuite aims to locate the bottlenecks in parametric manner and to decouple the bottlenecks regarding the deployment requirements. The POSCA testsuite provides an user friendly way to profile and understand the E2E system behavior and deployment requirements.

Goals of the POSCA testsuite:
  1. Automatically locate the bottlenecks in a iterative manner.
  2. Automatically generate the testing report for bottlenecks in different categories.
  3. Implementing Automated Staging.
Scopes of the POSCA testsuite:
  1. Modeling, Testing and Test Result analysis.
  2. Parameters choosing and Algorithms.
Test stories of POSCA testsuite:
  1. Factor test (Stress test): base test cases that Feature test and Optimization will be dependant on or stress test to validate system.
  2. Feature test: test cases for features/scenarios.
  3. Optimization test: test to tune the system parameter.

Detailed workflow is illutrated below.

Preinstall Packages

[Since Euphrates release, the docker-compose package is not required.]

if [ -d usr/local/bin/docker-compose ]; then
    rm -rf usr/local/bin/docker-compose
fi
curl -L https://github.com/docker/compose/releases/download/1.11.0/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
Run POSCA Locally

The test environment preparation, the installation of the testing tools, the execution of the tests and the reporting/analyisis of POSCA test suite are highly automated. A few steps are needed to run it locally.

In Euphrates, Bottlenecks has modified its framework to support installer-agnostic testing which means that test cases could be executed over different deployments.

Downloading Bottlenecks Software
mkdir /home/opnfv
cd /home/opnfv
git clone https://gerrit.opnfv.org/gerrit/bottlenecks
cd bottlenecks
Preparing Python Virtual Evnironment
. pre_virt_env.sh
Preparing configuration/description files

Put OpenStack RC file (admin_rc.sh), os_carcert and pod.yaml (pod descrition file) in /tmp directory. Edit admin_rc.sh and add the following line

export OS_CACERT=/tmp/os_cacert

If you have deployed your openstack environment by compass, you could use the following command to get the required files. As to Fuel, Apex and JOID installers, we only provide limited support now for retrieving the configuration/description files. If you find that the following command can not do the magic, you should put the required files in /tmp manually.

bash ./utils/env_prepare/config_prepare.sh -i <installer> [--debug]

Note that if we execute the command above, then admin_rc.sh and pod.yml will be created automatically in /tmp folder along with the line export OS_CACERT=/tmp/os_cacert added in admin_rc.sh file.

Executing Specified Testcase
  1. Bottlenecks provides a CLI interface to run the tests, which is one of the most convenient way since it is more close to our natural languge. An GUI interface with rest API will also be provided in later update.
bottlenecks testcase|teststory run <testname>

For the testcase command, testname should be as the same name of the test case configuration file located in testsuites/posca/testcase_cfg. For stress tests in Danube/Euphrates, testcase should be replaced by either posca_factor_ping or posca_factor_system_bandwidth. For the teststory command, a user can specify the test cases to be executed by defining it in a teststory configuration file located in testsuites/posca/testsuite_story. There is also an example there named posca_factor_test.

  1. There are also other 2 ways to run test cases and test stories.

    The first one is to use shell script.

bash run_tests.sh [-h|--help] -s <testsuite>|-c <testcase>


The second is to use python interpreter.
$REPORT=False
opts="--privileged=true -id"
docker_volume="-v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp"
docker run $opts --name bottlenecks-load-master $docker_volume opnfv/bottlenecks:latest /bin/bash
sleep 5
POSCA_SCRIPT="/home/opnfv/bottlenecks/testsuites/posca"
docker exec bottlenecks-load-master python ${POSCA_SCRIPT}/../run_posca.py testcase|teststory <testname> ${REPORT}
Showing Report

Bottlenecks uses ELK to illustrate the testing results. Asumming IP of the SUT (System Under Test) is denoted as ipaddr, then the address of Kibana is http://[ipaddr]:5601. One can visit this address to see the illustrations. Address for elasticsearch is http://[ipaddr]:9200. One can use any Rest Tool to visit the testing data stored in elasticsearch.

Cleaning Up Environment
. rm_virt_env.sh

If you want to clean the dockers that established during the test, you can excute the additional commands below.

bash run_tests.sh --cleanup

Note that you can also add cleanup parameter when you run a test case. Then environment will be automatically cleaned up when completing the test.

Run POSCA through Community CI

POSCA test cases are runned by OPNFV CI now. See https://build.opnfv.org for details of the building jobs. Each building job is set up to execute a single test case. The test results/logs will be printed on the web page and reported automatically to community MongoDB. There are two ways to report the results.

  1. Report testing result by shell script
bash run_tests.sh [-h|--help] -s <testsuite>|-c <testcase> --report
  1. Report testing result by python interpreter
REPORT=True
opts="--privileged=true -id"
docker_volume="-v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp"
docker run $opts --name bottlenecks-load-master $docker_volume opnfv/bottlenecks:latest /bin/bash
sleep 5
REPORT="True"
POSCA_SCRIPT="/home/opnfv/bottlenecks/testsuites/posca"
docker exec bottlenecks_load-master python ${POSCA_SCRIPT}/../run_posca.py testcase|teststory <testcase> ${REPORT}
Test Result Description
Dashbard guide
Scope

This document provides an overview of the results of test cases developed by the OPNFV Bottlenecks Project, executed on OPNFV community labs.

OPNFV CI(Continous Integration) system provides automated build, deploy and testing for the software developed in OPNFV. Unless stated, the reported tests are automated via Jenkins Jobs.

Test results are visible in the following dashboard:

  • Testing dashboard: uses Mongo DB to store test results and Bitergia for visualization, which includes the rubbos test result, vstf test result.
Bottlenecks - Test Cases
POSCA Stress (Factor) Test of System bandwidth
Test Case
Bottlenecks POSCA Stress Test Traffic
test case name posca_factor_system_bandwith
description Stress test regarding baseline of the system for a single user, i.e., a VM pair while increasing the package size
configuration
config file:
/testsuite/posca/testcase_cfg/
posca_factor_system_bandwith.yaml

stack number: 1

test result PKT loss rate, latency, throupht, cpu usage
Configration
test_config:
  tool: netperf
  protocol: tcp
  test_time: 20
  tx_pkt_sizes: 64, 256, 1024, 4096, 8192, 16384, 32768, 65536
  rx_pkt_sizes: 64, 256, 1024, 4096, 8192, 16384, 32768, 65536
  cpu_load: 0.9
  latency: 100000
runner_config:
  dashboard: "y"
  dashboard_ip:
  stack_create: yardstick
  yardstick_test_ip:
  yardstick_test_dir: "samples"
  yardstick_testcase: "netperf_bottlenecks"
POSCA Stress (Factor) Test of Perfomance Life-Cycle
Test Case
Bottlenecks POSCA Stress Test Ping
test case name posca_posca_ping
description Stress test regarding life-cycle while using ping to validate the VM pairs constructions
configuration
config file:
/testsuite/posca/testcase_cfg/posca_posca_ping.yaml

stack number: 5, 10, 20, 50 ...

test result PKT loss rate, success rate, test time, latency
Configuration
load_manager:
  scenarios:
    tool: ping
    test_times: 100
    package_size:
    num_stack: 5, 5
    package_loss: 0

  contexts:
    stack_create: yardstick
    flavor:
    yardstick_test_ip:
    yardstick_test_dir: "samples"
    yardstick_testcase: "ping_bottlenecks"

dashboard:
  dashboard: "y"
  dashboard_ip:
POSCA Stress Test of Storage Usage
Test Case
Bottlenecks POSCA Stress Test Storage
test case name posca_factor_storperf
description Stress test regarding storage using Storperf
configuration
config file:
/testsuite/posca/testcase_cfg/posca_posca_storperf.yaml
test result Read / Write IOPS, Throughput, latency
Configuration
load_manager:
  scenarios:
    tool: storperf
POSCA Stress (Factor) Test of Multistack Storage
Test Case
Bottlenecks POSCA Stress Test MultiStack Storage
test case name posca_factor_multistack_storage
description Stress test regarding multistack storage using yardstick as a runner
configuration
config file:
/testsuite/posca/testcase_cfg/posca_factor_multistack_storage.yaml

stack number: 5, 10, 20, 50 ...

test result Read / Write IOPS, Throughput, latency
Configuration
load_manager:
  scenarios:
    tool: fio
    test_times: 10
    rw: write, read, rw, rr, randomrw
    bs: 4k
    size: 50g
    rwmixwrite: 50
    num_stack: 1, 3
    volume_num: 1
    numjobs: 1
    direct: 1

  contexts:
    stack_create: yardstick
    flavor:
    yardstick_test_ip:
    yardstick_test_dir: "samples"
    yardstick_testcase: "storage_bottlenecks"

dashboard:
  dashboard: "y"
  dashboard_ip:
POSCA Stress (Factor) Test of Multistack Storage
Test Case
Bottlenecks POSCA Stress Test Storage (Multistack with Yardstick)
test case name posca_factor_multistack_storage_parallel
description Stress test regarding storage while using yardstick for multistack as a runner
configuration
config file:
/testsuite/posca/testcase_cfg/posca_factor_multistack_storage_parallel.yaml
test result Read / Write IOPS, Throughput, latency
Configuration
load_manager:
  scenarios:
    tool: fio
    test_times: 10
    rw: write, read, rw, rr, randomrw
    bs: 4k
    size: 50g
    rwmixwrite: 50
    num_stack: 1, 3
    volume_num: 1
    numjobs: 1
    direct: 1

  contexts:
    stack_create: yardstick
    flavor:
    yardstick_test_ip:
    yardstick_test_dir: "samples"
    yardstick_testcase: "storage_bottlenecks"

dashboard:
  dashboard: "y"
  dashboard_ip:
POSCA Factor Test of Soak Throughputs
Test Case
Bottlenecks POSCA Soak Test Throughputs
test case name posca_factor_soak_throughputs
description Long duration stability tests of data-plane traffic
configuration
config file:
/testsuite/posca/testcase_cfg/...
posca_factor_soak_throughputs.yaml
test result THROUGHPUT,THROUGHPUT_UNITS,MEAN_LATENCY,LOCAL_CPU_UTIL, REMOTE_CPU_UTIL,LOCAL_BYTES_SENT,REMOTE_BYTES_RECVD
Configuration
load_manager:
  scenarios:
    tool: netperf
    test_duration_hours: 1
    vim_pair_ttl: 300
    vim_pair_lazy_cre_delay: 2
    package_size:
    threshhold:
        package_loss: 0%
        latency: 300

  runners:
    stack_create: yardstick
    flavor:
    yardstick_test_dir: "samples"
    yardstick_testcase: "netperf_soak"
POSCA feature Test of Moon Security for resources per tenant
Test Case
Bottlenecks POSCA Soak Test Throughputs
test case name posca_feature_moon_resources
description Moon authentication capability test for maximum number of authentication operations per tenant
configuration
config file:
/testsuite/posca/testcase_cfg/...
posca_feature_moon_resources.yaml
test result number of tenants, max number of users
Configuration
load_manager:
  scenarios:
    tool: https request
    # info that the cpus and memes have the same number of data.
    pdp_name: pdp
    policy_name: "MLS Policy example"
    model_name: MLS
    tenants: 1,5,10,20
    subject_number: 10
    object_number: 10
    timeout: 0.2

  runners:
    stack_create: yardstick
    Debug: False
    yardstick_test_dir: "samples"
    yardstick_testcase: "moon_resource"
POSCA feature Test of Moon Security for Tenants
Test Case
Bottlenecks POSCA Soak Test Throughputs
test case name posca_feature_moon_tenants
description Moon authentication capability test for maximum tenants
configuration
config file:
/testsuite/posca/testcase_cfg/...
posca_feature_moon_tenants.yaml
test result Max number of tenants
Configuration
load_manager:
  scenarios:
    tool: https request
    # info that the cpus and memes have the same number of data.
    pdp_name: pdp
    policy_name: "MLS Policy example"
    model_name: MLS
    subject_number: 20
    object_number: 20
    timeout: 0.003
    initial_tenants: 0
    steps_tenants: 1
    tolerate_time: 20
    SLA: 5

  runners:
    stack_create: yardstick
    Debug: False
    yardstick_test_dir: "samples"
    yardstick_testcase: "moon_tenant"
POSCA feature Test of VNF Scale Out
Test Case
Bottlenecks POSCA Soak Test Throughputs
test case name posca_feature_nfv_scale_out
description SampleVNF Scale Out Test
configuration
config file:
/testsuite/posca/testcase_cfg/...
posca_feature_nfv_scale_out.yaml
test result throughputs, latency, loss rate
Configuration
load_manager:
  scenarios:
    number_vnfs: 1, 2, 4
    iterations: 10
    interval: 35

  runners:
    stack_create: yardstick
    flavor:
    yardstick_test_dir: "samples/vnf_samples/nsut/acl"
    yardstick_testcase: "tc_heat_rfc2544_ipv4_1rule_1flow_64B_trex_correlated_traffic_scale_out"
Kubenetes Stress Test of Deployment Capacity
Test Case
Bottlenecks Kubestone Deployment Capacity Test
test case name kubestone_deployment_capacity
description Stress test regarding capacity of deployment
configuration
config file:
testsuite/kubestone/testcases/deployment.yaml
test result Capcity, Life-Cycle Duration, Available Deployments
Configuration
apiVersion: apps/v1
kind: Deployment
namespace: bottlenecks-kubestone
test_type: Horizontal-Scaling
scaling_steps: 10, 50, 100, 200
template: None
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80

Dovetail / OPNFV Verified Program

1. OVP Workflow
1.1. Introduction

This document provides guidance for prospective participants on how to obtain ‘OPNFV Verified’ status. The OPNFV Verified Program (OVP) is administered by the OPNFV Compliance and Certification (C&C) committee.

For further information about the workflow and general inquiries about the program, please check out the OVP web portal, or contact the C&C committee by email address verified@opnfv.org. This email address should be used for all communication with the OVP.

1.2. Step 1: Participation Form Submission

A participant should start the process by submitting an online participation form. The participation form can found on the OVP web portal or directly at OVP participation form and the following information must be provided:

  • Organization name
  • Organization website (if public)
  • Product name and/or identifier
  • Product specifications
  • Product public documentation
  • Product categories, choose one: (i) software and hardware (ii) software and third party hardware (please specify)
  • Primary contact name, business email, postal address and phone number Only the primary contact email address should be used for official communication with OPNFV OVP.
  • User ID for OVP web portal The OVP web portal supports the Linux Foundation user ID in the current release. If a new user ID is needed, visit https://identity.linuxfoundation.org.
  • Location where the verification testing is to be conducted. Choose one: (internal vendor lab, third-party lab)
  • If the test is to be conducted by a third-party lab, please specify name and contact information of the third-party lab, including email, address and phone number.
  • OVP software version for compliance verification
  • Testing date

Once the participation form information is received and in order, an email response will be sent to the primary contact with confirmation and information to proceed. The primary contact specified in the participation form will be entered into OVP web portal back-end by the program administrator and will be permitted to submit results for review on behalf of their organization.

There is no fee at this time for participation in the OVP.

1.3. Step 2: Testing

The following documents guide testers to prepare the test environment and run tests:

A unique Test ID is generated by the Dovetail tool for each test run and can only be submitted to the OVP web portal once.

1.4. Step 3: Submitting Test Results

Users/testers other than the primary contact may use the OVP web portal as a resource to upload, evaluate and share results in a private manner. Testers can upload the test results to the OVP web portal. By default, the results are visible only to the tester who uploaded the data.

Testers can self-review the test results through the portal until they are ready to ask for OVP review. They may also add new test results as needed.

Once the tester is satisfied with the test result, the primary contact grants access to the test result for OVP review using a ‘submit for review’ operation via the portal. The test result is identified by the unique Test ID and becomes visible to a review group comprised of OPNFV community members.

When a test result is made visible to the reviewers, the program administrator will ask for volunteers from the review group using the verified@opnfv.org email and CC the primary contact email that a review request has been made. The program administrator will supply the Test ID and owner field (primary contact user ID) to the reviewers to identify the results.

1.5. Step 4: OVP Review

Upon receiving the email request from the program administrator, the review group conducts a peer based review of the test result using reviewer guidelines published per OVP release. Persons employed by the same organization that submitted the test results or by affiliated organizations will not be part of the reviewers.

The primary contact may be asked via email for any missing information or clarification of the test results. The reviewers will make a determination and recommend compliance or non-compliance to the C&C Committee. A positive review requires a minimum of two approvals from two distinct organizations without any negative reviews. The program administrator sends an email to OVP/C&C emails announcing a positive review. A one week limit is given for issues to be raised. If no issue is raised, the C&C Committee approves the result and the program administrator sends an email to OVP/C&C emails stating the result is approved.

Normally, the outcome of the review should be communicated to the primary contact within 10 business days after all required information is in order.

If a test result is denied, an appeal can be made to the C&C Committee for arbitration.

1.6. Step 5: Grant of Use of Program Marks

If an application is approved, further information will be communicated to the primary contact on the guidelines of using OVP Program Marks (including OVP logo) and the status of compliance for promotional purposes.

2. Guidelines Addendum for 2018.09 release
2.1. Introduction

This addendum provides a high-level description of the testing scope and pass/fail criteria used in the OPNFV Verified Program (OVP) for the 2018.09 release. This information is intended as an overview for OVP testers and for the Dovetail Project to help guide test-tool and test-case development for the OVP 2018.09 release. The Dovetail project is responsible for documenting test-case specifications as well as implementing the OVP tool-chain through collaboration with the OPNFV testing community. OVP testing focuses on establishing the ability of the System Under Test (SUT) to perform NFVI and VIM operations and support Service Provider oriented features that ensure manageable, resilient and secure networks.

2.2. Meaning of Compliance

OPNFV Compliance indicates adherence of an NFV platform to behaviors defined through specific platform capabilities, allowing to prepare, instantiate, operate and remove VNFs running on the NFVI. OVP 2018.09 compliance evaluates the ability of a platform to support Service Provider network capabilities and workloads that are supported in the OPNFV platform as of this release. Compliance test cases are designated as compulsory or optional based on the maturity of OPNFV capabilities as well as industry expectations. Compulsory test cases may for example include NFVI management capabilities whereas tests for certain high-availability features may be deemed as optional.

Test coverage and pass/fail criteria are designed to ensure an acceptable level of compliance but not be so restrictive as to disqualify variations in platform implementations, capabilities and features.

2.3. SUT Assumptions

Assumptions about the System Under Test (SUT) include ...

  • The minimal specification of physical infrastructure, including controller nodes, compute nodes and networks, is defined by the Pharos specification.
  • The SUT is fully deployed and operational, i.e. SUT deployment tools are out of scope of testing.
2.4. Scope of Testing

The OVP Governance Guidelines, as approved by the Board of Directors, outlines the key objectives of the OVP as follows:

  • Help build the market for
    • OPNFV based infrastructure
    • applications designed to run on that infrastructure
  • Reduce adoption risks for end-users
  • Decrease testing costs by verifying hardware and software platform interfaces and components
  • Enhance interoperability

The guidelines further directs the scope to be constrained to “features, capabilities, components, and interfaces included in an OPNFV release that are generally available in the industry (e.g., through adoption by an upstream community)”, and that compliance verification is evaluated using “functional tests that focus on defined interfaces and/or behaviors without regard to the implementation of the underlying system under test”.

OPNFV provides a broad range of capabilities, including the reference platform itself as well as tools-chains and methodologies for building infrastructures, and deploying and testing the platform. Not all these aspects are in scope for OVP and not all functions and components are tested in the initial versions of OVP. For example, the deployment tools for the SUT and CI/CD toolchain are currently out of scope. Similarly, performance benchmarking related testing is also out of scope or for further study. Newer functional areas such as MANO (outside of APIs in the NFVI and VIM) are still developing and are for future considerations.

2.4.1. General Approach

In order to meet the above objectives for OVP, we aim to follow a general approach by first identifying the overall requirements for all stake-holders, then analyzing what OPNFV and the upstream communities can effectively test and verify presently to derive an initial working scope for OVP, and to recommend what the community should strive to achieve in future releases.

The overall requirements for OVP can be categorized by the basic cloud capabilities representing common operations needed by basic VNFs, and additional requirements for VNFs that go beyond the common cloud capabilities including functional extensions, operational capabilities and additional carrier grade requirements.

For the basic NFV requirements, we will analyze the required test cases, leverage or improve upon existing test cases in OPNFV projects and upstream projects whenever we can, and bridge the gaps when we must, to meet these basic requirements.

We are not yet ready to include compliance requirements for capabilities such as hardware portability, carrier grade performance, fault management and other operational features, security, MANO and VNF verification. These areas are being studied for consideration in future OVP releases.

In some areas, we will start with a limited level of verification initially, constrained by what community resources are able to support at this time, but still serve a basic need that is not being fulfilled elsewhere. In these areas, we bring significant value to the community we serve by starting a new area of verification, breaking new ground and expanding it in the future.

In other areas, the functions being verified have yet to reach wide adoption but are seen as important requirements in NFV, or features are only needed for specific NFV use cases but an industry consensus about the APIs and behaviors is still deemed beneficial. In such cases, we plan to incorporate the test areas as optional. An optional test area will not have to be run or passed in order to achieve compliance. Optional tests provide an opportunity for vendors to demonstrate compliance with specific OPNFV features beyond the mandatory test scope.

2.4.2. Analysis of Scope

In order to define the scope of the 2018.09 release of the compliance and verification program, this section analyzes NFV-focused platform capabilities with respect to the high-level objectives and the general approach outlined in the previous section. The analysis determines which capabilities are suitable for inclusion in this release of the OVP and which capabilities are to be addressed in future releases.

  1. Basic Cloud Capabilities

The intent of these tests is to verify that the SUT has the required capabilities that a basic VNF needs, and these capabilities are implemented in a way that enables this basic VNF to run on any OPNFV compliant deployment.

A basic VNF can be thought of as a single virtual machine that is networked and can perform the simplest network functions, for example, a simple forwarding gateway, or a set of such virtual machines connected only by simple virtual network services. Running such basic VNF leads to a set of common requirements, including:

  • image management (testing Glance API)
  • identity management (testing Keystone Identity API)
  • virtual compute (testing Nova Compute API)
  • virtual storage (testing Cinder API)
  • virtual networks (testing Neutron Network API)
  • forwarding packets through virtual networks in data path
  • filtering packets based on security rules and port security in data path
  • dynamic network runtime operations through the life of a VNF (e.g. attach/detach, enable/disable, read stats)
  • correct behavior after common virtual machine life cycles events (e.g. suspend/resume, reboot, migrate)
  • simple virtual machine resource scheduling on multiple nodes

OPNFV mainly supports OpenStack as the VIM up to the 2018.09 release. The VNFs used in the OVP program, and features in scope for the program which are considered to be basic to all VNFs, require commercial OpenStack distributions to support a common basic level of cloud capabilities, and to be compliant to a common specification for these capabilities. This requirement significantly overlaps with OpenStack community’s Interop working group’s goals, but they are not identical. The OVP runs the OpenStack Refstack-Compute test cases to verify compliance to the basic common API requirements of cloud management functions and VNF (as a VM) management for OPNFV. Additional NFV specific requirements are added in network data path validation, packet filtering by security group rules and port security, life cycle runtime events of virtual networks, multiple networks in a topology, validation of VNF’s functional state after common life-cycle events including reboot, pause, suspense, stop/start and cold migration. In addition, the basic requirement also verifies that the SUT can allocate VNF resources based on simple anti-affinity rules.

The combined test cases help to ensure that these basic operations are always supported by a compliant platform and they adhere to a common standard to enable portability across OPNFV compliant platforms.

  1. NFV specific functional requirements

NFV has functional requirements beyond the basic common cloud capabilities, esp. in the networking area. Examples like BGPVPN, IPv6, SFC may be considered additional NFV requirements beyond general purpose cloud computing. These feature requirements expand beyond common OpenStack (or other VIM) requirements. OPNFV OVP will incorporate test cases to verify compliance in these areas as they become mature. Because these extensions may impose new API demands, maturity and industry adoption is a prerequisite for making them a mandatory requirement for OPNFV compliance. At the time of the 2018.09 release, we have promoted tests of the OpenStack IPv6 API from optional to mandatory while keeping BGPVPN as optional test area. Passing optional tests will not be required to pass OPNFV compliance verification.

BGPVPNs are relevant due to the wide adoption of MPLS/BGP based VPNs in wide area networks, which makes it necessary for data centers hosting VNFs to be able to seamlessly interconnect with such networks. SFC is also an important NFV requirement, however its implementation has not yet been accepted or adopted in the upstream at the time of the 2018.09 release.

  1. High availability

High availability is a common carrier grade requirement. Availability of a platform involves many aspects of the SUT, for example hardware or lower layer system failures or system overloads, and is also highly dependent on configurations. The current OPNFV high availability verification focuses on OpenStack control service failures and resource overloads, and verifies service continuity when the system encounters such failures or resource overloads, and also verifies the system heals after a failure episode within a reasonable time window. These service HA capabilities are commonly adopted in the industry and should be a mandatory requirement.

The current test cases in HA cover the basic area of failure and resource overload conditions for a cloud platform’s service availability, including all of the basic cloud capability services, and basic compute and storage loads, so it is a meaningful first step for OVP. We expect additional high availability scenarios be extended in future releases.

  1. Stress Testing

Resiliency testing involves stressing the SUT and verifying its ability to absorb stress conditions and still provide an acceptable level of service. Resiliency is an important requirement for end-users.

The 2018.09 release of OVP includes a load test which spins up a number of VMs pairs in parallel to assert that the system under test can process the workload spike in a stable and deterministic fashion.

  1. Security

Security is among the top priorities as a carrier grade requirement by the end-users. Some of the basic common functions, including virtual network isolation, security groups, port security and role based access control are already covered as part of the basic cloud capabilities that are verified in OVP. These test cases however do not yet cover the basic required security capabilities expected of an end-user deployment. It is an area that we should address in the near future, to define a common set of requirements and develop test cases for verifying those requirements.

The 2018.09 release includes new test cases which verify that the role-based access control (RBAC) functionality of the VIM is behaving as expected.

Another common requirement is security vulnerability scanning. While the OPNFV security project integrated tools for security vulnerability scanning, this has not been fully analyzed or exercised in 2018.09 release. This area needs further work to identify the required level of security for the purpose of OPNFV in order to be integrated into the OVP. End-user inputs on specific requirements in security is needed.

  1. Service assurance

Service assurance (SA) is a broad area of concern for reliability of the NFVI/VIM and VNFs, and depends upon multiple subsystems of an NFV platform for essential information and control mechanisms. These subsystems include telemetry, fault management (e.g. alarms), performance management, audits, and control mechanisms such as security and configuration policies.

The current 2018.09 release implements some enabling capabilities in NFVI/VIM such as telemetry, policy, and fault management. However, the specification of expected system components, behavior and the test cases to verify them have not yet been adequately developed. We will therefore not be testing this area at this time but defer to future study.

  1. Use case testing

Use-case test cases exercise multiple functional capabilities of a platform in order to realize a larger end-to-end scenario. Such end-to-end use cases do not necessarily add new API requirements to the SUT per se, but exercise aspects of the SUT’s functional capabilities in more complex ways. For instance, they allow for verifying the complex interactions among multiple VNFs and between VNFs and the cloud platform in a more realistic fashion. End-users consider use-case-level testing as a significant tool in verifying OPNFV compliance because it validates design patterns and support for the types of NFVI features that users care about.

There are a lot of projects in OPNFV developing use cases and sample VNFs. The 2018.09 release of OVP features two such use-case tests, spawning and verifying a vIMS and a vEPC, correspondingly.

  1. Additional capabilities

In addition to the capabilities analyzed above, there are further system aspects which are of importance for the OVP. These comprise operational and management aspects such as platform in-place upgrades and platform operational insights such as telemetry and logging. Further aspects include API backward compatibility / micro-versioning, workload migration, multi-site federation and interoperability with workload automation platforms, e.g. ONAP. Finally, efficiency aspects such as the hardware and energy footprint of the platform are worth considering in the OVP.

OPNFV is addressing these items on different levels of details in different projects. However, the contributions developed in these projects are not yet considered widely available in commercial systems in order to include them in the OVP. Hence, these aspects are left for inclusion in future releases of the OVP.

2.4.3. Scope of the 2018.09 release of the OVP

Summarizing the results of the analysis above, the scope of the 2018.09 release of OVP is as follows:

  • Mandatory test scope:
    • functest.vping.userdata
    • functest.vping.ssh
    • functest.tempest.osinterop*
    • functest.tempest.compute
    • functest.tempest.identity_v3
    • functest.tempest.image
    • functest.tempest.network_api
    • functest.tempest.volume
    • functest.tempest.neutron_trunk_ports
    • functest.tempest.ipv6_api
    • functest.security.patrole
    • yardstick.ha.nova_api
    • yardstick.ha.neutron_server
    • yardstick.ha.keystone
    • yardstick.ha.glance_api
    • yardstick.ha.cinder_api
    • yardstick.ha.cpu_load
    • yardstick.ha.disk_load
    • yardstick.ha.haproxy
    • yardstick.ha.rabbitmq
    • yardstick.ha.database
    • bottlenecks.stress.ping
  • Optional test scope:
    • functest.tempest.ipv6_scenario
    • functest.tempest.multi_node_scheduling
    • functest.tempest.network_security
    • functest.tempest.vm_lifecycle
    • functest.tempest.network_scenario
    • functest.tempest.bgpvpn
    • functest.bgpvpn.subnet_connectivity
    • functest.bgpvpn.tenant_separation
    • functest.bgpvpn.router_association
    • functest.bgpvpn.router_association_floating_ip
    • yardstick.ha.neutron_l3_agent
    • yardstick.ha.controller_restart
    • functest.vnf.vims
    • functest.vnf.vepc
    • functest.snaps.smoke

* The OPNFV OVP utilizes the same set of test cases as the OpenStack interoperability program OpenStack Powered Compute. Passing the OPNFV OVP does not imply that the SUT is certified according to the OpenStack Powered Compute program. OpenStack Powered Compute is a trademark of the OpenStack foundation and the corresponding certification label can only be awarded by the OpenStack foundation.

Note: The SUT is limited to NFVI and VIM functions. While testing MANO component capabilities is out of scope, certain APIs exposed towards MANO are used by the current OPNFV compliance testing suite. MANO and other operational elements may be part of the test infrastructure; for example used for workload deployment and provisioning.

2.4.4. Scope considerations for future OVP releases

Based on the previous analysis, the following items are outside the scope of the 2018.09 release of OVP but are being considered for inclusion in future releases:

  • service assurance
  • use case testing
  • platform in-place upgrade
  • API backward compatibility / micro-versioning
  • workload migration
  • multi-site federation
  • service function chaining
  • platform operational insights, e.g. telemetry, logging
  • efficiency, e.g. hardware and energy footprint of the platform
  • interoperability with workload automation platforms e.g. ONAP
  • resilience
  • security and vulnerability scanning
  • performance measurements
2.5. Criteria for Awarding Compliance

This section provides guidance on compliance criteria for each test area. The criteria described here are high-level, detailed pass/fail metrics are documented in Dovetail test specifications.

  1. All mandatory test cases must pass.

Exceptions to this rule may be legitimate, e.g. due to imperfect test tools or reasonable circumstances that we can not foresee. These exceptions must be documented and accepted by the reviewers.

  1. Optional test cases are optional to run. Its test results, pass or fail, do not impact compliance.

Applicants who choose to run the optional test cases can include the results of the optional test cases to highlight the additional compliance.

2.6. Exemption from strict API response validation

Vendors of commercial NFVI products may have extended the Nova API to support proprietary add-on features. These additions can cause Nova Tempest API tests to fail due to unexpected data in API responses. In order to resolve this transparently in the context of OVP, a temporary exemption process has been created. More information on the exemption can be found in section Disabling Strict API Validation in Tempest.

3. OVP Reviewer Guide
3.1. Introduction

This document provides detailed guidance for reviewers on how to handle the result review process.

The OPNFV Verified program (OVP) provides the ability for users to upload test results in OVP portal and request from OVP community to review them. After the user submit for review the test results Status is changed from ‘private’ to ‘review’ (as shown in figure 2).

OVP administrator will ask for review volunteers using the verified@opnfv.org email alias. The incoming results for review will be identified by the administrator with particular Test ID and Owner values.

Volunteers that will accept the review request can access the test results by login to the OVP portal and the click on the My Results tab in top-level navigation bar.

_images/ovp_top_nav.png

Figure 1

The corresponding OVP portal result will have a status of ‘review’.

_images/ovp_result_review.png

Figure 2

Reviewers must follow the checklist below to ensure review consistency for the OPNFV Verified Program (OVP) 2018.09 (Fraser) release at a minimum.

  1. Mandatory Test Area Results - Validate that results for all mandatory test areas are present.
  2. Test-Case Pass Percentage - Ensure all tests have passed (100% pass rate).
  3. Log File Verification - Inspect the log file for each test area.
  4. SUT Info Verification - Validate the system under test (SUT) hardware and software endpoint info is present.
3.2. 1. Mandatory Test Area Results

Test results can be displayed by clicking on the hyperlink under the ‘Test ID’ column. User should validate that results for all mandatory test areas are included in the overall test suite. The required mandatory test cases are:

  • functest.vping.userdata
  • functest.vping.ssh
  • bottlenecks.stress.ping
  • functest.tempest.osinterop
  • functest.tempest.compute
  • functest.tempest.identity_v3
  • functest.tempest.image
  • functest.tempest.network_api
  • functest.tempest.volume
  • functest.tempest.neutron_trunk_ports
  • functest.tempest.ipv6_api
  • functest.security.patrole
  • yardstick.ha.nova_api
  • yardstick.ha.neutron_server
  • yardstick.ha.keystone
  • yardstick.ha.glance_api
  • yardstick.ha.cinder_api
  • yardstick.ha.cpu_load
  • yardstick.ha.disk_load
  • yardstick.ha.haproxy
  • yardstick.ha.rabbitmq
  • yardstick.ha.database

Note, that the ‘Test ID’ column in this view condenses the UUID used for ‘Test ID’ to eight characters even though the ‘Test ID’ is a longer UUID in the back-end.

_images/ovp_result_overview.png

Figure 3

3.3. 2. Test-Case Pass Percentage

All mandatory test-cases have to run successfully. The below diagram of the ‘Test Run Results’ is one method and shows that 98.15% of the mandatory test-cases have passed. This value must not be lower than 100%.

_images/ovp_pass_percentage.png

Figure 4

Failed test cases can also be easy identified by the color of pass/total number. :

  • Green when all test-cases pass
  • Orange when at least one fails
  • Red when all test-cases fail
_images/ovp_pass_fraction.png

Figure 5

3.4. 3. Log File Verification

Each log file of the mandatory test cases have to be verified for content.

Log files can be displayed by clicking on the setup icon to the right of the results, as shown in figure below.

_images/ovp_log_setup.png

Figure 6

Note, all log files can be found at results/ directory as shown at the following table.

Mandatory Test Case Location
bottlenecks results/stress_logs/
functest.vping results/vping_logs/
functest.tempest results/tempest_logs/
functest.security results/security_logs/
yardstick results/ha_logs/

The bottlenecks log must contain the ‘SUCCESS’ result as shown in following example:

2018-08-22 14:11:21,815 [INFO] yardstick.benchmark.core.task task.py:127 Testcase: “ping_bottlenecks” SUCCESS!!!

Functest logs opens an html page that lists all test cases as shown in figure 7. All test cases must have run successfuly.

_images/ovp_log_files_functest_image.png

Figure 7

For the vping test area log file (functest.log). The two entries displayed in the tables below must be present in this log file.

functest.vping_userdata

_images/ovp_vping_ssh.png

Figure 8

functest.vping_ssh

_images/ovp_vping_user.png

Figure 9

The yardstick log must contain the ‘SUCCESS’ result for each of the test-cases within this test area. This can be verified by searching the log for the keyword ‘SUCCESS’.

An example of a FAILED and a SUCCESS test case are listed below:

2018-08-28 10:25:09,946 [ERROR] yardstick.benchmark.scenarios.availability.monitor.monitor_multi monitor_multi.py:78 SLA failure: 14.015082 > 5.000000

2018-08-28 10:23:41,907 [INFO] yardstick.benchmark.core.task task.py:127 Testcase: “opnfv_yardstick_tc052” SUCCESS!!!

3.5. 4. SUT Info Verification

SUT information must be present in the results to validate that all required endpoint services and at least two controllers were present during test execution. For the results shown below, click the ‘info‘ hyperlink in the SUT column to navigate to the SUT information page.

_images/sut_info.png

Figure 10

In the ‘Endpoints‘ listing shown below for the SUT VIM component, ensure that services are present for identify, compute, image, volume and network at a minimum by inspecting the ‘Service Type‘ column.

_images/sut_endpoints.png

Figure 11

Inspect the ‘Hosts‘ listing found below the Endpoints secion of the SUT info page and ensure at least two hosts are present, as two controllers are required the for the mandatory HA test-cases.

4. OVP System Preparation Guide

This document provides a general guide to hardware system prerequisites and expectations for running OPNFV OVP testing. For detailed guide of preparing software tools and configurations, and conducting the test, please refer to the User Guide :ref:dovetail-testing_user_guide.

The OVP test tools expect that the hardware of the System Under Test (SUT) is Pharos compliant Pharos specification

The Pharos specification itself is a general guideline, rather than a set of specific hard requirements at this time, developed by the OPNFV community. For the purpose of helping OVP testers, we summarize the main aspects of hardware to consider in preparation for OVP testing.

As described by the OVP Testing User Guide, the hardware systems involved in OVP testing includes a Test Node, a System Under Test (SUT) system, and network connectivity between them.

The Test Node can be a bare metal machine or a virtual machine that can support Docker container environment. If it is a bare metal machine, it needs to be a x86 based at this time. Detailed information of how to configure and prepare the Test Node can be found in the User Guide.

The System Under Test (SUT) system is expected to consist of a set of general purpose servers, storage devices or systems, and networking infrastructure connecting them together. The set of servers are expected to be of the same architecture, either x86-64 or ARM-64. Mixing different architectures in the same SUT is not supported.

A minimum of 5 servers, 3 configured for controllers and 2 or more configured for compute resource are expected. However this is not a hard requirement at this phase. The OVP 1.0 mandatory test cases only require one compute server. At lease two compute servers are required to pass some of the optional test cases in the current OVP release. OVP control service high availability tests expect two or more control nodes to pass, depending on the HA mechanism implemented by the SUT.

The SUT is also expected to include components for persistent storage. The OVP testing does not expect or impose significant storage size or performance requirements.

The SUT is expected to be connected with high performance networks. These networks are expected in the SUT:

  • A management network by which the Test Node can reach all identity, image, network, and compute services in the SUT
  • A data network that supports the virtual network capabilities and data path testing

Additional networks, such as Light Out Management or storage networks, may be beneficial and found in the SUT, but they are not a requirement for OVP testing.

5. OVP Test Specifications
5.1. Introduction

The OPNFV OVP provides a series or test areas aimed to evaluate the operation of an NFV system in accordance with carrier networking needs. Each test area contains a number of associated test cases which are described in detail in the associated test specification.

All tests in the OVP are required to fulfill a specific set of criteria in order that the OVP is able to provide a fair assessment of the system under test. Test requirements are described in the :ref:dovetail-test_case_requirements document.

All tests areas addressed in the OVP are covered in the following test specification documents.

5.1.1. OpenStack Services HA test specification
5.1.1.1. Scope

The HA test area evaluates the ability of the System Under Test to support service continuity and recovery from component failures on part of OpenStack controller services(“nova-api”, “neutron-server”, “keystone”, “glance-api”, “cinder-api”) and on “load balancer” service.

The tests in this test area will emulate component failures by killing the processes of above target services, stressing the CPU load or blocking disk I/O on the selected controller node, and then check if the impacted services are still available and the killed processes are recovered on the selected controller node within a given time interval.

5.1.1.2. References

This test area references the following specifications:

5.1.1.3. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • SUT - system under test
  • Monitor - tools used to measure the service outage time and the process outage time
  • Service outage time - the outage time (seconds) of the specific OpenStack service
  • Process outage time - the outage time (seconds) from the specific processes being killed to recovered
5.1.1.4. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

SUT is assumed to be in high availability configuration, which typically means more than one controller nodes are in the System Under Test.

5.1.1.5. Test Area Structure

The HA test area is structured with the following test cases in a sequential manner.

Each test case is able to run independently. Preceding test case’s failure will not affect the subsequent test cases.

Preconditions of each test case will be described in the following test descriptions.

5.1.1.6. Test Descriptions
5.1.1.6.1. Test Case 1 - Controller node OpenStack service down - nova-api
5.1.1.6.1.1. Short name

yardstick.ha.nova_api

Yardstick test case: opnfv_yardstick_tc019.yaml

5.1.1.6.1.2. Use case specification

This test case verifies the service continuity capability in the face of the software process failure. It kills the processes of OpenStack “nova-api” service on the selected controller node, then checks whether the “nova-api” service is still available during the failure, by creating a VM then deleting the VM, and checks whether the killed processes are recovered within a given time interval.

5.1.1.6.1.3. Test preconditions

There is more than one controller node, which is providing the “nova-api” service for API end-point.

Denoted a controller node as Node1 in the following configuration.

5.1.1.6.1.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.1.4.1. Methodology for verifying service continuity and recovery

The service continuity and process recovery capabilities of “nova-api” service is evaluated by monitoring service outage time, process outage time, and results of nova operations.

Service outage time is measured by continuously executing “openstack server list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “nova-api” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is measured by checking the status of “nova-api” processes on the selected controller node. The time of “nova-api” processes being killed to the time of the “nova-api” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “nova-api” processes.

All nova operations are carried out correctly within a given time interval which suggests that the “nova-api” service is continuously available.

5.1.1.6.1.4.2. Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “nova-api” processes are running on Node1
  • Test action 2: Create a image with “openstack image create test-cirros –file cirros-0.3.5-x86_64-disk.img –disk-format qcow2 –container-format bare”
  • Test action 3: Execute”openstack flavor create m1.test –id auto –ram 512 –disk 1 –vcpus 1” to create flavor “m1.test”.
  • Test action 4: Start two monitors: one for “nova-api” processes and the other for “openstack server list” command. Each monitor will run as an independent process
  • Test action 5: Connect to Node1 through SSH, and then kill the “nova-api” processes
  • Test action 6: When “openstack server list” returns with no error, calculate the service outage time, and execute command “openstack server create –flavor m1.test –image test-cirros test-instance”
  • Test action 7: Continuously Execute “openstack server show test-instance” to check if the status of VM “test-instance” is “Active”
  • Test action 8: If VM “test-instance” is “Active”, execute “openstack server delete test-instance”, then execute “openstack server list” to check if the VM is not in the list
  • Test action 9: Continuously measure process outage time from the monitor until the process outage time is more than 30s
5.1.1.6.1.4.3. Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

The nova operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

5.1.1.6.1.5. Post conditions

Restart the process of “nova-api” if they are not running.

Delete image with “openstack image delete test-cirros”.

Delete flavor with “openstack flavor delete m1.test”.

5.1.1.6.2. Test Case 2 - Controller node OpenStack service down - neutron-server
5.1.1.6.2.1. Short name

yardstick.ha.neutron_server

Yardstick test case: opnfv_yardstick_tc045.yaml

5.1.1.6.2.2. Use case specification

This test verifies the high availability of the “neutron-server” service provided by OpenStack controller nodes. It kills the processes of OpenStack “neutron-server” service on the selected controller node, then checks whether the “neutron-server” service is still available, by creating a network and deleting the network, and checks whether the killed processes are recovered.

5.1.1.6.2.3. Test preconditions

There is more than one controller node, which is providing the “neutron-server” service for API end-point.

Denoted a controller node as Node1 in the following configuration.

5.1.1.6.2.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.2.4.1. Methodology for monitoring high availability

The high availability of “neutron-server” service is evaluated by monitoring service outage time, process outage time, and results of neutron operations.

Service outage time is tested by continuously executing “openstack router list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “neutron-server” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of “neutron-server” processes on the selected controller node. The time of “neutron-server” processes being killed to the time of the “neutron-server” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “neutron-server” processes.

5.1.1.6.2.4.2. Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “neutron-server” processes are running on Node1
  • Test action 2: Start two monitors: one for “neutron-server” process and the other for “openstack router list” command. Each monitor will run as an independent process.
  • Test action 3: Connect to Node1 through SSH, and then kill the “neutron-server” processes
  • Test action 4: When “openstack router list” returns with no error, calculate the service outage time, and execute “openstack network create test-network”
  • Test action 5: Continuously executing “openstack network show test-network”, check if the status of “test-network” is “Active”
  • Test action 6: If “test-network” is “Active”, execute “openstack network delete test-network”, then execute “openstack network list” to check if the “test-network” is not in the list
  • Test action 7: Continuously measure process outage time from the monitor until the process outage time is more than 30s
5.1.1.6.2.4.3. Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

The neutron operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

5.1.1.6.2.5. Post conditions

Restart the processes of “neutron-server” if they are not running.

5.1.1.6.3. Test Case 3 - Controller node OpenStack service down - keystone
5.1.1.6.3.1. Short name

yardstick.ha.keystone

Yardstick test case: opnfv_yardstick_tc046.yaml

5.1.1.6.3.2. Use case specification

This test verifies the high availability of the “keystone” service provided by OpenStack controller nodes. It kills the processes of OpenStack “keystone” service on the selected controller node, then checks whether the “keystone” service is still available by executing command “openstack user list” and whether the killed processes are recovered.

5.1.1.6.3.3. Test preconditions

There is more than one controller node, which is providing the “keystone” service for API end-point.

Denoted a controller node as Node1 in the following configuration.

5.1.1.6.3.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.3.4.1. Methodology for monitoring high availability

The high availability of “keystone” service is evaluated by monitoring service outage time and process outage time

Service outage time is tested by continuously executing “openstack user list” command in loop and checking if the response of the command request is reutrned with no failure. When the response fails, the “keystone” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of “keystone” processes on the selected controller node. The time of “keystone” processes being killed to the time of the “keystone” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “keystone” processes.

5.1.1.6.3.4.2. Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “keystone” processes are running on Node1
  • Test action 2: Start two monitors: one for “keystone” process and the other for “openstack user list” command. Each monitor will run as an independent process.
  • Test action 3: Connect to Node1 through SSH, and then kill the “keystone” processes
  • Test action 4: Calculate the service outage time and process outage time
  • Test action 5: The test passes if process outage time is less than 20s and service outage time is less than 5s
  • Test action 6: Continuously measure process outage time from the monitor until the process outage time is more than 30s
5.1.1.6.3.4.3. Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

A negative result will be generated if the above is not met in completion.

5.1.1.6.3.5. Post conditions

Restart the processes of “keystone” if they are not running.

5.1.1.6.4. Test Case 4 - Controller node OpenStack service down - glance-api
5.1.1.6.4.1. Short name

yardstick.ha.glance_api

Yardstick test case: opnfv_yardstick_tc047.yaml

5.1.1.6.4.2. Use case specification

This test verifies the high availability of the “glance-api” service provided by OpenStack controller nodes. It kills the processes of OpenStack “glance-api” service on the selected controller node, then checks whether the “glance-api” service is still available, by creating image and deleting image, and checks whether the killed processes are recovered.

5.1.1.6.4.3. Test preconditions

There is more than one controller node, which is providing the “glance-api” service for API end-point.

Denoted a controller node as Node1 in the following configuration.

5.1.1.6.4.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.4.4.1. Methodology for monitoring high availability

The high availability of “glance-api” service is evaluated by monitoring service outage time, process outage time, and results of glance operations.

Service outage time is tested by continuously executing “openstack image list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “glance-api” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of “glance-api” processes on the selected controller node. The time of “glance-api” processes being killed to the time of the “glance-api” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “glance-api” processes.

5.1.1.6.4.4.2. Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “glance-api” processes are running on Node1
  • Test action 2: Start two monitors: one for “glance-api” process and the other for “openstack image list” command. Each monitor will run as an independent process.
  • Test action 3: Connect to Node1 through SSH, and then kill the “glance-api” processes
  • Test action 4: When “openstack image list” returns with no error, calculate the service outage time, and execute “openstack image create test-image –file cirros-0.3.5-x86_64-disk.img –disk-format qcow2 –container-format bare”
  • Test action 5: Continuously execute “openstack image show test-image”, check if status of “test-image” is “active”
  • Test action 6: If “test-image” is “active”, execute “openstack image delete test-image”. Then execute “openstack image list” to check if “test-image” is not in the list
  • Test action 7: Continuously measure process outage time from the monitor until the process outage time is more than 30s
5.1.1.6.4.4.3. Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

The glance operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

5.1.1.6.4.5. Post conditions

Restart the processes of “glance-api” if they are not running.

Delete image with “openstack image delete test-image”.

5.1.1.6.5. Test Case 5 - Controller node OpenStack service down - cinder-api
5.1.1.6.5.1. Short name

yardstick.ha.cinder_api

Yardstick test case: opnfv_yardstick_tc048.yaml

5.1.1.6.5.2. Use case specification

This test verifies the high availability of the “cinder-api” service provided by OpenStack controller nodes. It kills the processes of OpenStack “cinder-api” service on the selected controller node, then checks whether the “cinder-api” service is still available by executing command “openstack volume list” and whether the killed processes are recovered.

5.1.1.6.5.3. Test preconditions

There is more than one controller node, which is providing the “cinder-api” service for API end-point.

Denoted a controller node as Node1 in the following configuration.

5.1.1.6.5.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.5.4.1. Methodology for monitoring high availability

The high availability of “cinder-api” service is evaluated by monitoring service outage time and process outage time

Service outage time is tested by continuously executing “openstack volume list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “cinder-api” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of “cinder-api” processes on the selected controller node. The time of “cinder-api” processes being killed to the time of the “cinder-api” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “cinder-api” processes.

5.1.1.6.5.4.2. Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “cinder-api” processes are running on Node1
  • Test action 2: Start two monitors: one for “cinder-api” process and the other for “openstack volume list” command. Each monitor will run as an independent process.
  • Test action 3: Connect to Node1 through SSH, and then execute kill the “cinder-api” processes
  • Test action 4: Continuously measure service outage time from the monitor until the service outage time is more than 5s
  • Test action 5: Continuously measure process outage time from the monitor until the process outage time is more than 30s
5.1.1.6.5.4.3. Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

The cinder operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

5.1.1.6.5.5. Post conditions

Restart the processes of “cinder-api” if they are not running.

5.1.1.6.6. Test Case 6 - Controller Node CPU Overload High Availability
5.1.1.6.6.1. Short name

yardstick.ha.cpu_load

Yardstick test case: opnfv_yardstick_tc051.yaml

5.1.1.6.6.2. Use case specification

This test verifies the availability of services when one of the controller node suffers from heavy CPU overload. When the CPU usage of the specified controller node is up to 100%, which breaks down the OpenStack services on this node, the Openstack services should continue to be available. This test case stresses the CPU usage of a specific controller node to 100%, then checks whether all services provided by the SUT are still available with the monitor tools.

5.1.1.6.6.3. Test preconditions

There is more than one controller node, which is providing the “cinder-api”, “neutron-server”, “glance-api” and “keystone” services for API end-point.

Denoted a controller node as Node1 in the following configuration.

5.1.1.6.6.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.6.4.1. Methodology for monitoring high availability

The high availability of related OpenStack service is evaluated by monitoring service outage time

Service outage time is tested by continuously executing “openstack router list”, “openstack stack list”, “openstack volume list”, “openstack image list” commands in loop and checking if the response of the command request is returned with no failure. When the response fails, the related service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

5.1.1.6.6.4.2. Methodology for stressing CPU usage

To evaluate the high availability of target OpenStack service under heavy CPU load, the test case will first get the number of logical CPU cores on the target controller node by shell command, then use the number to execute ‘dd’ command to continuously copy from /dev/zero and output to /dev/null in loop. The ‘dd’ operation only uses CPU, no I/O operation, which is ideal for stressing the CPU usage.

Since the ‘dd’ command is continuously executed and the CPU usage rate is stressed to 100%, the scheduler will schedule each ‘dd’ command to be processed on a different logical CPU core. Eventually to achieve all logical CPU cores usage rate to 100%.

5.1.1.6.6.4.3. Test execution
  • Test action 1: Start four monitors: one for “openstack image list” command, one for “openstack router list” command, one for “openstack stack list” command and the last one for “openstack volume list” command. Each monitor will run as an independent process.
  • Test action 2: Connect to Node1 through SSH, and then stress all logical CPU cores usage rate to 100%
  • Test action 3: Continuously measure all the service outage times until they are more than 5s
  • Test action 4: Kill the process that stresses the CPU usage
5.1.1.6.6.4.4. Pass / fail criteria

All the service outage times are less than 5s.

A negative result will be generated if the above is not met in completion.

5.1.1.6.6.5. Post conditions

No impact on the SUT.

5.1.1.6.7. Test Case 7 - Controller Node Disk I/O Overload High Availability
5.1.1.6.7.1. Short name

yardstick.ha.disk_load

Yardstick test case: opnfv_yardstick_tc052.yaml

5.1.1.6.7.2. Use case specification

This test verifies the high availability of control node. When the disk I/O of the specific disk is overload, which breaks down the OpenStack services on this node, the read and write services should continue to be available. This test case blocks the disk I/O of the specific controller node, then checks whether the services that need to read or write the disk of the controller node are available with some monitor tools.

5.1.1.6.7.3. Test preconditions

There is more than one controller node. Denoted a controller node as Node1 in the following configuration. The controller node has at least 20GB free disk space.

5.1.1.6.7.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.7.4.1. Methodology for monitoring high availability

The high availability of nova service is evaluated by monitoring service outage time

Service availability is tested by continuously executing “openstack flavor list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the related service is considered in outage.

5.1.1.6.7.4.2. Methodology for stressing disk I/O

To evaluate the high availability of target OpenStack service under heavy I/O load, the test case will execute shell command on the selected controller node to continuously writing 8kb blocks to /test.dbf

5.1.1.6.7.4.3. Test execution
  • Test action 1: Connect to Node1 through SSH, and then stress disk I/O by continuously writing 8kb blocks to /test.dbf
  • Test action 2: Start a monitor: for “openstack flavor list” command
  • Test action 3: Create a flavor called “test-001”
  • Test action 4: Check whether the flavor “test-001” is created
  • Test action 5: Continuously measure service outage time from the monitor until the service outage time is more than 5s
  • Test action 6: Stop writing to /test.dbf and delete file /test.dbf
5.1.1.6.7.4.4. Pass / fail criteria

The service outage time is less than 5s.

The nova operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

5.1.1.6.7.5. Post conditions

Delete flavor with “openstack flavor delete test-001”.

5.1.1.6.8. Test Case 8 - Controller Load Balance as a Service High Availability
5.1.1.6.8.1. Short name

yardstick.ha.haproxy

Yardstick test case: opnfv_yardstick_tc053.yaml

5.1.1.6.8.2. Use case specification

This test verifies the high availability of “haproxy” service. When the “haproxy” service of a specified controller node is killed, whether “haproxy” service on other controller nodes will work, and whether the controller node will restart the “haproxy” service are checked. This test case kills the processes of “haproxy” service on the selected controller node, then checks whether the request of the related OpenStack command is processed with no failure and whether the killed processes are recovered.

5.1.1.6.8.3. Test preconditions

There is more than one controller node, which is providing the “haproxy” service for rest-api.

Denoted as Node1 in the following configuration.

5.1.1.6.8.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.8.4.1. Methodology for monitoring high availability

The high availability of “haproxy” service is evaluated by monitoring service outage time and process outage time

Service outage time is tested by continuously executing “openstack image list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “haproxy” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of processes of “haproxy” service on the selected controller node. The time of those processes being killed to the time of those processes being recovered is the process outage time. Process recovery is verified by checking the existence of processes of “haproxy” service.

5.1.1.6.8.4.2. Test execution
  • Test action 1: Connect to Node1 through SSH, and check that processes of “haproxy” service are running on Node1
  • Test action 2: Start two monitors: one for processes of “haproxy” service and the other for “openstack image list” command. Each monitor will run as an independent process
  • Test action 3: Connect to Node1 through SSH, and then kill the processes of “haproxy” service
  • Test action 4: Continuously measure service outage time from the monitor until the service outage time is more than 5s
  • Test action 5: Continuously measure process outage time from the monitor until the process outage time is more than 30s
5.1.1.6.8.4.3. Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

A negative result will be generated if the above is not met in completion.

5.1.1.6.8.5. Post conditions

Restart the processes of “haproxy” if they are not running.

5.1.1.6.9. Test Case 9 - Controller node OpenStack service down - Database
5.1.1.6.9.1. Short name

yardstick.ha.database

Yardstick test case: opnfv_yardstick_tc090.yaml

5.1.1.6.9.2. Use case specification

This test case verifies that the high availability of the data base instances used by OpenStack (mysql) on control node is working properly. Specifically, this test case kills the processes of database service on a selected control node, then checks whether the request of the related OpenStack command is OK and the killed processes are recovered.

5.1.1.6.9.3. Test preconditions

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: fault_type, process_name and host.

The purpose of this attacker is to kill any process with a specific process name which is run on the host node. In case that multiple processes use the same name on the host node, all of them are going to be killed by this attacker.

5.1.1.6.9.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.9.4.1. Methodology for verifying service continuity and recovery

In order to verify this service two different monitors are going to be used.

As first monitor is used a OpenStack command and acts as watcher for database connection of different OpenStack components.

For second monitor is used a process monitor and the main purpose is to watch whether the database processes on the host node are killed properly.

Therefore, in this test case, there are two metrics:

  • service_outage_time, which indicates the maximum outage time (seconds) of the specified OpenStack command request
  • process_recover_time, which indicates the maximum time (seconds) from the process being killed to recovered
5.1.1.6.9.4.2. Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “database” processes are running on Node1
  • Test action 2: Start two monitors: one for “database” processes on the host node and the other for connection toward database from OpenStack components, verifying the results of openstack image list, openstack router list, openstack stack list and openstack volume list. Each monitor will run as an independent process
  • Test action 3: Connect to Node1 through SSH, and then kill the “mysql” process(es)
  • Test action 4: Stop monitors after a period of time specified by “waiting_time”. The monitor info will be aggregated.
  • Test action 5: Verify the SLA and set the verdict of the test case to pass or fail.
5.1.1.6.9.4.3. Pass / fail criteria

Check whether the SLA is passed: - The process outage time is less than 30s. - The service outage time is less than 5s.

The database operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

5.1.1.6.9.5. Post conditions

The database service is up and running again. If the database service did not recover successfully by itself, the test explicitly restarts the database service.

5.1.1.6.10. Test Case 10 - Controller Messaging Queue as a Service High Availability
5.1.1.6.10.1. Short name

yardstick.ha.rabbitmq

Yardstick test case: opnfv_yardstick_tc056.yaml

5.1.1.6.10.2. Use case specification

This test case will verify the high availability of the messaging queue service (RabbitMQ) that supports OpenStack on controller node. This test case expects that message bus service implementation is RabbitMQ. If the SUT uses a different message bus implementations, the Dovetail configuration (pod.yaml) can be changed accordingly. When messaging queue service (which is active) of a specified controller node is killed, the test case will check whether messaging queue services (which are standby) on other controller nodes will be switched active, and whether the cluster manager on the attacked controller node will restart the stopped messaging queue.

5.1.1.6.10.3. Test preconditions

There is more than one controller node, which is providing the “messaging queue” service. Denoted as Node1 in the following configuration.

5.1.1.6.10.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.10.4.1. Methodology for verifying service continuity and recovery

The high availability of “messaging queue” service is evaluated by monitoring service outage time and process outage time.

Service outage time is tested by continuously executing “openstack image list”, “openstack network list”, “openstack volume list” and “openstack stack list” commands in loop and checking if the responses of the command requests are returned with no failure. When the response fails, the “messaging queue” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of processes of “messaging queue” service on the selected controller node. The time of those processes being killed to the time of those processes being recovered is the process outage time. Process recovery is verified by checking the existence of processes of “messaging queue” service.

5.1.1.6.10.4.2. Test execution
  • Test action 1: Start five monitors: one for processes of “messaging queue” service and the others for “openstack image list”, “openstack network list”, “openstack stack list” and “openstack volume list” command. Each monitor will run as an independent process
  • Test action 2: Connect to Node1 through SSH, and then kill all the processes of “messaging queue” service
  • Test action 3: Continuously measure service outage time from the monitors until the service outage time is more than 5s
  • Test action 4: Continuously measure process outage time from the monitor until the process outage time is more than 30s
5.1.1.6.10.4.3. Pass / fail criteria

Test passes if the process outage time is no more than 30s and the service outage time is no more than 5s.

A negative result will be generated if the above is not met in completion.

5.1.1.6.10.5. Post conditions

Restart the processes of “messaging queue” if they are not running.

5.1.1.6.11. Test Case 11 - Controller node OpenStack service down - Controller Restart
5.1.1.6.11.1. Short name

yardstick.ha.controller_restart

Yardstick test case: opnfv_yardstick_tc025.yaml

5.1.1.6.11.2. Use case specification

This test case verifies that the high availability of controller node is working properly. Specifically, this test case shutdowns a specified controller node via IPMI, then checks whether all services provided by the controller node are OK with some monitor tools.

5.1.1.6.11.3. Test preconditions

In this test case, an attacker called “host-shutdown” is needed. This attacker includes two parameters: fault_type and host.

The purpose of this attacker is to shutdown a controller and check whether the services are handled by this controller are still working normally.

5.1.1.6.11.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.11.4.1. Methodology for verifying service continuity and recovery

In order to verify this service one monitor is going to be used.

This monitor is using an OpenStack command and the respective command name of the OpenStack component that we want to verify that the respective service is still running normally.

In this test case, there is one metric: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified OpenStack command request.

5.1.1.6.11.4.2. Test execution
  • Test action 1: Connect to Node1 through SSH, and check that controller services are running normally
  • Test action 2: Start monitors: each monitor will run as independently process, monitoring the image list, router list, stack list and volume list accordingly. The monitor info will be collected.
  • Test action 3: Using the IPMI component, the Node1 is shut-down remotely.
  • Test action 4: Stop monitors after a period of time specified by “waiting_time”. The monitor info will be aggregated.
  • Test action 5: Verify the SLA and set the verdict of the test case to pass or fail.
5.1.1.6.11.4.3. Pass / fail criteria

Check whether the SLA is passed: - The process outage time is less than 30s. - The service outage time is less than 5s.

The controller operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

5.1.1.6.11.5. Post conditions

The controller has been restarted

5.1.1.6.12. Test Case 12 - OpenStack Controller Virtual Router Service High Availability
5.1.1.6.12.1. Short name

yardstick.ha.neutron_l3_agent

Yardstick test case: opnfv_yardstick_tc058.yaml

5.1.1.6.12.2. Use case specification

This test case will verify the high availability of virtual routers(L3 agent) on controller node. When a virtual router service on a specified controller node is shut down, this test case will check whether the network of virtual machines will be affected, and whether the attacked virtual router service will be recovered.

5.1.1.6.12.3. Test preconditions

There is more than one controller node, which is providing the Neutron API extension called “neutron-l3-agent” virtual router service API.

Denoted as Node1 in the following configuration.

5.1.1.6.12.4. Basic test flow execution description and pass/fail criteria
5.1.1.6.12.4.1. Methodology for verifying service continuity and recovery

The high availability of “neutrol-l3-agent” virtual router service is evaluated by monitoring service outage time and process outage time.

Service outage is tested using ping to virtual machines. Ping tests that the network routing of virtual machines is ok. When the response fails, the virtual router service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of processes of “neutron-l3-agent” service on the selected controller node. The time of those processes being killed to the time of those processes being recovered is the process outage time.

Process recovery is verified by checking the existence of processes of “neutron-l3-agent” service.

5.1.1.6.12.4.2. Test execution
  • Test action 1: Two host VMs are booted, these two hosts are in two different networks, the networks are connected by a virtual router.
  • Test action 2: Start monitors: each monitor will run with independently process. The monitor info will be collected.
  • Test action 3: Do attacker: Connect the host through SSH, and then execute the kill process script with param value specified by “process_name”
  • Test action 4: Stop monitors after a period of time specified by “waiting_time” The monitor info will be aggregated.
  • Test action 5: Verify the SLA and set the verdict of the test case to pass or fail.
5.1.1.6.12.4.3. Pass / fail criteria

Check whether the SLA is passed: - The process outage time is less than 30s. - The service outage time is less than 5s.

A negative result will be generated if the above is not met in completion.

5.1.1.6.12.5. Post conditions

Delete image with “openstack image delete neutron-l3-agent_ha_image”.

Delete flavor with “openstack flavor delete neutron-l3-agent_ha_flavor”.

5.1.2. Patrole Tempest Tests
5.1.2.1. Scope

This test area evaluates the ability of a system under test to support the role-based access control (RBAC) implementation. The test area specifically validates services image and networking.

5.1.2.3. System Under Test (SUT)

The system under test is assumed to be the NFVI and VIM deployed on a Pharos compliant infrastructure.

5.1.2.4. Test Area Structure

The test area is structured in individual tests as listed below. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. For detailed information on the individual steps and assertions performed by the tests, review the Python source code accessible via the following links.

Image basic RBAC test:

These tests cover the RBAC tests of image basic operations.

Implementation: BasicOperationsImagesRbacTest

  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_create_image
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_create_image_tag
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_deactivate_image
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_delete_image
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_delete_image_tag
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_download_image
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_list_images
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_publicize_image
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_reactivate_image
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_show_image
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_update_image
  • patrole_tempest_plugin.tests.api.image.test_images_rbac.BasicOperationsImagesRbacTest.test_upload_image

Image namespaces RBAC test:

These tests cover the RBAC tests of image namespaces.

Implementation: ImageNamespacesRbacTest

  • patrole_tempest_plugin.tests.api.image.test_image_namespace_rbac.ImageNamespacesRbacTest.test_create_metadef_namespace
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_rbac.ImageNamespacesRbacTest.test_list_metadef_namespaces
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_rbac.ImageNamespacesRbacTest.test_modify_metadef_namespace

Image namespaces objects RBAC test:

These tests cover the RBAC tests of image namespaces objects.

Implementation: ImageNamespacesObjectsRbacTest

  • patrole_tempest_plugin.tests.api.image.test_image_namespace_objects_rbac.ImageNamespacesObjectsRbacTest.test_create_metadef_object_in_namespace
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_objects_rbac.ImageNamespacesObjectsRbacTest.test_list_metadef_objects_in_namespace
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_objects_rbac.ImageNamespacesObjectsRbacTest.test_show_metadef_object_in_namespace
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_objects_rbac.ImageNamespacesObjectsRbacTest.test_update_metadef_object_in_namespace

Image namespaces property RBAC test:

These tests cover the RBAC tests of image namespaces property.

Implementation: NamespacesPropertyRbacTest

  • patrole_tempest_plugin.tests.api.image.test_image_namespace_property_rbac.NamespacesPropertyRbacTest.test_add_md_properties
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_property_rbac.NamespacesPropertyRbacTest.test_get_md_properties
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_property_rbac.NamespacesPropertyRbacTest.test_get_md_property
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_property_rbac.NamespacesPropertyRbacTest.test_modify_md_properties

Image namespaces tags RBAC test:

These tests cover the RBAC tests of image namespaces tags.

Implementation: NamespaceTagsRbacTest

  • patrole_tempest_plugin.tests.api.image.test_image_namespace_tags_rbac.NamespaceTagsRbacTest.test_create_namespace_tag
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_tags_rbac.NamespaceTagsRbacTest.test_create_namespace_tags
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_tags_rbac.NamespaceTagsRbacTest.test_list_namespace_tags
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_tags_rbac.NamespaceTagsRbacTest.test_show_namespace_tag
  • patrole_tempest_plugin.tests.api.image.test_image_namespace_tags_rbac.NamespaceTagsRbacTest.test_update_namespace_tag

Image resource types RBAC test:

These tests cover the RBAC tests of image resource types.

Implementation: ImageResourceTypesRbacTest

  • patrole_tempest_plugin.tests.api.image.test_image_resource_types_rbac.ImageResourceTypesRbacTest.test_add_metadef_resource_type
  • patrole_tempest_plugin.tests.api.image.test_image_resource_types_rbac.ImageResourceTypesRbacTest.test_get_metadef_resource_type
  • patrole_tempest_plugin.tests.api.image.test_image_resource_types_rbac.ImageResourceTypesRbacTest.test_list_metadef_resource_types

Image member RBAC test:

These tests cover the RBAC tests of image member.

Implementation: ImagesMemberRbacTest

  • patrole_tempest_plugin.tests.api.image.test_images_member_rbac.ImagesMemberRbacTest.test_add_image_member
  • patrole_tempest_plugin.tests.api.image.test_images_member_rbac.ImagesMemberRbacTest.test_delete_image_member
  • patrole_tempest_plugin.tests.api.image.test_images_member_rbac.ImagesMemberRbacTest.test_list_image_members
  • patrole_tempest_plugin.tests.api.image.test_images_member_rbac.ImagesMemberRbacTest.test_show_image_member

Network agents RBAC test:

These tests cover the RBAC tests of network agents.

Implementation: AgentsRbacTest and DHCPAgentSchedulersRbacTest.

  • patrole_tempest_plugin.tests.api.network.test_agents_rbac.AgentsRbacTest.test_show_agent
  • patrole_tempest_plugin.tests.api.network.test_agents_rbac.AgentsRbacTest.test_update_agent
  • patrole_tempest_plugin.tests.api.network.test_agents_rbac.DHCPAgentSchedulersRbacTest.test_add_dhcp_agent_to_network
  • patrole_tempest_plugin.tests.api.network.test_agents_rbac.DHCPAgentSchedulersRbacTest.test_delete_network_from_dhcp_agent
  • patrole_tempest_plugin.tests.api.network.test_agents_rbac.DHCPAgentSchedulersRbacTest.test_list_networks_hosted_by_one_dhcp_agent

Network floating ips RBAC test:

These tests cover the RBAC tests of network floating ips.

Implementation: FloatingIpsRbacTest

  • patrole_tempest_plugin.tests.api.network.test_floating_ips_rbac.FloatingIpsRbacTest.test_create_floating_ip
  • patrole_tempest_plugin.tests.api.network.test_floating_ips_rbac.FloatingIpsRbacTest.test_create_floating_ip_floatingip_address
  • patrole_tempest_plugin.tests.api.network.test_floating_ips_rbac.FloatingIpsRbacTest.test_delete_floating_ip
  • patrole_tempest_plugin.tests.api.network.test_floating_ips_rbac.FloatingIpsRbacTest.test_show_floating_ip
  • patrole_tempest_plugin.tests.api.network.test_floating_ips_rbac.FloatingIpsRbacTest.test_update_floating_ip

Network basic RBAC test:

These tests cover the RBAC tests of network basic operations.

Implementation: NetworksRbacTest

  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_create_network
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_create_network_provider_network_type
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_create_network_provider_segmentation_id
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_create_network_router_external
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_create_network_shared
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_create_subnet
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_delete_network
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_delete_subnet
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_list_dhcp_agents_on_hosting_network
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_show_network
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_show_network_provider_network_type
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_show_network_provider_physical_network
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_show_network_provider_segmentation_id
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_show_network_router_external
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_show_subnet
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_update_network
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_update_network_router_external
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_update_network_shared
  • patrole_tempest_plugin.tests.api.network.test_networks_rbac.NetworksRbacTest.test_update_subnet

Network ports RBAC test:

These tests cover the RBAC tests of network ports.

Implementation: PortsRbacTest

  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_create_port
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_create_port_allowed_address_pairs
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_create_port_binding_host_id
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_create_port_binding_profile
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_create_port_device_owner
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_create_port_fixed_ips
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_create_port_mac_address
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_create_port_security_enabled
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_delete_port
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_show_port
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_show_port_binding_host_id
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_show_port_binding_profile
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_show_port_binding_vif_details
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_show_port_binding_vif_type
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_update_port
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_update_port_allowed_address_pairs
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_update_port_binding_host_id
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_update_port_binding_profile
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_update_port_device_owner
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_update_port_fixed_ips
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_update_port_mac_address
  • patrole_tempest_plugin.tests.api.network.test_ports_rbac.PortsRbacTest.test_update_port_security_enabled

Network routers RBAC test:

These tests cover the RBAC tests of network routers.

Implementation: RouterRbacTest

  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_add_router_interface
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_create_router
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_create_router_enable_snat
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_create_router_external_fixed_ips
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_delete_router
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_remove_router_interface
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_show_router
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_update_router
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_update_router_enable_snat
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_update_router_external_fixed_ips
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_update_router_external_gateway_info
  • patrole_tempest_plugin.tests.api.network.test_routers_rbac.RouterRbacTest.test_update_router_external_gateway_info_network_id

Network security groups RBAC test:

These tests cover the RBAC tests of network security groups.

Implementation: SecGroupRbacTest

  • patrole_tempest_plugin.tests.api.network.test_security_groups_rbac.SecGroupRbacTest.test_create_security_group
  • patrole_tempest_plugin.tests.api.network.test_security_groups_rbac.SecGroupRbacTest.test_create_security_group_rule
  • patrole_tempest_plugin.tests.api.network.test_security_groups_rbac.SecGroupRbacTest.test_delete_security_group
  • patrole_tempest_plugin.tests.api.network.test_security_groups_rbac.SecGroupRbacTest.test_delete_security_group_rule
  • patrole_tempest_plugin.tests.api.network.test_security_groups_rbac.SecGroupRbacTest.test_list_security_group_rules
  • patrole_tempest_plugin.tests.api.network.test_security_groups_rbac.SecGroupRbacTest.test_list_security_groups
  • patrole_tempest_plugin.tests.api.network.test_security_groups_rbac.SecGroupRbacTest.test_show_security_group_rule
  • patrole_tempest_plugin.tests.api.network.test_security_groups_rbac.SecGroupRbacTest.test_show_security_groups
  • patrole_tempest_plugin.tests.api.network.test_security_groups_rbac.SecGroupRbacTest.test_update_security_group

Network service providers RBAC test:

These tests cover the RBAC tests of network service providers.

Implementation: ServiceProvidersRbacTest

  • patrole_tempest_plugin.tests.api.network.test_service_providers_rbac.ServiceProvidersRbacTest.test_list_service_providers

Network subnetpools RBAC test:

These tests cover the RBAC tests of network subnetpools.

Implementation: SubnetPoolsRbacTest

  • patrole_tempest_plugin.tests.api.network.test_subnetpools_rbac.SubnetPoolsRbacTest.test_create_subnetpool
  • patrole_tempest_plugin.tests.api.network.test_subnetpools_rbac.SubnetPoolsRbacTest.test_create_subnetpool_shared
  • patrole_tempest_plugin.tests.api.network.test_subnetpools_rbac.SubnetPoolsRbacTest.test_delete_subnetpool
  • patrole_tempest_plugin.tests.api.network.test_subnetpools_rbac.SubnetPoolsRbacTest.test_show_subnetpool
  • patrole_tempest_plugin.tests.api.network.test_subnetpools_rbac.SubnetPoolsRbacTest.test_update_subnetpool
  • patrole_tempest_plugin.tests.api.network.test_subnetpools_rbac.SubnetPoolsRbacTest.test_update_subnetpool_is_default

Network subnets RBAC test:

These tests cover the RBAC tests of network subnets.

Implementation: SubnetsRbacTest

  • patrole_tempest_plugin.tests.api.network.test_subnets_rbac.SubnetsRbacTest.test_create_subnet
  • patrole_tempest_plugin.tests.api.network.test_subnets_rbac.SubnetsRbacTest.test_delete_subnet
  • patrole_tempest_plugin.tests.api.network.test_subnets_rbac.SubnetsRbacTest.test_list_subnets
  • patrole_tempest_plugin.tests.api.network.test_subnets_rbac.SubnetsRbacTest.test_show_subnet
  • patrole_tempest_plugin.tests.api.network.test_subnets_rbac.SubnetsRbacTest.test_update_subnet
5.1.3. SNAPS smoke test specification
5.1.3.1. Scope

The SNAPS smoke test case contains tests that setup and destroy environments with VMs with and without Floating IPs with a newly created user and project.

5.1.3.2. References

This smoke test executes the Python Tests included with the SNAPS libraries that exercise many of the OpenStack APIs within Keystone, Glance, Neutron, and Nova.

5.1.3.3. System Under Test (SUT)

The SUT is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.3.4. Test Area Structure

The test area is structured in individual tests as listed below. For detailed information on the individual steps and assertions performed by the tests, review the Python source code accessible via the following links:

Dynamic creation of User/Project objects to be leveraged for the integration tests:

  • Create Image Success tests
    • snaps.openstack.tests.create_image_tests.CreateImageSuccessTests.test_create_delete_image
    • snaps.openstack.tests.create_image_tests.CreateImageSuccessTests.test_create_image_clean_file
    • snaps.openstack.tests.create_image_tests.CreateImageSuccessTests.test_create_image_clean_url
    • snaps.openstack.tests.create_image_tests.CreateImageSuccessTests.test_create_image_clean_url_properties
    • snaps.openstack.tests.create_image_tests.CreateImageSuccessTests.test_create_same_image
    • snaps.openstack.tests.create_image_tests.CreateImageSuccessTests.test_create_same_image_new_settings
  • Create Image Negative tests
    • snaps.openstack.tests.create_image_tests.CreateImageNegativeTests.test_bad_image_file
    • snaps.openstack.tests.create_image_tests.CreateImageNegativeTests.test_bad_image_image_type
    • snaps.openstack.tests.create_image_tests.CreateImageNegativeTests.test_bad_image_name
    • snaps.openstack.tests.create_image_tests.CreateImageNegativeTests.test_bad_image_url
  • Create Image Multi Part tests
    • snaps.openstack.tests.create_image_tests.CreateMultiPartImageTests.test_create_three_part_image_from_file_3_creators
    • snaps.openstack.tests.create_image_tests.CreateMultiPartImageTests.test_create_three_part_image_from_url
    • snaps.openstack.tests.create_image_tests.CreateMultiPartImageTests.test_create_three_part_image_from_url_3_creators
  • Create Keypairs tests
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsTests.test_create_delete_keypair
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsTests.test_create_keypair_from_file
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsTests.test_create_keypair_large_key
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsTests.test_create_keypair_only
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsTests.test_create_keypair_save_both
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsTests.test_create_keypair_save_pub_only
  • Create Keypairs Cleanup tests
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsCleanupTests.test_create_keypair_exist_files_delete
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsCleanupTests.test_create_keypair_exist_files_keep
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsCleanupTests.test_create_keypair_gen_files_delete_1
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsCleanupTests.test_create_keypair_gen_files_delete_2
    • snaps.openstack.tests.create_keypairs_tests.CreateKeypairsCleanupTests.test_create_keypair_gen_files_keep
  • Create Network Success tests
    • snaps.openstack.tests.create_network_tests.CreateNetworkSuccessTests.test_create_delete_network
    • snaps.openstack.tests.create_network_tests.CreateNetworkSuccessTests.test_create_network_router_admin_user_to_new_project
    • snaps.openstack.tests.create_network_tests.CreateNetworkSuccessTests.test_create_network_router_new_user_to_admin_project
    • snaps.openstack.tests.create_network_tests.CreateNetworkSuccessTests.test_create_network_with_router
    • snaps.openstack.tests.create_network_tests.CreateNetworkSuccessTests.test_create_network_without_router
    • snaps.openstack.tests.create_network_tests.CreateNetworkSuccessTests.test_create_networks_same_name
  • Create Router Success tests
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_delete_router
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_router_admin_state_True
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_router_admin_state_false
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_router_admin_user_to_new_project
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_router_external_network
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_router_new_user_as_admin_project
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_router_private_network
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_router_vanilla
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_router_with_ext_port
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_with_internal_sub
    • snaps.openstack.tests.create_router_tests.CreateRouterSuccessTests.test_create_with_invalid_internal_sub
  • Create Router Negative tests
    • snaps.openstack.tests.create_router_tests.CreateRouterNegativeTests.test_create_router_admin_ports
    • snaps.openstack.tests.create_router_tests.CreateRouterNegativeTests.test_create_router_invalid_gateway_name
    • snaps.openstack.tests.create_router_tests.CreateRouterNegativeTests.test_create_router_noname
  • Create QoS tests
    • snaps.openstack.tests.create_qos_tests.CreateQoSTests.test_create_delete_qos
    • snaps.openstack.tests.create_qos_tests.CreateQoSTests.test_create_qos
    • snaps.openstack.tests.create_qos_tests.CreateQoSTests.test_create_same_qos
  • Create Simple Volume Success tests
    • snaps.openstack.tests.create_volume_tests.CreateSimpleVolumeSuccessTests.test_create_delete_volume
    • snaps.openstack.tests.create_volume_tests.CreateSimpleVolumeSuccessTests.test_create_same_volume
    • snaps.openstack.tests.create_volume_tests.CreateSimpleVolumeSuccessTests.test_create_volume_simple
  • Create Simple Volume Failure tests
    • snaps.openstack.tests.create_volume_tests.CreateSimpleVolumeFailureTests.test_create_volume_bad_image
    • snaps.openstack.tests.create_volume_tests.CreateSimpleVolumeFailureTests.test_create_volume_bad_size
    • snaps.openstack.tests.create_volume_tests.CreateSimpleVolumeFailureTests.test_create_volume_bad_type
  • Create Volume With Type tests
    • snaps.openstack.tests.create_volume_tests.CreateVolumeWithTypeTests.test_bad_volume_type
    • snaps.openstack.tests.create_volume_tests.CreateVolumeWithTypeTests.test_valid_volume_type
  • Create Volume With Image tests
    • snaps.openstack.tests.create_volume_tests.CreateVolumeWithImageTests.test_bad_image_name
    • snaps.openstack.tests.create_volume_tests.CreateVolumeWithImageTests.test_valid_volume_image
  • Create Simple Volume Type Success tests
    • snaps.openstack.tests.create_volume_type_tests.CreateSimpleVolumeTypeSuccessTests.test_create_delete_volume_type
    • snaps.openstack.tests.create_volume_type_tests.CreateSimpleVolumeTypeSuccessTests.test_create_same_volume_type
    • snaps.openstack.tests.create_volume_type_tests.CreateSimpleVolumeTypeSuccessTests.test_create_volume_type
  • Create Volume Type Complex tests
    • snaps.openstack.tests.create_volume_type_tests.CreateVolumeTypeComplexTests.test_volume_type_with_encryption
    • snaps.openstack.tests.create_volume_type_tests.CreateVolumeTypeComplexTests.test_volume_type_with_qos
    • snaps.openstack.tests.create_volume_type_tests.CreateVolumeTypeComplexTests.test_volume_type_with_qos_and_encryption
  • Simple Health Check
    • snaps.openstack.tests.create_instance_tests.SimpleHealthCheck.test_check_vm_ip_dhcp
  • Create Instance Two Net tests
    • snaps.openstack.tests.create_instance_tests.CreateInstanceTwoNetTests.test_ping_via_router
  • Create Instance Simple tests
    • snaps.openstack.tests.create_instance_tests.CreateInstanceSimpleTests.test_create_admin_instance
    • snaps.openstack.tests.create_instance_tests.CreateInstanceSimpleTests.test_create_delete_instance
  • Create Instance Port Manipulation tests
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_allowed_address_pairs
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_allowed_address_pairs_bad_ip
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_allowed_address_pairs_bad_mac
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_custom_invalid_ip_one_subnet
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_custom_invalid_mac
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_custom_mac_and_ip
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_custom_valid_ip_one_subnet
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_custom_valid_mac
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_one_port_two_ip_one_subnet
    • snaps.openstack.tests.create_instance_tests.CreateInstancePortManipulationTests.test_set_one_port_two_ip_two_subnets
  • Instance Security Group tests
    • snaps.openstack.tests.create_instance_tests.InstanceSecurityGroupTests.test_add_invalid_security_group
    • snaps.openstack.tests.create_instance_tests.InstanceSecurityGroupTests.test_add_same_security_group
    • snaps.openstack.tests.create_instance_tests.InstanceSecurityGroupTests.test_add_security_group
    • snaps.openstack.tests.create_instance_tests.InstanceSecurityGroupTests.test_remove_security_group
    • snaps.openstack.tests.create_instance_tests.InstanceSecurityGroupTests.test_remove_security_group_never_added
  • Create Instance On Compute Host
    • snaps.openstack.tests.create_instance_tests.CreateInstanceOnComputeHost.test_deploy_vm_to_each_compute_node
  • Create Instance From Three Part Image
    • snaps.openstack.tests.create_instance_tests.CreateInstanceFromThreePartImage.test_create_instance_from_three_part_image
  • Create Instance Volume tests
    • snaps.openstack.tests.create_instance_tests.CreateInstanceVolumeTests.test_create_instance_with_one_volume
    • snaps.openstack.tests.create_instance_tests.CreateInstanceVolumeTests.test_create_instance_with_two_volumes
  • Create Instance Single Network tests
    • snaps.openstack.tests.create_instance_tests.CreateInstanceSingleNetworkTests.test_single_port_static
    • snaps.openstack.tests.create_instance_tests.CreateInstanceSingleNetworkTests.test_ssh_client_fip_after_active
    • snaps.openstack.tests.create_instance_tests.CreateInstanceSingleNetworkTests.test_ssh_client_fip_after_init
    • snaps.openstack.tests.create_instance_tests.CreateInstanceSingleNetworkTests.test_ssh_client_fip_after_reboot
    • snaps.openstack.tests.create_instance_tests.CreateInstanceSingleNetworkTests.test_ssh_client_fip_before_active
    • snaps.openstack.tests.create_instance_tests.CreateInstanceSingleNetworkTests.test_ssh_client_fip_reverse_engineer
    • snaps.openstack.tests.create_instance_tests.CreateInstanceSingleNetworkTests.test_ssh_client_fip_second_creator
  • Create Stack Success tests
    • snaps.openstack.tests.create_stack_tests.CreateStackSuccessTests.test_create_delete_stack
    • snaps.openstack.tests.create_stack_tests.CreateStackSuccessTests.test_create_same_stack
    • snaps.openstack.tests.create_stack_tests.CreateStackSuccessTests.test_create_stack_short_timeout
    • snaps.openstack.tests.create_stack_tests.CreateStackSuccessTests.test_create_stack_template_dict
    • snaps.openstack.tests.create_stack_tests.CreateStackSuccessTests.test_create_stack_template_file
    • snaps.openstack.tests.create_stack_tests.CreateStackSuccessTests.test_retrieve_network_creators
    • snaps.openstack.tests.create_stack_tests.CreateStackSuccessTests.test_retrieve_vm_inst_creators
  • Create Stack Volume tests
    • snaps.openstack.tests.create_stack_tests.CreateStackVolumeTests.test_retrieve_volume_creator
    • snaps.openstack.tests.create_stack_tests.CreateStackVolumeTests.test_retrieve_volume_type_creator
  • Create Stack Flavor tests
    • snaps.openstack.tests.create_stack_tests.CreateStackFlavorTests.test_retrieve_flavor_creator
  • Create Stack Keypair tests
    • snaps.openstack.tests.create_stack_tests.CreateStackKeypairTests.test_retrieve_keypair_creator
  • Create Stack Security Group tests
    • snaps.openstack.tests.create_stack_tests.CreateStackSecurityGroupTests.test_retrieve_security_group_creatorl
  • Create Stack Negative tests
    • snaps.openstack.tests.create_stack_tests.CreateStackNegativeTests.test_bad_stack_file
    • snaps.openstack.tests.create_stack_tests.CreateStackNegativeTest.test_missing_dependencies
  • Create Security Group tests
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_add_rule
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_create_delete_group
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_create_group_admin_user_to_new_project
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_create_group_new_user_to_admin_project
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_create_group_with_one_complex_rule
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_create_group_with_one_simple_rule
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_create_group_with_several_rules
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_create_group_without_rules
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_remove_rule_by_id
    • snaps.openstack.tests.create_security_group_tests.CreateSecurityGroupTests.test_remove_rule_by_setting

Floating IP and Ansible provisioning:

  • Create Stack Floating tests
    • snaps.openstack.tests.create_stack_tests.CreateStackFloatingIpTests.test_connect_via_ssh_heat_vm
    • snaps.openstack.tests.create_stack_tests.CreateStackFloatingIpTests.test_connect_via_ssh_heat_vm_derived
  • Ansible Provisioning tests
    • snaps.provisioning.tests.ansible_utils_tests.AnsibleProvisioningTests.test_apply_simple_playbook
    • snaps.provisioning.tests.ansible_utils_tests.AnsibleProvisioningTests.test_apply_template_playbook
5.1.4. Stress Test Specification
5.1.4.1. Scope

The stress test involves testing and verifying the ability of the SUT to withstand stress and other challenging factors. Main purpose behind the testing is to make sure the SUT is able to absorb failures while providing an acceptable level of service.

5.1.4.2. References

This test area references the following specifications, definitions and reviews:

5.1.4.3. Definitions and Abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • iff - if and only if
  • NFVI - Network Functions Virtualization Infrastructure
  • NaN - Not a Number
  • Service Level - Measurable terms that describe the quality of the service provided by the SUT within a given time period
  • SUT - System Under Test
  • VM - Virtual Machine
5.1.4.4. System Under Test (SUT)

The system under test is assumed to be the NFVI and VIM in operation on a Pharos compliant infrastructure.

5.1.4.5. Test Area Structure

According to the testing goals stated in the test scope section, preceding test will not affect the subsequent test as long as the SUT is able to sustain the given stress while providing an acceptable level of service. Any FAIL result from a single test case will cause the SUT failing the whole test.

5.1.4.6. Test Descriptions
5.1.4.6.1. Test Case 1 - Concurrent capacity based on life-cycle ping test
5.1.4.6.1.1. Short name

dovetail.stress.ping

5.1.4.6.1.2. Use case specification

This test case verifies the ability of the SUT concurrently setting up VM pairs for different tenants (through different OpenStack related components) and providing acceptable capacity under stressful conditions. The connectivity between VMs in a VM pair for a tenant is validated through Ping test. A life-cycle event in this test case is particularly referred to a VM pair life-cycle consisting of spawning, pinging and destroying.

5.1.4.6.1.3. Test preconditions
  • heat_template_version: 2013-05-23
  • ElasticSearch Port: 9200
  • LogStash Port: 5044
  • Kibana Port: 5601
  • Yardstick Port: 5000
5.1.4.6.1.4. Basic test flow execution description and pass/fail criteria
5.1.4.6.1.4.1. Methodology for validating capacity of the SUT

Validating capacity of the SUT based on life-cycle ping test generally involves 2 subtests which provides secondary validation for the SUT furnishing users with reliable capacity without being crushed.

Let N1, N2, N3 and P1 be certain preset numbers, respectively. In subtest 1, the SUT concurrently setting up N1 VM pairs with each VM pair belonging to a different tenant. Then VM1 in a VM pair pings VM2 for P1 times with P1 packets. The connectivity could be validated iff VM1 successfully pings VM2 with these P1 packets. Subtest 1 is finished iff all the concurrent (N1) requests for creating VM pairs are fulfilled with returned values that indicate the statuses of the VM pairs creations.

Subtest 2 is executed after subtest 1 as secondary validation of the capacity. It follows the same workflow as subtest 1 does to set up N2 VM pairs.

Assume S1 and S2 be the numbers of VM pairs that are successfully created in subtest 1 and subtest 2, respectively. If min(S1,S2)>=N3, then the SUT is considered as PASS. Otherwise, we denote the SUT with FAIL.

Note that for subtest 1, if the number of successfully created VM pairs, i.e., S1, is smaller than N3. Subtest 2 will not be executed and SUT will be marked with FAIL.

5.1.4.6.1.4.2. Test execution
  • Test action 1: Install the testing tools by pulling and running the Bottlenecks Docker container
  • Test action 2: Prepare the test by sourcing openstack credential file, eliminating the environment constraints, i.e., Quota setting, setting up Yardstick docker, pulling and registering OS images and VM flavor
  • Test action 3: Call Yardstick to concurrently creating N1 VM pairs for N1 tenants
  • Test action 4: In each VM pair, VM1 pings VM2 for P1 times with P1 packets while recording the successful numbers
  • Test action 5: Mark the VM pairs with P1 successful pings as PASS and record the total number of PASS VM pairs as S1
  • Test action 6: Destroy all the VM pairs
  • Test action 7: If S1<N3, the SUT is marked with FAIL and the test return. Otherwise go to Test action 8
  • Test action 8: Go to Test action 3 and do the test again to create N2 VM pairs with PASS VM pairs counted as S2
  • Test action 9: If S2<N3, the SUT is marked with FAIL. Otherwise marked with PASS.
5.1.4.6.1.4.3. Pass / fail criteria

Typical setting of (N1, N2, N3, P1) is (5, 5, 5, 10). The reference setting above is acquired based on the results from OPNFV CI jobs and testing over commercial products.

The connectivity within a VM pair is validated iff:

  • VM1 successfully pings VM2 for P1 times with P1 packets

The SUT is considered passing the test iff:

  • min(S1,S2)>=N3

Note that after each subtest, the program will check if the successfully created number of VM pairs is smaller than N3. If true, the program will return and the SUT will be marked with FAIL. Then the passing criteria is equal to the equation above. When subtest 1 returns, S2 here is denoted by NaN.

5.1.4.6.1.5. Post conditions

N/A

5.1.5. Tempest Compute test specification
5.1.5.1. Scope

The Tempest Compute test area evaluates the ability of the System Under Test (SUT) to support dynamic network runtime operations through the life of a VNF. The tests in this test area will evaluate IPv4 network runtime operations functionality.

These runtime operations includes:

  • Create, list and show flavors
  • Create and list security group rules
  • Create, delete and list security groups
  • Create, delete, show and list interfaces; attach and deattach ports to servers
  • List server addresses
  • Individual version endpoints info works
  • Servers Test Boot From Volume
5.1.5.2. References

Security Groups:

  • create security group
  • delete security group

Networks:

  • create network
  • delete network

Routers and interface:

  • create router
  • update router
  • delete router
  • add interface to router

Subnets:

  • create subnet
  • update subnet
  • delete subnet

Servers:

  • create keypair
  • create server
  • delete server
  • add/assign floating IP
  • disassociate floating IP

Ports:

  • create port
  • update port
  • delete port

Floating IPs:

  • create floating IP
  • delete floating IP
5.1.5.3. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.5.4. Test Area Structure

The test area is structured in individual tests as listed below. For detailed information on the individual steps and assertions performed by the tests, review the Python source code accessible via the following links:

All these test cases are included in the test case dovetail.tempest.compute of OVP test suite.

5.1.5.5. Test Area Structure

The test area is structured in individual tests as listed below. For detailed information on the individual steps and assertions performed by the tests, review the Python source code accessible via the following links:

  • Flavor V2 test
    • tempest.api.compute.flavors.test_flavors.FlavorsV2TestJSON.test_get_flavor
    • tempest.api.compute.flavors.test_flavors.FlavorsV2TestJSON.test_list_flavors
  • Security Group Rules test
    • tempest.api.compute.security_groups.test_security_group_rules.SecurityGroupRulesTestJSON.test_security_group_rules_create
    • tempest.api.compute.security_groups.test_security_group_rules.SecurityGroupRulesTestJSON.test_security_group_rules_list
  • Security Groups test
    • tempest.api.compute.security_groups.test_security_groups.SecurityGroupsTestJSON.test_security_groups_create_list_delete
  • Attach Interfaces test
    • tempest.api.compute.servers.test_attach_interfaces.AttachInterfacesTestJSON.test_add_remove_fixed_ip
  • Server Addresses test
    • tempest.api.compute.servers.test_server_addresses.ServerAddressesTestJSON.test_list_server_addresses
    • tempest.api.compute.servers.test_server_addresses.ServerAddressesTestJSON.test_list_server_addresses_by_network
  • Test Versions
    • tempest.api.compute.test_versions.TestVersions.test_get_version_details
  • Servers Test Boot From Volume
    • tempest.api.compute.servers.test_create_server.ServersTestBootFromVolume.test_verify_server_details
    • tempest.api.compute.servers.test_create_server.ServersTestBootFromVolume.test_list_servers
  • Server Basic Operations test
    • tempest.scenario.test_server_basic_ops.TestServerBasicOps.test_server_basic_ops
5.1.6. Tempest Identity v3 test specification
5.1.6.1. Scope

The Tempest Identity v3 test area evaluates the ability of the System Under Test (SUT) to create, list, delete and verify users through the life of a VNF. The tests in this test area will evaluate IPv4 network runtime operations functionality.

These runtime operations may include that create, list, verify and delete:

  • credentials
  • domains
  • endpoints
  • user groups
  • policies
  • regions
  • roles
  • services
  • identities
  • API versions
5.1.6.2. References

Identity API v3.0

5.1.6.3. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.6.4. Test Area Structure

The test area is structured in individual tests as listed below. For detailed information on the individual steps and assertions performed by the tests, review the Python source code accessible via the following links:

All these test cases are included in the test case dovetail.tempest.identity_v3 of OVP test suite.

5.1.7. Tempest Image test specification
5.1.7.1. Scope

The Tempest Image test area tests the basic operations of Images of the System Under Test (SUT) through the life of a VNF. The tests in this test area will evaluate IPv4 network runtime operations functionality.

5.1.7.2. References

Image Service API v2

5.1.7.3. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.7.4. Test Area Structure

The test area is structured in individual tests as listed below. For detailed information on the individual steps and assertions performed by the tests, review the Python source code accessible via the following links:

All these test cases are included in the test case dovetail.tempest.image of OVP test suite.

5.1.8. IPv6 test specification
5.1.8.1. Scope

The IPv6 test area will evaluate the ability for a SUT to support IPv6 Tenant Network features and functionality. The tests in this test area will evaluate,

  • network, subnet, port, router API CRUD operations
  • interface add and remove operations
  • security group and security group rule API CRUD operations
  • IPv6 address assignment with dual stack, dual net, multiprefix in mode DHCPv6 stateless or SLAAC
5.1.8.2. References
5.1.8.3. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • CIDR - Classless Inter-Domain Routing
  • CRUD - Create, Read, Update, and Delete
  • DHCP - Dynamic Host Configuration Protocol
  • DHCPv6 - Dynamic Host Configuration Protocol version 6
  • ICMP - Internet Control Message Protocol
  • NFVI - Network Functions Virtualization Infrastructure
  • NIC - Network Interface Controller
  • RA - Router Advertisements
  • radvd - The Router Advertisement Daemon
  • SDN - Software Defined Network
  • SLAAC - Stateless Address Auto Configuration
  • TCP - Transmission Control Protocol
  • UDP - User Datagram Protocol
  • VM - Virtual Machine
  • vNIC - virtual Network Interface Card
5.1.8.4. System Under Test (SUT)

The system under test is assumed to be the NFVI and VIM deployed with a Pharos compliant infrastructure.

5.1.8.5. Test Area Structure

The test area is structured based on network, port and subnet operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test.

5.1.8.6. Test Descriptions
5.1.8.6.1. API Used and Reference

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • show network details
  • update network
  • delete network
  • list networks
  • create netowrk
  • bulk create networks

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • list subnets
  • create subnet
  • bulk create subnet
  • show subnet details
  • update subnet
  • delete subnet

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • list routers
  • create router
  • show router details
  • update router
  • delete router
  • add interface to router
  • remove interface from router

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • show port details
  • update port
  • delete port
  • list port
  • create port
  • bulk create ports

Security groups: https://developer.openstack.org/api-ref/networking/v2/index.html#security-groups-security-groups

  • list security groups
  • create security groups
  • show security group
  • update security group
  • delete security group

Security groups rules: https://developer.openstack.org/api-ref/networking/v2/index.html#security-group-rules-security-group-rules

  • list security group rules
  • create security group rule
  • show security group rule
  • delete security group rule

Servers: https://developer.openstack.org/api-ref/compute/

  • list servers
  • create server
  • create multiple servers
  • list servers detailed
  • show server details
  • update server
  • delete server

All IPv6 api and scenario test cases addressed in OVP are covered in the following test specification documents.

5.1.8.6.1.1. Test Case 1 - Create and Delete Bulk Network, IPv6 Subnet and Port
5.1.8.6.1.1.1. Short name

dovetail.tempest.ipv6_api.bulk_network_subnet_port_create_delete

5.1.8.6.1.1.2. Use case specification

This test case evaluates the SUT API ability of creating and deleting multiple networks, IPv6 subnets, ports in one request, the reference is,

tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_network tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_subnet tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_port

5.1.8.6.1.1.3. Test preconditions

None

5.1.8.6.1.1.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.1.4.1. Test execution
  • Test action 1: Create 2 networks using bulk create, storing the “id” parameters returned in the response
  • Test action 2: List all networks, verifying the two network id’s are found in the list
  • Test assertion 1: The two “id” parameters are found in the network list
  • Test action 3: Delete the 2 created networks using the stored network ids
  • Test action 4: List all networks, verifying the network ids are no longer present
  • Test assertion 2: The two “id” parameters are not present in the network list
  • Test action 5: Create 2 networks using bulk create, storing the “id” parameters returned in the response
  • Test action 6: Create an IPv6 subnets on each of the two networks using bulk create commands, storing the associated “id” parameters
  • Test action 7: List all subnets, verify the IPv6 subnets are found in the list
  • Test assertion 3: The two IPv6 subnet “id” parameters are found in the network list
  • Test action 8: Delete the 2 IPv6 subnets using the stored “id” parameters
  • Test action 9: List all subnets, verify the IPv6 subnets are no longer present in the list
  • Test assertion 4: The two IPv6 subnet “id” parameters, are not present in list
  • Test action 10: Delete the 2 networks created in test action 5, using the stored network ids
  • Test action 11: List all networks, verifying the network ids are no longer present
  • Test assertion 5: The two “id” parameters are not present in the network list
  • Test action 12: Create 2 networks using bulk create, storing the “id” parameters returned in the response
  • Test action 13: Create a port on each of the two networks using bulk create commands, storing the associated “port_id” parameters
  • Test action 14: List all ports, verify the port_ids are found in the list
  • Test assertion 6: The two “port_id” parameters are found in the ports list
  • Test action 15: Delete the 2 ports using the stored “port_id” parameters
  • Test action 16: List all ports, verify port_ids are no longer present in the list
  • Test assertion 7: The two “port_id” parameters, are not present in list
  • Test action 17: Delete the 2 networks created in test action 12, using the stored network ids
  • Test action 18: List all networks, verifying the network ids are no longer present
  • Test assertion 8: The two “id” parameters are not present in the network list
5.1.8.6.1.1.4.2. Pass / fail criteria

This test evaluates the ability to use bulk create commands to create networks, IPv6 subnets and ports on the SUT API. Specifically it verifies that:

  • Bulk network create commands return valid “id” parameters which are reported in the list commands
  • Bulk IPv6 subnet commands return valid “id” parameters which are reported in the list commands
  • Bulk port commands return valid “port_id” parameters which are reported in the list commands
  • All items created using bulk create commands are able to be removed using the returned identifiers
5.1.8.6.1.1.5. Post conditions

N/A

5.1.8.6.1.2. Test Case 2 - Create, Update and Delete an IPv6 Network and Subnet
5.1.8.6.1.2.1. Short name

dovetail.tempest.ipv6_api.network_subnet_create_update_delete

5.1.8.6.1.2.2. Use case specification

This test case evaluates the SUT API ability of creating, updating, deleting network and IPv6 subnet with the network, the reference is

tempest.api.network.test_networks.NetworksIpV6Test.test_create_update_delete_network_subnet

5.1.8.6.1.2.3. Test preconditions

None

5.1.8.6.1.2.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.2.4.1. Test execution
  • Test action 1: Create a network, storing the “id” and “status” parameters returned in the response
  • Test action 2: Verify the value of the created network’s “status” is ACTIVE
  • Test assertion 1: The created network’s “status” is ACTIVE
  • Test action 3: Update this network with a new_name
  • Test action 4: Verify the network’s name equals the new_name
  • Test assertion 2: The network’s name equals to the new_name after name updating
  • Test action 5: Create an IPv6 subnet within the network, storing the “id” parameters returned in the response
  • Test action 6: Update this IPv6 subnet with a new_name
  • Test action 7: Verify the IPv6 subnet’s name equals the new_name
  • Test assertion 3: The IPv6 subnet’s name equals to the new_name after name updating
  • Test action 8: Delete the IPv6 subnet created in test action 5, using the stored subnet id
  • Test action 9: List all subnets, verifying the subnet id is no longer present
  • Test assertion 4: The IPv6 subnet “id” is not present in the subnet list
  • Test action 10: Delete the network created in test action 1, using the stored network id
  • Test action 11: List all networks, verifying the network id is no longer present
  • Test assertion 5: The network “id” is not present in the network list
5.1.8.6.1.2.4.2. Pass / fail criteria

This test evaluates the ability to create, update, delete network, IPv6 subnet on the SUT API. Specifically it verifies that:

  • Create network commands return ACTIVE “status” parameters which are reported in the list commands
  • Update network commands return updated “name” parameters which equals to the “name” used
  • Update subnet commands return updated “name” parameters which equals to the “name” used
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.2.5. Post conditions

None

5.1.8.6.1.3. Test Case 3 - Check External Network Visibility
5.1.8.6.1.3.1. Short name

dovetail.tempest.ipv6_api.external_network_visibility

5.1.8.6.1.3.2. Use case specification

This test case verifies user can see external networks but not subnets, the reference is,

tempest.api.network.test_networks.NetworksIpV6Test.test_external_network_visibility

5.1.8.6.1.3.3. Test preconditions
  1. The SUT has at least one external network.
  2. In the external network list, there is no network without external router, i.e., all networks in this list are with external router.
  3. There is one external network with configured public network id and there is no subnet on this network
5.1.8.6.1.3.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.3.4.1. Test execution
  • Test action 1: List all networks with external router, storing the “id”s parameters returned in the response
  • Test action 2: Verify list in test action 1 is not empty
  • Test assertion 1: The network with external router list is not empty
  • Test action 3: List all netowrks without external router in test action 1 list
  • Test action 4: Verify list in test action 3 is empty
  • Test assertion 2: networks without external router in the external network list is empty
  • Test action 5: Verify the configured public network id is found in test action 1 stored “id”s
  • Test assertion 3: the public network id is found in the external network “id”s
  • Test action 6: List the subnets of the external network with the configured public network id
  • Test action 7: Verify list in test action 6 is empty
  • Test assertion 4: There is no subnet of the external network with the configured public network id
5.1.8.6.1.3.4.2. Pass / fail criteria

This test evaluates the ability to use list commands to list external networks, pre-configured public network. Specifically it verifies that:

  • Network list commands to find visible networks with external router
  • Network list commands to find visible network with pre-configured public network id
  • Subnet list commands to find no subnet on the pre-configured public network
5.1.8.6.1.3.5. Post conditions

None

5.1.8.6.1.4. Test Case 4 - List IPv6 Networks and Subnets
5.1.8.6.1.4.1. Short name

dovetail.tempest.ipv6_api.network_subnet_list

5.1.8.6.1.4.2. Use case specification

This test case evaluates the SUT API ability of listing netowrks, subnets after creating a network and an IPv6 subnet, the reference is

tempest.api.network.test_networks.NetworksIpV6Test.test_list_networks tempest.api.network.test_networks.NetworksIpV6Test.test_list_subnets

5.1.8.6.1.4.3. Test preconditions

None

5.1.8.6.1.4.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.4.4.1. Test execution
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: List all networks, verifying the network id is found in the list
  • Test assertion 1: The “id” parameter is found in the network list
  • Test action 3: Create an IPv6 subnet of the network created in test action 1. storing the “id” parameter returned in the response
  • Test action 4: List all subnets of this network, verifying the IPv6 subnet id is found in the list
  • Test assertion 2: The “id” parameter is found in the IPv6 subnet list
  • Test action 5: Delete the IPv6 subnet using the stored “id” parameters
  • Test action 6: List all subnets, verify subnet_id is no longer present in the list
  • Test assertion 3: The IPv6 subnet “id” parameter is not present in list
  • Test action 7: Delete the network created in test action 1, using the stored network ids
  • Test action 8: List all networks, verifying the network id is no longer present
  • Test assertion 4: The network “id” parameter is not present in the network list
5.1.8.6.1.4.4.2. Pass / fail criteria

This test evaluates the ability to use create commands to create network, IPv6 subnet, list commands to list the created networks, IPv6 subnet on the SUT API. Specifically it verifies that:

  • Create commands to create network, IPv6 subnet
  • List commands to find that netowrk, IPv6 subnet in the all networks, subnets list after creating
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.4.5. Post conditions

None

5.1.8.6.1.5. Test Case 5 - Show Details of an IPv6 Network and Subnet
5.1.8.6.1.5.1. Short name

dovetail.tempest.ipv6_api.network_subnet_show

5.1.8.6.1.5.2. Use case specification

This test case evaluates the SUT API ability of showing the network, subnet details, the reference is,

tempest.api.network.test_networks.NetworksIpV6Test.test_show_network tempest.api.network.test_networks.NetworksIpV6Test.test_show_subnet

5.1.8.6.1.5.3. Test preconditions

None

5.1.8.6.1.5.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.5.4.1. Test execution
  • Test action 1: Create a network, storing the “id” and “name” parameter returned in the response
  • Test action 2: Show the network id and name, verifying the network id and name equal to the “id” and “name” stored in test action 1
  • Test assertion 1: The id and name equal to the “id” and “name” stored in test action 1
  • Test action 3: Create an IPv6 subnet of the network, storing the “id” and CIDR parameter returned in the response
  • Test action 4: Show the details of the created IPv6 subnet, verifying the id and CIDR in the details are equal to the stored id and CIDR in test action 3.
  • Test assertion 2: The “id” and CIDR in show details equal to “id” and CIDR stored in test action 3
  • Test action 5: Delete the IPv6 subnet using the stored “id” parameter
  • Test action 6: List all subnets on the network, verify the IPv6 subnet id is no longer present in the list
  • Test assertion 3: The IPv6 subnet “id” parameter is not present in list
  • Test action 7: Delete the network created in test action 1, using the stored network id
  • Test action 8: List all networks, verifying the network id is no longer present
  • Test assertion 4: The “id” parameter is not present in the network list
5.1.8.6.1.5.4.2. Pass / fail criteria

This test evaluates the ability to use create commands to create network, IPv6 subnet and show commands to show network, IPv6 subnet details on the SUT API. Specifically it verifies that:

  • Network show commands return correct “id” and “name” parameter which equal to the returned response in the create commands
  • IPv6 subnet show commands return correct “id” and CIDR parameter which equal to the returned response in the create commands
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.5.5. Post conditions

None

5.1.8.6.1.6. Test Case 6 - Create an IPv6 Port in Allowed Allocation Pools
5.1.8.6.1.6.1. Short name

dovetail.tempest.ipv6_api.port_create_in_allocation_pool

5.1.8.6.1.6.2. Use case specification

This test case evaluates the SUT API ability of creating an IPv6 subnet within allowed IPv6 address allocation pool and creating a port whose address is in the range of the pool, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_in_allowed_allocation_pools

5.1.8.6.1.6.3. Test preconditions

There should be an IPv6 CIDR configuration, which prefixlen is less than 126.

5.1.8.6.1.6.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.6.4.1. Test execution
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Check the allocation pools configuration, verifying the prefixlen of the IPv6 CIDR configuration is less than 126.
  • Test assertion 1: The prefixlen of the IPv6 CIDR configuration is less than 126
  • Test action 3: Get the allocation pool by setting the start_ip and end_ip based on the IPv6 CIDR configuration.
  • Test action 4: Create an IPv6 subnet of the network within the allocation pools, storing the “id” parameter returned in the response
  • Test action 5: Create a port of the network, storing the “id” parameter returned in the response
  • Test action 6: Verify the port’s id is in the range of the allocation pools which is got is test action 3
  • Test assertion 2: the port’s id is in the range of the allocation pools
  • Test action 7: Delete the port using the stored “id” parameter
  • Test action 8: List all ports, verify the port id is no longer present in the list
  • Test assertion 3: The port “id” parameter is not present in list
  • Test action 9: Delete the IPv6 subnet using the stored “id” parameter
  • Test action 10: List all subnets on the network, verify the IPv6 subnet id is no longer present in the list
  • Test assertion 4: The IPv6 subnet “id” parameter is not present in list
  • Test action 11: Delete the network created in test action 1, using the stored network id
  • Test action 12: List all networks, verifying the network id is no longer present
  • Test assertion 5: The “id” parameter is not present in the network list
5.1.8.6.1.6.4.2. Pass / fail criteria

This test evaluates the ability to use create commands to create an IPv6 subnet within allowed IPv6 address allocation pool and create a port whose address is in the range of the pool. Specifically it verifies that:

  • IPv6 subnet create command to create an IPv6 subnet within allowed IPv6 address allocation pool
  • Port create command to create a port whose id is in the range of the allocation pools
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.6.5. Post conditions

None

5.1.8.6.1.7. Test Case 7 - Create an IPv6 Port with Empty Security Groups
5.1.8.6.1.7.1. Short name

dovetail.tempest.ipv6_api.port_create_empty_security_group

5.1.8.6.1.7.2. Use case specification

This test case evaluates the SUT API ability of creating port with empty security group, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_with_no_securitygroups

5.1.8.6.1.7.3. Test preconditions

None

5.1.8.6.1.7.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.7.4.1. Test execution
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create an IPv6 subnet of the network, storing the “id” parameter returned in the response
  • Test action 3: Create a port of the network with an empty security group, storing the “id” parameter returned in the response
  • Test action 4: Verify the security group of the port is not none but is empty
  • Test assertion 1: the security group of the port is not none but is empty
  • Test action 5: Delete the port using the stored “id” parameter
  • Test action 6: List all ports, verify the port id is no longer present in the list
  • Test assertion 2: The port “id” parameter is not present in list
  • Test action 7: Delete the IPv6 subnet using the stored “id” parameter
  • Test action 8: List all subnets on the network, verify the IPv6 subnet id is no longer present in the list
  • Test assertion 3: The IPv6 subnet “id” parameter is not present in list
  • Test action 9: Delete the network created in test action 1, using the stored network id
  • Test action 10: List all networks, verifying the network id is no longer present
  • Test assertion 4: The “id” parameter is not present in the network list
5.1.8.6.1.7.4.2. Pass / fail criteria

This test evaluates the ability to use create commands to create port with empty security group of the SUT API. Specifically it verifies that:

  • Port create commands to create a port with an empty security group
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.7.5. Post conditions

None

5.1.8.6.1.8. Test Case 8 - Create, Update and Delete an IPv6 Port
5.1.8.6.1.8.1. Short name

dovetail.tempest.ipv6_api.port_create_update_delete

5.1.8.6.1.8.2. Use case specification

This test case evaluates the SUT API ability of creating, updating, deleting IPv6 port, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_update_delete_port

5.1.8.6.1.8.3. Test preconditions

None

5.1.8.6.1.8.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.8.4.1. Test execution
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create a port of the network, storing the “id” and “admin_state_up” parameters returned in the response
  • Test action 3: Verify the value of port’s ‘admin_state_up’ is True
  • Test assertion 1: the value of port’s ‘admin_state_up’ is True after creating
  • Test action 4: Update the port’s name with a new_name and set port’s admin_state_up to False, storing the name and admin_state_up parameters returned in the response
  • Test action 5: Verify the stored port’s name equals to new_name and the port’s admin_state_up is False.
  • Test assertion 2: the stored port’s name equals to new_name and the port’s admin_state_up is False
  • Test action 6: Delete the port using the stored “id” parameter
  • Test action 7: List all ports, verify the port is no longer present in the list
  • Test assertion 3: The port “id” parameter is not present in list
  • Test action 8: Delete the network created in test action 1, using the stored network id
  • Test action 9: List all networks, verifying the network id is no longer present
  • Test assertion 4: The “id” parameter is not present in the network list
5.1.8.6.1.8.4.2. Pass / fail criteria

This test evaluates the ability to use create/update/delete commands to create/update/delete port of the SUT API. Specifically it verifies that:

  • Port create commands return True of ‘admin_state_up’ in response
  • Port update commands to update ‘name’ to new_name and ‘admin_state_up’ to false
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.8.5. Post conditions

None

5.1.8.6.1.9. Test Case 9 - List IPv6 Ports
5.1.8.6.1.9.1. Short name

dovetail.tempest.ipv6_api.port_list

5.1.8.6.1.9.2. Use case specification

This test case evaluates the SUT ability of creating a port on a network and finding the port in the all ports list, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_list_ports

5.1.8.6.1.9.3. Test preconditions

None

5.1.8.6.1.9.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.9.4.1. Test execution
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create a port of the network, storing the “id” parameter returned in the response
  • Test action 3: List all ports, verify the port id is found in the list
  • Test assertion 1: The “id” parameter is found in the port list
  • Test action 4: Delete the port using the stored “id” parameter
  • Test action 5: List all ports, verify the port is no longer present in the list
  • Test assertion 2: The port “id” parameter is not present in list
  • Test action 6: Delete the network created in test action 1, using the stored network id
  • Test action 7: List all networks, verifying the network id is no longer present
  • Test assertion 3: The “id” parameter is not present in the network list
5.1.8.6.1.9.4.2. Pass / fail criteria

This test evaluates the ability to use list commands to list the networks and ports on the SUT API. Specifically it verifies that:

  • Port list command to list all ports, the created port is found in the list.
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.9.5. Post conditions

None

5.1.8.6.1.10. Test Case 10 - Show Key/Valus Details of an IPv6 Port
5.1.8.6.1.10.1. Short name

dovetail.tempest.ipv6_api.port_show_details

5.1.8.6.1.10.2. Use case specification

This test case evaluates the SUT ability of showing the port details, the values in the details should be equal to the values to create the port, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_show_port

5.1.8.6.1.10.3. Test preconditions

None

5.1.8.6.1.10.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.10.4.1. Test execution
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create a port of the network, storing the “id” parameter returned in the response
  • Test action 3: Show the details of the port, verify the stored port’s id in test action 2 exists in the details
  • Test assertion 1: The “id” parameter is found in the port shown details
  • Test action 4: Verify the values in the details of the port are the same as the values to create the port
  • Test assertion 2: The values in the details of the port are the same as the values to create the port
  • Test action 5: Delete the port using the stored “id” parameter
  • Test action 6: List all ports, verify the port is no longer present in the list
  • Test assertion 3: The port “id” parameter is not present in list
  • Test action 7: Delete the network created in test action 1, using the stored network id
  • Test action 8: List all networks, verifying the network id is no longer present
  • Test assertion 4: The “id” parameter is not present in the network list
5.1.8.6.1.10.4.2. Pass / fail criteria

This test evaluates the ability to use show commands to show port details on the SUT API. Specifically it verifies that:

  • Port show commands to show the details of the port, whose id is in the details
  • Port show commands to show the details of the port, whose values are the same as the values to create the port
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.10.5. Post conditions

None

5.1.8.6.1.11. Test Case 11 - Add Multiple Interfaces for an IPv6 Router
5.1.8.6.1.11.1. Short name

dovetail.tempest.ipv6_api.router_add_multiple_interface

5.1.8.6.1.11.2. Use case specification

This test case evaluates the SUT ability of adding multiple interface to a router, the reference is,

tempest.api.network.test_routers.RoutersIpV6Test.test_add_multiple_router_interfaces

5.1.8.6.1.11.3. Test preconditions

None

5.1.8.6.1.11.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.11.4.1. Test execution
  • Test action 1: Create 2 networks named network01 and network02 sequentially, storing the “id” parameters returned in the response
  • Test action 2: Create an IPv6 subnet01 in network01, an IPv6 subnet02 in network02 sequentially, storing the “id” parameters returned in the response
  • Test action 3: Create a router, storing the “id” parameter returned in the response
  • Test action 4: Create interface01 with subnet01 and the router
  • Test action 5: Verify the router_id stored in test action 3 equals to the interface01’s ‘device_id’ and subnet01_id stored in test action 2 equals to the interface01’s ‘subnet_id’
  • Test assertion 1: the router_id equals to the interface01’s ‘device_id’ and subnet01_id equals to the interface01’s ‘subnet_id’
  • Test action 5: Create interface02 with subnet02 and the router
  • Test action 6: Verify the router_id stored in test action 3 equals to the interface02’s ‘device_id’ and subnet02_id stored in test action 2 equals to the interface02’s ‘subnet_id’
  • Test assertion 2: the router_id equals to the interface02’s ‘device_id’ and subnet02_id equals to the interface02’s ‘subnet_id’
  • Test action 7: Delete the interfaces, router, IPv6 subnets and networks, networks, subnets, then list all interfaces, ports, IPv6 subnets, networks, the test passes if the deleted ones are not found in the list.
  • Test assertion 3: The interfaces, router, IPv6 subnets and networks ids are not present in the lists after deleting
5.1.8.6.1.11.4.2. Pass / fail criteria

This test evaluates the ability to use bulk create commands to create networks, IPv6 subnets and ports on the SUT API. Specifically it verifies that:

  • Interface create commands to create interface with IPv6 subnet and router, interface ‘device_id’ and ‘subnet_id’ should equal to the router id and IPv6 subnet id, respectively.
  • Interface create commands to create multiple interface with the same router and multiple IPv6 subnets.
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.11.5. Post conditions

None

5.1.8.6.1.12. Test Case 12 - Add and Remove an IPv6 Router Interface with port_id
5.1.8.6.1.12.1. Short name

dovetail.tempest.ipv6_api.router_interface_add_remove_with_port

5.1.8.6.1.12.2. Use case specification

This test case evaluates the SUT abiltiy of adding, removing router interface to a port, the subnet_id and port_id of the interface will be checked, the port’s device_id will be checked if equals to the router_id or not. The reference is,

tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_port_id

5.1.8.6.1.12.3. Test preconditions

None

5.1.8.6.1.12.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.12.4.1. Test execution
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create an IPv6 subnet of the network, storing the “id” parameter returned in the response
  • Test action 3: Create a router, storing the “id” parameter returned in the response
  • Test action 4: Create a port of the network, storing the “id” parameter returned in the response
  • Test action 5: Add router interface to the port created, storing the “id” parameter returned in the response
  • Test action 6: Verify the interface’s keys include ‘subnet_id’ and ‘port_id’
  • Test assertion 1: the interface’s keys include ‘subnet_id’ and ‘port_id’
  • Test action 7: Show the port details, verify the ‘device_id’ in port details equals to the router id stored in test action 3
  • Test assertion 2: ‘device_id’ in port details equals to the router id
  • Test action 8: Delete the interface, port, router, subnet and network, then list all interfaces, ports, routers, subnets and networks, the test passes if the deleted ones are not found in the list.
  • Test assertion 3: interfaces, ports, routers, subnets and networks are not found in the lists after deleting
5.1.8.6.1.12.4.2. Pass / fail criteria

This test evaluates the ability to use add/remove commands to add/remove router interface to the port, show commands to show port details on the SUT API. Specifically it verifies that:

  • Router_interface add commands to add router interface to a port, the interface’s keys should include ‘subnet_id’ and ‘port_id’
  • Port show commands to show ‘device_id’ in port details, which should be equal to the router id
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.12.5. Post conditions

None

5.1.8.6.1.13. Test Case 13 - Add and Remove an IPv6 Router Interface with subnet_id
5.1.8.6.1.13.1. Short name

dovetail.tempest.ipv6_api.router_interface_add_remove

5.1.8.6.1.13.2. Use case specification

This test case evaluates the SUT API ability of adding and removing a router interface with the IPv6 subnet id, the reference is

tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_subnet_id

5.1.8.6.1.13.3. Test preconditions

None

5.1.8.6.1.13.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.13.4.1. Test execution
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create an IPv6 subnet with the network created, storing the “id” parameter returned in the response
  • Test action 3: Create a router, storing the “id” parameter returned in the response
  • Test action 4: Add a router interface with the stored ids of the router and IPv6 subnet
  • Test assertion 1: Key ‘subnet_id’ is included in the added interface’s keys
  • Test assertion 2: Key ‘port_id’ is included in the added interface’s keys
  • Test action 5: Show the port info with the stored interface’s port id
  • Test assertion 3:: The stored router id is equal to the device id shown in the port info
  • Test action 6: Delete the router interface created in test action 4, using the stored subnet id
  • Test action 7: List all router interfaces, verifying the router interface is no longer present
  • Test assertion 4: The router interface with the stored subnet id is not present in the router interface list
  • Test action 8: Delete the router created in test action 3, using the stored router id
  • Test action 9: List all routers, verifying the router id is no longer present
  • Test assertion 5: The router “id” parameter is not present in the router list
  • Test action 10: Delete the subnet created in test action 2, using the stored subnet id
  • Test action 11: List all subnets, verifying the subnet id is no longer present
  • Test assertion 6: The subnet “id” parameter is not present in the subnet list
  • Test action 12: Delete the network created in test action 1, using the stored network id
  • Test action 13: List all networks, verifying the network id is no longer present
  • Test assertion 7: The network “id” parameter is not present in the network list
5.1.8.6.1.13.4.2. Pass / fail criteria

This test evaluates the ability to add and remove router interface with the subnet id on the SUT API. Specifically it verifies that:

  • Router interface add command returns valid ‘subnet_id’ parameter which is reported in the interface’s keys
  • Router interface add command returns valid ‘port_id’ parameter which is reported in the interface’s keys
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.13.5. Post conditions

None

5.1.8.6.1.14. Test Case 14 - Create, Show, List, Update and Delete an IPv6 router
5.1.8.6.1.14.1. Short name

dovetail.tempest.ipv6_api.router_create_show_list_update_delete

5.1.8.6.1.14.2. Use case specification

This test case evaluates the SUT API ability of creating, showing, listing, updating and deleting routers, the reference is

tempest.api.network.test_routers.RoutersIpV6Test.test_create_show_list_update_delete_router

5.1.8.6.1.14.3. Test preconditions

There should exist an OpenStack external network.

5.1.8.6.1.14.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.14.4.1. Test execution
  • Test action 1: Create a router, set the admin_state_up to be False and external_network_id to be public network id, storing the “id” parameter returned in the response
  • Test assertion 1: The created router’s admin_state_up is False
  • Test assertion 2: The created router’s external network id equals to the public network id
  • Test action 2: Show details of the router created in test action 1, using the stored router id
  • Test assertion 3: The router’s name shown is the same as the router created
  • Test assertion 4: The router’s external network id shown is the same as the public network id
  • Test action 3: List all routers and verify if created router is in response message
  • Test assertion 5: The stored router id is in the router list
  • Test action 4: Update the name of router and verify if it is updated
  • Test assertion 6: The name of router equals to the name used to update in test action 4
  • Test action 5: Show the details of router, using the stored router id
  • Test assertion 7: The router’s name shown equals to the name used to update in test action 4
  • Test action 6: Delete the router created in test action 1, using the stored router id
  • Test action 7: List all routers, verifying the router id is no longer present
  • Test assertion 8: The “id” parameter is not present in the router list
5.1.8.6.1.14.4.2. Pass / fail criteria

This test evaluates the ability to create, show, list, update and delete router on the SUT API. Specifically it verifies that:

  • Router create command returns valid “admin_state_up” and “id” parameters which equal to the “admin_state_up” and “id” returned in the response
  • Router show command returns valid “name” parameter which equals to the “name” returned in the response
  • Router show command returns valid “external network id” parameters which equals to the public network id
  • Router list command returns valid “id” parameter which equals to the stored router “id”
  • Router update command returns updated “name” parameters which equals to the “name” used to update
  • Router created using create command is able to be removed using the returned identifiers
5.1.8.6.1.14.5. Post conditions

None

5.1.8.6.1.15. Test Case 15 - Create, List, Update, Show and Delete an IPv6 security group
5.1.8.6.1.15.1. Short name

dovetail.tempest.ipv6_api.security_group_create_list_update_show_delete

5.1.8.6.1.15.2. Use case specification

This test case evaluates the SUT API ability of creating, listing, updating, showing and deleting security groups, the reference is

tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_list_update_show_delete_security_group

5.1.8.6.1.15.3. Test preconditions

None

5.1.8.6.1.15.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.15.4.1. Test execution
  • Test action 1: Create a security group, storing the “id” parameter returned in the response
  • Test action 2: List all security groups and verify if created security group is there in response
  • Test assertion 1: The created security group’s “id” is found in the list
  • Test action 3: Update the name and description of this security group, using the stored id
  • Test action 4: Verify if the security group’s name and description are updated
  • Test assertion 2: The security group’s name equals to the name used in test action 3
  • Test assertion 3: The security group’s description equals to the description used in test action 3
  • Test action 5: Show details of the updated security group, using the stored id
  • Test assertion 4: The security group’s name shown equals to the name used in test action 3
  • Test assertion 5: The security group’s description shown equals to the description used in test action 3
  • Test action 6: Delete the security group created in test action 1, using the stored id
  • Test action 7: List all security groups, verifying the security group’s id is no longer present
  • Test assertion 6: The “id” parameter is not present in the security group list
5.1.8.6.1.15.4.2. Pass / fail criteria

This test evaluates the ability to create list, update, show and delete security group on the SUT API. Specifically it verifies that:

  • Security group create commands return valid “id” parameter which is reported in the list commands
  • Security group update commands return valid “name” and “description” parameters which are reported in the show commands
  • Security group created using create command is able to be removed using the returned identifiers
5.1.8.6.1.15.5. Post conditions

None

5.1.8.6.1.16. Test Case 16 - Create, Show and Delete IPv6 security group rule
5.1.8.6.1.16.1. Short name

dovetail.tempest.ipv6_api.security_group_rule_create_show_delete

5.1.8.6.1.16.2. Use case specification

This test case evaluates the SUT API ability of creating, showing, listing and deleting security group rules, the reference is

tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_show_delete_security_group_rule

5.1.8.6.1.16.3. Test preconditions

None

5.1.8.6.1.16.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.16.4.1. Test execution
  • Test action 1: Create a security group, storing the “id” parameter returned in the response
  • Test action 2: Create a rule of the security group with protocol tcp, udp and icmp, respectively, using the stored security group’s id, storing the “id” parameter returned in the response
  • Test action 3: Show details of the created security group rule, using the stored id of the security group rule
  • Test assertion 1: All the created security group rule’s values equal to the rule values shown in test action 3
  • Test action 4: List all security group rules
  • Test assertion 2: The stored security group rule’s id is found in the list
  • Test action 5: Delete the security group rule, using the stored security group rule’s id
  • Test action 6: List all security group rules, verifying the security group rule’s id is no longer present
  • Test assertion 3: The security group rule “id” parameter is not present in the list
  • Test action 7: Delete the security group, using the stored security group’s id
  • Test action 8: List all security groups, verifying the security group’s id is no longer present
  • Test assertion 4: The security group “id” parameter is not present in the list
5.1.8.6.1.16.4.2. Pass / fail criteria

This test evaluates the ability to create, show, list and delete security group rules on the SUT API. Specifically it verifies that:

  • Security group rule create command returns valid values which are reported in the show command
  • Security group rule created using create command is able to be removed using the returned identifiers
5.1.8.6.1.16.5. Post conditions

None

5.1.8.6.1.17. Test Case 17 - List IPv6 Security Groups
5.1.8.6.1.17.1. Short name

dovetail.tempest.ipv6_api.security_group_list

5.1.8.6.1.17.2. Use case specification

This test case evaluates the SUT API ability of listing security groups, the reference is

tempest.api.network.test_security_groups.SecGroupIPv6Test.test_list_security_groups

5.1.8.6.1.17.3. Test preconditions

There should exist a default security group.

5.1.8.6.1.17.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.17.4.1. Test execution
  • Test action 1: List all security groups
  • Test action 2: Verify the default security group exists in the list, the test passes if the default security group exists
  • Test assertion 1: The default security group is in the list
5.1.8.6.1.17.4.2. Pass / fail criteria

This test evaluates the ability to list security groups on the SUT API. Specifically it verifies that:

  • Security group list command return valid security groups which include the default security group
5.1.8.6.1.17.5. Post conditions

None

5.1.8.6.1.18. Test Case 1 - IPv6 Address Assignment - Dual Stack, SLAAC, DHCPv6 Stateless
5.1.8.6.1.18.1. Short name

dovetail.tempest.ipv6_scenario.dhcpv6_stateless

5.1.8.6.1.18.2. Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’. In this case, guest instance obtains IPv6 address from OpenStack managed radvd using SLAAC and optional info from dnsmasq using DHCPv6 stateless. This test case then verifies the ping6 available VM can ping the other VM’s v4 and v6 addresses as well as the v6 subnet’s gateway ip in the same network, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os

5.1.8.6.1.18.3. Test preconditions

There should exist a public router or a public network.

5.1.8.6.1.18.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.18.4.1. Test execution
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create one IPv6 subnet of the network created in test action 1 in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, storing the “id” parameter returned in the response
  • Test action 6: Connect the IPv6 subnet to the router, using the stored IPv6 subnet id
  • Test action 7: Boot two VMs on this network, storing the “id” parameters returned in the response
  • Test assertion 1: The vNIC of each VM gets one v4 address and one v6 address actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 address as well as the v6 subnet’s gateway ip
  • Test action 8: Delete the 2 VMs created in test action 7, using the stored ids
  • Test action 9: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 10: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 11: Delete the IPv6 subnet created in test action 5, using the stored id
  • Test action 12: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 13: Delete the network created in test action 1, using the stored id
  • Test action 14: List all networks, verifying the id is no longer present
  • Test assertion 6: The “id” parameter is not present in the network list
5.1.8.6.1.18.4.2. Pass / fail criteria

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, and verify the ping6 available VM can ping the other VM’s v4 and v6 addresses as well as the v6 subnet’s gateway ip in the same network. Specifically it verifies that:

  • The IPv6 addresses in mode ‘dhcpv6_stateless’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnet’s gateway ip
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.18.5. Post conditions

None

5.1.8.6.1.19. Test Case 2 - IPv6 Address Assignment - Dual Net, Dual Stack, SLAAC, DHCPv6 Stateless
5.1.8.6.1.19.1. Short name

dovetail.tempest.ipv6_scenario.dualnet_dhcpv6_stateless

5.1.8.6.1.19.2. Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’. In this case, guest instance obtains IPv6 address from OpenStack managed radvd using SLAAC and optional info from dnsmasq using DHCPv6 stateless. This test case then verifies the ping6 available VM can ping the other VM’s v4 address in one network and v6 address in another network as well as the v6 subnet’s gateway ip, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os

5.1.8.6.1.19.3. Test preconditions

There should exists a public router or a public network.

5.1.8.6.1.19.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.19.4.1. Test execution
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create another network, storing the “id” parameter returned in the response
  • Test action 6: Create one IPv6 subnet of network created in test action 5 in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, storing the “id” parameter returned in the response
  • Test action 7: Connect the IPv6 subnet to the router, using the stored IPv6 subnet id
  • Test action 8: Boot two VMs on these two networks, storing the “id” parameters returned in the response
  • Test action 9: Turn on 2nd NIC of each VM for the network created in test action 5
  • Test assertion 1: The 1st vNIC of each VM gets one v4 address assigned and the 2nd vNIC of each VM gets one v6 address actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 address as well as the v6 subnet’s gateway ip
  • Test action 10: Delete the 2 VMs created in test action 8, using the stored ids
  • Test action 11: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 12: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 13: Delete the IPv6 subnet created in test action 6, using the stored id
  • Test action 14: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 15: Delete the 2 networks created in test action 1 and 5, using the stored ids
  • Test action 16: List all networks, verifying the ids are no longer present
  • Test assertion 6: The two “id” parameters are not present in the network list
5.1.8.6.1.19.4.2. Pass / fail criteria

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, and verify the ping6 available VM can ping the other VM’s v4 address in one network and v6 address in another network as well as the v6 subnet’s gateway ip. Specifically it verifies that:

  • The IPv6 addresses in mode ‘dhcpv6_stateless’ assigned successfully
  • The VM can ping the other VM’s IPv4 address in one network and IPv6 address in another network as well as the v6 subnet’s gateway ip
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.19.5. Post conditions

None

5.1.8.6.1.20. Test Case 3 - IPv6 Address Assignment - Multiple Prefixes, Dual Stack, SLAAC, DHCPv6 Stateless
5.1.8.6.1.20.1. Short name

dovetail.tempest.ipv6_scenario.multiple_prefixes_dhcpv6_stateless

5.1.8.6.1.20.2. Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’. In this case, guest instance obtains IPv6 addresses from OpenStack managed radvd using SLAAC and optional info from dnsmasq using DHCPv6 stateless. This test case then verifies the ping6 available VM can ping the other VM’s one v4 address and two v6 addresses with different prefixes as well as the v6 subnets’ gateway ips in the same network, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_dhcpv6_stateless

5.1.8.6.1.20.3. Test preconditions

There should exist a public router or a public network.

5.1.8.6.1.20.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.20.4.1. Test execution
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create two IPv6 subnets of the network created in test action 1 in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, storing the “id” parameters returned in the response
  • Test action 6: Connect the two IPv6 subnets to the router, using the stored IPv6 subnet ids
  • Test action 7: Boot two VMs on this network, storing the “id” parameters returned in the response
  • Test assertion 1: The vNIC of each VM gets one v4 address and two v6 addresses with different prefixes actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 addresses as well as the v6 subnets’ gateway ips
  • Test action 8: Delete the 2 VMs created in test action 7, using the stored ids
  • Test action 9: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 10: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 11: Delete two IPv6 subnets created in test action 5, using the stored ids
  • Test action 12: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 13: Delete the network created in test action 1, using the stored id
  • Test action 14: List all networks, verifying the id is no longer present
  • Test assertion 6: The “id” parameter is not present in the network list
5.1.8.6.1.20.4.2. Pass / fail criteria

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, and verify the ping6 available VM can ping the other VM’s v4 address and two v6 addresses with different prefixes as well as the v6 subnets’ gateway ips in the same network. Specifically it verifies that:

  • The different prefixes IPv6 addresses in mode ‘dhcpv6_stateless’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnets’ gateway ips
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.20.5. Post conditions

None

5.1.8.6.1.21. Test Case 4 - IPv6 Address Assignment - Dual Net, Multiple Prefixes, Dual Stack, SLAAC, DHCPv6 Stateless
5.1.8.6.1.21.1. Short name

dovetail.tempest.ipv6_scenario.dualnet_multiple_prefixes_dhcpv6_stateless

5.1.8.6.1.21.2. Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’. In this case, guest instance obtains IPv6 addresses from OpenStack managed radvd using SLAAC and optional info from dnsmasq using DHCPv6 stateless. This test case then verifies the ping6 available VM can ping the other VM’s v4 address in one network and two v6 addresses with different prefixes in another network as well as the v6 subnets’ gateway ips, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_dhcpv6_stateless

5.1.8.6.1.21.3. Test preconditions

There should exist a public router or a public network.

5.1.8.6.1.21.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.21.4.1. Test execution
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create another network, storing the “id” parameter returned in the response
  • Test action 6: Create two IPv6 subnets of network created in test action 5 in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, storing the “id” parameters returned in the response
  • Test action 7: Connect the two IPv6 subnets to the router, using the stored IPv6 subnet ids
  • Test action 8: Boot two VMs on these two networks, storing the “id” parameters returned in the response
  • Test action 9: Turn on 2nd NIC of each VM for the network created in test action 5
  • Test assertion 1: The vNIC of each VM gets one v4 address and two v6 addresses with different prefixes actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 addresses as well as the v6 subnets’ gateway ips
  • Test action 10: Delete the 2 VMs created in test action 8, using the stored ids
  • Test action 11: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 12: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 13: Delete two IPv6 subnets created in test action 6, using the stored ids
  • Test action 14: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 15: Delete the 2 networks created in test action 1 and 5, using the stored ids
  • Test action 16: List all networks, verifying the ids are no longer present
  • Test assertion 6: The two “id” parameters are not present in the network list
5.1.8.6.1.21.4.2. Pass / fail criteria

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, and verify the ping6 available VM can ping the other VM’s v4 address in one network and two v6 addresses with different prefixes in another network as well as the v6 subnets’ gateway ips. Specifically it verifies that:

  • The IPv6 addresses in mode ‘dhcpv6_stateless’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnets’ gateway ips
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.21.5. Post conditions

None

5.1.8.6.1.22. Test Case 5 - IPv6 Address Assignment - Dual Stack, SLAAC
5.1.8.6.1.22.1. Short name

dovetail.tempest.ipv6_scenario.slaac

5.1.8.6.1.22.2. Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’. In this case, guest instance obtains IPv6 address from OpenStack managed radvd using SLAAC. This test case then verifies the ping6 available VM can ping the other VM’s v4 and v6 addresses as well as the v6 subnet’s gateway ip in the same network, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_slaac_from_os

5.1.8.6.1.22.3. Test preconditions

There should exist a public router or a public network.

5.1.8.6.1.22.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.22.4.1. Test execution
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create one IPv6 subnet of the network created in test action 1 in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, storing the “id” parameter returned in the response
  • Test action 6: Connect the IPv6 subnet to the router, using the stored IPv6 subnet id
  • Test action 7: Boot two VMs on this network, storing the “id” parameters returned in the response
  • Test assertion 1: The vNIC of each VM gets one v4 address and one v6 address actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 address as well as the v6 subnet’s gateway ip
  • Test action 8: Delete the 2 VMs created in test action 7, using the stored ids
  • Test action 9: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 10: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 11: Delete the IPv6 subnet created in test action 5, using the stored id
  • Test action 12: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 13: Delete the network created in test action 1, using the stored id
  • Test action 14: List all networks, verifying the id is no longer present
  • Test assertion 6: The “id” parameter is not present in the network list
5.1.8.6.1.22.4.2. Pass / fail criteria

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, and verify the ping6 available VM can ping the other VM’s v4 and v6 addresses as well as the v6 subnet’s gateway ip in the same network. Specifically it verifies that:

  • The IPv6 addresses in mode ‘slaac’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnet’s gateway ip
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.22.5. Post conditions

None

5.1.8.6.1.23. Test Case 6 - IPv6 Address Assignment - Dual Net, Dual Stack, SLAAC
5.1.8.6.1.23.1. Short name

dovetail.tempest.ipv6_scenario.dualnet_slaac

5.1.8.6.1.23.2. Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’. In this case, guest instance obtains IPv6 address from OpenStack managed radvd using SLAAC. This test case then verifies the ping6 available VM can ping the other VM’s v4 address in one network and v6 address in another network as well as the v6 subnet’s gateway ip, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_slaac_from_os

5.1.8.6.1.23.3. Test preconditions

There should exist a public router or a public network.

5.1.8.6.1.23.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.23.4.1. Test execution
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create another network, storing the “id” parameter returned in the response
  • Test action 6: Create one IPv6 subnet of network created in test action 5 in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, storing the “id” parameter returned in the response
  • Test action 7: Connect the IPv6 subnet to the router, using the stored IPv6 subnet id
  • Test action 8: Boot two VMs on these two networks, storing the “id” parameters returned in the response
  • Test action 9: Turn on 2nd NIC of each VM for the network created in test action 5
  • Test assertion 1: The 1st vNIC of each VM gets one v4 address assigned and the 2nd vNIC of each VM gets one v6 address actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 address as well as the v6 subnet’s gateway ip
  • Test action 10: Delete the 2 VMs created in test action 8, using the stored ids
  • Test action 11: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 12: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 13: Delete the IPv6 subnet created in test action 6, using the stored id
  • Test action 14: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 15: Delete the 2 networks created in test action 1 and 5, using the stored ids
  • Test action 16: List all networks, verifying the ids are no longer present
  • Test assertion 6: The two “id” parameters are not present in the network list
5.1.8.6.1.23.4.2. Pass / fail criteria

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, and verify the ping6 available VM can ping the other VM’s v4 address in one network and v6 address in another network as well as the v6 subnet’s gateway ip. Specifically it verifies that:

  • The IPv6 addresses in mode ‘slaac’ assigned successfully
  • The VM can ping the other VM’s IPv4 address in one network and IPv6 address in another network as well as the v6 subnet’s gateway ip
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.23.5. Post conditions

None

5.1.8.6.1.24. Test Case 7 - IPv6 Address Assignment - Multiple Prefixes, Dual Stack, SLAAC
5.1.8.6.1.24.1. Short name

dovetail.tempest.ipv6_scenario.multiple_prefixes_slaac

5.1.8.6.1.24.2. Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’. In this case, guest instance obtains IPv6 addresses from OpenStack managed radvd using SLAAC. This test case then verifies the ping6 available VM can ping the other VM’s one v4 address and two v6 addresses with different prefixes as well as the v6 subnets’ gateway ips in the same network, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac

5.1.8.6.1.24.3. Test preconditions

There should exists a public router or a public network.

5.1.8.6.1.24.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.24.4.1. Test execution
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create two IPv6 subnets of the network created in test action 1 in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, storing the “id” parameters returned in the response
  • Test action 6: Connect the two IPv6 subnets to the router, using the stored IPv6 subnet ids
  • Test action 7: Boot two VMs on this network, storing the “id” parameters returned in the response
  • Test assertion 1: The vNIC of each VM gets one v4 address and two v6 addresses with different prefixes actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 addresses as well as the v6 subnets’ gateway ips
  • Test action 8: Delete the 2 VMs created in test action 7, using the stored ids
  • Test action 9: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 10: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 11: Delete two IPv6 subnets created in test action 5, using the stored ids
  • Test action 12: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 13: Delete the network created in test action 1, using the stored id
  • Test action 14: List all networks, verifying the id is no longer present
  • Test assertion 6: The “id” parameter is not present in the network list
5.1.8.6.1.24.4.2. Pass / fail criteria

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, and verify the ping6 available VM can ping the other VM’s v4 address and two v6 addresses with different prefixes as well as the v6 subnets’ gateway ips in the same network. Specifically it verifies that:

  • The different prefixes IPv6 addresses in mode ‘slaac’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnets’ gateway ips
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.24.5. Post conditions

None

5.1.8.6.1.25. Test Case 8 - IPv6 Address Assignment - Dual Net, Dual Stack, Multiple Prefixes, SLAAC
5.1.8.6.1.25.1. Short name

dovetail.tempest.ipv6_scenario.dualnet_multiple_prefixes_slaac

5.1.8.6.1.25.2. Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’. In this case, guest instance obtains IPv6 addresses from OpenStack managed radvd using SLAAC. This test case then verifies the ping6 available VM can ping the other VM’s v4 address in one network and two v6 addresses with different prefixes in another network as well as the v6 subnets’ gateway ips, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_slaac

5.1.8.6.1.25.3. Test preconditions

There should exist a public router or a public network.

5.1.8.6.1.25.4. Basic test flow execution description and pass/fail criteria
5.1.8.6.1.25.4.1. Test execution
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create another network, storing the “id” parameter returned in the response
  • Test action 6: Create two IPv6 subnets of network created in test action 5 in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, storing the “id” parameters returned in the response
  • Test action 7: Connect the two IPv6 subnets to the router, using the stored IPv6 subnet ids
  • Test action 8: Boot two VMs on these two networks, storing the “id” parameters returned in the response
  • Test action 9: Turn on 2nd NIC of each VM for the network created in test action 5
  • Test assertion 1: The vNIC of each VM gets one v4 address and two v6 addresses with different prefixes actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 addresses as well as the v6 subnets’ gateway ips
  • Test action 10: Delete the 2 VMs created in test action 8, using the stored ids
  • Test action 11: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 12: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 13: Delete two IPv6 subnets created in test action 6, using the stored ids
  • Test action 14: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 15: Delete the 2 networks created in test action 1 and 5, using the stored ids
  • Test action 16: List all networks, verifying the ids are no longer present
  • Test assertion 6: The two “id” parameters are not present in the network list
5.1.8.6.1.25.4.2. Pass / fail criteria

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, and verify the ping6 available VM can ping the other VM’s v4 address in one network and two v6 addresses with different prefixes in another network as well as the v6 subnets’ gateway ips. Specifically it verifies that:

  • The IPv6 addresses in mode ‘slaac’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnets’ gateway ips
  • All items created using create commands are able to be removed using the returned identifiers
5.1.8.6.1.25.5. Post conditions

None

5.1.9. VM Resource Scheduling on Multiple Nodes test specification
5.1.9.1. Scope

The VM resource scheduling test area evaluates the ability of the system under test to support VM resource scheduling on multiple nodes. The tests in this test area will evaluate capabilities to schedule VM to multiple compute nodes directly with scheduler hints, and create server groups with policy affinity and anti-affinity.

5.1.9.3. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualization infrastructure
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
5.1.9.4. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.9.5. Test Area Structure

The test area is structured based on server group operations and server operations on multiple nodes. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

All these test cases are included in the test case dovetail.tempest.multi_node_scheduling of OVP test suite.

5.1.9.6. Test Descriptions
5.1.9.6.1. API Used and Reference

Security Groups: https://developer.openstack.org/api-ref/network/v2/index.html#security-groups-security-groups

  • create security group
  • delete security group

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • create network
  • delete network

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • create router
  • delete router
  • add interface to router

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • create subnet
  • delete subnet

Servers: https://developer.openstack.org/api-ref/compute/

  • create keypair
  • create server
  • show server
  • delete server
  • add/assign floating IP
  • create server group
  • delete server group
  • list server groups
  • show server group details

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • create port
  • delete port

Floating IPs: https://developer.openstack.org/api-ref/networking/v2/index.html#floating-ips-floatingips

  • create floating IP
  • delete floating IP

Availability zone: https://developer.openstack.org/api-ref/compute/

  • get availability zone
5.1.9.6.2. Test Case 1 - Schedule VM to compute nodes
5.1.9.6.2.1. Test case specification

tempest.scenario.test_server_multinode.TestServerMultinode.test_schedule_to_all_nodes

5.1.9.6.2.2. Test preconditions
  • At least 2 compute nodes
  • Openstack nova, neutron services are available
  • One public network
5.1.9.6.2.3. Basic test flow execution description and pass/fail criteria
5.1.9.6.2.3.1. Test execution
  • Test action 1: Get all availability zones AZONES1 in the SUT
  • Test action 2: Get all compute nodes in AZONES1
  • Test action 3: Get the value of ‘min_compute_nodes’ which is set by user in tempest config file and means the minimum number of compute nodes expected
  • Test assertion 1: Verify that SUT has at least as many compute nodes as specified by the ‘min_compute_nodes’ threshold
  • Test action 4: Create one server for each compute node, up to the ‘min_compute_nodes’ threshold
  • Test assertion 2: Verify the number of servers matches the ‘min_compute_nodes’ threshold
  • Test action 5: Get every server’s ‘hostId’ and store them in a set which has no duplicate values
  • Test assertion 3: Verify the length of the set equals to the number of servers to ensure that every server ended up on a different host
  • Test action 6: Delete the created servers
5.1.9.6.2.3.2. Pass / fail criteria

This test evaluates the functionality of VM resource scheduling. Specifically, the test verifies that:

  • VMs are scheduled to the requested compute nodes correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.9.6.2.4. Post conditions

N/A

5.1.9.6.3. Test Case 2 - Test create and delete multiple server groups with same name and policy
5.1.9.6.3.1. Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_create_delete_multiple_server_groups_with_same_name_policy

5.1.9.6.3.2. Test preconditions

None

5.1.9.6.3.3. Basic test flow execution description and pass/fail criteria
5.1.9.6.3.3.1. Test execution
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy affinity
  • Test action 3: Create another server group SERG2 with N1 and policy affinity
  • Test assertion 1: The names of SERG1 and SERG2 are the same
  • Test assertion 2: The ‘policies’ of SERG1 and SERG2 are the same
  • Test assertion 3: The ids of SERG1 and SERG2 are different
  • Test action 4: Delete SERG1 and SERG2
  • Test action 5: List all server groups
  • Test assertion 4: SERG1 and SERG2 are not in the list
5.1.9.6.3.3.2. Pass / fail criteria

This test evaluates the functionality of creating and deleting server groups with the same name and policy. Specifically, the test verifies that:

  • Server groups can be created with the same name and policy.
  • Server groups with the same name and policy can be deleted successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.9.6.3.4. Post conditions

N/A

5.1.9.6.4. Test Case 3 - Test create and delete server group with affinity policy
5.1.9.6.4.1. Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_create_delete_server_group_with_affinity_policy

5.1.9.6.4.2. Test preconditions

None

5.1.9.6.4.3. Basic test flow execution description and pass/fail criteria
5.1.9.6.4.3.1. Test execution
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy affinity
  • Test assertion 1: The name of SERG1 returned in the response is the same as N1
  • Test assertion 2: The ‘policies’ of SERG1 returned in the response is affinity
  • Test action 3: Delete SERG1 and list all server groups
  • Test assertion 3: SERG1 is not in the list
5.1.9.6.4.3.2. Pass / fail criteria

This test evaluates the functionality of creating and deleting server group with affinity policy. Specifically, the test verifies that:

  • Server group can be created with affinity policy and deleted successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.9.6.4.4. Post conditions

N/A

5.1.9.6.5. Test Case 4 - Test create and delete server group with anti-affinity policy
5.1.9.6.5.1. Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_create_delete_server_group_with_anti_affinity_policy

5.1.9.6.5.2. Test preconditions

None

5.1.9.6.5.3. Basic test flow execution description and pass/fail criteria
5.1.9.6.5.3.1. Test execution
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy anti-affinity
  • Test assertion 1: The name of SERG1 returned in the response is the same as N1
  • Test assertion 2: The ‘policies’ of SERG1 returned in the response is anti-affinity
  • Test action 3: Delete SERG1 and list all server groups
  • Test assertion 3: SERG1 is not in the list
5.1.9.6.5.3.2. Pass / fail criteria

This test evaluates the functionality of creating and deleting server group with anti-affinity policy. Specifically, the test verifies that:

  • Server group can be created with anti-affinity policy and deleted successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.9.6.5.4. Post conditions

N/A

5.1.9.6.6. Test Case 5 - Test list server groups
5.1.9.6.6.1. Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_list_server_groups

5.1.9.6.6.2. Test preconditions

None

5.1.9.6.6.3. Basic test flow execution description and pass/fail criteria
5.1.9.6.6.3.1. Test execution
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy affinity
  • Test action 3: List all server groups
  • Test assertion 1: SERG1 is in the list
  • Test action 4: Delete SERG1
5.1.9.6.6.3.2. Pass / fail criteria

This test evaluates the functionality of listing server groups. Specifically, the test verifies that:

  • Server groups can be listed successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.9.6.6.4. Post conditions

N/A

5.1.9.6.7. Test Case 6 - Test show server group details
5.1.9.6.7.1. Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_show_server_group

5.1.9.6.7.2. Test preconditions

None

5.1.9.6.7.3. Basic test flow execution description and pass/fail criteria
5.1.9.6.7.3.1. Test execution
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy affinity, and stored the details (D1) returned in the response
  • Test action 3: Show the details (D2) of SERG1
  • Test assertion 1: All values in D1 are the same as the values in D2
  • Test action 4: Delete SERG1
5.1.9.6.7.3.2. Pass / fail criteria

This test evaluates the functionality of showing server group details. Specifically, the test verifies that:

  • Server groups can be shown successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.9.6.7.4. Post conditions

N/A

5.1.10. Tempest Network API test specification
5.1.10.1. Scope

The Tempest Network API test area tests the basic operations of the System Under Test (SUT) through the life of a VNF. The tests in this test area will evaluate IPv4 network runtime operations functionality.

These runtime operations may include that create, list, verify or delete:

  • Floating IP
  • Network
  • Subnet
  • Port
  • External Network Visibility
  • Router
  • Subnetpools
  • API Version Resources
5.1.10.2. References

Networks:

  • create network
  • delete network

Routers and interface:

  • create router
  • update router
  • delete router
  • add interface to router

Subnets:

  • create subnet
  • update subnet
  • delete subnet

Subnetpools:

  • create subnetpool
  • update subnetpool
  • delete subnetpool

Ports:

  • create port
  • update port
  • delete port

Floating IPs:

  • create floating IP
  • delete floating IP

Api Versions

  • list version
  • show version
5.1.10.3. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.10.4. Test Area Structure

The test area is structured in individual tests as listed below. For detailed information on the individual steps and assertions performed by the tests, review the Python source code accessible via the following links:

All these test cases are included in the test case dovetail.tempest.network of OVP test suite.

List, Show and Verify the Details of the Available Extensions
  • tempest.api.network.test_extensions.ExtensionsTestJSON.test_list_show_extensions
Floating IP tests
  • Create a Floating IP
  • Update a Floating IP
  • Delete a Floating IP
  • List all Floating IPs
  • Show Floating IP Details
  • Associate a Floating IP with a Port and then Delete that Port
  • Associate a Floating IP with a Port and then with a Port on Another Router
  • tempest.api.network.test_floating_ips.FloatingIPTestJSON.test_create_floating_ip_specifying_a_fixed_ip_address
  • tempest.api.network.test_floating_ips.FloatingIPTestJSON.test_create_list_show_update_delete_floating_ip
Network tests
  • Bulk Network Creation & Deletion
  • Bulk Subnet Create & Deletion
  • Bulk Port Creation & Deletion
  • List Project’s Networks
  • tempest.api.network.test_networks.BulkNetworkOpsTest.test_bulk_create_delete_network
  • tempest.api.network.test_networks.BulkNetworkOpsTest.test_bulk_create_delete_port
  • tempest.api.network.test_networks.BulkNetworkOpsTest.test_bulk_create_delete_subnet
External Network Visibility test
  • tempest.api.network.test_networks.NetworksTest.test_external_network_visibility
Create Port with No Security Groups test
  • tempest.api.network.test_ports.PortsTestJSON.test_create_port_with_no_securitygroups
Router test
  • tempest.api.network.test_routers.RoutersTest.test_add_multiple_router_interfaces
  • tempest.api.network.test_routers.RoutersTest.test_add_remove_router_interface_with_port_id
  • tempest.api.network.test_routers.RoutersTest.test_add_remove_router_interface_with_subnet_id
  • tempest.api.network.test_routers.RoutersTest.test_create_show_list_update_delete_router
Create, List, Show, Update and Delete Subnetpools
  • tempest.api.network.test_subnetpools_extensions.SubnetPoolsTestJSON.test_create_list_show_update_delete_subnetpools
API Version Resources test
  • tempest.api.network.test_versions.NetworksApiDiscovery.test_api_version_resources
5.1.11. Tempest Network Scenario test specification
5.1.11.1. Scope

The Tempest Network scenario test area evaluates the ability of the system under test to support dynamic network runtime operations through the life of a VNF (e.g. attach/detach, enable/disable, read stats). The tests in this test area will evaluate IPv4 network runtime operations functionality. These runtime operations includes hotpluging network interface, detaching floating-ip from VM, attaching floating-ip to VM, updating subnet’s DNS, updating VM instance port admin state and updating router admin state.

5.1.11.3. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • DNS - Domain Name System
  • ICMP - Internet Control Message Protocol
  • MAC - Media Access Control
  • NIC - Network Interface Controller
  • NFVi - Network Functions Virtualization infrastructure
  • SSH - Secure Shell
  • TCP - Transmission Control Protocol
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
5.1.11.4. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.11.5. Test Area Structure

The test area is structured based on dynamic network runtime operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

All these test cases are included in the test case dovetail.tempest.network_scenario of OVP test suite.

5.1.11.6. Test Descriptions
5.1.11.6.1. API Used and Reference

Security Groups: https://developer.openstack.org/api-ref/network/v2/index.html#security-groups-security-groups

  • create security group
  • delete security group

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • create network
  • delete network

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • create router
  • update router
  • delete router
  • add interface to router

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • create subnet
  • update subnet
  • delete subnet

Servers: https://developer.openstack.org/api-ref/compute/

  • create keypair
  • create server
  • delete server
  • add/assign floating IP
  • disassociate floating IP

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • create port
  • update port
  • delete port

Floating IPs: https://developer.openstack.org/api-ref/networking/v2/index.html#floating-ips-floatingips

  • create floating IP
  • delete floating IP
5.1.11.6.2. Test Case 1 - Basic network operations
5.1.11.6.2.1. Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops

5.1.11.6.2.2. Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • Openstack nova, neutron services are available
  • One public network
5.1.11.6.2.3. Basic test flow execution description and pass/fail criteria
5.1.11.6.2.3.1. Test execution
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test assertion 1: Ping FIP1 and SSH to VM1 via FIP1 successfully
  • Test assertion 2: Ping the internal gateway from VM1 successfully
  • Test assertion 3: Ping the default gateway from VM1 using its floating IP FIP1 successfully
  • Test action 6: Detach FIP1 from VM1
  • Test assertion 4: VM1 becomes unreachable after FIP1 disassociated
  • Test action 7: Create a new server VM2 with NET1, and associate floating IP FIP1 to VM2
  • Test assertion 5: Ping FIP1 and SSH to VM2 via FIP1 successfully
  • Test action 8: Delete SG1, NET1, SUBNET1, R1, VM1, VM2 and FIP1
5.1.11.6.2.3.2. Pass / fail criteria

This test evaluates the functionality of basic network operations. Specifically, the test verifies that:

  • The Tempest host can ping VM’s IP address. This implies, but does not guarantee (see the ssh check that follows), that the VM has been assigned the correct IP address and has connectivity to the Tempest host.
  • The Tempest host can perform key-based authentication to an ssh server hosted at VM’s IP address. This check guarantees that the IP address is associated with the target VM.
  • The Tempest host can ssh into the VM via the IP address and successfully ping the internal gateway address, implying connectivity to another VM on the same network.
  • The Tempest host can ssh into the VM via the IP address and successfully ping the default gateway, implying external connectivity.
  • Detach the floating-ip from the VM and VM becomes unreachable.
  • Associate attached floating ip to a new VM and the new VM is reachable.
  • Floating IP status is updated correctly after each change.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.11.6.2.4. Post conditions

N/A

5.1.11.6.3. Test Case 2 - Hotplug network interface
5.1.11.6.3.1. Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_hotplug_nic

5.1.11.6.3.2. Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • Compute interface_attach feature is enabled
  • VM vnic_type is not set to ‘direct’ or ‘macvtap’
  • Openstack nova, neutron services are available
  • One public network
5.1.11.6.3.3. Basic test flow execution description and pass/fail criteria
5.1.11.6.3.3.1. Test execution
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test assertion 1: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test action 6: Create a second neutron network NET2 and subnet SUBNET2, and attach VM1 to NET2
  • Test action 7: Get VM1’s ethernet interface NIC2 for NET2
  • Test assertion 2: Ping NET2’s internal gateway successfully
  • Test action 8: Delete SG1, NET1, NET2, SUBNET1, SUBNET2, R1, NIC2, VM1 and FIP1
5.1.11.6.3.3.2. Pass / fail criteria

This test evaluates the functionality of adding network to an active VM. Specifically, the test verifies that:

  • New network interface can be added to an existing VM successfully.
  • The Tempest host can ssh into the VM via the IP address and successfully ping the new network’s internal gateway address, implying connectivity to the new network.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.11.6.3.4. Post conditions

N/A

5.1.11.6.4. Test Case 3 - Update subnet’s configuration
5.1.11.6.4.1. Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_subnet_details

5.1.11.6.4.2. Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • DHCP client is available
  • Tenant networks should be non-shared and isolated
  • Openstack nova, neutron services are available
  • One public network
5.1.11.6.4.3. Basic test flow execution description and pass/fail criteria
5.1.11.6.4.3.1. Test execution
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface, configure SUBNET1 with dns nameserver ‘1.2.3.4’
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test assertion 1: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test assertion 2: Retrieve the VM1’s configured dns and verify it matches the one configured for SUBNET1
  • Test action 6: Update SUBNET1’s dns to ‘9.8.7.6’
  • Test assertion 3: After triggering the DHCP renew from the VM manually, retrieve the VM1’s configured dns and verify it has been successfully updated
  • Test action 7: Delete SG1, NET1, SUBNET1, R1, VM1 and FIP1
5.1.11.6.4.3.2. Pass / fail criteria

This test evaluates the functionality of updating subnet’s configurations. Specifically, the test verifies that:

  • Updating subnet’s DNS server configurations are affecting the VMs.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.11.6.4.4. Post conditions

N/A

5.1.11.6.5. Test Case 4 - Update VM port admin state
5.1.11.6.5.1. Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_instance_port_admin_state

5.1.11.6.5.2. Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • Network port_admin_state_change feature is enabled
  • Openstack nova, neutron services are available
  • One public network
5.1.11.6.5.3. Basic test flow execution description and pass/fail criteria
5.1.11.6.5.3.1. Test execution
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test action 6: Create a server VM2 with SG1 and NET1, and assign a floating IP FIP2 to VM2
  • Test action 7: Get a SSH client SSHCLNT1 to VM2
  • Test assertion 1: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test assertion 2: Ping FIP1 via SSHCLNT1 successfully
  • Test action 8: Update admin_state_up attribute of VM1 port to False
  • Test assertion 3: Ping FIP1 and SSH to VM1 with FIP1 failed
  • Test assertion 4: Ping FIP1 via SSHCLNT1 failed
  • Test action 9: Update admin_state_up attribute of VM1 port to True
  • Test assertion 5: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test assertion 6: Ping FIP1 via SSHCLNT1 successfully
  • Test action 10: Delete SG1, NET1, SUBNET1, R1, SSHCLNT1, VM1, VM2 and FIP1, FIP2
5.1.11.6.5.3.2. Pass / fail criteria

This test evaluates the VM public and project connectivity status by changing VM port admin_state_up to True & False. Specifically, the test verifies that:

  • Public and project connectivity is reachable before updating admin_state_up attribute of VM port to False.
  • Public and project connectivity is unreachable after updating admin_state_up attribute of VM port to False.
  • Public and project connectivity is reachable after updating admin_state_up attribute of VM port from False to True.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.11.6.5.4. Post conditions

N/A

5.1.11.6.6. Test Case 5 - Update router admin state
5.1.11.6.6.1. Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_router_admin_state

5.1.11.6.6.2. Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • Multi-tenant networks capabilities
  • Openstack nova, neutron services are available
  • One public network
5.1.11.6.6.3. Basic test flow execution description and pass/fail criteria
5.1.11.6.6.3.1. Test execution
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test assertion 1: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test action 6: Update admin_state_up attribute of R1 to False
  • Test assertion 2: Ping FIP1 and SSH to VM1 with FIP1 failed
  • Test action 7: Update admin_state_up attribute of R1 to True
  • Test assertion 3: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test action 8: Delete SG1, NET1, SUBNET1, R1, VM1 and FIP1
5.1.11.6.6.3.2. Pass / fail criteria

This test evaluates the router public connectivity status by changing router admin_state_up to True & False. Specifically, the test verifies that:

  • Public connectivity is reachable before updating admin_state_up attribute of router to False.
  • Public connectivity is unreachable after updating admin_state_up attribute of router to False.
  • Public connectivity is reachable after updating admin_state_up attribute of router. from False to True

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.11.6.6.4. Post conditions

N/A

5.1.12. Security Group and Port Security test specification
5.1.12.1. Scope

The security group and port security test area evaluates the ability of the system under test to support packet filtering by security group and port security. The tests in this test area will evaluate preventing MAC spoofing by port security, basic security group operations including testing cross/in tenant traffic, testing multiple security groups, using port security to disable security groups and updating security groups.

5.1.12.2. References

N/A

5.1.12.3. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • ICMP - Internet Control Message Protocol
  • MAC - Media Access Control
  • NFVi - Network Functions Virtualization infrastructure
  • SSH - Secure Shell
  • TCP - Transmission Control Protocol
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
5.1.12.4. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.12.5. Test Area Structure

The test area is structured based on the basic operations of security group and port security. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

All these test cases are included in the test case dovetail.tempest.network_security of OVP test suite.

5.1.12.6. Test Descriptions
5.1.12.6.1. API Used and Reference

Security Groups: https://developer.openstack.org/api-ref/network/v2/index.html#security-groups-security-groups

  • create security group
  • delete security group

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • create network
  • delete network
  • list networks
  • create floating ip
  • delete floating ip

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • create router
  • delete router
  • list routers
  • add interface to router

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • create subnet
  • list subnets
  • delete subnet

Servers: https://developer.openstack.org/api-ref/compute/

  • create keypair
  • create server
  • delete server
  • add/assign floating ip

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • update port
  • list ports
  • show port details
5.1.12.6.2. Test Case 1 - Port Security and MAC Spoofing
5.1.12.6.2.1. Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_port_security_macspoofing_port

5.1.12.6.2.2. Test preconditions
  • Neutron port-security extension API
  • Neutron security-group extension API
  • One public network
5.1.12.6.2.3. Basic test flow execution description and pass/fail criteria
5.1.12.6.2.3.1. Test execution
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 6: Verify can ping FIP1 successfully and can SSH to VM1 with FIP1
  • Test action 7: Create a second neutron network NET2 and subnet SUBNET2, and attach VM1 to NET2
  • Test action 8: Get VM1’s ethernet interface NIC2 for NET2
  • Test action 9: Create second server VM2 on NET2
  • Test action 10: Verify VM1 is able to communicate with VM2 via NIC2
  • Test action 11: Login to VM1 and spoof the MAC address of NIC2 to “00:00:00:00:00:01”
  • Test action 12: Verify VM1 fails to communicate with VM2 via NIC2
  • Test assertion 1: The ping operation is failed
  • Test action 13: Update ‘security_groups’ to be none for VM1’s NIC2 port
  • Test action 14: Update ‘port_security_enable’ to be False for VM1’s NIC2 port
  • Test action 15: Verify now VM1 is able to communicate with VM2 via NIC2
  • Test assertion 2: The ping operation is successful
  • Test action 16: Delete SG1, NET1, NET2, SUBNET1, SUBNET2, R1, VM1, VM2 and FIP1
5.1.12.6.2.3.2. Pass / fail criteria

This test evaluates the ability to prevent MAC spoofing by using port security. Specifically, the test verifies that:

  • With port security, the ICMP packets from a spoof server cannot pass the port.
  • Without port security, the ICMP packets from a spoof server can pass the port.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.12.6.2.4. Post conditions

N/A

5.1.12.6.3. Test Case 2 - Test Security Group Cross Tenant Traffic
5.1.12.6.3.1. Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_cross_tenant_traffic

5.1.12.6.3.2. Test preconditions
  • Neutron security-group extension API
  • Two tenants
  • One public network
5.1.12.6.3.3. Basic test flow execution description and pass/fail criteria
5.1.12.6.3.3.1. Test execution
  • Test action 1: Create a neutron network NET1 for primary tenant
  • Test action 2: Create a primary tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2 for primary tenant
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Repeat test action 1 to 6 and create NET2, R2, SUBNET2, SG3, SG4, FIP2 and VM2 for an alt_tenant
  • Test action 8: Verify VM1 fails to communicate with VM2 through FIP2
  • Test assertion 1: The ping operation is failed
  • Test action 9: Add ICMP rule to SG4
  • Test action 10: Verify VM1 is able to communicate with VM2 through FIP2
  • Test assertion 2: The ping operation is successful
  • Test action 11: Verify VM2 fails to communicate with VM1 through FIP1
  • Test assertion 3: The ping operation is failed
  • Test action 12: Add ICMP rule to SG2
  • Test action 13: Verify VM2 is able to communicate with VM1 through FIP1
  • Test assertion 4: The ping operation is successful
  • Test action 14: Delete SG1, SG2, SG3, SG4, NET1, NET2, SUBNET1, SUBNET2, R1, R2, VM1, VM2, FIP1 and FIP2
5.1.12.6.3.3.2. Pass / fail criteria

This test evaluates the ability of the security group to filter packets cross tenant. Specifically, the test verifies that:

  • Without ICMP security group rule, the ICMP packets cannot be received by the server in another tenant which differs from the source server.
  • With ingress ICMP security group rule enabled only at tenant1, the server in tenant2 can ping server in tenant1 but not the reverse direction.
  • With ingress ICMP security group rule enabled at tenant2 also, the ping works from both directions.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.12.6.3.4. Post conditions

N/A

5.1.12.6.4. Test Case 3 - Test Security Group in Tenant Traffic
5.1.12.6.4.1. Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_in_tenant_traffic

5.1.12.6.4.2. Test preconditions
  • Neutron security-group extension API
  • One public network
5.1.12.6.4.3. Basic test flow execution description and pass/fail criteria
5.1.12.6.4.3.1. Test execution
  • Test action 1: Create a neutron network NET1
  • Test action 2: Create a tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Create second server VM2 with default security group and NET1
  • Test action 8: Verify VM1 fails to communicate with VM2 through VM2’s fixed ip
  • Test assertion 1: The ping operation is failed
  • Test action 9: Add ICMP security group rule to default security group
  • Test action 10: Verify VM1 is able to communicate with VM2 through VM2’s fixed ip
  • Test assertion 2: The ping operation is successful
  • Test action 11: Delete SG1, SG2, NET1, SUBNET1, R1, VM1, VM2 and FIP1
5.1.12.6.4.3.2. Pass / fail criteria

This test evaluates the ability of the security group to filter packets in one tenant. Specifically, the test verifies that:

  • Without ICMP security group rule, the ICMP packets cannot be received by the server in the same tenant.
  • With ICMP security group rule, the ICMP packets can be received by the server in the same tenant.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.12.6.4.4. Post conditions

N/A

5.1.12.6.5. Test Case 4 - Test Multiple Security Groups
5.1.12.6.5.1. Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_multiple_security_groups

5.1.12.6.5.2. Test preconditions
  • Neutron security-group extension API
  • One public network
5.1.12.6.5.3. Basic test flow execution description and pass/fail criteria
5.1.12.6.5.3.1. Test execution
  • Test action 1: Create a neutron network NET1
  • Test action 2: Create a tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Verify failed to ping FIP1
  • Test assertion 1: The ping operation is failed
  • Test action 8: Add ICMP security group rule to SG2
  • Test action 9: Verify can ping FIP1 successfully
  • Test assertion 2: The ping operation is successful
  • Test action 10: Verify can SSH to VM1 with FIP1
  • Test assertion 3: Can SSH to VM1 successfully
  • Test action 11: Delete SG1, SG2, NET1, SUBNET1, R1, VM1 and FIP1
5.1.12.6.5.3.2. Pass / fail criteria

This test evaluates the ability of multiple security groups to filter packets. Specifically, the test verifies that:

  • A server with 2 security groups, one with TCP rule and without ICMP rule, cannot receive the ICMP packets sending from the tempest host machine.
  • A server with 2 security groups, one with TCP rule and the other with ICMP rule, can receive the ICMP packets sending from the tempest host machine and be connected via the SSH client.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.12.6.5.4. Post conditions

N/A

5.1.12.6.6. Test Case 5 - Test Port Security Disable Security Group
5.1.12.6.6.1. Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_port_security_disable_security_group

5.1.12.6.6.2. Test preconditions
  • Neutron security-group extension API
  • Neutron port-security extension API
  • One public network
5.1.12.6.6.3. Basic test flow execution description and pass/fail criteria
5.1.12.6.6.3.1. Test execution
  • Test action 1: Create a neutron network NET1
  • Test action 2: Create a tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Create second server VM2 with default security group and NET1
  • Test action 8: Update ‘security_groups’ to be none and ‘port_security_enabled’ to be True for VM2’s port
  • Test action 9: Verify VM1 fails to communicate with VM2 through VM2’s fixed ip
  • Test assertion 1: The ping operation is failed
  • Test action 10: Update ‘security_groups’ to be none and ‘port_security_enabled’ to be False for VM2’s port
  • Test action 11: Verify VM1 is able to communicate with VM2 through VM2’s fixed ip
  • Test assertion 2: The ping operation is successful
  • Test action 12: Delete SG1, SG2, NET1, SUBNET1, R1, VM1, VM2 and FIP1
5.1.12.6.6.3.2. Pass / fail criteria

This test evaluates the ability of port security to disable security group. Specifically, the test verifies that:

  • The ICMP packets cannot pass the port whose ‘port_security_enabled’ is True and security_groups is none.
  • The ICMP packets can pass the port whose ‘port_security_enabled’ is False and security_groups is none.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.12.6.6.4. Post conditions

N/A

5.1.12.6.7. Test Case 6 - Test Update Port Security Group
5.1.12.6.7.1. Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_port_update_new_security_group

5.1.12.6.7.2. Test preconditions
  • Neutron security-group extension API
  • One public network
5.1.12.6.7.3. Basic test flow execution description and pass/fail criteria
5.1.12.6.7.3.1. Test execution
  • Test action 1: Create a neutron network NET1
  • Test action 2: Create a tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Create third empty security group SG3
  • Test action 8: Add ICMP rule to SG3
  • Test action 9: Create second server VM2 with default security group and NET1
  • Test action 10: Verify VM1 fails to communicate with VM2 through VM2’s fixed ip
  • Test assertion 1: The ping operation is failed
  • Test action 11: Update ‘security_groups’ to be SG3 for VM2’s port
  • Test action 12: Verify VM1 is able to communicate with VM2 through VM2’s fixed ip
  • Test assertion 2: The ping operation is successful
  • Test action 13: Delete SG1, SG2, SG3, NET1, SUBNET1, R1, VM1, VM2 and FIP1
5.1.12.6.7.3.2. Pass / fail criteria

This test evaluates the ability to update port with a new security group. Specifically, the test verifies that:

  • Without ICMP security group rule, the VM cannot receive ICMP packets.
  • Update the port’s security group which has ICMP rule, the VM can receive ICMP packets.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.12.6.7.4. Post conditions

N/A

5.1.13. OpenStack Interoperability test specification

The test cases documented here are the API test cases in the OpenStack Interop guideline 2017.09 as implemented by the RefStack client.

5.1.13.1. References

All OpenStack interop test cases addressed in OVP are covered in the following test specification documents.

5.1.13.1.1. VIM compute operations test specification
5.1.13.1.1.1. Scope

The VIM compute operations test area evaluates the ability of the system under test to support VIM compute operations. The test cases documented here are the compute API test cases in the OpenStack Interop guideline 2017.09 as implemented by the RefStack client. These test cases will evaluate basic OpenStack (as a VIM) compute operations, including:

  • Image management operations
  • Basic support operations
  • API version support operations
  • Quotas management operations
  • Basic server operations
  • Volume management operations
5.1.13.1.1.2. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualization infrastructure
  • SUT - System Under Test
  • UUID - Universally Unique Identifier
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
5.1.13.1.1.3. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM deployed with a Pharos compliant infrastructure.

5.1.13.1.1.4. Test Area Structure

The test area is structured based on VIM compute API operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

For brevity, the test cases in this test area are summarized together based on the operations they are testing.

All these test cases are included in the test case dovetail.tempest.osinterop of OVP test suite.

5.1.13.1.1.5. Test Descriptions
5.1.13.1.1.5.1. API Used and Reference

Servers: https://developer.openstack.org/api-ref/compute/

  • create server
  • delete server
  • list servers
  • start server
  • stop server
  • update server
  • get server action
  • set server metadata
  • update server metadata
  • rebuild server
  • create image
  • delete image
  • create keypair
  • delete keypair

Block storage: https://developer.openstack.org/api-ref/block-storage

  • create volume
  • delete volume
  • attach volume to server
  • detach volume from server
5.1.13.1.1.5.2. Test Case 1 - Image operations within the Compute API
5.1.13.1.1.5.2.1. Test case specification

tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_delete_image tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_image_specify_multibyte_character_image_name

5.1.13.1.1.5.2.2. Test preconditions
  • Compute server extension API
5.1.13.1.1.5.2.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.1.5.2.3.1. Test execution
  • Test action 1: Create a server VM1 with an image IMG1 and wait for VM1 to reach ‘ACTIVE’ status
  • Test action 2: Create a new server image IMG2 from VM1, specifying image name and image metadata. Wait for IMG2 to reach ‘ACTIVE’ status, and then delete IMG2
  • Test assertion 1: Verify IMG2 is created with correct image name and image metadata; verify IMG1’s ‘minRam’ equals to IMG2’s ‘minRam’ and IMG2’s ‘minDisk’ equals to IMG1’s ‘minDisk’ or VM1’s flavor disk size
  • Test assertion 2: Verify IMG2 is deleted correctly
  • Test action 3: Create another server IMG3 from VM1, specifying image name with a 3 byte utf-8 character
  • Test assertion 3: Verify IMG3 is created correctly
  • Test action 4: Delete VM1, IMG1 and IMG3
5.1.13.1.1.5.2.3.2. Pass / fail criteria

This test evaluates the Compute API ability of creating image from server, deleting image, creating server image with multi-byte character name. Specifically, the test verifies that:

  • Compute server create image and delete image APIs work correctly.
  • Compute server image can be created with multi-byte character name.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.1.5.2.4. Post conditions

N/A

5.1.13.1.1.5.3. Test Case 2 - Action operation within the Compute API
5.1.13.1.1.5.3.1. Test case specification

tempest.api.compute.servers.test_instance_actions.InstanceActionsTestJSON.test_get_instance_action tempest.api.compute.servers.test_instance_actions.InstanceActionsTestJSON.test_list_instance_actions

5.1.13.1.1.5.3.2. Test preconditions
  • Compute server extension API
5.1.13.1.1.5.3.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.1.5.3.3.1. Test execution
  • Test action 1: Create a server VM1 and wait for VM1 to reach ‘ACTIVE’ status
  • Test action 2: Get the action details ACT_DTL of VM1
  • Test assertion 1: Verify ACT_DTL’s ‘instance_uuid’ matches VM1’s ID and ACT_DTL’s ‘action’ matched ‘create’
  • Test action 3: Create a server VM2 and wait for VM2 to reach ‘ACTIVE’ status
  • Test action 4: Delete server VM2 and wait for VM2 to reach termination
  • Test action 5: Get the action list ACT_LST of VM2
  • Test assertion 2: Verify ACT_LST’s length is 2 and two actions are ‘create’ and ‘delete’
  • Test action 6: Delete VM1
5.1.13.1.1.5.3.3.2. Pass / fail criteria

This test evaluates the Compute API ability of getting the action details of a provided server and getting the action list of a deleted server. Specifically, the test verifies that:

  • Get the details of the action in a specified server.
  • List the actions that were performed on the specified server.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.1.5.3.4. Post conditions

N/A

5.1.13.1.1.5.4. Test Case 3 - Generate, import and delete SSH keys within Compute services
5.1.13.1.1.5.4.1. Test case specification

tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_specify_keypair

5.1.13.1.1.5.4.2. Test preconditions
  • Compute server extension API
5.1.13.1.1.5.4.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.1.5.4.3.1. Test execution
  • Test action 1: Create a keypair KEYP1 and list all existing keypairs
  • Test action 2: Create a server VM1 with KEYP1 and wait for VM1 to reach ‘ACTIVE’ status
  • Test action 3: Show details of VM1
  • Test assertion 1: Verify value of ‘key_name’ in the details equals to the name of KEYP1
  • Test action 4: Delete KEYP1 and VM1
5.1.13.1.1.5.4.3.2. Pass / fail criteria

This test evaluates the Compute API ability of creating a keypair, listing keypairs and creating a server with a provided keypair. Specifically, the test verifies that:

  • Compute create keypair and list keypair APIs work correctly.
  • While creating a server, keypair can be specified.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.1.5.4.4. Post conditions

N/A

5.1.13.1.1.5.5. Test Case 4 - List supported versions of the Compute API
5.1.13.1.1.5.5.1. Test case specification

tempest.api.compute.test_versions.TestVersions.test_list_api_versions

5.1.13.1.1.5.5.2. Test preconditions
  • Compute versions extension API
5.1.13.1.1.5.5.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.1.5.5.3.1. Test execution
  • Test action 1: Get a List of versioned endpoints in the SUT
  • Test assertion 1: Verify endpoints versions start at ‘v2.0’
5.1.13.1.1.5.5.3.2. Pass / fail criteria

This test evaluates the functionality of listing all available APIs to API consumers. Specifically, the test verifies that:

  • Compute list API versions API works correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.1.5.5.4. Post conditions

N/A

5.1.13.1.1.5.6. Test Case 5 - Quotas management in Compute API
5.1.13.1.1.5.6.1. Test case specification

tempest.api.compute.test_quotas.QuotasTestJSON.test_get_default_quotas tempest.api.compute.test_quotas.QuotasTestJSON.test_get_quotas

5.1.13.1.1.5.6.2. Test preconditions
  • Compute quotas extension API
5.1.13.1.1.5.6.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.1.5.6.3.1. Test execution
  • Test action 1: Get the default quota set using the tenant ID
  • Test assertion 1: Verify the default quota set ID matches tenant ID and the default quota set is complete
  • Test action 2: Get the quota set using the tenant ID
  • Test assertion 2: Verify the quota set ID matches tenant ID and the quota set is complete
  • Test action 3: Get the quota set using the user ID
  • Test assertion 3: Verify the quota set ID matches tenant ID and the quota set is complete
5.1.13.1.1.5.6.3.2. Pass / fail criteria

This test evaluates the functionality of getting quota set. Specifically, the test verifies that:

  • User can get the default quota set for its tenant.
  • User can get the quota set for its tenant.
  • User can get the quota set using user ID.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.1.5.6.4. Post conditions

N/A

5.1.13.1.1.5.7. Test Case 6 - Basic server operations in the Compute API
5.1.13.1.1.5.7.1. Test case specification

This test case evaluates the Compute API ability of basic server operations, including:

  • Create a server with admin password
  • Create a server with a name that already exists
  • Create a server with a numeric name
  • Create a server with a really long metadata
  • Create a server with a name whose length exceeding 255 characters
  • Create a server with an unknown flavor
  • Create a server with an unknown image ID
  • Create a server with an invalid network UUID
  • Delete a server using a server ID that exceeds length limit
  • Delete a server using a negative server ID
  • Get a nonexistent server details
  • Verify the instance host name is the same as the server name
  • Create a server with an invalid access IPv6 address
  • List all existent servers
  • Filter the (detailed) list of servers by flavor, image, server name, server status or limit
  • Lock a server and try server stop, unlock and retry
  • Get and delete metadata from a server
  • List and set metadata for a server
  • Reboot, rebuild, stop and start a server
  • Update a server’s access addresses and server name

The reference is,

tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_server_with_admin_password tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_with_existing_server_name tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_numeric_server_name tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_server_metadata_exceeds_length_limit tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_server_name_length_exceeds_256 tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_with_invalid_flavor tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_with_invalid_image tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_with_invalid_network_uuid tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_delete_server_pass_id_exceeding_length_limit tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_delete_server_pass_negative_id tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_get_non_existent_server tempest.api.compute.servers.test_create_server.ServersTestJSON.test_host_name_is_same_as_server_name tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_host_name_is_same_as_server_name tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_invalid_ip_v6_address tempest.api.compute.servers.test_create_server.ServersTestJSON.test_list_servers tempest.api.compute.servers.test_create_server.ServersTestJSON.test_list_servers_with_detail tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_list_servers tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_list_servers_with_detail tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_flavor tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_image tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_server_name tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_server_status tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_limit_results tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_flavor tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_image tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_limit tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_server_name tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_active_status tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filtered_by_name_wildcard tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_changes_since_future_date tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_changes_since_invalid_date tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits_greater_than_actual_count tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits_pass_negative_value tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits_pass_string tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_non_existing_flavor tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_non_existing_image tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_non_existing_server_name tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_detail_server_is_deleted tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_status_non_existing tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_with_a_deleted_server tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_lock_unlock_server tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_delete_server_metadata_item tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_get_server_metadata_item tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_list_server_metadata tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_set_server_metadata tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_set_server_metadata_item tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_update_server_metadata tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_server_name_blank tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_reboot_server_hard tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_reboot_non_existent_server tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_rebuild_server tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_rebuild_deleted_server tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_rebuild_non_existent_server tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_stop_start_server tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_stop_non_existent_server tempest.api.compute.servers.test_servers.ServersTestJSON.test_update_access_server_address tempest.api.compute.servers.test_servers.ServersTestJSON.test_update_server_name tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_update_name_of_non_existent_server tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_update_server_name_length_exceeds_256 tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_update_server_set_empty_name tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_created_server_vcpus tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_server_details tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_verify_created_server_vcpus tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_verify_server_details tempest.api.compute.servers.test_delete_server.DeleteServersTestJSON.test_delete_active_server

5.1.13.1.1.5.7.2. Test preconditions
  • Compute quotas extension API
5.1.13.1.1.5.7.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.1.5.7.3.1. Test execution
  • Test action 1: Create a server VM1 with a admin password ‘testpassword’
  • Test assertion 1: Verify the password returned in the response equals to ‘testpassword’
  • Test action 2: Generate a VM name VM_NAME
  • Test action 3: Create 2 servers VM2 and VM3 both with name VM_NAME
  • Test assertion 2: Verify VM2’s ID is not equal to VM3’s ID, and VM2’s name equal to VM3’s name
  • Test action 4: Create a server VM4 with a numeric name ‘12345’
  • Test assertion 3: Verify creating VM4 failed
  • Test action 5: Create a server VM5 with a long metadata ‘{‘a’: ‘b’ * 260}’
  • Test assertion 4: Verify creating VM5 failed
  • Test action 6: Create a server VM6 with name length exceeding 255 characters
  • Test assertion 5: Verify creating VM6 failed
  • Test action 7: Create a server VM7 with an unknown flavor ‘-1’
  • Test assertion 6: Verify creating VM7 failed
  • Test action 8: Create a server VM8 with an unknown image ID ‘-1’
  • Test assertion 7: Verify creating VM8 failed
  • Test action 9: Create a server VM9 with an invalid network UUID ‘a-b-c-d-e-f-g-h-i-j’
  • Test assertion 8: Verify creating VM9 failed
  • Test action 10: Delete a server using a server ID that exceeds system’s max integer limit
  • Test assertion 9: Verify deleting server failed
  • Test action 11: Delete a server using a server ID ‘-1’
  • Test assertion 10: Verify deleting server failed
  • Test action 12: Get a nonexistent server by using a random generated server ID
  • Test assertion 11: Verify get server failed
  • Test action 13: SSH into a provided server and get server’s hostname
  • Test assertion 12: Verify server’s host name is the same as the server name
  • Test action 14: SSH into a provided server and get server’s hostname (manual disk configuration)
  • Test assertion 13: Verify server’s host name is the same as the server name (manual disk configuration)
  • Test action 15: Create a server with an invalid access IPv6 address
  • Test assertion 14: Verify creating server failed, a bad request error is returned in response
  • Test action 16: List all existent servers
  • Test assertion 15: Verify a provided server is in the server list
  • Test action 17: List all existent servers in detail
  • Test assertion 16: Verify a provided server is in the detailed server list
  • Test action 18: List all existent servers (manual disk configuration)
  • Test assertion 17: Verify a provided server is in the server list (manual disk configuration)
  • Test action 19: List all existent servers in detail (manual disk configuration)
  • Test assertion 18: Verify a provided server is in the detailed server list (manual disk configuration)
  • Test action 20: List all existent servers in detail and filter the server list by flavor
  • Test assertion 19: Verify the filtered server list is correct
  • Test action 21: List all existent servers in detail and filter the server list by image
  • Test assertion 20: Verify the filtered server list is correct
  • Test action 22: List all existent servers in detail and filter the server list by server name
  • Test assertion 21: Verify the filtered server list is correct
  • Test action 23: List all existent servers in detail and filter the server list by server status
  • Test assertion 22: Verify the filtered server list is correct
  • Test action 24: List all existent servers in detail and filter the server list by display limit ‘1’
  • Test assertion 23: Verify the length of filtered server list is 1
  • Test action 25: List all existent servers and filter the server list by flavor
  • Test assertion 24: Verify the filtered server list is correct
  • Test action 26: List all existent servers and filter the server list by image
  • Test assertion 25: Verify the filtered server list is correct
  • Test action 27: List all existent servers and filter the server list by display limit ‘1’
  • Test assertion 26: Verify the length of filtered server list is 1
  • Test action 28: List all existent servers and filter the server list by server name
  • Test assertion 27: Verify the filtered server list is correct
  • Test action 29: List all existent servers and filter the server list by server status
  • Test assertion 28: Verify the filtered server list is correct
  • Test action 30: List all existent servers and filter the server list by server name wildcard
  • Test assertion 29: Verify the filtered server list is correct
  • Test action 31: List all existent servers and filter the server list by part of server name
  • Test assertion 30: Verify the filtered server list is correct
  • Test action 32: List all existent servers and filter the server list by a future change-since date
  • Test assertion 31: Verify the filtered server list is empty
  • Test action 33: List all existent servers and filter the server list by a invalid change-since date format
  • Test assertion 32: Verify a bad request error is returned in the response
  • Test action 34: List all existent servers and filter the server list by a display limit value greater than the length of the server list
  • Test assertion 33: Verify the length of filtered server list equals to the length of server list
  • Test action 35: List all existent servers and filter the server list by display limit ‘-1’
  • Test assertion 34: Verify a bad request error is returned in the response
  • Test action 36: List all existent servers and filter the server list by a string type limit value ‘testing’
  • Test assertion 35: Verify a bad request error is returned in the response
  • Test action 37: List all existent servers and filter the server list by a nonexistent flavor
  • Test assertion 36: Verify the filtered server list is empty
  • Test action 38: List all existent servers and filter the server list by a nonexistent image
  • Test assertion 37: Verify the filtered server list is empty
  • Test action 39: List all existent servers and filter the server list by a nonexistent server name
  • Test assertion 38: Verify the filtered server list is empty
  • Test action 40: List all existent servers in detail and search the server list for a deleted server
  • Test assertion 39: Verify the deleted server is not in the server list
  • Test action 41: List all existent servers and filter the server list by a nonexistent server status
  • Test assertion 40: Verify the filtered server list is empty
  • Test action 42: List all existent servers in detail
  • Test assertion 41: Verify a provided deleted server’s id is not in the server list
  • Test action 43: Lock a provided server VM10 and retrieve the server’s status
  • Test assertion 42: Verify VM10 is in ‘ACTIVE’ status
  • Test action 44: Stop VM10
  • Test assertion 43: Verify stop VM10 failed
  • Test action 45: Unlock VM10 and stop VM10 again
  • Test assertion 44: Verify VM10 is stopped and in ‘SHUTOFF’ status
  • Test action 46: Start VM10
  • Test assertion 45: Verify VM10 is in ‘ACTIVE’ status
  • Test action 47: Delete metadata item ‘key1’ from a provided server
  • Test assertion 46: Verify the metadata item is removed
  • Test action 48: Get metadata item ‘key2’ from a provided server
  • Test assertion 47: Verify the metadata item is correct
  • Test action 49: List all metadata key/value pair for a provided server
  • Test assertion 48: Verify all metadata are retrieved correctly
  • Test action 50: Set metadata {‘meta2’: ‘data2’, ‘meta3’: ‘data3’} for a provided server
  • Test assertion 49: Verify server’s metadata are replaced correctly
  • Test action 51: Set metadata item nova’s value to ‘alt’ for a provided server
  • Test assertion 50: Verify server’s metadata are set correctly
  • Test action 52: Update metadata {‘key1’: ‘alt1’, ‘key3’: ‘value3’} for a provided server
  • Test assertion 51: Verify server’s metadata are updated correctly
  • Test action 53: Create a server with empty name parameter
  • Test assertion 52: Verify create server failed
  • Test action 54: Hard reboot a provided server
  • Test assertion 53: Verify server is rebooted successfully
  • Test action 55: Soft reboot a nonexistent server
  • Test assertion 54: Verify reboot failed, an error is returned in the response
  • Test action 56: Rebuild a provided server with new image, new server name and metadata
  • Test assertion 55: Verify server is rebuilt successfully, server image, name and metadata are correct
  • Test action 57: Create a server VM11
  • Test action 58: Delete VM11 and wait for VM11 to reach termination
  • Test action 59: Rebuild VM11 with another image
  • Test assertion 56: Verify rebuild server failed, an error is returned in the response
  • Test action 60: Rebuild a nonexistent server
  • Test assertion 57: Verify rebuild server failed, an error is returned in the response
  • Test action 61: Stop a provided server
  • Test assertion 58: Verify server reaches ‘SHUTOFF’ status
  • Test action 62: Start the stopped server
  • Test assertion 59: Verify server reaches ‘ACTIVE’ status
  • Test action 63: Stop a provided server
  • Test assertion 60: Verify stop server failed, an error is returned in the response
  • Test action 64: Create a server VM12 and wait it to reach ‘ACTIVE’ status
  • Test action 65: Update VM12’s IPv4 and IPv6 access addresses
  • Test assertion 61: Verify VM12’s access addresses have been updated correctly
  • Test action 66: Create a server VM13 and wait it to reach ‘ACTIVE’ status
  • Test action 67: Update VM13’s server name with non-ASCII characters ‘u00CDu00F1stu00E1u00F1cu00E9’
  • Test assertion 62: Verify VM13’s server name has been updated correctly
  • Test action 68: Update the server name of a nonexistent server
  • Test assertion 63: Verify update server name failed, an ‘object not found’ error is returned in the response
  • Test action 69: Update a provided server’s name with a 256-character long name
  • Test assertion 64: Verify update server name failed, a bad request is returned in the response
  • Test action 70: Update a provided server’s server name with an empty string
  • Test assertion 65: Verify update server name failed, a bad request error is returned in the response
  • Test action 71: Get the number of vcpus of a provided server
  • Test action 72: Get the number of vcpus stated by the server’s flavor
  • Test assertion 66: Verify that the number of vcpus reported by the server matches the amount stated by the server’s flavor
  • Test action 73: Create a server VM14
  • Test assertion 67: Verify VM14’s server attributes are set correctly
  • Test action 74: Get the number of vcpus of a provided server (manual disk configuration)
  • Test action 75: Get the number of vcpus stated by the server’s flavor (manual disk configuration)
  • Test assertion 68: Verify that the number of vcpus reported by the server matches the amount stated by the server’s flavor (manual disk configuration)
  • Test action 76: Create a server VM15 (manual disk configuration)
  • Test assertion 69: Verify VM15’s server attributes are set correctly (manual disk configuration)
  • Test action 77: Create a server VM16 and then delete it when its status is ‘ACTIVE’
  • Test assertion 70: Verify VM16 is deleted successfully
  • Test action 78: Delete all VMs created
5.1.13.1.1.5.7.3.2. Pass / fail criteria

This test evaluates the functionality of basic server operations. Specifically, the test verifies that:

  • If an admin password is provided on server creation, the server’s root password should be set to that password
  • Create a server with a name that already exists is allowed
  • Create a server with a numeric name or a name that exceeds the length limit is not allowed
  • Create a server with a metadata that exceeds the length limit is not allowed
  • Create a server with an invalid flavor, an invalid image or an invalid network UUID is not allowed
  • Delete a server with a server ID that exceeds the length limit or a nonexistent server ID is not allowed
  • Delete a server which status is ‘ACTIVE’ is allowed
  • A provided server’s host name is the same as the server name
  • Create a server with an invalid IPv6 access address is not allowed
  • A created server is in the (detailed) list of servers
  • Filter the (detailed) list of servers by flavor, image, server name, server status, and display limit, respectively.
  • Filter the list of servers by a future date
  • Filter the list of servers by an invalid date format, a negative display limit or a string type display limit value is not allowed
  • Filter the list of servers by a nonexistent flavor, image, server name or server status is not allowed
  • Deleted servers are not in the list of servers
  • Deleted servers do not show by default in list of servers
  • Locked server is not allowed to be stopped by non-admin user
  • Can get and delete metadata from servers
  • Can list, set and update server metadata
  • Create a server with name parameter empty is not allowed
  • Hard reboot a server and the server should be power cycled
  • Reboot, rebuild and stop a nonexistent server is not allowed
  • Rebuild a server using the provided image and metadata
  • Stop and restart a server
  • A server’s name and access addresses can be updated
  • Update the name of a nonexistent server is not allowed
  • Update name of a server to a name that exceeds the name length limit is not allowed
  • Update name of a server to an empty string is not allowed
  • The number of vcpus reported by the server matches the amount stated by the server’s flavor
  • The specified server attributes are set correctly

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.1.5.7.4. Post conditions

N/A

5.1.13.1.1.5.8. Test Case 7 - Retrieve volume information through the Compute API
5.1.13.1.1.5.8.1. Test case specification

This test case evaluates the Compute API ability of attaching volume to a specific server and retrieve volume information, the reference is,

tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_attach_detach_volume tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_list_get_volume_attachments

5.1.13.1.1.5.8.2. Test preconditions
  • Compute volume extension API
5.1.13.1.1.5.8.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.1.5.8.3.1. Test execution
  • Test action 1: Create a server VM1 and a volume VOL1
  • Test action 2: Attach VOL1 to VM1
  • Test assertion 1: Stop VM1 successfully and wait VM1 to reach ‘SHUTOFF’ status
  • Test assertion 2: Start VM1 successfully and wait VM1 to reach ‘ACTIVE’ status
  • Test assertion 3: SSH into VM1 and verify VOL1 is in VM1’s root disk devices
  • Test action 3: Detach VOL1 from VM1
  • Test assertion 4: Stop VM1 successfully and wait VM1 to reach ‘SHUTOFF’ status
  • Test assertion 5: Start VM1 successfully and wait VM1 to reach ‘ACTIVE’ status
  • Test assertion 6: SSH into VM1 and verify VOL1 is not in VM1’s root disk devices
  • Test action 4: Create a server VM2 and a volume VOL2
  • Test action 5: Attach VOL2 to VM2
  • Test action 6: List VM2’s volume attachments
  • Test assertion 7: Verify the length of the list is 1 and VOL2 attachment is in the list
  • Test action 7: Retrieve VM2’s volume information
  • Test assertion 8: Verify volume information is correct
  • Test action 8: Delete VM1, VM2, VOL1 and VOL2
5.1.13.1.1.5.8.3.2. Pass / fail criteria

This test evaluates the functionality of retrieving volume information. Specifically, the test verifies that:

  • Stop and start a server with an attached volume work correctly.
  • Retrieve a server’s volume information correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.1.5.8.4. Post conditions

N/A

5.1.13.1.1.5.9. Test Case 8 - List Compute service availability zones with the Compute API
5.1.13.1.1.5.9.1. Test case specification

This test case evaluates the Compute API ability of listing availability zones with a non admin user, the reference is,

tempest.api.compute.servers.test_availability_zone.AZV2TestJSON.test_get_availability_zone_list_with_non_admin_user

5.1.13.1.1.5.9.2. Test preconditions
  • Compute volume extension API
5.1.13.1.1.5.9.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.1.5.9.3.1. Test execution
  • Test action 1: List availability zones with a non admin user
  • Test assertion 1: The list is not empty
5.1.13.1.1.5.9.3.2. Pass / fail criteria

This test evaluates the functionality of listing availability zones with a non admin user. Specifically, the test verifies that:

  • Non admin users can list availability zones.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.1.5.9.4. Post conditions

N/A

5.1.13.1.1.5.10. Test Case 9 - List Flavors within the Compute API
5.1.13.1.1.5.10.1. Test case specification

This test case evaluates the Compute API ability of listing flavors, the reference is,

tempest.api.compute.flavors.test_flavors.FlavorsV2TestJSON.test_list_flavors tempest.api.compute.flavors.test_flavors.FlavorsV2TestJSON.test_list_flavors_with_detail

5.1.13.1.1.5.10.2. Test preconditions
  • Compute volume extension API
5.1.13.1.1.5.10.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.1.5.10.3.1. Test execution
  • Test action 1: List all flavors
  • Test assertion 1: One given flavor is list in the all flavors’ list
  • Test action 2: List all flavors with details
  • Test assertion 2: One given flavor is list in the all flavors’ list
5.1.13.1.1.5.10.3.2. Pass / fail criteria

This test evaluates the functionality of listing flavors within the Compute API. Specifically, the test verifies that:

  • Can list flavors with/without details within the Compute API.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.1.5.10.4. Post conditions

N/A

5.1.13.1.2. VIM identity operations test specification
5.1.13.1.2.1. Scope

The VIM identity test area evaluates the ability of the system under test to support VIM identity operations. The tests in this area will evaluate API discovery operations within the Identity v3 API, auth operations within the Identity API.

5.1.13.1.2.2. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualisation infrastructure
  • VIM - Virtual Infrastructure Manager
5.1.13.1.2.3. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on an Pharos compliant infrastructure.

5.1.13.1.2.4. Test Area Structure

The test area is structured based on VIM identity operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test.

All these test cases are included in the test case dovetail.tempest.osinterop of OVP test suite.

5.1.13.1.2.5. Dependency Description

The VIM identity operations test cases are a part of the OpenStack interoperability tempest test cases. For Fraser based dovetail release, the OpenStack interoperability guidelines (version 2017.09) is adopted, which is valid for Mitaka, Newton, Ocata and Pike releases of Openstack.

5.1.13.1.2.6. Test Descriptions
5.1.13.1.2.6.1. API discovery operations within the Identity v3 API
5.1.13.1.2.6.1.1. Use case specification

tempest.api.identity.v3.test_api_discovery.TestApiDiscovery.test_api_version_resources tempest.api.identity.v3.test_api_discovery.TestApiDiscovery.test_api_media_types tempest.api.identity.v3.test_api_discovery.TestApiDiscovery.test_api_version_statuses

5.1.13.1.2.6.1.2. Test preconditions

None

5.1.13.1.2.6.1.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.2.6.1.3.1. Test execution
  • Test action 1: Show the v3 identity api description, the test passes if keys ‘id’, ‘links’, ‘media-types’, ‘status’, ‘updated’ are all included in the description response message.
  • Test action 2: Get the value of v3 identity api ‘media-types’, the test passes if api version 2 and version 3 are all included in the response.
  • Test action 3: Show the v3 indentity api description, the test passes if ‘current’, ‘stable’, ‘experimental’, ‘supported’, ‘deprecated’ are all of the identity api ‘status’ values.
5.1.13.1.2.6.1.3.2. Pass / fail criteria

This test case passes if all test action steps execute successfully and all assertions are affirmed. If any test steps fails to execute successfully or any of the assertions is not met, the test case fails.

5.1.13.1.2.6.1.4. Post conditions

None

5.1.13.1.2.6.2. Auth operations within the Identity API
5.1.13.1.2.6.2.1. Use case specification

tempest.api.identity.v3.test_tokens.TokensV3Test.test_create_token

5.1.13.1.2.6.2.2. Test preconditions

None

5.1.13.1.2.6.2.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.2.6.2.3.1. Test execution
  • Test action 1: Get the token by system credentials, the test passes if the returned token_id is not empty and is string type.
  • Test action 2: Get the user_id in getting token response message, the test passes if it is equal to the user_id which is used to get token.
  • Test action 3: Get the user_name in getting token response message, the test passes if it is equal to the user_name which is used to get token.
  • Test action 4: Get the method in getting token response message, the test passes if it is equal to the password which is used to get token.
5.1.13.1.2.6.2.3.2. Pass / fail criteria

This test case passes if all test action steps execute successfully and all assertions are affirmed. If any test steps fails to execute successfully or any of the assertions is not met, the test case fails.

5.1.13.1.2.6.2.4. Post conditions

None

5.1.13.1.3. VIM image operations test specification
5.1.13.1.3.1. Scope

The VIM image test area evaluates the ability of the system under test to support VIM image operations. The test cases documented here are the Image API test cases in the Openstack Interop guideline 2017.09 as implemented by the Refstack client. These test cases will evaluate basic Openstack (as a VIM) image operations including image creation, image list, image update and image deletion capabilities using Glance v2 API.

5.1.13.1.3.2. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • CRUD - Create, Read, Update, and Delete
  • NFVi - Network Functions Virtualization infrastructure
  • VIM - Virtual Infrastructure Manager
5.1.13.1.3.3. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.13.1.3.4. Test Area Structure

The test area is structured based on VIM image operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test.

For brevity, the test cases in this test area are summarized together based on the operations they are testing.

All these test cases are included in the test case dovetail.tempest.osinterop of OVP test suite.

5.1.13.1.3.5. Test Descriptions
5.1.13.1.3.5.1. API Used and Reference

Images: https://developer.openstack.org/api-ref/image/v2/

  • create image
  • delete image
  • show image details
  • show images
  • show image schema
  • show images schema
  • upload binary image data
  • add image tag
  • delete image tag
5.1.13.1.3.5.2. Image get tests using the Glance v2 API
5.1.13.1.3.5.2.1. Test case specification

tempest.api.image.v2.test_images.ListUserImagesTest.test_get_image_schema tempest.api.image.v2.test_images.ListUserImagesTest.test_get_images_schema tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_get_delete_deleted_image tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_get_image_null_id tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_get_non_existent_image

5.1.13.1.3.5.2.2. Test preconditions

Glance is available.

5.1.13.1.3.5.2.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.3.5.2.3.1. Test execution
  • Test action 1: Create 6 images and store their ids in a created images list.
  • Test action 2: Use image v2 API to show image schema and check the body of the response.
  • Test assertion 1: In the body of the response, the value of the key ‘name’ is ‘image’.
  • Test action 3: Use image v2 API to show images schema and check the body of the response.
  • Test assertion 2: In the body of the response, the value of the key ‘name’ is ‘images’.
  • Test action 4: Create an image with name ‘test’, container_formats ‘bare’ and disk_formats ‘raw’. Delete this image with its id and then try to show it with its id. Delete this deleted image again with its id and check the API’s response code.
  • Test assertion 3: The operations of showing and deleting a deleted image with its id both get 404 response code.
  • Test action 5: Use a null image id to show a image and check the API’s response code.
  • Test assertion 4: The API’s response code is 404.
  • Test action 6: Generate a random uuid and use it as the image id to show the image.
  • Test assertion 5: The API’s response code is 404.
  • Test action 7: Delete the 6 images with the stored ids. Show all images and check whether the 6 images’ ids are not in the show list.
  • Test assertion 6: The 6 images’ ids are not found in the show list.
5.1.13.1.3.5.2.3.2. Pass / fail criteria

The first two test cases evaluate the ability to use Glance v2 API to show image and images schema. The latter three test cases evaluate the ability to use Glance v2 API to show images with a deleted image id, a null image id and a non-existing image id. Specifically it verifies that:

  • Glance image get API can show the image and images schema.
  • Glance image get API can’t show an image with a deleted image id.
  • Glance image get API can’t show an image with a null image id.
  • Glance image get API can’t show an image with a non-existing image id.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.3.5.2.4. Post conditions

None

5.1.13.1.3.5.3. CRUD image operations in Images API v2
5.1.13.1.3.5.3.1. Test case specification

tempest.api.image.v2.test_images.ListUserImagesTest.test_list_no_params

5.1.13.1.3.5.3.2. Test preconditions

Glance is available.

5.1.13.1.3.5.3.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.3.5.3.3.1. Test execution
  • Test action 1: Create 6 images and store their ids in a created images list.
  • Test action 2: List all images and check whether the ids listed are in the created images list.
  • Test assertion 1: The ids get from the list images API are in the created images list.
5.1.13.1.3.5.3.3.2. Pass / fail criteria

This test case evaluates the ability to use Glance v2 API to list images. Specifically it verifies that:

  • Glance image API can show the images.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.3.5.3.4. Post conditions

None

5.1.13.1.3.5.4. Image list tests using the Glance v2 API
5.1.13.1.3.5.4.1. Test case specification

tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_container_format tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_disk_format tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_limit tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_min_max_size tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_size tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_status tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_visibility

5.1.13.1.3.5.4.2. Test preconditions

Glance is available.

5.1.13.1.3.5.4.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.3.5.4.3.1. Test execution
  • Test action 1: Create 6 images with a random size ranging from 1024 to 4096 and visibility ‘private’; set their (container_format, disk_format) pair to be (ami, ami), (ami, ari), (ami, aki), (ami, vhd), (ami, vmdk) and (ami, raw); store their ids in a list and upload the binary images data.
  • Test action 2: Use Glance v2 API to list all images whose container_format is ‘ami’ and store the response details in a list.
  • Test assertion 1: The list is not empty and all the values of container_format in the list are ‘ami’.
  • Test action 3: Use Glance v2 API to list all images whose disk_format is ‘raw’ and store the response details in a list.
  • Test assertion 2: The list is not empty and all the values of disk_format in the list are ‘raw’.
  • Test action 4: Use Glance v2 API to list one image by setting limit to be 1 and store the response details in a list.
  • Test assertion 3: The length of the list is one.
  • Test action 5: Use Glance v2 API to list images by setting size_min and size_max, and store the response images’ sizes in a list. Choose the first image’s size as the median, size_min is median-500 and size_max is median+500.
  • Test assertion 4: All sizes in the list are no less than size_min and no more than size_max.
  • Test action 6: Use Glance v2 API to show the first created image with its id and get its size from the response. Use Glance v2 API to list images whose size is equal to this size and store the response details in a list.
  • Test assertion 5: All sizes of the images in the list are equal to the size used to list the images.
  • Test action 7: Use Glance v2 API to list the images whose status is active and store the response details in a list.
  • Test assertion 6: All status of images in the list are active.
  • Test action 8: Use Glance v2 API to list the images whose visibility is private and store the response details in a list.
  • Test assertion 7: All images’ values of visibility in the list are private.
  • Test action 9: Delete the 6 images with the stored ids. Show images and check whether the 6 ids are not in the show list.
  • Test assertion 8: The stored 6 ids are not found in the show list.
5.1.13.1.3.5.4.3.2. Pass / fail criteria

This test case evaluates the ability to use Glance v2 API to list images with different parameters. Specifically it verifies that:

  • Glance image API can show the images with the container_format.
  • Glance image API can show the images with the disk_format.
  • Glance image API can show the images by setting a limit number.
  • Glance image API can show the images with the size_min and size_max.
  • Glance image API can show the images with the size.
  • Glance image API can show the images with the status.
  • Glance image API can show the images with the visibility type.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.3.5.4.4. Post conditions

None

5.1.13.1.3.5.5. Image update tests using the Glance v2 API
5.1.13.1.3.5.5.1. Test case specification

tempest.api.image.v2.test_images.BasicOperationsImagesTest.test_update_image tempest.api.image.v2.test_images_tags.ImagesTagsTest.test_update_delete_tags_for_image tempest.api.image.v2.test_images_tags_negative.ImagesTagsNegativeTest.test_update_tags_for_non_existing_image

5.1.13.1.3.5.5.2. Test preconditions

Glance is available.

5.1.13.1.3.5.5.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.3.5.5.3.1. Test execution
  • Test action 1: Create an image with container_formats ‘ami’, disk_formats ‘ami’ and visibility ‘private’ and store its id returned in the response. Check whether the status of the created image is ‘queued’.
  • Test assertion 1: The status of the created image is ‘queued’.
  • Test action 2: Use the stored image id to upload the binary image data and update this image’s name. Show this image with the stored id. Check if the stored id and name used to update the image are equal to the id and name in the show list.
  • Test assertion 2: The id and name returned in the show list are equal to the stored id and name used to update the image.
  • Test action 3: Create an image with container_formats ‘bare’, disk_formats ‘raw’ and visibility ‘private’ and store its id returned in the response.
  • Test action 4: Use the stored id to add a tag. Show the image with the stored id and check if the tag used to add is in the image’s tags returned in the show list.
  • Test assertion 3: The tag used to add into the image is in the show list.
  • Test action 5: Use the stored id to delete this tag. Show the image with the stored id and check if the tag used to delete is not in the show list.
  • Test assertion 4: The tag used to delete from the image is not in the show list.
  • Test action 6: Generate a random uuid as the image id. Use the image id to add a tag into the image’s tags.
  • Test assertion 5: The API’s response code is 404.
  • Test action 7: Delete the images created in test action 1 and 3. Show the images and check whether the ids are not in the show list.
  • Test assertion 6: The two ids are not found in the show list.
5.1.13.1.3.5.5.3.2. Pass / fail criteria

This test case evaluates the ability to use Glance v2 API to update images with different parameters. Specifically it verifies that:

  • Glance image API can update image’s name with the existing image id.
  • Glance image API can update image’s tags with the existing image id.
  • Glance image API can’t update image’s tags with a non-existing image id.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.3.5.5.4. Post conditions

None

5.1.13.1.3.5.6. Image deletion tests using the Glance v2 API
5.1.13.1.3.5.6.1. Test case specification

tempest.api.image.v2.test_images.BasicOperationsImagesTest.test_delete_image tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_delete_image_null_id tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_delete_non_existing_image tempest.api.image.v2.test_images_tags_negative.ImagesTagsNegativeTest.test_delete_non_existing_tag

5.1.13.1.3.5.6.2. Test preconditions

Glance is available.

5.1.13.1.3.5.6.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.3.5.6.3.1. Test execution
  • Test action 1: Create an image with container_formats ‘ami’, disk_formats ‘ami’ and visibility ‘private’. Use the id of the created image to delete the image. List all images and check whether this id is in the list.
  • Test assertion 1: The id of the created image is not found in the list of all images after the deletion operation.
  • Test action 2: Delete images with a null id and check the API’s response code.
  • Test assertion 2: The API’s response code is 404.
  • Test action 3: Generate a random uuid and delete images with this uuid as image id. Check the API’s response code.
  • Test assertion 3: The API’s response code is 404.
  • Test action 4: Create an image with container_formats ‘bare’, disk_formats ‘raw’ and visibility ‘private’. Delete this image’s tag with the image id and a random tag Check the API’s response code.
  • Test assertion 4: The API’s response code is 404.
  • Test action 5: Delete the images created in test action 1 and 4. List all images and check whether the ids are in the list.
  • Test assertion 5: The two ids are not found in the list.
5.1.13.1.3.5.6.3.2. Pass / fail criteria

The first three test cases evaluate the ability to use Glance v2 API to delete images with an existing image id, a null image id and a non-existing image id. The last one evaluates the ability to use the API to delete a non-existing image tag. Specifically it verifies that:

  • Glance image deletion API can delete the image with an existing id.
  • Glance image deletion API can’t delete an image with a null image id.
  • Glance image deletion API can’t delete an image with a non-existing image id.
  • Glance image deletion API can’t delete an image tag with a non-existing image tag.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.3.5.6.4. Post conditions

None

5.1.13.1.4. VIM network operations test specification
5.1.13.1.4.1. Scope

The VIM network test area evaluates the ability of the system under test to support VIM network operations. The test cases documented here are the network API test cases in the Openstack Interop guideline 2017.09 as implemented by the Refstack client. These test cases will evaluate basic Openstack (as a VIM) network operations including basic CRUD operations on L2 networks, L2 network ports and security groups.

5.1.13.1.4.2. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • CRUD - Create, Read, Update and Delete
  • NFVi - Network Functions Virtualization infrastructure
  • VIM - Virtual Infrastructure Manager
5.1.13.1.4.3. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.13.1.4.4. Test Area Structure

The test area is structured based on VIM network operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

For brevity, the test cases in this test area are summarized together based on the operations they are testing.

All these test cases are included in the test case dovetail.tempest.osinterop of OVP test suite.

5.1.13.1.4.5. Test Descriptions
5.1.13.1.4.5.1. API Used and Reference

Network: http://developer.openstack.org/api-ref/networking/v2/index.html

  • create network
  • update network
  • list networks
  • show network details
  • delete network
  • create subnet
  • update subnet
  • list subnets
  • show subnet details
  • delete subnet
  • create port
  • bulk create ports
  • update port
  • list ports
  • show port details
  • delete port
  • create security group
  • update security group
  • list security groups
  • show security group
  • delete security group
  • create security group rule
  • list security group rules
  • show security group rule
  • delete security group rule
5.1.13.1.4.5.2. Basic CRUD operations on L2 networks and L2 network ports
5.1.13.1.4.5.2.1. Test case specification

tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_allocation_pools tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_dhcp_enabled tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_gw tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_gw_and_allocation_pools tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_host_routes_and_dns_nameservers tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_without_gateway tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_all_attributes tempest.api.network.test_networks.NetworksTest.test_create_update_delete_network_subnet tempest.api.network.test_networks.NetworksTest.test_delete_network_with_subnet tempest.api.network.test_networks.NetworksTest.test_list_networks tempest.api.network.test_networks.NetworksTest.test_list_networks_fields tempest.api.network.test_networks.NetworksTest.test_list_subnets tempest.api.network.test_networks.NetworksTest.test_list_subnets_fields tempest.api.network.test_networks.NetworksTest.test_show_network tempest.api.network.test_networks.NetworksTest.test_show_network_fields tempest.api.network.test_networks.NetworksTest.test_show_subnet tempest.api.network.test_networks.NetworksTest.test_show_subnet_fields tempest.api.network.test_networks.NetworksTest.test_update_subnet_gw_dns_host_routes_dhcp tempest.api.network.test_ports.PortsTestJSON.test_create_bulk_port tempest.api.network.test_ports.PortsTestJSON.test_create_port_in_allowed_allocation_pools tempest.api.network.test_ports.PortsTestJSON.test_create_update_delete_port tempest.api.network.test_ports.PortsTestJSON.test_list_ports tempest.api.network.test_ports.PortsTestJSON.test_list_ports_fields tempest.api.network.test_ports.PortsTestJSON.test_show_port tempest.api.network.test_ports.PortsTestJSON.test_show_port_fields

5.1.13.1.4.5.2.2. Test preconditions

Neutron is available.

5.1.13.1.4.5.2.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.4.5.2.3.1. Test execution
  • Test action 1: Create a network and create a subnet of this network by setting allocation_pools, then check the details of the subnet and delete the subnet and network
  • Test assertion 1: The allocation_pools returned in the response equals to the one used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 2: Create a network and create a subnet of this network by setting enable_dhcp “True”, then check the details of the subnet and delete the subnet and network
  • Test assertion 2: The enable_dhcp returned in the response is “True” and the network and subnet ids are not found after deletion
  • Test action 3: Create a network and create a subnet of this network by setting gateway_ip, then check the details of the subnet and delete the subnet and network
  • Test assertion 3: The gateway_ip returned in the response equals to the one used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 4: Create a network and create a subnet of this network by setting allocation_pools and gateway_ip, then check the details of the subnet and delete the subnet and network
  • Test assertion 4: The allocation_pools and gateway_ip returned in the response equal to the ones used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 5: Create a network and create a subnet of this network by setting host_routes and dns_nameservers, then check the details of the subnet and delete the subnet and network
  • Test assertion 5: The host_routes and dns_nameservers returned in the response equal to the ones used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 6: Create a network and create a subnet of this network without setting gateway_ip, then delete the subnet and network
  • Test assertion 6: The network and subnet ids are not found after deletion
  • Test action 7: Create a network and create a subnet of this network by setting enable_dhcp “true”, gateway_ip, ip_version, cidr, host_routes, allocation_pools and dns_nameservers, then check the details of the subnet and delete the subnet and network
  • Test assertion 7: The values returned in the response equal to the ones used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 8: Create a network and update this network’s name, then create a subnet and update this subnet’s name, delete the subnet and network
  • Test assertion 8: The network’s status and subnet’s status are both ‘ACTIVE’ after creation, their names equal to the new names used to update, and the network and subnet ids are not found after deletion
  • Test action 9: Create a network and create a subnet of this network, then delete this network
  • Test assertion 9: The subnet has also been deleted after deleting the network
  • Test action 10: Create a network and list all networks
  • Test assertion 10: The network created is found in the list
  • Test action 11: Create a network and list networks with the id and name of the created network
  • Test assertion 11: The id and name of the list network equal to the created network’s id and name
  • Test action 12: Create a network and create a subnet of this network, then list all subnets
  • Test assertion 12: The subnet created is found in the list
  • Test action 13: Create a network and create a subnet of this network, then list subnets with the id and network_id of the created subnet
  • Test assertion 13: The id and network_id of the list subnet equal to the created subnet
  • Test action 14: Create a network and show network’s details with the id of the created network
  • Test assertion 14: The id and name returned in the response equal to the created network’s id and name
  • Test action 15: Create a network and just show network’s id and name info with the id of the created network
  • Test assertion 15: The keys returned in the response are only id and name, and the values of all the keys equal to network’s id and name
  • Test action 16: Create a network and create a subnet of this network, then show subnet’s details with the id of the created subnet
  • Test assertion 16: The id and cidr info returned in the response equal to the created subnet’s id and cidr
  • Test action 17: Create a network and create a subnet of this network, then show subnet’s id and network_id info with the id of the created subnet
  • Test assertion 17: The keys returned in the response are just id and network_id, and the values of all the keys equal to subnet’s id and network_id
  • Test action 18: Create a network and create a subnet of this network, then update subnet’s name, host_routes, dns_nameservers and gateway_ip
  • Test assertion 18: The name, host_routes, dns_nameservers and gateway_ip returned in the response equal to the values used to update the subnet
  • Test action 19: Create 2 networks and bulk create 2 ports with the ids of the created networks
  • Test assertion 19: The network_id of each port equals to the one used to create the port and the admin_state_up of each port is True
  • Test action 20: Create a network and create a subnet of this network by setting allocation_pools, then create a port with the created network’s id
  • Test assertion 20: The ip_address of the created port is in the range of the allocation_pools
  • Test action 21: Create a network and create a port with its id, then update the port’s name and set its admin_state_up to be False
  • Test assertion 21: The name returned in the response equals to the name used to update the port and the port’s admin_state_up is False
  • Test action 22: Create a network and create a port with its id, then list all ports
  • Test assertion 22: The created port is found in the list
  • Test action 23: Create a network and create a port with its id, then list ports with the id and mac_address of the created port
  • Test assertion 23: The created port is found in the list
  • Test action 24: Create a network and create a port with its id, then show the port’s details
  • Test assertion 24: The key ‘id’ is in the details
  • Test action 25: Create a network and create a port with its id, then show the port’s id and mac_address info with the port’s id
  • Test assertion 25: The keys returned in the response are just id and mac_address, and the values of all the keys equal to port’s id and mac_address
5.1.13.1.4.5.2.3.2. Pass / fail criteria

These test cases evaluate the ability of basic CRUD operations on L2 networks and L2 network ports. Specifically it verifies that:

  • Subnets can be created successfully by setting different parameters.
  • Subnets can be updated after being created.
  • Ports can be bulk created with network ids.
  • Port’s security group(s) can be updated after being created.
  • Networks/subnets/ports can be listed with their ids and other parameters.
  • All details or special fields’ info of networks/subnets/ports can be shown with their ids.
  • Networks/subnets/ports can be successfully deleted.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.4.5.2.4. Post conditions

N/A

5.1.13.1.4.5.3. Basic CRUD operations on security groups
5.1.13.1.4.5.3.1. Test case specification

tempest.api.network.test_security_groups.SecGroupTest.test_create_list_update_show_delete_security_group tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_additional_args tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_icmp_type_code tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_protocol_integer_value tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_remote_group_id tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_remote_ip_prefix tempest.api.network.test_security_groups.SecGroupTest.test_create_show_delete_security_group_rule tempest.api.network.test_security_groups.SecGroupTest.test_list_security_groups tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_additional_default_security_group_fails tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_duplicate_security_group_rule_fails tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_bad_ethertype tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_bad_protocol tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_bad_remote_ip_prefix tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_invalid_ports tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_non_existent_remote_groupid tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_non_existent_security_group tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_delete_non_existent_security_group tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_show_non_existent_security_group tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_show_non_existent_security_group_rule

5.1.13.1.4.5.3.2. Test preconditions

Neutron is available.

5.1.13.1.4.5.3.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.4.5.3.3.1. Test execution
  • Test action 1: Create a security group SG1, list all security groups, update the name and description of SG1, show details of SG1 and delete SG1
  • Test assertion 1: SG1 is in the list, the name and description of SG1 equal to the ones used to update it, the name and description of SG1 shown in the details equal to the ones used to update it, and SG1’s id is not found after deletion
  • Test action 2: Create a security group SG1, and create a rule with protocol ‘tcp’, port_range_min and port_range_max
  • Test assertion 2: The values returned in the response equal to the ones used to create the rule
  • Test action 3: Create a security group SG1, and create a rule with protocol ‘icmp’ and icmp_type_codes
  • Test assertion 3: The values returned in the response equal to the ones used to create the rule
  • Test action 4: Create a security group SG1, and create a rule with protocol ‘17’
  • Test assertion 4: The values returned in the response equal to the ones used to create the rule
  • Test action 5: Create a security group SG1, and create a rule with protocol ‘udp’, port_range_min, port_range_max and remote_group_id
  • Test assertion 5: The values returned in the response equal to the ones used to create the rule
  • Test action 6: Create a security group SG1, and create a rule with protocol ‘tcp’, port_range_min, port_range_max and remote_ip_prefix
  • Test assertion 6: The values returned in the response equal to the ones used to create the rule
  • Test action 7: Create a security group SG1, create 3 rules with protocol ‘tcp’, ‘udp’ and ‘icmp’ respectively, show details of each rule, list all rules and delete all rules
  • Test assertion 7: The values in the shown details equal to the ones used to create the rule, all rules are found in the list, and all rules are not found after deletion
  • Test action 8: List all security groups
  • Test assertion 8: There is one default security group in the list
  • Test action 9: Create a security group whose name is ‘default’
  • Test assertion 9: Failed to create this security group because of name conflict
  • Test action 10: Create a security group SG1, create a rule with protocol ‘tcp’, port_range_min and port_range_max, and create another tcp rule with the same parameters
  • Test assertion 10: Failed to create this security group rule because of duplicate protocol
  • Test action 11: Create a security group SG1, and create a rule with ethertype ‘bad_ethertype’
  • Test assertion 11: Failed to create this security group rule because of bad ethertype
  • Test action 12: Create a security group SG1, and create a rule with protocol ‘bad_protocol_name’
  • Test assertion 12: Failed to create this security group rule because of bad protocol
  • Test action 13: Create a security group SG1, and create a rule with remote_ip_prefix ‘92.168.1./24’, ‘192.168.1.1/33’, ‘bad_prefix’ and ‘256’ respectively
  • Test assertion 13: Failed to create these security group rules because of bad remote_ip_prefix
  • Test action 14: Create a security group SG1, and create a tcp rule with (port_range_min, port_range_max) (-16, 80), (80, 79), (80, 65536), (None, 6) and (-16, 65536) respectively
  • Test assertion 14: Failed to create these security group rules because of bad ports
  • Test action 15: Create a security group SG1, and create a tcp rule with remote_group_id ‘bad_group_id’ and a random uuid respectively
  • Test assertion 15: Failed to create these security group rules because of nonexistent remote_group_id
  • Test action 16: Create a security group SG1, and create a rule with a random uuid as security_group_id
  • Test assertion 16: Failed to create these security group rules because of nonexistent security_group_id
  • Test action 17: Generate a random uuid and use this id to delete security group
  • Test assertion 17: Failed to delete security group because of nonexistent security_group_id
  • Test action 18: Generate a random uuid and use this id to show security group
  • Test assertion 18: Failed to show security group because of nonexistent id of security group
  • Test action 19: Generate a random uuid and use this id to show security group rule
  • Test assertion 19: Failed to show security group rule because of nonexistent id of security group rule
5.1.13.1.4.5.3.3.2. Pass / fail criteria

These test cases evaluate the ability of Basic CRUD operations on security groups and security group rules. Specifically it verifies that:

  • Security groups can be created, list, updated, shown and deleted.
  • Security group rules can be created with different parameters, list, shown and deleted.
  • Cannot create an additional default security group.
  • Cannot create a duplicate security group rules.
  • Cannot create security group rules with bad ethertype, protocol, remote_ip_prefix, ports, remote_group_id and security_group_id.
  • Cannot show or delete security groups or security group rules with nonexistent ids.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.4.5.3.4. Post conditions

N/A

5.1.13.1.4.5.4. CRUD operations on subnet pools
5.1.13.1.4.5.4.1. Test case specification

tempest.api.network.test_subnetpools_extensions.SubnetPoolsTestJSON.test_create_list_show_update_delete_subnetpools

5.1.13.1.4.5.4.2. Test preconditions

Neutron is available.

5.1.13.1.4.5.4.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.4.5.4.3.1. Test execution
  • Test action 1: Create a subnetpool SNP1 with a specific name and get the name from the response body
  • Test assertion 1: The name got from the body is the same as the name used to create SNP1
  • Test action 2: Show SNP1 and get the name from the response body
  • Test assertion 2: The name got from the body is the same as the name used to create SNP1
  • Test action 3: Update the name of SNP1 and get the new name from the response body
  • Test assertion 3: The name got from the body is the same as the name used to update SNP1
  • Test action 4: Delete SNP1
5.1.13.1.4.5.4.3.2. Pass / fail criteria

These test cases evaluate the ability of Basic CRUD operations on subnetpools. Specifically it verifies that:

  • Subnetpools can be created, updated, shown and deleted.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.4.5.4.4. Post conditions

N/A

5.1.13.1.5. VIM volume operations test specification
5.1.13.1.5.1. Scope

The VIM volume operations test area evaluates the ability of the system under test to support VIM volume operations. The test cases documented here are the volume API test cases in the OpenStack Interop guideline 2017.09 as implemented by the RefStack client. These test cases will evaluate basic OpenStack (as a VIM) volume operations, including:

  • Volume attach and detach operations
  • Volume service availability zone operations
  • Volume cloning operations
  • Image copy-to-volume operations
  • Volume creation and deletion operations
  • Volume service extension listing
  • Volume metadata operations
  • Volume snapshot operations
5.1.13.1.5.2. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualization infrastructure
  • SUT - System Under Test
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
5.1.13.1.5.3. System Under Test (SUT)

The system under test is assumed to be the NFVI and VIM deployed with a Pharos compliant infrastructure.

5.1.13.1.5.4. Test Area Structure

The test area is structured based on VIM volume API operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

For brevity, the test cases in this test area are summarized together based on the operations they are testing.

All these test cases are included in the test case dovetail.tempest.osinterop of OVP test suite.

5.1.13.1.5.5. Test Descriptions
5.1.13.1.5.5.1. API Used and Reference

Block storage: https://developer.openstack.org/api-ref/block-storage

  • create volume
  • delete volume
  • update volume
  • attach volume to server
  • detach volume from server
  • create volume metadata
  • update volume metadata
  • delete volume metadata
  • list volume
  • create snapshot
  • update snapshot
  • delete snapshot
5.1.13.1.5.5.2. Test Case 1 - Upload volumes with Cinder v2 or v3 API
5.1.13.1.5.5.2.1. Test case specification

tempest.api.volume.test_volumes_actions.VolumesActionsTest.test_volume_upload

5.1.13.1.5.5.2.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.2.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.2.3.1. Test execution
  • Test action 1: Create a volume VOL1
  • Test action 2: Convert VOL1 and upload image IMG1 to the Glance
  • Test action 3: Wait until the status of IMG1 is ‘ACTIVE’ and VOL1 is ‘available’
  • Test action 4: Show the details of IMG1
  • Test assertion 1: The name of IMG1 shown is the same as the name used to upload it
  • Test assertion 2: The disk_format of IMG1 is the same as the disk_format of VOL1
5.1.13.1.5.5.2.3.2. Pass / fail criteria

This test case evaluates the volume API ability of uploading images. Specifically, the test verifies that:

  • The Volume can convert volumes and upload images.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.2.4. Post conditions

N/A

5.1.13.1.5.5.3. Test Case 2 - Volume service availability zone operations with the Cinder v2 or v3 API
5.1.13.1.5.5.3.1. Test case specification

tempest.api.volume.test_availability_zone.AvailabilityZoneTestJSON.test_get_availability_zone_list

5.1.13.1.5.5.3.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.3.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.3.3.1. Test execution
  • Test action 1: List all existent availability zones
  • Test assertion 1: Verify the availability zone list length is greater than 0
5.1.13.1.5.5.3.3.2. Pass / fail criteria

This test case evaluates the volume API ability of listing availability zones. Specifically, the test verifies that:

  • Availability zones can be listed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.3.4. Post conditions

N/A

5.1.13.1.5.5.4. Test Case 3 - Volume cloning operations with the Cinder v2 or v3 API
5.1.13.1.5.5.4.1. Test case specification

tempest.api.volume.test_volumes_get.VolumesGetTest.test_volume_create_get_update_delete_as_clone

5.1.13.1.5.5.4.2. Test preconditions
  • Volume extension API
  • Cinder volume clones feature is enabled
5.1.13.1.5.5.4.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.4.3.1. Test execution
  • Test action 1: Create a volume VOL1
  • Test action 2: Create a volume VOL2 from source volume VOL1 with a specific name and metadata
  • Test action 2: Wait for VOL2 to reach ‘available’ status
  • Test assertion 1: Verify the name of VOL2 is correct
  • Test action 3: Retrieve VOL2’s detail information
  • Test assertion 2: Verify the retrieved volume name, ID and metadata are the same as VOL2
  • Test assertion 3: Verify VOL2’s bootable flag is ‘False’
  • Test action 4: Update the name of VOL2 with the original value
  • Test action 5: Update the name of VOL2 with a new value
  • Test assertion 4: Verify the name of VOL2 is updated successfully
  • Test action 6: Create a volume VOL3 with no name specified and a description contains characters '@#$%^*
  • Test assertion 5: Verify VOL3 is created successfully
  • Test action 7: Update the name of VOL3 and description with the original value
  • Test assertion 6: Verify VOL3’s bootable flag is ‘False’
5.1.13.1.5.5.4.3.2. Pass / fail criteria

This test case evaluates the volume API ability of creating a cloned volume from a source volume, getting cloned volume detail information and updating cloned volume attributes.

Specifically, the test verifies that:

  • Cloned volume can be created from a source volume.
  • Cloned volume detail information can be retrieved.
  • Cloned volume detail information can be updated.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.4.4. Post conditions

N/A

5.1.13.1.5.5.5. Test Case 4 - Image copy-to-volume operations with the Cinder v2 or v3 API
5.1.13.1.5.5.5.1. Test case specification

tempest.api.volume.test_volumes_actions.VolumesActionsTest.test_volume_bootable tempest.api.volume.test_volumes_get.VolumesGetTest.test_volume_create_get_update_delete_from_image

5.1.13.1.5.5.5.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.5.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.5.3.1. Test execution
  • Test action 1: Set a provided volume VOL1’s bootable flag to ‘True’
  • Test action 2: Retrieve VOL1’s bootable flag
  • Test assertion 1: Verify VOL1’s bootable flag is ‘True’
  • Test action 3: Set a provided volume VOL1’s bootable flag to ‘False’
  • Test action 4: Retrieve VOL1’s bootable flag
  • Test assertion 2: Verify VOL1’s bootable flag is ‘False’
  • Test action 5: Create a bootable volume VOL2 from one image with a specific name and metadata
  • Test action 6: Wait for VOL2 to reach ‘available’ status
  • Test assertion 3: Verify the name of VOL2 name is correct
  • Test action 7: Retrieve VOL2’s information
  • Test assertion 4: Verify the retrieved volume name, ID and metadata are the same as VOL2
  • Test assertion 5: Verify VOL2’s bootable flag is ‘True’
  • Test action 8: Update the name of VOL2 with the original value
  • Test action 9: Update the name of VOL2 with a new value
  • Test assertion 6: Verify the name of VOL2 is updated successfully
  • Test action 10: Create a volume VOL3 with no name specified and a description contains characters '@#$%^*
  • Test assertion 7: Verify VOL3 is created successfully
  • Test action 11: Update the name of VOL3 and description with the original value
  • Test assertion 8: Verify VOL3’s bootable flag is ‘True’
5.1.13.1.5.5.5.3.2. Pass / fail criteria

This test case evaluates the volume API ability of updating volume’s bootable flag and creating a bootable volume from an image, getting bootable volume detail information and updating bootable volume.

Specifically, the test verifies that:

  • Volume bootable flag can be set and retrieved.
  • Bootable volume can be created from a source volume.
  • Bootable volume detail information can be retrieved.
  • Bootable volume detail information can be updated.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.5.4. Post conditions

N/A

5.1.13.1.5.5.6. Test Case 5 - Volume creation and deletion operations with the Cinder v2 or v3 API
5.1.13.1.5.5.6.1. Test case specification

tempest.api.volume.test_volumes_get.VolumesGetTest.test_volume_create_get_update_delete tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_invalid_size tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_nonexistent_source_volid tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_nonexistent_volume_type tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_without_passing_size tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_size_negative tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_size_zero

5.1.13.1.5.5.6.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.6.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.6.3.1. Test execution
  • Test action 1: Create a volume VOL1 with a specific name and metadata
  • Test action 2: Wait for VOL1 to reach ‘available’ status
  • Test assertion 1: Verify the name of VOL1 is correct
  • Test action 3: Retrieve VOL1’s information
  • Test assertion 2: Verify the retrieved volume name, ID and metadata are the same as VOL1
  • Test assertion 3: Verify VOL1’s bootable flag is ‘False’
  • Test action 4: Update the name of VOL1 with the original value
  • Test action 5: Update the name of VOL1 with a new value
  • Test assertion 4: Verify the name of VOL1 is updated successfully
  • Test action 6: Create a volume VOL2 with no name specified and a description contains characters '@#$%^*
  • Test assertion 5: Verify VOL2 is created successfully
  • Test action 7: Update the name of VOL2 and description with the original value
  • Test assertion 6: Verify VOL2’s bootable flag is ‘False’
  • Test action 8: Create a volume with an invalid size ‘#$%’
  • Test assertion 7: Verify create volume failed, a bad request error is returned in the response
  • Test action 9: Create a volume with a nonexistent source volume
  • Test assertion 8: Verify create volume failed, a ‘Not Found’ error is returned in the response
  • Test action 10: Create a volume with a nonexistent volume type
  • Test assertion 9: Verify create volume failed, a ‘Not Found’ error is returned in the response
  • Test action 11: Create a volume without passing a volume size
  • Test assertion 10: Verify create volume failed, a bad request error is returned in the response
  • Test action 12: Create a volume with a negative volume size
  • Test assertion 11: Verify create volume failed, a bad request error is returned in the response
  • Test action 13: Create a volume with volume size ‘0’
  • Test assertion 12: Verify create volume failed, a bad request error is returned in the response
5.1.13.1.5.5.6.3.2. Pass / fail criteria

This test case evaluates the volume API ability of creating a volume, getting volume detail information and updating volume, the reference is, Specifically, the test verifies that:

  • Volume can be created from a source volume.
  • Volume detail information can be retrieved/updated.
  • Create a volume with an invalid size is not allowed.
  • Create a volume with a nonexistent source volume or volume type is not allowed.
  • Create a volume without passing a volume size is not allowed.
  • Create a volume with a negative volume size is not allowed.
  • Create a volume with volume size ‘0’ is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.6.4. Post conditions

N/A

5.1.13.1.5.5.7. Test Case 6 - Volume service extension listing operations with the Cinder v2 or v3 API
5.1.13.1.5.5.7.1. Test case specification

tempest.api.volume.test_extensions.ExtensionsTestJSON.test_list_extensions

5.1.13.1.5.5.7.2. Test preconditions
  • Volume extension API
  • At least one Cinder extension is configured
5.1.13.1.5.5.7.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.7.3.1. Test execution
  • Test action 1: List all cinder service extensions
  • Test assertion 1: Verify all extensions are list in the extension list
5.1.13.1.5.5.7.3.2. Pass / fail criteria

This test case evaluates the volume API ability of listing all existent volume service extensions.

  • Cinder service extensions can be listed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.7.4. Post conditions

N/A

5.1.13.1.5.5.8. Test Case 7 - Volume GET operations with the Cinder v2 or v3 API
5.1.13.1.5.5.8.1. Test case specification

tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_get_invalid_volume_id tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_get_volume_without_passing_volume_id tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_volume_get_nonexistent_volume_id

5.1.13.1.5.5.8.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.8.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.8.3.1. Test execution
  • Test action 1: Retrieve a volume with an invalid volume ID
  • Test assertion 1: Verify retrieve volume failed, a ‘Not Found’ error is returned in the response
  • Test action 2: Retrieve a volume with an empty volume ID
  • Test assertion 2: Verify retrieve volume failed, a ‘Not Found’ error is returned in the response
  • Test action 3: Retrieve a volume with a nonexistent volume ID
  • Test assertion 3: Verify retrieve volume failed, a ‘Not Found’ error is returned in the response
5.1.13.1.5.5.8.3.2. Pass / fail criteria

This test case evaluates the volume API ability of getting volumes. Specifically, the test verifies that:

  • Get a volume with an invalid/an empty/a nonexistent volume ID is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.8.4. Post conditions

N/A

5.1.13.1.5.5.9. Test Case 8 - Volume listing operations with the Cinder v2 or v3 API
5.1.13.1.5.5.9.1. Test case specification

tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_by_name tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_details_by_name tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_param_display_name_and_status tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_with_detail_param_display_name_and_status tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_with_detail_param_metadata tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_with_details tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_with_param_metadata tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volumes_list_by_availability_zone tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volumes_list_by_status tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volumes_list_details_by_availability_zone tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volumes_list_details_by_status tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_list_volumes_detail_with_invalid_status tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_list_volumes_detail_with_nonexistent_name tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_list_volumes_with_invalid_status tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_list_volumes_with_nonexistent_name tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_details_pagination tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_details_with_multiple_params tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_pagination

5.1.13.1.5.5.9.2. Test preconditions
  • Volume extension API
  • The backing file for the volume group that Nova uses has space for at least 3 1G volumes
5.1.13.1.5.5.9.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.9.3.1. Test execution
  • Test action 1: List all existent volumes
  • Test assertion 1: Verify the volume list is complete
  • Test action 2: List existent volumes and filter the volume list by volume name
  • Test assertion 2: Verify the length of filtered volume list is 1 and the retrieved volume is correct
  • Test action 3: List existent volumes in detail and filter the volume list by volume name
  • Test assertion 3: Verify the length of filtered volume list is 1 and the retrieved volume is correct
  • Test action 4: List existent volumes and filter the volume list by volume name and status ‘available’
  • Test assertion 4: Verify the name and status parameters of the fetched volume are correct
  • Test action 5: List existent volumes in detail and filter the volume list by volume name and status ‘available’
  • Test assertion 5: Verify the name and status parameters of the fetched volume are correct
  • Test action 6: List all existent volumes in detail and filter the volume list by volume metadata
  • Test assertion 6: Verify the metadata parameter of the fetched volume is correct
  • Test action 7: List all existent volumes in detail
  • Test assertion 7: Verify the volume list is complete
  • Test action 8: List all existent volumes and filter the volume list by volume metadata
  • Test assertion 8: Verify the metadata parameter of the fetched volume is correct
  • Test action 9: List existent volumes and filter the volume list by availability zone
  • Test assertion 9: Verify the availability zone parameter of the fetched volume is correct
  • Test action 10: List all existent volumes and filter the volume list by volume status ‘available’
  • Test assertion 10: Verify the status parameter of the fetched volume is correct
  • Test action 11: List existent volumes in detail and filter the volume list by availability zone
  • Test assertion 11: Verify the availability zone parameter of the fetched volume is correct
  • Test action 12: List all existent volumes in detail and filter the volume list by volume status ‘available’
  • Test assertion 12: Verify the status parameter of the fetched volume is correct
  • Test action 13: List all existent volumes in detail and filter the volume list by an invalid volume status ‘null’
  • Test assertion 13: Verify the filtered volume list is empty
  • Test action 14: List all existent volumes in detail and filter the volume list by a non-existent volume name
  • Test assertion 14: Verify the filtered volume list is empty
  • Test action 15: List all existent volumes and filter the volume list by an invalid volume status ‘null’
  • Test assertion 15: Verify the filtered volume list is empty
  • Test action 16: List all existent volumes and filter the volume list by a non-existent volume name
  • Test assertion 16: Verify the filtered volume list is empty
  • Test action 17: List all existent volumes in detail and paginate the volume list by desired volume IDs
  • Test assertion 17: Verify only the desired volumes are listed in the filtered volume list
  • Test action 18: List all existent volumes in detail and filter the volume list by volume status ‘available’ and display limit ‘2’
  • Test action 19: Sort the filtered volume list by IDs in ascending order
  • Test assertion 18: Verify the length of filtered volume list is 2
  • Test assertion 19: Verify the status of retrieved volumes is correct
  • Test assertion 20: Verify the filtered volume list is sorted correctly
  • Test action 20: List all existent volumes in detail and filter the volume list by volume status ‘available’ and display limit ‘2’
  • Test action 21: Sort the filtered volume list by IDs in descending order
  • Test assertion 21: Verify the length of filtered volume list is 2
  • Test assertion 22: Verify the status of retrieved volumes is correct
  • Test assertion 23: Verify the filtered volume list is sorted correctly
  • Test action 22: List all existent volumes and paginate the volume list by desired volume IDs
  • Test assertion 24: Verify only the desired volumes are listed in the filtered volume list
5.1.13.1.5.5.9.3.2. Pass / fail criteria

This test case evaluates the volume API ability of getting a list of volumes and filtering the volume list. Specifically, the test verifies that:

  • Get a list of volumes (in detail) successful.
  • Get a list of volumes (in detail) and filter volumes by name/status/metadata/availability zone successful.
  • Volume list pagination functionality is working.
  • Get a list of volumes in detail using combined condition successful.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.9.4. Post conditions

N/A

5.1.13.1.5.5.10. Test Case 9 - Volume metadata operations with the Cinder v2 or v3 API
5.1.13.1.5.5.10.1. Test case specification

tempest.api.volume.test_volume_metadata.VolumesMetadataTest.test_crud_volume_metadata tempest.api.volume.test_volume_metadata.VolumesMetadataTest.test_update_show_volume_metadata_item

5.1.13.1.5.5.10.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.10.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.10.3.1. Test execution
  • Test action 1: Create metadata for a provided volume VOL1
  • Test action 2: Get the metadata of VOL1
  • Test assertion 1: Verify the metadata of VOL1 is correct
  • Test action 3: Update the metadata of VOL1
  • Test assertion 2: Verify the metadata of VOL1 is updated
  • Test action 4: Delete one metadata item ‘key1’ of VOL1
  • Test assertion 3: Verify the metadata item ‘key1’ is deleted
  • Test action 5: Create metadata for a provided volume VOL2
  • Test assertion 4: Verify the metadata of VOL2 is correct
  • Test action 6: Update one metadata item ‘key3’ of VOL2
  • Test assertion 5: Verify the metadata of VOL2 is updated
5.1.13.1.5.5.10.3.2. Pass / fail criteria

This test case evaluates the volume API ability of creating metadata for a volume, getting the metadata of a volume, updating volume metadata and deleting a metadata item of a volume. Specifically, the test verifies that:

  • Create metadata for volume successfully.
  • Get metadata of volume successfully.
  • Update volume metadata and metadata item successfully.
  • Delete metadata item of a volume successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.10.4. Post conditions

N/A

5.1.13.1.5.5.11. Test Case 10 - Verification of read-only status on volumes with the Cinder v2 or v3 API
5.1.13.1.5.5.11.1. Test case specification

tempest.api.volume.test_volumes_actions.VolumesActionsTest.test_volume_readonly_update

5.1.13.1.5.5.11.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.11.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.11.3.1. Test execution
  • Test action 1: Update a provided volume VOL1’s read-only access mode to ‘True’
  • Test assertion 1: Verify VOL1 is in read-only access mode
  • Test action 2: Update a provided volume VOL1’s read-only access mode to ‘False’
  • Test assertion 2: Verify VOL1 is not in read-only access mode
5.1.13.1.5.5.11.3.2. Pass / fail criteria

This test case evaluates the volume API ability of setting and updating volume read-only access mode. Specifically, the test verifies that:

  • Volume read-only access mode can be set and updated.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.11.4. Post conditions

N/A

5.1.13.1.5.5.12. Test Case 11 - Volume reservation operations with the Cinder v2 or v3 API
5.1.13.1.5.5.12.1. Test case specification

tempest.api.volume.test_volumes_actions.VolumesActionsTest.test_reserve_unreserve_volume tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_reserve_volume_with_negative_volume_status tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_reserve_volume_with_nonexistent_volume_id tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_unreserve_volume_with_nonexistent_volume_id

5.1.13.1.5.5.12.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.12.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.12.3.1. Test execution
  • Test action 1: Update a provided volume VOL1 as reserved
  • Test assertion 1: Verify VOL1 is in ‘attaching’ status
  • Test action 2: Update VOL1 as un-reserved
  • Test assertion 2: Verify VOL1 is in ‘available’ status
  • Test action 3: Update a provided volume VOL2 as reserved
  • Test action 4: Update VOL2 as reserved again
  • Test assertion 3: Verify update VOL2 status failed, a bad request error is returned in the response
  • Test action 5: Update VOL2 as un-reserved
  • Test action 6: Update a non-existent volume as reserved by using an invalid volume ID
  • Test assertion 4: Verify update non-existent volume as reserved failed, a ‘Not Found’ error is returned in the response
  • Test action 7: Update a non-existent volume as un-reserved by using an invalid volume ID
  • Test assertion 5: Verify update non-existent volume as un-reserved failed, a ‘Not Found’ error is returned in the response
5.1.13.1.5.5.12.3.2. Pass / fail criteria

This test case evaluates the volume API ability of reserving and un-reserving volumes. Specifically, the test verifies that:

  • Volume can be reserved and un-reserved.
  • Update a non-existent volume as reserved is not allowed.
  • Update a non-existent volume as un-reserved is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.12.4. Post conditions

N/A

5.1.13.1.5.5.13. Test Case 12 - Volume snapshot creation/deletion operations with the Cinder v2 or v3 API
5.1.13.1.5.5.13.1. Test case specification

tempest.api.volume.test_snapshot_metadata.SnapshotMetadataTestJSON.test_crud_snapshot_metadata tempest.api.volume.test_snapshot_metadata.SnapshotMetadataTestJSON.test_update_show_snapshot_metadata_item tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_nonexistent_snapshot_id tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_delete_invalid_volume_id tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_delete_volume_without_passing_volume_id tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_volume_delete_nonexistent_volume_id tempest.api.volume.test_volumes_snapshots.VolumesSnapshotTestJSON.test_snapshot_create_get_list_update_delete tempest.api.volume.test_volumes_snapshots.VolumesSnapshotTestJSON.test_volume_from_snapshot tempest.api.volume.test_volumes_snapshots_list.VolumesSnapshotListTestJSON.test_snapshots_list_details_with_params tempest.api.volume.test_volumes_snapshots_list.VolumesSnapshotListTestJSON.test_snapshots_list_with_params tempest.api.volume.test_volumes_snapshots_negative.VolumesSnapshotNegativeTestJSON.test_create_snapshot_with_nonexistent_volume_id tempest.api.volume.test_volumes_snapshots_negative.VolumesSnapshotNegativeTestJSON.test_create_snapshot_without_passing_volume_id

5.1.13.1.5.5.13.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.13.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.13.3.1. Test execution
  • Test action 1: Create metadata for a provided snapshot SNAP1
  • Test action 2: Get the metadata of SNAP1
  • Test assertion 1: Verify the metadata of SNAP1 is correct
  • Test action 3: Update the metadata of SNAP1
  • Test assertion 2: Verify the metadata of SNAP1 is updated
  • Test action 4: Delete one metadata item ‘key3’ of SNAP1
  • Test assertion 3: Verify the metadata item ‘key3’ is deleted
  • Test action 5: Create metadata for a provided snapshot SNAP2
  • Test assertion 4: Verify the metadata of SNAP2 is correct
  • Test action 6: Update one metadata item ‘key3’ of SNAP2
  • Test assertion 5: Verify the metadata of SNAP2 is updated
  • Test action 7: Create a volume with a nonexistent snapshot
  • Test assertion 6: Verify create volume failed, a ‘Not Found’ error is returned in the response
  • Test action 8: Delete a volume with an invalid volume ID
  • Test assertion 7: Verify delete volume failed, a ‘Not Found’ error is returned in the response
  • Test action 9: Delete a volume with an empty volume ID
  • Test assertion 8: Verify delete volume failed, a ‘Not Found’ error is returned in the response
  • Test action 10: Delete a volume with a nonexistent volume ID
  • Test assertion 9: Verify delete volume failed, a ‘Not Found’ error is returned in the response
  • Test action 11: Create a snapshot SNAP2 from a provided volume VOL1
  • Test action 12: Retrieve SNAP2’s detail information
  • Test assertion 10: Verify SNAP2 is created from VOL1
  • Test action 13: Update the name and description of SNAP2
  • Test assertion 11: Verify the name and description of SNAP2 are updated in the response body of update snapshot API
  • Test action 14: Retrieve SNAP2’s detail information
  • Test assertion 12: Verify the name and description of SNAP2 are correct
  • Test action 15: Delete SNAP2
  • Test action 16: Create a volume VOL2 with a volume size
  • Test action 17: Create a snapshot SNAP3 from VOL2
  • Test action 18: Create a volume VOL3 from SNAP3 with a bigger volume size
  • Test action 19: Retrieve VOL3’s detail information
  • Test assertion 13: Verify volume size and source snapshot of VOL3 are correct
  • Test action 20: List all snapshots in detail and filter the snapshot list by name
  • Test assertion 14: Verify the filtered snapshot list is correct
  • Test action 21: List all snapshots in detail and filter the snapshot list by status
  • Test assertion 15: Verify the filtered snapshot list is correct
  • Test action 22: List all snapshots in detail and filter the snapshot list by name and status
  • Test assertion 16: Verify the filtered snapshot list is correct
  • Test action 23: List all snapshots and filter the snapshot list by name
  • Test assertion 17: Verify the filtered snapshot list is correct
  • Test action 24: List all snapshots and filter the snapshot list by status
  • Test assertion 18: Verify the filtered snapshot list is correct
  • Test action 25: List all snapshots and filter the snapshot list by name and status
  • Test assertion 19: Verify the filtered snapshot list is correct
  • Test action 26: Create a snapshot from a nonexistent volume by using an invalid volume ID
  • Test assertion 20: Verify create snapshot failed, a ‘Not Found’ error is returned in the response
  • Test action 27: Create a snapshot from a volume by using an empty volume ID
  • Test assertion 21: Verify create snapshot failed, a ‘Not Found’ error is returned in the response
5.1.13.1.5.5.13.3.2. Pass / fail criteria

This test case evaluates the volume API ability of managing snapshot and snapshot metadata. Specifically, the test verifies that:

  • Create metadata for snapshot successfully.
  • Get metadata of snapshot successfully.
  • Update snapshot metadata and metadata item successfully.
  • Delete metadata item of a snapshot successfully.
  • Create a volume from a nonexistent snapshot is not allowed.
  • Delete a volume using an invalid volume ID is not allowed.
  • Delete a volume without passing the volume ID is not allowed.
  • Delete a non-existent volume is not allowed.
  • Create snapshot successfully.
  • Get snapshot’s detail information successfully.
  • Update snapshot attributes successfully.
  • Delete snapshot successfully.
  • Creates a volume and a snapshot passing a size different from the source successfully.
  • List snapshot details by display_name and status filters successfully.
  • Create a snapshot from a nonexistent volume is not allowed.
  • Create a snapshot from a volume without passing the volume ID is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.13.4. Post conditions

N/A

5.1.13.1.5.5.14. Test Case 13 - Volume update operations with the Cinder v2 or v3 API
5.1.13.1.5.5.14.1. Test case specification

tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_update_volume_with_empty_volume_id tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_update_volume_with_invalid_volume_id tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_update_volume_with_nonexistent_volume_id

5.1.13.1.5.5.14.2. Test preconditions
  • Volume extension API
5.1.13.1.5.5.14.3. Basic test flow execution description and pass/fail criteria
5.1.13.1.5.5.14.3.1. Test execution
  • Test action 1: Update a volume by using an empty volume ID
  • Test assertion 1: Verify update volume failed, a ‘Not Found’ error is returned in the response
  • Test action 2: Update a volume by using an invalid volume ID
  • Test assertion 2: Verify update volume failed, a ‘Not Found’ error is returned in the response
  • Test action 3: Update a non-existent volume by using a random generated volume ID
  • Test assertion 3: Verify update volume failed, a ‘Not Found’ error is returned in the response
5.1.13.1.5.5.14.3.2. Pass / fail criteria

This test case evaluates the volume API ability of updating volume attributes. Specifically, the test verifies that:

  • Update a volume without passing the volume ID is not allowed.
  • Update a volume using an invalid volume ID is not allowed.
  • Update a non-existent volume is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.13.1.5.5.14.4. Post conditions

N/A

5.1.14. Neutron Trunk Port Tempest Tests
5.1.14.1. Scope

This test area evaluates the ability of a system under test to support Neutron trunk ports. The test area specifically validates port and sub-port API CRUD operations, by means of both positive and negative tests.

5.1.14.2. References
5.1.14.3. System Under Test (SUT)

The system under test is assumed to be the NFVI and VIM deployed on a Pharos compliant infrastructure.

5.1.14.4. Test Area Structure

The test area is structured in individual tests as listed below. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. For detailed information on the individual steps and assertions performed by the tests, review the Python source code accessible via the following links:

Trunk port and sub-port CRUD operations:

These tests cover the CRUD (Create, Read, Update, Delete) life-cycle operations of trunk ports and subports.

Implementation: TrunkTestInheritJSONBase and TrunkTestJSON.

  • neutron.tests.tempest.api.test_trunk.TrunkTestInheritJSONBase.test_add_subport
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_add_subport
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_create_show_delete_trunk
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_create_trunk_empty_subports_list
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_create_trunk_subports_not_specified
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_create_update_trunk
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_create_update_trunk_with_description
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_delete_trunk_with_subport_is_allowed
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_get_subports
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_list_trunks
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_remove_subport
  • neutron.tests.tempest.api.test_trunk.TrunkTestJSON.test_show_trunk_has_project_id

MTU-related operations:

These tests validate that trunk ports and subports can be created and added when specifying valid MTU sizes. These tests do not include negative tests covering invalid MTU sizes.

Implementation: TrunkTestMtusJSON

  • neutron.tests.tempest.api.test_trunk.TrunkTestMtusJSON.test_add_subport_with_mtu_equal_to_trunk
  • neutron.tests.tempest.api.test_trunk.TrunkTestMtusJSON.test_add_subport_with_mtu_smaller_than_trunk
  • neutron.tests.tempest.api.test_trunk.TrunkTestMtusJSON.test_create_trunk_with_mtu_equal_to_subport
  • neutron.tests.tempest.api.test_trunk.TrunkTestMtusJSON.test_create_trunk_with_mtu_greater_than_subport

API for listing query results:

These tests verify that listing operations of trunk port objects work. This functionality is required for CLI and UI operations.

Implementation: TrunksSearchCriteriaTest

  • neutron.tests.tempest.api.test_trunk.TrunksSearchCriteriaTest.test_list_no_pagination_limit_0
  • neutron.tests.tempest.api.test_trunk.TrunksSearchCriteriaTest.test_list_pagination
  • neutron.tests.tempest.api.test_trunk.TrunksSearchCriteriaTest.test_list_pagination_page_reverse_asc
  • neutron.tests.tempest.api.test_trunk.TrunksSearchCriteriaTest.test_list_pagination_page_reverse_desc
  • neutron.tests.tempest.api.test_trunk.TrunksSearchCriteriaTest.test_list_pagination_page_reverse_with_href_links
  • neutron.tests.tempest.api.test_trunk.TrunksSearchCriteriaTest.test_list_pagination_with_href_links
  • neutron.tests.tempest.api.test_trunk.TrunksSearchCriteriaTest.test_list_pagination_with_marker
  • neutron.tests.tempest.api.test_trunk.TrunksSearchCriteriaTest.test_list_sorts_asc
  • neutron.tests.tempest.api.test_trunk.TrunksSearchCriteriaTest.test_list_sorts_desc

Query trunk port details:

These tests validate that all attributes of trunk port objects can be queried.

Implementation: TestTrunkDetailsJSON

  • neutron.tests.tempest.api.test_trunk_details.TestTrunkDetailsJSON.test_port_resource_empty_trunk_details
  • neutron.tests.tempest.api.test_trunk_details.TestTrunkDetailsJSON.test_port_resource_trunk_details_no_subports
  • neutron.tests.tempest.api.test_trunk_details.TestTrunkDetailsJSON.test_port_resource_trunk_details_with_subport

Negative tests:

These group of tests comprise negative tests which verify that invalid operations are handled correctly by the system under test.

Implementation: TrunkTestNegative

  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_add_subport_duplicate_segmentation_details
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_add_subport_passing_dict
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_add_subport_port_id_disabled_trunk
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_add_subport_port_id_uses_trunk_port_id
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_subport_invalid_inherit_network_segmentation_type
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_subport_missing_segmentation_id
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_subport_nonexistent_port_id
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_subport_nonexistent_trunk
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_trunk_duplicate_subport_segmentation_ids
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_trunk_nonexistent_port_id
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_trunk_nonexistent_subport_port_id
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_trunk_with_subport_missing_port_id
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_trunk_with_subport_missing_segmentation_id
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_create_trunk_with_subport_missing_segmentation_type
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_delete_port_in_use_by_subport
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_delete_port_in_use_by_trunk
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_delete_trunk_disabled_trunk
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_remove_subport_not_found
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_remove_subport_passing_dict
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestJSON.test_remove_subport_port_id_disabled_trunk
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestMtusJSON.test_add_subport_with_mtu_greater_than_trunk
  • neutron.tests.tempest.api.test_trunk_negative.TrunkTestMtusJSON.test_create_trunk_with_mtu_smaller_than_subport

Scenario tests (tests covering more than one functionality):

In contrast to the API tests above, these tests validate more than one specific API capability. Instead they verify that a simple scenario (example workflow) functions as intended. To this end, they boot up two VMs with trunk ports and sub ports and verify connectivity between those VMs.

Implementation: TrunkTest

  • neutron.tests.tempest.scenario.test_trunk.TrunkTest.test_subport_connectivity
  • neutron.tests.tempest.scenario.test_trunk.TrunkTest.test_trunk_subport_lifecycle
5.1.15. Common virtual machine life cycle events test specification
5.1.15.1. Scope

The common virtual machine life cycle events test area evaluates the ability of the system under test to behave correctly after common virtual machine life cycle events. The tests in this test area will evaluate:

  • Stop/Start a server
  • Reboot a server
  • Rebuild a server
  • Pause/Unpause a server
  • Suspend/Resume a server
  • Resize a server
  • Resizing a volume-backed server
  • Sequence suspend resume
  • Shelve/Unshelve a server
  • Cold migrate a server
  • Live migrate a server
5.1.15.3. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualization infrastructure
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
5.1.15.4. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.15.5. Test Area Structure

The test area is structured based on common virtual machine life cycle events. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

All these test cases are included in the test case dovetail.tempest.vm_lifecycle of OVP test suite.

5.1.15.6. Test Descriptions
5.1.15.6.1. API Used and Reference

Block storage: https://developer.openstack.org/api-ref/block-storage

  • create volume
  • delete volume
  • attach volume to server
  • detach volume from server

Security Groups: https://developer.openstack.org/api-ref/network/v2/index.html#security-groups-security-groups

  • create security group
  • delete security group

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • create network
  • delete network

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • create router
  • delete router
  • add interface to router

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • create subnet
  • delete subnet

Servers: https://developer.openstack.org/api-ref/compute/

  • create keypair
  • create server
  • show server
  • delete server
  • add/assign floating IP
  • resize server
  • revert resized server
  • confirm resized server
  • pause server
  • unpause server
  • start server
  • stop server
  • reboot server
  • rebuild server
  • suspend server
  • resume suspended server
  • shelve server
  • unshelve server
  • migrate server
  • live-migrate server

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • create port
  • delete port

Floating IPs: https://developer.openstack.org/api-ref/networking/v2/index.html#floating-ips-floatingips

  • create floating IP
  • delete floating IP

Availability zone: https://developer.openstack.org/api-ref/compute/

  • get availability zone
5.1.15.6.2. Test Case 1 - Minimum basic scenario
5.1.15.6.2.1. Test case specification

tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario

5.1.15.6.2.2. Test preconditions
  • Nova, cinder, glance, neutron services are available
  • One public network
5.1.15.6.2.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.2.3.1. Test execution
  • Test action 1: Create an image IMG1
  • Test action 2: Create a keypair KEYP1
  • Test action 3: Create a server VM1 with IMG1 and KEYP1
  • Test assertion 1: Verify VM1 is created successfully
  • Test action 4: Create a volume VOL1
  • Test assertion 2: Verify VOL1 is created successfully
  • Test action 5: Attach VOL1 to VM1
  • Test assertion 3: Verify VOL1’s status has been updated after attached to VM1
  • Test action 6: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test assertion 4: Verify VM1’s addresses have been refreshed after associating FIP1
  • Test action 7: Create and add security group SG1 to VM1
  • Test assertion 5: Verify can SSH to VM1 via FIP1
  • Test action 8: Reboot VM1
  • Test assertion 6: Verify can SSH to VM1 via FIP1
  • Test assertion 7: Verify VM1’s disk count equals to 1
  • Test action 9: Delete the floating IP FIP1 from VM1
  • Test assertion 8: Verify VM1’s addresses have been refreshed after disassociating FIP1
  • Test action 10: Delete SG1, IMG1, KEYP1, VOL1, VM1 and FIP1
5.1.15.6.2.3.2. Pass / fail criteria

This test evaluates a minimum basic scenario. Specifically, the test verifies that:

  • The server can be connected before reboot.
  • The server can be connected after reboot.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.2.4. Post conditions

N/A

5.1.15.6.3. Test Case 2 - Cold migration
5.1.15.6.3.1. Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_cold_migration

5.1.15.6.3.2. Test preconditions
  • At least 2 compute nodes
  • Nova, neutron services are available
  • One public network
5.1.15.6.3.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.3.3.1. Test execution
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Get VM1’s host info SRC_HOST
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 6: Cold migrate VM1
  • Test action 7: Wait for VM1 to reach ‘VERIFY_RESIZE’ status
  • Test action 8: Confirm resize VM1
  • Test action 9: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 2: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 10: Get VM1’s host info DST_HOST
  • Test assertion 3: Verify SRC_HOST does not equal to DST_HOST
  • Test action 11: Delete KEYP1, VM1 and FIP1
5.1.15.6.3.3.2. Pass / fail criteria

This test evaluates the ability to cold migrate VMs. Specifically, the test verifies that:

  • Servers can be cold migrated from one compute node to another computer node.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.3.4. Post conditions

N/A

5.1.15.6.4. Test Case 3 - Pause and unpause server
5.1.15.6.4.1. Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_pause_unpause

5.1.15.6.4.2. Test preconditions
  • Nova, neutron services are available
  • One public network
5.1.15.6.4.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.4.3.1. Test execution
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Pause VM1
  • Test action 5: Wait for VM1 to reach ‘PAUSED’ status
  • Test assertion 1: Verify FIP1 status is ‘ACTIVE’
  • Test assertion 2: Verify ping FIP1 failed and SSH to VM1 via FIP1 failed
  • Test action 6: Unpause VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 3: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 8: Delete KEYP1, VM1 and FIP1
5.1.15.6.4.3.2. Pass / fail criteria

This test evaluates the ability to pause and unpause VMs. Specifically, the test verifies that:

  • When paused, servers cannot be reached.
  • When unpaused, servers can recover its reachability.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.4.4. Post conditions

N/A

5.1.15.6.5. Test Case 4 - Reboot server
5.1.15.6.5.1. Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_reboot

5.1.15.6.5.2. Test preconditions
  • Nova, neutron services are available
  • One public network
5.1.15.6.5.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.5.3.1. Test execution
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Soft reboot VM1
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 6: Delete KEYP1, VM1 and FIP1
5.1.15.6.5.3.2. Pass / fail criteria

This test evaluates the ability to reboot servers. Specifically, the test verifies that:

  • After reboot, servers can still be connected.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.5.4. Post conditions

N/A

5.1.15.6.6. Test Case 5 - Rebuild server
5.1.15.6.6.1. Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_rebuild

5.1.15.6.6.2. Test preconditions
  • Nova, neutron services are available
  • One public network
5.1.15.6.6.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.6.3.1. Test execution
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Rebuild VM1 with another image
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 6: Delete KEYP1, VM1 and FIP1
5.1.15.6.6.3.2. Pass / fail criteria

This test evaluates the ability to rebuild servers. Specifically, the test verifies that:

  • Servers can be rebuilt with specific image correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.6.4. Post conditions

N/A

5.1.15.6.7. Test Case 6 - Resize server
5.1.15.6.7.1. Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_resize

5.1.15.6.7.2. Test preconditions
  • Nova, neutron services are available
  • One public network
5.1.15.6.7.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.7.3.1. Test execution
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Resize VM1 with another flavor
  • Test action 5: Wait for VM1 to reach ‘VERIFY_RESIZE’ status
  • Test action 6: Confirm resize VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 8: Delete KEYP1, VM1 and FIP1
5.1.15.6.7.3.2. Pass / fail criteria

This test evaluates the ability to resize servers. Specifically, the test verifies that:

  • Servers can be resized with specific flavor correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.7.4. Post conditions

N/A

5.1.15.6.8. Test Case 7 - Stop and start server
5.1.15.6.8.1. Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_stop_start

5.1.15.6.8.2. Test preconditions
  • Nova, neutron services are available
  • One public network
5.1.15.6.8.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.8.3.1. Test execution
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Stop VM1
  • Test action 5: Wait for VM1 to reach ‘SHUTOFF’ status
  • Test assertion 1: Verify ping FIP1 failed and SSH to VM1 via FIP1 failed
  • Test action 6: Start VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 2: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 8: Delete KEYP1, VM1 and FIP1
5.1.15.6.8.3.2. Pass / fail criteria

This test evaluates the ability to stop and start servers. Specifically, the test verifies that:

  • When stopped, servers cannot be reached.
  • When started, servers can recover its reachability.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.8.4. Post conditions

N/A

5.1.15.6.9. Test Case 8 - Suspend and resume server
5.1.15.6.9.1. Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_suspend_resume

5.1.15.6.9.2. Test preconditions
  • Nova, neutron services are available
  • One public network
5.1.15.6.9.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.9.3.1. Test execution
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Suspend VM1
  • Test action 5: Wait for VM1 to reach ‘SUSPENDED’ status
  • Test assertion 1: Verify ping FIP1 failed and SSH to VM1 via FIP1 failed
  • Test action 6: Resume VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 2: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 8: Delete KEYP1, VM1 and FIP1
5.1.15.6.9.3.2. Pass / fail criteria

This test evaluates the ability to suspend and resume servers. Specifically, the test verifies that:

  • When suspended, servers cannot be reached.
  • When resumed, servers can recover its reachability.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.9.4. Post conditions

N/A

5.1.15.6.10. Test Case 9 - Suspend and resume server in sequence
5.1.15.6.10.1. Test case specification

tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_server_sequence_suspend_resume

5.1.15.6.10.2. Test preconditions
  • Nova, neutron services are available
  • One public network
5.1.15.6.10.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.10.3.1. Test execution
  • Test action 1: Create a server VM1
  • Test action 2: Suspend VM1
  • Test action 3: Wait for VM1 to reach ‘SUSPENDED’ status
  • Test assertion 1: Verify VM1’s status is ‘SUSPENDED’
  • Test action 4: Resume VM1
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 2: Verify VM1’s status is ‘ACTIVE’
  • Test action 6: Suspend VM1
  • Test action 7: Wait for VM1 to reach ‘SUSPENDED’ status
  • Test assertion 3: Verify VM1 status is ‘SUSPENDED’
  • Test action 8: Resume VM1
  • Test action 9: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 4: Verify VM1 status is ‘ACTIVE’
  • Test action 10: Delete KEYP1, VM1 and FIP1
5.1.15.6.10.3.2. Pass / fail criteria

This test evaluates the ability to suspend and resume servers in sequence. Specifically, the test verifies that:

  • Servers can be suspend and resume in sequence correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.10.4. Post conditions

N/A

5.1.15.6.11. Test Case 10 - Resize volume backed server
5.1.15.6.11.1. Test case specification

tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_volume_backed_server_confirm

5.1.15.6.11.2. Test preconditions
  • Nova, neutron, cinder services are available
  • One public network
5.1.15.6.11.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.11.3.1. Test execution
  • Test action 1: Create a volume backed server VM1
  • Test action 2: Resize VM1 with another flavor
  • Test action 3: Wait for VM1 to reach ‘VERIFY_RESIZE’ status
  • Test action 4: Confirm resize VM1
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: VM1’s status is ‘ACTIVE’
  • Test action 6: Delete VM1
5.1.15.6.11.3.2. Pass / fail criteria

This test evaluates the ability to resize volume backed servers. Specifically, the test verifies that:

  • Volume backed servers can be resized with specific flavor correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.11.4. Post conditions

N/A

5.1.15.6.12. Test Case 11 - Shelve and unshelve server
5.1.15.6.12.1. Test case specification

tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance

5.1.15.6.12.2. Test preconditions
  • Nova, neutron, image services are available
  • One public network
5.1.15.6.12.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.12.3.1. Test execution
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 3: Create a server with SG1 and KEYP1
  • Test action 4: Create a timestamp and store it in a file F1 inside VM1
  • Test action 5: Shelve VM1
  • Test action 6: Unshelve VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test action 8: Read F1 and compare if the read value and the previously written value are the same or not
  • Test assertion 1: Verify the values written and read are the same
  • Test action 9: Delete SG1, KEYP1 and VM1
5.1.15.6.12.3.2. Pass / fail criteria

This test evaluates the ability to shelve and unshelve servers. Specifically, the test verifies that:

  • Servers can be shelved and unshelved correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.12.4. Post conditions

N/A

5.1.15.6.13. Test Case 12 - Shelve and unshelve volume backed server
5.1.15.6.13.1. Test case specification

tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_volume_backed_instance

5.1.15.6.13.2. Test preconditions
  • Nova, neutron, image, cinder services are available
  • One public network
5.1.15.6.13.3. Basic test flow execution description and pass/fail criteria
5.1.15.6.13.3.1. Test execution
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a security group SG1, which has rules for allowing incoming and outgoing SSH and ICMP traffic
  • Test action 3: Create a volume backed server VM1 with SG1 and KEYP1
  • Test action 4: SSH to VM1 to create a timestamp T_STAMP1 and store it in a file F1 inside VM1
  • Test action 5: Shelve VM1
  • Test action 6: Unshelve VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test action 8: SSH to VM1 to read the timestamp T_STAMP2 stored in F1
  • Test assertion 1: Verify T_STAMP1 equals to T_STAMP2
  • Test action 9: Delete SG1, KEYP1 and VM1
5.1.15.6.13.3.2. Pass / fail criteria

This test evaluates the ability to shelve and unshelve volume backed servers. Specifically, the test verifies that:

  • Volume backed servers can be shelved and unshelved correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.15.6.13.4. Post conditions

N/A

5.1.16. Tempest Volume test specification
5.1.16.1. Scope

This test area evaluates the ability of a system under test to manage volumes.

The test area specifically validates the creation, the deletion and the attachment/detach volume operations. tests.

5.1.16.2. References

N/A

5.1.16.3. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.16.4. Test Area Structure

The test area is structured in individual tests as listed below. For detailed information on the individual steps and assertions performed by the tests, review the Python source code accessible via the following links:

All these test cases are included in the test case dovetail.tempest.volume of OVP test suite.

5.1.16.4.1. Test Case 1 - Attach Detach Volume to Instance
5.1.16.4.1.1. Test case specification

Implementation: Attach Detach Volume to Instance

  • tempest.api.volume.test_volumes_actions.VolumesActionsTest.test_attach_detach_volume_to_instance
5.1.16.4.1.2. Test preconditions
  • Volume extension API
5.1.16.4.1.3. Basic test flow execution description and pass/fail criteria
5.1.16.4.1.3.1. Test execution
  • Test action 1: Create a server VM1
  • Test action 2: Attach a provided VOL1 to VM1
  • Test assertion 1: Verify VOL1 is in ‘in-use’ status
  • Test action 3: Detach VOL1 from VM1
  • Test assertion 2: Verify detach volume VOL1 successfully and VOL1 is in ‘available’ status
5.1.16.4.1.3.2. Pass / fail criteria

This test evaluates the volume API ability of attaching a volume to a server and detaching a volume from a server. Specifically, the test verifies that:

  • Volumes can be attached and detached from servers.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.16.4.1.4. Post conditions

N/A

5.1.16.4.2. Test Case 2 - Volume Boot Pattern test
5.1.16.4.2.1. Test case specification

Implementation: Volume Boot Pattern test

  • tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern.test_volume_boot_pattern
5.1.16.4.2.2. Test preconditions
  • Volume extension API
5.1.16.4.2.3. Basic test flow execution description and pass/fail criteria
5.1.16.4.2.3.1. Test execution
  • Test action 1:Create in Cinder some bootable volume VOL1 importing a Glance image
  • Test action 2:Boot an instance VM1 from the bootable volume VOL1
  • Test action 3:Write content to the VOL1
  • Test action 4:Delete VM1 and Boot a new instance VM2 from the volume VOL1
  • Test action 5:Check written content in the instance
  • Test assertion 1: Verify the content of written file in action 3
  • Test action 6:Create a volume snapshot VOL2 while the instance VM2 is running
  • Test action 7:Boot an additional instance VM3 from the new snapshot based volume VOL2
  • Test action 8:Check written content in the instance booted from snapshot
  • Test assertion 2: Verify the content of written file in action 3
5.1.16.4.2.3.2. Pass / fail criteria

This test evaluates the volume storage consistency. Specifically, the test verifies that:

  • The content of written file in the volume.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.16.4.2.4. Post conditions

N/A

5.1.17. VNF test specification
5.1.17.1. Scope

The VNF test area evaluates basic NFV capabilities of the system under test. These capabilities include creating a small number of virtual machines, establishing the SUT VNF, VNFs which are going to support the test activities and an Orchestrator as well as verifying the proper behavior of the basic VNF.

5.1.17.2. References

This test area references the following specifications and guides:

5.1.17.3. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • 3GPP - 3rd Generation Partnership Project
  • EPC - Evolved Packet Core
  • ETSI - European Telecommunications Standards Institute
  • IMS - IP Multimedia Core Network Subsystem
  • LTE - Long Term Evolution
  • NFV - Network functions virtualization
  • OAI - Open Air Interface
  • TS - Technical Specifications
  • VM - Virtual machine
  • VNF - Virtual Network Function
5.1.17.4. System Under Test (SUT)

The system under test is assumed to be the VNF and VIM in operation on a Pharos compliant infrastructure.

5.1.17.5. Test Area Structure

The test area is structured in two separate tests which are executed sequentially. The order of the tests is arbitrary as there are no dependencies across the tests. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

5.1.17.6. Test Descriptions
5.1.17.6.1. Test Case 1 - vEPC
5.1.17.6.1.1. Short name

dovetail.vnf.vepc

5.1.17.6.1.2. Use case specification

The Evolved Packet Core (EPC) is the main component of the System Architecture Evolution (SAE) which forms the core of the 3GPP LTE specification.

vEPC has been integrated in Functest to demonstrate the capability to deploy a complex mobility-specific NFV scenario on the OPNFV platform. The OAI EPC supports most of the essential functions defined by the 3GPP Technical Specs; hence the successful execution of functional tests on the OAI EPC provides a good endorsement of the underlying NFV platform.

5.1.17.6.1.3. Test preconditions

At least one compute node is available. No further pre-configuration needed.

5.1.17.6.1.4. Basic test flow execution description and pass/fail criteria
5.1.17.6.1.4.1. Methodology for verifying connectivity

This integration also includes ABot, a Test Orchestration system that enables test scenarios to be defined in high-level DSL. ABot is also deployed as a VM on the OPNFV platform; and this provides an example of the automation driver and the Test VNF being both deployed as separate VNFs on the underlying OPNFV platform.

5.1.17.6.1.4.2. Test execution
  • Test action 1: Deploy Juju controller (VNF Manager) using Bootstrap command.
  • Test action 2: Deploy ABot (Orchestrator) and OAI EPC as Juju charms. Configuration of ABot and OAI EPC components is handled through built-in Juju relations.
  • Test action 3: Execution of ABot feature files triggered by Juju actions. This executes a suite of LTE signalling tests on the OAI EPC.
  • Test action 4: ABot test results are parsed accordingly.
  • Test action 5: The deployed VMs are deleted.
5.1.17.6.1.4.3. Pass / fail criteria

The VNF Manager (juju) should be deployed successfully

Test executor (ABot), test Orchestration system is deployed and enables test scenarios to be defined in high-level DSL

VMs which are act as VNFs (including the VNF that is the SUT for test case) are following the 3GPP technical specifications accordingly.

5.1.17.6.1.5. Post conditions

The clean-up operations are run.

5.1.17.6.2. Test Case 2 - vIMS
5.1.17.6.2.1. Short name

dovetail.vnf.vims

5.1.17.6.2.2. Use case specification

The IP Multimedia Subsystem or IP Multimedia Core Network Subsystem (IMS) is an architectural framework for delivering IP multimedia services.

vIMS test case is integrated to demonstrate the capability to deploy a relatively complex NFV scenario on top of the OPNFV infrastructure.

Example of a real VNF deployment to show the NFV capabilities of the platform. The IP Multimedia Subsytem is a typical Telco test case, referenced by ETSI. It provides a fully functional VoIP System.

5.1.17.6.2.3. Test preconditions

Certain ubuntu server and cloudify images version refer to dovetail testing user guide.

At least 30G RAMs and 50 vcpu cores required.

5.1.17.6.2.4. Basic test flow execution description and pass/fail criteria

vIMS has been integrated in Functest to demonstrate the capability to deploy a relatively complex NFV scenario on the OPNFV platform. The deployment of a complete functional VNF allows the test of most of the essential functions needed for a NFV platform.

5.1.17.6.2.4.1. Test execution
  • Test action 1: Deploy a VNF orchestrator (Cloudify).
  • Test action 2: Deploy a Clearwater vIMS (IP Multimedia Subsystem) VNF from this orchestrator based on a TOSCA blueprint defined in repository of opnfv-cloudify-clearwater [1].
  • Test action 3: Run suite of signaling tests on top of this VNF
  • Test action 4: Collect test results.
  • Test action 5: The deployed VMs are deleted.
5.1.17.6.2.4.2. Pass / fail criteria

The VNF orchestrator (Cloudify) should be deployed successfully.

The Clearwater vIMS (IP Multimedia Subsystem) VNF from this orchestrator should be deployed successfully.

The suite of signaling tests on top of vIMS should be run successfully.

The test scenarios on the NFV platform should be executed successfully following the ETSI standards accordingly.

5.1.17.6.2.5. Post conditions

All resources created during the test run have been cleaned-up

5.1.18. Vping test specification
5.1.18.1. Scope

The vping test area evaluates basic NFVi capabilities of the system under test. These capabilities include creating a small number of virtual machines, establishing basic L3 connectivity between them and verifying connectivity by means of ICMP packets.

5.1.18.3. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • ICMP - Internet Control Message Protocol
  • L3 - Layer 3
  • NFVi - Network functions virtualization infrastructure
  • SCP - Secure Copy
  • SSH - Secure Shell
  • VM - Virtual machine
5.1.18.4. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.18.5. Test Area Structure

The test area is structured in two separate tests which are executed sequentially. The order of the tests is arbitrary as there are no dependencies across the tests.

5.1.18.6. Test Descriptions
5.1.18.6.1. Test Case 1 - vPing using userdata provided by nova metadata service
5.1.18.6.1.1. Short name

dovetail.vping.userdata

5.1.18.6.1.2. Use case specification

This test evaluates the use case where an NFVi tenant boots up two VMs and requires L3 connectivity between those VMs. The target IP is passed to the VM that will initiate pings by using a custom userdata script provided by nova metadata service.

5.1.18.6.1.3. Test preconditions

At least one compute node is available. No further pre-configuration needed.

5.1.18.6.1.4. Basic test flow execution description and pass/fail criteria
5.1.18.6.1.4.1. Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. The target IP is passed to the VM sending pings by using a custom userdata script by means of the config driver mechanism provided by Nova metadata service. Whether or not a ping was successful is determined by checking the console output of the source VMs.

5.1.18.6.1.4.2. Test execution
  • Test action 1:
    • Create a private tenant network by using neutron client
    • Create one subnet and one router in the network by neutron client
    • Add one interface between the subnet and router
    • Add one gateway route to the router by neutron client
    • Store the network id in the response
  • Test assertion 1: The network id, subnet id and router id can be found in the response

  • Test action 2:
    • Create an security group by using neutron client
    • Store the security group id parameter in the response
  • Test assertion 2: The security group id can be found in the response

  • Test action 3: boot VM1 by using nova client with configured name, image, flavor, private tenant network created in test action 1, security group created in test action 2

  • Test assertion 3: The VM1 object can be found in the response

  • Test action 4: Generate ping script with the IP of VM1 to be passed as userdata provided by the nova metadata service.

  • Test action 5: Boot VM2 by using nova client with configured name, image, flavor, private tenant network created in test action 1, security group created in test action 2, userdata created in test action 4

  • Test assertion 4: The VM2 object can be found in the response

  • Test action 6: Inside VM2, the ping script is executed automatically when booted and it contains a loop doing the ping until the return code is 0 or timeout reached. For each ping, when the return code is 0, “vPing OK” is printed in the VM2 console-log, otherwise, “vPing KO” is printed. Monitoring the console-log of VM2 to see the response generated by the script.

  • Test assertion 5: “vPing OK” is detected, when monitoring the console-log in VM2

  • Test action 7: delete VM1, VM2

  • Test assertion 6: VM1 and VM2 are not present in the VM list

  • Test action 8: delete security group, gateway, interface, router, subnet and network

  • Test assertion 7: The security group, gateway, interface, router, subnet and network are no longer present in the lists after deleting

5.1.18.6.1.4.3. Pass / fail criteria

This test evaluates basic NFVi capabilities of the system under test. Specifically, the test verifies that:

  • Neutron client network, subnet, router, interface create commands return valid “id” parameters which are shown in the create response message
  • Neutron client interface add command to add between subnet and router returns success code
  • Neutron client gateway add command to add to router returns success code
  • Neutron client security group create command returns valid “id” parameter which is shown in the response message
  • Nova client VM create command returns valid VM attributes response message
  • Nova metadata server can transfer userdata configuration at nova client VM booting time
  • Ping command from one VM to the other in same private tenant network returns valid code
  • All items created using neutron client or nova client create commands are able to be removed by using the returned identifiers

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.18.6.1.5. Post conditions

None

5.1.18.6.2. Test Case 2 - vPing using SSH to a floating IP
5.1.18.6.2.1. Short name

dovetail.vping.ssh

5.1.18.6.2.2. Use case specification

This test evaluates the use case where an NFVi tenant boots up two VMs and requires L3 connectivity between those VMs. An SSH connection is establised from the host to a floating IP associated with VM2 and ping is executed on VM2 with the IP of VM1 as target.

5.1.18.6.2.3. Test preconditions

At least one compute node is available. There should exist an OpenStack external network and can assign floating IP.

5.1.18.6.2.4. Basic test flow execution description and pass/fail criteria
5.1.18.6.2.4.1. Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. To this end, the test establishes an SSH connection from the host running the test suite to a floating IP associated with VM2 and executes ping on VM2 with the IP of VM1 as target.

5.1.18.6.2.4.2. Test execution
  • Test action 1:
    • Create a private tenant network by neutron client
    • Create one subnet and one router are created in the network by using neutron client
    • Create one interface between the subnet and router
    • Add one gateway route to the router by neutron client
    • Store the network id in the response
  • Test assertion 1: The network id, subnet id and router id can be found in the response

  • Test action 2:
    • Create an security group by using neutron client
    • Store the security group id parameter in the response
  • Test assertion 2: The security group id can be found in the response

  • Test action 3: Boot VM1 by using nova client with configured name, image, flavor, private tenant network created in test action 1, security group created in test action 2

  • Test assertion 3: The VM1 object can be found in the response

  • Test action 4: Boot VM2 by using nova client with configured name, image, flavor, private tenant network created in test action 1, security group created in test action 2

  • Test assertion 4: The VM2 object can be found in the response

  • Test action 5: create one floating IP by using neutron client, storing the floating IP address returned in the response

  • Test assertion 5: Floating IP address can be found in the response

  • Test action 6: Assign the floating IP address created in test action 5 to VM2 by using nova client

  • Test assertion 6: The assigned floating IP can be found in the VM2 console log file

  • Test action 7: Establish SSH connection between the test host and VM2 through the floating IP

  • Test assertion 7: SSH connection between the test host and VM2 is established within 300 seconds

  • Test action 8: Copy the Ping script from the test host to VM2 by using SCPClient

  • Test assertion 8: The Ping script can be found inside VM2

  • Test action 9: Inside VM2, to execute the Ping script to ping VM1, the Ping script contains a loop doing the ping until the return code is 0 or timeout reached, for each ping, when the return code is 0, “vPing OK” is printed in the VM2 console-log, otherwise, “vPing KO” is printed. Monitoring the console-log of VM2 to see the response generated by the script.

  • Test assertion 9: “vPing OK” is detected, when monitoring the console-log in VM2

  • Test action 10: delete VM1, VM2

  • Test assertion 10: VM1 and VM2 are not present in the VM list

  • Test action 11: delete floating IP, security group, gateway, interface, router, subnet and network

  • Test assertion 11: The security group, gateway, interface, router, subnet and network are no longer present in the lists after deleting

5.1.18.6.2.4.3. Pass / fail criteria

This test evaluates basic NFVi capabilities of the system under test. Specifically, the test verifies that:

  • Neutron client network, subnet, router, interface create commands return valid “id” parameters which are shown in the create response message
  • Neutron client interface add command to add between subnet and router return success code
  • Neutron client gateway add command to add to router return success code
  • Neutron client security group create command returns valid “id” parameter which is shown in the response message
  • Nova client VM create command returns valid VM attributes response message
  • Neutron client floating IP create command return valid floating IP address
  • Nova client add floating IP command returns valid response message
  • SSH connection can be established using a floating IP
  • Ping command from one VM to another in same private tenant network returns valid code
  • All items created using neutron client or nova client create commands are able to be removed by using the returned identifiers

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.18.6.2.5. Post conditions

None

5.1.19. VPN test specification
5.1.19.1. Scope

The VPN test area evaluates the ability of the system under test to support VPN networking for virtual workloads. The tests in this test area will evaluate establishing VPN networks, publishing and communication between endpoints using BGP and tear down of the networks.

5.1.19.2. References

This test area evaluates the ability of the system to perform selected actions defined in the following specifications. Details of specific features evaluated are described in the test descriptions.

5.1.19.3. Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • BGP - Border gateway protocol
  • eRT - Export route target
  • IETF - Internet Engineering Task Force
  • iRT - Import route target
  • NFVi - Network functions virtualization infrastructure
  • Tenant - An isolated set of virtualized infrastructures
  • VM - Virtual machine
  • VPN - Virtual private network
  • VLAN - Virtual local area network
5.1.19.4. System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

5.1.19.5. Test Area Structure

The test area is structured in four separate tests which are executed sequentially. The order of the tests is arbitrary as there are no dependencies across the tests. Specifially, every test performs clean-up operations which return the system to the same state as before the test.

The test area evaluates the ability of the SUT to establish connectivity between Virtual Machines using an appropriate route target configuration, reconfigure the route targets to remove connectivity between the VMs, then reestablish connectivity by re-association.

5.1.19.6. Test Descriptions
5.1.19.6.1. Test Case 1 - VPN provides connectivity between Neutron subnets
5.1.19.6.1.1. Short name

dovetail.sdnvpn.subnet_connectivity

5.1.19.6.1.2. Use case specification

This test evaluates the use case where an NFVi tenant uses a BGPVPN to provide connectivity between VMs on different Neutron networks and subnets that reside on different hosts.

5.1.19.6.1.3. Test preconditions

2 compute nodes are available, denoted Node1 and Node2 in the following.

5.1.19.6.1.4. Basic test flow execution description and pass/fail criteria
5.1.19.6.1.4.1. Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. The target IPs are passed to the VMs sending pings by means of a custom user data script. Whether or not a ping was successful is determined by checking the console output of the source VMs.

5.1.19.6.1.4.2. Test execution
  • Create Neutron network N1 and subnet SN1 with IP range 10.10.10.0/24
  • Create Neutron network N2 and subnet SN2 with IP range 10.10.11.0/24
  • Create VM1 on Node1 with a port in network N1
  • Create VM2 on Node1 with a port in network N1
  • Create VM3 on Node2 with a port in network N1
  • Create VM4 on Node1 with a port in network N2
  • Create VM5 on Node2 with a port in network N2
  • Create VPN1 with eRT<>iRT
  • Create network association between network N1 and VPN1
  • VM1 sends ICMP packets to VM2 using ping
  • Test assertion 1: Ping from VM1 to VM2 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM3 using ping
  • Test assertion 2: Ping from VM1 to VM3 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM4 using ping
  • Test assertion 3: Ping from VM1 to VM4 fails: ping exits with a non-zero return code
  • Create network association between network N2 and VPN1
  • VM4 sends ICMP packets to VM5 using ping
  • Test assertion 4: Ping from VM4 to VM5 succeeds: ping exits with return code 0
  • Configure iRT=eRT in VPN1
  • VM1 sends ICMP packets to VM4 using ping
  • Test assertion 5: Ping from VM1 to VM4 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM5 using ping
  • Test assertion 6: Ping from VM1 to VM5 succeeds: ping exits with return code 0
  • Delete all instances: VM1, VM2, VM3, VM4 and VM5
  • Delete all networks and subnets: networks N1 and N2 including subnets SN1 and SN2
  • Delete all network associations and VPN1
5.1.19.6.1.4.3. Pass / fail criteria

This test evaluates the capability of the NFVi and VIM to provide routed IP connectivity between VMs by means of BGP/MPLS VPNs. Specifically, the test verifies that:

  • VMs in the same Neutron subnet have IP connectivity regardless of BGP/MPLS VPNs (test assertion 1, 2, 4)
  • VMs in different Neutron subnets do not have IP connectivity by default - in this case without associating VPNs with the same import and export route targets to the Neutron networks (test assertion 3)
  • VMs in different Neutron subnets have routed IP connectivity after associating both networks with BGP/MPLS VPNs which have been configured with the same import and export route targets (test assertion 5, 6). Hence, adjusting the ingress and egress route targets enables as well as prohibits routing.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.19.6.1.5. Post conditions

N/A

5.1.19.6.2. Test Case 2 - VPNs ensure traffic separation between tenants
5.1.19.6.2.1. Short Name

dovetail.sdnvpn.tenant_separation

5.1.19.6.2.2. Use case specification

This test evaluates if VPNs provide separation of traffic such that overlapping IP ranges can be used.

5.1.19.6.2.3. Test preconditions

2 compute nodes are available, denoted Node1 and Node2 in the following.

5.1.19.6.2.4. Basic test flow execution description and pass/fail criteria
5.1.19.6.2.4.1. Methodology for verifying connectivity

Connectivity between VMs is tested by establishing an SSH connection. Moreover, the command “hostname” is executed at the remote VM in order to retrieve the hostname of the remote VM. The retrieved hostname is furthermore compared against an expected value. This is used to verify tenant traffic separation, i.e., despite overlapping IPs, a connection is made to the correct VM as determined by means of the hostname of the target VM.

5.1.19.6.2.4.2. Test execution
  • Create Neutron network N1
  • Create subnet SN1a of network N1 with IP range 10.10.10.0/24
  • Create subnet SN1b of network N1 with IP range 10.10.11.0/24
  • Create Neutron network N2
  • Create subnet SN2a of network N2 with IP range 10.10.10.0/24
  • Create subnet SN2b of network N2 with IP range 10.10.11.0/24
  • Create VM1 on Node1 with a port in network N1 and IP 10.10.10.11.
  • Create VM2 on Node1 with a port in network N1 and IP 10.10.10.12.
  • Create VM3 on Node2 with a port in network N1 and IP 10.10.11.13.
  • Create VM4 on Node1 with a port in network N2 and IP 10.10.10.12.
  • Create VM5 on Node2 with a port in network N2 and IP 10.10.11.13.
  • Create VPN1 with iRT=eRT=RT1
  • Create network association between network N1 and VPN1
  • VM1 attempts to execute the command hostname on the VM with IP 10.10.10.12 via SSH.
  • Test assertion 1: VM1 can successfully connect to the VM with IP 10.10.10.12. via SSH and execute the remote command hostname. The retrieved hostname equals the hostname of VM2.
  • VM1 attempts to execute the command hostname on the VM with IP 10.10.11.13 via SSH.
  • Test assertion 2: VM1 can successfully connect to the VM with IP 10.10.11.13 via SSH and execute the remote command hostname. The retrieved hostname equals the hostname of VM3.
  • Create VPN2 with iRT=eRT=RT2
  • Create network association between network N2 and VPN2
  • VM4 attempts to execute the command hostname on the VM with IP 10.10.11.13 via SSH.
  • Test assertion 3: VM4 can successfully connect to the VM with IP 10.10.11.13 via SSH and execute the remote command hostname. The retrieved hostname equals the hostname of VM5.
  • VM4 attempts to execute the command hostname on the VM with IP 10.10.11.11 via SSH.
  • Test assertion 4: VM4 cannot connect to the VM with IP 10.10.11.11 via SSH.
  • Delete all instances: VM1, VM2, VM3, VM4 and VM5
  • Delete all networks and subnets: networks N1 and N2 including subnets SN1a, SN1b, SN2a and SN2b
  • Delete all network associations, VPN1 and VPN2
5.1.19.6.2.4.3. Pass / fail criteria

This test evaluates the capability of the NFVi and VIM to provide routed IP connectivity between VMs by means of BGP/MPLS VPNs. Specifically, the test verifies that:

  • VMs in the same Neutron subnet (still) have IP connectivity between each other when a BGP/MPLS VPN is associated with the network (test assertion 1).
  • VMs in different Neutron subnets have routed IP connectivity between each other when BGP/MPLS VPNs with the same import and expert route targets are associated with both networks (assertion 2).
  • VMs in different Neutron networks and BGP/MPLS VPNs with different import and export route targets can have overlapping IP ranges. The BGP/MPLS VPNs provide traffic separation (assertion 3 and 4).

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.19.6.2.5. Post conditions

N/A

5.1.19.6.3. Test Case 3 - VPN provides connectivity between subnets using router association
5.1.19.6.3.1. Short Name

dovetail.sdnvpn.router_association

5.1.19.6.3.2. Use case specification

This test evaluates if a VPN provides connectivity between two subnets by utilizing two different VPN association mechanisms: a router association and a network association.

Specifically, the test network topology comprises two networks N1 and N2 with corresponding subnets. Additionally, network N1 is connected to a router R1. This test verifies that a VPN V1 provides connectivity between both networks when applying a router association to router R1 and a network association to network N2.

5.1.19.6.3.3. Test preconditions

2 compute nodes are available, denoted Node1 and Node2 in the following.

5.1.19.6.3.4. Basic test flow execution description and pass/fail criteria
5.1.19.6.3.4.1. Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. The target IPs are passed to the VMs sending pings by means of a custom user data script. Whether or not a ping was successful is determined by checking the console output of the source VMs.

5.1.19.6.3.4.2. Test execution
  • Create a network N1, a subnet SN1 with IP range 10.10.10.0/24 and a connected router R1
  • Create a network N2, a subnet SN2 with IP range 10.10.11.0/24
  • Create VM1 on Node1 with a port in network N1
  • Create VM2 on Node1 with a port in network N1
  • Create VM3 on Node2 with a port in network N1
  • Create VM4 on Node1 with a port in network N2
  • Create VM5 on Node2 with a port in network N2
  • Create VPN1 with eRT<>iRT so that connected subnets should not reach each other
  • Create route association between router R1 and VPN1
  • VM1 sends ICMP packets to VM2 using ping
  • Test assertion 1: Ping from VM1 to VM2 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM3 using ping
  • Test assertion 2: Ping from VM1 to VM3 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM4 using ping
  • Test assertion 3: Ping from VM1 to VM4 fails: ping exits with a non-zero return code
  • Create network association between network N2 and VPN1
  • VM4 sends ICMP packets to VM5 using ping
  • Test assertion 4: Ping from VM4 to VM5 succeeds: ping exits with return code 0
  • Change VPN1 so that iRT=eRT
  • VM1 sends ICMP packets to VM4 using ping
  • Test assertion 5: Ping from VM1 to VM4 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM5 using ping
  • Test assertion 6: Ping from VM1 to VM5 succeeds: ping exits with return code 0
  • Delete all instances: VM1, VM2, VM3, VM4 and VM5
  • Delete all networks, subnets and routers: networks N1 and N2 including subnets SN1 and SN2, router R1
  • Delete all network and router associations and VPN1
5.1.19.6.3.4.3. Pass / fail criteria

This test evaluates the capability of the NFVi and VIM to provide routed IP connectivity between VMs by means of BGP/MPLS VPNs. Specifically, the test verifies that:

  • VMs in the same Neutron subnet have IP connectivity regardless of the import and export route target configuration of BGP/MPLS VPNs (test assertion 1, 2, 4)
  • VMs in different Neutron subnets do not have IP connectivity by default - in this case without associating VPNs with the same import and export route targets to the Neutron networks or connected Neutron routers (test assertion 3).
  • VMs in two different Neutron subnets have routed IP connectivity after associating the first network and a router connected to the second network with BGP/MPLS VPNs which have been configured with the same import and export route targets (test assertion 5, 6). Hence, adjusting the ingress and egress route targets enables as well as prohibits routing.
  • Network and router associations are equivalent methods for binding Neutron networks to VPN.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.19.6.3.5. Post conditions

N/A

5.1.19.6.4. Test Case 4 - Verify interworking of router and network associations with floating IP functionality
5.1.19.6.4.1. Short Name

dovetail.sdnvpn.router_association_floating_ip

5.1.19.6.4.2. Use case specification

This test evaluates if both the router association and network association mechanisms interwork with floating IP functionality.

Specifically, the test network topology comprises two networks N1 and N2 with corresponding subnets. Additionally, network N1 is connected to a router R1. This test verifies that i) a VPN V1 provides connectivity between both networks when applying a router association to router R1 and a network association to network N2 and ii) a VM in network N1 is reachable externally by means of a floating IP.

5.1.19.6.4.3. Test preconditions

At least one compute node is available.

5.1.19.6.4.4. Basic test flow execution description and pass/fail criteria
5.1.19.6.4.4.1. Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. The target IPs are passed to the VMs sending pings by means of a custom user data script. Whether or not a ping was successful is determined by checking the console output of the source VMs.

5.1.19.6.4.4.2. Test execution
  • Create a network N1, a subnet SN1 with IP range 10.10.10.0/24 and a connected router R1
  • Create a network N2 with IP range 10.10.20.0/24
  • Create VM1 with a port in network N1
  • Create VM2 with a port in network N2
  • Create VPN1
  • Create a router association between router R1 and VPN1
  • Create a network association between network N2 and VPN1
  • VM1 sends ICMP packets to VM2 using ping
  • Test assertion 1: Ping from VM1 to VM2 succeeds: ping exits with return code 0
  • Assign a floating IP to VM1
  • The host running the test framework sends ICMP packets to VM1 using ping
  • Test assertion 2: Ping from the host running the test framework to the floating IP of VM1 succeeds: ping exits with return code 0
  • Delete floating IP assigned to VM1
  • Delete all instances: VM1, VM2
  • Delete all networks, subnets and routers: networks N1 and N2 including subnets SN1 and SN2, router R1
  • Delete all network and router associations as well as VPN1
5.1.19.6.4.4.3. Pass / fail criteria

This test evaluates the capability of the NFVi and VIM to provide routed IP connectivity between VMs by means of BGP/MPLS VPNs. Specifically, the test verifies that:

  • VMs in the same Neutron subnet have IP connectivity regardless of the import and export route target configuration of BGP/MPLS VPNs (test assertion 1)
  • VMs connected to a network which has been associated with a BGP/MPLS VPN are reachable through floating IPs.

In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.19.6.4.5. Post conditions

N/A

5.1.19.6.5. Test Case 5 - Tempest API CRUD Tests
5.1.19.6.5.1. Short Name

dovetail.tempest.bgpvpn

5.1.19.6.5.2. Use case specification

This test case combines multiple CRUD (Create, Read, Update, Delete) tests for the objects defined by the BGPVPN API extension of Neutron.

These tests are implemented in the upstream networking-bgpvpn project repository as a Tempest plugin.

5.1.19.6.5.3. Test preconditions

The VIM is operational and the networking-bgpvpn service plugin for Neutron is correctly configured and loaded. At least one compute node is available.

5.1.19.6.5.4. Basic test flow execution description and pass/fail criteria

List of test cases

  • networking_bgpvpn_tempest.tests.api.test_create_bgpvpn
  • networking_bgpvpn_tempest.tests.api.test_create_bgpvpn_as_non_admin_fail
  • networking_bgpvpn_tempest.tests.api.test_delete_bgpvpn_as_non_admin_fail
  • networking_bgpvpn_tempest.tests.api.test_show_bgpvpn_as_non_owner_fail
  • networking_bgpvpn_tempest.tests.api.test_list_bgpvpn_as_non_owner_fail
  • networking_bgpvpn_tempest.tests.api.test_show_netassoc_as_non_owner_fail
  • networking_bgpvpn_tempest.tests.api.test_list_netassoc_as_non_owner_fail
  • networking_bgpvpn_tempest.tests.api.test_associate_disassociate_network
  • networking_bgpvpn_tempest.tests.api.test_update_route_target_non_admin_fail
  • networking_bgpvpn_tempest.tests.api.test_create_bgpvpn_with_invalid_routetargets
  • networking_bgpvpn_tempest.tests.api.test_update_bgpvpn_invalid_routetargets
  • networking_bgpvpn_tempest.tests.api.test_associate_invalid_network
  • networking_bgpvpn_tempest.tests.api.test_disassociate_invalid_network
  • networking_bgpvpn_tempest.tests.api.test_associate_disassociate_router
  • networking_bgpvpn_tempest.tests.api.test_attach_associated_subnet_to_associated_router

The tests include both positive tests and negative tests. The latter are identified with the suffix “_fail” in their name.

5.1.19.6.5.4.1. Test execution

The tests are executed sequentially and a separate pass/fail result is recorded per test.

In general, every test case performs the API operations indicated in its name and asserts that the action succeeds (positive test) or a specific exception is triggered (negative test). The following describes the test execution per test in further detail.

5.1.19.6.5.4.1.1. networking_bgpvpn_tempest.tests.api.test_create_bgpvpn
  • Create a BGPVPN as an admin.
  • Test assertion: The API call succeeds.
5.1.19.6.5.4.1.2. networking_bgpvpn_tempest.tests.api.test_create_bgpvpn_as_non_admin_fail
  • Attempt to create a BGPVPN as non-admin.
  • Test assertion: Creating a BGPVPN as non-admin fails.
5.1.19.6.5.4.1.3. networking_bgpvpn_tempest.tests.api.test_delete_bgpvpn_as_non_admin_fail
  • Create BGPVPN vpn1 as admin.
  • Attempt to delete vpn1 as non-admin.
  • Test assertion: The deletion of vpn1 as non-admin fails.
5.1.19.6.5.4.1.4. networking_bgpvpn_tempest.tests.api.test_show_bgpvpn_as_non_owner_fail
  • Create a BGPVPN vpn1 as admin in project1.
  • Test assertion: Attempting to retrieve detailed properties of vpn1 in project2 fails.
5.1.19.6.5.4.1.5. networking_bgpvpn_tempest.tests.api.test_list_bgpvpn_as_non_owner_fail
  • Create a BGPVPN vpn1 as admin in project1.
  • Retrieve a list of all BGPVPNs in project2.
  • Test assertion: The list of BGPVPNs retrieved in project2 does not include vpn1.
5.1.19.6.5.4.1.6. networking_bgpvpn_tempest.tests.api.test_show_netassoc_as_non_owner_fail
  • Create BGPVPN vpn1 as admin in project1.
  • Associate vpn1 with a Neutron network in project1
  • Test assertion: Retrieving detailed properties of the network association fails in project2.
5.1.19.6.5.4.1.7. networking_bgpvpn_tempest.tests.api.test_list_netassoc_as_non_owner_fail
  • Create BGPVPN vpn1 as admin in project1.
  • Create network association net-assoc1 with vpn1 and Neutron network net1 in project1.
  • Retrieve a list of all network associations in project2.
  • Test assertion: The retrieved list of network associations does not include network association net-assoc1.
5.1.19.6.5.4.1.8. networking_bgpvpn_tempest.tests.api.test_associate_disassociate_network
  • Create a BGPVPN vpn1 as admin.
  • Associate vpn1 with a Neutron network net1.
  • Test assertion: The metadata of vpn1 includes the UUID of net1.
  • Diassociate vpn1 from the Neutron network.
  • Test assertion: The metadata of vpn1 does not include the UUID of net1.
5.1.19.6.5.4.1.9. networking_bgpvpn_tempest.tests.api.test_update_route_target_non_admin_fail
  • Create a BGPVPN vpn1 as admin with specific route targets.
  • Attempt to update vpn1 with different route targets as non-admin.
  • Test assertion: The update fails.
5.1.19.6.5.4.1.10. networking_bgpvpn_tempest.tests.api.test_create_bgpvpn_with_invalid_routetargets
  • Attempt to create a BGPVPN as admin with invalid route targets.
  • Test assertion: The creation of the BGPVPN fails.
5.1.19.6.5.4.1.11. networking_bgpvpn_tempest.tests.api.test_update_bgpvpn_invalid_routetargets
  • Create a BGPVPN vpn1 as admin with empty route targets.
  • Attempt to update vpn1 with invalid route targets.
  • Test assertion: The update of the route targets fails.
5.1.19.6.5.4.1.12. networking_bgpvpn_tempest.tests.api.test_associate_invalid_network
  • Create BGPVPN vpn1 as admin.
  • Attempt to associate vpn1 with a non-existing Neutron network.
  • Test assertion: Creating the network association fails.
5.1.19.6.5.4.1.13. networking_bgpvpn_tempest.tests.api.test_disassociate_invalid_network
  • Create BGPVPN vpn1 as admin.
  • Create network association net-assoc1 with vpn1 and Neutron network net1.
  • Attempt to delete net-assoc1 with an invalid network UUID.
  • Test assertion: The deletion of the net-assoc fails.
5.1.19.6.5.4.1.14. networking_bgpvpn_tempest.tests.api.test_associate_disassociate_router
  • Create a BGPVPN vpn1 as admin.
  • Associate vpn1 with a Neutron router router1.
  • Test assertion: The metadata of vpn1 includes the UUID of router1.
  • Disassociate router1 from vpn1.
  • Test assertion: The metadata of vpn1 does not include the UUID of router1.
5.1.19.6.5.4.1.15. networking_bgpvpn_tempest.tests.api.test_attach_associated_subnet_to_associated_router
  • Create BGPVPN vpn1 as admin.
  • Associate vpn1 with Neutron network net1.
  • Create BGPVPN vpn2
  • Associate vpn2 with Neutron router router1.
  • Attempt to add the subnet of net1 to router1
  • Test assertion: The association fails.
5.1.19.6.5.4.2. Pass / fail criteria

This test validates that all supported CRUD operations (create, read, update, delete) can be applied to the objects of the Neutron BGPVPN extension. In order to pass this test, all test assertions listed in the test execution above need to pass.

5.1.19.6.5.5. Post conditions

N/A

6. OVP Testing User Guide
6.1. Conducting OVP Testing with Dovetail
6.1.1. Overview

The Dovetail testing framework for OVP consists of two major parts: the testing client that executes all test cases in a lab (vendor self-testing or a third party lab), and the server system that is hosted by the OVP administrator to store and view test results based on a web API. The following diagram illustrates this overall framework.

_images/dovetail_online_mode.png

Within the tester’s lab, the Test Host is the machine where Dovetail executes all automated test cases. As it hosts the test harness, the Test Host must not be part of the System Under Test (SUT) itself. The above diagram assumes that the tester’s Test Host is situated in a DMZ, which has internal network access to the SUT and external access via the public Internet. The public Internet connection allows for easy installation of the Dovetail containers. A singular compressed file that includes all the underlying results can be pulled from the Test Host and uploaded to the OPNFV OVP server. This arrangement may not be supported in some labs. Dovetail also supports an offline mode of installation that is illustrated in the next diagram.

_images/dovetail_offline_mode.png

In the offline mode, the Test Host only needs to have access to the SUT via the internal network, but does not need to connect to the public Internet. This user guide will highlight differences between the online and offline modes of the Test Host. While it is possible to run the Test Host as a virtual machine, this user guide assumes it is a physical machine for simplicity.

The rest of this guide will describe how to install the Dovetail tool as a Docker container image, go over the steps of running the OVP test suite, and then discuss how to view test results and make sense of them.

Readers interested in using Dovetail for its functionalities beyond OVP testing, e.g. for in-house or extended testing, should consult the Dovetail developer’s guide for additional information.

6.1.2. Installing Dovetail

In this section, we describe the procedure to install Dovetail client tool on the Test Host. The Test Host must have network access to the management network with access rights to the Virtual Infrastructure Manager’s API.

6.1.2.1. Checking the Test Host Readiness

The Test Host must have network access to the Virtual Infrastructure Manager’s API hosted in the SUT so that the Dovetail tool can exercise the API from the Test Host. It must also have ssh access to the Linux operating system of the compute nodes in the SUT. The ssh mechanism is used by some test cases to generate test events in the compute nodes. You can find out which test cases use this mechanism in the test specification document.

We have tested the Dovetail tool on the following host operating systems. Other versions or distributions of Linux may also work, but community support may be more available on these versions.

  • Ubuntu 16.04.2 LTS (Xenial) or 14.04 LTS (Trusty)
  • CentOS-7-1611
  • Red Hat Enterprise Linux 7.3
  • Fedora 24 or 25 Server

Use of Ubuntu 16.04 is highly recommended, as it has been most widely employed during testing. Non-Linux operating systems, such as Windows and Mac OS, have not been tested and are not supported.

If online mode is used, the tester should also validate that the Test Host can reach the public Internet. For example,

$ ping www.opnfv.org
PING www.opnfv.org (50.56.49.117): 56 data bytes
64 bytes from 50.56.49.117: icmp_seq=0 ttl=48 time=52.952 ms
64 bytes from 50.56.49.117: icmp_seq=1 ttl=48 time=53.805 ms
64 bytes from 50.56.49.117: icmp_seq=2 ttl=48 time=53.349 ms
...

Or, if the lab environment does not allow ping, try validating it using HTTPS instead.

$ curl https://www.opnfv.org
<!doctype html>


<html lang="en-US" class="no-js">
<head>
...
6.1.2.2. Installing Prerequisite Packages on the Test Host

The main prerequisite software for Dovetail is Docker.

Dovetail does not work with Docker versions prior to 1.12.3. We have validated Dovetail with Docker 17.03 CE. Other versions of Docker later than 1.12.3 may also work, but community support may be more available on Docker 17.03 CE or greater.

$ sudo docker version
Client:
Version:      17.03.1-ce
API version:  1.27
Go version:   go1.7.5
Git commit:   c6d412e
Built:        Mon Mar 27 17:10:36 2017
OS/Arch:      linux/amd64

Server:
Version:      17.03.1-ce
API version:  1.27 (minimum version 1.12)
Go version:   go1.7.5
Git commit:   c6d412e
Built:        Mon Mar 27 17:10:36 2017
OS/Arch:      linux/amd64
Experimental: false

If your Test Host does not have Docker installed, or Docker is older than 1.12.3, or you have Docker version other than 17.03 CE and wish to change, you will need to install, upgrade, or re-install in order to run Dovetail. The Docker installation process can be more complex, you should refer to the official Docker installation guide that is relevant to your Test Host’s operating system.

The above installation steps assume that the Test Host is in the online mode. For offline testing, use the following offline installation steps instead.

In order to install Docker offline, download Docker static binaries and copy the tar file to the Test Host, such as for Ubuntu14.04, you may follow the following link to install,

https://github.com/meetyg/docker-offline-install
6.1.2.3. Configuring the Test Host Environment

The Test Host needs a few environment variables set correctly in order to access the Openstack API required to drive the Dovetail tests. For convenience and as a convention, we will also create a home directory for storing all Dovetail related config files and results files:

$ mkdir -p ${HOME}/dovetail
$ export DOVETAIL_HOME=${HOME}/dovetail

Here we set dovetail home directory to be ${HOME}/dovetail for an example. Then create 2 directories named pre_config and images in this directory to store all Dovetail related config files and all test images respectively:

$ mkdir -p ${DOVETAIL_HOME}/pre_config
$ mkdir -p ${DOVETAIL_HOME}/images
6.1.2.4. Setting up Primary Configuration File

At this point, you will need to consult your SUT (Openstack) administrator to correctly set the configurations in a file named env_config.sh. The Openstack settings need to be configured such that the Dovetail client has all the necessary credentials and privileges to execute all test operations. If the SUT uses terms somewhat differently from the standard Openstack naming, you will need to adjust this file accordingly.

Create and edit the file ${DOVETAIL_HOME}/pre_config/env_config.sh so that all parameters are set correctly to match your SUT. Here is an example of what this file should contain.

$ cat ${DOVETAIL_HOME}/pre_config/env_config.sh

# Project-level authentication scope (name or ID), recommend admin project.
export OS_PROJECT_NAME=admin

# Authentication username, belongs to the project above, recommend admin user.
export OS_USERNAME=admin

# Authentication password. Use your own password
export OS_PASSWORD=xxxxxxxx

# Authentication URL, one of the endpoints of keystone service. If this is v3 version,
# there need some extra variables as follows.
export OS_AUTH_URL='http://xxx.xxx.xxx.xxx:5000/v3'

# Default is 2.0. If use keystone v3 API, this should be set as 3.
export OS_IDENTITY_API_VERSION=3

# Domain name or ID containing the user above.
# Command to check the domain: openstack user show <OS_USERNAME>
export OS_USER_DOMAIN_NAME=default

# Domain name or ID containing the project above.
# Command to check the domain: openstack project show <OS_PROJECT_NAME>
export OS_PROJECT_DOMAIN_NAME=default

# Special environment parameters for https.
# If using https + cacert, the path of cacert file should be provided.
# The cacert file should be put at $DOVETAIL_HOME/pre_config.
export OS_CACERT=/path/to/pre_config/cacert.pem

# If using https + no cacert, should add OS_INSECURE environment parameter.
export OS_INSECURE=True

# The name of a network with external connectivity for allocating floating
# IPs. It is required that at least one Neutron network with the attribute
# 'router:external=True' is pre-configured on the system under test.
# This network is used by test cases to SSH into tenant VMs and perform
# operations there.
export EXTERNAL_NETWORK=xxx

# Set an existing role used to create project and user for vping test cases.
# Otherwise, it will create a role 'Member' to do that.
export NEW_USER_ROLE=xxx

The OS_AUTH_URL variable is key to configure correctly, as the other admin services are gleaned from the identity service. HTTPS should be configured in the SUT so either OS_CACERT or OS_INSECURE should be uncommented. However, if SSL is disabled in the SUT, comment out both OS_CACERT and OS_INSECURE variables. Ensure the ‘/path/to/pre_config’ directory in the above file matches the directory location of the cacert file for the OS_CACERT variable.

The next three sections outline additional configuration files used by Dovetail. The tempest (tempest_conf.yaml) configuration file is required for executing all tempest test cases (e.g. functest.tempest.compute, functest.tempest.ipv6 ...) and functest.security.patrole. The HA (pod.yaml) configuration file is required for HA test cases and is also employed to collect SUT hardware info. The hosts.yaml is optional for hostname/IP resolution.

6.1.2.5. Configuration for Running Tempest Test Cases (Mandatory)

The test cases in the test areas tempest and security are based on Tempest. A SUT-specific configuration of Tempest is required in order to run those test cases successfully. The corresponding SUT-specific configuration options must be supplied in the file $DOVETAIL_HOME/pre_config/tempest_conf.yaml.

Create and edit file $DOVETAIL_HOME/pre_config/tempest_conf.yaml. Here is an example of what this file should contain.

compute:
  # The minimum number of compute nodes expected.
  # This should be no less than 2 and no larger than the compute nodes the SUT actually has.
  min_compute_nodes: 2

  # Expected device name when a volume is attached to an instance.
  volume_device_name: vdb

Use the listing above as a minimum to execute the mandatory test areas.

If the optional BGPVPN Tempest API tests shall be run, Tempest needs to be told that the BGPVPN service is available. To do that, add the following to the $DOVETAIL_HOME/pre_config/tempest_conf.yaml configuration file:

service_available:
  bgpvpn: True
6.1.2.6. Configuration for Running HA Test Cases (Mandatory)

The HA test cases require OpenStack controller node info. It must include the node’s name, role, ip, as well as the user and key_filename or password to login to the node. Users must create the file ${DOVETAIL_HOME}/pre_config/pod.yaml to store the info. For some HA test cases, they will log in the controller node ‘node1’ and kill the specific processes. The names of the specific processes may be different with the actual ones of the SUTs. The process names can also be changed with file ${DOVETAIL_HOME}/pre_config/pod.yaml.

This file is also used as basis to collect SUT hardware information that is stored alongside results and uploaded to the OVP web portal. The SUT hardware information can be viewed within the ‘My Results’ view in the OVP web portal by clicking the SUT column ‘info’ link. In order to collect SUT hardware information holistically, ensure this file has an entry for each of the controller and compute nodes within the SUT.

Below is a sample with the required syntax when password is employed by the controller.

nodes:
-
    # This can not be changed and must be node0.
    name: node0

    # This must be Jumpserver.
    role: Jumpserver

    # This is the install IP of a node which has ipmitool installed.
    ip: xx.xx.xx.xx

    # User name of this node. This user must have sudo privileges.
    user: root

    # Password of the user.
    password: root

-
    # This can not be changed and must be node1.
    name: node1

    # This must be controller.
    role: Controller

    # This is the install IP of a controller node, which is the haproxy primary node
    ip: xx.xx.xx.xx

    # User name of this node. This user must have sudo privileges.
    user: root

    # Password of the user.
    password: root

process_info:
-
    # The default attack process of yardstick.ha.rabbitmq is 'rabbitmq-server'.
    # Here can reset it to be 'rabbitmq'.
    testcase_name: yardstick.ha.rabbitmq
    attack_process: rabbitmq

-
    # The default attack host for all HA test cases is 'node1'.
    # Here can reset it to be any other node given in the section 'nodes'.
    testcase_name: yardstick.ha.glance_api
    attack_host: node2

Besides the ‘password’, a ‘key_filename’ entry can be provided to login to the controller node. Users need to create file $DOVETAIL_HOME/pre_config/id_rsa to store the private key. A sample is provided below to show the required syntax when using a key file.

nodes:
-
    name: node1
    role: Controller
    ip: 10.1.0.50
    user: root

    # Private ssh key for accessing the controller nodes. If a keyfile is
    # being used, the path specified **must** be as shown below as this
    # is the location of the user-provided private ssh key inside the
    # Yardstick container.
    key_filename: /home/opnfv/userconfig/pre_config/id_rsa

Under nodes, repeat entries for name, role, ip, user and password or key file for each of the controller/compute nodes that comprise the SUT. Use a ‘-‘ to separate each of the entries. Specify the value for the role key to be either ‘Controller’ or ‘Compute’ for each node.

Under process_info, repeat entries for testcase_name, attack_host and attack_process for each HA test case. Use a ‘-‘ to separate each of the entries. The default attack host of all HA test cases is node1. The default attack processes of all HA test cases are list here,

Test Case Name Attack Process Name
yardstick.ha.cinder_api cinder-api
yardstick.ha.database mysql
yardstick.ha.glance_api glance-api
yardstick.ha.haproxy haproxy
yardstick.ha.keystone keystone
yardstick.ha.neutron_l3_agent neutron-l3-agent
yardstick.ha.neutron_server neutron-server
yardstick.ha.nova_api nova-api
yardstick.ha.rabbitmq rabbitmq-server
6.1.2.7. Configuration of Hosts File (Optional)

If your SUT uses a hosts file to translate hostnames into the IP of OS_AUTH_URL, then you need to provide the hosts info in a file $DOVETAIL_HOME/pre_config/hosts.yaml.

Create and edit file $DOVETAIL_HOME/pre_config/hosts.yaml. Below is an example of what this file should contain. Note, that multiple hostnames can be specified for each IP address, as shown in the generic syntax below the example.

$ cat ${DOVETAIL_HOME}/pre_config/hosts.yaml

---
hosts_info:
  192.168.141.101:
    - identity.endpoint.url
    - compute.endpoint.url

  <ip>:
    - <hostname1>
    - <hostname2>
6.1.2.8. Installing Dovetail on the Test Host

The Dovetail project maintains a Docker image that has Dovetail test tools preinstalled. This Docker image is tagged with versions. Before pulling the Dovetail image, check the OPNFV’s OVP web page first to determine the right tag for OVP testing.

6.1.2.8.1. Online Test Host

If the Test Host is online, you can directly pull Dovetail Docker image and download Ubuntu and Cirros images. All other dependent docker images will automatically be downloaded. The Ubuntu and Cirros images are used by Dovetail for image creation and VM instantiation within the SUT.

$ wget -nc http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img -P ${DOVETAIL_HOME}/images
$ wget -nc https://cloud-images.ubuntu.com/releases/14.04/release/ubuntu-14.04-server-cloudimg-amd64-disk1.img -P ${DOVETAIL_HOME}/images
$ wget -nc https://cloud-images.ubuntu.com/releases/16.04/release/ubuntu-16.04-server-cloudimg-amd64-disk1.img -P ${DOVETAIL_HOME}/images
$ wget -nc http://repository.cloudifysource.org/cloudify/4.0.1/sp-release/cloudify-manager-premium-4.0.1.qcow2 -P ${DOVETAIL_HOME}/images

$ sudo docker pull opnfv/dovetail:ovp-2.0.0
ovp-2.0.0: Pulling from opnfv/dovetail
324d088ce065: Pull complete
2ab951b6c615: Pull complete
9b01635313e2: Pull complete
04510b914a6c: Pull complete
83ab617df7b4: Pull complete
40ebbe7294ae: Pull complete
d5db7e3e81ae: Pull complete
0701bf048879: Pull complete
0ad9f4168266: Pull complete
d949894f87f6: Pull complete
Digest: sha256:7449601108ebc5c40f76a5cd9065ca5e18053be643a0eeac778f537719336c29
Status: Downloaded newer image for opnfv/dovetail:ovp-2.0.0
6.1.2.8.2. Offline Test Host

If the Test Host is offline, you will need to first pull the Dovetail Docker image, and all the dependent images that Dovetail uses, to a host that is online. The reason that you need to pull all dependent images is because Dovetail normally does dependency checking at run-time and automatically pulls images as needed, if the Test Host is online. If the Test Host is offline, then all these dependencies will need to be manually copied.

The Docker images and Cirros image below are necessary for all mandatory test cases.

$ sudo docker pull opnfv/dovetail:ovp-2.0.0
$ sudo docker pull opnfv/functest-smoke:opnfv-6.3.0
$ sudo docker pull opnfv/yardstick:ovp-2.0.0
$ sudo docker pull opnfv/bottlenecks:ovp-2.0.0
$ wget -nc http://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img -P {ANY_DIR}

The other Docker images and test images below are only used by optional test cases.

$ sudo docker pull opnfv/functest-healthcheck:opnfv-6.3.0
$ sudo docker pull opnfv/functest-features:opnfv-6.3.0
$ sudo docker pull opnfv/functest-vnf:opnfv-6.3.0
$ wget -nc https://cloud-images.ubuntu.com/releases/14.04/release/ubuntu-14.04-server-cloudimg-amd64-disk1.img -P {ANY_DIR}
$ wget -nc https://cloud-images.ubuntu.com/releases/16.04/release/ubuntu-16.04-server-cloudimg-amd64-disk1.img -P {ANY_DIR}
$ wget -nc http://repository.cloudifysource.org/cloudify/4.0.1/sp-release/cloudify-manager-premium-4.0.1.qcow2 -P {ANY_DIR}

Once all these images are pulled, save the images, copy to the Test Host, and then load the Dovetail image and all dependent images at the Test Host.

At the online host, save the images with the command below.

$ sudo docker save -o dovetail.tar opnfv/dovetail:ovp-2.0.0 \
  opnfv/functest-smoke:opnfv-6.3.0 opnfv/functest-healthcheck:opnfv-6.3.0 \
  opnfv/functest-features:opnfv-6.3.0 opnfv/functest-vnf:opnfv-6.3.0 \
  opnfv/yardstick:ovp-2.0.0 opnfv/bottlenecks:ovp-2.0.0

The command above creates a dovetail.tar file with all the images, which can then be copied to the Test Host. To load the Dovetail images on the Test Host execute the command below.

$ sudo docker load --input dovetail.tar

Now check to see that all Docker images have been pulled or loaded properly.

$ sudo docker images
REPOSITORY                      TAG                 IMAGE ID            CREATED             SIZE
opnfv/dovetail                  ovp-2.0.0           ac3b2d12b1b0        24 hours ago        784 MB
opnfv/functest-smoke            opnfv-6.3.0         010aacb7c1ee        17 hours ago        594.2 MB
opnfv/functest-healthcheck      opnfv-6.3.0         2cfd4523f797        17 hours ago        234 MB
opnfv/functest-features         opnfv-6.3.0         b61d4abd56fd        17 hours ago        530.5 MB
opnfv/functest-vnf              opnfv-6.3.0         929e847a22c3        17 hours ago        1.87 GB
opnfv/yardstick                 ovp-2.0.0           84b4edebfc44        17 hours ago        2.052 GB
opnfv/bottlenecks               ovp-2.0.0           3d4ed98a6c9a        21 hours ago        638 MB

After copying and loading the Dovetail images at the Test Host, also copy the test images (Ubuntu, Cirros and cloudify-manager) to the Test Host.

  • Copy image cirros-0.4.0-x86_64-disk.img to ${DOVETAIL_HOME}/images/.
  • Copy image ubuntu-14.04-server-cloudimg-amd64-disk1.img to ${DOVETAIL_HOME}/images/.
  • Copy image ubuntu-16.04-server-cloudimg-amd64-disk1.img to ${DOVETAIL_HOME}/images/.
  • Copy image cloudify-manager-premium-4.0.1.qcow2 to ${DOVETAIL_HOME}/images/.
6.1.3. Starting Dovetail Docker

Regardless of whether you pulled down the Dovetail image directly online, or loaded from a static image tar file, you are now ready to run Dovetail. Use the command below to create a Dovetail container and get access to its shell.

$ sudo docker run --privileged=true -it \
          -e DOVETAIL_HOME=$DOVETAIL_HOME \
          -v $DOVETAIL_HOME:$DOVETAIL_HOME \
          -v /var/run/docker.sock:/var/run/docker.sock \
          opnfv/dovetail:<tag> /bin/bash

The -e option sets the DOVETAIL_HOME environment variable in the container and the -v options mounts files from the test host to the destination path inside the container. The latter option allows the Dovetail container to read the configuration files and write result files into DOVETAIL_HOME on the Test Host. The user should be within the Dovetail container shell, once the command above is executed.

6.1.4. Running the OVP Test Suite

All or a subset of the available tests can be executed at any location within the Dovetail container prompt. You can refer to Dovetail Command Line Interface Reference for the details of the CLI.

$ dovetail run --testsuite <test-suite-name>

The ‘–testsuite’ option is used to control the set of tests intended for execution at a high level. For the purposes of running the OVP test suite, the test suite name follows the following format, ovp.<major>.<minor>.<patch>. The latest and default test suite is ovp.2018.09.

$ dovetail run

This command is equal to

$ dovetail run --testsuite ovp.2018.09

Without any additional options, the above command will attempt to execute all mandatory and optional test cases with test suite ovp.2018.09. To restrict the breadth of the test scope, it can also be specified using options ‘–mandatory’ or ‘–optional’.

$ dovetail run --mandatory

Also there is a ‘–testcase’ option provided to run a specified test case.

$ dovetail run --testcase functest.tempest.osinterop

Dovetail allows the user to disable strict API response validation implemented by Nova Tempest tests by means of the --no-api-validation option. Usage of this option is only advisable if the SUT returns Nova API responses that contain additional attributes. For more information on this command line option and its intended usage, refer to Disabling Strict API Validation in Tempest.

$ dovetail run --testcase functest.tempest.osinterop --no-api-validation

By default, during test case execution, the respective feature is responsible to decide what flavor is going to use for the execution of each test scenario which is under of its umbrella. In parallel, there is also implemented a mechanism in order for the extra specs in flavors of executing test scenarios to be hugepages instead of the default option. This is happening if the name of the scenario contains the substring “ovs”. In this case, the flavor which is going to be used for the running test case has ‘hugepage’ characteristics.

Taking the above into our consideration and having in our mind that the DEPLOY_SCENARIO environment parameter is not used by dovetail framework (the initial value is ‘unknown’), we set as input, for the features that they are responsible for the test case execution, the DEPLOY_SCENARIO environment parameter having as substring the feature name “ovs” (e.g. os-nosdn-ovs-ha).

Note for the users:
  • if their system uses DPDK, they should run with –deploy-scenario <xx-yy-ovs-zz> (e.g. os-nosdn-ovs-ha)
  • this is an experimental feature
$ dovetail run --testcase functest.tempest.osinterop --deploy-scenario os-nosdn-ovs-ha

By default, results are stored in local files on the Test Host at $DOVETAIL_HOME/results. Each time the ‘dovetail run’ command is executed, the results in the aforementioned directory are overwritten. To create a singular compressed result file for upload to the OVP portal or for archival purposes, the tool provided an option ‘–report’.

$ dovetail run --report

If the Test Host is offline, --offline should be added to support running with local resources.

$ dovetail run --offline

Below is an example of running one test case and the creation of the compressed result file on the Test Host.

$ dovetail run --offline --testcase functest.vping.userdata --report
2018-05-22 08:16:16,353 - run - INFO - ================================================
2018-05-22 08:16:16,353 - run - INFO - Dovetail compliance: ovp.2018.09!
2018-05-22 08:16:16,353 - run - INFO - ================================================
2018-05-22 08:16:16,353 - run - INFO - Build tag: daily-master-660de986-5d98-11e8-b635-0242ac110001
2018-05-22 08:19:31,595 - run - WARNING - There is no hosts file /home/dovetail/pre_config/hosts.yaml, may be some issues with domain name resolution.
2018-05-22 08:19:31,595 - run - INFO - Get hardware info of all nodes list in file /home/dovetail/pre_config/pod.yaml ...
2018-05-22 08:19:39,778 - run - INFO - Hardware info of all nodes are stored in file /home/dovetail/results/all_hosts_info.json.
2018-05-22 08:19:39,961 - run - INFO - >>[testcase]: functest.vping.userdata
2018-05-22 08:31:17,961 - run - INFO - Results have been stored with file /home/dovetail/results/functest_results.txt.
2018-05-22 08:31:17,969 - report.Report - INFO -

Dovetail Report
Version: 1.0.0
Build Tag: daily-master-660de986-5d98-11e8-b635-0242ac110001
Upload Date: 2018-05-22 08:31:17 UTC
Duration: 698.01 s

Pass Rate: 100.00% (1/1)
vping:                     pass rate 100.00%
-functest.vping.userdata   PASS

When test execution is complete, a tar file with all result and log files is written in $DOVETAIL_HOME on the Test Host. An example filename is ${DOVETAIL_HOME}/logs_20180105_0858.tar.gz. The file is named using a timestamp that follows the convention ‘YearMonthDay-HourMinute’. In this case, it was generated at 08:58 on January 5th, 2018. This tar file is used to upload to the OVP portal.

6.1.4.1. Making Sense of OVP Test Results

When a tester is performing trial runs, Dovetail stores results in local files on the Test Host by default within the directory specified below.

cd $DOVETAIL_HOME/results
  1. Local file
    • Log file: dovetail.log
      • Review the dovetail.log to see if all important information has been captured - in default mode without DEBUG.
      • Review the results.json to see all results data including criteria for PASS or FAIL.
    • Tempest and security test cases
      • Can see the log details in tempest_logs/functest.tempest.XXX.html and security_logs/functest.security.XXX.html respectively, which has the passed, skipped and failed test cases results.
      • This kind of files need to be opened with a web browser.
      • The skipped test cases have the reason for the users to see why these test cases skipped.
      • The failed test cases have rich debug information for the users to see why these test cases fail.
    • Vping test cases
      • Its log is stored in vping_logs/functest.vping.XXX.log.
    • HA test cases
      • Its log is stored in ha_logs/yardstick.ha.XXX.log.
    • Stress test cases
      • Its log is stored in stress_logs/bottlenecks.stress.XXX.log.
    • Snaps test cases
      • Its log is stored in snaps_logs/functest.snaps.smoke.log.
    • VNF test cases
      • Its log is stored in vnf_logs/functest.vnf.XXX.log.
    • Bgpvpn test cases
      • Can see the log details in bgpvpn_logs/functest.bgpvpn.XXX.log.
6.1.5. OVP Portal Web Interface

The OVP portal is a public web interface for the community to collaborate on results and to submit results for official OPNFV compliance verification. The portal can be used as a resource by users and testers to navigate and inspect results more easily than by manually inspecting the log files. The portal also allows users to share results in a private manner until they are ready to submit results for peer community review.

  • Web Site URL
  • Sign In / Sign Up Links
    • Accounts are exposed through Linux Foundation or OpenStack account credentials.
    • If you already have a Linux Foundation ID, you can sign in directly with your ID.
    • If you do not have a Linux Foundation ID, you can sign up for a new one using ‘Sign Up’
  • My Results Tab
    • This is the primary view where most of the workflow occurs.
    • This page lists all results uploaded by you after signing in.
    • You can also upload results on this page with the two steps below.
    • Obtain results tar file located at ${DOVETAIL_HOME}/, example logs_20180105_0858.tar.gz
    • Use the Choose File button where a file selection dialog allows you to choose your result file from the hard-disk. Then click the Upload button and see a results ID once your upload succeeds.
    • Results are status ‘private’ until they are submitted for review.
    • Use the Operation column drop-down option ‘submit to review’, to expose results to OPNFV community peer reviewers. Use the ‘withdraw submit’ option to reverse this action.
    • Use the Operation column drop-down option ‘share with’ to share results with other users by supplying either the login user ID or the email address associated with the share target account. The result is exposed to the share target but remains private otherwise.
  • Profile Tab
    • This page shows your account info after you sign in.
6.1.6. Updating Dovetail or a Test Suite

Follow the instructions in section Installing Dovetail on the Test Host and Running the OVP Test Suite by replacing the docker images with new_tags,

sudo docker pull opnfv/dovetail:<dovetail_new_tag>
sudo docker pull opnfv/functest:<functest_new_tag>
sudo docker pull opnfv/yardstick:<yardstick_new_tag>

This step is necessary if dovetail software or the OVP test suite have updates.

6.2. Dovetail Command Line Interface Reference

Dovetail command line is to have a simple command line interface in Dovetail to make easier for users to handle the functions that dovetail framework provides.

6.2.1. Commands List
Commands Action
dovetail –help | -h Show usage of command “dovetail”
dovetail –version Show version number
Dovetail List Commands
dovetail list –help | -h Show usage of command “dovetail list”
dovetail list List all available test suites and all test cases within each test suite
dovetail list <test_suite_name> List all available test areas within test suite <test_suite_name>
Dovetail Show Commands
dovetail show –help | -h Show usage of command “dovetail show”
dovetail show <test_case_name> Show the details of one test case
Dovetail Run Commands
dovetail run –help | -h Show usage of command “dovetail run”
dovetail run Run Dovetail with all test cases within default test suite
dovetail run –testsuite <test_suite_name> Run Dovetail with all test cases within test suite <test_suite_name>
dovetail run –testsuite <test_suite_name> –testarea <test_area_name> Run Dovetail with test area <test_area_name> within test suite <test_suite_name>. Test area can be chosen from (vping, tempest, security, ha, stress, bgpvpn, vnf, snaps). Repeat option to set multiple test areas.
dovetail run –testcase <test_case_name> Run Dovetail with one or more specified test cases. Repeat option to set multiple test cases.
dovetail run –mandatory –testsuite <test_suite_name> Run Dovetail with all mandatory test cases within test suite <test_suite_name>
dovetail run –optional –testsuite <test_suite_name> Run Dovetail with all optional test cases within test suite <test_suite_name>
dovetail run –debug | -d Run Dovetail with debug mode and show all debug logs
dovetail run –offline Run Dovetail offline, use local docker images instead of download online
dovetail run –report | -r <db_url> Package the results directory which can be used to upload to OVP web portal
dovetail run –deploy-scenario <deploy_scenario_name> Specify the deploy scenario having as project name ‘ovs’
dovetail run –no-api-validation Disable strict API response validation
dovetail run –no-clean | -n Keep all Containers created for debuging
dovetail run –stop | -s Stop immediately when one test case failed
6.2.2. Commands Examples
6.2.2.1. Dovetail Commands
root@1f230e719e44:~/dovetail/dovetail# dovetail --help
Usage: dovetail [OPTIONS] COMMAND [ARGS]...

Options:
  --version   Show the version and exit.
  -h, --help  Show this message and exit.

Commands:
  list  list the testsuite details
  run   run the testcases
  show  show the testcases details
root@1f230e719e44:~/dovetail/dovetail# dovetail --version
dovetail, version 2018.9.0
6.2.2.2. Dovetail List Commands
root@1f230e719e44:~/dovetail/dovetail# dovetail list --help
Usage: dovetail list [OPTIONS] [TESTSUITE]

  list the testsuite details

Options:
  -h, --help  Show this message and exit.
root@1f230e719e44:~/dovetail/dovetail# dovetail list ovp.2018.09
- mandatory
    functest.vping.userdata
    functest.vping.ssh
    functest.tempest.osinterop
    functest.tempest.compute
    functest.tempest.identity_v3
    functest.tempest.image
    functest.tempest.network_api
    functest.tempest.volume
    functest.tempest.neutron_trunk_ports
    functest.tempest.ipv6_api
    functest.security.patrole
    yardstick.ha.nova_api
    yardstick.ha.neutron_server
    yardstick.ha.keystone
    yardstick.ha.glance_api
    yardstick.ha.cinder_api
    yardstick.ha.cpu_load
    yardstick.ha.disk_load
    yardstick.ha.haproxy
    yardstick.ha.rabbitmq
    yardstick.ha.database
    bottlenecks.stress.ping
- optional
    functest.tempest.ipv6_scenario
    functest.tempest.multi_node_scheduling
    functest.tempest.network_security
    functest.tempest.vm_lifecycle
    functest.tempest.network_scenario
    functest.tempest.bgpvpn
    functest.bgpvpn.subnet_connectivity
    functest.bgpvpn.tenant_separation
    functest.bgpvpn.router_association
    functest.bgpvpn.router_association_floating_ip
    yardstick.ha.neutron_l3_agent
    yardstick.ha.controller_restart
    functest.vnf.vims
    functest.vnf.vepc
    functest.snaps.smoke
6.2.2.3. Dovetail Show Commands
root@1f230e719e44:~/dovetail/dovetail# dovetail show --help
Usage: dovetail show [OPTIONS] TESTCASE

  show the testcases details

Options:
  -h, --help  Show this message and exit.
root@1f230e719e44:~/dovetail/dovetail# dovetail show functest.vping.ssh
---
functest.vping.ssh:
  name: functest.vping.ssh
  objective: testing for vping using ssh
  validate:
    type: functest
    testcase: vping_ssh
  report:
    source_archive_files:
      - functest.log
    dest_archive_files:
      - vping_logs/functest.vping.ssh.log
    check_results_file: 'functest_results.txt'
    sub_testcase_list:
root@1f230e719e44:~/dovetail/dovetail# dovetail show functest.tempest.image
---
functest.tempest.image:
  name: functest.tempest.image
  objective: tempest smoke test cases about image
  validate:
    type: functest
    testcase: tempest_custom
    pre_condition:
      - 'cp /home/opnfv/userconfig/pre_config/tempest_conf.yaml /usr/lib/python2.7/site-packages/functest/opnfv_tests/openstack/tempest/custom_tests/tempest_conf.yaml'
      - 'cp /home/opnfv/userconfig/pre_config/testcases.yaml /usr/lib/python2.7/site-packages/xtesting/ci/testcases.yaml'
    pre_copy:
      src_file: tempest_custom.txt
      dest_path: /usr/lib/python2.7/site-packages/functest/opnfv_tests/openstack/tempest/custom_tests/test_list.txt
  report:
    source_archive_files:
      - functest.log
      - tempest_custom/tempest.log
      - tempest_custom/tempest-report.html
    dest_archive_files:
      - tempest_logs/functest.tempest.image.functest.log
      - tempest_logs/functest.tempest.image.log
      - tempest_logs/functest.tempest.image.html
    check_results_file: 'functest_results.txt'
    sub_testcase_list:
      - tempest.api.image.v2.test_images.BasicOperationsImagesTest.test_register_upload_get_image_file[id-139b765e-7f3d-4b3d-8b37-3ca3876ee318,smoke]
      - tempest.api.image.v2.test_versions.VersionsTest.test_list_versions[id-659ea30a-a17c-4317-832c-0f68ed23c31d,smoke]
6.2.2.4. Dovetail Run Commands
root@1f230e719e44:~/dovetail/dovetail# dovetail run --help
Usage: run.py [OPTIONS]

Dovetail compliance test entry!

Options:
--deploy-scenario TEXT  Specify the DEPLOY_SCENARIO which will be used as input by each testcase respectively
--optional              Run all optional test cases.
--offline               run in offline method, which means not to update the docker upstream images, functest, yardstick, etc.
-r, --report            Create a tarball file to upload to OVP web portal
-d, --debug             Flag for showing debug log on screen.
--testcase TEXT         Compliance testcase. Specify option multiple times to include multiple test cases.
--testarea TEXT         Compliance testarea within testsuite. Specify option multiple times to include multiple test areas.
-s, --stop              Flag for stopping on test case failure.
-n, --no-clean          Keep all Containers created for debuging.
--no-api-validation     disable strict API response validation
--mandatory             Run all mandatory test cases.
--testsuite TEXT        compliance testsuite.
-h, --help              Show this message and exit.
root@1f230e719e44:~/dovetail/dovetail# dovetail run --testcase functest.vping.ssh --offline -r --deploy-scenario os-nosdn-ovs-ha
2017-10-12 14:57:51,278 - run - INFO - ================================================
2017-10-12 14:57:51,278 - run - INFO - Dovetail compliance: ovp.2018.09!
2017-10-12 14:57:51,278 - run - INFO - ================================================
2017-10-12 14:57:51,278 - run - INFO - Build tag: daily-master-b80bca76-af5d-11e7-879a-0242ac110002
2017-10-12 14:57:51,278 - run - INFO - DEPLOY_SCENARIO : os-nosdn-ovs-ha
2017-10-12 14:57:51,336 - run - WARNING - There is no hosts file /home/dovetail/pre_config/hosts.yaml, may be some issues with domain name resolution.
2017-10-12 14:57:51,336 - run - INFO - Get hardware info of all nodes list in file /home/cvp/pre_config/pod.yaml ...
2017-10-12 14:57:51,336 - run - INFO - Hardware info of all nodes are stored in file /home/cvp/results/all_hosts_info.json.
2017-10-12 14:57:51,517 - run - INFO - >>[testcase]: functest.vping.ssh
2017-10-12 14:58:21,325 - report.Report - INFO - Results have been stored with file /home/cvp/results/functest_results.txt.
2017-10-12 14:58:21,325 - report.Report - INFO -

Dovetail Report
Version: 2018.09
Build Tag: daily-master-b80bca76-af5d-11e7-879a-0242ac110002
Test Date: 2018-08-13 03:23:56 UTC
Duration: 291.92 s

Pass Rate: 0.00% (1/1)
vping:                     pass rate 100%
-functest.vping.ssh        PASS

Functest

Functest Installation Guide
Introduction

This document describes how to install and configure Functest in OPNFV.

High level architecture

The high level architecture of Functest within OPNFV can be described as follows:

CIMC/Lights+out management               Admin  Mgmt/API  Public  Storage Private
                                          PXE
+                                           +       +        +       +       +
|                                           |       |        |       |       |
|     +----------------------------+        |       |        |       |       |
|     |                            |        |       |        |       |       |
+-----+       Jumphost             |        |       |        |       |       |
|     |                            |        |       |        |       |       |
|     |   +--------------------+   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   | Tools              |   +----------------+        |       |       |
|     |   | - Rally            |   |        |       |        |       |       |
|     |   | - Robot            |   |        |       |        |       |       |
|     |   | - RefStack         |   |        |       |        |       |       |
|     |   |                    |   |-------------------------+       |       |
|     |   | Testcases          |   |        |       |        |       |       |
|     |   | - VIM              |   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   | - SDN Controller   |   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   | - Features         |   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   | - VNF              |   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   +--------------------+   |        |       |        |       |       |
|     |     Functest Docker        +        |       |        |       |       |
|     |                            |        |       |        |       |       |
|     |                            |        |       |        |       |       |
|     |                            |        |       |        |       |       |
|     +----------------------------+        |       |        |       |       |
|                                           |       |        |       |       |
|    +----------------+                     |       |        |       |       |
|    |             1  |                     |       |        |       |       |
+----+ +--------------+-+                   |       |        |       |       |
|    | |             2  |                   |       |        |       |       |
|    | | +--------------+-+                 |       |        |       |       |
|    | | |             3  |                 |       |        |       |       |
|    | | | +--------------+-+               |       |        |       |       |
|    | | | |             4  |               |       |        |       |       |
|    +-+ | | +--------------+-+             |       |        |       |       |
|      | | | |             5  +-------------+       |        |       |       |
|      +-+ | |  nodes for     |             |       |        |       |       |
|        | | |  deploying     +---------------------+        |       |       |
|        +-+ |  OPNFV         |             |       |        |       |       |
|          | |                +------------------------------+       |       |
|          +-+     SUT        |             |       |        |       |       |
|            |                +--------------------------------------+       |
|            |                |             |       |        |       |       |
|            |                +----------------------------------------------+
|            +----------------+             |       |        |       |       |
|                                           |       |        |       |       |
+                                           +       +        +       +       +
             SUT = System Under Test

Note connectivity to management network is not needed for most of the testcases. But it may be needed for some specific snaps tests.

All the libraries and dependencies needed by all of the Functest tools are pre-installed into the Docker images. This allows running Functest on any platform.

The automated mechanisms inside the Functest Docker containers will:

  • Prepare the environment according to the System Under Test (SUT)
  • Perform the appropriate functional tests
  • Push the test results into the OPNFV test result database (optional)

The OpenStack credentials file must be provided to the container.

These Docker images can be integrated into CI or deployed independently.

Please note that the Functest Docker images have been designed for OPNFV, however, it would be possible to adapt them to any OpenStack based VIM + controller environment, since most of the test cases are integrated from upstream communities.

The functional test cases are described in the Functest User Guide

Prerequisites

The OPNFV deployment is out of the scope of this document but it can be found in http://docs.opnfv.org. The OPNFV platform is considered as the SUT in this document.

Several prerequisites are needed for Functest:

  1. A Jumphost to run Functest on
  2. A Docker daemon shall be installed on the Jumphost
  3. A public/external network created on the SUT
  4. An admin/management network created on the SUT
  5. Connectivity from the Jumphost to the SUT public/external network

Some specific SNAPS tests may require a connectivity from the Jumphost to the SUT admin/management network but most of the test cases do not. This requirement can be changed by overriding the ‘interface’ attribute (OS_INTERFACE) value to ‘public’ in the credentials file. Another means to circumvent this issue would be to change the ‘snaps.use_keystone’ value from True to False.

WARNING: Connectivity from Jumphost is essential and it is of paramount importance to make sure it is working before even considering to install and run Functest. Make also sure you understand how your networking is designed to work.

NOTE: Jumphost refers to any server which meets the previous requirements. Normally it is the same server from where the OPNFV deployment has been triggered previously, but it could be any server with proper connectivity to the SUT.

NOTE: If your Jumphost is operating behind a company http proxy and/or firewall, please consult first the section Proxy support, towards the end of this document. The section details some tips/tricks which may be of help in a proxified environment.

Docker installation

Docker installation and configuration is only needed to be done once through the life cycle of Jumphost.

If your Jumphost is based on Ubuntu, SUSE, RHEL or CentOS linux, please consult the references below for more detailed instructions. The commands below are offered as a short reference.

Tip: For running docker containers behind the proxy, you need first some extra configuration which is described in section Docker Installation on CentOS behind http proxy. You should follow that section before installing the docker engine.

Docker installation needs to be done as root user. You may use other userid’s to create and run the actual containers later if so desired. Log on to your Jumphost as root user and install the Docker Engine (e.g. for CentOS family):

curl -sSL https://get.docker.com/ | sh
systemctl start docker

*Tip:* If you are working through proxy, please set the https_proxy
environment variable first before executing the curl command.

Add your user to docker group to be able to run commands without sudo:

sudo usermod -aG docker <your_user>
A reconnection is needed. There are 2 ways for this:
  1. Re-login to your account
  2. su - <username>
References - Installing Docker Engine on different Linux Operating Systems:
Public/External network on SUT

Some of the tests against the VIM (Virtual Infrastructure Manager) need connectivity through an existing public/external network in order to succeed. This is needed, for example, to create floating IPs to access VM instances through the public/external network (i.e. from the Docker container).

By default, the five OPNFV installers provide a fresh installation with a public/external network created along with a router. Make sure that the public/external subnet is reachable from the Jumphost and an external router exists.

Hint: For the given OPNFV Installer in use, the IP sub-net address used for the public/external network is usually a planning item and should thus be known. Ensure you can reach each node in the SUT, from the Jumphost using the ‘ping’ command using the respective IP address on the public/external network for each node in the SUT. The details of how to determine the needed IP addresses for each node in the SUT may vary according to the used installer and are therefore ommitted here.

Installation and configuration

Alpine containers have been introduced in Euphrates. Alpine allows Functest testing in several very light containers and thanks to the refactoring on dependency management should allow the creation of light and fully customized docker images.

Functest Dockers for OpenStack deployment

Docker images are available on the dockerhub:

  • opnfv/functest-core
  • opnfv/functest-healthcheck
  • opnfv/functest-smoke
  • opnfv/functest-benchmarking
  • opnfv/functest-features
  • opnfv/functest-components
  • opnfv/functest-vnf
Preparing your environment

cat env:

EXTERNAL_NETWORK=XXX
DEPLOY_SCENARIO=XXX  # if not os-nosdn-nofeature-noha scenario
NAMESERVER=XXX  # if not 8.8.8.8

See section on environment variables for details.

cat env_file:

export OS_AUTH_URL=XXX
export OS_USER_DOMAIN_NAME=XXX
export OS_PROJECT_DOMAIN_NAME=XXX
export OS_USERNAME=XXX
export OS_PROJECT_NAME=XXX
export OS_PASSWORD=XXX
export OS_IDENTITY_API_VERSION=3

See section on OpenStack credentials for details.

Create a directory for the different images (attached as a Docker volume):

mkdir -p images && wget -q -O- https://git.opnfv.org/functest/plain/functest/ci/download_images.sh?h=stable/fraser | bash -s -- images && ls -1 images/*

images/CentOS-7-aarch64-GenericCloud.qcow2
images/CentOS-7-aarch64-GenericCloud.qcow2.xz
images/CentOS-7-x86_64-GenericCloud.qcow2
images/cirros-0.4.0-x86_64-disk.img
images/cirros-0.4.0-x86_64-lxc.tar.gz
images/cloudify-manager-premium-4.0.1.qcow2
images/shaker-image-arm64.qcow2
images/shaker-image.qcow
images/ubuntu-14.04-server-cloudimg-amd64-disk1.img
images/ubuntu-14.04-server-cloudimg-arm64-uefi1.img
images/ubuntu-16.04-server-cloudimg-amd64-disk1.img
images/vyos-1.1.7.img
Testing healthcheck suite

Run healthcheck suite:

sudo docker run --env-file env \
    -v $(pwd)/openstack.creds:/home/opnfv/functest/conf/env_file \
    -v $(pwd)/images:/home/opnfv/functest/images \
    opnfv/functest-healthcheck:gambia

Results shall be displayed as follows:

+----------------------------+------------------+---------------------+------------------+----------------+
|         TEST CASE          |     PROJECT      |         TIER        |     DURATION     |     RESULT     |
+----------------------------+------------------+---------------------+------------------+----------------+
|      connection_check      |     functest     |     healthcheck     |      00:09       |      PASS      |
|       tenantnetwork1       |     functest     |     healthcheck     |      00:14       |      PASS      |
|       tenantnetwork2       |     functest     |     healthcheck     |      00:11       |      PASS      |
|          vmready1          |     functest     |     healthcheck     |      00:19       |      PASS      |
|          vmready2          |     functest     |     healthcheck     |      00:16       |      PASS      |
|         singlevm1          |     functest     |     healthcheck     |      00:41       |      PASS      |
|         singlevm2          |     functest     |     healthcheck     |      00:36       |      PASS      |
|         vping_ssh          |     functest     |     healthcheck     |      00:46       |      PASS      |
|       vping_userdata       |     functest     |     healthcheck     |      00:41       |      PASS      |
|        cinder_test         |     functest     |     healthcheck     |      01:18       |      PASS      |
|         api_check          |     functest     |     healthcheck     |      10:33       |      PASS      |
|     snaps_health_check     |     functest     |     healthcheck     |      00:44       |      PASS      |
|            odl             |     functest     |     healthcheck     |      00:00       |      SKIP      |
+----------------------------+------------------+---------------------+------------------+----------------+

NOTE: the duration is a reference and it might vary depending on your SUT.

Testing smoke suite

Run smoke suite:

sudo docker run --env-file env \
    -v $(pwd)/openstack.creds:/home/opnfv/functest/conf/env_file \
    -v $(pwd)/images:/home/opnfv/functest/images \
    opnfv/functest-smoke:gambia

Results shall be displayed as follows:

+------------------------------------+------------------+---------------+------------------+----------------+
|             TEST CASE              |     PROJECT      |      TIER     |     DURATION     |     RESULT     |
+------------------------------------+------------------+---------------+------------------+----------------+
|           tempest_smoke            |     functest     |     smoke     |      06:13       |      PASS      |
|     neutron-tempest-plugin-api     |     functest     |     smoke     |      09:32       |      PASS      |
|            rally_sanity            |     functest     |     smoke     |      29:34       |      PASS      |
|             rally_jobs             |     functest     |     smoke     |      24:02       |      PASS      |
|          refstack_defcore          |     functest     |     smoke     |      13:07       |      PASS      |
|              patrole               |     functest     |     smoke     |      05:17       |      PASS      |
|            snaps_smoke             |     functest     |     smoke     |      90:13       |      PASS      |
|           neutron_trunk            |     functest     |     smoke     |      00:00       |      SKIP      |
|         networking-bgpvpn          |     functest     |     smoke     |      00:00       |      SKIP      |
|           networking-sfc           |     functest     |     smoke     |      00:00       |      SKIP      |
|              barbican              |     functest     |     smoke     |      05:01       |      PASS      |
+------------------------------------+------------------+---------------+------------------+----------------+

Note: if the scenario does not support some tests, they are indicated as SKIP. See User guide for details.

Testing benchmarking suite

Run benchmarking suite:

sudo docker run --env-file env \
    -v $(pwd)/openstack.creds:/home/opnfv/functest/conf/env_file \
    -v $(pwd)/images:/home/opnfv/functest/images \
    opnfv/functest-benchmarking:gambia

Results shall be displayed as follows:

+-------------------+------------------+----------------------+------------------+----------------+
|     TEST CASE     |     PROJECT      |         TIER         |     DURATION     |     RESULT     |
+-------------------+------------------+----------------------+------------------+----------------+
|        vmtp       |     functest     |     benchmarking     |      18:43       |      PASS      |
|       shaker      |     functest     |     benchmarking     |      29:45       |      PASS      |
+-------------------+------------------+----------------------+------------------+----------------+

Note: if the scenario does not support some tests, they are indicated as SKIP. See User guide for details.

Testing features suite

Run features suite:

sudo docker run --env-file env \
    -v $(pwd)/openstack.creds:/home/opnfv/functest/conf/env_file \
    -v $(pwd)/images:/home/opnfv/functest/images \
    opnfv/functest-features:gambia

Results shall be displayed as follows:

+-----------------------------+------------------------+------------------+------------------+----------------+
|          TEST CASE          |        PROJECT         |       TIER       |     DURATION     |     RESULT     |
+-----------------------------+------------------------+------------------+------------------+----------------+
|     doctor-notification     |         doctor         |     features     |      00:00       |      SKIP      |
|            bgpvpn           |         sdnvpn         |     features     |      00:00       |      SKIP      |
|       functest-odl-sfc      |          sfc           |     features     |      00:00       |      SKIP      |
|      barometercollectd      |       barometer        |     features     |      00:00       |      SKIP      |
|             fds             |     fastdatastacks     |     features     |      00:00       |      SKIP      |
|             vgpu            |        functest        |     features     |      00:00       |      SKIP      |
|         stor4nfv_os         |        stor4nfv        |     features     |      00:00       |      SKIP      |
+-----------------------------+------------------------+------------------+------------------+----------------+

Note: if the scenario does not support some tests, they are indicated as SKIP. See User guide for details.

Testing components suite

Run components suite:

sudo docker run --env-file env \
    -v $(pwd)/openstack.creds:/home/opnfv/functest/conf/env_file \
    -v $(pwd)/images:/home/opnfv/functest/images \
    opnfv/functest-components:gambia

Results shall be displayed as follows:

+--------------------------+------------------+--------------------+------------------+----------------+
|        TEST CASE         |     PROJECT      |        TIER        |     DURATION     |     RESULT     |
+--------------------------+------------------+--------------------+------------------+----------------+
|       tempest_full       |     functest     |     components     |      49:51       |      PASS      |
|     tempest_scenario     |     functest     |     components     |      18:50       |      PASS      |
|        rally_full        |     functest     |     components     |      167:13      |      PASS      |
+--------------------------+------------------+--------------------+------------------+----------------+
Testing vnf suite

Run vnf suite:

sudo docker run --env-file env \
    -v $(pwd)/openstack.creds:/home/opnfv/functest/conf/env_file \
    -v $(pwd)/images:/home/opnfv/functest/images \
    opnfv/functest-vnf:gambia

Results shall be displayed as follows:

+----------------------+------------------+--------------+------------------+----------------+
|      TEST CASE       |     PROJECT      |     TIER     |     DURATION     |     RESULT     |
+----------------------+------------------+--------------+------------------+----------------+
|       cloudify       |     functest     |     vnf      |      04:05       |      PASS      |
|     cloudify_ims     |     functest     |     vnf      |      24:07       |      PASS      |
|       heat_ims       |     functest     |     vnf      |      18:15       |      PASS      |
|     vyos_vrouter     |     functest     |     vnf      |      15:48       |      PASS      |
|       juju_epc       |     functest     |     vnf      |      29:38       |      PASS      |
+----------------------+------------------+--------------+------------------+----------------+
Functest Dockers for Kubernetes deployment

Docker images are available on the dockerhub:

  • opnfv/functest-kubernetes-core
  • opnfv/functest-kubernetest-healthcheck
  • opnfv/functest-kubernetest-smoke
  • opnfv/functest-kubernetest-features
Preparing your environment

cat env:

DEPLOY_SCENARIO=k8s-XXX
Testing healthcheck suite

Run healthcheck suite:

sudo docker run -it --env-file env \
    -v $(pwd)/config:/root/.kube/config \
    opnfv/functest-kubernetes-healthcheck:gambia

A config file in the current dir ‘config’ is also required, which should be volume mapped to ~/.kube/config inside kubernetes container.

Results shall be displayed as follows:

+-------------------+------------------+---------------------+------------------+----------------+
|     TEST CASE     |     PROJECT      |         TIER        |     DURATION     |     RESULT     |
+-------------------+------------------+---------------------+------------------+----------------+
|     k8s_smoke     |     functest     |     healthcheck     |      02:27       |      PASS      |
+-------------------+------------------+---------------------+------------------+----------------+
Testing smoke suite

Run smoke suite:

sudo docker run -it --env-file env \
    -v $(pwd)/config:/root/.kube/config \
    opnfv/functest-kubernetes-smoke:gambia

Results shall be displayed as follows:

+-------------------------+------------------+---------------+------------------+----------------+
|        TEST CASE        |     PROJECT      |      TIER     |     DURATION     |     RESULT     |
+-------------------------+------------------+---------------+------------------+----------------+
|     k8s_conformance     |     functest     |     smoke     |      57:14       |      PASS      |
+-------------------------+------------------+---------------+------------------+----------------+
Testing features suite

Run features suite:

sudo docker run -it --env-file env \
    -v $(pwd)/config:/root/.kube/config \
    opnfv/functest-kubernetes-features:gambia

Results shall be displayed as follows:

+----------------------+------------------+------------------+------------------+----------------+
|      TEST CASE       |     PROJECT      |       TIER       |     DURATION     |     RESULT     |
+----------------------+------------------+------------------+------------------+----------------+
|     stor4nfv_k8s     |     stor4nfv     |     stor4nfv     |      00:00       |      SKIP      |
|      clover_k8s      |      clover      |      clover      |      00:00       |      SKIP      |
+----------------------+------------------+------------------+------------------+----------------+
Environment variables

Several environement variables may be specified:

  • INSTALLER_IP=<Specific IP Address>
  • DEPLOY_SCENARIO=<vim>-<controller>-<nfv_feature>-<ha_mode>
  • NAMESERVER=XXX # if not 8.8.8.8
  • VOLUME_DEVICE_NAME=XXX # if not vdb
  • EXTERNAL_NETWORK=XXX # if not first network with router:external=True
  • NEW_USER_ROLE=XXX # if not member

INSTALLER_IP is required by Barometer in order to access the installer node and the deployment.

The format for the DEPLOY_SCENARIO env variable can be described as follows:
  • vim: (os|k8s) = OpenStack or Kubernetes
  • controller is one of ( nosdn | odl )
  • nfv_feature is one or more of ( ovs | kvm | sfc | bgpvpn | nofeature )
  • ha_mode (high availability) is one of ( ha | noha )

If several features are pertinent then use the underscore character ‘_’ to separate each feature (e.g. ovs_kvm). ‘nofeature’ indicates that no OPNFV feature is deployed.

The list of supported scenarios per release/installer is indicated in the release note.

NOTE: The scenario name is mainly used to automatically detect if a test suite is runnable or not (e.g. it will prevent ODL test suite to be run on ‘nosdn’ scenarios). If not set, Functest will try to run the default test cases that might not include SDN controller or a specific feature.

NOTE: An HA scenario means that 3 OpenStack controller nodes are deployed. It does not necessarily mean that the whole system is HA. See installer release notes for details.

Finally, three additional environment variables can also be passed in to the Functest Docker Container, using the -e “<EnvironmentVariable>=<Value>” mechanism. The first two parameters are only relevant to Jenkins CI invoked testing and should not be used when performing manual test scenarios:

  • INSTALLER_TYPE=(apex|compass|daisy|fuel)
  • NODE_NAME=<Test POD Name>
  • BUILD_TAG=<Jenkins Build Tag>

where:

  • <Test POD Name> = Symbolic name of the POD where the tests are run.

    Visible in test results files, which are stored to the database. This option is only used when tests are activated under Jenkins CI control. It indicates the POD/hardware where the test has been run. If not specified, then the POD name is defined as “Unknown” by default. DO NOT USE THIS OPTION IN MANUAL TEST SCENARIOS.

  • <Jenkins Build tag> = Symbolic name of the Jenkins Build Job.

    Visible in test results files, which are stored to the database. This option is only set when tests are activated under Jenkins CI control. It enables the correlation of test results, which are independently pushed to the results database from different Jenkins jobs. DO NOT USE THIS OPTION IN MANUAL TEST SCENARIOS.

Openstack credentials

OpenStack credentials are mandatory and must be provided to Functest. When running the command “functest env prepare”, the framework will automatically look for the Openstack credentials file “/home/opnfv/functest/conf/env_file” and will exit with error if it is not present or is empty.

There are 2 ways to provide that file:

  • by using a Docker volume with -v option when creating the Docker container. This is referred to in docker documentation as “Bind Mounting”. See the usage of this parameter in the following chapter.
  • or creating manually the file ‘/home/opnfv/functest/conf/env_file’ inside the running container and pasting the credentials in it. Consult your installer guide for further details. This is however not instructed in this document.

In proxified environment you may need to change the credentials file. There are some tips in chapter: Proxy support

SSL Support

If you need to connect to a server that is TLS-enabled (the auth URL begins with “https”) and it uses a certificate from a private CA or a self-signed certificate, then you will need to specify the path to an appropriate CA certificate to use, to validate the server certificate with the environment variable OS_CACERT:

echo $OS_CACERT
/etc/ssl/certs/ca.crt

However, this certificate does not exist in the container by default. It has to be copied manually from the OpenStack deployment. This can be done in 2 ways:

  1. Create manually that file and copy the contents from the OpenStack controller.

  2. (Recommended) Add the file using a Docker volume when starting the container:

    -v <path_to_your_cert_file>:/etc/ssl/certs/ca.cert
    

You might need to export OS_CACERT environment variable inside the credentials file:

export OS_CACERT=/etc/ssl/certs/ca.crt

Certificate verification can be turned off using OS_INSECURE=true. For example, Fuel uses self-signed cacerts by default, so an pre step would be:

export OS_INSECURE=true
Logs

By default all the logs are put un /home/opnfv/functest/results/functest.log. If you want to have more logs in console, you may edit the logging.ini file manually. Connect on the docker then edit the file located in /usr/lib/python2.7/site-packages/xtesting/ci/logging.ini

Change wconsole to console in the desired module to get more traces.

Configuration

You may also directly modify the python code or the configuration file (e.g. testcases.yaml used to declare test constraints) under /usr/lib/python2.7/site-packages/xtesting and /usr/lib/python2.7/site-packages/functest

Tips
Docker

When typing exit in the container prompt, this will cause exiting the container and probably stopping it. When stopping a running Docker container all the changes will be lost, there is a keyboard shortcut to quit the container without stopping it: <CTRL>-P + <CTRL>-Q. To reconnect to the running container DO NOT use the run command again (since it will create a new container), use the exec or attach command instead:

docker ps  # <check the container ID from the output>
docker exec -ti <CONTAINER_ID> /bin/bash

There are other useful Docker commands that might be needed to manage possible issues with the containers.

List the running containers:

docker ps

List all the containers including the stopped ones:

docker ps -a

Start a stopped container named “FunTest”:

docker start FunTest

Attach to a running container named “StrikeTwo”:

docker attach StrikeTwo

It is useful sometimes to remove a container if there are some problems:

docker rm <CONTAINER_ID>

Use the -f option if the container is still running, it will force to destroy it:

docker rm -f <CONTAINER_ID>

Check the Docker documentation [dockerdocs] for more information.

Checking Openstack and credentials

It is recommended and fairly straightforward to check that Openstack and credentials are working as expected.

Once the credentials are there inside the container, they should be sourced before running any Openstack commands:

source /home/opnfv/functest/conf/env_file

After this, try to run any OpenStack command to see if you get any output, for instance:

openstack user list

This will return a list of the actual users in the OpenStack deployment. In any other case, check that the credentials are sourced:

env|grep OS_

This command must show a set of environment variables starting with OS_, for example:

OS_REGION_NAME=RegionOne
OS_USER_DOMAIN_NAME=Default
OS_PROJECT_NAME=admin
OS_AUTH_VERSION=3
OS_IDENTITY_API_VERSION=3
OS_PASSWORD=da54c27ae0d10dfae5297e6f0d6be54ebdb9f58d0f9dfc
OS_AUTH_URL=http://10.1.0.9:5000/v3
OS_USERNAME=admin
OS_TENANT_NAME=admin
OS_ENDPOINT_TYPE=internalURL
OS_INTERFACE=internalURL
OS_NO_CACHE=1
OS_PROJECT_DOMAIN_NAME=Default

If the OpenStack command still does not show anything or complains about connectivity issues, it could be due to an incorrect url given to the OS_AUTH_URL environment variable. Check the deployment settings.

Proxy support

If your Jumphost node is operating behind a http proxy, then there are 2 places where some special actions may be needed to make operations succeed:

  1. Initial installation of docker engine First, try following the official Docker documentation for Proxy settings. Some issues were experienced on CentOS 7 based Jumphost. Some tips are documented in section: Docker Installation on CentOS behind http proxy below.

If that is the case, make sure the resolv.conf and the needed http_proxy and https_proxy environment variables, as well as the ‘no_proxy’ environment variable are set correctly:

# Make double sure that the 'no_proxy=...' line in the
# 'env_file' file is commented out first. Otherwise, the
# values set into the 'no_proxy' environment variable below will
# be ovewrwritten, each time the command
# 'source ~/functest/conf/env_file' is issued.

cd ~/functest/conf/
sed -i 's/export no_proxy/#export no_proxy/' env_file
source ./env_file

# Next calculate some IP addresses for which http_proxy
# usage should be excluded:

publicURL_IP=$(echo $OS_AUTH_URL | grep -Eo "([0-9]+\.){3}[0-9]+")

adminURL_IP=$(openstack catalog show identity | \
grep adminURL | grep -Eo "([0-9]+\.){3}[0-9]+")

export http_proxy="<your http proxy settings>"
export https_proxy="<your https proxy settings>"
export no_proxy="127.0.0.1,localhost,$publicURL_IP,$adminURL_IP"

# Ensure that "git" uses the http_proxy
# This may be needed if your firewall forbids SSL based git fetch
git config --global http.sslVerify True
git config --global http.proxy <Your http proxy settings>

For example, try to use the nc command from inside the functest docker container:

nc -v opnfv.org 80
Connection to opnfv.org 80 port [tcp/http] succeeded!

nc -v opnfv.org 443
Connection to opnfv.org 443 port [tcp/https] succeeded!

Note: In a Jumphost node based on the CentOS family OS, the nc commands might not work. You can use the curl command instead.

curl http://www.opnfv.org:80

<HTML><HEAD><meta http-equiv=”content-type” . . </BODY></HTML>

curl https://www.opnfv.org:443

<HTML><HEAD><meta http-equiv=”content-type” . . </BODY></HTML>

(Ignore the content. If command returns a valid HTML page, it proves the connection.)

Docker Installation on CentOS behind http proxy

This section is applicable for CentOS family OS on Jumphost which itself is behind a proxy server. In that case, the instructions below should be followed before installing the docker engine:

1) # Make a directory '/etc/systemd/system/docker.service.d'
   # if it does not exist
   sudo mkdir /etc/systemd/system/docker.service.d

2) # Create a file called 'env.conf' in that directory with
   # the following contents:
   [Service]
   EnvironmentFile=-/etc/sysconfig/docker

3) # Set up a file called 'docker' in directory '/etc/sysconfig'
   # with the following contents:
   HTTP_PROXY="<Your http proxy settings>"
   HTTPS_PROXY="<Your https proxy settings>"
   http_proxy="${HTTP_PROXY}"
   https_proxy="${HTTPS_PROXY}"

4) # Reload the daemon
   systemctl daemon-reload

5) # Sanity check - check the following docker settings:
   systemctl show docker | grep -i env

   Expected result:
   ----------------
   EnvironmentFile=/etc/sysconfig/docker (ignore_errors=yes)
   DropInPaths=/etc/systemd/system/docker.service.d/env.conf

Now follow the instructions in [Install Docker on CentOS] to download and install the docker-engine. The instructions conclude with a “test pull” of a sample “Hello World” docker container. This should now work with the above pre-requisite actions.

Integration in CI

In CI we use the Docker images and execute the appropriate commands within the container from Jenkins.

4 steps have been defined::
  • functest-cleanup: clean existing functest dockers on the jumphost
  • functest-daily: run dockers opnfv/functest-* (healthcheck, smoke, features, vnf)
  • functest-store-results: push logs to artifacts

See [2] for details.

References

[1] : Keystone and public end point constraint

[2] : Functest Jenkins jobs

[3] : OPNFV main site

[4] : Functest wiki page

IRC support channel: #opnfv-functest

Functest User Guide
Introduction

The goal of this document is to describe the OPNFV Functest test cases and to provide a procedure to execute them.

IMPORTANT: It is assumed here that Functest has been properly deployed following the installation guide procedure Functest Installation Guide.

Overview of the Functest suites

Functest is the OPNFV project primarily targeting functional testing. In the Continuous Integration pipeline, it is launched after an OPNFV fresh installation to validate and verify the basic functions of the infrastructure.

The current list of test suites can be distributed over 5 main domains:
  • VIM (Virtualised Infrastructure Manager)
  • Controllers (i.e. SDN Controllers)
  • Features
  • VNF (Virtual Network Functions)
  • Kubernetes

Functest test suites are also distributed in the OPNFV testing categories: healthcheck, smoke, features, components, performance, VNF, Stress tests.

All the Healthcheck and smoke tests of a given scenario must be succesful to validate the scenario for the release.

Domain Tier Test case Comments
VIM healthcheck connection _check Check OpenStack connectivity through SNAPS framework
api_check Check OpenStack API through SNAPS framework
snaps _health _check basic instance creation, check DHCP
smoke vping_ssh NFV “Hello World” using an SSH connection to a destination VM over a created floating IP address on the SUT Public / External network. Using the SSH connection a test script is then copied to the destination VM and then executed via SSH. The script will ping another VM on a specified IP address over the SUT Private Tenant network
vping _userdata Uses Ping with given userdata to test intra-VM connectivity over the SUT Private Tenant network. The correct operation of the NOVA Metadata service is also verified in this test
tempest _smoke Generate and run a relevant Tempest Test Suite in smoke mode. The generated test set is dependent on the OpenStack deployment environment
rally _sanity Run a subset of the OpenStack Rally Test Suite in smoke mode
snaps_smoke Run the SNAPS-OO integration tests
refstack _defcore Reference RefStack suite tempest selection for NFV
patrole Patrole is a tempest plugin for testing and verifying RBAC policy enforcement, which offers testing for the following OpenStack services: Nova, Neutron, Glance, Cinder and Keystone
  neutron _trunk The neutron trunk port testcases have been introduced and they are supported by installers : Apex, Fuel and Compass.
components tempest _full _parallel Generate and run a full set of the OpenStack Tempest Test Suite. See the OpenStack reference test suite [2]. The generated test set is dependent on the OpenStack deployment environment
rally_full Run the OpenStack testing tool benchmarking OpenStack modules See the Rally documents [3]
Controllers smoke odl Opendaylight Test suite Limited test suite to check the basic neutron (Layer 2) operations mainly based on upstream testcases. See below for details
Features features bgpvpn Implementation of the OpenStack bgpvpn API from the SDNVPN feature project. It allows for the creation of BGP VPNs. See SDNVPN User Guide for details
doctor Doctor platform, as of Colorado release, provides the three features: * Immediate Notification * Consistent resource state awareness for compute host down * Valid compute host status given to VM owner See Doctor User Guide for details
odl-sfc SFC testing for odl scenarios See SFC User Guide for details
parser Parser is an integration project which aims to provide placement/deployment templates translation for OPNFV platform, including TOSCA -> HOT, POLICY -> TOSCA and YANG -> TOSCA. it deals with a fake vRNC. See Parser User Guide for details
fds Test Suite for the OpenDaylight SDN Controller when the GBP features are installed. It integrates some test suites from upstream using Robot as the test framework
VNF vnf cloudify _ims Example of a real VNF deployment to show the NFV capabilities of the platform. The IP Multimedia Subsytem is a typical Telco test case, referenced by ETSI. It provides a fully functional VoIP System
vyos _vrouter vRouter testing
juju_epc Validates deployment of a complex mobility VNF on OPNFV Platform. Uses Juju for deploying the OAI EPC and ABot for defining test scenarios using high-level DSL. VNF tests reference 3GPP Technical Specs and are executed through protocol drivers provided by ABot.
Kubernetes healthcheck k8s_smoke Test a running Kubernetes cluster and ensure it satisfies minimal functional requirements
smoke k8s_ conformance Run a subset of Kubernetes End-to-End tests, expected to pass on any Kubernetes cluster
stor4nfv stor4nfv _k8s Run tests necessary to demonstrate conformance of the K8s+Stor4NFV deployment
clover clover_k8s Test functionality of K8s+Istio+Clover deployment.

As shown in the above table, Functest is structured into different ‘domains’, ‘tiers’ and ‘test cases’. Each ‘test case’ usually represents an actual ‘Test Suite’ comprised -in turn- of several test cases internally.

Test cases also have an implicit execution order. For example, if the early ‘healthcheck’ Tier testcase fails, or if there are any failures in the ‘smoke’ Tier testcases, there is little point to launch a full testcase execution round.

In Danube, we merged smoke and sdn controller tiers in smoke tier.

An overview of the Functest Structural Concept is depicted graphically below:

Functest Concepts Structure

Some of the test cases are developed by Functest team members, whereas others are integrated from upstream communities or other OPNFV projects. For example, Tempest is the OpenStack integration test suite and Functest is in charge of the selection, integration and automation of those tests that fit suitably to OPNFV.

The Tempest test suite is the default OpenStack smoke test suite but no new test cases have been created in OPNFV Functest.

The results produced by the tests run from CI are pushed and collected into a NoSQL database. The goal is to populate the database with results from different sources and scenarios and to show them on a Functest Dashboard. A screenshot of a live Functest Dashboard is shown below:

Functest Dashboard

Basic components (VIM, SDN controllers) are tested through their own suites. Feature projects also provide their own test suites with different ways of running their tests.

The notion of domain has been introduced in the description of the test cases stored in the Database. This parameters as well as possible tags can be used for the Test case catalog.

vIMS test case was integrated to demonstrate the capability to deploy a relatively complex NFV scenario on top of the OPNFV infrastructure.

Functest considers OPNFV as a black box. OPNFV offers a lot of potential combinations (which may change from one version to another):

  • 3 controllers (OpenDaylight, ONOS, OpenContrail)
  • 5 installers (Apex, Compass, Daisy, Fuel, Joid)

Most of the tests are runnable by any combination, but some tests might have restrictions imposed by the utilized installers or due to the available deployed features. The system uses the environment variables (INSTALLER_TYPE and DEPLOY_SCENARIO) to automatically determine the valid test cases, for each given environment.

A convenience Functest CLI utility is also available to simplify setting up the Functest evironment, management of the OpenStack environment (e.g. resource clean-up) and for executing tests. The Functest CLI organised the testcase into logical Tiers, which contain in turn one or more testcases. The CLI allows execution of a single specified testcase, all test cases in a specified Tier, or the special case of execution of ALL testcases. The Functest CLI is introduced in more details in next section.

The different test cases are described in the remaining sections of this document.

VIM (Virtualized Infrastructure Manager)
Healthcheck tests

Since Danube, healthcheck tests have been refactored and rely on SNAPS, an OPNFV middleware project.

SNAPS stands for “SDN/NFV Application development Platform and Stack”. SNAPS is an object-oriented OpenStack library packaged with tests that exercise OpenStack. More information on SNAPS can be found in  [13]

Three tests are declared as healthcheck tests and can be used for gating by the installer, they cover functionally the tests previously done by healthcheck test case.

The tests are:

  • connection_check
  • api_check
  • snaps_health_check

Connection_check consists in 9 test cases (test duration < 5s) checking the connectivity with Glance, Keystone, Neutron, Nova and the external network.

Api_check verifies the retrieval of OpenStack clients: Keystone, Glance, Neutron and Nova and may perform some simple queries. When the config value of snaps.use_keystone is True, functest must have access to the cloud’s private network. This suite consists in 49 tests (test duration < 2 minutes).

Snaps_health_check creates a VM with a single port with an IPv4 address that is assigned by DHCP and then validates the expected IP with the actual.

The flavors for the SNAPS test cases are able to be configured giving new metadata values as well as new values for the basic elements of flavor (i.e. ram, vcpu, disk, ephemeral, swap etc). The snaps.flavor_extra_specs dict in the config_functest.yaml file could be used for this purpose.

Self-obviously, successful completion of the ‘healthcheck’ testcase is a necessary pre-requisite for the execution of all other test Tiers.

vPing_ssh

Given the script ping.sh:

#!/bin/sh
ping -c 1 $1 2>&1 >/dev/null
RES=$?
if [ "Z$RES" = "Z0" ] ; then
    echo 'vPing OK'
else
    echo 'vPing KO'
fi

The goal of this test is to establish an SSH connection using a floating IP on the Public/External network and verify that 2 instances can talk over a Private Tenant network:

vPing_ssh test case
+-------------+                    +-------------+
|             |                    |             |
|             | Boot VM1 with IP1  |             |
|             +------------------->|             |
|   Tester    |                    |   System    |
|             | Boot VM2           |    Under    |
|             +------------------->|     Test    |
|             |                    |             |
|             | Create floating IP |             |
|             +------------------->|             |
|             |                    |             |
|             | Assign floating IP |             |
|             | to VM2             |             |
|             +------------------->|             |
|             |                    |             |
|             | Establish SSH      |             |
|             | connection to VM2  |             |
|             | through floating IP|             |
|             +------------------->|             |
|             |                    |             |
|             | SCP ping.sh to VM2 |             |
|             +------------------->|             |
|             |                    |             |
|             | VM2 executes       |             |
|             | ping.sh to VM1     |             |
|             +------------------->|             |
|             |                    |             |
|             |    If ping:        |             |
|             |      exit OK       |             |
|             |    else (timeout): |             |
|             |      exit Failed   |             |
|             |                    |             |
+-------------+                    +-------------+

This test can be considered as an “Hello World” example. It is the first basic use case which must work on any deployment.

vPing_userdata

This test case is similar to vPing_ssh but without the use of Floating IPs and the Public/External network to transfer the ping script. Instead, it uses Nova metadata service to pass it to the instance at booting time. As vPing_ssh, it checks that 2 instances can talk to each other on a Private Tenant network:

vPing_userdata test case
+-------------+                     +-------------+
|             |                     |             |
|             | Boot VM1 with IP1   |             |
|             +-------------------->|             |
|             |                     |             |
|             | Boot VM2 with       |             |
|             | ping.sh as userdata |             |
|             | with IP1 as $1.     |             |
|             +-------------------->|             |
|   Tester    |                     |   System    |
|             | VM2 executes ping.sh|    Under    |
|             | (ping IP1)          |     Test    |
|             +-------------------->|             |
|             |                     |             |
|             | Monitor nova        |             |
|             |  console-log VM 2   |             |
|             |    If ping:         |             |
|             |      exit OK        |             |
|             |    else (timeout)   |             |
|             |      exit Failed    |             |
|             |                     |             |
+-------------+                     +-------------+

When the second VM boots it will execute the script passed as userdata automatically. The ping will be detected by periodically capturing the output in the console-log of the second VM.

Tempest

Tempest [2] is the reference OpenStack Integration test suite. It is a set of integration tests to be run against a live OpenStack cluster. Tempest has suites of tests for:

  • OpenStack API validation
  • Scenarios
  • Other specific tests useful in validating an OpenStack deployment

Functest uses Rally [3] to run the Tempest suite. Rally generates automatically the Tempest configuration file tempest.conf. Before running the actual test cases, Functest creates the needed resources (user, tenant) and updates the appropriate parameters into the configuration file.

When the Tempest suite is executed, each test duration is measured and the full console output is stored to a log file for further analysis.

The Tempest testcases are distributed across three Tiers:

  • Smoke Tier - Test Case ‘tempest_smoke’
  • Components Tier - Test case ‘tempest_full’
  • Neutron Trunk Port - Test case ‘neutron_trunk’
  • OpenStack interop testcases - Test case ‘refstack_defcore’
  • Testing and verifying RBAC policy enforcement - Test case ‘patrole’

NOTE: Test case ‘tempest_smoke’ executes a defined set of tempest smoke tests. Test case ‘tempest_full’ executes all defined Tempest tests.

NOTE: The ‘neutron_trunk’ test set allows to connect a VM to multiple VLAN separated networks using a single NIC. The feature neutron trunk ports have been supported by Apex, Fuel and Compass, so the tempest testcases have been integrated normally.

NOTE: Rally is also used to run Openstack Interop testcases [9], which focus on testing interoperability between OpenStack clouds.

NOTE: Patrole is a tempest plugin for testing and verifying RBAC policy enforcement. It runs Tempest-based API tests using specified RBAC roles, thus allowing deployments to verify that only intended roles have access to those APIs. Patrole currently offers testing for the following OpenStack services: Nova, Neutron, Glance, Cinder and Keystone. Currently in functest, only neutron and glance are tested.

The goal of the Tempest test suite is to check the basic functionalities of the different OpenStack components on an OPNFV fresh installation, using the corresponding REST API interfaces.

Rally bench test suites

Rally [3] is a benchmarking tool that answers the question:

How does OpenStack work at scale?

The goal of this test suite is to benchmark all the different OpenStack modules and get significant figures that could help to define Telco Cloud KPIs.

The OPNFV Rally scenarios are based on the collection of the actual Rally scenarios:

  • authenticate
  • cinder
  • glance
  • heat
  • keystone
  • neutron
  • nova
  • quotas

A basic SLA (stop test on errors) has been implemented.

The Rally testcases are distributed across two Tiers:

  • Smoke Tier - Test Case ‘rally_sanity’
  • Components Tier - Test case ‘rally_full’

NOTE: Test case ‘rally_sanity’ executes a limited number of Rally smoke test cases. Test case ‘rally_full’ executes the full defined set of Rally tests.

snaps_smoke

This test case contains tests that setup and destroy environments with VMs with and without Floating IPs with a newly created user and project. Set the config value snaps.use_floating_ips (True|False) to toggle this functionality. Please note that When the configuration value of snaps.use_keystone is True, Functest must have access the cloud’s private network. This suite consists in 120 tests (test duration ~= 50 minutes)

The flavors for the SNAPS test cases are able to be configured giving new metadata values as well as new values for the basic elements of flavor (i.e. ram, vcpu, disk, ephemeral, swap etc). The snaps.flavor_extra_specs dict in the config_functest.yaml file could be used for this purpose.

SDN Controllers
OpenDaylight

The OpenDaylight (ODL) test suite consists of a set of basic tests inherited from the ODL project using the Robot [11] framework. The suite verifies creation and deletion of networks, subnets and ports with OpenDaylight and Neutron.

The list of tests can be described as follows:

  • Basic Restconf test cases
    • Connect to Restconf URL
    • Check the HTTP code status
  • Neutron Reachability test cases
    • Get the complete list of neutron resources (networks, subnets, ports)
  • Neutron Network test cases
    • Check OpenStack networks
    • Check OpenDaylight networks
    • Create a new network via OpenStack and check the HTTP status code returned by Neutron
    • Check that the network has also been successfully created in OpenDaylight
  • Neutron Subnet test cases
    • Check OpenStack subnets
    • Check OpenDaylight subnets
    • Create a new subnet via OpenStack and check the HTTP status code returned by Neutron
    • Check that the subnet has also been successfully created in OpenDaylight
  • Neutron Port test cases
    • Check OpenStack Neutron for known ports
    • Check OpenDaylight ports
    • Create a new port via OpenStack and check the HTTP status code returned by Neutron
    • Check that the new port has also been successfully created in OpenDaylight
  • Delete operations
    • Delete the port previously created via OpenStack
    • Check that the port has been also successfully deleted in OpenDaylight
    • Delete previously subnet created via OpenStack
    • Check that the subnet has also been successfully deleted in OpenDaylight
    • Delete the network created via OpenStack
    • Check that the network has also been successfully deleted in OpenDaylight

Note: the checks in OpenDaylight are based on the returned HTTP status code returned by OpenDaylight.

Features

Functest has been supporting several feature projects since Brahmaputra:

Test Brahma Colorado Danube Euphrates Fraser
barometer     X X X
bgpvpn   X X X X
copper   X      
doctor X X X X X
domino   X X X  
fds     X X X
moon   X      
multisite   X X    
netready     X    
odl_sfc   X X X X
opera     X    
orchestra     X X X
parser     X X X
promise X X X X X
security_scan   X X    
clover         X
stor4nfv         X

Please refer to the dedicated feature user guides for details.

VNF
cloudify_ims

The IP Multimedia Subsystem or IP Multimedia Core Network Subsystem (IMS) is an architectural framework for delivering IP multimedia services.

vIMS has been integrated in Functest to demonstrate the capability to deploy a relatively complex NFV scenario on the OPNFV platform. The deployment of a complete functional VNF allows the test of most of the essential functions needed for a NFV platform.

The goal of this test suite consists of:

  • deploy a VNF orchestrator (Cloudify)
  • deploy a Clearwater vIMS (IP Multimedia Subsystem) VNF from this orchestrator based on a TOSCA blueprint defined in [5]
  • run suite of signaling tests on top of this VNF

The Clearwater architecture is described as follows:

vIMS architecture
heat_ims

The IP Multimedia Subsystem or IP Multimedia Core Network Subsystem (IMS) is an architectural framework for delivering IP multimedia services.

vIMS has been integrated in Functest to demonstrate the capability to deploy a relatively complex NFV scenario on the OPNFV platform. The deployment of a complete functional VNF allows the test of most of the essential functions needed for a NFV platform.

The goal of this test suite consists of:

  • deploy a Clearwater vIMS (IP Multimedia Subsystem) VNF using OpenStack Heat orchestrator based on a HOT template defined in [17]
  • run suite of signaling tests on top of this VNF

The Clearwater architecture is described as follows:

vIMS architecture
vyos-vrouter

This test case deals with the deployment and the test of vyos vrouter with Cloudify orchestrator. The test case can do testing for interchangeability of BGP Protocol using vyos.

The Workflow is as follows:
  • Deploy

    Deploy VNF Testing topology by Cloudify using blueprint.

  • Configuration

    Setting configuration to Target VNF and reference VNF using ssh

  • Run

    Execution of test command for test item written YAML format file. Check VNF status and behavior.

  • Reporting

    Output of report based on result using JSON format.

The vyos-vrouter architecture is described in [14]

juju_epc

The Evolved Packet Core (EPC) is the main component of the System Architecture Evolution (SAE) which forms the core of the 3GPP LTE specification.

vEPC has been integrated in Functest to demonstrate the capability to deploy a complex mobility-specific NFV scenario on the OPNFV platform. The OAI EPC supports most of the essential functions defined by the 3GPP Technical Specs; hence the successful execution of functional tests on the OAI EPC provides a good endorsement of the underlying NFV platform.

This integration also includes ABot, a Test Orchestration system that enables test scenarios to be defined in high-level DSL. ABot is also deployed as a VM on the OPNFV platform; and this provides an example of the automation driver and the Test VNF being both deployed as separate VNFs on the underlying OPNFV platform.

The Workflow is as follows:
  • Deploy Orchestrator

    Deploy Juju controller using Bootstrap command.

  • Deploy VNF

    Deploy ABot orchestrator and OAI EPC as Juju charms. Configuration of ABot and OAI EPC components is handled through built-in Juju relations.

  • Test VNF

    Execution of ABot feature files triggered by Juju actions. This executes a suite of LTE signalling tests on the OAI EPC.

  • Reporting

    ABot test results are parsed accordingly and pushed to Functest Db.

Details of the ABot test orchestration tool may be found in [15]

Kubernetes (K8s)

Kubernetes testing relies on sets of tests, which are part of the Kubernetes source tree, such as the Kubernetes End-to-End (e2e) tests [16].

The kubernetes testcases are distributed across various Tiers:

  • Healthcheck Tier
    • k8s_smoke Test Case: Creates a Guestbook application that contains redis server, 2 instances of redis slave, frontend application, frontend service and redis master service and redis slave service. Using frontend service, the test will write an entry into the guestbook application which will store the entry into the backend redis database. Application flow MUST work as expected and the data written MUST be available to read.
  • Smoke Tier
    • k8s_conformance Test Case: Runs a series of k8s e2e tests expected to pass on any Kubernetes cluster. It is a subset of tests necessary to demonstrate conformance grows with each release. Conformance is thus considered versioned, with backwards compatibility guarantees and are designed to be run with no cloud provider configured.
Executing Functest suites

As mentioned in the Functest Installation Guide, Alpine docker containers have been introduced in Euphrates. Tier containers have been created. Assuming that you pulled the container and your environement is ready, you can simply run the tiers by typing (e.g. with functest-healthcheck):

sudo docker run --env-file env \
    -v $(pwd)/openstack.creds:/home/opnfv/functest/conf/env_file  \
    -v $(pwd)/images:/home/opnfv/functest/images  \
    opnfv/functest-healthcheck

You should get:

+----------------------------+------------------+---------------------+------------------+----------------+
|         TEST CASE          |     PROJECT      |         TIER        |     DURATION     |     RESULT     |
+----------------------------+------------------+---------------------+------------------+----------------+
|      connection_check      |     functest     |     healthcheck     |      00:02       |      PASS      |
|         api_check          |     functest     |     healthcheck     |      03:19       |      PASS      |
|     snaps_health_check     |     functest     |     healthcheck     |      00:46       |      PASS      |
+----------------------------+------------------+---------------------+------------------+----------------+

You can run functest-healcheck, functest-smoke, functest-features, functest-components and functest-vnf.

The result tables show the results by test case, it can be:

* PASS
* FAIL
* SKIP: if the scenario/installer does not support the test case
Manual run

If you want to run the test step by step, you may add docker option then run the different commands within the docker.

Considering the healthcheck example, running functest manaully means:

sudo docker run -ti --env-file env \
  -v $(pwd)/openstack.creds:/home/opnfv/functest/conf/env_file  \
  -v $(pwd)/images:/home/opnfv/functest/images  \
  opnfv/functest-healthcheck /bin/bash

The docker prompt shall be returned. Then within the docker run the following commands:

$ source /home/opnfv/functest/conf/env_file
Tier

Each Alpine container provided on the docker hub matches with a tier. The following commands are available:

# functest tier list
  - 0. healthcheck:
  ['connection_check', 'api_check', 'snaps_health_check']
# functest tier show healthcheck
+---------------------+---------------+--------------------------+-------------------------------------------------+------------------------------------+
|        TIERS        |     ORDER     |         CI LOOP          |                   DESCRIPTION                   |             TESTCASES              |
+---------------------+---------------+--------------------------+-------------------------------------------------+------------------------------------+
|     healthcheck     |       0       |     (daily)|(weekly)     |     First tier to be executed to verify the     |     connection_check api_check     |
|                     |               |                          |           basic operations in the VIM.          |         snaps_health_check         |
+---------------------+---------------+--------------------------+-------------------------------------------------+------------------------------------+

To run all the cases of the tier, type:

# functest tier run healthcheck
Testcase

Testcases can be listed, shown and run though the CLI:

# functest testcase list
 connection_check
 api_check
 snaps_health_check
# functest testcase show api_check
+-------------------+--------------------------------------------------+------------------+---------------------------+
|     TEST CASE     |                   DESCRIPTION                    |     CRITERIA     |         DEPENDENCY        |
+-------------------+--------------------------------------------------+------------------+---------------------------+
|     api_check     |     This test case verifies the retrieval of     |       100        |       ^((?!lxd).)*$       |
|                   |       OpenStack clients: Keystone, Glance,       |                  |                           |
|                   |      Neutron and Nova and may perform some       |                  |                           |
|                   |     simple queries. When the config value of     |                  |                           |
|                   |       snaps.use_keystone is True, functest       |                  |                           |
|                   |     must have access to the cloud's private      |                  |                           |
|                   |                     network.                     |                  |                           |
+-------------------+--------------------------------------------------+------------------+---------------------------+
# functest testcase run connection_check
...
# functest run all

You can also type run_tests -t all to run all the tests.

Note the list of test cases depend on the installer and the scenario.

Note that the flavors for the SNAPS test cases are able to be configured giving new metadata values as well as new values for the basic elements of flavor (i.e. ram, vcpu, disk, ephemeral, swap etc). The snaps.flavor_extra_specs dict in the config_functest.yaml file could be used for this purpose.

Reporting results to the test Database

In OPNFV CI we collect all the results from CI. A test API shall be available as well as a test database [16].

Test results
Manual testing

In manual mode test results are displayed in the console and result files are put in /home/opnfv/functest/results.

If you want additional logs, you may configure the logging.ini under /usr/lib/python2.7/site-packages/xtesting/ci.

Automated testing

In automated mode, tests are run within split Alpine containers, and test results are displayed in jenkins logs. The result summary is provided at the end of each suite and can be described as follow.

Healthcheck suite:

+----------------------------+------------------+---------------------+------------------+----------------+
|         TEST CASE          |     PROJECT      |         TIER        |     DURATION     |     RESULT     |
+----------------------------+------------------+---------------------+------------------+----------------+
|      connection_check      |     functest     |     healthcheck     |      00:07       |      PASS      |
|         api_check          |     functest     |     healthcheck     |      07:46       |      PASS      |
|     snaps_health_check     |     functest     |     healthcheck     |      00:36       |      PASS      |
+----------------------------+------------------+---------------------+------------------+----------------+

Smoke suite:

+------------------------------+------------------+---------------+------------------+----------------+
|          TEST CASE           |     PROJECT      |      TIER     |     DURATION     |     RESULT     |
+------------------------------+------------------+---------------+------------------+----------------+
|          vping_ssh           |     functest     |     smoke     |      00:57       |      PASS      |
|        vping_userdata        |     functest     |     smoke     |      00:33       |      PASS      |
|     tempest_smoke_serial     |     functest     |     smoke     |      13:22       |      PASS      |
|         rally_sanity         |     functest     |     smoke     |      24:07       |      PASS      |
|       refstack_defcore       |     functest     |     smoke     |      05:21       |      PASS      |
|           patrole            |     functest     |     smoke     |      04:29       |      PASS      |
|         snaps_smoke          |     functest     |     smoke     |      46:54       |      PASS      |
|             odl              |     functest     |     smoke     |      00:00       |      SKIP      |
|        neutron_trunk         |     functest     |     smoke     |      00:00       |      SKIP      |
+------------------------------+------------------+---------------+------------------+----------------+

Features suite:

+-----------------------------+------------------------+------------------+------------------+----------------+
|          TEST CASE          |        PROJECT         |       TIER       |     DURATION     |     RESULT     |
+-----------------------------+------------------------+------------------+------------------+----------------+
|     doctor-notification     |         doctor         |     features     |      00:00       |      SKIP      |
|            bgpvpn           |         sdnvpn         |     features     |      00:00       |      SKIP      |
|       functest-odl-sfc      |          sfc           |     features     |      00:00       |      SKIP      |
|      barometercollectd      |       barometer        |     features     |      00:00       |      SKIP      |
|             fds             |     fastdatastacks     |     features     |      00:00       |      SKIP      |
+-----------------------------+------------------------+------------------+------------------+----------------+

Components suite:

+-------------------------------+------------------+--------------------+------------------+----------------+
|           TEST CASE           |     PROJECT      |        TIER        |     DURATION     |     RESULT     |
+-------------------------------+------------------+--------------------+------------------+----------------+
|     tempest_full_parallel     |     functest     |     components     |      48:28       |      PASS      |
|           rally_full          |     functest     |     components     |      126:02      |      PASS      |
+-------------------------------+------------------+--------------------+------------------+----------------+

Vnf suite:

+----------------------+------------------+--------------+------------------+----------------+
|      TEST CASE       |     PROJECT      |     TIER     |     DURATION     |     RESULT     |
+----------------------+------------------+--------------+------------------+----------------+
|     cloudify_ims     |     functest     |     vnf      |      28:15       |      PASS      |
|     vyos_vrouter     |     functest     |     vnf      |      17:59       |      PASS      |
|       juju_epc       |     functest     |     vnf      |      46:44       |      PASS      |
+----------------------+------------------+--------------+------------------+----------------+

Parser testcase:

+-----------------------+-----------------+------------------+------------------+----------------+
|       TEST CASE       |     PROJECT     |       TIER       |     DURATION     |     RESULT     |
+-----------------------+-----------------+------------------+------------------+----------------+
|     parser-basics     |      parser     |     features     |      00:00       |      SKIP      |
+-----------------------+-----------------+------------------+------------------+----------------+

Functest Kubernetes test result:

+--------------------------------------+------------------------------------------------------------+
|               ENV VAR                |                           VALUE                            |
+--------------------------------------+------------------------------------------------------------+
|            INSTALLER_TYPE            |                          compass                           |
|           DEPLOY_SCENARIO            |                   k8-nosdn-nofeature-ha                    |
|              BUILD_TAG               |     jenkins-functest-compass-baremetal-daily-master-75     |
|               CI_LOOP                |                           daily                            |
+--------------------------------------+------------------------------------------------------------+

Kubernetes healthcheck suite:

+-------------------+------------------+---------------------+------------------+----------------+
|     TEST CASE     |     PROJECT      |         TIER        |     DURATION     |     RESULT     |
+-------------------+------------------+---------------------+------------------+----------------+
|     k8s_smoke     |     functest     |     healthcheck     |      01:54       |      PASS      |
+-------------------+------------------+---------------------+------------------+----------------+

Kubernetes smoke suite:

+-------------------------+------------------+---------------+------------------+----------------+
|        TEST CASE        |     PROJECT      |      TIER     |     DURATION     |     RESULT     |
+-------------------------+------------------+---------------+------------------+----------------+
|     k8s_conformance     |     functest     |     smoke     |      57:47       |      PASS      |
+-------------------------+------------------+---------------+------------------+----------------+

Kubernetes features suite:

+----------------------+------------------+------------------+------------------+----------------+
|      TEST CASE       |     PROJECT      |       TIER       |     DURATION     |     RESULT     |
+----------------------+------------------+------------------+------------------+----------------+
|     stor4nfv_k8s     |     stor4nfv     |     stor4nfv     |      00:00       |      SKIP      |
|      clover_k8s      |      clover      |      clover      |      00:00       |      SKIP      |
+----------------------+------------------+------------------+------------------+----------------+

Results are automatically pushed to the test results database, some additional result files are pushed to OPNFV artifact web sites.

Based on the results stored in the result database, a Functest reporting portal is also automatically updated. This portal provides information on the overall status per scenario and per installer

Test reporting

An automatic reporting page has been created in order to provide a consistent view of the Functest tests on the different scenarios.

In this page, each scenario is evaluated according to test criteria.

The results are collected from the centralized database every day and, per scenario. A score is calculated based on the results from the last 10 days. This score is the addition of single test scores. Each test case has a success criteria reflected in the criteria field from the results.

As an illustration, let’s consider the scenario os-odl_l2-nofeature-ha scenario, the scenario scoring is the addition of the scores of all the runnable tests from the categories (tiers, healthcheck, smoke and features) corresponding to this scenario.

Test Apex Compass Fuel Joid
vPing_ssh X X X X
vPing_userdata X X X X
tempest_smoke X X X X
rally_sanity X X X X
odl X X X X
promise     X X
doctor X   X  
security_scan X      
parser     X  
copper X     X

src: os-odl_l2-nofeature-ha Colorado (see release note for the last matrix version)

All the testcases (X) listed in the table are runnable on os-odl_l2-nofeature scenarios. Please note that other test cases (e.g. sfc_odl, bgpvpn) need ODL configuration addons and, as a consequence, specific scenario. There are not considered as runnable on the generic odl_l2 scenario.

If no result is available or if all the results are failed, the test case get 0 point. If it was successful at least once but not anymore during the 4 runs, the case get 1 point (it worked once). If at least 3 of the last 4 runs were successful, the case get 2 points. If the last 4 runs of the test are successful, the test get 3 points.

In the example above, the target score for fuel/os-odl_l2-nofeature-ha is 3 x 8 = 24 points and for compass it is 3 x 5 = 15 points .

The scenario is validated per installer when we got 3 points for all individual test cases (e.g 24/24 for fuel, 15/15 for compass).

Please note that complex or long duration tests are not considered yet for the scoring. In fact the success criteria are not always easy to define and may require specific hardware configuration.

Please also note that all the test cases have the same “weight” for the score calculation whatever the complexity of the test case. Concretely a vping has the same weight than the 200 tempest tests. Moreover some installers support more features than others. The more cases your scenario is dealing with, the most difficult to rich a good scoring.

Therefore the scoring provides 3 types of indicators:

  • the richness of the scenario: if the target scoring is high, it means that the scenario includes lots of features
  • the maturity: if the percentage (scoring/target scoring * 100) is high, it means that all the tests are PASS
  • the stability: as the number of iteration is included in the calculation, the pecentage can be high only if the scenario is run regularly (at least more than 4 iterations over the last 10 days in CI)

In any case, the scoring is used to give feedback to the other projects and does not represent an absolute value of the scenario.

See reporting page for details. For the status, click on the version, Functest then the Status menu.

Functest reporting portal Fuel status page
Troubleshooting

This section gives some guidelines about how to troubleshoot the test cases owned by Functest.

IMPORTANT: As in the previous section, the steps defined below must be executed inside the Functest Docker container and after sourcing the OpenStack credentials:

. $creds

or:

source /home/opnfv/functest/conf/env_file
VIM

This section covers the test cases related to the VIM (healthcheck, vping_ssh, vping_userdata, tempest_smoke, tempest_full, rally_sanity, rally_full).

vPing common

For both vPing test cases (vPing_ssh, and vPing_userdata), the first steps are similar:

  • Create Glance image
  • Create Network
  • Create Security Group
  • Create Instances

After these actions, the test cases differ and will be explained in their respective section.

These test cases can be run inside the container, using new Functest CLI as follows:

$ run_tests -t vping_ssh
$ run_tests -t vping_userdata

The Functest CLI is designed to route a call to the corresponding internal python scripts, located in paths:

/usr/lib/python2.7/site-packages/functest/opnfv_tests/openstack/vping/vping_ssh.py
/usr/lib/python2.7/site-packages/functest/opnfv_tests/openstack/vping/vping_userdata.py

Notes:

  1. There is one difference, between the Functest CLI based test case execution compared to the earlier used Bash shell script, which is relevant to point out in troubleshooting scenarios:

    The Functest CLI does not yet support the option to suppress clean-up of the generated OpenStack resources, following the execution of a test case.

    Explanation: After finishing the test execution, the corresponding script will remove, by default, all created resources in OpenStack (image, instances, network and security group). When troubleshooting, it is advisable sometimes to keep those resources in case the test fails and a manual testing is needed.

    It is actually still possible to invoke test execution, with suppression of OpenStack resource cleanup, however this requires invocation of a specific Python script: ‘run_tests’. The OPNFV Functest Developer Guide provides guidance on the use of that Python script in such troubleshooting cases.

Some of the common errors that can appear in this test case are:

vPing_ssh- ERROR - There has been a problem when creating the neutron network....

This means that there has been some problems with Neutron, even before creating the instances. Try to create manually a Neutron network and a Subnet to see if that works. The debug messages will also help to see when it failed (subnet and router creation). Example of Neutron commands (using 10.6.0.0/24 range for example):

neutron net-create net-test
neutron subnet-create --name subnet-test --allocation-pool start=10.6.0.2,end=10.6.0.100 \
--gateway 10.6.0.254 net-test 10.6.0.0/24
neutron router-create test_router
neutron router-interface-add <ROUTER_ID> test_subnet
neutron router-gateway-set <ROUTER_ID> <EXT_NET_NAME>

Another related error can occur while creating the Security Groups for the instances:

vPing_ssh- ERROR - Failed to create the security group...

In this case, proceed to create it manually. These are some hints:

neutron security-group-create sg-test
neutron security-group-rule-create sg-test --direction ingress --protocol icmp \
--remote-ip-prefix 0.0.0.0/0
neutron security-group-rule-create sg-test --direction ingress --ethertype IPv4 \
--protocol tcp --port-range-min 80 --port-range-max 80 --remote-ip-prefix 0.0.0.0/0
neutron security-group-rule-create sg-test --direction egress --ethertype IPv4 \
--protocol tcp --port-range-min 80 --port-range-max 80 --remote-ip-prefix 0.0.0.0/0

The next step is to create the instances. The image used is located in /home/opnfv/functest/data/cirros-0.4.0-x86_64-disk.img and a Glance image is created with the name functest-vping. If booting the instances fails (i.e. the status is not ACTIVE), you can check why it failed by doing:

nova list
nova show <INSTANCE_ID>

It might show some messages about the booting failure. To try that manually:

nova boot --flavor m1.small --image functest-vping --nic net-id=<NET_ID> nova-test

This will spawn a VM using the network created previously manually. In all the OPNFV tested scenarios from CI, it never has been a problem with the previous actions. Further possible problems are explained in the following sections.

vPing_SSH

This test case creates a floating IP on the external network and assigns it to the second instance opnfv-vping-2. The purpose of this is to establish a SSH connection to that instance and SCP a script that will ping the first instance. This script is located in the repository under /usr/lib/python2.7/site-packages/functest/opnfv_tests/openstack/vping/ping.sh and takes an IP as a parameter. When the SCP is completed, the test will do a SSH call to that script inside the second instance. Some problems can happen here:

vPing_ssh- ERROR - Cannot establish connection to IP xxx.xxx.xxx.xxx. Aborting

If this is displayed, stop the test or wait for it to finish, if you have used the special method of test invocation with specific supression of OpenStack resource clean-up, as explained earler. It means that the Container can not reach the Public/External IP assigned to the instance opnfv-vping-2. There are many possible reasons, and they really depend on the chosen scenario. For most of the ODL-L3 and ONOS scenarios this has been noticed and it is a known limitation.

First, make sure that the instance opnfv-vping-2 succeeded to get an IP from the DHCP agent. It can be checked by doing:

nova console-log opnfv-vping-2

If the message Sending discover and No lease, failing is shown, it probably means that the Neutron dhcp-agent failed to assign an IP or even that it was not responding. At this point it does not make sense to try to ping the floating IP.

If the instance got an IP properly, try to ping manually the VM from the container:

nova list
<grab the public IP>
ping <public IP>

If the ping does not return anything, try to ping from the Host where the Docker container is running. If that solves the problem, check the iptable rules because there might be some rules rejecting ICMP or TCP traffic coming/going from/to the container.

At this point, if the ping does not work either, try to reproduce the test manually with the steps described above in the vPing common section with the addition:

neutron floatingip-create <EXT_NET_NAME>
nova floating-ip-associate nova-test <FLOATING_IP>

Further troubleshooting is out of scope of this document, as it might be due to problems with the SDN controller. Contact the installer team members or send an email to the corresponding OPNFV mailing list for more information.

vPing_userdata

This test case does not create any floating IP neither establishes an SSH connection. Instead, it uses nova-metadata service when creating an instance to pass the same script as before (ping.sh) but as 1-line text. This script will be executed automatically when the second instance opnfv-vping-2 is booted.

The only known problem here for this test to fail is mainly the lack of support of cloud-init (nova-metadata service). Check the console of the instance:

nova console-log opnfv-vping-2

If this text or similar is shown:

checking http://169.254.169.254/2009-04-04/instance-id
failed 1/20: up 1.13. request failed
failed 2/20: up 13.18. request failed
failed 3/20: up 25.20. request failed
failed 4/20: up 37.23. request failed
failed 5/20: up 49.25. request failed
failed 6/20: up 61.27. request failed
failed 7/20: up 73.29. request failed
failed 8/20: up 85.32. request failed
failed 9/20: up 97.34. request failed
failed 10/20: up 109.36. request failed
failed 11/20: up 121.38. request failed
failed 12/20: up 133.40. request failed
failed 13/20: up 145.43. request failed
failed 14/20: up 157.45. request failed
failed 15/20: up 169.48. request failed
failed 16/20: up 181.50. request failed
failed 17/20: up 193.52. request failed
failed 18/20: up 205.54. request failed
failed 19/20: up 217.56. request failed
failed 20/20: up 229.58. request failed
failed to read iid from metadata. tried 20

it means that the instance failed to read from the metadata service. Contact the Functest or installer teams for more information.

Tempest

In the upstream OpenStack CI all the Tempest test cases are supposed to pass. If some test cases fail in an OPNFV deployment, the reason is very probably one of the following

Error Details
Resources required for testcase execution are missing Such resources could be e.g. an external network and access to the management subnet (adminURL) from the Functest docker container.
OpenStack components or services are missing or not configured properly Check running services in the controller and compute nodes (e.g. with “systemctl” or “service” commands). Configuration parameters can be verified from the related .conf files located under ‘/etc/<component>’ directories.
Some resources required for execution test cases are missing The tempest.conf file, automatically generated by Rally in Functest, does not contain all the needed parameters or some parameters are not set properly. The tempest.conf file is located in directory ‘root/.rally/verification/verifier-<UUID> /for-deployment-<UUID>’ in the Functest Docker container. Use the “rally deployment list” command in order to check the UUID of the current deployment.

When some Tempest test case fails, captured traceback and possibly also the related REST API requests/responses are output to the console. More detailed debug information can be found from tempest.log file stored into related Rally deployment folder.

Functest offers a possibility to test a customized list of Tempest test cases. To enable that, add a new entry in docker/components/testcases.yaml on the “components” container with the following content:

-
    case_name: tempest_custom
    project_name: functest
    criteria: 100
    blocking: false
    description: >-
        The test case allows running a customized list of tempest
        test cases
    dependencies:
        installer: ''
        scenario: ''
    run:
        module: 'functest.opnfv_tests.openstack.tempest.tempest'
        class: 'TempestCustom'

Also, a list of the Tempest test cases must be provided to the container or modify the existing one in /usr/lib/python2.7/site-packages/functest/opnfv_tests/openstack/tempest/custom_tests/test_list.txt

Example of custom list of tests ‘my-custom-tempest-tests.txt’:

tempest.scenario.test_server_basic_ops.TestServerBasicOps.test_server_basic_ops[compute,id-7fff3fb3-91d8-4fd0-bd7d-0204f1f180ba,network,smoke]
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops[compute,id-f323b3ba-82f8-4db7-8ea6-6a895869ec49,network,smoke]

This is an example of running a customized list of Tempest tests in Functest:

sudo docker run --env-file env \
    -v $(pwd)/openstack.creds:/home/opnfv/functest/conf/env_file \
    -v $(pwd)/images:/home/opnfv/functest/images \
    -v $(pwd)/my-custom-testcases.yaml:/usr/lib/python2.7/site-packages/functest/ci/testcases.yaml \
    -v $(pwd)/my-custom-tempest-tests.txt:/usr/lib/python2.7/site-packages/functest/opnfv_tests/openstack/tempest/custom_tests/test_list.txt \
    opnfv/functest-components run_tests -t tempest_custom
Rally

The same error causes which were mentioned above for Tempest test cases, may also lead to errors in Rally as well.

Possible scenarios are:
  • authenticate
  • glance
  • cinder
  • heat
  • keystone
  • neutron
  • nova
  • quotas
  • vm

To know more about what those scenarios are doing, they are defined in directory: /usr/lib/python2.7/site-packages/functest/opnfv_tests/openstack/rally/scenario For more info about Rally scenario definition please refer to the Rally official documentation. [3]

To check any possible problems with Rally, the logs are stored under /home/opnfv/functest/results/rally/ in the Functest Docker container.

Controllers
Opendaylight

If the Basic Restconf test suite fails, check that the ODL controller is reachable and its Restconf module has been installed.

If the Neutron Reachability test fails, verify that the modules implementing Neutron requirements have been properly installed.

If any of the other test cases fails, check that Neutron and ODL have been correctly configured to work together. Check Neutron configuration files, accounts, IP addresses etc.).

Features

Please refer to the dedicated feature user guides for details.

VNF
cloudify_ims

vIMS deployment may fail for several reasons, the most frequent ones are described in the following table:

Error Comments
Keystone admin API not reachable Impossible to create vIMS user and tenant
Impossible to retrieve admin role id Impossible to create vIMS user and tenant
Error when uploading image from OpenStack to glance impossible to deploy VNF
Cinder quota cannot be updated Default quotas not sufficient, they are adapted in the script
Impossible to create a volume VNF cannot be deployed
SSH connection issue between the Test Docker container and the VM if vPing test fails, vIMS test will fail...
No Internet access from the VM the VMs of the VNF must have an external access to Internet
No access to OpenStack API from the VM Orchestrator can be installed but the vIMS VNF installation fails

Please note that this test case requires resources (8 VM (2Go) + 1 VM (4Go)), it is there fore not recommended to run it on a light configuration.

References

[2]: OpenStack Tempest documentation

[3]: Rally documentation

[4]: Functest in depth (Danube)

[5]: Clearwater vIMS blueprint

[6]: Security Content Automation Protocol

[7]: OpenSCAP web site

[8]: Refstack client

[9]: Defcore

[10]: OpenStack interoperability procedure

[11]: Robot Framework web site

[13]: SNAPS wiki

[14]: vRouter

[15]: Testing OpenStack Tempest part 1

[16]: OPNFV Test API

OPNFV main site: OPNFV official web site

Functest page: Functest wiki page

IRC support chan: #opnfv-functest

NFVbench

NFVbench User Guide

The NFVbench tool provides an automated way to measure the network performance for the most common data plane packet flows on any OpenStack system. It is designed to be easy to install and easy to use by non experts (no need to be an expert in traffic generators and data plane performance testing).

Table of Content
Features
Data Plane Performance Measurement Features

NFVbench supports the following main measurement capabilities:

  • supports 2 measurement modes:
    • fixed rate mode to generate traffic at a fixed rate for a fixed duration
    • NDR (No Drop Rate) and PDR (Partial Drop Rate) measurement mode
  • configurable frame sizes (any list of fixed sizes or ‘IMIX’)

  • built-in packet paths (PVP, PVVP)

  • built-in loopback VNFs based on fast L2 or L3 forwarders running in VMs

  • configurable number of flows and service chains

  • configurable traffic direction (single or bi-directional)

NDR is the highest throughput achieved without dropping packets. PDR is the highest throughput achieved without dropping more than a pre-set limit (called PDR threshold or allowance, expressed in %).

Results of each run include the following data:

  • Aggregated achieved throughput in bps
  • Aggregated achieved packet rate in pps (or fps)
  • Actual drop rate in %
  • Latency in usec (min, max, average in the current version)
Built-in OpenStack support

NFVbench can stage OpenStack resources to build 1 or more service chains using direct OpenStack APIs. Each service chain is composed of:

  • 1 or 2 loopback VM instances per service chain
  • 2 Neutron networks per loopback VM

OpenStack resources are staged before traffic is measured using OpenStack APIs (Nova and Neutron) then disposed after completion of measurements.

The loopback VM flavor to use can be configured in the NFVbench configuration file.

Note that NFVbench does not use OpenStack Heat nor any higher level service (VNFM or NFVO) to create the service chains because its main purpose is to measure the performance of the NFVi infrastructure which is mainly focused on L2 forwarding performance.

External Chains

NFVbench supports settings that involve externally staged packet paths with or without OpenStack:

  • run benchmarks on existing service chains at the L3 level that are staged externally by any other tool (e.g. any VNF capable of L3 routing)
  • run benchmarks on existing L2 chains that are configured externally (e.g. pure L2 forwarder such as DPDK testpmd)
Direct L2 Loopback (Switch or wire loopback)

NFVbench supports benchmarking of pure L2 loopbacks (see “–l2-loopback vlan” option)

  • Switch level loopback
  • Port to port wire loopback

In this mode, NFVbench will take a vlan ID and send packets from each port to the other port (dest MAC set to the other port MAC) using the same VLAN ID on both ports. This can be useful for example to verify that the connectivity to the switch is working properly.

Traffic Generation

NFVbench currently integrates with the open source TRex traffic generator:

  • TRex (pre-built into the NFVbench container)
Supported Packet Paths

Packet paths describe where packets are flowing in the NFVi platform. The most commonly used paths are identified by 3 or 4 letter abbreviations. A packet path can generally describe the flow of packets associated to one or more service chains, with each service chain composed of 1 or more VNFs.

The following packet paths are currently supported by NFVbench:

  • PVP (Physical interface to VM to Physical interface)
  • PVVP (Physical interface to VM to VM to Physical interface)
  • N*PVP (N concurrent PVP packet paths)
  • N*PVVP (N concurrent PVVP packet paths)

The traffic is made of 1 or more flows of L3 frames (UDP packets) with different payload sizes. Each flow is identified by a unique source and destination MAC/IP tuple.

Loopback VM

NFVbench provides a loopback VM image that runs CentOS with 2 pre-installed forwarders:

  • DPDK testpmd configured to do L2 cross connect between 2 virtual interfaces
  • FD.io VPP configured to perform L3 routing between 2 virtual interfaces

Frames are just forwarded from one interface to the other. In the case of testpmd, the source and destination MAC are rewritten, which corresponds to the mac forwarding mode (–forward-mode=mac). In the case of VPP, VPP will act as a real L3 router, and the packets are routed from one port to the other using static routes.

Which forwarder and what Nova flavor to use can be selected in the NFVbench configuration. Be default the DPDK testpmd forwarder is used with 2 vCPU per VM. The configuration of these forwarders (such as MAC rewrite configuration or static route configuration) is managed by NFVbench.

PVP Packet Path

This packet path represents a single service chain with 1 loopback VNF and 2 Neutron networks:

_images/nfvbench-pvp.png
PVVP Packet Path

This packet path represents a single service chain with 2 loopback VNFs in sequence and 3 Neutron networks. The 2 VNFs can run on the same compute node (PVVP intra-node):

_images/nfvbench-pvvp.png

or on different compute nodes (PVVP inter-node) based on a configuration option:

_images/nfvbench-pvvp2.png
Multi-Chaining (N*PVP or N*PVVP)

Multiple service chains can be setup by NFVbench without any limit on the concurrency (other than limits imposed by available resources on compute nodes). In the case of multiple service chains, NFVbench will instruct the traffic generator to use multiple L3 packet streams (frames directed to each path will have a unique destination MAC address).

Example of multi-chaining with 2 concurrent PVP service chains:

_images/nfvbench-npvp.png

This innovative feature will allow to measure easily the performance of a fully loaded compute node running multiple service chains.

Multi-chaining is currently limited to 1 compute node (PVP or PVVP intra-node) or 2 compute nodes (for PVVP inter-node). The 2 edge interfaces for all service chains will share the same 2 networks. The total traffic will be split equally across all chains.

SR-IOV

By default, service chains will be based on virtual switch interfaces.

NFVbench provides an option to select SR-IOV based virtual interfaces instead (thus bypassing any virtual switch) for those OpenStack system that include and support SR-IOV capable NICs on compute nodes.

The PVP packet path will bypass the virtual switch completely when the SR-IOV option is selected:

_images/nfvbench-sriov-pvp.png

The PVVP packet path will use SR-IOV for the left and right networks and the virtual switch for the middle network by default:

_images/nfvbench-sriov-pvvp.png

Or in the case of inter-node:

_images/nfvbench-sriov-pvvp2.png

This packet path is a good way to approximate VM to VM (V2V) performance (middle network) given the high efficiency of the left and right networks. The V2V throughput will likely be very close to the PVVP throughput while its latency will be very close to the difference between the SR-IOV PVVP latency and the SR-IOV PVP latency.

It is possible to also force the middle network to use SR-IOV (in this version, the middle network is limited to use the same SR-IOV phys net):

_images/nfvbench-all-sriov-pvvp.png

The chain can also span across 2 nodes with the use of 2 SR-IOV ports in each node:

_images/nfvbench-all-sriov-pvvp2.png
Other Misc Packet Paths

P2P (Physical interface to Physical interface - no VM) can be supported using the external chain/L2 forwarding mode.

V2V (VM to VM) is not supported but PVVP provides a more complete (and more realistic) alternative.

Supported Neutron Network Plugins and vswitches
Any Virtual Switch, Any Encapsulation

NFVbench is agnostic of the virtual switch implementation and has been tested with the following virtual switches:

  • ML2/VPP/VLAN (networking-vpp)
  • OVS/VLAN and OVS-DPDK/VLAN
  • ML2/ODL/VPP (OPNFV Fast Data Stack)
Limitations

NFVbench only supports VLAN with OpenStack. NFVbench does not support VxLAN overlays.

Installation and Quick Start Guides
Requirements for running NFVbench
Hardware Requirements

To run NFVbench you need the following hardware: - a Linux server - a DPDK compatible NIC with at least 2 ports (preferably 10Gbps or higher) - 2 ethernet cables between the NIC and the OpenStack pod under test (usually through a top of rack switch)

The DPDK-compliant NIC must be one supported by the TRex traffic generator (such as Intel X710, refer to the Trex Installation Guide for a complete list of supported NIC)

To run the TRex traffic generator (that is bundled with NFVbench) you will need to wire 2 physical interfaces of the NIC to the TOR switch(es):
  • if you have only 1 TOR, wire both interfaces to that same TOR
  • 1 interface to each TOR if you have 2 TORs and want to use bonded links to your compute nodes
_images/nfvbench-trex-setup.png
Switch Configuration

The 2 corresponding ports on the switch(es) facing the Trex ports on the Linux server should be configured in trunk mode (NFVbench will instruct TRex to insert the appropriate vlan tag).

Using a TOR switch is more representative of a real deployment and allows to measure packet flows on any compute node in the rack without rewiring and includes the overhead of the TOR switch.

Although not the primary targeted use case, NFVbench could also support the direct wiring of the traffic generator to a compute node without a switch.

Software Requirements

You need Docker to be installed on the Linux server.

TRex uses the DPDK interface to interact with the DPDK compatible NIC for sending and receiving frames. The Linux server will need to be configured properly to enable DPDK.

DPDK requires a uio (User space I/O) or vfio (Virtual Function I/O) kernel module to be installed on the host to work. There are 2 main uio kernel modules implementations (igb_uio and uio_pci_generic) and one vfio kernel module implementation.

To check if a uio or vfio is already loaded on the host:

lsmod | grep -e igb_uio -e uio_pci_generic -e vfio

If missing, it is necessary to install a uio/vfio kernel module on the host server:

  • find a suitable kernel module for your host server (any uio or vfio kernel module built with the same Linux kernel version should work)
  • load it using the modprobe and insmod commands

Example of installation of the igb_uio kernel module:

modprobe uio
insmod ./igb_uio.ko

Finally, the correct iommu options and huge pages to be configured on the Linux server on the boot command line:

  • enable intel_iommu and iommu pass through: “intel_iommu=on iommu=pt”
  • for Trex, pre-allocate 1024 huge pages of 2MB each (for a total of 2GB): “hugepagesz=2M hugepages=1024”

More detailed instructions can be found in the DPDK documentation (https://media.readthedocs.org/pdf/dpdk/latest/dpdk.pdf).

NFVbench Installation and Quick Start Guide

Make sure you satisfy the hardware and software requirements <requirements> before you start .

1. Container installation

To pull the latest NFVbench container image:

docker pull opnfv/nfvbench
2. Docker Container configuration

The NFVbench container requires the following Docker options to operate properly.

Docker options Description
-v /lib/modules/$(uname -r):/lib/modules/$(uname -r) needed by kernel modules in the container
-v /usr/src/kernels:/usr/src/kernels needed by TRex to build kernel modules when needed
-v /dev:/dev needed by kernel modules in the container
-v $PWD:/tmp/nfvbench optional but recommended to pass files between the host and the docker space (see examples below) Here we map the current directory on the host to the /tmp/nfvbench director in the container but any other similar mapping can work as well
–net=host (optional) needed if you run the NFVbench server in the container (or use any appropriate docker network mode other than “host”)
–privileged (optional) required if SELinux is enabled on the host
-e HOST=”127.0.0.1” (optional) required if REST server is enabled
-e PORT=7556 (optional) required if REST server is enabled
-e CONFIG_FILE=”/root/nfvbenchconfig.json (optional) required if REST server is enabled

It can be convenient to write a shell script (or an alias) to automatically insert the necessary options.

The minimal configuration file required must specify the openrc file to use (using in-container path), the PCI addresses of the 2 NIC ports to use for generating traffic and the line rate (in each direction) of each of these 2 interfaces.

Here is an example of mimimal configuration where: the openrc file is located on the host current directory which is mapped under /tmp/nfvbench in the container (this is achieved using -v $PWD:/tmp/nfvbench) the 2 NIC ports to use for generating traffic have the PCI addresses “04:00.0” and “04:00.1”

{
    "openrc_file": "/tmp/nfvbench/openrc",
    "traffic_generator": {
        "generator_profile": [
            {
                "interfaces": [
                    {
                        "pci": "04:00.0",
                        "port": 0,
                    },
                    {
                        "pci": "04:00.1",
                        "port": 1,
                    }
                ],
                "intf_speed": "",
                "ip": "127.0.0.1",
                "name": "trex-local",
                "software_mode": false,
                "tool": "TRex"
            }
        ]
    }
}

The other options in the minimal configuration must be present and must have the same values as above.

3. Start the Docker container

As for any Docker container, you can execute NFVbench measurement sessions using a temporary container (“docker run” - which exits after each NFVbench run) or you can decide to run the NFVbench container in the background then execute one or more NFVbench measurement sessions on that container (“docker exec”).

The former approach is simpler to manage (since each container is started and terminated after each command) but incurs a small delay at start time (several seconds). The second approach is more responsive as the delay is only incurred once when starting the container.

We will take the second approach and start the NFVbench container in detached mode with the name “nfvbench” (this works with bash, prefix with “sudo” if you do not use the root login)

First create a new working directory, and change the current working directory to there. A “nfvbench_ws” directory under your home directory is good place for that, and this is where the OpenStack RC file and NFVbench config file will sit.

To run NFVBench without server mode

cd ~/nfvbench_ws
docker run --detach --net=host --privileged -v $PWD:/tmp/nfvbench -v /dev:/dev -v /lib/modules/$(uname -r):/lib/modules/$(uname -r) -v /usr/src/kernels:/usr/src/kernels --name nfvbench opnfv/nfvbench

To run NFVBench enabling REST server (mount the configuration json and the path for openrc)

cd ~/nfvbench_ws
docker run --detach --net=host --privileged -e HOST="127.0.0.1" -e PORT=7556 -e CONFIG_FILE="/tmp/nfvbench/nfvbenchconfig.json -v $PWD:/tmp/nfvbench -v /dev:/dev -v /lib/modules/$(uname -r):/lib/modules/$(uname -r) -v /usr/src/kernels:/usr/src/kernels --name nfvbench opnfv/nfvbench start_rest_server

The create an alias to make it easy to execute nfvbench commands directly from the host shell prompt:

alias nfvbench='docker exec -it nfvbench nfvbench'

The next to last “nfvbench” refers to the name of the container while the last “nfvbench” refers to the NFVbench binary that is available to run in the container.

To verify it is working:

nfvbench --version
nfvbench --help
4. NFVbench configuration

Create a new file containing the minimal configuration for NFVbench, we can call it any name, for example “my_nfvbench.cfg” and paste the following yaml template in the file:

openrc_file:
traffic_generator:
    generator_profile:
        - name: trex-local
          tool: TRex
          ip: 127.0.0.1
          cores: 3
          software_mode: false,
          interfaces:
            - port: 0
              switch_port:
              pci:
            - port: 1
              switch_port:
              pci:
          intf_speed:

NFVbench requires an openrc file to connect to OpenStack using the OpenStack API. This file can be downloaded from the OpenStack Horizon dashboard (refer to the OpenStack documentation on how to retrieve the openrc file). The file pathname in the container must be stored in the “openrc_file” property. If it is stored on the host in the current directory, its full pathname must start with /tmp/nfvbench (since the current directory is mapped to /tmp/nfvbench in the container).

The required configuration is the PCI address of the 2 physical interfaces that will be used by the traffic generator. The PCI address can be obtained for example by using the “lspci” Linux command. For example:

[root@sjc04-pod6-build ~]# lspci | grep 710
0a:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
0a:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
0a:00.2 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
0a:00.3 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)

Example of edited configuration with an OpenStack RC file stored in the current directory with the “openrc” name, and PCI addresses “0a:00.0” and “0a:00.1” (first 2 ports of the quad port NIC):

openrc_file: /tmp/nfvbench/openrc
traffic_generator:
    generator_profile:
        - name: trex-local
          tool: TRex
          ip: 127.0.0.1
          cores: 3
          software_mode: false,
          interfaces:
            - port: 0
              switch_port:
              pci: "0a:00.0"
            - port: 1
              switch_port:
              pci: "0a:00.1"
          intf_speed:

Warning

You have to put quotes around the pci addresses as shown in the above example, otherwise TRex will read it wrong.

Alternatively, the full template with comments can be obtained using the –show-default-config option in yaml format:

nfvbench --show-default-config > my_nfvbench.cfg

Edit the nfvbench.cfg file to only keep those properties that need to be modified (preserving the nesting).

Make sure you have your nfvbench configuration file (my_nfvbench.cfg) and OpenStack RC file in your pre-created working directory.

5. Run NFVbench

To do a single run at 10,000pps bi-directional (or 5kpps in each direction) using the PVP packet path:

nfvbench -c /tmp/nfvbench/my_nfvbench.cfg --rate 10kpps

NFVbench options used:

  • -c /tmp/nfvbench/my_nfvbench.cfg : specify the config file to use (this must reflect the file path from inside the container)
  • --rate 10kpps : specify rate of packets for test for both directions using the kpps unit (thousands of packets per second)

This should produce a result similar to this (a simple run with the above options should take less than 5 minutes):

[TBP]
7. Terminating the NFVbench container

When no longer needed, the container can be terminated using the usual docker commands:

docker kill nfvbench
docker rm nfvbench
Example of Results

Example run for fixed rate

nfvbench -c /nfvbench/nfvbenchconfig.json --rate 1%
========== NFVBench Summary ==========
Date: 2017-09-21 23:57:44
NFVBench version 1.0.9
Openstack Neutron:
  vSwitch: BASIC
  Encapsulation: BASIC
Benchmarks:
> Networks:
  > Components:
    > TOR:
        Type: None
    > Traffic Generator:
        Profile: trex-local
        Tool: TRex
    > Versions:
      > TOR:
      > Traffic Generator:
          build_date: Aug 30 2017
          version: v2.29
          built_by: hhaim
          build_time: 16:43:55
  > Service chain:
    > PVP:
      > Traffic:
          Profile: traffic_profile_64B
          Bidirectional: True
          Flow count: 10000
          Service chains count: 1
          Compute nodes: []

            Run Summary:

              +-----------------+-------------+----------------------+----------------------+----------------------+
              |   L2 Frame Size |  Drop Rate  |   Avg Latency (usec) |   Min Latency (usec) |   Max Latency (usec) |
              +=================+=============+======================+======================+======================+
              |              64 |   0.0000%   |                   53 |                   20 |                  211 |
              +-----------------+-------------+----------------------+----------------------+----------------------+


            L2 frame size: 64
            Chain analysis duration: 60.076 seconds

            Run Config:

              +-------------+---------------------------+------------------------+-----------------+---------------------------+------------------------+-----------------+
              |  Direction  |  Requested TX Rate (bps)  |  Actual TX Rate (bps)  |  RX Rate (bps)  |  Requested TX Rate (pps)  |  Actual TX Rate (pps)  |  RX Rate (pps)  |
              +=============+===========================+========================+=================+===========================+========================+=================+
              |   Forward   |       100.0000 Mbps       |      95.4546 Mbps      |  95.4546 Mbps   |        148,809 pps        |      142,045 pps       |   142,045 pps   |
              +-------------+---------------------------+------------------------+-----------------+---------------------------+------------------------+-----------------+
              |   Reverse   |       100.0000 Mbps       |      95.4546 Mbps      |  95.4546 Mbps   |        148,809 pps        |      142,045 pps       |   142,045 pps   |
              +-------------+---------------------------+------------------------+-----------------+---------------------------+------------------------+-----------------+
              |    Total    |       200.0000 Mbps       |     190.9091 Mbps      |  190.9091 Mbps  |        297,618 pps        |      284,090 pps       |   284,090 pps   |
              +-------------+---------------------------+------------------------+-----------------+---------------------------+------------------------+-----------------+

            Chain Analysis:

              +-------------------+----------+-----------------+---------------+---------------+-----------------+---------------+---------------+
              |     Interface     |  Device  |  Packets (fwd)  |   Drops (fwd) |  Drop% (fwd)  |  Packets (rev)  |   Drops (rev) |  Drop% (rev)  |
              +===================+==========+=================+===============+===============+=================+===============+===============+
              | traffic-generator |   trex   |    8,522,729    |               |               |    8,522,729    |             0 |    0.0000%    |
              +-------------------+----------+-----------------+---------------+---------------+-----------------+---------------+---------------+
              | traffic-generator |   trex   |    8,522,729    |             0 |    0.0000%    |    8,522,729    |               |               |
              +-------------------+----------+-----------------+---------------+---------------+-----------------+---------------+---------------+

Example run for NDR/PDR with package size 1518B

nfvbench -c /nfvbench/nfvbenchconfig.json -fs 1518
========== NFVBench Summary ==========
Date: 2017-09-22 00:02:07
NFVBench version 1.0.9
Openstack Neutron:
  vSwitch: BASIC
  Encapsulation: BASIC
Benchmarks:
> Networks:
  > Components:
    > TOR:
        Type: None
    > Traffic Generator:
        Profile: trex-local
        Tool: TRex
    > Versions:
      > TOR:
      > Traffic Generator:
          build_date: Aug 30 2017
          version: v2.29
          built_by: hhaim
          build_time: 16:43:55
  > Measurement Parameters:
      NDR: 0.001
      PDR: 0.1
  > Service chain:
    > PVP:
      > Traffic:
          Profile: custom_traffic_profile
          Bidirectional: True
          Flow count: 10000
          Service chains count: 1
          Compute nodes: []

            Run Summary:

              +-----+-----------------+------------------+------------------+-----------------+----------------------+----------------------+----------------------+
              |  -  |   L2 Frame Size |  Rate (fwd+rev)  |  Rate (fwd+rev)  |  Avg Drop Rate  |   Avg Latency (usec) |   Min Latency (usec) |  Max Latency (usec)  |
              +=====+=================+==================+==================+=================+======================+======================+======================+
              | NDR |            1518 |   19.9805 Gbps   |  1,623,900 pps   |     0.0001%     |                  342 |                   30 |         704          |
              +-----+-----------------+------------------+------------------+-----------------+----------------------+----------------------+----------------------+
              | PDR |            1518 |   20.0000 Gbps   |  1,625,486 pps   |     0.0022%     |                  469 |                   40 |        1,266         |
              +-----+-----------------+------------------+------------------+-----------------+----------------------+----------------------+----------------------+


            L2 frame size: 1518
            Chain analysis duration: 660.442 seconds
            NDR search duration: 660 seconds
            PDR search duration: 0 seconds
Advanced Usage

This section covers a few examples on how to run NFVbench with multiple different settings. Below are shown the most common and useful use-cases and explained some fields from a default config file.

How to change any NFVbench run configuration (CLI)

NFVbench always starts with a default configuration which can further be refined (overridden) by the user from the CLI or from REST requests.

At first have a look at the default config:

nfvbench --show-default-config

It is sometimes useful derive your own configuration from a copy of the default config:

nfvbench --show-default-config > nfvbench.cfg

At this point you can edit the copy by:

  • removing any parameter that is not to be changed (since NFVbench will always load the default configuration, default values are not needed)
  • edit the parameters that are to be changed changed

A run with the new confguration can then simply be requested using the -c option and by using the actual path of the configuration file as seen from inside the container (in this example, we assume the current directory is mapped to /tmp/nfvbench in the container):

nfvbench -c /tmp/nfvbench/nfvbench.cfg

The same -c option also accepts any valid yaml or json string to override certain parameters without having to create a configuration file.

NFVbench provides many configuration options as optional arguments. For example the number of flows can be specified using the –flow-count option.

The flow count option can be specified in any of 3 ways:

  • by providing a confguration file that has the flow_count value to use (-c myconfig.yaml and myconfig.yaml contains ‘flow_count: 100k’)
  • by passing that yaml paremeter inline (-c “flow_count: 100k”) or (-c “{flow_count: 100k}”)
  • by using the flow count optional argument (–flow-count 100k)
Showing the running configuration

Because configuration parameters can be overriden, it is sometimes useful to show the final configuration (after all oevrrides are done) by using the –show-config option. This final configuration is also called the “running” configuration.

For example, this will only display the running configuration (without actually running anything):

nfvbench -c "{flow_count: 100k, debug: true}" --show-config
Connectivity and Configuration Check

NFVbench allows to test connectivity to devices used with the selected packet path. It runs the whole test, but without actually sending any traffic. It is also a good way to check if everything is configured properly in the configuration file and what versions of components are used.

To verify everything works without sending any traffic, use the –no-traffic option:

nfvbench --no-traffic

Used parameters:

  • --no-traffic or -0 : sending traffic from traffic generator is skipped
Fixed Rate Run

Fixed rate run is the most basic type of NFVbench usage. It can be used to measure the drop rate with a fixed transmission rate of packets.

This example shows how to run the PVP packet path (which is the default packet path) with multiple different settings:

nfvbench -c nfvbench.cfg --no-cleanup --rate 100000pps --duration 30 --interval 15 --json results.json

Used parameters:

  • -c nfvbench.cfg : path to the config file
  • --no-cleanup : resources (networks, VMs, attached ports) are not deleted after test is finished
  • --rate 100000pps : defines rate of packets sent by traffic generator
  • --duration 30 : specifies how long should traffic be running in seconds
  • --interval 15 : stats are checked and shown periodically (in seconds) in this interval when traffic is flowing
  • --json results.json : collected data are stored in this file after run is finished

Note

It is your responsibility to clean up resources if needed when --no-cleanup parameter is used. You can use the nfvbench_cleanup helper script for that purpose.

The --json parameter makes it easy to store NFVbench results. The –show-summary (or -ss) option can be used to display the results in a json results file in a text tabular format:

nfvbench --show-summary results.json

This example shows how to specify a different packet path:

nfvbench -c nfvbench.cfg --rate 1Mbps --inter-node --service-chain PVVP

Used parameters:

  • -c nfvbench.cfg : path to the config file
  • --rate 1Mbps : defines rate of packets sent by traffic generator
  • --inter-node : VMs are created on different compute nodes, works only with PVVP flow
  • --service-chain PVVP or -sc PVVP : specifies the type of service chain (or packet path) to use

Note

When parameter --inter-node is not used or there aren’t enough compute nodes, VMs are on the same compute node.

Rate Units

Parameter --rate accepts different types of values:

  • packets per second (pps, kpps, mpps), e.g. 1000pps or 10kpps
  • load percentage (%), e.g. 50%
  • bits per second (bps, kbps, Mbps, Gbps), e.g. 1Gbps, 1000bps
  • NDR/PDR (ndr, pdr, ndr_pdr), e.g. ndr_pdr

NDR/PDR is the default rate when not specified.

NDR and PDR

The NDR and PDR test is used to determine the maximum throughput performance of the system under test following guidelines defined in RFC-2544:

  • NDR (No Drop Rate): maximum packet rate sent without dropping any packet
  • PDR (Partial Drop Rate): maximum packet rate sent while allowing a given maximum drop rate

The NDR search can also be relaxed to allow some very small amount of drop rate (lower than the PDR maximum drop rate). NFVbench will measure the NDR and PDR values by driving the traffic generator through multiple iterations at different transmission rates using a binary search algorithm.

The configuration file contains section where settings for NDR/PDR can be set.

# NDR/PDR configuration
measurement:
    # Drop rates represent the ratio of dropped packet to the total number of packets sent.
    # Values provided here are percentages. A value of 0.01 means that at most 0.01% of all
    # packets sent are dropped (or 1 packet every 10,000 packets sent)

    # No Drop Rate; Default to 0.001%
    NDR: 0.001
    # Partial Drop Rate; NDR should always be less than PDR
    PDR: 0.1
    # The accuracy of NDR and PDR load percentiles; The actual load percentile that match NDR
    # or PDR should be within `load_epsilon` difference than the one calculated.
    load_epsilon: 0.1

Because NDR/PDR is the default --rate value, it is possible to run NFVbench simply like this:

nfvbench -c nfvbench.cfg

Other possible run options:

nfvbench -c nfvbench.cfg --duration 120 --json results.json

Used parameters:

  • -c nfvbench.cfg : path to the config file
  • --duration 120 : specifies how long should be traffic running in each iteration
  • --json results.json : collected data are stored in this file after run is finished
Multichain

NFVbench allows to run multiple chains at the same time. For example it is possible to stage the PVP service chain N-times, where N can be as much as your compute power can scale. With N = 10, NFVbench will spawn 10 VMs as a part of 10 simultaneous PVP chains.

The number of chains is specified by --service-chain-count or -scc flag with a default value of 1. For example to run NFVbench with 3 PVP chains:

nfvbench -c nfvbench.cfg --rate 10000pps -scc 3

It is not necessary to specify the service chain type (-sc) because PVP is set as default. The PVP service chains will have 3 VMs in 3 chains with this configuration. If -sc PVVP is specified instead, there would be 6 VMs in 3 chains as this service chain has 2 VMs per chain. Both single run or NDR/PDR can be run as multichain. Running multichain is a scenario closer to a real life situation than runs with a single chain.

External Chain

NFVbench can measure the performance of 1 or more L3 service chains that are setup externally. Instead of being setup by NFVbench, the complete environment (VMs and networks) has to be setup prior to running NFVbench.

Each external chain is made of 1 or more VNFs and has exactly 2 end network interfaces (left and right network interfaces) that are connected to 2 neutron networks (left and right networks). The internal composition of a multi-VNF service chain can be arbitrary (usually linear) as far as NFVbench is concerned, the only requirement is that the service chain can route L3 packets properly between the left and right networks.

To run NFVbench on such external service chains:

  • explicitly tell NFVbench to use external service chain by adding -sc EXT or --service-chain EXT to NFVbench CLI options

  • specify the number of external chains using the -scc option (defaults to 1 chain)

  • specify the 2 end point networks of your environment in external_networks inside the config file.
    • The two networks specified there have to exist in Neutron and will be used as the end point networks by NFVbench (‘napa’ and ‘marin’ in the diagram below)
  • specify the router gateway IPs for the external service chains (1.1.0.2 and 2.2.0.2)

  • specify the traffic generator gateway IPs for the external service chains (1.1.0.102 and 2.2.0.102 in diagram below)

  • specify the packet source and destination IPs for the virtual devices that are simulated (10.0.0.0/8 and 20.0.0.0/8)

_images/extchain-config.png

L3 routing must be enabled in the VNF and configured to:

  • reply to ARP requests to its public IP addresses on both left and right networks
  • route packets from each set of remote devices toward the appropriate dest gateway IP in the traffic generator using 2 static routes (as illustrated in the diagram)

Upon start, NFVbench will: - first retrieve the properties of the left and right networks using Neutron APIs, - extract the underlying network ID (typically VLAN segmentation ID), - generate packets with the proper VLAN ID and measure traffic.

Note that in the case of multiple chains, all chains end interfaces must be connected to the same two left and right networks. The traffic will be load balanced across the corresponding gateway IP of these external service chains.

Multiflow

NFVbench always generates L3 packets from the traffic generator but allows the user to specify how many flows to generate. A flow is identified by a unique src/dest MAC IP and port tuple that is sent by the traffic generator. Flows are generated by ranging the IP adresses but using a small fixed number of MAC addresses.

The number of flows will be spread roughly even between chains when more than 1 chain is being tested. For example, for 11 flows and 3 chains, number of flows that will run for each chain will be 3, 4, and 4 flows respectively.

The number of flows is specified by --flow-count or -fc flag, the default value is 2 (1 flow in each direction). To run NFVbench with 3 chains and 100 flows, use the following command:

nfvbench -c nfvbench.cfg --rate 10000pps -scc 3 -fc 100

Note that from a vswitch point of view, the number of flows seen will be higher as it will be at least 4 times the number of flows sent by the traffic generator (add flow to VM and flow from VM).

IP addresses generated can be controlled with the following NFVbench configuration options:

ip_addrs: ['10.0.0.0/8', '20.0.0.0/8']
ip_addrs_step: 0.0.0.1
tg_gateway_ip_addrs: ['1.1.0.100', '2.2.0.100']
tg_gateway_ip_addrs_step: 0.0.0.1
gateway_ip_addrs: ['1.1.0.2', '2.2.0.2']
gateway_ip_addrs_step: 0.0.0.1

ip_addrs are the start of the 2 ip address ranges used by the traffic generators as the packets source and destination packets where each range is associated to virtual devices simulated behind 1 physical interface of the traffic generator. These can also be written in CIDR notation to represent the subnet.

tg_gateway_ip_addrs are the traffic generator gateway (virtual) ip addresses, all traffic to/from the virtual devices go through them.

gateway_ip_addrs are the 2 gateway ip address ranges of the VMs used in the external chains. They are only used with external chains and must correspond to their public IP address.

The corresponding step is used for ranging the IP addresses from the ip_addrs`, tg_gateway_ip_addrs and gateway_ip_addrs base addresses. 0.0.0.1 is the default step for all IP ranges. In ip_addrs, ‘random’ can be configured which tells NFVBench to generate random src/dst IP pairs in the traffic stream.

Traffic Configuration via CLI

While traffic configuration can be modified using the configuration file, it can be inconvenient to have to change the configuration file everytime you need to change a traffic configuration option. Traffic configuration options can be overridden with a few CLI options.

Here is an example of configuring traffic via CLI:

nfvbench --rate 10kpps --service-chain-count 2 -fs 64 -fs IMIX -fs 1518 --unidir

This command will run NFVbench with a unidirectional flow for three packet sizes 64B, IMIX, and 1518B.

Used parameters:

  • --rate 10kpps : defines rate of packets sent by traffic generator (total TX rate)
  • -scc 2 or --service-chain-count 2 : specifies number of parallel chains of given flow to run (default to 1)
  • -fs 64 or --frame-size 64: add the specified frame size to the list of frame sizes to run
  • --unidir : run traffic with unidirectional flow (default to bidirectional flow)
MAC Addresses

NFVbench will dicover the MAC addresses to use for generated frames using: - either OpenStack discovery (find the MAC of an existing VM) in the case of PVP and PVVP service chains - or using dynamic ARP discovery (find MAC from IP) in the case of external chains.

Status and Cleanup of NFVbench Resources

The –status option will display the status of NFVbench and list any NFVbench resources. You need to pass the OpenStack RC file in order to connect to OpenStack.

# nfvbench --status -r /tmp/nfvbench/openrc
2018-04-09 17:05:48,682 INFO Version: 1.3.2.dev1
2018-04-09 17:05:48,683 INFO Status: idle
2018-04-09 17:05:48,757 INFO Discovering instances nfvbench-loop-vm...
2018-04-09 17:05:49,252 INFO Discovering flavor nfvbench.medium...
2018-04-09 17:05:49,281 INFO Discovering networks...
2018-04-09 17:05:49,365 INFO No matching NFVbench resources found
#

The Status can be either “idle” or “busy (run pending)”.

The –cleanup option will first discover resources created by NFVbench and prompt if you want to proceed with cleaning them up. Example of run:

# nfvbench --cleanup -r /tmp/nfvbench/openrc
2018-04-09 16:58:00,204 INFO Version: 1.3.2.dev1
2018-04-09 16:58:00,205 INFO Status: idle
2018-04-09 16:58:00,279 INFO Discovering instances nfvbench-loop-vm...
2018-04-09 16:58:00,829 INFO Discovering flavor nfvbench.medium...
2018-04-09 16:58:00,876 INFO Discovering networks...
2018-04-09 16:58:00,960 INFO Discovering ports...
2018-04-09 16:58:01,012 INFO Discovered 6 NFVbench resources:
+----------+-------------------+--------------------------------------+
| Type     | Name              | UUID                                 |
|----------+-------------------+--------------------------------------|
| Instance | nfvbench-loop-vm0 | b039b858-777e-467e-99fb-362f856f4a94 |
| Flavor   | nfvbench.medium   | a027003c-ad86-4f24-b676-2b05bb06adc0 |
| Network  | nfvbench-net0     | bca8d183-538e-4965-880e-fd92d48bfe0d |
| Network  | nfvbench-net1     | c582a201-8279-4309-8084-7edd6511092c |
| Port     |                   | 67740862-80ac-4371-b04e-58a0b0f05085 |
| Port     |                   | b5db95b9-e419-4725-951a-9a8f7841e66a |
+----------+-------------------+--------------------------------------+
2018-04-09 16:58:01,013 INFO NFVbench will delete all resources shown...
Are you sure? (y/n) y
2018-04-09 16:58:01,865 INFO Deleting instance nfvbench-loop-vm0...
2018-04-09 16:58:02,058 INFO     Waiting for 1 instances to be fully deleted...
2018-04-09 16:58:02,182 INFO     1 yet to be deleted by Nova, retries left=6...
2018-04-09 16:58:04,506 INFO     1 yet to be deleted by Nova, retries left=5...
2018-04-09 16:58:06,636 INFO     1 yet to be deleted by Nova, retries left=4...
2018-04-09 16:58:08,701 INFO Deleting flavor nfvbench.medium...
2018-04-09 16:58:08,729 INFO Deleting port 67740862-80ac-4371-b04e-58a0b0f05085...
2018-04-09 16:58:09,102 INFO Deleting port b5db95b9-e419-4725-951a-9a8f7841e66a...
2018-04-09 16:58:09,620 INFO Deleting network nfvbench-net0...
2018-04-09 16:58:10,357 INFO Deleting network nfvbench-net1...
#

The –force-cleanup option will do the same but without prompting for confirmation.

NFVbench Fluentd Integration

NFVbench has an optional fluentd integration to save logs and results.

Configuring Fluentd to receive NFVbench logs and results

The following configurations should be added to Fluentd configuration file to enable logs or results.

To receive logs, and forward to a storage server:

In the example below nfvbench is the tag name for logs (which should be matched with logging_tag under NFVbench configuration), and storage backend is elasticsearch which is running at localhost:9200.

<match nfvbench.**>
@type copy
<store>
    @type elasticsearch
    host localhost
    port 9200
    logstash_format true
    logstash_prefix nfvbench
    utc_index false
    flush_interval 15s
</store>
</match>

To receive results, and forward to a storage server:

In the example below resultnfvbench is the tag name for results (which should be matched with result_tag under NFVbench configuration), and storage backend is elasticsearch which is running at localhost:9200.

<match resultnfvbench.**>
@type copy
<store>
    @type elasticsearch
    host localhost
    port 9200
    logstash_format true
    logstash_prefix resultnfvbench
    utc_index false
    flush_interval 15s
</store>
</match>
Configuring NFVbench to connect Fluentd

To configure NFVbench to connect Fluentd, fill following configuration parameters in the configuration file

Configuration Description
logging_tag Tag for NFVbench logs, it should be the same tag defined in Fluentd configuration
result_tag Tag for NFVbench results, it should be the same tag defined in Fluentd configuration
ip ip address of Fluentd server
port port number of Fluentd serverd

An example of configuration for Fluentd working at 127.0.0.1:24224 and tags for logging is nfvbench and result is resultnfvbench

fluentd:
    # by default (logging_tag is empty) nfvbench log messages are not sent to fluentd
    # to enable logging to fluents, specify a valid fluentd tag name to be used for the
    # log records
    logging_tag: nfvbench

    # by default (result_tag is empty) nfvbench results are not sent to fluentd
    # to enable sending nfvbench results to fluentd, specify a valid fluentd tag name
    # to be used for the results records, which is different than logging_tag
    result_tag: resultnfvbench

    # IP address of the server, defaults to loopback
    ip: 127.0.0.1

    # port # to use, by default, use the default fluentd forward port
    port: 24224
Example of logs and results

An example of log obtained from fluentd by elasticsearch:

{
  "_index": "nfvbench-2017.10.17",
  "_type": "fluentd",
  "_id": "AV8rhnCjTgGF_dX8DiKK",
  "_version": 1,
  "_score": 3,
  "_source": {
    "loglevel": "INFO",
    "message": "Service chain 'PVP' run completed.",
    "@timestamp": "2017-10-17T18:09:09.516897+0000",
    "runlogdate": "2017-10-17T18:08:51.851253+0000"
  },
  "fields": {
    "@timestamp": [
      1508263749516
    ]
  }
}

For each packet size and rate a result record is sent. Users can label those results by passing –user-label parameter to NFVbench run

And the results of this command obtained from fluentd by elasticsearch:

{
  "_index": "resultnfvbench-2017.10.17",
  "_type": "fluentd",
  "_id": "AV8rjYlbTgGF_dX8Drl1",
  "_version": 1,
  "_score": null,
  "_source": {
    "compute_nodes": [
      "nova:compute-3"
    ],
    "total_orig_rate_bps": 200000000,
    "@timestamp": "2017-10-17T18:16:43.755240+0000",
    "frame_size": "64",
    "forward_orig_rate_pps": 148809,
    "flow_count": 10000,
    "avg_delay_usec": 6271,
    "total_tx_rate_pps": 283169,
    "total_tx_rate_bps": 190289668,
    "forward_tx_rate_bps": 95143832,
    "reverse_tx_rate_bps": 95145836,
    "forward_tx_rate_pps": 141583,
    "chain_analysis_duration": "60.091",
    "service_chain": "PVP",
    "version": "1.0.10.dev1",
    "runlogdate": "2017-10-17T18:10:12.134260+0000",
    "Encapsulation": "VLAN",
    "user_label": "nfvbench-label",
    "min_delay_usec": 70,
    "profile": "traffic_profile_64B",
    "reverse_rx_rate_pps": 68479,
    "reverse_rx_rate_bps": 46018044,
    "reverse_orig_rate_pps": 148809,
    "total_rx_rate_bps": 92030085,
    "drop_rate_percent": 51.6368455626846,
    "forward_orig_rate_bps": 100000000,
    "bidirectional": true,
    "vSwitch": "OPENVSWITCH",
    "sc_count": 1,
    "total_orig_rate_pps": 297618,
    "type": "single_run",
    "reverse_orig_rate_bps": 100000000,
    "total_rx_rate_pps": 136949,
    "max_delay_usec": 106850,
    "forward_rx_rate_pps": 68470,
    "forward_rx_rate_bps": 46012041,
    "reverse_tx_rate_pps": 141586
  },
  "fields": {
    "@timestamp": [
      1508264203755
    ]
  },
  "sort": [
    1508264203755
  ]
}
Testing SR-IOV

NFVbench supports SR-IOV with the PVP packet flow (PVVP is not supported). SR-IOV support is not applicable for external chains since the networks have to be setup externally (and can themselves be pre-set to use SR-IOV or not).

Pre-requisites

To test SR-IOV you need to have compute nodes configured to support one or more SR-IOV interfaces (also knows as PF or physical function) and you need OpenStack to be configured to support SR-IOV. You will also need to know: - the name of the physical networks associated to your SR-IOV interfaces (this is a configuration in Nova compute) - the VLAN range that can be used on the switch ports that are wired to the SR-IOV ports. Such switch ports are normally configured in trunk mode with a range of VLAN ids enabled on that port

For example, in the case of 2 SR-IOV ports per compute node, 2 physical networks are generally configured in OpenStack with a distinct name. The VLAN range to use is is also allocated and reserved by the network administrator and in coordination with the corresponding top of rack switch port configuration.

Configuration

To enable SR-IOV test, you will need to provide the following configuration options to NFVbench (in the configuration file). This example instructs NFVbench to create the left and right networks of a PVP packet flow to run on 2 SRIOV ports named “phys_sriov0” and “phys_sriov1” using resp. segmentation_id 2000 and 2001:

internal_networks:
   left:
       segmentation_id: 2000
       physical_network: phys_sriov0
   right:
       segmentation_id: 2001
       physical_network: phys_sriov1

The segmentation ID fields must be different. In the case of PVVP, the middle network also needs to be provisioned properly. The same physical network can also be shared by the virtual networks but with different segmentation IDs.

NIC NUMA socket placement and flavors

If the 2 selected ports reside on NICs that are on different NUMA sockets, you will need to explicitly tell Nova to use 2 numa nodes in the flavor used for the VMs in order to satisfy the filters, for example:

flavor:
  # Number of vCPUs for the flavor
  vcpus: 2
  # Memory for the flavor in MB
  ram: 8192
  # Size of local disk in GB
  disk: 0
  extra_specs:
      "hw:cpu_policy": dedicated
      "hw:mem_page_size": large
      "hw:numa_nodes": 2

Failure to do so might cause the VM creation to fail with the Nova error “Instance creation error: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology.”

NFVbench Server mode and NFVbench client API

NFVbench can run as an HTTP server to:

  • optionally provide access to any arbitrary HTLM files (HTTP server function) - this is optional
  • service fully parameterized aynchronous run requests using the HTTP protocol (REST/json with polling)
  • service fully parameterized run requests with interval stats reporting using the WebSocket/SocketIO protocol.
Start the NFVbench server

To run in server mode, simply use the –server <http_root_path> and optionally the listen address to use (–host <ip>, default is 0.0.0.0) and listening port to use (–port <port>, default is 7555).

If HTTP files are to be serviced, they must be stored right under the http root path. This root path must contain a static folder to hold static files (css, js) and a templates folder with at least an index.html file to hold the template of the index.html file to be used. This mode is convenient when you do not already have a WEB server hosting the UI front end. If HTTP files servicing is not needed (REST only or WebSocket/SocketIO mode), the root path can point to any dummy folder.

Once started, the NFVbench server will be ready to service HTTP or WebSocket/SocketIO requests at the advertised URL.

Example of NFVbench server start in a container:

# get to the container shell (assume the container name is "nfvbench")
docker exec -it nfvbench bash
# from the container shell start the NFVbench server in the background
nfvbench -c /tmp/nfvbench/nfvbench.cfg --server /tmp &
# exit container
exit
HTTP Interface
<http-url>/echo (GET)

This request simply returns whatever content is sent in the body of the request (body should be in json format, only used for testing)

Example request:

curl -XGET '127.0.0.1:7556/echo' -H "Content-Type: application/json" -d '{"nfvbench": "test"}'
Response:
{
  "nfvbench": "test"
}
<http-url>/status (GET)

This request fetches the status of an asynchronous run. It will return in json format:

  • a request pending reply (if the run is still not completed)
  • an error reply if there is no run pending
  • or the complete result of the run

The client can keep polling until the run completes.

Example of return when the run is still pending:

{
  "error_message": "nfvbench run still pending",
  "status": "PENDING"
}

Example of return when the run completes:

{
  "result": {...}
  "status": "OK"
}
<http-url>/start_run (POST)

This request starts an NFVBench run with passed configurations. If no configuration is passed, a run with default configurations will be executed.

Example request: curl -XPOST ‘localhost:7556/start_run’ -H “Content-Type: application/json” -d @nfvbenchconfig.json

See “NFVbench configuration JSON parameter” below for details on how to format this parameter.

The request returns immediately with a json content indicating if there was an error (status=ERROR) or if the request was submitted successfully (status=PENDING). Example of return when the submission is successful:

{
  "error_message": "NFVbench run still pending",
  "request_id": "42cccb7effdc43caa47f722f0ca8ec96",
  "status": "PENDING"
}

If there is already an NFVBench running then it will return:

{
 "error_message": "there is already an NFVbench request running",
 "status": "ERROR"
}
WebSocket/SocketIO events

List of SocketIO events supported:

Client to Server

start_run:

sent by client to start a new run with the configuration passed in argument (JSON). The configuration can be any valid NFVbench configuration passed as a JSON document (see “NFVbench configuration JSON parameter” below)
Server to Client

run_interval_stats:

sent by server to report statistics during a run the message contains the statistics {‘time_ms’: time_ms, ‘tx_pps’: tx_pps, ‘rx_pps’: rx_pps, ‘drop_pct’: drop_pct}

ndr_found:

(during NDR-PDR search) sent by server when the NDR rate is found the message contains the NDR value {‘rate_pps’: ndr_pps}

ndr_found:

(during NDR-PDR search) sent by server when the PDR rate is found the message contains the PDR value {‘rate_pps’: pdr_pps}

run_end:

sent by server to report the end of a run the message contains the complete results in JSON format
NFVbench configuration JSON parameter

The NFVbench configuration describes the parameters of an NFVbench run and can be passed to the NFVbench server as a JSON document.

Default configuration

The simplest JSON document is the empty dictionary “{}” which indicates to use the default NFVbench configuration:

  • PVP
  • NDR-PDR measurement
  • 64 byte packets
  • 1 flow per direction

The entire default configuration can be viewed using the –show-json-config option on the cli:

# nfvbench --show-config
{
    "availability_zone": null,
    "compute_node_user": "root",
    "compute_nodes": null,
    "debug": false,
    "duration_sec": 60,
    "flavor": {
        "disk": 0,
        "extra_specs": {
            "hw:cpu_policy": "dedicated",
            "hw:mem_page_size": 2048
        },
        "ram": 8192,
        "vcpus": 2
    },
    "flavor_type": "nfv.medium",
    "flow_count": 1,
    "generic_poll_sec": 2,
    "generic_retry_count": 100,
    "inter_node": false,
    "internal_networks": {
        "left": {
            "name": "nfvbench-net0",
            "subnet": "nfvbench-subnet0",
            "cidr": "192.168.1.0/24",
        },
        "right": {
            "name": "nfvbench-net1",
            "subnet": "nfvbench-subnet1",
            "cidr": "192.168.2.0/24",
        },
        "middle": {
            "name": "nfvbench-net2",
            "subnet": "nfvbench-subnet2",
            "cidr": "192.168.3.0/24",
        }
    },
    "interval_sec": 10,
    "json": null,
    "loop_vm_name": "nfvbench-loop-vm",
    "measurement": {
        "NDR": 0.001,
        "PDR": 0.1,
        "load_epsilon": 0.1
    },
    "name": "(built-in default config)",
    "no_cleanup": false,
    "no_traffic": false,
    "openrc_file": "/tmp/nfvbench/openstack/openrc",
    "rate": "ndr_pdr",
    "service_chain": "PVP",
    "service_chain_count": 1,
    "sriov": false,
    "std_json": null,
    "traffic": {
        "bidirectional": true,
        "profile": "traffic_profile_64B"
    },
    "traffic_generator": {
        "default_profile": "trex-local",
        "gateway_ip_addrs": [
            "1.1.0.2",
            "2.2.0.2"
        ],
        "gateway_ip_addrs_step": "0.0.0.1",
        "generator_profile": [
            {
                "cores": 3,
                "interfaces": [
                    {
                        "pci": "0a:00.0",
                        "port": 0,
                        "switch_port": "Ethernet1/33",
                        "vlan": null
                    },
                    {
                        "pci": "0a:00.1",
                        "port": 1,
                        "switch_port": "Ethernet1/34",
                        "vlan": null
                    }
                ],
                "intf_speed": null,
                "ip": "127.0.0.1",
                "name": "trex-local",
                "tool": "TRex"
            }
        ],
        "host_name": "nfvbench_tg",
        "ip_addrs": [
            "10.0.0.0/8",
            "20.0.0.0/8"
        ],
        "ip_addrs_step": "0.0.0.1",
        "mac_addrs": [
            "00:10:94:00:0A:00",
            "00:11:94:00:0A:00"
        ],
        "step_mac": null,
        "tg_gateway_ip_addrs": [
            "1.1.0.100",
            "2.2.0.100"
        ],
        "tg_gateway_ip_addrs_step": "0.0.0.1"
    },
    "traffic_profile": [
        {
            "l2frame_size": [
                "64"
            ],
            "name": "traffic_profile_64B"
        },
        {
            "l2frame_size": [
                "IMIX"
            ],
            "name": "traffic_profile_IMIX"
        },
        {
            "l2frame_size": [
                "1518"
            ],
            "name": "traffic_profile_1518B"
        },
        {
            "l2frame_size": [
                "64",
                "IMIX",
                "1518"
            ],
            "name": "traffic_profile_3sizes"
        }
    ],
    "unidir_reverse_traffic_pps": 1,
    "vlan_tagging": true,
}
Common examples of JSON configuration

Use the default configuration but use 10000 flows per direction (instead of 1):

{ "flow_count": 10000 }

Use default confguration but with 10000 flows, “EXT” chain and IMIX packet size:

{
   "flow_count": 10000,
   "service_chain": "EXT",
    "traffic": {
        "profile": "traffic_profile_IMIX"
    },
}

A short run of 5 seconds at a fixed rate of 1Mpps (and everything else same as the default configuration):

{
   "duration": 5,
   "rate": "1Mpps"
}
Example of interaction with the NFVbench server using HTTP and curl

HTTP requests can be sent directly to the NFVbench server from CLI using curl from any host that can connect to the server (here we run it from the local host).

This is a POST request to start a run using the default NFVbench configuration but with traffic generation disabled (“no_traffic” property is set to true):

[root@sjc04-pod3-mgmt ~]# curl -H "Accept: application/json" -H "Content-type: application/json" -X POST -d '{"no_traffic":true}' http://127.0.0.1:7555/start_run
{
  "error_message": "nfvbench run still pending",
  "status": "PENDING"
}
[root@sjc04-pod3-mgmt ~]#

This request will return immediately with status set to “PENDING” if the request was started successfully.

The status can be polled until the run completes. Here the poll returns a “PENDING” status, indicating the run is still not completed:

[root@sjc04-pod3-mgmt ~]# curl -G http://127.0.0.1:7555/status
{
  "error_message": "nfvbench run still pending",
  "status": "PENDING"
}
[root@sjc04-pod3-mgmt ~]#

Finally, the status request returns a “OK” status along with the full results (truncated here):

[root@sjc04-pod3-mgmt ~]# curl -G http://127.0.0.1:7555/status
{
  "result": {
    "benchmarks": {
      "network": {
        "service_chain": {
          "PVP": {
            "result": {
              "bidirectional": true,
              "compute_nodes": {
                "nova:sjc04-pod3-compute-4": {
                  "bios_settings": {
                    "Adjacent Cache Line Prefetcher": "Disabled",
                    "All Onboard LOM Ports": "Enabled",
                    "All PCIe Slots OptionROM": "Enabled",
                    "Altitude": "300 M",
...

    "date": "2017-03-31 22:15:41",
    "nfvbench_version": "0.3.5",
    "openstack_spec": {
      "encaps": "VxLAN",
      "vswitch": "VTS"
    }
  },
  "status": "OK"
}
[root@sjc04-pod3-mgmt ~]#
Example of interaction with the NFVbench server using a python CLI app (nfvbench_client)

The module client/client.py contains an example of python class that can be used to control the NFVbench server from a python app using HTTP or WebSocket/SocketIO.

The module client/nfvbench_client.py has a simple main application to control the NFVbench server from CLI. The “nfvbench_client” wrapper script can be used to invoke the client front end (this wrapper is pre-installed in the NFVbench container)

Example of invocation of the nfvbench_client front end, from the host (assume the name of the NFVbench container is “nfvbench”), use the default NFVbench configuration but do not generate traffic (no_traffic property set to true, the full json result is truncated here):

[root@sjc04-pod3-mgmt ~]# docker exec -it nfvbench nfvbench_client -c '{"no_traffic":true}' http://127.0.0.1:7555
{u'status': u'PENDING', u'error_message': u'nfvbench run still pending'}
{u'status': u'PENDING', u'error_message': u'nfvbench run still pending'}
{u'status': u'PENDING', u'error_message': u'nfvbench run still pending'}

{u'status': u'OK', u'result': {u'date': u'2017-03-31 22:04:59', u'nfvbench_version': u'0.3.5',
 u'config': {u'compute_nodes': None, u'compute_node_user': u'root', u'traffic_generator': {u'tg_gateway_ip_addrs': [u'1.1.0.100', u'2.2.0.100'], u'ip_addrs_step': u'0.0.0.1',
 u'step_mac': None, u'generator_profile': [{u'intf_speed': u'', u'interfaces': [{u'pci': u'0a:00.0', u'port': 0, u'vlan': 1998, u'switch_port': None},

...

[root@sjc04-pod3-mgmt ~]#

The http interface is used unless –use-socketio is defined.

Example of invocation using Websocket/SocketIO, execute NFVbench using the default configuration but with a duration of 5 seconds and a fixed rate run of 5kpps.

[root@sjc04-pod3-mgmt ~]# docker exec -it nfvbench nfvbench_client -c '{"duration":5,"rate":"5kpps"}' --use-socketio  http://127.0.0.1:7555 >results.json
Frequently Asked Questions
General Questions
Can NFVbench be used without OpenStack

Yes. This can be done using the EXT chain mode, with or without ARP (depending on whether your systen under test can do routing) and by setting the openrc_file property to empty in the NFVbench configuration.

Can NFVbench be used with a different traffic generator than TRex?

This is possible but requires developing a new python class to manage the new traffic generator interface.

Can I connect Trex directly to my compute node?

Yes.

Can I drive NFVbench using a REST interface?

NFVbench can run in server mode and accept HTTP or WebSocket/SocketIO events to run any type of measurement (fixed rate run or NDR_PDR run) with any run configuration.

Can I run NFVbench on a Cisco UCS-B series blade?

Yes provided your UCS-B series server has a Cisco VIC 1340 (with a recent firmware version). TRex will require VIC firmware version 3.1(2) or higher for blade servers (which supports more filtering capabilities). In this setting, the 2 physical interfaces for data plane traffic are simply hooked to the UCS-B fabric interconnect (no need to connect to a switch).

Troubleshooting
TrafficClientException: End-to-end connectivity cannot be ensured

Prior to running a benchmark, NFVbench will make sure that traffic is passing in the service chain by sending a small flow of packets in each direction and verifying that they are received back at the other end. This exception means that NFVbench cannot pass any traffic in the service chain.

The most common issues that prevent traffic from passing are: - incorrect wiring of the NFVbench/TRex interfaces - incorrect vlan_tagging setting in the NFVbench configuration, this needs to match how the NFVbench ports on the switch are configured (trunk or access port)

  • if the switch port is configured as access port, you must disable vlan_tagging in the NFVbench configuration
  • of the switch port is configured as trunk (recommended method), you must enable it

Storperf

StorPerf User Guide
1. StorPerf Introduction

The purpose of StorPerf is to provide a tool to measure ephemeral and block storage performance of OpenStack.

A key challenge to measuring disk performance is to know when the disk (or, for OpenStack, the virtual disk or volume) is performing at a consistent and repeatable level of performance. Initial writes to a volume can perform poorly due to block allocation, and reads can appear instantaneous when reading empty blocks. How do we know when the data reported is valid? The Storage Network Industry Association (SNIA) has developed methods which enable manufacturers to set, and customers to compare, the performance specifications of Solid State Storage devices. StorPerf applies this methodology to OpenStack Cinder and Glance services to provide a high level of confidence in the performance metrics in the shortest reasonable time.

1.1. How Does StorPerf Work?

Once launched, StorPerf presents a ReST interface, along with a Swagger UI that makes it easier to form HTTP ReST requests. Issuing an HTTP POST to the configurations API causes StorPerf to talk to OpenStack’s heat service to create a new stack with as many agent VMs and attached Cinder volumes as specified.

After the stack is created, we can issue one or more jobs by issuing a POST to the jobs ReST API. The job is the smallest unit of work that StorPerf can use to measure the disk’s performance.

While the job is running, StorPerf collects the performance metrics from each of the disks under test every minute. Once the trend of metrics match the criteria specified in the SNIA methodology, the job automatically terminates and the valid set of metrics are available for querying.

What is the criteria? Simply put, it specifies that when the metrics measured start to “flat line” and stay within that range for the specified amount of time, then the metrics are considered to be indicative of a repeatable level of performance.

1.2. StorPerf Testing Guidelines

First of all, StorPerf is not able to give pointers on how to tune a Cinder implementation, as there are far too many backends (Ceph, NFS, LVM, etc), each with their own methods of tuning. StorPerf is here to assist in getting a reliable performance measurement by encoding the test specification from SNIA, and helping present the results in a way that makes sense.

Having said that, there are some general guidelines that we can present to assist with planning a performance test.

1.2.1. Workload Modelling

This is an important item to address as there are many parameters to how data is accessed. Databases typically use a fixed block size and tend to manage their data so that sequential access is more likely. GPS image tiles can be around 20-60kb and will be accessed by reading the file in full, with no easy way to predict what tiles will be needed next. Some programs are able to submit I/O asynchronously where others need to have different threads and may be synchronous. There is no one size fits all here, so knowing what type of I/O pattern we need to model is critical to getting realistic measurements.

1.2.2. System Under Test

The unfortunate part is that StorPerf does not have any knowledge about the underlying OpenStack itself – we can only see what is available through OpenStack APIs, and none of them provide details about the underlying storage implementation. As the test executor, we need to know information such as: the number of disks or storage nodes; the amount of RAM available for caching; the type of connection to the storage and bandwidth available.

1.2.3. Measure Storage, not Cache

As part of the test data size, we need to ensure that we prevent caching from interfering in the measurements. The total size of the data set in the test must exceed the total size of all the disk cache memory available by a certain amount in order to ensure we are forcing non-cached I/O. There is no exact science here, but if we balance test duration against cache hit ratio, it can be argued that 20% cache hit is good enough and increasing file size would result in diminishing returns. Let’s break this number down a bit. Given a cache size of 10GB, we could write, then read the following dataset sizes:

  • 10GB gives 100% cache hit
  • 20GB gives 50% cache hit
  • 50GB gives 20% cache hit
  • 100GB gives 10% cache hit

This means that for the first test, 100% of the results are unreliable due to cache. At 50GB, the true performance without cache has only a 20% margin of error. Given the fact that the 100GB would take twice as long, and that we are only reducing the margin of error by 10%, we recommend this as the best tradeoff.

How much cache do we actually have? This depends on the storage device being used. For hardware NAS or other arrays, it should be fairly easy to get the number from the manufacturer, but for software defined storage, it can be harder to determine. Let’s take Ceph as an example. Ceph runs as software on the bare metal server and therefore has access to all the RAM available on the server to use as its cache. Well, not exactly all the memory. We have to take into account the memory consumed by the operating system, by the Ceph processes, as well as any other processes running on the same system. In the case of hyper-converged Ceph, where workload VMs and Ceph run on the systems, it can become quite difficult to predict. Ultimately, the amount of memory that is left over is the cache for that single Ceph instance. We now need to add the memory available from all the other Ceph storage nodes in the environment. Time for another example: given 3 Ceph storage nodes with 256GB RAM each. Let’s take 20% off to pin to the OS and other processes, leaving approximately 240GB per node This gives us 3 x 240 or 720GB total RAM available for cache. The total amount of data we want to write in order to initialize our Cinder volumes would then be 5 x 720, or 3,600 GB. The following illustrates some ways to allocate the data:

  • 1 VM with 1 3,600 GB volume
  • 10 VMs each with 1 360 GB volume
  • 2 VMs each with 5 360 GB volumes
1.2.4. Back to Modelling

Now that we know there is 3.6 TB of data to be written, we need to go back to the workload model to determine how we are going to write it. Factors to consider:

  • Number of Volumes. We might be simulating a single database of 3.6 TB, so only 1 Cinder volume is needed to represent this. Or, we might be simulating a web server farm where there are hundreds of processes accessing many different volumes. In this case, we divide the 3.6 TB by the number of volumes, making each volume smaller.
  • Number of Virtual Machines. We might have one monster VM that will drive all our I/O in the system, or maybe there are hundreds of VMs, each with their own individual volume. Using Ceph as an example again, we know that it allows for a single VM to consume all the Ceph resources, which can be perceived as a problem in terms of multi-tenancy and scaling. A common practice to mitigate this is to use Cinder to throttle IOPS at the VM level. If this technique is being used in the environment under test, we must adjust the number of VMs used in the test accordingly.
  • Block Size. We need to know if the application is managing the volume as a raw device (ie: /dev/vdb) or as a filesystem mounted over the device. Different filesystems have their own block sizes: ext4 only allows 1024, 2048 or 4096 as the block size. Typically the larger the block, the better the throughput, however as blocks must be written as an atomic unit, larger block sizes can also reduce effective throughput by having to pad the block if the content is smaller than the actual block size.
  • I/O Depth. This represents the amount of I/O that the application can issue simultaneously. In a multi-threaded app, or one that uses asynchronous I/O, it is possible to have multiple read or write requests outstanding at the same time. For example, with software defined storage where there is an Ethernet network between the client and the storage, the storage would have a higher latency for each I/O, but is capable of accepting many requests in parallel. With an I/O depth of 1, we spend time waiting for the network latency before a response comes back. With higher I/O depth, we can get more throughput despite each I/O having higher latency. Typically, we do not see applications that would go beyond a queue depth of 8, however this is not a firm rule.
  • Data Access Pattern. We need to know if the application typically reads data sequentially or randomly, as well as what the mixture of read vs. write is. It is possible to measure read by itself, or write by itself, but this is not typical behavior for applications. It is useful for determining the potential maximum throughput of a given type of operation.
1.2.5. Fastest Path to Results

Once we have the information gathered, we can now start executing some tests. Let’s take some of the points discussed above and describe our system:

  • OpenStack deployment with 3 Control nodes, 5 Compute nodes and 3 dedicated Ceph storage nodes.
  • Ceph nodes each have 240 GB RAM available to be used as cache.
  • Our application writes directly to the raw device (/dev/vdb)
  • There will be 10 instances of the application running, each with its own volume.
  • Our application can use block sizes of 4k or 64k.
  • Our application is capable of maintaining up to 6 I/O operations simultaneously.

The first thing we know is that we want to keep our cache hit ratio around 20%, so we will be moving 3,600 GB of data. We also know this will take a significant amount of time, so here is where StorPerf helps.

First, we use the configurations API to launch our 10 virtual machines each with a 360 GB volume. Next comes the most time consuming part: we call the initializations API to fill each one of these volumes with random data. By preloading the data, we ensure a number of things:

  • The storage device has had to fully allocate all of the space for our volumes. This is especially important for software defined storage like Ceph, which is smart enough to know if data is being read from a block that has never been written. No data on disk means no disk read is needed and the response is immediate.
  • The RAM cache has been overrun multiple times. Only 20% of what was written can possibly remain in cache.

This last part is important as we can now use StorPerf’s implementation of SNIA’s steady state algorithm to ensure our follow up tests execute as quickly as possible. Given the fact that 80% of the data in any given test results in a cache miss, we can run multiple tests in a row without having to re-initialize or invalidate the cache again in between test runs. We can also mix and match the types of workloads to be run in a single performance job submission.

Now we can submit a job to the jobs API to execute a 70%/30% mix of read/write, with a block size of 4k and an I/O queue depth of 6. This job will run until either the maximum time has expired, or until StorPerf detects steady state has been reached, at which point it will immediately complete and report the results of the measurements.

StorPerf uses FIO as its workload engine, so whatever workload parameters we would like to use with FIO can be passed directly through via StorPerf’s jobs API.

1.3. What Data Can We Get?

StorPerf provides the following metrics:

  • IOPS
  • Bandwidth (number of kilobytes read or written per second)
  • Latency

These metrics are available for every job, and for the specific workloads, I/O loads and I/O types (read, write) associated with the job.

For each metric, StorPerf also provides the set of samples that were collected along with the slope, min and max values that can be used for plotting or comparison.

As of this time, StorPerf only provides textual reports of the metrics.

1. StorPerf Installation Guide
1.1. OpenStack Prerequisites

If you do not have an Ubuntu 16.04 image in Glance, you will need to add one. You also need to create the StorPerf flavor, or choose one that closely matches. For Ubuntu 16.04, it must have a minimum of a 4 GB disk. It should also have about 8 GB RAM to support FIO’s memory mapping of written data blocks to ensure 100% coverage of the volume under test.

There are scripts in storperf/ci directory to assist, or you can use the follow code snippets:

# Put an Ubuntu Image in glance
wget -q https://cloud-images.ubuntu.com/releases/16.04/release/ubuntu-16.04-server-cloudimg-amd64-disk1.img
openstack image create "Ubuntu 16.04 x86_64" --disk-format qcow2 --public \
    --container-format bare --file ubuntu-16.04-server-cloudimg-amd64-disk1.img

# Create StorPerf flavor
openstack flavor create storperf \
    --id auto \
    --ram 8192 \
    --disk 4 \
    --vcpus 2
1.1.1. OpenStack Credentials

You must have your OpenStack Controller environment variables defined and passed to the StorPerf container. The easiest way to do this is to put the rc file contents into a clean file called admin.rc that looks similar to this for V2 authentication:

cat << 'EOF' > admin.rc
OS_AUTH_URL=http://10.13.182.243:5000/v2.0
OS_TENANT_ID=e8e64985506a4a508957f931d1800aa9
OS_TENANT_NAME=admin
OS_PROJECT_NAME=admin
OS_USERNAME=admin
OS_PASSWORD=admin
OS_REGION_NAME=RegionOne
EOF

For V3 authentication, at a minimum, use the following:

cat << 'EOF' > admin.rc
OS_AUTH_URL=http://10.10.243.14:5000/v3
OS_USERNAME=admin
OS_PASSWORD=admin
OS_PROJECT_DOMAIN_NAME=Default
OS_PROJECT_NAME=admin
OS_USER_DOMAIN_NAME=Default
EOF

Additionally, if you want your results published to the common OPNFV Test Results DB, add the following:

TEST_DB_URL=http://testresults.opnfv.org/testapi
1.2. Planning

StorPerf is delivered as a series of Docker containers managed by docker-compose. There are two possible methods for installation:

  1. Run the containers on bare metal
  2. Run the containers in a VM

Requirements:

  • Docker and docker-compose must be installed * (note: sudo may be required if user is not part of docker group)
  • OpenStack Controller credentials are available
  • Host has access to the OpenStack Controller API
  • Host must have internet connectivity for downloading docker image
  • Enough OpenStack floating IPs must be available to match your agent count
  • A local directory for holding the Carbon DB Whisper files

Local disk used for the Carbon DB storage as the default size of the docker container is only 10g. Here is an example of how to create a local storage directory and set its permissions so that StorPerf can write to it:

mkdir -p ./carbon
sudo chown 33:33 ./carbon
1.3. Ports

The following ports are exposed if you use the supplied docker-compose.yaml file:

  • 5000 for StorPerf ReST API and Swagger UI

Note: Port 8000 is no longer exposed and graphite can be accesed via http://storperf:5000/graphite

1.4. Running StorPerf Container

As of Euphrates (development) release (June 2017), StorPerf has changed to use docker-compose in order to start its services.

Docker compose requires a local file to be created in order to define the services that make up the full StorPerf application. This file can be:

  • Manually created
  • Downloaded from the StorPerf git repo, or
  • Create via a helper script from the StorPerf git repo

Manual creation involves taking the sample in the StorPerf git repo and typing in the contents by hand on your target system.

1.5. Downloading From Git Repo
wget https://raw.githubusercontent.com/opnfv/storperf/master/docker-compose/docker-compose.yaml
sha256sum docker-compose.yaml

which should result in:

69856e9788bec36308a25303ec9154ed68562e126788a47d54641d68ad22c8b9  docker-compose.yaml

To run, you must specify two environment variables:

  • ENV_FILE, which points to your OpenStack admin.rc as noted above.
  • CARBON_DIR, which points to a directory that will be mounted to store the raw metrics.
  • TAG, which specified the Docker tag for the build (ie: latest, danube.3.0, etc).

The following command will start all the StorPerf services:

TAG=latest ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose pull
TAG=latest ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose up -d

StorPerf is now available at http://docker-host:5000/

1.6. Downloading Helper Tool

A tool to help you get started with the docker-compose.yaml can be downloaded from:

wget https://raw.githubusercontent.com/opnfv/storperf/master/docker-compose/create-compose.py
sha256sum create-compose.py

which should result in:

327cad2a7b3a3ca37910978005c743799313c2b90709e4a3f142286a06e53f57  create-compose.py

Note: The script will run fine on python3. Install python future package to avoid error on python2.

pip install future
1.6.1. Docker Exec

If needed, any StorPerf container can be entered with docker exec. This is not normally required.

docker exec -it storperf-master /bin/bash
1.7. Pulling StorPerf Containers

The tags for StorPerf can be found here: https://hub.docker.com/r/opnfv/storperf-master/tags/

1.7.1. Master (latest)

This tag represents StorPerf at its most current state of development. While self-tests have been run, there is not a guarantee that all features will be functional, or there may be bugs.

Documentation for latest can be found using the latest label at:

User Guide

For x86_64 based systems, use:

TAG=x86_64-latest ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose pull
TAG=x86_64-latest ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose up -d

For 64 bit ARM based systems, use:

TAG=aarch64-latest ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose pull
TAG=aarch64-latest ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose up -d
1.7.2. Release (stable)

This tag represents StorPerf at its most recent stable release. There are no known bugs and known issues and workarounds are documented in the release notes. Issues found here should be reported in JIRA:

https://jira.opnfv.org/secure/RapidBoard.jspa?rapidView=3

For x86_64 based systems, use:

TAG=x86_64-stable ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose pull
TAG=x86_64-stable ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose up -d

For 64 bit ARM based systems, use:

TAG=aarch64-stable ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose pull
TAG=aarch64-stable ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose up -d
1.7.3. Fraser (opnfv-6.0.0)

This tag represents the 6th OPNFV release and the 5th StorPerf release. There are no known bugs and known issues and workarounds are documented in the release notes. Documentation can be found under the Fraser label at:

http://docs.opnfv.org/en/stable-fraser/submodules/storperf/docs/testing/user/index.html

Issues found here should be reported against release 6.0.0 in JIRA:

https://jira.opnfv.org/secure/RapidBoard.jspa?rapidView=3

For x86_64 based systems, use:

TAG=x86_64-opnfv-6.0.0 ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose pull
TAG=x86_64-opnfv-6.0.0 ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose up -d

For 64 bit ARM based systems, use:

TAG=aarch64-opnfv-6.0.0 ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose pull
TAG=aarch64-opnfv-6.0.0 ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose up -d
1.7.4. Euphrates (opnfv-5.0.0)

This tag represents the 5th OPNFV release and the 4th StorPerf release. There are no known bugs and known issues and workarounds are documented in the release notes. Documentation can be found under the Euphrates label at:

http://docs.opnfv.org/en/stable-euphrates/submodules/storperf/docs/testing/user/index.html

Issues found here should be reported against release 6.0.0 in JIRA:

https://jira.opnfv.org/secure/RapidBoard.jspa?rapidView=3

For x86_64 based systems, use:

TAG=x86_64-opnfv-6.0.0 ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose pull
TAG=x86_64-opnfv-6.0.0 ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose up -d

For 64 bit ARM based systems, use:

TAG=aarch64-opnfv-6.0.0 ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose pull
TAG=aarch64-opnfv-6.0.0 ENV_FILE=./admin.rc CARBON_DIR=./carbon/ docker-compose up -d
2. StorPerf Test Execution Guide
2.1. Prerequisites

This guide requires StorPerf to be running and have its ReST API accessible. If the ReST API is not running on port 5000, adjust the commands provided here as needed.

2.2. Interacting With StorPerf

Once the StorPerf container has been started and the ReST API exposed, you can interact directly with it using the ReST API. StorPerf comes with a Swagger interface that is accessible through the exposed port at:

http://StorPerf:5000/swagger/index.html

The typical test execution follows this pattern:

  1. Configure the environment
  2. Initialize the cinder volumes
  3. Execute one or more performance runs
  4. Delete the environment
2.3. Configure The Environment

The following pieces of information are required to prepare the environment:

  • The number of VMs/Cinder volumes to create.
  • The Cinder volume type (optional) to create
  • The Glance image that holds the VM operating system to use.
  • The OpenStack flavor to use when creating the VMs.
  • The name of the public network that agents will use.
  • The size, in gigabytes, of the Cinder volumes to create.
  • The number of the Cinder volumes to attach to each VM.
  • The availability zone (optional) in which the VM is to be launched. Defaults to nova.
  • The username (optional) if we specify a custom image.
  • The password (optional) for the above image.

Note: on ARM based platforms there exists a bug in the kernel which can prevent VMs from properly attaching Cinder volumes. There are two known workarounds:

  1. Create the environment with 0 Cinder volumes attached, and after the VMs
have finished booting, modify the stack to have 1 or more Cinder volumes. See section on Changing Stack Parameters later in this guide.
  1. Add the following image metadata to Glance. This will cause the Cinder
volume to be mounted as a SCSI device, and therefore your target will be /dev/sdb, etc, instead of /dev/vdb. You will need to specify this in your warm up and workload jobs.

The ReST API is a POST to http://StorPerf:5000/api/v1.0/configurations and takes a JSON payload as follows.

{
  "agent_count": int,
  "agent_flavor": "string",
  "agent_image": "string",
  "availability_zone": "string",
  "password": "string",
  "public_network": "string",
  "username": "string",
  "volume_count": int,
  "volume_size": int,
  "volume_type": "string"
}

This call will block until the stack is created, at which point it will return the OpenStack heat stack id as well as the IP addresses of the slave agents.

2.4. Initialize the Target Volumes

Before executing a test run for the purpose of measuring performance, it is necessary to fill the volume or file with random data. Failure to execute this step can result in meaningless numbers, especially for read performance. Most Cinder drivers are smart enough to know what blocks contain data, and which do not. Uninitialized blocks return “0” immediately without actually reading from the volume.

Initiating the data fill behave similarly to a regular performance run, but will tag the data with a special workload name called “_warm_up”. It is designed to run to completion, filling 100% of the specified target with random data.

The ReST API is a POST to http://StorPerf:5000/api/v1.0/initializations and takes a JSON payload as follows. The body is optional unless your target is something other than /dev/vdb. For example, if you want to profile a glance ephemeral storage file, you could specify the target as “/filename.dat”, which is a file that then gets created on the root filesystem.

{
   "target": "/dev/vdb"
}

This will return a job ID as follows.

{
  "job_id": "edafa97e-457e-4d3d-9db4-1d6c0fc03f98"
}

This job ID can be used to query the state to determine when it has completed. See the section on querying jobs for more information.

2.5. Execute a Performance Run

Performance runs can execute either a single workload, or iterate over a matrix of workload types, block sizes and queue depths.

2.5.1. Workload Types
rr
Read, Random. 100% read of random blocks
rs
Read, Sequential. 100% read of sequential blocks of data
rw
Read / Write Mix, Sequential. 70% random read, 30% random write
wr
Write, Random. 100% write of random blocks
ws
Write, Sequential. 100% write of sequential blocks.
2.5.2. Custom Workload Types

New in Gambia (7.0), you can specify custom workload parameters for StorPerf to pass on to FIO. This is available in the /api/v2.0/jobs API, and takes a different format than the default v1.0 API.

The format is as follows:

"workloads": {
  "name": {
     "fio argument": "fio value"
  }
}

The name is used the same way the ‘rr’, ‘rs’, ‘rw’, etc is used, but can be any arbitrary alphanumeric string. This is for you to identify the job later. Following the name is a series of arguments to pass on to FIO. The most important on of these is the actual I/O operation to perform. From the FIO manual, there are a number of different workloads:

  • read
  • write
  • trim
  • randread
  • etc

This is an example of how the original ‘ws’ workload looks in the new format:

"workloads": {
  "ws": {
     "rw": "write"
  }
}

Using this format, it is now possible to initiate any combination of IO workload type. For example, a mix of 60% reads and 40% writes scattered randomly throughout the volume being profiled would be:

"workloads": {
  "6040randrw": {
      "rw": "randrw",
      "rwmixread": "60"
  }
}

Additional arguments can be added as needed. Here is an example of random writes, with 25% duplicated blocks, followed by a second run of 75/25% mixed reads and writes. This can be used to test the deduplication capabilities of the underlying storage driver.

"workloads": {
  "dupwrite": {
     "rw": "randwrite",
      "dedupe_percentage": "25"
  },
  "7525randrw": {
     "rw": "randrw",
      "rwmixread": "75",
      "dedupe_percentage": "25"
  }
}

There is no limit on the number of workloads and additional FIO arguments that can be specified.

Note that as in v1.0, the list of workloads will be iterated over with the block sizes and queue depths specified.

StorPerf will also do a verification of the arguments given prior to returning a Job ID from the ReST call. If an argument fails validation, the error will be returned in the payload of the response.

2.5.3. Block Sizes

A comma delimited list of the different block sizes to use when reading and writing data. Note: Some Cinder drivers (such as Ceph) cannot support block sizes larger than 16k (16384).

2.5.4. Queue Depths

A comma delimited list of the different queue depths to use when reading and writing data. The queue depth parameter causes FIO to keep this many I/O requests outstanding at one time. It is used to simulate traffic patterns on the system. For example, a queue depth of 4 would simulate 4 processes constantly creating I/O requests.

2.5.5. Deadline

The deadline is the maximum amount of time in minutes for a workload to run. If steady state has not been reached by the deadline, the workload will terminate and that particular run will be marked as not having reached steady state. Any remaining workloads will continue to execute in order.

{
   "block_sizes": "2048,16384",
   "deadline": 20,
   "queue_depths": "2,4",
   "workload": "wr,rr,rw"
}
2.5.6. Metadata

A job can have metadata associated with it for tagging. The following metadata is required in order to push results to the OPNFV Test Results DB:

"metadata": {
    "disk_type": "HDD or SDD",
    "pod_name": "OPNFV Pod Name",
    "scenario_name": string,
    "storage_node_count": int,
    "version": string,
    "build_tag": string,
    "test_case": "snia_steady_state"
}
2.5.7. Changing Stack Parameters

While StorPerf currently does not support changing the parameters of the stack directly, it is possible to change the stack using the OpenStack client library. The following parameters can be changed:

  • agent_count: to increase or decrease the number of VMs.
  • volume_count: to change the number of Cinder volumes per VM.
  • volume_size: to increase the size of each volume. Note: Cinder cannot shrink volumes.

Increasing the number of agents or volumes, or increasing the size of the volumes will require you to kick off a new _warm_up job to initialize the newly allocated volumes.

The following is an example of how to change the stack using the heat client:

2.6. Query Jobs Information

By issuing a GET to the job API http://StorPerf:5000/api/v1.0/jobs?job_id=<ID>, you can fetch information about the job as follows:

  • &type=status: to report on the status of the job.
  • &type=metrics: to report on the collected metrics.
  • &type=metadata: to report back any metadata sent with the job ReST API
2.6.1. Status

The Status field can be: - Running to indicate the job is still in progress, or - Completed to indicate the job is done. This could be either normal completion

or manually terminated via HTTP DELETE call.

Workloads can have a value of: - Pending to indicate the workload has not yet started, - Running to indicate this is the active workload, or - Completed to indicate this workload has completed.

This is an example of a type=status call.

{
  "Status": "Running",
  "TestResultURL": null,
  "Workloads": {
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.1.block-size.16384": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.1.block-size.4096": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.1.block-size.512": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.4.block-size.16384": "Running",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.4.block-size.4096": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.4.block-size.512": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.8.block-size.16384": "Completed",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.8.block-size.4096": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.8.block-size.512": "Pending"
  }
}

If the job_id is not provided along with type status, then all jobs are returned along with their status. Metrics ~~~~~~~ Metrics can be queried at any time during or after the completion of a run. Note that the metrics show up only after the first interval has passed, and are subject to change until the job completes.

This is a sample of a type=metrics call.

{
  "rw.queue-depth.1.block-size.512.read.bw": 52.8,
  "rw.queue-depth.1.block-size.512.read.iops": 106.76199999999999,
  "rw.queue-depth.1.block-size.512.read.lat_ns.mean": 93.176,
  "rw.queue-depth.1.block-size.512.write.bw": 22.5,
  "rw.queue-depth.1.block-size.512.write.iops": 45.760000000000005,
  "rw.queue-depth.1.block-size.512.write.lat_ns.mean": 21764.184999999998
}
2.7. Abort a Job

Issuing an HTTP DELETE to the job api http://StorPerf:5000/api/v1.0/jobs will force the termination of the whole job, regardless of how many workloads remain to be executed.

curl -X DELETE --header 'Accept: application/json' http://StorPerf:5000/api/v1.0/jobs
2.8. List all Jobs

A list of all Jobs can also be queried. You just need to issue a GET request without any Job ID.

curl -X GET --header 'Accept: application/json' http://StorPerf/api/v1.0/jobs
2.9. Delete the Environment

After you are done testing, you can have StorPerf delete the Heat stack by issuing an HTTP DELETE to the configurations API.

curl -X DELETE --header 'Accept: application/json' http://StorPerf:5000/api/v1.0/configurations

You may also want to delete an environment, and then create a new one with a different number of VMs/Cinder volumes to test the impact of the number of VMs in your environment.

2.10. Viewing StorPerf Logs

Logs are an integral part of any application as they help debugging the application. The user just needs to issue an HTTP request. To view the entire logs

curl -X GET --header 'Accept: application/json' http://StorPerf:5000/api/v1.0/logs?lines=all

Alternatively, one can also view a certain amount of lines by specifying the number in the request. If no lines are specified, then last 35 lines are returned

curl -X GET --header 'Accept: application/json' http://StorPerf:5000/api/v1.0/logs?lines=12
3. Storperf Reporting Module
3.1. About this project
  • This project aims to create a series of graphs to support the SNIA reports.
  • All data for the reports can be fetched either from the OPNFV Test Results DB, or locally from StorPerf’s own database of current run data.
  • The report code may be stored in either the Releng repository (so it can be included in the Test Results Dashboards), or locally in StorPerf’s own git repository.
  • The report (generated by the reporting module) looks like the following example:
_images/demo_example.png
3.2. Usage
  • Enter the URL for the location of the data for which you want to generate the report(http://StorPerf:5000/api/v1.0/jobs?type=metadata).
  • Note: You can test the module using the testdata present in the directory storperf-reporting/src/static/testdata. Instead of the URL enter the filename present in the testdata directory, eg. local-data.json
  • After entering the URL, you are taken to the page showing the details of the all the jobs present in the data.
  • Click on the Click here to view details to see the different block sizes for the respective job.
  • Click on the block size and select the parameter for which you want to view the graph.
3.3. Graph Explanation

Example of a graph generated is shown below:-

_images/graph_explanation.png

Steady State Convergence Graph

  • This graph shows the values as reported by StorPerf for the actual and average throughput.
  • Also shown is the average +-10% and the slope.
  • It serves to visually demonstrate the compliance to the steady state definition.
  • The allowed maximum data excursion is 20% of the average (or average x 0.20)
  • The allowed maximum slope excursion is 10% of the average.
  • The measured data excursion is the value from range.
  • The measured slope excursion is the value from range
3.4. Workflow

A Flask server is used to fetch the data and is sent to the client side for formation of the graphs (Using Javascript).

3.4.1. Steps involved
  • Step 1: Data is fetched from the OPNFV Test Results ReST API
  • Step 2: The fields “report_data” and “metrics” are taken from the JSON object retrieved in the above step and sent to the client side.
  • Step 3: The “report_data” is obtained by the client side and a parser written in Javascript along with Plotly.js forms the graphs.
3.5. Directory structure

storperf/docker/storperf-reporting/ contains the code used for this project.

The file structure is as follows:-

storperf-reporting
|+-- Dockerfile                         # Dockerfile for the storperf-reporting container
|+-- requirements.txt                   # pip requirements for the container
+-- src                                 # Contains the code for the flask server
    |+-- app.py                         # Code to run the flask application
    |+-- static                         # Contains the static files (js,css)
    |   |+-- css                        # Contains css files
    |   |   `-- bootstrap.min.css
    |   |+-- images
    |   |+-- js                         # Contains the javascript files
    |   |   |-- bootstrap.min.js
    |   |   |-- Chart.min.js
    |   |   |-- jquery-2.1.3.min.js
    |   |   |-- jquery.bootpag.min.js
    |   |   `-- plotly-latest.min.js    # Used for plotting the graphs
    |   `-- testdata                    # Contains testing data for the module
    `-- templates
        |-- index.html
        |-- plot_jobs.html
        |-- plot_multi_data.html
        `-- plot_tables.html
3.6. Graphing libraries and tools
  • Plotly.js is used as the graphing library for this project (Link: https://plot.ly/javascript/)
  • Bootstrap is used for the UI of the project.

VSPERF

VSPERF Configuration and User Guide
Introduction

VSPERF is an OPNFV testing project.

VSPERF provides an automated test-framework and comprehensive test suite based on Industry Test Specifications for measuring NFVI data-plane performance. The data-path includes switching technologies with physical and virtual network interfaces. The VSPERF architecture is switch and traffic generator agnostic and test cases can be easily customized. VSPERF was designed to be independent of OpenStack therefore OPNFV installer scenarios are not required. VSPERF can source, configure and deploy the device-under-test using specified software versions and network topology. VSPERF is used as a development tool for optimizing switching technologies, qualification of packet processing functions and for evaluation of data-path performance.

The Euphrates release adds new features and improvements that will help advance high performance packet processing on Telco NFV platforms. This includes new test cases, flexibility in customizing test-cases, new results display options, improved tool resiliency, additional traffic generator support and VPP support.

VSPERF provides a framework where the entire NFV Industry can learn about NFVI data-plane performance and try-out new techniques together. A new IETF benchmarking specification (RFC8204) is based on VSPERF work contributed since 2015. VSPERF is also contributing to development of ETSI NFV test specifications through the Test and Open Source Working Group.

VSPERF Install and Configuration
1. Installing vswitchperf
1.1. Downloading vswitchperf

The vswitchperf can be downloaded from its official git repository, which is hosted by OPNFV. It is necessary to install a git at your DUT before downloading vswitchperf. Installation of git is specific to the packaging system used by Linux OS installed at DUT.

Example of installation of GIT package and its dependencies:

  • in case of OS based on RedHat Linux:

    sudo yum install git
    
  • in case of Ubuntu or Debian:

    sudo apt-get install git
    

After the git is successfully installed at DUT, then vswitchperf can be downloaded as follows:

git clone http://git.opnfv.org/vswitchperf

The last command will create a directory vswitchperf with a local copy of vswitchperf repository.

1.2. Supported Operating Systems
  • CentOS 7.3
  • Fedora 24 (kernel 4.8 requires DPDK 16.11 and newer)
  • Fedora 25 (kernel 4.9 requires DPDK 16.11 and newer)
  • openSUSE 42.2
  • openSUSE 42.3
  • openSUSE Tumbleweed
  • SLES 15
  • RedHat 7.2 Enterprise Linux
  • RedHat 7.3 Enterprise Linux
  • RedHat 7.5 Enterprise Linux
  • Ubuntu 14.04
  • Ubuntu 16.04
  • Ubuntu 16.10 (kernel 4.8 requires DPDK 16.11 and newer)
1.3. Supported vSwitches

The vSwitch must support Open Flow 1.3 or greater.

  • Open vSwitch
  • Open vSwitch with DPDK support
  • TestPMD application from DPDK (supports p2p and pvp scenarios)
  • Cisco VPP
1.4. Supported Hypervisors
  • Qemu version 2.3 or greater (version 2.5.0 is recommended)
1.5. Supported VNFs

In theory, it is possible to use any VNF image, which is compatible with supported hypervisor. However such VNF must ensure, that appropriate number of network interfaces is configured and that traffic is properly forwarded among them. For new vswitchperf users it is recommended to start with official vloop-vnf image, which is maintained by vswitchperf community.

1.5.1. vloop-vnf

The official VM image is called vloop-vnf and it is available for free download from OPNFV artifactory. This image is based on Linux Ubuntu distribution and it supports following applications for traffic forwarding:

  • DPDK testpmd
  • Linux Bridge
  • Custom l2fwd module

The vloop-vnf can be downloaded to DUT, for example by wget:

wget http://artifacts.opnfv.org/vswitchperf/vnf/vloop-vnf-ubuntu-14.04_20160823.qcow2

NOTE: In case that wget is not installed at your DUT, you could install it at RPM based system by sudo yum install wget or at DEB based system by sudo apt-get install wget.

Changelog of vloop-vnf:

1.6. Installation

The test suite requires Python 3.3 or newer and relies on a number of other system and python packages. These need to be installed for the test suite to function.

Updated kernel and certain development packages are required by DPDK, OVS (especially Vanilla OVS) and QEMU. It is necessary to check if the versions of these packages are not being held-back and if the DNF/APT/YUM configuration does not prevent their modification, by enforcing settings such as “exclude-kernel”.

Installation of required packages, preparation of Python 3 virtual environment and compilation of OVS, DPDK and QEMU is performed by script systems/build_base_machine.sh. It should be executed under the user account, which will be used for vsperf execution.

NOTE: Password-less sudo access must be configured for given user account before the script is executed.

$ cd systems
$ ./build_base_machine.sh

NOTE: you don’t need to go into any of the systems subdirectories, simply run the top level build_base_machine.sh, your OS will be detected automatically.

Script build_base_machine.sh will install all the vsperf dependencies in terms of system packages, Python 3.x and required Python modules. In case of CentOS 7 or RHEL it will install Python 3.3 from an additional repository provided by Software Collections (a link). The installation script will also use virtualenv to create a vsperf virtual environment, which is isolated from the default Python environment, using the Python3 package located in /usr/bin/python3. This environment will reside in a directory called vsperfenv in $HOME. It will ensure, that system wide Python installation

is not modified or broken by VSPERF installation. The complete list of Python

packages installed inside virtualenv can be found in the file requirements.txt, which is located at the vswitchperf repository.

NOTE: For RHEL 7.3 Enterprise and CentOS 7.3 OVS Vanilla is not built from upstream source due to kernel incompatibilities. Please see the instructions in the vswitchperf_design document for details on configuring OVS Vanilla for binary package usage.

NOTE: For RHEL 7.5 Enterprise DPDK and Openvswitch are not built from upstream sources due to kernel incompatibilities. Please use subscription channels to obtain binary equivalents of openvswitch and dpdk packages or build binaries using instructions from openvswitch.org and dpdk.org.

1.6.1. VPP installation

VPP installation is now included as part of the VSPerf installation scripts.

In case of an error message about a missing file such as “Couldn’t open file /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7” you can resolve this issue by simply downloading the file.

$ wget https://dl.fedoraproject.org/pub/epel/RPM-GPG-KEY-EPEL-7
1.7. Using vswitchperf

You will need to activate the virtual environment every time you start a new shell session. Its activation is specific to your OS:

  • CentOS 7 and RHEL

    $ scl enable rh-python34 bash
    $ source $HOME/vsperfenv/bin/activate
    
  • Fedora and Ubuntu

    $ source $HOME/vsperfenv/bin/activate
    

After the virtual environment is configued, then VSPERF can be used. For example:

(vsperfenv) $ cd vswitchperf
(vsperfenv) $ ./vsperf --help
1.7.1. Gotcha

In case you will see following error during environment activation:

$ source $HOME/vsperfenv/bin/activate
Badly placed ()'s.

then check what type of shell you are using:

$ echo $SHELL
/bin/tcsh

See what scripts are available in $HOME/vsperfenv/bin

$ ls $HOME/vsperfenv/bin/
activate          activate.csh      activate.fish     activate_this.py

source the appropriate script

$ source bin/activate.csh
1.7.2. Working Behind a Proxy

If you’re behind a proxy, you’ll likely want to configure this before running any of the above. For example:

export http_proxy=proxy.mycompany.com:123
export https_proxy=proxy.mycompany.com:123
1.7.3. Bind Tools DPDK

VSPerf supports the default DPDK bind tool, but also supports driverctl. The driverctl tool is a new tool being used that allows driver binding to be persistent across reboots. The driverctl tool is not provided by VSPerf, but can be downloaded from upstream sources. Once installed set the bind tool to driverctl to allow VSPERF to correctly bind cards for DPDK tests.

PATHS['dpdk']['src']['bind-tool'] = 'driverctl'
1.8. Hugepage Configuration

Systems running vsperf with either dpdk and/or tests with guests must configure hugepage amounts to support running these configurations. It is recommended to configure 1GB hugepages as the pagesize.

The amount of hugepages needed depends on your configuration files in vsperf. Each guest image requires 2048 MB by default according to the default settings in the 04_vnf.conf file.

GUEST_MEMORY = ['2048']

The dpdk startup parameters also require an amount of hugepages depending on your configuration in the 02_vswitch.conf file.

DPDK_SOCKET_MEM = ['1024', '0']

NOTE: Option DPDK_SOCKET_MEM is used by all vSwitches with DPDK support. It means Open vSwitch, VPP and TestPMD.

VSPerf will verify hugepage amounts are free before executing test environments. In case of hugepage amounts not being free, test initialization will fail and testing will stop.

NOTE: In some instances on a test failure dpdk resources may not release hugepages used in dpdk configuration. It is recommended to configure a few extra hugepages to prevent a false detection by VSPerf that not enough free hugepages are available to execute the test environment. Normally dpdk would use previously allocated hugepages upon initialization.

Depending on your OS selection configuration of hugepages may vary. Please refer to your OS documentation to set hugepages correctly. It is recommended to set the required amount of hugepages to be allocated by default on reboots.

Information on hugepage requirements for dpdk can be found at http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html

You can review your hugepage amounts by executing the following command

cat /proc/meminfo | grep Huge

If no hugepages are available vsperf will try to automatically allocate some. Allocation is controlled by HUGEPAGE_RAM_ALLOCATION configuration parameter in 02_vswitch.conf file. Default is 2GB, resulting in either 2 1GB hugepages or 1024 2MB hugepages.

1.9. Tuning Considerations

With the large amount of tuning guides available online on how to properly tune a DUT, it becomes difficult to achieve consistent numbers for DPDK testing. VSPerf recommends a simple approach that has been tested by different companies to achieve proper CPU isolation.

The idea behind CPU isolation when running DPDK based tests is to achieve as few interruptions to a PMD process as possible. There is now a utility available on most Linux Systems to achieve proper CPU isolation with very little effort and customization. The tool is called tuned-adm and is most likely installed by default on the Linux DUT

VSPerf recommends the latest tuned-adm package, which can be downloaded from the following location:

http://www.tuned-project.org/2017/04/27/tuned-2-8-0-released/

Follow the instructions to install the latest tuned-adm onto your system. For current RHEL customers you should already have the most current version. You just need to install the cpu-partitioning profile.

yum install -y tuned-profiles-cpu-partitioning.noarch

Proper CPU isolation starts with knowing what NUMA your NIC is installed onto. You can identify this by checking the output of the following command

cat /sys/class/net/<NIC NAME>/device/numa_node

You can then use utilities such as lscpu or cpu_layout.py which is located in the src dpdk area of VSPerf. These tools will show the CPU layout of which cores/hyperthreads are located on the same NUMA.

Determine which CPUS/Hyperthreads will be used for PMD threads and VCPUs for VNFs. Then modify the /etc/tuned/cpu-partitioning-variables.conf and add the CPUs into the isolated_cores variable in some form of x-y or x,y,z or x-y,z, etc. Then apply the profile.

tuned-adm profile cpu-partitioning

After applying the profile, reboot your system.

After rebooting the DUT, you can verify the profile is active by running

tuned-adm active

Now you should have proper CPU isolation active and can achieve consistent results with DPDK based tests.

The last consideration is when running TestPMD inside of a VNF, it may make sense to enable enough cores to run a PMD thread on separate core/HT. To achieve this, set the number of VCPUs to 3 and enable enough nb-cores in the TestPMD config. You can modify options in the conf files.

GUEST_SMP = ['3']
GUEST_TESTPMD_PARAMS = ['-l 0,1,2 -n 4 --socket-mem 512 -- '
                        '--burst=64 -i --txqflags=0xf00 '
                        '--disable-hw-vlan --nb-cores=2']

Verify you set the VCPU core locations appropriately on the same NUMA as with your PMD mask for OVS-DPDK.

2. Upgrading vswitchperf
2.1. Generic

In case, that VSPERF is cloned from git repository, then it is easy to upgrade it to the newest stable version or to the development version.

You could get a list of stable releases by git command. It is necessary to update local git repository first.

NOTE: Git commands must be executed from directory, where VSPERF repository was cloned, e.g. vswitchperf.

Update of local git repository:

$ git pull

List of stable releases:

$ git tag

brahmaputra.1.0
colorado.1.0
colorado.2.0
colorado.3.0
danube.1.0
euphrates.1.0

You could select which stable release should be used. For example, select danube.1.0:

$ git checkout danube.1.0

Development version of VSPERF can be selected by:

$ git checkout master
2.2. Colorado to Danube upgrade notes
2.2.1. Obsoleted features

Support of vHost Cuse interface has been removed in Danube release. It means, that it is not possible to select QemuDpdkVhostCuse as a VNF anymore. Option QemuDpdkVhostUser should be used instead. Please check you configuration files and definition of your testcases for any occurrence of:

VNF = "QemuDpdkVhostCuse"

or

"VNF" : "QemuDpdkVhostCuse"

In case that QemuDpdkVhostCuse is found, it must be modified to QemuDpdkVhostUser.

NOTE: In case that execution of VSPERF is automated by scripts (e.g. for CI purposes), then these scripts must be checked and updated too. It means, that any occurrence of:

./vsperf --vnf QemuDpdkVhostCuse

must be updated to:

./vsperf --vnf QemuDpdkVhostUser
2.2.2. Configuration

Several configuration changes were introduced during Danube release. The most important changes are discussed below.

2.2.2.1. Paths to DPDK, OVS and QEMU

VSPERF uses external tools for proper testcase execution. Thus it is important to properly configure paths to these tools. In case that tools are installed by installation scripts and are located inside ./src directory inside VSPERF home, then no changes are needed. On the other hand, if path settings was changed by custom configuration file, then it is required to update configuration accordingly. Please check your configuration files for following configuration options:

OVS_DIR
OVS_DIR_VANILLA
OVS_DIR_USER
OVS_DIR_CUSE

RTE_SDK_USER
RTE_SDK_CUSE

QEMU_DIR
QEMU_DIR_USER
QEMU_DIR_CUSE
QEMU_BIN

In case that any of these options is defined, then configuration must be updated. All paths to the tools are now stored inside PATHS dictionary. Please refer to the Configuration of PATHS dictionary and update your configuration where necessary.

2.2.2.2. Configuration change via CLI

In previous releases it was possible to modify selected configuration options (mostly VNF specific) via command line interface, i.e. by --test-params argument. This concept has been generalized in Danube release and it is possible to modify any configuration parameter via CLI or via Parameters section of the testcase definition. Old configuration options were obsoleted and it is required to specify configuration parameter name in the same form as it is defined inside configuration file, i.e. in uppercase. Please refer to the Overriding values defined in configuration files for additional details.

NOTE: In case that execution of VSPERF is automated by scripts (e.g. for CI purposes), then these scripts must be checked and updated too. It means, that any occurrence of

guest_loopback
vanilla_tgen_port1_ip
vanilla_tgen_port1_mac
vanilla_tgen_port2_ip
vanilla_tgen_port2_mac
tunnel_type

shall be changed to the uppercase form and data type of entered values must match to data types of original values from configuration files.

In case that guest_nic1_name or guest_nic2_name is changed, then new dictionary GUEST_NICS must be modified accordingly. Please see Configuration of GUEST options and conf/04_vnf.conf for additional details.

2.2.2.3. Traffic configuration via CLI

In previous releases it was possible to modify selected attributes of generated traffic via command line interface. This concept has been enhanced in Danube release and it is now possible to modify all traffic specific options via CLI or by TRAFFIC dictionary in configuration file. Detailed description is available at Configuration of TRAFFIC dictionary section of documentation.

Please check your automated scripts for VSPERF execution for following CLI parameters and update them according to the documentation:

bidir
duration
frame_rate
iload
lossrate
multistream
pkt_sizes
pre-installed_flows
rfc2544_tests
stream_type
traffic_type
3. ‘vsperf’ Traffic Gen Guide
3.1. Overview

VSPERF supports the following traffic generators:

To see the list of traffic gens from the cli:

$ ./vsperf --list-trafficgens

This guide provides the details of how to install and configure the various traffic generators.

3.2. Background Information

The traffic default configuration can be found in conf/03_traffic.conf, and is configured as follows:

TRAFFIC = {
    'traffic_type' : 'rfc2544_throughput',
    'frame_rate' : 100,
    'burst_size' : 100,
    'bidir' : 'True',  # will be passed as string in title format to tgen
    'multistream' : 0,
    'stream_type' : 'L4',
    'pre_installed_flows' : 'No',           # used by vswitch implementation
    'flow_type' : 'port',                   # used by vswitch implementation
    'flow_control' : False,                 # supported only by IxNet
    'learning_frames' : True,               # supported only by IxNet
    'l2': {
        'framesize': 64,
        'srcmac': '00:00:00:00:00:00',
        'dstmac': '00:00:00:00:00:00',
    },
    'l3': {
        'enabled': True,
        'proto': 'udp',
        'srcip': '1.1.1.1',
        'dstip': '90.90.90.90',
    },
    'l4': {
        'enabled': True,
        'srcport': 3000,
        'dstport': 3001,
    },
    'vlan': {
        'enabled': False,
        'id': 0,
        'priority': 0,
        'cfi': 0,
    },
    'capture': {
        'enabled': False,
        'tx_ports' : [0],
        'rx_ports' : [1],
        'count': 1,
        'filter': '',
    },
    'scapy': {
        'enabled': False,
        '0' : 'Ether(src={Ether_src}, dst={Ether_dst})/'
              'Dot1Q(prio={Dot1Q_prio}, id={Dot1Q_id}, vlan={Dot1Q_vlan})/'
              'IP(proto={IP_proto}, src={IP_src}, dst={IP_dst})/'
              '{IP_PROTO}(sport={IP_PROTO_sport}, dport={IP_PROTO_dport})',
        '1' : 'Ether(src={Ether_dst}, dst={Ether_src})/'
              'Dot1Q(prio={Dot1Q_prio}, id={Dot1Q_id}, vlan={Dot1Q_vlan})/'
              'IP(proto={IP_proto}, src={IP_dst}, dst={IP_src})/'
              '{IP_PROTO}(sport={IP_PROTO_dport}, dport={IP_PROTO_sport})',
    }
}

A detailed description of the TRAFFIC dictionary can be found at Configuration of TRAFFIC dictionary.

The framesize parameter can be overridden from the configuration files by adding the following to your custom configuration file 10_custom.conf:

TRAFFICGEN_PKT_SIZES = (64, 128,)

OR from the commandline:

$ ./vsperf --test-params "TRAFFICGEN_PKT_SIZES=(x,y)" $TESTNAME

You can also modify the traffic transmission duration and the number of tests run by the traffic generator by extending the example commandline above to:

$ ./vsperf --test-params "TRAFFICGEN_PKT_SIZES=(x,y);TRAFFICGEN_DURATION=10;" \
                         "TRAFFICGEN_RFC2544_TESTS=1" $TESTNAME
3.3. Dummy

The Dummy traffic generator can be used to test VSPERF installation or to demonstrate VSPERF functionality at DUT without connection to a real traffic generator.

You could also use the Dummy generator in case, that your external traffic generator is not supported by VSPERF. In such case you could use VSPERF to setup your test scenario and then transmit the traffic. After the transmission is completed you could specify values for all collected metrics and VSPERF will use them to generate final reports.

3.3.1. Setup

To select the Dummy generator please add the following to your custom configuration file 10_custom.conf.

TRAFFICGEN = 'Dummy'

OR run vsperf with the --trafficgen argument

$ ./vsperf --trafficgen Dummy $TESTNAME

Where $TESTNAME is the name of the vsperf test you would like to run. This will setup the vSwitch and the VNF (if one is part of your test) print the traffic configuration and prompt you to transmit traffic when the setup is complete.

Please send 'continuous' traffic with the following stream config:
30mS, 90mpps, multistream False
and the following flow config:
{
    "flow_type": "port",
    "l3": {
        "enabled": True,
        "srcip": "1.1.1.1",
        "proto": "udp",
        "dstip": "90.90.90.90"
    },
    "traffic_type": "rfc2544_continuous",
    "multistream": 0,
    "bidir": "True",
    "vlan": {
        "cfi": 0,
        "priority": 0,
        "id": 0,
        "enabled": False
    },
    "l4": {
        "enabled": True,
        "srcport": 3000,
        "dstport": 3001,
    },
    "frame_rate": 90,
    "l2": {
        "dstmac": "00:00:00:00:00:00",
        "srcmac": "00:00:00:00:00:00",
        "framesize": 64
    }
}
What was the result for 'frames tx'?

When your traffic generator has completed traffic transmission and provided the results please input these at the VSPERF prompt. VSPERF will try to verify the input:

Is '$input_value' correct?

Please answer with y OR n.

VSPERF will ask you to provide a value for every of collected metrics. The list of metrics can be found at traffic-type-metrics. Finally vsperf will print out the results for your test and generate the appropriate logs and report files.

3.3.2. Metrics collected for supported traffic types

Below you could find a list of metrics collected by VSPERF for each of supported traffic types.

RFC2544 Throughput and Continuous:

  • frames tx
  • frames rx
  • min latency
  • max latency
  • avg latency
  • frameloss

RFC2544 Back2back:

  • b2b frames
  • b2b frame loss %
3.3.3. Dummy result pre-configuration

In case of a Dummy traffic generator it is possible to pre-configure the test results. This is useful for creation of demo testcases, which do not require a real traffic generator. Such testcase can be run by any user and it will still generate all reports and result files.

Result values can be specified within TRAFFICGEN_DUMMY_RESULTS dictionary, where every of collected metrics must be properly defined. Please check the list of traffic-type-metrics.

Dictionary with dummy results can be passed by CLI argument --test-params or specified in Parameters section of testcase definition.

Example of testcase execution with dummy results defined by CLI argument:

$ ./vsperf back2back --trafficgen Dummy --test-params \
  "TRAFFICGEN_DUMMY_RESULTS={'b2b frames':'3000','b2b frame loss %':'0.0'}"

Example of testcase definition with pre-configured dummy results:

{
    "Name": "back2back",
    "Traffic Type": "rfc2544_back2back",
    "Deployment": "p2p",
    "biDirectional": "True",
    "Description": "LTD.Throughput.RFC2544.BackToBackFrames",
    "Parameters" : {
        'TRAFFICGEN_DUMMY_RESULTS' : {'b2b frames':'3000','b2b frame loss %':'0.0'}
    },
},

NOTE: Pre-configured results for the Dummy traffic generator will be used only in case, that the Dummy traffic generator is used. Otherwise the option TRAFFICGEN_DUMMY_RESULTS will be ignored.

3.4. Ixia

VSPERF can use both IxNetwork and IxExplorer TCL servers to control Ixia chassis. However, usage of IxNetwork TCL server is a preferred option. The following sections will describe installation and configuration of IxNetwork components used by VSPERF.

3.4.1. Installation

On the system under the test you need to install IxNetworkTclClient$(VER_NUM)Linux.bin.tgz.

On the IXIA client software system you need to install IxNetwork TCL server. After its installation you should configure it as follows:

  1. Find the IxNetwork TCL server app (start -> All Programs -> IXIA -> IxNetwork -> IxNetwork_$(VER_NUM) -> IxNetwork TCL Server)

  2. Right click on IxNetwork TCL Server, select properties - Under shortcut tab in the Target dialogue box make sure there is the argument “-tclport xxxx” where xxxx is your port number (take note of this port number as you will need it for the 10_custom.conf file).

    _images/TCLServerProperties.png
  3. Hit Ok and start the TCL server application

3.4.2. VSPERF configuration

There are several configuration options specific to the IxNetwork traffic generator from IXIA. It is essential to set them correctly, before the VSPERF is executed for the first time.

Detailed description of options follows:

  • TRAFFICGEN_IXNET_MACHINE - IP address of server, where IxNetwork TCL Server is running
  • TRAFFICGEN_IXNET_PORT - PORT, where IxNetwork TCL Server is accepting connections from TCL clients
  • TRAFFICGEN_IXNET_USER - username, which will be used during communication with IxNetwork TCL Server and IXIA chassis
  • TRAFFICGEN_IXIA_HOST - IP address of IXIA traffic generator chassis
  • TRAFFICGEN_IXIA_CARD - identification of card with dedicated ports at IXIA chassis
  • TRAFFICGEN_IXIA_PORT1 - identification of the first dedicated port at TRAFFICGEN_IXIA_CARD at IXIA chassis; VSPERF uses two separated ports for traffic generation. In case of unidirectional traffic, it is essential to correctly connect 1st IXIA port to the 1st NIC at DUT, i.e. to the first PCI handle from WHITELIST_NICS list. Otherwise traffic may not be able to pass through the vSwitch. NOTE: In case that TRAFFICGEN_IXIA_PORT1 and TRAFFICGEN_IXIA_PORT2 are set to the same value, then VSPERF will assume, that there is only one port connection between IXIA and DUT. In this case it must be ensured, that chosen IXIA port is physically connected to the first NIC from WHITELIST_NICS list.
  • TRAFFICGEN_IXIA_PORT2 - identification of the second dedicated port at TRAFFICGEN_IXIA_CARD at IXIA chassis; VSPERF uses two separated ports for traffic generation. In case of unidirectional traffic, it is essential to correctly connect 2nd IXIA port to the 2nd NIC at DUT, i.e. to the second PCI handle from WHITELIST_NICS list. Otherwise traffic may not be able to pass through the vSwitch. NOTE: In case that TRAFFICGEN_IXIA_PORT1 and TRAFFICGEN_IXIA_PORT2 are set to the same value, then VSPERF will assume, that there is only one port connection between IXIA and DUT. In this case it must be ensured, that chosen IXIA port is physically connected to the first NIC from WHITELIST_NICS list.
  • TRAFFICGEN_IXNET_LIB_PATH - path to the DUT specific installation of IxNetwork TCL API
  • TRAFFICGEN_IXNET_TCL_SCRIPT - name of the TCL script, which VSPERF will use for communication with IXIA TCL server
  • TRAFFICGEN_IXNET_TESTER_RESULT_DIR - folder accessible from IxNetwork TCL server, where test results are stored, e.g. c:/ixia_results; see test-results-share
  • TRAFFICGEN_IXNET_DUT_RESULT_DIR - directory accessible from the DUT, where test results from IxNetwork TCL server are stored, e.g. /mnt/ixia_results; see test-results-share
3.4.3. Test results share

VSPERF is not able to retrieve test results via TCL API directly. Instead, all test results are stored at IxNetwork TCL server. Results are stored at folder defined by TRAFFICGEN_IXNET_TESTER_RESULT_DIR configuration parameter. Content of this folder must be shared (e.g. via samba protocol) between TCL Server and DUT, where VSPERF is executed. VSPERF expects, that test results will be available at directory configured by TRAFFICGEN_IXNET_DUT_RESULT_DIR configuration parameter.

Example of sharing configuration:

  • Create a new folder at IxNetwork TCL server machine, e.g. c:\ixia_results

  • Modify sharing options of ixia_results folder to share it with everybody

  • Create a new directory at DUT, where shared directory with results will be mounted, e.g. /mnt/ixia_results

  • Update your custom VSPERF configuration file as follows:

    TRAFFICGEN_IXNET_TESTER_RESULT_DIR = 'c:/ixia_results'
    TRAFFICGEN_IXNET_DUT_RESULT_DIR = '/mnt/ixia_results'
    

    NOTE: It is essential to use slashes ‘/’ also in path configured by TRAFFICGEN_IXNET_TESTER_RESULT_DIR parameter.

  • Install cifs-utils package.

    e.g. at rpm based Linux distribution:

    yum install cifs-utils
    
  • Mount shared directory, so VSPERF can access test results.

    e.g. by adding new record into /etc/fstab

    mount -t cifs //_TCL_SERVER_IP_OR_FQDN_/ixia_results /mnt/ixia_results
          -o file_mode=0777,dir_mode=0777,nounix
    

It is recommended to verify, that any new file inserted into c:/ixia_results folder is visible at DUT inside /mnt/ixia_results directory.

3.5. Spirent Setup

Spirent installation files and instructions are available on the Spirent support website at:

http://support.spirent.com

Select a version of Spirent TestCenter software to utilize. This example will use Spirent TestCenter v4.57 as an example. Substitute the appropriate version in place of ‘v4.57’ in the examples, below.

3.5.1. On the CentOS 7 System

Download and install the following:

Spirent TestCenter Application, v4.57 for 64-bit Linux Client

3.5.2. Spirent Virtual Deployment Service (VDS)

Spirent VDS is required for both TestCenter hardware and virtual chassis in the vsperf environment. For installation, select the version that matches the Spirent TestCenter Application version. For v4.57, the matching VDS version is 1.0.55. Download either the ova (VMware) or qcow2 (QEMU) image and create a VM with it. Initialize the VM according to Spirent installation instructions.

3.5.3. Using Spirent TestCenter Virtual (STCv)

STCv is available in both ova (VMware) and qcow2 (QEMU) formats. For VMware, download:

Spirent TestCenter Virtual Machine for VMware, v4.57 for Hypervisor - VMware ESX.ESXi

Virtual test port performance is affected by the hypervisor configuration. For best practice results in deploying STCv, the following is suggested:

  • Create a single VM with two test ports rather than two VMs with one port each
  • Set STCv in DPDK mode
  • Give STCv 2*n + 1 cores, where n = the number of ports. For vsperf, cores = 5.
  • Turning off hyperthreading and pinning these cores will improve performance
  • Give STCv 2 GB of RAM

To get the highest performance and accuracy, Spirent TestCenter hardware is recommended. vsperf can run with either stype test ports.

3.5.4. Using STC REST Client

The stcrestclient package provides the stchttp.py ReST API wrapper module. This allows simple function calls, nearly identical to those provided by StcPython.py, to be used to access TestCenter server sessions via the STC ReST API. Basic ReST functionality is provided by the resthttp module, and may be used for writing ReST clients independent of STC.

To use REST interface, follow the instructions in the Project page to install the package. Once installed, the scripts named with ‘rest’ keyword can be used. For example: testcenter-rfc2544-rest.py can be used to run RFC 2544 tests using the REST interface.

3.5.5. Configuration:
  1. The Labserver and license server addresses. These parameters applies to all the tests, and are mandatory for all tests.
TRAFFICGEN_STC_LAB_SERVER_ADDR = " "
TRAFFICGEN_STC_LICENSE_SERVER_ADDR = " "
TRAFFICGEN_STC_PYTHON2_PATH = " "
TRAFFICGEN_STC_TESTCENTER_PATH = " "
TRAFFICGEN_STC_TEST_SESSION_NAME = " "
TRAFFICGEN_STC_CSV_RESULTS_FILE_PREFIX = " "
  1. For RFC2544 tests, the following parameters are mandatory
TRAFFICGEN_STC_EAST_CHASSIS_ADDR = " "
TRAFFICGEN_STC_EAST_SLOT_NUM = " "
TRAFFICGEN_STC_EAST_PORT_NUM = " "
TRAFFICGEN_STC_EAST_INTF_ADDR = " "
TRAFFICGEN_STC_EAST_INTF_GATEWAY_ADDR = " "
TRAFFICGEN_STC_WEST_CHASSIS_ADDR = ""
TRAFFICGEN_STC_WEST_SLOT_NUM = " "
TRAFFICGEN_STC_WEST_PORT_NUM = " "
TRAFFICGEN_STC_WEST_INTF_ADDR = " "
TRAFFICGEN_STC_WEST_INTF_GATEWAY_ADDR = " "
TRAFFICGEN_STC_RFC2544_TPUT_TEST_FILE_NAME
  1. RFC2889 tests: Currently, the forwarding, address-caching, and address-learning-rate tests of RFC2889 are supported. The testcenter-rfc2889-rest.py script implements the rfc2889 tests. The configuration for RFC2889 involves test-case definition, and parameter definition, as described below. New results-constants, as shown below, are added to support these tests.

Example of testcase definition for RFC2889 tests:

{
    "Name": "phy2phy_forwarding",
    "Deployment": "p2p",
    "Description": "LTD.Forwarding.RFC2889.MaxForwardingRate",
    "Parameters" : {
        "TRAFFIC" : {
            "traffic_type" : "rfc2889_forwarding",
        },
    },
}

For RFC2889 tests, specifying the locations for the monitoring ports is mandatory. Necessary parameters are:

TRAFFICGEN_STC_RFC2889_TEST_FILE_NAME
TRAFFICGEN_STC_EAST_CHASSIS_ADDR = " "
TRAFFICGEN_STC_EAST_SLOT_NUM = " "
TRAFFICGEN_STC_EAST_PORT_NUM = " "
TRAFFICGEN_STC_EAST_INTF_ADDR = " "
TRAFFICGEN_STC_EAST_INTF_GATEWAY_ADDR = " "
TRAFFICGEN_STC_WEST_CHASSIS_ADDR = ""
TRAFFICGEN_STC_WEST_SLOT_NUM = " "
TRAFFICGEN_STC_WEST_PORT_NUM = " "
TRAFFICGEN_STC_WEST_INTF_ADDR = " "
TRAFFICGEN_STC_WEST_INTF_GATEWAY_ADDR = " "
TRAFFICGEN_STC_VERBOSE = "True"
TRAFFICGEN_STC_RFC2889_LOCATIONS="//10.1.1.1/1/1,//10.1.1.1/2/2"

Other Configurations are :

TRAFFICGEN_STC_RFC2889_MIN_LR = 1488
TRAFFICGEN_STC_RFC2889_MAX_LR = 14880
TRAFFICGEN_STC_RFC2889_MIN_ADDRS = 1000
TRAFFICGEN_STC_RFC2889_MAX_ADDRS = 65536
TRAFFICGEN_STC_RFC2889_AC_LR = 1000

The first 2 values are for address-learning test where as other 3 values are for the Address caching capacity test. LR: Learning Rate. AC: Address Caching. Maximum value for address is 16777216. Whereas, maximum for LR is 4294967295.

Results for RFC2889 Tests: Forwarding tests outputs following values:

TX_RATE_FPS : "Transmission Rate in Frames/sec"
THROUGHPUT_RX_FPS: "Received Throughput Frames/sec"
TX_RATE_MBPS : " Transmission rate in MBPS"
THROUGHPUT_RX_MBPS: "Received Throughput in MBPS"
TX_RATE_PERCENT: "Transmission Rate in Percentage"
FRAME_LOSS_PERCENT: "Frame loss in Percentage"
FORWARDING_RATE_FPS: " Maximum Forwarding Rate in FPS"

Whereas, the address caching test outputs following values,

CACHING_CAPACITY_ADDRS = 'Number of address it can cache'
ADDR_LEARNED_PERCENT = 'Percentage of address successfully learned'

and address learning test outputs just a single value:

OPTIMAL_LEARNING_RATE_FPS = 'Optimal learning rate in fps'

Note that ‘FORWARDING_RATE_FPS’, ‘CACHING_CAPACITY_ADDRS’, ‘ADDR_LEARNED_PERCENT’ and ‘OPTIMAL_LEARNING_RATE_FPS’ are the new result-constants added to support RFC2889 tests.

3.6. Xena Networks
3.6.1. Installation

Xena Networks traffic generator requires specific files and packages to be installed. It is assumed the user has access to the Xena2544.exe file which must be placed in VSPerf installation location under the tools/pkt_gen/xena folder. Contact Xena Networks for the latest version of this file. The user can also visit www.xenanetworks/downloads to obtain the file with a valid support contract.

Note VSPerf has been fully tested with version v2.43 of Xena2544.exe

To execute the Xena2544.exe file under Linux distributions the mono-complete package must be installed. To install this package follow the instructions below. Further information can be obtained from http://www.mono-project.com/docs/getting-started/install/linux/

rpm --import "http://keyserver.ubuntu.com/pks/lookup?op=get&search=0x3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF"
yum-config-manager --add-repo http://download.mono-project.com/repo/centos/
yum -y install mono-complete-5.8.0.127-0.xamarin.3.epel7.x86_64

To prevent gpg errors on future yum installation of packages the mono-project repo should be disabled once installed.

yum-config-manager --disable download.mono-project.com_repo_centos_
3.6.2. Configuration

Connection information for your Xena Chassis must be supplied inside the 10_custom.conf or 03_custom.conf file. The following parameters must be set to allow for proper connections to the chassis.

TRAFFICGEN_XENA_IP = ''
TRAFFICGEN_XENA_PORT1 = ''
TRAFFICGEN_XENA_PORT2 = ''
TRAFFICGEN_XENA_USER = ''
TRAFFICGEN_XENA_PASSWORD = ''
TRAFFICGEN_XENA_MODULE1 = ''
TRAFFICGEN_XENA_MODULE2 = ''
3.6.3. RFC2544 Throughput Testing

Xena traffic generator testing for rfc2544 throughput can be modified for different behaviors if needed. The default options for the following are optimized for best results.

TRAFFICGEN_XENA_2544_TPUT_INIT_VALUE = '10.0'
TRAFFICGEN_XENA_2544_TPUT_MIN_VALUE = '0.1'
TRAFFICGEN_XENA_2544_TPUT_MAX_VALUE = '100.0'
TRAFFICGEN_XENA_2544_TPUT_VALUE_RESOLUTION = '0.5'
TRAFFICGEN_XENA_2544_TPUT_USEPASS_THRESHHOLD = 'false'
TRAFFICGEN_XENA_2544_TPUT_PASS_THRESHHOLD = '0.0'

Each value modifies the behavior of rfc 2544 throughput testing. Refer to your Xena documentation to understand the behavior changes in modifying these values.

Xena RFC2544 testing inside VSPerf also includes a final verification option. This option allows for a faster binary search with a longer final verification of the binary search result. This feature can be enabled in the configuration files as well as the length of the final verification in seconds.

..code-block:: python

TRAFFICGEN_XENA_RFC2544_VERIFY = False TRAFFICGEN_XENA_RFC2544_VERIFY_DURATION = 120

If the final verification does not pass the test with the lossrate specified it will continue the binary search from its previous point. If the smart search option is enabled the search will continue by taking the current pass rate minus the minimum and divided by 2. The maximum is set to the last pass rate minus the threshold value set.

For example if the settings are as follows

..code-block:: python

TRAFFICGEN_XENA_RFC2544_BINARY_RESTART_SMART_SEARCH = True TRAFFICGEN_XENA_2544_TPUT_MIN_VALUE = ‘0.5’ TRAFFICGEN_XENA_2544_TPUT_VALUE_RESOLUTION = ‘0.5’

and the verification attempt was 64.5, smart search would take 64.5 - 0.5 / 2. This would continue the search at 32 but still have a maximum possible value of 64.

If smart is not enabled it will just resume at the last pass rate minus the threshold value.

3.6.4. Continuous Traffic Testing

Xena continuous traffic by default does a 3 second learning preemption to allow the DUT to receive learning packets before a continuous test is performed. If a custom test case requires this learning be disabled, you can disable the option or modify the length of the learning by modifying the following settings.

TRAFFICGEN_XENA_CONT_PORT_LEARNING_ENABLED = False
TRAFFICGEN_XENA_CONT_PORT_LEARNING_DURATION = 3
3.6.5. Multistream Modifier

Xena has a modifier maximum value or 64k in size. For this reason when specifying Multistream values of greater than 64k for Layer 2 or Layer 3 it will use two modifiers that may be modified to a value that can be square rooted to create the two modifiers. You will see a log notification for the new value that was calculated.

3.7. MoonGen
3.7.1. Installation

MoonGen architecture overview and general installation instructions can be found here:

https://github.com/emmericp/MoonGen

  • Note: Today, MoonGen with VSPERF only supports 10Gbps line speeds.

For VSPERF use, MoonGen should be cloned from here (as opposed to the previously mentioned GitHub):

git clone https://github.com/atheurer/lua-trafficgen

and use the master branch:

git checkout master

VSPERF uses a particular Lua script with the MoonGen project:

trafficgen.lua

Follow MoonGen set up and execution instructions here:

https://github.com/atheurer/lua-trafficgen/blob/master/README.md

Note one will need to set up ssh login to not use passwords between the server running MoonGen and the device under test (running the VSPERF test infrastructure). This is because VSPERF on one server uses ‘ssh’ to configure and run MoonGen upon the other server.

One can set up this ssh access by doing the following on both servers:

ssh-keygen -b 2048 -t rsa
ssh-copy-id <other server>
3.7.2. Configuration

Connection information for MoonGen must be supplied inside the 10_custom.conf or 03_custom.conf file. The following parameters must be set to allow for proper connections to the host with MoonGen.

TRAFFICGEN_MOONGEN_HOST_IP_ADDR = ""
TRAFFICGEN_MOONGEN_USER = ""
TRAFFICGEN_MOONGEN_BASE_DIR = ""
TRAFFICGEN_MOONGEN_PORTS = ""
TRAFFICGEN_MOONGEN_LINE_SPEED_GBPS = ""
3.8. Trex
3.8.1. Installation

Trex architecture overview and general installation instructions can be found here:

https://trex-tgn.cisco.com/trex/doc/trex_stateless.html

You can directly download from GitHub:

git clone https://github.com/cisco-system-traffic-generator/trex-core

and use the same Trex version for both server and client API.

NOTE: The Trex API version used by VSPERF is defined by variable TREX_TAG in file src/package-list.mk.

git checkout v2.38

or Trex latest release you can download from here:

wget --no-cache http://trex-tgn.cisco.com/trex/release/latest

After download, Trex repo has to be built:

cd trex-core/linux_dpdk
./b configure   (run only once)
./b build

Next step is to create a minimum configuration file. It can be created by script dpdk_setup_ports.py. The script with parameter -i will run in interactive mode and it will create file /etc/trex_cfg.yaml.

cd trex-core/scripts
sudo ./dpdk_setup_ports.py -i

Or example of configuration file can be found at location below, but it must be updated manually:

cp trex-core/scripts/cfg/simple_cfg /etc/trex_cfg.yaml

For additional information about configuration file see official documentation (chapter 3.1.2):

https://trex-tgn.cisco.com/trex/doc/trex_manual.html#_creating_minimum_configuration_file

After compilation and configuration it is possible to run trex server in stateless mode. It is neccesary for proper connection between Trex server and VSPERF.

cd trex-core/scripts/
./t-rex-64 -i

NOTE: Please check your firewall settings at both DUT and T-Rex server. Firewall must allow a connection from DUT (VSPERF) to the T-Rex server running at TCP port 4501.

NOTE: For high speed cards it may be advantageous to start T-Rex with more transmit queues/cores.

cd trex-cores/scripts/
./t-rex-64 -i -c 10

For additional information about Trex stateless mode see Trex stateless documentation:

https://trex-tgn.cisco.com/trex/doc/trex_stateless.html

NOTE: One will need to set up ssh login to not use passwords between the server running Trex and the device under test (running the VSPERF test infrastructure). This is because VSPERF on one server uses ‘ssh’ to configure and run Trex upon the other server.

One can set up this ssh access by doing the following on both servers:

ssh-keygen -b 2048 -t rsa
ssh-copy-id <other server>
3.8.2. Configuration

Connection information for Trex must be supplied inside the custom configuration file. The following parameters must be set to allow for proper connections to the host with Trex. Example of this configuration is in conf/03_traffic.conf or conf/10_custom.conf.

TRAFFICGEN_TREX_HOST_IP_ADDR = ''
TRAFFICGEN_TREX_USER = ''
TRAFFICGEN_TREX_BASE_DIR = ''

TRAFFICGEN_TREX_USER has to have sudo permission and password-less access. TRAFFICGEN_TREX_BASE_DIR is the place, where is stored ‘t-rex-64’ file.

It is possible to specify the accuracy of RFC2544 Throughput measurement. Threshold below defines maximal difference between frame rate of successful (i.e. defined frameloss was reached) and unsuccessful (i.e. frameloss was exceeded) iterations.

Default value of this parameter is defined in conf/03_traffic.conf as follows:

TRAFFICGEN_TREX_RFC2544_TPUT_THRESHOLD = ''

T-Rex can have learning packets enabled. For certain tests it may be beneficial to send some packets before starting test traffic to allow switch learning to take place. This can be adjusted with the following configurations:

TRAFFICGEN_TREX_LEARNING_MODE=True
TRAFFICGEN_TREX_LEARNING_DURATION=5

Latency measurements have impact on T-Rex performance. Thus vswitchperf uses a separate latency stream for each direction with limited speed. This workaround is used for RFC2544 Throughput and Continuous traffic types. In case of Burst traffic type, the latency statistics are measured for all frames in the burst. Collection of latency statistics is driven by configuration option TRAFFICGEN_TREX_LATENCY_PPS as follows:

  • value 0 - disables latency measurements

  • non zero integer value - enables latency measurements; In case of Throughput

    and Continuous traffic types, it specifies a speed of latency specific stream in PPS. In case of burst traffic type, it enables latency measurements for all frames.

TRAFFICGEN_TREX_LATENCY_PPS = 1000
3.8.3. SR-IOV and Multistream layer 2

T-Rex by default only accepts packets on the receive side if the destination mac matches the MAC address specified in the /etc/trex-cfg.yaml on the server side. For SR-IOV this creates challenges with modifying the MAC address in the traffic profile to correctly flow packets through specified VFs. To remove this limitation enable promiscuous mode on T-Rex to allow all packets regardless of the destination mac to be accepted.

This also creates problems when doing multistream at layer 2 since the source macs will be modified. Enable Promiscuous mode when doing multistream at layer 2 testing with T-Rex.

TRAFFICGEN_TREX_PROMISCUOUS=True
3.8.4. Card Bandwidth Options

T-Rex API will attempt to retrieve the highest possible speed from the card using internal calls to port information. If you are using two separate cards then it will take the lowest of the two cards as the max speed. If necessary you can try to force the API to use a specific maximum speed per port. The below configurations can be adjusted to enable this.

TRAFFICGEN_TREX_FORCE_PORT_SPEED = True
TRAFFICGEN_TREX_PORT_SPEED = 40000 # 40 gig

Note:: Setting higher than possible speeds will result in unpredictable behavior when running tests such as duration inaccuracy and/or complete test failure.

3.8.5. RFC2544 Validation

T-Rex can perform a verification run for a longer duration once the binary search of the RFC2544 trials have completed. This duration should be at least 60 seconds. This is similar to other traffic generator functionality where a more sustained time can be attempted to verify longer runs from the result of the search. This can be configured with the following params

TRAFFICGEN_TREX_VERIFICATION_MODE = False
TRAFFICGEN_TREX_VERIFICATION_DURATION = 60
TRAFFICGEN_TREX_MAXIMUM_VERIFICATION_TRIALS = 10

The duration and maximum number of attempted verification trials can be set to change the behavior of this step. If the verification step fails, it will resume the binary search with new values where the maximum output will be the last attempted frame rate minus the current set thresh hold.

3.8.6. Scapy frame definition

It is possible to use a SCAPY frame definition to generate various network protocols by the T-Rex traffic generator. In case that particular network protocol layer is disabled by the TRAFFIC dictionary (e.g. TRAFFIC[‘vlan’][‘enabled’] = False), then disabled layer will be removed from the scapy format definition by VSPERF.

The scapy frame definition can refer to values defined by the TRAFFIC dictionary by following keywords. These keywords are used in next examples.

  • Ether_src - refers to TRAFFIC['l2']['srcmac']
  • Ether_dst - refers to TRAFFIC['l2']['dstmac']
  • IP_proto - refers to TRAFFIC['l3']['proto']
  • IP_PROTO - refers to upper case version of TRAFFIC['l3']['proto']
  • IP_src - refers to TRAFFIC['l3']['srcip']
  • IP_dst - refers to TRAFFIC['l3']['dstip']
  • IP_PROTO_sport - refers to TRAFFIC['l4']['srcport']
  • IP_PROTO_dport - refers to TRAFFIC['l4']['dstport']
  • Dot1Q_prio - refers to TRAFFIC['vlan']['priority']
  • Dot1Q_id - refers to TRAFFIC['vlan']['cfi']
  • Dot1Q_vlan - refers to TRAFFIC['vlan']['id']

In following examples of SCAPY frame definition only relevant parts of TRAFFIC dictionary are shown. The rest of the TRAFFIC dictionary is set to default values as they are defined in conf/03_traffic.conf.

Please check official documentation of SCAPY project for details about SCAPY frame definition and supported network layers at: http://www.secdev.org/projects/scapy

  1. Generate ICMP frames:

    'scapy': {
        'enabled': True,
        '0' : 'Ether(src={Ether_src}, dst={Ether_dst})/IP(proto="icmp", src={IP_src}, dst={IP_dst})/ICMP()',
        '1' : 'Ether(src={Ether_dst}, dst={Ether_src})/IP(proto="icmp", src={IP_dst}, dst={IP_src})/ICMP()',
    }
    
  2. Generate IPv6 ICMP Echo Request

    'l3' : {
        'srcip': 'feed::01',
        'dstip': 'feed::02',
    },
    'scapy': {
        'enabled': True,
        '0' : 'Ether(src={Ether_src}, dst={Ether_dst})/IPv6(src={IP_src}, dst={IP_dst})/ICMPv6EchoRequest()',
        '1' : 'Ether(src={Ether_dst}, dst={Ether_src})/IPv6(src={IP_dst}, dst={IP_src})/ICMPv6EchoRequest()',
    }
    
  3. Generate TCP frames:

    Example uses default SCAPY frame definition, which can reflect TRAFFIC['l3']['proto'] settings.

    'l3' : {
        'proto' : 'tcp',
    },
    
4. ‘vsperf’ Additional Tools Configuration Guide
4.1. Overview

VSPERF supports the following categories additional tools:

Under each category, there are one or more tools supported by VSPERF. This guide provides the details of how to install (if required) and configure the above mentioned tools.

4.2. Infrastructure Metrics Collection

VSPERF supports following two tools for collecting and reporting the metrics:

  • pidstat
  • collectd

pidstat is a command in linux systems, which is used for monitoring individual tasks currently being managed by Linux kernel. In VSPERF this command is used to monitor ovs-vswitchd, ovsdb-server and kvm processes.

collectd is linux application that collects, stores and transfers various system metrics. For every category of metrics, there is a separate plugin in collectd. For example, CPU plugin and Interface plugin provides all the cpu metrics and interface metrics, respectively. CPU metrics may include user-time, system-time, etc., whereas interface metrics may include received-packets, dropped-packets, etc.

4.2.1. Installation

No installation is required for pidstat, whereas, collectd has to be installed separately. For installation of collectd, we recommend to follow the process described in OPNFV-Barometer project, which can be found here Barometer-Euphrates or the most recent release.

VSPERF assumes that collectd is installed and configured to send metrics over localhost. The metrics sent should be for the following categories: CPU, Processes, Interface, OVS, DPDK, Intel-RDT.

4.2.2. Configuration

The configuration file for the collectors can be found in conf/05_collector.conf. pidstat specific configuration includes:

  • PIDSTAT_MONITOR - processes to be monitored by pidstat
  • PIDSTAT_OPTIONS - options which will be passed to pidstat command
  • PIDSTAT_SAMPLE_INTERVAL - sampling interval used by pidstat to collect statistics
  • LOG_FILE_PIDSTAT - prefix of pidstat’s log file

The collectd configuration option includes:

  • COLLECTD_IP - IP address where collectd is running
  • COLLECTD_PORT - Port number over which collectd is sending the metrics
  • COLLECTD_SECURITY_LEVEL - Security level for receiving metrics
  • COLLECTD_AUTH_FILE - Authentication file for receiving metrics
  • LOG_FILE_COLLECTD - Prefix for collectd’s log file.
  • COLLECTD_CPU_KEYS - Interesting metrics from CPU
  • COLLECTD_PROCESSES_KEYS - Interesting metrics from processes
  • COLLECTD_INTERFACE_KEYS - Interesting metrics from interface
  • COLLECTD_OVSSTAT_KEYS - Interesting metrics from OVS
  • COLLECTD_DPDKSTAT_KEYS - Interesting metrics from DPDK.
  • COLLECTD_INTELRDT_KEYS - Interesting metrics from Intel-RDT
  • COLLECTD_INTERFACE_XKEYS - Metrics to exclude from Interface
  • COLLECTD_INTELRDT_XKEYS - Metrics to exclude from Intel-RDT
4.3. Load Generation

In VSPERF, load generation refers to creating background cpu and memory loads to study the impact of these loads on system under test. There are two options to create loads in VSPERF. These options are used for different use-cases. The options are:

  • stress or stress-ng
  • Stressor-VMs

stress and stress-ng are linux tools to stress the system in various ways. It can stress different subsystems such as CPU and memory. stress-ng is the improvised version of stress. StressorVMs are custom build virtual-machines for the noisy-neighbor use-cases.

4.3.1. Installation

stress and stress-ng can be installed through standard linux installation process. Information about stress-ng, including the steps for installing can be found here: stress-ng

There are two options for StressorVMs - one is VMs based on stress-ng and second is VM based on Spirent’s cloudstress. VMs based on stress-ng can be found in this link . Spirent’s cloudstress based VM can be downloaded from this site

These stressorVMs are of OSV based VMs, which are very small in size. Download these VMs and place it in appropriate location, and this location will used in the configuration - as mentioned below.

4.3.2. Configuration

The configuration file for loadgens can be found in conf/07_loadgen.conf. There are no specific configurations for stress and stress-ng commands based load-generation. However, for StressorVMs, following configurations apply:

  • NN_COUNT - Number of stressor VMs required.
  • NN_MEMORY - Comma separated memory configuration for each VM
  • NN_SMP - Comma separated configuration for each VM
  • NN_IMAGE - Comma separated list of Paths for each VM image
  • NN_SHARED_DRIVE_TYPE - Comma separated list of shaed drive type for each VM
  • NN_BOOT_DRIVE_TYPE - Comma separated list of boot drive type for each VM
  • NN_CORE_BINDING - Comma separated lists of list specifying the cores associated with each VM.
  • NN_NICS_NR - Comma seprated list of number of NICS for each VM
  • NN_BASE_VNC_PORT - Base VNC port Index.
  • NN_LOG_FILE - Name of the log file
4.4. Last Level Cache Management

VSPERF support last-level cache management using Intel’s RDT tool(s) - the relavant ones are Intel CAT-CMT and Intel RMD. RMD is a linux daemon that runs on individual hosts, and provides a REST API for control/orchestration layer to request LLC for the VMs/Containers/Applications. RDT receives resource policy form orchestration layer - in this case, from VSPERF - and enforce it on the host. It achieves this enforcement via kernel interfaces such as resctrlfs and libpqos. The resource here refer to the last-level cache. User can configure policies to define how much of cache a CPU can get. The policy configuration is described below.

4.4.1. Installation

For installation of RMD tool, please install CAT-CMT first and then install RMD. The details of installation can be found here: Intel CAT-CMT and Intel RMD

4.4.2. Configuration

The configuration file for cache management can be found in conf/08_llcmanagement.conf.

VSPERF provides following configuration options, for user to define and enforce policies via RMD.

  • LLC_ALLOCATION - Enable or Disable LLC management.
  • RMD_PORT - RMD port (port number on which API server is listening)
  • RMD_SERVER_IP - IP address where RMD is running. Currently only localhost.
  • RMD_API_VERSION - RMD version. Currently it is ‘v1’
  • POLICY_TYPE - Specify how the policy is defined - either COS or CUSTOM
  • VSWITCH_COS - Class of service (CoS for Vswitch. CoS can be gold, silver-bf or bronze-shared.
  • VNF_COS - Class of service for VNF
  • PMD_COS - Class of service for PMD
  • NOISEVM_COS - Class of service of Noisy VM.
  • VSWITCH_CA - [min-cache-value, maxi-cache-value] for vswitch
  • VNF_CA - [min-cache-value, max-cache-value] for VNF
  • PMD_CA - [min-cache-value, max-cache-value] for PMD
  • NOISEVM_CA - [min-cache-value, max-cache-value] for Noisy VM
VSPERF Test Guide
1. vSwitchPerf test suites userguide
1.1. General

VSPERF requires a traffic generators to run tests, automated traffic gen support in VSPERF includes:

  • IXIA traffic generator (IxNetwork hardware) and a machine that runs the IXIA client software.
  • Spirent traffic generator (TestCenter hardware chassis or TestCenter virtual in a VM) and a VM to run the Spirent Virtual Deployment Service image, formerly known as “Spirent LabServer”.
  • Xena Network traffic generator (Xena hardware chassis) that houses the Xena Traffic generator modules.
  • Moongen software traffic generator. Requires a separate machine running moongen to execute packet generation.
  • T-Rex software traffic generator. Requires a separate machine running T-Rex Server to execute packet generation.

If you want to use another traffic generator, please select the Dummy generator.

1.2. VSPERF Installation

To see the supported Operating Systems, vSwitches and system requirements, please follow the installation instructions <vsperf-installation>.

1.3. Traffic Generator Setup

Follow the Traffic generator instructions <trafficgen-installation> to install and configure a suitable traffic generator.

1.4. Cloning and building src dependencies

In order to run VSPERF, you will need to download DPDK and OVS. You can do this manually and build them in a preferred location, OR you could use vswitchperf/src. The vswitchperf/src directory contains makefiles that will allow you to clone and build the libraries that VSPERF depends on, such as DPDK and OVS. To clone and build simply:

$ cd src
$ make

VSPERF can be used with stock OVS (without DPDK support). When build is finished, the libraries are stored in src_vanilla directory.

The ‘make’ builds all options in src:

  • Vanilla OVS
  • OVS with vhost_user as the guest access method (with DPDK support)

The vhost_user build will reside in src/ovs/ The Vanilla OVS build will reside in vswitchperf/src_vanilla

To delete a src subdirectory and its contents to allow you to re-clone simply use:

$ make clobber
1.5. Configure the ./conf/10_custom.conf file

The 10_custom.conf file is the configuration file that overrides default configurations in all the other configuration files in ./conf The supplied 10_custom.conf file MUST be modified, as it contains configuration items for which there are no reasonable default values.

The configuration items that can be added is not limited to the initial contents. Any configuration item mentioned in any .conf file in ./conf directory can be added and that item will be overridden by the custom configuration value.

Further details about configuration files evaluation and special behaviour of options with GUEST_ prefix could be found at design document.

1.6. Using a custom settings file

If your 10_custom.conf doesn’t reside in the ./conf directory or if you want to use an alternative configuration file, the file can be passed to vsperf via the --conf-file argument.

$ ./vsperf --conf-file <path_to_custom_conf> ...
1.7. Evaluation of configuration parameters

The value of configuration parameter can be specified at various places, e.g. at the test case definition, inside configuration files, by the command line argument, etc. Thus it is important to understand the order of configuration parameter evaluation. This “priority hierarchy” can be described like so (1 = max priority):

  1. Testcase definition keywords vSwitch, Trafficgen, VNF and Tunnel Type
  2. Parameters inside testcase definition section Parameters
  3. Command line arguments (e.g. --test-params, --vswitch, --trafficgen, etc.)
  4. Environment variables (see --load-env argument)
  5. Custom configuration file specified via --conf-file argument
  6. Standard configuration files, where higher prefix number means higher priority.

For example, if the same configuration parameter is defined in custom configuration file (specified via --conf-file argument), via --test-params argument and also inside Parameters section of the testcase definition, then parameter value from the Parameters section will be used.

Further details about order of configuration files evaluation and special behaviour of options with GUEST_ prefix could be found at design document.

1.8. Overriding values defined in configuration files

The configuration items can be overridden by command line argument --test-params. In this case, the configuration items and their values should be passed in form of item=value and separated by semicolon.

Example:

$ ./vsperf --test-params "TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,);" \
                         "GUEST_LOOPBACK=['testpmd','l2fwd']" pvvp_tput

The --test-params command line argument can also be used to override default configuration values for multiple tests. Providing a list of parameters will apply each element of the list to the test with the same index. If more tests are run than parameters provided the last element of the list will repeat.

$ ./vsperf --test-params "['TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,)',"
                         "'TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(64,)']" \
                         pvvp_tput pvvp_tput

The second option is to override configuration items by Parameters section of the test case definition. The configuration items can be added into Parameters dictionary with their new values. These values will override values defined in configuration files or specified by --test-params command line argument.

Example:

"Parameters" : {'TRAFFICGEN_PKT_SIZES' : (128,),
                'TRAFFICGEN_DURATION' : 10,
                'GUEST_LOOPBACK' : ['testpmd','l2fwd'],
               }

NOTE: In both cases, configuration item names and their values must be specified in the same form as they are defined inside configuration files. Parameter names must be specified in uppercase and data types of original and new value must match. Python syntax rules related to data types and structures must be followed. For example, parameter TRAFFICGEN_PKT_SIZES above is defined as a tuple with a single value 128. In this case trailing comma is mandatory, otherwise value can be wrongly interpreted as a number instead of a tuple and vsperf execution would fail. Please check configuration files for default values and their types and use them as a basis for any customized values. In case of any doubt, please check official python documentation related to data structures like tuples, lists and dictionaries.

NOTE: Vsperf execution will terminate with runtime error in case, that unknown parameter name is passed via --test-params CLI argument or defined in Parameters section of test case definition. It is also forbidden to redefine a value of TEST_PARAMS configuration item via CLI or Parameters section.

NOTE: The new definition of the dictionary parameter, specified via --test-params or inside Parameters section, will not override original dictionary values. Instead the original dictionary will be updated with values from the new dictionary definition.

1.9. Referencing parameter values

It is possible to use a special macro #PARAM() to refer to the value of another configuration parameter. This reference is evaluated during access of the parameter value (by settings.getValue() call), so it can refer to parameters created during VSPERF runtime, e.g. NICS dictionary. It can be used to reflect DUT HW details in the testcase definition.

Example:

{
    ...
    "Name": "testcase",
    "Parameters" : {
        "TRAFFIC" : {
            'l2': {
                # set destination MAC to the MAC of the first
                # interface from WHITELIST_NICS list
                'dstmac' : '#PARAM(NICS[0]["mac"])',
            },
        },
    ...
1.10. vloop_vnf

VSPERF uses a VM image called vloop_vnf for looping traffic in the deployment scenarios involving VMs. The image can be downloaded from http://artifacts.opnfv.org/.

Please see the installation instructions for information on vloop-vnf images.

1.11. l2fwd Kernel Module

A Kernel Module that provides OSI Layer 2 Ipv4 termination or forwarding with support for Destination Network Address Translation (DNAT) for both the MAC and IP addresses. l2fwd can be found in <vswitchperf_dir>/src/l2fwd

1.12. Additional Tools Setup

Follow the Additional tools instructions <additional-tools-configuration> to install and configure additional tools such as collectors and loadgens.

1.13. Executing tests

All examples inside these docs assume, that user is inside the VSPERF directory. VSPERF can be executed from any directory.

Before running any tests make sure you have root permissions by adding the following line to /etc/sudoers:

username ALL=(ALL)       NOPASSWD: ALL

username in the example above should be replaced with a real username.

To list the available tests:

$ ./vsperf --list

To run a single test:

$ ./vsperf $TESTNAME

Where $TESTNAME is the name of the vsperf test you would like to run.

To run a test multiple times, repeat it:

$ ./vsperf $TESTNAME $TESTNAME $TESTNAME

To run a group of tests, for example all tests with a name containing ‘RFC2544’:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf --tests="RFC2544"

To run all tests:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf

Some tests allow for configurable parameters, including test duration (in seconds) as well as packet sizes (in bytes).

$ ./vsperf --conf-file user_settings.py \
    --tests RFC2544Tput \
    --test-params "TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,)"

To specify configurable parameters for multiple tests, use a list of parameters. One element for each test.

$ ./vsperf --conf-file user_settings.py \
    --test-params "['TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,)',"\
    "'TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(64,)']" \
    phy2phy_cont phy2phy_cont

If the CUMULATIVE_PARAMS setting is set to True and there are different parameters provided for each test using --test-params, each test will take the parameters of the previous test before appyling it’s own. With CUMULATIVE_PARAMS set to True the following command will be equivalent to the previous example:

$ ./vsperf --conf-file user_settings.py \
    --test-params "['TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,)',"\
    "'TRAFFICGEN_PKT_SIZES=(64,)']" \
    phy2phy_cont phy2phy_cont
    "

For all available options, check out the help dialog:

$ ./vsperf --help
1.14. Executing Vanilla OVS tests
  1. If needed, recompile src for all OVS variants

    $ cd src
    $ make distclean
    $ make
    
  2. Update your 10_custom.conf file to use Vanilla OVS:

    VSWITCH = 'OvsVanilla'
    
  3. Run test:

    $ ./vsperf --conf-file=<path_to_custom_conf>
    

    Please note if you don’t want to configure Vanilla OVS through the configuration file, you can pass it as a CLI argument.

    $ ./vsperf --vswitch OvsVanilla
    
1.15. Executing tests with VMs

To run tests using vhost-user as guest access method:

  1. Set VSWITCH and VNF of your settings file to:

    VSWITCH = 'OvsDpdkVhost'
    VNF = 'QemuDpdkVhost'
    
  2. If needed, recompile src for all OVS variants

    $ cd src
    $ make distclean
    $ make
    
  3. Run test:

    $ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf
    

NOTE: By default vSwitch is acting as a server for dpdk vhost-user sockets. In case, that QEMU should be a server for vhost-user sockets, then parameter VSWITCH_VHOSTUSER_SERVER_MODE should be set to False.

1.16. Executing tests with VMs using Vanilla OVS

To run tests using Vanilla OVS:

  1. Set the following variables:

    VSWITCH = 'OvsVanilla'
    VNF = 'QemuVirtioNet'
    
    VANILLA_TGEN_PORT1_IP = n.n.n.n
    VANILLA_TGEN_PORT1_MAC = nn:nn:nn:nn:nn:nn
    
    VANILLA_TGEN_PORT2_IP = n.n.n.n
    VANILLA_TGEN_PORT2_MAC = nn:nn:nn:nn:nn:nn
    
    VANILLA_BRIDGE_IP = n.n.n.n
    

    or use --test-params option

    $ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
               --test-params "VANILLA_TGEN_PORT1_IP=n.n.n.n;" \
                             "VANILLA_TGEN_PORT1_MAC=nn:nn:nn:nn:nn:nn;" \
                             "VANILLA_TGEN_PORT2_IP=n.n.n.n;" \
                             "VANILLA_TGEN_PORT2_MAC=nn:nn:nn:nn:nn:nn"
    
  2. If needed, recompile src for all OVS variants

    $ cd src
    $ make distclean
    $ make
    
  3. Run test:

    $ ./vsperf --conf-file<path_to_custom_conf>/10_custom.conf
    
1.17. Executing VPP tests

Currently it is not possible to use standard scenario deployments for execution of tests with VPP. It means, that deployments p2p, pvp, pvvp and in general any PXP Deployment won’t work with VPP. However it is possible to use VPP in Step driven tests. A basic set of VPP testcases covering phy2phy, pvp and pvvp tests are already prepared.

List of performance tests with VPP support follows:

  • phy2phy_tput_vpp: VPP: LTD.Throughput.RFC2544.PacketLossRatio
  • phy2phy_cont_vpp: VPP: Phy2Phy Continuous Stream
  • phy2phy_back2back_vpp: VPP: LTD.Throughput.RFC2544.BackToBackFrames
  • pvp_tput_vpp: VPP: LTD.Throughput.RFC2544.PacketLossRatio
  • pvp_cont_vpp: VPP: PVP Continuous Stream
  • pvp_back2back_vpp: VPP: LTD.Throughput.RFC2544.BackToBackFrames
  • pvvp_tput_vpp: VPP: LTD.Throughput.RFC2544.PacketLossRatio
  • pvvp_cont_vpp: VPP: PVP Continuous Stream
  • pvvp_back2back_vpp: VPP: LTD.Throughput.RFC2544.BackToBackFrames

In order to execute testcases with VPP it is required to:

After that it is possible to execute VPP testcases listed above.

For example:

$ ./vsperf --conf-file=<path_to_custom_conf> phy2phy_tput_vpp
1.18. Using vfio_pci with DPDK

To use vfio with DPDK instead of igb_uio add into your custom configuration file the following parameter:

PATHS['dpdk']['src']['modules'] = ['uio', 'vfio-pci']

NOTE: In case, that DPDK is installed from binary package, then please set PATHS['dpdk']['bin']['modules'] instead.

NOTE: Please ensure that Intel VT-d is enabled in BIOS.

NOTE: Please ensure your boot/grub parameters include the following:

iommu=pt intel_iommu=on

To check that IOMMU is enabled on your platform:

$ dmesg | grep IOMMU
[    0.000000] Intel-IOMMU: enabled
[    0.139882] dmar: IOMMU 0: reg_base_addr fbffe000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.139888] dmar: IOMMU 1: reg_base_addr ebffc000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.139893] IOAPIC id 2 under DRHD base  0xfbffe000 IOMMU 0
[    0.139894] IOAPIC id 0 under DRHD base  0xebffc000 IOMMU 1
[    0.139895] IOAPIC id 1 under DRHD base  0xebffc000 IOMMU 1
[    3.335744] IOMMU: dmar0 using Queued invalidation
[    3.335746] IOMMU: dmar1 using Queued invalidation
....

NOTE: In case of VPP, it is required to explicitly define, that vfio-pci DPDK driver should be used. It means to update dpdk part of VSWITCH_VPP_ARGS dictionary with uio-driver section, e.g. VSWITCH_VPP_ARGS[‘dpdk’] = ‘uio-driver vfio-pci’

1.19. Using SRIOV support

To use virtual functions of NIC with SRIOV support, use extended form of NIC PCI slot definition:

WHITELIST_NICS = ['0000:05:00.0|vf0', '0000:05:00.1|vf3']

Where ‘vf’ is an indication of virtual function usage and following number defines a VF to be used. In case that VF usage is detected, then vswitchperf will enable SRIOV support for given card and it will detect PCI slot numbers of selected VFs.

So in example above, one VF will be configured for NIC ‘0000:05:00.0’ and four VFs will be configured for NIC ‘0000:05:00.1’. Vswitchperf will detect PCI addresses of selected VFs and it will use them during test execution.

At the end of vswitchperf execution, SRIOV support will be disabled.

SRIOV support is generic and it can be used in different testing scenarios. For example:

  • vSwitch tests with DPDK or without DPDK support to verify impact of VF usage on vSwitch performance
  • tests without vSwitch, where traffic is forwarded directly between VF interfaces by packet forwarder (e.g. testpmd application)
  • tests without vSwitch, where VM accesses VF interfaces directly by PCI-passthrough to measure raw VM throughput performance.
1.20. Using QEMU with PCI passthrough support

Raw virtual machine throughput performance can be measured by execution of PVP test with direct access to NICs by PCI pass-through. To execute VM with direct access to PCI devices, enable vfio-pci. In order to use virtual functions, SRIOV-support must be enabled.

Execution of test with PCI pass-through with vswitch disabled:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
           --vswitch none --vnf QemuPciPassthrough pvp_tput

Any of supported guest-loopback-application can be used inside VM with PCI pass-through support.

Note: Qemu with PCI pass-through support can be used only with PVP test deployment.

1.21. Selection of loopback application for tests with VMs

To select the loopback applications which will forward packets inside VMs, the following parameter should be configured:

GUEST_LOOPBACK = ['testpmd']

or use --test-params CLI argument:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
      --test-params "GUEST_LOOPBACK=['testpmd']"

Supported loopback applications are:

'testpmd'       - testpmd from dpdk will be built and used
'l2fwd'         - l2fwd module provided by Huawei will be built and used
'linux_bridge'  - linux bridge will be configured
'buildin'       - nothing will be configured by vsperf; VM image must
                  ensure traffic forwarding between its interfaces

Guest loopback application must be configured, otherwise traffic will not be forwarded by VM and testcases with VM related deployments will fail. Guest loopback application is set to ‘testpmd’ by default.

NOTE: In case that only 1 or more than 2 NICs are configured for VM, then ‘testpmd’ should be used. As it is able to forward traffic between multiple VM NIC pairs.

NOTE: In case of linux_bridge, all guest NICs are connected to the same bridge inside the guest.

1.22. Mergable Buffers Options with QEMU

Mergable buffers can be disabled with VSPerf within QEMU. This option can increase performance significantly when not using jumbo frame sized packets. By default VSPerf disables mergable buffers. If you wish to enable it you can modify the setting in the a custom conf file.

GUEST_NIC_MERGE_BUFFERS_DISABLE = [False]

Then execute using the custom conf file.

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf

Alternatively you can just pass the param during execution.

$ ./vsperf --test-params "GUEST_NIC_MERGE_BUFFERS_DISABLE=[False]"
1.23. Selection of dpdk binding driver for tests with VMs

To select dpdk binding driver, which will specify which driver the vm NICs will use for dpdk bind, the following configuration parameter should be configured:

GUEST_DPDK_BIND_DRIVER = ['igb_uio_from_src']

The supported dpdk guest bind drivers are:

'uio_pci_generic'      - Use uio_pci_generic driver
'igb_uio_from_src'     - Build and use the igb_uio driver from the dpdk src
                         files
'vfio_no_iommu'        - Use vfio with no iommu option. This requires custom
                         guest images that support this option. The default
                         vloop image does not support this driver.

Note: uio_pci_generic does not support sr-iov testcases with guests attached. This is because uio_pci_generic only supports legacy interrupts. In case uio_pci_generic is selected with the vnf as QemuPciPassthrough it will be modified to use igb_uio_from_src instead.

Note: vfio_no_iommu requires kernels equal to or greater than 4.5 and dpdk 16.04 or greater. Using this option will also taint the kernel.

Please refer to the dpdk documents at http://dpdk.org/doc/guides for more information on these drivers.

1.24. Guest Core and Thread Binding

VSPERF provides options to achieve better performance by guest core binding and guest vCPU thread binding as well. Core binding is to bind all the qemu threads. Thread binding is to bind the house keeping threads to some CPU and vCPU thread to some other CPU, this helps to reduce the noise from qemu house keeping threads.

GUEST_CORE_BINDING = [('#EVAL(6+2*#VMINDEX)', '#EVAL(7+2*#VMINDEX)')]

NOTE By default the GUEST_THREAD_BINDING will be none, which means same as the GUEST_CORE_BINDING, i.e. the vcpu threads are sharing the physical CPUs with the house keeping threads. Better performance using vCPU thread binding can be achieved by enabling affinity in the custom configuration file.

For example, if an environment requires 32,33 to be core binded and 29,30&31 for guest thread binding to achieve better performance.

VNF_AFFINITIZATION_ON = True
GUEST_CORE_BINDING = [('32','33')]
GUEST_THREAD_BINDING = [('29', '30', '31')]
1.25. Qemu CPU features

QEMU default to a compatible subset of performance enhancing cpu features. To pass all available host processor features to the guest.

GUEST_CPU_OPTIONS = ['host,migratable=off']

NOTE To enhance the performance, cpu features tsc deadline timer for guest, the guest PMU, the invariant TSC can be provided in the custom configuration file.

1.26. Multi-Queue Configuration

VSPerf currently supports multi-queue with the following limitations:

  1. Requires QEMU 2.5 or greater and any OVS version higher than 2.5. The default upstream package versions installed by VSPerf satisfies this requirement.

  2. Guest image must have ethtool utility installed if using l2fwd or linux bridge inside guest for loopback.

  3. If using OVS versions 2.5.0 or less enable old style multi-queue as shown in the ‘‘02_vswitch.conf’’ file.

    OVS_OLD_STYLE_MQ = True
    

To enable multi-queue for dpdk modify the ‘‘02_vswitch.conf’’ file.

VSWITCH_DPDK_MULTI_QUEUES = 2

NOTE: you should consider using the switch affinity to set a pmd cpu mask that can optimize your performance. Consider the numa of the NIC in use if this applies by checking /sys/class/net/<eth_name>/device/numa_node and setting an appropriate mask to create PMD threads on the same numa node.

When multi-queue is enabled, each dpdk or dpdkvhostuser port that is created on the switch will set the option for multiple queues. If old style multi queue has been enabled a global option for multi queue will be used instead of the port by port option.

To enable multi-queue on the guest modify the ‘‘04_vnf.conf’’ file.

GUEST_NIC_QUEUES = [2]

Enabling multi-queue at the guest will add multiple queues to each NIC port when qemu launches the guest.

In case of Vanilla OVS, multi-queue is enabled on the tuntap ports and nic queues will be enabled inside the guest with ethtool. Simply enabling the multi-queue on the guest is sufficient for Vanilla OVS multi-queue.

Testpmd should be configured to take advantage of multi-queue on the guest if using DPDKVhostUser. This can be done by modifying the ‘‘04_vnf.conf’’ file.

GUEST_TESTPMD_PARAMS = ['-l 0,1,2,3,4  -n 4 --socket-mem 512 -- '
                        '--burst=64 -i --txqflags=0xf00 '
                        '--nb-cores=4 --rxq=2 --txq=2 '
                        '--disable-hw-vlan']

NOTE: The guest SMP cores must be configured to allow for testpmd to use the optimal number of cores to take advantage of the multiple guest queues.

In case of using Vanilla OVS and qemu virtio-net you can increase performance by binding vhost-net threads to cpus. This can be done by enabling the affinity in the ‘‘04_vnf.conf’’ file. This can be done to non multi-queue enabled configurations as well as there will be 2 vhost-net threads.

VSWITCH_VHOST_NET_AFFINITIZATION = True

VSWITCH_VHOST_CPU_MAP = [4,5,8,11]

NOTE: This method of binding would require a custom script in a real environment.

NOTE: For optimal performance guest SMPs and/or vhost-net threads should be on the same numa as the NIC in use if possible/applicable. Testpmd should be assigned at least (nb_cores +1) total cores with the cpu mask.

1.27. Jumbo Frame Testing

VSPERF provides options to support jumbo frame testing with a jumbo frame supported NIC and traffic generator for the following vswitches:

  1. OVSVanilla
  2. OvsDpdkVhostUser
  3. TestPMD loopback with or without a guest

NOTE: There is currently no support for SR-IOV or VPP at this time with jumbo frames.

All packet forwarding applications for pxp testing is supported.

To enable jumbo frame testing simply enable the option in the conf files and set the maximum size that will be used.

VSWITCH_JUMBO_FRAMES_ENABLED = True
VSWITCH_JUMBO_FRAMES_SIZE = 9000

To enable jumbo frame testing with OVSVanilla the NIC in test on the host must have its mtu size changed manually using ifconfig or applicable tools:

ifconfig eth1 mtu 9000 up

NOTE: To make the setting consistent across reboots you should reference the OS documents as it differs from distribution to distribution.

To start a test for jumbo frames modify the conf file packet sizes or pass the option through the VSPERF command line.

TEST_PARAMS = {'TRAFFICGEN_PKT_SIZES':(2000,9000)}
./vsperf --test-params "TRAFFICGEN_PKT_SIZES=2000,9000"

It is recommended to increase the memory size for OvsDpdkVhostUser testing from the default 1024. Your size required may vary depending on the number of guests in your testing. 4096 appears to work well for most typical testing scenarios.

DPDK_SOCKET_MEM = ['4096', '0']

NOTE: For Jumbo frames to work with DpdkVhostUser, mergable buffers will be enabled by default. If testing with mergable buffers in QEMU is desired, disable Jumbo Frames and only test non jumbo frame sizes. Test Jumbo Frames sizes separately to avoid this collision.

1.28. Executing Packet Forwarding tests

To select the applications which will forward packets, the following parameters should be configured:

VSWITCH = 'none'
PKTFWD = 'TestPMD'

or use --vswitch and --fwdapp CLI arguments:

$ ./vsperf phy2phy_cont --conf-file user_settings.py \
           --vswitch none \
           --fwdapp TestPMD

Supported Packet Forwarding applications are:

'testpmd'       - testpmd from dpdk
  1. Update your ‘‘10_custom.conf’’ file to use the appropriate variables for selected Packet Forwarder:

    # testpmd configuration
    TESTPMD_ARGS = []
    # packet forwarding mode supported by testpmd; Please see DPDK documentation
    # for comprehensive list of modes supported by your version.
    # e.g. io|mac|mac_retry|macswap|flowgen|rxonly|txonly|csum|icmpecho|...
    # Note: Option "mac_retry" has been changed to "mac retry" since DPDK v16.07
    TESTPMD_FWD_MODE = 'csum'
    # checksum calculation layer: ip|udp|tcp|sctp|outer-ip
    TESTPMD_CSUM_LAYER = 'ip'
    # checksum calculation place: hw (hardware) | sw (software)
    TESTPMD_CSUM_CALC = 'sw'
    # recognize tunnel headers: on|off
    TESTPMD_CSUM_PARSE_TUNNEL = 'off'
    
  2. Run test:

    $ ./vsperf phy2phy_tput --conf-file <path_to_settings_py>
    
1.29. Executing Packet Forwarding tests with one guest

TestPMD with DPDK 16.11 or greater can be used to forward packets as a switch to a single guest using TestPMD vdev option. To set this configuration the following parameters should be used.

VSWITCH = 'none'
PKTFWD = 'TestPMD'

or use --vswitch and --fwdapp CLI arguments:

$ ./vsperf pvp_tput --conf-file user_settings.py \
           --vswitch none \
           --fwdapp TestPMD

Guest forwarding application only supports TestPMD in this configuration.

GUEST_LOOPBACK = ['testpmd']

For optimal performance one cpu per port +1 should be used for TestPMD. Also set additional params for packet forwarding application to use the correct number of nb-cores.

DPDK_SOCKET_MEM = ['1024', '0']
VSWITCHD_DPDK_ARGS = ['-l', '46,44,42,40,38', '-n', '4']
TESTPMD_ARGS = ['--nb-cores=4', '--txq=1', '--rxq=1']

For guest TestPMD 3 VCpus should be assigned with the following TestPMD params.

GUEST_TESTPMD_PARAMS = ['-l 0,1,2 -n 4 --socket-mem 1024 -- '
                        '--burst=64 -i --txqflags=0xf00 '
                        '--disable-hw-vlan --nb-cores=2 --txq=1 --rxq=1']

Execution of TestPMD can be run with the following command line

./vsperf pvp_tput --vswitch=none --fwdapp=TestPMD --conf-file <path_to_settings_py>

NOTE: To achieve the best 0% loss numbers with rfc2544 throughput testing, other tunings should be applied to host and guest such as tuned profiles and CPU tunings to prevent possible interrupts to worker threads.

1.30. VSPERF modes of operation

VSPERF can be run in different modes. By default it will configure vSwitch, traffic generator and VNF. However it can be used just for configuration and execution of traffic generator. Another option is execution of all components except traffic generator itself.

Mode of operation is driven by configuration parameter -m or –mode

-m MODE, --mode MODE  vsperf mode of operation;
    Values:
        "normal" - execute vSwitch, VNF and traffic generator
        "trafficgen" - execute only traffic generator
        "trafficgen-off" - execute vSwitch and VNF
        "trafficgen-pause" - execute vSwitch and VNF but wait before traffic transmission

In case, that VSPERF is executed in “trafficgen” mode, then configuration of traffic generator can be modified through TRAFFIC dictionary passed to the --test-params option. It is not needed to specify all values of TRAFFIC dictionary. It is sufficient to specify only values, which should be changed. Detailed description of TRAFFIC dictionary can be found at Configuration of TRAFFIC dictionary.

Example of execution of VSPERF in “trafficgen” mode:

$ ./vsperf -m trafficgen --trafficgen IxNet --conf-file vsperf.conf \
    --test-params "TRAFFIC={'traffic_type':'rfc2544_continuous','bidir':'False','framerate':60}"
1.31. Performance Matrix

The --matrix command line argument analyses and displays the performance of all the tests run. Using the metric specified by MATRIX_METRIC in the conf-file, the first test is set as the baseline and all the other tests are compared to it. The MATRIX_METRIC must always refer to a numeric value to enable comparision. A table, with the test ID, metric value, the change of the metric in %, testname and the test parameters used for each test, is printed out as well as saved into the results directory.

Example of 2 tests being compared using Performance Matrix:

$ ./vsperf --conf-file user_settings.py \
    --test-params "['TRAFFICGEN_PKT_SIZES=(64,)',"\
    "'TRAFFICGEN_PKT_SIZES=(128,)']" \
    phy2phy_cont phy2phy_cont --matrix

Example output:

+------+--------------+---------------------+----------+---------------------------------------+
|   ID | Name         |   throughput_rx_fps |   Change | Parameters, CUMULATIVE_PARAMS = False |
+======+==============+=====================+==========+=======================================+
|    0 | phy2phy_cont |        23749000.000 |        0 | 'TRAFFICGEN_PKT_SIZES': [64]          |
+------+--------------+---------------------+----------+---------------------------------------+
|    1 | phy2phy_cont |        16850500.000 |  -29.048 | 'TRAFFICGEN_PKT_SIZES': [128]         |
+------+--------------+---------------------+----------+---------------------------------------+
1.32. Code change verification by pylint

Every developer participating in VSPERF project should run pylint before his python code is submitted for review. Project specific configuration for pylint is available at ‘pylint.rc’.

Example of manual pylint invocation:

$ pylint --rcfile ./pylintrc ./vsperf
1.33. GOTCHAs:
1.33.1. Custom image fails to boot

Using custom VM images may not boot within VSPerf pxp testing because of the drive boot and shared type which could be caused by a missing scsi driver inside the image. In case of issues you can try changing the drive boot type to ide.

GUEST_BOOT_DRIVE_TYPE = ['ide']
GUEST_SHARED_DRIVE_TYPE = ['ide']
1.33.2. OVS with DPDK and QEMU

If you encounter the following error: “before (last 100 chars): ‘-path=/dev/hugepages,share=on: unable to map backing store for hugepages: Cannot allocate memoryrnrn” during qemu initialization, check the amount of hugepages on your system:

$ cat /proc/meminfo | grep HugePages

By default the vswitchd is launched with 1Gb of memory, to change this, modify –socket-mem parameter in conf/02_vswitch.conf to allocate an appropriate amount of memory:

DPDK_SOCKET_MEM = ['1024', '0']
VSWITCHD_DPDK_ARGS = ['-c', '0x4', '-n', '4']
VSWITCHD_DPDK_CONFIG = {
    'dpdk-init' : 'true',
    'dpdk-lcore-mask' : '0x4',
    'dpdk-socket-mem' : '1024,0',
}

Note: Option VSWITCHD_DPDK_ARGS is used for vswitchd, which supports --dpdk parameter. In recent vswitchd versions, option VSWITCHD_DPDK_CONFIG will be used to configure vswitchd via ovs-vsctl calls.

1.34. More information

For more information and details refer to the rest of vSwitchPerfuser documentation.

2. Step driven tests

In general, test scenarios are defined by a deployment used in the particular test case definition. The chosen deployment scenario will take care of the vSwitch configuration, deployment of VNFs and it can also affect configuration of a traffic generator. In order to allow a more flexible way of testcase scripting, VSPERF supports a detailed step driven testcase definition. It can be used to configure and program vSwitch, deploy and terminate VNFs, execute a traffic generator, modify a VSPERF configuration, execute external commands, etc.

Execution of step driven tests is done on a step by step work flow starting with step 0 as defined inside the test case. Each step of the test increments the step number by one which is indicated in the log.

(testcases.integration) - Step 0 'vswitch add_vport ['br0']' start

Test steps are defined as a list of steps within a TestSteps item of test case definition. Each step is a list with following structure:

'[' [ optional-alias ',' ] test-object ',' test-function [ ',' optional-function-params ] '],'

Step driven tests can be used for both performance and integration testing. In case of integration test, each step in the test case is validated. If a step does not pass validation the test will fail and terminate. The test will continue until a failure is detected or all steps pass. A csv report file is generated after a test completes with an OK or FAIL result.

NOTE: It is possible to suppress validation process of given step by prefixing it by ! (exclamation mark). In following example test execution won’t fail if all traffic is dropped:

['!trafficgen', 'send_traffic', {}]

In case of performance test, the validation of steps is not performed and standard output files with results from traffic generator and underlying OS details are generated by vsperf.

Step driven testcases can be used in two different ways:

# description of full testcase - in this case clean deployment is used
to indicate that vsperf should neither configure vSwitch nor deploy any VNF. Test shall perform all required vSwitch configuration and programming and deploy required number of VNFs.
# modification of existing deployment - in this case, any of supported
deployments can be used to perform initial vSwitch configuration and deployment of VNFs. Additional actions defined by TestSteps can be used to alter vSwitch configuration or deploy additional VNFs. After the last step is processed, the test execution will continue with traffic execution.
2.1. Test objects and their functions

Every test step can call a function of one of the supported test objects. In general any existing function of supported test object can be called by test step. In case that step validation is required (valid for integration test steps, which are not suppressed), then appropriate validate_ method must be implemented.

The list of supported objects and their most common functions is listed below. Please check implementation of test objects for full list of implemented functions and their parameters.

  • vswitch - provides functions for vSwitch configuration

    List of supported functions:

    • add_switch br_name - creates a new switch (bridge) with given br_name
    • del_switch br_name - deletes switch (bridge) with given br_name
    • add_phy_port br_name - adds a physical port into bridge specified by br_name
    • add_vport br_name - adds a virtual port into bridge specified by br_name
    • del_port br_name port_name - removes physical or virtual port specified by port_name from bridge br_name
    • add_flow br_name flow - adds flow specified by flow dictionary into the bridge br_name; Content of flow dictionary will be passed to the vSwitch. In case of Open vSwitch it will be passed to the ovs-ofctl add-flow command. Please see Open vSwitch documentation for the list of supported flow parameters.
    • del_flow br_name [flow] - deletes flow specified by flow dictionary from bridge br_name; In case that optional parameter flow is not specified or set to an empty dictionary {}, then all flows from bridge br_name will be deleted.
    • dump_flows br_name - dumps all flows from bridge specified by br_name
    • enable_stp br_name - enables Spanning Tree Protocol for bridge br_name
    • disable_stp br_name - disables Spanning Tree Protocol for bridge br_name
    • enable_rstp br_name - enables Rapid Spanning Tree Protocol for bridge br_name
    • disable_rstp br_name - disables Rapid Spanning Tree Protocol for bridge br_name
    • restart - restarts switch, which is useful for failover testcases

    Examples:

    ['vswitch', 'add_switch', 'int_br0']
    
    ['vswitch', 'del_switch', 'int_br0']
    
    ['vswitch', 'add_phy_port', 'int_br0']
    
    ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]']
    
    ['vswitch', 'add_flow', 'int_br0', {'in_port': '1', 'actions': ['output:2'],
     'idle_timeout': '0'}],
    
    ['vswitch', 'enable_rstp', 'int_br0']
    
  • vnf[ID] - provides functions for deployment and termination of VNFs; Optional alfanumerical ID is used for VNF identification in case that testcase deploys multiple VNFs.

    List of supported functions:

    • start - starts a VNF based on VSPERF configuration
    • stop - gracefully terminates given VNF
    • execute command [delay] - executes command cmd inside VNF; Optional delay defines number of seconds to wait before next step is executed. Method returns command output as a string.
    • execute_and_wait command [timeout] [prompt] - executes command cmd inside VNF; Optional timeout defines number of seconds to wait until prompt is detected. Optional prompt defines a string, which is used as detection of successful command execution. In case that prompt is not defined, then content of GUEST_PROMPT_LOGIN parameter will be used. Method returns command output as a string.

    Examples:

    ['vnf1', 'start'],
    ['vnf2', 'start'],
    ['vnf1', 'execute_and_wait', 'ifconfig eth0 5.5.5.1/24 up'],
    ['vnf2', 'execute_and_wait', 'ifconfig eth0 5.5.5.2/24 up', 120, 'root.*#'],
    ['vnf2', 'execute_and_wait', 'ping -c1 5.5.5.1'],
    ['vnf2', 'stop'],
    ['vnf1', 'stop'],
    
  • VNF[ID] - provides access to VNFs deployed automatically by testcase deployment scenario. For Example pvvp deployment automatically starts two VNFs before any TestStep is executed. It is possible to access these VNFs by VNF0 and VNF1 labels.

    List of supported functions is identical to vnf[ID] option above except functions start and stop.

    Examples:

    ['VNF0', 'execute_and_wait', 'ifconfig eth2 5.5.5.1/24 up'],
    ['VNF1', 'execute_and_wait', 'ifconfig eth2 5.5.5.2/24 up', 120, 'root.*#'],
    ['VNF2', 'execute_and_wait', 'ping -c1 5.5.5.1'],
    
  • trafficgen - triggers traffic generation

    List of supported functions:

    • send_traffic traffic - starts a traffic based on the vsperf configuration and given traffic dictionary. More details about traffic dictionary and its possible values are available at Traffic Generator Integration Guide
    • get_results - returns dictionary with results collected from previous execution of send_traffic

    Examples:

    ['trafficgen', 'send_traffic', {'traffic_type' : 'rfc2544_throughput'}]
    
    ['trafficgen', 'send_traffic', {'traffic_type' : 'rfc2544_back2back', 'bidir' : 'True'}],
    ['trafficgen', 'get_results'],
    ['tools', 'assert', '#STEP[-1][0]["frame_loss_percent"] < 0.05'],
    
  • settings - reads or modifies VSPERF configuration

    List of supported functions:

    • getValue param - returns value of given param
    • setValue param value - sets value of param to given value
    • resetValue param - if param was overridden by TEST_PARAMS (e.g. by “Parameters” section of the test case definition), then it will be set to its original value.

    Examples:

    ['settings', 'getValue', 'TOOLS']
    
    ['settings', 'setValue', 'GUEST_USERNAME', ['root']]
    
    ['settings', 'resetValue', 'WHITELIST_NICS'],
    

    It is possible and more convenient to access any VSPERF configuration option directly via $NAME notation. Option evaluation is done during runtime and vsperf will automatically translate it to the appropriate call of settings.getValue. If the referred parameter does not exist, then vsperf will keep $NAME string untouched and it will continue with testcase execution. The reason is to avoid test execution failure in case that $ sign has been used from different reason than vsperf parameter evaluation.

    NOTE: It is recommended to use ${NAME} notation for any shell parameters used within Exec_Shell call to avoid a clash with configuration parameter evaluation.

    NOTE: It is possible to refer to vsperf parameter value by #PARAM() macro (see Overriding values defined in configuration files. However #PARAM() macro is evaluated at the beginning of vsperf execution and it will not reflect any changes made to the vsperf configuration during runtime. On the other hand $NAME notation is evaluated during test execution and thus it contains any modifications to the configuration parameter made by vsperf (e.g. TOOLS and NICS dictionaries) or by testcase definition (e.g. TRAFFIC dictionary).

    Examples:

    ['tools', 'exec_shell', "$TOOLS['ovs-vsctl'] show"]
    
    ['settings', 'setValue', 'TRAFFICGEN_IXIA_PORT2', '$TRAFFICGEN_IXIA_PORT1'],
    
    ['vswitch', 'add_flow', 'int_br0',
     {'in_port': '#STEP[1][1]',
      'dl_type': '0x800',
      'nw_proto': '17',
      'nw_dst': '$TRAFFIC["l3"]["dstip"]/8',
      'actions': ['output:#STEP[2][1]']
     }
    ]
    
  • namespace - creates or modifies network namespaces

    List of supported functions:

    • create_namespace name - creates new namespace with given name
    • delete_namespace name - deletes namespace specified by its name
    • assign_port_to_namespace port name [port_up] - assigns NIC specified by port into given namespace name; If optional parameter port_up is set to True, then port will be brought up.
    • add_ip_to_namespace_eth port name addr cidr - assigns an IP address addr/cidr to the NIC specified by port within namespace name
    • reset_port_to_root port name - returns given port from namespace name back to the root namespace

    Examples:

    ['namespace', 'create_namespace', 'testns']
    
    ['namespace', 'assign_port_to_namespace', 'eth0', 'testns']
    
  • veth - manipulates with eth and veth devices

    List of supported functions:

    • add_veth_port port peer_port - adds a pair of veth ports named port and peer_port
    • del_veth_port port peer_port - deletes a veth port pair specified by port and peer_port
    • bring_up_eth_port eth_port [namespace] - brings up eth_port in (optional) namespace

    Examples:

    ['veth', 'add_veth_port', 'veth', 'veth1']
    
    ['veth', 'bring_up_eth_port', 'eth1']
    
  • tools - provides a set of helper functions

    List of supported functions:

    • Assert condition - evaluates given condition and raises AssertionError in case that condition is not True
    • Eval expression - evaluates given expression as a python code and returns its result
    • Exec_Shell command - executes a shell command and wait until it finishes
    • Exec_Shell_Background command - executes a shell command at background; Command will be automatically terminated at the end of testcase execution.
    • Exec_Python code - executes a python code

    Examples:

    ['tools', 'exec_shell', 'numactl -H', 'available: ([0-9]+)']
    ['tools', 'assert', '#STEP[-1][0]>1']
    
  • wait - is used for test case interruption. This object doesn’t have any functions. Once reached, vsperf will pause test execution and waits for press of Enter key. It can be used during testcase design for debugging purposes.

    Examples:

    ['wait']
    
  • sleep - is used to pause testcase execution for defined number of seconds.

    Examples:

    ['sleep', '60']
    
  • log level message - is used to log message of given level into vsperf output. Level is one of info, debug, warning or error.

    Examples:

    ['log', 'error', 'tools $TOOLS']
    
  • pdb - executes python debugger

    Examples:

    ['pdb']
    
2.2. Test Macros

Test profiles can include macros as part of the test step. Each step in the profile may return a value such as a port name. Recall macros use #STEP to indicate the recalled value inside the return structure. If the method the test step calls returns a value it can be later recalled, for example:

{
    "Name": "vswitch_add_del_vport",
    "Deployment": "clean",
    "Description": "vSwitch - add and delete virtual port",
    "TestSteps": [
            ['vswitch', 'add_switch', 'int_br0'],               # STEP 0
            ['vswitch', 'add_vport', 'int_br0'],                # STEP 1
            ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],  # STEP 2
            ['vswitch', 'del_switch', 'int_br0'],               # STEP 3
         ]
}

This test profile uses the vswitch add_vport method which returns a string value of the port added. This is later called by the del_port method using the name from step 1.

It is also possible to use negative indexes in step macros. In that case #STEP[-1] will refer to the result from previous step, #STEP[-2] will refer to result of step called before previous step, etc. It means, that you could change STEP 2 from previous example to achieve the same functionality:

['vswitch', 'del_port', 'int_br0', '#STEP[-1][0]'],  # STEP 2

Another option to refer to previous values, is to define an alias for given step by its first argument with ‘#’ prefix. Alias must be unique and it can’t be a number. Example of step alias usage:

['#port1', 'vswitch', 'add_vport', 'int_br0'],
['vswitch', 'del_port', 'int_br0', '#STEP[port1][0]'],

Also commonly used steps can be created as a separate profile.

STEP_VSWITCH_PVP_INIT = [
    ['vswitch', 'add_switch', 'int_br0'],           # STEP 0
    ['vswitch', 'add_phy_port', 'int_br0'],         # STEP 1
    ['vswitch', 'add_phy_port', 'int_br0'],         # STEP 2
    ['vswitch', 'add_vport', 'int_br0'],            # STEP 3
    ['vswitch', 'add_vport', 'int_br0'],            # STEP 4
]

This profile can then be used inside other testcases

{
    "Name": "vswitch_pvp",
    "Deployment": "clean",
    "Description": "vSwitch - configure switch and one vnf",
    "TestSteps": STEP_VSWITCH_PVP_INIT +
                 [
                    ['vnf', 'start'],
                    ['vnf', 'stop'],
                 ] +
                 STEP_VSWITCH_PVP_FINIT
}

It is possible to refer to vsperf configuration parameters within step macros. Please see step-driven-tests-variable-usage for more details.

In case that step returns a string or list of strings, then it is possible to filter such output by regular expression. This optional filter can be specified as a last step parameter with prefix ‘|’. Output will be split into separate lines and only matching records will be returned. It is also possible to return a specified group of characters from the matching lines, e.g. by regex |ID (\d+).

Examples:

['tools', 'exec_shell', "sudo $TOOLS['ovs-appctl'] dpif-netdev/pmd-rxq-show",
 '|dpdkvhostuser0\s+queue-id: \d'],
['tools', 'assert', 'len(#STEP[-1])==1'],

['vnf', 'execute_and_wait', 'ethtool -L eth0 combined 2'],
['vnf', 'execute_and_wait', 'ethtool -l eth0', '|Combined:\s+2'],
['tools', 'assert', 'len(#STEP[-1])==2']
2.3. HelloWorld and other basic Testcases

The following examples are for demonstration purposes. You can run them by copying and pasting into the conf/integration/01_testcases.conf file. A command-line instruction is shown at the end of each example.

2.3.1. HelloWorld

The first example is a HelloWorld testcase. It simply creates a bridge with 2 physical ports, then sets up a flow to drop incoming packets from the port that was instantiated at the STEP #1. There’s no interaction with the traffic generator. Then the flow, the 2 ports and the bridge are deleted. ‘add_phy_port’ method creates a ‘dpdk’ type interface that will manage the physical port. The string value returned is the port name that will be referred by ‘del_port’ later on.

{
    "Name": "HelloWorld",
    "Description": "My first testcase",
    "Deployment": "clean",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],   # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 2
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'actions': ['drop'], 'idle_timeout': '0'}],
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]

},

To run HelloWorld test:

./vsperf --conf-file user_settings.py --integration HelloWorld
2.3.2. Specify a Flow by the IP address

The next example shows how to explicitly set up a flow by specifying a destination IP address. All packets received from the port created at STEP #1 that have a destination IP address = 90.90.90.90 will be forwarded to the port created at the STEP #2.

{
    "Name": "p2p_rule_l3da",
    "Description": "Phy2Phy with rule on L3 Dest Addr",
    "Deployment": "clean",
    "biDirectional": "False",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],   # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 2
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_dst': '90.90.90.90', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous'}],
        ['vswitch', 'dump_flows', 'int_br0'],   # STEP 5
        ['vswitch', 'del_flow', 'int_br0'],     # STEP 7 == del-flows
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration p2p_rule_l3da
2.3.3. Multistream feature

The next testcase uses the multistream feature. The traffic generator will send packets with different UDP ports. That is accomplished by using “Stream Type” and “MultiStream” keywords. 4 different flows are set to forward all incoming packets.

{
    "Name": "multistream_l4",
    "Description": "Multistream on UDP ports",
    "Deployment": "clean",
    "Parameters": {
        'TRAFFIC' : {
            "multistream": 4,
            "stream_type": "L4",
        },
    },
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],   # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 2
        # Setup Flows
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '0', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '1', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '2', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '3', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        # Send mono-dir traffic
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous', \
            'bidir' : 'False'}],
        # Clean up
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
     ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration multistream_l4
2.3.4. PVP with a VM Replacement

This example launches a 1st VM in a PVP topology, then the VM is replaced by another VM. When VNF setup parameter in ./conf/04_vnf.conf is “QemuDpdkVhostUser” ‘add_vport’ method creates a ‘dpdkvhostuser’ type port to connect a VM.

{
    "Name": "ex_replace_vm",
    "Description": "PVP with VM replacement",
    "Deployment": "clean",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],       # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 3    vm1
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 4

        # Setup Flows
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[4][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[2][1]', \
            'actions': ['output:#STEP[4][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[3][1]', \
            'actions': ['output:#STEP[1][1]'], 'idle_timeout': '0'}],

        # Start VM 1
        ['vnf1', 'start'],
        # Now we want to replace VM 1 with another VM
        ['vnf1', 'stop'],

        ['vswitch', 'add_vport', 'int_br0'],        # STEP 11    vm2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 12
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'actions': ['output:#STEP[11][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[12][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],

        # Start VM 2
        ['vnf2', 'start'],
        ['vnf2', 'stop'],
        ['vswitch', 'dump_flows', 'int_br0'],

        # Clean up
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[3][0]'],    # vm1
        ['vswitch', 'del_port', 'int_br0', '#STEP[4][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[11][0]'],   # vm2
        ['vswitch', 'del_port', 'int_br0', '#STEP[12][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration ex_replace_vm
2.3.5. VM with a Linux bridge

This example setups a PVP topology and routes traffic to the VM based on the destination IP address. A command-line parameter is used to select a Linux bridge as a guest loopback application. It is also possible to select a guest loopback application by a configuration option GUEST_LOOPBACK.

{
    "Name": "ex_pvp_rule_l3da",
    "Description": "PVP with flow on L3 Dest Addr",
    "Deployment": "clean",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],       # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 3    vm1
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 4
        # Setup Flows
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_dst': '90.90.90.90', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        # Each pkt from the VM is forwarded to the 2nd dpdk port
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[4][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        # Start VMs
        ['vnf1', 'start'],
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous', \
            'bidir' : 'False'}],
        ['vnf1', 'stop'],
        # Clean up
        ['vswitch', 'dump_flows', 'int_br0'],       # STEP 10
        ['vswitch', 'del_flow', 'int_br0'],         # STEP 11
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[3][0]'],  # vm1 ports
        ['vswitch', 'del_port', 'int_br0', '#STEP[4][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --test-params \
        "GUEST_LOOPBACK=['linux_bridge']" --integration ex_pvp_rule_l3da
2.3.6. Forward packets based on UDP port

This examples launches 2 VMs connected in parallel. Incoming packets will be forwarded to one specific VM depending on the destination UDP port.

{
    "Name": "ex_2pvp_rule_l4dp",
    "Description": "2 PVP with flows on L4 Dest Port",
    "Deployment": "clean",
    "Parameters": {
        'TRAFFIC' : {
            "multistream": 2,
            "stream_type": "L4",
        },
    },
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],       # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 3    vm1
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 4
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 5    vm2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 6
        # Setup Flows to reply ICMPv6 and similar packets, so to
        # avoid flooding internal port with their re-transmissions
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:01', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:02', \
            'actions': ['output:#STEP[4][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:03', \
            'actions': ['output:#STEP[5][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:04', \
            'actions': ['output:#STEP[6][1]'], 'idle_timeout': '0'}],
        # Forward UDP packets depending on dest port
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '0', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '1', \
            'actions': ['output:#STEP[5][1]'], 'idle_timeout': '0'}],
        # Send VM output to phy port #2
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[4][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[6][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        # Start VMs
        ['vnf1', 'start'],                          # STEP 16
        ['vnf2', 'start'],                          # STEP 17
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous', \
            'bidir' : 'False'}],
        ['vnf1', 'stop'],
        ['vnf2', 'stop'],
        ['vswitch', 'dump_flows', 'int_br0'],
        # Clean up
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[3][0]'],  # vm1 ports
        ['vswitch', 'del_port', 'int_br0', '#STEP[4][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[5][0]'],  # vm2 ports
        ['vswitch', 'del_port', 'int_br0', '#STEP[6][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

The same test can be written in a shorter form using “Deployment” : “pvpv”.

To run the test:

./vsperf --conf-file user_settings.py --integration ex_2pvp_rule_l4dp
2.3.7. Modification of existing PVVP deployment

This is an example of modification of a standard deployment scenario with additional TestSteps. Standard PVVP scenario is used to configure a vSwitch and to deploy two VNFs connected in series. Additional TestSteps will deploy a 3rd VNF and connect it in parallel to already configured VNFs. Traffic generator is instructed (by Multistream feature) to send two separate traffic streams. One stream will be sent to the standalone VNF and second to two chained VNFs.

In case, that test is defined as a performance test, then traffic results will be collected and available in both csv and rst report files.

{
    "Name": "pvvp_pvp_cont",
    "Deployment": "pvvp",
    "Description": "PVVP and PVP in parallel with Continuous Stream",
    "Parameters" : {
        "TRAFFIC" : {
            "traffic_type" : "rfc2544_continuous",
            "multistream": 2,
        },
    },
    "TestSteps": [
                    ['vswitch', 'add_vport', '$VSWITCH_BRIDGE_NAME'],
                    ['vswitch', 'add_vport', '$VSWITCH_BRIDGE_NAME'],
                    # priority must be higher than default 32768, otherwise flows won't match
                    ['vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME',
                     {'in_port': '1', 'actions': ['output:#STEP[-2][1]'], 'idle_timeout': '0', 'dl_type':'0x0800',
                                                  'nw_proto':'17', 'tp_dst':'0', 'priority': '33000'}],
                    ['vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME',
                     {'in_port': '2', 'actions': ['output:#STEP[-2][1]'], 'idle_timeout': '0', 'dl_type':'0x0800',
                                                  'nw_proto':'17', 'tp_dst':'0', 'priority': '33000'}],
                    ['vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME', {'in_port': '#STEP[-4][1]', 'actions': ['output:1'],
                                                    'idle_timeout': '0'}],
                    ['vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME', {'in_port': '#STEP[-4][1]', 'actions': ['output:2'],
                                                    'idle_timeout': '0'}],
                    ['vswitch', 'dump_flows', '$VSWITCH_BRIDGE_NAME'],
                    ['vnf1', 'start'],
                 ]
},

To run the test:

./vsperf --conf-file user_settings.py pvvp_pvp_cont
3. Integration tests

VSPERF includes a set of integration tests defined in conf/integration. These tests can be run by specifying –integration as a parameter to vsperf. Current tests in conf/integration include switch functionality and Overlay tests.

Tests in the conf/integration can be used to test scaling of different switch configurations by adding steps into the test case.

For the overlay tests VSPERF supports VXLAN, GRE and GENEVE tunneling protocols. Testing of these protocols is limited to unidirectional traffic and P2P (Physical to Physical scenarios).

NOTE: The configuration for overlay tests provided in this guide is for unidirectional traffic only.

NOTE: The overlay tests require an IxNet traffic generator. The tunneled traffic is configured by ixnetrfc2544v2.tcl script. This script can be used with all supported deployment scenarios for generation of frames with VXLAN, GRE or GENEVE protocols. In that case options “Tunnel Operation” and “TRAFFICGEN_IXNET_TCL_SCRIPT” must be properly configured at testcase definition.

3.1. Executing Integration Tests

To execute integration tests VSPERF is run with the integration parameter. To view the current test list simply execute the following command:

./vsperf --integration --list

The standard tests included are defined inside the conf/integration/01_testcases.conf file.

3.2. Executing Tunnel encapsulation tests

The VXLAN OVS DPDK encapsulation tests requires IPs, MAC addresses, bridge names and WHITELIST_NICS for DPDK.

NOTE: Only Ixia traffic generators currently support the execution of the tunnel encapsulation tests. Support for other traffic generators may come in a future release.

Default values are already provided. To customize for your environment, override the following variables in you user_settings.py file:

# Variables defined in conf/integration/02_vswitch.conf
# Tunnel endpoint for Overlay P2P deployment scenario
# used for br0
VTEP_IP1 = '192.168.0.1/24'

# Used as remote_ip in adding OVS tunnel port and
# to set ARP entry in OVS (e.g. tnl/arp/set br-ext 192.168.240.10 02:00:00:00:00:02
VTEP_IP2 = '192.168.240.10'

# Network to use when adding a route for inner frame data
VTEP_IP2_SUBNET = '192.168.240.0/24'

# Bridge names
TUNNEL_INTEGRATION_BRIDGE = 'vsperf-br0'
TUNNEL_EXTERNAL_BRIDGE = 'vsperf-br-ext'

# IP of br-ext
TUNNEL_EXTERNAL_BRIDGE_IP = '192.168.240.1/24'

# vxlan|gre|geneve
TUNNEL_TYPE = 'vxlan'

# Variables defined conf/integration/03_traffic.conf
# For OP2P deployment scenario
TRAFFICGEN_PORT1_MAC = '02:00:00:00:00:01'
TRAFFICGEN_PORT2_MAC = '02:00:00:00:00:02'
TRAFFICGEN_PORT1_IP = '1.1.1.1'
TRAFFICGEN_PORT2_IP = '192.168.240.10'

To run VXLAN encapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=vxlan' overlay_p2p_tput

To run GRE encapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=gre' overlay_p2p_tput

To run GENEVE encapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=geneve' overlay_p2p_tput

To run OVS NATIVE tunnel tests (VXLAN/GRE/GENEVE):

  1. Install the OVS kernel modules
cd src/ovs/ovs
sudo -E make modules_install
  1. Set the following variables:
VSWITCH = 'OvsVanilla'
# Specify vport_* kernel module to test.
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_vxlan',
    'vport_gre',
    'vport_geneve',
    'datapath/linux/openvswitch.ko',
]

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run tests:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=vxlan' overlay_p2p_tput
3.3. Executing VXLAN decapsulation tests

To run VXLAN decapsulation tests:

  1. Set the variables used in “Executing Tunnel encapsulation tests”
  2. Run test:
./vsperf --conf-file user_settings.py --integration overlay_p2p_decap_cont

If you want to use different values for your VXLAN frame, you may set:

VXLAN_FRAME_L3 = {'proto': 'udp',
                  'packetsize': 64,
                  'srcip': TRAFFICGEN_PORT1_IP,
                  'dstip': '192.168.240.1',
                 }
VXLAN_FRAME_L4 = {'srcport': 4789,
                  'dstport': 4789,
                  'vni': VXLAN_VNI,
                  'inner_srcmac': '01:02:03:04:05:06',
                  'inner_dstmac': '06:05:04:03:02:01',
                  'inner_srcip': '192.168.0.10',
                  'inner_dstip': '192.168.240.9',
                  'inner_proto': 'udp',
                  'inner_srcport': 3000,
                  'inner_dstport': 3001,
                 }
3.4. Executing GRE decapsulation tests

To run GRE decapsulation tests:

  1. Set the variables used in “Executing Tunnel encapsulation tests”
  2. Run test:
./vsperf --conf-file user_settings.py --test-params 'TUNNEL_TYPE=gre' \
         --integration overlay_p2p_decap_cont

If you want to use different values for your GRE frame, you may set:

GRE_FRAME_L3 = {'proto': 'gre',
                'packetsize': 64,
                'srcip': TRAFFICGEN_PORT1_IP,
                'dstip': '192.168.240.1',
               }

GRE_FRAME_L4 = {'srcport': 0,
                'dstport': 0
                'inner_srcmac': '01:02:03:04:05:06',
                'inner_dstmac': '06:05:04:03:02:01',
                'inner_srcip': '192.168.0.10',
                'inner_dstip': '192.168.240.9',
                'inner_proto': 'udp',
                'inner_srcport': 3000,
                'inner_dstport': 3001,
               }
3.5. Executing GENEVE decapsulation tests

IxNet 7.3X does not have native support of GENEVE protocol. The template, GeneveIxNetTemplate.xml_ClearText.xml, should be imported into IxNET for this testcase to work.

To import the template do:

  1. Run the IxNetwork TCL Server
  2. Click on the Traffic menu
  3. Click on the Traffic actions and click Edit Packet Templates
  4. On the Template editor window, click Import. Select the template located at 3rd_party/ixia/GeneveIxNetTemplate.xml_ClearText.xml and click import.
  5. Restart the TCL Server.

To run GENEVE decapsulation tests:

  1. Set the variables used in “Executing Tunnel encapsulation tests”
  2. Run test:
./vsperf --conf-file user_settings.py --test-params 'tunnel_type=geneve' \
         --integration overlay_p2p_decap_cont

If you want to use different values for your GENEVE frame, you may set:

GENEVE_FRAME_L3 = {'proto': 'udp',
                   'packetsize': 64,
                   'srcip': TRAFFICGEN_PORT1_IP,
                   'dstip': '192.168.240.1',
                  }

GENEVE_FRAME_L4 = {'srcport': 6081,
                   'dstport': 6081,
                   'geneve_vni': 0,
                   'inner_srcmac': '01:02:03:04:05:06',
                   'inner_dstmac': '06:05:04:03:02:01',
                   'inner_srcip': '192.168.0.10',
                   'inner_dstip': '192.168.240.9',
                   'inner_proto': 'udp',
                   'inner_srcport': 3000,
                   'inner_dstport': 3001,
                  }
3.6. Executing Native/Vanilla OVS VXLAN decapsulation tests

To run VXLAN decapsulation tests:

  1. Set the following variables in your user_settings.py file:
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_vxlan',
    'datapath/linux/openvswitch.ko',
]

TRAFFICGEN_PORT1_IP = '172.16.1.2'
TRAFFICGEN_PORT2_IP = '192.168.1.11'

VTEP_IP1 = '172.16.1.2/24'
VTEP_IP2 = '192.168.1.1'
VTEP_IP2_SUBNET = '192.168.1.0/24'
TUNNEL_EXTERNAL_BRIDGE_IP = '172.16.1.1/24'
TUNNEL_INT_BRIDGE_IP = '192.168.1.1'

VXLAN_FRAME_L2 = {'srcmac':
                  '01:02:03:04:05:06',
                  'dstmac':
                  '06:05:04:03:02:01',
                 }

VXLAN_FRAME_L3 = {'proto': 'udp',
                  'packetsize': 64,
                  'srcip': TRAFFICGEN_PORT1_IP,
                  'dstip': '172.16.1.1',
                 }

VXLAN_FRAME_L4 = {
                  'srcport': 4789,
                  'dstport': 4789,
                  'protocolpad': 'true',
                  'vni': 99,
                  'inner_srcmac': '01:02:03:04:05:06',
                  'inner_dstmac': '06:05:04:03:02:01',
                  'inner_srcip': '192.168.1.2',
                  'inner_dstip': TRAFFICGEN_PORT2_IP,
                  'inner_proto': 'udp',
                  'inner_srcport': 3000,
                  'inner_dstport': 3001,
                 }

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run test:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'tunnel_type=vxlan' overlay_p2p_decap_cont
3.7. Executing Native/Vanilla OVS GRE decapsulation tests

To run GRE decapsulation tests:

  1. Set the following variables in your user_settings.py file:
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_gre',
    'datapath/linux/openvswitch.ko',
]

TRAFFICGEN_PORT1_IP = '172.16.1.2'
TRAFFICGEN_PORT2_IP = '192.168.1.11'

VTEP_IP1 = '172.16.1.2/24'
VTEP_IP2 = '192.168.1.1'
VTEP_IP2_SUBNET = '192.168.1.0/24'
TUNNEL_EXTERNAL_BRIDGE_IP = '172.16.1.1/24'
TUNNEL_INT_BRIDGE_IP = '192.168.1.1'

GRE_FRAME_L2 = {'srcmac':
                '01:02:03:04:05:06',
                'dstmac':
                '06:05:04:03:02:01',
               }

GRE_FRAME_L3 = {'proto': 'udp',
                'packetsize': 64,
                'srcip': TRAFFICGEN_PORT1_IP,
                'dstip': '172.16.1.1',
               }

GRE_FRAME_L4 = {
                'srcport': 4789,
                'dstport': 4789,
                'protocolpad': 'true',
                'inner_srcmac': '01:02:03:04:05:06',
                'inner_dstmac': '06:05:04:03:02:01',
                'inner_srcip': '192.168.1.2',
                'inner_dstip': TRAFFICGEN_PORT2_IP,
                'inner_proto': 'udp',
                'inner_srcport': 3000,
                'inner_dstport': 3001,
               }

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run test:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'tunnel_type=gre' overlay_p2p_decap_cont
3.8. Executing Native/Vanilla OVS GENEVE decapsulation tests

To run GENEVE decapsulation tests:

  1. Set the following variables in your user_settings.py file:
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_geneve',
    'datapath/linux/openvswitch.ko',
]

TRAFFICGEN_PORT1_IP = '172.16.1.2'
TRAFFICGEN_PORT2_IP = '192.168.1.11'

VTEP_IP1 = '172.16.1.2/24'
VTEP_IP2 = '192.168.1.1'
VTEP_IP2_SUBNET = '192.168.1.0/24'
TUNNEL_EXTERNAL_BRIDGE_IP = '172.16.1.1/24'
TUNNEL_INT_BRIDGE_IP = '192.168.1.1'

GENEVE_FRAME_L2 = {'srcmac':
                   '01:02:03:04:05:06',
                   'dstmac':
                   '06:05:04:03:02:01',
                  }

GENEVE_FRAME_L3 = {'proto': 'udp',
                   'packetsize': 64,
                   'srcip': TRAFFICGEN_PORT1_IP,
                   'dstip': '172.16.1.1',
                  }

GENEVE_FRAME_L4 = {'srcport': 6081,
                   'dstport': 6081,
                   'protocolpad': 'true',
                   'geneve_vni': 0,
                   'inner_srcmac': '01:02:03:04:05:06',
                   'inner_dstmac': '06:05:04:03:02:01',
                   'inner_srcip': '192.168.1.2',
                   'inner_dstip': TRAFFICGEN_PORT2_IP,
                   'inner_proto': 'udp',
                   'inner_srcport': 3000,
                   'inner_dstport': 3001,
                  }

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run test:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'tunnel_type=geneve' overlay_p2p_decap_cont
3.9. Executing Tunnel encapsulation+decapsulation tests

The OVS DPDK encapsulation/decapsulation tests requires IPs, MAC addresses, bridge names and WHITELIST_NICS for DPDK.

The test cases can test the tunneling encap and decap without using any ingress overlay traffic as compared to above test cases. To achieve this the OVS is configured to perform encap and decap in a series on the same traffic stream as given below.

TRAFFIC-IN –> [ENCAP] –> [MOD-PKT] –> [DECAP] –> TRAFFIC-OUT

Default values are already provided. To customize for your environment, override the following variables in you user_settings.py file:

# Variables defined in conf/integration/02_vswitch.conf

# Bridge names
TUNNEL_EXTERNAL_BRIDGE1 = 'br-phy1'
TUNNEL_EXTERNAL_BRIDGE2 = 'br-phy2'
TUNNEL_MODIFY_BRIDGE1 = 'br-mod1'
TUNNEL_MODIFY_BRIDGE2 = 'br-mod2'

# IP of br-mod1
TUNNEL_MODIFY_BRIDGE_IP1 = '10.0.0.1/24'

# Mac of br-mod1
TUNNEL_MODIFY_BRIDGE_MAC1 = '00:00:10:00:00:01'

# IP of br-mod2
TUNNEL_MODIFY_BRIDGE_IP2 = '20.0.0.1/24'

#Mac of br-mod2
TUNNEL_MODIFY_BRIDGE_MAC2 = '00:00:20:00:00:01'

# vxlan|gre|geneve, Only VXLAN is supported for now.
TUNNEL_TYPE = 'vxlan'

To run VXLAN encapsulation+decapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         overlay_p2p_mod_tput
4. Execution of vswitchperf testcases by Yardstick
4.1. General

Yardstick is a generic framework for a test execution, which is used for validation of installation of OPNFV platform. In the future, Yardstick will support two options of vswitchperf testcase execution:

  • plugin mode, which will execute native vswitchperf testcases; Tests will be executed natively by vsperf, and test results will be processed and reported by yardstick.
  • traffic generator mode, which will run vswitchperf in trafficgen mode only; Yardstick framework will be used to launch VNFs and to configure flows to ensure, that traffic is properly routed. This mode will allow to test OVS performance in real world scenarios.

In Colorado release only the traffic generator mode is supported.

4.2. Yardstick Installation

In order to run Yardstick testcases, you will need to prepare your test environment. Please follow the installation instructions to install the yardstick.

Please note, that yardstick uses OpenStack for execution of testcases. OpenStack must be installed with Heat and Neutron services. Otherwise vswitchperf testcases cannot be executed.

4.3. VM image with vswitchperf

A special VM image is required for execution of vswitchperf specific testcases by yardstick. It is possible to use a sample VM image available at OPNFV artifactory or to build customized image.

4.3.1. Sample VM image with vswitchperf

Sample VM image is available at vswitchperf section of OPNFV artifactory for free download:

$ wget http://artifacts.opnfv.org/vswitchperf/vnf/vsperf-yardstick-image.qcow2

This image can be used for execution of sample testcases with dummy traffic generator.

NOTE: Traffic generators might require an installation of client software. This software is not included in the sample image and must be installed by user.

NOTE: This image will be updated only in case, that new features related to yardstick integration will be added to the vswitchperf.

4.3.2. Preparation of custom VM image

In general, any Linux distribution supported by vswitchperf can be used as a base image for vswitchperf. One of the possibilities is to modify vloop-vnf image, which can be downloaded from http://artifacts.opnfv.org/vswitchperf.html/ (see vloop-vnf).

Please follow the Installing vswitchperf to install vswitchperf inside vloop-vnf image. As vswitchperf will be run in trafficgen mode, it is possible to skip installation and compilation of OVS, QEMU and DPDK to keep image size smaller.

In case, that selected traffic generator requires installation of additional client software, please follow appropriate documentation. For example in case of IXIA, you would need to install IxOS and IxNetowrk TCL API.

4.3.3. VM image usage

Image with vswitchperf must be uploaded into the glance service and vswitchperf specific flavor configured, e.g.:

$ glance --os-username admin --os-image-api-version 1 image-create --name \
  vsperf --is-public true --disk-format qcow2 --container-format bare --file \
  vsperf-yardstick-image.qcow2

$ nova --os-username admin flavor-create vsperf-flavor 100 2048 25 1
4.4. Testcase execution

After installation, yardstick is available as python package within yardstick specific virtual environment. It means, that yardstick environment must be enabled before the test execution, e.g.:

source ~/yardstick_venv/bin/activate

Next step is configuration of OpenStack environment, e.g. in case of devstack:

source /opt/openstack/devstack/openrc
export EXTERNAL_NETWORK=public

Vswitchperf testcases executable by yardstick are located at vswitchperf repository inside yardstick/tests directory. Example of their download and execution follows:

git clone https://gerrit.opnfv.org/gerrit/vswitchperf
cd vswitchperf

yardstick -d task start yardstick/tests/rfc2544_throughput_dummy.yaml

NOTE: Optional argument -d shows debug output.

4.5. Testcase customization

Yardstick testcases are described by YAML files. vswitchperf specific testcases are part of the vswitchperf repository and their yaml files can be found at yardstick/tests directory. For detailed description of yaml file structure, please see yardstick documentation and testcase samples. Only vswitchperf specific parts will be discussed here.

Example of yaml file:

...
scenarios:
-
  type: Vsperf
  options:
    testname: 'p2p_rfc2544_throughput'
    trafficgen_port1: 'eth1'
    trafficgen_port2: 'eth3'
    external_bridge: 'br-ex'
    test_params: 'TRAFFICGEN_DURATION=30;TRAFFIC={'traffic_type':'rfc2544_throughput}'
    conf_file: '~/vsperf-yardstick.conf'

  host: vsperf.demo

  runner:
    type: Sequence
    scenario_option_name: frame_size
    sequence:
    - 64
    - 128
    - 512
    - 1024
    - 1518
  sla:
    metrics: 'throughput_rx_fps'
    throughput_rx_fps: 500000
    action: monitor

context:
...
4.5.1. Section option

Section option defines details of vswitchperf test scenario. Lot of options are identical to the vswitchperf parameters passed through --test-params argument. Following options are supported:

  • frame_size - a packet size for which test should be executed; Multiple packet sizes can be tested by modification of Sequence runner section inside YAML definition. Default: ‘64’
  • conf_file - sets path to the vswitchperf configuration file, which will be uploaded to VM; Default: ‘~/vsperf-yardstick.conf’
  • setup_script - sets path to the setup script, which will be executed during setup and teardown phases
  • trafficgen_port1 - specifies device name of 1st interface connected to the trafficgen
  • trafficgen_port2 - specifies device name of 2nd interface connected to the trafficgen
  • external_bridge - specifies name of external bridge configured in OVS; Default: ‘br-ex’
  • test_params - specifies a string with a list of vsperf configuration parameters, which will be passed to the --test-params CLI argument; Parameters should be stated in the form of param=value and separated by a semicolon. Configuration of traffic generator is driven by TRAFFIC dictionary, which can be also updated by values defined by test_params. Please check VSPERF documentation for details about available configuration parameters and their data types. In case that both test_params and conf_file are specified, then values from test_params will override values defined in the configuration file.

In case that trafficgen_port1 and/or trafficgen_port2 are defined, then these interfaces will be inserted into the external_bridge of OVS. It is expected, that OVS runs at the same node, where the testcase is executed. In case of more complex OpenStack installation or a need of additional OVS configuration, setup_script can be used.

NOTE It is essential to specify a configuration for selected traffic generator. In case, that standalone testcase is created, then traffic generator can be selected and configured directly in YAML file by test_params. On the other hand, if multiple testcases should be executed with the same traffic generator settings, then a customized configuration file should be prepared and its name passed by conf_file option.

4.5.2. Section runner

Yardstick supports several runner types. In case of vswitchperf specific TCs, Sequence runner type can be used to execute the testcase for given list of frame sizes.

4.5.3. Section sla

In case that sla section is not defined, then testcase will be always considered as successful. On the other hand, it is possible to define a set of test metrics and their minimal values to evaluate test success. Any numeric value, reported by vswitchperf inside CSV result file, can be used. Multiple metrics can be defined as a coma separated list of items. Minimal value must be set separately for each metric.

e.g.:

sla:
    metrics: 'throughput_rx_fps,throughput_rx_mbps'
    throughput_rx_fps: 500000
    throughput_rx_mbps: 1000

In case that any of defined metrics will be lower than defined value, then testcase will be marked as failed. Based on action policy, yardstick will either stop test execution (value assert) or it will run next test (value monitor).

NOTE The throughput SLA (or any other SLA) cannot be set to a meaningful value without knowledge of the server and networking environment, possibly including prior testing in that environment to establish a baseline SLA level under well-understood circumstances.

5. List of vswitchperf testcases
5.1. Performance testcases
Testcase Name Description
phy2phy_tput LTD.Throughput.RFC2544.PacketLossRatio
phy2phy_forwarding LTD.Forwarding.RFC2889.MaxForwardingRate
phy2phy_learning LTD.AddrLearning.RFC2889.AddrLearningRate
phy2phy_caching LTD.AddrCaching.RFC2889.AddrCachingCapacity
back2back LTD.Throughput.RFC2544.BackToBackFrames
phy2phy_tput_mod_vlan LTD.Throughput.RFC2544.PacketLossRatioFrameModification
phy2phy_cont Phy2Phy Continuous Stream
pvp_cont PVP Continuous Stream
pvvp_cont PVVP Continuous Stream
pvpv_cont Two VMs in parallel with Continuous Stream
phy2phy_scalability LTD.Scalability.Flows.RFC2544.0PacketLoss
pvp_tput LTD.Throughput.RFC2544.PacketLossRatio
pvp_back2back LTD.Throughput.RFC2544.BackToBackFrames
pvvp_tput LTD.Throughput.RFC2544.PacketLossRatio
pvvp_back2back LTD.Throughput.RFC2544.BackToBackFrames
phy2phy_cpu_load LTD.CPU.RFC2544.0PacketLoss
phy2phy_mem_load LTD.Memory.RFC2544.0PacketLoss
phy2phy_tput_vpp VPP: LTD.Throughput.RFC2544.PacketLossRatio
phy2phy_cont_vpp VPP: Phy2Phy Continuous Stream
phy2phy_back2back_vpp VPP: LTD.Throughput.RFC2544.BackToBackFrames
pvp_tput_vpp VPP: LTD.Throughput.RFC2544.PacketLossRatio
pvp_cont_vpp VPP: PVP Continuous Stream
pvp_back2back_vpp VPP: LTD.Throughput.RFC2544.BackToBackFrames
pvvp_tput_vpp VPP: LTD.Throughput.RFC2544.PacketLossRatio
pvvp_cont_vpp VPP: PVP Continuous Stream
pvvp_back2back_vpp VPP: LTD.Throughput.RFC2544.BackToBackFrames

List of performance testcases above can be obtained by execution of:

$ ./vsperf --list
5.2. Integration testcases
Testcase Name Description
vswitch_vports_add_del_flow vSwitch - configure switch with vports, add and delete flow
vswitch_add_del_flows vSwitch - add and delete flows
vswitch_p2p_tput vSwitch - configure switch and execute RFC2544 throughput test
vswitch_p2p_back2back vSwitch - configure switch and execute RFC2544 back2back test
vswitch_p2p_cont vSwitch - configure switch and execute RFC2544 continuous stream test
vswitch_pvp vSwitch - configure switch and one vnf
vswitch_vports_pvp vSwitch - configure switch with vports and one vnf
vswitch_pvp_tput vSwitch - configure switch, vnf and execute RFC2544 throughput test
vswitch_pvp_back2back vSwitch - configure switch, vnf and execute RFC2544 back2back test
vswitch_pvp_cont vSwitch - configure switch, vnf and execute RFC2544 continuous stream test
vswitch_pvp_all vSwitch - configure switch, vnf and execute all test types
vswitch_pvvp vSwitch - configure switch and two vnfs
vswitch_pvvp_tput vSwitch - configure switch, two chained vnfs and execute RFC2544 throughput test
vswitch_pvvp_back2back vSwitch - configure switch, two chained vnfs and execute RFC2544 back2back test
vswitch_pvvp_cont vSwitch - configure switch, two chained vnfs and execute RFC2544 continuous stream test
vswitch_pvvp_all vSwitch - configure switch, two chained vnfs and execute all test types
vswitch_p4vp_tput 4 chained vnfs, execute RFC2544 throughput test, deployment pvvp4
vswitch_p4vp_back2back 4 chained vnfs, execute RFC2544 back2back test, deployment pvvp4
vswitch_p4vp_cont 4 chained vnfs, execute RFC2544 continuous stream test, deployment pvvp4
vswitch_p4vp_all 4 chained vnfs, execute RFC2544 throughput tests, deployment pvvp4
2pvp_udp_dest_flows RFC2544 Continuous TC with 2 Parallel VMs, flows on UDP Dest Port, deployment pvpv2
4pvp_udp_dest_flows RFC2544 Continuous TC with 4 Parallel VMs, flows on UDP Dest Port, deployment pvpv4
6pvp_udp_dest_flows RFC2544 Continuous TC with 6 Parallel VMs, flows on UDP Dest Port, deployment pvpv6
vhost_numa_awareness vSwitch DPDK - verify that PMD threads are served by the same NUMA slot as QEMU instances
ixnet_pvp_tput_1nic PVP Scenario with 1 port towards IXIA
vswitch_vports_add_del_connection_vpp VPP: vSwitch - configure switch with vports, add and delete connection
p2p_l3_multi_IP_ovs OVS: P2P L3 multistream with unique flow for each IP stream
p2p_l3_multi_IP_mask_ovs OVS: P2P L3 multistream with 1 flow for /8 net mask
pvp_l3_multi_IP_mask_ovs OVS: PVP L3 multistream with 1 flow for /8 net mask
pvvp_l3_multi_IP_mask_ovs OVS: PVVP L3 multistream with 1 flow for /8 net mask
p2p_l4_multi_PORT_ovs OVS: P2P L4 multistream with unique flow for each IP stream
p2p_l4_multi_PORT_mask_ovs OVS: P2P L4 multistream with 1 flow for /8 net and port mask
pvp_l4_multi_PORT_mask_ovs OVS: PVP L4 multistream flows for /8 net and port mask
pvvp_l4_multi_PORT_mask_ovs OVS: PVVP L4 multistream with flows for /8 net and port mask
p2p_l3_multi_IP_arp_vpp VPP: P2P L3 multistream with unique ARP entry for each IP stream
p2p_l3_multi_IP_mask_vpp VPP: P2P L3 multistream with 1 route for /8 net mask
p2p_l3_multi_IP_routes_vpp VPP: P2P L3 multistream with unique route for each IP stream
pvp_l3_multi_IP_mask_vpp VPP: PVP L3 multistream with route for /8 netmask
pvvp_l3_multi_IP_mask_vpp VPP: PVVP L3 multistream with route for /8 netmask
p2p_l4_multi_PORT_arp_vpp VPP: P2P L4 multistream with unique ARP entry for each IP stream and port check
p2p_l4_multi_PORT_mask_vpp VPP: P2P L4 multistream with 1 route for /8 net mask and port check
p2p_l4_multi_PORT_routes_vpp VPP: P2P L4 multistream with unique route for each IP stream and port check
pvp_l4_multi_PORT_mask_vpp VPP: PVP L4 multistream with route for /8 net and port mask
pvvp_l4_multi_PORT_mask_vpp VPP: PVVP L4 multistream with route for /8 net and port mask
vxlan_multi_IP_mask_ovs OVS: VxLAN L3 multistream
vxlan_multi_IP_arp_vpp VPP: VxLAN L3 multistream with unique ARP entry for each IP stream
vxlan_multi_IP_mask_vpp VPP: VxLAN L3 multistream with 1 route for /8 netmask

List of integration testcases above can be obtained by execution of:

$ ./vsperf --integration --list
5.3. OVS/DPDK Regression TestCases

These regression tests verify several DPDK features used internally by Open vSwitch. Tests can be used for verification of performance and correct functionality of upcoming DPDK and OVS releases and release candidates.

These tests are part of integration testcases and they must be executed with --integration CLI parameter.

Example of execution of all OVS/DPDK regression tests:

$ ./vsperf --integration --tests ovsdpdk_

Testcases are defined in the file conf/integration/01b_dpdk_regression_tests.conf. This file contains a set of configuration options with prefix OVSDPDK_. These parameters can be used for customization of regression tests and they will override some of standard VSPERF configuration options. It is recommended to check OVSDPDK configuration parameters and modify them in accordance with VSPERF configuration.

At least following parameters should be examined. Their values shall ensure, that DPDK and QEMU threads are pinned to cpu cores of the same NUMA slot, where tested NICs are connected.

_OVSDPDK_1st_PMD_CORE
_OVSDPDK_2nd_PMD_CORE
_OVSDPDK_GUEST_5_CORES
5.3.1. DPDK NIC Support

A set of performance tests to verify support of DPDK accelerated network interface cards. Testcases use standard physical to physical network scenario with several vSwitch and traffic configurations, which includes one and two PMD threads, uni and bidirectional traffic and RFC2544 Continuous or RFC2544 Throughput with 0% packet loss traffic types.

Testcase Name Description
ovsdpdk_nic_p2p_single_pmd_unidir_cont P2P with single PMD in OVS and unidirectional traffic.
ovsdpdk_nic_p2p_single_pmd_bidir_cont P2P with single PMD in OVS and bidirectional traffic.
ovsdpdk_nic_p2p_two_pmd_bidir_cont P2P with two PMDs in OVS and bidirectional traffic.
ovsdpdk_nic_p2p_single_pmd_unidir_tput P2P with single PMD in OVS and unidirectional traffic.
ovsdpdk_nic_p2p_single_pmd_bidir_tput P2P with single PMD in OVS and bidirectional traffic.
ovsdpdk_nic_p2p_two_pmd_bidir_tput P2P with two PMDs in OVS and bidirectional traffic.
5.3.2. DPDK Hotplug Support

A set of functional tests to verify DPDK hotplug support. Tests verify, that it is possible to use port, which was not bound to DPDK driver during vSwitch startup. There is also a test which verifies a possibility to detach port from DPDK driver. However support for manual detachment of a port from DPDK has been removed from recent OVS versions and thus this testcase is expected to fail.

Testcase Name Description
ovsdpdk_hotplug_attach Ensure successful port-add after binding a device to igb_uio after ovs-vswitchd is launched.
ovsdpdk_hotplug_detach Same as ovsdpdk_hotplug_attach, but delete and detach the device after the hotplug. Note Support of netdev-dpdk/detach has been removed from OVS, so testcase will fail with recent OVS/DPDK versions.
5.3.3. RX Checksum Support

A set of functional tests for verification of RX checksum calculation for tunneled traffic. Open vSwitch enables RX checksum offloading by default if NIC supports it. It is to note, that it is not possible to disable or enable RX checksum offloading. In order to verify correct RX checksum calculation in software, user has to execute these testcases at NIC without HW offloading capabilities.

Testcases utilize existing overlay physical to physical (op2p) network deployment implemented in vsperf. This deployment expects, that traffic generator sends unidirectional tunneled traffic (e.g. vxlan) and Open vSwitch performs data decapsulation and sends them back to the traffic generator via second port.

Testcase Name Description
ovsdpdk_checksum_l3 Test verifies RX IP header checksum (offloading) validation for tunneling protocols.
ovsdpdk_checksum_l4 Test verifies RX UDP header checksum (offloading) validation for tunneling protocols.
5.3.4. Flow Control Support

A set of functional testcases for the validation of flow control support in Open vSwitch with DPDK support. If flow control is enabled in both OVS and Traffic Generator, the network endpoint (OVS or TGEN) is not able to process incoming data and thus it detects a RX buffer overflow. It then sends an ethernet pause frame (as defined at 802.3x) to the TX side. This mechanism will ensure, that the TX side will slow down traffic transmission and thus no data is lost at RX side.

Introduced testcases use physical to physical scenario to forward data between traffic generator ports. It is expected that the processing of small frames in OVS is slower than line rate. It means that with flow control disabled, traffic generator will report a frame loss. On the other hand with flow control enabled, there should be 0% frame loss reported by traffic generator.

Testcase Name Description
ovsdpdk_flow_ctrl_rx Test the rx flow control functionality of DPDK PHY ports.
ovsdpdk_flow_ctrl_rx_dynamic Change the rx flow control support at run time and ensure the system honored the changes.
5.3.5. Multiqueue Support

A set of functional testcases for validation of multiqueue support for both physical and vHost User DPDK ports. Testcases utilize P2P and PVP network deployments and native support of multiqueue configuration available in VSPERF.

Testcase Name Description
ovsdpdk_mq_p2p_rxqs Setup rxqs on NIC port.
ovsdpdk_mq_p2p_rxqs_same_core_affinity Affinitize rxqs to the same core.
ovsdpdk_mq_p2p_rxqs_multi_core_affinity Affinitize rxqs to separate cores.
ovsdpdk_mq_pvp_rxqs Setup rxqs on vhost user port.
ovsdpdk_mq_pvp_rxqs_linux_bridge Confirm traffic received over vhost RXQs with Linux virtio device in guest.
ovsdpdk_mq_pvp_rxqs_testpmd Confirm traffic received over vhost RXQs with DPDK device in guest.
5.3.6. Vhost User

A set of functional testcases for validation of vHost User Client and vHost User Server modes in OVS.

NOTE: Vhost User Server mode is deprecated and it will be removed from OVS in the future.

Testcase Name Description
ovsdpdk_vhostuser_client Test vhost-user client mode
ovsdpdk_vhostuser_client_reconnect Test vhost-user client mode reconnect feature
ovsdpdk_vhostuser_server Test vhost-user server mode
ovsdpdk_vhostuser_sock_dir Verify functionality of vhost-sock-dir flag
5.3.7. Virtual Devices Support

A set of functional testcases for verification of correct functionality of virtual device PMD drivers.

Testcase Name Description
ovsdpdk_vdev_add_null_pmd Test addition of port using the null DPDK PMD driver.
ovsdpdk_vdev_del_null_pmd Test deletion of port using the null DPDK PMD driver.
ovsdpdk_vdev_add_af_packet_pmd Test addition of port using the af_packet DPDK PMD driver.
ovsdpdk_vdev_del_af_packet_pmd Test deletion of port using the af_packet DPDK PMD driver.
5.3.8. NUMA Support

A functional testcase for validation of NUMA awareness feature in OVS.

Testcase Name Description
ovsdpdk_numa Test vhost-user NUMA support. Vhostuser PMD threads should migrate to the same numa slot, where QEMU is executed.
5.3.9. Jumbo Frame Support

A set of functional testcases for verification of jumbo frame support in OVS. Testcases utilize P2P and PVP network deployments and native support of jumbo frames available in VSPERF.

Testcase Name Description
ovsdpdk_jumbo_increase_mtu_phy_port_ovsdb Ensure that the increased MTU for a DPDK physical port is updated in OVSDB.
ovsdpdk_jumbo_increase_mtu_vport_ovsdb Ensure that the increased MTU for a DPDK vhost-user port is updated in OVSDB.
ovsdpdk_jumbo_reduce_mtu_phy_port_ovsdb Ensure that the reduced MTU for a DPDK physical port is updated in OVSDB.
ovsdpdk_jumbo_reduce_mtu_vport_ovsdb Ensure that the reduced MTU for a DPDK vhost-user port is updated in OVSDB.
ovsdpdk_jumbo_increase_mtu_phy_port_datapath Ensure that the MTU for a DPDK physical port is updated in the datapath itself when increased to a valid value.
ovsdpdk_jumbo_increase_mtu_vport_datapath Ensure that the MTU for a DPDK vhost-user port is updated in the datapath itself when increased to a valid value.
ovsdpdk_jumbo_reduce_mtu_phy_port_datapath Ensure that the MTU for a DPDK physical port is updated in the datapath itself when decreased to a valid value.
ovsdpdk_jumbo_reduce_mtu_vport_datapath Ensure that the MTU for a DPDK vhost-user port is updated in the datapath itself when decreased to a valid value.
ovsdpdk_jumbo_mtu_upper_bound_phy_port Verify that the upper bound limit is enforced for OvS DPDK Phy ports.
ovsdpdk_jumbo_mtu_upper_bound_vport Verify that the upper bound limit is enforced for OvS DPDK vhost-user ports.
ovsdpdk_jumbo_mtu_lower_bound_phy_port Verify that the lower bound limit is enforced for OvS DPDK Phy ports.
ovsdpdk_jumbo_mtu_lower_bound_vport Verify that the lower bound limit is enforced for OvS DPDK vhost-user ports.
ovsdpdk_jumbo_p2p Ensure that jumbo frames are received, processed and forwarded correctly by DPDK physical ports.
ovsdpdk_jumbo_pvp Ensure that jumbo frames are received, processed and forwarded correctly by DPDK vhost-user ports.
ovsdpdk_jumbo_p2p_upper_bound Ensure that jumbo frames above the configured Rx port’s MTU are not accepted
5.3.10. Rate Limiting

A set of functional testcases for validation of rate limiting support. This feature allows to configure an ingress policing for both physical and vHost User DPDK ports.

NOTE: Desired maximum rate is specified in kilo bits per second and it defines the rate of payload only.

Testcase Name Description
ovsdpdk_rate_create_phy_port Ensure a rate limiting interface can be created on a physical DPDK port.
ovsdpdk_rate_delete_phy_port Ensure a rate limiting interface can be destroyed on a physical DPDK port.
ovsdpdk_rate_create_vport Ensure a rate limiting interface can be created on a vhost-user port.
ovsdpdk_rate_delete_vport Ensure a rate limiting interface can be destroyed on a vhost-user port.
ovsdpdk_rate_no_policing Ensure when a user attempts to create a rate limiting interface but is missing policing rate argument, no rate limitiner is created.
ovsdpdk_rate_no_burst Ensure when a user attempts to create a rate limiting interface but is missing policing burst argument, rate limitiner is created.
ovsdpdk_rate_p2p Ensure when a user creates a rate limiting physical interface that the traffic is limited to the specified policer rate in a p2p setup.
ovsdpdk_rate_pvp Ensure when a user creates a rate limiting vHost User interface that the traffic is limited to the specified policer rate in a pvp setup.
ovsdpdk_rate_p2p_multi_pkt_sizes Ensure that rate limiting works for various frame sizes.
5.3.11. Quality of Service

A set of functional testcases for validation of QoS support. This feature allows to configure an egress policing for both physical and vHost User DPDK ports.

NOTE: Desired maximum rate is specified in bytes per second and it defines the rate of payload only.

Testcase Name Description
ovsdpdk_qos_create_phy_port Ensure a QoS policy can be created on a physical DPDK port
ovsdpdk_qos_delete_phy_port Ensure an existing QoS policy can be destroyed on a physical DPDK port.
ovsdpdk_qos_create_vport Ensure a QoS policy can be created on a virtual vhost user port.
ovsdpdk_qos_delete_vport Ensure an existing QoS policy can be destroyed on a vhost user port.
ovsdpdk_qos_create_no_cir Ensure that a QoS policy cannot be created if the egress policer cir argument is missing.
ovsdpdk_qos_create_no_cbs Ensure that a QoS policy cannot be created if the egress policer cbs argument is missing.
ovsdpdk_qos_p2p In a p2p setup, ensure when a QoS egress policer is created that the traffic is limited to the specified rate.
ovsdpdk_qos_pvp In a pvp setup, ensure when a QoS egress policer is created that the traffic is limited to the specified rate.
5.3.12. Custom Statistics

A set of functional testcases for validation of Custom Statistics support by OVS. This feature allows Custom Statistics to be accessed by VSPERF.

These testcases require DPDK v17.11, the latest Open vSwitch(v2.9.90) and the IxNet traffic-generator.

ovsdpdk_custstat_check Test if custom statistics are supported.
ovsdpdk_custstat_rx_error Test bad ethernet CRC counter ‘rx_crc_errors’ exposed by custom statistics.
5.4. T-Rex in VM TestCases

A set of functional testcases, which use T-Rex running in VM as a traffic generator. These testcases require a VM image with T-Rex server installed. An example of such image is a vloop-vnf image with T-Rex available for download at:

http://artifacts.opnfv.org/vswitchperf/vnf/vloop-vnf-ubuntu-16.04_trex_20180209.qcow2

This image can be used for both T-Rex VM and loopback VM in vm2vm testcases.

NOTE: The performance of T-Rex running inside the VM is lower if compared to T-Rex execution on bare-metal. The user should perform a calibration of the VM maximum FPS capability, to ensure this limitation is understood.

trex_vm_cont T-Rex VM - execute RFC2544 Continuous Stream from T-Rex VM and loop it back through Open vSwitch.
trex_vm_tput T-Rex VM - execute RFC2544 Throughput from T-Rex VM and loop it back through Open vSwitch.
trex_vm2vm_cont T-Rex VM2VM - execute RFC2544 Continuous Stream from T-Rex VM and loop it back through 2nd VM.
trex_vm2vm_tput T-Rex VM2VM - execute RFC2544 Throughput from T-Rex VM and loop it back through 2nd VM.
VSPERF Test Guide
1. vSwitchPerf test suites userguide
1.1. General

VSPERF requires a traffic generators to run tests, automated traffic gen support in VSPERF includes:

  • IXIA traffic generator (IxNetwork hardware) and a machine that runs the IXIA client software.
  • Spirent traffic generator (TestCenter hardware chassis or TestCenter virtual in a VM) and a VM to run the Spirent Virtual Deployment Service image, formerly known as “Spirent LabServer”.
  • Xena Network traffic generator (Xena hardware chassis) that houses the Xena Traffic generator modules.
  • Moongen software traffic generator. Requires a separate machine running moongen to execute packet generation.
  • T-Rex software traffic generator. Requires a separate machine running T-Rex Server to execute packet generation.

If you want to use another traffic generator, please select the Dummy generator.

1.2. VSPERF Installation

To see the supported Operating Systems, vSwitches and system requirements, please follow the installation instructions <vsperf-installation>.

1.3. Traffic Generator Setup

Follow the Traffic generator instructions <trafficgen-installation> to install and configure a suitable traffic generator.

1.4. Cloning and building src dependencies

In order to run VSPERF, you will need to download DPDK and OVS. You can do this manually and build them in a preferred location, OR you could use vswitchperf/src. The vswitchperf/src directory contains makefiles that will allow you to clone and build the libraries that VSPERF depends on, such as DPDK and OVS. To clone and build simply:

$ cd src
$ make

VSPERF can be used with stock OVS (without DPDK support). When build is finished, the libraries are stored in src_vanilla directory.

The ‘make’ builds all options in src:

  • Vanilla OVS
  • OVS with vhost_user as the guest access method (with DPDK support)

The vhost_user build will reside in src/ovs/ The Vanilla OVS build will reside in vswitchperf/src_vanilla

To delete a src subdirectory and its contents to allow you to re-clone simply use:

$ make clobber
1.5. Configure the ./conf/10_custom.conf file

The 10_custom.conf file is the configuration file that overrides default configurations in all the other configuration files in ./conf The supplied 10_custom.conf file MUST be modified, as it contains configuration items for which there are no reasonable default values.

The configuration items that can be added is not limited to the initial contents. Any configuration item mentioned in any .conf file in ./conf directory can be added and that item will be overridden by the custom configuration value.

Further details about configuration files evaluation and special behaviour of options with GUEST_ prefix could be found at design document.

1.6. Using a custom settings file

If your 10_custom.conf doesn’t reside in the ./conf directory or if you want to use an alternative configuration file, the file can be passed to vsperf via the --conf-file argument.

$ ./vsperf --conf-file <path_to_custom_conf> ...
1.7. Evaluation of configuration parameters

The value of configuration parameter can be specified at various places, e.g. at the test case definition, inside configuration files, by the command line argument, etc. Thus it is important to understand the order of configuration parameter evaluation. This “priority hierarchy” can be described like so (1 = max priority):

  1. Testcase definition keywords vSwitch, Trafficgen, VNF and Tunnel Type
  2. Parameters inside testcase definition section Parameters
  3. Command line arguments (e.g. --test-params, --vswitch, --trafficgen, etc.)
  4. Environment variables (see --load-env argument)
  5. Custom configuration file specified via --conf-file argument
  6. Standard configuration files, where higher prefix number means higher priority.

For example, if the same configuration parameter is defined in custom configuration file (specified via --conf-file argument), via --test-params argument and also inside Parameters section of the testcase definition, then parameter value from the Parameters section will be used.

Further details about order of configuration files evaluation and special behaviour of options with GUEST_ prefix could be found at design document.

1.8. Overriding values defined in configuration files

The configuration items can be overridden by command line argument --test-params. In this case, the configuration items and their values should be passed in form of item=value and separated by semicolon.

Example:

$ ./vsperf --test-params "TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,);" \
                         "GUEST_LOOPBACK=['testpmd','l2fwd']" pvvp_tput

The --test-params command line argument can also be used to override default configuration values for multiple tests. Providing a list of parameters will apply each element of the list to the test with the same index. If more tests are run than parameters provided the last element of the list will repeat.

$ ./vsperf --test-params "['TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,)',"
                         "'TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(64,)']" \
                         pvvp_tput pvvp_tput

The second option is to override configuration items by Parameters section of the test case definition. The configuration items can be added into Parameters dictionary with their new values. These values will override values defined in configuration files or specified by --test-params command line argument.

Example:

"Parameters" : {'TRAFFICGEN_PKT_SIZES' : (128,),
                'TRAFFICGEN_DURATION' : 10,
                'GUEST_LOOPBACK' : ['testpmd','l2fwd'],
               }

NOTE: In both cases, configuration item names and their values must be specified in the same form as they are defined inside configuration files. Parameter names must be specified in uppercase and data types of original and new value must match. Python syntax rules related to data types and structures must be followed. For example, parameter TRAFFICGEN_PKT_SIZES above is defined as a tuple with a single value 128. In this case trailing comma is mandatory, otherwise value can be wrongly interpreted as a number instead of a tuple and vsperf execution would fail. Please check configuration files for default values and their types and use them as a basis for any customized values. In case of any doubt, please check official python documentation related to data structures like tuples, lists and dictionaries.

NOTE: Vsperf execution will terminate with runtime error in case, that unknown parameter name is passed via --test-params CLI argument or defined in Parameters section of test case definition. It is also forbidden to redefine a value of TEST_PARAMS configuration item via CLI or Parameters section.

NOTE: The new definition of the dictionary parameter, specified via --test-params or inside Parameters section, will not override original dictionary values. Instead the original dictionary will be updated with values from the new dictionary definition.

1.9. Referencing parameter values

It is possible to use a special macro #PARAM() to refer to the value of another configuration parameter. This reference is evaluated during access of the parameter value (by settings.getValue() call), so it can refer to parameters created during VSPERF runtime, e.g. NICS dictionary. It can be used to reflect DUT HW details in the testcase definition.

Example:

{
    ...
    "Name": "testcase",
    "Parameters" : {
        "TRAFFIC" : {
            'l2': {
                # set destination MAC to the MAC of the first
                # interface from WHITELIST_NICS list
                'dstmac' : '#PARAM(NICS[0]["mac"])',
            },
        },
    ...
1.10. vloop_vnf

VSPERF uses a VM image called vloop_vnf for looping traffic in the deployment scenarios involving VMs. The image can be downloaded from http://artifacts.opnfv.org/.

Please see the installation instructions for information on vloop-vnf images.

1.11. l2fwd Kernel Module

A Kernel Module that provides OSI Layer 2 Ipv4 termination or forwarding with support for Destination Network Address Translation (DNAT) for both the MAC and IP addresses. l2fwd can be found in <vswitchperf_dir>/src/l2fwd

1.12. Additional Tools Setup

Follow the Additional tools instructions <additional-tools-configuration> to install and configure additional tools such as collectors and loadgens.

1.13. Executing tests

All examples inside these docs assume, that user is inside the VSPERF directory. VSPERF can be executed from any directory.

Before running any tests make sure you have root permissions by adding the following line to /etc/sudoers:

username ALL=(ALL)       NOPASSWD: ALL

username in the example above should be replaced with a real username.

To list the available tests:

$ ./vsperf --list

To run a single test:

$ ./vsperf $TESTNAME

Where $TESTNAME is the name of the vsperf test you would like to run.

To run a test multiple times, repeat it:

$ ./vsperf $TESTNAME $TESTNAME $TESTNAME

To run a group of tests, for example all tests with a name containing ‘RFC2544’:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf --tests="RFC2544"

To run all tests:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf

Some tests allow for configurable parameters, including test duration (in seconds) as well as packet sizes (in bytes).

$ ./vsperf --conf-file user_settings.py \
    --tests RFC2544Tput \
    --test-params "TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,)"

To specify configurable parameters for multiple tests, use a list of parameters. One element for each test.

$ ./vsperf --conf-file user_settings.py \
    --test-params "['TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,)',"\
    "'TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(64,)']" \
    phy2phy_cont phy2phy_cont

If the CUMULATIVE_PARAMS setting is set to True and there are different parameters provided for each test using --test-params, each test will take the parameters of the previous test before appyling it’s own. With CUMULATIVE_PARAMS set to True the following command will be equivalent to the previous example:

$ ./vsperf --conf-file user_settings.py \
    --test-params "['TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,)',"\
    "'TRAFFICGEN_PKT_SIZES=(64,)']" \
    phy2phy_cont phy2phy_cont
    "

For all available options, check out the help dialog:

$ ./vsperf --help
1.14. Executing Vanilla OVS tests
  1. If needed, recompile src for all OVS variants

    $ cd src
    $ make distclean
    $ make
    
  2. Update your 10_custom.conf file to use Vanilla OVS:

    VSWITCH = 'OvsVanilla'
    
  3. Run test:

    $ ./vsperf --conf-file=<path_to_custom_conf>
    

    Please note if you don’t want to configure Vanilla OVS through the configuration file, you can pass it as a CLI argument.

    $ ./vsperf --vswitch OvsVanilla
    
1.15. Executing tests with VMs

To run tests using vhost-user as guest access method:

  1. Set VSWITCH and VNF of your settings file to:

    VSWITCH = 'OvsDpdkVhost'
    VNF = 'QemuDpdkVhost'
    
  2. If needed, recompile src for all OVS variants

    $ cd src
    $ make distclean
    $ make
    
  3. Run test:

    $ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf
    

NOTE: By default vSwitch is acting as a server for dpdk vhost-user sockets. In case, that QEMU should be a server for vhost-user sockets, then parameter VSWITCH_VHOSTUSER_SERVER_MODE should be set to False.

1.16. Executing tests with VMs using Vanilla OVS

To run tests using Vanilla OVS:

  1. Set the following variables:

    VSWITCH = 'OvsVanilla'
    VNF = 'QemuVirtioNet'
    
    VANILLA_TGEN_PORT1_IP = n.n.n.n
    VANILLA_TGEN_PORT1_MAC = nn:nn:nn:nn:nn:nn
    
    VANILLA_TGEN_PORT2_IP = n.n.n.n
    VANILLA_TGEN_PORT2_MAC = nn:nn:nn:nn:nn:nn
    
    VANILLA_BRIDGE_IP = n.n.n.n
    

    or use --test-params option

    $ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
               --test-params "VANILLA_TGEN_PORT1_IP=n.n.n.n;" \
                             "VANILLA_TGEN_PORT1_MAC=nn:nn:nn:nn:nn:nn;" \
                             "VANILLA_TGEN_PORT2_IP=n.n.n.n;" \
                             "VANILLA_TGEN_PORT2_MAC=nn:nn:nn:nn:nn:nn"
    
  2. If needed, recompile src for all OVS variants

    $ cd src
    $ make distclean
    $ make
    
  3. Run test:

    $ ./vsperf --conf-file<path_to_custom_conf>/10_custom.conf
    
1.17. Executing VPP tests

Currently it is not possible to use standard scenario deployments for execution of tests with VPP. It means, that deployments p2p, pvp, pvvp and in general any PXP Deployment won’t work with VPP. However it is possible to use VPP in Step driven tests. A basic set of VPP testcases covering phy2phy, pvp and pvvp tests are already prepared.

List of performance tests with VPP support follows:

  • phy2phy_tput_vpp: VPP: LTD.Throughput.RFC2544.PacketLossRatio
  • phy2phy_cont_vpp: VPP: Phy2Phy Continuous Stream
  • phy2phy_back2back_vpp: VPP: LTD.Throughput.RFC2544.BackToBackFrames
  • pvp_tput_vpp: VPP: LTD.Throughput.RFC2544.PacketLossRatio
  • pvp_cont_vpp: VPP: PVP Continuous Stream
  • pvp_back2back_vpp: VPP: LTD.Throughput.RFC2544.BackToBackFrames
  • pvvp_tput_vpp: VPP: LTD.Throughput.RFC2544.PacketLossRatio
  • pvvp_cont_vpp: VPP: PVP Continuous Stream
  • pvvp_back2back_vpp: VPP: LTD.Throughput.RFC2544.BackToBackFrames

In order to execute testcases with VPP it is required to:

After that it is possible to execute VPP testcases listed above.

For example:

$ ./vsperf --conf-file=<path_to_custom_conf> phy2phy_tput_vpp
1.18. Using vfio_pci with DPDK

To use vfio with DPDK instead of igb_uio add into your custom configuration file the following parameter:

PATHS['dpdk']['src']['modules'] = ['uio', 'vfio-pci']

NOTE: In case, that DPDK is installed from binary package, then please set PATHS['dpdk']['bin']['modules'] instead.

NOTE: Please ensure that Intel VT-d is enabled in BIOS.

NOTE: Please ensure your boot/grub parameters include the following:

iommu=pt intel_iommu=on

To check that IOMMU is enabled on your platform:

$ dmesg | grep IOMMU
[    0.000000] Intel-IOMMU: enabled
[    0.139882] dmar: IOMMU 0: reg_base_addr fbffe000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.139888] dmar: IOMMU 1: reg_base_addr ebffc000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.139893] IOAPIC id 2 under DRHD base  0xfbffe000 IOMMU 0
[    0.139894] IOAPIC id 0 under DRHD base  0xebffc000 IOMMU 1
[    0.139895] IOAPIC id 1 under DRHD base  0xebffc000 IOMMU 1
[    3.335744] IOMMU: dmar0 using Queued invalidation
[    3.335746] IOMMU: dmar1 using Queued invalidation
....

NOTE: In case of VPP, it is required to explicitly define, that vfio-pci DPDK driver should be used. It means to update dpdk part of VSWITCH_VPP_ARGS dictionary with uio-driver section, e.g. VSWITCH_VPP_ARGS[‘dpdk’] = ‘uio-driver vfio-pci’

1.19. Using SRIOV support

To use virtual functions of NIC with SRIOV support, use extended form of NIC PCI slot definition:

WHITELIST_NICS = ['0000:05:00.0|vf0', '0000:05:00.1|vf3']

Where ‘vf’ is an indication of virtual function usage and following number defines a VF to be used. In case that VF usage is detected, then vswitchperf will enable SRIOV support for given card and it will detect PCI slot numbers of selected VFs.

So in example above, one VF will be configured for NIC ‘0000:05:00.0’ and four VFs will be configured for NIC ‘0000:05:00.1’. Vswitchperf will detect PCI addresses of selected VFs and it will use them during test execution.

At the end of vswitchperf execution, SRIOV support will be disabled.

SRIOV support is generic and it can be used in different testing scenarios. For example:

  • vSwitch tests with DPDK or without DPDK support to verify impact of VF usage on vSwitch performance
  • tests without vSwitch, where traffic is forwarded directly between VF interfaces by packet forwarder (e.g. testpmd application)
  • tests without vSwitch, where VM accesses VF interfaces directly by PCI-passthrough to measure raw VM throughput performance.
1.20. Using QEMU with PCI passthrough support

Raw virtual machine throughput performance can be measured by execution of PVP test with direct access to NICs by PCI pass-through. To execute VM with direct access to PCI devices, enable vfio-pci. In order to use virtual functions, SRIOV-support must be enabled.

Execution of test with PCI pass-through with vswitch disabled:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
           --vswitch none --vnf QemuPciPassthrough pvp_tput

Any of supported guest-loopback-application can be used inside VM with PCI pass-through support.

Note: Qemu with PCI pass-through support can be used only with PVP test deployment.

1.21. Selection of loopback application for tests with VMs

To select the loopback applications which will forward packets inside VMs, the following parameter should be configured:

GUEST_LOOPBACK = ['testpmd']

or use --test-params CLI argument:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
      --test-params "GUEST_LOOPBACK=['testpmd']"

Supported loopback applications are:

'testpmd'       - testpmd from dpdk will be built and used
'l2fwd'         - l2fwd module provided by Huawei will be built and used
'linux_bridge'  - linux bridge will be configured
'buildin'       - nothing will be configured by vsperf; VM image must
                  ensure traffic forwarding between its interfaces

Guest loopback application must be configured, otherwise traffic will not be forwarded by VM and testcases with VM related deployments will fail. Guest loopback application is set to ‘testpmd’ by default.

NOTE: In case that only 1 or more than 2 NICs are configured for VM, then ‘testpmd’ should be used. As it is able to forward traffic between multiple VM NIC pairs.

NOTE: In case of linux_bridge, all guest NICs are connected to the same bridge inside the guest.

1.22. Mergable Buffers Options with QEMU

Mergable buffers can be disabled with VSPerf within QEMU. This option can increase performance significantly when not using jumbo frame sized packets. By default VSPerf disables mergable buffers. If you wish to enable it you can modify the setting in the a custom conf file.

GUEST_NIC_MERGE_BUFFERS_DISABLE = [False]

Then execute using the custom conf file.

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf

Alternatively you can just pass the param during execution.

$ ./vsperf --test-params "GUEST_NIC_MERGE_BUFFERS_DISABLE=[False]"
1.23. Selection of dpdk binding driver for tests with VMs

To select dpdk binding driver, which will specify which driver the vm NICs will use for dpdk bind, the following configuration parameter should be configured:

GUEST_DPDK_BIND_DRIVER = ['igb_uio_from_src']

The supported dpdk guest bind drivers are:

'uio_pci_generic'      - Use uio_pci_generic driver
'igb_uio_from_src'     - Build and use the igb_uio driver from the dpdk src
                         files
'vfio_no_iommu'        - Use vfio with no iommu option. This requires custom
                         guest images that support this option. The default
                         vloop image does not support this driver.

Note: uio_pci_generic does not support sr-iov testcases with guests attached. This is because uio_pci_generic only supports legacy interrupts. In case uio_pci_generic is selected with the vnf as QemuPciPassthrough it will be modified to use igb_uio_from_src instead.

Note: vfio_no_iommu requires kernels equal to or greater than 4.5 and dpdk 16.04 or greater. Using this option will also taint the kernel.

Please refer to the dpdk documents at http://dpdk.org/doc/guides for more information on these drivers.

1.24. Guest Core and Thread Binding

VSPERF provides options to achieve better performance by guest core binding and guest vCPU thread binding as well. Core binding is to bind all the qemu threads. Thread binding is to bind the house keeping threads to some CPU and vCPU thread to some other CPU, this helps to reduce the noise from qemu house keeping threads.

GUEST_CORE_BINDING = [('#EVAL(6+2*#VMINDEX)', '#EVAL(7+2*#VMINDEX)')]

NOTE By default the GUEST_THREAD_BINDING will be none, which means same as the GUEST_CORE_BINDING, i.e. the vcpu threads are sharing the physical CPUs with the house keeping threads. Better performance using vCPU thread binding can be achieved by enabling affinity in the custom configuration file.

For example, if an environment requires 32,33 to be core binded and 29,30&31 for guest thread binding to achieve better performance.

VNF_AFFINITIZATION_ON = True
GUEST_CORE_BINDING = [('32','33')]
GUEST_THREAD_BINDING = [('29', '30', '31')]
1.25. Qemu CPU features

QEMU default to a compatible subset of performance enhancing cpu features. To pass all available host processor features to the guest.

GUEST_CPU_OPTIONS = ['host,migratable=off']

NOTE To enhance the performance, cpu features tsc deadline timer for guest, the guest PMU, the invariant TSC can be provided in the custom configuration file.

1.26. Multi-Queue Configuration

VSPerf currently supports multi-queue with the following limitations:

  1. Requires QEMU 2.5 or greater and any OVS version higher than 2.5. The default upstream package versions installed by VSPerf satisfies this requirement.

  2. Guest image must have ethtool utility installed if using l2fwd or linux bridge inside guest for loopback.

  3. If using OVS versions 2.5.0 or less enable old style multi-queue as shown in the ‘‘02_vswitch.conf’’ file.

    OVS_OLD_STYLE_MQ = True
    

To enable multi-queue for dpdk modify the ‘‘02_vswitch.conf’’ file.

VSWITCH_DPDK_MULTI_QUEUES = 2

NOTE: you should consider using the switch affinity to set a pmd cpu mask that can optimize your performance. Consider the numa of the NIC in use if this applies by checking /sys/class/net/<eth_name>/device/numa_node and setting an appropriate mask to create PMD threads on the same numa node.

When multi-queue is enabled, each dpdk or dpdkvhostuser port that is created on the switch will set the option for multiple queues. If old style multi queue has been enabled a global option for multi queue will be used instead of the port by port option.

To enable multi-queue on the guest modify the ‘‘04_vnf.conf’’ file.

GUEST_NIC_QUEUES = [2]

Enabling multi-queue at the guest will add multiple queues to each NIC port when qemu launches the guest.

In case of Vanilla OVS, multi-queue is enabled on the tuntap ports and nic queues will be enabled inside the guest with ethtool. Simply enabling the multi-queue on the guest is sufficient for Vanilla OVS multi-queue.

Testpmd should be configured to take advantage of multi-queue on the guest if using DPDKVhostUser. This can be done by modifying the ‘‘04_vnf.conf’’ file.

GUEST_TESTPMD_PARAMS = ['-l 0,1,2,3,4  -n 4 --socket-mem 512 -- '
                        '--burst=64 -i --txqflags=0xf00 '
                        '--nb-cores=4 --rxq=2 --txq=2 '
                        '--disable-hw-vlan']

NOTE: The guest SMP cores must be configured to allow for testpmd to use the optimal number of cores to take advantage of the multiple guest queues.

In case of using Vanilla OVS and qemu virtio-net you can increase performance by binding vhost-net threads to cpus. This can be done by enabling the affinity in the ‘‘04_vnf.conf’’ file. This can be done to non multi-queue enabled configurations as well as there will be 2 vhost-net threads.

VSWITCH_VHOST_NET_AFFINITIZATION = True

VSWITCH_VHOST_CPU_MAP = [4,5,8,11]

NOTE: This method of binding would require a custom script in a real environment.

NOTE: For optimal performance guest SMPs and/or vhost-net threads should be on the same numa as the NIC in use if possible/applicable. Testpmd should be assigned at least (nb_cores +1) total cores with the cpu mask.

1.27. Jumbo Frame Testing

VSPERF provides options to support jumbo frame testing with a jumbo frame supported NIC and traffic generator for the following vswitches:

  1. OVSVanilla
  2. OvsDpdkVhostUser
  3. TestPMD loopback with or without a guest

NOTE: There is currently no support for SR-IOV or VPP at this time with jumbo frames.

All packet forwarding applications for pxp testing is supported.

To enable jumbo frame testing simply enable the option in the conf files and set the maximum size that will be used.

VSWITCH_JUMBO_FRAMES_ENABLED = True
VSWITCH_JUMBO_FRAMES_SIZE = 9000

To enable jumbo frame testing with OVSVanilla the NIC in test on the host must have its mtu size changed manually using ifconfig or applicable tools:

ifconfig eth1 mtu 9000 up

NOTE: To make the setting consistent across reboots you should reference the OS documents as it differs from distribution to distribution.

To start a test for jumbo frames modify the conf file packet sizes or pass the option through the VSPERF command line.

TEST_PARAMS = {'TRAFFICGEN_PKT_SIZES':(2000,9000)}
./vsperf --test-params "TRAFFICGEN_PKT_SIZES=2000,9000"

It is recommended to increase the memory size for OvsDpdkVhostUser testing from the default 1024. Your size required may vary depending on the number of guests in your testing. 4096 appears to work well for most typical testing scenarios.

DPDK_SOCKET_MEM = ['4096', '0']

NOTE: For Jumbo frames to work with DpdkVhostUser, mergable buffers will be enabled by default. If testing with mergable buffers in QEMU is desired, disable Jumbo Frames and only test non jumbo frame sizes. Test Jumbo Frames sizes separately to avoid this collision.

1.28. Executing Packet Forwarding tests

To select the applications which will forward packets, the following parameters should be configured:

VSWITCH = 'none'
PKTFWD = 'TestPMD'

or use --vswitch and --fwdapp CLI arguments:

$ ./vsperf phy2phy_cont --conf-file user_settings.py \
           --vswitch none \
           --fwdapp TestPMD

Supported Packet Forwarding applications are:

'testpmd'       - testpmd from dpdk
  1. Update your ‘‘10_custom.conf’’ file to use the appropriate variables for selected Packet Forwarder:

    # testpmd configuration
    TESTPMD_ARGS = []
    # packet forwarding mode supported by testpmd; Please see DPDK documentation
    # for comprehensive list of modes supported by your version.
    # e.g. io|mac|mac_retry|macswap|flowgen|rxonly|txonly|csum|icmpecho|...
    # Note: Option "mac_retry" has been changed to "mac retry" since DPDK v16.07
    TESTPMD_FWD_MODE = 'csum'
    # checksum calculation layer: ip|udp|tcp|sctp|outer-ip
    TESTPMD_CSUM_LAYER = 'ip'
    # checksum calculation place: hw (hardware) | sw (software)
    TESTPMD_CSUM_CALC = 'sw'
    # recognize tunnel headers: on|off
    TESTPMD_CSUM_PARSE_TUNNEL = 'off'
    
  2. Run test:

    $ ./vsperf phy2phy_tput --conf-file <path_to_settings_py>
    
1.29. Executing Packet Forwarding tests with one guest

TestPMD with DPDK 16.11 or greater can be used to forward packets as a switch to a single guest using TestPMD vdev option. To set this configuration the following parameters should be used.

VSWITCH = 'none'
PKTFWD = 'TestPMD'

or use --vswitch and --fwdapp CLI arguments:

$ ./vsperf pvp_tput --conf-file user_settings.py \
           --vswitch none \
           --fwdapp TestPMD

Guest forwarding application only supports TestPMD in this configuration.

GUEST_LOOPBACK = ['testpmd']

For optimal performance one cpu per port +1 should be used for TestPMD. Also set additional params for packet forwarding application to use the correct number of nb-cores.

DPDK_SOCKET_MEM = ['1024', '0']
VSWITCHD_DPDK_ARGS = ['-l', '46,44,42,40,38', '-n', '4']
TESTPMD_ARGS = ['--nb-cores=4', '--txq=1', '--rxq=1']

For guest TestPMD 3 VCpus should be assigned with the following TestPMD params.

GUEST_TESTPMD_PARAMS = ['-l 0,1,2 -n 4 --socket-mem 1024 -- '
                        '--burst=64 -i --txqflags=0xf00 '
                        '--disable-hw-vlan --nb-cores=2 --txq=1 --rxq=1']

Execution of TestPMD can be run with the following command line

./vsperf pvp_tput --vswitch=none --fwdapp=TestPMD --conf-file <path_to_settings_py>

NOTE: To achieve the best 0% loss numbers with rfc2544 throughput testing, other tunings should be applied to host and guest such as tuned profiles and CPU tunings to prevent possible interrupts to worker threads.

1.30. VSPERF modes of operation

VSPERF can be run in different modes. By default it will configure vSwitch, traffic generator and VNF. However it can be used just for configuration and execution of traffic generator. Another option is execution of all components except traffic generator itself.

Mode of operation is driven by configuration parameter -m or –mode

-m MODE, --mode MODE  vsperf mode of operation;
    Values:
        "normal" - execute vSwitch, VNF and traffic generator
        "trafficgen" - execute only traffic generator
        "trafficgen-off" - execute vSwitch and VNF
        "trafficgen-pause" - execute vSwitch and VNF but wait before traffic transmission

In case, that VSPERF is executed in “trafficgen” mode, then configuration of traffic generator can be modified through TRAFFIC dictionary passed to the --test-params option. It is not needed to specify all values of TRAFFIC dictionary. It is sufficient to specify only values, which should be changed. Detailed description of TRAFFIC dictionary can be found at Configuration of TRAFFIC dictionary.

Example of execution of VSPERF in “trafficgen” mode:

$ ./vsperf -m trafficgen --trafficgen IxNet --conf-file vsperf.conf \
    --test-params "TRAFFIC={'traffic_type':'rfc2544_continuous','bidir':'False','framerate':60}"
1.31. Performance Matrix

The --matrix command line argument analyses and displays the performance of all the tests run. Using the metric specified by MATRIX_METRIC in the conf-file, the first test is set as the baseline and all the other tests are compared to it. The MATRIX_METRIC must always refer to a numeric value to enable comparision. A table, with the test ID, metric value, the change of the metric in %, testname and the test parameters used for each test, is printed out as well as saved into the results directory.

Example of 2 tests being compared using Performance Matrix:

$ ./vsperf --conf-file user_settings.py \
    --test-params "['TRAFFICGEN_PKT_SIZES=(64,)',"\
    "'TRAFFICGEN_PKT_SIZES=(128,)']" \
    phy2phy_cont phy2phy_cont --matrix

Example output:

+------+--------------+---------------------+----------+---------------------------------------+
|   ID | Name         |   throughput_rx_fps |   Change | Parameters, CUMULATIVE_PARAMS = False |
+======+==============+=====================+==========+=======================================+
|    0 | phy2phy_cont |        23749000.000 |        0 | 'TRAFFICGEN_PKT_SIZES': [64]          |
+------+--------------+---------------------+----------+---------------------------------------+
|    1 | phy2phy_cont |        16850500.000 |  -29.048 | 'TRAFFICGEN_PKT_SIZES': [128]         |
+------+--------------+---------------------+----------+---------------------------------------+
1.32. Code change verification by pylint

Every developer participating in VSPERF project should run pylint before his python code is submitted for review. Project specific configuration for pylint is available at ‘pylint.rc’.

Example of manual pylint invocation:

$ pylint --rcfile ./pylintrc ./vsperf
1.33. GOTCHAs:
1.33.1. Custom image fails to boot

Using custom VM images may not boot within VSPerf pxp testing because of the drive boot and shared type which could be caused by a missing scsi driver inside the image. In case of issues you can try changing the drive boot type to ide.

GUEST_BOOT_DRIVE_TYPE = ['ide']
GUEST_SHARED_DRIVE_TYPE = ['ide']
1.33.2. OVS with DPDK and QEMU

If you encounter the following error: “before (last 100 chars): ‘-path=/dev/hugepages,share=on: unable to map backing store for hugepages: Cannot allocate memoryrnrn” during qemu initialization, check the amount of hugepages on your system:

$ cat /proc/meminfo | grep HugePages

By default the vswitchd is launched with 1Gb of memory, to change this, modify –socket-mem parameter in conf/02_vswitch.conf to allocate an appropriate amount of memory:

DPDK_SOCKET_MEM = ['1024', '0']
VSWITCHD_DPDK_ARGS = ['-c', '0x4', '-n', '4']
VSWITCHD_DPDK_CONFIG = {
    'dpdk-init' : 'true',
    'dpdk-lcore-mask' : '0x4',
    'dpdk-socket-mem' : '1024,0',
}

Note: Option VSWITCHD_DPDK_ARGS is used for vswitchd, which supports --dpdk parameter. In recent vswitchd versions, option VSWITCHD_DPDK_CONFIG will be used to configure vswitchd via ovs-vsctl calls.

1.34. More information

For more information and details refer to the rest of vSwitchPerfuser documentation.

2. Step driven tests

In general, test scenarios are defined by a deployment used in the particular test case definition. The chosen deployment scenario will take care of the vSwitch configuration, deployment of VNFs and it can also affect configuration of a traffic generator. In order to allow a more flexible way of testcase scripting, VSPERF supports a detailed step driven testcase definition. It can be used to configure and program vSwitch, deploy and terminate VNFs, execute a traffic generator, modify a VSPERF configuration, execute external commands, etc.

Execution of step driven tests is done on a step by step work flow starting with step 0 as defined inside the test case. Each step of the test increments the step number by one which is indicated in the log.

(testcases.integration) - Step 0 'vswitch add_vport ['br0']' start

Test steps are defined as a list of steps within a TestSteps item of test case definition. Each step is a list with following structure:

'[' [ optional-alias ',' ] test-object ',' test-function [ ',' optional-function-params ] '],'

Step driven tests can be used for both performance and integration testing. In case of integration test, each step in the test case is validated. If a step does not pass validation the test will fail and terminate. The test will continue until a failure is detected or all steps pass. A csv report file is generated after a test completes with an OK or FAIL result.

NOTE: It is possible to suppress validation process of given step by prefixing it by ! (exclamation mark). In following example test execution won’t fail if all traffic is dropped:

['!trafficgen', 'send_traffic', {}]

In case of performance test, the validation of steps is not performed and standard output files with results from traffic generator and underlying OS details are generated by vsperf.

Step driven testcases can be used in two different ways:

# description of full testcase - in this case clean deployment is used
to indicate that vsperf should neither configure vSwitch nor deploy any VNF. Test shall perform all required vSwitch configuration and programming and deploy required number of VNFs.
# modification of existing deployment - in this case, any of supported
deployments can be used to perform initial vSwitch configuration and deployment of VNFs. Additional actions defined by TestSteps can be used to alter vSwitch configuration or deploy additional VNFs. After the last step is processed, the test execution will continue with traffic execution.
2.1. Test objects and their functions

Every test step can call a function of one of the supported test objects. In general any existing function of supported test object can be called by test step. In case that step validation is required (valid for integration test steps, which are not suppressed), then appropriate validate_ method must be implemented.

The list of supported objects and their most common functions is listed below. Please check implementation of test objects for full list of implemented functions and their parameters.

  • vswitch - provides functions for vSwitch configuration

    List of supported functions:

    • add_switch br_name - creates a new switch (bridge) with given br_name
    • del_switch br_name - deletes switch (bridge) with given br_name
    • add_phy_port br_name - adds a physical port into bridge specified by br_name
    • add_vport br_name - adds a virtual port into bridge specified by br_name
    • del_port br_name port_name - removes physical or virtual port specified by port_name from bridge br_name
    • add_flow br_name flow - adds flow specified by flow dictionary into the bridge br_name; Content of flow dictionary will be passed to the vSwitch. In case of Open vSwitch it will be passed to the ovs-ofctl add-flow command. Please see Open vSwitch documentation for the list of supported flow parameters.
    • del_flow br_name [flow] - deletes flow specified by flow dictionary from bridge br_name; In case that optional parameter flow is not specified or set to an empty dictionary {}, then all flows from bridge br_name will be deleted.
    • dump_flows br_name - dumps all flows from bridge specified by br_name
    • enable_stp br_name - enables Spanning Tree Protocol for bridge br_name
    • disable_stp br_name - disables Spanning Tree Protocol for bridge br_name
    • enable_rstp br_name - enables Rapid Spanning Tree Protocol for bridge br_name
    • disable_rstp br_name - disables Rapid Spanning Tree Protocol for bridge br_name
    • restart - restarts switch, which is useful for failover testcases

    Examples:

    ['vswitch', 'add_switch', 'int_br0']
    
    ['vswitch', 'del_switch', 'int_br0']
    
    ['vswitch', 'add_phy_port', 'int_br0']
    
    ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]']
    
    ['vswitch', 'add_flow', 'int_br0', {'in_port': '1', 'actions': ['output:2'],
     'idle_timeout': '0'}],
    
    ['vswitch', 'enable_rstp', 'int_br0']
    
  • vnf[ID] - provides functions for deployment and termination of VNFs; Optional alfanumerical ID is used for VNF identification in case that testcase deploys multiple VNFs.

    List of supported functions:

    • start - starts a VNF based on VSPERF configuration
    • stop - gracefully terminates given VNF
    • execute command [delay] - executes command cmd inside VNF; Optional delay defines number of seconds to wait before next step is executed. Method returns command output as a string.
    • execute_and_wait command [timeout] [prompt] - executes command cmd inside VNF; Optional timeout defines number of seconds to wait until prompt is detected. Optional prompt defines a string, which is used as detection of successful command execution. In case that prompt is not defined, then content of GUEST_PROMPT_LOGIN parameter will be used. Method returns command output as a string.

    Examples:

    ['vnf1', 'start'],
    ['vnf2', 'start'],
    ['vnf1', 'execute_and_wait', 'ifconfig eth0 5.5.5.1/24 up'],
    ['vnf2', 'execute_and_wait', 'ifconfig eth0 5.5.5.2/24 up', 120, 'root.*#'],
    ['vnf2', 'execute_and_wait', 'ping -c1 5.5.5.1'],
    ['vnf2', 'stop'],
    ['vnf1', 'stop'],
    
  • VNF[ID] - provides access to VNFs deployed automatically by testcase deployment scenario. For Example pvvp deployment automatically starts two VNFs before any TestStep is executed. It is possible to access these VNFs by VNF0 and VNF1 labels.

    List of supported functions is identical to vnf[ID] option above except functions start and stop.

    Examples:

    ['VNF0', 'execute_and_wait', 'ifconfig eth2 5.5.5.1/24 up'],
    ['VNF1', 'execute_and_wait', 'ifconfig eth2 5.5.5.2/24 up', 120, 'root.*#'],
    ['VNF2', 'execute_and_wait', 'ping -c1 5.5.5.1'],
    
  • trafficgen - triggers traffic generation

    List of supported functions:

    • send_traffic traffic - starts a traffic based on the vsperf configuration and given traffic dictionary. More details about traffic dictionary and its possible values are available at Traffic Generator Integration Guide
    • get_results - returns dictionary with results collected from previous execution of send_traffic

    Examples:

    ['trafficgen', 'send_traffic', {'traffic_type' : 'rfc2544_throughput'}]
    
    ['trafficgen', 'send_traffic', {'traffic_type' : 'rfc2544_back2back', 'bidir' : 'True'}],
    ['trafficgen', 'get_results'],
    ['tools', 'assert', '#STEP[-1][0]["frame_loss_percent"] < 0.05'],
    
  • settings - reads or modifies VSPERF configuration

    List of supported functions:

    • getValue param - returns value of given param
    • setValue param value - sets value of param to given value
    • resetValue param - if param was overridden by TEST_PARAMS (e.g. by “Parameters” section of the test case definition), then it will be set to its original value.

    Examples:

    ['settings', 'getValue', 'TOOLS']
    
    ['settings', 'setValue', 'GUEST_USERNAME', ['root']]
    
    ['settings', 'resetValue', 'WHITELIST_NICS'],
    

    It is possible and more convenient to access any VSPERF configuration option directly via $NAME notation. Option evaluation is done during runtime and vsperf will automatically translate it to the appropriate call of settings.getValue. If the referred parameter does not exist, then vsperf will keep $NAME string untouched and it will continue with testcase execution. The reason is to avoid test execution failure in case that $ sign has been used from different reason than vsperf parameter evaluation.

    NOTE: It is recommended to use ${NAME} notation for any shell parameters used within Exec_Shell call to avoid a clash with configuration parameter evaluation.

    NOTE: It is possible to refer to vsperf parameter value by #PARAM() macro (see Overriding values defined in configuration files. However #PARAM() macro is evaluated at the beginning of vsperf execution and it will not reflect any changes made to the vsperf configuration during runtime. On the other hand $NAME notation is evaluated during test execution and thus it contains any modifications to the configuration parameter made by vsperf (e.g. TOOLS and NICS dictionaries) or by testcase definition (e.g. TRAFFIC dictionary).

    Examples:

    ['tools', 'exec_shell', "$TOOLS['ovs-vsctl'] show"]
    
    ['settings', 'setValue', 'TRAFFICGEN_IXIA_PORT2', '$TRAFFICGEN_IXIA_PORT1'],
    
    ['vswitch', 'add_flow', 'int_br0',
     {'in_port': '#STEP[1][1]',
      'dl_type': '0x800',
      'nw_proto': '17',
      'nw_dst': '$TRAFFIC["l3"]["dstip"]/8',
      'actions': ['output:#STEP[2][1]']
     }
    ]
    
  • namespace - creates or modifies network namespaces

    List of supported functions:

    • create_namespace name - creates new namespace with given name
    • delete_namespace name - deletes namespace specified by its name
    • assign_port_to_namespace port name [port_up] - assigns NIC specified by port into given namespace name; If optional parameter port_up is set to True, then port will be brought up.
    • add_ip_to_namespace_eth port name addr cidr - assigns an IP address addr/cidr to the NIC specified by port within namespace name
    • reset_port_to_root port name - returns given port from namespace name back to the root namespace

    Examples:

    ['namespace', 'create_namespace', 'testns']
    
    ['namespace', 'assign_port_to_namespace', 'eth0', 'testns']
    
  • veth - manipulates with eth and veth devices

    List of supported functions:

    • add_veth_port port peer_port - adds a pair of veth ports named port and peer_port
    • del_veth_port port peer_port - deletes a veth port pair specified by port and peer_port
    • bring_up_eth_port eth_port [namespace] - brings up eth_port in (optional) namespace

    Examples:

    ['veth', 'add_veth_port', 'veth', 'veth1']
    
    ['veth', 'bring_up_eth_port', 'eth1']
    
  • tools - provides a set of helper functions

    List of supported functions:

    • Assert condition - evaluates given condition and raises AssertionError in case that condition is not True
    • Eval expression - evaluates given expression as a python code and returns its result
    • Exec_Shell command - executes a shell command and wait until it finishes
    • Exec_Shell_Background command - executes a shell command at background; Command will be automatically terminated at the end of testcase execution.
    • Exec_Python code - executes a python code

    Examples:

    ['tools', 'exec_shell', 'numactl -H', 'available: ([0-9]+)']
    ['tools', 'assert', '#STEP[-1][0]>1']
    
  • wait - is used for test case interruption. This object doesn’t have any functions. Once reached, vsperf will pause test execution and waits for press of Enter key. It can be used during testcase design for debugging purposes.

    Examples:

    ['wait']
    
  • sleep - is used to pause testcase execution for defined number of seconds.

    Examples:

    ['sleep', '60']
    
  • log level message - is used to log message of given level into vsperf output. Level is one of info, debug, warning or error.

    Examples:

    ['log', 'error', 'tools $TOOLS']
    
  • pdb - executes python debugger

    Examples:

    ['pdb']
    
2.2. Test Macros

Test profiles can include macros as part of the test step. Each step in the profile may return a value such as a port name. Recall macros use #STEP to indicate the recalled value inside the return structure. If the method the test step calls returns a value it can be later recalled, for example:

{
    "Name": "vswitch_add_del_vport",
    "Deployment": "clean",
    "Description": "vSwitch - add and delete virtual port",
    "TestSteps": [
            ['vswitch', 'add_switch', 'int_br0'],               # STEP 0
            ['vswitch', 'add_vport', 'int_br0'],                # STEP 1
            ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],  # STEP 2
            ['vswitch', 'del_switch', 'int_br0'],               # STEP 3
         ]
}

This test profile uses the vswitch add_vport method which returns a string value of the port added. This is later called by the del_port method using the name from step 1.

It is also possible to use negative indexes in step macros. In that case #STEP[-1] will refer to the result from previous step, #STEP[-2] will refer to result of step called before previous step, etc. It means, that you could change STEP 2 from previous example to achieve the same functionality:

['vswitch', 'del_port', 'int_br0', '#STEP[-1][0]'],  # STEP 2

Another option to refer to previous values, is to define an alias for given step by its first argument with ‘#’ prefix. Alias must be unique and it can’t be a number. Example of step alias usage:

['#port1', 'vswitch', 'add_vport', 'int_br0'],
['vswitch', 'del_port', 'int_br0', '#STEP[port1][0]'],

Also commonly used steps can be created as a separate profile.

STEP_VSWITCH_PVP_INIT = [
    ['vswitch', 'add_switch', 'int_br0'],           # STEP 0
    ['vswitch', 'add_phy_port', 'int_br0'],         # STEP 1
    ['vswitch', 'add_phy_port', 'int_br0'],         # STEP 2
    ['vswitch', 'add_vport', 'int_br0'],            # STEP 3
    ['vswitch', 'add_vport', 'int_br0'],            # STEP 4
]

This profile can then be used inside other testcases

{
    "Name": "vswitch_pvp",
    "Deployment": "clean",
    "Description": "vSwitch - configure switch and one vnf",
    "TestSteps": STEP_VSWITCH_PVP_INIT +
                 [
                    ['vnf', 'start'],
                    ['vnf', 'stop'],
                 ] +
                 STEP_VSWITCH_PVP_FINIT
}

It is possible to refer to vsperf configuration parameters within step macros. Please see step-driven-tests-variable-usage for more details.

In case that step returns a string or list of strings, then it is possible to filter such output by regular expression. This optional filter can be specified as a last step parameter with prefix ‘|’. Output will be split into separate lines and only matching records will be returned. It is also possible to return a specified group of characters from the matching lines, e.g. by regex |ID (\d+).

Examples:

['tools', 'exec_shell', "sudo $TOOLS['ovs-appctl'] dpif-netdev/pmd-rxq-show",
 '|dpdkvhostuser0\s+queue-id: \d'],
['tools', 'assert', 'len(#STEP[-1])==1'],

['vnf', 'execute_and_wait', 'ethtool -L eth0 combined 2'],
['vnf', 'execute_and_wait', 'ethtool -l eth0', '|Combined:\s+2'],
['tools', 'assert', 'len(#STEP[-1])==2']
2.3. HelloWorld and other basic Testcases

The following examples are for demonstration purposes. You can run them by copying and pasting into the conf/integration/01_testcases.conf file. A command-line instruction is shown at the end of each example.

2.3.1. HelloWorld

The first example is a HelloWorld testcase. It simply creates a bridge with 2 physical ports, then sets up a flow to drop incoming packets from the port that was instantiated at the STEP #1. There’s no interaction with the traffic generator. Then the flow, the 2 ports and the bridge are deleted. ‘add_phy_port’ method creates a ‘dpdk’ type interface that will manage the physical port. The string value returned is the port name that will be referred by ‘del_port’ later on.

{
    "Name": "HelloWorld",
    "Description": "My first testcase",
    "Deployment": "clean",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],   # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 2
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'actions': ['drop'], 'idle_timeout': '0'}],
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]

},

To run HelloWorld test:

./vsperf --conf-file user_settings.py --integration HelloWorld
2.3.2. Specify a Flow by the IP address

The next example shows how to explicitly set up a flow by specifying a destination IP address. All packets received from the port created at STEP #1 that have a destination IP address = 90.90.90.90 will be forwarded to the port created at the STEP #2.

{
    "Name": "p2p_rule_l3da",
    "Description": "Phy2Phy with rule on L3 Dest Addr",
    "Deployment": "clean",
    "biDirectional": "False",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],   # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 2
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_dst': '90.90.90.90', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous'}],
        ['vswitch', 'dump_flows', 'int_br0'],   # STEP 5
        ['vswitch', 'del_flow', 'int_br0'],     # STEP 7 == del-flows
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration p2p_rule_l3da
2.3.3. Multistream feature

The next testcase uses the multistream feature. The traffic generator will send packets with different UDP ports. That is accomplished by using “Stream Type” and “MultiStream” keywords. 4 different flows are set to forward all incoming packets.

{
    "Name": "multistream_l4",
    "Description": "Multistream on UDP ports",
    "Deployment": "clean",
    "Parameters": {
        'TRAFFIC' : {
            "multistream": 4,
            "stream_type": "L4",
        },
    },
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],   # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 2
        # Setup Flows
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '0', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '1', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '2', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '3', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        # Send mono-dir traffic
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous', \
            'bidir' : 'False'}],
        # Clean up
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
     ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration multistream_l4
2.3.4. PVP with a VM Replacement

This example launches a 1st VM in a PVP topology, then the VM is replaced by another VM. When VNF setup parameter in ./conf/04_vnf.conf is “QemuDpdkVhostUser” ‘add_vport’ method creates a ‘dpdkvhostuser’ type port to connect a VM.

{
    "Name": "ex_replace_vm",
    "Description": "PVP with VM replacement",
    "Deployment": "clean",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],       # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 3    vm1
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 4

        # Setup Flows
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[4][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[2][1]', \
            'actions': ['output:#STEP[4][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[3][1]', \
            'actions': ['output:#STEP[1][1]'], 'idle_timeout': '0'}],

        # Start VM 1
        ['vnf1', 'start'],
        # Now we want to replace VM 1 with another VM
        ['vnf1', 'stop'],

        ['vswitch', 'add_vport', 'int_br0'],        # STEP 11    vm2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 12
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'actions': ['output:#STEP[11][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[12][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],

        # Start VM 2
        ['vnf2', 'start'],
        ['vnf2', 'stop'],
        ['vswitch', 'dump_flows', 'int_br0'],

        # Clean up
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[3][0]'],    # vm1
        ['vswitch', 'del_port', 'int_br0', '#STEP[4][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[11][0]'],   # vm2
        ['vswitch', 'del_port', 'int_br0', '#STEP[12][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration ex_replace_vm
2.3.5. VM with a Linux bridge

This example setups a PVP topology and routes traffic to the VM based on the destination IP address. A command-line parameter is used to select a Linux bridge as a guest loopback application. It is also possible to select a guest loopback application by a configuration option GUEST_LOOPBACK.

{
    "Name": "ex_pvp_rule_l3da",
    "Description": "PVP with flow on L3 Dest Addr",
    "Deployment": "clean",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],       # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 3    vm1
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 4
        # Setup Flows
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_dst': '90.90.90.90', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        # Each pkt from the VM is forwarded to the 2nd dpdk port
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[4][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        # Start VMs
        ['vnf1', 'start'],
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous', \
            'bidir' : 'False'}],
        ['vnf1', 'stop'],
        # Clean up
        ['vswitch', 'dump_flows', 'int_br0'],       # STEP 10
        ['vswitch', 'del_flow', 'int_br0'],         # STEP 11
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[3][0]'],  # vm1 ports
        ['vswitch', 'del_port', 'int_br0', '#STEP[4][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --test-params \
        "GUEST_LOOPBACK=['linux_bridge']" --integration ex_pvp_rule_l3da
2.3.6. Forward packets based on UDP port

This examples launches 2 VMs connected in parallel. Incoming packets will be forwarded to one specific VM depending on the destination UDP port.

{
    "Name": "ex_2pvp_rule_l4dp",
    "Description": "2 PVP with flows on L4 Dest Port",
    "Deployment": "clean",
    "Parameters": {
        'TRAFFIC' : {
            "multistream": 2,
            "stream_type": "L4",
        },
    },
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],       # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 3    vm1
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 4
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 5    vm2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 6
        # Setup Flows to reply ICMPv6 and similar packets, so to
        # avoid flooding internal port with their re-transmissions
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:01', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:02', \
            'actions': ['output:#STEP[4][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:03', \
            'actions': ['output:#STEP[5][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:04', \
            'actions': ['output:#STEP[6][1]'], 'idle_timeout': '0'}],
        # Forward UDP packets depending on dest port
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '0', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '1', \
            'actions': ['output:#STEP[5][1]'], 'idle_timeout': '0'}],
        # Send VM output to phy port #2
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[4][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[6][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        # Start VMs
        ['vnf1', 'start'],                          # STEP 16
        ['vnf2', 'start'],                          # STEP 17
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous', \
            'bidir' : 'False'}],
        ['vnf1', 'stop'],
        ['vnf2', 'stop'],
        ['vswitch', 'dump_flows', 'int_br0'],
        # Clean up
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[3][0]'],  # vm1 ports
        ['vswitch', 'del_port', 'int_br0', '#STEP[4][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[5][0]'],  # vm2 ports
        ['vswitch', 'del_port', 'int_br0', '#STEP[6][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

The same test can be written in a shorter form using “Deployment” : “pvpv”.

To run the test:

./vsperf --conf-file user_settings.py --integration ex_2pvp_rule_l4dp
2.3.7. Modification of existing PVVP deployment

This is an example of modification of a standard deployment scenario with additional TestSteps. Standard PVVP scenario is used to configure a vSwitch and to deploy two VNFs connected in series. Additional TestSteps will deploy a 3rd VNF and connect it in parallel to already configured VNFs. Traffic generator is instructed (by Multistream feature) to send two separate traffic streams. One stream will be sent to the standalone VNF and second to two chained VNFs.

In case, that test is defined as a performance test, then traffic results will be collected and available in both csv and rst report files.

{
    "Name": "pvvp_pvp_cont",
    "Deployment": "pvvp",
    "Description": "PVVP and PVP in parallel with Continuous Stream",
    "Parameters" : {
        "TRAFFIC" : {
            "traffic_type" : "rfc2544_continuous",
            "multistream": 2,
        },
    },
    "TestSteps": [
                    ['vswitch', 'add_vport', '$VSWITCH_BRIDGE_NAME'],
                    ['vswitch', 'add_vport', '$VSWITCH_BRIDGE_NAME'],
                    # priority must be higher than default 32768, otherwise flows won't match
                    ['vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME',
                     {'in_port': '1', 'actions': ['output:#STEP[-2][1]'], 'idle_timeout': '0', 'dl_type':'0x0800',
                                                  'nw_proto':'17', 'tp_dst':'0', 'priority': '33000'}],
                    ['vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME',
                     {'in_port': '2', 'actions': ['output:#STEP[-2][1]'], 'idle_timeout': '0', 'dl_type':'0x0800',
                                                  'nw_proto':'17', 'tp_dst':'0', 'priority': '33000'}],
                    ['vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME', {'in_port': '#STEP[-4][1]', 'actions': ['output:1'],
                                                    'idle_timeout': '0'}],
                    ['vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME', {'in_port': '#STEP[-4][1]', 'actions': ['output:2'],
                                                    'idle_timeout': '0'}],
                    ['vswitch', 'dump_flows', '$VSWITCH_BRIDGE_NAME'],
                    ['vnf1', 'start'],
                 ]
},

To run the test:

./vsperf --conf-file user_settings.py pvvp_pvp_cont
3. Integration tests

VSPERF includes a set of integration tests defined in conf/integration. These tests can be run by specifying –integration as a parameter to vsperf. Current tests in conf/integration include switch functionality and Overlay tests.

Tests in the conf/integration can be used to test scaling of different switch configurations by adding steps into the test case.

For the overlay tests VSPERF supports VXLAN, GRE and GENEVE tunneling protocols. Testing of these protocols is limited to unidirectional traffic and P2P (Physical to Physical scenarios).

NOTE: The configuration for overlay tests provided in this guide is for unidirectional traffic only.

NOTE: The overlay tests require an IxNet traffic generator. The tunneled traffic is configured by ixnetrfc2544v2.tcl script. This script can be used with all supported deployment scenarios for generation of frames with VXLAN, GRE or GENEVE protocols. In that case options “Tunnel Operation” and “TRAFFICGEN_IXNET_TCL_SCRIPT” must be properly configured at testcase definition.

3.1. Executing Integration Tests

To execute integration tests VSPERF is run with the integration parameter. To view the current test list simply execute the following command:

./vsperf --integration --list

The standard tests included are defined inside the conf/integration/01_testcases.conf file.

3.2. Executing Tunnel encapsulation tests

The VXLAN OVS DPDK encapsulation tests requires IPs, MAC addresses, bridge names and WHITELIST_NICS for DPDK.

NOTE: Only Ixia traffic generators currently support the execution of the tunnel encapsulation tests. Support for other traffic generators may come in a future release.

Default values are already provided. To customize for your environment, override the following variables in you user_settings.py file:

# Variables defined in conf/integration/02_vswitch.conf
# Tunnel endpoint for Overlay P2P deployment scenario
# used for br0
VTEP_IP1 = '192.168.0.1/24'

# Used as remote_ip in adding OVS tunnel port and
# to set ARP entry in OVS (e.g. tnl/arp/set br-ext 192.168.240.10 02:00:00:00:00:02
VTEP_IP2 = '192.168.240.10'

# Network to use when adding a route for inner frame data
VTEP_IP2_SUBNET = '192.168.240.0/24'

# Bridge names
TUNNEL_INTEGRATION_BRIDGE = 'vsperf-br0'
TUNNEL_EXTERNAL_BRIDGE = 'vsperf-br-ext'

# IP of br-ext
TUNNEL_EXTERNAL_BRIDGE_IP = '192.168.240.1/24'

# vxlan|gre|geneve
TUNNEL_TYPE = 'vxlan'

# Variables defined conf/integration/03_traffic.conf
# For OP2P deployment scenario
TRAFFICGEN_PORT1_MAC = '02:00:00:00:00:01'
TRAFFICGEN_PORT2_MAC = '02:00:00:00:00:02'
TRAFFICGEN_PORT1_IP = '1.1.1.1'
TRAFFICGEN_PORT2_IP = '192.168.240.10'

To run VXLAN encapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=vxlan' overlay_p2p_tput

To run GRE encapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=gre' overlay_p2p_tput

To run GENEVE encapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=geneve' overlay_p2p_tput

To run OVS NATIVE tunnel tests (VXLAN/GRE/GENEVE):

  1. Install the OVS kernel modules
cd src/ovs/ovs
sudo -E make modules_install
  1. Set the following variables:
VSWITCH = 'OvsVanilla'
# Specify vport_* kernel module to test.
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_vxlan',
    'vport_gre',
    'vport_geneve',
    'datapath/linux/openvswitch.ko',
]

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run tests:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=vxlan' overlay_p2p_tput
3.3. Executing VXLAN decapsulation tests

To run VXLAN decapsulation tests:

  1. Set the variables used in “Executing Tunnel encapsulation tests”
  2. Run test:
./vsperf --conf-file user_settings.py --integration overlay_p2p_decap_cont

If you want to use different values for your VXLAN frame, you may set:

VXLAN_FRAME_L3 = {'proto': 'udp',
                  'packetsize': 64,
                  'srcip': TRAFFICGEN_PORT1_IP,
                  'dstip': '192.168.240.1',
                 }
VXLAN_FRAME_L4 = {'srcport': 4789,
                  'dstport': 4789,
                  'vni': VXLAN_VNI,
                  'inner_srcmac': '01:02:03:04:05:06',
                  'inner_dstmac': '06:05:04:03:02:01',
                  'inner_srcip': '192.168.0.10',
                  'inner_dstip': '192.168.240.9',
                  'inner_proto': 'udp',
                  'inner_srcport': 3000,
                  'inner_dstport': 3001,
                 }
3.4. Executing GRE decapsulation tests

To run GRE decapsulation tests:

  1. Set the variables used in “Executing Tunnel encapsulation tests”
  2. Run test:
./vsperf --conf-file user_settings.py --test-params 'TUNNEL_TYPE=gre' \
         --integration overlay_p2p_decap_cont

If you want to use different values for your GRE frame, you may set:

GRE_FRAME_L3 = {'proto': 'gre',
                'packetsize': 64,
                'srcip': TRAFFICGEN_PORT1_IP,
                'dstip': '192.168.240.1',
               }

GRE_FRAME_L4 = {'srcport': 0,
                'dstport': 0
                'inner_srcmac': '01:02:03:04:05:06',
                'inner_dstmac': '06:05:04:03:02:01',
                'inner_srcip': '192.168.0.10',
                'inner_dstip': '192.168.240.9',
                'inner_proto': 'udp',
                'inner_srcport': 3000,
                'inner_dstport': 3001,
               }
3.5. Executing GENEVE decapsulation tests

IxNet 7.3X does not have native support of GENEVE protocol. The template, GeneveIxNetTemplate.xml_ClearText.xml, should be imported into IxNET for this testcase to work.

To import the template do:

  1. Run the IxNetwork TCL Server
  2. Click on the Traffic menu
  3. Click on the Traffic actions and click Edit Packet Templates
  4. On the Template editor window, click Import. Select the template located at 3rd_party/ixia/GeneveIxNetTemplate.xml_ClearText.xml and click import.
  5. Restart the TCL Server.

To run GENEVE decapsulation tests:

  1. Set the variables used in “Executing Tunnel encapsulation tests”
  2. Run test:
./vsperf --conf-file user_settings.py --test-params 'tunnel_type=geneve' \
         --integration overlay_p2p_decap_cont

If you want to use different values for your GENEVE frame, you may set:

GENEVE_FRAME_L3 = {'proto': 'udp',
                   'packetsize': 64,
                   'srcip': TRAFFICGEN_PORT1_IP,
                   'dstip': '192.168.240.1',
                  }

GENEVE_FRAME_L4 = {'srcport': 6081,
                   'dstport': 6081,
                   'geneve_vni': 0,
                   'inner_srcmac': '01:02:03:04:05:06',
                   'inner_dstmac': '06:05:04:03:02:01',
                   'inner_srcip': '192.168.0.10',
                   'inner_dstip': '192.168.240.9',
                   'inner_proto': 'udp',
                   'inner_srcport': 3000,
                   'inner_dstport': 3001,
                  }
3.6. Executing Native/Vanilla OVS VXLAN decapsulation tests

To run VXLAN decapsulation tests:

  1. Set the following variables in your user_settings.py file:
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_vxlan',
    'datapath/linux/openvswitch.ko',
]

TRAFFICGEN_PORT1_IP = '172.16.1.2'
TRAFFICGEN_PORT2_IP = '192.168.1.11'

VTEP_IP1 = '172.16.1.2/24'
VTEP_IP2 = '192.168.1.1'
VTEP_IP2_SUBNET = '192.168.1.0/24'
TUNNEL_EXTERNAL_BRIDGE_IP = '172.16.1.1/24'
TUNNEL_INT_BRIDGE_IP = '192.168.1.1'

VXLAN_FRAME_L2 = {'srcmac':
                  '01:02:03:04:05:06',
                  'dstmac':
                  '06:05:04:03:02:01',
                 }

VXLAN_FRAME_L3 = {'proto': 'udp',
                  'packetsize': 64,
                  'srcip': TRAFFICGEN_PORT1_IP,
                  'dstip': '172.16.1.1',
                 }

VXLAN_FRAME_L4 = {
                  'srcport': 4789,
                  'dstport': 4789,
                  'protocolpad': 'true',
                  'vni': 99,
                  'inner_srcmac': '01:02:03:04:05:06',
                  'inner_dstmac': '06:05:04:03:02:01',
                  'inner_srcip': '192.168.1.2',
                  'inner_dstip': TRAFFICGEN_PORT2_IP,
                  'inner_proto': 'udp',
                  'inner_srcport': 3000,
                  'inner_dstport': 3001,
                 }

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run test:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'tunnel_type=vxlan' overlay_p2p_decap_cont
3.7. Executing Native/Vanilla OVS GRE decapsulation tests

To run GRE decapsulation tests:

  1. Set the following variables in your user_settings.py file:
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_gre',
    'datapath/linux/openvswitch.ko',
]

TRAFFICGEN_PORT1_IP = '172.16.1.2'
TRAFFICGEN_PORT2_IP = '192.168.1.11'

VTEP_IP1 = '172.16.1.2/24'
VTEP_IP2 = '192.168.1.1'
VTEP_IP2_SUBNET = '192.168.1.0/24'
TUNNEL_EXTERNAL_BRIDGE_IP = '172.16.1.1/24'
TUNNEL_INT_BRIDGE_IP = '192.168.1.1'

GRE_FRAME_L2 = {'srcmac':
                '01:02:03:04:05:06',
                'dstmac':
                '06:05:04:03:02:01',
               }

GRE_FRAME_L3 = {'proto': 'udp',
                'packetsize': 64,
                'srcip': TRAFFICGEN_PORT1_IP,
                'dstip': '172.16.1.1',
               }

GRE_FRAME_L4 = {
                'srcport': 4789,
                'dstport': 4789,
                'protocolpad': 'true',
                'inner_srcmac': '01:02:03:04:05:06',
                'inner_dstmac': '06:05:04:03:02:01',
                'inner_srcip': '192.168.1.2',
                'inner_dstip': TRAFFICGEN_PORT2_IP,
                'inner_proto': 'udp',
                'inner_srcport': 3000,
                'inner_dstport': 3001,
               }

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run test:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'tunnel_type=gre' overlay_p2p_decap_cont
3.8. Executing Native/Vanilla OVS GENEVE decapsulation tests

To run GENEVE decapsulation tests:

  1. Set the following variables in your user_settings.py file:
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_geneve',
    'datapath/linux/openvswitch.ko',
]

TRAFFICGEN_PORT1_IP = '172.16.1.2'
TRAFFICGEN_PORT2_IP = '192.168.1.11'

VTEP_IP1 = '172.16.1.2/24'
VTEP_IP2 = '192.168.1.1'
VTEP_IP2_SUBNET = '192.168.1.0/24'
TUNNEL_EXTERNAL_BRIDGE_IP = '172.16.1.1/24'
TUNNEL_INT_BRIDGE_IP = '192.168.1.1'

GENEVE_FRAME_L2 = {'srcmac':
                   '01:02:03:04:05:06',
                   'dstmac':
                   '06:05:04:03:02:01',
                  }

GENEVE_FRAME_L3 = {'proto': 'udp',
                   'packetsize': 64,
                   'srcip': TRAFFICGEN_PORT1_IP,
                   'dstip': '172.16.1.1',
                  }

GENEVE_FRAME_L4 = {'srcport': 6081,
                   'dstport': 6081,
                   'protocolpad': 'true',
                   'geneve_vni': 0,
                   'inner_srcmac': '01:02:03:04:05:06',
                   'inner_dstmac': '06:05:04:03:02:01',
                   'inner_srcip': '192.168.1.2',
                   'inner_dstip': TRAFFICGEN_PORT2_IP,
                   'inner_proto': 'udp',
                   'inner_srcport': 3000,
                   'inner_dstport': 3001,
                  }

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run test:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'tunnel_type=geneve' overlay_p2p_decap_cont
3.9. Executing Tunnel encapsulation+decapsulation tests

The OVS DPDK encapsulation/decapsulation tests requires IPs, MAC addresses, bridge names and WHITELIST_NICS for DPDK.

The test cases can test the tunneling encap and decap without using any ingress overlay traffic as compared to above test cases. To achieve this the OVS is configured to perform encap and decap in a series on the same traffic stream as given below.

TRAFFIC-IN –> [ENCAP] –> [MOD-PKT] –> [DECAP] –> TRAFFIC-OUT

Default values are already provided. To customize for your environment, override the following variables in you user_settings.py file:

# Variables defined in conf/integration/02_vswitch.conf

# Bridge names
TUNNEL_EXTERNAL_BRIDGE1 = 'br-phy1'
TUNNEL_EXTERNAL_BRIDGE2 = 'br-phy2'
TUNNEL_MODIFY_BRIDGE1 = 'br-mod1'
TUNNEL_MODIFY_BRIDGE2 = 'br-mod2'

# IP of br-mod1
TUNNEL_MODIFY_BRIDGE_IP1 = '10.0.0.1/24'

# Mac of br-mod1
TUNNEL_MODIFY_BRIDGE_MAC1 = '00:00:10:00:00:01'

# IP of br-mod2
TUNNEL_MODIFY_BRIDGE_IP2 = '20.0.0.1/24'

#Mac of br-mod2
TUNNEL_MODIFY_BRIDGE_MAC2 = '00:00:20:00:00:01'

# vxlan|gre|geneve, Only VXLAN is supported for now.
TUNNEL_TYPE = 'vxlan'

To run VXLAN encapsulation+decapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         overlay_p2p_mod_tput
1. Traffic Capture

Tha ability to capture traffic at multiple points of the system is crucial to many of the functional tests. It allows the verification of functionality for both the vSwitch and the NICs using hardware acceleration for packet manipulation and modification.

There are three different methods of traffic capture supported by VSPERF. Detailed descriptions of these methods as well as their pros and cons can be found in the following chapters.

1.1. Traffic Capture inside of a VM

This method uses the standard PVP scenario, in which vSwitch first processes and modifies the packet before forwarding it to the VM. Inside of the VM we capture the traffic using tcpdump or a similiar technique. The capture information is the used to verify the expected modifications to the packet done by vSwitch.

                                                     _
+--------------------------------------------------+  |
|                                                  |  |
|   +------------------------------------------+   |  |
|   |  Traffic capture and Packet Forwarding   |   |  |
|   +------------------------------------------+   |  |
|          ^                            :          |  |
|          |                            |          |  |  Guest
|          :                            v          |  |
|   +---------------+          +---------------+   |  |
|   | logical port 0|          | logical port 1|   |  |
+---+---------------+----------+---------------+---+ _|
            ^                          :
            |                          |
            :                          v            _
+---+---------------+----------+---------------+---+  |
|   | logical port 0|          | logical port 1|   |  |
|   +---------------+          +---------------+   |  |
|           ^                          :           |  |
|           |                          |           |  |  Host
|           :                          v           |  |
|   +--------------+            +--------------+   |  |
|   |   phy port   |  vSwitch   |   phy port   |   |  |
+---+--------------+------------+--------------+---+ _|
            ^                          :
            |                          |
            :                          v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+

PROS:

  • supports testing with all traffic generators
  • easy to use and implement into test
  • allows testing hardware offloading on the ingress side

CONS:

  • does not allow testing hardware offloading on the egress side

An example of Traffic Capture in VM test:

# Capture Example 1 - Traffic capture inside VM (PVP scenario)
# This TestCase will modify VLAN ID set by the traffic generator to the new value.
# Correct VLAN ID settings is verified by inspection of captured frames.
{
    Name: capture_pvp_modify_vid,
    Deployment: pvp,
    Description: Test and verify VLAN ID modification by Open vSwitch,
    Parameters : {
        VSWITCH : OvsDpdkVhost, # works also for Vanilla OVS
        TRAFFICGEN_DURATION : 5,
        TRAFFIC : {
            traffic_type : rfc2544_continuous,
            frame_rate : 100,
            'vlan': {
                'enabled': True,
                'id': 8,
                'priority': 1,
                'cfi': 0,
            },
        },
        GUEST_LOOPBACK : ['linux_bridge'],
    },
    TestSteps: [
        # replace original flows with vlan ID modification
        ['!vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME', {'in_port': '1', 'actions': ['mod_vlan_vid:4','output:3']}],
        ['!vswitch', 'add_flow', '$VSWITCH_BRIDGE_NAME', {'in_port': '2', 'actions': ['mod_vlan_vid:4','output:4']}],
        ['vswitch', 'dump_flows', '$VSWITCH_BRIDGE_NAME'],
        # verify that received frames have modified vlan ID
        ['VNF0', 'execute_and_wait', 'tcpdump -i eth0 -c 5 -w dump.pcap vlan 4 &'],
        ['trafficgen', 'send_traffic',{}],
        ['!VNF0', 'execute_and_wait', 'tcpdump -qer dump.pcap vlan 4 2>/dev/null | wc -l','|^(\d+)$'],
        ['tools', 'assert', '#STEP[-1][0] == 5'],
    ],
},
1.2. Traffic Capture for testing NICs with HW offloading/acceleration

The NIC with hardware acceleration/offloading is inserted as an additional card into the server. Two ports on this card are then connected together using a patch cable as shown in the diagram. Only a single port of the tested NIC is setup with DPDK acceleration, while the other is handled by the Linux Ip stack allowing for traffic capture. The two NICs are then connected by vSwitch so the original card can forward the processed packets to the traffic generator. The ports handled by Linux IP stack allow for capturing packets, which are then analyzed for changes done by both the vSwitch and the NIC with hardware acceleration.

                                                   _
+------------------------------------------------+  |
|                                                |  |
|   +----------------------------------------+   |  |
|   |                 vSwitch                |   |  |
|   |  +----------------------------------+  |   |  |
|   |  |                                  |  |   |  |
|   |  |       +------------------+       |  |   |  |
|   |  |       |                  |       v  |   |  |
|   +----------------------------------------+   |  |  Device under Test
|      ^       |                  ^       |      |  |
|      |       |                  |       |      |  |
|      |       v                  |       v      |  |
|   +--------------+          +--------------+   |  |
|   |              |          | NIC w HW acc |   |  |
|   |   phy ports  |          |   phy ports  |   |  |
+---+--------------+----------+--------------+---+ _|
       ^       :                  ^       :
       |       |                  |       |
       |       |                  +-------+
       :       v                 Patch Cable
+------------------------------------------------+
|                                                |
|                traffic generator               |
|                                                |
+------------------------------------------------+

PROS:

  • allows testing hardware offloading on both the ingress and egress side
  • supports testing with all traffic generators
  • relatively easy to use and implement into tests

CONS:

  • a more complex setup with two cards
  • if the tested card only has one port, an additional card is needed

An example of Traffic Capture for testing NICs with HW offloading test:

# Capture Example 2 - Setup with 2 NICs, where traffic is captured after it is
# processed by NIC under the test (2nd NIC). See documentation for further details.
# This TestCase will strip VLAN headers from traffic sent by the traffic generator.
# The removal of VLAN headers is verified by inspection of captured frames.
#
# NOTE: This setup expects a DUT with two NICs with two ports each. First NIC is
# connected to the traffic generator (standard VSPERF setup). Ports of a second NIC
# are interconnected by a patch cable. PCI addresses of all four ports have to be
# properly configured in the WHITELIST_NICS parameter.
{
    Name: capture_p2p2p_strip_vlan_ovs,
    Deployment: clean,
    Description: P2P Continuous Stream,
    Parameters : {
        _CAPTURE_P2P2P_OVS_ACTION : 'strip_vlan',
        TRAFFIC : {
            bidir : False,
            traffic_type : rfc2544_continuous,
            frame_rate : 100,
            'l2': {
                'srcmac': ca:fe:00:00:00:00,
                'dstmac': 00:00:00:00:00:01
            },
            'vlan': {
                'enabled': True,
                'id': 8,
                'priority': 1,
                'cfi': 0,
            },
        },
        # suppress DPDK configuration, so physical interfaces are not bound to DPDK driver
        'WHITELIST_NICS' : [],
        'NICS' : [],
    },
    TestSteps: _CAPTURE_P2P2P_SETUP + [
        # capture traffic after processing by NIC under the test (after possible egress HW offloading)
        ['tools', 'exec_shell_background', 'tcpdump -i [2][device] -c 5 -w capture.pcap '
                                           'ether src [l2][srcmac]'],
        ['trafficgen', 'send_traffic', {}],
        ['vswitch', 'dump_flows', '$VSWITCH_BRIDGE_NAME'],
        ['vswitch', 'dump_flows', 'br1'],
        # there must be 5 captured frames...
        ['tools', 'exec_shell', 'tcpdump -r capture.pcap | wc -l', '|^(\d+)$'],
        ['tools', 'assert', '#STEP[-1][0] == 5'],
        # ...but no vlan headers
        ['tools', 'exec_shell', 'tcpdump -r capture.pcap vlan | wc -l', '|^(\d+)$'],
        ['tools', 'assert', '#STEP[-1][0] == 0'],
    ],
},
1.3. Traffic Capture on the Traffic Generator

Using the functionality of the Traffic generator makes it possible to configure Traffic Capture on both it’s ports. With Traffic Capture enabled, VSPERF instructs the Traffic Generator to automatically export captured data into a pcap file. The captured packets are then sent to VSPERF for analysis and verification, monitoring any changes done by both vSwitch and the NICs.

Vsperf currently only supports this functionality with the T-Rex generator.

                                                     _
+--------------------------------------------------+  |
|                                                  |  |
|           +--------------------------+           |  |
|           |                          |           |  |
|           |                          v           |  |  Host
|   +--------------+            +--------------+   |  |
|   |   phy port   |  vSwitch   |   phy port   |   |  |
+---+--------------+------------+--------------+---+ _|
            ^                          :
            |                          |
            :                          v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+

PROS:

  • allows testing hardware offloading on both the ingress and egress side
  • does not require an additional NIC

CONS:

  • currently only supported by T-Rex traffic generator

An example Traffic Capture on the Traffic Generator test:

# Capture Example 3 - Traffic capture by traffic generator.
# This TestCase uses OVS flow to add VLAN tag with given ID into every
# frame send by traffic generator. Correct frame modificaiton is verified by
# inspection of packet capture received by T-Rex.
{
    Name: capture_p2p_add_vlan_ovs_trex,
    Deployment: clean,
    Description: OVS: Test VLAN tag modification and verify it by traffic capture,
    vSwitch : OvsDpdkVhost, # works also for Vanilla OVS
    Parameters : {
        TRAFFICGEN : Trex,
        TRAFFICGEN_DURATION : 5,
        TRAFFIC : {
            traffic_type : rfc2544_continuous,
            frame_rate : 100,
            # enable capture of five RX frames
            'capture': {
                'enabled': True,
                'tx_ports' : [],
                'rx_ports' : [1],
                'count' : 5,
            },
        },
    },
    TestSteps : STEP_VSWITCH_P2P_INIT + [
        # replace standard L2 flows by flows, which will add VLAN tag with ID 3
        ['!vswitch', 'add_flow', 'int_br0', {'in_port': '1', 'actions': ['mod_vlan_vid:3','output:2']}],
        ['!vswitch', 'add_flow', 'int_br0', {'in_port': '2', 'actions': ['mod_vlan_vid:3','output:1']}],
        ['vswitch', 'dump_flows', 'int_br0'],
        ['trafficgen', 'send_traffic', {}],
        ['trafficgen', 'get_results'],
        # verify that captured frames have vlan tag with ID 3
        ['tools', 'exec_shell', 'tcpdump -qer /#STEP[-1][0][capture_rx] vlan 3 '
                                '2>/dev/null | wc -l', '|^(\d+)$'],
        # number of received frames with expected VLAN id must match the number of captured frames
        ['tools', 'assert', '#STEP[-1][0] == 5'],
    ] + STEP_VSWITCH_P2P_FINIT,
},
4. Execution of vswitchperf testcases by Yardstick
4.1. General

Yardstick is a generic framework for a test execution, which is used for validation of installation of OPNFV platform. In the future, Yardstick will support two options of vswitchperf testcase execution:

  • plugin mode, which will execute native vswitchperf testcases; Tests will be executed natively by vsperf, and test results will be processed and reported by yardstick.
  • traffic generator mode, which will run vswitchperf in trafficgen mode only; Yardstick framework will be used to launch VNFs and to configure flows to ensure, that traffic is properly routed. This mode will allow to test OVS performance in real world scenarios.

In Colorado release only the traffic generator mode is supported.

4.2. Yardstick Installation

In order to run Yardstick testcases, you will need to prepare your test environment. Please follow the installation instructions to install the yardstick.

Please note, that yardstick uses OpenStack for execution of testcases. OpenStack must be installed with Heat and Neutron services. Otherwise vswitchperf testcases cannot be executed.

4.3. VM image with vswitchperf

A special VM image is required for execution of vswitchperf specific testcases by yardstick. It is possible to use a sample VM image available at OPNFV artifactory or to build customized image.

4.3.1. Sample VM image with vswitchperf

Sample VM image is available at vswitchperf section of OPNFV artifactory for free download:

$ wget http://artifacts.opnfv.org/vswitchperf/vnf/vsperf-yardstick-image.qcow2

This image can be used for execution of sample testcases with dummy traffic generator.

NOTE: Traffic generators might require an installation of client software. This software is not included in the sample image and must be installed by user.

NOTE: This image will be updated only in case, that new features related to yardstick integration will be added to the vswitchperf.

4.3.2. Preparation of custom VM image

In general, any Linux distribution supported by vswitchperf can be used as a base image for vswitchperf. One of the possibilities is to modify vloop-vnf image, which can be downloaded from http://artifacts.opnfv.org/vswitchperf.html/ (see vloop-vnf).

Please follow the Installing vswitchperf to install vswitchperf inside vloop-vnf image. As vswitchperf will be run in trafficgen mode, it is possible to skip installation and compilation of OVS, QEMU and DPDK to keep image size smaller.

In case, that selected traffic generator requires installation of additional client software, please follow appropriate documentation. For example in case of IXIA, you would need to install IxOS and IxNetowrk TCL API.

4.3.3. VM image usage

Image with vswitchperf must be uploaded into the glance service and vswitchperf specific flavor configured, e.g.:

$ glance --os-username admin --os-image-api-version 1 image-create --name \
  vsperf --is-public true --disk-format qcow2 --container-format bare --file \
  vsperf-yardstick-image.qcow2

$ nova --os-username admin flavor-create vsperf-flavor 100 2048 25 1
4.4. Testcase execution

After installation, yardstick is available as python package within yardstick specific virtual environment. It means, that yardstick environment must be enabled before the test execution, e.g.:

source ~/yardstick_venv/bin/activate

Next step is configuration of OpenStack environment, e.g. in case of devstack:

source /opt/openstack/devstack/openrc
export EXTERNAL_NETWORK=public

Vswitchperf testcases executable by yardstick are located at vswitchperf repository inside yardstick/tests directory. Example of their download and execution follows:

git clone https://gerrit.opnfv.org/gerrit/vswitchperf
cd vswitchperf

yardstick -d task start yardstick/tests/rfc2544_throughput_dummy.yaml

NOTE: Optional argument -d shows debug output.

4.5. Testcase customization

Yardstick testcases are described by YAML files. vswitchperf specific testcases are part of the vswitchperf repository and their yaml files can be found at yardstick/tests directory. For detailed description of yaml file structure, please see yardstick documentation and testcase samples. Only vswitchperf specific parts will be discussed here.

Example of yaml file:

...
scenarios:
-
  type: Vsperf
  options:
    testname: 'p2p_rfc2544_throughput'
    trafficgen_port1: 'eth1'
    trafficgen_port2: 'eth3'
    external_bridge: 'br-ex'
    test_params: 'TRAFFICGEN_DURATION=30;TRAFFIC={'traffic_type':'rfc2544_throughput}'
    conf_file: '~/vsperf-yardstick.conf'

  host: vsperf.demo

  runner:
    type: Sequence
    scenario_option_name: frame_size
    sequence:
    - 64
    - 128
    - 512
    - 1024
    - 1518
  sla:
    metrics: 'throughput_rx_fps'
    throughput_rx_fps: 500000
    action: monitor

context:
...
4.5.1. Section option

Section option defines details of vswitchperf test scenario. Lot of options are identical to the vswitchperf parameters passed through --test-params argument. Following options are supported:

  • frame_size - a packet size for which test should be executed; Multiple packet sizes can be tested by modification of Sequence runner section inside YAML definition. Default: ‘64’
  • conf_file - sets path to the vswitchperf configuration file, which will be uploaded to VM; Default: ‘~/vsperf-yardstick.conf’
  • setup_script - sets path to the setup script, which will be executed during setup and teardown phases
  • trafficgen_port1 - specifies device name of 1st interface connected to the trafficgen
  • trafficgen_port2 - specifies device name of 2nd interface connected to the trafficgen
  • external_bridge - specifies name of external bridge configured in OVS; Default: ‘br-ex’
  • test_params - specifies a string with a list of vsperf configuration parameters, which will be passed to the --test-params CLI argument; Parameters should be stated in the form of param=value and separated by a semicolon. Configuration of traffic generator is driven by TRAFFIC dictionary, which can be also updated by values defined by test_params. Please check VSPERF documentation for details about available configuration parameters and their data types. In case that both test_params and conf_file are specified, then values from test_params will override values defined in the configuration file.

In case that trafficgen_port1 and/or trafficgen_port2 are defined, then these interfaces will be inserted into the external_bridge of OVS. It is expected, that OVS runs at the same node, where the testcase is executed. In case of more complex OpenStack installation or a need of additional OVS configuration, setup_script can be used.

NOTE It is essential to specify a configuration for selected traffic generator. In case, that standalone testcase is created, then traffic generator can be selected and configured directly in YAML file by test_params. On the other hand, if multiple testcases should be executed with the same traffic generator settings, then a customized configuration file should be prepared and its name passed by conf_file option.

4.5.2. Section runner

Yardstick supports several runner types. In case of vswitchperf specific TCs, Sequence runner type can be used to execute the testcase for given list of frame sizes.

4.5.3. Section sla

In case that sla section is not defined, then testcase will be always considered as successful. On the other hand, it is possible to define a set of test metrics and their minimal values to evaluate test success. Any numeric value, reported by vswitchperf inside CSV result file, can be used. Multiple metrics can be defined as a coma separated list of items. Minimal value must be set separately for each metric.

e.g.:

sla:
    metrics: 'throughput_rx_fps,throughput_rx_mbps'
    throughput_rx_fps: 500000
    throughput_rx_mbps: 1000

In case that any of defined metrics will be lower than defined value, then testcase will be marked as failed. Based on action policy, yardstick will either stop test execution (value assert) or it will run next test (value monitor).

NOTE The throughput SLA (or any other SLA) cannot be set to a meaningful value without knowledge of the server and networking environment, possibly including prior testing in that environment to establish a baseline SLA level under well-understood circumstances.

5. List of vswitchperf testcases
5.1. Performance testcases
Testcase Name Description
phy2phy_tput LTD.Throughput.RFC2544.PacketLossRatio
phy2phy_forwarding LTD.Forwarding.RFC2889.MaxForwardingRate
phy2phy_learning LTD.AddrLearning.RFC2889.AddrLearningRate
phy2phy_caching LTD.AddrCaching.RFC2889.AddrCachingCapacity
back2back LTD.Throughput.RFC2544.BackToBackFrames
phy2phy_tput_mod_vlan LTD.Throughput.RFC2544.PacketLossRatioFrameModification
phy2phy_cont Phy2Phy Continuous Stream
pvp_cont PVP Continuous Stream
pvvp_cont PVVP Continuous Stream
pvpv_cont Two VMs in parallel with Continuous Stream
phy2phy_scalability LTD.Scalability.Flows.RFC2544.0PacketLoss
pvp_tput LTD.Throughput.RFC2544.PacketLossRatio
pvp_back2back LTD.Throughput.RFC2544.BackToBackFrames
pvvp_tput LTD.Throughput.RFC2544.PacketLossRatio
pvvp_back2back LTD.Throughput.RFC2544.BackToBackFrames
phy2phy_cpu_load LTD.CPU.RFC2544.0PacketLoss
phy2phy_mem_load LTD.Memory.RFC2544.0PacketLoss
phy2phy_tput_vpp VPP: LTD.Throughput.RFC2544.PacketLossRatio
phy2phy_cont_vpp VPP: Phy2Phy Continuous Stream
phy2phy_back2back_vpp VPP: LTD.Throughput.RFC2544.BackToBackFrames
pvp_tput_vpp VPP: LTD.Throughput.RFC2544.PacketLossRatio
pvp_cont_vpp VPP: PVP Continuous Stream
pvp_back2back_vpp VPP: LTD.Throughput.RFC2544.BackToBackFrames
pvvp_tput_vpp VPP: LTD.Throughput.RFC2544.PacketLossRatio
pvvp_cont_vpp VPP: PVP Continuous Stream
pvvp_back2back_vpp VPP: LTD.Throughput.RFC2544.BackToBackFrames

List of performance testcases above can be obtained by execution of:

$ ./vsperf --list
5.2. Integration testcases
Testcase Name Description
vswitch_vports_add_del_flow vSwitch - configure switch with vports, add and delete flow
vswitch_add_del_flows vSwitch - add and delete flows
vswitch_p2p_tput vSwitch - configure switch and execute RFC2544 throughput test
vswitch_p2p_back2back vSwitch - configure switch and execute RFC2544 back2back test
vswitch_p2p_cont vSwitch - configure switch and execute RFC2544 continuous stream test
vswitch_pvp vSwitch - configure switch and one vnf
vswitch_vports_pvp vSwitch - configure switch with vports and one vnf
vswitch_pvp_tput vSwitch - configure switch, vnf and execute RFC2544 throughput test
vswitch_pvp_back2back vSwitch - configure switch, vnf and execute RFC2544 back2back test
vswitch_pvp_cont vSwitch - configure switch, vnf and execute RFC2544 continuous stream test
vswitch_pvp_all vSwitch - configure switch, vnf and execute all test types
vswitch_pvvp vSwitch - configure switch and two vnfs
vswitch_pvvp_tput vSwitch - configure switch, two chained vnfs and execute RFC2544 throughput test
vswitch_pvvp_back2back vSwitch - configure switch, two chained vnfs and execute RFC2544 back2back test
vswitch_pvvp_cont vSwitch - configure switch, two chained vnfs and execute RFC2544 continuous stream test
vswitch_pvvp_all vSwitch - configure switch, two chained vnfs and execute all test types
vswitch_p4vp_tput 4 chained vnfs, execute RFC2544 throughput test, deployment pvvp4
vswitch_p4vp_back2back 4 chained vnfs, execute RFC2544 back2back test, deployment pvvp4
vswitch_p4vp_cont 4 chained vnfs, execute RFC2544 continuous stream test, deployment pvvp4
vswitch_p4vp_all 4 chained vnfs, execute RFC2544 throughput tests, deployment pvvp4
2pvp_udp_dest_flows RFC2544 Continuous TC with 2 Parallel VMs, flows on UDP Dest Port, deployment pvpv2
4pvp_udp_dest_flows RFC2544 Continuous TC with 4 Parallel VMs, flows on UDP Dest Port, deployment pvpv4
6pvp_udp_dest_flows RFC2544 Continuous TC with 6 Parallel VMs, flows on UDP Dest Port, deployment pvpv6
vhost_numa_awareness vSwitch DPDK - verify that PMD threads are served by the same NUMA slot as QEMU instances
ixnet_pvp_tput_1nic PVP Scenario with 1 port towards IXIA
vswitch_vports_add_del_connection_vpp VPP: vSwitch - configure switch with vports, add and delete connection
p2p_l3_multi_IP_ovs OVS: P2P L3 multistream with unique flow for each IP stream
p2p_l3_multi_IP_mask_ovs OVS: P2P L3 multistream with 1 flow for /8 net mask
pvp_l3_multi_IP_mask_ovs OVS: PVP L3 multistream with 1 flow for /8 net mask
pvvp_l3_multi_IP_mask_ovs OVS: PVVP L3 multistream with 1 flow for /8 net mask
p2p_l4_multi_PORT_ovs OVS: P2P L4 multistream with unique flow for each IP stream
p2p_l4_multi_PORT_mask_ovs OVS: P2P L4 multistream with 1 flow for /8 net and port mask
pvp_l4_multi_PORT_mask_ovs OVS: PVP L4 multistream flows for /8 net and port mask
pvvp_l4_multi_PORT_mask_ovs OVS: PVVP L4 multistream with flows for /8 net and port mask
p2p_l3_multi_IP_arp_vpp VPP: P2P L3 multistream with unique ARP entry for each IP stream
p2p_l3_multi_IP_mask_vpp VPP: P2P L3 multistream with 1 route for /8 net mask
p2p_l3_multi_IP_routes_vpp VPP: P2P L3 multistream with unique route for each IP stream
pvp_l3_multi_IP_mask_vpp VPP: PVP L3 multistream with route for /8 netmask
pvvp_l3_multi_IP_mask_vpp VPP: PVVP L3 multistream with route for /8 netmask
p2p_l4_multi_PORT_arp_vpp VPP: P2P L4 multistream with unique ARP entry for each IP stream and port check
p2p_l4_multi_PORT_mask_vpp VPP: P2P L4 multistream with 1 route for /8 net mask and port check
p2p_l4_multi_PORT_routes_vpp VPP: P2P L4 multistream with unique route for each IP stream and port check
pvp_l4_multi_PORT_mask_vpp VPP: PVP L4 multistream with route for /8 net and port mask
pvvp_l4_multi_PORT_mask_vpp VPP: PVVP L4 multistream with route for /8 net and port mask
vxlan_multi_IP_mask_ovs OVS: VxLAN L3 multistream
vxlan_multi_IP_arp_vpp VPP: VxLAN L3 multistream with unique ARP entry for each IP stream
vxlan_multi_IP_mask_vpp VPP: VxLAN L3 multistream with 1 route for /8 netmask

List of integration testcases above can be obtained by execution of:

$ ./vsperf --integration --list
5.3. OVS/DPDK Regression TestCases

These regression tests verify several DPDK features used internally by Open vSwitch. Tests can be used for verification of performance and correct functionality of upcoming DPDK and OVS releases and release candidates.

These tests are part of integration testcases and they must be executed with --integration CLI parameter.

Example of execution of all OVS/DPDK regression tests:

$ ./vsperf --integration --tests ovsdpdk_

Testcases are defined in the file conf/integration/01b_dpdk_regression_tests.conf. This file contains a set of configuration options with prefix OVSDPDK_. These parameters can be used for customization of regression tests and they will override some of standard VSPERF configuration options. It is recommended to check OVSDPDK configuration parameters and modify them in accordance with VSPERF configuration.

At least following parameters should be examined. Their values shall ensure, that DPDK and QEMU threads are pinned to cpu cores of the same NUMA slot, where tested NICs are connected.

_OVSDPDK_1st_PMD_CORE
_OVSDPDK_2nd_PMD_CORE
_OVSDPDK_GUEST_5_CORES
5.3.1. DPDK NIC Support

A set of performance tests to verify support of DPDK accelerated network interface cards. Testcases use standard physical to physical network scenario with several vSwitch and traffic configurations, which includes one and two PMD threads, uni and bidirectional traffic and RFC2544 Continuous or RFC2544 Throughput with 0% packet loss traffic types.

Testcase Name Description
ovsdpdk_nic_p2p_single_pmd_unidir_cont P2P with single PMD in OVS and unidirectional traffic.
ovsdpdk_nic_p2p_single_pmd_bidir_cont P2P with single PMD in OVS and bidirectional traffic.
ovsdpdk_nic_p2p_two_pmd_bidir_cont P2P with two PMDs in OVS and bidirectional traffic.
ovsdpdk_nic_p2p_single_pmd_unidir_tput P2P with single PMD in OVS and unidirectional traffic.
ovsdpdk_nic_p2p_single_pmd_bidir_tput P2P with single PMD in OVS and bidirectional traffic.
ovsdpdk_nic_p2p_two_pmd_bidir_tput P2P with two PMDs in OVS and bidirectional traffic.
5.3.2. DPDK Hotplug Support

A set of functional tests to verify DPDK hotplug support. Tests verify, that it is possible to use port, which was not bound to DPDK driver during vSwitch startup. There is also a test which verifies a possibility to detach port from DPDK driver. However support for manual detachment of a port from DPDK has been removed from recent OVS versions and thus this testcase is expected to fail.

Testcase Name Description
ovsdpdk_hotplug_attach Ensure successful port-add after binding a device to igb_uio after ovs-vswitchd is launched.
ovsdpdk_hotplug_detach Same as ovsdpdk_hotplug_attach, but delete and detach the device after the hotplug. Note Support of netdev-dpdk/detach has been removed from OVS, so testcase will fail with recent OVS/DPDK versions.
5.3.3. RX Checksum Support

A set of functional tests for verification of RX checksum calculation for tunneled traffic. Open vSwitch enables RX checksum offloading by default if NIC supports it. It is to note, that it is not possible to disable or enable RX checksum offloading. In order to verify correct RX checksum calculation in software, user has to execute these testcases at NIC without HW offloading capabilities.

Testcases utilize existing overlay physical to physical (op2p) network deployment implemented in vsperf. This deployment expects, that traffic generator sends unidirectional tunneled traffic (e.g. vxlan) and Open vSwitch performs data decapsulation and sends them back to the traffic generator via second port.

Testcase Name Description
ovsdpdk_checksum_l3 Test verifies RX IP header checksum (offloading) validation for tunneling protocols.
ovsdpdk_checksum_l4 Test verifies RX UDP header checksum (offloading) validation for tunneling protocols.
5.3.4. Flow Control Support

A set of functional testcases for the validation of flow control support in Open vSwitch with DPDK support. If flow control is enabled in both OVS and Traffic Generator, the network endpoint (OVS or TGEN) is not able to process incoming data and thus it detects a RX buffer overflow. It then sends an ethernet pause frame (as defined at 802.3x) to the TX side. This mechanism will ensure, that the TX side will slow down traffic transmission and thus no data is lost at RX side.

Introduced testcases use physical to physical scenario to forward data between traffic generator ports. It is expected that the processing of small frames in OVS is slower than line rate. It means that with flow control disabled, traffic generator will report a frame loss. On the other hand with flow control enabled, there should be 0% frame loss reported by traffic generator.

Testcase Name Description
ovsdpdk_flow_ctrl_rx Test the rx flow control functionality of DPDK PHY ports.
ovsdpdk_flow_ctrl_rx_dynamic Change the rx flow control support at run time and ensure the system honored the changes.
5.3.5. Multiqueue Support

A set of functional testcases for validation of multiqueue support for both physical and vHost User DPDK ports. Testcases utilize P2P and PVP network deployments and native support of multiqueue configuration available in VSPERF.

Testcase Name Description
ovsdpdk_mq_p2p_rxqs Setup rxqs on NIC port.
ovsdpdk_mq_p2p_rxqs_same_core_affinity Affinitize rxqs to the same core.
ovsdpdk_mq_p2p_rxqs_multi_core_affinity Affinitize rxqs to separate cores.
ovsdpdk_mq_pvp_rxqs Setup rxqs on vhost user port.
ovsdpdk_mq_pvp_rxqs_linux_bridge Confirm traffic received over vhost RXQs with Linux virtio device in guest.
ovsdpdk_mq_pvp_rxqs_testpmd Confirm traffic received over vhost RXQs with DPDK device in guest.
5.3.6. Vhost User

A set of functional testcases for validation of vHost User Client and vHost User Server modes in OVS.

NOTE: Vhost User Server mode is deprecated and it will be removed from OVS in the future.

Testcase Name Description
ovsdpdk_vhostuser_client Test vhost-user client mode
ovsdpdk_vhostuser_client_reconnect Test vhost-user client mode reconnect feature
ovsdpdk_vhostuser_server Test vhost-user server mode
ovsdpdk_vhostuser_sock_dir Verify functionality of vhost-sock-dir flag
5.3.7. Virtual Devices Support

A set of functional testcases for verification of correct functionality of virtual device PMD drivers.

Testcase Name Description
ovsdpdk_vdev_add_null_pmd Test addition of port using the null DPDK PMD driver.
ovsdpdk_vdev_del_null_pmd Test deletion of port using the null DPDK PMD driver.
ovsdpdk_vdev_add_af_packet_pmd Test addition of port using the af_packet DPDK PMD driver.
ovsdpdk_vdev_del_af_packet_pmd Test deletion of port using the af_packet DPDK PMD driver.
5.3.8. NUMA Support

A functional testcase for validation of NUMA awareness feature in OVS.

Testcase Name Description
ovsdpdk_numa Test vhost-user NUMA support. Vhostuser PMD threads should migrate to the same numa slot, where QEMU is executed.
5.3.9. Jumbo Frame Support

A set of functional testcases for verification of jumbo frame support in OVS. Testcases utilize P2P and PVP network deployments and native support of jumbo frames available in VSPERF.

Testcase Name Description
ovsdpdk_jumbo_increase_mtu_phy_port_ovsdb Ensure that the increased MTU for a DPDK physical port is updated in OVSDB.
ovsdpdk_jumbo_increase_mtu_vport_ovsdb Ensure that the increased MTU for a DPDK vhost-user port is updated in OVSDB.
ovsdpdk_jumbo_reduce_mtu_phy_port_ovsdb Ensure that the reduced MTU for a DPDK physical port is updated in OVSDB.
ovsdpdk_jumbo_reduce_mtu_vport_ovsdb Ensure that the reduced MTU for a DPDK vhost-user port is updated in OVSDB.
ovsdpdk_jumbo_increase_mtu_phy_port_datapath Ensure that the MTU for a DPDK physical port is updated in the datapath itself when increased to a valid value.
ovsdpdk_jumbo_increase_mtu_vport_datapath Ensure that the MTU for a DPDK vhost-user port is updated in the datapath itself when increased to a valid value.
ovsdpdk_jumbo_reduce_mtu_phy_port_datapath Ensure that the MTU for a DPDK physical port is updated in the datapath itself when decreased to a valid value.
ovsdpdk_jumbo_reduce_mtu_vport_datapath Ensure that the MTU for a DPDK vhost-user port is updated in the datapath itself when decreased to a valid value.
ovsdpdk_jumbo_mtu_upper_bound_phy_port Verify that the upper bound limit is enforced for OvS DPDK Phy ports.
ovsdpdk_jumbo_mtu_upper_bound_vport Verify that the upper bound limit is enforced for OvS DPDK vhost-user ports.
ovsdpdk_jumbo_mtu_lower_bound_phy_port Verify that the lower bound limit is enforced for OvS DPDK Phy ports.
ovsdpdk_jumbo_mtu_lower_bound_vport Verify that the lower bound limit is enforced for OvS DPDK vhost-user ports.
ovsdpdk_jumbo_p2p Ensure that jumbo frames are received, processed and forwarded correctly by DPDK physical ports.
ovsdpdk_jumbo_pvp Ensure that jumbo frames are received, processed and forwarded correctly by DPDK vhost-user ports.
ovsdpdk_jumbo_p2p_upper_bound Ensure that jumbo frames above the configured Rx port’s MTU are not accepted
5.3.10. Rate Limiting

A set of functional testcases for validation of rate limiting support. This feature allows to configure an ingress policing for both physical and vHost User DPDK ports.

NOTE: Desired maximum rate is specified in kilo bits per second and it defines the rate of payload only.

Testcase Name Description
ovsdpdk_rate_create_phy_port Ensure a rate limiting interface can be created on a physical DPDK port.
ovsdpdk_rate_delete_phy_port Ensure a rate limiting interface can be destroyed on a physical DPDK port.
ovsdpdk_rate_create_vport Ensure a rate limiting interface can be created on a vhost-user port.
ovsdpdk_rate_delete_vport Ensure a rate limiting interface can be destroyed on a vhost-user port.
ovsdpdk_rate_no_policing Ensure when a user attempts to create a rate limiting interface but is missing policing rate argument, no rate limitiner is created.
ovsdpdk_rate_no_burst Ensure when a user attempts to create a rate limiting interface but is missing policing burst argument, rate limitiner is created.
ovsdpdk_rate_p2p Ensure when a user creates a rate limiting physical interface that the traffic is limited to the specified policer rate in a p2p setup.
ovsdpdk_rate_pvp Ensure when a user creates a rate limiting vHost User interface that the traffic is limited to the specified policer rate in a pvp setup.
ovsdpdk_rate_p2p_multi_pkt_sizes Ensure that rate limiting works for various frame sizes.
5.3.11. Quality of Service

A set of functional testcases for validation of QoS support. This feature allows to configure an egress policing for both physical and vHost User DPDK ports.

NOTE: Desired maximum rate is specified in bytes per second and it defines the rate of payload only.

Testcase Name Description
ovsdpdk_qos_create_phy_port Ensure a QoS policy can be created on a physical DPDK port
ovsdpdk_qos_delete_phy_port Ensure an existing QoS policy can be destroyed on a physical DPDK port.
ovsdpdk_qos_create_vport Ensure a QoS policy can be created on a virtual vhost user port.
ovsdpdk_qos_delete_vport Ensure an existing QoS policy can be destroyed on a vhost user port.
ovsdpdk_qos_create_no_cir Ensure that a QoS policy cannot be created if the egress policer cir argument is missing.
ovsdpdk_qos_create_no_cbs Ensure that a QoS policy cannot be created if the egress policer cbs argument is missing.
ovsdpdk_qos_p2p In a p2p setup, ensure when a QoS egress policer is created that the traffic is limited to the specified rate.
ovsdpdk_qos_pvp In a pvp setup, ensure when a QoS egress policer is created that the traffic is limited to the specified rate.
5.3.12. Custom Statistics

A set of functional testcases for validation of Custom Statistics support by OVS. This feature allows Custom Statistics to be accessed by VSPERF.

These testcases require DPDK v17.11, the latest Open vSwitch(v2.9.90) and the IxNet traffic-generator.

ovsdpdk_custstat_check Test if custom statistics are supported.
ovsdpdk_custstat_rx_error Test bad ethernet CRC counter ‘rx_crc_errors’ exposed by custom statistics.
5.4. T-Rex in VM TestCases

A set of functional testcases, which use T-Rex running in VM as a traffic generator. These testcases require a VM image with T-Rex server installed. An example of such image is a vloop-vnf image with T-Rex available for download at:

http://artifacts.opnfv.org/vswitchperf/vnf/vloop-vnf-ubuntu-16.04_trex_20180209.qcow2

This image can be used for both T-Rex VM and loopback VM in vm2vm testcases.

NOTE: The performance of T-Rex running inside the VM is lower if compared to T-Rex execution on bare-metal. The user should perform a calibration of the VM maximum FPS capability, to ensure this limitation is understood.

trex_vm_cont T-Rex VM - execute RFC2544 Continuous Stream from T-Rex VM and loop it back through Open vSwitch.
trex_vm_tput T-Rex VM - execute RFC2544 Throughput from T-Rex VM and loop it back through Open vSwitch.
trex_vm2vm_cont T-Rex VM2VM - execute RFC2544 Continuous Stream from T-Rex VM and loop it back through 2nd VM.
trex_vm2vm_tput T-Rex VM2VM - execute RFC2544 Throughput from T-Rex VM and loop it back through 2nd VM.

Yardstick

Yardstick User Guide
1. Introduction

Welcome to Yardstick’s documentation !

Yardstick is an OPNFV Project.

The project’s goal is to verify infrastructure compliance, from the perspective of a Virtual Network Function (VNF).

The Project’s scope is the development of a test framework, Yardstick, test cases and test stimuli to enable Network Function Virtualization Infrastructure (NFVI) verification.

Yardstick is used in OPNFV for verifying the OPNFV infrastructure and some of the OPNFV features. The Yardstick framework is deployed in several OPNFV community labs. It is installer, infrastructure and application independent.

See also

Pharos for information on OPNFV community labs and this Presentation for an overview of Yardstick

1.1. About This Document

This document consists of the following chapters:

  • Chapter Introduction provides a brief introduction to Yardstick project’s background and describes the structure of this document.
  • Chapter Methodology describes the methodology implemented by the Yardstick Project for NFVI verification.
  • Chapter Architecture provides information on the software architecture of Yardstick.
  • Chapter Yardstick Installation provides instructions to install Yardstick.
  • Chapter Yardstick Usage provides information on how to use Yardstick to run and create testcases.
  • Chapter Installing a plug-in into Yardstick provides information on how to integrate other OPNFV testing projects into Yardstick.
  • Chapter Store Other Project’s Test Results in InfluxDB provides inforamtion on how to run plug-in test cases and store test results into community’s InfluxDB.
  • Chapter Grafana dashboard provides inforamtion on Yardstick grafana dashboard and how to add a dashboard into Yardstick grafana dashboard.
  • Chapter Yardstick Restful API provides inforamtion on Yardstick ReST API and how to use Yardstick API.
  • Chapter Yardstick User Interface provides inforamtion on how to use yardstick report CLI to view the test result in table format and also values pinned on to a graph
  • Chapter Network Services Benchmarking (NSB) describes the methodology implemented by the Yardstick - Network service benchmarking to test real world usecase for a given VNF.
  • Chapter 13-nsb_installation provides instructions to install Yardstick - Network Service Benchmarking (NSB) testing.
  • Chapter Yardstick - NSB Testing - Operation provides information on running NSB
  • Chapter Yardstick Test Cases includes a list of available Yardstick test cases.
1.2. Contact Yardstick

Feedback? Contact us

2. Methodology
2.1. Abstract

This chapter describes the methodology implemented by the Yardstick project for verifying the NFVI from the perspective of a VNF.

2.2. ETSI-NFV

The document ETSI GS NFV-TST001, “Pre-deployment Testing; Report on Validation of NFV Environments and Services”, recommends methods for pre-deployment testing of the functional components of an NFV environment.

The Yardstick project implements the methodology described in chapter 6, “Pre- deployment validation of NFV infrastructure”.

The methodology consists in decomposing the typical VNF work-load performance metrics into a number of characteristics/performance vectors, which each can be represented by distinct test-cases.

The methodology includes five steps:

  • Step1: Define Infrastruture - the Hardware, Software and corresponding

    configuration target for validation; the OPNFV infrastructure, in OPNFV community labs.

  • Step2: Identify VNF type - the application for which the

    infrastructure is to be validated, and its requirements on the underlying infrastructure.

  • Step3: Select test cases - depending on the workload that represents the

    application for which the infrastruture is to be validated, the relevant test cases amongst the list of available Yardstick test cases.

  • Step4: Execute tests - define the duration and number of iterations for the

    selected test cases, tests runs are automated via OPNFV Jenkins Jobs.

  • Step5: Collect results - using the common API for result collection.

See also

Yardsticktst for material on alignment ETSI TST001 and Yardstick.

2.3. Metrics

The metrics, as defined by ETSI GS NFV-TST001, are shown in Table1, Table2 and Table3.

In OPNFV Colorado release, generic test cases covering aspects of the listed metrics are available; further OPNFV releases will provide extended testing of these metrics. The view of available Yardstick test cases cross ETSI definitions in Table1, Table2 and Table3 is shown in Table4. It shall be noticed that the Yardstick test cases are examples, the test duration and number of iterations are configurable, as are the System Under Test (SUT) and the attributes (or, in Yardstick nomemclature, the scenario options).

Table 1 - Performance/Speed Metrics

Category Performance/Speed
Compute
  • Latency for random memory access
  • Latency for cache read/write operations
  • Processing speed (instructions per second)
  • Throughput for random memory access (bytes per second)
Network
  • Throughput per NFVI node (frames/byte per second)
  • Throughput provided to a VM (frames/byte per second)
  • Latency per traffic flow
  • Latency between VMs
  • Latency between NFVI nodes
  • Packet delay variation (jitter) between VMs
  • Packet delay variation (jitter) between NFVI nodes
Storage
  • Sequential read/write IOPS
  • Random read/write IOPS
  • Latency for storage read/write operations
  • Throughput for storage read/write operations

Table 2 - Capacity/Scale Metrics

Category Capacity/Scale
Compute
  • Number of cores and threads- Available memory size
  • Cache size
  • Processor utilization (max, average, standard deviation)
  • Memory utilization (max, average, standard deviation)
  • Cache utilization (max, average, standard deviation)
Network
  • Number of connections
  • Number of frames sent/received
  • Maximum throughput between VMs (frames/byte per second)
  • Maximum throughput between NFVI nodes (frames/byte per second)
  • Network utilization (max, average, standard deviation)
  • Number of traffic flows
Storage
  • Storage/Disk size
  • Capacity allocation (block-based, object-based)
  • Block size
  • Maximum sequential read/write IOPS
  • Maximum random read/write IOPS
  • Disk utilization (max, average, standard deviation)

Table 3 - Availability/Reliability Metrics

Category Availability/Reliability
Compute
  • Processor availability (Error free processing time)
  • Memory availability (Error free memory time)
  • Processor mean-time-to-failure
  • Memory mean-time-to-failure
  • Number of processing faults per second
Network
  • NIC availability (Error free connection time)
  • Link availability (Error free transmission time)
  • NIC mean-time-to-failure
  • Network timeout duration due to link failure
  • Frame loss rate
Storage
  • Disk availability (Error free disk access time)
  • Disk mean-time-to-failure
  • Number of failed storage read/write operations per second

Table 4 - Yardstick Generic Test Cases

Category Performance/Speed Capacity/Scale Availability/Reliability
Compute TC003 [1] TC004 TC010 TC012 TC014 TC069 TC003 [1] TC004 TC024 TC055 TC013 [1] TC015 [1]
Network TC001 TC002 TC009 TC011 TC042 TC043 TC044 TC073 TC075 TC016 [1] TC018 [1]
Storage TC005 TC063 TC017 [1]

Note

The description in this OPNFV document is intended as a reference for users to understand the scope of the Yardstick Project and the deliverables of the Yardstick framework. For complete description of the methodology, please refer to the ETSI document.

Footnotes

[1](1, 2, 3, 4, 5, 6, 7) To be included in future deliveries.
3. Architecture
3.1. Abstract

This chapter describes the yardstick framework software architecture. We will introduce it from Use-Case View, Logical View, Process View and Deployment View. More technical details will be introduced in this chapter.

3.2. Overview
3.2.1. Architecture overview

Yardstick is mainly written in Python, and test configurations are made in YAML. Documentation is written in reStructuredText format, i.e. .rst files. Yardstick is inspired by Rally. Yardstick is intended to run on a computer with access and credentials to a cloud. The test case is described in a configuration file given as an argument.

How it works: the benchmark task configuration file is parsed and converted into an internal model. The context part of the model is converted into a Heat template and deployed into a stack. Each scenario is run using a runner, either serially or in parallel. Each runner runs in its own subprocess executing commands in a VM using SSH. The output of each scenario is written as json records to a file or influxdb or http server, we use influxdb as the backend, the test result will be shown with grafana.

3.2.2. Concept

Benchmark - assess the relative performance of something

Benchmark configuration file - describes a single test case in yaml format

Context - The set of Cloud resources used by a scenario, such as user names, image names, affinity rules and network configurations. A context is converted into a simplified Heat template, which is used to deploy onto the Openstack environment.

Data - Output produced by running a benchmark, written to a file in json format

Runner - Logic that determines how a test scenario is run and reported, for example the number of test iterations, input value stepping and test duration. Predefined runner types exist for re-usage, see Runner types.

Scenario - Type/class of measurement for example Ping, Pktgen, (Iperf, LmBench, ...)

SLA - Relates to what result boundary a test case must meet to pass. For example a latency limit, amount or ratio of lost packets and so on. Action based on SLA can be configured, either just to log (monitor) or to stop further testing (assert). The SLA criteria is set in the benchmark configuration file and evaluated by the runner.

3.2.3. Runner types

There exists several predefined runner types to choose between when designing a test scenario:

Arithmetic: Every test run arithmetically steps the specified input value(s) in the test scenario, adding a value to the previous input value. It is also possible to combine several input values for the same test case in different combinations.

Snippet of an Arithmetic runner configuration:

runner:
    type: Arithmetic
    iterators:
    -
      name: stride
      start: 64
      stop: 128
      step: 64

Duration: The test runs for a specific period of time before completed.

Snippet of a Duration runner configuration:

runner:
  type: Duration
  duration: 30

Sequence: The test changes a specified input value to the scenario. The input values to the sequence are specified in a list in the benchmark configuration file.

Snippet of a Sequence runner configuration:

runner:
  type: Sequence
  scenario_option_name: packetsize
  sequence:
  - 100
  - 200
  - 250

Iteration: Tests are run a specified number of times before completed.

Snippet of an Iteration runner configuration:

runner:
  type: Iteration
  iterations: 2
3.3. Use-Case View

Yardstick Use-Case View shows two kinds of users. One is the Tester who will do testing in cloud, the other is the User who is more concerned with test result and result analyses.

For testers, they will run a single test case or test case suite to verify infrastructure compliance or bencnmark their own infrastructure performance. Test result will be stored by dispatcher module, three kinds of store method (file, influxdb and http) can be configured. The detail information of scenarios and runners can be queried with CLI by testers.

For users, they would check test result with four ways.

If dispatcher module is configured as file(default), there are two ways to check test result. One is to get result from yardstick.out ( default path: /tmp/yardstick.out), the other is to get plot of test result, it will be shown if users execute command “yardstick-plot”.

If dispatcher module is configured as influxdb, users will check test result on Grafana which is most commonly used for visualizing time series data.

If dispatcher module is configured as http, users will check test result on OPNFV testing dashboard which use MongoDB as backend.

Yardstick Use-Case View
3.4. Logical View

Yardstick Logical View describes the most important classes, their organization, and the most important use-case realizations.

Main classes:

TaskCommands - “yardstick task” subcommand handler.

HeatContext - Do test yaml file context section model convert to HOT, deploy and undeploy Openstack heat stack.

Runner - Logic that determines how a test scenario is run and reported.

TestScenario - Type/class of measurement for example Ping, Pktgen, (Iperf, LmBench, ...)

Dispatcher - Choose user defined way to store test results.

TaskCommands is the “yardstick task” subcommand’s main entry. It takes yaml file (e.g. test.yaml) as input, and uses HeatContext to convert the yaml file’s context section to HOT. After Openstack heat stack is deployed by HeatContext with the converted HOT, TaskCommands use Runner to run specified TestScenario. During first runner initialization, it will create output process. The output process use Dispatcher to push test results. The Runner will also create a process to execute TestScenario. And there is a multiprocessing queue between each runner process and output process, so the runner process can push the real-time test results to the storage media. TestScenario is commonly connected with VMs by using ssh. It sets up VMs and run test measurement scripts through the ssh tunnel. After all TestScenaio is finished, TaskCommands will undeploy the heat stack. Then the whole test is finished.

Yardstick framework architecture in Danube
3.5. Process View (Test execution flow)

Yardstick process view shows how yardstick runs a test case. Below is the sequence graph about the test execution flow using heat context, and each object represents one module in yardstick:

Yardstick Process View

A user wants to do a test with yardstick. He can use the CLI to input the command to start a task. “TaskCommands” will receive the command and ask “HeatContext” to parse the context. “HeatContext” will then ask “Model” to convert the model. After the model is generated, “HeatContext” will inform “Openstack” to deploy the heat stack by heat template. After “Openstack” deploys the stack, “HeatContext” will inform “Runner” to run the specific test case.

Firstly, “Runner” would ask “TestScenario” to process the specific scenario. Then “TestScenario” will start to log on the openstack by ssh protocal and execute the test case on the specified VMs. After the script execution finishes, “TestScenario” will send a message to inform “Runner”. When the testing job is done, “Runner” will inform “Dispatcher” to output the test result via file, influxdb or http. After the result is output, “HeatContext” will call “Openstack” to undeploy the heat stack. Once the stack is undepoyed, the whole test ends.

3.6. Deployment View

Yardstick deployment view shows how the yardstick tool can be deployed into the underlying platform. Generally, yardstick tool is installed on JumpServer(see 07-installation for detail installation steps), and JumpServer is connected with other control/compute servers by networking. Based on this deployment, yardstick can run the test cases on these hosts, and get the test result for better showing.

Yardstick Deployment View
3.7. Yardstick Directory structure

yardstick/ - Yardstick main directory.

tests/ci/ - Used for continuous integration of Yardstick at different PODs and
with support for different installers.
docs/ - All documentation is stored here, such as configuration guides,
user guides and Yardstick descriptions.

etc/ - Used for test cases requiring specific POD configurations.

samples/ - test case samples are stored here, most of all scenario and
feature’s samples are shown in this directory.
tests/ - Here both Yardstick internal tests (functional/ and unit/) as
well as the test cases run to verify the NFVI (opnfv/) are stored. Also configurations of what to run daily and weekly at the different PODs is located here.
tools/ - Currently contains tools to build image for VMs which are deployed
by Heat. Currently contains how to build the yardstick-trusty-server image with the different tools that are needed from within the image.

plugin/ - Plug-in configuration files are stored here.

yardstick/ - Contains the internals of Yardstick: Runners, Scenario, Contexts,
CLI parsing, keys, plotting tools, dispatcher, plugin install/remove scripts and so on.
4. Yardstick Installation

Yardstick supports installation by Docker or directly in Ubuntu. The installation procedure for Docker and direct installation are detailed in the sections below.

To use Yardstick you should have access to an OpenStack environment, with at least Nova, Neutron, Glance, Keystone and Heat installed.

The steps needed to run Yardstick are:

  1. Install Yardstick.
  2. Load OpenStack environment variables.
  3. Create Yardstick flavor.
  4. Build a guest image and load it into the OpenStack environment.
  5. Create the test configuration .yaml file and run the test case/suite.
4.1. Prerequisites

The OPNFV deployment is out of the scope of this document and can be found in User Guide & Configuration Guide. The OPNFV platform is considered as the System Under Test (SUT) in this document.

Several prerequisites are needed for Yardstick:

  1. A Jumphost to run Yardstick on
  2. A Docker daemon or a virtual environment installed on the Jumphost
  3. A public/external network created on the SUT
  4. Connectivity from the Jumphost to the SUT public/external network

Note

Jumphost refers to any server which meets the previous requirements. Normally it is the same server from where the OPNFV deployment has been triggered.

Warning

Connectivity from Jumphost is essential and it is of paramount importance to make sure it is working before even considering to install and run Yardstick. Make also sure you understand how your networking is designed to work.

Note

If your Jumphost is operating behind a company http proxy and/or Firewall, please first consult Proxy Support section which is towards the end of this document. That section details some tips/tricks which may be of help in a proxified environment.

4.3. Install Yardstick directly in Ubuntu (second option)

Alternatively you can install Yardstick framework directly in Ubuntu or in an Ubuntu Docker image. No matter which way you choose to install Yardstick, the following installation steps are identical.

If you choose to use the Ubuntu Docker image, you can pull the Ubuntu Docker image from Docker hub:

sudo -EH docker pull ubuntu:16.04
4.3.1. Install Yardstick

Prerequisite preparation:

sudo -EH apt-get update && sudo -EH apt-get install -y \
   git python-setuptools python-pip
sudo -EH easy_install -U setuptools==30.0.0
sudo -EH pip install appdirs==1.4.0
sudo -EH pip install virtualenv

Download the source code and install Yardstick from it:

git clone https://gerrit.opnfv.org/gerrit/yardstick
export YARDSTICK_REPO_DIR=~/yardstick
cd ~/yardstick
sudo -EH ./install.sh

If the host is ever restarted, nginx and uwsgi need to be restarted:

service nginx restart
uwsgi -i /etc/yardstick/yardstick.ini
4.3.2. Configure the Yardstick environment (Todo)

For installing Yardstick directly in Ubuntu, the yardstick env command is not available. You need to prepare OpenStack environment variables and create Yardstick flavor and guest images manually.

4.3.3. Uninstall Yardstick

For uninstalling Yardstick, just delete the virtual environment:

rm -rf ~/yardstick_venv
4.4. Install Yardstick directly in OpenSUSE

You can install Yardstick framework directly in OpenSUSE.

4.4.1. Install Yardstick

Prerequisite preparation:

sudo -EH zypper -n install -y gcc \
   wget \
   git \
   sshpass \
   qemu-tools \
   kpartx \
   libffi-devel \
   libopenssl-devel \
   python \
   python-devel \
   python-virtualenv \
   libxml2-devel \
   libxslt-devel \
   python-setuptools-git

Create a virtual environment:

virtualenv ~/yardstick_venv
export YARDSTICK_VENV=~/yardstick_venv
source ~/yardstick_venv/bin/activate
sudo -EH easy_install -U setuptools

Download the source code and install Yardstick from it:

git clone https://gerrit.opnfv.org/gerrit/yardstick
export YARDSTICK_REPO_DIR=~/yardstick
cd yardstick
sudo -EH python setup.py install
sudo -EH pip install -r requirements.txt

Install missing python modules:

sudo -EH pip install pyyaml \
   oslo_utils \
   oslo_serialization \
   oslo_config \
   paramiko \
   python.heatclient \
   python.novaclient \
   python.glanceclient \
   python.neutronclient \
   scp \
   jinja2
4.4.2. Configure the Yardstick environment

Source the OpenStack environment variables:

source DEVSTACK_DIRECTORY/openrc

Export the Openstack external network. The default installation of Devstack names the external network public:

export EXTERNAL_NETWORK=public
export OS_USERNAME=demo

Change the API version used by Yardstick to v2.0 (the devstack openrc sets it to v3):

export OS_AUTH_URL=http://PUBLIC_IP_ADDRESS:5000/v2.0
4.4.3. Uninstall Yardstick

For unistalling Yardstick, just delete the virtual environment:

rm -rf ~/yardstick_venv
4.5. Verify the installation

It is recommended to verify that Yardstick was installed successfully by executing some simple commands and test samples. Before executing Yardstick test cases make sure yardstick-flavor and yardstick-image can be found in OpenStack and the openrc file is sourced. Below is an example invocation of Yardstick help command and ping.py test sample:

yardstick -h
yardstick task start samples/ping.yaml

Note

The above commands could be run in both the Yardstick container and the Ubuntu directly.

Each testing tool supported by Yardstick has a sample configuration file. These configuration files can be found in the samples directory.

Default location for the output is /tmp/yardstick.out.

4.6. Deploy InfluxDB and Grafana using Docker

Without InfluxDB, Yardstick stores results for running test case in the file /tmp/yardstick.out. However, it’s inconvenient to retrieve and display test results. So we will show how to use InfluxDB to store data and use Grafana to display data in the following sections.

4.6.2. Manual deployment of InfluxDB and Grafana containers

You can also deploy influxDB and Grafana containers manually on the Jumphost. The following sections show how to do.

Pull docker images:

sudo -EH docker pull tutum/influxdb
sudo -EH docker pull grafana/grafana

Run influxDB:

sudo -EH docker run -d --name influxdb \
   -p 8083:8083 -p 8086:8086 --expose 8090 --expose 8099 \
   tutum/influxdb
docker exec -it influxdb bash

Configure influxDB:

influx
   >CREATE USER root WITH PASSWORD 'root' WITH ALL PRIVILEGES
   >CREATE DATABASE yardstick;
   >use yardstick;
   >show MEASUREMENTS;

Run Grafana:

sudo -EH docker run -d --name grafana -p 1948:3000 grafana/grafana

Log on http://{YOUR_IP_HERE}:1948 using admin/admin and configure database resource to be {YOUR_IP_HERE}:8086.

Grafana data source configuration

Configure yardstick.conf:

sudo -EH docker exec -it yardstick /bin/bash
sudo cp etc/yardstick/yardstick.conf.sample /etc/yardstick/yardstick.conf
sudo vi /etc/yardstick/yardstick.conf

Modify yardstick.conf:

[DEFAULT]
debug = True
dispatcher = influxdb

[dispatcher_influxdb]
timeout = 5
target = http://{YOUR_IP_HERE}:8086
db_name = yardstick
username = root
password = root

Now you can run Yardstick test cases and store the results in influxDB.

4.7. Deploy InfluxDB and Grafana directly in Ubuntu (Todo)
4.8. Proxy Support

To configure the Jumphost to access Internet through a proxy its necessary to export several variables to the environment, contained in the following script:

#!/bin/sh
_proxy=<proxy_address>
_proxyport=<proxy_port>
_ip=$(hostname -I | awk '{print $1}')

export ftp_proxy=http://$_proxy:$_proxyport
export FTP_PROXY=http://$_proxy:$_proxyport
export http_proxy=http://$_proxy:$_proxyport
export HTTP_PROXY=http://$_proxy:$_proxyport
export https_proxy=http://$_proxy:$_proxyport
export HTTPS_PROXY=http://$_proxy:$_proxyport
export no_proxy=127.0.0.1,localhost,$_ip,$(hostname),<.localdomain>
export NO_PROXY=127.0.0.1,localhost,$_ip,$(hostname),<.localdomain>

To enable Internet access from a container using docker, depends on the OS version. On Ubuntu 14.04 LTS, which uses SysVinit, /etc/default/docker must be modified:

.......
# If you need Docker to use an HTTP proxy, it can also be specified here.
export http_proxy="http://<proxy_address>:<proxy_port>/"
export https_proxy="https://<proxy_address>:<proxy_port>/"

Then its necessary to restart the docker service:

sudo -EH service docker restart

In Ubuntu 16.04 LTS, which uses Systemd, its necessary to create a drop-in directory:

sudo mkdir /etc/systemd/system/docker.service.d

Then, the proxy configuration will be stored in the following file:

# cat /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=https://<proxy_address>:<proxy_port>/"
Environment="HTTPS_PROXY=https://<proxy_address>:<proxy_port>/"
Environment="NO_PROXY=localhost,127.0.0.1,<localaddress>,<.localdomain>"

The changes need to be flushed and the docker service restarted:

sudo systemctl daemon-reload
sudo systemctl restart docker

Any container is already created won’t contain these modifications. If needed, stop and delete the container:

sudo docker stop yardstick
sudo docker rm yardstick

Warning

Be careful, the above rm command will delete the container completely. Everything on this container will be lost.

Then follow the previous instructions Prepare the Yardstick container to rebuild the Yardstick container.

4.9. References
5. Yardstick Usage

Once you have yardstick installed, you can start using it to run testcases immediately, through the CLI. You can also define and run new testcases and test suites. This chapter details basic usage (running testcases), as well as more advanced usage (creating your own testcases).

5.1. Yardstick common CLI
5.1.1. List test cases

yardstick testcase list: This command line would list all test cases in Yardstick. It would show like below:

+---------------------------------------------------------------------------------------
| Testcase Name         | Description
+---------------------------------------------------------------------------------------
| opnfv_yardstick_tc001 | Measure network throughput using pktgen
| opnfv_yardstick_tc002 | measure network latency using ping
| opnfv_yardstick_tc005 | Measure Storage IOPS, throughput and latency using fio.
...
+---------------------------------------------------------------------------------------
5.1.2. Show a test case config file

Take opnfv_yardstick_tc002 for an example. This test case measure network latency. You just need to type in yardstick testcase show opnfv_yardstick_tc002, and the console would show the config yaml of this test case:

---

schema: "yardstick:task:0.1"
description: >
    Yardstick TC002 config file;
    measure network latency using ping;

{% set image = image or "cirros-0.3.5" %}

{% set provider = provider or none %}
{% set physical_network = physical_network or 'physnet1' %}
{% set segmentation_id = segmentation_id or none %}
{% set packetsize = packetsize or 100 %}

scenarios:
{% for i in range(2) %}
-
  type: Ping
  options:
    packetsize: {{packetsize}}
  host: athena.demo
  target: ares.demo

  runner:
    type: Duration
    duration: 60
    interval: 10

  sla:
    max_rtt: 10
    action: monitor
{% endfor %}

context:
  name: demo
  image: {{image}}
  flavor: yardstick-flavor
  user: cirros

  placement_groups:
    pgrp1:
      policy: "availability"

  servers:
    athena:
      floating_ip: true
      placement: "pgrp1"
    ares:
      placement: "pgrp1"

  networks:
    test:
      cidr: '10.0.1.0/24'
      {% if provider == "vlan" or provider == "sriov" %}
      provider: {{provider}}
      physical_network: {{physical_network}}
        {% if segmentation_id %}
      segmentation_id: {{segmentation_id}}
        {% endif %}
      {% endif %}
5.1.3. Run a Yardstick test case

If you want run a test case, then you need to use yardstick task start <test_case_path> this command support some parameters as below:

Parameters Detail
-d show debug log of yardstick running
–task-args If you want to customize test case parameters, use “–task-args” to pass the value. The format is a json string with parameter key-value pair.
–task-args-file If you want to use yardstick env prepare command(or related API) to load the
–parse-only  
–output-file OUTPUT_FILE_PATH Specify where to output the log. if not pass, the default value is “/tmp/yardstick/yardstick.log”
–suite TEST_SUITE_PATH run a test suite, TEST_SUITE_PATH specify where the test suite locates
5.2. Run Yardstick in a local environment

We also have a guide about How to run Yardstick in a local environment. This work is contributed by Tapio Tallgren.

5.3. Create a new testcase for Yardstick

As a user, you may want to define a new testcase in addition to the ones already available in Yardstick. This section will show you how to do this.

Each testcase consists of two sections:

  • scenarios describes what will be done by the test
  • context describes the environment in which the test will be run.
5.3.1. Defining the testcase scenarios

TODO

5.3.2. Defining the testcase context(s)

Each testcase consists of one or more contexts, which describe the environment in which the testcase will be run. Current available contexts are:

  • Dummy: this is a no-op context, and is used when there is no environment to set up e.g. when testing whether OpenStack services are available
  • Node: this context is used to perform operations on baremetal servers
  • Heat: uses OpenStack to provision the required hosts, networks, etc.
  • Kubernetes: uses Kubernetes to provision the resources required for the test.

Regardless of the context type, the context section of the testcase will consist of the following:

context:
  name: demo
  type: Dummy|Node|Heat|Kubernetes

The content of the context section will vary based on the context type.

5.3.2.1. Dummy Context

No additional information is required for the Dummy context:

context:
  name: my_context
  type: Dummy
5.3.2.2. Node Context

TODO

5.3.2.3. Heat Context

In addition to name and type, a Heat context requires the following arguments:

  • image: the image to be used to boot VMs
  • flavor: the flavor to be used for VMs in the context
  • user: the username for connecting into the VMs
  • networks: The networks to be created, networks are identified by name
    • name: network name (required)
    • (TODO) Any optional attributes
  • servers: The servers to be created
    • name: server name
    • (TODO) Any optional attributes

In addition to the required arguments, the following optional arguments can be passed to the Heat context:

  • placement_groups:
    • name: the name of the placement group to be created
    • policy: either affinity or availability
  • server_groups:
    • name: the name of the server group
    • policy: either affinity or anti-affinity

Combining these elements together, a sample Heat context config looks like:

5.3.2.3.1. Using exisiting HOT Templates

TODO

5.3.2.4. Kubernetes Context

TODO

5.3.2.5. Using multiple contexts in a testcase

When using multiple contexts in a testcase, the context section is replaced by a contexts section, and each context is separated with a - line:

contexts:
-
  name: context1
  type: Heat
  ...
-
  name: context2
  type: Node
  ...
5.3.2.6. Reusing a context

Typically, a context is torn down after a testcase is run, however, the user may wish to keep an context intact after a testcase is complete.

Note

This feature has been implemented for the Heat context only

To keep or reuse a context, the flags option must be specified:

  • no_setup: skip the deploy stage, and fetch the details of a deployed

    context/Heat stack.

  • no_teardown: skip the undeploy stage, thus keeping the stack intact for

    the next test

If either of these flags are True, the context information must still be given. By default, these flags are disabled:

context:
  name: mycontext
  type: Heat
  flags:
    no_setup: True
    no_teardown: True
  ...
5.4. Create a test suite for Yardstick

A test suite in Yardstick is a .yaml file which includes one or more test cases. Yardstick is able to support running test suite task, so you can customize your own test suite and run it in one task.

tests/opnfv/test_suites is the folder where Yardstick puts CI test suite. A typical test suite is like below (the fuel_test_suite.yaml example):

---
# Fuel integration test task suite

schema: "yardstick:suite:0.1"

name: "fuel_test_suite"
test_cases_dir: "samples/"
test_cases:
-
  file_name: ping.yaml
-
  file_name: iperf3.yaml

As you can see, there are two test cases in the fuel_test_suite.yaml. The schema and the name must be specified. The test cases should be listed via the tag test_cases and their relative path is also marked via the tag test_cases_dir.

Yardstick test suite also supports constraints and task args for each test case. Here is another sample (the os-nosdn-nofeature-ha.yaml example) to show this, which is digested from one big test suite:

---

schema: "yardstick:suite:0.1"

name: "os-nosdn-nofeature-ha"
test_cases_dir: "tests/opnfv/test_cases/"
test_cases:
-
  file_name: opnfv_yardstick_tc002.yaml
-
  file_name: opnfv_yardstick_tc005.yaml
-
  file_name: opnfv_yardstick_tc043.yaml
     constraint:
        installer: compass
        pod: huawei-pod1
     task_args:
        huawei-pod1: '{"pod_info": "etc/yardstick/.../pod.yaml",
        "host": "node4.LF","target": "node5.LF"}'

As you can see in test case opnfv_yardstick_tc043.yaml, there are two tags, constraint and task_args. constraint is to specify which installer or pod it can be run in the CI environment. task_args is to specify the task arguments for each pod.

All in all, to create a test suite in Yardstick, you just need to create a yaml file and add test cases, constraint or task arguments if necessary.

5.5. References
6. Installing a plug-in into Yardstick
6.1. Abstract

Yardstick provides a plugin CLI command to support integration with other OPNFV testing projects. Below is an example invocation of Yardstick plugin command and Storperf plug-in sample.

6.2. Installing Storperf into Yardstick

Storperf is delivered as a Docker container from https://hub.docker.com/r/opnfv/storperf/tags/.

There are two possible methods for installation in your environment:

  • Run container on Jump Host
  • Run container in a VM

In this introduction we will install Storperf on Jump Host.

6.2.1. Step 0: Environment preparation

Running Storperf on Jump Host Requirements:

  • Docker must be installed
  • Jump Host must have access to the OpenStack Controller API
  • Jump Host must have internet connectivity for downloading docker image
  • Enough floating IPs must be available to match your agent count

Before installing Storperf into yardstick you need to check your openstack environment and other dependencies:

  1. Make sure docker is installed.
  2. Make sure Keystone, Nova, Neutron, Glance, Heat are installed correctly.
  3. Make sure Jump Host have access to the OpenStack Controller API.
  4. Make sure Jump Host must have internet connectivity for downloading docker image.
  5. You need to know where to get basic openstack Keystone authorization info, such as OS_PASSWORD, OS_PROJECT_NAME, OS_AUTH_URL, OS_USERNAME.
  6. To run a Storperf container, you need to have OpenStack Controller environment variables defined and passed to Storperf container. The best way to do this is to put environment variables in a “storperf_admin-rc” file. The storperf_admin-rc should include credential environment variables at least:
    • OS_AUTH_URL
    • OS_USERNAME
    • OS_PASSWORD
    • OS_PROJECT_NAME
    • OS_PROJECT_ID
    • OS_USER_DOMAIN_ID

Yardstick has a prepare_storperf_admin-rc.sh script which can be used to generate the storperf_admin-rc file, this script is located at test/ci/prepare_storperf_admin-rc.sh

#!/bin/bash
# Prepare storperf_admin-rc for StorPerf.
AUTH_URL=${OS_AUTH_URL}
USERNAME=${OS_USERNAME:-admin}
PASSWORD=${OS_PASSWORD:-console}

# OS_TENANT_NAME is still present to keep backward compatibility with legacy
# deployments, but should be replaced by OS_PROJECT_NAME.
TENANT_NAME=${OS_TENANT_NAME:-admin}
PROJECT_NAME=${OS_PROJECT_NAME:-$TENANT_NAME}
PROJECT_ID=`openstack project show admin|grep '\bid\b' |awk -F '|' '{print $3}'|sed -e 's/^[[:space:]]*//'`
USER_DOMAIN_ID=${OS_USER_DOMAIN_ID:-default}

rm -f ~/storperf_admin-rc
touch ~/storperf_admin-rc

echo "OS_AUTH_URL="$AUTH_URL >> ~/storperf_admin-rc
echo "OS_USERNAME="$USERNAME >> ~/storperf_admin-rc
echo "OS_PASSWORD="$PASSWORD >> ~/storperf_admin-rc
echo "OS_PROJECT_NAME="$PROJECT_NAME >> ~/storperf_admin-rc
echo "OS_PROJECT_ID="$PROJECT_ID >> ~/storperf_admin-rc
echo "OS_USER_DOMAIN_ID="$USER_DOMAIN_ID >> ~/storperf_admin-rc

The generated storperf_admin-rc file will be stored in the root directory. If you installed Yardstick using Docker, this file will be located in the container. You may need to copy it to the root directory of the Storperf deployed host.

6.2.2. Step 1: Plug-in configuration file preparation

To install a plug-in, first you need to prepare a plug-in configuration file in YAML format and store it in the “plugin” directory. The plugin configration file work as the input of yardstick “plugin” command. Below is the Storperf plug-in configuration file sample:

---
# StorPerf plugin configuration file
# Used for integration StorPerf into Yardstick as a plugin
schema: "yardstick:plugin:0.1"
plugins:
  name: storperf
deployment:
  ip: 192.168.23.2
  user: root
  password: root

In the plug-in configuration file, you need to specify the plug-in name and the plug-in deployment info, including node ip, node login username and password. Here the Storperf will be installed on IP 192.168.23.2 which is the Jump Host in my local environment.

6.2.3. Step 2: Plug-in install/remove scripts preparation

In yardstick/resource/scripts directory, there are two folders: an install folder and a remove folder. You need to store the plug-in install/remove scripts in these two folders respectively.

The detailed installation or remove operation should de defined in these two scripts. The name of both install and remove scripts should match the plugin-in name that you specified in the plug-in configuration file.

For example, the install and remove scripts for Storperf are both named storperf.bash.

6.2.4. Step 3: Install and remove Storperf

To install Storperf, simply execute the following command:

# Install Storperf
yardstick plugin install plugin/storperf.yaml
6.2.4.1. Removing Storperf from yardstick

To remove Storperf, simply execute the following command:

# Remove Storperf
yardstick plugin remove plugin/storperf.yaml

What yardstick plugin command does is using the username and password to log into the deployment target and then execute the corresponding install or remove script.

7. Store Other Project’s Test Results in InfluxDB
7.1. Abstract

This chapter illustrates how to run plug-in test cases and store test results into community’s InfluxDB. The framework is shown in Framework.

Store Other Project's Test Results in InfluxDB
7.2. Store Storperf Test Results into Community’s InfluxDB

As shown in Framework, there are two ways to store Storperf test results into community’s InfluxDB:

  1. Yardstick executes Storperf test case (TC074), posting test job to Storperf container via ReST API. After the test job is completed, Yardstick reads test results via ReST API from Storperf and posts test data to the influxDB.
  2. Additionally, Storperf can run tests by itself and post the test result directly to the InfluxDB. The method for posting data directly to influxDB will be supported in the future.

Our plan is to support rest-api in D release so that other testing projects can call the rest-api to use yardstick dispatcher service to push data to Yardstick’s InfluxDB database.

For now, InfluxDB only supports line protocol, and the json protocol is deprecated.

Take ping test case for example, the raw_result is json format like this:

  "benchmark": {
      "timestamp": 1470315409.868095,
      "errors": "",
      "data": {
        "rtt": {
        "ares": 1.125
        }
      },
    "sequence": 1
    },
  "runner_id": 2625
}

With the help of “influxdb_line_protocol”, the json is transform to like below as a line string:

'ping,deploy_scenario=unknown,host=athena.demo,installer=unknown,pod_name=unknown,
  runner_id=2625,scenarios=Ping,target=ares.demo,task_id=77755f38-1f6a-4667-a7f3-
    301c99963656,version=unknown rtt.ares=1.125 1470315409868094976'

So, for data output of json format, you just need to transform json into line format and call influxdb api to post the data into the database. All this function has been implemented in Influxdb. If you need support on this, please contact Mingjiang.

curl -i -XPOST 'http://104.197.68.199:8086/write?db=yardstick' --
  data-binary 'ping,deploy_scenario=unknown,host=athena.demo,installer=unknown, ...'

Grafana will be used for visualizing the collected test data, which is shown in Visual. Grafana can be accessed by Login.

results visualization
8. Grafana dashboard
8.1. Abstract

This chapter describes the Yardstick grafana dashboard. The Yardstick grafana dashboard can be found here: http://testresults.opnfv.org/grafana/

Yardstick grafana dashboard
8.2. Public access

Yardstick provids a public account for accessing to the dashboard. The username and password are both set to ‘opnfv’.

8.3. Testcase dashboard

For each test case, there is a dedicated dashboard. Shown here is the dashboard of TC002.

For each test case dashboard. On the top left, we have a dashboard selection, you can switch to different test cases using this pull-down menu.

Underneath, we have a pod and scenario selection. All the pods and scenarios that have ever published test data to the InfluxDB will be shown here.

You can check multiple pods or scenarios.

For each test case, we have a short description and a link to detailed test case information in Yardstick user guide.

Underneath, it is the result presentation section. You can use the time period selection on the top right corner to zoom in or zoom out the chart.

8.4. Administration access

For a user with administration rights it is easy to update and save any dashboard configuration. Saved updates immediately take effect and become live. This may cause issues like:

  • Changes and updates made to the live configuration in Grafana can compromise existing Grafana content in an unwanted, unpredicted or incompatible way. Grafana as such is not version controlled, there exists one single Grafana configuration per dashboard.
  • There is a risk several people can disturb each other when doing updates to the same Grafana dashboard at the same time.

Any change made by administrator should be careful.

8.5. Add a dashboard into yardstick grafana

Due to security concern, users that using the public opnfv account are not able to edit the yardstick grafana directly.It takes a few more steps for a non-yardstick user to add a custom dashboard into yardstick grafana.

There are 6 steps to go.

Add a dashboard into yardstick grafana
  1. You need to build a local influxdb and grafana, so you can do the work locally. You can refer to How to deploy InfluxDB and Grafana locally wiki page about how to do this.
  2. Once step one is done, you can fetch the existing grafana dashboard configuration file from the yardstick repository and import it to your local grafana. After import is done, you grafana dashboard will be ready to use just like the community’s dashboard.
  3. The third step is running some test cases to generate test results and publishing it to your local influxdb.
  4. Now you have some data to visualize in your dashboard. In the fourth step, it is time to create your own dashboard. You can either modify an existing dashboard or try to create a new one from scratch. If you choose to modify an existing dashboard then in the curtain menu of the existing dashboard do a “Save As...” into a new dashboard copy instance, and then continue doing all updates and saves within the dashboard copy.
  5. When finished with all Grafana configuration changes in this temporary dashboard then chose “export” of the updated dashboard copy into a JSON file and put it up for review in Gerrit, in file /yardstick/dashboard/Yardstick-TCxxx-yyyyyyyyyyyyy. For instance a typical default name of the file would be Yardstick-TC001 Copy-1234567891234.
  6. Once you finish your dashboard, the next step is exporting the configuration file and propose a patch into Yardstick. Yardstick team will review and merge it into Yardstick repository. After approved review Yardstick team will do an “import” of the JSON file and also a “save dashboard” as soon as possible to replace the old live dashboard configuration.
9. Yardstick Restful API
9.1. Abstract

Yardstick support restful API since Danube.

9.2. Available API
9.2.1. /yardstick/env/action

Description: This API is used to prepare Yardstick test environment. For Euphrates, it supports:

  1. Prepare yardstick test environment, including setting the EXTERNAL_NETWORK environment variable, load Yardstick VM images and create flavors;
  2. Start an InfluxDB Docker container and config Yardstick output to InfluxDB;
  3. Start a Grafana Docker container and config it with the InfluxDB.

Which API to call will depend on the parameters.

Method: POST

Prepare Yardstick test environment Example:

{
    'action': 'prepare_env'
}

This is an asynchronous API. You need to call /yardstick/asynctask API to get the task result.

Start and config an InfluxDB docker container Example:

{
    'action': 'create_influxdb'
}

This is an asynchronous API. You need to call /yardstick/asynctask API to get the task result.

Start and config a Grafana docker container Example:

{
    'action': 'create_grafana'
}

This is an asynchronous API. You need to call /yardstick/asynctask API to get the task result.

9.2.2. /yardstick/asynctask

Description: This API is used to get the status of asynchronous tasks

Method: GET

Get the status of asynchronous tasks Example:

http://<SERVER IP>:<PORT>/yardstick/asynctask?task_id=3f3f5e03-972a-4847-a5f8-154f1b31db8c

The returned status will be 0(running), 1(finished) and 2(failed).

NOTE:

<SERVER IP>: The ip of the host where you start your yardstick container
<PORT>: The outside port of port mapping which set when you start start yardstick container
9.2.3. /yardstick/testcases

Description: This API is used to list all released Yardstick test cases.

Method: GET

Get a list of released test cases Example:

http://<SERVER IP>:<PORT>/yardstick/testcases
9.2.4. /yardstick/testcases/release/action

Description: This API is used to run a Yardstick released test case.

Method: POST

Run a released test case Example:

{
    'action': 'run_test_case',
    'args': {
        'opts': {},
        'testcase': 'opnfv_yardstick_tc002'
    }
}

This is an asynchronous API. You need to call /yardstick/results to get the result.

9.2.5. /yardstick/testcases/samples/action

Description: This API is used to run a Yardstick sample test case.

Method: POST

Run a sample test case Example:

{
    'action': 'run_test_case',
    'args': {
        'opts': {},
        'testcase': 'ping'
    }
}

This is an asynchronous API. You need to call /yardstick/results to get the result.

9.2.6. /yardstick/testcases/<testcase_name>/docs

Description: This API is used to the documentation of a certain released test case.

Method: GET

Get the documentation of a certain test case Example:

http://<SERVER IP>:<PORT>/yardstick/taskcases/opnfv_yardstick_tc002/docs
9.2.7. /yardstick/testsuites/action

Description: This API is used to run a Yardstick test suite.

Method: POST

Run a test suite Example:

{
    'action': 'run_test_suite',
    'args': {
        'opts': {},
        'testsuite': 'opnfv_smoke'
    }
}

This is an asynchronous API. You need to call /yardstick/results to get the result.

9.2.8. /yardstick/tasks/<task_id>/log

Description: This API is used to get the real time log of test case execution.

Method: GET

Get real time of test case execution Example:

http://<SERVER IP>:<PORT>/yardstick/tasks/14795be8-f144-4f54-81ce-43f4e3eab33f/log?index=0
9.2.9. /yardstick/results

Description: This API is used to get the test results of tasks. If you call /yardstick/testcases/samples/action API, it will return a task id. You can use the returned task id to get the results by using this API.

Method: GET

Get test results of one task Example:

http://<SERVER IP>:<PORT>/yardstick/results?task_id=3f3f5e03-972a-4847-a5f8-154f1b31db8c

This API will return a list of test case result

9.2.10. /api/v2/yardstick/openrcs

Description: This API provides functionality of handling OpenStack credential file (openrc). For Euphrates, it supports:

  1. Upload an openrc file for an OpenStack environment;
  2. Update an openrc;
  3. Get openrc file information;
  4. Delete an openrc file.

Which API to call will depend on the parameters.

METHOD: POST

Upload an openrc file for an OpenStack environment Example:

{
    'action': 'upload_openrc',
    'args': {
        'file': file,
        'environment_id': environment_id
    }
}

METHOD: POST

Update an openrc file Example:

{
    'action': 'update_openrc',
    'args': {
        'openrc': {
            "EXTERNAL_NETWORK": "ext-net",
            "OS_AUTH_URL": "http://192.168.23.51:5000/v3",
            "OS_IDENTITY_API_VERSION": "3",
            "OS_IMAGE_API_VERSION": "2",
            "OS_PASSWORD": "console",
            "OS_PROJECT_DOMAIN_NAME": "default",
            "OS_PROJECT_NAME": "admin",
            "OS_USERNAME": "admin",
            "OS_USER_DOMAIN_NAME": "default"
        },
        'environment_id': environment_id
    }
}
9.2.11. /api/v2/yardstick/openrcs/<openrc_id>

Description: This API provides functionality of handling OpenStack credential file (openrc). For Euphrates, it supports:

  1. Get openrc file information;
  2. Delete an openrc file.

METHOD: GET

Get openrc file information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/openrcs/5g6g3e02-155a-4847-a5f8-154f1b31db8c

METHOD: DELETE

Delete openrc file Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/openrcs/5g6g3e02-155a-4847-a5f8-154f1b31db8c
9.2.12. /api/v2/yardstick/pods

Description: This API provides functionality of handling Yardstick pod file (pod.yaml). For Euphrates, it supports:

  1. Upload a pod file;

Which API to call will depend on the parameters.

METHOD: POST

Upload a pod.yaml file Example:

{
    'action': 'upload_pod_file',
    'args': {
        'file': file,
        'environment_id': environment_id
    }
}
9.2.13. /api/v2/yardstick/pods/<pod_id>

Description: This API provides functionality of handling Yardstick pod file (pod.yaml). For Euphrates, it supports:

  1. Get pod file information;
  2. Delete an openrc file.

METHOD: GET

Get pod file information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/pods/5g6g3e02-155a-4847-a5f8-154f1b31db8c

METHOD: DELETE

Delete openrc file Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/pods/5g6g3e02-155a-4847-a5f8-154f1b31db8c
9.2.14. /api/v2/yardstick/images

Description: This API is used to do some work related to Yardstick VM images. For Euphrates, it supports:

  1. Load Yardstick VM images;

Which API to call will depend on the parameters.

METHOD: POST

Load VM images Example:

{
    'action': 'load_image',
    'args': {
        'name': 'yardstick-image'
    }
}
9.2.15. /api/v2/yardstick/images/<image_id>

Description: This API is used to do some work related to Yardstick VM images. For Euphrates, it supports:

  1. Get image’s information;
  2. Delete images

METHOD: GET

Get image information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/images/5g6g3e02-155a-4847-a5f8-154f1b31db8c

METHOD: DELETE

Delete images Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/images/5g6g3e02-155a-4847-a5f8-154f1b31db8c
9.2.16. /api/v2/yardstick/tasks

Description: This API is used to do some work related to yardstick tasks. For Euphrates, it supports:

  1. Create a Yardstick task;

Which API to call will depend on the parameters.

METHOD: POST

Create a Yardstick task Example:

{
    'action': 'create_task',
        'args': {
            'name': 'task1',
            'project_id': project_id
        }
}
9.2.17. /api/v2/yardstick/tasks/<task_id>

Description: This API is used to do some work related to yardstick tasks. For Euphrates, it supports:

  1. Add a environment to a task
  2. Add a test case to a task;
  3. Add a test suite to a task;
  4. run a Yardstick task;
  5. Get a tasks’ information;
  6. Delete a task.

METHOD: PUT

Add a environment to a task

Example:

{
    'action': 'add_environment',
    'args': {
        'environment_id': 'e3cadbbb-0419-4fed-96f1-a232daa0422a'
    }
}

METHOD: PUT

Add a test case to a task Example:

{
    'action': 'add_case',
    'args': {
        'case_name': 'opnfv_yardstick_tc002',
        'case_content': case_content
    }
}

METHOD: PUT

Add a test suite to a task Example:

{
    'action': 'add_suite',
    'args': {
        'suite_name': 'opnfv_smoke',
        'suite_content': suite_content
    }
}

METHOD: PUT

Run a task

Example:

{
    'action': 'run'
}

METHOD: GET

Get a task’s information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/tasks/5g6g3e02-155a-4847-a5f8-154f1b31db8c

METHOD: DELETE

Delete a task

Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/tasks/5g6g3e02-155a-4847-a5f8-154f1b31db8c
9.2.18. /api/v2/yardstick/testcases

Description: This API is used to do some work related to Yardstick testcases. For Euphrates, it supports:

  1. Upload a test case;
  2. Get all released test cases’ information;

Which API to call will depend on the parameters.

METHOD: POST

Upload a test case Example:

{
    'action': 'upload_case',
    'args': {
        'file': file
    }
}

METHOD: GET

Get all released test cases’ information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/testcases
9.2.19. /api/v2/yardstick/testcases/<case_name>

Description: This API is used to do some work related to yardstick testcases. For Euphrates, it supports:

  1. Get certain released test case’s information;
  2. Delete a test case.

METHOD: GET

Get certain released test case’s information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/testcases/opnfv_yardstick_tc002

METHOD: DELETE

Delete a certain test case Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/testcases/opnfv_yardstick_tc002
9.2.20. /api/v2/yardstick/testsuites

Description: This API is used to do some work related to yardstick test suites. For Euphrates, it supports:

  1. Create a test suite;
  2. Get all test suites;

Which API to call will depend on the parameters.

METHOD: POST

Create a test suite Example:

{
    'action': 'create_suite',
    'args': {
        'name': <suite_name>,
        'testcases': [
            'opnfv_yardstick_tc002'
        ]
    }
}

METHOD: GET

Get all test suite Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/testsuites
9.2.21. /api/v2/yardstick/testsuites

Description: This API is used to do some work related to yardstick test suites. For Euphrates, it supports:

  1. Get certain test suite’s information;
  2. Delete a test case.

METHOD: GET

Get certain test suite’s information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/testsuites/<suite_name>

METHOD: DELETE

Delete a certain test suite Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/testsuites/<suite_name>
9.2.22. /api/v2/yardstick/projects

Description: This API is used to do some work related to Yardstick test projects. For Euphrates, it supports:

  1. Create a Yardstick project;
  2. Get all projects;

Which API to call will depend on the parameters.

METHOD: POST

Create a Yardstick project Example:

{
    'action': 'create_project',
    'args': {
        'name': 'project1'
    }
}

METHOD: GET

Get all projects’ information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/projects
9.2.23. /api/v2/yardstick/projects

Description: This API is used to do some work related to yardstick test projects. For Euphrates, it supports:

  1. Get certain project’s information;
  2. Delete a project.

METHOD: GET

Get certain project’s information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/projects/<project_id>

METHOD: DELETE

Delete a certain project Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/projects/<project_id>
9.2.24. /api/v2/yardstick/containers

Description: This API is used to do some work related to Docker containers. For Euphrates, it supports:

  1. Create a Grafana Docker container;
  2. Create an InfluxDB Docker container;

Which API to call will depend on the parameters.

METHOD: POST

Create a Grafana Docker container Example:

{
    'action': 'create_grafana',
    'args': {
        'environment_id': <environment_id>
    }
}

METHOD: POST

Create an InfluxDB Docker container Example:

{
    'action': 'create_influxdb',
    'args': {
        'environment_id': <environment_id>
    }
}
9.2.25. /api/v2/yardstick/containers/<container_id>

Description: This API is used to do some work related to Docker containers. For Euphrates, it supports:

  1. Get certain container’s information;
  2. Delete a container.

METHOD: GET

Get certain container’s information Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/containers/<container_id>

METHOD: DELETE

Delete a certain container Example:

http://<SERVER IP>:<PORT>/api/v2/yardstick/containers/<container_id>
10. Yardstick User Interface

This interface provides a user to view the test result in table format and also values pinned on to a graph.

10.1. Command
yardstick report generate <task-ID> <testcase-filename>
10.2. Description

1. When the command is triggered using the task-id and the testcase name provided the respective values are retrieved from the database (influxdb in this particular case).

2. The values are then formatted and then provided to the html template framed with complete html body using Django Framework.

  1. Then the whole template is written into a html file.

The graph is framed with Timestamp on x-axis and output values (differ from testcase to testcase) on y-axis with the help of “Highcharts”.

11. Network Services Benchmarking (NSB)
11.1. Abstract

This chapter provides an overview of the NSB, a contribution to OPNFV Yardstick from Intel.

11.2. Overview

The goal of NSB is to Extend Yardstick to perform real world VNFs and NFVi Characterization and benchmarking with repeatable and deterministic methods.

The Network Service Benchmarking (NSB) extends the yardstick framework to do VNF characterization and benchmarking in three different execution environments - bare metal i.e. native Linux environment, standalone virtual environment and managed virtualized environment (e.g. Open stack etc.). It also brings in the capability to interact with external traffic generators both hardware & software based for triggering and validating the traffic according to user defined profiles.

NSB extension includes:

  • Generic data models of Network Services, based on ETSI spec ETSI GS NFV-TST 001

  • New Standalone context for VNF testing like SRIOV, OVS, OVS-DPDK etc

  • Generic VNF configuration models and metrics implemented with Python classes

  • Traffic generator features and traffic profiles

    • L1-L3 state-less traffic profiles
    • L4-L7 state-full traffic profiles
    • Tunneling protocol / network overlay support
  • Test case samples

    • Ping
    • Trex
    • vPE,vCGNAT, vFirewall etc - ipv4 throughput, latency etc
  • Traffic generators like Trex, ab/nginx, ixia, iperf etc

  • KPIs for a given use case:

    • System agent support for collecting NFVi KPI. This includes:

      • CPU statistic
      • Memory BW
      • OVS-DPDK Stats
    • Network KPIs, e.g., inpackets, outpackets, thoughput, latency etc

    • VNF KPIs, e.g., packet_in, packet_drop, packet_fwd etc

11.3. Architecture

The Network Service (NS) defines a set of Virtual Network Functions (VNF) connected together using NFV infrastructure.

The Yardstick NSB extension can support multiple VNFs created by different vendors including traffic generators. Every VNF being tested has its own data model. The Network service defines a VNF modelling on base of performed network functionality. The part of the data model is a set of the configuration parameters, number of connection points used and flavor including core and memory amount.

The ETSI defines a Network Service as a set of configurable VNFs working in some NFV Infrastructure connecting each other using Virtual Links available through Connection Points. The ETSI MANO specification defines a set of management entities called Network Service Descriptors (NSD) and VNF Descriptors (VNFD) that define real Network Service. The picture below makes an example how the real Network Operator use-case can map into ETSI Network service definition

Network Service framework performs the necessary test steps. It may involve

  • Interacting with traffic generator and providing the inputs on traffic type / packet structure to generate the required traffic as per the test case. Traffic profiles will be used for this.
  • Executing the commands required for the test procedure and analyses the command output for confirming whether the command got executed correctly or not. E.g. As per the test case, run the traffic for the given time period / wait for the necessary time delay
  • Verify the test result.
  • Validate the traffic flow from SUT
  • Fetch the table / data from SUT and verify the value as per the test case
  • Upload the logs from SUT onto the Test Harness server
  • Read the KPI’s provided by particular VNF
11.3.1. Components of Network Service
  • Models for Network Service benchmarking: The Network Service benchmarking requires the proper modelling approach. The NSB provides models using Python files and defining of NSDs and VNFDs.

The benchmark control application being a part of OPNFV yardstick can call that python models to instantiate and configure the VNFs. Depending on infrastructure type (bare-metal or fully virtualized) that calls could be made directly or using MANO system.

  • Traffic generators in NSB: Any benchmark application requires a set of traffic generator and traffic profiles defining the method in which traffic is generated.

The Network Service benchmarking model extends the Network Service definition with a set of Traffic Generators (TG) that are treated same way as other VNFs being a part of benchmarked network service. Same as other VNFs the traffic generator are instantiated and terminated.

Every traffic generator has own configuration defined as a traffic profile and a set of KPIs supported. The python models for TG is extended by specific calls to listen and generate traffic.

  • The stateless TREX traffic generator: The main traffic generator used as Network Service stimulus is open source TREX tool.

The TREX tool can generate any kind of stateless traffic.

+--------+      +-------+      +--------+
|        |      |       |      |        |
|  Trex  | ---> |  VNF  | ---> |  Trex  |
|        |      |       |      |        |
+--------+      +-------+      +--------+

Supported testcases scenarios:

  • Correlated UDP traffic using TREX traffic generator and replay VNF.

    • using different IMIX configuration like pure voice, pure video traffic etc
    • using different number IP flows like 1 flow, 1K, 16K, 64K, 256K, 1M flows
    • Using different number of rules configured like 1 rule, 1K, 10K rules

For UDP correlated traffic following Key Performance Indicators are collected for every combination of test case parameters:

  • RFC2544 throughput for various loss rate defined (1% is a default)
11.4. Graphical Overview

NSB Testing with yardstick framework facilitate performance testing of various VNFs provided.

+-----------+
|           |                                                     +-----------+
|   vPE     |                                                   ->|TGen Port 0|
| TestCase  |                                                   | +-----------+
|           |                                                   |
+-----------+     +------------------+            +-------+     |
                  |                  | -- API --> |  VNF  | <--->
+-----------+     |     Yardstick    |            +-------+     |
| Test Case | --> |    NSB Testing   |                          |
+-----------+     |                  |                          |
      |           |                  |                          |
      |           +------------------+                          |
+-----------+                                                   | +-----------+
|   Traffic |                                                   ->|TGen Port 1|
|  patterns |                                                     +-----------+
+-----------+

            Figure 1: Network Service - 2 server configuration
11.4.1. VNFs supported for chracterization:
  1. CGNAPT - Carrier Grade Network Address and port Translation

  2. vFW - Virtual Firewall

  3. vACL - Access Control List

  4. Prox - Packet pROcessing eXecution engine:
    • VNF can act as Drop, Basic Forwarding (no touch), L2 Forwarding (change MAC), GRE encap/decap, Load balance based on packet fields, Symmetric load balancing
    • QinQ encap/decap IPv4/IPv6, ARP, QoS, Routing, Unmpls, Policing, ACL
  5. UDP_Replay

12. Yardstick - NSB Testing -Installation
12.1. Abstract

The Network Service Benchmarking (NSB) extends the yardstick framework to do VNF characterization and benchmarking in three different execution environments viz., bare metal i.e. native Linux environment, standalone virtual environment and managed virtualized environment (e.g. Open stack etc.). It also brings in the capability to interact with external traffic generators both hardware & software based for triggering and validating the traffic according to user defined profiles.

The steps needed to run Yardstick with NSB testing are:

  • Install Yardstick (NSB Testing).
  • Setup/Reference pod.yaml describing Test topology
  • Create/Reference the test configuration yaml file.
  • Run the test case.
12.2. Prerequisites

Refer chapter Yardstick Installation for more information on yardstick prerequisites

Several prerequisites are needed for Yardstick (VNF testing):

  • Python Modules: pyzmq, pika.
  • flex
  • bison
  • build-essential
  • automake
  • libtool
  • librabbitmq-dev
  • rabbitmq-server
  • collectd
  • intel-cmt-cat
12.2.1. Hardware & Software Ingredients

SUT requirements:

Item Description
Memory Min 20GB
NICs 2 x 10G
OS Ubuntu 16.04.3 LTS
kernel 4.4.0-34-generic
DPDK 17.02

Boot and BIOS settings:

Boot settings default_hugepagesz=1G hugepagesz=1G hugepages=16 hugepagesz=2M hugepages=2048 isolcpus=1-11,22-33 nohz_full=1-11,22-33 rcu_nocbs=1-11,22-33 iommu=on iommu=pt intel_iommu=on Note: nohz_full and rcu_nocbs is to disable Linux kernel interrupts
BIOS CPU Power and Performance Policy <Performance> CPU C-state Disabled CPU P-state Disabled Enhanced Intel® Speedstep® Tech Disabl Hyper-Threading Technology (If supported) Enabled Virtualization Techology Enabled Intel(R) VT for Direct I/O Enabled Coherency Enabled Turbo Boost Disabled
12.3. Install Yardstick (NSB Testing)

Download the source code and install Yardstick from it

git clone https://gerrit.opnfv.org/gerrit/yardstick

cd yardstick

# Switch to latest stable branch
# git checkout <tag or stable branch>
git checkout stable/euphrates

Configure the network proxy, either using the environment variables or setting the global environment file:

cat /etc/environment
http_proxy='http://proxy.company.com:port'
https_proxy='http://proxy.company.com:port'
export http_proxy='http://proxy.company.com:port'
export https_proxy='http://proxy.company.com:port'

The last step is to modify the Yardstick installation inventory, used by Ansible:

cat ./ansible/install-inventory.ini
[jumphost]
localhost  ansible_connection=local

[yardstick-standalone]
yardstick-standalone-node ansible_host=192.168.1.2
yardstick-standalone-node-2 ansible_host=192.168.1.3

# section below is only due backward compatibility.
# it will be removed later
[yardstick:children]
jumphost

[all:vars]
ansible_user=root
ansible_pass=root

Note

SSH access without password needs to be configured for all your nodes defined in install-inventory.ini file. If you want to use password authentication you need to install sshpass

sudo -EH apt-get install sshpass

To execute an installation for a Bare-Metal or a Standalone context:

./nsb_setup.sh

To execute an installation for an OpenStack context:

./nsb_setup.sh <path to admin-openrc.sh>

Above command setup docker with latest yardstick code. To execute

docker exec -it yardstick bash

It will also automatically download all the packages needed for NSB Testing setup. Refer chapter Yardstick Installation for more on docker Install Yardstick using Docker (recommended)

12.4. System Topology:
+----------+              +----------+
|          |              |          |
|          | (0)----->(0) |          |
|    TG1   |              |    DUT   |
|          |              |          |
|          | (1)<-----(1) |          |
+----------+              +----------+
trafficgen_1                   vnf
12.5. Environment parameters and credentials
12.5.1. Config yardstick conf

If user did not run ‘yardstick env influxdb’ inside the container, which will generate correct yardstick.conf, then create the config file manually (run inside the container):

cp ./etc/yardstick/yardstick.conf.sample /etc/yardstick/yardstick.conf
vi /etc/yardstick/yardstick.conf

Add trex_path, trex_client_lib and bin_path in ‘nsb’ section.

[DEFAULT]
debug = True
dispatcher = file, influxdb

[dispatcher_influxdb]
timeout = 5
target = http://{YOUR_IP_HERE}:8086
db_name = yardstick
username = root
password = root

[nsb]
trex_path=/opt/nsb_bin/trex/scripts
bin_path=/opt/nsb_bin
trex_client_lib=/opt/nsb_bin/trex_client/stl
12.6. Run Yardstick - Network Service Testcases
12.6.1. NS testing - using yardstick CLI
docker exec -it yardstick /bin/bash
source /etc/yardstick/openstack.creds (only for heat TC if nsb_setup.sh was NOT used)
export EXTERNAL_NETWORK="<openstack public network>" (only for heat TC)
yardstick --debug task start yardstick/samples/vnf_samples/nsut/<vnf>/<test case>
12.7. Network Service Benchmarking - Bare-Metal
12.7.1. Bare-Metal Config pod.yaml describing Topology
12.7.1.1. Bare-Metal 2-Node setup
+----------+              +----------+
|          |              |          |
|          | (0)----->(0) |          |
|    TG1   |              |    DUT   |
|          |              |          |
|          | (n)<-----(n) |          |
+----------+              +----------+
trafficgen_1                   vnf
12.7.1.2. Bare-Metal 3-Node setup - Correlated Traffic
+----------+              +----------+            +------------+
|          |              |          |            |            |
|          |              |          |            |            |
|          | (0)----->(0) |          |            |    UDP     |
|    TG1   |              |    DUT   |            |   Replay   |
|          |              |          |            |            |
|          |              |          |(1)<---->(0)|            |
+----------+              +----------+            +------------+
trafficgen_1                   vnf                 trafficgen_2
12.7.2. Bare-Metal Config pod.yaml

Before executing Yardstick test cases, make sure that pod.yaml reflects the topology and update all the required fields.:

cp /etc/yardstick/nodes/pod.yaml.nsb.sample /etc/yardstick/nodes/pod.yaml
nodes:
-
    name: trafficgen_1
    role: TrafficGen
    ip: 1.1.1.1
    user: root
    password: r00t
    interfaces:
        xe0:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.0"
            driver:    i40e # default kernel driver
            dpdk_port_num: 0
            local_ip: "152.16.100.20"
            netmask:   "255.255.255.0"
            local_mac: "00:00:00:00:00:01"
        xe1:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.1"
            driver:    i40e # default kernel driver
            dpdk_port_num: 1
            local_ip: "152.16.40.20"
            netmask:   "255.255.255.0"
            local_mac: "00:00.00:00:00:02"

-
    name: vnf
    role: vnf
    ip: 1.1.1.2
    user: root
    password: r00t
    host: 1.1.1.2 #BM - host == ip, virtualized env - Host - compute node
    interfaces:
        xe0:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.0"
            driver:    i40e # default kernel driver
            dpdk_port_num: 0
            local_ip: "152.16.100.19"
            netmask:   "255.255.255.0"
            local_mac: "00:00:00:00:00:03"

        xe1:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.1"
            driver:    i40e # default kernel driver
            dpdk_port_num: 1
            local_ip: "152.16.40.19"
            netmask:   "255.255.255.0"
            local_mac: "00:00:00:00:00:04"
    routing_table:
    - network: "152.16.100.20"
      netmask: "255.255.255.0"
      gateway: "152.16.100.20"
      if: "xe0"
    - network: "152.16.40.20"
      netmask: "255.255.255.0"
      gateway: "152.16.40.20"
      if: "xe1"
    nd_route_tbl:
    - network: "0064:ff9b:0:0:0:0:9810:6414"
      netmask: "112"
      gateway: "0064:ff9b:0:0:0:0:9810:6414"
      if: "xe0"
    - network: "0064:ff9b:0:0:0:0:9810:2814"
      netmask: "112"
      gateway: "0064:ff9b:0:0:0:0:9810:2814"
      if: "xe1"
12.8. Network Service Benchmarking - Standalone Virtualization
12.8.1. SR-IOV
12.8.1.1. SR-IOV Pre-requisites
On Host, where VM is created:
  1. Create and configure a bridge named br-int for VM to connect to external network. Currently this can be done using VXLAN tunnel.

    Execute the following on host, where VM is created:

ip link add type vxlan remote <Jumphost IP> local <DUT IP> id <ID: 10> dstport 4789
brctl addbr br-int
brctl addif br-int vxlan0
ip link set dev vxlan0 up
ip addr add <IP#1, like: 172.20.2.1/24> dev br-int
ip link set dev br-int up

Note

May be needed to add extra rules to iptable to forward traffic.

iptables -A FORWARD -i br-int -s <network ip address>/<netmask> -j ACCEPT
iptables -A FORWARD -o br-int -d <network ip address>/<netmask> -j ACCEPT

Execute the following on a jump host:

ip link add type vxlan remote <DUT IP> local <Jumphost IP> id <ID: 10> dstport 4789
ip addr add <IP#2, like: 172.20.2.2/24> dev vxlan0
ip link set dev vxlan0 up

Note

Host and jump host are different baremetal servers.

  1. Modify test case management CIDR. IP addresses IP#1, IP#2 and CIDR must be in the same network.
servers:
  vnf:
    network_ports:
      mgmt:
        cidr: '1.1.1.7/24'
  1. Build guest image for VNF to run. Most of the sample test cases in Yardstick are using a guest image called yardstick-nsb-image which deviates from an Ubuntu Cloud Server image Yardstick has a tool for building this custom image with SampleVNF. It is necessary to have sudo rights to use this tool.

    Also you may need to install several additional packages to use this tool, by following the commands below:

    sudo apt-get update && sudo apt-get install -y qemu-utils kpartx
    

    This image can be built using the following command in the directory where Yardstick is installed

    export YARD_IMG_ARCH='amd64'
    sudo echo "Defaults env_keep += \'YARD_IMG_ARCH\'" >> /etc/sudoers
    

    Please use ansible script to generate a cloud image refer to Yardstick Installation

    for more details refer to chapter Yardstick Installation

    Note

    VM should be build with static IP and should be accessible from yardstick host.

12.8.1.2. SR-IOV Config pod.yaml describing Topology
12.8.1.3. SR-IOV 2-Node setup:
                             +--------------------+
                             |                    |
                             |                    |
                             |        DUT         |
                             |       (VNF)        |
                             |                    |
                             +--------------------+
                             | VF NIC |  | VF NIC |
                             +--------+  +--------+
                                   ^          ^
                                   |          |
                                   |          |
+----------+               +-------------------------+
|          |               |       ^          ^      |
|          |               |       |          |      |
|          | (0)<----->(0) | ------           |      |
|    TG1   |               |           SUT    |      |
|          |               |                  |      |
|          | (n)<----->(n) |------------------       |
+----------+               +-------------------------+
trafficgen_1                          host
12.8.1.4. SR-IOV 3-Node setup - Correlated Traffic
                             +--------------------+
                             |                    |
                             |                    |
                             |        DUT         |
                             |       (VNF)        |
                             |                    |
                             +--------------------+
                             | VF NIC |  | VF NIC |
                             +--------+  +--------+
                                   ^          ^
                                   |          |
                                   |          |
+----------+               +-------------------------+            +--------------+
|          |               |       ^          ^      |            |              |
|          |               |       |          |      |            |              |
|          | (0)<----->(0) | ------           |      |            |     TG2      |
|    TG1   |               |           SUT    |      |            | (UDP Replay) |
|          |               |                  |      |            |              |
|          | (n)<----->(n) |                  ------ | (n)<-->(n) |              |
+----------+               +-------------------------+            +--------------+
trafficgen_1                          host                       trafficgen_2

Before executing Yardstick test cases, make sure that pod.yaml reflects the topology and update all the required fields.

cp <yardstick>/etc/yardstick/nodes/standalone/trex_bm.yaml.sample /etc/yardstick/nodes/standalone/pod_trex.yaml
cp <yardstick>/etc/yardstick/nodes/standalone/host_sriov.yaml /etc/yardstick/nodes/standalone/host_sriov.yaml

Note

Update all the required fields like ip, user, password, pcis, etc...

12.8.1.5. SR-IOV Config pod_trex.yaml
nodes:
-
    name: trafficgen_1
    role: TrafficGen
    ip: 1.1.1.1
    user: root
    password: r00t
    key_filename: /root/.ssh/id_rsa
    interfaces:
        xe0:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.0"
            driver:    i40e # default kernel driver
            dpdk_port_num: 0
            local_ip: "152.16.100.20"
            netmask:   "255.255.255.0"
            local_mac: "00:00:00:00:00:01"
        xe1:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.1"
            driver:    i40e # default kernel driver
            dpdk_port_num: 1
            local_ip: "152.16.40.20"
            netmask:   "255.255.255.0"
            local_mac: "00:00.00:00:00:02"
12.8.1.6. SR-IOV Config host_sriov.yaml
nodes:
-
   name: sriov
   role: Sriov
   ip: 192.168.100.101
   user: ""
   password: ""

SR-IOV testcase update: <yardstick>/samples/vnf_samples/nsut/vfw/tc_sriov_rfc2544_ipv4_1rule_1flow_64B_trex.yaml

12.8.1.6.1. Update “contexts” section
contexts:
 - name: yardstick
   type: Node
   file: /etc/yardstick/nodes/standalone/pod_trex.yaml
 - type: StandaloneSriov
   file: /etc/yardstick/nodes/standalone/host_sriov.yaml
   name: yardstick
   vm_deploy: True
   flavor:
     images: "/var/lib/libvirt/images/ubuntu.qcow2"
     ram: 4096
     extra_specs:
       hw:cpu_sockets: 1
       hw:cpu_cores: 6
       hw:cpu_threads: 2
     user: "" # update VM username
     password: "" # update password
   servers:
     vnf:
       network_ports:
         mgmt:
           cidr: '1.1.1.61/24'  # Update VM IP address, if static, <ip>/<mask> or if dynamic, <start of ip>/<mask>
         xe0:
           - uplink_0
         xe1:
           - downlink_0
   networks:
     uplink_0:
       phy_port: "0000:05:00.0"
       vpci: "0000:00:07.0"
       cidr: '152.16.100.10/24'
       gateway_ip: '152.16.100.20'
     downlink_0:
       phy_port: "0000:05:00.1"
       vpci: "0000:00:08.0"
       cidr: '152.16.40.10/24'
       gateway_ip: '152.16.100.20'
12.8.2. OVS-DPDK
12.8.2.1. OVS-DPDK Pre-requisites
On Host, where VM is created:
  1. Create and configure a bridge named br-int for VM to connect to external network. Currently this can be done using VXLAN tunnel.

    Execute the following on host, where VM is created:

ip link add type vxlan remote <Jumphost IP> local <DUT IP> id <ID: 10> dstport 4789
brctl addbr br-int
brctl addif br-int vxlan0
ip link set dev vxlan0 up
ip addr add <IP#1, like: 172.20.2.1/24> dev br-int
ip link set dev br-int up

Note

May be needed to add extra rules to iptable to forward traffic.

iptables -A FORWARD -i br-int -s <network ip address>/<netmask> -j ACCEPT
iptables -A FORWARD -o br-int -d <network ip address>/<netmask> -j ACCEPT

Execute the following on a jump host:

ip link add type vxlan remote <DUT IP> local <Jumphost IP> id <ID: 10> dstport 4789
ip addr add <IP#2, like: 172.20.2.2/24> dev vxlan0
ip link set dev vxlan0 up

Note

Host and jump host are different baremetal servers.

  1. Modify test case management CIDR. IP addresses IP#1, IP#2 and CIDR must be in the same network.
servers:
  vnf:
    network_ports:
      mgmt:
        cidr: '1.1.1.7/24'
  1. Build guest image for VNF to run. Most of the sample test cases in Yardstick are using a guest image called yardstick-nsb-image which deviates from an Ubuntu Cloud Server image Yardstick has a tool for building this custom image with SampleVNF. It is necessary to have sudo rights to use this tool.

    Also you may need to install several additional packages to use this tool, by following the commands below:

    sudo apt-get update && sudo apt-get install -y qemu-utils kpartx
    

    This image can be built using the following command in the directory where Yardstick is installed:

    export YARD_IMG_ARCH='amd64'
    sudo echo "Defaults env_keep += \'YARD_IMG_ARCH\'" >> /etc/sudoers
    sudo tools/yardstick-img-dpdk-modify tools/ubuntu-server-cloudimg-samplevnf-modify.sh
    

    for more details refer to chapter Yardstick Installation

    Note

    VM should be build with static IP and should be accessible from yardstick host.

  1. OVS & DPDK version.
    • OVS 2.7 and DPDK 16.11.1 above version is supported
  2. Setup OVS/DPDK on host.

    Please refer to below link on how to setup OVS-DPDK

12.8.2.2. OVS-DPDK Config pod.yaml describing Topology
12.8.2.3. OVS-DPDK 2-Node setup
                             +--------------------+
                             |                    |
                             |                    |
                             |        DUT         |
                             |       (VNF)        |
                             |                    |
                             +--------------------+
                             | virtio |  | virtio |
                             +--------+  +--------+
                                  ^          ^
                                  |          |
                                  |          |
                             +--------+  +--------+
                             | vHOST0 |  | vHOST1 |
+----------+               +-------------------------+
|          |               |       ^          ^      |
|          |               |       |          |      |
|          | (0)<----->(0) | ------           |      |
|    TG1   |               |          SUT     |      |
|          |               |       (ovs-dpdk) |      |
|          | (n)<----->(n) |------------------       |
+----------+               +-------------------------+
trafficgen_1                          host
12.8.2.4. OVS-DPDK 3-Node setup - Correlated Traffic
                             +--------------------+
                             |                    |
                             |                    |
                             |        DUT         |
                             |       (VNF)        |
                             |                    |
                             +--------------------+
                             | virtio |  | virtio |
                             +--------+  +--------+
                                  ^          ^
                                  |          |
                                  |          |
                             +--------+  +--------+
                             | vHOST0 |  | vHOST1 |
+----------+               +-------------------------+          +------------+
|          |               |       ^          ^      |          |            |
|          |               |       |          |      |          |            |
|          | (0)<----->(0) | ------           |      |          |    TG2     |
|    TG1   |               |          SUT     |      |          |(UDP Replay)|
|          |               |      (ovs-dpdk)  |      |          |            |
|          | (n)<----->(n) |                  ------ |(n)<-->(n)|            |
+----------+               +-------------------------+          +------------+
trafficgen_1                          host                       trafficgen_2

Before executing Yardstick test cases, make sure that pod.yaml reflects the topology and update all the required fields.

cp <yardstick>/etc/yardstick/nodes/standalone/trex_bm.yaml.sample /etc/yardstick/nodes/standalone/pod_trex.yaml
cp <yardstick>/etc/yardstick/nodes/standalone/host_ovs.yaml /etc/yardstick/nodes/standalone/host_ovs.yaml

Note

Update all the required fields like ip, user, password, pcis, etc...

12.8.2.5. OVS-DPDK Config pod_trex.yaml
nodes:
-
  name: trafficgen_1
  role: TrafficGen
  ip: 1.1.1.1
  user: root
  password: r00t
  interfaces:
      xe0:  # logical name from topology.yaml and vnfd.yaml
          vpci:      "0000:07:00.0"
          driver:    i40e # default kernel driver
          dpdk_port_num: 0
          local_ip: "152.16.100.20"
          netmask:   "255.255.255.0"
          local_mac: "00:00:00:00:00:01"
      xe1:  # logical name from topology.yaml and vnfd.yaml
          vpci:      "0000:07:00.1"
          driver:    i40e # default kernel driver
          dpdk_port_num: 1
          local_ip: "152.16.40.20"
          netmask:   "255.255.255.0"
          local_mac: "00:00.00:00:00:02"
12.8.2.6. OVS-DPDK Config host_ovs.yaml
nodes:
-
   name: ovs_dpdk
   role: OvsDpdk
   ip: 192.168.100.101
   user: ""
   password: ""

ovs_dpdk testcase update: <yardstick>/samples/vnf_samples/nsut/vfw/tc_ovs_rfc2544_ipv4_1rule_1flow_64B_trex.yaml

12.8.2.6.1. Update “contexts” section
contexts:
 - name: yardstick
   type: Node
   file: /etc/yardstick/nodes/standalone/pod_trex.yaml
 - type: StandaloneOvsDpdk
   name: yardstick
   file: /etc/yardstick/nodes/standalone/pod_ovs.yaml
   vm_deploy: True
   ovs_properties:
     version:
       ovs: 2.7.0
       dpdk: 16.11.1
     pmd_threads: 2
     ram:
       socket_0: 2048
       socket_1: 2048
     queues: 4
     vpath: "/usr/local"

   flavor:
     images: "/var/lib/libvirt/images/ubuntu.qcow2"
     ram: 4096
     extra_specs:
       hw:cpu_sockets: 1
       hw:cpu_cores: 6
       hw:cpu_threads: 2
     user: "" # update VM username
     password: "" # update password
   servers:
     vnf:
       network_ports:
         mgmt:
           cidr: '1.1.1.61/24'  # Update VM IP address, if static, <ip>/<mask> or if dynamic, <start of ip>/<mask>
         xe0:
           - uplink_0
         xe1:
           - downlink_0
   networks:
     uplink_0:
       phy_port: "0000:05:00.0"
       vpci: "0000:00:07.0"
       cidr: '152.16.100.10/24'
       gateway_ip: '152.16.100.20'
     downlink_0:
       phy_port: "0000:05:00.1"
       vpci: "0000:00:08.0"
       cidr: '152.16.40.10/24'
       gateway_ip: '152.16.100.20'
12.9. Network Service Benchmarking - OpenStack with SR-IOV support

This section describes how to run a Sample VNF test case, using Heat context, with SR-IOV. It also covers how to install OpenStack in Ubuntu 16.04, using DevStack, with SR-IOV support.

12.9.1. Single node OpenStack setup with external TG
                               +----------------------------+
                               |OpenStack(DevStack)         |
                               |                            |
                               |   +--------------------+   |
                               |   |sample-VNF VM       |   |
                               |   |                    |   |
                               |   |        DUT         |   |
                               |   |       (VNF)        |   |
                               |   |                    |   |
                               |   +--------+  +--------+   |
                               |   | VF NIC |  | VF NIC |   |
                               |   +-----+--+--+----+---+   |
                               |         ^          ^       |
                               |         |          |       |
+----------+                   +---------+----------+-------+
|          |                   |        VF0        VF1      |
|          |                   |         ^          ^       |
|          |                   |         |   SUT    |       |
|    TG    | (PF0)<----->(PF0) +---------+          |       |
|          |                   |                    |       |
|          | (PF1)<----->(PF1) +--------------------+       |
|          |                   |                            |
+----------+                   +----------------------------+
trafficgen_1                                 host
12.9.1.1. Host pre-configuration

Warning

The following configuration requires sudo access to the system. Make sure that your user have the access.

Enable the Intel VT-d or AMD-Vi extension in the BIOS. Some system manufacturers disable this extension by default.

Activate the Intel VT-d or AMD-Vi extension in the kernel by modifying the GRUB config file /etc/default/grub.

For the Intel platform:

...
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"
...

For the AMD platform:

...
GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on"
...

Update the grub configuration file and restart the system:

Warning

The following command will reboot the system.

sudo update-grub
sudo reboot

Make sure the extension has been enabled:

sudo journalctl -b 0 | grep -e IOMMU -e DMAR

Feb 06 14:50:14 hostname kernel: ACPI: DMAR 0x000000006C406000 0001E0 (v01 INTEL  S2600WF  00000001 INTL 20091013)
Feb 06 14:50:14 hostname kernel: DMAR: IOMMU enabled
Feb 06 14:50:14 hostname kernel: DMAR: Host address width 46
Feb 06 14:50:14 hostname kernel: DMAR: DRHD base: 0x000000d37fc000 flags: 0x0
Feb 06 14:50:14 hostname kernel: DMAR: dmar0: reg_base_addr d37fc000 ver 1:0 cap 8d2078c106f0466 ecap f020de
Feb 06 14:50:14 hostname kernel: DMAR: DRHD base: 0x000000e0ffc000 flags: 0x0
Feb 06 14:50:14 hostname kernel: DMAR: dmar1: reg_base_addr e0ffc000 ver 1:0 cap 8d2078c106f0466 ecap f020de
Feb 06 14:50:14 hostname kernel: DMAR: DRHD base: 0x000000ee7fc000 flags: 0x0

Setup system proxy (if needed). Add the following configuration into the /etc/environment file:

Note

The proxy server name/port and IPs should be changed according to actuall/current proxy configuration in the lab.

export http_proxy=http://proxy.company.com:port
export https_proxy=http://proxy.company.com:port
export ftp_proxy=http://proxy.company.com:port
export no_proxy=localhost,127.0.0.1,company.com,<IP-OF-HOST1>,<IP-OF-HOST2>,...
export NO_PROXY=localhost,127.0.0.1,company.com,<IP-OF-HOST1>,<IP-OF-HOST2>,...

Upgrade the system:

sudo -EH apt-get update
sudo -EH apt-get upgrade
sudo -EH apt-get dist-upgrade

Install dependencies needed for the DevStack

sudo -EH apt-get install python
sudo -EH apt-get install python-dev
sudo -EH apt-get install python-pip

Setup SR-IOV ports on the host:

Note

The enp24s0f0, enp24s0f1 are physical function (PF) interfaces on a host and enp24s0f3 is a public interface used in OpenStack, so the interface names should be changed according to the HW environment used for testing.

sudo ip link set dev enp24s0f0 up
sudo ip link set dev enp24s0f1 up
sudo ip link set dev enp24s0f3 up

# Create VFs on PF
echo 2 | sudo tee /sys/class/net/enp24s0f0/device/sriov_numvfs
echo 2 | sudo tee /sys/class/net/enp24s0f1/device/sriov_numvfs
12.9.1.2. DevStack installation

Use official Devstack documentation to install OpenStack on a host. Please note, that stable pike branch of devstack repo should be used during the installation. The required local.conf` configuration file are described below.

DevStack configuration file:

Note

Update the devstack configuration file by replacing angluar brackets with a short description inside.

Note

Use lspci | grep Ether & lspci -n | grep <PCI ADDRESS> commands to get device and vendor id of the virtual function (VF).

[[local|localrc]]
HOST_IP=<HOST_IP_ADDRESS>
ADMIN_PASSWORD=password
MYSQL_PASSWORD=$ADMIN_PASSWORD
DATABASE_PASSWORD=$ADMIN_PASSWORD
RABBIT_PASSWORD=$ADMIN_PASSWORD
SERVICE_PASSWORD=$ADMIN_PASSWORD
HORIZON_PASSWORD=$ADMIN_PASSWORD

# Internet access.
RECLONE=False
PIP_UPGRADE=True
IP_VERSION=4

# Services
disable_service n-net
ENABLED_SERVICES+=,q-svc,q-dhcp,q-meta,q-agt,q-sriov-agt

# Heat
enable_plugin heat https://git.openstack.org/openstack/heat stable/pike

# Neutron
enable_plugin neutron https://git.openstack.org/openstack/neutron.git stable/pike

# Neutron Options
FLOATING_RANGE=<RANGE_IN_THE_PUBLIC_INTERFACE_NETWORK>
Q_FLOATING_ALLOCATION_POOL=start=<START_IP_ADDRESS>,end=<END_IP_ADDRESS>
PUBLIC_NETWORK_GATEWAY=<PUBLIC_NETWORK_GATEWAY>
PUBLIC_INTERFACE=<PUBLIC INTERFACE>

# ML2 Configuration
Q_PLUGIN=ml2
Q_ML2_PLUGIN_MECHANISM_DRIVERS=openvswitch,sriovnicswitch
Q_ML2_PLUGIN_TYPE_DRIVERS=vlan,flat,local,vxlan,gre,geneve

# Open vSwitch provider networking configuration
Q_USE_PROVIDERNET_FOR_PUBLIC=True
OVS_PHYSICAL_BRIDGE=br-ex
OVS_BRIDGE_MAPPINGS=public:br-ex
PHYSICAL_DEVICE_MAPPINGS=physnet1:<PF0_IFNAME>,physnet2:<PF1_IFNAME>
PHYSICAL_NETWORK=physnet1,physnet2


[[post-config|$NOVA_CONF]]
[DEFAULT]
scheduler_default_filters=RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,PciPassthroughFilter
# Whitelist PCI devices
pci_passthrough_whitelist = {\\"devname\\": \\"<PF0_IFNAME>\\", \\"physical_network\\": \\"physnet1\\" }
pci_passthrough_whitelist = {\\"devname\\": \\"<PF1_IFNAME>\\", \\"physical_network\\": \\"physnet2\\" }

[filter_scheduler]
enabled_filters = RetryFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,SameHostFilter

[libvirt]
cpu_mode = host-model


# ML2 plugin bits for SR-IOV enablement of Intel Corporation XL710/X710 Virtual Function
[[post-config|/$Q_PLUGIN_CONF_FILE]]
[ml2_sriov]
agent_required = True
supported_pci_vendor_devs = <VF_DEV_ID:VF_VEN_ID>

Start the devstack installation on a host.

12.9.1.3. TG host configuration

Yardstick automatically install and configure Trex traffic generator on TG host based on provided POD file (see below). Anyway, it’s recommended to check the compatibility of the installed NIC on the TG server with software Trex using the manual at https://trex-tgn.cisco.com/trex/doc/trex_manual.html.

12.9.1.4. Run the Sample VNF test case

There is an example of Sample VNF test case ready to be executed in an OpenStack environment with SR-IOV support: samples/vnf_samples/nsut/vfw/ tc_heat_sriov_external_rfc2544_ipv4_1rule_1flow_64B_trex.yaml.

Install yardstick using Install Yardstick (NSB Testing) steps for OpenStack context.

Create pod file for TG in the yardstick repo folder located in the yardstick container:

Note

The ip, user, password and vpci fields show be changed according to HW environment used for the testing. Use lshw -c network -businfo command to get the PF PCI address for vpci field.

nodes:
-
    name: trafficgen_1
    role: tg__0
    ip: <TG-HOST-IP>
    user: <TG-USER>
    password: <TG-PASS>
    interfaces:
        xe0:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:18:00.0"
            driver:    i40e # default kernel driver
            dpdk_port_num: 0
            local_ip: "10.1.1.150"
            netmask:   "255.255.255.0"
            local_mac: "00:00:00:00:00:01"
        xe1:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:18:00.1"
            driver:    i40e # default kernel driver
            dpdk_port_num: 1
            local_ip: "10.1.1.151"
            netmask:   "255.255.255.0"
            local_mac: "00:00:00:00:00:02"

Run the Sample vFW RFC2544 SR-IOV TC (samples/vnf_samples/nsut/vfw/ tc_heat_sriov_external_rfc2544_ipv4_1rule_1flow_64B_trex.yaml) in the heat context using steps described in NS testing - using yardstick CLI section.

12.9.2. Multi node OpenStack TG and VNF setup (two nodes)
+----------------------------+                   +----------------------------+
|OpenStack(DevStack)         |                   |OpenStack(DevStack)         |
|                            |                   |                            |
|   +--------------------+   |                   |   +--------------------+   |
|   |sample-VNF VM       |   |                   |   |sample-VNF VM       |   |
|   |                    |   |                   |   |                    |   |
|   |         TG         |   |                   |   |        DUT         |   |
|   |    trafficgen_1    |   |                   |   |       (VNF)        |   |
|   |                    |   |                   |   |                    |   |
|   +--------+  +--------+   |                   |   +--------+  +--------+   |
|   | VF NIC |  | VF NIC |   |                   |   | VF NIC |  | VF NIC |   |
|   +----+---+--+----+---+   |                   |   +-----+--+--+----+---+   |
|        ^           ^       |                   |         ^          ^       |
|        |           |       |                   |         |          |       |
+--------+-----------+-------+                   +---------+----------+-------+
|       VF0         VF1      |                   |        VF0        VF1      |
|        ^           ^       |                   |         ^          ^       |
|        |    SUT2   |       |                   |         |   SUT1   |       |
|        |           +-------+ (PF0)<----->(PF0) +---------+          |       |
|        |                   |                   |                    |       |
|        +-------------------+ (PF1)<----->(PF1) +--------------------+       |
|                            |                   |                            |
+----------------------------+                   +----------------------------+
         host2 (compute)                               host1 (controller)
12.9.2.1. Controller/Compute pre-configuration

Pre-configuration of the controller and compute hosts are the same as described in Host pre-configuration section. Follow the steps in the section.

12.9.2.2. DevStack configuration

Use official Devstack documentation to install OpenStack on a host. Please note, that stable pike branch of devstack repo should be used during the installation. The required local.conf` configuration file are described below.

Note

Update the devstack configuration files by replacing angluar brackets with a short description inside.

Note

Use lspci | grep Ether & lspci -n | grep <PCI ADDRESS> commands to get device and vendor id of the virtual function (VF).

DevStack configuration file for controller host:

[[local|localrc]]
HOST_IP=<HOST_IP_ADDRESS>
ADMIN_PASSWORD=password
MYSQL_PASSWORD=$ADMIN_PASSWORD
DATABASE_PASSWORD=$ADMIN_PASSWORD
RABBIT_PASSWORD=$ADMIN_PASSWORD
SERVICE_PASSWORD=$ADMIN_PASSWORD
HORIZON_PASSWORD=$ADMIN_PASSWORD
# Controller node
SERVICE_HOST=$HOST_IP
MYSQL_HOST=$SERVICE_HOST
RABBIT_HOST=$SERVICE_HOST
GLANCE_HOSTPORT=$SERVICE_HOST:9292

# Internet access.
RECLONE=False
PIP_UPGRADE=True
IP_VERSION=4

# Services
disable_service n-net
ENABLED_SERVICES+=,q-svc,q-dhcp,q-meta,q-agt,q-sriov-agt

# Heat
enable_plugin heat https://git.openstack.org/openstack/heat stable/pike

# Neutron
enable_plugin neutron https://git.openstack.org/openstack/neutron.git stable/pike

# Neutron Options
FLOATING_RANGE=<RANGE_IN_THE_PUBLIC_INTERFACE_NETWORK>
Q_FLOATING_ALLOCATION_POOL=start=<START_IP_ADDRESS>,end=<END_IP_ADDRESS>
PUBLIC_NETWORK_GATEWAY=<PUBLIC_NETWORK_GATEWAY>
PUBLIC_INTERFACE=<PUBLIC INTERFACE>

# ML2 Configuration
Q_PLUGIN=ml2
Q_ML2_PLUGIN_MECHANISM_DRIVERS=openvswitch,sriovnicswitch
Q_ML2_PLUGIN_TYPE_DRIVERS=vlan,flat,local,vxlan,gre,geneve

# Open vSwitch provider networking configuration
Q_USE_PROVIDERNET_FOR_PUBLIC=True
OVS_PHYSICAL_BRIDGE=br-ex
OVS_BRIDGE_MAPPINGS=public:br-ex
PHYSICAL_DEVICE_MAPPINGS=physnet1:<PF0_IFNAME>,physnet2:<PF1_IFNAME>
PHYSICAL_NETWORK=physnet1,physnet2


[[post-config|$NOVA_CONF]]
[DEFAULT]
scheduler_default_filters=RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,PciPassthroughFilter
# Whitelist PCI devices
pci_passthrough_whitelist = {\\"devname\\": \\"<PF0_IFNAME>\\", \\"physical_network\\": \\"physnet1\\" }
pci_passthrough_whitelist = {\\"devname\\": \\"<PF1_IFNAME>\\", \\"physical_network\\": \\"physnet2\\" }

[libvirt]
cpu_mode = host-model


# ML2 plugin bits for SR-IOV enablement of Intel Corporation XL710/X710 Virtual Function
[[post-config|/$Q_PLUGIN_CONF_FILE]]
[ml2_sriov]
agent_required = True
supported_pci_vendor_devs = <VF_DEV_ID:VF_VEN_ID>

DevStack configuration file for compute host:

[[local|localrc]]
HOST_IP=<HOST_IP_ADDRESS>
MYSQL_PASSWORD=password
DATABASE_PASSWORD=password
RABBIT_PASSWORD=password
ADMIN_PASSWORD=password
SERVICE_PASSWORD=password
HORIZON_PASSWORD=password
# Controller node
SERVICE_HOST=<CONTROLLER_IP_ADDRESS>
MYSQL_HOST=$SERVICE_HOST
RABBIT_HOST=$SERVICE_HOST
GLANCE_HOSTPORT=$SERVICE_HOST:9292

# Internet access.
RECLONE=False
PIP_UPGRADE=True
IP_VERSION=4

# Neutron
enable_plugin neutron https://git.openstack.org/openstack/neutron.git stable/pike

# Services
ENABLED_SERVICES=n-cpu,rabbit,q-agt,placement-api,q-sriov-agt

# Neutron Options
PUBLIC_INTERFACE=<PUBLIC INTERFACE>

# ML2 Configuration
Q_PLUGIN=ml2
Q_ML2_PLUGIN_MECHANISM_DRIVERS=openvswitch,sriovnicswitch
Q_ML2_PLUGIN_TYPE_DRIVERS=vlan,flat,local,vxlan,gre,geneve

# Open vSwitch provider networking configuration
PHYSICAL_DEVICE_MAPPINGS=physnet1:<PF0_IFNAME>,physnet2:<PF1_IFNAME>


[[post-config|$NOVA_CONF]]
[DEFAULT]
scheduler_default_filters=RamFilter,ComputeFilter,AvailabilityZoneFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,PciPassthroughFilter
# Whitelist PCI devices
pci_passthrough_whitelist = {\\"devname\\": \\"<PF0_IFNAME>\\", \\"physical_network\\": \\"physnet1\\" }
pci_passthrough_whitelist = {\\"devname\\": \\"<PF1_IFNAME>\\", \\"physical_network\\": \\"physnet2\\" }

[libvirt]
cpu_mode = host-model


# ML2 plugin bits for SR-IOV enablement of Intel Corporation XL710/X710 Virtual Function
[[post-config|/$Q_PLUGIN_CONF_FILE]]
[ml2_sriov]
agent_required = True
supported_pci_vendor_devs = <VF_DEV_ID:VF_VEN_ID>

Start the devstack installation on the controller and compute hosts.

12.9.2.3. Run the sample vFW TC

Install yardstick using Install Yardstick (NSB Testing) steps for OpenStack context.

Run sample vFW RFC2544 SR-IOV TC (samples/vnf_samples/nsut/vfw/ tc_heat_rfc2544_ipv4_1rule_1flow_64B_trex.yaml) in the heat context using steps described in NS testing - using yardstick CLI section and the following yardtick command line arguments:

yardstick -d task start --task-args='{"provider": "sriov"}' \
samples/vnf_samples/nsut/vfw/tc_heat_rfc2544_ipv4_1rule_1flow_64B_trex.yaml
12.10. Enabling other Traffic generator
  1. Software needed: IxLoadAPI <IxLoadTclApi verson>Linux64.bin.tgz and <IxOS version>Linux64.bin.tar.gz (Download from ixia support site) Install - <IxLoadTclApi verson>Linux64.bin.tgz and <IxOS version>Linux64.bin.tar.gz If the installation was not done inside the container, after installing the IXIA client, check /opt/ixia/ixload/<ver>/bin/ixloadpython and make sure you can run this cmd inside the yardstick container. Usually user is required to copy or link /opt/ixia/python/<ver>/bin/ixiapython to /usr/bin/ixiapython<ver> inside the container.
  2. Update pod_ixia.yaml file with ixia details.
cp <repo>/etc/yardstick/nodes/pod.yaml.nsb.sample.ixia etc/yardstick/nodes/pod_ixia.yaml

Config pod_ixia.yaml

nodes:
-
    name: trafficgen_1
    role: IxNet
    ip: 1.2.1.1 #ixia machine ip
    user: user
    password: r00t
    key_filename: /root/.ssh/id_rsa
    tg_config:
        ixchassis: "1.2.1.7" #ixia chassis ip
        tcl_port: "8009" # tcl server port
        lib_path: "/opt/ixia/ixos-api/8.01.0.2/lib/ixTcl1.0"
        root_dir: "/opt/ixia/ixos-api/8.01.0.2/"
        py_bin_path: "/opt/ixia/ixload/8.01.106.3/bin/"
        dut_result_dir: "/mnt/ixia"
        version: 8.1
    interfaces:
        xe0:  # logical name from topology.yaml and vnfd.yaml
            vpci: "2:5" # Card:port
            driver:    "none"
            dpdk_port_num: 0
            local_ip: "152.16.100.20"
            netmask:   "255.255.0.0"
            local_mac: "00:98:10:64:14:00"
        xe1:  # logical name from topology.yaml and vnfd.yaml
            vpci: "2:6" # [(Card, port)]
            driver:    "none"
            dpdk_port_num: 1
            local_ip: "152.40.40.20"
            netmask:   "255.255.0.0"
            local_mac: "00:98:28:28:14:00"

for sriov/ovs_dpdk pod files, please refer to above Standalone Virtualization for ovs-dpdk/sriov configuration

  1. Start IxOS TCL Server (Install ‘Ixia IxExplorer IxOS <version>’) You will also need to configure the IxLoad machine to start the IXIA IxosTclServer. This can be started like so:
    • Connect to the IxLoad machine using RDP
    • Go to: Start->Programs->Ixia->IxOS->IxOS 8.01-GA-Patch1->Ixia Tcl Server IxOS 8.01-GA-Patch1 or "C:\Program Files (x86)\Ixia\IxOS\8.01-GA-Patch1\ixTclServer.exe"
  2. Create a folder Results in c:and share the folder on the network.
  3. Execute testcase in samplevnf folder e.g. <repo>/samples/vnf_samples/nsut/vfw/tc_baremetal_http_ixload_1b_Requests-65000_Concurrency.yaml
12.10.1. IxNetwork

IxNetwork testcases use IxNetwork API Python Bindings module, which is installed as part of the requirements of the project.

  1. Update pod_ixia.yaml file with ixia details.
cp <repo>/etc/yardstick/nodes/pod.yaml.nsb.sample.ixia etc/yardstick/nodes/pod_ixia.yaml

Config pod_ixia.yaml

nodes:
-
    name: trafficgen_1
    role: IxNet
    ip: 1.2.1.1 #ixia machine ip
    user: user
    password: r00t
    key_filename: /root/.ssh/id_rsa
    tg_config:
        ixchassis: "1.2.1.7" #ixia chassis ip
        tcl_port: "8009" # tcl server port
        lib_path: "/opt/ixia/ixos-api/8.01.0.2/lib/ixTcl1.0"
        root_dir: "/opt/ixia/ixos-api/8.01.0.2/"
        py_bin_path: "/opt/ixia/ixload/8.01.106.3/bin/"
        dut_result_dir: "/mnt/ixia"
        version: 8.1
    interfaces:
        xe0:  # logical name from topology.yaml and vnfd.yaml
            vpci: "2:5" # Card:port
            driver:    "none"
            dpdk_port_num: 0
            local_ip: "152.16.100.20"
            netmask:   "255.255.0.0"
            local_mac: "00:98:10:64:14:00"
        xe1:  # logical name from topology.yaml and vnfd.yaml
            vpci: "2:6" # [(Card, port)]
            driver:    "none"
            dpdk_port_num: 1
            local_ip: "152.40.40.20"
            netmask:   "255.255.0.0"
            local_mac: "00:98:28:28:14:00"

for sriov/ovs_dpdk pod files, please refer to above Standalone Virtualization for ovs-dpdk/sriov configuration

  1. Start IxNetwork TCL Server You will also need to configure the IxNetwork machine to start the IXIA IxNetworkTclServer. This can be started like so:

    • Connect to the IxNetwork machine using RDP
    • Go to: Start->Programs->Ixia->IxNetwork->IxNetwork 7.21.893.14 GA->IxNetworkTclServer (or IxNetworkApiServer)
  2. Execute testcase in samplevnf folder e.g. <repo>/samples/vnf_samples/nsut/vfw/tc_baremetal_rfc2544_ipv4_1rule_1flow_64B_ixia.yaml

13. Yardstick - NSB Testing - Operation
13.1. Abstract

NSB test configuration and OpenStack setup requirements

13.2. OpenStack Network Configuration

NSB requires certain OpenStack deployment configurations. For optimal VNF characterization using external traffic generators NSB requires provider/external networks.

13.2.1. Provider networks

The VNFs require a clear L2 connect to the external network in order to generate realistic traffic from multiple address ranges and ports.

In order to prevent Neutron from filtering traffic we have to disable Neutron Port Security. We also disable DHCP on the data ports because we are binding the ports to DPDK and do not need DHCP addresses. We also disable gateways because multiple default gateways can prevent SSH access to the VNF from the floating IP. We only want a gateway on the mgmt network

uplink_0:
  cidr: '10.1.0.0/24'
  gateway_ip: 'null'
  port_security_enabled: False
  enable_dhcp: 'false'
13.2.2. Heat Topologies

By default Heat will attach every node to every Neutron network that is created. For scale-out tests we do not want to attach every node to every network.

For each node you can specify which ports are on which network using the network_ports dictionary.

In this example we have TRex xe0 <-> xe0 VNF xe1 <-> xe0 UDP_Replay

vnf_0:
  floating_ip: true
  placement: "pgrp1"
  network_ports:
    mgmt:
      - mgmt
    uplink_0:
      - xe0
    downlink_0:
      - xe1
tg_0:
  floating_ip: true
  placement: "pgrp1"
  network_ports:
    mgmt:
      - mgmt
    uplink_0:
      - xe0
    # Trex always needs two ports
    uplink_1:
      - xe1
tg_1:
  floating_ip: true
  placement: "pgrp1"
  network_ports:
    mgmt:
     - mgmt
    downlink_0:
     - xe0
13.2.3. Availability zone

The configuration of the availability zone is requred in cases where location of exact compute host/group of compute hosts needs to be specified for SampleVNF or traffic generator in the heat test case. If this is the case, please follow the instructions below.

  1. Create a host aggregate in the OpenStack and add the available compute hosts into the aggregate group.

    Note

    Change the <AZ_NAME> (availability zone name), <AGG_NAME> (host aggregate name) and <HOST> (host name of one of the compute) in the commands below.

    # create host aggregate
    openstack aggregate create --zone <AZ_NAME> --property availability_zone=<AZ_NAME> <AGG_NAME>
    # show available hosts
    openstack compute service list --service nova-compute
    # add selected host into the host aggregate
    openstack aggregate add host <AGG_NAME> <HOST>
    
  2. To specify the OpenStack location (the exact compute host or group of the hosts) of SampleVNF or traffic generator in the heat test case, the availability_zone server configuration option should be used. For example:

    Note

    The <AZ_NAME> (availability zone name) should be changed according to the name used during the host aggregate creation steps above.

    context:
      name: yardstick
      image: yardstick-samplevnfs
      ...
      servers:
        vnf__0:
          ...
          availability_zone: <AZ_NAME>
          ...
        tg__0:
          ...
          availability_zone: <AZ_NAME>
          ...
      networks:
        ...
    

There are two example of SampleVNF scale out test case which use the availability zone feature to specify the exact location of scaled VNFs and traffic generators.

Those are:

<repo>/samples/vnf_samples/nsut/prox/tc_prox_heat_context_l2fwd_multiflow-2-scale-out.yaml
<repo>/samples/vnf_samples/nsut/vfw/tc_heat_rfc2544_ipv4_1rule_1flow_64B_trex_scale_out.yaml

Note

This section describes the PROX scale-out testcase, but the same procedure is used for the vFW test case.

  1. Before running the scale-out test case, make sure the host aggregates are configured in the OpenStack environment. To check this, run the following command:

    # show configured host aggregates (example)
    openstack aggregate list
    +----+------+-------------------+
    | ID | Name | Availability Zone |
    +----+------+-------------------+
    |  4 | agg0 | AZ_NAME_0         |
    |  5 | agg1 | AZ_NAME_1         |
    +----+------+-------------------+
    
  2. If no host aggregates are configured, please use steps above to configure them.

  1. Run the SampleVNF PROX scale-out test case, specifying the availability zone of each VNF and traffic generator as a task arguments.

    Note

    The az_0 and az_1 should be changed according to the host aggregates created in the OpenStack.

    yardstick -d task start\
    <repo>/samples/vnf_samples/nsut/prox/tc_prox_heat_context_l2fwd_multiflow-2-scale-out.yaml\
      --task-args='{
        "num_vnfs": 4, "availability_zone": {
          "vnf_0": "az_0", "tg_0": "az_1",
          "vnf_1": "az_0", "tg_1": "az_1",
          "vnf_2": "az_0", "tg_2": "az_1",
          "vnf_3": "az_0", "tg_3": "az_1"
        }
      }'
    

    num_vnfs specifies how many VNFs are going to be deployed in the heat contexts. vnf_X and tg_X arguments configure the availability zone where the VNF and traffic generator is going to be deployed.

13.3. Collectd KPIs

NSB can collect KPIs from collected. We have support for various plugins enabled by the Barometer project.

The default yardstick-samplevnf has collectd installed. This allows for collecting KPIs from the VNF.

Collecting KPIs from the NFVi is more complicated and requires manual setup. We assume that collectd is not installed on the compute nodes.

To collectd KPIs from the NFVi compute nodes:

  • install_collectd on the compute nodes
  • create pod.yaml for the compute nodes
  • enable specific plugins depending on the vswitch and DPDK

example pod.yaml section for Compute node running collectd.

-
  name: "compute-1"
  role: Compute
  ip: "10.1.2.3"
  user: "root"
  ssh_port: "22"
  password: ""
  collectd:
    interval: 5
    plugins:
      # for libvirtd stats
      virt: {}
      intel_pmu: {}
      ovs_stats:
        # path to OVS socket
        ovs_socket_path: /var/run/openvswitch/db.sock
      intel_rdt: {}
13.4. Scale-Up

VNFs performance data with scale-up

  • Helps to figure out optimal number of cores specification in the Virtual Machine template creation or VNF
  • Helps in comparison between different VNF vendor offerings
  • Better the scale-up index, indicates the performance scalability of a particular solution
13.4.1. Heat

For VNF scale-up tests we increase the number for VNF worker threads. In the case of VNFs we also need to increase the number of VCPUs and memory allocated to the VNF.

An example scale-up Heat testcase is:

# Copyright (c) 2016-2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
{% set mem = mem or 20480 %}
{% set vcpus = vcpus or 10 %}
{% set vports = vports or 2 %}
---
schema: yardstick:task:0.1
scenarios:
- type: NSPerf
  traffic_profile: ../../traffic_profiles/ipv4_throughput-scale-up.yaml
  extra_args:
    vports: {{ vports }}
  topology: vfw-tg-topology-scale-up.yaml
  nodes:
    tg__0: tg_0.yardstick
    vnf__0: vnf_0.yardstick
  options:
    framesize:
      uplink: {64B: 100}
      downlink: {64B: 100}
    flow:
      src_ip: [
{% for vport in range(0,vports,2|int) %}
       {'tg__0': 'xe{{vport}}'},
{% endfor %}  ]
      dst_ip: [
{% for vport in range(1,vports,2|int) %}
      {'tg__0': 'xe{{vport}}'},
{% endfor %}  ]
      count: 1
    traffic_type: 4
    rfc2544:
      allowed_drop_rate: 0.0001 - 0.0001
    vnf__0:
      rules: acl_1rule.yaml
      vnf_config: {lb_config: 'SW', file: vfw_vnf_pipeline_cores_{{vcpus}}_ports_{{vports}}_lb_1_sw.conf }
  runner:
    type: Iteration
    iterations: 10
    interval: 35
context:
  # put node context first, so we don't HEAT deploy if node has errors
  name: yardstick
  image: yardstick-samplevnfs
  flavor:
    vcpus: {{ vcpus }}
    ram: {{ mem }}
    disk: 6
    extra_specs:
      hw:cpu_sockets: 1
      hw:cpu_cores: {{ vcpus }}
      hw:cpu_threads: 1
  user: ubuntu
  placement_groups:
    pgrp1:
      policy: "availability"
  servers:
    tg_0:
      floating_ip: true
      placement: "pgrp1"
    vnf_0:
      floating_ip: true
      placement: "pgrp1"
  networks:
    mgmt:
      cidr: '10.0.1.0/24'
{% for vport in range(1,vports,2|int) %}
    uplink_{{loop.index0}}:
      cidr: '10.1.{{vport}}.0/24'
      gateway_ip: 'null'
      port_security_enabled: False
      enable_dhcp: 'false'
    downlink_{{loop.index0}}:
      cidr: '10.1.{{vport+1}}.0/24'
      gateway_ip: 'null'
      port_security_enabled: False
      enable_dhcp: 'false'
{% endfor %}

This testcase template requires specifying the number of VCPUs, Memory and Ports. We set the VCPUs and memory using the --task-args options

yardstick task start --task-args='{"mem": 10480, "vcpus": 4, "vports": 2}' \
samples/vnf_samples/nsut/vfw/tc_heat_rfc2544_ipv4_1rule_1flow_64B_trex_scale-up.yaml

In order to support ports scale-up, traffic and topology templates need to be used in testcase.

A example topology template is:

# Copyright (c) 2016-2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
---
{% set vports = get(extra_args, 'vports', '2') %}
nsd:nsd-catalog:
    nsd:
    -   id: 3tg-topology
        name: 3tg-topology
        short-name: 3tg-topology
        description: 3tg-topology
        constituent-vnfd:
        -   member-vnf-index: '1'
            vnfd-id-ref: tg__0
            VNF model: ../../vnf_descriptors/tg_rfc2544_tpl.yaml      #VNF type
        -   member-vnf-index: '2'
            vnfd-id-ref: vnf__0
            VNF model: ../../vnf_descriptors/vfw_vnf.yaml      #VNF type

        vld:
{% for vport in range(0,vports,2|int) %}
        -   id: uplink_{{loop.index0}}
            name: tg__0 to vnf__0 link {{vport + 1}}
            type: ELAN
            vnfd-connection-point-ref:
            -   member-vnf-index-ref: '1'
                vnfd-connection-point-ref: xe{{vport}}
                vnfd-id-ref: tg__0
            -   member-vnf-index-ref: '2'
                vnfd-connection-point-ref: xe{{vport}}
                vnfd-id-ref: vnf__0
        -   id: downlink_{{loop.index0}}
            name: vnf__0 to tg__0 link {{vport + 2}}
            type: ELAN
            vnfd-connection-point-ref:
            -   member-vnf-index-ref: '2'
                vnfd-connection-point-ref: xe{{vport+1}}
                vnfd-id-ref: vnf__0
            -   member-vnf-index-ref: '1'
                vnfd-connection-point-ref: xe{{vport+1}}
                vnfd-id-ref: tg__0
{% endfor %}

This template has vports as an argument. To pass this argument it needs to be configured in extra_args scenario definition. Please note that more argument can be defined in that section. All of them will be passed to topology and traffic profile templates

For example:

schema: yardstick:task:0.1
scenarios:
- type: NSPerf
  traffic_profile: ../../traffic_profiles/ipv4_throughput-scale-up.yaml
  extra_args:
    vports: {{ vports }}
  topology: vfw-tg-topology-scale-up.yaml

A example traffic profile template is:

# Copyright (c) 2016-2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# flow definition for ACL tests - 1K flows - ipv4 only
#
# the number of flows defines the widest range of parameters
# for example if srcip_range=1.0.0.1-1.0.0.255 and dst_ip_range=10.0.0.1-10.0.1.255
# and it should define only 16 flows
#
# there is assumption that packets generated will have a random sequences of following addresses pairs
# in the packets
# 1. src=1.x.x.x(x.x.x =random from 1..255) dst=10.x.x.x (random from 1..512)
# 2. src=1.x.x.x(x.x.x =random from 1..255) dst=10.x.x.x (random from 1..512)
# ...
# 512. src=1.x.x.x(x.x.x =random from 1..255) dst=10.x.x.x (random from 1..512)
#
# not all combination should be filled
# Any other field with random range will be added to flow definition
#
# the example.yaml provides all possibilities for traffic generation
#
# the profile defines a public and private side to make limited traffic correlation
# between private and public side same way as it is made by IXIA solution.
#
{% set vports = get(extra_args, 'vports', '2') %}
---
schema: "nsb:traffic_profile:0.1"

# This file is a template, it will be filled with values from tc.yaml before passing to the traffic generator

name: rfc2544
description: Traffic profile to run RFC2544 latency
traffic_profile:
  traffic_type: RFC2544Profile # defines traffic behavior - constant or look for highest possible throughput
  frame_rate: 100  # pc of linerate
  duration: {{ duration }}

{% set count = 0 %}
{% for vport in range(vports|int) %}
uplink_{{vport}}:
  ipv4:
    id: {{count + 1 }}
    outer_l2:
      framesize:
        64B: "{{ get(imix, 'imix.uplink.64B', '0') }}"
        128B: "{{ get(imix, 'imix.uplink.128B', '0') }}"
        256B: "{{ get(imix, 'imix.uplink.256B', '0') }}"
        373b: "{{ get(imix, 'imix.uplink.373B', '0') }}"
        512B: "{{ get(imix, 'imix.uplink.512B', '0') }}"
        570B: "{{ get(imix, 'imix.uplink.570B', '0') }}"
        1400B: "{{ get(imix, 'imix.uplink.1400B', '0') }}"
        1500B: "{{ get(imix, 'imix.uplink.1500B', '0') }}"
        1518B: "{{ get(imix, 'imix.uplink.1518B', '0') }}"
    outer_l3v4:
      proto: "udp"
      srcip4: "{{ get(flow, 'flow.src_ip_{{vport}}', '1.1.1.1-1.1.255.255') }}"
      dstip4: "{{ get(flow, 'flow.dst_ip_{{vport}}', '90.90.1.1-90.90.255.255') }}"
      count: "{{ get(flow, 'flow.count', '1') }}"
      ttl: 32
      dscp: 0
    outer_l4:
      srcport: "{{ get(flow, 'flow.src_port_{{vport}}', '1234-4321') }}"
      dstport: "{{ get(flow, 'flow.dst_port_{{vport}}', '2001-4001') }}"
      count: "{{ get(flow, 'flow.count', '1') }}"
downlink_{{vport}}:
  ipv4:
    id: {{count + 2}}
    outer_l2:
      framesize:
        64B: "{{ get(imix, 'imix.downlink.64B', '0') }}"
        128B: "{{ get(imix, 'imix.downlink.128B', '0') }}"
        256B: "{{ get(imix, 'imix.downlink.256B', '0') }}"
        373b: "{{ get(imix, 'imix.downlink.373B', '0') }}"
        512B: "{{ get(imix, 'imix.downlink.512B', '0') }}"
        570B: "{{ get(imix, 'imix.downlink.570B', '0') }}"
        1400B: "{{ get(imix, 'imix.downlink.1400B', '0') }}"
        1500B: "{{ get(imix, 'imix.downlink.1500B', '0') }}"
        1518B: "{{ get(imix, 'imix.downlink.1518B', '0') }}"

    outer_l3v4:
      proto: "udp"
      srcip4: "{{ get(flow, 'flow.dst_ip_{{vport}}', '90.90.1.1-90.90.255.255') }}"
      dstip4: "{{ get(flow, 'flow.src_ip_{{vport}}', '1.1.1.1-1.1.255.255') }}"
      count: "{{ get(flow, 'flow.count', '1') }}"
      ttl: 32
      dscp: 0
    outer_l4:
      srcport: "{{ get(flow, 'flow.dst_port_{{vport}}', '1234-4321') }}"
      dstport: "{{ get(flow, 'flow.src_port_{{vport}}', '2001-4001') }}"
      count: "{{ get(flow, 'flow.count', '1') }}"
{% set count = count + 2 %}
{% endfor %}

There is an option to provide predefined config for SampleVNFs. Path to config file may by specified in vnf_config scenario section.

vnf__0:
   rules: acl_1rule.yaml
   vnf_config: {lb_config: 'SW', file: vfw_vnf_pipeline_cores_4_ports_2_lb_1_sw.conf }
13.4.2. Baremetal
  1. Follow above traffic generator section to setup.
  2. Edit num of threads in <repo>/samples/vnf_samples/nsut/vfw/tc_baremetal_rfc2544_ipv4_1rule_1flow_64B_trex_scale_up.yaml e.g, 6 Threads for given VNF
schema: yardstick:task:0.1
scenarios:
{% for worker_thread in [1, 2 ,3 , 4, 5, 6] %}
- type: NSPerf
  traffic_profile: ../../traffic_profiles/ipv4_throughput.yaml
  topology: vfw-tg-topology.yaml
  nodes:
    tg__0: trafficgen_1.yardstick
    vnf__0: vnf.yardstick
  options:
    framesize:
      uplink: {64B: 100}
      downlink: {64B: 100}
    flow:
      src_ip: [{'tg__0': 'xe0'}]
      dst_ip: [{'tg__0': 'xe1'}]
      count: 1
    traffic_type: 4
    rfc2544:
      allowed_drop_rate: 0.0001 - 0.0001
    vnf__0:
      rules: acl_1rule.yaml
      vnf_config: {lb_config: 'HW', lb_count: 1, worker_config: '1C/1T', worker_threads: {{worker_thread}}}
      nfvi_enable: True
  runner:
    type: Iteration
    iterations: 10
    interval: 35
{% endfor %}
context:
  type: Node
  name: yardstick
  nfvi_type: baremetal
  file: /etc/yardstick/nodes/pod.yaml
13.5. Scale-Out

VNFs performance data with scale-out helps

  • in capacity planning to meet the given network node requirements
  • in comparison between different VNF vendor offerings
  • better the scale-out index, provides the flexibility in meeting future capacity requirements
13.5.1. Standalone

Scale-out not supported on Baremetal.

  1. Follow above traffic generator section to setup.
  2. Generate testcase for standalone virtualization using ansible scripts
cd <repo>/ansible
trex: standalone_ovs_scale_out_trex_test.yaml or standalone_sriov_scale_out_trex_test.yaml
ixia: standalone_ovs_scale_out_ixia_test.yaml or standalone_sriov_scale_out_ixia_test.yaml
ixia_correlated: standalone_ovs_scale_out_ixia_correlated_test.yaml or standalone_sriov_scale_out_ixia_correlated_test.yaml

update the ovs_dpdk or sriov above Ansible scripts reflect the setup

  1. run the test
<repo>/samples/vnf_samples/nsut/tc_sriov_vfw_udp_ixia_correlated_scale_out-1.yaml
<repo>/samples/vnf_samples/nsut/tc_sriov_vfw_udp_ixia_correlated_scale_out-2.yaml
13.5.2. Heat

There are sample scale-out all-VM Heat tests. These tests only use VMs and don’t use external traffic.

The tests use UDP_Replay and correlated traffic.

<repo>/samples/vnf_samples/nsut/cgnapt/tc_heat_rfc2544_ipv4_1flow_64B_trex_correlated_scale_4.yaml

To run the test you need to increase OpenStack CPU, Memory and Port quotas.

13.6. Traffic Generator tuning

The TRex traffic generator can be setup to use multiple threads per core, this is for multiqueue testing.

TRex does not automatically enable multiple threads because we currently cannot detect the number of queues on a device.

To enable multiple queue set the queues_per_port value in the TG VNF options section.

scenarios:
  - type: NSPerf
    nodes:
      tg__0: tg_0.yardstick

    options:
      tg_0:
        queues_per_port: 2
13.7. Standalone configuration

NSB supports certain Standalone deployment configurations. Standalone supports provisioning a VM in a standalone visualised environment using kvm/qemu. There two types of Standalone contexts available: OVS-DPDK and SRIOV. OVS-DPDK uses OVS network with DPDK drivers. SRIOV enables network traffic to bypass the software switch layer of the Hyper-V stack.

13.7.1. Standalone with OVS-DPDK

SampleVNF image is spawned in a VM on a baremetal server. OVS with DPDK is installed on the baremetal server.

Note

Ubuntu 17.10 requires DPDK v.17.05 and higher, DPDK v.17.05 requires OVS v.2.8.0.

Default values for OVS-DPDK:

  • queues: 4
  • lcore_mask: “”
  • pmd_cpu_mask: “0x6”
13.7.2. Sample test case file
  1. Prepare SampleVNF image and copy it to flavor/images.
  2. Prepare context files for TREX and SampleVNF under contexts/file.
  3. Add bridge named br-int to the baremetal where SampleVNF image is deployed.
  4. Modify networks/phy_port accordingly to the baremetal setup.
  5. Run test from:
# Copyright (c) 2016-2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

---
schema: yardstick:task:0.1
scenarios:
- type: NSPerf
  traffic_profile: ../../traffic_profiles/ipv4_throughput.yaml
  topology: acl-tg-topology.yaml
  nodes:
    tg__0: trafficgen_1.yardstick
    vnf__0: vnf__0.yardstick
  options:
    framesize:
      uplink: {64B: 100}
      downlink: {64B: 100}
    flow:
      src_ip: [{'tg__0': 'xe0'}]
      dst_ip: [{'tg__0': 'xe1'}]
      count: 1
    traffic_type: 4
    rfc2544:
      allowed_drop_rate: 0.0001 - 0.0001
    vnf__0:
      rules: acl_1rule.yaml
      vnf_config: {lb_config: 'SW', lb_count: 1, worker_config: '1C/1T', worker_threads: 1}
  runner:
    type: Iteration
    iterations: 10
    interval: 35
contexts:
   - name: yardstick
     type: Node
     file: /etc/yardstick/nodes/standalone/trex_bm.yaml
   - type: StandaloneOvsDpdk
     name: yardstick
     file: /etc/yardstick/nodes/standalone/host_ovs.yaml
     vm_deploy: True
     ovs_properties:
       version:
         ovs: 2.7.0
         dpdk: 16.11.1
       pmd_threads: 2
       ram:
         socket_0: 2048
         socket_1: 2048
       queues: 4
       lcore_mask: ""
       pmd_cpu_mask: "0x6"
       vpath: "/usr/local"

     flavor:
       images: "/var/lib/libvirt/images/yardstick-nsb-image.img"
       ram: 16384
       extra_specs:
         hw:cpu_sockets: 1
         hw:cpu_cores: 6
         hw:cpu_threads: 2
       user: ""
       password: ""
     servers:
       vnf__0:
         network_ports:
           mgmt:
             cidr: '1.1.1.7/24'
           xe0:
             - uplink_0
           xe1:
             - downlink_0
     networks:
       uplink_0:
         port_num: 0
         phy_port: "0000:05:00.0"
         vpci: "0000:00:07.0"
         cidr: '152.16.100.10/24'
         gateway_ip: '152.16.100.20'
       downlink_0:
         port_num: 1
         phy_port: "0000:05:00.1"
         vpci: "0000:00:08.0"
         cidr: '152.16.40.10/24'
         gateway_ip: '152.16.100.20'
14. Yardstick Test Cases
14.1. Abstract

This chapter lists available Yardstick test cases. Yardstick test cases are divided in two main categories:

  • Generic NFVI Test Cases - Test Cases developed to realize the methodology described in Methodology
  • OPNFV Feature Test Cases - Test Cases developed to verify one or more aspect of a feature delivered by an OPNFV Project.
14.2. Generic NFVI Test Case Descriptions
14.2.1. Yardstick Test Case Description TC001
Network Performance
test case id OPNFV_YARDSTICK_TC001_NETWORK PERFORMANCE
metric Number of flows and throughput
test purpose

The purpose of TC001 is to evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

pktgen

Linux packet generator is a tool to generate packets at very high speed in the kernel. pktgen is mainly used to drive and LAN equipment test network. pktgen supports multi threading. To generate random MAC address, IP address, port number UDP packets, pktgen uses multiple CPU processors in the different PCI bus (PCI, PCIe bus) with Gigabit Ethernet tested (pktgen performance depends on the CPU processing speed, memory delay, PCI bus speed hardware parameters), Transmit data rate can be even larger than 10GBit/s. Visible can satisfy most card test requirements.

(Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

test description This test case uses Pktgen to generate packet flow between two hosts for simulating network workloads on the SUT.
traffic profile An IP table is setup on server to monitor for received packets.
configuration

file: opnfv_yardstick_tc001.yaml

Packet size is set to 60 bytes. Number of ports: 10, 50, 100, 500 and 1000, where each runs for 20 seconds. The whole sequence is run twice The client and server are distributed on different hardware.

For SLA max_ppm is set to 1000. The amount of configured ports map to between 110 up to 1001000 flows, respectively.

applicability

Test can be configured with different:

  • packet sizes;
  • amount of flows;
  • test duration.

Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

usability This test case is used for generating high network throughput to simulate certain workloads on the SUT. Hence it should work with other test cases.
references

pktgen

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs are booted, as server and client.
step 2 Yardstick is connected with the server VM by using ssh. ‘pktgen_benchmark’ bash script is copyied from Jump Host to the server VM via the ssh tunnel.
step 3 An IP table is setup on server to monitor for received packets.
step 4

pktgen is invoked to generate packet flow between two server and client for simulating network workloads on the SUT. Results are processed and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 5 Two host VMs are deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.2. Yardstick Test Case Description TC002
Network Latency
test case id OPNFV_YARDSTICK_TC002_NETWORK LATENCY
metric RTT (Round Trip Time)
test purpose

The purpose of TC002 is to do a basic verification that network latency is within acceptable boundaries when packets travel between hosts located on same or different compute blades.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

ping

Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network. It measures the round-trip time for packet sent from the originating host to a destination computer that are echoed back to the source.

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Docker image. (For example also a Cirros image can be downloaded from cirros-image, it includes ping)

test topology

Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from host VM to target VM(s) to elicit ICMP ECHO_RESPONSE.

For one host VM there can be multiple target VMs. Host VM and target VM(s) can be on same or different compute blades.

configuration

file: opnfv_yardstick_tc002.yaml

Packet size 100 bytes. Test duration 60 seconds. One ping each 10 seconds. Test is iterated two times. SLA RTT is set to maximum 10 ms.

applicability

This test case can be configured with different:

  • packet sizes;
  • burst sizes;
  • ping intervals;
  • test durations;
  • test iterations.

Default values exist.

SLA is optional. The SLA in this test case serves as an example. Considerably lower RTT is expected, and also normal to achieve in balanced L2 environments. However, to cover most configurations, both bare metal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many real time applications start to suffer badly if the RTT time is higher than this. Some may suffer bad also close to this RTT, while others may not suffer at all. It is a compromise that may have to be tuned for different configuration purposes.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

Ping

ETSI-NFV-TST001

pre-test conditions

The test case image (cirros-image) needs to be installed into Glance with ping included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs are booted, as server and client.
step 2 Yardstick is connected with the server VM by using ssh. ‘ping_benchmark’ bash script is copied from Jump Host to the server VM via the ssh tunnel.
step 3

Ping is invoked. Ping packets are sent from server VM to client VM. RTT results are calculated and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 Two host VMs are deleted.
test verdict Test should not PASS if any RTT is above the optional SLA value, or if there is a test case execution problem.
14.2.3. Yardstick Test Case Description TC004
Cache Utilization
test case id OPNFV_YARDSTICK_TC004_CACHE Utilization
metric cache hit, cache miss, hit/miss ratio, buffer size and page cache size
test purpose

The purpose of TC004 is to evaluate the IaaS compute capability with regards to cache utilization.This test case should be run in parallel with other Yardstick test cases and not run as a stand-alone test case.

This test case measures cache usage statistics, including cache hit, cache miss, hit ratio, buffer cache size and page cache size, with some wokloads runing on the infrastructure. Both average and maximun values are collected.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

cachestat

cachestat is a tool using Linux ftrace capabilities for showing Linux page cache hit/miss statistics.

(cachestat is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with cachestat included.)

test description cachestat test is invoked in a host VM on a compute blade, cachestat test requires some other test cases running in the host to stimulate workload.
configuration

File: cachestat.yaml (in the ‘samples’ directory)

Interval is set 1. Test repeat, pausing every 1 seconds in-between. Test durarion is set to 60 seconds.

SLA is not available in this test case.

applicability

Test can be configured with different:

  • interval;
  • runner Duration.

Default values exist.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

cachestat

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with cachestat included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with cachestat installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. ‘cache_stat’ bash script is copyied from Jump Host to the server VM via the ssh tunnel.
step 3

‘cache_stat’ script is invoked. Raw cache usage statistics are collected and filtrated. Average and maximum values are calculated and recorded. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict None. Cache utilization results are collected and stored.
14.2.4. Yardstick Test Case Description TC005
Storage Performance
test case id OPNFV_YARDSTICK_TC005_STORAGE PERFORMANCE
metric IOPS (Average IOs performed per second), Throughput (Average disk read/write bandwidth rate), Latency (Average disk read/write latency)
test purpose

The purpose of TC005 is to evaluate the IaaS storage performance with regards to IOPS, throughput and latency.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

fio

fio is an I/O tool meant to be used both for benchmark and stress/hardware verification. It has support for 19 different types of I/O engines (sync, mmap, libaio, posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O priorities (for newer Linux kernels), rate I/O, forked or threaded jobs, and much more.

(fio is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with fio included.)

test description fio test is invoked in a host VM on a compute blade, a job file as well as parameters are passed to fio and fio will start doing what the job file tells it to do.
configuration

file: opnfv_yardstick_tc005.yaml

IO types is set to read, write, randwrite, randread, rw. IO block size is set to 4KB, 64KB, 1024KB. fio is run for each IO type and IO block size scheme, each iteration runs for 30 seconds (10 for ramp time, 20 for runtime).

For SLA, minimum read/write iops is set to 100, minimum read/write throughput is set to 400 KB/s, and maximum read/write latency is set to 20000 usec.

applicability

This test case can be configured with different:

  • IO types;
  • IO block size;
  • IO depth;
  • ramp time;
  • test duration.

Default values exist.

SLA is optional. The SLA in this test case serves as an example. Considerably higher throughput and lower latency are expected. However, to cover most configurations, both baremetal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many heavy IO applications start to suffer badly if the read/write bandwidths are lower than this.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

fio

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with fio included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with fio installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. ‘fio_benchmark’ bash script is copyied from Jump Host to the host VM via the ssh tunnel.
step 3

‘fio_benchmark’ script is invoked. Simulated IO operations are started. IOPS, disk read/write bandwidth and latency are recorded and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.5. Yardstick Test Case Description TC008
Packet Loss Extended Test
test case id OPNFV_YARDSTICK_TC008_NW PERF, Packet loss Extended Test
metric Number of flows, packet size and throughput
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of packet sizes and flows matter for the throughput between VMs on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs ans similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc008.yaml

Packet size: 64, 128, 256, 512, 1024, 1280 and 1518 bytes.

Number of ports: 1, 10, 50, 100, 500 and 1000. The amount of configured ports map from 2 up to 1001000 flows, respectively. Each packet_size/port_amount combination is run ten times, for 20 seconds each. Then the next packet_size/port_amount combination is run, and so on.

The client and server are distributed on different HW.

For SLA max_ppm is set to 1000.

test tool

pktgen

(Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

references

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.6. Yardstick Test Case Description TC009
Packet Loss
test case id OPNFV_YARDSTICK_TC009_NW PERF, Packet loss
metric Number of flows, packets lost and throughput
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between VMs on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs ans similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc009.yaml

Packet size: 64 bytes

Number of ports: 1, 10, 50, 100, 500 and 1000. The amount of configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run ten times, for 20 seconds each. Then the next port_amount is run, and so on.

The client and server are distributed on different HW.

For SLA max_ppm is set to 1000.

test tool

pktgen

(Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

references

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.7. Yardstick Test Case Description TC010
Memory Latency
test case id OPNFV_YARDSTICK_TC010_MEMORY LATENCY
metric Memory read latency (nanoseconds)
test purpose

The purpose of TC010 is to evaluate the IaaS compute performance with regards to memory read latency. It measures the memory read latency for varying memory sizes and strides. Whole memory hierarchy is measured.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

Lmbench

Lmbench is a suite of operating system microbenchmarks. This test uses lat_mem_rd tool from that suite including:

  • Context switching
  • Networking: connection establishment, pipe, TCP, UDP, and RPC hot potato
  • File system creates and deletes
  • Process creation
  • Signal handling
  • System call overhead
  • Memory read latency

(LMbench is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with LMbench included.)

test description

LMbench lat_mem_rd benchmark measures memory read latency for varying memory sizes and strides.

The benchmark runs as two nested loops. The outer loop is the stride size. The inner loop is the array size. For each array size, the benchmark creates a ring of pointers that point backward one stride.Traversing the array is done by:

p = (char **)*p;

in a for loop (the over head of the for loop is not significant; the loop is an unrolled loop 100 loads long). The size of the array varies from 512 bytes to (typically) eight megabytes. For the small sizes, the cache will have an effect, and the loads will be much faster. This becomes much more apparent when the data is plotted.

Only data accesses are measured; the instruction cache is not measured.

The results are reported in nanoseconds per load and have been verified accurate to within a few nanoseconds on an SGI Indy.

configuration

File: opnfv_yardstick_tc010.yaml

  • SLA (max_latency): 30 nanoseconds
  • Stride - 128 bytes
  • Stop size - 64 megabytes
  • Iterations: 10 - test is run 10 times iteratively.
  • Interval: 1 - there is 1 second delay between each iteration.

SLA is optional. The SLA in this test case serves as an example. Considerably lower read latency is expected. However, to cover most configurations, both baremetal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many heavy IO applications start to suffer badly if the read latency is higher than this.

applicability

Test can be configured with different:

  • strides;
  • stop_size;
  • iterations and intervals.

Default values exist.

SLA (optional) : max_latency: The maximum memory latency that is accepted.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

LMbench lat_mem_rd

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with Lmbench included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host is installed as client. LMbench’s lat_mem_rd tool is invoked and logs are produced and stored.

Result: logs are stored.

step 1 A host VM with LMbench installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. ‘lmbench_latency_benchmark’ bash script is copyied from Jump Host to the host VM via the ssh tunnel.
step 3

‘lmbench_latency_benchmark’ script is invoked. LMbench’s lat_mem_rd benchmark starts to measures memory read latency for varying memory sizes and strides. Memory read latency are recorded and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Test fails if the measured memory latency is above the SLA value or if there is a test case execution problem.
14.2.8. Yardstick Test Case Description TC011
Packet delay variation between VMs
test case id OPNFV_YARDSTICK_TC011_PACKET DELAY VARIATION BETWEEN VMs
metric jitter: packet delay variation (ms)
test purpose

The purpose of TC011 is to evaluate the IaaS network performance with regards to network jitter (packet delay variation). It measures the packet delay variation sending the packets from one VM to the other.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

iperf3

iPerf3 is a tool for active measurements of the maximum achievable bandwidth on IP networks. It supports tuning of various parameters related to timing, buffers and protocols. The UDP protocols can be used to measure jitter delay.

(iperf3 is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

test description

iperf3 test is invoked between a host VM and a target VM.

Jitter calculations are continuously computed by the server, as specified by RTP in RFC 1889. The client records a 64 bit second/microsecond timestamp in the packet. The server computes the relative transit time as (server’s receive time - client’s send time). The client’s and server’s clocks do not need to be synchronized; any difference is subtracted outin the jitter calculation. Jitter is the smoothed mean of differences between consecutive transit times.

configuration

File: opnfv_yardstick_tc011.yaml

  • options: protocol: udp # The protocol used by iperf3 tools bandwidth: 20m # It will send the given number of packets

    without pausing

  • runner: duration: 30 # Total test duration 30 seconds.

  • SLA (optional): jitter: 10 (ms) # The maximum amount of jitter that is

    accepted.

applicability

Test can be configured with different:

  • bandwidth: Test case can be configured with different

    bandwidth.

  • duration: The test duration can be configured.

  • jitter: SLA is optional. The SLA in this test case

    serves as an example.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

iperf3

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with iperf3 included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs with iperf3 installed are booted, as server and client.
step 2 Yardstick is connected with the host VM by using ssh. A iperf3 server is started on the server VM via the ssh tunnel.
step 3

iperf3 benchmark is invoked. Jitter is calculated and check against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VMs are deleted.
test verdict Test should not PASS if any jitter is above the optional SLA value, or if there is a test case execution problem.
14.2.9. Yardstick Test Case Description TC012
Memory Bandwidth
test case id OPNFV_YARDSTICK_TC012_MEMORY BANDWIDTH
metric Memory read/write bandwidth (MBps)
test purpose

The purpose of TC012 is to evaluate the IaaS compute performance with regards to memory throughput. It measures the rate at which data can be read from and written to the memory (this includes all levels of memory).

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

LMbench

LMbench is a suite of operating system microbenchmarks. This test uses bw_mem tool from that suite including:

  • Cached file read
  • Memory copy (bcopy)
  • Memory read
  • Memory write
  • Pipe
  • TCP

(LMbench is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with LMbench included.)

test description LMbench bw_mem benchmark allocates twice the specified amount of memory, zeros it, and then times the copying of the first half to the second half. The benchmark is invoked in a host VM on a compute blade. Results are reported in megabytes moved per second.
configuration

File: opnfv_yardstick_tc012.yaml

  • SLA (optional): 15000 (MBps) min_bw: The minimum amount of memory bandwidth that is accepted.
  • Size: 10 240 kB - test allocates twice that size (20 480kB) zeros it and then measures the time it takes to copy from one side to another.
  • Benchmark: rdwr - measures the time to read data into memory and then write data to the same location.
  • Warmup: 0 - the number of iterations to perform before taking actual measurements.
  • Iterations: 10 - test is run 10 times iteratively.
  • Interval: 1 - there is 1 second delay between each iteration.

SLA is optional. The SLA in this test case serves as an example. Considerably higher bandwidth is expected. However, to cover most configurations, both baremetal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many heavy IO applications start to suffer badly if the read/write bandwidths are lower than this.

applicability

Test can be configured with different:

  • memory sizes;
  • memory operations (such as rd, wr, rdwr, cp, frd, fwr, fcp, bzero, bcopy);
  • number of warmup iterations;
  • iterations and intervals.

Default values exist.

SLA (optional) : min_bandwidth: The minimun memory bandwidth that is accepted.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

LMbench bw_mem

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with Lmbench included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with LMbench installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. “lmbench_bandwidth_benchmark” bash script is copied from Jump Host to the host VM via ssh tunnel.
step 3

‘lmbench_bandwidth_benchmark’ script is invoked. LMbench’s bw_mem benchmark starts to measures memory read/write bandwidth. Memory read/write bandwidth results are recorded and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Test fails if the measured memory bandwidth is below the SLA value or if there is a test case execution problem.
14.2.10. Yardstick Test Case Description TC014
Processing speed
test case id OPNFV_YARDSTICK_TC014_PROCESSING SPEED
metric score of single cpu running, score of parallel running
test purpose

The purpose of TC014 is to evaluate the IaaS compute performance with regards to CPU processing speed. It measures score of single cpu running and parallel running.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

UnixBench

Unixbench is the most used CPU benchmarking software tool. It can measure the performance of bash scripts, CPUs in multithreading and single threading. It can also measure the performance for parallel taks. Also, specific disk IO for small and large files are performed. You can use it to measure either linux dedicated servers and linux vps servers, running CentOS, Debian, Ubuntu, Fedora and other distros.

(UnixBench is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with UnixBench included.)

test description

The UnixBench runs system benchmarks in a host VM on a compute blade, getting information on the CPUs in the system. If the system has more than one CPU, the tests will be run twice – once with a single copy of each test running at once, and once with N copies, where N is the number of CPUs.

UnixBench will processs a set of results from a single test by averaging the individal pass results into a single final value.

configuration

file: opnfv_yardstick_tc014.yaml

run_mode: Run unixbench in quiet mode or verbose mode test_type: dhry2reg, whetstone and so on

For SLA with single_score and parallel_score, both can be set by user, default is NA.

applicability

Test can be configured with different:

  • test types;
  • dhry2reg;
  • whetstone.

Default values exist.

SLA (optional) : min_score: The minimun UnixBench score that is accepted.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

unixbench

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with unixbench included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with UnixBench installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. “unixbench_benchmark” bash script is copied from Jump Host to the host VM via ssh tunnel.
step 3

UnixBench is invoked. All the tests are executed using the “Run” script in the top-level of UnixBench directory. The “Run” script will run a standard “index” test, and save the report in the “results” directory. Then the report is processed by “unixbench_benchmark” and checked againsted the SLA.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.11. Yardstick Test Case Description TC024
CPU Load
test case id OPNFV_YARDSTICK_TC024_CPU Load
metric CPU load
test purpose To evaluate the CPU load performance of the IaaS. This test case should be run in parallel to other Yardstick test cases and not run as a stand-alone test case. Average, minimum and maximun values are obtained. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: cpuload.yaml (in the ‘samples’ directory)

  • interval: 1 - repeat, pausing every 1 seconds in-between.
  • count: 10 - display statistics 10 times, then exit.
test tool

mpstat

(mpstat is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. However, if mpstat is not present the TC instead uses /proc/stats as source to produce “mpstat” output.

references man-pages
applicability

Test can be configured with different:

  • interval;
  • count;
  • runner Iteration and intervals.

There are default values for each above-mentioned option. Run in background with other test cases.

pre-test conditions

The test case image needs to be installed into Glance with mpstat included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host is installed. The related TC, or TCs, is invoked and mpstat logs are produced and stored.

Result: Stored logs

test verdict None. CPU load results are fetched and stored.
14.2.12. Yardstick Test Case Description TC037
Latency, CPU Load, Throughput, Packet Loss
test case id OPNFV_YARDSTICK_TC037_LATENCY,CPU LOAD,THROUGHPUT, PACKET LOSS
metric Number of flows, latency, throughput, packet loss CPU utilization percentage, CPU interrupt per second
test purpose

The purpose of TC037 is to evaluate the IaaS compute capacity and network performance with regards to CPU utilization, packet flows and network throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades, and the CPU load variation.

Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

Ping, Pktgen, mpstat

Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network. It measures the round-trip time for packet sent from the originating host to a destination computer that are echoed back to the source.

Linux packet generator is a tool to generate packets at very high speed in the kernel. pktgen is mainly used to drive and LAN equipment test network. pktgen supports multi threading. To generate random MAC address, IP address, port number UDP packets, pktgen uses multiple CPU processors in the different PCI bus (PCI, PCIe bus) with Gigabit Ethernet tested (pktgen performance depends on the CPU processing speed, memory delay, PCI bus speed hardware parameters), Transmit data rate can be even larger than 10GBit/s. Visible can satisfy most card test requirements.

The mpstat command writes to standard output activities for each available processor, processor 0 being the first one. Global average activities among all processors are also reported. The mpstat command can be used both on SMP and UP machines, but in the latter, only global average activities will be printed.

(Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Docker image. For example also a Cirros image can be downloaded from cirros-image, it includes ping.

Pktgen and mpstat are not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen and mpstat included.)

test description This test case uses Pktgen to generate packet flow between two hosts for simulating network workloads on the SUT. Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from a host VM to the target VM(s) to elicit ICMP ECHO_RESPONSE, meanwhile CPU activities are monitored by mpstat.
configuration

file: opnfv_yardstick_tc037.yaml

Packet size is set to 64 bytes. Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run two times, for 20 seconds each. Then the next port_amount is run, and so on. During the test CPU load on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different hardware. mpstat monitoring interval is set to 1 second. ping packet size is set to 100 bytes. For SLA max_ppm is set to 1000.

applicability

Test can be configured with different:

  • pktgen packet sizes;
  • amount of flows;
  • test duration;
  • ping packet size;
  • mpstat monitor interval.

Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

references

Ping

mpstat

pktgen

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with pktgen, mpstat included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs are booted, as server and client.
step 2 Yardstick is connected with the server VM by using ssh. ‘pktgen_benchmark’, “ping_benchmark” bash script are copyied from Jump Host to the server VM via the ssh tunnel.
step 3 An IP table is setup on server to monitor for received packets.
step 4

pktgen is invoked to generate packet flow between two server and client for simulating network workloads on the SUT. Ping is invoked. Ping packets are sent from server VM to client VM. mpstat is invoked, recording activities for each available processor. Results are processed and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 5 Two host VMs are deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.13. Yardstick Test Case Description TC038
Latency, CPU Load, Throughput, Packet Loss (Extended measurements)
test case id OPNFV_YARDSTICK_TC038_Latency,CPU Load,Throughput,Packet Loss
metric Number of flows, latency, throughput, CPU load, packet loss
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs ans similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc038.yaml

Packet size: 64 bytes Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run ten times, for 20 seconds each. Then the next port_amount is run, and so on. During the test CPU load on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different HW. For SLA max_ppm is set to 1000.

test tool

pktgen

(Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

ping

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Glance image. (For example also a cirros image can be downloaded, it includes ping)

mpstat

(Mpstat is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image.

references

Ping and Mpstat man pages

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.14. Yardstick Test Case Description TC042
Network Performance
test case id OPNFV_YARDSTICK_TC042_DPDK pktgen latency measurements
metric L2 Network Latency
test purpose Measure L2 network latency when DPDK is enabled between hosts on different compute blades.
configuration

file: opnfv_yardstick_tc042.yaml

  • Packet size: 64 bytes
  • SLA(max_latency): 100usec
test tool

DPDK Pktgen-dpdk

(DPDK and Pktgen-dpdk are not part of a Linux distribution, hence they needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with DPDK and pktgen-dpdk included.)

references

DPDK

Pktgen-dpdk

ETSI-NFV-TST001

applicability Test can be configured with different packet sizes. Default values exist.
pre-test conditions

The test case image needs to be installed into Glance with DPDK and pktgen-dpdk included in it.

The NICs of compute nodes must support DPDK on POD.

And at least compute nodes setup hugepage.

If you want to achievement a hight performance result, it is recommend to use NUAM, CPU pin, OVS and so on.

test sequence description and expected result
step 1 The hosts are installed on different blades, as server and client. Both server and client have three interfaces. The first one is management such as ssh. The other two are used by DPDK.
step 2 Testpmd is invoked with configurations to forward packets from one DPDK port to the other on server.
step 3

Pktgen-dpdk is invoked with configurations as a traffic generator and logs are produced and stored on client.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.15. Yardstick Test Case Description TC043
Network Latency Between NFVI Nodes
test case id OPNFV_YARDSTICK_TC043_LATENCY_BETWEEN_NFVI_NODES
metric RTT (Round Trip Time)
test purpose

The purpose of TC043 is to do a basic verification that network latency is within acceptable boundaries when packets travel between different NFVI nodes.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

ping

Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network. It measures the round-trip time for packet sent from the originating host to a destination computer that are echoed back to the source.

test topology Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from host node to target node to elicit ICMP ECHO_RESPONSE.
configuration

file: opnfv_yardstick_tc043.yaml

Packet size 100 bytes. Total test duration 600 seconds. One ping each 10 seconds. SLA RTT is set to maximum 10 ms.

applicability

This test case can be configured with different:

  • packet sizes;
  • burst sizes;
  • ping intervals;
  • test durations;
  • test iterations.

Default values exist.

SLA is optional. The SLA in this test case serves as an example. Considerably lower RTT is expected, and also normal to achieve in balanced L2 environments. However, to cover most configurations, both bare metal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many real time applications start to suffer badly if the RTT time is higher than this. Some may suffer bad also close to this RTT, while others may not suffer at all. It is a compromise that may have to be tuned for different configuration purposes.

references

Ping

ETSI-NFV-TST001

pre_test conditions Each pod node must have ping included in it.
test sequence description and expected result
step 1 Yardstick is connected with the NFVI node by using ssh. ‘ping_benchmark’ bash script is copyied from Jump Host to the NFVI node via the ssh tunnel.
step 2

Ping is invoked. Ping packets are sent from server node to client node. RTT results are calculated and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

test verdict Test should not PASS if any RTT is above the optional SLA value, or if there is a test case execution problem.
14.2.16. Yardstick Test Case Description TC044
Memory Utilization
test case id OPNFV_YARDSTICK_TC044_Memory Utilization
metric Memory utilization
test purpose To evaluate the IaaS compute capability with regards to memory utilization.This test case should be run in parallel to other Yardstick test cases and not run as a stand-alone test case. Measure the memory usage statistics including used memory, free memory, buffer, cache and shared memory. Both average and maximun values are obtained. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

File: memload.yaml (in the ‘samples’ directory)

  • interval: 1 - repeat, pausing every 1 seconds in-between.
  • count: 10 - display statistics 10 times, then exit.
test tool

free

free provides information about unused and used memory and swap space on any computer running Linux or another Unix-like operating system. free is normally part of a Linux distribution, hence it doesn’t needs to be installed.

references

man-pages

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • interval;
  • count;
  • runner Iteration and intervals.

There are default values for each above-mentioned option. Run in background with other test cases.

pre-test conditions

The test case image needs to be installed into Glance with free included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host is installed as client. The related TC, or TCs, is invoked and free logs are produced and stored.

Result: logs are stored.

test verdict None. Memory utilization results are fetched and stored.
14.2.17. Yardstick Test Case Description TC055
Compute Capacity
test case id OPNFV_YARDSTICK_TC055_Compute Capacity
metric Number of cpus, number of cores, number of threads, available memory size and total cache size.
test purpose To evaluate the IaaS compute capacity with regards to hardware specification, including number of cpus, number of cores, number of threads, available memory size and total cache size. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc055.yaml

There is are no additional configurations to be set for this TC.

test tool

/proc/cpuinfo

this TC uses /proc/cpuinfo as source to produce compute capacity output.

references

/proc/cpuinfo_

ETSI-NFV-TST001

applicability None.
pre-test conditions No POD specific requirements have been identified.
test sequence description and expected result
step 1

The hosts are installed, TC is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict None. Hardware specification are fetched and stored.
14.2.18. Yardstick Test Case Description TC061
Network Utilization
test case id OPNFV_YARDSTICK_TC061_Network Utilization
metric Network utilization
test purpose To evaluate the IaaS network capability with regards to network utilization, including Total number of packets received per second, Total number of packets transmitted per second, Total number of kilobytes received per second, Total number of kilobytes transmitted per second, Number of compressed packets received per second (for cslip etc.), Number of compressed packets transmitted per second, Number of multicast packets received per second, Utilization percentage of the network interface. This test case should be run in parallel to other Yardstick test cases and not run as a stand-alone test case. Measure the network usage statistics from the network devices Average, minimum and maximun values are obtained. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

File: netutilization.yaml (in the ‘samples’ directory)

  • interval: 1 - repeat, pausing every 1 seconds in-between.
  • count: 1 - display statistics 1 times, then exit.
test tool

sar

The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. sar is normally part of a Linux distribution, hence it doesn’t needs to be installed.

references

man-pages

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • interval;
  • count;
  • runner Iteration and intervals.

There are default values for each above-mentioned option. Run in background with other test cases.

pre-test conditions

The test case image needs to be installed into Glance with sar included in the image.

No POD specific requirements have been identified.

test sequence description and expected result.
step 1

The host is installed as client. The related TC, or TCs, is invoked and sar logs are produced and stored.

Result: logs are stored.

test verdict None. Network utilization results are fetched and stored.
14.2.19. Yardstick Test Case Description TC063
Storage Capacity
test case id OPNFV_YARDSTICK_TC063_Storage Capacity
metric Storage/disk size, block size Disk Utilization
test purpose This test case will check the parameters which could decide several models and each model has its specified task to measure. The test purposes are to measure disk size, block size and disk utilization. With the test results, we could evaluate the storage capacity of the host.
configuration
file: opnfv_yardstick_tc063.yaml
  • test_type: “disk_size”

  • runner:

    type: Iteration iterations: 1 - test is run 1 time iteratively.

test tool

fdisk A command-line utility that provides disk partitioning functions

iostat This is a computer system monitor tool used to collect and show operating system storage input and output statistics.

references

iostat fdisk

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • test_type: “disk size”, “block size”, “disk utilization”

  • interval: 1 - how ofter to stat disk utilization

    type: int unit: seconds

  • count: 15 - how many times to stat disk utilization

    type: int unit: na

There are default values for each above-mentioned option. Run in background with other test cases.

pre-test conditions

The test case image needs to be installed into Glance

No POD specific requirements have been identified.

test sequence Output the specific storage capacity of disk information as the sequence into file.
step 1

The pod is available and the hosts are installed. Node5 is used and logs are produced and stored.

Result: Logs are stored.

test verdict None.
14.2.20. Yardstick Test Case Description TC069
Memory Bandwidth
test case id OPNFV_YARDSTICK_TC069_Memory Bandwidth
metric Megabyte per second (MBps)
test purpose To evaluate the IaaS compute performance with regards to memory bandwidth. Measure the maximum possible cache and memory performance while reading and writing certain blocks of data (starting from 1Kb and further in power of 2) continuously through ALU and FPU respectively. Measure different aspects of memory performance via synthetic simulations. Each simulation consists of four performances (Copy, Scale, Add, Triad). Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

File: opnfv_yardstick_tc069.yaml

  • SLA (optional): 7000 (MBps) min_bandwidth: The minimum amount of memory bandwidth that is accepted.

  • type_id: 1 - runs a specified benchmark (by an ID number):

    1 – INTmark [writing] 4 – FLOATmark [writing] 2 – INTmark [reading] 5 – FLOATmark [reading] 3 – INTmem 6 – FLOATmem

  • block_size: 64 Megabytes - the maximum block

    size per array.

  • load: 32 Gigabytes - the amount of data load per pass.

  • iterations: 5 - test is run 5 times iteratively.

  • interval: 1 - there is 1 second delay between each iteration.

test tool

RAMspeed

RAMspeed is a free open source command line utility to measure cache and memory performance of computer systems. RAMspeed is not always part of a Linux distribution, hence it needs to be installed in the test image.

references

RAMspeed

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • benchmark operations (such as INTmark [writing], INTmark [reading], FLOATmark [writing], FLOATmark [reading], INTmem, FLOATmem);
  • block size per array;
  • load per pass;
  • number of batch run iterations;
  • iterations and intervals.

There are default values for each above-mentioned option.

pre-test conditions

The test case image needs to be installed into Glance with RAmspeed included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host is installed as client. RAMspeed is invoked and logs are produced and stored.

Result: logs are stored.

test verdict Test fails if the measured memory bandwidth is below the SLA value or if there is a test case execution problem.
14.2.21. Yardstick Test Case Description TC070
Latency, Memory Utilization, Throughput, Packet Loss
test case id OPNFV_YARDSTICK_TC070_Latency, Memory Utilization, Throughput,Packet Loss
metric Number of flows, latency, throughput, Memory Utilization, packet loss
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc070.yaml

Packet size: 64 bytes Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run two times, for 20 seconds each. Then the next port_amount is run, and so on. During the test Memory Utilization on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different HW. For SLA max_ppm is set to 1000.

test tool

pktgen

Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. (As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

ping

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Glance image. (For example also a cirros image can be downloaded, it includes ping)

free

free provides information about unused and used memory and swap space on any computer running Linux or another Unix-like operating system. free is normally part of a Linux distribution, hence it doesn’t needs to be installed.

references

Ping and free man pages

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to lose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.22. Yardstick Test Case Description TC071
Latency, Cache Utilization, Throughput, Packet Loss
test case id OPNFV_YARDSTICK_TC071_Latency, Cache Utilization, Throughput,Packet Loss
metric Number of flows, latency, throughput, Cache Utilization, packet loss
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc071.yaml

Packet size: 64 bytes Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run two times, for 20 seconds each. Then the next port_amount is run, and so on. During the test Cache Utilization on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different HW. For SLA max_ppm is set to 1000.

test tool

pktgen

Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. (As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

ping

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Glance image. (For example also a cirros image can be downloaded, it includes ping)

cachestat

cachestat is not always part of a Linux distribution, hence it needs to be installed.

references

Ping man pages

pktgen

cachestat

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to lose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.23. Yardstick Test Case Description TC072
Latency, Network Utilization, Throughput, Packet Loss
test case id OPNFV_YARDSTICK_TC072_Latency, Network Utilization, Throughput,Packet Loss
metric Number of flows, latency, throughput, Network Utilization, packet loss
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc072.yaml

Packet size: 64 bytes Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run two times, for 20 seconds each. Then the next port_amount is run, and so on. During the test Network Utilization on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different HW. For SLA max_ppm is set to 1000.

test tool

pktgen

Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. (As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

ping

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Glance image. (For example also a cirros image can be downloaded, it includes ping)

sar

The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. sar is normally part of a Linux distribution, hence it doesn’t needs to be installed.

references

Ping and sar man pages

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to lose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.24. Yardstick Test Case Description TC073
Throughput per NFVI node test
test case id OPNFV_YARDSTICK_TC073_Network latency and throughput between nodes
metric Network latency and throughput
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of packet sizes and flows matter for the throughput between nodes in one pod.
configuration

file: opnfv_yardstick_tc073.yaml

Packet size: default 1024 bytes.

Test length: default 20 seconds.

The client and server are distributed on different nodes.

For SLA max_mean_latency is set to 100.

test tool netperf Netperf is a software application that provides network bandwidth testing between two hosts on a network. It supports Unix domain sockets, TCP, SCTP, DLPI and UDP via BSD Sockets. Netperf provides a number of predefined tests e.g. to measure bulk (unidirectional) data transfer or request response performance. (netperf is not always part of a Linux distribution, hence it needs to be installed.)
references netperf Man pages ETSI-NFV-TST001
applicability

Test can be configured with different packet sizes and test duration. Default values exist.

SLA (optional): max_mean_latency

pre-test conditions The POD can be reached by external ip and logged on via ssh
test sequence description and expected result
step 1 Install netperf tool on each specified node, one is as the server, and the other as the client.
step 2 Log on to the client node and use the netperf command to execute the network performance test
step 3 The throughput results stored.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.2.25. Yardstick Test Case Description TC074
Storperf
test case id OPNFV_YARDSTICK_TC074_Storperf
metric Storage performance
test purpose

To evaluate and report on the Cinder volume performance.

This testcase integrates with OPNFV StorPerf to measure block performance of the underlying Cinder drivers. Many options are supported, and even the root disk (Glance ephemeral storage can be profiled.

The fundamental concept of the test case is to first fill the volumes with random data to ensure reported metrics are indicative of continued usage and not skewed by transitional performance while the underlying storage driver allocates blocks. The metrics for filling the volumes with random data are not reported in the final results. The test also ensures the volumes are performing at a consistent level of performance by measuring metrics every minute, and comparing the trend of the metrics over the run. By evaluating the min and max values, as well as the slope of the trend, it can make the determination that the metrics are stable, and not fluctuating beyond industry standard norms.

configuration

file: opnfv_yardstick_tc074.yaml

  • agent_count: 1 - the number of VMs to be created
  • agent_image: “Ubuntu-14.04” - image used for creating VMs
  • public_network: “ext-net” - name of public network
  • volume_size: 2 - cinder volume size
  • block_sizes: “4096” - data block size
  • queue_depths: “4” - the number of simultaneous I/Os to perform at all times
  • StorPerf_ip: “192.168.200.2”
  • query_interval: 10 - state query interval
  • timeout: 600 - maximum allowed job time
test tool

Storperf

StorPerf is a tool to measure block and object storage performance in an NFVI.

StorPerf is delivered as a Docker container from https://hub.docker.com/r/opnfv/storperf-master/tags/.

The underlying tool used is FIO, and StorPerf supports any FIO option in order to tailor the test to the exact workload needed.

references

Storperf

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • agent_count

  • volume_size

  • block_sizes

  • queue_depths

  • query_interval

  • timeout

  • target=[device or path] The path to either an attached storage device (/dev/vdb, etc) or a directory path (/opt/storperf) that will be used to execute the performance test. In the case of a device, the entire device will be used. If not specified, the current directory will be used.

  • workload=[workload module] If not specified, the default is to run all workloads. The workload types are:

    • rs: 100% Read, sequential data
    • ws: 100% Write, sequential data
    • rr: 100% Read, random access
    • wr: 100% Write, random access
    • rw: 70% Read / 30% write, random access

    measurements.

  • workloads={json maps} This parameter supercedes the workload and calls the V2.0 API in StorPerf. It allows for greater control of the parameters to be passed to FIO. For example, running a random read/write with a mix of 90% read and 10% write would be expressed as follows: {“9010randrw”: {“rw”:”randrw”,”rwmixread”: “90”}} Note: This must be passed in as a string, so don’t forget to escape or otherwise properly deal with the quotes.

  • report= [job_id] Query the status of the supplied job_id and report on metrics. If a workload is supplied, will report on only that subset.

  • availability_zone: Specify the availability zone which the stack will use to create instances.

  • volume_type: Cinder volumes can have different types, for example encrypted vs. not encrypted. To be able to profile the difference between the two.

  • subnet_CIDR: Specify subnet CIDR of private network

  • stack_name: Specify the name of the stack that will be created, the default: “StorperfAgentGroup”

  • volume_count: Specify the number of volumes per virtual machines

    There are default values for each above-mentioned option.

pre-test conditions

If you do not have an Ubuntu 14.04 image in Glance, you will need to add one.

Storperf is required to be installed in the environment. There are two possible methods for Storperf installation:

Run container on Jump Host Run container in a VM

Running StorPerf on Jump Host Requirements:

  • Docker must be installed
  • Jump Host must have access to the OpenStack Controller API
  • Jump Host must have internet connectivity for downloading docker image
  • Enough floating IPs must be available to match your agent count

Running StorPerf in a VM Requirements:

  • VM has docker installed
  • VM has OpenStack Controller credentials and can communicate with the Controller API
  • VM has internet connectivity for downloading the docker image
  • Enough floating IPs must be available to match your agent count

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Yardstick calls StorPerf to create the heat stack with the number of VMs and size of Cinder volumes specified. The VMs will be on their own private subnet, and take floating IP addresses from the specified public network.
step 2 Yardstick calls StorPerf to fill all the volumes with random data.
step 3 Yardstick calls StorPerf to perform the series of tests specified by the workload, queue depths and block sizes.
step 4 Yardstick calls StorPerf to delete the stack it created.
test verdict None. Storage performance results are fetched and stored.
14.2.26. Yardstick Test Case Description TC075
Network Capacity and Scale Testing
test case id OPNFV_YARDSTICK_TC075_Network_Capacity_and_Scale_testing
metric Number of connections, Number of frames sent/received
test purpose To evaluate the network capacity and scale with regards to connections and frmaes.
configuration

file: opnfv_yardstick_tc075.yaml

There is no additional configuration to be set for this TC.

test tool

netstar

Netstat is normally part of any Linux distribution, hence it doesn’t need to be installed.

references

Netstat man page

ETSI-NFV-TST001

applicability This test case is mainly for evaluating network performance.
pre_test conditions Each pod node must have netstat included in it.
test sequence description and expected result
step 1

The pod is available. Netstat is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict None. Number of connections and frames are fetched and stored.
14.2.27. Yardstick Test Case Description TC076
Monitor Network Metrics
test case id OPNFV_YARDSTICK_TC076_Monitor_Network_Metrics
metric IP datagram error rate, ICMP message error rate, TCP segment error rate and UDP datagram error rate
test purpose

The purpose of TC076 is to evaluate the IaaS network reliability with regards to IP datagram error rate, ICMP message error rate, TCP segment error rate and UDP datagram error rate.

TC076 monitors network metrics provided by the Linux kernel in a host and calculates IP datagram error rate, ICMP message error rate, TCP segment error rate and UDP datagram error rate.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

nstat

nstat is a simple tool to monitor kernel snmp counters and network interface statistics.

(nstat is not always part of a Linux distribution, hence it needs to be installed. nstat is provided by the iproute2 collection, which is usually also the name of the package in many Linux distributions.As an example see the /yardstick/tools/ directory for how to generate a Linux image with iproute2 included.)

test description

Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from host VM to target VM(s) to elicit ICMP ECHO_RESPONSE.

nstat is invoked on the target vm to monitors network metrics provided by the Linux kernel.

configuration

file: opnfv_yardstick_tc076.yaml

There is no additional configuration to be set for this TC.

references

nstat man page

ETSI-NFV-TST001

applicability This test case is mainly for monitoring network metrics.
pre_test conditions

The test case image needs to be installed into Glance with fio included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs are booted, as server and client.
step 2 Yardstick is connected with the server VM by using ssh. ‘ping_benchmark’ bash script is copyied from Jump Host to the server VM via the ssh tunnel.
step 3

Ping is invoked. Ping packets are sent from server VM to client VM. RTT results are calculated and checked against the SLA. nstat is invoked on the client vm to monitors network metrics provided by the Linux kernel. IP datagram error rate, ICMP message error rate, TCP segment error rate and UDP datagram error rate are calculated. Logs are produced and stored.

Result: Logs are stored.

step 4 Two host VMs are deleted.
test verdict None.
14.2.28. Yardstick Test Case Description TC078
Compute Performance
test case id OPNFV_YARDSTICK_TC078_SPEC CPU 2006
metric compute-intensive performance
test purpose The purpose of TC078 is to evaluate the IaaS compute performance by using SPEC CPU 2006 benchmark. The SPEC CPU 2006 benchmark has several different ways to measure computer performance. One way is to measure how fast the computer completes a single task; this is called a speed measurement. Another way is to measure how many tasks computer can accomplish in a certain amount of time; this is called a throughput, capacity or rate measurement.
test tool

SPEC CPU 2006

The SPEC CPU 2006 benchmark is SPEC’s industry-standardized, CPU-intensive benchmark suite, stressing a system’s processor, memory subsystem and compiler. This benchmark suite includes the SPECint benchmarks and the SPECfp benchmarks. The SPECint 2006 benchmark contains 12 different enchmark tests and the SPECfp 2006 benchmark contains 19 different benchmark tests.

SPEC CPU 2006 is not always part of a Linux distribution. SPEC requires that users purchase a license and agree with their terms and conditions. For this test case, users must manually download cpu2006-1.2.iso from the SPEC website and save it under the yardstick/resources folder (e.g. /home/ opnfv/repos/yardstick/yardstick/resources/cpu2006-1.2.iso) SPEC CPU® 2006 benchmark is available for purchase via the SPEC order form (https://www.spec.org/order.html).

test description This test case uses SPEC CPU 2006 benchmark to measure compute-intensive performance of hosts.
configuration

file: spec_cpu.yaml (in the ‘samples’ directory)

benchmark_subset is set to int.

SLA is not available in this test case.

applicability

Test can be configured with different:

  • benchmark_subset - a subset of SPEC CPU2006 benchmarks to run;
  • SPECint_benchmark - a SPECint benchmark to run;
  • SPECint_benchmark - a SPECfp benchmark to run;
  • output_format - desired report format;
  • runspec_config - SPEC CPU2006 config file provided to the runspec binary;
  • runspec_iterations - the number of benchmark iterations to execute. For a reportable run, must be 3;
  • runspec_tune - tuning to use (base, peak, or all). For a reportable run, must be either base or all. Reportable runs do base first, then (optionally) peak;
  • runspec_size - size of input data to run (test, train, or ref). Reportable runs ensure that your binaries can produce correct results with the test and train workloads
usability This test case is used for executing SPEC CPU 2006 benchmark physical servers. The SPECint 2006 benchmark takes approximately 5 hours.
references

spec_cpu2006

ETSI-NFV-TST001

pre-test conditions
To run and install SPEC CPU2006, the following are required:
  • For SPECint2006: Both C99 and C++98 compilers;
  • For SPECfp2006: All three of C99, C++98 and Fortran-95 compilers;
  • At least 8GB of disk space availabile on the system.
test sequence description and expected result
step 1 cpu2006-1.2.iso has been saved under the yardstick/resources folder (e.g. /home/opnfv/repos/yardstick/yardstick/resources /cpu2006-1.2.iso). Additional, to use your custom runspec config file you can save it under the yardstick/resources/ files folder and specify the config file name in the runspec_config parameter.
step 2 Upload SPEC CPU2006 ISO to the target server and install SPEC CPU2006 via ansible.
step 3 Yardstick is connected with the target server by using ssh. If custom runspec config file is used, this file is copyied from yardstick to the target server via the ssh tunnel.
step 4 SPEC CPU2006 benchmark is invoked and SPEC CPU 2006 metrics are generated.
step 5 Text, HTML, CSV, PDF, and Configuration file outputs for the SPEC CPU 2006 metrics are fetch from the server and stored under /tmp/result folder.
step 6 uninstall SPEC CPU2006 and remove cpu2006-1.2.iso from the target server .
test verdict None. SPEC CPU2006 results are collected and stored.
14.2.29. Yardstick Test Case Description TC079
Storage Performance
test case id OPNFV_YARDSTICK_TC079_Bonnie++
metric Sequential Input/Output and Sequential/Random Create speed and CPU useage.
test purpose The purpose of TC078 is to evaluate the IaaS storage performance with regards to Sequential Input/Output and Sequential/Random Create speed and CPU useage statistics.
test tool

Bonnie++

Bonnie++ is a disk and file system benchmarking tool for measuring I/O performance. With Bonnie++ you can quickly and easily produce a meaningful value to represent your current file system performance.

Bonnie++ is not always part of a Linux distribution, hence it needs to be installed in the test image.

test description
This test case uses Bonnie++ to perform the tests below:
  • Create files in sequential order
  • Stat files in sequential order
  • Delete files in sequential order
  • Create files in random order
  • Stat files in random order
  • Delete files in random order
configuration

file: bonnie++.yaml (in the ‘samples’ directory)

file_size is set to 1024; ram_size is set to 512; test_dir is set to ‘/tmp’; concurrency is set to 1.

SLA is not available in this test case.

applicability

Test can be configured with different:

  • file_size - size fo the test file in MB. File size should be double RAM for good results;
  • ram_size - specify RAM size in MB to use, this is used to reduce testing time;
  • test_dir - this directory is where bonnie++ will create the benchmark operations;
  • test_user - the user who should perform the test. This is not required if you are not running as root;
  • concurrency - number of thread to perform test;
usability This test case is used for executing Bonnie++ benchmark in VMs.
references

bonnie++_

ETSI-NFV-TST001

pre-test conditions The Bonnie++ distribution includes a ‘bon_csv2html’ Perl script, which takes the comma-separated values reported by Bonnie++ and generates an HTML page displaying them. To use this feature, bonnie++ is required to be install with yardstick (e.g. in yardstick docker).
test sequence description and expected result
step 1 A host VM with fio installed is booted.
step 2 Yardstick is connected with the host VM by using ssh.
step 3

Bonnie++ benchmark is invoked. Simulated IO operations are started. Logs are produced and stored.

Result: Logs are stored.

step 4 An HTML report is generated using bonnie++ benchmark results and stored under /tmp/bonnie.html.
step 5 The host VM is deleted.
test verdict None. Bonnie++ html report is generated.
14.2.30. Yardstick Test Case Description TC080
Network Latency
test case id OPNFV_YARDSTICK_TC080_NETWORK_LATENCY_BETWEEN_CONTAINER
metric RTT (Round Trip Time)
test purpose

The purpose of TC080 is to do a basic verification that network latency is within acceptable boundaries when packets travel between containers located in two different Kubernetes pods.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

ping

Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network. It measures the round-trip time for packet sent from the originating host to a destination computer that are echoed back to the source.

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Docker image.

test topology Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from host container to target container to elicit ICMP ECHO_RESPONSE.
configuration

file: opnfv_yardstick_tc080.yaml

Packet size 200 bytes. Test duration 60 seconds. SLA RTT is set to maximum 10 ms.

applicability

This test case can be configured with different:

  • packet sizes;
  • burst sizes;
  • ping intervals;
  • test durations;
  • test iterations.

Default values exist.

SLA is optional. The SLA in this test case serves as an example. Considerably lower RTT is expected, and also normal to achieve in balanced L2 environments. However, to cover most configurations, both bare metal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many real time applications start to suffer badly if the RTT time is higher than this. Some may suffer bad also close to this RTT, while others may not suffer at all. It is a compromise that may have to be tuned for different configuration purposes.

usability This test case should be run in Kunernetes environment.
references

Ping

ETSI-NFV-TST001

pre-test conditions

The test case Docker image (openretriever/yardstick) needs to be pulled into Kubernetes environment.

No further requirements have been identified.

test sequence description and expected result
step 1 Two containers are booted, as server and client.
step 2 Yardstick is connected with the server container by using ssh. ‘ping_benchmark’ bash script is copied from Jump Host to the server container via the ssh tunnel.
step 3

Ping is invoked. Ping packets are sent from server container to client container. RTT results are calculated and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 Two containers are deleted.
test verdict Test should not PASS if any RTT is above the optional SLA value, or if there is a test case execution problem.
14.2.31. Yardstick Test Case Description TC081
Network Latency
test case id OPNFV_YARDSTICK_TC081_NETWORK_LATENCY_BETWEEN_CONTAINER_AND_ VM
metric RTT (Round Trip Time)
test purpose

The purpose of TC081 is to do a basic verification that network latency is within acceptable boundaries when packets travel between a containers and a VM.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

ping

Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network. It measures the round-trip time for packet sent from the originating host to a destination computer that are echoed back to the source.

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Docker image. (For example also a Cirros image can be downloaded from cirros-image, it includes ping)

test topology Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from host container to target vm to elicit ICMP ECHO_RESPONSE.
configuration

file: opnfv_yardstick_tc081.yaml

Packet size 200 bytes. Test duration 60 seconds. SLA RTT is set to maximum 10 ms.

applicability

This test case can be configured with different:

  • packet sizes;
  • burst sizes;
  • ping intervals;
  • test durations;
  • test iterations.

Default values exist.

SLA is optional. The SLA in this test case serves as an example. Considerably lower RTT is expected, and also normal to achieve in balanced L2 environments. However, to cover most configurations, both bare metal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many real time applications start to suffer badly if the RTT time is higher than this. Some may suffer bad also close to this RTT, while others may not suffer at all. It is a compromise that may have to be tuned for different configuration purposes.

usability This test case should be run in Kunernetes environment.
references

Ping

ETSI-NFV-TST001

pre-test conditions

The test case Docker image (openretriever/yardstick) needs to be pulled into Kubernetes environment. The VM image (cirros-image) needs to be installed into Glance with ping included in it.

No further requirements have been identified.

test sequence description and expected result
step 1 A containers is booted, as server and a VM is booted as client.
step 2 Yardstick is connected with the server container by using ssh. ‘ping_benchmark’ bash script is copied from Jump Host to the server container via the ssh tunnel.
step 3

Ping is invoked. Ping packets are sent from server container to client VM. RTT results are calculated and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The container and VM are deleted.
test verdict Test should not PASS if any RTT is above the optional SLA value, or if there is a test case execution problem.
14.2.32. Yardstick Test Case Description TC083
Throughput per VM test
test case id OPNFV_YARDSTICK_TC083_Network latency and throughput between VMs
metric Network latency and throughput
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of packet sizes and flows matter for the throughput between 2 VMs in one pod.
configuration

file: opnfv_yardstick_tc083.yaml

Packet size: default 1024 bytes.

Test length: default 20 seconds.

The client and server are distributed on different nodes.

For SLA max_mean_latency is set to 100.

test tool netperf Netperf is a software application that provides network bandwidth testing between two hosts on a network. It supports Unix domain sockets, TCP, SCTP, DLPI and UDP via BSD Sockets. Netperf provides a number of predefined tests e.g. to measure bulk (unidirectional) data transfer or request response performance. (netperf is not always part of a Linux distribution, hence it needs to be installed.)
references netperf Man pages ETSI-NFV-TST001
applicability

Test can be configured with different packet sizes and test duration. Default values exist.

SLA (optional): max_mean_latency

pre-test conditions The POD can be reached by external ip and logged on via ssh
test sequence description and expected result
step 1 Install netperf tool on each specified node, one is as the server, and the other as the client.
step 2 Log on to the client node and use the netperf command to execute the network performance test
step 3 The throughput results stored.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3. OPNFV Feature Test Cases
14.3.1. H A
14.3.1.1. Yardstick Test Case Description TC019
Control Node Openstack Service High Availability
test case id OPNFV_YARDSTICK_TC019_HA: Control node Openstack service down
test purpose This test case will verify the high availability of the service provided by OpenStack (like nova-api, neutro-server) on control node.
test method This test case kills the processes of a specific Openstack service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “nova-api” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific

Openstack command, which needs two parameters:

1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request

  1. the “process” monitor check whether a process is running on a specific node, which needs three parameters:

1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “openstack server list” monitor2: -monitor_type: “process” -process_name: “nova-api” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc019.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action

It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases.

Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.2. Yardstick Test Case Description TC025
OpenStack Controller Node abnormally shutdown High Availability
test case id OPNFV_YARDSTICK_TC025_HA: OpenStack Controller Node abnormally shutdown
test purpose This test case will verify the high availability of controller node. When one of the controller node abnormally shutdown, the service provided by it should be OK.
test method This test case shutdowns a specified controller node with some fault injection tools, then checks whether all services provided by the controller node are OK with some monitor tools.
attackers

In this test case, an attacker called “host-shutdown” is needed. This attacker includes two parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “host-shutdown” in this test case. 2) host: the name of a controller node being attacked.

e.g. -fault_type: “host-shutdown” -host: node1

monitors

In this test case, one kind of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific

Openstack command, which needs two parameters

1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request

There are four instance of the “openstack-cmd” monitor: monitor1: -monitor_type: “openstack-cmd” -api_name: “nova image-list” monitor2: -monitor_type: “openstack-cmd” -api_name: “neutron router-list” monitor3: -monitor_type: “openstack-cmd” -api_name: “heat stack-list” monitor4: -monitor_type: “openstack-cmd” -api_name: “cinder list”

metrics In this test case, there is one metric: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc019.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute shutdown script on the host

Result: The host will be shutdown.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: All monitor result will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It restarts the specified controller node if it is not restarted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.3. Yardstick Test Case Description TC045
Control Node Openstack Service High Availability - Neutron Server
test case id OPNFV_YARDSTICK_TC045: Control node Openstack service down - neutron server
test purpose This test case will verify the high availability of the network service provided by OpenStack (neutro-server) on control node.
test method This test case kills the processes of neutron-server service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “neutron- server”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “neutron-server” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be neutron related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “neutron agent-list” monitor2: -monitor_type: “process” -process_name: “neutron-server” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc045.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action

It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases.

Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.4. Yardstick Test Case Description TC046
Control Node Openstack Service High Availability - Keystone
test case id OPNFV_YARDSTICK_TC046: Control node Openstack service down - keystone
test purpose This test case will verify the high availability of the user service provided by OpenStack (keystone) on control node.
test method This test case kills the processes of keystone service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “keystone” 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “keystone” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be keystone related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “keystone user-list” monitor2: -monitor_type: “process” -process_name: “keystone” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc046.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action

It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases.

Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.5. Yardstick Test Case Description TC047
Control Node Openstack Service High Availability - Glance Api
test case id OPNFV_YARDSTICK_TC047: Control node Openstack service down - glance api
test purpose This test case will verify the high availability of the image service provided by OpenStack (glance-api) on control node.
test method This test case kills the processes of glance-api service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “glance- api”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “glance-api” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be glance related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “glance image-list” monitor2: -monitor_type: “process” -process_name: “glance-api” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc047.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action

It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases.

Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.6. Yardstick Test Case Description TC048
Control Node Openstack Service High Availability - Cinder Api
test case id OPNFV_YARDSTICK_TC048: Control node Openstack service down - cinder api
test purpose This test case will verify the high availability of the volume service provided by OpenStack (cinder-api) on control node.
test method This test case kills the processes of cinder-api service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “cinder- api”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “cinder-api” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be cinder related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “cinder list” monitor2: -monitor_type: “process” -process_name: “cinder-api” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc048.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action

It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test case

Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.7. Yardstick Test Case Description TC049
Control Node Openstack Service High Availability - Swift Proxy
test case id OPNFV_YARDSTICK_TC049: Control node Openstack service down - swift proxy
test purpose This test case will verify the high availability of the storage service provided by OpenStack (swift-proxy) on control node.
test method This test case kills the processes of swift-proxy service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “swift- proxy”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “swift-proxy” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be swift related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “swift stat” monitor2: -monitor_type: “process” -process_name: “swift-proxy” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc049.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action

It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases.

Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.8. Yardstick Test Case Description TC050
OpenStack Controller Node Network High Availability
test case id OPNFV_YARDSTICK_TC050: OpenStack Controller Node Network High Availability
test purpose This test case will verify the high availability of control node. When one of the controller failed to connect the network, which breaks down the Openstack services on this node. These Openstack service should able to be accessed by other controller nodes, and the services on failed controller node should be isolated.
test method This test case turns off the network interfaces of a specified control node, then checks whether all services provided by the control node are OK with some monitor tools.
attackers

In this test case, an attacker called “close-interface” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “close-interface” in this test case. 2) host: which is the name of a control node being attacked. 3) interface: the network interface to be turned off.

The interface to be closed by the attacker can be set by the variable of “{{ interface_name }}”

attackers:
  • fault_type: “general-attacker” host: {{ attack_host }} key: “close-br-public” attack_key: “close-interface” action_parameter:

    interface: {{ interface_name }}

    rollback_parameter:

    interface: {{ interface_name }}

monitors

In this test case, the monitor named “openstack-cmd” is needed. The monitor needs needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request

There are four instance of the “openstack-cmd” monitor: monitor1:

  • monitor_type: “openstack-cmd”
  • command_name: “nova image-list”
monitor2:
  • monitor_type: “openstack-cmd”
  • command_name: “neutron router-list”
monitor3:
  • monitor_type: “openstack-cmd”
  • command_name: “heat stack-list”
monitor4:
  • monitor_type: “openstack-cmd”
  • command_name: “cinder list”
metrics In this test case, there is one metric: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc050.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the turnoff network interface script with param value specified by “{{ interface_name }}”.

Result: The specified network interface will be down.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It turns up the network interface of the control node if it is not turned up.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.9. Yardstick Test Case Description TC051
OpenStack Controller Node CPU Overload High Availability
test case id OPNFV_YARDSTICK_TC051: OpenStack Controller Node CPU Overload High Availability
test purpose This test case will verify the high availability of control node. When the CPU usage of a specified controller node is stressed to 100%, which breaks down the Openstack services on this node. These Openstack service should able to be accessed by other controller nodes, and the services on failed controller node should be isolated.
test method This test case stresses the CPU uasge of a specified control node to 100%, then checks whether all services provided by the environment are OK with some monitor tools.
attackers In this test case, an attacker called “stress-cpu” is needed. This attacker includes two parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “stress-cpu” in this test case. 2) host: which is the name of a control node being attacked. e.g. -fault_type: “stress-cpu” -host: node1
monitors

In this test case, the monitor named “openstack-cmd” is needed. The monitor needs needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request

There are four instance of the “openstack-cmd” monitor: monitor1: -monitor_type: “openstack-cmd” -command_name: “nova image-list” monitor2: -monitor_type: “openstack-cmd” -command_name: “neutron router-list” monitor3: -monitor_type: “openstack-cmd” -command_name: “heat stack-list” monitor4: -monitor_type: “openstack-cmd” -command_name: “cinder list”

metrics In this test case, there is one metric: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc051.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the stress cpu script on the host.

Result: The CPU usage of the host will be stressed to 100%.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It kills the process that stresses the CPU usage.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.10. Yardstick Test Case Description TC052
OpenStack Controller Node Disk I/O Block High Availability
test case id OPNFV_YARDSTICK_TC052: OpenStack Controller Node Disk I/O Block High Availability
test purpose This test case will verify the high availability of control node. When the disk I/O of a specified disk is blocked, which breaks down the Openstack services on this node. Read and write services should still be accessed by other controller nodes, and the services on failed controller node should be isolated.
test method This test case blocks the disk I/O of a specified control node, then checks whether the services that need to read or wirte the disk of the control node are OK with some monitor tools.
attackers In this test case, an attacker called “disk-block” is needed. This attacker includes two parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “disk-block” in this test case. 2) host: which is the name of a control node being attacked. e.g. -fault_type: “disk-block” -host: node1
monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scripts. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request.

e.g. -monitor_type: “openstack-cmd” -command_name: “nova flavor-list”

2. the second monitor verifies the read and write function by a “operation” and a “result checker”. the “operation” have two parameters: 1) operation_type: which is used for finding the operation class and related scripts. 2) action_parameter: parameters for the operation. the “result checker” have three parameters: 1) checker_type: which is used for finding the reuslt checker class and realted scripts. 2) expectedValue: the expected value for the output of the checker script. 3) condition: whether the expected value is in the output of checker script or is totally same with the output.

In this case, the “operation” adds a flavor and the “result checker” checks whether ths flavor is created. Their parameters show as follows: operation: -operation_type: “nova-create-flavor” -action_parameter:

flavorconfig: “test-001 test-001 100 1 1”

result checker: -checker_type: “check-flavor” -expectedValue: “test-001” -condition: “in”

metrics In this test case, there is one metric: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc052.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

do attacker: connect the host through SSH, and then execute the block disk I/O script on the host.

Result: The disk I/O of the host will be blocked

step 2

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 3 do operation: add a flavor
step 4 do result checker: check whether the falvor is created
step 5

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 6

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It excutes the release disk I/O script to release the blocked I/O.
test verdict Fails if monnitor SLA is not passed or the result checker is not passed, or if there is a test case execution problem.
14.3.1.11. Yardstick Test Case Description TC053
OpenStack Controller Load Balance Service High Availability
test case id OPNFV_YARDSTICK_TC053: OpenStack Controller Load Balance Service High Availability
test purpose This test case will verify the high availability of the load balance service(current is HAProxy) that supports OpenStack on controller node. When the load balance service of a specified controller node is killed, whether other load balancers on other controller nodes will work, and whether the controller node will restart the load balancer are checked.
test method This test case kills the processes of load balance service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “swift- proxy”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “haproxy” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scripts. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process In this case, the command_name of monitor1 should be services that is supported by load balancer and the process- name of monitor2 should be “haproxy”, for example:

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “nova image-list” monitor2: -monitor_type: “process” -process_name: “haproxy” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc053.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action

It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases.

Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.12. Yardstick Test Case Description TC054
OpenStack Virtual IP High Availability
test case id OPNFV_YARDSTICK_TC054: OpenStack Virtual IP High Availability
test purpose This test case will verify the high availability for virtual ip in the environment. When master node of virtual ip is abnormally shutdown, connection to virtual ip and the services binded to the virtual IP it should be OK.
test method This test case shutdowns the virtual IP master node with some fault injection tools, then checks whether virtual ips can be pinged and services binded to virtual ip are OK with some monitor tools.
attackers

In this test case, an attacker called “control-shutdown” is needed. This attacker includes two parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “control-shutdown” in this test case. 2) host: which is the name of a control node being attacked.

In this case the host should be the virtual ip master node, that means the host ip is the virtual ip, for exapmle: -fault_type: “control-shutdown” -host: node1(the VIP Master node)

monitors

In this test case, two kinds of monitor are needed: 1. the “ip_status” monitor that pings a specific ip to check the connectivity of this ip, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scripts. It should be always set to “ip_status” for this monitor. 2) ip_address: The ip to be pinged. In this case, ip_address should be the virtual IP.

2. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scripts. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request.

e.g. monitor1: -monitor_type: “ip_status” -host: 192.168.0.2 monitor2: -monitor_type: “openstack-cmd” -command_name: “nova image-list”

metrics In this test case, there are two metrics: 1) ping_outage_time: which-indicates the maximum outage time to ping the specified host. 2)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc054.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the shutdown script on the VIP master node.

Result: VIP master node will be shutdown

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It restarts the original VIP master node if it is not restarted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.13. Yardstick Test Case Description TC056
OpenStack Controller Messaging Queue Service High Availability
test case id OPNFV_YARDSTICK_TC056:OpenStack Controller Messaging Queue Service High Availability
test purpose This test case will verify the high availability of the messaging queue service(RabbitMQ) that supports OpenStack on controller node. When messaging queue service(which is active) of a specified controller node is killed, the test case will check whether messaging queue services(which are standby) on other controller nodes will be switched active, and whether the cluster manager on attacked the controller node will restart the stopped messaging queue.
test method This test case kills the processes of messaging queue service on a selected controller node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case, this parameter should always set to “rabbitmq”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “rabbitmq-server” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scripts. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process In this case, the command_name of monitor1 should be services that will use the messaging queue(current nova, neutron, cinder ,heat and ceilometer are using RabbitMQ) , and the process-name of monitor2 should be “rabbitmq”, for example:

e.g. monitor1-1: -monitor_type: “openstack-cmd” -command_name: “openstack image list” monitor1-2: -monitor_type: “openstack-cmd” -command_name: “openstack network list” monitor1-3: -monitor_type: “openstack-cmd” -command_name: “openstack volume list” monitor2: -monitor_type: “process” -process_name: “rabbitmq” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximum time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc056.yaml -Attackers: see above “attackers” description -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” description -SLA: see above “metrics” description

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action

It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases.

Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.14. Yardstick Test Case Description TC057
OpenStack Controller Cluster Management Service High Availability
test case id OPNFV_YARDSTICK_TC057_HA: OpenStack Controller Cluster Management Service High Availability
test purpose This test case will verify the quorum configuration of the cluster manager(pacemaker) on controller nodes. When a controller node , which holds all active application resources, failed to communicate with other cluster nodes (via corosync), the test case will check whether the standby application resources will take place of those active application resources which should be regarded to be down in the cluster manager.
test method This test case kills the processes of cluster messaging service(corosync) on a selected controller node(the node holds the active application resources), then checks whether active application resources are switched to other controller nodes and whether the Openstack commands are OK.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the load balance service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. 3) host: which is the name of a control node being attacked.

In this case, this process name should set to “corosync” , for example -fault_type: “kill-process” -process_name: “corosync” -host: node1

monitors

In this test case, a kind of monitor is needed: 1. the “openstack-cmd” monitor constantly request a specific

Openstack command, which needs two parameters:

1) monitor_type: which is used for finding the monitor class and related scripts. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request

In this case, the command_name of monitor1 should be services that are managed by the cluster manager. (Since rabbitmq and haproxy are managed by pacemaker, most Openstack Services can be used to check high availability in this case)

(e.g.) monitor1: -monitor_type: “openstack-cmd” -command_name: “nova image-list” monitor2: -monitor_type: “openstack-cmd” -command_name: “neutron router-list” monitor3: -monitor_type: “openstack-cmd” -command_name: “heat stack-list” monitor4: -monitor_type: “openstack-cmd” -command_name: “cinder list”

checkers

In this test case, a checker is needed, the checker will the status of application resources in pacemaker and the checker have three parameters: 1) checker_type: which is used for finding the result checker class and related scripts. In this case the checker type will be “pacemaker-check-resource” 2) resource_name: the application resource name 3) resource_status: the expected status of the resource 4) expectedValue: the expected value for the output of the checker script, in the case the expected value will be the identifier in the cluster manager 3) condition: whether the expected value is in the output of checker script or is totally same with the output. (note: pcs is required to installed on controller node in order to run this checker)

(e.g.) checker1: -checker_type: “pacemaker-check-resource” -resource_name: “p_rabbitmq-server” -resource_status: “Stopped” -expectedValue: “node-1” -condition: “in” checker2: -checker_type: “pacemaker-check-resource” -resource_name: “p_rabbitmq-server” -resource_status: “Master” -expectedValue: “node-2” -condition: “in”

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool None. Self-developed.
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc057.yaml -Attackers: see above “attackers” description -Monitors: see above “monitors” description -Checkers: see above “checkers” description -Steps: the test case execution step, see “test sequence” description below

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3 do checker: check whether the status of application resources on different nodes are updated
step 4

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 5

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It will check the status of the cluster messaging process(corosync) on the host, and restart the process if it is not running for next test cases. Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.15. Yardstick Test Case Description TC058
OpenStack Controller Virtual Router Service High Availability
test case id OPNFV_YARDSTICK_TC058: OpenStack Controller Virtual Router Service High Availability
test purpose This test case will verify the high availability of virtual routers(L3 agent) on controller node. When a virtual router service on a specified controller node is shut down, this test case will check whether the network of virtual machines will be affected, and whether the attacked virtual router service will be recovered.
test method This test case kills the processes of virtual router service (l3-agent) on a selected controller node(the node holds the active l3-agent), then checks whether the network routing of virtual machines is OK and whether the killed service will be recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the load balance service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. 3) host: which is the name of a control node being attacked.

In this case, this process name should set to “l3agent” , for example -fault_type: “kill-process” -process_name: “l3agent” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “ip_status” monitor that pings a specific ip to check the connectivity of this ip, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scripts. It should be always set to “ip_status” for this monitor. 2) ip_address: The ip to be pinged. In this case, ip_address will be either an ip address of external network or an ip address of a virtual machine. 3) host: The node on which ping will be executed, in this case the host will be a virtual machine.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scripts. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor. In this case, the process-name of monitor2 should be “l3agent” 3) host: which is the name of the node running the process

e.g. monitor1-1: -monitor_type: “ip_status” -host: 172.16.0.11 -ip_address: 172.16.1.11 monitor1-2: -monitor_type: “ip_status” -host: 172.16.0.11 -ip_address: 8.8.8.8 monitor2: -monitor_type: “process” -process_name: “l3agent” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximum time (seconds) from the process being killed to recovered
test tool None. Self-developed.
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc058.yaml -Attackers: see above “attackers” description -Monitors: see above “monitors” description -Steps: the test case execution step, see “test sequence” description below

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
pre-test conditions The test case image needs to be installed into Glance with cachestat included in the image.
step 1 Two host VMs are booted, these two hosts are in two different networks, the networks are connected by a virtual router.
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 4

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 5

verify the SLA

Result: The test case is passed or not.

post-action

It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases. Virtual machines and network created in the test case will be destoryed.

Notice: This post-action uses ‘lsb_release’ command to check the host linux distribution and determine the OpenStack service name to restart the process. Lack of ‘lsb_release’ on the host may cause failure to restart the process.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.16. Yardstick Test Case Description TC087
14.3.1.17. Yardstick Test Case Description TC092
SDN Controller resilience in HA configuration
test case id OPNFV_YARDSTICK_TC092: SDN controller resilience and high availability HA configuration
test purpose

This test validates SDN controller node high availability by verifying there is no impact on the data plane connectivity when one SDN controller fails in a HA configuration, i.e. all existing configured network services DHCP, ARP, L2, L3VPN, Security Groups should continue to operate between the existing VMs while one SDN controller instance is offline and rebooting.

The test also validates that network service operations such as creating a new VM in an existing or new L2 network network remain operational while one instance of the SDN controller is offline and recovers from the failure.

test method
This test case:
  1. fails one instance of a SDN controller cluster running in a HA configuration on the OpenStack controller node
  2. checks if already configured L2 connectivity between existing VMs is not impacted
  3. verifies that the system never loses the ability to execute virtual network operations, even when the failed SDN Controller is still recovering
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters:

  1. fault_type: which is used for finding the attacker’s scripts. It should be set to ‘kill-process’ in this test
  2. process_name: should be set to sdn controller process
  3. host: which is the name of a control node where opendaylight process is running
example:
  • fault_type: “kill-process”
  • process_name: “opendaylight-karaf” (TBD)
  • host: node1
monitors
In this test case, the following monitors are needed
  1. ping_same_network_l2: monitor pinging traffic between the VMs in same neutron network
  2. ping_external_snat: monitor ping traffic from VMs to external destinations (e.g. google.com)
  3. SDN controller process monitor: a monitor checking the state of a specified SDN controller process. It measures the recovery time of the given process.
operations
In this test case, the following operations are needed:
  1. “nova-create-instance-in_network”: create a VM instance in one of the existing neutron network.
metrics
In this test case, there are two metrics:
  1. process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
  2. packet_drop: measure the packets that have been dropped by the monitors using pktgen.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references TBD
configuration
This test case needs two configuration files:
  1. test case file: opnfv_yardstick_tc092.yaml - Attackers: see above “attackers” discription - Monitors: see above “monitors” discription

    • waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors
    • SLA: see above “metrics” discription
  2. POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence Description and expected result
pre-action
  1. The OpenStack cluster is set up with an SDN controller running in a three node cluster configuration.
  2. One or more neutron networks are created with two or more VMs attached to each of the neutron networks.
  3. The neutron networks are attached to a neutron router which is attached to an external network the towards DCGW.
  4. The master node of SDN controller cluster is known.
step 1
Start ip connectivity monitors:
  1. Check the L2 connectivity between the VMs in the same neutron network.
  2. Check the external connectivity of the VMs.

Each monitor runs in an independent process.

Result: The monitor info will be collected.

step 2

Start attacker: SSH to the VIM node and kill the SDN controller process determined in step 2.

Result: One SDN controller service will be shut down

step 3 Restart the SDN controller.
step 4 Create a new VM in the existing Neutron network while the SDN controller is offline or still recovering.
step 5

Stop IP connectivity monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated

step 6

Verify the IP connectivity monitor result

Result: IP connectivity monitor should not have any packet drop failures reported

step 7

Verify process_recover_time, which indicates the maximun time (seconds) from the process being killed to recovered, is within the SLA. This step blocks until either the process has recovered or a timeout occurred.

Result: process_recover_time is within SLA limits, if not, test case failed and stopped.

step 8
Start IP connectivity monitors for the new VM:
  1. Check the L2 connectivity from the existing VMs to the new VM in the Neutron network.
  2. Check connectivity from one VM to an external host on the Internet to verify SNAT functionality.

Result: The monitor info will be collected.

step 9

Stop IP connectivity monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated

step 10

Verify the IP connectivity monitor result

Result: IP connectivity monitor should not have any packet drop failures reported

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.1.18. Yardstick Test Case Description TC093
SDN Vswitch resilience in non-HA or HA configuration
test case id OPNFV_YARDSTICK_TC093: SDN Vswitch resilience in non-HA or HA configuration
test purpose

This test validates that network data plane services are resilient in the event of Virtual Switch failure in compute nodes. Specifically, the test verifies that existing data plane connectivity is not permanently impacted i.e. all configured network services such as DHCP, ARP, L2, L3 Security Groups continue to operate between the existing VMs eventually after the Virtual Switches have finished rebooting.

The test also validates that new network service operations (creating a new VM in the existing L2/L3 network or in a new network, etc.) are operational after the Virtual Switches have recovered from a failure.

test method This testcase first checks if the already configured DHCP/ARP/L2/L3/SNAT connectivity is proper. After it fails and restarts again the VSwitch services which are running on both OpenStack compute nodes, and then checks if already configured DHCP/ARP/L2/L3/SNAT connectivity is not permanently impacted (even if there are some packet loss events) between VMs and the system is able to execute new virtual network operations once the Vswitch services are restarted and have been fully recovered
attackers

In this test case, two attackers called “kill-process” are needed. These attackers include three parameters:

  1. fault_type: which is used for finding the attacker’s scripts. It should be set to ‘kill-process’ in this test
  2. process_name: should be set to the name of the Vswitch process
  3. host: which is the name of the compute node where the Vswitch process is running
e.g. -fault_type: “kill-process”
-process_name: “openvswitch” -host: node1
monitors

This test case utilizes two monitors of type “ip-status” and one monitor of type “process” to track the following conditions:

  1. “ping_same_network_l2”: monitor ICMP traffic between VMs in the same Neutron network
  2. “ping_external_snat”: monitor ICMP traffic from VMs to an external host on the Internet to verify SNAT functionality.
  3. “Vswitch process monitor”: a monitor checking the state of the specified Vswitch process. It measures the recovery time of the given process.

Monitors of type “ip-status” use the “ping” utility to verify reachability of a given target IP.

operations
In this test case, the following operations are needed:
  1. “nova-create-instance-in_network”: create a VM instance in one of the existing Neutron network.
metrics
In this test case, there are two metrics:
  1. process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
  2. outage_time: measures the total time in which monitors were failing in their tasks (e.g. total time of Ping failure)
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references none
configuration
This test case needs two configuration files:
  1. test case file: opnfv_yardstick_tc093.yaml - Attackers: see above “attackers” description - monitor_time: which is the time (seconds) from

    starting to stoping the monitors

    • Monitors: see above “monitors” discription
    • SLA: see above “metrics” description
  2. POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence Description and expected result
pre-action
  1. The Vswitches are set up in both compute nodes.
  2. One or more Neutron networks are created with two or more VMs attached to each of the Neutron networks.
  3. The Neutron networks are attached to a Neutron router which is attached to an external network towards the DCGW.
step 1
Start IP connectivity monitors:
  1. Check the L2 connectivity between the VMs in the same Neutron network.
  2. Check connectivity from one VM to an external host on the Internet to verify SNAT functionality.

Result: The monitor info will be collected.

step 2

Start attackers: SSH connect to the VIM compute nodes and kill the Vswitch processes

Result: the SDN Vswitch services will be shutdown

step 3

Verify the results of the IP connectivity monitors.

Result: The outage_time metric reported by the monitors is not greater than the max_outage_time.

step 4 Restart the SDN Vswitch services.
step 5 Create a new VM in the existing Neutron network
step 6
Verify connectivity between VMs as follows:
  1. Check the L2 connectivity between the previously existing VM and the newly created VM on the same Neutron network by sending ICMP messages
step 7

Stop IP connectivity monitors after a period of time specified by “monitor_time”

Result: The monitor info will be aggregated

step 8

Verify the IP connectivity monitor results

Result: IP connectivity monitor should not have any packet drop failures reported

test verdict

This test fails if the SLAs are not met or if there is a test case execution problem. The SLAs are define as follows for this test:

  • SDN Vswitch recovery * process_recover_time <= 30 sec
  • no impact on data plane connectivity during SDN Vswitch failure and recovery. * packet_drop == 0
14.3.2. IPv6
14.3.2.1. Yardstick Test Case Description TC027
IPv6 connectivity between nodes on the tenant network
test case id OPNFV_YARDSTICK_TC027_IPv6 connectivity
metric RTT, Round Trip Time
test purpose To do a basic verification that IPv6 connectivity is within acceptable boundaries when ipv6 packets travel between hosts located on same or different compute blades. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc027.yaml

Packet size 56 bytes. SLA RTT is set to maximum 30 ms. ipv6 test case can be configured as three independent modules (setup, run, teardown). if you only want to setup ipv6 testing environment, do some tests as you want, “run_step” of task yaml file should be configured as “setup”. if you want to setup and run ping6 testing automatically, “run_step” should be configured as “setup, run”. and if you have had a environment which has been setup, you only wan to verify the connectivity of ipv6 network, “run_step” should be “run”. Of course, default is that three modules run sequentially.

test tool

ping6

Ping6 is normally part of Linux distribution, hence it doesn’t need to be installed.

references

ipv6

ETSI-NFV-TST001

applicability Test case can be configured with different run step you can run setup, run benchmark, teardown independently SLA is optional. The SLA in this test case serves as an example. Considerably lower RTT is expected.
pre-test conditions

The test case image needs to be installed into Glance with ping6 included in it.

For Brahmaputra, a compass_os_nosdn_ha deploy scenario is need. more installer and more sdn deploy scenario will be supported soon

test sequence description and expected result
step 1 To setup IPV6 testing environment: 1. disable security group 2. create (ipv6, ipv4) router, network and subnet 3. create VRouter, VM1, VM2
step 2 To run ping6 to verify IPV6 connectivity : 1. ssh to VM1 2. Ping6 to ipv6 router from VM1 3. Get the result(RTT) and logs are stored
step 3 To teardown IPV6 testing environment 1. delete VRouter, VM1, VM2 2. delete (ipv6, ipv4) router, network and subnet 3. enable security group
test verdict Test should not PASS if any RTT is above the optional SLA value, or if there is a test case execution problem.
14.3.3. KVM
14.3.3.1. Yardstick Test Case Description TC028
KVM Latency measurements
test case id OPNFV_YARDSTICK_TC028_KVM Latency measurements
metric min, avg and max latency
test purpose To evaluate the IaaS KVM virtualization capability with regards to min, avg and max latency. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration file: samples/cyclictest-node-context.yaml
test tool

Cyclictest

(Cyclictest is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with cyclictest included.)

references Cyclictest
applicability This test case is mainly for kvm4nfv project CI verify. Upgrade host linux kernel, boot a gust vm update it’s linux kernel, and then run the cyclictest to test the new kernel is work well.
pre-test conditions

The test kernel rpm, test sequence scripts and test guest image need put the right folders as specified in the test case yaml file. The test guest image needs with cyclictest included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host and guest os kernel is upgraded. Cyclictest is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.3.4. Parser
14.3.4.1. Yardstick Test Case Description TC040
Verify Parser Yang-to-Tosca
test case id OPNFV_YARDSTICK_TC040 Verify Parser Yang-to-Tosca
metric
  1. tosca file which is converted from yang file by Parser
  2. result whether the output is same with expected outcome
test purpose To verify the function of Yang-to-Tosca in Parser.
configuration

file: opnfv_yardstick_tc040.yaml

yangfile: the path of the yangfile which you want to convert toscafile: the path of the toscafile which is your expected outcome.

test tool

Parser

(Parser is not part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/benchmark/scenarios/parser/parser_setup.sh for how to install it manual. Of course, it will be installed and uninstalled automatically when you run this test case by yardstick)

references Parser
applicability Test can be configured with different path of yangfile and toscafile to fit your real environment to verify Parser
pre-test conditions No POD specific requirements have been identified. it can be run without VM
test sequence description and expected result
step 1

parser is installed without VM, running Yang-to-Tosca module to convert yang file to tosca file, validating output against expected outcome.

Result: Logs are stored.

test verdict Fails only if output is different with expected outcome or if there is a test case execution problem.
14.3.5. StorPerf
14.2.25. Yardstick Test Case Description TC074
Storperf
test case id OPNFV_YARDSTICK_TC074_Storperf
metric Storage performance
test purpose

To evaluate and report on the Cinder volume performance.

This testcase integrates with OPNFV StorPerf to measure block performance of the underlying Cinder drivers. Many options are supported, and even the root disk (Glance ephemeral storage can be profiled.

The fundamental concept of the test case is to first fill the volumes with random data to ensure reported metrics are indicative of continued usage and not skewed by transitional performance while the underlying storage driver allocates blocks. The metrics for filling the volumes with random data are not reported in the final results. The test also ensures the volumes are performing at a consistent level of performance by measuring metrics every minute, and comparing the trend of the metrics over the run. By evaluating the min and max values, as well as the slope of the trend, it can make the determination that the metrics are stable, and not fluctuating beyond industry standard norms.

configuration

file: opnfv_yardstick_tc074.yaml

  • agent_count: 1 - the number of VMs to be created
  • agent_image: “Ubuntu-14.04” - image used for creating VMs
  • public_network: “ext-net” - name of public network
  • volume_size: 2 - cinder volume size
  • block_sizes: “4096” - data block size
  • queue_depths: “4” - the number of simultaneous I/Os to perform at all times
  • StorPerf_ip: “192.168.200.2”
  • query_interval: 10 - state query interval
  • timeout: 600 - maximum allowed job time
test tool

Storperf

StorPerf is a tool to measure block and object storage performance in an NFVI.

StorPerf is delivered as a Docker container from https://hub.docker.com/r/opnfv/storperf-master/tags/.

The underlying tool used is FIO, and StorPerf supports any FIO option in order to tailor the test to the exact workload needed.

references

Storperf

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • agent_count

  • volume_size

  • block_sizes

  • queue_depths

  • query_interval

  • timeout

  • target=[device or path] The path to either an attached storage device (/dev/vdb, etc) or a directory path (/opt/storperf) that will be used to execute the performance test. In the case of a device, the entire device will be used. If not specified, the current directory will be used.

  • workload=[workload module] If not specified, the default is to run all workloads. The workload types are:

    • rs: 100% Read, sequential data
    • ws: 100% Write, sequential data
    • rr: 100% Read, random access
    • wr: 100% Write, random access
    • rw: 70% Read / 30% write, random access

    measurements.

  • workloads={json maps} This parameter supercedes the workload and calls the V2.0 API in StorPerf. It allows for greater control of the parameters to be passed to FIO. For example, running a random read/write with a mix of 90% read and 10% write would be expressed as follows: {“9010randrw”: {“rw”:”randrw”,”rwmixread”: “90”}} Note: This must be passed in as a string, so don’t forget to escape or otherwise properly deal with the quotes.

  • report= [job_id] Query the status of the supplied job_id and report on metrics. If a workload is supplied, will report on only that subset.

  • availability_zone: Specify the availability zone which the stack will use to create instances.

  • volume_type: Cinder volumes can have different types, for example encrypted vs. not encrypted. To be able to profile the difference between the two.

  • subnet_CIDR: Specify subnet CIDR of private network

  • stack_name: Specify the name of the stack that will be created, the default: “StorperfAgentGroup”

  • volume_count: Specify the number of volumes per virtual machines

    There are default values for each above-mentioned option.

pre-test conditions

If you do not have an Ubuntu 14.04 image in Glance, you will need to add one.

Storperf is required to be installed in the environment. There are two possible methods for Storperf installation:

Run container on Jump Host Run container in a VM

Running StorPerf on Jump Host Requirements:

  • Docker must be installed
  • Jump Host must have access to the OpenStack Controller API
  • Jump Host must have internet connectivity for downloading docker image
  • Enough floating IPs must be available to match your agent count

Running StorPerf in a VM Requirements:

  • VM has docker installed
  • VM has OpenStack Controller credentials and can communicate with the Controller API
  • VM has internet connectivity for downloading the docker image
  • Enough floating IPs must be available to match your agent count

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Yardstick calls StorPerf to create the heat stack with the number of VMs and size of Cinder volumes specified. The VMs will be on their own private subnet, and take floating IP addresses from the specified public network.
step 2 Yardstick calls StorPerf to fill all the volumes with random data.
step 3 Yardstick calls StorPerf to perform the series of tests specified by the workload, queue depths and block sizes.
step 4 Yardstick calls StorPerf to delete the stack it created.
test verdict None. Storage performance results are fetched and stored.
14.3.6. virtual Traffic Classifier
14.3.6.1. Yardstick Test Case Description TC006
Volume storage Performance
test case id OPNFV_YARDSTICK_TC006_VOLUME STORAGE PERFORMANCE
metric IOPS (Average IOs performed per second), Throughput (Average disk read/write bandwidth rate), Latency (Average disk read/write latency)
test purpose

The purpose of TC006 is to evaluate the IaaS volume storage performance with regards to IOPS, throughput and latency.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

fio

fio is an I/O tool meant to be used both for benchmark and stress/hardware verification. It has support for 19 different types of I/O engines (sync, mmap, libaio, posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O priorities (for newer Linux kernels), rate I/O, forked or threaded jobs, and much more.

(fio is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with fio included.)

test description fio test is invoked in a host VM with a volume attached on a compute blade, a job file as well as parameters are passed to fio and fio will start doing what the job file tells it to do.
configuration

file: opnfv_yardstick_tc006.yaml

Fio job file is provided to define the benchmark process Target volume is mounted at /FIO_Test directory

For SLA, minimum read/write iops is set to 100, minimum read/write throughput is set to 400 KB/s, and maximum read/write latency is set to 20000 usec.

applicability

This test case can be configured with different:

  • Job file;
  • Volume mount directory.

SLA is optional. The SLA in this test case serves as an example. Considerably higher throughput and lower latency are expected. However, to cover most configurations, both baremetal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many heavy IO applications start to suffer badly if the read/write bandwidths are lower than this.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

fio

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with fio included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with fio installed is booted. A 200G volume is attached to the host VM
step 2 Yardstick is connected with the host VM by using ssh. ‘job_file.ini’ is copyied from Jump Host to the host VM via the ssh tunnel. The attached volume is formated and mounted.
step 3

Fio benchmark is invoked. Simulated IO operations are started. IOPS, disk read/write bandwidth and latency are recorded and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
14.4. Templates
14.4.1. Yardstick Test Case Description TCXXX
test case slogan e.g. Network Latency
test case id e.g. OPNFV_YARDSTICK_TC001_NW Latency
metric what will be measured, e.g. latency
test purpose describe what is the purpose of the test case
configuration what .yaml file to use, state SLA if applicable, state test duration, list and describe the scenario options used in this TC and also list the options using default values.
test tool e.g. ping
references e.g. RFCxxx, ETSI-NFVyyy
applicability describe variations of the test case which can be performend, e.g. run the test for different packet sizes
pre-test conditions describe configuration in the tool(s) used to perform the measurements (e.g. fio, pktgen), POD-specific configuration required to enable running the test
test sequence description and expected result
step 1

use this to describe tests that require sveveral steps e.g collect logs.

Result: what happens in this step e.g. logs collected

step 2

remove interface

Result: interface down.

step N

what is done in step N

Result: what happens

test verdict expected behavior, or SLA, pass/fail criteria
14.4.2. Task Template Syntax
14.4.2.1. Basic template syntax

A nice feature of the input task format used in Yardstick is that it supports the template syntax based on Jinja2. This turns out to be extremely useful when, say, you have a fixed structure of your task but you want to parameterize this task in some way. For example, imagine your input task file (task.yaml) runs a set of Ping scenarios:

# Sample benchmark task config file
# measure network latency using ping
schema: "yardstick:task:0.1"

scenarios:
-
  type: Ping
  options:
    packetsize: 200
  host: athena.demo
  target: ares.demo

  runner:
    type: Duration
    duration: 60
    interval: 1

  sla:
    max_rtt: 10
    action: monitor

context:
    ...

Let’s say you want to run the same set of scenarios with the same runner/ context/sla, but you want to try another packetsize to compare the performance. The most elegant solution is then to turn the packetsize name into a template variable:

# Sample benchmark task config file
# measure network latency using ping

schema: "yardstick:task:0.1"
scenarios:
-
  type: Ping
  options:
    packetsize: {{packetsize}}
  host: athena.demo
  target: ares.demo

  runner:
    type: Duration
    duration: 60
    interval: 1

  sla:
    max_rtt: 10
    action: monitor

context:
    ...

and then pass the argument value for {{packetsize}} when starting a task with this configuration file. Yardstick provides you with different ways to do that:

1.Pass the argument values directly in the command-line interface (with either a JSON or YAML dictionary):

yardstick task start samples/ping-template.yaml
--task-args'{"packetsize":"200"}'

2.Refer to a file that specifies the argument values (JSON/YAML):

yardstick task start samples/ping-template.yaml --task-args-file args.yaml
14.4.2.2. Using the default values

Note that the Jinja2 template syntax allows you to set the default values for your parameters. With default values set, your task file will work even if you don’t parameterize it explicitly while starting a task. The default values should be set using the {% set ... %} clause (task.yaml). For example:

# Sample benchmark task config file
# measure network latency using ping
schema: "yardstick:task:0.1"
{% set packetsize = packetsize or "100" %}
scenarios:
-
  type: Ping
  options:
  packetsize: {{packetsize}}
  host: athena.demo
  target: ares.demo

  runner:
    type: Duration
    duration: 60
    interval: 1
  ...

If you don’t pass the value for {{packetsize}} while starting a task, the default one will be used.

14.4.2.3. Advanced templates

Yardstick makes it possible to use all the power of Jinja2 template syntax, including the mechanism of built-in functions. As an example, let us make up a task file that will do a block storage performance test. The input task file (fio-template.yaml) below uses the Jinja2 for-endfor construct to accomplish that:

#Test block sizes of 4KB, 8KB, 64KB, 1MB
#Test 5 workloads: read, write, randwrite, randread, rw
schema: "yardstick:task:0.1"

 scenarios:
{% for bs in ['4k', '8k', '64k', '1024k' ] %}
  {% for rw in ['read', 'write', 'randwrite', 'randread', 'rw' ] %}
-
  type: Fio
  options:
    filename: /home/ubuntu/data.raw
    bs: {{bs}}
    rw: {{rw}}
    ramp_time: 10
  host: fio.demo
  runner:
    type: Duration
    duration: 60
    interval: 60

  {% endfor %}
{% endfor %}
context
    ...
15. NSB Sample Test Cases
15.1. Abstract

This chapter lists available NSB test cases.

15.2. NSB PROX Test Case Descriptions
15.2.1. Yardstick Test Case Description: NSB PROX ACL
NSB PROX test for NFVI characterization
test case id

tc_prox_{context}_acl-{port_num}

  • context = baremetal or heat_context;
  • port_num = 2 or 4;
metric
  • Network Throughput;
  • TG Packets Out;
  • TG Packets In;
  • VNF Packets Out;
  • VNF Packets In;
  • Dropped packets;
test purpose

This test allows to measure how well the SUT can exploit structures in the list of ACL rules. The ACL rules are matched against a 7-tuple of the input packet: the regular 5-tuple and two VLAN tags. The rules in the rule set allow the packet to be forwarded and the rule set contains a default “match all” rule.

The KPI is measured with the rule set that has a moderate number of rules with moderate similarity between the rules & the fraction of rules that were used.

The ACL test cases are implemented to run in baremetal and heat context for 2 port and 4 port configuration.

configuration

The ACL test cases are listed below:

  • tc_prox_baremetal_acl-2.yaml
  • tc_prox_baremetal_acl-4.yaml
  • tc_prox_heat_context_acl-2.yaml
  • tc_prox_heat_context_acl-4.yaml

Test duration is set as 300sec for each test. Packet size set as 64 bytes in traffic profile. These can be configured

test tool PROX PROX is a DPDK application that can simulate VNF workloads and can generate traffic and used for NFVI characterization
applicability

This PROX ACL test cases can be configured with different:

  • packet sizes;
  • test durations;
  • tolerated loss;

Default values exist.

pre-test conditions

For Openstack test case image (yardstick-samplevnfs) needs to be installed into Glance with Prox and Dpdk included in it. The test need multi-queue enabled in Glance image.

For Baremetal tests cases Prox and Dpdk must be installed in the hosts where the test is executed. The pod.yaml file must have the necessary system and NIC information

test sequence description and expected result
step 1

For Baremetal test: The TG and VNF are started on the hosts based on the pod file.

For Heat test: Two host VMs are booted, as Traffic generator and VNF(ACL workload) based on the test flavor.

step 2 Yardstick is connected with the TG and VNF by using ssh. The test will resolve the topology and instantiate the VNF and TG and collect the KPI’s/metrics.
step 3

The TG will send packets to the VNF. If the number of dropped packets is more than the tolerated loss the line rate or throughput is halved. This is done until the dropped packets are within an acceptable tolerated loss.

The KPI is the number of packets per second for 64 bytes packet size with an accepted minimal packet loss for the default configuration.

step 4

In Baremetal test: The test quits the application and unbind the dpdk ports.

In Heat test: Two host VMs are deleted on test completion.

test verdict The test case will achieve a Throughput with an accepted minimal tolerated packet loss.
15.2.2. Yardstick Test Case Description: NSB PROX BNG
NSB PROX test for NFVI characterization
test case id

tc_prox_{context}_bng-{port_num}

  • context = baremetal or heat_context;
  • port_num = 4;
metric
  • Network Throughput;
  • TG Packets Out;
  • TG Packets In;
  • VNF Packets Out;
  • VNF Packets In;
  • Dropped packets;
test purpose

The BNG workload converts packets from QinQ to GRE tunnels, handles routing and adds/removes MPLS tags. This use case simulates a realistic and complex application. The number of users is 32K per port and the number of routes is 8K.

The BNG test cases are implemented to run in baremetal and heat context an require 4 port topology to run the default configuration.

configuration

The BNG test cases are listed below:

  • tc_prox_baremetal_bng-2.yaml
  • tc_prox_baremetal_bng-4.yaml
  • tc_prox_heat_context_bng-2.yaml
  • tc_prox_heat_context_bng-4.yaml

Test duration is set as 300sec for each test. The minimum packet size for BNG test is 78 bytes. This is set in the BNG traffic profile and can be configured to use a higher packet size for the test.

test tool PROX PROX is a DPDK application that can simulate VNF workloads and can generate traffic and used for NFVI characterization
applicability

The PROX BNG test cases can be configured with different:

  • packet sizes;
  • test durations;
  • tolerated loss;

Default values exist.

pre-test conditions

For Openstack test case image (yardstick-samplevnfs) needs to be installed into Glance with Prox and Dpdk included in it. The test need multi-queue enabled in Glance image.

For Baremetal tests cases Prox and Dpdk must be installed in the hosts where the test is executed. The pod.yaml file must have the necessary system and NIC information

test sequence description and expected result
step 1

For Baremetal test: The TG and VNF are started on the hosts based on the pod file.

For Heat test: Two host VMs are booted, as Traffic generator and VNF(BNG workload) based on the test flavor.

step 2 Yardstick is connected with the TG and VNF by using ssh. The test will resolve the topology and instantiate the VNF and TG and collect the KPI’s/metrics.
step 3

The TG will send packets to the VNF. If the number of dropped packets is more than the tolerated loss the line rate or throughput is halved. This is done until the dropped packets are within an acceptable tolerated loss.

The KPI is the number of packets per second for 78 bytes packet size with an accepted minimal packet loss for the default configuration.

step 4

In Baremetal test: The test quits the application and unbind the dpdk ports.

In Heat test: Two host VMs are deleted on test completion.

test verdict The test case will achieve a Throughput with an accepted minimal tolerated packet loss.
15.2.3. Yardstick Test Case Description: NSB PROX BNG_QoS
NSB PROX test for NFVI characterization
test case id

tc_prox_{context}_bng_qos-{port_num}

  • context = baremetal or heat_context;
  • port_num = 4;
metric
  • Network Throughput;
  • TG Packets Out;
  • TG Packets In;
  • VNF Packets Out;
  • VNF Packets In;
  • Dropped packets;
test purpose

The BNG+QoS workload converts packets from QinQ to GRE tunnels, handles routing and adds/removes MPLS tags and performs a QoS. This use case simulates a realistic and complex application. The number of users is 32K per port and the number of routes is 8K.

The BNG_QoS test cases are implemented to run in baremetal and heat context an require 4 port topology to run the default configuration.

configuration

The BNG_QoS test cases are listed below:

  • tc_prox_baremetal_bng_qos-2.yaml
  • tc_prox_baremetal_bng_qos-4.yaml
  • tc_prox_heat_context_bng_qos-2.yaml
  • tc_prox_heat_context_bng_qos-4.yaml

Test duration is set as 300sec for each test. The minumum packet size for BNG_QoS test is 78 bytes. This is set in the bng_qos traffic profile and can be configured to use a higher packet size for the test.

test tool PROX PROX is a DPDK application that can simulate VNF workloads and can generate traffic and used for NFVI characterization
applicability

This PROX BNG_QoS test cases can be configured with different:

  • packet sizes;
  • test durations;
  • tolerated loss;

Default values exist.

pre-test conditions

For Openstack test case image (yardstick-samplevnfs) needs to be installed into Glance with Prox and Dpdk included in it. The test need multi-queue enabled in Glance image.

For Baremetal tests cases Prox and Dpdk must be installed in the hosts where the test is executed. The pod.yaml file must have the necessary system and NIC information

test sequence description and expected result
step 1

For Baremetal test: The TG and VNF are started on the hosts based on the pod file.

For Heat test: Two host VMs are booted, as Traffic generator and VNF(BNG_QoS workload) based on the test flavor.

step 2 Yardstick is connected with the TG and VNF by using ssh. The test will resolve the topology and instantiate the VNF and TG and collect the KPI’s/metrics.
step 3

The TG will send packets to the VNF. If the number of dropped packets is more than the tolerated loss the line rate or throughput is halved. This is done until the dropped packets are within an acceptable tolerated loss.

The KPI is the number of packets per second for 78 bytes packet size with an accepted minimal packet loss for the default configuration.

step 4

In Baremetal test: The test quits the application and unbind the dpdk ports.

In Heat test: Two host VMs are deleted on test completion.

test verdict The test case will achieve a Throughput with an accepted minimal tolerated packet loss.
15.2.4. Yardstick Test Case Description: NSB PROX L2FWD
NSB PROX test for NFVI characterization
test case id

tc_prox_{context}_l2fwd-{port_num}

  • context = baremetal or heat_context;
  • port_num = 2 or 4;
metric
  • Network Throughput;
  • TG Packets Out;
  • TG Packets In;
  • VNF Packets Out;
  • VNF Packets In;
  • Dropped packets;
test purpose

The PROX L2FWD test has 3 types of test cases: L2FWD: The application will take packets in from one port and forward them unmodified to another port L2FWD_Packet_Touch: The application will take packets in from one port, update src and dst MACs and forward them to another port. L2FWD_Multi_Flow: The application will take packets in from one port, update src and dst MACs and forward them to another port. This test case exercises the softswitch with 200k flows.

The above test cases are implemented for baremetal and heat context for 2 port and 4 port configuration.

configuration

The L2FWD test cases are listed below:

  • tc_prox_baremetal_l2fwd-2.yaml
  • tc_prox_baremetal_l2fwd-4.yaml
  • tc_prox_baremetal_l2fwd_pktTouch-2.yaml
  • tc_prox_baremetal_l2fwd_pktTouch-4.yaml
  • tc_prox_baremetal_l2fwd_multiflow-2.yaml
  • tc_prox_baremetal_l2fwd_multiflow-4.yaml
  • tc_prox_heat_context_l2fwd-2.yaml
  • tc_prox_heat_context_l2fwd-4.yaml
  • tc_prox_heat_context_l2fwd_pktTouch-2.yaml
  • tc_prox_heat_context_l2fwd_pktTouch-4.yaml
  • tc_prox_heat_context_l2fwd_multiflow-2.yaml
  • tc_prox_heat_context_l2fwd_multiflow-4.yaml

Test duration is set as 300sec for each test. Packet size set as 64 bytes in traffic profile These can be configured

test tool PROX PROX is a DPDK application that can simulate VNF workloads and can generate traffic and used for NFVI characterization
applicability

The PROX L2FWD test cases can be configured with different:

  • packet sizes;
  • test durations;
  • tolerated loss;

Default values exist.

pre-test conditions

For Openstack test case image (yardstick-samplevnfs) needs to be installed into Glance with Prox and Dpdk included in it.

For Baremetal tests cases Prox and Dpdk must be installed in the hosts where the test is executed. The pod.yaml file must have the necessary system and NIC information

test sequence description and expected result
step 1

For Baremetal test: The TG and VNF are started on the hosts based on the pod file.

For Heat test: Two host VMs are booted, as Traffic generator and VNF(L2FWD workload) based on the test flavor.

step 2 Yardstick is connected with the TG and VNF by using ssh. The test will resolve the topology and instantiate the VNF and TG and collect the KPI’s/metrics.
step 3

The TG will send packets to the VNF. If the number of dropped packets is more than the tolerated loss the line rate or throughput is halved. This is done until the dropped packets are within an acceptable tolerated loss.

The KPI is the number of packets per second for 64 bytes packet size with an accepted minimal packet loss for the default configuration.

step 4

In Baremetal test: The test quits the application and unbind the dpdk ports.

In Heat test: Two host VMs are deleted on test completion.

test verdict The test case will achieve a Throughput with an accepted minimal tolerated packet loss.
15.2.5. Yardstick Test Case Description: NSB PROX L3FWD
NSB PROX test for NFVI characterization
test case id

tc_prox_{context}_l3fwd-{port_num}

  • context = baremetal or heat_context;
  • port_num = 2 or 4;
metric
  • Network Throughput;
  • TG Packets Out;
  • TG Packets In;
  • VNF Packets Out;
  • VNF Packets In;
  • Dropped packets;
test purpose

The PROX L3FWD application performs basic routing of packets with LPM based look-up method.

The L3FWD test cases are implemented for baremetal and heat context for 2 port and 4 port configuration.

configuration

The L3FWD test cases are listed below:

  • tc_prox_baremetal_l3fwd-2.yaml
  • tc_prox_baremetal_l3fwd-4.yaml
  • tc_prox_heat_context_l3fwd-2.yaml
  • tc_prox_heat_context_l3fwd-4.yaml

Test duration is set as 300sec for each test. The minimum packet size for L3FWD test is 64 bytes. This is set in the traffic profile and can be configured to use a higher packet size for the test.

test tool PROX PROX is a DPDK application that can simulate VNF workloads and can generate traffic and used for NFVI characterization
applicability

This PROX L3FWD test cases can be configured with different:

  • packet sizes;
  • test durations;
  • tolerated loss;

Default values exist.

pre-test conditions

For Openstack test case image (yardstick-samplevnfs) needs to be installed into Glance with Prox and Dpdk included in it. The test need multi-queue enabled in Glance image.

For Baremetal tests cases Prox and Dpdk must be installed in the hosts where the test is executed. The pod.yaml file must have the necessary system and NIC information

test sequence description and expected result
step 1

For Baremetal test: The TG and VNF are started on the hosts based on the pod file.

For Heat test: Two host VMs are booted, as Traffic generator and VNF(L3FWD workload) based on the test flavor.

step 2 Yardstick is connected with the TG and VNF by using ssh. The test will resolve the topology and instantiate the VNF and TG and collect the KPI’s/metrics.
step 3

The TG will send packet to the VNF. If the number of dropped packets is more than the tolerated loss the line rate or throughput is halved. This is done until the dropped packets are within an acceptable tolerated loss.

The KPI is the number of packets per second for 64 byte packets with an accepted minimal packet loss for the default configuration.

step 4

In Baremetal test: The test quits the application and unbind the dpdk ports.

In Heat test: Two host VMs are deleted on test completion.

test verdict The test case will achieve a Throughput with an accepted minimal tolerated packet loss.
15.2.6. Yardstick Test Case Description: NSB PROX MPLS Tagging
NSB PROX test for NFVI characterization
test case id

tc_prox_{context}_mpls_tagging-{port_num}

  • context = baremetal or heat_context;
  • port_num = 2 or 4;
metric
  • Network Throughput;
  • TG Packets Out;
  • TG Packets In;
  • VNF Packets Out;
  • VNF Packets In;
  • Dropped packets;
test purpose

The PROX MPLS Tagging test will take packets in from one port add an MPLS tag and forward them to another port. While forwarding packets in other direction MPLS tags will be removed.

The MPLS test cases are implemented to run in baremetal and heat context an require 4 port topology to run the default configuration.

configuration

The MPLS Tagging test cases are listed below:

  • tc_prox_baremetal_mpls_tagging-2.yaml
  • tc_prox_baremetal_mpls_tagging-4.yaml
  • tc_prox_heat_context_mpls_tagging-2.yaml
  • tc_prox_heat_context_mpls_tagging-4.yaml

Test duration is set as 300sec for each test. The minimum packet size for MPLS test is 68 bytes. This is set in the traffic profile and can be configured to use higher packet sizes.

test tool PROX PROX is a DPDK application that can simulate VNF workloads and can generate traffic and used for NFVI characterization
applicability

The PROX MPLS Tagging test cases can be configured with different:

  • packet sizes;
  • test durations;
  • tolerated loss;

Default values exist.

pre-test conditions

For Openstack test case image (yardstick-samplevnfs) needs to be installed into Glance with Prox and Dpdk included in it.

For Baremetal tests cases Prox and Dpdk must be installed in the hosts where the test is executed. The pod.yaml file must have the necessary system and NIC information

test sequence description and expected result
step 1

For Baremetal test: The TG and VNF are started on the hosts based on the pod file.

For Heat test: Two host VMs are booted, as Traffic generator and VNF(MPLS workload) based on the test flavor.

step 2 Yardstick is connected with the TG and VNF by using ssh. The test will resolve the topology and instantiate the VNF and TG and collect the KPI’s/metrics.
step 3

The TG will send packets to the VNF. If the number of dropped packets is more than the tolerated loss the line rate or throughput is halved. This is done until the dropped packets are within an acceptable tolerated loss.

The KPI is the number of packets per second for 68 bytes packet size with an accepted minimal packet loss for the default configuration.

step 4

In Baremetal test: The test quits the application and unbind the dpdk ports.

In Heat test: Two host VMs are deleted on test completion.

test verdict The test case will achieve a Throughput with an accepted minimal tolerated packet loss.
15.2.7. Yardstick Test Case Description: NSB PROX Packet Buffering
NSB PROX test for NFVI characterization
test case id

tc_prox_{context}_buffering-{port_num}

  • context = baremetal or heat_context
  • port_num = 1
metric
  • Network Throughput;
  • TG Packets Out;
  • TG Packets In;
  • VNF Packets Out;
  • VNF Packets In;
  • Dropped packets;
test purpose

This test measures the impact of the condition when packets get buffered, thus they stay in memory for the extended period of time, 125ms in this case.

The Packet Buffering test cases are implemented to run in baremetal and heat context.

The test runs only on the first port of the SUT.

configuration

The Packet Buffering test cases are listed below:

  • tc_prox_baremetal_buffering-1.yaml
  • tc_prox_heat_context_buffering-1.yaml

Test duration is set as 300sec for each test. The minimum packet size for Buffering test is 64 bytes. This is set in the traffic profile and can be configured to use a higher packet size for the test.

test tool PROX PROX is a DPDK application that can simulate VNF workloads and can generate traffic and used for NFVI characterization
applicability
The PROX Packet Buffering test cases can be configured with

different:

  • packet sizes;
  • test durations;
  • tolerated loss;

Default values exist.

pre-test conditions

For Openstack test case image (yardstick-samplevnfs) needs to be installed into Glance with Prox and Dpdk included in it. The test need multi-queue enabled in Glance image.

For Baremetal tests cases Prox and Dpdk must be installed in the hosts where the test is executed. The pod.yaml file must have the necessary system and NIC information

test sequence description and expected result
step 1

For Baremetal test: The TG and VNF are started on the hosts based on the pod file.

For Heat test: Two host VMs are booted, as Traffic generator and VNF(Packet Buffering workload) based on the test flavor.

step 2 Yardstick is connected with the TG and VNF by using ssh. The test will resolve the topology and instantiate the VNF and TG and collect the KPI’s/metrics.
step 3

The TG will send packets to the VNF. If the number of dropped packets is more than the tolerated loss the line rate or throughput is halved. This is done until the dropped packets are within an acceptable tolerated loss.

The KPI in this test is the maximum number of packets that can be forwarded given the requirement that the latency of each packet is at least 125 millisecond.

step 4

In Baremetal test: The test quits the application and unbind the dpdk ports.

In Heat test: Two host VMs are deleted on test completion.

test verdict The test case will achieve a Throughput with an accepted minimal tolerated packet loss.
15.2.8. Yardstick Test Case Description: NSB PROX Load Balancer
NSB PROX test for NFVI characterization
test case id

tc_prox_{context}_lb-{port_num}

  • context = baremetal or heat_context
  • port_num = 4
metric
  • Network Throughput;
  • TG Packets Out;
  • TG Packets In;
  • VNF Packets Out;
  • VNF Packets In;
  • Dropped packets;
test purpose

The applciation transmits packets on one port and revieves them on 4 ports. The conventional 5-tuple is used in this test as it requires some extraction steps and allows defining enough distinct values to find the performance limits.

The load is increased (adding more ports if needed) while packets are load balanced using a hash table of 8M entries

The number of packets per second that can be forwarded determines the KPI. The default packet size is 64 bytes.

configuration

The Load Balancer test cases are listed below:

  • tc_prox_baremetal_lb-4.yaml
  • tc_prox_heat_context_lb-4.yaml

Test duration is set as 300sec for each test. Packet size set as 64 bytes in traffic profile. These can be configured

test tool PROX PROX is a DPDK application that can simulate VNF workloads and can generate traffic and used for NFVI characterization
applicability
The PROX Load Balancer test cases can be configured with

different:

  • packet sizes;
  • test durations;
  • tolerated loss;

Default values exist.

pre-test conditions

For Openstack test case image (yardstick-samplevnfs) needs to be installed into Glance with Prox and Dpdk included in it. The test need multi-queue enabled in Glance image.

For Baremetal tests cases Prox and Dpdk must be installed in the hosts where the test is executed. The pod.yaml file must have the necessary system and NIC information

test sequence description and expected result
step 1

For Baremetal test: The TG and VNF are started on the hosts based on the pod file.

For Heat test: Two host VMs are booted, as Traffic generator and VNF(Load Balancer workload) based on the test flavor.

step 2 Yardstick is connected with the TG and VNF by using ssh. The test will resolve the topology and instantiate the VNF and TG and collect the KPI’s/metrics.
step 3

The TG will send packets to the VNF. If the number of dropped packets is more than the tolerated loss the line rate or throughput is halved. This is done until the dropped packets are within an acceptable tolerated loss.

The KPI is the number of packets per second for 78 bytes packet size with an accepted minimal packet loss for the default configuration.

step 4

In Baremetal test: The test quits the application and unbind the dpdk ports.

In Heat test: Two host VMs are deleted on test completion.

test verdict The test case will achieve a Throughput with an accepted minimal tolerated packet loss.
15.2.9. Yardstick Test Case Description: NSB PROXi VPE
NSB PROX test for NFVI characterization
test case id

tc_prox_{context}_vpe-{port_num}

  • context = baremetal or heat_context;
  • port_num = 4;
metric
  • Network Throughput;
  • TG Packets Out;
  • TG Packets In;
  • VNF Packets Out;
  • VNF Packets In;
  • Dropped packets;
test purpose

The PROX VPE test handles packet processing, routing, QinQ encapsulation, flows, ACL rules, adds/removes MPLS tagging and performs QoS before forwarding packet to another port. The reverse applies to forwarded packets in the other direction.

The VPE test cases are implemented to run in baremetal and heat context an require 4 port topology to run the default configuration.

configuration

The VPE test cases are listed below:

  • tc_prox_baremetal_vpe-4.yaml
  • tc_prox_heat_context_vpe-4.yaml

Test duration is set as 300sec for each test. The minimum packet size for VPE test is 68 bytes. This is set in the traffic profile and can be configured to use higher packet sizes.

test tool PROX PROX is a DPDK application that can simulate VNF workloads and can generate traffic and used for NFVI characterization
applicability

The PROX VPE test cases can be configured with different:

  • packet sizes;
  • test durations;
  • tolerated loss;

Default values exist.

pre-test conditions

For Openstack test case image (yardstick-samplevnfs) needs to be installed into Glance with Prox and Dpdk included in it.

For Baremetal tests cases Prox and Dpdk must be installed in the hosts where the test is executed. The pod.yaml file must have the necessary system and NIC information

test sequence description and expected result
step 1

For Baremetal test: The TG and VNF are started on the hosts based on the pod file.

For Heat test: Two host VMs are booted, as Traffic generator and VNF(VPE workload) based on the test flavor.

step 2 Yardstick is connected with the TG and VNF by using ssh. The test will resolve the topology and instantiate the VNF and TG and collect the KPI’s/metrics.
step 3

The TG will send packets to the VNF. If the number of dropped packets is more than the tolerated loss the line rate or throughput is halved. This is done until the dropped packets are within an acceptable tolerated loss.

The KPI is the number of packets per second for 68 bytes packet size with an accepted minimal packet loss for the default configuration.

step 4

In Baremetal test: The test quits the application and unbind the dpdk ports.

In Heat test: Two host VMs are deleted on test completion.

test verdict The test case will achieve a Throughput with an accepted minimal tolerated packet loss.
16. Glossary
API
Application Programming Interface
Docker
Docker provisions and manages containers. Yardstick and many other OPNFV projects are deployed in containers. Docker is required to launch the containerized versions of these projects.
DPDK
Data Plane Development Kit
DPI
Deep Packet Inspection
DSCP
Differentiated Services Code Point
IGMP
Internet Group Management Protocol
IOPS
Input/Output Operations Per Second A performance measurement used to benchmark storage devices.
KPI
Key Performance Indicator
Kubernetes
k8s Kubernetes is an open-source container-orchestration system for automating deployment, scaling and management of containerized applications. It is one of the contexts supported in Yardstick.
NFV
Network Function Virtualization NFV is an initiative to take network services which were traditionally run on proprietary, dedicated hardware, and virtualize them to run on general purpose hardware.
NFVI
Network Function Virtualization Infrastructure The servers, routers, switches, etc on which the NFV system runs.
NIC
Network Interface Controller
OpenStack
OpenStack is a cloud operating system that controls pools of compute, storage, and networking resources. OpenStack is an open source project licensed under the Apache License 2.0.
PBFS
Packet Based per Flow State
PROX
Packet pROcessing eXecution engine
QoS
Quality of Service The ability to guarantee certain network or storage requirements to satisfy a Service Level Agreement (SLA) between an application provider and end users. Typically includes performance requirements like networking bandwidth, latency, jitter correction, and reliability as well as storage performance in Input/Output Operations Per Second (IOPS), throttling agreements, and performance expectations at peak load
SLA
Service Level Agreement An SLA is an agreement between a service provider and a customer to provide a certain level of service/performance.
SR-IOV
Single Root IO Virtualization A specification that, when implemented by a physical PCIe device, enables it to appear as multiple separate PCIe devices. This enables multiple virtualized guests to share direct access to the physical device.
SUT
System Under Test
ToS
Type of Service
VLAN
Virtual LAN (Local Area Network)
VM
Virtual Machine An operating system instance that runs on top of a hypervisor. Multiple VMs can run at the same time on the same physical host.
VNF
Virtual Network Function
VNFC
Virtual Network Function Component
17. References

Testing Developer Guides

Testing group

Test Framework Overview
Testing developer guide
Introduction

The OPNFV testing ecosystem is wide.

The goal of this guide consists in providing some guidelines for new developers involved in test areas.

For the description of the ecosystem, see [DEV1].

Developer journey

There are several ways to join test projects as a developer. In fact you may:

  • Develop new test cases
  • Develop frameworks
  • Develop tooling (reporting, dashboards, graphs, middleware,...)
  • Troubleshoot results
  • Post-process results

These different tasks may be done within a specific project or as a shared resource accross the different projects.

If you develop new test cases, the best practice is to contribute upstream as much as possible. You may contact the testing group to know which project - in OPNFV or upstream - would be the best place to host the test cases. Such contributions are usually directly connected to a specific project, more details can be found in the user guides of the testing projects.

Each OPNFV testing project provides test cases and the framework to manage them. As a developer, you can obviously contribute to them. The developer guide of the testing projects shall indicate the procedure to follow.

Tooling may be specific to a project or generic to all the projects. For specific tooling, please report to the test project user guide. The tooling used by several test projects will be detailed in this document.

The best event to meet the testing community is probably the plugfest. Such an event is organized after each release. Most of the test projects are present.

The summit is also a good opportunity to meet most of the actors [DEV4].

Be involved in the testing group

The testing group is a self organized working group. The OPNFV projects dealing with testing are invited to participate in order to elaborate and consolidate a consistant test strategy (test case definition, scope of projects, resources for long duration, documentation, ...) and align tooling or best practices.

A weekly meeting is organized, the agenda may be amended by any participant. 2 slots have been defined (US/Europe and APAC). Agendas and minutes are public. See [DEV3] for details. The testing group IRC channel is #opnfv-testperf

Best practices

All the test projects do not have the same maturity and/or number of contributors. The nature of the test projects may be also different. The following best practices may not be acurate for all the projects and are only indicative. Contact the testing group for further details.

Repository structure

Most of the projects have a similar structure, which can be defined as follows:

`-- home
  |-- requirements.txt
  |-- setup.py
  |-- tox.ini
  |
  |-- <project>
  |       |-- <api>
  |       |-- <framework>
  |       `-- <test cases>
  |
  |-- docker
  |     |-- Dockerfile
  |     `-- Dockerfile.aarch64.patch
  |-- <unit tests>
  `- docs
     |-- release
     |   |-- release-notes
     |   `-- results
     `-- testing
         |-- developer
         |     `-- devguide
         |-- user
               `-- userguide
API

Test projects are installing tools and triggering tests. When it is possible it is recommended to implement an API in order to perform the different actions.

Each test project should be able to expose and consume APIs from other test projects. This pseudo micro service approach should allow a flexible use of the different projects and reduce the risk of overlapping. In fact if project A provides an API to deploy a traffic generator, it is better to reuse it rather than implementing a new way to deploy it. This approach has not been implemented yet but the prerequisites consiting in exposing and API has already been done by several test projects.

CLI

Most of the test projects provide a docker as deliverable. Once connected, it is possible to prepare the environement and run tests through a CLI.

Dockerization

Dockerization has been introduced in Brahmaputra and adopted by most of the test projects. Docker containers are pulled on the jumphost of OPNFV POD. <TODO Jose/Mark/Alec>

Code quality

It is recommended to control the quality of the code of the testing projects, and more precisely to implement some verifications before any merge:

  • pep8
  • pylint
  • unit tests (python 2.7)
  • unit tests (python 3.5)

The code of the test project must be covered by unit tests. The coverage shall be reasonable and not decrease when adding new features to the framework. The use of tox is recommended. It is possible to implement strict rules (no decrease of pylint score, unit test coverages) on critical python classes.

Third party tooling

Several test projects integrate third party tooling for code quality check and/or traffic generation. Some of the tools can be listed as follows:

Project Tool Comments
Bottlenecks TODO  
Functest Tempest Rally Refstack RobotFramework OpenStack test tooling OpenStack test tooling OpenStack test tooling Used for ODL tests
QTIP Unixbench RAMSpeed nDPI openSSL inxi  
Storperf TODO  
VSPERF TODO  
Yardstick Moongen Trex Pktgen IxLoad, IxNet SPEC Unixbench RAMSpeed LMBench Iperf3 Netperf Pktgen-DPDK Testpmd L2fwd Fio Bonnie++ Traffic generator Traffic generator Traffic generator Traffic generator Compute Compute Compute Compute Network Network Network Network Network Storage Storage
Testing group configuration parameters
Testing categories

The testing group defined several categories also known as tiers. These categories can be used to group test suites.

Category Description
Healthcheck Simple and quick healthcheck tests case
Smoke Set of smoke test cases/suites to validate the release
Features Test cases that validate a specific feature on top of OPNFV. Those come from Feature projects and need a bit of support for integration
Components Tests on a specific component (e.g. OpenStack, OVS, DPDK,..) It may extend smoke tests
Performance Performance qualification
VNF Test cases related to deploy an open source VNF including an orchestrator
Stress Stress and robustness tests
In Service In service testing
Testing domains

The domains deal with the technical scope of the tests. It shall correspond to domains defined for the certification program:

  • compute
  • network
  • storage
  • hypervisor
  • container
  • vim
  • mano
  • vnf
  • ...
Testing coverage

One of the goals of the testing working group is to identify the poorly covered areas and avoid testing overlap. Ideally based on the declaration of the test cases, through the tags, domains and tier fields, it shall be possible to create heuristic maps.

Reliability, Stress and Long Duration Testing

Resiliency of NFV refers to the ability of the NFV framework to limit disruption and return to normal or at a minimum acceptable service delivery level in the face of a fault, failure, or an event that disrupts the normal operation [DEV5].

Reliability testing evaluates the ability of SUT to recover in face of fault, failure or disrupts in normal operation or simply the ability of SUT absorbing “disruptions”.

Reliability tests use different forms of faults as stimulus, and the test must measure the reaction in terms of the outage time or impairments to transmission.

Stress testing involves producing excess load as stimulus, and the test must measure the reaction in terms of unexpected outages or (more likely) impairments to transmission.

These kinds of “load” will cause “disruption” which could be easily found in system logs. It is the purpose to raise such “load” to evaluate the SUT if it could provide an acceptable level of service or level of confidence during such circumstances. In Danube and Euphrates, we only considered the stress test with excess load over OPNFV Platform.

In Danube, Bottlenecks and Yardstick project jointly implemented 2 stress tests (concurrently create/destroy VM pairs and do ping, system throughput limit) while Bottlenecks acts as the load manager calling yardstick to execute each test iteration. These tests are designed to test for breaking points and provide level of confidence of the system to users. Summary of the test cases are listed in the following addresses:

Stress test cases for OPNFV Euphrates (OS Ocata) release can be seen as extension/enhancement of those in D release. These tests are located in Bottlenecks/Yardstick repo (Bottlenecks as load manager while Yardstick execute each test iteration):

network usage from different VM pairs): https://wiki.opnfv.org/display/DEV/Intern+Project%3A+Baseline+Stress+Test+Case+for+Bottlenecks+E+Release

In OPNFV E release, we also plan to do long duration testing over OS Ocata. A separate CI pipe testing OPNFV XCI (OSA) is proposed to accomplish the job. We have applied specific pod for the testing. Proposals and details are listed below:

The long duration testing is supposed to be started when OPNFV E release is published. A simple monitoring module for these tests is also planned to be added: https://wiki.opnfv.org/display/DEV/Intern+Project%3A+Monitoring+Stress+Testing+for+Bottlenecks+E+Release

How TOs
Where can I find information on the different test projects?

On http://docs.opnfv.org! A section is dedicated to the testing projects. You will find the overview of the ecosystem and the links to the project documents.

Another source is the testing wiki on https://wiki.opnfv.org/display/testing

You may also contact the testing group on the IRC channel #opnfv-testperf or by mail at test-wg AT lists.opnfv.org (testing group) or opnfv-tech-discuss AT lists.opnfv.org (generic technical discussions).

How can I contribute to a test project?

As any project, the best solution is to contact the project. The project members with their email address can be found under https://git.opnfv.org/<project>/tree/INFO

You may also send a mail to the testing mailing list or use the IRC channel #opnfv-testperf

Where can I find hardware resources?

You should discuss this topic with the project you are working with. If you need access to an OPNFV community POD, it is possible to contact the infrastructure group. Depending on your needs (scenario/installer/tooling), it should be possible to find free time slots on one OPNFV community POD from the Pharos federation. Create a JIRA ticket to describe your needs on https://jira.opnfv.org/projects/INFRA. You must already be an OPNFV contributor. See https://wiki.opnfv.org/display/DEV/Developer+Getting+Started.

Please note that lots of projects have their own “how to contribute” or “get started” page on the OPNFV wiki.

How do I integrate my tests in CI?

It shall be discussed directly with the project you are working with. It is done through jenkins jobs calling testing project files but the way to onboard cases differ from one project to another.

How to declare my tests in the test Database?

If you have access to the test API swagger (access granted to contributors), you may use the swagger interface of the test API to declare your project. The URL is http://testresults.opnfv.org/test/swagger/spec.html.

Testing Group Test API swagger

Click on Spec, the list of available methods must be displayed.

Testing Group Test API swagger

For the declaration of a new project use the POST /api/v1/projects method. For the declaration of new test cases in an existing project, use the POST

/api/v1/projects/{project_name}/cases method

Testing group declare new test case
How to push your results into the Test Database?

The test database is used to collect test results. By default it is enabled only for CI tests from Production CI pods.

Please note that it is possible to create your own local database.

A dedicated database is for instance created for each plugfest.

The architecture and associated API is described in previous chapter. If you want to push your results from CI, you just have to call the API at the end of your script.

You can also reuse a python function defined in functest_utils.py [DEV2]

Where can I find the documentation on the test API?

The Test API is now documented in this document (see sections above). You may also find autogenerated documentation in http://artifacts.opnfv.org/releng/docs/testapi.html A web protal is also under construction for certification at http://testresults.opnfv.org/test/#/

I have tests, to which category should I declare them?

See table above.

The main ambiguity could be between features and VNF. In fact sometimes you have to spawn VMs to demonstrate the capabilities of the feature you introduced. We recommend to declare your test in the feature category.

VNF category is really dedicated to test including:

  • creation of resources
  • deployement of an orchestrator/VNFM
  • deployment of the VNF
  • test of the VNFM
  • free resources

The goal is not to study a particular feature on the infrastructure but to have a whole end to end test of a VNF automatically deployed in CI. Moreover VNF are run in weekly jobs (one a week), feature tests are in daily jobs and use to get a scenario score.

Where are the logs of CI runs?

Logs and configuration files can be pushed to artifact server from the CI under http://artifacts.opnfv.org/<project name>

References

[DEV1]: OPNFV Testing Ecosystem

[DEV2]: Python code sample to push results into the Database

[DEV3]: Testing group wiki page

[DEV4]: Conversation with the testing community, OPNFV Beijing Summit

[DEV5]: GS NFV 003

IRC support chan: #opnfv-testperf

Bottlenecks

Bottlenecks Developer Guide
Bottlenecks - A Developer Quick Start
Introduction

This document will provide general view of the project for developers as a quick guide.

Bottlenecks - Framework Guide
Introduction

This document will provide a comprehensive guilde on Bottlenecks testing framework development.

Bottlenecks - Unit & Coverage Test Guide
Introduction of the Rationale and Framework
What are Unit & Coverage Tests

A unit test is an automated code-level test for a small and fairly isolated part of functionality, mostly in terms of functions. They should interact with external resources at their minimum, and includes testing every corner cases and cases that do not work.

Unit tests should always be pretty simple, by intent. There are a couple of ways to integrate unit tests into your development style [1]:

  • Test Driven Development, where unit tests are written prior to the functionality they’re testing
  • During refactoring, where existing code – sometimes code without any automated tests to start with – is retrofitted with unit tests as part of the refactoring process
  • Bug fix testing, where bugs are first pinpointed by a targetted test and then fixed
  • Straight test enhanced development, where tests are written organically as the code evolves.

Comprehensive and integrally designed unit tests serves valuably as validator of your APIs, fuctionalities and the workflow that acctually make them executable. It will make it possibe to deliver your codes more quickly.

In the meanwhile, Coverage Test is the tool for measuring code coverage of Python programs. Accompany with Unit Test, it monitors your program, noting which parts of the code have been executed, then analyzes the source to identify code that could have been executed but was not.

Coverage measurement is typically used to gauge the effectiveness of tests. It can show which parts of your code are being exercised by tests, and which are not.

Why We Use a Framework and Nose

People use unit test discovery and execution frameworks so that they can forcus on add tests to existing code, then the tests could be tirggerd, resulting report could be obtained automatically.

In addition to adding and running your tests, frameworks can run tests selectively according to your requirements, add coverage and profiling information, generate comprehensive reports.

There are many unit test frameworks in Python, and more arise every day. It will take you some time to be falimiar with those that are famous from among the ever-arising frameworks. However, to us, it always matters more that you are actually writing tests for your codes than how you write them. Plus, nose is quite stable, it’s been used by many projects and it could be adapted easily to mimic any other unit test discovery framework pretty easily. So, why not?

Principles of the Tests

Before you actually implement test codes for your software, please keep the following principles in mind [2]

  • A testing unit should focus on one tiny bit of functionality and prove it correct.
  • Each test unit must be fully independent. This is usually handled by setUp() and tearDown() methods.
  • Try hard to make tests that run fast.
  • Learn your tools and learn how to run a single test or a test case. Then, when developing a function inside a module, run this function’s tests frequently, ideally automatically when you save the code.
  • Always run the full test suite before a coding session, and run it again after. This will give you more confidence that you did not break anything in the rest of the code.
  • It is a good idea to implement a hook that runs all tests before pushing code to a shared repository.
  • If you are in the middle of a development session and have to interrupt your work, it is a good idea to write a broken unit test about what you want to develop next. When coming back to work, you will have a pointer to where you were and get back on track faster.
  • The first step when you are debugging your code is to write a new test pinpointing the bug, while it is not always possible to do.
  • Use long and descriptive names for testing functions. These function names are displayed when a test fails, and should be as descriptive as possible.
  • Welly designed tests could acts as an introduction to new developers (read tests or write tests first before going into functionality development) and demonstrations for maintainers.
Offline Test

There only are a few guidance for developing and testing your code on your local server assuming that you already have python installed. For more detailed introduction, please refer to the wesites of nose and coverage [3] [4].

Install Nose

Install Nose using your OS’s package manager. For example:

pip install nose

As to creating tests and a quick start, please refer to [5]

Run Tests

Nose comes with a command line utility called ‘nosetests’. The simplest usage is to call nosetests from within your project directory and pass a ‘tests’ directory as an argument. For example,

nosetests tests

The outputs could be similar to the following summary:

 % nosetests tests
....
----------------------------------------------------------------------
Ran 4 tests in 0.003s  OK
Adding Code Coverage

Coverage is the metric that could complete your unit tests by overseeing your test codes themselves. Nose support coverage test according the Coverage.py.

pip install coverage

To generate a coverage report using the nosetests utility, simply add the –with-coverage. By default, coverage generates data for all modules found in the current directory.

nosetests --with-coverage

% nosetests –with-coverage –cover-package a

The –cover-package switch can be used multiple times to restrain the tests only looking into the 3rd party package to avoid useless information.

nosetests --with-coverage --cover-package a --cover-package b
....
Name    Stmts   Miss  Cover   Missing
-------------------------------------
a           8      0   100%
----------------------------------------------------------------------
Ran 4 tests in 0.006sOK
OPNFV CI Verify Job

Assuming that you have already got the main idea of unit testing and start to programing you own tests under Bottlenecks repo. The most important thing that should be clarified is that unit tests under Bottlenecks should be either excutable offline and by OPNFV CI pipeline. When you submit patches to Bottlenecks repo, your patch should following certain ruls to enable the tests:

  • The Bottlenecks unit tests are triggered by OPNFV verify job of CI when you upload files to “test” directory.
  • You should add your –cover-package and test directory in ./verify.sh according to the above guides

After meeting the two rules, your patch will automatically validated by nose tests executed by OPNFV verify job.

Bottlenecks - Package Guilde
Introduction

This document will provide a comprehensive guilde on packages library for developers.

Dovetail / OPNFV Verified Program

7. OVP Test Case Requirements
7.1. OVP Test Suite Purpose and Goals

The OVP test suite is intended to provide a method for validating the interfaces and behaviors of an NFVI platform according to the expected capabilities exposed in OPNFV. The behavioral foundation evaluated in these tests should serve to provide a functional baseline for VNF deployment and portability across NFVI instances. All OVP tests are available in open source and are executed in open source test frameworks.

7.2. Test case requirements

The following requirements are mandatory for a test to be submitted for consideration in the OVP test suite:

  • All test cases must be fully documented, in a common format. Please consider the existing OVP Test Specifications as examples.
    • Clearly identifying the test procedure and expected results / metrics to determine a “pass” or “fail” result.
  • Tests must be validated for the purpose of OVP, tests should be run with both an expected positive and negative outcome.
  • At the current stage of OVP, only functional tests are eligible, performance testing is out of scope.
    • Performance test output could be built in as “for information only”, but must not carry pass/fail metrics.
  • Test cases should favor implementation of a published standard interface for validation.
    • Where no standard is available provide API support references.
    • If a standard exists and is not followed, an exemption is required. Such exemptions can be raised in the project meetings first, and if no consensus can be reached, escalated to the TSC.
  • Test cases must pass on applicable OPNFV reference deployments and release versions.
    • Tests must not require a specific NFVI platform composition or installation tool.
      • Tests and test tools must run independently of the method of platform installation and architecture.
      • Tests and test tools must run independently of specific OPNFV components allowing different components such as storage backends or SDN controllers.
    • Tests must not require un-merged patches to the relevant upstream projects.
    • Tests must not require features or code which are out of scope for the latest release of the OPNFV project.
    • Tests must have a documented history of recent successful verification in OPNFV testing programs including CI, Functest, Yardstick, Bottlenecks, Dovetail, etc. (i.e., all testing programs in OPNFV that regularly validate tests against the release, whether automated or manual).
    • Tests must be considered optional unless they have a documented history for ALL OPNFV scenarios that are both
      • applicable, i.e., support the feature that the test exercises, and
      • released, i.e., in the OPNFV release supported by the OVP test suite version.
  • Tests must run against a fully deployed and operational system under test.
  • Tests and test implementations must support stand alone OPNFV and commercial OPNFV-derived solutions.
    • There can be no dependency on OPNFV resources or infrastructure.
    • Tests must not require external resources while a test is running, e.g., connectivity to the Internet. All resources required to run a test, e.g., VM and container images, are downloaded and installed as part of the system preparation and test tool installation.
  • The following things must be documented for the test case:
    • Use case specification
    • Test preconditions
    • Basic test flow execution description and test assertions
    • Pass fail criteria
  • The following things may be documented for the test case:
    • Parameter border test cases descriptions
    • Fault/Error test case descriptions
    • Post conditions where the system state may be left changed after completion

New test case proposals should complete a OVP test case worksheet to ensure that all of these considerations are met before the test case is approved for inclusion in the OVP test suite.

7.3. Dovetail Test Suite Naming Convention

Test case naming and structuring must comply with the following conventions. The fully qualified name of a test case must comprise three sections:

<testproject>.<test_area>.<test_case_name>

  • testproject: The fully qualified test case name must identify the test project which developed and maintains the test case.
  • test_area: The fully qualified test case name must identify the test case area. The test case area is a single word identifier describing the broader functional scope of a test case, such as ha (high-availability), tempest, vnf, etc.
  • test_case_name: The fully qualified test case name must include a concise description of the purpose of the test case.

An example of a fully qualified test case name is functest.tempest.compute.

Functest

Functest Developer Guide
Introduction

Functest is a project dealing with functional testing. The project produces its own internal test cases but can also be considered as a framework to support feature and VNF onboarding project testing.

Therefore there are many ways to contribute to Functest. You can:

  • Develop new internal test cases
  • Integrate the tests from your feature project
  • Develop the framework to ease the integration of external test cases

Additional tasks involving Functest but addressing all the test projects may also be mentioned:

  • The API / Test collection framework
  • The dashboards
  • The automatic reporting portals
  • The testcase catalog

This document describes how, as a developer, you may interact with the Functest project. The first section details the main working areas of the project. The Second part is a list of “How to” to help you to join the Functest family whatever your field of interest is.

Functest developer areas
Functest High level architecture

Functest is a project delivering test containers dedicated to OPNFV. It includes the tools, the scripts and the test scenarios. In Euphrates Alpine containers have been introduced in order to lighten the container and manage testing slicing. The new containers are created according to the different tiers:

Standalone functest dockers are maintained for Euphrates but Alpine containers are recommended.

Functest can be described as follow:

+----------------------+
|                      |
|   +--------------+   |                  +-------------------+
|   |              |   |    Public        |                   |
|   | Tools        |   +------------------+      OPNFV        |
|   | Scripts      |   |                  | System Under Test |
|   | Scenarios    |   |                  |                   |
|   |              |   |                  |                   |
|   +--------------+   |                  +-------------------+
|                      |
|    Functest Docker   |
|                      |
+----------------------+
Functest internal test cases

The internal test cases in Euphrates are:

  • api_check
  • connection_check
  • snaps_health_check
  • vping_ssh
  • vping_userdata
  • odl
  • rally_full
  • rally_sanity
  • tempest_smoke
  • tempest_full
  • cloudify_ims

By internal, we mean that this particular test cases have been developed and/or integrated by functest contributors and the associated code is hosted in the Functest repository. An internal case can be fully developed or a simple integration of upstream suites (e.g. Tempest/Rally developed in OpenStack, or odl suites are just integrated in Functest).

The structure of this repository is detailed in [1]. The main internal test cases are in the opnfv_tests subfolder of the repository, the internal test cases can be grouped by domain:

  • sdn: odl, odl_fds
  • openstack: api_check, connection_check, snaps_health_check, vping_ssh, vping_userdata, tempest_*, rally_*
  • vnf: cloudify_ims

If you want to create a new test case you will have to create a new folder under the testcases directory (See next section for details).

Functest external test cases

The external test cases are inherited from other OPNFV projects, especially the feature projects.

The external test cases are:

  • barometer
  • bgpvpn
  • doctor
  • domino
  • fds
  • promise
  • refstack_defcore
  • snaps_smoke
  • functest-odl-sfc
  • orchestra_clearwaterims
  • orchestra_openims
  • vyos_vrouter
  • juju_vepc

External test cases integrated in previous versions but not released in Euphrates:

  • copper
  • moon
  • netready
  • security_scan

The code to run these test cases is hosted in the repository of the project. Please note that orchestra test cases are hosted in Functest repository and not in orchestra repository. Vyos_vrouter and juju_vepc code is also hosted in functest as there are no dedicated projects.

Functest framework

Functest is a framework.

Historically Functest is released as a docker file, including tools, scripts and a CLI to prepare the environment and run tests. It simplifies the integration of external test suites in CI pipeline and provide commodity tools to collect and display results.

Since Colorado, test categories also known as tiers have been created to group similar tests, provide consistent sub-lists and at the end optimize test duration for CI (see How To section).

The definition of the tiers has been agreed by the testing working group.

The tiers are:
  • healthcheck
  • smoke
  • features
  • components
  • vnf
Functest abstraction classes

In order to harmonize test integration, abstraction classes have been introduced:

  • testcase: base for any test case
  • unit: run unit tests as test case
  • feature: abstraction for feature project
  • vnf: abstraction for vnf onboarding

The goal is to unify the way to run tests in Functest.

Feature, unit and vnf_base inherit from testcase:

            +----------------------------------------------------------------+
            |                                                                |
            |                   TestCase                                     |
            |                                                                |
            |                   - init()                                     |
            |                   - run()                                      |
            |                   - push_to_db()                               |
            |                   - is_successful()                            |
            |                                                                |
            +----------------------------------------------------------------+
               |             |                 |                           |
               V             V                 V                           V
+--------------------+   +---------+   +------------------------+   +-----------------+
|                    |   |         |   |                        |   |                 |
|    feature         |   |  unit   |   |    vnf                 |   | robotframework  |
|                    |   |         |   |                        |   |                 |
|                    |   |         |   |- prepare()             |   |                 |
|  - execute()       |   |         |   |- deploy_orchestrator() |   |                 |
| BashFeature class  |   |         |   |- deploy_vnf()          |   |                 |
|                    |   |         |   |- test_vnf()            |   |                 |
|                    |   |         |   |- clean()               |   |                 |
+--------------------+   +---------+   +------------------------+   +-----------------+
Functest util classes

In order to simplify the creation of test cases, Functest develops also some functions that are used by internal test cases. Several features are supported such as logger, configuration management and Openstack capabilities (tacker,..). These functions can be found under <repo>/functest/utils and can be described as follows:

functest/utils/
|-- config.py
|-- constants.py
|-- decorators.py
|-- env.py
|-- functest_utils.py
|-- openstack_tacker.py
`-- openstack_utils.py

It is recommended to use the SNAPS-OO library for deploying OpenStack instances. SNAPS [4] is an OPNFV project providing OpenStack utils.

TestAPI

Functest is using the Test collection framework and the TestAPI developed by the OPNFV community. See [5] for details.

Reporting

A web page is automatically generated every day to display the status based on jinja2 templates [3].

Dashboard

Additional dashboarding is managed at the testing group level, see [6] for details.

How TOs

See How to section on Functest wiki [7]

StorPerf

1. StorPerf Dev Guide
1.1. Initial Set up
1.1.1. Getting the Code

Replace your LFID with your actual Linux Foundation ID.

git clone ssh://YourLFID@gerrit.opnfv.org:29418/storperf
1.1.2. Virtual Environment

It is preferred to use virtualenv for Python dependencies. This way it is known exactly what libraries are needed, and can restart from a clean state at any time to ensure any library is not missing. Simply running the script:

ci/verify.sh

from inside the storperf directory will automatically create a virtualenv in the home directory called ‘storperf_venv’. This will be used as the Python interpreter for the IDE.

1.1.3. Docker Version

In order to run the full set of StorPerf services, docker and docker-compose are required to be installed. This requires docker 17.05 at a minimum.

https://docs.docker.com/engine/installation/linux/docker-ce/ubuntu/
1.2. IDE

While PyCharm as an excellent IDE, some aspects of it require licensing, and the PyDev Plugin for Eclipse (packaged as LiClipse) is fully open source (although donations are welcome). Therefore this section focuses on using LiClipse for StorPerf development.

1.2.1. Download
http://www.liclipse.com/download.html
1.2.2. Storperf virtualenv Interpretor

Setting up interpreter under PyDev (LiClipse):

  • Go to Project -> Properties, PyDev Interpreter:
submodules/storperf/docs/testing/developer/devguide/../images/PyDev_Interpreter.jpeg
  • Click to configure an interpreter not listed.
submodules/storperf/docs/testing/developer/devguide/../images/PyDev_Interpreters_List.jpeg
  • Click New, and create a new interpreter called StorPerf that points to your Virtual Env.
submodules/storperf/docs/testing/developer/devguide/../images/PyDev_New_Interpreter.jpeg
  • You should get a pop up similar to:
submodules/storperf/docs/testing/developer/devguide/../images/PyDev_Interpreter_Folders.jpeg
  • And then you can change the Interpreter to StorPerf.
submodules/storperf/docs/testing/developer/devguide/../images/PyDev_StorPerf_Interpreter.jpeg
1.2.3. Code Formatting

Pep8 and Flake8 rule. These are part of the Gerrit checks and I’m going to start enforcing style guidelines soon.

  • Go to Window -> Preferences, under PyDev, Editor, Code Style, Code Formatter and select autopep8.py for code formatting.
submodules/storperf/docs/testing/developer/devguide/../images/Code_formatter.jpeg
  • Next, under Save Actions, enable “Auto-format editor contents before saving”, and “Sort imports on save”.
submodules/storperf/docs/testing/developer/devguide/../images/Save_Actions.jpeg
  • And under Imports, select Delete unused imports.
submodules/storperf/docs/testing/developer/devguide/../images/Unused_imports.jpeg
  • Go to PyDev -> Editor -> Code Analysis and under the pycodestye.py (pep8), select Pep8 as Error. This flag highlight badly formatted lines as errors. These must be fixed before Jenkins will +1 any review.
submodules/storperf/docs/testing/developer/devguide/../images/Code_analysis.jpeg
1.2.4. Import Storperf as Git Project

I prefer to do the git clone from the command line, and then import that as a local project in LiClipse.

  • From the menu: File -> Import Project
submodules/storperf/docs/testing/developer/devguide/../images/Import_Project.png

submodules/storperf/docs/testing/developer/devguide/../images/Local_Repo.png

submodules/storperf/docs/testing/developer/devguide/../images/Add_git.png

  • Browse to the directory where you cloned StorPerf
submodules/storperf/docs/testing/developer/devguide/../images/Browse.png

  • You should now have storperf as a valid local git repo:
submodules/storperf/docs/testing/developer/devguide/../images/Git_Selection.png

  • Choose Import as general project
1.3. Unit Tests
1.3.1. Running from CLI

You technically already did when you ran:

ci/verify.sh

The shortcut to running the unit tests again from the command line is:

source ~/storperf_venv/bin/activate
nosetests --with-xunit \
      --with-coverage \
      --cover-package=storperf\
      --cover-xml \
      storperf

Note

You must be in the top level storperf directory in order to run the tests.

1.3.2. Set up under LiClipse

Running the tests:

Right click on the tests folder and select Run as Python Unit Test. Chances are, you’ll get:

Traceback (most recent call last):
  File "/home/mark/Documents/EMC/git/opnfv/storperf/storperf/tests/storperf_master_test.py", line 24, in setUp
    self.storperf = StorPerfMaster()
  File "/home/mark/Documents/EMC/git/opnfv/storperf/storperf/storperf_master.py", line 38, in __init__
    template_file = open("storperf/resources/hot/agent-group.yaml")
IOError: [Errno 2] No such file or directory: 'storperf/resources/hot/agent-group.yaml'

This means we need to set the working directory of the run configuration.

  • Under the menu: Run -> Run Configurations:
submodules/storperf/docs/testing/developer/devguide/../images/StorPerf_Tests-Main.jpeg
  • Go to the Arguments tab and change the radio button for Working Directory to “Default”
submodules/storperf/docs/testing/developer/devguide/../images/StorPerf_Tests-Arguments.jpeg
  • And on interpreter tab, change the interpreter to StorPerf:
submodules/storperf/docs/testing/developer/devguide/../images/StorPerf_Tests-Interpreter.jpeg
  • Click Apply. From now on, the run should be clean:
submodules/storperf/docs/testing/developer/devguide/../images/StorPerf_Tests-Console.jpeg

submodules/storperf/docs/testing/developer/devguide/../images/StorPerf_Tests-PyUnit.jpeg
1.3.3. Adding builtins

For some reason, sqlite needs to be added as a builtin.

  • Go to Window -> Preferences, PyDev > Interpreters > Python Interpreter and select the StorPerf interpreter:
submodules/storperf/docs/testing/developer/devguide/../images/Python_Interpreters.jpeg
  • Go to the Forced Builtins tab, click New and add sqlite3.
submodules/storperf/docs/testing/developer/devguide/../images/Forced_Builtins.jpeg
1.4. Gerrit

Installing and configuring Git and Git-Review is necessary in order to follow this guide. The Getting Started page will provide you with some help for that.

1.4.1. Committing the code with Gerrit
  • Open a terminal window and set the project’s directory to the working directory using the cd command. In this case “/home/tim/OPNFV/storperf” is the path to the StorPerf project folder on my computer. Replace this with the path of your own project.
cd /home/tim/OPNFV/storperf
  • Start a new topic for your change.
git checkout -b TOPIC-BRANCH
  • Tell Git which files you would like to take into account for the next commit. This is called ‘staging’ the files, by placing them into the staging area, using the ‘git add’ command (or the synonym ‘git stage’ command).
git add storperf/utilities/math.py
git add storperf/tests/utilities/math.py
...
  • Alternatively, you can choose to stage all files that have been modified (that is the files you have worked on) since the last time you generated a commit, by using the -a argument.
git add -a
  • Git won’t let you push (upload) any code to Gerrit if you haven’t pulled the latest changes first. So the next step is to pull (download) the latest changes made to the project by other collaborators using the ‘pull’ command.
git pull
  • Now that you have the latest version of the project and you have staged the files you wish to push, it is time to actually commit your work to your local Git repository.
git commit --signoff -m "Title of change

Test of change that describes in high level what
was done. There is a lot of documentation in code
so you do not need to repeat it here.

JIRA: STORPERF-54"

The message that is required for the commit should follow a specific set of rules. This practice allows to standardize the description messages attached to the commits, and eventually navigate among the latter more easily. This document happened to be very clear and useful to get started with that.

1.4.2. Pushing the code to Git for review
  • Now that the code has been comitted into your local Git repository the following step is to push it online to Gerrit for it to be reviewed. The command we will use is ‘git review’.
git review
  • This will automatically push your local commit into Gerrit, and the command should get back to you with a Gerrit URL that looks like this :
submodules/storperf/docs/testing/developer/devguide/../images/git_review.png
  • The OPNFV-Gerrit-Bot in #opnfv-storperf IRC channel will send a message with the URL as well.
submodules/storperf/docs/testing/developer/devguide/../images/gerrit_bot.png
  • Copy/Paste the URL into a web browser to get to the Gerrit code review you have just generated, and click the ‘add’ button to add reviewers to review your changes :
submodules/storperf/docs/testing/developer/devguide/../images/add_reviewers.png

Note

Check out this section if the git review command returns to you with an “access denied” error.

1.4.3. Fetching a Git review

If you want to collaborate with another developer, you can fetch their review by the Gerrit change id (which is part of the URL, and listed in the top left as Change NNNNN).

git review -d 16213

would download the patchset for change 16213. If there were a topic branch associated with it, it would switch you to that branch, allowing you to look at different patch sets locally at the same time without conflicts.

1.4.4. Modifying the code under review in Gerrit

At the same time the code is being reviewed in Gerrit, you may need to edit it to make some changes and then send it back for review. The following steps go through the procedure.

  • Once you have modified/edited your code files under your IDE, you will have to stage them. The ‘status’ command is very helpful at this point as it provides an overview of Git’s current state.
git status
submodules/storperf/docs/testing/developer/devguide/../images/git_status.png
  • The output of the command provides us with the files that have been modified after the latest commit (in this case I modified storperf/tests/utilities/ math.py and storperf/utilities/math.py).
  • We can now stage the files that have been modified as part of the Gerrit code review edition/modification/improvement :
git add storperf/tests/utilities/math.py
git add storperf/utilities/math.py
  • The ‘git status’ command should take this into consideration :
submodules/storperf/docs/testing/developer/devguide/../images/git_status_2.png
  • It is now time to commit the newly modified files, but the objective here is not to create a new commit, we simply want to inject the new changes into the previous commit. We can achieve that with the ‘–amend’ option on the ‘commit’ command :
git commit --amend
submodules/storperf/docs/testing/developer/devguide/../images/amend_commit.png
  • If the commit was successful, the ‘status’ command should not return the updated files as about to be committed.
  • The final step consists in pushing the newly modified commit to Gerrit.
git review
submodules/storperf/docs/testing/developer/devguide/../images/git_review_2.png

The Gerrit code review should be updated, which results in a ‘patch set 2’ notification appearing in the history log. ‘patch set 1’ being the original code review proposition.

1.4.5. If Gerrit upload is denied

The ‘git review’ command might return to you with an “access denied” error that looks like this :

submodules/storperf/docs/testing/developer/devguide/../images/Access_denied.png

In this case, you need to make sure your Gerrit account has been added as a member of the StorPerf contributors group : ldap/opnfv-gerrit-storperf- contributors. You also want to check that have signed the CLA (Contributor License Agreement), if not you can sign it in the “Agreements” section of your Gerrit account :

submodules/storperf/docs/testing/developer/devguide/../images/CLA_agreement.png

VSPERF

OPNFV VSPERF Developer Guide
Introduction

VSPERF is an OPNFV testing project.

VSPERF provides an automated test-framework and comprehensive test suite based on Industry Test Specifications for measuring NFVI data-plane performance. The data-path includes switching technologies with physical and virtual network interfaces. The VSPERF architecture is switch and traffic generator agnostic and test cases can be easily customized. VSPERF was designed to be independent of OpenStack therefore OPNFV installer scenarios are not required. VSPERF can source, configure and deploy the device-under-test using specified software versions and network topology. VSPERF is used as a development tool for optimizing switching technologies, qualification of packet processing functions and for evaluation of data-path performance.

The Euphrates release adds new features and improvements that will help advance high performance packet processing on Telco NFV platforms. This includes new test cases, flexibility in customizing test-cases, new results display options, improved tool resiliency, additional traffic generator support and VPP support.

VSPERF provides a framework where the entire NFV Industry can learn about NFVI data-plane performance and try-out new techniques together. A new IETF benchmarking specification (RFC8204) is based on VSPERF work contributed since 2015. VSPERF is also contributing to development of ETSI NFV test specifications through the Test and Open Source Working Group.

Design Guides
1. Traffic Generator Integration Guide
1.1. Intended Audience

This document is intended to aid those who want to integrate new traffic generator into the vsperf code. It is expected, that reader has already read generic part of VSPERF Design Document.

Let us create a sample traffic generator called sample_tg, step by step.

1.2. Step 1 - create a directory

Implementation of trafficgens is located at tools/pkt_gen/ directory, where every implementation has its dedicated sub-directory. It is required to create a new directory for new traffic generator implementations.

E.g.

$ mkdir tools/pkt_gen/sample_tg
1.3. Step 2 - create a trafficgen module

Every trafficgen class must inherit from generic ITrafficGenerator interface class. VSPERF during its initialization scans content of pkt_gen directory for all python modules, that inherit from ITrafficGenerator. These modules are automatically added into the list of supported traffic generators.

Example:

Let us create a draft of tools/pkt_gen/sample_tg/sample_tg.py module.

from tools.pkt_gen import trafficgen

class SampleTG(trafficgen.ITrafficGenerator):
    """
    A sample traffic generator implementation
    """
    pass

VSPERF is immediately aware of the new class:

$ ./vsperf --list-trafficgen

Output should look like:

Classes derived from: ITrafficGenerator
======

* Dummy:            A dummy traffic generator whose data is generated by the user.

* IxNet:            A wrapper around IXIA IxNetwork applications.

* Ixia:             A wrapper around the IXIA traffic generator.

* Moongen:          Moongen Traffic generator wrapper.

* TestCenter:       Spirent TestCenter

* Trex:             Trex Traffic generator wrapper.

* Xena:             Xena Traffic generator wrapper class
1.4. Step 3 - configuration

All configuration values, required for correct traffic generator function, are passed from VSPERF to the traffic generator in a dictionary. Default values shared among all traffic generators are defined in conf/03_traffic.conf within TRAFFIC dictionary. Default values are loaded by ITrafficGenerator interface class automatically, so it is not needed to load them explicitly. In case that there are any traffic generator specific default values, then they should be set within class specific __init__ function.

VSPERF passes test specific configuration within traffic dictionary to every start and send function. So implementation of these functions must ensure, that default values are updated with the testcase specific values. Proper merge of values is assured by call of merge_spec function from conf module.

Example of merge_spec usage in tools/pkt_gen/sample_tg/sample_tg.py module:

from conf import merge_spec

def start_rfc2544_throughput(self, traffic=None, duration=30):
    self._params = {}
    self._params['traffic'] = self.traffic_defaults.copy()
    if traffic:
        self._params['traffic'] = merge_spec(
            self._params['traffic'], traffic)
1.5. Step 4 - generic functions

There are some generic functions, which every traffic generator should provide. Although these functions are mainly optional, at least empty implementation must be provided. This is required, so that developer is explicitly aware of these functions.

The connect function is called from the traffic generator controller from its __enter__ method. This function should assure proper connection initialization between DUT and traffic generator. In case, that such implementation is not needed, empty implementation is required.

The disconnect function should perform clean up of any connection specific actions called from the connect function.

Example in tools/pkt_gen/sample_tg/sample_tg.py module:

def connect(self):
    pass

def disconnect(self):
    pass
1.6. Step 5 - supported traffic types

Currently VSPERF supports three different types of tests for traffic generators, these are identified in vsperf through the traffic type, which include:

  • RFC2544 throughput - Send fixed size packets at different rates, using

    traffic configuration, until minimum rate at which no packet loss is detected is found. Methods with its implementation have suffix _rfc2544_throughput.

  • RFC2544 back2back - Send fixed size packets at a fixed rate, using traffic

    configuration, for specified time interval. Methods with its implementation have suffix _rfc2544_back2back.

  • continuous flow - Send fixed size packets at given framerate, using traffic

    configuration, for specified time interval. Methods with its implementation have suffix _cont_traffic.

In general, both synchronous and asynchronous interfaces must be implemented for each traffic type. Synchronous functions start with prefix send_. Asynchronous with prefixes start_ and wait_ in case of throughput and back2back and start_ and stop_ in case of continuous traffic type.

Example of synchronous interfaces:

def send_rfc2544_throughput(self, traffic=None, tests=1, duration=20,
                            lossrate=0.0):
def send_rfc2544_back2back(self, traffic=None, tests=1, duration=20,
                           lossrate=0.0):
def send_cont_traffic(self, traffic=None, duration=20):

Example of asynchronous interfaces:

def start_rfc2544_throughput(self, traffic=None, tests=1, duration=20,
                             lossrate=0.0):
def wait_rfc2544_throughput(self):

def start_rfc2544_back2back(self, traffic=None, tests=1, duration=20,
                            lossrate=0.0):
def wait_rfc2544_back2back(self):

def start_cont_traffic(self, traffic=None, duration=20):
def stop_cont_traffic(self):

Description of parameters used by send, start, wait and stop functions:

  • param traffic: A dictionary with detailed definition of traffic pattern. It contains following parameters to be implemented by traffic generator.

    Note: Traffic dictionary has also virtual switch related parameters, which are not listed below.

    Note: There are parameters specific to testing of tunnelling protocols, which are discussed in detail at Integration tests userguide.

    Note: A detailed description of the TRAFFIC dictionary can be found at Configuration of TRAFFIC dictionary.

    • param traffic_type: One of the supported traffic types, e.g. rfc2544_throughput, rfc2544_continuous, rfc2544_back2back or burst.
    • param bidir: Specifies if generated traffic will be full-duplex (true) or half-duplex (false).
    • param frame_rate: Defines desired percentage of frame rate used during continuous stream tests.
    • param burst_size: Defines a number of frames in the single burst, which is sent by burst traffic type. Burst size is applied for each direction, i.e. the total number of tx frames will be 2*burst_size in case of bidirectional traffic.
    • param multistream: Defines number of flows simulated by traffic generator. Value 0 disables MultiStream feature.
    • param stream_type: Stream Type defines ISO OSI network layer used for simulation of multiple streams. Supported values:
      • L2 - iteration of destination MAC address
      • L3 - iteration of destination IP address
      • L4 - iteration of destination port of selected transport protocol
    • param l2: A dictionary with data link layer details, e.g. srcmac, dstmac and framesize.
    • param l3: A dictionary with network layer details, e.g. srcip, dstip, proto and l3 on/off switch enabled.
    • param l4: A dictionary with transport layer details, e.g. srcport, dstport and l4 on/off switch enabled.
    • param vlan: A dictionary with vlan specific parameters, e.g. priority, cfi, id and vlan on/off switch enabled.
    • param scapy: A dictionary with definition of the frame content for both traffic directions. The frame content is defined by a SCAPY notation.
  • param tests: Number of times the test is executed.

  • param duration: Duration of continuous test or per iteration duration in case of RFC2544 throughput or back2back traffic types.

  • param lossrate: Acceptable lossrate percentage.

1.7. Step 6 - passing back results

It is expected that methods send, wait and stop will return values measured by traffic generator within a dictionary. Dictionary keys are defined in ResultsConstants implemented in core/results/results_constants.py. Please check sections for RFC2544 Throughput & Continuous and for Back2Back. The same key names should be used by all traffic generator implementations.

2. VSPERF Design Document
2.1. Intended Audience

This document is intended to aid those who want to modify the vsperf code. Or to extend it - for example to add support for new traffic generators, deployment scenarios and so on.

2.2. Usage
2.2.1. Example Connectivity to DUT

Establish connectivity to the VSPERF DUT Linux host. If this is in an OPNFV lab following the steps provided by Pharos to access the POD

The followign steps establish the VSPERF environment.

2.2.2. Example Command Lines

List all the cli options:

$ ./vsperf -h

Run all tests that have tput in their name - phy2phy_tput, pvp_tput etc.:

$ ./vsperf --tests 'tput'

As above but override default configuration with settings in ‘10_custom.conf’. This is useful as modifying configuration directly in the configuration files in conf/NN_*.py shows up as changes under git source control:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf --tests 'tput'

Override specific test parameters. Useful for shortening the duration of tests for development purposes:

$ ./vsperf --test-params 'TRAFFICGEN_DURATION=10;TRAFFICGEN_RFC2544_TESTS=1;' \
                         'TRAFFICGEN_PKT_SIZES=(64,)' pvp_tput
2.3. Typical Test Sequence

This is a typical flow of control for a test.

_images/vsperf.png
2.4. Configuration

The conf package contains the configuration files (*.conf) for all system components, it also provides a settings object that exposes all of these settings.

Settings are not passed from component to component. Rather they are available globally to all components once they import the conf package.

from conf import settings
...
log_file = settings.getValue('LOG_FILE_DEFAULT')

Settings files (*.conf) are valid python code so can be set to complex types such as lists and dictionaries as well as scalar types:

first_packet_size = settings.getValue('PACKET_SIZE_LIST')[0]
2.4.1. Configuration Procedure and Precedence

Configuration files follow a strict naming convention that allows them to be processed in a specific order. All the .conf files are named NNx_name.conf, where NN is a decimal number and x is an optional alphabetical suffix. The files are processed in order from 00_name.conf to 99_name.conf (and from 00a_name to 00z_name), so that if the name setting is given in both a lower and higher numbered conf file then the higher numbered file is the effective setting as it is processed after the setting in the lower numbered file.

The values in the file specified by --conf-file takes precedence over all the other configuration files and does not have to follow the naming convention.

2.4.2. Configuration of PATHS dictionary

VSPERF uses external tools like Open vSwitch and Qemu for execution of testcases. These tools may be downloaded and built automatically (see Installation) or installed manually by user from binary packages. It is also possible to use a combination of both approaches, but it is essential to correctly set paths to all required tools. These paths are stored within a PATHS dictionary, which is evaluated before execution of each testcase, in order to setup testcase specific environment. Values selected for testcase execution are internally stored inside TOOLS dictionary, which is used by VSPERF to execute external tools, load kernel modules, etc.

The default configuration of PATHS dictionary is spread among three different configuration files to follow logical grouping of configuration options. Basic description of PATHS dictionary is placed inside conf/00_common.conf. The configuration specific to DPDK and vswitches is located at conf/02_vswitch.conf. The last part related to the Qemu is defined inside conf/04_vnf.conf. Default configuration values can be used in case, that all required tools were downloaded and built automatically by vsperf itself. In case, that some of tools were installed manually from binary packages, then it will be necessary to modify the content of PATHS dictionary accordingly.

Dictionary has a specific section of configuration options for every tool type, it means:

  • PATHS['vswitch'] - contains a separate dictionary for each of vswitches supported by VSPEF

    Example:

    PATHS['vswitch'] = {
       'OvsDpdkVhost': { ... },
       'OvsVanilla' : { ... },
       ...
    }
    
  • PATHS['dpdk'] - contains paths to the dpdk sources, kernel modules and tools (e.g. testpmd)

    Example:

    PATHS['dpdk'] = {
       'type' : 'src',
       'src': {
           'path': os.path.join(ROOT_DIR, 'src/dpdk/dpdk/'),
           'modules' : ['uio', os.path.join(RTE_TARGET, 'kmod/igb_uio.ko')],
           'bind-tool': 'tools/dpdk*bind.py',
           'testpmd': os.path.join(RTE_TARGET, 'app', 'testpmd'),
       },
       ...
    }
    
  • PATHS['qemu'] - contains paths to the qemu sources and executable file

    Example:

    PATHS['qemu'] = {
        'type' : 'bin',
        'bin': {
            'qemu-system': 'qemu-system-x86_64'
        },
        ...
    }
    

Every section specific to the particular vswitch, dpdk or qemu may contain following types of configuration options:

  • option type - is a string, which defines the type of configured paths (‘src’ or ‘bin’) to be selected for a given section:

    • value src means, that VSPERF will use vswitch, DPDK or QEMU built from sources e.g. by execution of systems/build_base_machine.sh script during VSPERF installation
    • value bin means, that VSPERF will use vswitch, DPDK or QEMU binaries installed directly in the operating system, e.g. via OS specific packaging system
  • option path - is a string with a valid system path; Its content is checked for existence, prefixed with section name and stored into TOOLS for later use e.g. TOOLS['dpdk_src'] or TOOLS['vswitch_src']

  • option modules - is list of strings with names of kernel modules; Every module name from given list is checked for a ‘.ko’ suffix. In case that it matches and if it is not an absolute path to the module, then module name is prefixed with value of path option defined for the same section

    Example:

    """
    snippet of PATHS definition from the configuration file:
    """
    PATHS['vswitch'] = {
        'OvsVanilla' = {
            'type' : 'src',
            'src': {
                'path': '/tmp/vsperf/src_vanilla/ovs/ovs/',
                'modules' : ['datapath/linux/openvswitch.ko'],
                ...
            },
            ...
        }
        ...
    }
    
    """
    Final content of TOOLS dictionary used during runtime:
    """
    TOOLS['vswitch_modules'] = ['/tmp/vsperf/src_vanilla/ovs/ovs/datapath/linux/openvswitch.ko']
    
  • all other options are strings with names and paths to specific tools; If a given string contains a relative path and option path is defined for a given section, then string content will be prefixed with content of the path. Otherwise the name of the tool will be searched within standard system directories. In case that filename contains OS specific wildcards, then they will be expanded to the real path. At the end of the processing, every absolute path will be checked for its existence. In case that temporary path (i.e. path with a _tmp suffix) does not exist, then log will be written and vsperf will continue. If any other path will not exist, then vsperf execution will be terminated with a runtime error.

    Example:

    """
    snippet of PATHS definition from the configuration file:
    """
    PATHS['vswitch'] = {
        'OvsDpdkVhost': {
            'type' : 'src',
            'src': {
                'path': '/tmp/vsperf/src_vanilla/ovs/ovs/',
                'ovs-vswitchd': 'vswitchd/ovs-vswitchd',
                'ovsdb-server': 'ovsdb/ovsdb-server',
                ...
            }
            ...
        }
        ...
    }
    
    """
    Final content of TOOLS dictionary used during runtime:
    """
    TOOLS['ovs-vswitchd'] = '/tmp/vsperf/src_vanilla/ovs/ovs/vswitchd/ovs-vswitchd'
    TOOLS['ovsdb-server'] = '/tmp/vsperf/src_vanilla/ovs/ovs/ovsdb/ovsdb-server'
    

Note: In case that bin type is set for DPDK, then TOOLS['dpdk_src'] will be set to the value of PATHS['dpdk']['src']['path']. The reason is, that VSPERF uses downloaded DPDK sources to copy DPDK and testpmd into the GUEST, where testpmd is built. In case, that DPDK sources are not available, then vsperf will continue with test execution, but testpmd can’t be used as a guest loopback. This is useful in case, that other guest loopback applications (e.g. buildin or l2fwd) are used.

Note: In case of RHEL 7.3 OS usage, binary package configuration is required for Vanilla OVS tests. With the installation of a supported rpm for OVS there is a section in the conf\10_custom.conf file that can be used.

2.4.3. Configuration of TRAFFIC dictionary

TRAFFIC dictionary is used for configuration of traffic generator. Default values can be found in configuration file conf/03_traffic.conf. These default values can be modified by (first option has the highest priorty):

  1. Parameters section of testcase definition
  2. command line options specified by --test-params argument
  3. custom configuration file

It is to note, that in case of option 1 and 2, it is possible to specify only values, which should be changed. In case of custom configuration file, it is required to specify whole TRAFFIC dictionary with its all values or explicitly call and update() method of TRAFFIC dictionary.

Detailed description of TRAFFIC dictionary items follows:

'traffic_type'  - One of the supported traffic types.
                  E.g. rfc2544_throughput, rfc2544_back2back,
                  rfc2544_continuous or burst
                  Data type: str
                  Default value: "rfc2544_throughput".
'bidir'         - Specifies if generated traffic will be full-duplex (True)
                  or half-duplex (False)
                  Data type: str
                  Supported values: "True", "False"
                  Default value: "False".
'frame_rate'    - Defines desired percentage of frame rate used during
                  continuous stream tests.
                  Data type: int
                  Default value: 100.
'burst_size'    - Defines a number of frames in the single burst, which is sent
                  by burst traffic type. Burst size is applied for each direction,
                  i.e. the total number of tx frames will be 2*burst_size in case of
                  bidirectional traffic.
                  Data type: int
                  Default value: 100.
'multistream'   - Defines number of flows simulated by traffic generator.
                  Value 0 disables multistream feature
                  Data type: int
                  Supported values: 0-65536 for 'L4' stream type
                                    unlimited for 'L2' and 'L3' stream types
                  Default value: 0.
'stream_type'   - Stream type is an extension of the "multistream" feature.
                  If multistream is disabled, then stream type will be
                  ignored. Stream type defines ISO OSI network layer used
                  for simulation of multiple streams.
                  Data type: str
                  Supported values:
                     "L2" - iteration of destination MAC address
                     "L3" - iteration of destination IP address
                     "L4" - iteration of destination port
                            of selected transport protocol
                  Default value: "L4".
'pre_installed_flows'
               -  Pre-installed flows is an extension of the "multistream"
                  feature. If enabled, it will implicitly insert a flow
                  for each stream. If multistream is disabled, then
                  pre-installed flows will be ignored.
                  Data type: str
                  Supported values:
                     "Yes" - flows will be inserted into OVS
                     "No"  - flows won't be inserted into OVS
                  Default value: "No".
'flow_type'     - Defines flows complexity.
                  Data type: str
                  Supported values:
                     "port" - flow is defined by ingress ports
                     "IP"   - flow is defined by ingress ports
                              and src and dst IP addresses
                  Default value: "port"
'flow_control'  - Controls flow control support by traffic generator.
                  Supported values:
                     False  - flow control is disabled
                     True   - flow control is enabled
                  Default value: False
                  Note: Currently it is supported by IxNet only
'learning_frames' - Controls learning frames support by traffic generator.
                  Supported values:
                     False  - learning frames are disabled
                     True   - learning frames are enabled
                  Default value: True
                  Note: Currently it is supported by IxNet only
'l2'            - A dictionary with l2 network layer details. Supported
                  values are:
    'srcmac'    - Specifies source MAC address filled by traffic generator.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: str
                  Default value: "00:00:00:00:00:00".
    'dstmac'    - Specifies destination MAC address filled by traffic generator.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: str
                  Default value: "00:00:00:00:00:00".
    'framesize' - Specifies default frame size. This value should not be
                  changed directly. It will be overridden during testcase
                  execution by values specified by list TRAFFICGEN_PKT_SIZES.
                  Data type: int
                  Default value: 64
'l3'            - A dictionary with l3 network layer details. Supported
                  values are:
    'enabled'   - Specifies if l3 layer should be enabled or disabled.
                  Data type: bool
                  Default value: True
                  NOTE: Supported only by IxNet trafficgen class
    'srcip'     - Specifies source MAC address filled by traffic generator.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: str
                  Default value: "1.1.1.1".
    'dstip'     - Specifies destination MAC address filled by traffic generator.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: str
                  Default value: "90.90.90.90".
    'proto'     - Specifies deflaut protocol type.
                  Please check particular traffic generator implementation
                  for supported protocol types.
                  Data type: str
                  Default value: "udp".
'l4'            - A dictionary with l4 network layer details. Supported
                  values are:
    'enabled'   - Specifies if l4 layer should be enabled or disabled.
                  Data type: bool
                  Default value: True
                  NOTE: Supported only by IxNet trafficgen class
    'srcport'   - Specifies source port of selected transport protocol.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: int
                  Default value: 3000
    'dstport'   - Specifies destination port of selected transport protocol.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: int
                  Default value: 3001
'vlan'          - A dictionary with vlan encapsulation details. Supported
                  values are:
    'enabled'   - Specifies if vlan encapsulation should be enabled or
                  disabled.
                  Data type: bool
                  Default value: False
    'id'        - Specifies vlan id.
                  Data type: int (NOTE: must fit to 12 bits)
                  Default value: 0
    'priority'  - Specifies a vlan priority (PCP header field).
                  Data type: int (NOTE: must fit to 3 bits)
                  Default value: 0
    'cfi'       - Specifies if frames can or cannot be dropped during
                  congestion (DEI header field).
                  Data type: int (NOTE: must fit to 1 bit)
                  Default value: 0
'capture'       - A dictionary with traffic capture configuration.
                  NOTE: It is supported only by T-Rex traffic generator.
    'enabled'   - Specifies if traffic should be captured
                  Data type: bool
                  Default value: False
    'tx_ports'  - A list of ports, where frames transmitted towards DUT will
                  be captured. Ports have numbers 0 and 1. TX packet capture
                  is disabled if list of ports is empty.
                  Data type: list
                  Default value: [0]
    'rx_ports'  - A list of ports, where frames received from DUT will
                  be captured. Ports have numbers 0 and 1. RX packet capture
                  is disabled if list of ports is empty.
                  Data type: list
                  Default value: [1]
    'count'     - A number of frames to be captured. The same count value
                  is applied to both TX and RX captures.
                  Data type: int
                  Default value: 1
    'filter'    - An expression used to filter TX and RX packets. It uses the same
                  syntax as pcap library. See pcap-filter man page for additional
                  details.
                  Data type: str
                  Default value: ''
    'scapy'     - A dictionary with definition of a frame content for both traffic
                  directions. The frame content is defined by a SCAPY notation.
                  NOTE: It is supported only by the T-Rex traffic generator.
                  Following keywords can be used to refer to the related parts of
                  the TRAFFIC dictionary:
                       Ether_src   - refers to TRAFFIC['l2']['srcmac']
                       Ether_dst   - refers to TRAFFIC['l2']['dstmac']
                       IP_proto    - refers to TRAFFIC['l3']['proto']
                       IP_PROTO    - refers to upper case version of TRAFFIC['l3']['proto']
                       IP_src      - refers to TRAFFIC['l3']['srcip']
                       IP_dst      - refers to TRAFFIC['l3']['dstip']
                       IP_PROTO_sport - refers to TRAFFIC['l4']['srcport']
                       IP_PROTO_dport - refers to TRAFFIC['l4']['dstport']
                       Dot1Q_prio  - refers to TRAFFIC['vlan']['priority']
                       Dot1Q_id    - refers to TRAFFIC['vlan']['cfi']
                       Dot1Q_vlan  - refers to TRAFFIC['vlan']['id']
        '0'     - A string with the frame definition for the 1st direction.
                  Data type: str
                  Default value: 'Ether(src={Ether_src}, dst={Ether_dst})/'
                                 'Dot1Q(prio={Dot1Q_prio}, id={Dot1Q_id}, vlan={Dot1Q_vlan})/'
                                 'IP(proto={IP_proto}, src={IP_src}, dst={IP_dst})/'
                                 '{IP_PROTO}(sport={IP_PROTO_sport}, dport={IP_PROTO_dport})'
        '1'     - A string with the frame definition for the 2nd direction.
                  Data type: str
                  Default value: 'Ether(src={Ether_dst}, dst={Ether_src})/'
                                 'Dot1Q(prio={Dot1Q_prio}, id={Dot1Q_id}, vlan={Dot1Q_vlan})/'
                                 'IP(proto={IP_proto}, src={IP_dst}, dst={IP_src})/'
                                 '{IP_PROTO}(sport={IP_PROTO_dport}, dport={IP_PROTO_sport})',
2.4.4. Configuration of GUEST options

VSPERF is able to setup scenarios involving a number of VMs in series or in parallel. All configuration options related to a particular VM instance are defined as lists and prefixed with GUEST_ label. It is essential, that there is enough items in all GUEST_ options to cover all VM instances involved in the test. In case there is not enough items, then VSPERF will use the first item of particular GUEST_ option to expand the list to required length.

Example of option expansion for 4 VMs:

"""
Original values:
"""
GUEST_SMP = ['2']
GUEST_MEMORY = ['2048', '4096']

"""
Values after automatic expansion:
"""
GUEST_SMP = ['2', '2', '2', '2']
GUEST_MEMORY = ['2048', '4096', '2048', '2048']

First option can contain macros starting with # to generate VM specific values. These macros can be used only for options of list or str types with GUEST_ prefix.

Example of macros and their expansion for 2 VMs:

"""
Original values:
"""
GUEST_SHARE_DIR = ['/tmp/qemu#VMINDEX_share']
GUEST_BRIDGE_IP = ['#IP(1.1.1.5)/16']

"""
Values after automatic expansion:
"""
GUEST_SHARE_DIR = ['/tmp/qemu0_share', '/tmp/qemu1_share']
GUEST_BRIDGE_IP = ['1.1.1.5/16', '1.1.1.6/16']

Additional examples are available at 04_vnf.conf.

Note: In case, that macro is detected in the first item of the list, then all other items are ignored and list content is created automatically.

Multiple macros can be used inside one configuration option definition, but macros cannot be used inside other macros. The only exception is macro #VMINDEX, which is expanded first and thus it can be used inside other macros.

Following macros are supported:

  • #VMINDEX - it is replaced by index of VM being executed; This macro is expanded first, so it can be used inside other macros.

    Example:

    GUEST_SHARE_DIR = ['/tmp/qemu#VMINDEX_share']
    
  • #MAC(mac_address[, step]) - it will iterate given mac_address with optional step. In case that step is not defined, then it is set to 1. It means, that first VM will use the value of mac_address, second VM value of mac_address increased by step, etc.

    Example:

    GUEST_NICS = [[{'mac' : '#MAC(00:00:00:00:00:01,2)'}]]
    
  • #IP(ip_address[, step]) - it will iterate given ip_address with optional step. In case that step is not defined, then it is set to 1. It means, that first VM will use the value of ip_address, second VM value of ip_address increased by step, etc.

    Example:

    GUEST_BRIDGE_IP = ['#IP(1.1.1.5)/16']
    
  • #EVAL(expression) - it will evaluate given expression as python code; Only simple expressions should be used. Call of the functions is not supported.

    Example:

    GUEST_CORE_BINDING = [('#EVAL(6+2*#VMINDEX)', '#EVAL(7+2*#VMINDEX)')]
    
2.4.5. Other Configuration

conf.settings also loads configuration from the command line and from the environment.

2.5. PXP Deployment

Every testcase uses one of the supported deployment scenarios to setup test environment. The controller responsible for a given scenario configures flows in the vswitch to route traffic among physical interfaces connected to the traffic generator and virtual machines. VSPERF supports several deployments including PXP deployment, which can setup various scenarios with multiple VMs.

These scenarios are realized by VswitchControllerPXP class, which can configure and execute given number of VMs in serial or parallel configurations. Every VM can be configured with just one or an even number of interfaces. In case that VM has more than 2 interfaces, then traffic is properly routed among pairs of interfaces.

Example of traffic routing for VM with 4 NICs in serial configuration:

         +------------------------------------------+
         |  VM with 4 NICs                          |
         |  +---------------+    +---------------+  |
         |  |  Application  |    |  Application  |  |
         |  +---------------+    +---------------+  |
         |      ^       |            ^       |      |
         |      |       v            |       v      |
         |  +---------------+    +---------------+  |
         |  | logical ports |    | logical ports |  |
         |  |   0       1   |    |   2       3   |  |
         +--+---------------+----+---------------+--+
                ^       :            ^       :
                |       |            |       |
                :       v            :       v
+-----------+---------------+----+---------------+----------+
| vSwitch   |   0       1   |    |   2       3   |          |
|           | logical ports |    | logical ports |          |
| previous  +---------------+    +---------------+   next   |
| VM or PHY     ^       |            ^       |     VM or PHY|
|   port   -----+       +------------+       +--->   port   |
+-----------------------------------------------------------+

It is also possible to define different number of interfaces for each VM to better simulate real scenarios.

Example of traffic routing for 2 VMs in serial configuration, where 1st VM has 4 NICs and 2nd VM 2 NICs:

         +------------------------------------------+  +---------------------+
         |  1st VM with 4 NICs                      |  |  2nd VM with 2 NICs |
         |  +---------------+    +---------------+  |  |  +---------------+  |
         |  |  Application  |    |  Application  |  |  |  |  Application  |  |
         |  +---------------+    +---------------+  |  |  +---------------+  |
         |      ^       |            ^       |      |  |      ^       |      |
         |      |       v            |       v      |  |      |       v      |
         |  +---------------+    +---------------+  |  |  +---------------+  |
         |  | logical ports |    | logical ports |  |  |  | logical ports |  |
         |  |   0       1   |    |   2       3   |  |  |  |   0       1   |  |
         +--+---------------+----+---------------+--+  +--+---------------+--+
                ^       :            ^       :               ^       :
                |       |            |       |               |       |
                :       v            :       v               :       v
+-----------+---------------+----+---------------+-------+---------------+----------+
| vSwitch   |   0       1   |    |   2       3   |       |   4       5   |          |
|           | logical ports |    | logical ports |       | logical ports |          |
| previous  +---------------+    +---------------+       +---------------+   next   |
| VM or PHY     ^       |            ^       |               ^       |     VM or PHY|
|   port   -----+       +------------+       +---------------+       +---->  port   |
+-----------------------------------------------------------------------------------+

The number of VMs involved in the test and the type of their connection is defined by deployment name as follows:

  • pvvp[number] - configures scenario with VMs connected in series with optional number of VMs. In case that number is not specified, then 2 VMs will be used.

    Example of 2 VMs in a serial configuration:

    +----------------------+  +----------------------+
    |   1st VM             |  |   2nd VM             |
    |   +---------------+  |  |   +---------------+  |
    |   |  Application  |  |  |   |  Application  |  |
    |   +---------------+  |  |   +---------------+  |
    |       ^       |      |  |       ^       |      |
    |       |       v      |  |       |       v      |
    |   +---------------+  |  |   +---------------+  |
    |   | logical ports |  |  |   | logical ports |  |
    |   |   0       1   |  |  |   |   0       1   |  |
    +---+---------------+--+  +---+---------------+--+
            ^       :                 ^       :
            |       |                 |       |
            :       v                 :       v
    +---+---------------+---------+---------------+--+
    |   |   0       1   |         |   3       4   |  |
    |   | logical ports | vSwitch | logical ports |  |
    |   +---------------+         +---------------+  |
    |       ^       |                 ^       |      |
    |       |       +-----------------+       v      |
    |   +----------------------------------------+   |
    |   |              physical ports            |   |
    |   |      0                         1       |   |
    +---+----------------------------------------+---+
               ^                         :
               |                         |
               :                         v
    +------------------------------------------------+
    |                                                |
    |                traffic generator               |
    |                                                |
    +------------------------------------------------+
    
  • pvpv[number] - configures scenario with VMs connected in parallel with optional number of VMs. In case that number is not specified, then 2 VMs will be used. Multistream feature is used to route traffic to particular VMs (or NIC pairs of every VM). It means, that VSPERF will enable multistream feature and sets the number of streams to the number of VMs and their NIC pairs. Traffic will be dispatched based on Stream Type, i.e. by UDP port, IP address or MAC address.

    Example of 2 VMs in a parallel configuration, where traffic is dispatched

    based on the UDP port.

    +----------------------+  +----------------------+
    |   1st VM             |  |   2nd VM             |
    |   +---------------+  |  |   +---------------+  |
    |   |  Application  |  |  |   |  Application  |  |
    |   +---------------+  |  |   +---------------+  |
    |       ^       |      |  |       ^       |      |
    |       |       v      |  |       |       v      |
    |   +---------------+  |  |   +---------------+  |
    |   | logical ports |  |  |   | logical ports |  |
    |   |   0       1   |  |  |   |   0       1   |  |
    +---+---------------+--+  +---+---------------+--+
            ^       :                 ^       :
            |       |                 |       |
            :       v                 :       v
    +---+---------------+---------+---------------+--+
    |   |   0       1   |         |   3       4   |  |
    |   | logical ports | vSwitch | logical ports |  |
    |   +---------------+         +---------------+  |
    |      ^         |                 ^       :     |
    |      |     ......................:       :     |
    |  UDP | UDP :   |                         :     |
    |  port| port:   +--------------------+    :     |
    |   0  |  1  :                        |    :     |
    |      |     :                        v    v     |
    |   +----------------------------------------+   |
    |   |              physical ports            |   |
    |   |    0                               1   |   |
    +---+----------------------------------------+---+
             ^                               :
             |                               |
             :                               v
    +------------------------------------------------+
    |                                                |
    |                traffic generator               |
    |                                                |
    +------------------------------------------------+
    

PXP deployment is backward compatible with PVP deployment, where pvp is an alias for pvvp1 and it executes just one VM.

The number of interfaces used by VMs is defined by configuration option GUEST_NICS_NR. In case that more than one pair of interfaces is defined for VM, then:

  • for pvvp (serial) scenario every NIC pair is connected in serial before connection to next VM is created
  • for pvpv (parallel) scenario every NIC pair is directly connected to the physical ports and unique traffic stream is assigned to it

Examples:

  • Deployment pvvp10 will start 10 VMs and connects them in series
  • Deployment pvpv4 will start 4 VMs and connects them in parallel
  • Deployment pvpv1 and GUEST_NICS_NR = [4] will start 1 VM with 4 interfaces and every NIC pair is directly connected to the physical ports
  • Deployment pvvp and GUEST_NICS_NR = [2, 4] will start 2 VMs; 1st VM will have 2 interfaces and 2nd VM 4 interfaces. These interfaces will be connected in serial, i.e. traffic will flow as follows: PHY1 -> VM1_1 -> VM1_2 -> VM2_1 -> VM2_2 -> VM2_3 -> VM2_4 -> PHY2

Note: In case that only 1 or more than 2 NICs are configured for VM, then testpmd should be used as forwarding application inside the VM. As it is able to forward traffic between multiple VM NIC pairs.

Note: In case of linux_bridge, all NICs are connected to the same bridge inside the VM.

Note: In case that multistream feature is configured and pre_installed_flows is set to Yes, then stream specific flows will be inserted only for connections originating at physical ports. The rest of the flows will be based on port numbers only. The same logic applies in case that flow_type TRAFFIC option is set to ip. This configuration will avoid a testcase malfunction if frame headers are modified inside VM (e.g. MAC swap or IP change).

2.6. VM, vSwitch, Traffic Generator Independence

VSPERF supports different VSwitches, Traffic Generators, VNFs and Forwarding Applications by using standard object-oriented polymorphism:

  • Support for vSwitches is implemented by a class inheriting from IVSwitch.
  • Support for Traffic Generators is implemented by a class inheriting from ITrafficGenerator.
  • Support for VNF is implemented by a class inheriting from IVNF.
  • Support for Forwarding Applications is implemented by a class inheriting from IPktFwd.

By dealing only with the abstract interfaces the core framework can support many implementations of different vSwitches, Traffic Generators, VNFs and Forwarding Applications.

2.6.1. IVSwitch
class IVSwitch:
  start(self)
  stop(self)
  add_switch(switch_name)
  del_switch(switch_name)
  add_phy_port(switch_name)
  add_vport(switch_name)
  get_ports(switch_name)
  del_port(switch_name, port_name)
  add_flow(switch_name, flow)
  del_flow(switch_name, flow=None)
2.6.2. ITrafficGenerator
class ITrafficGenerator:
  connect()
  disconnect()

  send_burst_traffic(traffic, time)

  send_cont_traffic(traffic, time, framerate)
  start_cont_traffic(traffic, time, framerate)
  stop_cont_traffic(self):

  send_rfc2544_throughput(traffic, tests, duration, lossrate)
  start_rfc2544_throughput(traffic, tests, duration, lossrate)
  wait_rfc2544_throughput(self)

  send_rfc2544_back2back(traffic, tests, duration, lossrate)
  start_rfc2544_back2back(traffic, , tests, duration, lossrate)
  wait_rfc2544_back2back()

Note send_xxx() blocks whereas start_xxx() does not and must be followed by a subsequent call to wait_xxx().

2.6.3. IVnf
class IVnf:
  start(memory, cpus,
        monitor_path, shared_path_host,
        shared_path_guest, guest_prompt)
  stop()
  execute(command)
  wait(guest_prompt)
  execute_and_wait (command)
2.6.4. IPktFwd
class IPktFwd:
    start()
    stop()
2.6.5. Controllers

Controllers are used in conjunction with abstract interfaces as way of decoupling the control of vSwtiches, VNFs, TrafficGenerators and Forwarding Applications from other components.

The controlled classes provide basic primitive operations. The Controllers sequence and co-ordinate these primitive operation in to useful actions. For instance the vswitch_controller_p2p can be used to bring any vSwitch (that implements the primitives defined in IVSwitch) into the configuration required by the Phy-to-Phy Deployment Scenario.

In order to support a new vSwitch only a new implementation of IVSwitch needs be created for the new vSwitch to be capable of fulfilling all the Deployment Scenarios provided for by existing or future vSwitch Controllers.

Similarly if a new Deployment Scenario is required it only needs to be written once as a new vSwitch Controller and it will immediately be capable of controlling all existing and future vSwitches in to that Deployment Scenario.

Similarly the Traffic Controllers can be used to co-ordinate basic operations provided by implementers of ITrafficGenerator to provide useful tests. Though traffic generators generally already implement full test cases i.e. they both generate suitable traffic and analyse returned traffic in order to implement a test which has typically been predefined in an RFC document. However the Traffic Controller class allows for the possibility of further enhancement - such as iterating over tests for various packet sizes or creating new tests.

2.6.6. Traffic Controller’s Role _images/traffic_controller.png
2.6.7. Loader & Component Factory

The working of the Loader package (which is responsible for finding arbitrary classes based on configuration data) and the Component Factory which is responsible for choosing the correct class for a particular situation - e.g. Deployment Scenario can be seen in this diagram.

_images/factory_and_loader.png
2.7. Routing Tables

Vsperf uses a standard set of routing tables in order to allow tests to easily mix and match Deployment Scenarios (PVP, P2P topology), Tuple Matching and Frame Modification requirements.

The usage of routing tables is driven by configuration parameter OVS_ROUTING_TABLES. Routing tables are disabled by default (i.e. parameter is set to False) for better comparison of results among supported vSwitches (e.g. OVS vs. VPP).

+--------------+
|              |
| Table 0      |  table#0 - Match table. Flows designed to force 5 & 10
|              |  tuple matches go here.
|              |
+--------------+
       |
       |
       v
+--------------+  table#1 - Routing table. Flow entries to forward
|              |  packets between ports goes here.
| Table 1      |  The chosen port is communicated to subsequent tables by
|              |  setting the metadata value to the egress port number.
|              |  Generally this table is set-up by by the
+--------------+  vSwitchController.
       |
       |
       v
+--------------+  table#2 - Frame modification table. Frame modification
|              |  flow rules are isolated in this table so that they can
| Table 2      |  be turned on or off without affecting the routing or
|              |  tuple-matching flow rules. This allows the frame
|              |  modification and tuple matching required by the tests
|              |  in the VSWITCH PERFORMANCE FOR TELCO NFV test
+--------------+  specification to be independent of the Deployment
       |          Scenario set up by the vSwitchController.
       |
       v
+--------------+
|              |
| Table 3      |  table#3 - Egress table. Egress packets on the ports
|              |  setup in Table 1.
+--------------+
3. VSPERF LEVEL TEST DESIGN (LTD)
3.1. Introduction

The intention of this Level Test Design (LTD) document is to specify the set of tests to carry out in order to objectively measure the current characteristics of a virtual switch in the Network Function Virtualization Infrastructure (NFVI) as well as the test pass criteria. The detailed test cases will be defined in details-of-LTD, preceded by the doc-id-of-LTD and the scope-of-LTD.

This document is currently in draft form.

3.1.1. Document identifier

The document id will be used to uniquely identify versions of the LTD. The format for the document id will be: OPNFV_vswitchperf_LTD_REL_STATUS, where by the status is one of: draft, reviewed, corrected or final. The document id for this version of the LTD is: OPNFV_vswitchperf_LTD_Brahmaputra_REVIEWED.

3.1.2. Scope

The main purpose of this project is to specify a suite of performance tests in order to objectively measure the current packet transfer characteristics of a virtual switch in the NFVI. The intent of the project is to facilitate testing of any virtual switch. Thus, a generic suite of tests shall be developed, with no hard dependencies to a single implementation. In addition, the test case suite shall be architecture independent.

The test cases developed in this project shall not form part of a separate test framework, all of these tests may be inserted into the Continuous Integration Test Framework and/or the Platform Functionality Test Framework - if a vSwitch becomes a standard component of an OPNFV release.

3.2. Details of the Level Test Design

This section describes the features to be tested (FeaturesToBeTested-of-LTD), and identifies the sets of test cases or scenarios (TestIdentification-of-LTD).

3.2.1. Features to be tested

Characterizing virtual switches (i.e. Device Under Test (DUT) in this document) includes measuring the following performance metrics:

  • Throughput
  • Packet delay
  • Packet delay variation
  • Packet loss
  • Burst behaviour
  • Packet re-ordering
  • Packet correctness
  • Availability and capacity of the DUT
3.2.2. Test identification
3.2.2.1. Throughput tests

The following tests aim to determine the maximum forwarding rate that can be achieved with a virtual switch. The list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

3.2.2.1.1. Test ID: LTD.Throughput.RFC2544.PacketLossRatio

Title: RFC 2544 X% packet loss ratio Throughput and Latency Test

Prerequisite Test: N/A

Priority:

Description:

This test determines the DUT’s maximum forwarding rate with X% traffic loss for a constant load (fixed length frames at a fixed interval time). The default loss percentages to be tested are: - X = 0% - X = 10^-7%

Note: Other values can be tested if required by the user.

The selected frame sizes are those previously defined under Default Test Parameters. The test can also be used to determine the average latency of the traffic.

Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result.

Expected Result: At the end of each trial, the presence or absence of loss determines the modification of offered load for the next trial, converging on a maximum rate, or RFC2544 Throughput with X% loss. The Throughput load is re-used in related RFC2544 tests and other tests.

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
3.2.2.1.2. Test ID: LTD.Throughput.RFC2544.PacketLossRatioFrameModification

Title: RFC 2544 X% packet loss Throughput and Latency Test with packet modification

Prerequisite Test: N/A

Priority:

Description:

This test determines the DUT’s maximum forwarding rate with X% traffic loss for a constant load (fixed length frames at a fixed interval time). The default loss percentages to be tested are: - X = 0% - X = 10^-7%

Note: Other values can be tested if required by the user.

The selected frame sizes are those previously defined under Default Test Parameters. The test can also be used to determine the average latency of the traffic.

Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result.

During this test, the DUT must perform the following operations on the traffic flow:

  • Perform packet parsing on the DUT’s ingress port.
  • Perform any relevant address look-ups on the DUT’s ingress ports.
  • Modify the packet header before forwarding the packet to the DUT’s egress port. Packet modifications include:
    • Modifying the Ethernet source or destination MAC address.
    • Modifying/adding a VLAN tag. (Recommended).
    • Modifying/adding a MPLS tag.
    • Modifying the source or destination ip address.
    • Modifying the TOS/DSCP field.
    • Modifying the source or destination ports for UDP/TCP/SCTP.
    • Modifying the TTL.

Expected Result: The Packet parsing/modifications require some additional degree of processing resource, therefore the RFC2544 Throughput is expected to be somewhat lower than the Throughput level measured without additional steps. The reduction is expected to be greatest on tests with the smallest packet sizes (greatest header processing rates).

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss and packet modification operations being performed by the DUT.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • The RFC5481 PDV form of delay variation on the traffic flow, using the 99th percentile.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
3.2.2.1.3. Test ID: LTD.Throughput.RFC2544.Profile

Title: RFC 2544 Throughput and Latency Profile

Prerequisite Test: N/A

Priority:

Description:

This test reveals how throughput and latency degrades as the offered rate varies in the region of the DUT’s maximum forwarding rate as determined by LTD.Throughput.RFC2544.PacketLossRatio (0% Packet Loss). For example it can be used to determine if the degradation of throughput and latency as the offered rate increases is slow and graceful or sudden and severe.

The selected frame sizes are those previously defined under Default Test Parameters.

The offered traffic rate is described as a percentage delta with respect to the DUT’s RFC 2544 Throughput as determined by LTD.Throughput.RFC2544.PacketLoss Ratio (0% Packet Loss case). A delta of 0% is equivalent to an offered traffic rate equal to the RFC 2544 Maximum Throughput; A delta of +50% indicates an offered rate half-way between the Maximum RFC2544 Throughput and line-rate, whereas a delta of -50% indicates an offered rate of half the RFC 2544 Maximum Throughput. Therefore the range of the delta figure is natuarlly bounded at -100% (zero offered traffic) and +100% (traffic offered at line rate).

The following deltas to the maximum forwarding rate should be applied:

  • -50%, -10%, 0%, +10% & +50%

Expected Result: For each packet size a profile should be produced of how throughput and latency vary with offered rate.

Metrics Collected:

The following are the metrics collected for this test:

  • The forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each delta to the maximum forwarding rate and for each frame size.
  • The average latency for each delta to the maximum forwarding rate and for each frame size.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
  • Any failures experienced (for example if the vSwitch crashes, stops processing packets, restarts or becomes unresponsive to commands) when the offered load is above Maximum Throughput MUST be recorded and reported with the results.
3.2.2.1.4. Test ID: LTD.Throughput.RFC2544.SystemRecoveryTime

Title: RFC 2544 System Recovery Time Test

Prerequisite Test LTD.Throughput.RFC2544.PacketLossRatio

Priority:

Description:

The aim of this test is to determine the length of time it takes the DUT to recover from an overload condition for a constant load (fixed length frames at a fixed interval time). The selected frame sizes are those previously defined under Default Test Parameters, traffic should be sent to the DUT under normal conditions. During the duration of the test and while the traffic flows are passing though the DUT, at least one situation leading to an overload condition for the DUT should occur. The time from the end of the overload condition to when the DUT returns to normal operations should be measured to determine recovery time. Prior to overloading the DUT, one should record the average latency for 10,000 packets forwarded through the DUT.

The overload condition SHOULD be to transmit traffic at a very high frame rate to the DUT (150% of the maximum 0% packet loss rate as determined by LTD.Throughput.RFC2544.PacketLossRatio or line-rate whichever is lower), for at least 60 seconds, then reduce the frame rate to 75% of the maximum 0% packet loss rate. A number of time-stamps should be recorded: - Record the time-stamp at which the frame rate was reduced and record a second time-stamp at the time of the last frame lost. The recovery time is the difference between the two timestamps. - Record the average latency for 10,000 frames after the last frame loss and continue to record average latency measurements for every 10,000 frames, when latency returns to within 10% of pre-overload levels record the time-stamp.

Expected Result:

Metrics collected

The following are the metrics collected for this test:

  • The length of time it takes the DUT to recover from an overload condition.
  • The length of time it takes the DUT to recover the average latency to pre-overload conditions.

Deployment scenario:

  • Physical → virtual switch → physical.
3.2.2.1.5. Test ID: LTD.Throughput.RFC2544.BackToBackFrames

Title: RFC2544 Back To Back Frames Test

Prerequisite Test: N

Priority:

Description:

The aim of this test is to characterize the ability of the DUT to process back-to-back frames. For each frame size previously defined under Default Test Parameters, a burst of traffic is sent to the DUT with the minimum inter-frame gap between each frame. If the number of received frames equals the number of frames that were transmitted, the burst size should be increased and traffic is sent to the DUT again. The value measured is the back-to-back value, that is the maximum burst size the DUT can handle without any frame loss. Please note a trial must run for a minimum of 2 seconds and should be repeated 50 times (at a minimum).

Expected Result:

Tests of back-to-back frames with physical devices have produced unstable results in some cases. All tests should be repeated in multiple test sessions and results stability should be examined.

Metrics collected

The following are the metrics collected for this test:

  • The average back-to-back value across the trials, which is the number of frames in the longest burst that the DUT will handle without the loss of any frames.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → physical.
3.2.2.1.6. Test ID: LTD.Throughput.RFC2889.MaxForwardingRateSoak

Title: RFC 2889 X% packet loss Max Forwarding Rate Soak Test

Prerequisite Tests:

LTD.Throughput.RFC2544.PacketLossRatio will determine the offered load and frame size for which the maximum theoretical throughput of the interface has not been achieved. As described in RFC 2544 section 24, the final determination of the benchmark SHOULD be conducted using a full length trial, and for this purpose the duration is 5 minutes with zero loss ratio.

It is also essential to verify that the Traffic Generator has sufficient stability to conduct Soak tests. Therefore, a prerequisite is to perform this test with the DUT removed and replaced with a cross-over cable (or other equivalent very low overhead method such as a loopback in a HW switch), so that the traffic generator (and any other network involved) can be tested over the Soak period. Note that this test may be challenging for software- based traffic generators.

Priority:

Description:

The aim of this test is to understand the Max Forwarding Rate stability over an extended test duration in order to uncover any outliers. To allow for an extended test duration, the test should ideally run for 24 hours or if this is not possible, for at least 6 hours.

For this test, one frame size must be sent at the highest frame rate with X% packet loss ratio, as determined in the prerequisite test (a short trial). The loss ratio shall be measured and recorded every 5 minutes during the test (it may be sufficient to collect lost frame counts and divide by the number of frames sent in 5 minutes to see if a threshold has been crossed, and accept some small inaccuracy in the threshold evaluation, not the result). The default loss ratio is X = 0% and loss ratio > 10^-7% is the default threshold to terminate the test early (or inform the test operator of the failure status).

Note: Other values of X and loss threshold can be tested if required by the user.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • Max Forwarding Rate stability of the DUT.
    • This means reporting the number of packets lost per time interval and reporting any time intervals with packet loss. The RFC2889 Forwarding Rate shall be measured in each interval. An interval of 300s is suggested.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
  • The RFC5481 PDV form of delay variation on the traffic flow, using the 99th percentile, may also be collected.
3.2.2.1.7. Test ID: LTD.Throughput.RFC2889.MaxForwardingRateSoakFrameModification

Title: RFC 2889 Max Forwarding Rate Soak Test with Frame Modification

Prerequisite Test:

LTD.Throughput.RFC2544.PacketLossRatioFrameModification (0% Packet Loss) will determine the offered load and frame size for which the maximum theoretical throughput of the interface has not been achieved. As described in RFC 2544 section 24, the final determination of the benchmark SHOULD be conducted using a full length trial, and for this purpose the duration is 5 minutes with zero loss ratio.

It is also essential to verify that the Traffic Generator has sufficient stability to conduct Soak tests. Therefore, a prerequisite is to perform this test with the DUT removed and replaced with a cross-over cable (or other equivalent very low overhead method such as a loopback in a HW switch), so that the traffic generator (and any other network involved) can be tested over the Soak period. Note that this test may be challenging for software- based traffic generators.

Priority:

Description:

The aim of this test is to understand the Max Forwarding Rate stability over an extended test duration in order to uncover any outliers. To allow for an extended test duration, the test should ideally run for 24 hours or, if this is not possible, for at least 6 hours.

For this test, one frame size must be sent at the highest frame rate with X% packet loss ratio, as determined in the prerequisite test (a short trial). The loss ratio shall be measured and recorded every 5 minutes during the test (it may be sufficient to collect lost frame counts and divide by the number of frames sent in 5 minutes to see if a threshold has been crossed, and accept some small inaccuracy in the threshold evaluation, not the result). The default loss ratio is X = 0% and loss ratio > 10^-7% is the default threshold to terminate the test early (or inform the test operator of the failure status).

Note: Other values of X and loss threshold can be tested if required by the user.

During this test, the DUT must perform the following operations on the traffic flow:

  • Perform packet parsing on the DUT’s ingress port.
  • Perform any relevant address look-ups on the DUT’s ingress ports.
  • Modify the packet header before forwarding the packet to the DUT’s egress port. Packet modifications include:
    • Modifying the Ethernet source or destination MAC address.
    • Modifying/adding a VLAN tag (Recommended).
    • Modifying/adding a MPLS tag.
    • Modifying the source or destination ip address.
    • Modifying the TOS/DSCP field.
    • Modifying the source or destination ports for UDP/TCP/SCTP.
    • Modifying the TTL.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • Max Forwarding Rate stability of the DUT.
    • This means reporting the number of packets lost per time interval and reporting any time intervals with packet loss. The RFC2889 Forwarding Rate shall be measured in each interval. An interval of 300s is suggested.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
  • The RFC5481 PDV form of delay variation on the traffic flow, using the 99th percentile, may also be collected.
3.2.2.1.8. Test ID: LTD.Throughput.RFC6201.ResetTime

Title: RFC 6201 Reset Time Test

Prerequisite Test: N/A

Priority:

Description:

The aim of this test is to determine the length of time it takes the DUT to recover from a reset.

Two reset methods are defined - planned and unplanned. A planned reset requires stopping and restarting the virtual switch by the usual ‘graceful’ method defined by it’s documentation. An unplanned reset requires simulating a fatal internal fault in the virtual switch - for example by using kill -SIGKILL on a Linux environment.

Both reset methods SHOULD be exercised.

For each frame size previously defined under Default Test Parameters, traffic should be sent to the DUT under normal conditions. During the duration of the test and while the traffic flows are passing through the DUT, the DUT should be reset and the Reset time measured. The Reset time is the total time that a device is determined to be out of operation and includes the time to perform the reset and the time to recover from it (cf. RFC6201).

RFC6201 defines two methods to measure the Reset time:

  • Frame-Loss Method: which requires the monitoring of the number of lost frames and calculates the Reset time based on the number of frames lost and the offered rate according to the following formula:

                       Frames_lost (packets)
    Reset_time = -------------------------------------
                   Offered_rate (packets per second)
    
  • Timestamp Method: which measures the time from which the last frame is forwarded from the DUT to the time the first frame is forwarded after the reset. This involves time-stamping all transmitted frames and recording the timestamp of the last frame that was received prior to the reset and also measuring the timestamp of the first frame that is received after the reset. The Reset time is the difference between these two timestamps.

According to RFC6201 the choice of method depends on the test tool’s capability; the Frame-Loss method SHOULD be used if the test tool supports:

  • Counting the number of lost frames per stream.
  • Transmitting test frame despite the physical link status.

whereas the Timestamp method SHOULD be used if the test tool supports:

  • Timestamping each frame.
  • Monitoring received frame’s timestamp.
  • Transmitting frames only if the physical link status is up.

Expected Result:

Metrics collected

The following are the metrics collected for this test:

  • Average Reset Time over the number of trials performed.

Results of this test should include the following information:

  • The reset method used.
  • Throughput in Fps and Mbps.
  • Average Frame Loss over the number of trials performed.
  • Average Reset Time in milliseconds over the number of trials performed.
  • Number of trials performed.
  • Protocol: IPv4, IPv6, MPLS, etc.
  • Frame Size in Octets
  • Port Media: Ethernet, Gigabit Ethernet (GbE), etc.
  • Port Speed: 10 Gbps, 40 Gbps etc.
  • Interface Encapsulation: Ethernet, Ethernet VLAN, etc.

Deployment scenario:

  • Physical → virtual switch → physical.
3.2.2.1.9. Test ID: LTD.Throughput.RFC2889.MaxForwardingRate

Title: RFC2889 Forwarding Rate Test

Prerequisite Test: LTD.Throughput.RFC2544.PacketLossRatio

Priority:

Description:

This test measures the DUT’s Max Forwarding Rate when the Offered Load is varied between the throughput and the Maximum Offered Load for fixed length frames at a fixed time interval. The selected frame sizes are those previously defined under Default Test Parameters. The throughput is the maximum offered load with 0% frame loss (measured by the prerequisite test), and the Maximum Offered Load (as defined by RFC2285) is “the highest number of frames per second that an external source can transmit to a DUT/SUT for forwarding to a specified output interface or interfaces”.

Traffic should be sent to the DUT at a particular rate (TX rate) starting with TX rate equal to the throughput rate. The rate of successfully received frames at the destination counted (in FPS). If the RX rate is equal to the TX rate, the TX rate should be increased by a fixed step size and the RX rate measured again until the Max Forwarding Rate is found.

The trial duration for each iteration should last for the period of time needed for the system to reach steady state for the frame size being tested. Under RFC2889 (Sec. 5.6.3.1) test methodology, the test duration should run for a minimum period of 30 seconds, regardless whether the system reaches steady state before the minimum duration ends.

Expected Result: According to RFC2889 The Max Forwarding Rate is the highest forwarding rate of a DUT taken from an iterative set of forwarding rate measurements. The iterative set of forwarding rate measurements are made by setting the intended load transmitted from an external source and measuring the offered load (i.e what the DUT is capable of forwarding). If the Throughput == the Maximum Offered Load, it follows that Max Forwarding Rate is equal to the Maximum Offered Load.

Metrics Collected:

The following are the metrics collected for this test:

  • The Max Forwarding Rate for the DUT for each packet size.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → physical. Note: Full mesh tests with multiple ingress and egress ports are a key aspect of RFC 2889 benchmarks, and scenarios with both 2 and 4 ports should be tested. In any case, the number of ports used must be reported.
3.2.2.1.10. Test ID: LTD.Throughput.RFC2889.ForwardPressure

Title: RFC2889 Forward Pressure Test

Prerequisite Test: LTD.Throughput.RFC2889.MaxForwardingRate

Priority:

Description:

The aim of this test is to determine if the DUT transmits frames with an inter-frame gap that is less than 12 bytes. This test overloads the DUT and measures the output for forward pressure. Traffic should be transmitted to the DUT with an inter-frame gap of 11 bytes, this will overload the DUT by 1 byte per frame. The forwarding rate of the DUT should be measured.

Expected Result: The forwarding rate should not exceed the maximum forwarding rate of the DUT collected by LTD.Throughput.RFC2889.MaxForwardingRate.

Metrics collected

The following are the metrics collected for this test:

  • Forwarding rate of the DUT in FPS or Mbps.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → physical.
3.2.2.1.11. Test ID: LTD.Throughput.RFC2889.ErrorFramesFiltering

Title: RFC2889 Error Frames Filtering Test

Prerequisite Test: N/A

Priority:

Description:

The aim of this test is to determine whether the DUT will propagate any erroneous frames it receives or whether it is capable of filtering out the erroneous frames. Traffic should be sent with erroneous frames included within the flow at random intervals. Illegal frames that must be tested include: - Oversize Frames. - Undersize Frames. - CRC Errored Frames. - Dribble Bit Errored Frames - Alignment Errored Frames

The traffic flow exiting the DUT should be recorded and checked to determine if the erroneous frames where passed through the DUT.

Expected Result: Broken frames are not passed!

Metrics collected

No Metrics are collected in this test, instead it determines:

  • Whether the DUT will propagate erroneous frames.
  • Or whether the DUT will correctly filter out any erroneous frames from traffic flow with out removing correct frames.

Deployment scenario:

  • Physical → virtual switch → physical.
3.2.2.1.12. Test ID: LTD.Throughput.RFC2889.BroadcastFrameForwarding

Title: RFC2889 Broadcast Frame Forwarding Test

Prerequisite Test: N

Priority:

Description:

The aim of this test is to determine the maximum forwarding rate of the DUT when forwarding broadcast traffic. For each frame previously defined under Default Test Parameters, the traffic should be set up as broadcast traffic. The traffic throughput of the DUT should be measured.

The test should be conducted with at least 4 physical ports on the DUT. The number of ports used MUST be recorded.

As broadcast involves forwarding a single incoming packet to several destinations, the latency of a single packet is defined as the average of the latencies for each of the broadcast destinations.

The incoming packet is transmitted on each of the other physical ports, it is not transmitted on the port on which it was received. The test MAY be conducted using different broadcasting ports to uncover any performance differences.

Expected Result:

Metrics collected:

The following are the metrics collected for this test:

  • The forwarding rate of the DUT when forwarding broadcast traffic.
  • The minimum, average & maximum packets latencies observed.

Deployment scenario:

  • Physical → virtual switch 3x physical. In the Broadcast rate testing, four test ports are required. One of the ports is connected to the test device, so it can send broadcast frames and listen for miss-routed frames.
3.2.2.1.13. Test ID: LTD.Throughput.RFC2544.WorstN-BestN

Title: Modified RFC 2544 X% packet loss ratio Throughput and Latency Test

Prerequisite Test: N/A

Priority:

Description:

This test determines the DUT’s maximum forwarding rate with X% traffic loss for a constant load (fixed length frames at a fixed interval time). The default loss percentages to be tested are: X = 0%, X = 10^-7%

Modified RFC 2544 throughput benchmarking methodology aims to quantify the throughput measurement variations observed during standard RFC 2544 benchmarking measurements of virtual switches and VNFs. The RFC2544 binary search algorithm is modified to use more samples per test trial to drive the binary search and yield statistically more meaningful results. This keeps the heart of the RFC2544 methodology, still relying on the binary search of throughput at specified loss tolerance, while providing more useful information about the range of results seen in testing. Instead of using a single traffic trial per iteration step, each traffic trial is repeated N times and the success/failure of the iteration step is based on these N traffic trials. Two types of revised tests are defined - Worst-of-N and Best-of-N.

Worst-of-N

Worst-of-N indicates the lowest expected maximum throughput for ( packet size, loss tolerance) when repeating the test.

  1. Repeat the same test run N times at a set packet rate, record each result.
  2. Take the WORST result (highest packet loss) out of N result samples, called the Worst-of-N sample.
  3. If Worst-of-N sample has loss less than the set loss tolerance, then the step is successful - increase the test traffic rate.
  4. If Worst-of-N sample has loss greater than the set loss tolerance then the step failed - decrease the test traffic rate.
  5. Go to step 1.

Best-of-N

Best-of-N indicates the highest expected maximum throughput for ( packet size, loss tolerance) when repeating the test.

  1. Repeat the same traffic run N times at a set packet rate, record each result.
  2. Take the BEST result (least packet loss) out of N result samples, called the Best-of-N sample.
  3. If Best-of-N sample has loss less than the set loss tolerance, then the step is successful - increase the test traffic rate.
  4. If Best-of-N sample has loss greater than the set loss tolerance, then the step failed - decrease the test traffic rate.
  5. Go to step 1.

Performing both Worst-of-N and Best-of-N benchmark tests yields lower and upper bounds of expected maximum throughput under the operating conditions, giving a very good indication to the user of the deterministic performance range for the tested setup.

Expected Result: At the end of each trial series, the presence or absence of loss determines the modification of offered load for the next trial series, converging on a maximum rate, or RFC2544 Throughput with X% loss. The Throughput load is re-used in related RFC2544 tests and other tests.

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • Following may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system:
  • CPU core utilization.
  • CPU cache utilization.
  • Memory footprint.
  • System bus (QPI, PCI, ...) utilization.
  • CPU cycles consumed per packet.
3.2.2.1.14. Test ID: LTD.Throughput.Overlay.Network.<tech>.RFC2544.PacketLossRatio

Title: <tech> Overlay Network RFC 2544 X% packet loss ratio Throughput and Latency Test

NOTE: Throughout this test, four interchangeable overlay technologies are covered by the same test description. They are: VXLAN, GRE, NVGRE and GENEVE.

Prerequisite Test: N/A

Priority:

Description: This test evaluates standard switch performance benchmarks for the scenario where an Overlay Network is deployed for all paths through the vSwitch. Overlay Technologies covered (replacing <tech> in the test name) include:

  • VXLAN
  • GRE
  • NVGRE
  • GENEVE

Performance will be assessed for each of the following overlay network functions:

  • Encapsulation only
  • De-encapsulation only
  • Both Encapsulation and De-encapsulation

For each native packet, the DUT must perform the following operations:

  • Examine the packet and classify its correct overlay net (tunnel) assignment
  • Encapsulate the packet
  • Switch the packet to the correct port

For each encapsulated packet, the DUT must perform the following operations:

  • Examine the packet and classify its correct native network assignment
  • De-encapsulate the packet, if required
  • Switch the packet to the correct port

The selected frame sizes are those previously defined under Default Test Parameters.

Thus, each test comprises an overlay technology, a network function, and a packet size with overlay network overhead included (but see also the discussion at https://etherpad.opnfv.org/p/vSwitchTestsDrafts ).

The test can also be used to determine the average latency of the traffic.

Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result for Throughput.

Expected Result: At the end of each trial, the presence or absence of loss determines the modification of offered load for the next trial, converging on a maximum rate, or RFC2544 Throughput with X% loss (where the value of X is typically equal to zero). The Throughput load is re-used in related RFC2544 tests and other tests.

Metrics Collected: The following are the metrics collected for this test:

  • The maximum Throughput in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT and VNFs (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
3.2.2.1.15. Test ID: LTD.Throughput.RFC2544.MatchAction.PacketLossRatio

Title: RFC 2544 X% packet loss ratio match action Throughput and Latency Test

Prerequisite Test: LTD.Throughput.RFC2544.PacketLossRatio

Priority:

Description:

The aim of this test is to determine the cost of carrying out match action(s) on the DUT’s RFC2544 Throughput with X% traffic loss for a constant load (fixed length frames at a fixed interval time).

Each test case requires:

  • selection of a specific match action(s),
  • specifying a percentage of total traffic that is elligible for the match action,
  • determination of the specific test configuration (number of flows, number of test ports, presence of an external controller, etc.), and
  • measurement of the RFC 2544 Throughput level with X% packet loss: Traffic shall be bi-directional and symmetric.

Note: It would be ideal to verify that all match action-elligible traffic was forwarded to the correct port, and if forwarded to an unintended port it should be considered lost.

A match action is an action that is typically carried on a frame or packet that matches a set of flow classification parameters (typically frame/packet header fields). A match action may or may not modify a packet/frame. Match actions include [1]:

  • output : outputs a packet to a particular port.
  • normal: Subjects the packet to traditional L2/L3 processing (MAC learning).
  • flood: Outputs the packet on all switch physical ports other than the port on which it was received and any ports on which flooding is disabled.
  • all: Outputs the packet on all switch physical ports other than the port on which it was received.
  • local: Outputs the packet on the local port, which corresponds to the network device that has the same name as the bridge.
  • in_port: Outputs the packet on the port from which it was received.
  • Controller: Sends the packet and its metadata to the OpenFlow controller as a packet in message.
  • enqueue: Enqueues the packet on the specified queue within port.
  • drop: discard the packet.

Modifications include [1]:

  • mod vlan: covered by LTD.Throughput.RFC2544.PacketLossRatioFrameModification
  • mod_dl_src: Sets the source Ethernet address.
  • mod_dl_dst: Sets the destination Ethernet address.
  • mod_nw_src: Sets the IPv4 source address.
  • mod_nw_dst: Sets the IPv4 destination address.
  • mod_tp_src: Sets the TCP or UDP or SCTP source port.
  • mod_tp_dst: Sets the TCP or UDP or SCTP destination port.
  • mod_nw_tos: Sets the DSCP bits in the IPv4 ToS/DSCP or IPv6 traffic class field.
  • mod_nw_ecn: Sets the ECN bits in the appropriate IPv4 or IPv6 field.
  • mod_nw_ttl: Sets the IPv4 TTL or IPv6 hop limit field.

Note: This comprehensive list requires extensive traffic generator capabilities.

The match action(s) that were applied as part of the test should be reported in the final test report.

During this test, the DUT must perform the following operations on the traffic flow:

  • Perform packet parsing on the DUT’s ingress port.
  • Perform any relevant address look-ups on the DUT’s ingress ports.
  • Carry out one or more of the match actions specified above.

The default loss percentages to be tested are: - X = 0% - X = 10^-7% Other values can be tested if required by the user. The selected frame sizes are those previously defined under Default Test Parameters.

The test can also be used to determine the average latency of the traffic when a match action is applied to packets in a flow. Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result.

Expected Result:

At the end of each trial, the presence or absence of loss determines the modification of offered load for the next trial, converging on a maximum rate, or RFC2544Throughput with X% loss. The Throughput load is re-used in related RFC2544 tests and other tests.

Metrics Collected:

The following are the metrics collected for this test:

  • The RFC 2544 Throughput in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 ofRFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

The metrics collected can be compared to that of the prerequisite test to determine the cost of the match action(s) in the pipeline.

Deployment scenario:

  • Physical → virtual switch → physical (and others are possible)
[1] ovs-ofctl - administer OpenFlow switches
[http://openvswitch.org/support/dist-docs/ovs-ofctl.8.txt ]
3.2.2.2. Packet Latency tests

These tests will measure the store and forward latency as well as the packet delay variation for various packet types through the virtual switch. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

3.2.2.2.1. Test ID: LTD.PacketLatency.InitialPacketProcessingLatency

Title: Initial Packet Processing Latency

Prerequisite Test: N/A

Priority:

Description:

In some virtual switch architectures, the first packets of a flow will take the system longer to process than subsequent packets in the flow. This test determines the latency for these packets. The test will measure the latency of the packets as they are processed by the flow-setup-path of the DUT. There are two methods for this test, a recommended method and a nalternative method that can be used if it is possible to disable the fastpath of the virtual switch.

Recommended method: This test will send 64,000 packets to the DUT, each belonging to a different flow. Average packet latency will be determined over the 64,000 packets.

Alternative method: This test will send a single packet to the DUT after a fixed interval of time. The time interval will be equivalent to the amount of time it takes for a flow to time out in the virtual switch plus 10%. Average packet latency will be determined over 1,000,000 packets.

This test is intended only for non-learning virtual switches; For learning virtual switches use RFC2889.

For this test, only unidirectional traffic is required.

Expected Result: The average latency for the initial packet of all flows should be greater than the latency of subsequent traffic.

Metrics Collected:

The following are the metrics collected for this test:

  • Average latency of the initial packets of all flows that are processed by the DUT.

Deployment scenario:

  • Physical → Virtual Switch → Physical.
3.2.2.2.2. Test ID: LTD.PacketDelayVariation.RFC3393.Soak

Title: Packet Delay Variation Soak Test

Prerequisite Tests:

LTD.Throughput.RFC2544.PacketLossRatio will determine the offered load and frame size for which the maximum theoretical throughput of the interface has not been achieved. As described in RFC 2544 section 24, the final determination of the benchmark SHOULD be conducted using a full length trial, and for this purpose the duration is 5 minutes with zero loss ratio.

It is also essential to verify that the Traffic Generator has sufficient stability to conduct Soak tests. Therefore, a prerequisite is to perform this test with the DUT removed and replaced with a cross-over cable (or other equivalent very low overhead method such as a loopback in a HW switch), so that the traffic generator (and any other network involved) can be tested over the Soak period. Note that this test may be challenging for software- based traffic generators.

Priority:

Description:

The aim of this test is to understand the distribution of packet delay variation for different frame sizes over an extended test duration and to determine if there are any outliers. To allow for an extended test duration, the test should ideally run for 24 hours or, if this is not possible, for at least 6 hours.

For this test, one frame size must be sent at the highest frame rate with X% packet loss ratio, as determined in the prerequisite test (a short trial). The loss ratio shall be measured and recorded every 5 minutes during the test (it may be sufficient to collect lost frame counts and divide by the number of frames sent in 5 minutes to see if a threshold has been crossed, and accept some small inaccuracy in the threshold evaluation, not the result). The default loss ratio is X = 0% and loss ratio > 10^-7% is the default threshold to terminate the test early (or inform the test operator of the failure status).

Note: Other values of X and loss threshold can be tested if required by the user.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • The packet delay variation value for traffic passing through the DUT.
  • The RFC5481 PDV form of delay variation on the traffic flow, using the 99th percentile, for each 300s interval during the test.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
3.2.2.3. Scalability tests

The general aim of these tests is to understand the impact of large flow table size and flow lookups on throughput. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

3.2.2.3.1. Test ID: LTD.Scalability.Flows.RFC2544.0PacketLoss

Title: RFC 2544 0% loss Flow Scalability throughput test

Prerequisite Test: LTD.Throughput.RFC2544.PacketLossRatio, IF the delta Throughput between the single-flow RFC2544 test and this test with a variable number of flows is desired.

Priority:

Description:

The aim of this test is to measure how throughput changes as the number of flows in the DUT increases. The test will measure the throughput through the fastpath, as such the flows need to be installed on the DUT before passing traffic.

For each frame size previously defined under Default Test Parameters and for each of the following number of flows:

  • 1,000
  • 2,000
  • 4,000
  • 8,000
  • 16,000
  • 32,000
  • 64,000
  • Max supported number of flows.

This test will be conducted under two conditions following the establishment of all flows as required by RFC 2544, regarding the flow expiration time-out:

  1. The time-out never expires during each trial.

2) The time-out expires for all flows periodically. This would require a short time-out compared with flow re-appearance for a small number of flows, and may not be possible for all flow conditions.

The maximum 0% packet loss Throughput should be determined in a manner identical to LTD.Throughput.RFC2544.PacketLossRatio.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum number of frames per second that can be forwarded at the specified number of flows and the specified frame size, with zero packet loss.
3.2.2.3.2. Test ID: LTD.MemoryBandwidth.RFC2544.0PacketLoss.Scalability

Title: RFC 2544 0% loss Memory Bandwidth Scalability test

Prerequisite Tests: LTD.Throughput.RFC2544.PacketLossRatio, IF the delta Throughput between an undisturbed RFC2544 test and this test with the Throughput affected by cache and memory bandwidth contention is desired.

Priority:

Description:

The aim of this test is to understand how the DUT’s performance is affected by cache sharing and memory bandwidth between processes.

During the test all cores not used by the vSwitch should be running a memory intensive application. This application should read and write random data to random addresses in unused physical memory. The random nature of the data and addresses is intended to consume cache, exercise main memory access (as opposed to cache) and exercise all memory buses equally. Furthermore:

  • the ratio of reads to writes should be recorded. A ratio of 1:1 SHOULD be used.
  • the reads and writes MUST be of cache-line size and be cache-line aligned.
  • in NUMA architectures memory access SHOULD be local to the core’s node. Whether only local memory or a mix of local and remote memory is used MUST be recorded.
  • the memory bandwidth (reads plus writes) used per-core MUST be recorded; the test MUST be run with a per-core memory bandwidth equal to half the maximum system memory bandwidth divided by the number of cores. The test MAY be run with other values for the per-core memory bandwidth.
  • the test MAY also be run with the memory intensive application running on all cores.

Under these conditions the DUT’s 0% packet loss throughput is determined as per LTD.Throughput.RFC2544.PacketLossRatio.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • The DUT’s 0% packet loss throughput in the presence of cache sharing and memory bandwidth between processes.
3.2.2.3.3. Test ID: LTD.Scalability.VNF.RFC2544.PacketLossRatio
Title: VNF Scalability RFC 2544 X% packet loss ratio Throughput and
Latency Test

Prerequisite Test: N/A

Priority:

Description:

This test determines the DUT’s throughput rate with X% traffic loss for a constant load (fixed length frames at a fixed interval time) when the number of VNFs on the DUT increases. The default loss percentages to be tested are: - X = 0% - X = 10^-7% . The minimum number of VNFs to be tested are 3.

Flow classification should be conducted with L2, L3 and L4 matching to understand the matching and scaling capability of the vSwitch. The matching fields which were used as part of the test should be reported as part of the benchmark report.

The vSwitch is responsible for forwarding frames between the VNFs

The SUT (vSwitch and VNF daisy chain) operation should be validated before running the test. This may be completed by running a burst or continuous stream of traffic through the SUT to ensure proper operation before a test.

Note: The traffic rate used to validate SUT operation should be low enough not to stress the SUT.

Note: Other values can be tested if required by the user.

Note: The same VNF should be used in the “daisy chain” formation. Each addition of a VNF should be conducted in a new test setup (The DUT is brought down, then the DUT is brought up again). An atlernative approach would be to continue to add VNFs without bringing down the DUT. The approach used needs to be documented as part of the test report.

The selected frame sizes are those previously defined under Default Test Parameters. The test can also be used to determine the average latency of the traffic.

Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result for Throughput.

Expected Result: At the end of each trial, the presence or absence of loss determines the modification of offered load for the next trial, converging on a maximum rate, or RFC2544 Throughput with X% loss. The Throughput load is re-used in related RFC2544 tests and other tests.

If the test VNFs are rather light-weight in terms of processing, the test provides a view of multiple passes through the vswitch on logical interfaces. In other words, the test produces an optimistic count of daisy-chained VNFs, but the cumulative effect of traffic on the vSwitch is “real” (assuming that the vSwitch has some dedicated resources, and the effects on shared resources is understood).

Metrics Collected: The following are the metrics collected for this test:

  • The maximum Throughput in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT and VNFs (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
3.2.2.3.4. Test ID: LTD.Scalability.VNF.RFC2544.PacketLossProfile

Title: VNF Scalability RFC 2544 Throughput and Latency Profile

Prerequisite Test: N/A

Priority:

Description:

This test reveals how throughput and latency degrades as the number of VNFs increases and offered rate varies in the region of the DUT’s maximum forwarding rate as determined by LTD.Throughput.RFC2544.PacketLossRatio (0% Packet Loss). For example it can be used to determine if the degradation of throughput and latency as the number of VNFs and offered rate increases is slow and graceful, or sudden and severe. The minimum number of VNFs to be tested is 3.

The selected frame sizes are those previously defined under Default Test Parameters.

The offered traffic rate is described as a percentage delta with respect to the DUT’s RFC 2544 Throughput as determined by LTD.Throughput.RFC2544.PacketLoss Ratio (0% Packet Loss case). A delta of 0% is equivalent to an offered traffic rate equal to the RFC 2544 Throughput; A delta of +50% indicates an offered rate half-way between the Throughput and line-rate, whereas a delta of -50% indicates an offered rate of half the maximum rate. Therefore the range of the delta figure is natuarlly bounded at -100% (zero offered traffic) and +100% (traffic offered at line rate).

The following deltas to the maximum forwarding rate should be applied:

  • -50%, -10%, 0%, +10% & +50%

Note: Other values can be tested if required by the user.

Note: The same VNF should be used in the “daisy chain” formation. Each addition of a VNF should be conducted in a new test setup (The DUT is brought down, then the DUT is brought up again). An atlernative approach would be to continue to add VNFs without bringing down the DUT. The approach used needs to be documented as part of the test report.

Flow classification should be conducted with L2, L3 and L4 matching to understand the matching and scaling capability of the vSwitch. The matching fields which were used as part of the test should be reported as part of the benchmark report.

The SUT (vSwitch and VNF daisy chain) operation should be validated before running the test. This may be completed by running a burst or continuous stream of traffic through the SUT to ensure proper operation before a test.

Note: the traffic rate used to validate SUT operation should be low enough not to stress the SUT

Expected Result: For each packet size a profile should be produced of how throughput and latency vary with offered rate.

Metrics Collected:

The following are the metrics collected for this test:

  • The forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each delta to the maximum forwarding rate and for each frame size.
  • The average latency for each delta to the maximum forwarding rate and for each frame size.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
  • Any failures experienced (for example if the vSwitch crashes, stops processing packets, restarts or becomes unresponsive to commands) when the offered load is above Maximum Throughput MUST be recorded and reported with the results.
3.2.2.4. Activation tests

The general aim of these tests is to understand the capacity of the and speed with which the vswitch can accommodate new flows.

3.2.2.4.1. Test ID: LTD.Activation.RFC2889.AddressCachingCapacity

Title: RFC2889 Address Caching Capacity Test

Prerequisite Test: N/A

Priority:

Description:

Please note this test is only applicable to virtual switches that are capable of MAC learning. The aim of this test is to determine the address caching capacity of the DUT for a constant load (fixed length frames at a fixed interval time). The selected frame sizes are those previously defined under Default Test Parameters.

In order to run this test the aging time, that is the maximum time the DUT will keep a learned address in its flow table, and a set of initial addresses, whose value should be >= 1 and <= the max number supported by the implementation must be known. Please note that if the aging time is configurable it must be longer than the time necessary to produce frames from the external source at the specified rate. If the aging time is fixed the frame rate must be brought down to a value that the external source can produce in a time that is less than the aging time.

Learning Frames should be sent from an external source to the DUT to install a number of flows. The Learning Frames must have a fixed destination address and must vary the source address of the frames. The DUT should install flows in its flow table based on the varying source addresses. Frames should then be transmitted from an external source at a suitable frame rate to see if the DUT has properly learned all of the addresses. If there is no frame loss and no flooding, the number of addresses sent to the DUT should be increased and the test is repeated until the max number of cached addresses supported by the DUT determined.

Expected Result:

Metrics collected:

The following are the metrics collected for this test:

  • Number of cached addresses supported by the DUT.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → 2 x physical (one receiving, one listening).
3.2.2.4.2. Test ID: LTD.Activation.RFC2889.AddressLearningRate

Title: RFC2889 Address Learning Rate Test

Prerequisite Test: LTD.Memory.RFC2889.AddressCachingCapacity

Priority:

Description:

Please note this test is only applicable to virtual switches that are capable of MAC learning. The aim of this test is to determine the rate of address learning of the DUT for a constant load (fixed length frames at a fixed interval time). The selected frame sizes are those previously defined under Default Test Parameters, traffic should be sent with each IPv4/IPv6 address incremented by one. The rate at which the DUT learns a new address should be measured. The maximum caching capacity from LTD.Memory.RFC2889.AddressCachingCapacity should be taken into consideration as the maximum number of addresses for which the learning rate can be obtained.

Expected Result: It may be worthwhile to report the behaviour when operating beyond address capacity - some DUTs may be more friendly to new addresses than others.

Metrics collected:

The following are the metrics collected for this test:

  • The address learning rate of the DUT.

Deployment scenario:

  • Physical → virtual switch → 2 x physical (one receiving, one listening).
3.2.2.5. Coupling between control path and datapath Tests

The following tests aim to determine how tightly coupled the datapath and the control path are within a virtual switch. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

3.2.2.5.1. Test ID: LTD.CPDPCouplingFlowAddition

Title: Control Path and Datapath Coupling

Prerequisite Test:

Priority:

Description:

The aim of this test is to understand how exercising the DUT’s control path affects datapath performance.

Initially a certain number of flow table entries are installed in the vSwitch. Then over the duration of an RFC2544 throughput test flow-entries are added and removed at the rates specified below. No traffic is ‘hitting’ these flow-entries, they are simply added and removed.

The test MUST be repeated with the following initial number of flow-entries installed: - < 10 - 1000 - 100,000 - 10,000,000 (or the maximum supported number of flow-entries)

The test MUST be repeated with the following rates of flow-entry addition and deletion per second: - 0 - 1 (i.e. 1 addition plus 1 deletion) - 100 - 10,000

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum forwarding rate in Frames Per Second (FPS) and Mbps of the DUT.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → physical.
3.2.2.6. CPU and memory consumption

The following tests will profile a virtual switch’s CPU and memory utilization under various loads and circumstances. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

3.2.2.6.1. Test ID: LTD.Stress.RFC2544.0PacketLoss

Title: RFC 2544 0% Loss CPU OR Memory Stress Test

Prerequisite Test:

Priority:

Description:

The aim of this test is to understand the overall performance of the system when a CPU or Memory intensive application is run on the same DUT as the Virtual Switch. For each frame size, an LTD.Throughput.RFC2544.PacketLossRatio (0% Packet Loss) test should be performed. Throughout the entire test a CPU or Memory intensive application should be run on all cores on the system not in use by the Virtual Switch. For NUMA system only cores on the same NUMA node are loaded.

It is recommended that stress-ng be used for loading the non-Virtual Switch cores but any stress tool MAY be used.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • Memory and CPU utilization of the cores running the Virtual Switch.
  • The number of identity of the cores allocated to the Virtual Switch.
  • The configuration of the stress tool (for example the command line parameters used to start it.)
Note: Stress in the test ID can be replaced with the name of the
component being stressed, when reporting the results: LTD.CPU.RFC2544.0PacketLoss or LTD.Memory.RFC2544.0PacketLoss
3.2.2.7. Summary List of Tests
  1. Throughput tests
  • Test ID: LTD.Throughput.RFC2544.PacketLossRatio
  • Test ID: LTD.Throughput.RFC2544.PacketLossRatioFrameModification
  • Test ID: LTD.Throughput.RFC2544.Profile
  • Test ID: LTD.Throughput.RFC2544.SystemRecoveryTime
  • Test ID: LTD.Throughput.RFC2544.BackToBackFrames
  • Test ID: LTD.Throughput.RFC2889.Soak
  • Test ID: LTD.Throughput.RFC2889.SoakFrameModification
  • Test ID: LTD.Throughput.RFC6201.ResetTime
  • Test ID: LTD.Throughput.RFC2889.MaxForwardingRate
  • Test ID: LTD.Throughput.RFC2889.ForwardPressure
  • Test ID: LTD.Throughput.RFC2889.ErrorFramesFiltering
  • Test ID: LTD.Throughput.RFC2889.BroadcastFrameForwarding
  • Test ID: LTD.Throughput.RFC2544.WorstN-BestN
  • Test ID: LTD.Throughput.Overlay.Network.<tech>.RFC2544.PacketLossRatio
  1. Packet Latency tests
  • Test ID: LTD.PacketLatency.InitialPacketProcessingLatency
  • Test ID: LTD.PacketDelayVariation.RFC3393.Soak
  1. Scalability tests
  • Test ID: LTD.Scalability.Flows.RFC2544.0PacketLoss
  • Test ID: LTD.MemoryBandwidth.RFC2544.0PacketLoss.Scalability
  • LTD.Scalability.VNF.RFC2544.PacketLossProfile
  • LTD.Scalability.VNF.RFC2544.PacketLossRatio
  1. Activation tests
  • Test ID: LTD.Activation.RFC2889.AddressCachingCapacity
  • Test ID: LTD.Activation.RFC2889.AddressLearningRate
  1. Coupling between control path and datapath Tests
  • Test ID: LTD.CPDPCouplingFlowAddition
  1. CPU and memory consumption
  • Test ID: LTD.Stress.RFC2544.0PacketLoss
4. VSPERF LEVEL TEST PLAN (LTP)
4.1. Introduction

The objective of the OPNFV project titled Characterize vSwitch Performance for Telco NFV Use Cases, is to evaluate the performance of virtual switches to identify its suitability for a Telco Network Function Virtualization (NFV) environment. The intention of this Level Test Plan (LTP) document is to specify the scope, approach, resources, and schedule of the virtual switch performance benchmarking activities in OPNFV. The test cases will be identified in a separate document called the Level Test Design (LTD) document.

This document is currently in draft form.

4.1.1. Document identifier

The document id will be used to uniquely identify versions of the LTP. The format for the document id will be: OPNFV_vswitchperf_LTP_REL_STATUS, where by the status is one of: draft, reviewed, corrected or final. The document id for this version of the LTP is: OPNFV_vswitchperf_LTP_Colorado_REVIEWED.

4.1.2. Scope

The main purpose of this project is to specify a suite of performance tests in order to objectively measure the current packet transfer characteristics of a virtual switch in the NFVI. The intent of the project is to facilitate the performance testing of any virtual switch. Thus, a generic suite of tests shall be developed, with no hard dependencies to a single implementation. In addition, the test case suite shall be architecture independent.

The test cases developed in this project shall not form part of a separate test framework, all of these tests may be inserted into the Continuous Integration Test Framework and/or the Platform Functionality Test Framework - if a vSwitch becomes a standard component of an OPNFV release.

4.1.4. Level in the overall sequence

The level of testing conducted by vswitchperf in the overall testing sequence (among all the testing projects in OPNFV) is the performance benchmarking of a specific component (the vswitch) in the OPNFV platfrom. It’s expected that this testing will follow on from the functional and integration testing conducted by other testing projects in OPNFV, namely Functest and Yardstick.

4.1.5. Test classes and overall test conditions

A benchmark is defined by the IETF as: A standardized test that serves as a basis for performance evaluation and comparison. It’s important to note that benchmarks are not Functional tests. They do not provide PASS/FAIL criteria, and most importantly ARE NOT performed on live networks, or performed with live network traffic.

In order to determine the packet transfer characteristics of a virtual switch, the benchmarking tests will be broken down into the following categories:

  • Throughput Tests to measure the maximum forwarding rate (in frames per second or fps) and bit rate (in Mbps) for a constant load (as defined by RFC1242) without traffic loss.
  • Packet and Frame Delay Tests to measure average, min and max packet and frame delay for constant loads.
  • Stream Performance Tests (TCP, UDP) to measure bulk data transfer performance, i.e. how fast systems can send and receive data through the virtual switch.
  • Request/Response Performance Tests (TCP, UDP) the measure the transaction rate through the virtual switch.
  • Packet Delay Tests to understand latency distribution for different packet sizes and over an extended test run to uncover outliers.
  • Scalability Tests to understand how the virtual switch performs as the number of flows, active ports, complexity of the forwarding logic’s configuration... it has to deal with increases.
  • Control Path and Datapath Coupling Tests, to understand how closely coupled the datapath and the control path are as well as the effect of this coupling on the performance of the DUT.
  • CPU and Memory Consumption Tests to understand the virtual switch’s footprint on the system, this includes:
    • CPU core utilization.
    • CPU cache utilization.
    • Memory footprint.
    • System bus (QPI, PCI, ..) utilization.
    • Memory lanes utilization.
    • CPU cycles consumed per packet.
    • Time To Establish Flows Tests.
  • Noisy Neighbour Tests, to understand the effects of resource sharing on the performance of a virtual switch.

Note: some of the tests above can be conducted simultaneously where the combined results would be insightful, for example Packet/Frame Delay and Scalability.

4.2. Details of the Level Test Plan

This section describes the following items: * Test items and their identifiers (TestItems) * Test Traceability Matrix (TestMatrix) * Features to be tested (FeaturesToBeTested) * Features not to be tested (FeaturesNotToBeTested) * Approach (Approach) * Item pass/fail criteria (PassFailCriteria) * Suspension criteria and resumption requirements (SuspensionResumptionReqs)

4.2.1. Test items and their identifiers

The test item/application vsperf is trying to test are virtual switches and in particular their performance in an nfv environment. vsperf will first try to measure the maximum achievable performance by a virtual switch and then it will focus in on usecases that are as close to real life deployment scenarios as possible.

4.2.2. Test Traceability Matrix

vswitchperf leverages the “3x3” matrix (introduced in https://tools.ietf.org/html/draft-ietf-bmwg-virtual-net-02) to achieve test traceability. The matrix was expanded to 3x4 to accommodate scale metrics when displaying the coverage of many metrics/benchmarks). Test case covreage in the LTD is tracked using the following catagories:

  SPEED ACCURACY RELIABILITY SCALE
Activation X X X X
Operation X X X X
De-activation        

X = denotes a test catagory that has 1 or more test cases defined.

4.2.3. Features to be tested

Characterizing virtual switches (i.e. Device Under Test (DUT) in this document) includes measuring the following performance metrics:

  • Throughput as defined by RFC1242: The maximum rate at which none of the offered frames are dropped by the DUT. The maximum frame rate and bit rate that can be transmitted by the DUT without any error should be recorded. Note there is an equivalent bit rate and a specific layer at which the payloads contribute to the bits. Errors and improperly formed frames or packets are dropped.
  • Packet delay introduced by the DUT and its cumulative effect on E2E networks. Frame delay can be measured equivalently.
  • Packet delay variation: measured from the perspective of the VNF/application. Packet delay variation is sometimes called “jitter”. However, we will avoid the term “jitter” as the term holds different meaning to different groups of people. In this document we will simply use the term packet delay variation. The preferred form for this metric is the PDV form of delay variation defined in RFC5481. The most relevant measurement of PDV considers the delay variation of a single user flow, as this will be relevant to the size of end-system buffers to compensate for delay variation. The measurement system’s ability to store the delays of individual packets in the flow of interest is a key factor that determines the specific measurement method. At the outset, it is ideal to view the complete PDV distribution. Systems that can capture and store packets and their delays have the freedom to calculate the reference minimum delay and to determine various quantiles of the PDV distribution accurately (in post-measurement processing routines). Systems without storage must apply algorithms to calculate delay and statistical measurements on the fly. For example, a system may store temporary estimates of the mimimum delay and the set of (100) packets with the longest delays during measurement (to calculate a high quantile, and update these sets with new values periodically. In some cases, a limited number of delay histogram bins will be available, and the bin limits will need to be set using results from repeated experiments. See section 8 of RFC5481.
  • Packet loss (within a configured waiting time at the receiver): All packets sent to the DUT should be accounted for.
  • Burst behaviour: measures the ability of the DUT to buffer packets.
  • Packet re-ordering: measures the ability of the device under test to maintain sending order throughout transfer to the destination.
  • Packet correctness: packets or Frames must be well-formed, in that they include all required fields, conform to length requirements, pass integrity checks, etc.
  • Availability and capacity of the DUT i.e. when the DUT is fully “up” and connected, following measurements should be captured for DUT without any network packet load:
    • Includes average power consumption of the CPUs (in various power states) and system over specified period of time. Time period should not be less than 60 seconds.
    • Includes average per core CPU utilization over specified period of time. Time period should not be less than 60 seconds.
    • Includes the number of NIC interfaces supported.
    • Includes headroom of VM workload processing cores (i.e. available for applications).
4.2.4. Features not to be tested

vsperf doesn’t intend to define or perform any functional tests. The aim is to focus on performance.

4.2.5. Approach

The testing approach adoped by the vswitchperf project is black box testing, meaning the test inputs can be generated and the outputs captured and completely evaluated from the outside of the System Under Test. Some metrics can be collected on the SUT, such as cpu or memory utilization if the collection has no/minimal impact on benchmark. This section will look at the deployment scenarios and the general methodology used by vswitchperf. In addition, this section will also specify the details of the Test Report that must be collected for each of the test cases.

4.2.5.1. Deployment Scenarios

The following represents possible deployment test scenarios which can help to determine the performance of both the virtual switch and the datapaths to physical ports (to NICs) and to logical ports (to VNFs):

4.2.5.1.1. Physical port → vSwitch → physical port
                                                     _
+--------------------------------------------------+  |
|              +--------------------+              |  |
|              |                    |              |  |
|              |                    v              |  |  Host
|   +--------------+            +--------------+   |  |
|   |   phy port   |  vSwitch   |   phy port   |   |  |
+---+--------------+------------+--------------+---+ _|
           ^                           :
           |                           |
           :                           v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
4.2.5.1.2. Physical port → vSwitch → VNF → vSwitch → physical port
                                                      _
+---------------------------------------------------+  |
|                                                   |  |
|   +-------------------------------------------+   |  |
|   |                 Application               |   |  |
|   +-------------------------------------------+   |  |
|       ^                                  :        |  |
|       |                                  |        |  |  Guest
|       :                                  v        |  |
|   +---------------+           +---------------+   |  |
|   | logical port 0|           | logical port 1|   |  |
+---+---------------+-----------+---------------+---+ _|
        ^                                  :
        |                                  |
        :                                  v         _
+---+---------------+----------+---------------+---+  |
|   | logical port 0|          | logical port 1|   |  |
|   +---------------+          +---------------+   |  |
|       ^                                  :       |  |
|       |                                  |       |  |  Host
|       :                                  v       |  |
|   +--------------+            +--------------+   |  |
|   |   phy port   |  vSwitch   |   phy port   |   |  |
+---+--------------+------------+--------------+---+ _|
           ^                           :
           |                           |
           :                           v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
4.2.5.1.3. Physical port → vSwitch → VNF → vSwitch → VNF → vSwitch → physical port
                                                   _
+----------------------+  +----------------------+  |
|   Guest 1            |  |   Guest 2            |  |
|   +---------------+  |  |   +---------------+  |  |
|   |  Application  |  |  |   |  Application  |  |  |
|   +---------------+  |  |   +---------------+  |  |
|       ^       |      |  |       ^       |      |  |
|       |       v      |  |       |       v      |  |  Guests
|   +---------------+  |  |   +---------------+  |  |
|   | logical ports |  |  |   | logical ports |  |  |
|   |   0       1   |  |  |   |   0       1   |  |  |
+---+---------------+--+  +---+---------------+--+ _|
        ^       :                 ^       :
        |       |                 |       |
        :       v                 :       v        _
+---+---------------+---------+---------------+--+  |
|   |   0       1   |         |   3       4   |  |  |
|   | logical ports |         | logical ports |  |  |
|   +---------------+         +---------------+  |  |
|       ^       |                 ^       |      |  |  Host
|       |       L-----------------+       v      |  |
|   +--------------+          +--------------+   |  |
|   |   phy ports  | vSwitch  |   phy ports  |   |  |
+---+--------------+----------+--------------+---+ _|
        ^       ^                 :       :
        |       |                 |       |
        :       :                 v       v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
4.2.5.1.4. Physical port → VNF → vSwitch → VNF → physical port
                                                    _
+----------------------+  +----------------------+   |
|   Guest 1            |  |   Guest 2            |   |
|+-------------------+ |  | +-------------------+|   |
||     Application   | |  | |     Application   ||   |
|+-------------------+ |  | +-------------------+|   |
|       ^       |      |  |       ^       |      |   |  Guests
|       |       v      |  |       |       v      |   |
|+-------------------+ |  | +-------------------+|   |
||   logical ports   | |  | |   logical ports   ||   |
||  0              1 | |  | | 0              1  ||   |
++--------------------++  ++--------------------++  _|
    ^              :          ^              :
(PCI passthrough)  |          |     (PCI passthrough)
    |              v          :              |      _
+--------++------------+-+------------++---------+   |
|   |    ||        0   | |    1       ||     |   |   |
|   |    ||logical port| |logical port||     |   |   |
|   |    |+------------+ +------------+|     |   |   |
|   |    |     |                 ^     |     |   |   |
|   |    |     L-----------------+     |     |   |   |
|   |    |                             |     |   |   |  Host
|   |    |           vSwitch           |     |   |   |
|   |    +-----------------------------+     |   |   |
|   |                                        |   |   |
|   |                                        v   |   |
| +--------------+              +--------------+ |   |
| | phy port/VF  |              | phy port/VF  | |   |
+-+--------------+--------------+--------------+-+  _|
    ^                                        :
    |                                        |
    :                                        v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
4.2.5.1.5. Physical port → vSwitch → VNF
                                                      _
+---------------------------------------------------+  |
|                                                   |  |
|   +-------------------------------------------+   |  |
|   |                 Application               |   |  |
|   +-------------------------------------------+   |  |
|       ^                                           |  |
|       |                                           |  |  Guest
|       :                                           |  |
|   +---------------+                               |  |
|   | logical port 0|                               |  |
+---+---------------+-------------------------------+ _|
        ^
        |
        :                                            _
+---+---------------+------------------------------+  |
|   | logical port 0|                              |  |
|   +---------------+                              |  |
|       ^                                          |  |
|       |                                          |  |  Host
|       :                                          |  |
|   +--------------+                               |  |
|   |   phy port   |  vSwitch                      |  |
+---+--------------+------------ -------------- ---+ _|
           ^
           |
           :
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
4.2.5.1.6. VNF → vSwitch → physical port
                                                      _
+---------------------------------------------------+  |
|                                                   |  |
|   +-------------------------------------------+   |  |
|   |                 Application               |   |  |
|   +-------------------------------------------+   |  |
|                                          :        |  |
|                                          |        |  |  Guest
|                                          v        |  |
|                               +---------------+   |  |
|                               | logical port  |   |  |
+-------------------------------+---------------+---+ _|
                                           :
                                           |
                                           v         _
+------------------------------+---------------+---+  |
|                              | logical port  |   |  |
|                              +---------------+   |  |
|                                          :       |  |
|                                          |       |  |  Host
|                                          v       |  |
|                               +--------------+   |  |
|                     vSwitch   |   phy port   |   |  |
+-------------------------------+--------------+---+ _|
                                       :
                                       |
                                       v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
4.2.5.1.7. VNF → vSwitch → VNF → vSwitch
                                                         _
+-------------------------+  +-------------------------+  |
|   Guest 1               |  |   Guest 2               |  |
|   +-----------------+   |  |   +-----------------+   |  |
|   |   Application   |   |  |   |   Application   |   |  |
|   +-----------------+   |  |   +-----------------+   |  |
|                :        |  |       ^                 |  |
|                |        |  |       |                 |  |  Guest
|                v        |  |       :                 |  |
|     +---------------+   |  |   +---------------+     |  |
|     | logical port 0|   |  |   | logical port 0|     |  |
+-----+---------------+---+  +---+---------------+-----+ _|
                :                    ^
                |                    |
                v                    :                    _
+----+---------------+------------+---------------+-----+  |
|    |     port 0    |            |     port 1    |     |  |
|    +---------------+            +---------------+     |  |
|              :                    ^                   |  |
|              |                    |                   |  |  Host
|              +--------------------+                   |  |
|                                                       |  |
|                     vswitch                           |  |
+-------------------------------------------------------+ _|

HOST 1(Physical port → virtual switch → VNF → virtual switch → Physical port) → HOST 2(Physical port → virtual switch → VNF → virtual switch → Physical port)

4.2.5.1.8. HOST 1 (PVP) → HOST 2 (PVP)
                                                   _
+----------------------+  +----------------------+  |
|   Guest 1            |  |   Guest 2            |  |
|   +---------------+  |  |   +---------------+  |  |
|   |  Application  |  |  |   |  Application  |  |  |
|   +---------------+  |  |   +---------------+  |  |
|       ^       |      |  |       ^       |      |  |
|       |       v      |  |       |       v      |  |  Guests
|   +---------------+  |  |   +---------------+  |  |
|   | logical ports |  |  |   | logical ports |  |  |
|   |   0       1   |  |  |   |   0       1   |  |  |
+---+---------------+--+  +---+---------------+--+ _|
        ^       :                 ^       :
        |       |                 |       |
        :       v                 :       v        _
+---+---------------+--+  +---+---------------+--+  |
|   |   0       1   |  |  |   |   3       4   |  |  |
|   | logical ports |  |  |   | logical ports |  |  |
|   +---------------+  |  |   +---------------+  |  |
|       ^       |      |  |       ^       |      |  |  Hosts
|       |       v      |  |       |       v      |  |
|   +--------------+   |  |   +--------------+   |  |
|   |   phy ports  |   |  |   |   phy ports  |   |  |
+---+--------------+---+  +---+--------------+---+ _|
        ^       :                 :       :
        |       +-----------------+       |
        :                                 v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+

Note: For tests where the traffic generator and/or measurement receiver are implemented on VM and connected to the virtual switch through vNIC, the issues of shared resources and interactions between the measurement devices and the device under test must be considered.

Note: Some RFC 2889 tests require a full-mesh sending and receiving pattern involving more than two ports. This possibility is illustrated in the Physical port → vSwitch → VNF → vSwitch → VNF → vSwitch → physical port diagram above (with 2 sending and 2 receiving ports, though all ports could be used bi-directionally).

Note: When Deployment Scenarios are used in RFC 2889 address learning or cache capacity testing, an additional port from the vSwitch must be connected to the test device. This port is used to listen for flooded frames.

4.2.5.2. General Methodology:

To establish the baseline performance of the virtual switch, tests would initially be run with a simple workload in the VNF (the recommended simple workload VNF would be DPDK‘s testpmd application forwarding packets in a VM or vloop_vnf a simple kernel module that forwards traffic between two network interfaces inside the virtualized environment while bypassing the networking stack). Subsequently, the tests would also be executed with a real Telco workload running in the VNF, which would exercise the virtual switch in the context of higher level Telco NFV use cases, and prove that its underlying characteristics and behaviour can be measured and validated. Suitable real Telco workload VNFs are yet to be identified.

4.2.5.2.1. Default Test Parameters

The following list identifies the default parameters for suite of tests:

  • Reference application: Simple forwarding or Open Source VNF.
  • Frame size (bytes): 64, 128, 256, 512, 1024, 1280, 1518, 2K, 4k OR Packet size based on use-case (e.g. RTP 64B, 256B) OR Mix of packet sizes as maintained by the Functest project <https://wiki.opnfv.org/traffic_profile_management>.
  • Reordering check: Tests should confirm that packets within a flow are not reordered.
  • Duplex: Unidirectional / Bidirectional. Default: Full duplex with traffic transmitting in both directions, as network traffic generally does not flow in a single direction. By default the data rate of transmitted traffic should be the same in both directions, please note that asymmetric traffic (e.g. downlink-heavy) tests will be mentioned explicitly for the relevant test cases.
  • Number of Flows: Default for non scalability tests is a single flow. For scalability tests the goal is to test with maximum supported flows but where possible will test up to 10 Million flows. Start with a single flow and scale up. By default flows should be added sequentially, tests that add flows simultaneously will explicitly call out their flow addition behaviour. Packets are generated across the flows uniformly with no burstiness. For multi-core tests should consider the number of packet flows based on vSwitch/VNF multi-thread implementation and behavior.
  • Traffic Types: UDP, SCTP, RTP, GTP and UDP traffic.
  • Deployment scenarios are:
  • Physical → virtual switch → physical.
  • Physical → virtual switch → VNF → virtual switch → physical.
  • Physical → virtual switch → VNF → virtual switch → VNF → virtual switch → physical.
  • Physical → VNF → virtual switch → VNF → physical.
  • Physical → virtual switch → VNF.
  • VNF → virtual switch → Physical.
  • VNF → virtual switch → VNF.

Tests MUST have these parameters unless otherwise stated. Test cases with non default parameters will be stated explicitly.

Note: For throughput tests unless stated otherwise, test configurations should ensure that traffic traverses the installed flows through the virtual switch, i.e. flows are installed and have an appropriate time out that doesn’t expire before packet transmission starts.

4.2.5.2.2. Flow Classification

Virtual switches classify packets into flows by processing and matching particular header fields in the packet/frame and/or the input port where the packets/frames arrived. The vSwitch then carries out an action on the group of packets that match the classification parameters. Thus a flow is considered to be a sequence of packets that have a shared set of header field values or have arrived on the same port and have the same action applied to them. Performance results can vary based on the parameters the vSwitch uses to match for a flow. The recommended flow classification parameters for L3 vSwitch performance tests are: the input port, the source IP address, the destination IP address and the Ethernet protocol type field. It is essential to increase the flow time-out time on a vSwitch before conducting any performance tests that do not measure the flow set-up time. Normally the first packet of a particular flow will install the flow in the vSwitch which adds an additional latency, subsequent packets of the same flow are not subject to this latency if the flow is already installed on the vSwitch.

4.2.5.2.3. Test Priority

Tests will be assigned a priority in order to determine which tests should be implemented immediately and which tests implementations can be deferred.

Priority can be of following types: - Urgent: Must be implemented immediately. - High: Must be implemented in the next release. - Medium: May be implemented after the release. - Low: May or may not be implemented at all.

4.2.5.2.4. SUT Setup

The SUT should be configured to its “default” state. The SUT’s configuration or set-up must not change between tests in any way other than what is required to do the test. All supported protocols must be configured and enabled for each test set up.

4.2.5.2.5. Port Configuration

The DUT should be configured with n ports where n is a multiple of 2. Half of the ports on the DUT should be used as ingress ports and the other half of the ports on the DUT should be used as egress ports. Where a DUT has more than 2 ports, the ingress data streams should be set-up so that they transmit packets to the egress ports in sequence so that there is an even distribution of traffic across ports. For example, if a DUT has 4 ports 0(ingress), 1(ingress), 2(egress) and 3(egress), the traffic stream directed at port 0 should output a packet to port 2 followed by a packet to port 3. The traffic stream directed at port 1 should also output a packet to port 2 followed by a packet to port 3.

4.2.5.2.6. Frame Formats

Frame formats Layer 2 (data link layer) protocols

  • Ethernet II
+---------------------------+-----------+
| Ethernet Header | Payload | Check Sum |
+-----------------+---------+-----------+
|_________________|_________|___________|
      14 Bytes     46 - 1500   4 Bytes
                     Bytes

Layer 3 (network layer) protocols

  • IPv4
+-----------------+-----------+---------+-----------+
| Ethernet Header | IP Header | Payload | Checksum  |
+-----------------+-----------+---------+-----------+
|_________________|___________|_________|___________|
      14 Bytes       20 bytes  26 - 1480   4 Bytes
                                 Bytes
  • IPv6
+-----------------+-----------+---------+-----------+
| Ethernet Header | IP Header | Payload | Checksum  |
+-----------------+-----------+---------+-----------+
|_________________|___________|_________|___________|
      14 Bytes       40 bytes  26 - 1460   4 Bytes
                                 Bytes

Layer 4 (transport layer) protocols

  • TCP
  • UDP
  • SCTP
+-----------------+-----------+-----------------+---------+-----------+
| Ethernet Header | IP Header | Layer 4 Header  | Payload | Checksum  |
+-----------------+-----------+-----------------+---------+-----------+
|_________________|___________|_________________|_________|___________|
      14 Bytes      40 bytes      20 Bytes       6 - 1460   4 Bytes
                                                  Bytes

Layer 5 (application layer) protocols

  • RTP
  • GTP
+-----------------+-----------+-----------------+---------+-----------+
| Ethernet Header | IP Header | Layer 4 Header  | Payload | Checksum  |
+-----------------+-----------+-----------------+---------+-----------+
|_________________|___________|_________________|_________|___________|
      14 Bytes      20 bytes     20 Bytes        >= 6 Bytes   4 Bytes
4.2.5.2.7. Packet Throughput

There is a difference between an Ethernet frame, an IP packet, and a UDP datagram. In the seven-layer OSI model of computer networking, packet refers to a data unit at layer 3 (network layer). The correct term for a data unit at layer 2 (data link layer) is a frame, and at layer 4 (transport layer) is a segment or datagram.

Important concepts related to 10GbE performance are frame rate and throughput. The MAC bit rate of 10GbE, defined in the IEEE standard 802 .3ae, is 10 billion bits per second. Frame rate is based on the bit rate and frame format definitions. Throughput, defined in IETF RFC 1242, is the highest rate at which the system under test can forward the offered load, without loss.

The frame rate for 10GbE is determined by a formula that divides the 10 billion bits per second by the preamble + frame length + inter-frame gap.

The maximum frame rate is calculated using the minimum values of the following parameters, as described in the IEEE 802 .3ae standard:

  • Preamble: 8 bytes * 8 = 64 bits
  • Frame Length: 64 bytes (minimum) * 8 = 512 bits
  • Inter-frame Gap: 12 bytes (minimum) * 8 = 96 bits

Therefore, Maximum Frame Rate (64B Frames) = MAC Transmit Bit Rate / (Preamble + Frame Length + Inter-frame Gap) = 10,000,000,000 / (64 + 512 + 96) = 10,000,000,000 / 672 = 14,880,952.38 frame per second (fps)

4.2.5.3. RFCs for testing virtual switch performance

The starting point for defining the suite of tests for benchmarking the performance of a virtual switch is to take existing RFCs and standards that were designed to test their physical counterparts and adapting them for testing virtual switches. The rationale behind this is to establish a fair comparison between the performance of virtual and physical switches. This section outlines the RFCs that are used by this specification.

4.2.5.3.1. RFC 1242 Benchmarking Terminology for Network Interconnection

Devices RFC 1242 defines the terminology that is used in describing performance benchmarking tests and their results. Definitions and discussions covered include: Back-to-back, bridge, bridge/router, constant load, data link frame size, frame loss rate, inter frame gap, latency, and many more.

4.2.5.3.2. RFC 2544 Benchmarking Methodology for Network Interconnect Devices

RFC 2544 outlines a benchmarking methodology for network Interconnect Devices. The methodology results in performance metrics such as latency, frame loss percentage, and maximum data throughput.

In this document network “throughput” (measured in millions of frames per second) is based on RFC 2544, unless otherwise noted. Frame size refers to Ethernet frames ranging from smallest frames of 64 bytes to largest frames of 9K bytes.

Types of tests are:

  1. Throughput test defines the maximum number of frames per second that can be transmitted without any error, or 0% loss ratio. In some Throughput tests (and those tests with long duration), evaluation of an additional frame loss ratio is suggested. The current ratio (10^-7 %) is based on understanding the typical user-to-user packet loss ratio needed for good application performance and recognizing that a single transfer through a vswitch must contribute a tiny fraction of user-to-user loss. Further, the ratio 10^-7 % also recognizes practical limitations when measuring loss ratio.
  2. Latency test measures the time required for a frame to travel from the originating device through the network to the destination device. Please note that RFC2544 Latency measurement will be superseded with a measurement of average latency over all successfully transferred packets or frames.
  3. Frame loss test measures the network’s response in overload conditions - a critical indicator of the network’s ability to support real-time applications in which a large amount of frame loss will rapidly degrade service quality.
  4. Burst test assesses the buffering capability of a virtual switch. It measures the maximum number of frames received at full line rate before a frame is lost. In carrier Ethernet networks, this measurement validates the excess information rate (EIR) as defined in many SLAs.
  5. System recovery to characterize speed of recovery from an overload condition.
  6. Reset to characterize speed of recovery from device or software reset. This type of test has been updated by RFC6201 as such, the methodology defined by this specification will be that of RFC 6201.

Although not included in the defined RFC 2544 standard, another crucial measurement in Ethernet networking is packet delay variation. The definition set out by this specification comes from RFC5481.

4.2.5.3.3. RFC 2285 Benchmarking Terminology for LAN Switching Devices

RFC 2285 defines the terminology that is used to describe the terminology for benchmarking a LAN switching device. It extends RFC 1242 and defines: DUTs, SUTs, Traffic orientation and distribution, bursts, loads, forwarding rates, etc.

4.2.5.3.4. RFC 2889 Benchmarking Methodology for LAN Switching

RFC 2889 outlines a benchmarking methodology for LAN switching, it extends RFC 2544. The outlined methodology gathers performance metrics for forwarding, congestion control, latency, address handling and finally filtering.

4.2.5.3.5. RFC 3918 Methodology for IP Multicast Benchmarking

RFC 3918 outlines a methodology for IP Multicast benchmarking.

4.2.5.3.6. RFC 4737 Packet Reordering Metrics

RFC 4737 describes metrics for identifying and counting re-ordered packets within a stream, and metrics to measure the extent each packet has been re-ordered.

4.2.5.3.7. RFC 5481 Packet Delay Variation Applicability Statement

RFC 5481 defined two common, but different forms of delay variation metrics, and compares the metrics over a range of networking circumstances and tasks. The most suitable form for vSwitch benchmarking is the “PDV” form.

4.2.5.3.8. RFC 6201 Device Reset Characterization

RFC 6201 extends the methodology for characterizing the speed of recovery of the DUT from device or software reset described in RFC 2544.

4.2.6. Item pass/fail criteria

vswitchperf does not specify Pass/Fail criteria for the tests in terms of a threshold, as benchmarks do not (and should not do this). The results/metrics for a test are simply reported. If it had to be defined, a test is considered to have passed if it succesfully completed and a relavent metric was recorded/reported for the SUT.

4.2.7. Suspension criteria and resumption requirements

In the case of a throughput test, a test should be suspended if a virtual switch is failing to forward any traffic. A test should be restarted from a clean state if the intention is to carry out the test again.

4.2.8. Test deliverables

Each test should produce a test report that details SUT information as well as the test results. There are a number of parameters related to the system, DUT and tests that can affect the repeatability of a test results and should be recorded. In order to minimise the variation in the results of a test, it is recommended that the test report includes the following information:

  • Hardware details including:
    • Platform details.
    • Processor details.
    • Memory information (see below)
    • Number of enabled cores.
    • Number of cores used for the test.
    • Number of physical NICs, as well as their details (manufacturer, versions, type and the PCI slot they are plugged into).
    • NIC interrupt configuration.
    • BIOS version, release date and any configurations that were modified.
  • Software details including:
    • OS version (for host and VNF)
    • Kernel version (for host and VNF)
    • GRUB boot parameters (for host and VNF).
    • Hypervisor details (Type and version).
    • Selected vSwitch, version number or commit id used.
    • vSwitch launch command line if it has been parameterised.
    • Memory allocation to the vSwitch – which NUMA node it is using, and how many memory channels.
    • Where the vswitch is built from source: compiler details including versions and the flags that were used to compile the vSwitch.
    • DPDK or any other SW dependency version number or commit id used.
    • Memory allocation to a VM - if it’s from Hugpages/elsewhere.
    • VM storage type: snapshot/independent persistent/independent non-persistent.
    • Number of VMs.
    • Number of Virtual NICs (vNICs), versions, type and driver.
    • Number of virtual CPUs and their core affinity on the host.
    • Number vNIC interrupt configuration.
    • Thread affinitization for the applications (including the vSwitch itself) on the host.
    • Details of Resource isolation, such as CPUs designated for Host/Kernel (isolcpu) and CPUs designated for specific processes (taskset).
  • Memory Details
    • Total memory
    • Type of memory
    • Used memory
    • Active memory
    • Inactive memory
    • Free memory
    • Buffer memory
    • Swap cache
    • Total swap
    • Used swap
    • Free swap
  • Test duration.
  • Number of flows.
  • Traffic Information:
    • Traffic type - UDP, TCP, IMIX / Other.
    • Packet Sizes.
  • Deployment Scenario.

Note: Tests that require additional parameters to be recorded will explicitly specify this.

4.2.9. Test management

This section will detail the test activities that will be conducted by vsperf as well as the infrastructure that will be used to complete the tests in OPNFV.

4.2.10. Planned activities and tasks; test progression

A key consideration when conducting any sort of benchmark is trying to ensure the consistency and repeatability of test results between runs. When benchmarking the performance of a virtual switch there are many factors that can affect the consistency of results. This section describes these factors and the measures that can be taken to limit their effects. In addition, this section will outline some system tests to validate the platform and the VNF before conducting any vSwitch benchmarking tests.

System Isolation:

When conducting a benchmarking test on any SUT, it is essential to limit (and if reasonable, eliminate) any noise that may interfere with the accuracy of the metrics collected by the test. This noise may be introduced by other hardware or software (OS, other applications), and can result in significantly varying performance metrics being collected between consecutive runs of the same test. In the case of characterizing the performance of a virtual switch, there are a number of configuration parameters that can help increase the repeatability and stability of test results, including:

  • OS/GRUB configuration:
    • maxcpus = n where n >= 0; limits the kernel to using ‘n’ processors. Only use exactly what you need.
    • isolcpus: Isolate CPUs from the general scheduler. Isolate all CPUs bar one which will be used by the OS.
    • use taskset to affinitize the forwarding application and the VNFs onto isolated cores. VNFs and the vSwitch should be allocated their own cores, i.e. must not share the same cores. vCPUs for the VNF should be affinitized to individual cores also.
    • Limit the amount of background applications that are running and set OS to boot to runlevel 3. Make sure to kill any unnecessary system processes/daemons.
    • Only enable hardware that you need to use for your test – to ensure there are no other interrupts on the system.
    • Configure NIC interrupts to only use the cores that are not allocated to any other process (VNF/vSwitch).
  • NUMA configuration: Any unused sockets in a multi-socket system should be disabled.
  • CPU pinning: The vSwitch and the VNF should each be affinitized to separate logical cores using a combination of maxcpus, isolcpus and taskset.
  • BIOS configuration: BIOS should be configured for performance where an explicit option exists, sleep states should be disabled, any virtualization optimization technologies should be enabled, and hyperthreading should also be enabled, turbo boost and overclocking should be disabled.

System Validation:

System validation is broken down into two sub-categories: Platform validation and VNF validation. The validation test itself involves verifying the forwarding capability and stability for the sub-system under test. The rationale behind system validation is two fold. Firstly to give a tester confidence in the stability of the platform or VNF that is being tested; and secondly to provide base performance comparison points to understand the overhead introduced by the virtual switch.

  • Benchmark platform forwarding capability: This is an OPTIONAL test used to verify the platform and measure the base performance (maximum forwarding rate in fps and latency) that can be achieved by the platform without a vSwitch or a VNF. The following diagram outlines the set-up for benchmarking Platform forwarding capability:

                                                         __
    +--------------------------------------------------+   |
    |   +------------------------------------------+   |   |
    |   |                                          |   |   |
    |   |          l2fw or DPDK L2FWD app          |   |  Host
    |   |                                          |   |   |
    |   +------------------------------------------+   |   |
    |   |                 NIC                      |   |   |
    +---+------------------------------------------+---+ __|
               ^                           :
               |                           |
               :                           v
    +--------------------------------------------------+
    |                                                  |
    |                traffic generator                 |
    |                                                  |
    +--------------------------------------------------+
    
  • Benchmark VNF forwarding capability: This test is used to verify the VNF and measure the base performance (maximum forwarding rate in fps and latency) that can be achieved by the VNF without a vSwitch. The performance metrics collected by this test will serve as a key comparison point for NIC passthrough technologies and vSwitches. VNF in this context refers to the hypervisor and the VM. The following diagram outlines the set-up for benchmarking VNF forwarding capability:

                                                         __
    +--------------------------------------------------+   |
    |   +------------------------------------------+   |   |
    |   |                                          |   |   |
    |   |                 VNF                      |   |   |
    |   |                                          |   |   |
    |   +------------------------------------------+   |   |
    |   |          Passthrough/SR-IOV              |   |  Host
    |   +------------------------------------------+   |   |
    |   |                 NIC                      |   |   |
    +---+------------------------------------------+---+ __|
               ^                           :
               |                           |
               :                           v
    +--------------------------------------------------+
    |                                                  |
    |                traffic generator                 |
    |                                                  |
    +--------------------------------------------------+
    

Methodology to benchmark Platform/VNF forwarding capability

The recommended methodology for the platform/VNF validation and benchmark is: - Run RFC2889 Maximum Forwarding Rate test, this test will produce maximum forwarding rate and latency results that will serve as the expected values. These expected values can be used in subsequent steps or compared with in subsequent validation tests. - Transmit bidirectional traffic at line rate/max forwarding rate (whichever is higher) for at least 72 hours, measure throughput (fps) and latency. - Note: Traffic should be bidirectional. - Establish a baseline forwarding rate for what the platform can achieve. - Additional validation: After the test has completed for 72 hours run bidirectional traffic at the maximum forwarding rate once more to see if the system is still functional and measure throughput (fps) and latency. Compare the measure the new obtained values with the expected values.

NOTE 1: How the Platform is configured for its forwarding capability test (BIOS settings, GRUB configuration, runlevel...) is how the platform should be configured for every test after this

NOTE 2: How the VNF is configured for its forwarding capability test (# of vCPUs, vNICs, Memory, affinitization…) is how it should be configured for every test that uses a VNF after this.

Methodology to benchmark the VNF to vSwitch to VNF deployment scenario

vsperf has identified the following concerns when benchmarking the VNF to vSwitch to VNF deployment scenario:

  • The accuracy of the timing synchronization between VNFs/VMs.
  • The clock accuracy of a VNF/VM if they were to be used as traffic generators.
  • VNF traffic generator/receiver may be using resources of the system under test, causing at least three forms of workload to increase as the traffic load increases (generation, switching, receiving).

The recommendation from vsperf is that tests for this sceanario must include an external HW traffic generator to act as the tester/traffic transmitter and receiver. The perscribed methodology to benchmark this deployment scanrio with an external tester involves the following three steps:

#. Determine the forwarding capability and latency through the virtual interface connected to the VNF/VM.

_images/vm2vm_virtual_interface_benchmark.png

Virtual interfaces performance benchmark

  1. Determine the forwarding capability and latency through the VNF/hypervisor.
_images/vm2vm_hypervisor_benchmark.png

Hypervisor performance benchmark

  1. Determine the forwarding capability and latency for the VNF to vSwitch to VNF taking the information from the previous two steps into account.
_images/vm2vm_benchmark.png

VNF to vSwitch to VNF performance benchmark

vsperf also identified an alternative configuration for the final step:

_images/vm2vm_alternative_benchmark.png

VNF to vSwitch to VNF alternative performance benchmark

4.2.11. Environment/infrastructure

VSPERF CI jobs are run using the OPNFV lab infrastructure as described by the ‘Pharos Project <https://www.opnfv.org/community/projects/pharos>`_ . A VSPERF POD is described here https://wiki.opnfv.org/display/pharos/VSPERF+in+Intel+Pharos+Lab+-+Pod+12

4.2.11.1. vsperf CI

vsperf CI jobs are broken down into:

  • Daily job:
    • Runs everyday takes about 10 hours to complete.
    • TESTCASES_DAILY=’phy2phy_tput back2back phy2phy_tput_mod_vlan phy2phy_scalability pvp_tput pvp_back2back pvvp_tput pvvp_back2back’.
    • TESTPARAM_DAILY=’–test-params TRAFFICGEN_PKT_SIZES=(64,128,512,1024,1518)’.
  • Merge job:
    • Runs whenever patches are merged to master.
    • Runs a basic Sanity test.
  • Verify job:
    • Runs every time a patch is pushed to gerrit.
    • Builds documentation.
4.2.11.2. Scripts:

There are 2 scripts that are part of VSPERFs CI:

  • build-vsperf.sh: Lives in the VSPERF repository in the ci/ directory and is used to run vsperf with the appropriate cli parameters.
  • vswitchperf.yml: YAML description of our jenkins job. lives in the RELENG repository.

More info on vsperf CI can be found here: https://wiki.opnfv.org/display/vsperf/VSPERF+CI

4.2.12. Responsibilities and authority

The group responsible for managing, designing, preparing and executing the tests listed in the LTD are the vsperf committers and contributors. The vsperf committers and contributors should work with the relavent OPNFV projects to ensure that the infrastructure is in place for testing vswitches, and that the results are published to common end point (a results database).

IETF RFC 8204

The IETF Benchmarking Methodology Working Group (BMWG) was re-chartered in 2014 to include benchmarking for Virtualized Network Functions (VNFs) and their infrastructure. A version of the VSPERF test specification was summarized in an Internet Draft ... Benchmarking Virtual Switches in OPNFV and contributed to the BMWG. In June 2017 the Internet Engineering Steering Group of the IETF approved the most recent version of the draft for publication as a new test specification (RFC 8204).

VSPERF CI Test Cases

CI Test cases run daily on the VSPERF Pharos POD for master and stable branches.

./results/scenario.rst ./results/results.rst

Yardstick

Yardstick Developer Guide
1. Introduction

Yardstick is a project dealing with performance testing. Yardstick produces its own test cases but can also be considered as a framework to support feature project testing.

Yardstick developed a test API that can be used by any OPNFV project. Therefore there are many ways to contribute to Yardstick.

You can:

  • Develop new test cases
  • Review codes
  • Develop Yardstick API / framework
  • Develop Yardstick grafana dashboards and Yardstick reporting page
  • Write Yardstick documentation

This developer guide describes how to interact with the Yardstick project. The first section details the main working areas of the project. The Second part is a list of “How to” to help you to join the Yardstick family whatever your field of interest is.

1.1. Where can I find some help to start?

This guide is made for you. You can have a look at the user guide. There are also references on documentation, video tutorials, tips in the project wiki page. You can also directly contact us by mail with [Yardstick] prefix in the subject at opnfv-tech-discuss@lists.opnfv.org or on the IRC chan #opnfv-yardstick.

2. Yardstick developer areas
2.1. Yardstick framework

Yardstick can be considered as a framework. Yardstick is released as a docker file, including tools, scripts and a CLI to prepare the environement and run tests. It simplifies the integration of external test suites in CI pipelines and provides commodity tools to collect and display results.

Since Danube, test categories (also known as tiers) have been created to group similar tests, provide consistant sub-lists and at the end optimize test duration for CI (see How To section).

The definition of the tiers has been agreed by the testing working group.

The tiers are:

  • smoke
  • features
  • components
  • performance
  • vnf
3. How Todos?
3.1. How Yardstick works?

The installation and configuration of the Yardstick is described in the user guide.

3.2. How to work with test cases?
3.2.1. Sample Test cases

Yardstick provides many sample test cases which are located at samples directory of repo.

Sample test cases are designed with the following goals:

  1. Helping user better understand Yardstick features (including new feature and new test capacity).
  2. Helping developer to debug a new feature and test case before it is offically released.
  3. Helping other developers understand and verify the new patch before the patch is merged.

Developers should upload their sample test cases as well when they are uploading a new patch which is about the Yardstick new test case or new feature.

3.2.2. OPNFV Release Test cases

OPNFV Release test cases are located at yardstick/tests/opnfv/test_cases. These test cases are run by OPNFV CI jobs, which means these test cases should be more mature than sample test cases. OPNFV scenario owners can select related test cases and add them into the test suites which represent their scenario.

3.2.3. Test case Description File

This section will introduce the meaning of the Test case description file. we will use ping.yaml as a example to show you how to understand the test case description file. This yaml file consists of two sections. One is scenarios, the other is context.:

---
  # Sample benchmark task config file
  # measure network latency using ping

  schema: "yardstick:task:0.1"

  {% set provider = provider or none %}
  {% set physical_network = physical_network or 'physnet1' %}
  {% set segmentation_id = segmentation_id or none %}
  scenarios:
  -
    type: Ping
    options:
      packetsize: 200
    host: athena.demo
    target: ares.demo

    runner:
      type: Duration
      duration: 60
      interval: 1

    sla:
      max_rtt: 10
      action: monitor

  context:
    name: demo
    image: yardstick-image
    flavor: yardstick-flavor
    user: ubuntu

    placement_groups:
      pgrp1:
        policy: "availability"

    servers:
      athena:
        floating_ip: true
        placement: "pgrp1"
      ares:
        placement: "pgrp1"

    networks:
      test:
        cidr: '10.0.1.0/24'
        {% if provider == "vlan" %}
        provider: {{provider}}
        physical_network: {{physical_network}}
          {% if segmentation_id %}
        segmentation_id: {{segmentation_id}}
          {% endif %}
       {% endif %}

The contexts section is the description of pre-condition of testing. As ping.yaml shows, you can configure the image, flavor, name, affinity and network of Test VM (servers), with this section, you will get a pre-condition env for Testing. Yardstick will automatically setup the stack which are described in this section. Yardstick converts this section to heat template and sets up the VMs with heat-client (Yardstick can also support to convert this section to Kubernetes template to setup containers).

In the examples above, two Test VMs (athena and ares) are configured by keyword servers. flavor will determine how many vCPU, how much memory for test VMs. As yardstick-flavor is a basic flavor which will be automatically created when you run command yardstick env prepare. yardstick-flavor is 1 vCPU 1G RAM,3G Disk. image is the image name of test VMs. If you use cirros.3.5.0, you need fill the username of this image into user. The policy of placement of Test VMs have two values (affinity and availability). availability means anti-affinity. In the network section, you can configure which provider network and physical_network you want Test VMs to use. You may need to configure segmentation_id when your network is vlan.

Moreover, you can configure your specific flavor as below, Yardstick will setup the stack for you.

flavor:
  name: yardstick-new-flavor
  vcpus: 12
  ram: 1024
  disk: 2

Besides default Heat context, Yardstick also allows you to setup two other types of context. They are Node and Kubernetes.

context:
  type: Kubernetes
  name: k8s

and

context:
  type: Node
  name: LF

The scenarios section is the description of testing steps, you can orchestrate the complex testing step through scenarios.

Each scenario will do one testing step. In one scenario, you can configure the type of scenario (operation), runner type and sla of the scenario.

For TC002, We only have one step, which is Ping from host VM to target VM. In this step, we also have some detailed operations implemented (such as ssh to VM, ping from VM1 to VM2. Get the latency, verify the SLA, report the result).

If you want to get this implementation details implement, you can check with the scenario.py file. For Ping scenario, you can find it in Yardstick repo (yardstick/yardstick/benchmark/scenarios/networking/ping.py).

After you select the type of scenario (such as Ping), you will select one type of runner, there are 4 types of runner. Iteration and Duration are the most commonly used, and the default is Iteration.

For Iteration, you can specify the iteration number and interval of iteration.

runner:
  type: Iteration
  iterations: 10
  interval: 1

That means Yardstick will repeat the Ping test 10 times and the interval of each iteration is one second.

For Duration, you can specify the duration of this scenario and the interval of each ping test.

runner:
  type: Duration
  duration: 60
  interval: 10

That means Yardstick will run the ping test as loop until the total time of this scenario reaches 60s and the interval of each loop is ten seconds.

SLA is the criterion of this scenario. This depends on the scenario. Different scenarios can have different SLA metric.

3.2.4. How to write a new test case

Yardstick already provides a library of testing steps (i.e. different types of scenario).

Basically, what you need to do is to orchestrate the scenario from the library.

Here, we will show two cases. One is how to write a simple test case, the other is how to write a quite complex test case.

3.2.4.1. Write a new simple test case

First, you can image a basic test case description as below.

Storage Performance
metric IOPS (Average IOs performed per second), Throughput (Average disk read/write bandwidth rate), Latency (Average disk read/write latency)
test purpose The purpose of TC005 is to evaluate the IaaS storage performance with regards to IOPS, throughput and latency.
test description fio test is invoked in a host VM on a compute blade, a job file as well as parameters are passed to fio and fio will start doing what the job file tells it to do.
configuration

file: opnfv_yardstick_tc005.yaml

IO types is set to read, write, randwrite, randread, rw. IO block size is set to 4KB, 64KB, 1024KB. fio is run for each IO type and IO block size scheme, each iteration runs for 30 seconds (10 for ramp time, 20 for runtime).

For SLA, minimum read/write iops is set to 100, minimum read/write throughput is set to 400 KB/s, and maximum read/write latency is set to 20000 usec.

applicability

This test case can be configured with different:

  • IO types;
  • IO block size;
  • IO depth;
  • ramp time;
  • test duration.

Default values exist.

SLA is optional. The SLA in this test case serves as an example. Considerably higher throughput and lower latency are expected. However, to cover most configurations, both baremetal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many heavy IO applications start to suffer badly if the read/write bandwidths are lower than this.

pre-test conditions

The test case image needs to be installed into Glance with fio included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with fio installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. ‘fio_benchmark’ bash script is copyied from Jump Host to the host VM via the ssh tunnel.
step 3

‘fio_benchmark’ script is invoked. Simulated IO operations are started. IOPS, disk read/write bandwidth and latency are recorded and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.

TODO

3.3. How can I contribute to Yardstick?

If you are already a contributor of any OPNFV project, you can contribute to Yardstick. If you are totally new to OPNFV, you must first create your Linux Foundation account, then contact us in order to declare you in the repository database.

We distinguish 2 levels of contributors:

  • the standard contributor can push patch and vote +1/0/-1 on any Yardstick patch
  • The commitor can vote -2/-1/0/+1/+2 and merge

Yardstick commitors are promoted by the Yardstick contributors.

3.3.1. Gerrit & JIRA introduction

OPNFV uses Gerrit for web based code review and repository management for the Git Version Control System. You can access OPNFV Gerrit. Please note that you need to have Linux Foundation ID in order to use OPNFV Gerrit. You can get one from this link.

OPNFV uses JIRA for issue management. An important principle of change management is to have two-way trace-ability between issue management (i.e. JIRA) and the code repository (via Gerrit). In this way, individual commits can be traced to JIRA issues and we also know which commits were used to resolve a JIRA issue.

If you want to contribute to Yardstick, you can pick a issue from Yardstick’s JIRA dashboard or you can create you own issue and submit it to JIRA.

3.3.2. Install Git and Git-reviews

Installing and configuring Git and Git-Review is necessary in order to submit code to Gerrit. The Getting to the code page will provide you with some help for that.

3.3.3. Verify your patch locally before submitting

Once you finish a patch, you can submit it to Gerrit for code review. A developer sends a new patch to Gerrit will trigger patch verify job on Jenkins CI. The yardstick patch verify job includes python pylint check, unit test and code coverage test. Before you submit your patch, it is recommended to run the patch verification in your local environment first.

Open a terminal window and set the project’s directory to the working directory using the cd command. Assume that YARDSTICK_REPO_DIR is the path to the Yardstick project folder on your computer:

cd $YARDSTICK_REPO_DIR

Verify your patch:

tox

It is used in CI but also by the CLI.

3.3.4. Submit the code with Git

Tell Git which files you would like to take into account for the next commit. This is called ‘staging’ the files, by placing them into the staging area, using the git add command (or the synonym git stage command):

git add $YARDSTICK_REPO_DIR/samples/sample.yaml

Alternatively, you can choose to stage all files that have been modified (that is the files you have worked on) since the last time you generated a commit, by using the -a argument:

git add -a

Git won’t let you push (upload) any code to Gerrit if you haven’t pulled the latest changes first. So the next step is to pull (download) the latest changes made to the project by other collaborators using the pull command:

git pull

Now that you have the latest version of the project and you have staged the files you wish to push, it is time to actually commit your work to your local Git repository:

git commit --signoff -m "Title of change"

Test of change that describes in high level what was done. There is a lot of
documentation in code so you do not need to repeat it here.

JIRA: YARDSTICK-XXX

The message that is required for the commit should follow a specific set of rules. This practice allows to standardize the description messages attached to the commits, and eventually navigate among the latter more easily.

This document happened to be very clear and useful to get started with that.

3.3.5. Push the code to Gerrit for review

Now that the code has been comitted into your local Git repository the following step is to push it online to Gerrit for it to be reviewed. The command we will use is git review:

git review

This will automatically push your local commit into Gerrit. You can add Yardstick committers and contributors to review your codes.

Gerrit for code review

You can find a list Yardstick people here, or use the yardstick-reviewers and yardstick-committers groups in gerrit.

3.3.6. Modify the code under review in Gerrit

At the same time the code is being reviewed in Gerrit, you may need to edit it to make some changes and then send it back for review. The following steps go through the procedure.

Once you have modified/edited your code files under your IDE, you will have to stage them. The git status command is very helpful at this point as it provides an overview of Git’s current state:

git status

This command lists the files that have been modified since the last commit.

You can now stage the files that have been modified as part of the Gerrit code review addition/modification/improvement using git add command. It is now time to commit the newly modified files, but the objective here is not to create a new commit, we simply want to inject the new changes into the previous commit. You can achieve that with the ‘–amend’ option on the git commit command:

git commit --amend

If the commit was successful, the git status command should not return the updated files as about to be commited.

The final step consists in pushing the newly modified commit to Gerrit:

git review
4. Backporting changes to stable branches

During the release cycle, when master and the stable/<release> branch have diverged, it may be necessary to backport (cherry-pick) changes top the stable/<release> branch once they have merged to master. These changes should be identified by the committers reviewing the patch. Changes should be backported as soon as possible after merging of the original code.

..note::
Besides the commit and review process below, the Jira tick must be updated to add dual release versions and indicate that the change is to be backported.

The process for backporting is as follows:

  • Committer A merges a change to master (process for normal changes).
  • Committer A cherry-picks the change to stable/<release> branch (if the bug has been identified for backporting).
  • The original author should review the code and verify that it still works (and give a +1).
  • Committer B reviews the change, gives a +2 and merges to stable/<release>.

A backported change needs a +1 and a +2 from a committer who didn’t propose the change (i.e. minimum 3 people involved).

5. Plugins

For information about Yardstick plugins, refer to the chapter Installing a plug-in into Yardstick in the user guide.

7. Prerequisites

In order to integrate PROX tests into NSB, the following prerequisites are required.

8. Sample Prox Test Hardware Architecture

The following is a diagram of a sample NSB PROX Hardware Architecture for both NSB PROX on Bare metal and on Openstack.

In this example when running yardstick on baremetal, yardstick will run on the deployment node, the generator will run on the deployment node and the SUT(SUT) will run on the Controller Node.

Sample NSB PROX Hard Architecture
9. Prox Test Architecture

In order to create a new test, one must understand the architecture of the test.

A NSB Prox test architecture is composed of:

  • A traffic generator. This provides blocks of data on 1 or more ports to the SUT. The traffic generator also consumes the result packets from the system under test.

  • A SUT consumes the packets generated by the packet generator, and applies one or more tasks to the packets and return the modified packets to the traffic generator.

    This is an example of a sample NSB PROX test architecture.

NSB PROX test Architecture

This diagram is of a sample NSB PROX test application.

  • Traffic Generator
    • Generator Tasks - Composted of 1 or more tasks (It is possible to have multiple tasks sending packets to same port No. See Tasks Ai and Aii plus Di and Dii)
      • Task Ai - Generates Packets on Port 0 of Traffic Generator and send to Port 0 of SUT Port 0
      • Task Aii - Generates Packets on Port 0 of Traffic Generator and send to Port 0 of SUT Port 0
      • Task B - Generates Packets on Port 1 of Traffic Generator and send to Port 1 of SUT Port 1
      • Task C - Generates Packets on Port 2 of Traffic Generator and send to Port 2 of SUT Port 2
      • Task Di - Generates Packets on Port 3 of Traffic Generator and send to Port 3 of SUT Port 3
      • Task Dii - Generates Packets on Port 0 of Traffic Generator and send to Port 0 of SUT Port 0
    • Verifier Tasks - Composed of 1 or more tasks which receives packets from SUT
      • Task E - Receives packets on Port 0 of Traffic Generator sent from Port 0 of SUT Port 0
      • Task F - Receives packets on Port 1 of Traffic Generator sent from Port 1 of SUT Port 1
      • Task G - Receives packets on Port 2 of Traffic Generator sent from Port 2 of SUT Port 2
      • Task H - Receives packets on Port 3 of Traffic Generator sent from Port 3 of SUT Port 3
  • SUT
    • Receiver Tasks - Receives packets from generator - Composed on 1 or more tasks which consume the packs sent from Traffic Generator
      • Task A - Receives Packets on Port 0 of System-Under-Test from Traffic Generator Port 0, and forwards packets to Task E
      • Task B - Receives Packets on Port 1 of System-Under-Test from Traffic Generator Port 1, and forwards packets to Task E
      • Task C - Receives Packets on Port 2 of System-Under-Test from Traffic Generator Port 2, and forwards packets to Task E
      • Task D - Receives Packets on Port 3 of System-Under-Test from Traffic Generator Port 3, and forwards packets to Task E
    • Processing Tasks - Composed of multiple tasks in series which carry out some processing on received packets before forwarding to the task.
      • Task E - This receives packets from the Receiver Tasks, carries out some operation on the data and forwards to result packets to the next task in the sequence - Task F
      • Task F - This receives packets from the previous Task - Task E, carries out some operation on the data and forwards to result packets to the next task in the sequence - Task G
      • Task G - This receives packets from the previous Task - Task F and distributes the result packages to the Transmitter tasks
    • Transmitter Tasks - Composed on 1 or more tasks which send the processed packets back to the Traffic Generator
      • Task H - Receives Packets from Task G of System-Under-Test and sends packets to Traffic Generator Port 0
      • Task I - Receives Packets from Task G of System-Under-Test and sends packets to Traffic Generator Port 1
      • Task J - Receives Packets from Task G of System-Under-Test and sends packets to Traffic Generator Port 2
      • Task K - Receives Packets From Task G of System-Under-Test and sends packets to Traffic Generator Port 3
10. NSB Prox Test

A NSB Prox test is composed of the following components :-

  • Test Description File. Usually called tc_prox_<context>_<test>-<ports>.yaml where

    • <context> is either baremetal or heat_context
    • <test> is the a one or 2 word description of the test.
    • <ports> is the number of ports used

    Example tests tc_prox_baremetal_l2fwd-2.yaml or tc_prox_heat_context_vpe-4.yaml. This file describes the components of the test, in the case of openstack the network description and server descriptions, in the case of baremetal the hardware description location. It also contains the name of the Traffic Generator, the SUT config file and the traffic profile description, all described below. See nsb-test-description-label

  • Traffic Profile file. Example prox_binsearch.yaml. This describes the packet size, tolerated loss, initial line rate to start traffic at, test interval etc See nsb-traffic-profile-label

  • Traffic Generator Config file. Usually called gen_<test>-<ports>.cfg.

    This describes the activity of the traffic generator

    • What each core of the traffic generator does,
    • The packet of data sent by a core on a port of the traffic generator to the system under test
    • What core is used to wait on what port for data from the system under test.

    Example traffic generator config file gen_l2fwd-4.cfg See nsb-traffic-generator-label

  • SUT Config file. Usually called handle_<test>-<ports>.cfg.

    This describes the activity of the SUTs

    • What each core of the does,
    • What cores receives packets from what ports
    • What cores perform operations on the packets and pass the packets onto another core
    • What cores receives packets from what cores and transmit the packets on the ports to the Traffic Verifier tasks of the Traffic Generator.

    Example traffic generator config file handle_l2fwd-4.cfg See nsb-sut-generator-label

  • NSB PROX Baremetal Configuration file. Usually called prox-baremetal-<ports>.yaml

    • <ports> is the number of ports used

    This is required for baremetal only. This describes hardware, NICs, IP addresses, Network drivers, usernames and passwords. See baremetal-config-label

  • Grafana Dashboard. Usually called Prox_<context>_<test>-<port>-<DateAndTime>.json where

    • <context> Is either BM or heat
    • <test> Is the a one or 2 word description of the test.
    • <port> is the number of ports used express as 2Port or 4Port
    • <DateAndTime> is the Date and Time expressed as a string.

    Example grafana dashboard Prox_BM_L2FWD-4Port-1507804504588.json

Other files may be required. These are test specific files and will be covered later.

Test Description File

Here we will discuss the test description for both baremetal and openstack.

10.1. Test Description File for Baremetal

This section will introduce the meaning of the Test case description file. We will use tc_prox_baremetal_l2fwd-2.yaml as an example to show you how to understand the test description file.

NSB PROX Test Description File

Now let’s examine the components of the file in detail

  1. traffic_profile - This specifies the traffic profile for the test. In this case prox_binsearch.yaml is used. See nsb-traffic-profile-label

  2. topology - This is either prox-tg-topology-1.yaml or

    prox-tg-topology-2.yaml or prox-tg-topology-4.yaml depending on number of ports required.

  3. nodes - This names the Traffic Generator and the System under Test. Does not need to change.

  4. interface_speed_gbps - This is an optional parameter. If not present the system defaults to 10Gbps. This defines the speed of the interfaces.

  5. prox_path - Location of the Prox executable on the traffic generator (Either baremetal or Openstack Virtual Machine)

  6. prox_config - This is the SUT Config File. In this case it is handle_l2fwd-2.cfg

    A number of additional parameters can be added. This example is for VPE:

    options:
      interface_speed_gbps: 10
    
      vnf__0:
        prox_path: /opt/nsb_bin/prox
        prox_config: ``configs/handle_vpe-4.cfg``
        prox_args:
          ``-t``: ````
        prox_files:
          ``configs/vpe_ipv4.lua`` : ````
          ``configs/vpe_dscp.lua`` : ````
          ``configs/vpe_cpe_table.lua`` : ````
          ``configs/vpe_user_table.lua`` : ````
          ``configs/vpe_rules.lua`` : ````
        prox_generate_parameter: True
    

    interface_speed_gbps - this specifies the speed of the interface in Gigabits Per Second. This is used to calculate pps(packets per second). If the interfaces are of different speeds, then this specifies the speed of the slowest interface. This parameter is optional. If omitted the interface speed defaults to 10Gbps.

    prox_files - this specified that a number of addition files need to be provided for the test to run correctly. This files could provide routing information,hashing information or a hashing algorithm and ip/mac information.

    prox_generate_parameter - this specifies that the NSB application is required to provide information to the nsb Prox in the form of a file called parameters.lua, which contains information retrieved from either the hardware or the openstack configuration.

  7. prox_args - this specifies the command line arguments to start prox. See prox command line.

  8. prox_config - This specifies the Traffic Generator config file.

  9. runner - This is set to ProxDuration - This specifies that the test runs for a set duration. Other runner types are available but it is recommend to use ProxDuration

    The following parrameters are supported

    interval - (optional) - This specifies the sampling interval. Default is 1 sec

    sampled - (optional) - This specifies if sampling information is required. Default no

    duration - This is the length of the test in seconds. Default is 60 seconds.

    confirmation - This specifies the number of confirmation retests to be made before deciding to increase or decrease line speed. Default 0.

  10. context - This is context for a 2 port Baremetal configuration.

If a 4 port configuration was required then file prox-baremetal-4.yaml would be used. This is the NSB Prox baremetal configuration file.
10.2. Traffic Profile file

This describes the details of the traffic flow. In this case prox_binsearch.yaml is used.

NSB PROX Traffic Profile
  1. name - The name of the traffic profile. This name should match the name specified in the traffic_profile field in the Test Description File.

  2. traffic_type - This specifies the type of traffic pattern generated, This name matches class name of the traffic generator See:

    network_services/traffic_profile/prox_binsearch.py class ProxBinSearchProfile(ProxProfile)
    

    In this case it lowers the traffic rate until the number of packets sent is equal to the number of packets received (plus a tolerated loss). Once it achieves this it increases the traffic rate in order to find the highest rate with no traffic loss.

    Custom traffic types can be created by creating a new traffic profile class.

  3. tolerated_loss - This specifies the percentage of packets that can be lost/dropped before we declare success or failure. Success is Transmitted-Packets from Traffic Generator is greater than or equal to packets received by Traffic Generator plus tolerated loss.

  4. test_precision - This specifies the precision of the test results. For some tests the success criteria may never be achieved because the test precision may be greater than the successful throughput. For finer results increase the precision by making this value smaller.

  5. packet_sizes - This specifies the range of packets size this test is run for.

  6. duration - This specifies the sample duration that the test uses to check for success or failure.

  7. lower_bound - This specifies the test initial lower bound sample rate. On success this value is increased.

  8. upper_bound - This specifies the test initial upper bound sample rate. On success this value is decreased.

Other traffic profiles exist eg prox_ACL.yaml which does not compare what is received with what is transmitted. It just sends packet at max rate.

It is possible to create custom traffic profiles with by creating new file in the same folder as prox_binsearch.yaml. See this prox_vpe.yaml as example:

schema: ``nsb:traffic_profile:0.1``

name:            prox_vpe
description:     Prox vPE traffic profile

traffic_profile:
  traffic_type: ProxBinSearchProfile
  tolerated_loss: 100.0 #0.001
  test_precision: 0.01
# The minimum size of the Ethernet frame for the vPE test is 68 bytes.
  packet_sizes: [68]
  duration: 5
  lower_bound: 0.0
  upper_bound: 100.0
10.3. Test Description File for Openstack

We will use tc_prox_heat_context_l2fwd-2.yaml as a example to show you how to understand the test description file.

NSB PROX Test Description File - Part 1 NSB PROX Test Description File - Part 2

Now lets examine the components of the file in detail

Sections 1 to 9 are exactly the same in Baremetal and in Heat. Section 10 is replaced with sections A to F. Section 10 was for a baremetal configuration file. This has no place in a heat configuration.

  1. image - yardstick-samplevnfs. This is the name of the image created during the installation of NSB. This is fixed.

  2. flavor - The flavor is created dynamically. However we could use an already existing flavor if required. In that case the flavor would be named:

    flavor: yardstick-flavor
    
  3. extra_specs - This allows us to specify the number of cores sockets and hyperthreading assigned to it. In this case we have 1 socket with 10 codes and no hyperthreading enabled.

  4. placement_groups - default. Do not change for NSB PROX.

  5. servers - tg_0 is the traffic generator and vnf_0 is the system under test.

  6. networks - is composed of a management network labeled mgmt and one uplink network labeled uplink_0 and one downlink network labeled downlink_0 for 2 ports. If this was a 4 port configuration there would be 2 extra downlink ports. See this example from a 4 port l2fwd test.:

    networks:
      mgmt:
        cidr: '10.0.1.0/24'
      uplink_0:
        cidr: '10.0.2.0/24'
        gateway_ip: 'null'
        port_security_enabled: False
        enable_dhcp: 'false'
      downlink_0:
        cidr: '10.0.3.0/24'
        gateway_ip: 'null'
        port_security_enabled: False
        enable_dhcp: 'false'
      uplink_1:
        cidr: '10.0.4.0/24'
        gateway_ip: 'null'
        port_security_enabled: False
        enable_dhcp: 'false'
      downlink_1:
        cidr: '10.0.5.0/24'
        gateway_ip: 'null'
        port_security_enabled: False
        enable_dhcp: 'false'
    
10.4. Traffic Generator Config file

This section will describe the traffic generator config file. This is the same for both baremetal and heat. See this example of gen_l2fwd_multiflow-2.cfg to explain the options.

NSB PROX Gen Config File

The configuration file is divided into multiple sections, each of which is used to define some parameters and options.:

[eal options]
[variables]
[port 0]
[port 1]
[port .]
[port Z]
[defaults]
[global]
[core 0]
[core 1]
[core 2]
[core .]
[core Z]

See prox options for details

Now let’s examine the components of the file in detail

  1. [eal options] - This specified the EAL (Environmental Abstraction Layer) options. These are default values and are not changed. See dpdk wiki page.

  2. [variables] - This section contains variables, as the name suggests. Variables for Core numbers, mac addresses, ip addresses etc. They are assigned as a key = value where the key is used in place of the value.

    Caution

    A special case for valuables with a value beginning with @@. These values are dynamically updated by the NSB application at run time. Values like MAC address, IP Address etc.

  3. [port 0] - This section describes the DPDK Port. The number following the keyword port usually refers to the DPDK Port Id. usually starting from 0. Because you can have multiple ports this entry usually repeated. Eg. For a 2 port setup [port0] and [port 1] and for a 4 port setup [port 0], [port 1], [port 2] and [port 3]:

    [port 0]
    name=p0
    mac=hardware
    rx desc=2048
    tx desc=2048
    promiscuous=yes
    
    1. In this example name = p0 assigned the name p0 to the port. Any name can be assigned to a port.
    2. mac=hardware sets the MAC address assigned by the hardware to data from this port.
    3. rx desc=2048 sets the number of available descriptors to allocate for receive packets. This can be changed and can effect performance.
    4. tx desc=2048 sets the number of available descriptors to allocate for transmit packets. This can be changed and can effect performance.
    5. promiscuous=yes this enables promiscuous mode for this port.
  4. [defaults] - Here default operations and settings can be over written. In this example mempool size=4K the number of mbufs per task is altered. Altering this value could effect performance. See prox options for details.

  5. [global] - Here application wide setting are supported. Things like application name, start time, duration and memory configurations can be set here. In this example.:

      [global]
      start time=5
      name=Basic Gen
    
    a. ``start time=5`` Time is seconds after which average
       stats will be started.
    b. ``name=Basic Gen`` Name of the configuration.
    
  6. [core 0] - This core is designated the master core. Every Prox application must have a master core. The master mode must be assigned to exactly one task, running alone on one core.:

    [core 0]
    mode=master
    
  7. [core 1] - This describes the activity on core 1. Cores can be configured by means of a set of [core #] sections, where # represents either:

    1. an absolute core number: e.g. on a 10-core, dual socket system with hyper-threading, cores are numbered from 0 to 39.

    2. PROX allows a core to be identified by a core number, the letter ‘s’, and a socket number.

      It is possible to write a baremetal and an openstack test which use the same traffic generator config file and SUT config file. In this case it is advisable not to use physical core numbering.

      However it is also possible to write NSB Prox tests that have been optimized for a particular hardware configuration. In this case it is advisable to use the core numbering. It is up to the user to make sure that cores from the right sockets are used (i.e. from the socket on which the NIC is attached to), to ensure good performance (EPA).

    Each core can be assigned with a set of tasks, each running one of the implemented packet processing modes.:

    [core 1]
    name=p0
    task=0
    mode=gen
    tx port=p0
    bps=1250000000
    ; Ethernet + IP + UDP
    pkt inline=${sut_mac0} 70 00 00 00 00 01 08 00 45 00 00 1c 00 01 00 00 40 11 f7 7d 98 10 64 01 98 10 64 02 13 88 13 88 00 08 55 7b
    ; src_ip: 152.16.100.0/8
    random=0000XXX1
    rand_offset=29
    ; dst_ip: 152.16.100.0/8
    random=0000XXX0
    rand_offset=33
    random=0001001110001XXX0001001110001XXX
    rand_offset=34
    
    1. name=p0 - Name assigned to the core.

    2. task=0 - Each core can run a set of tasks. Starting with 0. Task 1 can be defined later in this core or can be defined in another [core 1] section with task=1 later in configuration file. Sometimes running multiple task related to the same packet on the same physical core improves performance, however sometimes it is optimal to move task to a separate core. This is best decided by checking performance.

    3. mode=gen - Specifies the action carried out by this task on this core. Supported modes are: classify, drop, gen, lat, genl4, nop, l2fwd, gredecap, greencap, lbpos, lbnetwork, lbqinq, lb5tuple, ipv6_decap, ipv6_encap, qinqdecapv4, qinqencapv4, qos, routing, impair, mirror, unmpls, tagmpls, nat, decapnsh, encapnsh, police, acl Which are :-

      • Classify
      • Drop
      • Basic Forwarding (no touch)
      • L2 Forwarding (change MAC)
      • GRE encap/decap
      • Load balance based on packet fields
      • Symmetric load balancing
      • QinQ encap/decap IPv4/IPv6
      • ARP
      • QoS
      • Routing
      • Unmpls
      • Nsh encap/decap
      • Policing
      • ACL

      In the traffic generator we expect a core to generate packets (gen) and to receive packets & calculate latency (lat) This core does gen . ie it is a traffic generator.

      To understand what each of the modes support please see prox documentation.

    4. tx port=p0 - This specifies that the packets generated are transmitted to port p0

    5. bps=1250000000 - This indicates Bytes Per Second to generate packets.

    6. ; Ethernet + IP + UDP - This is a comment. Items starting with ; are ignored.

    7. pkt inline=${sut_mac0} 70 00 00 00 ... - Defines the packet format as a sequence of bytes (each expressed in hexadecimal notation). This defines the packet that is generated. This packets begins with the hexadecimal sequence assigned to sut_mac and the remainder of the bytes in the string. This packet could now be sent or modified by random=.. described below before being sent to target.

    8. ; src_ip: 152.16.100.0/8 - Comment

    9. random=0000XXX1 - This describes a field of the packet containing random data. This string can be 8,16,24 or 32 character long and represents 1,2,3 or 4 bytes of data. In this case it describes a byte of data. Each character in string can be 0,1 or X. 0 or 1 are fixed bit values in the data packet and X is a random bit. So random=0000XXX1 generates 00000001(1), 00000011(3), 00000101(5), 00000111(7), 00001001(9), 00001011(11), 00001101(13) and 00001111(15) combinations.

    10. rand_offset=29 - Defines where to place the previously defined random field.

    11. ; dst_ip: 152.16.100.0/8 - Comment

    12. random=0000XXX0 - This is another random field which generates a byte of 00000000(0), 00000010(2), 00000100(4), 00000110(6), 00001000(8), 00001010(10), 00001100(12) and 00001110(14) combinations.

    13. rand_offset=33 - Defines where to place the previously defined random field.

    14. random=0001001110001XXX0001001110001XXX - This is another random field which generates 4 bytes.

    15. rand_offset=34 - Defines where to place the previously defined 4 byte random field.

    Core 2 executes same scenario as Core 1. The only difference in this case is that the packets are generated for Port 1.

  8. [core 3] - This defines the activities on core 3. The purpose of core 3 and core 4 is to receive packets sent by the SUT.:

    [core 3]
    name=rec 0
    task=0
    mode=lat
    rx port=p0
    lat pos=42
    
    1. name=rec 0 - Name assigned to the core.
    2. task=0 - Each core can run a set of tasks. Starting with 0. Task 1 can be defined later in this core or can be defined in another [core 1] section with task=1 later in configuration file. Sometimes running multiple task related to the same packet on the same physical core improves performance, however sometimes it is optimal to move task to a separate core. This is best decided by checking performance.
    3. mode=lat - Specifies the action carried out by this task on this core. Supported modes are: acl, classify, drop, gredecap, greencap, ipv6_decap, ipv6_encap, l2fwd, lbnetwork, lbpos, lbqinq, nop, police, qinqdecapv4, qinqencapv4, qos, routing, impair, lb5tuple, mirror, unmpls, tagmpls, nat, decapnsh, encapnsh, gen, genl4 and lat. This task(0) per core(3) receives packets on port.
    4. rx port=p0 - The port to receive packets on Port 0. Core 4 will receive packets on Port 1.
    5. lat pos=42 - Describes where to put a 4-byte timestamp in the packet. Note that the packet length should be longer than lat pos + 4 bytes to avoid truncation of the timestamp. It defines where the timestamp is to be read from. Note that the SUT workload might cause the position of the timestamp to change (i.e. due to encapsulation).
10.5. SUT Config file

This section will describes the SUT(VNF) config file. This is the same for both baremetal and heat. See this example of handle_l2fwd_multiflow-2.cfg to explain the options.

NSB PROX Handle Config File

See prox options for details

Now let’s examine the components of the file in detail

  1. [eal options] - same as the Generator config file. This specified the EAL (Environmental Abstraction Layer) options. These are default values and are not changed. See dpdk wiki page.

  2. [port 0] - This section describes the DPDK Port. The number following the keyword port usually refers to the DPDK Port Id. usually starting from 0. Because you can have multiple ports this entry usually repeated. Eg. For a 2 port setup [port0] and [port 1] and for a 4 port setup [port 0], [port 1], [port 2] and [port 3]:

    [port 0]
    name=if0
    mac=hardware
    rx desc=2048
    tx desc=2048
    promiscuous=yes
    
    1. In this example name =if0 assigned the name if0 to the port. Any name can be assigned to a port.
    2. mac=hardware sets the MAC address assigned by the hardware to data from this port.
    3. rx desc=2048 sets the number of available descriptors to allocate for receive packets. This can be changed and can effect performance.
    4. tx desc=2048 sets the number of available descriptors to allocate for transmit packets. This can be changed and can effect performance.
    5. promiscuous=yes this enables promiscuous mode for this port.
  3. [defaults] - Here default operations and settings can be over written.:

    [defaults]
    mempool size=8K
    memcache size=512
    
    1. In this example mempool size=8K the number of mbufs per task is altered. Altering this value could effect performance. See prox options for details.
    2. memcache size=512 - number of mbufs cached per core, default is 256 this is the cache_size. Altering this value could effect performance.
  4. [global] - Here application wide setting are supported. Things like application name, start time, duration and memory configurations can be set here. In this example.:

      [global]
      start time=5
      name=Basic Gen
    
    a. ``start time=5`` Time is seconds after which average stats will be started.
    b. ``name=Handle L2FWD Multiflow (2x)`` Name of the configuration.
    
  5. [core 0] - This core is designated the master core. Every Prox application must have a master core. The master mode must be assigned to exactly one task, running alone on one core.:

    [core 0]
    mode=master
    
  6. [core 1] - This describes the activity on core 1. Cores can be configured by means of a set of [core #] sections, where # represents either:

    1. an absolute core number: e.g. on a 10-core, dual socket system with hyper-threading, cores are numbered from 0 to 39.
    2. PROX allows a core to be identified by a core number, the letter ‘s’, and a socket number. However NSB PROX is hardware agnostic (physical and virtual configurations are the same) it is advisable no to use physical core numbering.

    Each core can be assigned with a set of tasks, each running one of the implemented packet processing modes.:

    [core 1]
    name=none
    task=0
    mode=l2fwd
    dst mac=@@tester_mac1
    rx port=if0
    tx port=if1
    
    1. name=none - No name assigned to the core.
    2. task=0 - Each core can run a set of tasks. Starting with 0. Task 1 can be defined later in this core or can be defined in another [core 1] section with task=1 later in configuration file. Sometimes running multiple task related to the same packet on the same physical core improves performance, however sometimes it is optimal to move task to a separate core. This is best decided by checking performance.
    3. mode=l2fwd - Specifies the action carried out by this task on this core. Supported modes are: acl, classify, drop, gredecap, greencap, ipv6_decap, ipv6_encap, l2fwd, lbnetwork, lbpos, lbqinq, nop, police, qinqdecapv4, qinqencapv4, qos, routing, impair, lb5tuple, mirror, unmpls, tagmpls, nat, decapnsh, encapnsh, gen, genl4 and lat. This code does l2fwd .. ie it does the L2FWD.
    4. dst mac=@@tester_mac1 - The destination mac address of the packet will be set to the MAC address of Port 1 of destination device. (The Traffic Generator/Verifier)
    5. rx port=if0 - This specifies that the packets are received from Port 0 called if0
    6. tx port=if1 - This specifies that the packets are transmitted to Port 1 called if1

    If this example we receive a packet on core on a port, carry out operation on the packet on the core and transmit it on on another port still using the same task on the same core.

    On some implementation you may wish to use multiple tasks, like this.:

    [core 1]
    name=rx_task
    task=0
    mode=l2fwd
    dst mac=@@tester_p0
    rx port=if0
    tx cores=1t1
    drop=no
    
    name=l2fwd_if0
    task=1
    mode=nop
    rx ring=yes
    tx port=if0
    drop=no
    

    In this example you can see Core 1/Task 0 called rx_task receives the packet from if0 and perform the l2fwd. However instead of sending the packet to a port it sends it to a core see tx cores=1t1. In this case it sends it to Core 1/Task 1.

    Core 1/Task 1 called l2fwd_if0, receives the packet, not from a port but from the ring. See rx ring=yes. It does not perform any operation on the packet See mode=none and sends the packets to if0 see tx port=if0.

    It is also possible to implement more complex operations be chaining multiple operations in sequence and using rings to pass packets from one core to another.

    In thus example we show a Broadband Network Gateway (BNG) with Quality of Service (QoS). Communication from task to task is via rings.

    NSB PROX Config File for BNG_QOS
10.6. Baremetal Configuration file

This is required for baremetal testing. It describes the IP address of the various ports, the Network devices drivers and MAC addresses and the network configuration.

In this example we will describe a 2 port configuration. This file is the same for all 2 port NSB Prox tests on the same platforms/configuration.

NSB PROX Yardstick Config

Now lets describe the sections of the file.

  1. TrafficGen - This section describes the Traffic Generator node of the test configuration. The name of the node trafficgen_1 must match the node name in the Test Description File for Baremetal mentioned earlier. The password attribute of the test needs to be configured. All other parameters can remain as default settings.
  2. interfaces - This defines the DPDK interfaces on the Traffic Generator.
  3. xe0 is DPDK Port 0. lspci and `` ./dpdk-devbind.py -s`` can be used to provide the interface information. netmask and local_ip should not be changed
  4. xe1 is DPDK Port 1. If more than 2 ports are required then xe1 section needs to be repeated and modified accordingly.
  5. vnf - This section describes the SUT of the test configuration. The name of the node vnf must match the node name in the Test Description File for Baremetal mentioned earlier. The password attribute of the test needs to be configured. All other parameters can remain as default settings
  6. interfaces - This defines the DPDK interfaces on the SUT
  7. xe0 - Same as 3 but for the SUT.
  8. xe1 - Same as 4 but for the SUT also.
  9. routing_table - All parameters should remain unchanged.
  10. nd_route_tbl - All parameters should remain unchanged.
10.7. Grafana Dashboard

The grafana dashboard visually displays the results of the tests. The steps required to produce a grafana dashboard are described here.

  1. Configure yardstick to use influxDB to store test results. See file /etc/yardstick/yardstick.conf.

    NSB PROX Yardstick Config
    1. Specify the dispatcher to use influxDB to store results.
    2. “target = .. ” - Specify location of influxDB to store results. “db_name = yardstick” - name of database. Do not change “username = root” - username to use to store result. (Many tests are run as root) “password = ... ” - Please set to root user password
  2. Deploy InfludDB & Grafana. See how to Deploy InfluxDB & Grafana. See grafana deployment.

  3. Generate the test data. Run the tests as follows .:

    yardstick --debug task start tc_prox_<context>_<test>-ports.yaml
    

    eg.:

    yardstick --debug task start tc_prox_heat_context_l2fwd-4.yaml
    
  4. Now build the dashboard for the test you just ran. The easiest way to do this is to copy an existing dashboard and rename the test and the field names. The procedure to do so is described here. See opnfv grafana dashboard.

11. How to run NSB Prox Test on an baremetal environment

In order to run the NSB PROX test.

  1. Install NSB on Traffic Generator node and Prox in SUT. See NSB Installation

  2. To enter container:

    docker exec -it yardstick /bin/bash
    
  3. Install baremetal configuration file (POD files)

    1. Go to location of PROX tests in container

      cd /home/opnfv/repos/yardstick/samples/vnf_samples/nsut/prox
      
    2. Install prox-baremetal-2.yam and prox-baremetal-4.yaml for that topology into this directory as per baremetal-config-label

    3. Install and configure yardstick.conf

      cd /etc/yardstick/
      

      Modify /etc/yardstick/yardstick.conf as per yardstick-config-label

  4. Execute the test. Eg.:

    yardstick --debug task start ./tc_prox_baremetal_l2fwd-4.yaml
    
12. How to run NSB Prox Test on an Openstack environment

In order to run the NSB PROX test.

  1. Install NSB on Openstack deployment node. See NSB Installation

  2. To enter container:

    docker exec -it yardstick /bin/bash
    
  3. Install configuration file

    1. Goto location of PROX tests in container

      cd /home/opnfv/repos/yardstick/samples/vnf_samples/nsut/prox
      
    2. Install and configure yardstick.conf

      cd /etc/yardstick/
      

      Modify /etc/yardstick/yardstick.conf as per yardstick-config-label

  4. Execute the test. Eg.:

    yardstick --debug task start ./tc_prox_heat_context_l2fwd-4.yaml
    
13. Frequently Asked Questions

Here is a list of frequently asked questions.

13.1. NSB Prox does not work on Baremetal, How do I resolve this?

If PROX NSB does not work on baremetal, problem is either in network configuration or test file.

Solution

  1. Verify network configuration. Execute existing baremetal test.:

    yardstick --debug task start ./tc_prox_baremetal_l2fwd-4.yaml
    

    If test does not work then error in network configuration.

    1. Check DPDK on Traffic Generator and SUT via:-

      /root/dpdk-17./usertools/dpdk-devbind.py
      
    2. Verify MAC addresses match prox-baremetal-<ports>.yaml via ifconfig and dpdk-devbind

    3. Check your eth port is what you expect. You would not be the first person to think that the port your cable is plugged into is ethX when in fact it is ethY. Use ethtool to visually confirm that the eth is where you expect.:

      ethtool -p ethX
      

      A led should start blinking on port. (On both System-Under-Test and Traffic Generator)

    4. Check cable.

      Install Linux kernel network driver and ensure your ports are bound to the driver via dpdk-devbind. Bring up port on both SUT and Traffic Generator and check connection.

      1. On SUT and on Traffic Generator:

        ifconfig ethX/enoX up
        
      2. Check link

        ethtool ethX/enoX

        See Link detected if yes .... Cable is good. If no you have an issue with your cable/port.

  2. If existing baremetal works then issue is with your test. Check the traffic generator gen_<test>-<ports>.cfg to ensure it is producing a valid packet.

13.2. How do I debug NSB Prox on Baremetal?

Solution

  1. Execute the test as follows:

    yardstick --debug task start ./tc_prox_baremetal_l2fwd-4.yaml
    
  2. Login to Traffic Generator as root.:

    cd
    /opt/nsb_bin/prox -f /tmp/gen_<test>-<ports>.cfg
    
  3. Login to SUT as root.:

    cd
    /opt/nsb_bin/prox -f /tmp/handle_<test>-<ports>.cfg
    
  4. Now let’s examine the Generator Output. In this case the output of gen_l2fwd-4.cfg.

    NSB PROX Traffic Generator GUI

    Now let’s examine the output

    1. Indicates the amount of data successfully transmitted on Port 0
    2. Indicates the amount of data successfully received on port 1
    3. Indicates the amount of data successfully handled for port 1

    It appears what is transmitted is received.

    Caution

    The number of packets MAY not exactly match because the ports are read in sequence.

    Caution

    What is transmitted on PORT X may not always be received on same port. Please check the Test scenario.

  5. Now lets examine the SUT Output

    NSB PROX SUT GUI

    Now lets examine the output

    1. What is received on 0 is transmitted on 1, received on 1 transmitted on 0, received on 2 transmitted on 3 and received on 3 transmitted on 2.
    2. No packets are Failed.
    3. No packets are discarded.

We can also dump the packets being received or transmitted via the following commands.

dump                   Arguments: <core id> <task id> <nb packets>
                       Create a hex dump of <nb_packets> from <task_id> on <core_id> showing how
                       packets have changed between RX and TX.
dump_rx                Arguments: <core id> <task id> <nb packets>
                       Create a hex dump of <nb_packets> from <task_id> on <core_id> at RX
dump_tx                Arguments: <core id> <task id> <nb packets>
                       Create a hex dump of <nb_packets> from <task_id> on <core_id> at TX

eg.:

dump_tx 1 0 1
13.3. NSB Prox works on Baremetal but not in Openstack. How do I resolve this?

NSB Prox on Baremetal is a lot more forgiving than NSB Prox on Openstack. A badly formed packed may still work with PROX on Baremetal. However on Openstack the packet must be correct and all fields of the header correct. Eg A packet with an invalid Protocol ID would still work in Baremetal but this packet would be rejected by openstack.

Solution

  1. Check the validity of the packet.
  2. Use a known good packet in your test
  3. If using Random fields in the traffic generator, disable them and retry.
13.4. How do I debug NSB Prox on Openstack?

Solution

  1. Execute the test as follows:

    yardstick --debug task start --keep-deploy ./tc_prox_heat_context_l2fwd-4.yaml
    
  2. Access docker image if required via:

    docker exec -it yardstick /bin/bash
    
  3. Install openstack credentials.

    Depending on your openstack deployment, the location of these credentials may vary. On this platform I do this via:

    scp root@10.237.222.55:/etc/kolla/admin-openrc.sh .
    source ./admin-openrc.sh
    
  4. List Stack details

    1. Get the name of the Stack.

      NSB PROX openstack stack list
    2. Get the Floating IP of the Traffic Generator & SUT

      This generates a lot of information. Please not the floating IP of the VNF and the Traffic Generator.

      NSB PROX openstack stack show (Top)

      From here you can see the floating IP Address of the SUT / VNF

      NSB PROX openstack stack show (Top)

      From here you can see the floating IP Address of the Traffic Generator

    3. Get ssh identity file

      In the docker container locate the identity file.:

      cd /home/opnfv/repos/yardstick/yardstick/resources/files
      ls -lt
      
  5. Login to SUT as Ubuntu.:

    ssh -i ./yardstick_key-01029d1d ubuntu@172.16.2.158
    

    Change to root:

     sudo su
    
    Now continue as baremetal.
    
  6. Login to SUT as Ubuntu.:

    ssh -i ./yardstick_key-01029d1d ubuntu@172.16.2.156
    

    Change to root:

     sudo su
    
    Now continue as baremetal.
    
13.5. How do I resolve “Quota exceeded for resources”

Solution

This usually occurs due to 2 reasons when executing an openstack test.

  1. One or more stacks already exists and are consuming all resources. To resolve

    openstack stack list
    

    Response:

    +--------------------------------------+--------------------+-----------------+----------------------+--------------+
    | ID                                   | Stack Name         | Stack Status    | Creation Time        | Updated Time |
    +--------------------------------------+--------------------+-----------------+----------------------+--------------+
    | acb559d7-f575-4266-a2d4-67290b556f15 | yardstick-e05ba5a4 | CREATE_COMPLETE | 2017-12-06T15:00:05Z | None         |
    | 7edf21ce-8824-4c86-8edb-f7e23801a01b | yardstick-08bda9e3 | CREATE_COMPLETE | 2017-12-06T14:56:43Z | None         |
    +--------------------------------------+--------------------+-----------------+----------------------+--------------+
    

    In this case 2 stacks already exist.

    To remove stack:

    openstack stack delete yardstick-08bda9e3
    Are you sure you want to delete this stack(s) [y/N]? y
    
  2. The openstack configuration quotas are too small.

    The solution is to increase the quota. Use below to query existing quotas:

    openstack quota show
    

    And to set quota:

    openstack quota set <resource>
    
13.6. Openstack Cli fails or hangs. How do I resolve this?

Solution

If it fails due to

Missing value auth-url required for auth plugin password

Check your shell environment for Openstack variables. One of them should contain the authentication URL

OS_AUTH_URL=``https://192.168.72.41:5000/v3``

Or similar. Ensure that openstack configurations are exported.

cat  /etc/kolla/admin-openrc.sh

Result

export OS_PROJECT_DOMAIN_NAME=default
export OS_USER_DOMAIN_NAME=default
export OS_PROJECT_NAME=admin
export OS_TENANT_NAME=admin
export OS_USERNAME=admin
export OS_PASSWORD=BwwSEZqmUJA676klr9wa052PFjNkz99tOccS9sTc
export OS_AUTH_URL=http://193.168.72.41:35357/v3
export OS_INTERFACE=internal
export OS_IDENTITY_API_VERSION=3
export EXTERNAL_NETWORK=yardstick-public

and visible.

If the Openstack Cli appears to hang, then verify the proxys and no_proxy are set correctly. They should be similar to

FTP_PROXY="http://proxy.ir.intel.com:911/"
HTTPS_PROXY="http://proxy.ir.intel.com:911/"
HTTP_PROXY="http://proxy.ir.intel.com:911/"
NO_PROXY="localhost,127.0.0.1,10.237.222.55,10.237.223.80,10.237.222.134,.ir.intel.com"
ftp_proxy="http://proxy.ir.intel.com:911/"
http_proxy="http://proxy.ir.intel.com:911/"
https_proxy="http://proxy.ir.intel.com:911/"
no_proxy="localhost,127.0.0.1,10.237.222.55,10.237.223.80,10.237.222.134,.ir.intel.com"

Where

  1. 10.237.222.55 = IP Address of deployment node
  2. 10.237.223.80 = IP Address of Controller node
  3. 10.237.222.134 = IP Address of Compute Node
  4. ir.intel.com = local no proxy
13.7. How to Understand the Grafana output?
NSB PROX Grafana_1 NSB PROX Grafana_2 NSB PROX Grafana_3 NSB PROX Grafana_4
  1. Test Parameters - Test interval, Duartion, Tolerated Loss and Test Precision
  2. Overall No of packets send and received during test
  3. Generator Stats - packets sent, received and attempted by Generator
  4. Packets Size
  5. No of packets received by SUT
  6. No of packets forwarded by SUT
  7. This is the number of packets sent by the generator per port, for each interval.
  8. This is the number of packets received by the generator per port, for each interval.
  9. This is the number of packets send and received by the generator and lost by the SUT that meet the success criteria
  10. This is the changes the Percentage of Line Rate used over a test, The MAX and the MIN should converge to within the interval specified as the test-precision.
  11. This is the packets Size supported during test. If “N/A” appears in any field the result has not been decided.
  12. This is the calculated throughput in MPPS(Million Packets Per second) for this line rate.
  13. This is the actual No, of packets sent by the generator in MPPS
  14. This is the actual No. of packets received by the generator in MPPS
  15. This is the total No. of packets sent by SUT.
  16. This is the total No. of packets received by the SUT
  17. This is the total No. of packets dropped. (These packets were sent by the generator but not received back by the generator, these may be dropped by the SUT or the Generator)
  18. This is the tolerated no of packets that can be dropped.
  19. This is the test Throughput in Gbps
  20. This is the Latencey per Port
  21. This is the CPU Utilization

Developer

Documentation Guide

Documentation Guide

This page intends to cover the documentation handling for OPNFV. OPNFV projects are expected to create a variety of document types, according to the nature of the project. Some of these are common to projects that develop/integrate features into the OPNFV platform, e.g. Installation Instructions and User/Configurations Guides. Other document types may be project-specific.

Getting Started with Documentation for Your Project

OPNFV documentation is automated and integrated into our git & gerrit toolchains.

We use RST document templates in our repositories and automatically render to HTML and PDF versions of the documents in our artifact store, our Wiki is also able to integrate these rendered documents directly allowing projects to use the revision controlled documentation process for project information, content and deliverables. Read this page which elaborates on how documentation is to be included within opnfvdocs.

Licencing your documentation

All contributions to the OPNFV project are done in accordance with the OPNFV licensing requirements. Documentation in OPNFV is contributed in accordance with the Creative Commons 4.0 and the `SPDX https://spdx.org/>`_ licence. All documentation files need to be licensed using the text below. The license may be applied in the first lines of all contributed RST files:

.. This work is licensed under a Creative Commons Attribution 4.0 International License.
.. SPDX-License-Identifier: CC-BY-4.0
.. (c) <optionally add copywriters name>

These lines will not be rendered in the html and pdf files.
How and where to store the document content files in your repository

All documentation for your project should be structured and stored in the <repo>/docs/ directory. The documentation toolchain will look in these directories and be triggered on events in these directories when generating documents.

Document structure and contribution

A general structure is proposed for storing and handling documents that are common across many projects but also for documents that may be project specific. The documentation is divided into three areas Release, Development and Testing. Templates for these areas can be found under opnfvdocs/docs/templates/.

Project teams are encouraged to use templates provided by the opnfvdocs project to ensure that there is consistency across the community. Following representation shows the expected structure:

docs/
├── development
│   ├── design
│   ├── overview
│   └── requirements
├── release
│   ├── configguide
│   ├── installation
│   ├── release-notes
│   ├── scenarios
│   │   └── scenario.name
│   └── userguide
├── testing
│   ├── developer
│   └── user
└── infrastructure
    ├── hardware-infrastructure
    ├── software-infrastructure
    ├── continuous-integration
    └── cross-community-continuous-integration
Release documentation

Release documentation is the set of documents that are published for each OPNFV release. These documents are created and developed following the OPNFV release process and milestones and should reflect the content of the OPNFV release. These documents have a master index.rst file in the <opnfvdocs> repository and extract content from other repositories. To provide content into these documents place your <content>.rst files in a directory in your repository that matches the master document and add a reference to that file in the correct place in the corresponding index.rst file in opnfvdocs/docs/release/.

Platform Overview: opnfvdocs/docs/release/overview

  • Note this document is not a contribution driven document
  • Content for this is prepared by the Marketing team together with the opnfvdocs team

Installation Instruction: <repo>/docs/release/installation

  • Folder for documents describing how to deploy each installer and scenario descriptions
  • Release notes will be included here <To Confirm>
  • Security related documents will be included here
  • Note that this document will be compiled into ‘OPNFV Installation Instruction’

User Guide: <repo>/docs/release/userguide

  • Folder for manuals to use specific features
  • Folder for documents describing how to install/configure project specific components and features
  • Can be the directory where API reference for project specific features are stored
  • Note this document will be compiled into ‘OPNFV userguide’

Configuration Guide: <repo>/docs/release/configguide

  • Brief introduction to configure OPNFV with its dependencies.

Release Notes: <repo>/docs/release/release-notes

  • Changes brought about in the release cycle.
  • Include version details.
Testing documentation

Documentation created by test projects can be stored under two different sub directories /user or /developemnt. Release notes will be stored under <repo>/docs/release/release-notes

User documentation: <repo>/testing/user/ Will collect the documentation of the test projects allowing the end user to perform testing towards a OPNFV SUT e.g. Functest/Yardstick/Vsperf/Storperf/Bottlenecks/Qtip installation/config & user guides.

Development documentation: <repo>/testing/developent/ Will collect documentation to explain how to create your own test case and leverage existing testing frameworks e.g. developer guides.

Development Documentation

Project specific documents such as design documentation, project overview or requirement documentation can be stored under /docs/development. Links to generated documents will be dislayed under Development Documentaiton section on docs.opnfv.org. You are encouraged to establish the following basic structure for your project as needed:

Requirement Documentation: <repo>/docs/development/requirements/

  • Folder for your requirement documentation
  • For details on requirements projects’ structures see the Requirements Projects page.

Design Documentation: <repo>/docs/development/design

  • Folder for your upstream design documents (blueprints, development proposals, etc..)

Project overview: <repo>/docs/development/overview

  • Folder for any project specific documentation.
Infrastructure Documentation

Infrastructure documentation can be stored under <repo>/docs/ folder of corresponding infrastructure project.

Including your Documentation

In your project repository

Add your documentation to your repository in the folder structure and according to the templates listed above. The documentation templates you will require are available in opnfvdocs/docs/templates/ repository, you should copy the relevant templates to your <repo>/docs/ directory in your repository. For instance if you want to document userguide, then your steps shall be as follows:

git clone ssh://<your_id>@gerrit.opnfv.org:29418/opnfvdocs.git
cp -p opnfvdocs/docs/userguide/* <my_repo>/docs/userguide/

You should then add the relevant information to the template that will explain the documentation. When you are done writing, you can commit the documentation to the project repository.

git add .
git commit --signoff --all
git review
In OPNFVDocs Composite Documentation
In toctree
To import project documents from project repositories, we use submodules.
Each project is stored in opnfvdocs/docs/submodule/ as follows:
_images/Submodules.jpg

To include your project specific documentation in the composite documentation, first identify where your project documentation should be included. Say your project userguide should figure in the ‘OPNFV Userguide’, then:

vim opnfvdocs/docs/release/userguide.introduction.rst

This opens the text editor. Identify where you want to add the userguide. If the userguide is to be added to the toctree, simply include the path to it, example:

.. toctree::
    :maxdepth: 1

 submodules/functest/docs/userguide/index
 submodules/bottlenecks/docs/userguide/index
 submodules/yardstick/docs/userguide/index
 <submodules/path-to-your-file>
‘doc8’ Validation

It is recommended that all rst content is validated by doc8 standards. To validate your rst files using doc8, install doc8.

sudo pip install doc8

doc8 can now be used to check the rst files. Execute as,

doc8 --ignore D000,D001 <file>
Testing: Build Documentation Locally
Composite OPNFVDOCS documentation

To build whole documentation under opnfvdocs/, follow these steps:

Install virtual environment.

sudo pip install virtualenv
cd /local/repo/path/to/project

Download the OPNFVDOCS repository.

git clone https://gerrit.opnfv.org/gerrit/opnfvdocs

Change directory to opnfvdocs & install requirements.

cd opnfvdocs
sudo pip install -r etc/requirements.txt

Update submodules, build documentation using tox & then open using any browser.

cd opnfvdocs
git submodule update --init
tox -edocs
firefox docs/_build/html/index.html

Note

Make sure to run tox -edocs and not just tox.

Individual project documentation

To test how the documentation renders in HTML, follow these steps:

Install virtual environment.

sudo pip install virtualenv
cd /local/repo/path/to/project

Download the opnfvdocs repository.

git clone https://gerrit.opnfv.org/gerrit/opnfvdocs

Change directory to opnfvdocs & install requirements.

cd opnfvdocs
sudo pip install -r etc/requirements.txt

Move the conf.py file to your project folder where RST files have been kept:

mv opnfvdocs/docs/conf.py <path-to-your-folder>/

Move the static files to your project folder:

mv opnfvdocs/_static/ <path-to-your-folder>/

Build the documentation from within your project folder:

sphinx-build -b html <path-to-your-folder> <path-to-output-folder>

Your documentation shall be built as HTML inside the specified output folder directory.

Note

Be sure to remove the conf.py, the static/ files and the output folder from the <project>/docs/. This is for testing only. Only commit the rst files and related content.

Adding your project repository as a submodule

Clone the opnfvdocs repository and your submodule to .gitmodules following the convention of the file

cd docs/submodules/
git submodule add https://gerrit.opnfv.org/gerrit/$reponame
git submodule init $reponame/
git submodule update $reponame/
git add .
git commit -sv
git review
Removing a project repository as a submodule
git rm docs/submodules/$reponame rm -rf .git/modules/$reponame git config -f .git/config –remove-section submodule.$reponame 2> /dev/null git add . git commit -sv git review

Submodule Transition

Moving away from submodules.

At the cost of some release-time overhead, there are several benefits the transition provides projects:

  • Local builds - Projects will be able to build and view there docs locally, as they would appear on the OPNFV Docs website.
  • Reduced build time - Patchset verification will only run against individual projects docs, not all projects.
  • Decoupled build failures - Any error introduced to project’s docs would not break builds for all the other projects
Steps

To make the transition the following steps need to be taken across three repositories:

  • Your project repository (Ex. Fuel)
  • The Releng repository
  • The OPNFV Docs repository
Adding a Local Build

In your project repo:

  1. Add the following files:

    docs/conf.py

    from docs_conf.conf import *  # noqa: F401,F403
    

    docs/conf.yaml

    ---
    project_cfg: opnfv
    project: Example
    

    docs/requirements.txt

    lfdocs-conf
    sphinx_opnfv_theme
    # Uncomment the following line if your project uses Sphinx to document
    # HTTP APIs
    # sphinxcontrib-httpdomain
    

    tox.ini

    [tox]
    minversion = 1.6
    envlist =
        docs,
        docs-linkcheck
    skipsdist = true
    
    [testenv:docs]
    deps = -rdocs/requirements.txt
    commands =
        sphinx-build -b html -n -d {envtmpdir}/doctrees ./docs/ {toxinidir}/docs/_build/html
        echo "Generated docs available in {toxinidir}/docs/_build/html"
    whitelist_externals = echo
    
    [testenv:docs-linkcheck]
    deps = -rdocs/requirements.txt
    commands = sphinx-build -b linkcheck -d {envtmpdir}/doctrees ./docs/ {toxinidir}/docs/_build/linkcheck
    

    .gitignore

    .tox/ docs/_build/*

    docs/index.rst

    If this file doesn’t exist, it will need to be created along any other missing index file for directories (release, development). Any example of the file’s content looks like this:

    .. This work is licensed under a Creative Commons Attribution 4.0 International License.
    .. SPDX-License-Identifier: CC-BY-4.0
    .. (c) Open Platform for NFV Project, Inc. and its contributors
    
    .. _<project-name>:
    
    ==============
    <project-name>
    ==============
    
    .. toctree::
       :numbered:
       :maxdepth: 2
    
       release/release-notes/index
       release/installation/index
       release/userguide/index
       scenarios/index
    

You can verify the build works by running:

tox -e docs
Creating a CI Job

In the releng repository:

  1. Update your project’s job file jjb/<project>/<projects-jobs.yaml with the following (taken from this guide):

    ---
    - project:
        name: PROJECT
        project: PROJECT
        project-name: 'PROJECT'
    
        project-pattern: 'PROJECT'
        rtd-build-url: RTD_BUILD_URL
        rtd-token: RTD_TOKEN
    
        jobs:
          - '{project-name}-rtd-jobs'
    

You can either send an email to helpdesk in order to get a copy of RTD_BUILD_URL and RTD_TOKEN, ping aricg or bramwelt in #opnfv-docs on Freenode, or add Aric Gardner or Trevor Bramwell to your patch as a reviewer and they will pass along the token and build URL.

Removing the Submodule

In the opnfvdocs repository:

  1. Add an intersphinx link to the opnfvdocs repo configuration:

    docs/conf.py

    intersphinx_mapping['<project>'] = ('http://opnfv-<project>.readthedocs.io', None)
    

    If the project exists on ReadTheDocs, and the previous build was merged in and ran, you can verify the linking is working currectly by finding the following line in the output of tox -e docs:

    loading intersphinx inventory from https://opnfv-<project>.readthedocs.io/en/latest/objects.inv...
    
  2. Ensure all references in opnfvdocs are using :ref: or :doc: and not directly specifying submodule files with ../submodules/<project>.

    For example:

    .. toctree::
    
       ../submodules/releng/docs/overview.rst
    

    Would become:

    .. toctree::
    
       :ref:`Releng Overview <releng:overview>`
    

    Some more examples can be seen here.

  3. Remove the submodule from opnfvdocs, replacing <project> with your project and commit the change:

    git rm docs/submodules/<project>
    git commit -s
    git review
    

Addendum

Index File

The index file must relatively refence your other rst files in that directory.

Here is an example index.rst :

*******************
Documentation Title
*******************

.. toctree::
   :numbered:
   :maxdepth: 2

   documentation-example
Source Files

Document source files have to be written in reStructuredText format (rst). Each file would be build as an html page.

Here is an example source rst file :

=============
Chapter Title
=============

Section Title
=============

Subsection Title
----------------

Hello!
Writing RST Markdown

See http://sphinx-doc.org/rest.html .

Hint: You can add dedicated contents by using ‘only’ directive with build type (‘html’ and ‘singlehtml’) for OPNFV document. But, this is not encouraged to use since this may make different views.

.. only:: html
    This line will be shown only in html version.
Verify Job

The verify job name is docs-verify-rtd-{branch}.

When you send document changes to gerrit, jenkins will create your documents in HTML formats (normal and single-page) to verify that new document can be built successfully. Please check the jenkins log and artifact carefully. You can improve your document even though if the build job succeeded.

Merge Job

The merge job name is docs-merge-rtd-{branch}.

Once the patch is merged, jenkins will automatically trigger building of the new documentation. This might take about 15 minutes while readthedocs builds the documentatation. The newly built documentation shall show up as appropriate placed in docs.opnfv.org/{branch}/path-to-file.

OPNFV Projects

Apex

2. OPNFV Installation instructions (Apex)

Contents:

2.1. Abstract

This document describes how to install the Fraser release of OPNFV when using Apex as a deployment tool covering it’s limitations, dependencies and required system resources.

2.2. License

Fraser release of OPNFV when using Apex as a deployment tool Docs (c) by Tim Rozet (Red Hat)

Fraser release of OPNFV when using Apex as a deployment tool Docs are licensed under a Creative Commons Attribution 4.0 International License. You should have received a copy of the license along with this. If not, see <http://creativecommons.org/licenses/by/4.0/>.

2.3. Introduction

This document describes the steps to install an OPNFV Fraser reference platform using the Apex installer.

The audience is assumed to have a good background in networking and Linux administration.

2.4. Preface

Apex uses Triple-O from the RDO Project OpenStack distribution as a provisioning tool. The Triple-O image based life cycle installation tool provisions an OPNFV Target System (1 or 3 controllers, 0 or more compute nodes) with OPNFV specific configuration provided by the Apex deployment tool chain.

The Apex deployment artifacts contain the necessary tools to deploy and configure an OPNFV target system using the Apex deployment toolchain. These artifacts offer the choice of using the Apex bootable ISO (opnfv-apex-fraser.iso) to both install CentOS 7 and the necessary materials to deploy or the Apex RPMs (opnfv-apex*.rpm), and their associated dependencies, which expects installation to a CentOS 7 libvirt enabled host. The RPM contains a collection of configuration files, prebuilt disk images, and the automatic deployment script (opnfv-deploy).

An OPNFV install requires a “Jump Host” in order to operate. The bootable ISO will allow you to install a customized CentOS 7 release to the Jump Host, which includes the required packages needed to run opnfv-deploy. If you already have a Jump Host with CentOS 7 installed, you may choose to skip the ISO step and simply install the (opnfv-apex*.rpm) RPMs. The RPMs are the same RPMs included in the ISO and include all the necessary disk images and configuration files to execute an OPNFV deployment. Either method will prepare a host to the same ready state for OPNFV deployment.

opnfv-deploy instantiates a Triple-O Undercloud VM server using libvirt as its provider. This VM is then configured and used to provision the OPNFV target deployment. These nodes can be either virtual or bare metal. This guide contains instructions for installing either method.

2.5. Triple-O Deployment Architecture

Apex is based on the OpenStack Triple-O project as distributed by the RDO Project. It is important to understand the basics of a Triple-O deployment to help make decisions that will assist in successfully deploying OPNFV.

Triple-O stands for OpenStack On OpenStack. This means that OpenStack will be used to install OpenStack. The target OPNFV deployment is an OpenStack cloud with NFV features built-in that will be deployed by a smaller all-in-one deployment of OpenStack. In this deployment methodology there are two OpenStack installations. They are referred to as the undercloud and the overcloud. The undercloud is used to deploy the overcloud.

The undercloud is the all-in-one installation of OpenStack that includes baremetal provisioning capability. The undercloud will be deployed as a virtual machine on a Jump Host. This VM is pre-built and distributed as part of the Apex RPM.

The overcloud is OPNFV. Configuration will be passed into undercloud and the undercloud will use OpenStack’s orchestration component, named Heat, to execute a deployment that will provision the target OPNFV nodes.

2.6. Apex High Availability Architecture
2.6.1. Undercloud

The undercloud is not Highly Available. End users do not depend on the undercloud. It is only for management purposes.

2.6.2. Overcloud

Apex will deploy three control nodes in an HA deployment. Each of these nodes will run the following services:

  • Stateless OpenStack services
  • MariaDB / Galera
  • RabbitMQ
  • OpenDaylight
  • HA Proxy
  • Pacemaker & VIPs
  • Ceph Monitors and OSDs
Stateless OpenStack services
All running stateless OpenStack services are load balanced by HA Proxy. Pacemaker monitors the services and ensures that they are running.
Stateful OpenStack services
All running stateful OpenStack services are load balanced by HA Proxy. They are monitored by pacemaker in an active/passive failover configuration.
MariaDB / Galera
The MariaDB database is replicated across the control nodes using Galera. Pacemaker is responsible for a proper start up of the Galera cluster. HA Proxy provides and active/passive failover methodology to connections to the database.
RabbitMQ
The message bus is managed by Pacemaker to ensure proper start up and establishment of clustering across cluster members.
OpenDaylight
OpenDaylight is currently installed on all three control nodes and started as an HA cluster unless otherwise noted for that scenario. OpenDaylight’s database, known as MD-SAL, breaks up pieces of the database into “shards”. Each shard will have its own election take place, which will determine which OpenDaylight node is the leader for that shard. The other OpenDaylight nodes in the cluster will be in standby. Every Open vSwitch node connects to every OpenDaylight to enable HA.
HA Proxy
HA Proxy is monitored by Pacemaker to ensure it is running across all nodes and available to balance connections.
Pacemaker & VIPs
Pacemaker has relationships and restraints setup to ensure proper service start up order and Virtual IPs associated with specific services are running on the proper host.
Ceph Monitors & OSDs
The Ceph monitors run on each of the control nodes. Each control node also has a Ceph OSD running on it. By default the OSDs use an autogenerated virtual disk as their target device. A non-autogenerated device can be specified in the deploy file.

VM Migration is configured and VMs can be evacuated as needed or as invoked by tools such as heat as part of a monitored stack deployment in the overcloud.

2.7. OPNFV Scenario Architecture

OPNFV distinguishes different types of SDN controllers, deployment options, and features into “scenarios”. These scenarios are universal across all OPNFV installers, although some may or may not be supported by each installer.

The standard naming convention for a scenario is: <VIM platform>-<SDN type>-<feature>-<ha/noha>

The only supported VIM type is “OS” (OpenStack), while SDN types can be any supported SDN controller. “feature” includes things like ovs_dpdk, sfc, etc. “ha” or “noha” determines if the deployment will be highly available. If “ha” is used at least 3 control nodes are required.

2.8. OPNFV Scenarios in Apex

Apex provides pre-built scenario files in /etc/opnfv-apex which a user can select from to deploy the desired scenario. Simply pass the desired file to the installer as a (-d) deploy setting. Read further in the Apex documentation to learn more about invoking the deploy command. Below is quick reference matrix for OPNFV scenarios supported in Apex. Please refer to the respective OPNFV Docs documentation for each scenario in order to see a full scenario description. Also, please refer to release notes for information about known issues per scenario. The following scenarios correspond to a supported <Scenario>.yaml deploy settings file:

Scenario Owner Supported
os-nosdn-nofeature-ha Apex Yes
os-nosdn-nofeature-noha Apex Yes
os-nosdn-bar-ha Barometer Yes
os-nosdn-bar-noha Barometer Yes
os-nosdn-calipso-noha Calipso No
os-nosdn-ovs_dpdk-ha Apex No
os-nosdn-ovs_dpdk-noha Apex No
os-nosdn-fdio-ha FDS No
os-nosdn-fdio-noha FDS No
os-nosdn-kvm_ovs_dpdk-ha KVM for NFV No
os-nosdn-kvm_ovs_dpdk -noha KVM for NFV No
os-nosdn-performance-ha Apex No
os-odl-nofeature-ha Apex Yes
os-odl-nofeature-noha Apex Yes
os-odl-ovs_dpdk-ha Apex No
os-odl-ovs_dpdk-noha Apex No
os-odl-bgpvpn-ha SDNVPN Yes
os-odl-bgpvpn-noha SDNVPN Yes
os-odl-sriov-ha Apex No
os-odl-sriov-noha Apex No
os-odl-l2gw-ha Apex No
os-odl-l2gw-noha Apex No
os-odl-sfc-ha SFC No
os-odl-sfc-noha SFC No
os-odl-gluon-noha Gluon No
os-odl-csit-noha Apex No
os-odl-fdio-ha FDS No
os-odl-fdio-noha FDS No
os-odl-fdio_dvr-ha FDS No
os-odl-fdio_dvr-noha FDS No
os-onos-nofeature-ha ONOSFW No
os-onos-sfc-ha ONOSFW No
os-ovn-nofeature-noha Apex Yes
2.9. Setup Requirements
2.9.1. Jump Host Requirements

The Jump Host requirements are outlined below:

  1. CentOS 7 (from ISO or self-installed).
  2. Root access.
  3. libvirt virtualization support.
  4. minimum 1 networks and maximum 5 networks, multiple NIC and/or VLAN combinations are supported. This is virtualized for a VM deployment.
  5. The Fraser Apex RPMs and their dependencies.
  6. 16 GB of RAM for a bare metal deployment, 64 GB of RAM for a Virtual Deployment.
2.9.2. Network Requirements

Network requirements include:

  1. No DHCP or TFTP server running on networks used by OPNFV.
  2. 1-5 separate networks with connectivity between Jump Host and nodes.
    • Control Plane (Provisioning)
    • Private Tenant-Networking Network*
    • External Network*
    • Storage Network*
    • Internal API Network* (required for IPv6 **)
  3. Lights out OOB network access from Jump Host with IPMI node enabled (bare metal deployment only).
  4. External network is a routable network from outside the cloud, deployment. The External network is where public internet access would reside if available.

*These networks can be combined with each other or all combined on the Control Plane network.

**Internal API network, by default, is collapsed with provisioning in IPv4 deployments, this is not possible with the current lack of PXE boot support and therefore the API network is required to be its own network in an IPv6 deployment.

2.9.3. Bare Metal Node Requirements

Bare metal nodes require:

  1. IPMI enabled on OOB interface for power control.
  2. BIOS boot priority should be PXE first then local hard disk.
  3. BIOS PXE interface should include Control Plane network mentioned above.
2.9.4. Execution Requirements (Bare Metal Only)

In order to execute a deployment, one must gather the following information:

  1. IPMI IP addresses for the nodes.
  2. IPMI login information for the nodes (user/pass).
  3. MAC address of Control Plane / Provisioning interfaces of the overcloud nodes.
2.10. Installation High-Level Overview - Bare Metal Deployment

The setup presumes that you have 6 or more bare metal servers already setup with network connectivity on at least 1 or more network interfaces for all servers via a TOR switch or other network implementation.

The physical TOR switches are not automatically configured from the OPNFV reference platform. All the networks involved in the OPNFV infrastructure as well as the provider networks and the private tenant VLANs needs to be manually configured.

The Jump Host can be installed using the bootable ISO or by using the (opnfv-apex*.rpm) RPMs and their dependencies. The Jump Host should then be configured with an IP gateway on its admin or public interface and configured with a working DNS server. The Jump Host should also have routable access to the lights out network for the overcloud nodes.

opnfv-deploy is then executed in order to deploy the undercloud VM and to provision the overcloud nodes. opnfv-deploy uses three configuration files in order to know how to install and provision the OPNFV target system. The information gathered under section Execution Requirements (Bare Metal Only) is put into the YAML file /etc/opnfv-apex/inventory.yaml configuration file. Deployment options are put into the YAML file /etc/opnfv-apex/deploy_settings.yaml. Alternatively there are pre-baked deploy_settings files available in /etc/opnfv-apex/. These files are named with the naming convention os-sdn_controller-enabled_feature-[no]ha.yaml. These files can be used in place of the /etc/opnfv-apex/deploy_settings.yaml file if one suites your deployment needs. Networking definitions gathered under section Network Requirements are put into the YAML file /etc/opnfv-apex/network_settings.yaml. opnfv-deploy will boot the undercloud VM and load the target deployment configuration into the provisioning toolchain. This information includes MAC address, IPMI, Networking Environment and OPNFV deployment options.

Once configuration is loaded and the undercloud is configured it will then reboot the overcloud nodes via IPMI. The nodes should already be set to PXE boot first off the admin interface. The nodes will first PXE off of the undercloud PXE server and go through a discovery/introspection process.

Introspection boots off of custom introspection PXE images. These images are designed to look at the properties of the hardware that is being booted and report the properties of it back to the undercloud node.

After introspection the undercloud will execute a Heat Stack Deployment to continue node provisioning and configuration. The nodes will reboot and PXE from the undercloud PXE server again to provision each node using Glance disk images provided by the undercloud. These disk images include all the necessary packages and configuration for an OPNFV deployment to execute. Once the disk images have been written to node’s disks the nodes will boot locally and execute cloud-init which will execute the final node configuration. This configuration is largely completed by executing a puppet apply on each node.

2.11. Installation Guide - Bare Metal Deployment

This section goes step-by-step on how to correctly install and provision the OPNFV target system to bare metal nodes.

2.11.1. Install Bare Metal Jump Host
1a. If your Jump Host does not have CentOS 7 already on it, or you would like
to do a fresh install, then download the Apex bootable ISO from the OPNFV artifacts site <http://artifacts.opnfv.org/apex.html>. There have been isolated reports of problems with the ISO having trouble completing installation successfully. In the unexpected event the ISO does not work please workaround this by downloading the CentOS 7 DVD and performing a “Virtualization Host” install. If you perform a “Minimal Install” or install type other than “Virtualization Host” simply run sudo yum -y groupinstall "Virtualization Host" chkconfig libvirtd on && reboot to install virtualization support and enable libvirt on boot. If you use the CentOS 7 DVD proceed to step 1b once the CentOS 7 with “Virtualization Host” support is completed.
1b. Boot the ISO off of a USB or other installation media and walk through

installing OPNFV CentOS 7. The ISO comes prepared to be written directly to a USB drive with dd as such:

dd if=opnfv-apex.iso of=/dev/sdX bs=4M

Replace /dev/sdX with the device assigned to your usb drive. Then select the USB device as the boot media on your Jump Host

2a. When not using the OPNFV Apex ISO, install these repos:

sudo yum install https://repos.fedorapeople.org/repos/openstack/openstack-pike/rdo-release-pike-1.noarch.rpm sudo yum install epel-release sudo curl -o /etc/yum.repos.d/opnfv-apex.repo http://artifacts.opnfv.org/apex/fraser/opnfv-apex.repo

The RDO Project release repository is needed to install OpenVSwitch, which is a dependency of opnfv-apex. If you do not have external connectivity to use this repository you need to download the OpenVSwitch RPM from the RDO Project repositories and install it with the opnfv-apex RPM. The opnfv-apex repo hosts all of the Apex dependencies which will automatically be installed when installing RPMs, but will be pre-installed with the ISO.

2b. If you chose not to use the Apex ISO, then you must download and install

the Apex RPMs to the Jump Host. Download the first 3 Apex RPMs from the OPNFV downloads page, under the TripleO RPMs https://www.opnfv.org/software/downloads. The following RPMs are available for installation:

  • opnfv-apex - OpenDaylight, OVN, and nosdn support
  • opnfv-apex-undercloud - (reqed) Undercloud Image
  • python34-opnfv-apex - (reqed) OPNFV Apex Python package
  • python34-markupsafe - (reqed) Dependency of python34-opnfv-apex **
  • python34-jinja2 - (reqed) Dependency of python34-opnfv-apex **
  • python3-ipmi - (reqed) Dependency of python34-opnfv-apex **
  • python34-pbr - (reqed) Dependency of python34-opnfv-apex **
  • python34-virtualbmc - (reqed) Dependency of python34-opnfv-apex **
  • python34-iptables - (reqed) Dependency of python34-opnfv-apex **
  • python34-cryptography - (reqed) Dependency of python34-opnfv-apex **
  • python34-libvirt - (reqed) Dependency of python34-opnfv-apex **

** These RPMs are not yet distributed by CentOS or EPEL. Apex has built these for distribution with Apex while CentOS and EPEL do not distribute them. Once they are carried in an upstream channel Apex will no longer carry them and they will not need special handling for installation. You do not need to explicitly install these as they will be automatically installed by installing python34-opnfv-apex when the opnfv-apex.repo has been previously downloaded to /etc/yum.repos.d/.

Install the three required RPMs (replace <rpm> with the actual downloaded artifact): yum -y install <opnfv-apex.rpm> <opnfv-apex-undercloud> <python34-opnfv-apex>

  1. After the operating system and the opnfv-apex RPMs are installed, login to your Jump Host as root.
  2. Configure IP addresses on the interfaces that you have selected as your networks.
  3. Configure the IP gateway to the Internet either, preferably on the public interface.
  4. Configure your /etc/resolv.conf to point to a DNS server (8.8.8.8 is provided by Google).
2.11.2. Creating a Node Inventory File

IPMI configuration information gathered in section Execution Requirements (Bare Metal Only) needs to be added to the inventory.yaml file.

  1. Copy /usr/share/doc/opnfv/inventory.yaml.example as your inventory file template to /etc/opnfv-apex/inventory.yaml.

  2. The nodes dictionary contains a definition block for each baremetal host that will be deployed. 0 or more compute nodes and 1 or 3 controller nodes are required. (The example file contains blocks for each of these already). It is optional at this point to add more compute nodes into the node list. By specifying 0 compute nodes in the inventory file, the deployment will automatically deploy “all-in-one” nodes which means the compute will run along side the controller in a single overcloud node. Specifying 3 control nodes will result in a highly-available service model.

  3. Edit the following values for each node:

    • mac_address: MAC of the interface that will PXE boot from undercloud

    • ipmi_ip: IPMI IP Address

    • ipmi_user: IPMI username

    • ipmi_password: IPMI password

    • pm_type: Power Management driver to use for the node

      values: pxe_ipmitool (tested) or pxe_wol (untested) or pxe_amt (untested)

    • cpus: (Introspected*) CPU cores available

    • memory: (Introspected*) Memory available in Mib

    • disk: (Introspected*) Disk space available in Gb

    • disk_device: (Opt***) Root disk device to use for installation

    • arch: (Introspected*) System architecture

    • capabilities: (Opt**) Node’s role in deployment

      values: profile:control or profile:compute

    * Introspection looks up the overcloud node’s resources and overrides these value. You can leave default values and Apex will get the correct values when it runs introspection on the nodes.

    ** If capabilities profile is not specified then Apex will select node’s roles in the OPNFV cluster in a non-deterministic fashion.

    *** disk_device declares which hard disk to use as the root device for installation. The format is a comma delimited list of devices, such as “sda,sdb,sdc”. The disk chosen will be the first device in the list which is found by introspection to exist on the system. Currently, only a single definition is allowed for all nodes. Therefore if multiple disk_device definitions occur within the inventory, only the last definition on a node will be used for all nodes.

2.11.3. Creating the Settings Files

Edit the 2 settings files in /etc/opnfv-apex/. These files have comments to help you customize them.

  1. deploy_settings.yaml This file includes basic configuration options deployment, and also documents all available options. Alternatively, there are pre-built deploy_settings files available in (/etc/opnfv-apex/). These files are named with the naming convention os-sdn_controller-enabled_feature-[no]ha.yaml. These files can be used in place of the (/etc/opnfv-apex/deploy_settings.yaml) file if one suites your deployment needs. If a pre-built deploy_settings file is chosen there is no need to customize (/etc/opnfv-apex/deploy_settings.yaml). The pre-built file can be used in place of the (/etc/opnfv-apex/deploy_settings.yaml) file.
  2. network_settings.yaml This file provides Apex with the networking information that satisfies the prerequisite Network Requirements. These are specific to your environment.
2.11.4. Running opnfv-deploy

You are now ready to deploy OPNFV using Apex! opnfv-deploy will use the inventory and settings files to deploy OPNFV.

Follow the steps below to execute:

  1. Execute opnfv-deploy sudo opnfv-deploy -n network_settings.yaml -i inventory.yaml -d deploy_settings.yaml If you need more information about the options that can be passed to opnfv-deploy use opnfv-deploy --help. -n network_settings.yaml allows you to customize your networking topology. Note it can also be useful to run the command with the --debug argument which will enable a root login on the overcloud nodes with password: ‘opnfvapex’. It is also useful in some cases to surround the deploy command with nohup. For example: nohup <deploy command> &, will allow a deployment to continue even if ssh access to the Jump Host is lost during deployment.
  2. Wait while deployment is executed. If something goes wrong during this part of the process, start by reviewing your network or the information in your configuration files. It’s not uncommon for something small to be overlooked or mis-typed. You will also notice outputs in your shell as the deployment progresses.
  3. When the deployment is complete the undercloud IP and overcloud dashboard url will be printed. OPNFV has now been deployed using Apex.
2.12. Installation High-Level Overview - Virtual Deployment

Deploying virtually is an alternative deployment method to bare metal, where only a single bare metal Jump Host server is required to execute deployment. This deployment type is useful when physical resources are constrained, or there is a desire to deploy a temporary sandbox environment.

With virtual deployments, two deployment options are offered. The first is a standard deployment where the Jump Host server will host the undercloud VM along with any number of OPNFV overcloud control/compute nodes. This follows the same deployment workflow as baremetal, and can take between 1 to 2 hours to complete.

The second option is to use snapshot deployments. Snapshots are saved disk images of previously deployed OPNFV upstream. These snapshots are promoted daily and contain and already deployed OPNFV environment that has passed a series of tests. The advantage of the snapshot is that it deploys in less than 10 minutes. Another major advantage is that the snapshots work on both CentOS and Fedora OS. Note: Fedora support is only tested via PIP installation at this time and not via RPM.

2.12.1. Standard Deployment Overview

The virtual deployment operates almost the same way as the bare metal deployment with a few differences mainly related to power management. opnfv-deploy still deploys an undercloud VM. In addition to the undercloud VM a collection of VMs (3 control nodes + 2 compute for an HA deployment or 1 control node and 0 or more compute nodes for a Non-HA Deployment) will be defined for the target OPNFV deployment. All overcloud VMs are registered with a Virtual BMC emulator which will service power management (IPMI) commands. The overcloud VMs are still provisioned with the same disk images and configuration that baremetal would use. Using 0 nodes for a virtual deployment will automatically deploy “all-in-one” nodes which means the compute will run along side the controller in a single overcloud node. Specifying 3 control nodes will result in a highly-available service model.

To Triple-O these nodes look like they have just built and registered the same way as bare metal nodes, the main difference is the use of a libvirt driver for the power management. Finally, the default network settings file will deploy without modification. Customizations are welcome but not needed if a generic set of network settings are acceptable.

2.12.2. Snapshot Deployment Overview

Snapshot deployments use the same opnfv-deploy CLI as standard deployments. The snapshot deployment will use a cache in order to store snapshots that are downloaded from the internet at deploy time. This caching avoids re-downloading the same artifact between deployments. The snapshot deployment recreates the same network and libvirt setup as would have been provisioned by the Standard deployment, with the exception that there is no undercloud VM. The snapshot deployment will give the location of the RC file to use in order to interact with the Overcloud directly from the jump host.

Snapshots come in different topology flavors. One is able to deploy either HA (3 Control, 2 Computes, no-HA (1 Control, 2 Computes), or all-in-one (1 Control/Compute. The snapshot deployment itself is always done with the os-odl-nofeature-* scenario.

2.13. Installation Guide - Virtual Deployment

This section goes step-by-step on how to correctly install and provision the OPNFV target system to VM nodes.

2.13.1. Special Requirements for Virtual Deployments

In scenarios where advanced performance options or features are used, such as using huge pages with nova instances, DPDK, or iommu; it is required to enabled nested KVM support. This allows hardware extensions to be passed to the overcloud VMs, which will allow the overcloud compute nodes to bring up KVM guest nova instances, rather than QEMU. This also provides a great performance increase even in non-required scenarios and is recommended to be enabled.

During deployment the Apex installer will detect if nested KVM is enabled, and if not, it will attempt to enable it; while printing a warning message if it cannot. Check to make sure before deployment that Nested Virtualization is enabled in BIOS, and that the output of cat /sys/module/kvm_intel/parameters/nested returns “Y”. Also verify using lsmod that the kvm_intel module is loaded for x86_64 machines, and kvm_amd is loaded for AMD64 machines.

2.13.2. Install Jump Host

Follow the instructions in the Install Bare Metal Jump Host section.

2.13.3. Running opnfv-deploy for Standard Deployment

You are now ready to deploy OPNFV! opnfv-deploy has virtual deployment capability that includes all of the configuration necessary to deploy OPNFV with no modifications.

If no modifications are made to the included configurations the target environment will deploy with the following architecture:

  • 1 undercloud VM
  • The option of 3 control and 2 or more compute VMs (HA Deploy / default) or 1 control and 0 or more compute VMs (Non-HA deploy)
  • 1-5 networks: provisioning, private tenant networking, external, storage and internal API. The API, storage and tenant networking networks can be collapsed onto the provisioning network.

Follow the steps below to execute:

  1. sudo opnfv-deploy -v [ --virtual-computes n ] [ --virtual-cpus n ] [ --virtual-ram n ] -n network_settings.yaml -d deploy_settings.yaml Note it can also be useful to run the command with the --debug argument which will enable a root login on the overcloud nodes with password: ‘opnfvapex’. It is also useful in some cases to surround the deploy command with nohup. For example: nohup <deploy command> &, will allow a deployment to continue even if ssh access to the Jump Host is lost during deployment. By specifying --virtual-computes 0, the deployment will proceed as all-in-one.
  2. It will take approximately 45 minutes to an hour to stand up undercloud, define the target virtual machines, configure the deployment and execute the deployment. You will notice different outputs in your shell.
  3. When the deployment is complete the IP for the undercloud and a url for the OpenStack dashboard will be displayed
2.13.4. Running opnfv-deploy for Snapshot Deployment

Deploying snapshots requires enough disk space to cache snapshot archives, as well as store VM disk images per deployment. The snapshot cache directory can be configured at deploy time. Ensure a directory is used on a partition with enough space for about 20GB. Additionally, Apex will attempt to detect the default libvirt storage pool on the jump host. This is typically ‘/var/lib/libvirt/images’. On default CentOS installations, this path will resolve to the /root partition, which is only around 50GB. Therefore, ensure that the path for the default storage pool has enough space to hold the VM backing storage (approx 4GB per VM). Note, each Overcloud VM disk size is set to 40GB, however Libvirt grows these disks dynamically. Due to this only 4GB will show up at initial deployment, but the disk may grow from there up to 40GB.

The new arguments to deploy snapshots include:

  • –snapshot: Enables snapshot deployments
  • –snap-cache: Indicates the directory to use for caching artifacts

An example deployment command is:

In the above example, several of the Standard Deployment arguments are still used to deploy snapshots:

  • -d: Deploy settings are used to determine OpenStack version of snapshots to use as well as the topology
  • –virtual-computes - When set to 0, it indicates to Apex to use an all-in-one snapshot
  • –no-fetch - Can be used to disable fetching latest snapshot artifact from upstream and use the latest found in –snap-cache
2.13.5. Verifying the Setup - VMs

To verify the set you can follow the instructions in the Verifying the Setup section.

2.14. Deploying Directly from Upstream - (Beta)

In addition to deploying with OPNFV tested artifacts included in the opnfv-apex-undercloud and opnfv-apex RPMs, it is now possible to deploy directly from upstream artifacts. Essentially this deployment pulls the latest RDO overcloud and undercloud artifacts at deploy time. This option is useful for being able to deploy newer versions of OpenStack that are not included with this release, and offers some significant advantages for some users. Please note this feature is currently in beta for the Fraser release and will be fully supported in the next OPNFV release.

2.14.1. Upstream Deployment Key Features

In addition to being able to install newer versions of OpenStack, the upstream deployment option allows the use of a newer version of TripleO, which provides overcloud container support. Therefore when deploying from upstream with an OpenStack version newer than Pike, every OpenStack service (also OpenDaylight) will be running as a docker container. Furthermore, deploying upstream gives the user the flexibility of including any upstream OpenStack patches he/she may need by simply adding them into the deploy settings file. The patches will be applied live during deployment.

2.15. Installation Guide - Upstream Deployment

This section goes step-by-step on how to correctly install and provision the OPNFV target system using a direct upstream deployment.

2.15.1. Special Requirements for Upstream Deployments

With upstream deployments it is required to have internet access. In addition, the upstream artifacts will be cached under the root partition of the jump host. It is required to at least have 10GB free space in the root partition in order to download and prepare the cached artifacts.

2.15.2. Scenarios and Deploy Settings for Upstream Deployments

Some deploy settings files are already provided which have been tested by the Apex team. These include (under /etc/opnfv-apex/):

  • os-nosdn-queens_upstream-noha.yaml
  • os-nosdn-master_upstream-noha.yaml
  • os-odl-queens_upstream-noha.yaml
  • os-odl-master_upstream-noha.yaml

Each of these scenarios has been tested by Apex over the Fraser release, but none are guaranteed to work as upstream is a moving target and this feature is relatively new. Still it is the goal of the Apex team to provide support and move to an upstream based deployments in the future, so please file a bug when encountering any issues.

2.15.3. Including Upstream Patches with Deployment

With upstream deployments it is possible to include any pending patch in OpenStack gerrit with the deployment. These patches are applicable to either the undercloud or the overcloud. This feature is useful in the case where a developer or user desires to pull in an unmerged patch for testing with a deployment. In order to use this feature, include the following in the deploy settings file, under “global_params” section:

patches:
  undercloud:
    - change-id: <gerrit change id>
      project: openstack/<project name>
      branch:  <branch where commit is proposed>
  overcloud:
    - change-id: <gerrit change id>
      project: openstack/<project name>
      branch:  <branch where commit is proposed>

You may include as many patches as needed. If the patch is already merged or abandoned, then it will not be included in the deployment.

2.15.4. Running opnfv-deploy

Deploying is similar to the typical method used for baremetal and virtual deployments with the addition of a few new arguments to the opnfv-deploy command. In order to use an upstream deployment, please use the --upstream argument. Also, the artifacts for each upstream deployment are only downloaded when a newer version is detected upstream. In order to explicitly disable downloading new artifacts from upstream if previous artifacts are already cached, please use the --no-fetch argument.

2.15.5. Interacting with Containerized Overcloud

Upstream deployments will use a containerized overcloud. These containers are Docker images built by the Kolla project. The Containers themselves are run and controlled through Docker as root user. In order to access logs for each service, examine the ‘/var/log/containers’ directory or use the docker logs <container name>. To see a list of services running on the node, use the docker ps command. Each container uses host networking, which means that the networking of the overcloud node will act the same exact way as a traditional deployment. In order to attach to a container, use this command: docker exec -it <container name/id> bin/bash. This will login to the container with a bash shell. Note the containers do not use systemd, unlike the traditional deployment model and are instead started as the first process in the container. To restart a service, use the docker restart <container> command.

2.16. Verifying the Setup

Once the deployment has finished, the OPNFV deployment can be accessed via the undercloud node. From the Jump Host ssh to the undercloud host and become the stack user. Alternatively ssh keys have been setup such that the root user on the Jump Host can ssh to undercloud directly as the stack user. For convenience a utility script has been provided to look up the undercloud’s ip address and ssh to the undercloud all in one command. An optional user name can be passed to indicate whether to connect as the stack or root user. The stack user is default if a username is not specified.

opnfv-util undercloud root
su - stack

Once connected to undercloud as the stack user look for two keystone files that can be used to interact with the undercloud and the overcloud. Source the appropriate RC file to interact with the respective OpenStack deployment.

source stackrc (undercloud)
source overcloudrc (overcloud / OPNFV)

The contents of these files include the credentials for the administrative user for undercloud and OPNFV respectively. At this point both undercloud and OPNFV can be interacted with just as any OpenStack installation can be. Start by listing the nodes in the undercloud that were used to deploy the overcloud.

source stackrc
openstack server list

The control and compute nodes will be listed in the output of this server list command. The IP addresses that are listed are the control plane addresses that were used to provision the nodes. Use these IP addresses to connect to these nodes. Initial authentication requires using the user heat-admin.

ssh heat-admin@192.0.2.7

To begin creating users, images, networks, servers, etc in OPNFV source the overcloudrc file or retrieve the admin user’s credentials from the overcloudrc file and connect to the web Dashboard.

You are now able to follow the OpenStack Verification section.

2.17. OpenStack Verification

Once connected to the OPNFV Dashboard make sure the OPNFV target system is working correctly:

  1. In the left pane, click Compute -> Images, click Create Image.
  2. Insert a name “cirros”, Insert an Image Location http://download.cirros-cloud.net/0.3.5/cirros-0.3.5-x86_64-disk.img.
  3. Select format “QCOW2”, select Public, then click Create Image.
  4. Now click Project -> Network -> Networks, click Create Network.
  5. Enter a name “internal”, click Next.
  6. Enter a subnet name “internal_subnet”, and enter Network Address 172.16.1.0/24, click Next.
  7. Now go to Project -> Compute -> Instances, click Launch Instance.
  8. Enter Instance Name “first_instance”, select Instance Boot Source “Boot from image”, and then select Image Name “cirros”.
  9. Click Launch, status will cycle though a couple states before becoming “Active”.
  10. Steps 7 though 9 can be repeated to launch more instances.
  11. Once an instance becomes “Active” their IP addresses will display on the Instances page.
  12. Click the name of an instance, then the “Console” tab and login as “cirros”/”cubswin:)”
  13. To verify storage is working, click Project -> Compute -> Volumes, Create Volume
  14. Give the volume a name and a size of 1 GB
  15. Once the volume becomes “Available” click the dropdown arrow and attach it to an instance.

Congratulations you have successfully installed OPNFV!

2.18. Developer Guide and Troubleshooting

This section aims to explain in more detail the steps that Apex follows to make a deployment. It also tries to explain possible issues you might find in the process of building or deploying an environment.

After installing the Apex RPMs in the Jump Host, some files will be located around the system.

  1. /etc/opnfv-apex: this directory contains a bunch of scenarios to be deployed with different characteristics such HA (High Availability), SDN controller integration (OpenDaylight/ONOS), BGPVPN, FDIO, etc. Having a look at any of these files will give you an idea of how to make a customized scenario setting up different flags.
  2. /usr/bin/: it contains the binaries for the commands opnfv-deploy, opnfv-clean and opnfv-util.
  3. /usr/share/opnfv/: contains Ansible playbooks and other non-python based configuration and libraries.
  4. /var/opt/opnfv/: contains disk images for Undercloud and Overcloud
2.18.1. Utilization of Images

As mentioned earlier in this guide, the Undercloud VM will be in charge of deploying OPNFV (Overcloud VMs). Since the Undercloud is an all-in-one OpenStack deployment, it will use Glance to manage the images that will be deployed as the Overcloud.

So whatever customization that is done to the images located in the jumpserver (/var/opt/opnfv/images) will be uploaded to the undercloud and consequently, to the overcloud.

Make sure, the customization is performed on the right image. For example, if I virt-customize the following image overcloud-full-opendaylight.qcow2, but then I deploy OPNFV with the following command:

sudo opnfv-deploy -n network_settings.yaml -d /etc/opnfv-apex/os-onos-nofeature-ha.yaml

It will not have any effect over the deployment, since the customized image is the opendaylight one, and the scenario indicates that the image to be deployed is the overcloud-full-onos.qcow2.

2.18.2. Post-deployment Configuration

Post-deployment scripts will perform some configuration tasks such ssh-key injection, network configuration, NATing, OpenVswitch creation. It will take care of some OpenStack tasks such creation of endpoints, external networks, users, projects, etc.

If any of these steps fail, the execution will be interrupted. In some cases, the interruption occurs at very early stages, so a new deployment must be executed. However, some other cases it could be worth it to try to debug it.

  1. There is not external connectivity from the overcloud nodes:

    Post-deployment scripts will configure the routing, nameservers and a bunch of other things between the overcloud and the undercloud. If local connectivity, like pinging between the different nodes, is working fine, script must have failed when configuring the NAT via iptables. The main rules to enable external connectivity would look like these:

    iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE iptables -t nat -A POSTROUTING -s ${external_cidr} -o eth0 -j MASQUERADE iptables -A FORWARD -i eth2 -j ACCEPT iptables -A FORWARD -s ${external_cidr} -m state --state ESTABLISHED,RELATED -j ACCEPT service iptables save

    These rules must be executed as root (or sudo) in the undercloud machine.

2.18.3. OpenDaylight Integration

When a user deploys a scenario that starts with os-odl*:

OpenDaylight (ODL) SDN controller will be deployed and integrated with OpenStack. ODL will run as a systemd service, and can be managed as as a regular service:

systemctl start/restart/stop opendaylight.service

This command must be executed as root in the controller node of the overcloud, where OpenDaylight is running. ODL files are located in /opt/opendaylight. ODL uses karaf as a Java container management system that allows the users to install new features, check logs and configure a lot of things. In order to connect to Karaf’s console, use the following command:

opnfv-util opendaylight

This command is very easy to use, but in case it is not connecting to Karaf, this is the command that is executing underneath:

ssh -p 8101 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no karaf@localhost

Of course, localhost when the command is executed in the overcloud controller, but you use its public IP to connect from elsewhere.

2.18.4. Debugging Failures

This section will try to gather different type of failures, the root cause and some possible solutions or workarounds to get the process continued.

  1. I can see in the output log a post-deployment error messages:

    Heat resources will apply puppet manifests during this phase. If one of these processes fail, you could try to see the error and after that, re-run puppet to apply that manifest. Log into the controller (see verification section for that) and check as root /var/log/messages. Search for the error you have encountered and see if you can fix it. In order to re-run the puppet manifest, search for “puppet apply” in that same log. You will have to run the last “puppet apply” before the error. And It should look like this:

    FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/5b4c7a01-0d63-4a71-81e9-d5ee6f0a1f2f"  FACTER_fqdn="overcloud-controller-0.localdomain.com" \ FACTER_deploy_config_name="ControllerOvercloudServicesDeployment_Step4"  puppet apply --detailed-exitcodes -l syslog -l console \ /var/lib/heat-config/heat-config-puppet/5b4c7a01-0d63-4a71-81e9-d5ee6f0a1f2f.pp

    As a comment, Heat will trigger the puppet run via os-apply-config and it will pass a different value for step each time. There is a total of five steps. Some of these steps will not be executed depending on the type of scenario that is being deployed.

2.18.5. Reporting a Bug

Please report bugs via the OPNFV Apex JIRA page. You may now use the log collecting utility provided by Apex in order to gather all of the logs from the overcloud after a deployment failure. To do this please use the opnfv-pyutil --fetch-logs command. The log file location will be displayed at the end of executing the script. Please attach this log to the JIRA Bug.

2.19. Frequently Asked Questions
2.20. License

All Apex and “common” entities are protected by the Apache 2.0 License.

2.21. References
2.21.1. OPNFV

OPNFV Home Page

OPNFV Apex project page

OPNFV Apex Release Notes

2.21.3. OpenDaylight

Upstream OpenDaylight provides a number of packaging and deployment options meant for consumption by downstream projects like OPNFV.

Currently, OPNFV Apex uses OpenDaylight’s Puppet module, which in turn depends on OpenDaylight’s RPM.

2.21.4. RDO Project

RDO Project website

Authors:Tim Rozet (trozet@redhat.com)
Authors:Dan Radez (dradez@redhat.com)
Version:6.0
2.22. Indices and tables

Armband

  • Armband Overview
  • Installation Guide
  • User Guide

Availability

1. High Availability Requirement Analysis and Detailed Mechanism Design
OPNFV
1.1. High Availability Requirement Analysis in OPNFV
1.1.1. 1 Introduction

This High Availability Requirement Analysis Document is used for eliciting High Availability Requirements of OPNFV. The document will refine high-level High Availability goals, into detailed HA mechanism design. And HA mechanisms are related with potential failures on different layers in OPNFV. Moreover, this document can be used as reference for HA Testing scenarios design. A requirement engineering model KAOS is used in this document.

1.1.2. 2 Terminologies and Symbols

The following concepts in KAOS will be used in the diagrams of this document.

  • Goal: The objective to be met by the target system.
  • Obstacle: Condition whose satisfaction may prevent some goals from being achieved.
  • Agent: Active Object performing operations to achieve goals.
  • Requirement: Goal assigned to an agent of the software being studied.
  • Domain Property: Descriptive assertion about objects in the environment of the software.
  • Refinement: Relationship linking a goal to other goals that are called its subgoals. Each subgoal contributes to the satisfaction of the goal it refines. There are two types of refinements: AND refinement and OR refinement, which means whether the goal can be archived by satisfying all of its sub goals or any one of its sub goals.
  • Conflict: Relationship linking an obstacle to a goal if the obstacle obstructs the goal from being satisfied.
  • Resolution: Relationship linking a goal to an obstacle if the goal can resolve the obstacle.
  • Responsibility: Relationship between an agent and a requirement. Holds when an agent is assigned the responsibility of achieving the linked requirement.

Figure 1 shows how these concepts are displayed in a KAOS diagram.

KAOS Sample

Fig 1. A KAOS Sample Diagram

1.1.3. 3 High Availability Goals of OPNFV
1.1.3.1. 3.1 Overall Goals

The Final Goal of OPNFV High Availability is to provide high available VNF services. And the following objectives are required to meet:

  • There should be no single point of failure in the NFV framework.
  • All resiliency mechanisms shall be designed for a multi-vendor environment, where for example the NFVI, NFV-MANO, and VNFs may be supplied by different vendors.
  • Resiliency related information shall always be explicitly specified and communicated using the reference interfaces (including policies/templates) of the NFV framework.
1.1.3.2. 3.2 Service Level Agreements of OPNFV HA

Service Level Agreements of OPNFV HA are mainly focused on time constraints of service outage, failure detection, failure recovery. The following table outlines the SLA metrics of different service availability levels described in ETSI GS NFV-REL 001 V1.1.1 (2015-01). Table 1 shows time constraints of different Service Availability Levels. In this document, SAL1 is the default benchmark value required to meet.

Table 1. Time Constraints for Different Service Availability Levels

Service Availability Level Failure Detection Time Failure Recovery Time
SAL1 <1s 5-6s
SAL2 <5s 10-15s
SAL3 <10s 20-25s
1.1.4. 4 Overall Analysis

Figure 2 shows the overall decomposition of high availability goals. The high availability of VNF Services can be refined to high availability of VNFs, MANO, and the NFVI where VNFs are deployed; the high availability of NFVI Service can be refined to high availability of Virtual Compute Instances, Virtual Storage and Virtual Network Services; the high availability of virtual instance is either the high availability of containers or the high availability of VMs, and these high availability goals can be further decomposed by how the NFV environment is deployed.

Overall HA Analysis of OPNFV

Fig 2. Overall HA Analysis of OPNFV

Thus the high availability requirement of VNF services can be classified into high availability requirements on different layers in OPNFV. The following layers are mainly discussed in this document:

  • VNF HA
  • MANO HA
  • Virtual Infrastructure HA (container HA or VM HA)
  • VIM HA
  • SDN HA
  • Hypervisor HA
  • Host OS HA
  • Hardware HA

The next section will illustrate detailed analysis of HA requirements on these layers.

1.1.5. 5 Detailed Analysis
1.1.5.1. 5.1 VNF HA
1.1.5.2. 5.2 MANO HA
1.1.5.3. 5.3 Virtual Infrastructure HA
1.1.5.4. 5.4 VIM HA

The VIM in the NFV reference architecture contains different components of Openstack, SDN controllers and other virtual resource controllers. VIM components can be classified into three types:

  • Entry Point Components: Components that give VIM service interfaces to users, like nova- api, neutron-server.
  • Middlewares: Components that provide load balancer services, messaging queues, cluster management services, etc.
  • Subcomponents: Components that implement VIM functions, which are called by Entry Point Components but not by users directly.

Table 2 shows the potential faults that may happen on VIM layer. Currently the main focus of VIM HA is the service crash of VIM components, which may occur on all types of VIM components. To prevent VIM services from being unavailable, Active/Active Redundancy, Active/Passive Redundancy and Message Queue are used for different types of VIM components, as is shown in figure 3.

Table 2. Potential Faults in VIM level

Service Fault Description Severity
General Service Crash The processes of a service crashed unnormally. Critical
VIM HA Analysis

Fig 3. VIM HA Analysis

1.1.5.4.1. Active/Active Redundancy

Active/Active Redundancy manages both the main and redundant systems concurrently. If there is a failure happens on a component, the backups are already online and users are unlikely to notice that the failed VIM component is under fixing. A typical Active/Active Redundancy will have redundant instances, and these instances are load balanced via a virtual IP address and a load balancer such as HAProxy.

When one of the redundant VIM component fails, the load balancer should be aware of the instance failure, and then isolate the failed instance from being called until it is recovered. The requirement decomposition of Active/Active Redundancy is shown in Figure 4.

Active/Active Redundancy Requirement Decomposition

Fig 4. Active/Active Redundancy Requirement Decomposition

The following requirements are elicited for VIM Active/Active Redundancy:

[Req 5.4.1] Redundant VIM components should be load balanced by a load balancer.

[Req 5.4.2] The load balancer should check the health status of VIM component instances.

[Req 5.4.3] The load balancer should isolate the failed VIM component instance until it is recovered.

[Req 5.4.4] The alarm information of VIM component failure should be reported.

[Req 5.4.5] Failed VIM component instances should be recovered by a cluster manager.

Table 3 shows the current VIM components using Active/Active Redundancy and the corresponding HA test cases to verify them.

Table 3. VIM Components using Active/Active Redundancy

Component Description Related HA Test Case
nova-api endpoint component of Openstack Compute Service Nova yardstick_tc019
nova-novncproxy server daemon that serves the Nova noVNC Websocket Proxy service, which provides a websocket proxy that is compatible with OpenStack Nova noVNC consoles.  
neeutron-server endpoint component of Openstack Networking Service Neutron yardstick_tc045
keystone component of Openstack Identity Service Service Keystone yardstick_tc046
glance-api endpoint component of Openstack Image Service Glance yardstick_tc047
glance-registry server daemon that serves image metadata through a REST-like API.  
cinder-api endpoint component of Openstack Block Storage Service Service Cinder yardstick_tc048
swift-proxy endpoint component of Openstack Object Storage Swift yardstick_tc049
horizon component of Openstack Dashboard Service Horizon  
heat-api endpoint component of Openstack Stack Service Heat  
mysqld database service of VIM components  
1.1.5.4.2. Active/Passive Redundancy

Active/Passive Redundancy maintains a redundant instance that can be brought online when the active service fails. A typical Active/Passive Redundancy maintains replacement resources that can be brought online when required. Requests are handled using a virtual IP address (VIP) that facilitates returning to service with minimal reconfiguration. A cluster manager (such as Pacemaker or Corosync) monitors these components, bringing the backup online as necessary.

When the main instance of a VIM component is failed, the cluster manager should be aware of the failure and switch the backup instance online. And the failed instance should also be recovered to another backup instance. The requirement decomposition of Active/Passive Redundancy is shown in Figure 5.

Active/Passive Redundancy Requirement Decomposition

Fig 5. Active/Passive Redundancy Requirement Decomposition

The following requirements are elicited for VIM Active/Passive Redundancy:

[Req 5.4.6] The cluster manager should replace the failed main VIM component instance with a backup instance.

[Req 5.4.7] The cluster manager should check the health status of VIM component instances.

[Req 5.4.8] Failed VIM component instances should be recovered by the cluster manager.

[Req 5.4.9] The alarm information of VIM component failure should be reported.

Table 4 shows the current VIM components using Active/Passive Redundancy and the corresponding HA test cases to verify them.

Table 4. VIM Components using Active/Passive Redundancy

Component Description Related HA Test Case
haproxy load balancer component of VIM components yardstick_tc053
rabbitmq-server messaging queue service of VIM components yardstick_tc056
corosync cluster management component of VIM components yardstick_tc057
1.1.5.4.3. Message Queue

Message Queue provides an asynchronous communication protocol. In Openstack, some projects ( like Nova, Cinder) use Message Queue to call their sub components. Although Message Queue itself is not an HA mechanism, how it works ensures the high availability when redundant components subscribe to the Message Queue. When a VIM sub component fails, since there are other redundant components are subscribing to the Message Queue, requests still can be processed. And fault isolation can also be archived since failed components won’t fetch requests actively. Also, the recovery of failed components is required. Figure 6 shows the requirement decomposition of Message Queue.

Message Queue Requirement Decomposition

Fig 6. Message Queue Redundancy Requirement Decomposition

The following requirements are elicited for Message Queue:

[Req 5.4.10] Redundant component instances should subscribe to the Message Queue, which is implemented by the installer.

[Req 5.4.11] Failed VIM component instances should be recovered by the cluster manager.

[Req 5.4.12] The alarm information of VIM component failure should be reported.

Table 5 shows the current VIM components using Message Queue and the corresponding HA test cases to verify them.

Table 5. VIM Components using Messaging Queue

Component Description Related HA Test Case
nova-scheduler Openstack compute component determines how to dispatch compute requests  
nova-cert Openstack compute component that serves the Nova Cert service for X509 certificates. Used to generate certificates for euca-bundle-image.  
nova-conductor server daemon that serves the Nova Conductor service, which provides coordination and database query support for Nova.  
nova-compute Handles all processes relating to instances (guest vms). nova-compute is responsible for building a disk image, launching it via the underlying virtualization driver, responding to calls to check its state, attaching persistent storage, and terminating it.  
nova-consoleauth Openstack compute component for Authentication of nova consoles.  
cinder-scheduler Openstack volume storage component decides on placement for newly created volumes and forwards the request to cinder-volume.  
cinder-volume Openstack volume storage component receives volume management requests from cinder-api and cinder-scheduler, and routes them to storage backends using vendor-supplied drivers.  
heat-engine Openstack Heat project server with an internal RPC api called by the heat-api server.  
1.1.5.5. 5.5 Hypervisor HA
1.1.5.6. 5.6 Host OS HA
1.1.5.7. 5.7 Hardware HA

Barometer

OPNFV Barometer Requirements
Problem Statement

Providing carrier grade Service Assurance is critical in the network transformation to a software defined and virtualized network (NFV). Medium-/large-scale cloud environments account for between hundreds and hundreds of thousands of infrastructure systems. It is vital to monitor systems for malfunctions that could lead to users application service disruption and promptly react to these fault events to facilitate improving overall system performance. As the size of infrastructure and virtual resources grow, so does the effort of monitoring back-ends. SFQM aims to expose as much useful information as possible off the platform so that faults and errors in the NFVI can be detected promptly and reported to the appropriate fault management entity.

The OPNFV platform (NFVI) requires functionality to:

  • Create a low latency, high performance packet processing path (fast path) through the NFVI that VNFs can take advantage of;
  • Measure Telco Traffic and Performance KPIs through that fast path;
  • Detect and report violations that can be consumed by VNFs and higher level EMS/OSS systems

Examples of local measurable QoS factors for Traffic Monitoring which impact both Quality of Experience and five 9’s availability would be (using Metro Ethernet Forum Guidelines as reference):

  • Packet loss
  • Packet Delay Variation
  • Uni-directional frame delay

Other KPIs such as Call drops, Call Setup Success Rate, Call Setup time etc. are measured by the VNF.

In addition to Traffic Monitoring, the NFVI must also support Performance Monitoring of the physical interfaces themselves (e.g. NICs), i.e. an ability to monitor and trace errors on the physical interfaces and report them.

All these traffic statistics for Traffic and Performance Monitoring must be measured in-service and must be capable of being reported by standard Telco mechanisms (e.g. SNMP traps), for potential enforcement actions.

Barometer updated scope

The scope of the project is to provide interfaces to support monitoring of the NFVI. The project will develop plugins for telemetry frameworks to enable the collection of platform stats and events and relay gathered information to fault management applications or the VIM. The scope is limited to collecting/gathering the events and stats and relaying them to a relevant endpoint. The project will not enforce or take any actions based on the gathered information.

Scope of SFQM

NOTE: The SFQM project has been replaced by Barometer. The output of the project will provide interfaces and functions to support monitoring of Packet Latency and Network Interfaces while the VNF is in service.

The DPDK interface/API will be updated to support:

  • Exposure of NIC MAC/PHY Level Counters
  • Interface for Time stamp on RX
  • Interface for Time stamp on TX
  • Exposure of DPDK events

collectd will be updated to support the exposure of DPDK metrics and events.

Specific testing and integration will be carried out to cover:

  • Unit/Integration Test plans: A sample application provided to demonstrate packet latency monitoring and interface monitoring

The following list of features and functionality will be developed:

  • DPDK APIs and functions for latency and interface monitoring
  • A sample application to demonstrate usage
  • collectd plugins

The scope of the project involves developing the relavant DPDK APIs, OVS APIs, sample applications, as well as the utilities in collectd to export all the relavent information to a telemetry and events consumer.

VNF specific processing, Traffic Monitoring, Performance Monitoring and Management Agent are out of scope.

The Proposed Interface counters include:

  • Packet RX
  • Packet TX
  • Packet loss
  • Interface errors + other stats

The Proposed Packet Latency Monitor include:

  • Cycle accurate stamping on ingress
  • Supports latency measurements on egress

Support for failover of DPDK enabled cores is also out of scope of the current proposal. However, this is an important requirement and must-have functionality for any DPDK enabled framework in the NFVI. To that end, a second phase of this project will be to implement DPDK Keep Alive functionality that would address this and would report to a VNF-level Failover and High Availability mechanism that would then determine what actions, including failover, may be triggered.

Consumption Models

In reality many VNFs will have an existing performance or traffic monitoring utility used to monitor VNF behavior and report statistics, counters, etc.

The consumption of performance and traffic related information/events provided by this project should be a logical extension of any existing VNF/NFVI monitoring framework. It should not require a new framework to be developed. We do not see the Barometer gathered metrics and evetns as major additional effort for monitoring frameworks to consume; this project would be sympathetic to existing monitoring frameworks. The intention is that this project represents an interface for NFVI monitoring to be used by higher level fault management entities (see below).

Allowing the Barometer metrics and events to be handled within existing telemetry frameoworks makes it simpler for overall interfacing with higher level management components in the VIM, MANO and OSS/BSS. The Barometer proposal would be complementary to the Doctor project, which addresses NFVI Fault Management support in the VIM, and the VES project, which addresses the integration of VNF telemetry-related data into automated VNF management systems. To that end, the project committers and contributors for the Barometer project wish to collaborate with the Doctor and VES projects to facilitate this.

collectd

collectd is a daemon which collects system performance statistics periodically and provides a variety of mechanisms to publish the collected metrics. It supports more than 90 different input and output plugins. Input plugins retrieve metrics and publish them to the collectd deamon, while output plugins publish the data they receive to an end point. collectd also has infrastructure to support thresholding and notification.

collectd statistics and Notifications

Within collectd notifications and performance data are dispatched in the same way. There are producer plugins (plugins that create notifications/metrics), and consumer plugins (plugins that receive notifications/metrics and do something with them).

Statistics in collectd consist of a value list. A value list includes:

  • Values, can be one of:
    • Derive: used for values where a change in the value since it’s last been read is of interest. Can be used to calculate and store a rate.
    • Counter: similar to derive values, but take the possibility of a counter wrap around into consideration.
    • Gauge: used for values that are stored as is.
    • Absolute: used for counters that are reset after reading.
  • Value length: the number of values in the data set.
  • Time: timestamp at which the value was collected.
  • Interval: interval at which to expect a new value.
  • Host: used to identify the host.
  • Plugin: used to identify the plugin.
  • Plugin instance (optional): used to group a set of values together. For e.g. values belonging to a DPDK interface.
  • Type: unit used to measure a value. In other words used to refer to a data set.
  • Type instance (optional): used to distinguish between values that have an identical type.
  • meta data: an opaque data structure that enables the passing of additional information about a value list. “Meta data in the global cache can be used to store arbitrary information about an identifier” [7].

Host, plugin, plugin instance, type and type instance uniquely identify a collectd value.

Values lists are often accompanied by data sets that describe the values in more detail. Data sets consist of:

  • A type: a name which uniquely identifies a data set.
  • One or more data sources (entries in a data set) which include:
    • The name of the data source. If there is only a single data source this is set to “value”.
    • The type of the data source, one of: counter, gauge, absolute or derive.
    • A min and a max value.

Types in collectd are defined in types.db. Examples of types in types.db:

bitrate    value:GAUGE:0:4294967295
counter    value:COUNTER:U:U
if_octets  rx:COUNTER:0:4294967295, tx:COUNTER:0:4294967295

In the example above if_octets has two data sources: tx and rx.

Notifications in collectd are generic messages containing:

  • An associated severity, which can be one of OKAY, WARNING, and FAILURE.
  • A time.
  • A Message
  • A host.
  • A plugin.
  • A plugin instance (optional).
  • A type.
  • A types instance (optional).
  • Meta-data.
DPDK Enhancements

This section will discuss the Barometer features that were integrated with DPDK.

Measuring Telco Traffic and Performance KPIs

This section will discuss the Barometer features that enable Measuring Telco Traffic and Performance KPIs.

_images/stats_and_timestamps.png

Measuring Telco Traffic and Performance KPIs

  • The very first thing Barometer enabled was a call-back API in DPDK and an associated application that used the API to demonstrate how to timestamp packets and measure packet latency in DPDK (the sample app is called rxtx_callbacks). This was upstreamed to DPDK 2.0 and is represented by the interfaces 1 and 2 in Figure 1.2.
  • The second thing Barometer implemented in DPDK is the extended NIC statistics API, which exposes NIC stats including error stats to the DPDK user by reading the registers on the NIC. This is represented by interface 3 in Figure 1.2.
    • For DPDK 2.1 this API was only implemented for the ixgbe (10Gb) NIC driver, in association with a sample application that runs as a DPDK secondary process and retrieves the extended NIC stats.
    • For DPDK 2.2 the API was implemented for igb, i40e and all the Virtual Functions (VFs) for all drivers.
    • For DPDK 16.07 the API migrated from using string value pairs to using id value pairs, improving the overall performance of the API.
Monitoring DPDK interfaces

With the features Barometer enabled in DPDK to enable measuring Telco traffic and performance KPIs, we can now retrieve NIC statistics including error stats and relay them to a DPDK user. The next step is to enable monitoring of the DPDK interfaces based on the stats that we are retrieving from the NICs, by relaying the information to a higher level Fault Management entity. To enable this Barometer has been enabling a number of plugins for collectd.

DPDK Keep Alive description

SFQM aims to enable fault detection within DPDK, the very first feature to meet this goal is the DPDK Keep Alive Sample app that is part of DPDK 2.2.

DPDK Keep Alive or KA is a sample application that acts as a heartbeat/watchdog for DPDK packet processing cores, to detect application thread failure. The application supports the detection of ‘failed’ DPDK cores and notification to a HA/SA middleware. The purpose is to detect Packet Processing Core fails (e.g. infinite loop) and ensure the failure of the core does not result in a fault that is not detectable by a management entity.

_images/dpdk_ka.png

DPDK Keep Alive Sample Application

Essentially the app demonstrates how to detect ‘silent outages’ on DPDK packet processing cores. The application can be decomposed into two specific parts: detection and notification.

  • The detection period is programmable/configurable but defaults to 5ms if no timeout is specified.
  • The Notification support is enabled by simply having a hook function that where this can be ‘call back support’ for a fault management application with a compliant heartbeat mechanism.
DPDK Keep Alive Sample App Internals

This section provides some explanation of the The Keep-Alive/’Liveliness’ conceptual scheme as well as the DPDK Keep Alive App. The initialization and run-time paths are very similar to those of the L2 forwarding application (see L2 Forwarding Sample Application (in Real and Virtualized Environments) for more information).

There are two types of cores: a Keep Alive Monitor Agent Core (master DPDK core) and Worker cores (Tx/Rx/Forwarding cores). The Keep Alive Monitor Agent Core will supervise worker cores and report any failure (2 successive missed pings). The Keep-Alive/’Liveliness’ conceptual scheme is:

  • DPDK worker cores mark their liveliness as they forward traffic.
  • A Keep Alive Monitor Agent Core runs a function every N Milliseconds to inspect worker core liveliness.
  • If keep-alive agent detects time-outs, it notifies the fault management entity through a call-back function.

Note: Only the worker cores state is monitored. There is no mechanism or agent to monitor the Keep Alive Monitor Agent Core.

DPDK Keep Alive Sample App Code Internals

The following section provides some explanation of the code aspects that are specific to the Keep Alive sample application.

The heartbeat functionality is initialized with a struct rte_heartbeat and the callback function to invoke in the case of a timeout.

rte_global_keepalive_info = rte_keepalive_create(&dead_core, NULL);
if (rte_global_hbeat_info == NULL)
    rte_exit(EXIT_FAILURE, "keepalive_create() failed");

The function that issues the pings hbeat_dispatch_pings() is configured to run every check_period milliseconds.

if (rte_timer_reset(&hb_timer,
        (check_period * rte_get_timer_hz()) / 1000,
        PERIODICAL,
        rte_lcore_id(),
        &hbeat_dispatch_pings, rte_global_keepalive_info
        ) != 0 )
    rte_exit(EXIT_FAILURE, "Keepalive setup failure.\n");

The rest of the initialization and run-time path follows the same paths as the the L2 forwarding application. The only addition to the main processing loop is the mark alive functionality and the example random failures.

rte_keepalive_mark_alive(&rte_global_hbeat_info);
cur_tsc = rte_rdtsc();

/* Die randomly within 7 secs for demo purposes.. */
if (cur_tsc - tsc_initial > tsc_lifetime)
break;

The rte_keepalive_mark_alive() function simply sets the core state to alive.

static inline void
rte_keepalive_mark_alive(struct rte_heartbeat *keepcfg)
{
    keepcfg->state_flags[rte_lcore_id()] = 1;
}

Keep Alive Monitor Agent Core Monitoring Options The application can run on either a host or a guest. As such there are a number of options for monitoring the Keep Alive Monitor Agent Core through a Local Agent on the compute node:

Application Location DPDK KA LOCAL AGENT
HOST X HOST/GUEST
GUEST X HOST/GUEST

For the first implementation of a Local Agent SFQM will enable:

Application Location DPDK KA LOCAL AGENT
HOST X HOST

Through extending the dpdkstat plugin for collectd with KA functionality, and integrating the extended plugin with Monasca for high performing, resilient, and scalable fault detection.

OPNFV Barometer configuration Guide
Barometer Configuration Guide

This document provides guidelines on how to install and configure Barometer with Apex and Compass4nfv. The deployment script installs and enables a series of collectd plugins on the compute node(s), which collect and dispatch specific metrics and events from the platform.

Pre-configuration activities - Apex

Deploying the Barometer components in Apex is done through the deploy-opnfv command by selecting a scenario-file which contains the barometer: true option. These files are located on the Jump Host in the /etc/opnfv-apex/ folder. Two scenarios are pre-defined to include Barometer, and they are: os-nosdn-bar-ha.yaml and os-nosdn-bar-noha.yaml.

$ cd /etc/opnfv-apex
$ opnfv-deploy -d os-nosdn-bar-ha.yaml -n network_settings.yaml -i inventory.yaml –- debug
Pre-configuration activities - Compass4nfv

Deploying the Barometer components in Compass4nfv is done by running the deploy.sh script after exporting a scenario-file which contains the barometer: true option. Two scenarios are pre-defined to include Barometer, and they are: os-nosdn-bar-ha.yaml and os-nosdn-bar-noha.yaml. For more information, please refer to these useful links: https://github.com/opnfv/compass4nfv https://wiki.opnfv.org/display/compass4nfv/Compass+101 https://wiki.opnfv.org/display/compass4nfv/Containerized+Compass

The quickest way to deploy using Compass4nfv is given below.

$ export SCENARIO=os-nosdn-bar-ha.yml
$ curl https://raw.githubusercontent.com/opnfv/compass4nfv/master/quickstart.sh | bash
Hardware configuration

There’s no specific Hardware configuration required. However, the intel_rdt plugin works only on platforms with Intel CPUs.

Feature configuration

All Barometer plugins are automatically deployed on all compute nodes. There is no option to selectively install only a subset of plugins. Any custom disabling or configuration must be done directly on the compute node(s) after the deployment is completed.

Upgrading the plugins

The Barometer components are built-in in the ISO image, and respectively the RPM/Debian packages. There is no simple way to update only the Barometer plugins in an existing deployment.

Barometer post installation procedures

This document describes briefly the methods of validating the Barometer installation.

Automated post installation activities

The Barometer test-suite in Functest is called barometercollectd and is part of the Features tier. Running these tests is done automatically by the OPNFV deployment pipeline on the supported scenarios. The testing consists of basic verifications that each plugin is functional per their default configurations. Inside the Functest container, the detailed results can be found in the /home/opnfv/functest/results/barometercollectd.log.

Barometer post configuration procedures

The functionality for each plugin (such as enabling/disabling and configuring its capabilities) is controlled as described in the User Guide through their individual .conf file located in the /etc/collectd/collectd.conf.d/ folder on the compute node(s). In order for any changes to take effect, the collectd service must be stopped and then started again.

Platform components validation - Apex

The following steps describe how to perform a simple “manual” testing of the Barometer components:

On the controller:

  1. Get a list of the available metrics:

    $ openstack metric list
    
  2. Take note of the ID of the metric of interest, and show the measures of this metric:

    $ openstack metric measures show <metric_id>
    
  3. Watch the measure list for updates to verify that metrics are being added:

    $ watch –n2 –d openstack metric measures show <metric_id>
    

More on testing and displaying metrics is shown below.

On the compute:

  1. Connect to any compute node and ensure that the collectd service is running. The log file collectd.log should contain no errors and should indicate that each plugin was successfully loaded. For example, from the Jump Host:

    $ opnfv-util overcloud compute0
    $ ls /etc/collectd/collectd.conf.d/
    $ systemctl status collectd
    $ vi /opt/stack/collectd.log
    

    The following plugings should be found loaded: aodh, gnocchi, hugepages, intel_rdt, mcelog, ovs_events, ovs_stats, snmp, virt

  2. On the compute node, induce an event monitored by the plugins; e.g. a corrected memory error:

    $ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
    $ cd mce-inject
    $ make
    $ modprobe mce-inject
    

    Modify the test/corrected script to include the following:

    CPU 0 BANK 0
    STATUS 0xcc00008000010090
    ADDR 0x0010FFFFFFF
    

    Inject the error:

    $ ./mce-inject < test/corrected
    
  3. Connect to the controller and query the monitoring services. Make sure the overcloudrc.v3 file has been copied to the controller (from the undercloud VM or from the Jump Host) in order to be able to authenticate for OpenStack services.

    $ opnfv-util overcloud controller0
    $ su
    $ source overcloudrc.v3
    $ gnocchi metric list
    $ aodh alarm list
    

    The output for the gnocchi and aodh queries should be similar to the excerpts below:

    +--------------------------------------+---------------------+------------------------------------------------------------------------------------------------------------+-----------+-------------+
    | id                                   | archive_policy/name | name                                                                                                       | unit      | resource_id |
    +--------------------------------------+---------------------+------------------------------------------------------------------------------------------------------------+-----------+-------------+
      [...]
    | 0550d7c1-384f-4129-83bc-03321b6ba157 | high                | overcloud-novacompute-0.jf.intel.com-hugepages-mm-2048Kb@vmpage_number.free                                | Pages     | None        |
    | 0cf9f871-0473-4059-9497-1fea96e5d83a | high                | overcloud-novacompute-0.jf.intel.com-hugepages-node0-2048Kb@vmpage_number.free                             | Pages     | None        |
    | 0d56472e-99d2-4a64-8652-81b990cd177a | high                | overcloud-novacompute-0.jf.intel.com-hugepages-node1-1048576Kb@vmpage_number.used                          | Pages     | None        |
    | 0ed71a49-6913-4e57-a475-d30ca2e8c3d2 | high                | overcloud-novacompute-0.jf.intel.com-hugepages-mm-1048576Kb@vmpage_number.used                             | Pages     | None        |
    | 11c7be53-b2c1-4c0e-bad7-3152d82c6503 | high                | overcloud-novacompute-0.jf.intel.com-mcelog-                                                               | None      | None        |
    |                                      |                     | SOCKET_0_CHANNEL_any_DIMM_any@errors.uncorrected_memory_errors_in_24h                                      |           |             |
    | 120752d4-385e-4153-aed8-458598a2a0e0 | high                | overcloud-novacompute-0.jf.intel.com-cpu-24@cpu.interrupt                                                  | jiffies   | None        |
    | 1213161e-472e-4e1b-9e56-5c6ad1647c69 | high                | overcloud-novacompute-0.jf.intel.com-cpu-6@cpu.softirq                                                     | jiffies   | None        |
      [...]
    
    +--------------------------------------+-------+------------------------------------------------------------------+-------+----------+---------+
    | alarm_id                             | type  | name                                                             | state | severity | enabled |
    +--------------------------------------+-------+------------------------------------------------------------------+-------+----------+---------+
    | fbd06539-45dd-42c5-a991-5c5dbf679730 | event | gauge.memory_erros(overcloud-novacompute-0.jf.intel.com-mcelog)  | ok    | moderate | True    |
    | d73251a5-1c4e-4f16-bd3d-377dd1e8cdbe | event | gauge.mcelog_status(overcloud-novacompute-0.jf.intel.com-mcelog) | ok    | moderate | True    |
      [...]
    
Barometer post installation verification for Compass4nfv

For Fraser release, Compass4nfv integrated the barometer-collectd container of Barometer. As a result, on the compute node, collectd runs in a Docker container. On the controller node, Grafana and InfluxDB are installed and configured.

The following steps describe how to perform simple “manual” testing of the Barometer components after successfully deploying a Barometer scenario using Compass4nfv:

On the compute:

  1. Connect to any compute node and ensure that the collectd container is running.

    root@host2:~# docker ps | grep collectd
    

    You should see the container opnfv/barometer-collectd running.

  2. Testing using mce-inject is similar to testing done in Apex.

On the controller:

3. Connect to the controller and query the monitoring services. Make sure to log in to the lxc-utility container before using the OpenStack CLI. Please refer to this wiki for details: https://wiki.opnfv.org/display/compass4nfv/Containerized+Compass#ContainerizedCompass-HowtouseOpenStackCLI

root@host1-utility-container-d15da033:~# source ~/openrc
root@host1-utility-container-d15da033:~# gnocchi metric list
root@host1-utility-container-d15da033:~# aodh alarm list

The output for the gnocchi and aodh queries should be similar to the excerpts shown in the section above for Apex.

4. Use a web browser to connect to Grafana at http://<serverip>:3000/, using the hostname or IP of your Ubuntu server and port 3000. Log in with admin/admin. You will see collectd InfluxDB database in the Data Sources. Also, you will notice metrics coming in the several dashboards such as CPU Usage and Host Overview.

For more details on the Barometer containers, Grafana and InfluxDB, please refer to the following documentation links: https://wiki.opnfv.org/display/fastpath/Barometer+Containers#BarometerContainers-barometer-collectdcontainer <barometer-docker-userguide>

OPNFV Barometer User Guide
OPNFV Barometer User Guide
Barometer collectd plugins description

Collectd is a daemon which collects system performance statistics periodically and provides a variety of mechanisms to publish the collected metrics. It supports more than 90 different input and output plugins. Input plugins retrieve metrics and publish them to the collectd deamon, while output plugins publish the data they receive to an end point. collectd also has infrastructure to support thresholding and notification.

Barometer has enabled the following collectd plugins:

  • dpdkstat plugin: A read plugin that retrieves stats from the DPDK extended NIC stats API.
  • dpdkevents plugin: A read plugin that retrieves DPDK link status and DPDK forwarding cores liveliness status (DPDK Keep Alive).
  • gnocchi plugin: A write plugin that pushes the retrieved stats to Gnocchi. It’s capable of pushing any stats read through collectd to Gnocchi, not just the DPDK stats.
  • aodh plugin: A notification plugin that pushes events to Aodh, and creates/updates alarms appropriately.
  • hugepages plugin: A read plugin that retrieves the number of available and free hugepages on a platform as well as what is available in terms of hugepages per socket.
  • Open vSwitch events Plugin: A read plugin that retrieves events from OVS.
  • Open vSwitch stats Plugin: A read plugin that retrieves flow and interface stats from OVS.
  • mcelog plugin: A read plugin that uses mcelog client protocol to check for memory Machine Check Exceptions and sends the stats for reported exceptions.
  • PMU plugin: A read plugin that provides performance counters data on Intel CPUs using Linux perf interface.
  • RDT plugin: A read plugin that provides the last level cache utilization and memory bandwidth utilization.
  • virt: A read plugin that uses virtualization API libvirt to gather statistics about virtualized guests on a system directly from the hypervisor, without a need to install collectd instance on the guest.
  • SNMP Agent: A write plugin that will act as a AgentX subagent that receives and handles queries from SNMP master agent and returns the data collected by read plugins. The SNMP Agent plugin handles requests only for OIDs specified in configuration file. To handle SNMP queries the plugin gets data from collectd and translates requested values from collectd’s internal format to SNMP format. Supports SNMP: get, getnext and walk requests.

All the plugins above are available on the collectd master, except for the Gnocchi and Aodh plugins as they are Python-based plugins and only C plugins are accepted by the collectd community. The Gnocchi and Aodh plugins live in the OpenStack repositories.

Other plugins existing as a pull request into collectd master:

  • Legacy/IPMI: A read plugin that reports platform thermals, voltages, fanspeed, current, flow, power etc. Also, the plugin monitors Intelligent Platform Management Interface (IPMI) System Event Log (SEL) and sends the appropriate notifications based on monitored SEL events.
  • PCIe AER: A read plugin that monitors PCIe standard and advanced errors and sends notifications about those errors.

Third party application in Barometer repository:

  • Open vSwitch PMD stats: An aplication that retrieves PMD stats from OVS. It is run through exec plugin.

Plugins and application included in the Euphrates release:

Write Plugins: aodh plugin, SNMP agent plugin, gnocchi plugin.

Read Plugins/application: Intel RDT plugin, virt plugin, Open vSwitch stats plugin, Open vSwitch PMD stats application.

Collectd capabilities and usage

Note

Plugins included in the OPNFV E release will be built-in for Apex integration and can be configured as shown in the examples below.

The collectd plugins in OPNFV are configured with reasonable defaults, but can be overridden.

Building all Barometer upstreamed plugins from scratch

The plugins that have been merged to the collectd master branch can all be built and configured through the barometer repository.

Note

  • sudo permissions are required to install collectd.
  • These instructions are for Centos 7.

To build all the upstream plugins, clone the barometer repo:

$ git clone https://gerrit.opnfv.org/gerrit/barometer

To install collectd as a service and install all it’s dependencies:

$ cd barometer/systems && ./build_base_machine.sh

This will install collectd as a service and the base install directory will be /opt/collectd.

Sample configuration files can be found in ‘/opt/collectd/etc/collectd.conf.d’

Note

If you don’t want to use one of the Barometer plugins, simply remove the sample config file from ‘/opt/collectd/etc/collectd.conf.d’

Note

If you plan on using the Exec plugin (for OVS_PMD_STATS or for executing scripts on notification generation), the plugin requires a non-root user to execute scripts. By default, collectd_exec user is used in the exec.conf provided in the sample configurations directory under src/collectd in the Barometer repo. These scripts DO NOT create this user. You need to create this user or modify the configuration in the sample configurations directory under src/collectd to use another existing non root user before running build_base_machine.sh.

Note

If you are using any Open vSwitch plugins you need to run:

$ sudo ovs-vsctl set-manager ptcp:6640

After this, you should be able to start collectd as a service

$ sudo systemctl status collectd

If you want to use granfana to display the metrics you collect, please see: grafana guide

For more information on configuring and installing OpenStack plugins for collectd, check out the collectd-openstack-plugins GSG.

Below is the per plugin installation and configuration guide, if you only want to install some/particular plugins.

DPDK plugins

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies: DPDK (http://dpdk.org/)

Note

DPDK statistics plugin requires DPDK version 16.04 or later.

To build and install DPDK to /usr please see: https://github.com/collectd/collectd/blob/master/docs/BUILD.dpdkstat.md

Building and installing collectd:

$ git clone https://github.com/collectd/collectd.git
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

Note

If DPDK was installed in a non standard location you will need to specify paths to the header files and libraries using LIBDPDK_CPPFLAGS and LIBDPDK_LDFLAGS. You will also need to add the DPDK library symbols to the shared library path using ldconfig. Note that this update to the shared library path is not persistant (i.e. it will not survive a reboot).

Example of specifying custom paths to DPDK headers and libraries:

$ ./configure LIBDPDK_CPPFLAGS="path to DPDK header files" LIBDPDK_LDFLAGS="path to DPDK libraries"

This will install collectd to default folder /opt/collectd. The collectd configuration file (collectd.conf) can be found at /opt/collectd/etc. To configure the dpdkstats plugin you need to modify the configuration file to include:

LoadPlugin dpdkstat
<Plugin dpdkstat>
   Coremask "0xf"
   ProcessType "secondary"
   FilePrefix "rte"
   EnabledPortMask 0xffff
   PortName "interface1"
   PortName "interface2"
</Plugin>

To configure the dpdkevents plugin you need to modify the configuration file to include:

<LoadPlugin dpdkevents>
  Interval 1
</LoadPlugin>

<Plugin "dpdkevents">
  <EAL>
    Coremask "0x1"
    MemoryChannels "4"
    FilePrefix "rte"
  </EAL>
  <Event "link_status">
    SendEventsOnUpdate false
    EnabledPortMask 0xffff
    SendNotification true
  </Event>
  <Event "keep_alive">
    SendEventsOnUpdate false
    LCoreMask "0xf"
    KeepAliveShmName "/dpdk_keepalive_shm_name"
    SendNotification true
  </Event>
</Plugin>

Note

Currently, the DPDK library doesn’t support API to de-initialize the DPDK resources allocated on the initialization. It means, the collectd plugin will not be able to release the allocated DPDK resources (locks/memory/pci bindings etc.) correctly on collectd shutdown or reinitialize the DPDK library if primary DPDK process is restarted. The only way to release those resources is to terminate the process itself. For this reason, the plugin forks off a separate collectd process. This child process becomes a secondary DPDK process which can be run on specific CPU cores configured by user through collectd configuration file (“Coremask” EAL configuration option, the hexadecimal bitmask of the cores to run on).

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

Note

dpdkstat plugin initialization time depends on read interval. It requires 5 read cycles to set up internal buffers and states, during that time no statistics are submitted. Also, if plugin is running and the number of DPDK ports is increased, internal buffers are resized. That requires 3 read cycles and no port statistics are submitted during that time.

The Address-Space Layout Randomization (ASLR) security feature in Linux should be disabled, in order for the same hugepage memory mappings to be present in all DPDK multi-process applications.

To disable ASLR:

$ sudo echo 0 > /proc/sys/kernel/randomize_va_space

To fully enable ASLR:

$ sudo echo 2 > /proc/sys/kernel/randomize_va_space

Warning

Disabling Address-Space Layout Randomization (ASLR) may have security implications. It is recommended to be disabled only when absolutely necessary, and only when all implications of this change have been understood.

For more information on multi-process support, please see: http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html

DPDK stats plugin limitations:

  1. The DPDK primary process application should use the same version of DPDK that collectd DPDK plugin is using;
  2. L2 statistics are only supported;
  3. The plugin has been tested on Intel NIC’s only.

DPDK stats known issues:

  • DPDK port visibility

    When network port controlled by Linux is bound to DPDK driver, the port will not be available in the OS. It affects the SNMP write plugin as those ports will not be present in standard IF-MIB. Thus, additional work is required to be done to support DPDK ports and statistics.

Hugepages Plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies: None, but assumes hugepages are configured.

To configure some hugepages:

$ sudo mkdir -p /mnt/huge
$ sudo mount -t hugetlbfs nodev /mnt/huge
$ sudo bash -c "echo 14336 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages"

Building and installing collectd:

$ git clone https://github.com/collectd/collectd.git
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-hugepages --enable-debug
$ make
$ sudo make install

This will install collectd to default folder /opt/collectd. The collectd configuration file (collectd.conf) can be found at /opt/collectd/etc. To configure the hugepages plugin you need to modify the configuration file to include:

LoadPlugin hugepages
<Plugin hugepages>
    ReportPerNodeHP  true
    ReportRootHP     true
    ValuesPages      true
    ValuesBytes      false
    ValuesPercentage false
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

Intel PMU Plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies:

To be suitable for use in collectd plugin shared library libjevents should be compiled as position-independent code. To do this add the following line to pmu-tools/jevents/Makefile:

CFLAGS += -fPIC

Building and installing jevents library:

$ git clone https://github.com/andikleen/pmu-tools.git
$ cd pmu-tools/jevents/
$ make
$ sudo make install

Download the Hardware Events that are relevant to your CPU, download the appropriate CPU event list json file:

$ wget https://raw.githubusercontent.com/andikleen/pmu-tools/master/event_download.py
$ python event_download.py

This will download the json files to the location: $HOME/.cache/pmu-events/. If you don’t want to download these files to the aforementioned location, set the environment variable XDG_CACHE_HOME to the location you want the files downloaded to.

Building and installing collectd:

$ git clone https://github.com/collectd/collectd.git
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --with-libjevents=/usr/local --enable-debug
$ make
$ sudo make install

This will install collectd to default folder /opt/collectd. The collectd configuration file (collectd.conf) can be found at /opt/collectd/etc. To configure the PMU plugin you need to modify the configuration file to include:

<LoadPlugin intel_pmu>
  Interval 1
</LoadPlugin>
<Plugin "intel_pmu">
  ReportHardwareCacheEvents true
  ReportKernelPMUEvents true
  ReportSoftwareEvents true
  Cores ""
</Plugin>

If you want to monitor Intel CPU specific CPU events, make sure to enable the additional two options shown below:

<Plugin intel_pmu>
 ReportHardwareCacheEvents true
 ReportKernelPMUEvents true
 ReportSoftwareEvents true
 EventList "$HOME/.cache/pmu-events/GenuineIntel-6-2D-core.json"
 HardwareEvents "L2_RQSTS.CODE_RD_HIT,L2_RQSTS.CODE_RD_MISS" "L2_RQSTS.ALL_CODE_RD"
 Cores ""
</Plugin>

Note

If you set XDG_CACHE_HOME to anything other than the variable above - you will need to modify the path for the EventList configuration.

Use “Cores” option to monitor metrics only for configured cores. If an empty string is provided as value for this field default cores configuration is applied - that is all available cores are monitored separately. To limit monitoring to cores 0-7 set the option as shown below:

Cores "[0-7]"

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

Note

The plugin opens file descriptors whose quantity depends on number of monitored CPUs and number of monitored counters. Depending on configuration, it might be required to increase the limit on the number of open file descriptors allowed. This can be done using ‘ulimit -n’ command. If collectd is executed as a service ‘LimitNOFILE=’ directive should be defined in [Service] section of collectd.service file.

Intel RDT Plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies:

Building and installing PQoS/Intel RDT library:

$ git clone https://github.com/01org/intel-cmt-cat.git
$ cd intel-cmt-cat
$ make
$ make install PREFIX=/usr

You will need to insert the msr kernel module:

$ modprobe msr

Building and installing collectd:

$ git clone https://github.com/collectd/collectd.git
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --with-libpqos=/usr/ --enable-debug
$ make
$ sudo make install

This will install collectd to default folder /opt/collectd. The collectd configuration file (collectd.conf) can be found at /opt/collectd/etc. To configure the RDT plugin you need to modify the configuration file to include:

<LoadPlugin intel_rdt>
  Interval 1
</LoadPlugin>
<Plugin "intel_rdt">
  Cores ""
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

IPMI Plugin

Repo: https://github.com/collectd/collectd

Branch: feat_ipmi_events, feat_ipmi_analog

Dependencies: OpenIPMI library (http://openipmi.sourceforge.net/)

The IPMI plugin is already implemented in the latest collectd and sensors like temperature, voltage, fanspeed, current are already supported there. The list of supported IPMI sensors has been extended and sensors like flow, power are supported now. Also, a System Event Log (SEL) notification feature has been introduced.

  • The feat_ipmi_events branch includes new SEL feature support in collectd IPMI plugin. If this feature is enabled, the collectd IPMI plugin will dispatch notifications about new events in System Event Log.
  • The feat_ipmi_analog branch includes the support of extended IPMI sensors in collectd IPMI plugin.

Install dependencies

On Centos, install OpenIPMI library:

$ sudo yum install OpenIPMI ipmitool

Anyway, it’s recommended to use the latest version of the OpenIPMI library as it includes fixes of known issues which aren’t included in standard OpenIPMI library package. The latest version of the library can be found at https://sourceforge.net/p/openipmi/code/ci/master/tree/. Steps to install the library from sources are described below.

Remove old version of OpenIPMI library:

$ sudo yum remove OpenIPMI ipmitool

Build and install OpenIPMI library:

$ git clone https://git.code.sf.net/p/openipmi/code openipmi-code
$ cd openipmi-code
$ autoreconf --install
$ ./configure --prefix=/usr
$ make
$ sudo make install

Add the directory containing OpenIPMI*.pc files to the PKG_CONFIG_PATH environment variable:

export PKG_CONFIG_PATH=/usr/lib/pkgconfig

Enable IPMI support in the kernel:

$ sudo modprobe ipmi_devintf
$ sudo modprobe ipmi_si

Note

If HW supports IPMI, the /dev/ipmi0 character device will be created.

Clone and install the collectd IPMI plugin:

$ git clone https://github.com/collectd/collectd
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

This will install collectd to default folder /opt/collectd. The collectd configuration file (collectd.conf) can be found at /opt/collectd/etc. To configure the IPMI plugin you need to modify the file to include:

LoadPlugin ipmi
<Plugin ipmi>
   <Instance "local">
     SELEnabled true # only feat_ipmi_events branch supports this
   </Instance>
</Plugin>

Note

By default, IPMI plugin will read all available analog sensor values, dispatch the values to collectd and send SEL notifications.

For more information on the IPMI plugin parameters and SEL feature configuration, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

Extended analog sensors support doesn’t require additional configuration. The usual collectd IPMI documentation can be used:

IPMI documentation:

Mcelog Plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies: mcelog

Start by installing mcelog.

Note

The kernel has to have CONFIG_X86_MCE enabled. For 32bit kernels you need atleast a 2.6,30 kernel.

On Centos:

$ sudo yum install mcelog

Or build from source

$ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mcelog.git
$ cd mcelog
$ make
... become root ...
$ make install
$ cp mcelog.service /etc/systemd/system/
$ systemctl enable mcelog.service
$ systemctl start mcelog.service

Verify you got a /dev/mcelog. You can verify the daemon is running completely by running:

$ mcelog --client

This should query the information in the running daemon. If it prints nothing that is fine (no errors logged yet). More info @ http://www.mcelog.org/installation.html

Modify the mcelog configuration file “/etc/mcelog/mcelog.conf” to include or enable:

socket-path = /var/run/mcelog-client
[dimm]
dimm-tracking-enabled = yes
dmi-prepopulate = yes
uc-error-threshold = 1 / 24h
ce-error-threshold = 10 / 24h

[socket]
socket-tracking-enabled = yes
mem-uc-error-threshold = 100 / 24h
mem-ce-error-threshold = 100 / 24h
mem-ce-error-log = yes

[page]
memory-ce-threshold = 10 / 24h
memory-ce-log = yes
memory-ce-action = soft

[trigger]
children-max = 2
directory = /etc/mcelog

Clone and install the collectd mcelog plugin:

$ git clone https://github.com/collectd/collectd
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

This will install collectd to default folder /opt/collectd. The collectd configuration file (collectd.conf) can be found at /opt/collectd/etc. To configure the mcelog plugin you need to modify the configuration file to include:

<LoadPlugin mcelog>
  Interval 1
</LoadPlugin>
<Plugin mcelog>
  <Memory>
    McelogClientSocket "/var/run/mcelog-client"
    PersistentNotification false
  </Memory>
  #McelogLogfile "/var/log/mcelog"
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

Simulating a Machine Check Exception can be done in one of 3 ways:

  • Running $make test in the mcelog cloned directory - mcelog test suite
  • using mce-inject
  • using mce-test

mcelog test suite:

It is always a good idea to test an error handling mechanism before it is really needed. mcelog includes a test suite. The test suite relies on mce-inject which needs to be installed and in $PATH.

You also need the mce-inject kernel module configured (with CONFIG_X86_MCE_INJECT=y), compiled, installed and loaded:

$ modprobe mce-inject

Then you can run the mcelog test suite with

$ make test

This will inject different classes of errors and check that the mcelog triggers runs. There will be some kernel messages about page offlining attempts. The test will also lose a few pages of memory in your system (not significant).

Note

This test will kill any running mcelog, which needs to be restarted manually afterwards.

mce-inject:

A utility to inject corrected, uncorrected and fatal machine check exceptions

$ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
$ cd mce-inject
$ make
$ modprobe mce-inject

Modify the test/corrected script to include the following:

CPU 0 BANK 0
STATUS 0xcc00008000010090
ADDR 0x0010FFFFFFF

Inject the error: .. code:: bash

$ ./mce-inject < test/corrected

Note

The uncorrected and fatal scripts under test will cause a platform reset. Only the fatal script generates the memory errors**. In order to quickly emulate uncorrected memory errors and avoid host reboot following test errors from mce-test suite can be injected:

$ mce-inject  mce-test/cases/coverage/soft-inj/recoverable_ucr/data/srao_mem_scrub

mce-test:

In addition a more in-depth test of the Linux kernel machine check facilities can be done with the mce-test test suite. mce-test supports testing uncorrected error handling, real error injection, handling of different soft offlining cases, and other tests.

Corrected memory error injection:

To inject corrected memory errors:

  • Remove sb_edac and edac_core kernel modules: rmmod sb_edac rmmod edac_core
  • Insert einj module: modprobe einj param_extension=1
  • Inject an error by specifying details (last command should be repeated at least two times):
$ APEI_IF=/sys/kernel/debug/apei/einj
$ echo 0x8 > $APEI_IF/error_type
$ echo 0x01f5591000 > $APEI_IF/param1
$ echo 0xfffffffffffff000 > $APEI_IF/param2
$ echo 1 > $APEI_IF/notrigger
$ echo 1 > $APEI_IF/error_inject
  • Check the MCE statistic: mcelog –client. Check the mcelog log for injected error details: less /var/log/mcelog.
Open vSwitch Plugins

OvS Plugins Repo: https://github.com/collectd/collectd

OvS Plugins Branch: master

OvS Events MIBs: The SNMP OVS interface link status is provided by standard IF-MIB (http://www.net-snmp.org/docs/mibs/IF-MIB.txt)

Dependencies: Open vSwitch, Yet Another JSON Library (https://github.com/lloyd/yajl)

On Centos, install the dependencies and Open vSwitch:

$ sudo yum install yajl-devel

Steps to install Open vSwtich can be found at http://docs.openvswitch.org/en/latest/intro/install/fedora/

Start the Open vSwitch service:

$ sudo service openvswitch-switch start

Configure the ovsdb-server manager:

$ sudo ovs-vsctl set-manager ptcp:6640

Clone and install the collectd ovs plugin:

$ git clone $REPO
$ cd collectd
$ git checkout master
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

This will install collectd to default folder /opt/collectd. The collectd configuration file (collectd.conf) can be found at /opt/collectd/etc. To configure the OVS events plugin you need to modify the configuration file to include:

<LoadPlugin ovs_events>
   Interval 1
</LoadPlugin>
<Plugin ovs_events>
   Port "6640"
   Address "127.0.0.1"
   Socket "/var/run/openvswitch/db.sock"
   Interfaces "br0" "veth0"
   SendNotification true
</Plugin>

To configure the OVS stats plugin you need to modify the configuration file to include:

<LoadPlugin ovs_stats>
   Interval 1
</LoadPlugin>
<Plugin ovs_stats>
   Port "6640"
   Address "127.0.0.1"
   Socket "/var/run/openvswitch/db.sock"
   Bridges "br0"
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

OVS PMD stats

Repo: https://gerrit.opnfv.org/gerrit/barometer

Prequistes: 1. Open vSwitch dependencies are installed. 2. Open vSwitch service is running. 3. Ovsdb-server manager is configured. You can refer Open vSwitch Plugins section above for each one of them.

OVS PMD stats application is run through the exec plugin.

To configure the OVS PMD stats application you need to modify the exec plugin configuration to include:

<LoadPlugin exec>
   Interval 1
</LoadPlugin
<Plugin exec>
    Exec "user:group" "<path to ovs_pmd_stat.sh>"
</Plugin>

Note

Exec plugin configuration has to be changed to use appropriate user before starting collectd service.

ovs_pmd_stat.sh calls the script for OVS PMD stats application with its argument:

sudo python /usr/local/src/ovs_pmd_stats.py" "--socket-pid-file"
"/var/run/openvswitch/ovs-vswitchd.pid"
SNMP Agent Plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies: NET-SNMP library

Start by installing net-snmp and dependencies.

On Centos 7:

$ sudo yum install net-snmp net-snmp-libs net-snmp-utils net-snmp-devel
$ sudo systemctl start snmpd.service

go to the snmp configuration steps.

From source:

Clone and build net-snmp:

$ git clone https://github.com/haad/net-snmp.git
$ cd net-snmp
$ ./configure --with-persistent-directory="/var/net-snmp" --with-systemd --enable-shared --prefix=/usr
$ make

Become root

$ make install

Copy default configuration to persistent folder:

$ cp EXAMPLE.conf /usr/share/snmp/snmpd.conf

Set library path and default MIB configuration:

$ cd ~/
$ echo export LD_LIBRARY_PATH=/usr/lib >> .bashrc
$ net-snmp-config --default-mibdirs
$ net-snmp-config --snmpconfpath

Configure snmpd as a service:

$ cd net-snmp
$ cp ./dist/snmpd.service /etc/systemd/system/
$ systemctl enable snmpd.service
$ systemctl start snmpd.service

Add the following line to snmpd.conf configuration file /etc/snmp/snmpd.conf to make all OID tree visible for SNMP clients:

view    systemview    included   .1

To verify that SNMP is working you can get IF-MIB table using SNMP client to view the list of Linux interfaces:

$ snmpwalk -v 2c -c public localhost IF-MIB::interfaces

Get the default MIB location:

$ net-snmp-config --default-mibdirs
/opt/stack/.snmp/mibs:/usr/share/snmp/mibs

Install Intel specific MIBs (if needed) into location received by net-snmp-config command (e.g. /usr/share/snmp/mibs).

$ git clone https://gerrit.opnfv.org/gerrit/barometer.git
$ sudo cp -f barometer/mibs/*.txt /usr/share/snmp/mibs/
$ sudo systemctl restart snmpd.service

Clone and install the collectd snmp_agent plugin:

$ cd ~
$ git clone https://github.com/collectd/collectd
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug --enable-snmp --with-libnetsnmp
$ make
$ sudo make install

This will install collectd to default folder /opt/collectd. The collectd configuration file (collectd.conf) can be found at /opt/collectd/etc.

SNMP Agent plugin is a generic plugin and cannot work without configuration. To configure the snmp_agent plugin you need to modify the configuration file to include OIDs mapped to collectd types. The following example maps scalar memAvailReal OID to value represented as free memory type of memory plugin:

LoadPlugin snmp_agent
<Plugin "snmp_agent">
  <Data "memAvailReal">
    Plugin "memory"
    Type "memory"
    TypeInstance "free"
    OIDs "1.3.6.1.4.1.2021.4.6.0"
  </Data>
</Plugin>

The snmpwalk command can be used to validate the collectd configuration:

$ snmpwalk -v 2c -c public localhost 1.3.6.1.4.1.2021.4.6.0
UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 135237632 kB

Limitations

  • Object instance with Counter64 type is not supported in SNMPv1. When GetNext request is received, Counter64 type objects will be skipped. When Get request is received for Counter64 type object, the error will be returned.
  • Interfaces that are not visible to Linux like DPDK interfaces cannot be retreived using standard IF-MIB tables.

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

For more details on AgentX subagent, please see: http://www.net-snmp.org/tutorial/tutorial-5/toolkit/demon/

virt plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies: libvirt (https://libvirt.org/), libxml2

On Centos, install the dependencies:

$ sudo yum install libxml2-devel libpciaccess-devel yajl-devel device-mapper-devel

Install libvirt:

Note

libvirt version in package manager might be quite old and offer only limited functionality. Hence, building and installing libvirt from sources is recommended. Detailed instructions can bet found at: https://libvirt.org/compiling.html

$ sudo yum install libvirt-devel

Certain metrics provided by the plugin have a requirement on a minimal version of the libvirt API. File system information statistics require a Guest Agent (GA) to be installed and configured in a VM. User must make sure that installed GA version supports retrieving file system information. Number of Performance monitoring events metrics depends on running libvirt daemon version.

Note

Please keep in mind that RDT metrics (part of Performance monitoring events) have to be supported by hardware. For more details on hardware support, please see: https://github.com/01org/intel-cmt-cat

Additionally perf metrics cannot be collected if Intel RDT plugin is enabled.

libvirt version can be checked with following commands:

$ virsh --version
$ libvirtd --version
Extended statistics requirements
Statistic Min. libvirt API version Requires GA
Domain reason 0.9.2 No
Disk errors 0.9.10 No
Job statistics 1.2.9 No
File system information 1.2.11 Yes
Performance monitoring events 1.3.3 No

Start libvirt daemon:

$ systemctl start libvirtd

Create domain (VM) XML configuration file. For more information on domain XML format and examples, please see: https://libvirt.org/formatdomain.html

Note

Installing additional hypervisor dependencies might be required before deploying virtual machine.

Create domain, based on created XML file:

$ virsh define DOMAIN_CFG_FILE.xml

Start domain:

$ virsh start DOMAIN_NAME

Check if domain is running:

$ virsh list

Check list of available Performance monitoring events and their settings:

$ virsh perf DOMAIN_NAME

Enable or disable Performance monitoring events for domain:

$ virsh perf DOMAIN_NAME [--enable | --disable] EVENT_NAME --live

Clone and install the collectd virt plugin:

$ git clone $REPO
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

Where $REPO is equal to information provided above.

This will install collectd to /opt/collectd. The collectd configuration file collectd.conf can be found at /opt/collectd/etc. To load the virt plugin user needs to modify the configuration file to include:

LoadPlugin virt

Additionally, user can specify plugin configuration parameters in this file, such as connection URL, domain name and much more. By default extended virt plugin statistics are disabled. They can be enabled with ExtraStats option.

<Plugin virt>
   RefreshInterval 60
   ExtraStats "cpu_util disk disk_err domain_state fs_info job_stats_background pcpu perf vcpupin"
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

Installing collectd as a service

NOTE: In an OPNFV installation, collectd is installed and configured as a service.

Collectd service scripts are available in the collectd/contrib directory. To install collectd as a service:

$ sudo cp contrib/systemd.collectd.service /etc/systemd/system/
$ cd /etc/systemd/system/
$ sudo mv systemd.collectd.service collectd.service
$ sudo chmod +x collectd.service

Modify collectd.service

[Service]
ExecStart=/opt/collectd/sbin/collectd
EnvironmentFile=-/opt/collectd/etc/
EnvironmentFile=-/opt/collectd/etc/
CapabilityBoundingSet=CAP_SETUID CAP_SETGID

Reload

$ sudo systemctl daemon-reload
$ sudo systemctl start collectd.service
$ sudo systemctl status collectd.service should show success
Additional useful plugins

Exec Plugin : Can be used to show you when notifications are being generated by calling a bash script that dumps notifications to file. (handy for debug). Modify /opt/collectd/etc/collectd.conf:

LoadPlugin exec
<Plugin exec>
#   Exec "user:group" "/path/to/exec"
   NotificationExec "user" "<path to barometer>/barometer/src/collectd/collectd_sample_configs/write_notification.sh"
</Plugin>

write_notification.sh (just writes the notification passed from exec through STDIN to a file (/tmp/notifications)):

#!/bin/bash
rm -f /tmp/notifications
while read x y
do
  echo $x$y >> /tmp/notifications
done

output to /tmp/notifications should look like:

Severity:WARNING
Time:1479991318.806
Host:localhost
Plugin:ovs_events
PluginInstance:br-ex
Type:gauge
TypeInstance:link_status
uuid:f2aafeec-fa98-4e76-aec5-18ae9fc74589

linkstate of "br-ex" interface has been changed to "DOWN"
  • logfile plugin: Can be used to log collectd activity. Modify /opt/collectd/etc/collectd.conf to include:
LoadPlugin logfile
<Plugin logfile>
    LogLevel info
    File "/var/log/collectd.log"
    Timestamp true
    PrintSeverity false
</Plugin>
Monitoring Interfaces and Openstack Support
_images/monitoring_interfaces.png

Monitoring Interfaces and Openstack Support

The figure above shows the DPDK L2 forwarding application running on a compute node, sending and receiving traffic. Collectd is also running on this compute node retrieving the stats periodically from DPDK through the dpdkstat plugin and publishing the retrieved stats to OpenStack through the collectd-openstack-plugins.

To see this demo in action please checkout: Barometer OPNFV Summit demo

For more information on configuring and installing OpenStack plugins for collectd, check out the collectd-openstack-plugins GSG.

Security
VES Application User Guide

The Barometer repository contains a python based application for VES (VNF Event Stream) which receives the collectd specific metrics via Kafka bus, normalizes the metric data into the VES message format and sends it into the VES collector.

The application currently supports pushing platform relevant metrics through the additional measurements field for VES.

Collectd has a write_kafka plugin that sends collectd metrics and values to a Kafka Broker. The VES message formatting application, ves_app.py, receives metrics from the Kafka broker, normalises the data to VES message format for forwarding to VES collector. The VES message formatting application will be simply referred to as the “VES application” within this userguide

The VES application can be run in host mode (baremetal), hypervisor mode (on a host with a hypervisor and VMs running) or guest mode(within a VM). The main software blocks that are required to run the VES application demo are:

  1. Kafka
  2. Collectd
  3. VES Application
  4. VES Collector
Install Kafka Broker
  1. Dependencies: install JAVA & Zookeeper.

    Ubuntu 16.04:

    $ sudo apt-get install default-jre
    $ sudo apt-get install zookeeperd
    $ sudo apt-get install python-pip
    

    CentOS:

    $ sudo yum update -y
    $ sudo yum install java-1.8.0-openjdk
    $ sudo yum install epel-release
    $ sudo yum install python-pip
    $ sudo yum install zookeeper
    $ sudo yum install telnet
    $ sudo yum install wget
    

    Note

    You may need to add the repository that contains zookeeper. To do so, follow the step below and try to install zookeeper again using steps above. Otherwise, skip next step.

    $ sudo yum install
    https://archive.cloudera.com/cdh5/one-click-install/redhat/7/x86_64/cloudera-cdh-5-0.x86_64.rpm
    

    Start zookeeper:

    $ sudo zookeeper-server start
    

    if you get the error message like ZooKeeper data directory is missing at /var/lib/zookeeper during the start of zookeeper, initialize zookeeper data directory using the command below and start zookeeper again, otherwise skip the next step.

    $ sudo /usr/lib/zookeeper/bin/zkServer-initialize.sh
     No myid provided, be sure to specify it in /var/lib/zookeeper/myid if using non-standalone
    
  2. Test if Zookeeper is running as a daemon.

    $ telnet localhost 2181
    

    Type ‘ruok’ & hit enter. Expected response is ‘imok’ which means that Zookeeper is up running.

  3. Install Kafka

    Note

    VES doesn’t work with version 0.9.4 of kafka-python. The recommended/tested version is 1.3.3.

    $ sudo pip install kafka-python
    $ wget "https://archive.apache.org/dist/kafka/1.0.0/kafka_2.11-1.0.0.tgz"
    $ tar -xvzf kafka_2.11-1.0.0.tgz
    $ sed -i -- 's/#delete.topic.enable=true/delete.topic.enable=true/' kafka_2.11-1.0.0/config/server.properties
    $ sudo nohup kafka_2.11-1.0.0/bin/kafka-server-start.sh \
      kafka_2.11-1.0.0/config/server.properties > kafka_2.11-1.0.0/kafka.log 2>&1 &
    

    Note

    If Kafka server fails to start, please check if the platform IP address is associated with the hostname in the static host lookup table. If it doesn’t exist, use the following command to add it.

    $ echo "$(ip route get 8.8.8.8 | awk '{print $NF; exit}') $HOSTNAME" | sudo tee -a /etc/hosts
    
  4. Test the Kafka Installation

    To test if the installation worked correctly there are two scripts, producer and consumer scripts. These will allow you to see messages pushed to broker and receive from broker.

    Producer (Publish “Hello World”):

    $ echo "Hello, World" | kafka_2.11-1.0.0/bin/kafka-console-producer.sh \
      --broker-list localhost:9092 --topic TopicTest > /dev/null
    

    Consumer (Receive “Hello World”):

    $ kafka_2.11-1.0.0/bin/kafka-console-consumer.sh --zookeeper \
      localhost:2181 --topic TopicTest --from-beginning --max-messages 1 --timeout-ms 3000
    
Install collectd

Install development tools:

Ubuntu 16.04:

$ sudo apt-get install build-essential bison autotools-dev autoconf
$ sudo apt-get install pkg-config flex libtool

CentOS:

$ sudo yum group install 'Development Tools'

Install Apache Kafka C/C++ client library:

$ git clone https://github.com/edenhill/librdkafka.git ~/librdkafka
$ cd ~/librdkafka
$ git checkout -b v0.9.5 v0.9.5
$ ./configure --prefix=/usr
$ make
$ sudo make install

Build collectd with Kafka support:

$ git clone https://github.com/collectd/collectd.git ~/collectd
$ cd ~/collectd
$ ./build.sh
$ ./configure --with-librdkafka=/usr --without-perl-bindings --enable-perl=no
$ make && sudo make install

Note

If installing from git repository collectd.conf configuration file will be located in directory /opt/collectd/etc/. If installing from via a package manager collectd.conf configuration file will be located in directory /etc/collectd/

Configure and start collectd. Modify Collectd configuration file collectd.conf as following:

Start collectd process as a service as described in Installing collectd as a service.

Setup VES application (guest mode)

In this mode Collectd runs from within a VM and sends metrics to the VES collector.

_images/ves-app-guest-mode.png

VES guest mode setup

Install dependencies:

$ sudo pip install pyyaml python-kafka

Clone Barometer repo and start the VES application:

$ git clone https://gerrit.opnfv.org/gerrit/barometer
$ cd barometer/3rd_party/collectd-ves-app/ves_app
$ nohup python ves_app.py --events-schema=guest.yaml --config=ves_app_config.conf > ves_app.stdout.log &

Modify Collectd configuration file collectd.conf as following:

LoadPlugin logfile
LoadPlugin interface
LoadPlugin memory
LoadPlugin load
LoadPlugin disk
LoadPlugin uuid
LoadPlugin write_kafka

<Plugin logfile>
  LogLevel info
  File "/opt/collectd/var/log/collectd.log"
  Timestamp true
  PrintSeverity false
</Plugin>

<Plugin cpu>
  ReportByCpu true
  ReportByState true
  ValuesPercentage true
</Plugin>

<Plugin write_kafka>
  Property "metadata.broker.list" "localhost:9092"
  <Topic "collectd">
    Format JSON
  </Topic>
</Plugin>

Start collectd process as a service as described in Installing collectd as a service.

Note

The above configuration is used for a localhost. The VES application can be configured to use remote VES collector and remote Kafka server. To do so, the IP addresses/host names needs to be changed in collector.conf and ves_app_config.conf files accordingly.

Setup VES application (hypervisor mode)

This mode is used to collect hypervisor statistics about guest VMs and to send those metrics into the VES collector. Also, this mode collects host statistics and send them as part of the guest VES message.

_images/ves-app-hypervisor-mode.png

VES hypervisor mode setup

Running the VES in hypervisor mode looks like steps described in Setup VES application (guest mode) but with the following exceptions:

  • The hypervisor.yaml configuration file should be used instead of guest.yaml file when VES application is running.
  • Collectd should be running on hypervisor machine only.
  • Addition libvirtd dependencies needs to be installed on where collectd daemon is running. To install those dependencies, see virt plugin section of Barometer user guide.
  • The next (minimum) configuration needs to be provided to collectd to be able to generate the VES message to VES collector.

Note

At least one VM instance should be up and running by hypervisor on the host.

LoadPlugin logfile
LoadPlugin cpu
LoadPlugin virt
LoadPlugin write_kafka

<Plugin logfile>
  LogLevel info
  File "/opt/collectd/var/log/collectd.log"
  Timestamp true
  PrintSeverity false
</Plugin>

<Plugin virt>
  Connection "qemu:///system"
  RefreshInterval 60
  HostnameFormat uuid
  PluginInstanceFormat name
  ExtraStats "cpu_util"
</Plugin>

<Plugin write_kafka>
  Property "metadata.broker.list" "localhost:9092"
  <Topic "collectd">
    Format JSON
  </Topic>
</Plugin>

Start collectd process as a service as described in Installing collectd as a service.

Note

The above configuration is used for a localhost. The VES application can be configured to use remote VES collector and remote Kafka server. To do so, the IP addresses/host names needs to be changed in collector.conf and ves_app_config.conf files accordingly.

Note

The list of the plugins can be extented depends on your needs.

Setup VES application (host mode)

This mode is used to collect platform wide metrics and to send those metrics into the VES collector. It is most suitable for running within a baremetal platform.

Install dependencies:

$ sudo pip install pyyaml

Clone Barometer repo and start the VES application:

$ git clone https://gerrit.opnfv.org/gerrit/barometer
$ cd barometer/3rd_party/collectd-ves-app/ves_app
$ nohup python ves_app.py --events-schema=host.yaml --config=ves_app_config.conf > ves_app.stdout.log &
_images/ves-app-host-mode.png

VES Native mode setup

Modify collectd configuration file collectd.conf as following:

LoadPlugin interface
LoadPlugin memory
LoadPlugin disk

LoadPlugin cpu
<Plugin cpu>
  ReportByCpu true
  ReportByState true
  ValuesPercentage true
</Plugin>

LoadPlugin write_kafka
<Plugin write_kafka>
  Property "metadata.broker.list" "localhost:9092"
  <Topic "collectd">
    Format JSON
  </Topic>
</Plugin>

Start collectd process as a service as described in Installing collectd as a service.

Note

The above configuration is used for a localhost. The VES application can be configured to use remote VES collector and remote Kafka server. To do so, the IP addresses/host names needs to be changed in collector.conf and ves_app_config.conf files accordingly.

Note

The list of the plugins can be extented depends on your needs.

Setup VES Test Collector

Note

Test Collector setup is required only for VES application testing purposes.

Install dependencies:

$ sudo pip install jsonschema

Clone VES Test Collector:

$ git clone https://github.com/att/evel-test-collector.git ~/evel-test-collector

Modify VES Test Collector config file to point to existing log directory and schema file:

$ sed -i.back 's/^\(log_file[ ]*=[ ]*\).*/\1collector.log/' ~/evel-test-collector/config/collector.conf
$ sed -i.back 's/^\(schema_file[ ]*=.*\)event_format_updated.json$/\1CommonEventFormat.json/' ~/evel-test-collector/config/collector.conf

Start VES Test Collector:

$ cd ~/evel-test-collector/code/collector
$ nohup python ./collector.py --config ../../config/collector.conf > collector.stdout.log &
VES application configuration description

Details of the Vendor Event Listener REST service

REST resources are defined with respect to a ServerRoot:

ServerRoot = https://{Domain}:{Port}/{optionalRoutingPath}

REST resources are of the form:

{ServerRoot}/eventListener/v{apiVersion}`
{ServerRoot}/eventListener/v{apiVersion}/{topicName}`
{ServerRoot}/eventListener/v{apiVersion}/eventBatch`

Within the VES directory (3rd_party/collectd-ves-app/ves_app) there is a configuration file called ves_app_conf.conf. The description of the configuration options are described below:

Domain “host”
VES domain name. It can be IP address or hostname of VES collector (default: 127.0.0.1)
Port port
VES port (default: 30000)
Path “path”
Used as the “optionalRoutingPath” element in the REST path (default: vendor_event_listener)
Topic “path”
Used as the “topicName” element in the REST path (default: example_vnf)
UseHttps true|false
Allow application to use HTTPS instead of HTTP (default: false)
Username “username”
VES collector user name (default: empty)
Password “passwd”
VES collector password (default: empty)
SendEventInterval interval
This configuration option controls how often (sec) collectd data is sent to Vendor Event Listener (default: 20)
ApiVersion version
Used as the “apiVersion” element in the REST path (default: 3)
KafkaPort port
Kafka Port (Default 9092)
KafkaBroker host
Kafka Broker domain name. It can be an IP address or hostname of local or remote server (default: localhost)
VES YAML configuration format

The format of the VES message is generated by the VES application based on the YAML schema configuration file provided by user via --events-schema command-line option of the application.

Note

Use --help option of VES application to see the description of all available options

Note

The detailed installation guide of the VES application is described in the VES Application User Guide.

The message defined by YAML schema should correspond to format defined in VES shema definition.

Warning

This section doesn’t explain the YAML language itself, so the knowledge of the YAML language is required before continue reading it!

Since, the VES message output is a JSON format, it’s recommended to understand how YAML document is converted to JSON before starting to create a new YAML definition for the VES. E.g.:

The following YAML document:

---
additionalMeasurements:
  plugin: somename
  instance: someinstance

will be converted to JSON like this:

{
  "additionalMeasurements": {
    "instance": "someinstance",
    "plugin": "somename"
  }
}

Note

The YAML syntax section of PyYAML documentation can be used as a reference for this.

VES message types

The VES agent can generate two type of messages which are sent to the VES collector. Each message type must be specified in the YAML configuration file using a specific YAML tag.

Measurements

This type is used to send a message defined in YAML configuration to the VES collector with a specified interval (default is 20 sec and it’s configurable via command line option of the application). This type can be specified in the configuration using !Measurements tag. For instance:

---
# My comments
My Host Measurements: !Measurements
  ... # message definition
Events

This type is used to send a message defined in YAML configuration to the VES collector when collectd notification is received from Kafka bus (collectd write_kafka plugin). This type can be specified in the configuration using !Events tag. For instance:

---
# My comments
My Events: !Events
  ... # event definition
Collectd metrics in VES

The VES application caches collectd metrics received via Kafka bus. The data is stored in a table format. It’s important to know it before mapping collectd metric values to message defined in YAML configuration file.

VES collectd metric cache example:

host plugin plugin_instance type type_instace time value ds_name interval
localhost cpu   percent user 1509535512.30567 16 value 10
localhost memory   memory free 1509535573.448014 798456 value 10
localhost interface   eth0 if_packets 1509544183.956496 253 rx 10
7ec333e7 virt Ubuntu-12.04.5-LTS percent virt_cpu_total 1509544402.378035 0.2 value 10
7ec333e7 virt Ubuntu-12.04.5-LTS memory rss 1509544638.55119 123405 value 10
7ec333e7 virt Ubuntu-12.04.5-LTS if_octets vnet1 1509544646.27108 67 tx 10
cc659a52 virt Ubuntu-16.04 percent virt_cpu_total 1509544745.617207 0.3 value 10
cc659a52 virt Ubuntu-16.04 memory rss 1509544754.329411 4567 value 10
cc659a52 virt Ubuntu-16.04 if_octets vnet0 1509544760.720381 0 rx 10

It’s possible from YAML configuration file to refer to any field of any row of the table via special YAML tags like ValueItem or ArrayItem. See the Collectd metric reference reference for more details.

Note

The collectd data types file contains map of type to ds_name field. It can be used to get possible value for ds_name field. See the collectd data types description for more details on collectd data types.

Aging of collectd metric

If the metric will not be updated by the collectd during the double metric interval time, it will be removed (aged) from VES internal cache.

VES YAML references

There are four type of references supported by the YAML format: System, Config, Collectd metric and Collectd notification. The format of the reference is the following:

"{<ref type>.<attribute name>}"

Note

Possible values for <ref type> and <attribute name> are described in farther sections.

System reference

This reference is used to get system statistics like time, date etc. The system reference (<ref type> = system) can be used in any place of the YAML configuration file. This type of reference provides the following attributes:

hostname
The name of the host where VES application is running.
id
Unique ID during VES application runtime.
time
Current time in seconds since the Epoch. For example 1509641411.631951.
date
Current date in ISO 8601 format, YYYY-MM-DD. For instance 2017-11-02.

For example:

Date: "{system.date}"
Config reference

This reference is used to get VES configuration described in VES application configuration description sectoin. The reference (<ref type> = config) can be used in any place of the YAML configuration file. This type of reference provides the following attributes:

interval
Measurements dispatch interval. It referenses to SendEventInterval configuration of the VES application.

For example:

Interval: "{config.interval}"
Collectd metric reference

This reference is used to get the attribute value of collectd metric from the VES cache. The reference (<ref type> = vl) can be used ONLY inside Measurements, ValueItem and ArrayItem tags. Using the reference inside a helper tag is also allowed if the helper tag is located inside the tag where the reference is allowed (e.g.: ArrayItem). The <attribute name> name corresponds to the table field name described in Collectd metrics in VES section. For example:

name: "{vl.type}-{vl.type_instance}"
Collectd notification reference

This reference is used to get the attribute value of received collectd notification. The reference (<ref type> = n) can be used ONLY inside Events tag. Using the reference inside a helper tag is also allowed if the helper tag is located inside the Events tag. This type of reference provides the following attributes:

host
The hostname of received collectd notification.
plugin
The plugin name of received collectd notification.
plugin_instance
The plugin instance of the received collectd notification.
type
The type name of the received collectd notification.
type_instance
The type instance of the received collectd notification.
severity
The severity of the received collectd notification.
message
The message of the received collectd notification.

Note

The exact value for each attribute depends on the collectd plugin which may generate the notification. Please refer to the collectd plugin description document to get more details on the specific collectd plugin.

YAML config example:

sourceId: "{n.plugin_instance}"
Collectd metric mapping YAML tags

This section describes the YAML tags used to map collectd metric values in the YAML configuration file.

Measurements tag

This tag is a YAML map which is used to define the VES measurement message. It’s allowed to be used multiple times in the document (e.g.: you can define multiple VES messages in one YAML document). This tag works in the same way as ArrayItem tag does and all keys have the same description/rules.

ValueItem tag

This tag is used to select a collectd metric and get its attribute value using Collectd metric reference. The type of this tag is a YAML array of maps with the possible keys described below.

SELECT (required)
Is a YAML map which describes the select metric criteria. Each key name of the map must correspond to the table field name described in Collectd metrics in VES section.
VALUE (optional)
Describes the value to be assigned. If not provided, the default !Number "{vl.value}" expression is used.
DEFAULT (optional)
Describes the default value which will be assigned if no metric is selected by SELECT criteria.

ValueItem tag description example:

memoryFree: !ValueItem
  - SELECT:
      plugin: memory
      type: memory
      type_instance: rss
  - VALUE: !Bytes2Kibibytes "{vl.value}"
  - DEFAULT: 0

The tag process workflow is described on the figure below.

_images/value-item-parse-workflow.png

YAML ValueItem tag process workflow

ArrayItem tag

This tag is used to select a list of collectd metrics and generate a YAML array of YAML items described by ITEM-DESC key. If no collectd metrics are selected by the given criteria, the empty array will be returned.

SELECT (optional)

Is a YAML map which describes the select metrics criteria. Each key name of the map must correspond to the table field name described in Collectd metrics in VES section. The value of the key may be regular expression. To enable regular expression in the value, the YAML string containing / char at the beginning and at the end should be used. For example:

plugin: "/^(?!virt).*$/" # selected all metrics except ``virt`` plugin

The VES application uses the python RE library to work with regular expression specified in the YAML configuration. Please refer to python regular expression syntax documentation for more details on a syntax used by the VES.

Multiple SELECT keys are allowed by the tag. If nor SELECT or INDEX-KEY key is specified, the VES error is generated.

INDEX-KEY (optional)
Is a YAML array which describes the unique fields to be selected by the tag. Each element of array is a YAML string which should be one of the table field name described in Collectd metrics in VES section. Please note, if this key is used, only fields specified by the key can be used as a collectd reference in the ITEM-DESC key.
ITEM-DESC (required)
Is a YAML map which describes each element of the YAML array generated by the tag. Using ArrayItem tags and other Helper YAML tags are also allowed in the definition of the key.

In the example below, the ArrayItem tag is used to generate an array of ITEM-DESC items for each collectd metrics except virt plugin with unique plugin, plugin_instance attribute values.

Measurements: !ArrayItem
  - SELECT:
      plugin: "/^(?!virt).*$/"
  - INDEX-KEY:
      - plugin
      - plugin_instance
  - ITEM-DESC:
      name: !StripExtraDash "{vl.plugin}-{vl.plugin_instance}"

The tag process workflow is described on the figure below.

_images/array-item-parse-workflow.png

YAML ArrayItem tag process workflow

Collectd event mapping YAML tags

This section describes the YAML tags used to map the collectd notification to the VES event message in the YAML configuration file.

Events tag

This tag is a YAML map which is used to define the VES event message. It’s allowed to be used multiple times in the document (e.g.: you can map multiple collectd notification into VES message in one YAML document). The possible keys of the tag are described below.

CONDITION (optional)
Is a YAML map which describes the select metric criteria. Each key name of the map must correspond to the name of attribute provided by Collectd notification reference. If no such key provided, any collectd notification will map the defined YAML message.
ITEM-DESC (required)
Is a YAML map which describes the message generated by this tag. Only Helper YAML tags are allowed in the definition of the key.

The example of the VES event message which will be generate by the VES application when collectd notification of the virt plugin is triggered is described below.

---
Virt Event: !Events
  - ITEM-DESC:
      event:
        commonEventHeader:
          domain: fault
          eventType: Notification
          sourceId: &event_sourceId "{n.plugin_instance}"
          sourceName: *event_sourceId
          lastEpochMicrosec: !Number "{n.time}"
          startEpochMicrosec: !Number "{n.time}"
        faultFields:
          alarmInterfaceA: !StripExtraDash "{n.plugin}-{n.plugin_instance}"
          alarmCondition: "{n.severity}"
          faultFieldsVersion: 1.1
  - CONDITION:
      plugin: virt
Helper YAML tags

This section describes the YAML tags used as utilities for formatting the output message. The YAML configuration process workflow is described on the figure below.

_images/parse-work-flow.png

YAML configuration process workflow

Convert string to number tag

The !Number tag is used in YAML configuration file to convert string value into the number type. For instance:

lastEpochMicrosec: !Number "3456"

The output of the tag will be the JSON number.

{
  lastEpochMicrosec: 3456
}
Convert bytes to Kibibytes tag

The !Bytes2Kibibytes tag is used in YAML configuration file to convert bytes into kibibytes (1 kibibyte = 1024 bytes). For instance:

memoryConfigured: !Bytes2Kibibytes 4098
memoryConfigured: !Bytes2Kibibytes "1024"

The output of the tag will be the JSON number.

{
  memoryConfigured: 4
  memoryConfigured: 1
}
Convert one value to another tag

The !MapValue tag is used in YAML configuration file to map one value into another value defined in the configuration. For instance:

Severity: !MapValue
  VALUE: Failure
  TO:
    Failure: Critical
    Error: Warnign

The output of the tag will be the mapped value.

{
  Severity: Critical
}
Strip extra dash tag

The !StripExtraDash tag is used in YAML configuration file to strip extra dashes in the string (dashes at the beginning, at the end and double dashes). For example:

name: !StripExtraDash string-with--extra-dashes-

The output of the tag will be the JSON string with extra dashes removed.

{
  name: string-with-extra-dashes
}
Limitations
  1. Only one document can be defined in the same YAML configuration file.
  2. The collectd notification is not supported by Kafka collectd plugin. Due to this limitation, the collectd notifications cannot be received by the VES application and the defined YAML event will not be generated and sent to the VES collector. Please note, that VES YAML format already supports the events definition and the format is descibed in the document.
OPNFV Barometer Docker User Guide

The intention of this user guide is to outline how to install and test the Barometer project’s docker images. The OPNFV docker hub contains 5 docker images from the Barometer project:

For description of images please see section Barometer Docker Images Description

For steps to build and run Collectd image please see section Build and Run Collectd Docker Image

For steps to build and run InfluxDB and Grafana images please see section Build and Run InfluxDB and Grafana Docker Images

For steps to build and run VES and Kafka images please see section Build and Run VES and Kafka Docker Images

For overview of running VES application with Kafka please see the VES Application User Guide

Barometer Docker Images Description
Barometer Collectd Image

The barometer collectd docker image gives you a collectd installation that includes all the barometer plugins.

Note

The Dockerfile is available in the docker/barometer-collectd directory in the barometer repo. The Dockerfile builds a CentOS 7 docker image. The container MUST be run as a privileged container.

Collectd is a daemon which collects system performance statistics periodically and provides a variety of mechanisms to publish the collected metrics. It supports more than 90 different input and output plugins. Input plugins retrieve metrics and publish them to the collectd deamon, while output plugins publish the data they receive to an end point. Collectd also has infrastructure to support thresholding and notification.

Collectd docker image has enabled the following collectd plugins (in addition to the standard collectd plugins):

  • hugepages plugin
  • Open vSwitch events Plugin
  • Open vSwitch stats Plugin
  • mcelog plugin
  • PMU plugin
  • RDT plugin
  • virt
  • SNMP Agent
  • Kafka_write plugin

Plugins and third party applications in Barometer repository that will be available in the docker image:

  • Open vSwitch PMD stats
  • ONAP VES application
  • gnocchi plugin
  • aodh plugin
  • Legacy/IPMI
InfluxDB + Grafana Docker Images

The Barometer project’s InfluxDB and Grafana docker images are 2 docker images that database and graph statistics reported by the Barometer collectd docker. InfluxDB is an open-source time series database tool which stores the data from collectd for future analysis via Grafana, which is a open-source metrics anlytics and visualisation suite which can be accessed through any browser.

VES + Kafka Docker Images

The Barometer project’s VES application and Kafka docker images are based on a CentOS 7 image. Kafka docker image has a dependancy on Zookeeper. Kafka must be able to connect and register with an instance of Zookeeper that is either running on local or remote host. Kafka recieves and stores metrics recieved from Collectd. VES application pulls latest metrics from Kafka which it normalizes into VES format for sending to a VES collector. Please see details in VES Application User Guide

One Click Install with Ansible
Proxy for package manager on host

Note

This step has to be performed only if host is behind HTTP/HTTPS proxy

Proxy URL have to be set in dedicated config file

  1. CentOS - /etc/yum.conf
proxy=http://your.proxy.domain:1234
  1. Ubuntu - /etc/apt/apt.conf
Acquire::http::Proxy "http://your.proxy.domain:1234"

After update of config file, apt mirrors have to be updated via ‘apt-get update’

$ sudo apt-get update
Proxy environment variables(for docker and pip)

Note

This step has to be performed only if host is behind HTTP/HTTPS proxy

Configuring proxy for packaging system is not enough, also some proxy environment variables have to be set in the system before ansible scripts can be started. Barometer configures docker proxy automatically via ansible task as a part of ‘one click install’ process - user only has to provide proxy URL using common shell environment variables and ansible will automatically configure proxies for docker(to be able to fetch barometer images). Another component used by ansible (e.g. pip is used for downloading python dependencies) will also benefit from setting proxy variables properly in the system.

Proxy variables used by ansible One Click Install:
  • http_proxy
  • https_proxy
  • ftp_proxy
  • no_proxy

Variables mentioned above have to be visible for superuser (because most actions involving ansible-barometer installation require root privileges). Proxy variables are commonly defined in ‘/etc/environment’ file (but any other place is good as long as variables can be seen by commands using ‘su’).

Sample proxy configuration in /etc/environment:

http_proxy=http://your.proxy.domain:1234
https_proxy=http://your.proxy.domain:1234
ftp_proxy=http://your.proxy.domain:1234
no_proxy=localhost
Install Ansible

Note

  • sudo permissions or root access are required to install ansible.
  • ansible version needs to be 2.4+, because usage of import/include statements

The following steps have been verified with Ansible 2.6.3 on Ubuntu 16.04 and 18.04. To install Ansible 2.6.3 on Ubuntu:

$ sudo apt-get install python
$ sudo apt-get install python-pip
$ sudo pip install 'ansible==2.6.3'

The following steps have been verified with Ansible 2.6.3 on Centos 7.5. To install Ansible 2.6.3 on Centos:

$ sudo yum install python
$ sudo yum install epel-release
$ sudo yum install python-pip
$ sudo pip install 'ansible==2.6.3'
Clone barometer repo
$ git clone https://gerrit.opnfv.org/gerrit/barometer
$ cd barometer/docker/ansible
Edit inventory file

Edit inventory file and add hosts: $barometer_dir/docker/ansible/default.inv

[collectd_hosts]
localhost

[collectd_hosts:vars]
install_mcelog=true
insert_ipmi_modules=true

[influxdb_hosts]
localhost

[grafana_hosts]
localhost

[prometheus_hosts]
#localhost

[kafka_hosts]
#localhost

[ves_hosts]
#localhost

Change localhost to different hosts where neccessary. Hosts for influxdb and grafana are required only for collectd_service.yml. Hosts for kafka and ves are required only for collectd_ves.yml.

To change host for kafka edit kafka_ip_addr in ./roles/config_files/vars/main.yml.

Additional plugin dependencies

By default ansible will try to fulfill dependencies for mcelog and ipmi plugin. For mcelog plugin it installs mcelog daemon. For ipmi it tries to insert ipmi_devintf and ipmi_si kernel modules. This can be changed in inventory file with use of variables install_mcelog and insert_ipmi_modules, both variables are independent:

[collectd_hosts:vars]
install_mcelog=false
insert_ipmi_modules=false

Note

On Ubuntu 18.04 to use mcelog plugin the user has to install mcelog daemon manually ahead of installing from ansible scripts as the deb package is not available in official Ubuntu 18.04 repo. It means that setting install_mcelog to true is ignored.

Configure ssh keys

Generate ssh keys if not present, otherwise move onto next step.

$ sudo ssh-keygen

Copy ssh key to all target hosts. It requires to provide root password. The example is for localhost.

$ sudo ssh-copy-id root@localhost

Verify that key is added and password is not required to connect.

$ sudo ssh root@localhost

Note

Keys should be added to every target host and [localhost] is only used as an example. For multinode installation keys need to be copied for each node: [collectd_hostname], [influxdb_hostname] etc.

Download and run Collectd+Influxdb+Grafana containers

The One Click installation features easy and scalable deployment of Collectd, Influxdb and Grafana containers using Ansible playbook. The following steps goes through more details.

$ sudo ansible-playbook -i default.inv collectd_service.yml

Check the three containers are running, the output of docker ps should be similar to:

$ sudo docker ps
CONTAINER ID        IMAGE                      COMMAND                  CREATED             STATUS              PORTS               NAMES
a033aeea180d        opnfv/barometer-grafana    "/run.sh"                9 days ago          Up 7 minutes                            bar-grafana
1bca2e4562ab        opnfv/barometer-influxdb   "/entrypoint.sh in..."   9 days ago          Up 7 minutes                            bar-influxdb
daeeb68ad1d5        opnfv/barometer-collectd   "/run_collectd.sh ..."   9 days ago          Up 7 minutes                            bar-collectd

To make some changes when a container is running run:

$ sudo docker exec -ti <CONTAINER ID> /bin/bash

Connect to <host_ip>:3000 with a browser and log into Grafana: admin/admin. For short introduction please see the: Grafana guide.

The collectd configuration files can be accessed directly on target system in ‘/opt/collectd/etc/collectd.conf.d’. It can be used for manual changes or enable/disable plugins. If configuration has been modified it is required to restart collectd:

$ sudo docker restart bar-collectd
Download collectd+kafka+ves containers

Before running Kafka an instance of zookeeper is required. See Run Kafka docker image for notes on how to run it. The ‘zookeeper_hostname’ and ‘broker_id’ can be set in ./roles/run_kafka/vars/main.yml.

$ sudo ansible-playbook -i default.inv collectd_ves.yml

Check the three containers are running, the output of docker ps should be similar to:

$ sudo docker ps
CONTAINER ID        IMAGE                      COMMAND                  CREATED             STATUS                     PORTS               NAMES
8b095ad94ea1        zookeeper:3.4.11           "/docker-entrypoin..."   7 minutes ago       Up 7 minutes                                   awesome_jennings
eb8bba3c0b76        opnfv/barometer-ves        "./start_ves_app.s..."   21 minutes ago      Up 6 minutes                                   bar-ves
86702a96a68c        opnfv/barometer-kafka      "/src/start_kafka.sh"    21 minutes ago      Up 6 minutes                                   bar-kafka
daeeb68ad1d5        opnfv/barometer-collectd   "/run_collectd.sh ..."   13 days ago         Up 6 minutes                                   bar-collectd

To make some changes when a container is running run:

$ sudo docker exec -ti <CONTAINER ID> /bin/bash
List of default plugins for collectd container
By default the collectd is started with default configuration which includes the followin plugins:
  • csv, contextswitch, cpu, cpufreq, df, disk, ethstat, ipc, irq, load, memory, numa, processes, swap, turbostat, uuid, uptime, exec, hugepages, intel_pmu, ipmi, write_kafka, logfile, mcelog, network, intel_rdt, rrdtool, snmp_agent, syslog, virt, ovs_stats, ovs_events

Some of the plugins are loaded depending on specific system requirements and can be omitted if dependency is not met, this is the case for:

  • hugepages, ipmi, mcelog, intel_rdt, virt, ovs_stats, ovs_events
List and description of tags used in ansible scripts

Tags can be used to run a specific part of the configuration without running the whole playbook. To run a specific parts only:

$ sudo ansible-playbook -i default.inv collectd_service.yml --tags "syslog,cpu,uuid"

To disable some parts or plugins:

$ sudo ansible-playbook -i default.inv collectd_service.yml --skip-tags "en_default_all,syslog,cpu,uuid"

List of available tags:

install_docker
Install docker and required dependencies with package manager.
add_docker_proxy
Configure proxy file for docker service if proxy is set on host environment.
rm_config_dir
Remove collectd config files.
copy_additional_configs
Copy additional configuration files to target system. Path to additional configuration is stored in $barometer_dir/docker/ansible/roles/config_files/vars/main.yml as additional_configs_path.
en_default_all
Set of default read plugins: contextswitch, cpu, cpufreq, df, disk, ethstat, ipc, irq, load, memory, numa, processes, swap, turbostat, uptime.
plugins tags
The following tags can be used to enable/disable plugins: csv, contextswitch, cpu, cpufreq, df, disk, ethstat, ipc, irq, load, memory, numa, processes, swap, turbostat, uptime, exec, hugepages, ipmi, kafka, logfile, mcelogs, network, pmu, rdt, rrdtool, snmp, syslog, virt, ovs_stats, ovs_events, uuid.
Installing Docker

Note

The below sections provide steps for manual installation and configuration of docker images. They are not neccessary if docker images were installed with use of Ansible-Playbook.

On Ubuntu

Note

  • sudo permissions are required to install docker.
  • These instructions are for Ubuntu 16.10

To install docker:

$ sudo apt-get install curl
$ sudo curl -fsSL https://get.docker.com/ | sh
$ sudo usermod -aG docker <username>
$ sudo systemctl status docker

Replace <username> above with an appropriate user name.

On CentOS

Note

  • sudo permissions are required to install docker.
  • These instructions are for CentOS 7

To install docker:

$ sudo yum remove docker docker-common docker-selinux docker-engine
$ sudo yum install -y yum-utils  device-mapper-persistent-data  lvm2
$ sudo yum-config-manager   --add-repo    https://download.docker.com/linux/centos/docker-ce.repo
$ sudo yum-config-manager --enable docker-ce-edge
$ sudo yum-config-manager --enable docker-ce-test
$ sudo yum install docker-ce
$ sudo usermod -aG docker <username>
$ sudo systemctl status docker

Replace <username> above with an appropriate user name.

Note

If this is the first time you are installing a package from a recently added repository, you will be prompted to accept the GPG key, and the key’s fingerprint will be shown. Verify that the fingerprint is correct, and if so, accept the key. The fingerprint should match060A 61C5 1B55 8A7F 742B 77AA C52F EB6B 621E 9F35.

Retrieving key from https://download.docker.com/linux/centos/gpg Importing GPG key 0x621E9F35:

Userid : “Docker Release (CE rpm) <docker@docker.com>” Fingerprint: 060a 61c5 1b55 8a7f 742b 77aa c52f eb6b 621e 9f35 From : https://download.docker.com/linux/centos/gpg

Is this ok [y/N]: y

Manual proxy configuration for docker

Note

This applies for both CentOS and Ubuntu.

If you are behind an HTTP or HTTPS proxy server, you will need to add this configuration in the Docker systemd service file.

  1. Create a systemd drop-in directory for the docker service:
$ sudo mkdir -p /etc/systemd/system/docker.service.d

2. Create a file called /etc/systemd/system/docker.service.d/http-proxy.conf that adds the HTTP_PROXY environment variable:

[Service]
Environment="HTTP_PROXY=http://proxy.example.com:80/"

Or, if you are behind an HTTPS proxy server, create a file called /etc/systemd/system/docker.service.d/https-proxy.conf that adds the HTTPS_PROXY environment variable:

[Service]
Environment="HTTPS_PROXY=https://proxy.example.com:443/"

Or create a single file with all the proxy configurations: /etc/systemd/system/docker.service.d/proxy.conf

[Service]
Environment="HTTP_PROXY=http://proxy.example.com:80/"
Environment="HTTPS_PROXY=https://proxy.example.com:443/"
Environment="FTP_PROXY=ftp://proxy.example.com:443/"
Environment="NO_PROXY=localhost"
  1. Flush changes:
$ sudo systemctl daemon-reload
  1. Restart Docker:
$ sudo systemctl restart docker
  1. Check docker environment variables:
sudo systemctl show --property=Environment docker
Test docker installation

Note

This applies for both CentOS and Ubuntu.

$ sudo docker run hello-world

The output should be something like:

Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
5b0f327be733: Pull complete
Digest: sha256:07d5f7800dfe37b8c2196c7b1c524c33808ce2e0f74e7aa00e603295ca9a0972
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:

$ docker run -it ubuntu bash
Build and Run Collectd Docker Image
Download the collectd docker image

If you wish to use a pre-built barometer image, you can pull the barometer image from https://hub.docker.com/r/opnfv/barometer-collectd/

$ docker pull opnfv/barometer-collectd
Build the collectd docker image
$ git clone https://gerrit.opnfv.org/gerrit/barometer
$ cd barometer/docker/barometer-collectd
$ sudo docker build -t opnfv/barometer-collectd --build-arg http_proxy=`echo $http_proxy` \
  --build-arg https_proxy=`echo $https_proxy` -f Dockerfile .

Note

Main directory of barometer source code (directory that contains ‘docker’, ‘docs’, ‘src’ and systems sub-directories) will be referred as <BAROMETER_REPO_DIR>

Note

In the above mentioned docker build command, http_proxy & https_proxy arguments needs to be passed only if system is behind an HTTP or HTTPS proxy server.

Check the docker images:

$ sudo docker images

Output should contain a barometer-collectd image:

REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
opnfv/barometer-collectd     latest              05f2a3edd96b        3 hours ago         1.2GB
centos                       7                   196e0ce0c9fb        4 weeks ago         197MB
centos                       latest              196e0ce0c9fb        4 weeks ago         197MB
hello-world                  latest              05a3bd381fc2        4 weeks ago         1.84kB
Run the collectd docker image
$ cd <BAROMETER_REPO_DIR>
$ sudo docker run -ti --net=host -v \
`pwd`/src/collectd/collectd_sample_configs:/opt/collectd/etc/collectd.conf.d \
-v /var/run:/var/run -v /tmp:/tmp --privileged opnfv/barometer-collectd

Note

The docker collectd image contains configuration for all the collectd plugins. In the command above we are overriding /opt/collectd/etc/collectd.conf.d by mounting a host directory src/collectd/collectd_sample_configs that contains only the sample configurations we are interested in running. It’s important to do this if you don’t have DPDK, or RDT installed on the host. Sample configurations can be found at: https://github.com/opnfv/barometer/tree/master/src/collectd/collectd_sample_configs

Check your docker image is running

sudo docker ps

To make some changes when the container is running run:

sudo docker exec -ti <CONTAINER ID> /bin/bash
Build and Run InfluxDB and Grafana docker images
Overview

The barometer-influxdb image is based on the influxdb:1.3.7 image from the influxdb dockerhub. To view detils on the base image please visit https://hub.docker.com/_/influxdb/ Page includes details of exposed ports and configurable enviromental variables of the base image.

The barometer-grafana image is based on grafana:4.6.3 image from the grafana dockerhub. To view details on the base image please visit https://hub.docker.com/r/grafana/grafana/ Page includes details on exposed ports and configurable enviromental variables of the base image.

The barometer-grafana image includes pre-configured source and dashboards to display statistics exposed by the barometer-collectd image. The default datasource is an influxdb database running on localhost but the address of the influxdb server can be modified when launching the image by setting the environmental variables influxdb_host to IP or hostname of host on which influxdb server is running.

Additional dashboards can be added to barometer-grafana by mapping a volume to /opt/grafana/dashboards. Incase where a folder is mounted to this volume only files included in this folder will be visible inside barometer-grafana. To ensure all default files are also loaded please ensure they are included in volume folder been mounted. Appropriate example are given in section Run the Grafana docker image

Download the InfluxDB and Grafana docker images

If you wish to use pre-built barometer project’s influxdb and grafana images, you can pull the images from https://hub.docker.com/r/opnfv/barometer-influxdb/ and https://hub.docker.com/r/opnfv/barometer-grafana/

Note

If your preference is to build images locally please see sections Build InfluxDB Docker Image and Build Grafana Docker Image

$ docker pull opnfv/barometer-influxdb
$ docker pull opnfv/barometer-grafana

Note

If you have pulled the pre-built barometer-influxdb and barometer-grafana images there is no requirement to complete steps outlined in sections Build InfluxDB Docker Image and Build Grafana Docker Image and you can proceed directly to section Run the Influxdb and Grafana Images If you wish to run the barometer-influxdb and barometer-grafana images via Docker Compose proceed directly to section Docker Compose.

Build InfluxDB docker image

Build influxdb image from Dockerfile

$ cd barometer/docker/barometer-influxdb
$ sudo docker build -t opnfv/barometer-influxdb --build-arg http_proxy=`echo $http_proxy` \
  --build-arg https_proxy=`echo $https_proxy` -f Dockerfile .

Note

In the above mentioned docker build command, http_proxy & https_proxy arguments needs to be passed only if system is behind an HTTP or HTTPS proxy server.

Check the docker images:

$ sudo docker images

Output should contain an influxdb image:

REPOSITORY                   TAG                 IMAGE ID            CREATED            SIZE
opnfv/barometer-influxdb     latest              1e4623a59fe5        3 days ago         191MB
Build Grafana docker image

Build Grafana image from Dockerfile

$ cd barometer/docker/barometer-grafana
$ sudo docker build -t opnfv/barometer-grafana --build-arg http_proxy=`echo $http_proxy` \
  --build-arg https_proxy=`echo $https_proxy` -f Dockerfile .

Note

In the above mentioned docker build command, http_proxy & https_proxy arguments needs to be passed only if system is behind an HTTP or HTTPS proxy server.

Check the docker images:

$ sudo docker images

Output should contain an influxdb image:

REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
opnfv/barometer-grafana      latest              05f2a3edd96b        3 hours ago         1.2GB
Run the Influxdb and Grafana Images
Run the InfluxDB docker image
$ sudo docker run -tid -v /var/lib/influxdb:/var/lib/influxdb -p 8086:8086 -p 25826:25826  opnfv/barometer-influxdb

Check your docker image is running

sudo docker ps

To make some changes when the container is running run:

sudo docker exec -ti <CONTAINER ID> /bin/bash
Run the Grafana docker image

Connecting to an influxdb instance running on local system and adding own custom dashboards

$ cd <BAROMETER_REPO_DIR>
$ sudo docker run -tid -v /var/lib/grafana:/var/lib/grafana -v ${PWD}/docker/barometer-grafana/dashboards:/opt/grafana/dashboards \
  -p 3000:3000 opnfv/barometer-grafana

Connecting to an influxdb instance running on remote system with hostname of someserver and IP address of 192.168.121.111

$ sudo docker run -tid -v /var/lib/grafana:/var/lib/grafana -p 3000:3000 -e \
  influxdb_host=someserver --add-host someserver:192.168.121.111 opnfv/barometer-grafana

Check your docker image is running

sudo docker ps

To make some changes when the container is running run:

sudo docker exec -ti <CONTAINER ID> /bin/bash

Connect to <host_ip>:3000 with a browser and log into grafana: admin/admin

Build and Run VES and Kafka Docker Images
Download VES and Kafka docker images

If you wish to use pre-built barometer project’s VES and kafka images, you can pull the images from https://hub.docker.com/r/opnfv/barometer-ves/ and https://hub.docker.com/r/opnfv/barometer-kafka/

Note

If your preference is to build images locally please see sections `Build the Kafka Image`_ and `Build VES Image`_

$ docker pull opnfv/barometer-kafka
$ docker pull opnfv/barometer-ves

Note

If you have pulled the pre-built images there is no requirement to complete steps outlined in sections Build Kafka Docker Image and Build VES Docker Image and you can proceed directly to section Run Kafka Docker Image If you wish to run the docker images via Docker Compose proceed directly to section Docker Compose.

Build Kafka docker image

Build Kafka docker image:

$ cd barometer/docker/barometer-kafka
$ sudo docker build -t opnfv/barometer-kafka --build-arg http_proxy=`echo $http_proxy` \
  --build-arg https_proxy=`echo $https_proxy` -f Dockerfile .

Note

In the above mentioned docker build command, http_proxy & https_proxy arguments needs to be passed only if system is behind an HTTP or HTTPS proxy server.

Check the docker images:

$ sudo docker images

Output should contain a barometer image:

REPOSITORY                TAG                 IMAGE ID            CREATED             SIZE
opnfv/barometer-kafka     latest              05f2a3edd96b        3 hours ago         1.2GB
Build VES docker image

Build VES application docker image:

$ cd barometer/docker/barometer-ves
$ sudo docker build -t opnfv/barometer-ves --build-arg http_proxy=`echo $http_proxy` \
  --build-arg https_proxy=`echo $https_proxy` -f Dockerfile .

Note

In the above mentioned docker build command, http_proxy & https_proxy arguments needs to be passed only if system is behind an HTTP or HTTPS proxy server.

Check the docker images:

$ sudo docker images

Output should contain a barometer image:

REPOSITORY                TAG                 IMAGE ID            CREATED             SIZE
opnfv/barometer-ves       latest              05f2a3edd96b        3 hours ago         1.2GB
Run Kafka docker image

Note

Before running Kafka an instance of Zookeeper must be running for the Kafka broker to register with. Zookeeper can be running locally or on a remote platform. Kafka’s broker_id and address of its zookeeper instance can be configured by setting values for environmental variables ‘broker_id’ and ‘zookeeper_node’. In instance where ‘broker_id’ and/or ‘zookeeper_node’ is not set the default setting of broker_id=0 and zookeeper_node=localhost is used. In intance where Zookeeper is running on same node as Kafka and there is a one to one relationship between Zookeeper and Kafka, default setting can be used. The docker argument add-host adds hostname and IP address to /etc/hosts file in container

Run zookeeper docker image:

$ sudo docker run -tid --net=host -p 2181:2181 zookeeper:3.4.11

Run kafka docker image which connects with a zookeeper instance running on same node with a 1:1 relationship

$ sudo docker run -tid --net=host -p 9092:9092 opnfv/barometer-kafka

Run kafka docker image which connects with a zookeeper instance running on a node with IP address of 192.168.121.111 using broker ID of 1

$ sudo docker run -tid --net=host -p 9092:9092 --env broker_id=1 --env zookeeper_node=zookeeper --add-host \
  zookeeper:192.168.121.111 opnfv/barometer-kafka
Run VES Application docker image

Note

VES application uses configuration file ves_app_config.conf from directory barometer/3rd_party/collectd-ves-app/ves_app/config/ and host.yaml file from barometer/3rd_party/collectd-ves-app/ves_app/yaml/ by default. If you wish to use a custom config file it should be mounted to mount point /opt/ves/config/ves_app_config.conf. To use an alternative yaml file from folder barometer/3rd_party/collectd-ves-app/ves_app/yaml the name of the yaml file to use should be passed as an additional command. If you wish to use a custom file the file should be mounted to mount point /opt/ves/yaml/ Please see examples below

Run VES docker image with default configuration

$ sudo docker run -tid --net=host opnfv/barometer-ves

Run VES docker image with guest.yaml files from barometer/3rd_party/collectd-ves-app/ves_app/yaml/

$ sudo docker run -tid --net=host opnfv/barometer-ves guest.yaml

Run VES docker image with using custom config and yaml files. In example below yaml/ folder cotains file named custom.yaml

$ sudo docker run -tid --net=host -v ${PWD}/custom.config:/opt/ves/config/ves_app_config.conf \
  -v ${PWD}/yaml/:/opt/ves/yaml/ opnfv/barometer-ves custom.yaml
Build and Run LocalAgent and Redis Docker Images
Download LocalAgent docker images

If you wish to use pre-built barometer project’s LocalAgent images, you can pull the images from https://hub.docker.com/r/opnfv/barometer-localagent/

Note

If your preference is to build images locally please see sections Build LocalAgent Docker Image

$ docker pull opnfv/barometer-localagent

Note

If you have pulled the pre-built images there is no requirement to complete steps outlined in sections Build LocalAgent Docker Image and you can proceed directly to section Run LocalAgent Docker Image If you wish to run the docker images via Docker Compose proceed directly to section Docker Compose.

Build LocalAgent docker image

Build LocalAgent docker image:

$ cd barometer/docker/barometer-dma
$ sudo docker build -t opnfv/barometer-dma --build-arg http_proxy=`echo $http_proxy` \
  --build-arg https_proxy=`echo $https_proxy` -f Dockerfile .

Note

In the above mentioned docker build command, http_proxy & https_proxy arguments needs to be passed only if system is behind an HTTP or HTTPS proxy server.

Check the docker images:

$ sudo docker images

Output should contain a barometer image:

REPOSITORY                   TAG                 IMAGE ID            CREATED             SIZE
opnfv/barometer-dma          latest              2f14fbdbd498        3 hours ago         941 MB
Run Redis docker image

Note

Before running LocalAgent, Redis must be running.

Run Redis docker image:

$ sudo docker run -tid -p 6379:6379 --name barometer-redis redis

Check your docker image is running

sudo docker ps
Run LocalAgent docker image

Run LocalAgent docker image with default configuration

$ cd barometer/docker/barometer-dma
$ sudo mkdir /etc/barometer-dma
$ sudo cp ../../src/dma/examples/config.toml /etc/barometer-dma/
$ sudo vi /etc/barometer-dma/config.toml
(edit amqp_password and os_password:OpenStack admin password)

$ sudo su -
(When there is no key for SSH access authentication)
# ssh-keygen
(Press Enter until done)
(Backup if necessary)
# cp ~/.ssh/authorized_keys ~/.ssh/authorized_keys_org
# cat ~/.ssh/authorized_keys_org ~/.ssh/id_rsa.pub \
  > ~/.ssh/authorized_keys
# exit

$ sudo docker run -tid --net=host --name server \
  -v /etc/barometer-dma:/etc/barometer-dma \
  -v /root/.ssh/id_rsa:/root/.ssh/id_rsa \
  -v /etc/collectd/collectd.conf.d:/etc/collectd/collectd.conf.d \
  opnfv/barometer-dma /server

$ sudo docker run -tid --net=host --name infofetch \
  -v /etc/barometer-dma:/etc/barometer-dma \
  -v /var/run/libvirt:/var/run/libvirt \
  opnfv/barometer-dma /infofetch

(Execute when installing the threshold evaluation binary)
$ sudo docker cp infofetch:/threshold ./
$ sudo ln -s ${PWD}/threshold /usr/local/bin/
Docker Compose
Install docker-compose

On the node where you want to run influxdb + grafana or the node where you want to run the VES app zookeeper and Kafka containers together:

Note

The default configuration for all these containers is to run on the localhost. If this is not the model you want to use then please make the appropriate configuration changes before launching the docker containers.

  1. Start by installing docker compose
$ sudo curl -L https://github.com/docker/compose/releases/download/1.17.0/docker-compose-`uname -s`-`uname -m` -o /usr/bin/docker-compose

Note

Use the latest Compose release number in the download command. The above command is an example, and it may become out-of-date. To ensure you have the latest version, check the Compose repository release page on GitHub.

  1. Apply executable permissions to the binary:
$ sudo chmod +x /usr/bin/docker-compose
  1. Test the installation.
$ sudo docker-compose --version
Run the InfluxDB and Grafana containers using docker compose

Launch containers:

$ cd barometer/docker/compose/influxdb-grafana/
$ sudo docker-compose up -d

Check your docker images are running

$ sudo docker ps

Connect to <host_ip>:3000 with a browser and log into grafana: admin/admin

Run the Kafka, zookeeper and VES containers using docker compose

Launch containers:

$ cd barometer/docker/compose/ves/
$ sudo docker-compose up -d

Check your docker images are running

$ sudo docker ps

Clover

Compass4Nfv

Compass4NFV Development Overview
Introduction of Containerized Compass

Containerized Compass uses five compass containers instead of a single VM.

Each container stands for a micro service and compass-core function separates into these five micro services:

  • Compass-deck : RESTful API and DB Handlers for Compass
  • Compass-tasks : Registered tasks and MQ modules for Compass
  • Compass-cobbler : Cobbler container for Compass
  • Compass-db : Database for Compass
  • Compass-mq : Message Queue for Compass

Compass4nfv has several containers to satisfy OPNFV requirements:

  • Compass-tasks-osa : compass-task’s adapter for deployment OpenStack via OpenStack-ansible
  • Compass-tasks-k8s : compass-task’s adapter for deployment Kubernetes
  • Compass-repo-osa-ubuntu : optional container to support OPNFV offfline installation via OpenStack-ansible
  • Compass-repo-osa-centos : optional container to support OPNFV offfline installation via OpenStack-ansible

Picture below shows the new architecture of compass4nfv:

New Archietecture of Compass4nfv

Fig 1. New Archietecture of Compass4nfv

Compass4nfv Installation Instructions
1. Abstract

This document describes how to install the Fraser release of OPNFV when using Compass4nfv as a deployment tool covering it’s limitations, dependencies and required system resources.

2. Features
2.1. Supported Openstack Version and OS
  OS only OpenStack Liberty OpenStack Mitaka OpenStack Newton OpenStack Ocata OpenStack Pike
CentOS 7 yes yes yes yes no yes
Ubuntu trusty yes yes yes no no no
Ubuntu xenial yes no yes yes yes yes
2.2. Supported Openstack Flavor and Features
  OpenStack Liberty OpenStack Mitaka OpenStack Newton OpenStack Ocata OpenStack Pike
Virtual Deployment Yes Yes Yes Yes Yes
Baremetal Deployment Yes Yes Yes Yes Yes
HA Yes Yes Yes Yes Yes
Ceph Yes Yes Yes Yes Yes
SDN ODL/ONOS Yes Yes Yes Yes* Yes*
Compute Node Expansion Yes Yes Yes No No
Multi-Nic Support Yes Yes Yes Yes Yes
Boot Recovery Yes Yes Yes Yes Yes
SFC No No Yes Yes Yes
  • ONOS will not be supported in this release.
3. Compass4nfv configuration

This document describes providing guidelines on how to install and configure the Danube release of OPNFV when using Compass as a deployment tool including required software and hardware configurations.

Installation and configuration of host OS, OpenStack, OpenDaylight, ONOS, Ceph etc. can be supported by Compass on Virtual nodes or Bare Metal nodes.

The audience of this document is assumed to have good knowledge in networking and Unix/Linux administration.

3.1. Preconditions

Before starting the installation of the Fraser release of OPNFV, some planning must be done.

3.1.1. Retrieving the installation tarball

First of all, The installation tarball is needed for deploying your OPNFV environment, it included packages of Compass, OpenStack, OpenDaylight, ONOS and so on.

The stable release tarball can be retrieved via OPNFV software download page

The daily build tarball can be retrieved via OPNFV artifacts repository:

http://artifacts.opnfv.org/compass4nfv.html

NOTE: Search the keyword “compass4nfv/Fraser” to locate the tarball.

E.g. compass4nfv/fraser/opnfv-2017-03-29_08-55-09.tar.gz

The name of tarball includes the time of tarball building, you can get the daily tarball according the building time. The git url and sha1 of Compass4nfv are recorded in properties files, According these, the corresponding deployment scripts can be retrieved.

3.1.2. Getting the deployment scripts

To retrieve the repository of Compass4nfv on Jumphost use the following command:

NOTE: PLEASE DO NOT GIT CLONE COMPASS4NFV IN ROOT DIRECTORY(INCLUDE SUBFOLDERS).

To get stable/fraser release, you can use the following command:

  • git checkout Fraser.1.0
3.2. Setup Requirements

If you have only 1 Bare Metal server, Virtual deployment is recommended. if more than or equal 3 servers, the Bare Metal deployment is recommended. The minimum number of servers for Bare metal deployment is 3, 1 for JumpServer(Jumphost), 1 for controller, 1 for compute.

3.2.1. Jumphost Requirements

The Jumphost requirements are outlined below:

  1. Ubuntu 14.04 (Pre-installed).
  2. Root access.
  3. libvirt virtualization support.
  4. Minimum 2 NICs.
    • PXE installation Network (Receiving PXE request from nodes and providing OS provisioning)
    • IPMI Network (Nodes power control and set boot PXE first via IPMI interface)
    • External Network (Optional: Internet access)
  5. 16 GB of RAM for a Bare Metal deployment, 64 GB of RAM for a Virtual deployment.
  6. CPU cores: 32, Memory: 64 GB, Hard Disk: 500 GB, (Virtual Deployment needs 1 TB Hard Disk)
3.3. Bare Metal Node Requirements

Bare Metal nodes require:

  1. IPMI enabled on OOB interface for power control.
  2. BIOS boot priority should be PXE first then local hard disk.
  3. Minimum 3 NICs.
    • PXE installation Network (Broadcasting PXE request)
    • IPMI Network (Receiving IPMI command from Jumphost)
    • External Network (OpenStack mgmt/external/storage/tenant network)
3.4. Network Requirements

Network requirements include:

  1. No DHCP or TFTP server running on networks used by OPNFV.
  2. 2-6 separate networks with connectivity between Jumphost and nodes.
    • PXE installation Network
    • IPMI Network
    • Openstack mgmt Network*
    • Openstack external Network*
    • Openstack tenant Network*
    • Openstack storage Network*
  3. Lights out OOB network access from Jumphost with IPMI node enabled (Bare Metal deployment only).
  4. External network has Internet access, meaning a gateway and DNS availability.

The networks with(*) can be share one NIC(Default configuration) or use an exclusive NIC(Reconfigurated in network.yml).

3.5. Execution Requirements (Bare Metal Only)

In order to execute a deployment, one must gather the following information:

  1. IPMI IP addresses of the nodes.
  2. IPMI login information for the nodes (user/pass).
  3. MAC address of Control Plane / Provisioning interfaces of the Bare Metal nodes.
3.6. Configurations

There are three configuration files a user needs to modify for a cluster deployment. network_cfg.yaml for openstack networks on hosts. dha file for host role, IPMI credential and host nic idenfitication (MAC address). deploy.sh for os and openstack version.

4. Configure network

network_cfg.yaml file describes networks configuration for openstack on hosts. It specifies host network mapping and ip assignment of networks to be installed on hosts. Compass4nfv includes a sample network_cfg.yaml under compass4nfv/deploy/conf/network_cfg.yaml

There are three openstack networks to be installed: external, mgmt and storage. These three networks can be shared on one physical nic or on separate nics (multi-nic). The sample included in compass4nfv uses one nic. For multi-nic configuration, see multi-nic configuration.

4.1. Configure openstack network

**! All interface name in network_cfg.yaml must be identified in dha file by mac address !**

Compass4nfv will install networks on host as described in this configuration. It will look for physical nic on host by mac address from dha file and rename nic to the name with that mac address. Therefore, any network interface name that is not identified by mac address in dha file will not be installed correctly as compass4nfv cannot find the nic.

Configure provider network

provider_net_mappings:
  - name: br-prv
    network: physnet
    interface: eth1
    type: ovs
    role:
      - controller
      - compute

The external nic in dha file must be named eth1 with mac address. If user uses a different interface name in dha file, change eth1 to that name here. Note: User cannot use eth0 for external interface name as install/pxe network is named as such.

Configure openstack mgmt&storage network:

sys_intf_mappings:
  - name: mgmt
    interface: eth1
    vlan_tag: 101
    type: vlan
    role:
      - controller
      - compute
- name: storage
    interface: eth1
    vlan_tag: 102
    type: vlan
    role:
      - controller
      - compute

Change vlan_tag of mgmt and storage to corresponding vlan tag configured on switch.

Note: for virtual deployment, there is no need to modify mgmt&storage network.

If using multi-nic feature, i.e, separate nic for mgmt or storage network, user needs to change name to desired nic name (need to match dha file). Please see multi-nic configuration.

4.2. Assign IP address to networks

ip_settings section specifics ip assignment for openstack networks.

User can use default ip range for mgmt&storage network.

for external networks:

- name: external
   ip_ranges:
   - - "192.168.50.210"
     - "192.168.50.220"
   cidr: "192.168.50.0/24"
   gw: "192.168.50.1"
   role:
     - controller
     - compute

Provide at least number of hosts available ip for external IP range(these ips will be assigned to each host). Provide actual cidr and gateway in cidr and gw fields.

configure public IP for horizon dashboard

public_vip:
 ip: 192.168.50.240
 netmask: "24"
 interface: external

Provide an external ip in ip field. This ip cannot be within the ip range assigned to external network configured in pervious section. It will be used for horizon address.

See section 6.2 (Vitual) and 7.2 (BareMetal) for graphs illustrating network topology.

5. Installation on Bare Metal
5.1. Nodes Configuration (Bare Metal Deployment)

The below file is the inventory template of deployment nodes:

“compass4nfv/deploy/conf/hardware_environment/huawei-pod1/dha.yml”

The “dha.yml” is a collectively name for “os-nosdn-nofeature-ha.yml os-ocl-nofeature-ha.yml os-odl_l2-moon-ha.yml etc”.

You can write your own IPMI IP/User/Password/Mac address/roles reference to it.

  • name – Host name for deployment node after installation.
  • ipmiVer – IPMI interface version for deployment node support. IPMI 1.0 or IPMI 2.0 is available.
  • ipmiIP – IPMI IP address for deployment node. Make sure it can access from Jumphost.
  • ipmiUser – IPMI Username for deployment node.
  • ipmiPass – IPMI Password for deployment node.
  • mac – MAC Address of deployment node PXE NIC.
  • interfaces – Host NIC renamed according to NIC MAC addresses when OS provisioning.
  • roles – Components deployed.

Set TYPE/FLAVOR and POWER TOOL

E.g. .. code-block:: yaml

TYPE: baremetal FLAVOR: cluster POWER_TOOL: ipmitool

Set ipmiUser/ipmiPass and ipmiVer

E.g.

ipmiUser: USER
ipmiPass: PASSWORD
ipmiVer: '2.0'

Assignment of different roles to servers

E.g. Openstack only deployment roles setting

hosts:
  - name: host1
    mac: 'F8:4A:BF:55:A2:8D'
    interfaces:
       - eth1: 'F8:4A:BF:55:A2:8E'
    ipmiIp: 172.16.130.26
    roles:
      - controller
      - ha

  - name: host2
    mac: 'D8:49:0B:DA:5A:B7'
    interfaces:
      - eth1: 'D8:49:0B:DA:5A:B8'
    ipmiIp: 172.16.130.27
    roles:
      - compute

NOTE: THE ‘ha’ role MUST BE SELECTED WITH CONTROLLERS, EVEN THERE IS ONLY ONE CONTROLLER NODE.

E.g. Openstack and ceph deployment roles setting

hosts:
  - name: host1
    mac: 'F8:4A:BF:55:A2:8D'
    interfaces:
       - eth1: 'F8:4A:BF:55:A2:8E'
    ipmiIp: 172.16.130.26
    roles:
      - controller
      - ha
      - ceph-adm
      - ceph-mon

  - name: host2
    mac: 'D8:49:0B:DA:5A:B7'
    interfaces:
      - eth1: 'D8:49:0B:DA:5A:B8'
    ipmiIp: 172.16.130.27
    roles:
      - compute
      - ceph-osd

E.g. Openstack and ODL deployment roles setting

hosts:
  - name: host1
    mac: 'F8:4A:BF:55:A2:8D'
    interfaces:
       - eth1: 'F8:4A:BF:55:A2:8E'
    ipmiIp: 172.16.130.26
    roles:
      - controller
      - ha
      - odl

  - name: host2
    mac: 'D8:49:0B:DA:5A:B7'
    interfaces:
      - eth1: 'D8:49:0B:DA:5A:B8'
    ipmiIp: 172.16.130.27
    roles:
      - compute

E.g. Openstack and ONOS deployment roles setting

hosts:
  - name: host1
    mac: 'F8:4A:BF:55:A2:8D'
    interfaces:
       - eth1: 'F8:4A:BF:55:A2:8E'
    ipmiIp: 172.16.130.26
    roles:
      - controller
      - ha
      - onos

  - name: host2
    mac: 'D8:49:0B:DA:5A:B7'
    interfaces:
      - eth1: 'D8:49:0B:DA:5A:B8'
    ipmiIp: 172.16.130.27
    roles:
      - compute
5.2. Network Configuration (Bare Metal Deployment)

Before deployment, there are some network configuration to be checked based on your network topology.Compass4nfv network default configuration file is “compass4nfv/deploy/conf/hardware_environment/huawei-pod1/network.yml”. This file is an example, you can customize by yourself according to specific network environment.

In this network.yml, there are several config sections listed following(corresponed to the ordre of the config file):

5.2.1. Provider Mapping
  • name – provider network name.
  • network – default as physnet, do not change it.
  • interfaces – the NIC or Bridge attached by the Network.
  • type – the type of the NIC or Bridge(vlan for NIC and ovs for Bridge, either).
  • roles – all the possible roles of the host machines which connected by this network(mostly put both controller and compute).
5.2.2. System Interface
  • name – Network name.
  • interfaces – the NIC or Bridge attached by the Network.
  • vlan_tag – if type is vlan, add this tag before ‘type’ tag.
  • type – the type of the NIC or Bridge(vlan for NIC and ovs for Bridge, either).
  • roles – all the possible roles of the host machines which connected by this network(mostly put both controller and compute).
5.2.3. IP Settings
  • name – network name corresponding the the network name in System Interface section one by one.
  • ip_ranges – ip addresses range provided for this network.
  • cidr – the IPv4 address and its associated routing prefix and subnet mask?
  • gw – need to add this line only if network is external.
  • roles – all the possible roles of the host machines which connected by this network(mostly put both controller and compute).
5.2.4. Internal VIP(virtual or proxy IP)
  • ip – virtual or proxy ip address, must be in the same subnet with mgmt network but must not be in the range of mgmt network.
  • netmask – the length of netmask
  • interface – mostly mgmt.
5.2.5. Public VIP
  • ip – virtual or proxy ip address, must be in the same subnet with external network but must not be in the range of external network.
  • netmask – the length of netmask
  • interface – mostly external.
5.2.6. Public Network
  • enable – must be True(if False, you need to set up provider network manually).
  • network – leave it ext-net.
  • type – the type of the ext-net above, such as flat or vlan.
  • segment_id – when the type is vlan, this should be id of vlan.
  • subnet – leave it ext-subnet.
  • provider_network – leave it physnet.
  • router – leave it router-ext.
  • enable_dhcp – must be False.
  • no_gateway – must be False.
  • external_gw – same as gw in ip_settings.
  • floating_ip_cidr – cidr for floating ip, see explanation in ip_settings.
  • floating_ip_start – define range of floating ip with floating_ip_end(this defined range must not be included in ip range of external configured in ip_settings section).
  • floating_ip_end – define range of floating ip with floating_ip_start.

The following figure shows the default network configuration.

+--+                          +--+     +--+
|  |                          |  |     |  |
|  |      +------------+      |  |     |  |
|  +------+  Jumphost  +------+  |     |  |
|  |      +------+-----+      |  |     |  |
|  |             |            |  |     |  |
|  |             +------------+  +-----+  |
|  |                          |  |     |  |
|  |      +------------+      |  |     |  |
|  +------+    host1   +------+  |     |  |
|  |      +------+-----+      |  |     |  |
|  |             |            |  |     |  |
|  |             +------------+  +-----+  |
|  |                          |  |     |  |
|  |      +------------+      |  |     |  |
|  +------+    host2   +------+  |     |  |
|  |      +------+-----+      |  |     |  |
|  |             |            |  |     |  |
|  |             +------------+  +-----+  |
|  |                          |  |     |  |
|  |      +------------+      |  |     |  |
|  +------+    host3   +------+  |     |  |
|  |      +------+-----+      |  |     |  |
|  |             |            |  |     |  |
|  |             +------------+  +-----+  |
|  |                          |  |     |  |
|  |                          |  |     |  |
+-++                          ++-+     +-++
  ^                            ^         ^
  |                            |         |
  |                            |         |
+-+-------------------------+  |         |
|      External Network     |  |         |
+---------------------------+  |         |
       +-----------------------+---+     |
       |       IPMI Network        |     |
       +---------------------------+     |
               +-------------------------+-+
               | PXE(Installation) Network |
               +---------------------------+

The following figure shows the interfaces and nics of JumpHost and deployment nodes in huawei-pod1 network configuration(default one nic for openstack networks).

Single nic scenario

Fig 1. Single nic scenario

The following figure shows the interfaces and nics of JumpHost and deployment nodes in intel-pod8 network configuration(openstack networks are seperated by multiple NICs).

Multiple nics scenario

Fig 2. Multiple nics scenario

5.3. Start Deployment (Bare Metal Deployment)
  1. Edit deploy.sh
1.1. Set OS version for deployment nodes.
Compass4nfv supports ubuntu and centos based openstack newton.

E.g.

# Set OS version for target hosts
# Ubuntu16.04 or CentOS7
export OS_VERSION=xenial
or
export OS_VERSION=centos7

1.2. Set tarball corresponding to your code

E.g.

# Set ISO image corresponding to your code
export ISO_URL=file:///home/compass/compass4nfv.tar.gz
1.3. Set hardware deploy jumpserver PXE NIC. (set eth1 E.g.)
You do not need to set it when virtual deploy.

E.g.

# Set hardware deploy jumpserver PXE NIC
# you need to comment out it when virtual deploy
export INSTALL_NIC=eth1

1.4. Set scenario that you want to deploy

E.g.

nosdn-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-nosdn-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml

odl_l2-moon scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-odl_l2-moon-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml

odl_l2-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-odl_l2-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml

odl_l3-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-odl_l3-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml

odl-sfc deploy scenario sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-odl-sfc-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml
  1. Run deploy.sh
./deploy.sh
6. Installation on virtual machines
6.1. Quick Start

Only 1 command to try virtual deployment, if you have Internet access. Just Paste it and Run.

curl https://raw.githubusercontent.com/opnfv/compass4nfv/stable/fraser/quickstart.sh | bash

If you want to deploy noha with1 controller and 1 compute, run the following command

export SCENARIO=os-nosdn-nofeature-noha.yml
curl https://raw.githubusercontent.com/opnfv/compass4nfv/stable/fraser/quickstart.sh | bash
6.2. Nodes Configuration (Virtual Deployment)
6.2.1. virtual machine setting
  • VIRT_CPUS – the number of CPUs allocated per virtual machine.
  • VIRT_MEM – the memory size(MB) allocated per virtual machine.
  • VIRT_DISK – the disk size allocated per virtual machine.
export VIRT_CPUS=${VIRT_CPU:-4}
export VIRT_MEM=${VIRT_MEM:-16384}
export VIRT_DISK=${VIRT_DISK:-200G}
6.2.2. roles setting

The below file is the inventory template of deployment nodes:

”./deploy/conf/vm_environment/huawei-virtual1/dha.yml”

The “dha.yml” is a collectively name for “os-nosdn-nofeature-ha.yml os-ocl-nofeature-ha.yml os-odl_l2-moon-ha.yml etc”.

You can write your own address/roles reference to it.

  • name – Host name for deployment node after installation.
  • roles – Components deployed.

Set TYPE and FLAVOR

E.g.

TYPE: virtual
FLAVOR: cluster

Assignment of different roles to servers

E.g. Openstack only deployment roles setting

hosts:
  - name: host1
    roles:
      - controller
      - ha

  - name: host2
    roles:
      - compute

NOTE: IF YOU SELECT MUTIPLE NODES AS CONTROLLER, THE ‘ha’ role MUST BE SELECT, TOO.

E.g. Openstack and ceph deployment roles setting

hosts:
  - name: host1
    roles:
      - controller
      - ha
      - ceph-adm
      - ceph-mon

  - name: host2
    roles:
      - compute
      - ceph-osd

E.g. Openstack and ODL deployment roles setting

hosts:
  - name: host1
    roles:
      - controller
      - ha
      - odl

  - name: host2
    roles:
      - compute

E.g. Openstack and ONOS deployment roles setting

hosts:
  - name: host1
    roles:
      - controller
      - ha
      - onos

  - name: host2
    roles:
      - compute
6.3. Network Configuration (Virtual Deployment)

The same with Baremetal Deployment.

6.4. Start Deployment (Virtual Deployment)
  1. Edit deploy.sh
1.1. Set OS version for deployment nodes.
Compass4nfv supports ubuntu and centos based openstack pike.

E.g.

# Set OS version for target hosts
# Ubuntu16.04 or CentOS7
export OS_VERSION=xenial
or
export OS_VERSION=centos7

1.2. Set ISO image corresponding to your code

E.g.

# Set ISO image corresponding to your code
export ISO_URL=file:///home/compass/compass4nfv.tar.gz

1.3. Set scenario that you want to deploy

E.g.

nosdn-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-nosdn-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network.yml

odl_l2-moon scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-odl_l2-moon-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network.yml

odl_l2-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-odl_l2-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network.yml

odl_l3-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-odl_l3-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network.yml

odl-sfc deploy scenario sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-odl-sfc-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network.yml
  1. Run deploy.sh
./deploy.sh
7. K8s introduction
7.1. Kubernetes Architecture

Currently Compass can deploy kubernetes as NFVI in 3+2 mode by default.

The following figure shows a typical architecture of Kubernetes.

K8s architecture

Fig 3. K8s architecture

7.1.1. Kube-apiserver

Kube-apiserver exposes the Kubernetes API. It is the front-end for the Kubernetes control plane. It is designed to scale horizontally, that is, it scales by deploying more instances.

7.1.2. Etcd

Etcd is used as Kubernetes’ backing store. All cluster data is stored here. Always have a backup plan for etcd’s data for your Kubernetes cluster.

7.1.3. Kube-controller-manager

Kube-controller-manager runs controllers, which are the background threads that handle routine tasks in the cluster. Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process.

These controllers include:

  • Node Controller: Responsible for noticing and responding when nodes go down.
  • Replication Controller: Responsible for maintaining the correct number of pods for every replication controller object in the system.
  • Endpoints Controller: Populates the Endpoints object (that is, joins Services & Pods).
  • Service Account & Token Controllers: Create default accounts and API access tokens for new namespaces.
7.1.4. kube-scheduler

Kube-scheduler watches newly created pods that have no node assigned, and selects a node for them to run on.

7.1.5. Kubelet

Kubelet is the primary node agent. It watches for pods that have been assigned to its node (either by apiserver or via local configuration file) and:

  • Mounts the pod’s required volumes.
  • Downloads the pod’s secrets.
  • Runs the pod’s containers via docker (or, experimentally, rkt).
  • Periodically executes any requested container liveness probes.
  • Reports the status of the pod back to the rest of the system, by creating a mirror pod if necessary.
  • Reports the status of the node back to the rest of the system.
7.1.6. Kube-proxy

Kube-proxy enables the Kubernetes service abstraction by maintaining network rules on the host and performing connection forwarding.

7.1.7. Docker

Docker is used for running containers.

7.1.8. POD

A pod is a collection of containers and its storage inside a node of a Kubernetes cluster. It is possible to create a pod with multiple containers inside it. For example, keeping a database container and data container in the same pod.

7.2. Understand Kubernetes Networking in Compass configuration

The following figure shows the Kubernetes Networking in Compass configuration.

Kubernetes Networking in Compass

Fig 4. Kubernetes Networking in Compass

8. Installation of K8s on virtual machines
8.1. Quick Start

Only 1 command to try virtual deployment, if you have Internet access. Just Paste it and Run.

curl https://raw.githubusercontent.com/opnfv/compass4nfv/master/quickstart_k8s.sh | bash

If you want to deploy noha with1 controller and 1 compute, run the following command

export SCENARIO=k8-nosdn-nofeature-noha.yml
export VIRT_NUMBER=2
curl https://raw.githubusercontent.com/opnfv/compass4nfv/stable/fraser/quickstart_k8s.sh | bash
9. Installation of K8s on Bare Metal
9.1. Nodes Configuration (Bare Metal Deployment)

The below file is the inventory template of deployment nodes:

“compass4nfv/deploy/conf/hardware_environment/huawei-pod1/k8-nosdn-nofeature-ha.yml”

You can write your own IPMI IP/User/Password/Mac address/roles reference to it.

  • name – Host name for deployment node after installation.
  • ipmiVer – IPMI interface version for deployment node support. IPMI 1.0 or IPMI 2.0 is available.
  • ipmiIP – IPMI IP address for deployment node. Make sure it can access from Jumphost.
  • ipmiUser – IPMI Username for deployment node.
  • ipmiPass – IPMI Password for deployment node.
  • mac – MAC Address of deployment node PXE NIC.
  • interfaces – Host NIC renamed according to NIC MAC addresses when OS provisioning.
  • roles – Components deployed.

Set TYPE/FLAVOR and POWER TOOL

E.g. .. code-block:: yaml

TYPE: baremetal FLAVOR: cluster POWER_TOOL: ipmitool

Set ipmiUser/ipmiPass and ipmiVer

E.g.

ipmiUser: USER
ipmiPass: PASSWORD
ipmiVer: '2.0'

Assignment of different roles to servers

E.g. K8s only deployment roles setting

hosts:
  - name: host1
    mac: 'F8:4A:BF:55:A2:8D'
    interfaces:
       - eth1: 'F8:4A:BF:55:A2:8E'
    ipmiIp: 172.16.130.26
    roles:
      - kube_master
      - etcd

  - name: host2
    mac: 'D8:49:0B:DA:5A:B7'
    interfaces:
      - eth1: 'D8:49:0B:DA:5A:B8'
    ipmiIp: 172.16.130.27
    roles:
      - kube_node
9.2. Network Configuration (Bare Metal Deployment)

Before deployment, there are some network configuration to be checked based on your network topology.Compass4nfv network default configuration file is “compass4nfv/deploy/conf/hardware_environment/huawei-pod1/network.yml”. This file is an example, you can customize by yourself according to specific network environment.

In this network.yml, there are several config sections listed following(corresponed to the ordre of the config file):

9.2.1. Provider Mapping
  • name – provider network name.
  • network – default as physnet, do not change it.
  • interfaces – the NIC or Bridge attached by the Network.
  • type – the type of the NIC or Bridge(vlan for NIC and ovs for Bridge, either).
  • roles – all the possible roles of the host machines which connected by this network(mostly put both controller and compute).
9.2.2. System Interface
  • name – Network name.
  • interfaces – the NIC or Bridge attached by the Network.
  • vlan_tag – if type is vlan, add this tag before ‘type’ tag.
  • type – the type of the NIC or Bridge(vlan for NIC and ovs for Bridge, either).
  • roles – all the possible roles of the host machines which connected by this network(mostly put both controller and compute).
9.2.3. IP Settings
  • name – network name corresponding the the network name in System Interface section one by one.
  • ip_ranges – ip addresses range provided for this network.
  • cidr – the IPv4 address and its associated routing prefix and subnet mask?
  • gw – need to add this line only if network is external.
  • roles – all the possible roles of the host machines which connected by this network(mostly put both controller and compute).
9.2.4. Internal VIP(virtual or proxy IP)
  • ip – virtual or proxy ip address, must be in the same subnet with mgmt network but must not be in the range of mgmt network.
  • netmask – the length of netmask
  • interface – mostly mgmt.
9.2.5. Public VIP
  • ip – virtual or proxy ip address, must be in the same subnet with external network but must not be in the range of external network.
  • netmask – the length of netmask
  • interface – mostly external.
9.2.6. Public Network
  • enable – must be True(if False, you need to set up provider network manually).
  • network – leave it ext-net.
  • type – the type of the ext-net above, such as flat or vlan.
  • segment_id – when the type is vlan, this should be id of vlan.
  • subnet – leave it ext-subnet.
  • provider_network – leave it physnet.
  • router – leave it router-ext.
  • enable_dhcp – must be False.
  • no_gateway – must be False.
  • external_gw – same as gw in ip_settings.
  • floating_ip_cidr – cidr for floating ip, see explanation in ip_settings.
  • floating_ip_start – define range of floating ip with floating_ip_end(this defined range must not be included in ip range of external configured in ip_settings section).
  • floating_ip_end – define range of floating ip with floating_ip_start.

The following figure shows the default network configuration.

Kubernetes network configuration

Fig 5. Kubernetes network configuration

9.3. Start Deployment (Bare Metal Deployment)
  1. Edit deploy.sh
1.1. Set OS version for deployment nodes.
Compass4nfv supports ubuntu and centos based openstack newton.

E.g.

# Set OS version for target hosts
# Only CentOS7 supported now
export OS_VERSION=centos7

1.2. Set tarball corresponding to your code

E.g.

# Set ISO image corresponding to your code
export ISO_URL=file:///home/compass/compass4nfv.tar.gz
1.3. Set hardware deploy jumpserver PXE NIC. (set eth1 E.g.)
You do not need to set it when virtual deploy.

E.g.

# Set hardware deploy jumpserver PXE NIC
# you need to comment out it when virtual deploy
export INSTALL_NIC=eth1

1.4. K8s scenario that you want to deploy

E.g.

nosdn-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/k8-nosdn-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml
  1. Run deploy.sh
./deploy.sh
10. Offline Deploy

Compass4nfv uses a repo docker container as distro and pip package source to deploy cluster and support complete offline deployment on a jumphost without access internet. Here is the offline deployment instruction:

10.1. Preparation for offline deploy
  1. Download compass.tar.gz from OPNFV artifacts repository (Search compass4nfv in http://artifacts.opnfv.org/ and download an appropriate tarball. Tarball can also be generated by script build.sh in compass4nfv root directory.)
  2. Download the Jumphost preparation package from our httpserver. (Download the jumphost environment package from here. It should be awared that currently we only support ubuntu trusty as offline jumphost OS.)
  3. Clone the compass4nfv code repository.
10.2. Steps of offline deploy
  1. Copy the compass.tar.gz, jh_env_package.tar.gz and the compass4nfv code repository to your jumphost.
  2. Export the local path of the compass.tar.gz and jh_env_package.tar.gz on jumphost. Then you can perform deployment on a offline jumphost.

E.g.

Export the compass4nfv.iso and jh_env_package.tar.gz path

# ISO_URL and JHPKG_URL should be absolute path
export ISO_URL=file:///home/compass/compass4nfv.iso
export JHPKG_URL=file:///home/compass/jh_env_package.tar.gz
  1. Open the OSA offline deployment switch on jumphost.
export OFFLINE_DEPLOY=Enable
  1. Run deploy.sh
./deploy.sh
11. Expansion Guide
11.1. Edit NETWORK File

The below file is the inventory template of deployment nodes:

”./deploy/conf/hardware_environment/huawei-pod1/network.yml”

You need to edit the network.yml which you had edited the first deployment.

NOTE: External subnet’s ip_range should exclude the IPs those have already been used.

11.2. Edit DHA File

The below file is the inventory template of deployment nodes:

”./deploy/conf/hardware_environment/expansion-sample/hardware_cluster_expansion.yml”

You can write your own IPMI IP/User/Password/Mac address/roles reference to it.

  • name – Host name for deployment node after installation.
  • ipmiIP – IPMI IP address for deployment node. Make sure it can access from Jumphost.
  • ipmiUser – IPMI Username for deployment node.
  • ipmiPass – IPMI Password for deployment node.
  • mac – MAC Address of deployment node PXE NIC .

Set TYPE/FLAVOR and POWER TOOL

E.g.

TYPE: baremetal
FLAVOR: cluster
POWER_TOOL: ipmitool

Set ipmiUser/ipmiPass and ipmiVer

E.g.

ipmiUser: USER
ipmiPass: PASSWORD
ipmiVer: '2.0'

Assignment of roles to servers

E.g. Only increase one compute node

hosts:
   - name: host6
     mac: 'E8:4D:D0:BA:60:45'
     interfaces:
        - eth1: '08:4D:D0:BA:60:44'
     ipmiIp: 172.16.131.23
     roles:
       - compute

E.g. Increase two compute nodes

hosts:
   - name: host6
     mac: 'E8:4D:D0:BA:60:45'
     interfaces:
        - eth1: '08:4D:D0:BA:60:44'
     ipmiIp: 172.16.131.23
     roles:
       - compute

   - name: host6
     mac: 'E8:4D:D0:BA:60:78'
     interfaces:
        - eth1: '08:4D:56:BA:60:83'
     ipmiIp: 172.16.131.23
     roles:
       - compute
11.2.1. Start Expansion
  1. Edit network.yml and dha.yml file

    You need to Edit network.yml and virtual_cluster_expansion.yml or hardware_cluster_expansion.yml. Edit the DHA and NETWORK envionment variables. External subnet’s ip_range and management ip should be changed as the first 6 IPs are already taken by the first deployment.

E.g.

--- network.yml     2017-02-16 20:07:10.097878150 +0800
+++ network-expansion.yml   2017-05-03 10:01:34.537379013 +0800
@@ -38,7 +38,7 @@
 ip_settings:
   - name: mgmt
     ip_ranges:
-      - - "172.16.1.1"
+      - - "172.16.1.6"
         - "172.16.1.254"
     cidr: "172.16.1.0/24"
     role:
@@ -47,7 +47,7 @@

   - name: storage
     ip_ranges:
-      - - "172.16.2.1"
+      - - "172.16.2.6"
         - "172.16.2.254"
     cidr: "172.16.2.0/24"
     role:
@@ -56,7 +56,7 @@

   - name: external
     ip_ranges:
-      - - "192.168.116.201"
+      - - "192.168.116.206"
         - "192.168.116.221"
     cidr: "192.168.116.0/24"
     gw: "192.168.116.1"
  1. Edit deploy.sh
2.1. Set EXPANSION and VIRT_NUMBER.
VIRT_NUMBER decide how many virtual machines needs to expand when virtual expansion

E.g.

export EXPANSION="true"
export MANAGEMENT_IP_START="10.1.0.55"
export VIRT_NUMBER=1
export DEPLOY_FIRST_TIME="false"

2.2. Set scenario that you need to expansion

E.g.

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/expansion-sample/hardware_cluster_expansion.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml
Note: Other environment variable shoud be same as your first deployment.
Please check the environment variable before you run deploy.sh.
  1. Run deploy.sh
./deploy.sh
Compass4nfv Design Guide
1. How to integrate a feature into compass4nfv

This document describes how to integrate a feature (e.g. sdn, moon, kvm, sfc) into compass installer. Follow the steps below, you can achieve the goal.

1.1. Create a role for the feature

Currently Ansible is the main packages installation plugin in the adapters of Compass4nfv, which is used to deploy all the roles listed in the playbooks. (More details about ansible and playbook can be achieved according to the Reference.) The mostly used playbook in compass4nfv is named “HA-ansible-multinodes.yml” located in “your_path_to_compass4nfv/compass4nfv/deploy/ adapters/ansible/openstack/”.

Before you add your role into the playbook, create your role under the directory of “your_path_to_compass4nfv/compass4nfv/deploy/adapters/ansible/roles/”. For example Fig 1 shows some roles currently existed in compass4nfv.

Existed roles in compass4nfv

Fig 1. Existed roles in compass4nfv

Let’s take a look at “moon” and understand the construction of a role. Fig 2 below presents the tree of “moon”.

Tree of moon role

Fig 2. Tree of moon role

There are five directories in moon, which are files, handlers, tasks, templates and vars. Almost every role has such five directories.

For “files”, it is used to store the files you want to copy to the hosts without any modification. These files can be configuration files, code files and etc. Here in moon’s files directory, there are two python files and one configuration file. All of the three files will be copied to controller nodes for some purposes.

For “handlers”, it is used to store some operations frequently used in your tasks. For example, restart the service daemon.

For “tasks”, it is used to store the task yaml files. You need to add the yaml files including the tasks you write to deploy your role on the hosts. Please attention that a main.yml should be existed as the entrance of running tasks. In Fig 2, you can find that there are four yaml files in the tasks directory of moon. The main.yml is the entrance which will call the other three yaml files.

For “templates”, it is used to store the files that you want to replace some variables in them before copying to hosts. These variables are usually defined in “vars” directory. This can avoid hard coding.

For “vars”, it is used to store the yaml files in which the packages and variables are defined. The packages defined here are some generic debian or rpm packages. The script of making repo will scan the packages names here and download them into related PPA. For some special packages, section “Build packages for the feature” will introduce how to handle with special packages. The variables defined here are used in the files in “templates” and “tasks”.

Note: you can get the special packages in the tasks like this:

- name: get the special packages' http server
  shell: awk -F'=' '/compass_server/ {print $2}' /etc/compass.conf
  register: http_server

- name: download odl package
  get_url:
    url: "http://{{ http_server.stdout_lines[0] }}/packages/odl/{{ odl_pkg_url }}"
    dest: /opt/
1.2. Build packages for the feature

In the previous section, we have explained how to build the generic packages for your feature. In this section, we will talk about how to build the special packages used by your feature.

Features building directory in compass4nfv

Fig 3. Features building directory in compass4nfv

Fig 3 shows the tree of “your_path_to_compass4nfv/compass4nfv/repo/features/”. Dockerfile is used to start a docker container to run the scripts in scripts directory. These scripts will download the special feature related packages into the container. What you need to do is to write a shell script to download or build the package you want. And then put the script into “your_path_to_compass4nfv/compass4nfv/repo/features/scripts/”. Attention that, you need to make a directory under /pkg. Take opendaylight as an example:

mkdir -p /pkg/odl

After downloading or building your feature packages, please copy all of your packages into the directory you made, e.g. /pkg/odl.

Note: If you have specail requirements for the container OS or kernel vesion, etc. Please contact us.

After all of these, come back to your_path_to_compass4nfv/compass4nfv/ directory, and run the command below:

./repo/make_repo.sh feature # To get special packages

./repo/make_repo.sh openstack # To get generic packages

When execution finished, you will get a tar package named packages.tar.gz under “your_path_to_compass4nfv/compass4nfv/work/repo/”. Your feature related packages have been archived in this tar package. And you will also get the PPA packages which includes the generic packages you defined in the role directory. The PPA packages are xenial-newton-ppa.tar.gz and centos7-newton-ppa.tar.gz, also in “your_path_to_compass4nfv/compass4nfv/work/repo/”.

1.3. Build compass ISO including the feature

Before you deploy a cluster with your feature installed, you need an ISO with feature packages, generic packages and role included. This section introduces how to build the ISO you want. What you need to do are two simple things:

Configure the build configuration file

The build configuration file is located in “your_path_to_compass4nfv/compass4nfv/build/”. There are lines in the file like this:

export APP_PACKAGE=${APP_PACKAGE:-$FEATURE_URL/packages.tar.gz}

export XENIAL_NEWTON_PPA=${XENIAL_NEWTON_PPA:-$PPA_URL/xenial-newton-ppa.tar.gz}

export CENTOS7_NEWTON_PPA=${CENTOS7_NEWTON_PPA:-$PPA_URL/centos7-newton-ppa.tar.gz}

Just replace the $FEATURE_URL and $PPA_URL to the directory where your packages.tar.gz located in. For example:

export APP_PACKAGE=${APP_PACKAGE:-file:///home/opnfv/compass4nfv/work/repo/packages.tar.gz}

export XENIAL_NEWTON_PPA=${XENIAL_NEWTON_PPA:-file:///home/opnfv/compass4nfv/work/repo/xenial-newton-ppa.tar.gz}

export CENTOS7_NEWTON_PPA=${CENTOS7_NEWTON_PPA:-file:///home/opnfv/compass4nfv/work/repo/centos7-newton-ppa.tar.gz}

Build the ISO

After the configuration, just run the command below to build the ISO you want for deployment.

./build.sh
1.4. References

Ansible documentation: http://docs.ansible.com/ansible/index.html>

Daisy4NFV

Design Docs for Daisy4nfv
1. CI Job Introduction
1.2. Project Gating And Daily Deployment Test

To save time, currently, Daisy4NFV does not run deployment test in gate job which simply builds and uploads artifacts to low confidence level repo. The project deployment test is triggered on a daily basis. If the artifact passes the test, then it will be promoted to the high confidence level repo.

The low confidence level artifacts are bin files in http://artifacts.opnfv.org/daisy.html named like “daisy/opnfv-Gerrit-39495.bin”, while the high confidence level artifacts are named like “daisy/opnfv-2017-08-20_08-00-04.bin”.

The daily project deployment status can be found at

https://build.opnfv.org/ci/job/daisy-daily-master/

2. Deployment Steps

This document takes VM all-in-one environment as example to show what ci/deploy/deploy.sh really do.

  1. On jump host, clean up all-in-one vm and networks.
  2. On jump host, clean up daisy vm and networks.
  3. On jump host, create and start daisy vm and networks.
  4. In daisy vm, Install daisy artifact.
  5. In daisy vm, config daisy and OpenStack default options.

6. In daisy vm, create cluster, update network and build PXE server for the bootstrap kernel. In short, be ready for discovering target nodes. These tasks are done by running the following command.

python /home/daisy/deploy/tempest.py –dha /home/daisy/labs/zte/virtual1/daisy/config/deploy.yml –network /home/daisy/labs/zte/virtual1/daisy/config/network.yml –cluster ‘yes’

  1. On jump host, create and start all-in-one vm and networks.
  2. On jump host, after all-in-one vm is up, get its mac address and write into /home/daisy/labs/zte/virtual1/daisy/config/deploy.yml.

9. In daisy vm, check if all-in-one vm was discovered, if it was, then update its network assignment and config OpenStack according to OPNFV scenario and setup PXE for OS installaion. These tasks are done by running the following command.

python /home/daisy/deploy/tempest.py –dha /home/daisy/labs/zte/virtual1/daisy/config/deploy.yml –network /home/daisy/labs/zte/virtual1/daisy/config/network.yml –host yes –isbare 0 –scenario os-nosdn-nofeature-noha

Note: Current host status: os_status is “init”.

  1. On jump host, restart all_in_one vm to install OS.

11. In daisy vm, continue to intall OS by running the following command which for VM environment only.

python /home/daisy/deploy/tempest.py –dha /home/daisy/labs/zte/virtual1/daisy/config/deploy.yml –network /home/daisy/labs/zte/virtual1/daisy/config/network.yml –install ‘yes’

12. In daisy vm, run the following command to check OS intallation progress. /home/daisy/deploy/check_os_progress.sh -d 0 -n 1

Note: Current host status: os_status is “installing” during installation, then os_status becomes “active” after OS was succesfully installed.

  1. On jump host, reboot all-in-one vm again to get a fresh and first booted OS.

14. In daisy vm, run the following command to check OpenStack/ODL/... intallation progress.

/home/daisy/deploy/check_openstack_progress.sh -n 1

3. Kolla Image Multicast Design
3.1. Protocol Design
  1. All Protocol headers are 1 byte long or align to 4 bytes.

2. Packet size should not exceed above 1500(MTU) bytes including UDP/IP header and should be align to 4 bytes. In future, MTU can be modified larger than 1500(Jumbo Frame) through cmd line option to enlarge the data throughput.

/* Packet header definition (align to 4 bytes) */ struct packet_ctl {

uint32_t seq; // packet seq number start from 0, unique in server life cycle. uint32_t crc; // checksum uint32_t data_size; // payload length uint8_t data[0];

};

/* Buffer info definition (align to 4 bytes) */ struct buffer_ctl {

uint32_t buffer_id; // buffer seq number start from 0, unique in server life cycle. uint32_t buffer_size; // payload total length of a buffer uint32_t packet_id_base; // seq number of the first packet in this buffer. uint32_t pkt_count; // number of packet in this buffer, 0 means EOF.

};

  1. 1-byte-long header definition

Signals such as the four below are 1 byte long, to simplify the receive process(since it cannot be spitted ).

#define CLIENT_READY 0x1 #define CLIENT_REQ 0x2 #define CLIENT_DONE 0x4 #define SERVER_SENT 0x8

Note: Please see the collaboration diagram for their meanings.

  1. Retransmission Request Header

/* Retransmition Request Header (align to 4 bytes) */ struct request_ctl {

uint32_t req_count; // How many seqs below. uint32_t seqs[0]; // packet seqs.

};

  1. Buffer operations

void buffer_init(); // Init the buffer_ctl structure and all(say 1024) packet_ctl structures. Allocate buffer memory. long buffer_fill(int fd); // fill a buffer from fd, such as stdin long buffer_flush(int fd); // flush a buffer to fd, say stdout struct packet_ctl *packet_put(struct packet_ctl *new_pkt);// put a packet to a buffer and return a free memory slot for the next packet. struct packet_ctl *packet_get(uint32_t seq);// get a packet data in buffer by indicating the packet seq.

3.2. How to sync between server threads

If children’s aaa() operation need to wait the parents’s init() to be done, then do it literally like this:

UDP Server TCP Server1 = spawn( )—-> TCP Server1

init()
TCP Server2 = spawn( )—–> TCP Server2
V(sem)———————-> P(sem) // No child any more

V(sem)———————> P(sem) aaa() // No need to V(sem), for no child

aaa()

If parent’s send() operation need to wait the children’s ready() done, then do it literally too, but is a reverse way:

UDP Server TCP Server1 TCP Server2
// No child any more

ready() ready() P(sem) <——————— V(sem)

P(sem) <—————— V(sem) send()

Note that the aaa() and ready() operations above run in parallel. If this is not the case due to race condition, the sequence above can be modified into this below:

UDP Server TCP Server1 TCP Server2
// No child any more
ready()

P(sem) <——————— V(sem) ready()

P(sem) <——————- V(sem) send()

In order to implement such chained/zipper sync pattern, a pair of semaphores is needed between the parent and the child. One is used by child to wait parent , the other is used by parent to wait child. semaphore pair can be allocated by parent and pass the pointer to the child over spawn() operation such as pthread_create().

/* semaphore pair definition */ struct semaphores {

sem_t wait_parent; sem_t wait_child;

};

Then the semaphore pair can be recorded by threads by using the semlink struct below: struct semlink {

struct semaphores this; / used by parent to point to the struct semaphores
which it created during spawn child. */
struct semaphores parent; / used by child to point to the struct
semaphores which it created by parent */

};

chained/zipper sync API:

void sl_wait_child(struct semlink *sl); void sl_release_child(struct semlink *sl); void sl_wait_parent(struct semlink *sl); void sl_release_parent(struct semlink *sl);

API usage is like this.

Thread1(root parent) Thread2(child) Thread3(grandchild) sl_wait_parent(noop op) sl_release_child

+———->sl_wait_parent
sl_release_child
+———–> sl_wait_parent
sl_release_child(noop op) ... sl_wait_child(noop op)
  • sl_release_parent

sl_wait_child <————-

  • sl_release_parent

sl_wait_child <———— sl_release_parent(noop op)

API implementation:

void sl_wait_child(struct semlink *sl) {

if (sl->this) {
P(sl->this->wait_child);

}

}

void sl_release_child(struct semlink *sl) {

if (sl->this) {
V(sl->this->wait_parent);

}

}

void sl_wait_parent(struct semlink *sl) {

if (sl->parent) {
P(sl->parent->wait_parent);

}

}

void sl_release_parent(struct semlink *sl) {

if (sl->parent) {
V(sl->parent->wait_child);

}

}

3.3. Client flow chart

See Collaboration Diagram

3.4. UDP thread flow chart

See Collaboration Diagram

3.5. TCP thread flow chart
S_INIT — (UDP initialized) —> S_ACCEPT — (accept clients) –+

/—————————————————————-/ V

S_PREP — (UDP prepared abuffer)

^ | | –> S_SYNC — (clients ClIENT_READY) | | | –> S_SEND — (clients CLIENT_DONE) | | | V —————(bufferctl.pkt_count != 0)———————–+


V

exit() <— (bufferctl.pkt_count == 0)

3.6. TCP using poll and message queue

TCP uses poll() to sync with client’s events as well as output event from itself, so that we can use non-block socket operations to reduce the latency. POLLIN means there are message from client and POLLOUT means we are ready to send message/retransmission packets to client.

poll main loop pseudo code: void check_clients(struct server_status_data *sdata) {

poll_events = poll(&(sdata->ds[1]), sdata->ccount - 1, timeout);

/* check all connected clients */ for (sdata->cindex = 1; sdata->cindex < sdata->ccount; sdata->cindex++) {

ds = &(sdata->ds[sdata->cindex]); if (!ds->revents) {

continue;

}

if (ds->revents & (POLLERR|POLLHUP|POLLNVAL)) {
handle_error_event(sdata);
} else if (ds->revents & (POLLIN|POLLPRI)) {
handle_pullin_event(sdata); // may set POLLOUT into ds->events
// to trigger handle_pullout_event().
} else if (ds->revents & POLLOUT) {
handle_pullout_event(sdata);

}

}

}

For TCP, since the message from client may not complete and send data may be also interrupted due to non-block fashion, there should be one send message queue and a receive message queue on the server side for each client (client do not use non-block operations).

TCP message queue definition:

struct tcpq {
struct qmsg head, *tail; long count; / message count in a queue / long size; / Total data size of a queue */

};

TCP message queue item definition:

struct qmsg {
struct qmsg *next; void *data; long size;

};

TCP message queue API:

// Allocate and init a queue. struct tcpq * tcpq_queue_init(void);

// Free a queue. void tcpq_queue_free(struct tcpq *q);

// Return queue length. long tcpq_queue_dsize(struct tcpq *q);

// queue new message to tail. void tcpq_queue_tail(struct tcpq *q, void *data, long size);

// queue message that cannot be sent currently back to queue head. void tcpq_queue_head(struct tcpq *q, void *data, long size);

// get one piece from queue head. void * tcpq_dequeue_head(struct tcpq *q, long *size);

// Serialize all pieces of a queue, and move it out of queue, to ease the further //operation on it. void * tcpq_dqueue_flat(struct tcpq *q, long *size);

// Serialize all pieces of a queue, do not move it out of queue, to ease the further //operation on it. void * tcpq_queue_flat_peek(struct tcpq *q, long *size);

Release notes for Daisy4nfv
1. Abstract

This document compiles the release notes for the Fraser release of OPNFV when using Daisy as a deployment tool.

1.1. Configuration Guide

Before installing Daisy4NFV on jump server,you have to configure the daisy.conf file.Then put the right configured daisy.conf file in the /home/daisy_install/ dir.

  1. you have to supplement the “daisy_management_ip” field with the ip of management ip of your Daisy server vm.
  2. Now the backend field “default_backend_types” just support the “kolla”.
  3. “os_install_type” field just support “pxe” for now.
  4. Daisy now use pxe server to install the os, the “build_pxe” item must set to “no”.
  5. “eth_name” field is the pxe server interface, and this field is required when the “build_pxe” field set to “yes”.This should be set to the interface (in Daisy Server VM) which will be used for communicating with other target nodes on management/PXE net plane. Default is ens3.
  6. “ip_address” field is the ip address of pxe server interface.
  7. “net_mask” field is the netmask of pxe server,which is required when the “build_pxe” is set to “yes”
  8. “client_ip_begin” and “client_ip_end” field are the dhcp range of the pxe server.
  9. If you want to use the multicast type to deliver the kolla image to target node, set the “daisy_conf_mcast_enabled” field to “True”
2. OpenStack Configuration Guide
2.1. Before The First Deployment

When executing deploy.sh, before doing real deployment, Daisy utilizes Kolla’s service configuration functionality [1] to specify the following changes to the default OpenStack configuration which comes from Kolla as default.

a) If is it is a VM deployment, set virt_type=qemu amd cpu_mode=none for nova-compute.conf.

b) In nova-api.conf set default_floating_pool to the name of the external network which will be created by Daisy after deployment for nova-api.conf.

c) In heat-api.conf and heat-engine.conf, set deferred_auth_method to trusts and unset trusts_delegated_roles.

Those above changes are requirements of OPNFV or environment’s constraints. So it is not recommended to change them. But if the user wants to add more specific configurations to OpenStack services before doing real deployment, we suggest to do it in the same way as deploy.sh do. Currently, this means hacking into deploy/prepare.sh or deploy/prepare/execute.py then add config file as described in [1].

Notes: Suggest to pass the first deployment first, then reconfigure and deploy again.

2.2. After The First Deployment

After the first time of deployment of OpenStack, its configurations can also be changed and applied by using Kolla’s service configuration functionality [1]. But user has to issue Kolla’s command to do it in this release:

[1] https://docs.openstack.org/kolla-ansible/latest/advanced-configuration.html#openstack-service-configuration-in-kolla

OPNFV Daisy4nfv Installation Guide
Abstract

This document describes how to install the Fraser release of OPNFV when using Daisy4nfv as a deployment tool covering it’s limitations, dependencies and required resources.

Version history
Date Ver. Author Comment
2017-02-07 0.0.1 Zhijiang Hu (ZTE) Initial version
Daisy4nfv configuration

This document provides guidelines on how to install and configure the Fraser release of OPNFV when using Daisy as a deployment tool including required software and hardware configurations.

Installation and configuration of host OS, OpenStack etc. can be supported by Daisy on Virtual nodes and Bare Metal nodes.

The audience of this document is assumed to have good knowledge in networking and Unix/Linux administration.

Prerequisites

Before starting the installation of the Fraser release of OPNFV, some plannings must be done.

Retrieve the installation iso image

First of all, the installation iso which includes packages of Daisy, OS, OpenStack, and so on is needed for deploying your OPNFV environment.

The stable release iso image can be retrieved via OPNFV software download page

The daily build iso image can be retrieved via OPNFV artifact repository:

http://artifacts.opnfv.org/daisy.html

NOTE: Search the keyword “daisy/Fraser” to locate the iso image.

E.g. daisy/opnfv-2017-10-06_09-50-23.iso

Download the iso file, then mount it to a specified directory and get the opnfv-*.bin from that directory.

The git url and sha512 checksum of iso image are recorded in properties files. According to these, the corresponding deployment scripts can be retrieved.

Retrieve the deployment scripts

To retrieve the repository of Daisy on Jumphost use the following command:

To get stable Fraser release, you can use the following command:

  • git checkout opnfv.6.0
Setup Requirements

If you have only 1 Bare Metal server, Virtual deployment is recommended. if you have more than 3 servers, the Bare Metal deployment is recommended. The minimum number of servers for each role in Bare metal deployment is listed below.

Role Number of Servers
Jump Host 1
Controller 1
Compute 1
Jumphost Requirements

The Jumphost requirements are outlined below:

  1. CentOS 7.2 (Pre-installed).
  2. Root access.
  3. Libvirt virtualization support(For virtual deployment).
  4. Minimum 1 NIC(or 2 NICs for virtual deployment).
    • PXE installation Network (Receiving PXE request from nodes and providing OS provisioning)
    • IPMI Network (Nodes power control and set boot PXE first via IPMI interface)
    • Internet access (For getting latest OS updates)
    • External Interface(For virtual deployment, exclusively used by instance traffic to access the rest of the Internet)
  5. 16 GB of RAM for a Bare Metal deployment, 64 GB of RAM for a Virtual deployment.
  6. CPU cores: 32, Memory: 64 GB, Hard Disk: 500 GB, (Virtual deployment needs 1 TB Hard Disk)
Bare Metal Node Requirements

Bare Metal nodes require:

  1. IPMI enabled on OOB interface for power control.
  2. BIOS boot priority should be PXE first then local hard disk.
  3. Minimum 1 NIC for Compute nodes, 2 NICs for Controller nodes.
    • PXE installation Network (Broadcasting PXE request)
    • IPMI Network (Receiving IPMI command from Jumphost)
    • Internet access (For getting latest OS updates)
    • External Interface(For virtual deployment, exclusively used by instance traffic to access the rest of the Internet)
Network Requirements

Network requirements include:

  1. No DHCP or TFTP server running on networks used by OPNFV.
  2. 2-7 separate networks with connectivity between Jumphost and nodes.
    • PXE installation Network
    • IPMI Network
    • Internet access Network
    • OpenStack Public API Network
    • OpenStack Private API Network
    • OpenStack External Network
    • OpenStack Tenant Network(currently, VxLAN only)
  3. Lights out OOB network access from Jumphost with IPMI node enabled (Bare Metal deployment only).
  4. Internet access Network has Internet access, meaning a gateway and DNS availability.
  5. OpenStack External Network has Internet access too if you want instances to access the Internet.

Note: All networks except OpenStack External Network can share one NIC(Default configuration) or use an exclusive NIC(Reconfigurated in network.yml).

Execution Requirements (Bare Metal Only)

In order to execute a deployment, one must gather the following information:

  1. IPMI IP addresses of the nodes.
  2. IPMI login information for the nodes (user/password).
Installation Guide (Bare Metal Deployment)
Nodes Configuration (Bare Metal Deployment)

The below file is the inventory template of deployment nodes:

”./deploy/config/bm_environment/zte-baremetal1/deploy.yml”

You can write your own name/roles reference into it.

  • name – Host name for deployment node after installation.
  • roles – Components deployed. CONTROLLER_LB is for Controller,

COMPUTER is for Compute role. Currently only these two roles are supported. The first CONTROLLER_LB is also used for ODL controller. 3 hosts in inventory will be chosen to setup the Ceph storage cluster.

Set TYPE and FLAVOR

E.g.

TYPE: virtual
FLAVOR: cluster

Assignment of different roles to servers

E.g. OpenStack only deployment roles setting

hosts:
  - name: host1
    roles:
      - CONTROLLER_LB
  - name: host2
    roles:
      - COMPUTER
  - name: host3
    roles:
      - COMPUTER

NOTE: For B/M, Daisy uses MAC address defined in deploy.yml to map discovered nodes to node items definition in deploy.yml, then assign role described by node item to the discovered nodes by name pattern. Currently, controller01, controller02, and controller03 will be assigned with Controler role while computer01, ‘computer02, computer03, and computer04 will be assigned with Compute role.

NOTE: For V/M, There is no MAC address defined in deploy.yml for each virtual machine. Instead, Daisy will fill that blank by getting MAC from “virsh dump-xml”.

Network Configuration (Bare Metal Deployment)

Before deployment, there are some network configurations to be checked based on your network topology. The default network configuration file for Daisy is ”./deploy/config/bm_environment/zte-baremetal1/network.yml”. You can write your own reference into it.

The following figure shows the default network configuration.

+-B/M--------+------------------------------+
|Jumperserver+                              |
+------------+                       +--+   |
|                                    |  |   |
|                +-V/M--------+      |  |   |
|                | Daisyserver+------+  |   |
|                +------------+      |  |   |
|                                    |  |   |
+------------------------------------|  |---+
                                     |  |
                                     |  |
      +--+                           |  |
      |  |       +-B/M--------+      |  |
      |  +-------+ Controller +------+  |
      |  |       | ODL(Opt.)  |      |  |
      |  |       | Network    |      |  |
      |  |       | CephOSD1   |      |  |
      |  |       +------------+      |  |
      |  |                           |  |
      |  |                           |  |
      |  |                           |  |
      |  |       +-B/M--------+      |  |
      |  +-------+  Compute1  +------+  |
      |  |       |  CephOSD2  |      |  |
      |  |       +------------+      |  |
      |  |                           |  |
      |  |                           |  |
      |  |                           |  |
      |  |       +-B/M--------+      |  |
      |  +-------+  Compute2  +------+  |
      |  |       |  CephOSD3  |      |  |
      |  |       +------------+      |  |
      |  |                           |  |
      |  |                           |  |
      |  |                           |  |
      +--+                           +--+
        ^                             ^
        |                             |
        |                             |
       /---------------------------\  |
       |      External Network     |  |
       \---------------------------/  |
              /-----------------------+---\
              |    Installation Network   |
              |    Public/Private API     |
              |      Internet Access      |
              |      Tenant Network       |
              |     Storage Network       |
              |     HeartBeat Network     |
              \---------------------------/

Note: For Flat External networks(which is used by default), a physical interface is needed on each compute node for ODL NetVirt recent versions. HeartBeat network is selected,and if it is configured in network.yml,the keepalived interface will be the heartbeat interface.

Start Deployment (Bare Metal Deployment)
  1. Git clone the latest daisy4nfv code from opnfv: “git clone https://gerrit.opnfv.org/gerrit/daisy

(2) Download latest bin file(such as opnfv-2017-06-06_23-00-04.bin) of daisy from http://artifacts.opnfv.org/daisy.html and change the bin file name(such as opnfv-2017-06-06_23-00-04.bin) to opnfv.bin. Check the https://build.opnfv.org/ci/job/daisy-os-odl-nofeature-ha-baremetal-daily-master/, and if the ‘snaps_health_check’ of functest result is ‘PASS’, you can use this verify-passed bin to deploy the openstack in your own environment

(3) Assumed cloned dir is $workdir, which laid out like below: [root@daisyserver daisy]# ls ci deploy docker INFO LICENSE requirements.txt templates tests tox.ini code deploy.log docs known_hosts setup.py test-requirements.txt tools Make sure the opnfv.bin file is in $workdir

(4) Enter into the $workdir, which laid out like below: [root@daisyserver daisy]# ls ci code deploy docker docs INFO LICENSE requirements.txt setup.py templates test-requirements.txt tests tools tox.ini Create folder of labs/zte/pod2/daisy/config in $workdir

(5) Move the ./deploy/config/bm_environment/zte-baremetal1/deploy.yml and ./deploy/config/bm_environment/zte-baremetal1/network.yml to labs/zte/pod2/daisy/config dir.

Note: If selinux is disabled on the host, please delete all xml files section of below lines in dir templates/physical_environment/vms/

<seclabel type=’dynamic’ model=’selinux’ relabel=’yes’>
<label>system_u:system_r:svirt_t:s0:c182,c195</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c182,c195</imagelabel>

</seclabel>

(6) Config the bridge in jumperserver,make sure the daisy vm can connect to the targetnode,use the command below: brctl addbr br7 brctl addif br7 enp3s0f3(the interface for jumperserver to connect to daisy vm) ifconfig br7 10.20.7.1 netmask 255.255.255.0 up service network restart

(7) Run the script deploy.sh in daisy/ci/deploy/ with command: sudo ./ci/deploy/deploy.sh -L $(cd ./;pwd) -l zte -p pod2 -s os-nosdn-nofeature-noha

Note: The value after -L should be a absolute path which points to the directory which contents labs/zte/pod2/daisy/config directory. The value after -p parameter(pod2) comes from path “labs/zte/pod2” The value after -l parameter(zte) comes from path “labs/zte” The value after -s “os-nosdn-nofeature-ha” used for deploying multinode openstack The value after -s “os-nosdn-nofeature-noha” used for deploying all-in-one openstack

(8) When deployed successfully,the floating ip of openstack is 10.20.7.11, the login account is “admin” and the password is “keystone”

Installation Guide (Virtual Deployment)
Nodes Configuration (Virtual Deployment)

The below file is the inventory template of deployment nodes:

”./deploy/conf/vm_environment/zte-virtual1/deploy.yml”

You can write your own name/roles reference into it.

  • name – Host name for deployment node after installation.
  • roles – Components deployed.

Set TYPE and FLAVOR

E.g.

TYPE: virtual
FLAVOR: cluster

Assignment of different roles to servers

E.g. OpenStack only deployment roles setting

hosts:
  - name: host1
    roles:
      - CONTROLLER_LB

  - name: host2
    roles:
      - COMPUTER

NOTE: For B/M, Daisy uses MAC address defined in deploy.yml to map discovered nodes to node items definition in deploy.yml, then assign role described by node item to the discovered nodes by name pattern. Currently, controller01, controller02, and controller03 will be assigned with Controller role while computer01, computer02, computer03, and computer04 will be assigned with Compute role.

NOTE: For V/M, There is no MAC address defined in deploy.yml for each virtual machine. Instead, Daisy will fill that blank by getting MAC from “virsh dump-xml”.

E.g. OpenStack and ceph deployment roles setting

hosts:
  - name: host1
    roles:
      - controller

  - name: host2
    roles:
      - compute
Network Configuration (Virtual Deployment)

Before deployment, there are some network configurations to be checked based on your network topology. The default network configuration file for Daisy is “daisy/deploy/config/vm_environment/zte-virtual1/network.yml”. You can write your own reference into it.

The following figure shows the default network configuration.

+-B/M--------+------------------------------+
|Jumperserver+                              |
+------------+                       +--+   |
|                                    |  |   |
|                +-V/M--------+      |  |   |
|                | Daisyserver+------+  |   |
|                +------------+      |  |   |
|                                    |  |   |
|     +--+                           |  |   |
|     |  |       +-V/M--------+      |  |   |
|     |  +-------+ Controller +------+  |   |
|     |  |       | ODL(Opt.)  |      |  |   |
|     |  |       | Network    |      |  |   |
|     |  |       | Ceph1      |      |  |   |
|     |  |       +------------+      |  |   |
|     |  |                           |  |   |
|     |  |                           |  |   |
|     |  |                           |  |   |
|     |  |       +-V/M--------+      |  |   |
|     |  +-------+  Compute1  +------+  |   |
|     |  |       |  Ceph2     |      |  |   |
|     |  |       +------------+      |  |   |
|     |  |                           |  |   |
|     |  |                           |  |   |
|     |  |                           |  |   |
|     |  |       +-V/M--------+      |  |   |
|     |  +-------+  Compute2  +------+  |   |
|     |  |       |  Ceph3     |      |  |   |
|     |  |       +------------+      |  |   |
|     |  |                           |  |   |
|     |  |                           |  |   |
|     |  |                           |  |   |
|     +--+                           +--+   |
|       ^                             ^     |
|       |                             |     |
|       |                             |     |
|      /---------------------------\  |     |
|      |      External Network     |  |     |
|      \---------------------------/  |     |
|             /-----------------------+---\ |
|             |    Installation Network   | |
|             |    Public/Private API     | |
|             |      Internet Access      | |
|             |      Tenant Network       | |
|             |     Storage Network       | |
|             |     HeartBeat Network     | |
|             \---------------------------/ |
+-------------------------------------------+

Note: For Flat External networks(which is used by default), a physical interface is needed on each compute node for ODL NetVirt recent versions. HeartBeat network is selected,and if it is configured in network.yml,the keepalived interface will be the heartbeat interface.

Start Deployment (Virtual Deployment)

(1) Git clone the latest daisy4nfv code from opnfv: “git clone https://gerrit.opnfv.org/gerrit/daisy”, make sure the current branch is master

(2) Download latest bin file(such as opnfv-2017-06-06_23-00-04.bin) of daisy from http://artifacts.opnfv.org/daisy.html and change the bin file name(such as opnfv-2017-06-06_23-00-04.bin) to opnfv.bin. Check the https://build.opnfv.org/ci/job/daisy-os-odl-nofeature-ha-baremetal-daily-master/, and if the ‘snaps_health_check’ of functest result is ‘PASS’, you can use this verify-passed bin to deploy the openstack in your own environment

(3) Assumed cloned dir is $workdir, which laid out like below: [root@daisyserver daisy]# ls ci code deploy docker docs INFO LICENSE requirements.txt setup.py templates test-requirements.txt tests tools tox.ini Make sure the opnfv.bin file is in $workdir

  1. Enter into $workdir, Create folder of labs/zte/virtual1/daisy/config in $workdir

(5) Move the deploy/config/vm_environment/zte-virtual1/deploy.yml and deploy/config/vm_environment/zte-virtual1/network.yml to labs/zte/virtual1/daisy/config dir.

Note: zte-virtual1 config files deploy openstack with five nodes(3 lb nodes and 2 computer nodes), if you want to deploy an all-in-one openstack, change the zte-virtual1 to zte-virtual2

Note: If selinux is disabled on the host, please delete all xml files section of below lines in dir templates/virtual_environment/vms/

<seclabel type=’dynamic’ model=’selinux’ relabel=’yes’>
<label>system_u:system_r:svirt_t:s0:c182,c195</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c182,c195</imagelabel>

</seclabel>

(6) Run the script deploy.sh in daisy/ci/deploy/ with command: sudo ./ci/deploy/deploy.sh -L $(cd ./;pwd) -l zte -p virtual1 -s os-nosdn-nofeature-ha

Note: The value after -L should be an absolute path which points to the directory which includes labs/zte/virtual1/daisy/config directory. The value after -p parameter(virtual1) is got from labs/zte/virtual1/daisy/config/ The value after -l parameter(zte) is got from labs/ The value after -s “os-nosdn-nofeature-ha” used for deploying multinode openstack The value after -s “os-nosdn-nofeature-noha” used for deploying all-in-one openstack

(7) When deployed successfully,the floating ip of openstack is 10.20.11.11, the login account is “admin” and the password is “keystone”

Deployment Error Recovery Guide

Deployment may fail due to different kinds of reasons, such as Daisy VM creation error, target nodes failure during OS installation, or Kolla deploy command error. Different errors can be grouped into several error levels. We define Recovery Levels below to fulfill recover requirements in different error levels.

1. Recovery Level 0

This level restart whole deployment again. Mainly to retry to solve errors such as Daisy VM creation failed. For example we use the following command to do virtual deployment(in the jump host):

sudo ./ci/deploy/deploy.sh -b ./ -l zte -p virtual1 -s os-nosdn-nofeature-ha

If command failed because of Daisy VM creation error, then redoing above command will restart whole deployment which includes rebuilding the daisy VM image and restarting Daisy VM.

2. Recovery Level 1

If Daisy VM was created successfully, but bugs were encountered in Daisy code or software of target OS which prevent deployment from being done, in this case, the user or the developer does not want to recreate the Daisy VM again during next deployment process but just to modify some pieces of code in it. To achieve this, he/she can redo deployment by deleting all clusters and hosts first(in the Daisy VM):

source /root/daisyrc_admin
for i in `daisy cluster-list | awk -F "|" '{print $2}' | sed -n '4p' | tr -d " "`;do daisy cluster-delete $i;done
for i in `daisy host-list | awk -F "|" '{print $2}'| grep -o "[^ ]\+\( \+[^ ]\+\)*"|tail -n +2`;do daisy host-delete $i;done

Then, adjust deployment command as below and run it again(in the jump host):

sudo ./ci/deploy/deploy.sh -S -b ./ -l zte -p virtual1 -s os-nosdn-nofeature-ha

Pay attention to the “-S” argument above, it lets the deployment process to skip re-creating Daisy VM and use the existing one.

3. Recovery Level 2

If both Daisy VM and target node’s OS are OK, but error ocurred when doing OpenStack deployment, then there is even no need to re-install target OS for the deployment retrying. In this level, all we need to do is just retry the Daisy deployment command as follows(in the Daisy VM):

source /root/daisyrc_admin
daisy uninstall <cluster-id>
daisy install <cluster-id>

This basically does kolla-ansible destruction and kolla-asnible deployment.

4. Recovery Level 3

If previous deployment was failed during kolla-ansible deploy(you can confirm it by checking /var/log/daisy/api.log) or if previous deployment was successful but the default configration is not what you want and it is OK for you to destroy the OPNFV software stack and re-deploy it again, then you can try recovery level 3.

For example, in order to use external iSCSI storage, you are about to deploy iSCSI cinder backend which is not enabled by default. First, cleanup the previous deployment.

ssh into daisy node, then do:

[root@daisy daisy]# source /etc/kolla/admin-openrc.sh
[root@daisy daisy]# openstack server delete <all vms you created>

Note: /etc/kolla/admin-openrc.sh may not have existed if previous deployment was failed during kolla deploy.

[root@daisy daisy]# cd /home/kolla_install/kolla-ansible/
[root@daisy kolla-ansible]# ./tools/kolla-ansible destroy \
-i ./ansible/inventory/multinode --yes-i-really-really-mean-it

Then, edit /etc/kolla/globals.yml and append the follwoing line:

enable_cinder_backend_iscsi: "yes"
enable_cinder_backend_lvm: "no"

Then, re-deploy again:

[root@daisy kolla-ansible]# ./tools/kolla-ansible prechecks -i ./ansible/inventory/multinode
[root@daisy kolla-ansible]# ./tools/kolla-ansible deploy -i ./ansible/inventory/multinode

After successfully deploying, issue the following command to generate /etc/kolla/admin-openrc.sh file.

[root@daisy kolla-ansible]# ./tools/kolla-ansible post-deploy -i ./ansible/inventory/multinode

Finally, issue the following command to create necessary resources, and your environment are ready for running OPNFV functest.

[root@daisy kolla-ansible]# cd /home/daisy
[root@daisy daisy]# ./deploy/post.sh -n /home/daisy/labs/zte/virtual1/daisy/config/network.yml

Note: “zte/virtual1” in above path may vary in your environment.

OpenStack Minor Version Update Guide

Thanks to Kolla’s kolla-ansible upgrade function, Daisy can update OpenStack minor version as the follows:

1. Get new version file only from Daisy team. Since Daisy’s Kolla images are built by meeting the OPNFV requirements and have their own file packaging layout, Daisy requires user to always use Kolla image file built by Daisy team. Currently, it can be found at http://artifacts.opnfv.org/daisy/upstream, or please see this chapter for how to build your own image.

2. Put new version file into /var/lib/daisy/versionfile/kolla/, for example: /var/lib/daisy/versionfile/kolla/kolla-image-ocata-170811155446.tgz

3. Add version file to Daisy’s version management database then get the version ID.

[root@daisy ~]# source /root/daisyrc_admin
[root@daisy ~]# daisy version-add kolla-image-ocata-170811155446.tgz kolla
+-------------+--------------------------------------+
| Property    | Value                                |
+-------------+--------------------------------------+
| checksum    | None                                 |
| created_at  | 2017-08-28T06:45:25.000000           |
| description | None                                 |
| id          | 8be92587-34d7-43e8-9862-a5288c651079 |
| name        | kolla-image-ocata-170811155446.tgz   |
| owner       | None                                 |
| size        | 0                                    |
| status      | unused                               |
| target_id   | None                                 |
| type        | kolla                                |
| updated_at  | 2017-08-28T06:45:25.000000           |
| version     | None                                 |
+-------------+--------------------------------------+
  1. Get cluster ID
[root@daisy ~]# daisy cluster-list
+--------------------------------------+-------------+...
| ID                                   | Name        |...
+--------------------------------------+-------------+...
| d4c1e0d3-c4b8-4745-aab0-0510e62f0ebb | clustertest |...
+--------------------------------------+-------------+...
  1. Issue update command passing cluster ID and version ID
[root@daisy ~]# daisy update d4c1e0d3-c4b8-4745-aab0-0510e62f0ebb --update-object kolla --version-id 8be92587-34d7-43e8-9862-a5288c651079
+----------+--------------+
| Property | Value        |
+----------+--------------+
| status   | begin update |
+----------+--------------+

6. Since step 5’s command is non-blocking, the user need to run the following command to get updating progress.

[root@daisy ~]# daisy host-list --cluster-id d4c1e0d3-c4b8-4745-aab0-0510e62f0ebb
...+---------------+-------------+-------------------------+
...| Role_progress | Role_status | Role_messages           |
...+---------------+-------------+-------------------------+
...| 0             | updating    | prechecking envirnoment |
...+---------------+-------------+-------------------------+

Notes. The above command returns many fields. User only have to take care about the Role_xxx fields in this case.

Build Your Own Kolla Image For Daisy

The following command will build Ocata Kolla image for Daisy based on Daisy’s fork of openstack/kolla project. This is also the method Daisy used for the Euphrates release.

The reason why here use fork of openstack/kolla project is to backport ODL support from pike branch to ocata branch.

cd ./ci
./kolla-build.sh

After building, the above command will put Kolla image into /tmp/kolla-build-output directory and the image version will be 4.0.2.

If you want to build an image which can update 4.0.2, run the following command:

cd ./ci
./kolla-build.sh -e 1

This time the image version will be 4.0.2.1 which is higher than 4.0.2 so that it can be used to replace the old version.

Deployment Test Guide

After successful deployment of openstack, daisy4nfv use Functest to test the api of openstack. You can follow below instruction to test the successfully deployed openstack on jumperserver.

1.docker pull opnfv/functest run ‘docker images’ command to make sure have the latest functest images.

2.docker run -ti –name functest -e INSTALLER_TYPE=”daisy”-e INSTALLER_IP=”10.20.11.2” -e NODE_NAME=”zte-vtest” -e DEPLOY_SCENARIO=”os-nosdn-nofeature-ha” -e BUILD_TAG=”jenkins-functest-daisy-virtual-daily-master-1259” -e DEPLOY_TYPE=”virt” opnfv/functest:latest /bin/bash Before run above command change below parameters: DEPLOY_SCENARIO: indicate the scenario DEPLOY_TYPE: virt/baremetal NODE_NAME: pod name INSTALLER_IP: daisy vm node ip

3.Log in the daisy vm node to get the /etc/kolla/admin-openrc.sh file, and write them in /home/opnfv/functest/conf/openstack.creds file of functest container.

4.Run command ‘functest env prepare’ to prepare the functest env.

5.Run command ‘functest testcase list’ to list all the testcase can be run.

6.Run command ‘functest testcase run testcase_name’ to run the testcase_name testcase of functest.

Doctor

Doctor Development Guide
Testing Doctor

You have two options to test Doctor functions with the script developed for doctor CI.

You need to install OpenStack and other OPNFV components except Doctor Sample Inspector, Sample Monitor and Sample Consumer, as these will be launched in this script. You are encouraged to use OPNFV official installers, but you can also deploy all components with other installers such as devstack or manual operation. In those cases, the versions of all components shall be matched with the versions of them in OPNFV specific release.

Run Test Script

Doctor project has own testing script under doctor/doctor_tests. This test script can be used for functional testing agained an OPNFV deployment.

Before running this script, make sure OpenStack env parameters are set properly (See e.g. OpenStackClient Configuration), so that Doctor Inspector can operate OpenStack services.

Doctor now supports different test cases and for that you might want to export TEST_CASE with different values:

#Fault management (default)
export TEST_CASE='fault_management'
#Maintenance (requires 3 compute nodes)
export TEST_CASE='maintenance'
#Run both tests cases
export TEST_CASE='all'
Run Python Test Script

You can run the python script as follows:

git clone https://gerrit.opnfv.org/gerrit/doctor
cd doctor && tox

You can see all the configurations with default values in sample configuration file doctor.sample.conf. And you can also modify the file to meet your environment and then run the test.

In OPNFV Apex jumphost you can run Doctor testing as follows using tox:

#Before Gambia: overcloudrc.v3
source overcloudrc
export INSTALLER_IP=${INSTALLER_IP}
export INSTALLER_TYPE=${INSTALLER_TYPE}
git clone https://gerrit.opnfv.org/gerrit/doctor
cd doctor
sudo -E tox
Run Functest Suite

Functest supports Doctor testing by triggering the test script above in a Functest container. You can run the Doctor test with the following steps:

DOCKER_TAG=latest
docker pull docker.io/opnfv/functest-features:${DOCKER_TAG}
docker run --privileged=true -id \
    -e INSTALLER_TYPE=${INSTALLER_TYPE} \
    -e INSTALLER_IP=${INSTALLER_IP} \
    -e INSPECTOR_TYPE=sample \
    docker.io/opnfv/functest-features:${DOCKER_TAG} /bin/bash
docker exec <container_id> functest testcase run doctor-notification

See Functest Userguide for more information.

For testing with stable version, change DOCKER_TAG to ‘stable’ or other release tag identifier.

Tips
Doctor: Fault Management and Maintenance
Project:

Doctor, https://wiki.opnfv.org/doctor

Editors:

Ashiq Khan (NTT DOCOMO), Gerald Kunzmann (NTT DOCOMO)

Authors:

Ryota Mibu (NEC), Carlos Goncalves (NEC), Tomi Juvonen (Nokia), Tommy Lindgren (Ericsson), Bertrand Souville (NTT DOCOMO), Balazs Gibizer (Ericsson), Ildiko Vancsa (Ericsson) and others.

Abstract:

Doctor is an OPNFV requirement project [DOCT]. Its scope is NFVI fault management, and maintenance and it aims at developing and realizing the consequent implementation for the OPNFV reference platform.

This deliverable is introducing the use cases and operational scenarios for Fault Management considered in the Doctor project. From the general features, a high level architecture describing logical building blocks and interfaces is derived. Finally, a detailed implementation is introduced, based on available open source components, and a related gap analysis is done as part of this project. The implementation plan finally discusses an initial realization for a NFVI fault management and maintenance solution in open source software.

Definition of terms

Different SDOs and communities use different terminology related to NFV/Cloud/SDN. This list tries to define an OPNFV terminology, mapping/translating the OPNFV terms to terminology used in other contexts.

ACT-STBY configuration
Failover configuration common in Telco deployments. It enables the operator to use a standby (STBY) instance to take over the functionality of a failed active (ACT) instance.
Administrator
Administrator of the system, e.g. OAM in Telco context.
Consumer
User-side Manager; consumer of the interfaces produced by the VIM; VNFM, NFVO, or Orchestrator in ETSI NFV [ENFV] terminology.
EPC
Evolved Packet Core, the main component of the core network architecture of 3GPP’s LTE communication standard.
MME
Mobility Management Entity, an entity in the EPC dedicated to mobility management.
NFV
Network Function Virtualization
NFVI
Network Function Virtualization Infrastructure; totality of all hardware and software components which build up the environment in which VNFs are deployed.
S/P-GW
Serving/PDN-Gateway, two entities in the EPC dedicated to routing user data packets and providing connectivity from the UE to external packet data networks (PDN), respectively.
Physical resource
Actual resources in NFVI; not visible to Consumer.
VNFM
Virtualized Network Function Manager; functional block that is responsible for the lifecycle management of VNF.
NFVO
Network Functions Virtualization Orchestrator; functional block that manages the Network Service (NS) lifecycle and coordinates the management of NS lifecycle, VNF lifecycle (supported by the VNFM) and NFVI resources (supported by the VIM) to ensure an optimized allocation of the necessary resources and connectivity.
VIM
Virtualized Infrastructure Manager; functional block that is responsible for controlling and managing the NFVI compute, storage and network resources, usually within one operator’s Infrastructure Domain, e.g. NFVI Point of Presence (NFVI-PoP).
Virtual Machine (VM)
Virtualized computation environment that behaves very much like a physical computer/server.
Virtual network
Virtual network routes information among the network interfaces of VM instances and physical network interfaces, providing the necessary connectivity.
Virtual resource
A Virtual Machine (VM), a virtual network, or virtualized storage; Offered resources to “Consumer” as result of infrastructure virtualization; visible to Consumer.
Virtual Storage
Virtualized non-volatile storage allocated to a VM.
VNF
Virtualized Network Function. Implementation of a Network Function that can be deployed on a Network Function Virtualization Infrastructure (NFVI).
Introduction

The goal of this project is to build an NFVI fault management and maintenance framework supporting high availability of the Network Services on top of the virtualized infrastructure. The key feature is immediate notification of unavailability of virtualized resources from VIM, to support failure recovery, or failure avoidance of VNFs running on them. Requirement survey and development of missing features in NFVI and VIM are in scope of this project in order to fulfil requirements for fault management and maintenance in NFV.

The purpose of this requirement project is to clarify the necessary features of NFVI fault management, and maintenance, identify missing features in the current OpenSource implementations, provide a potential implementation architecture and plan, provide implementation guidelines in relevant upstream projects to realize those missing features, and define the VIM northbound interfaces necessary to perform the task of NFVI fault management, and maintenance in alignment with ETSI NFV [ENFV].

Problem description

A Virtualized Infrastructure Manager (VIM), e.g. OpenStack [OPSK], cannot detect certain Network Functions Virtualization Infrastructure (NFVI) faults. This feature is necessary to detect the faults and notify the Consumer in order to ensure the proper functioning of EPC VNFs like MME and S/P-GW.

  • EPC VNFs are often in active standby (ACT-STBY) configuration and need to switch from STBY mode to ACT mode as soon as relevant faults are detected in the active (ACT) VNF.
  • NFVI encompasses all elements building up the environment in which VNFs are deployed, e.g., Physical Machines, Hypervisors, Storage, and Network elements.

In addition, VIM, e.g. OpenStack, needs to receive maintenance instructions from the Consumer, i.e. the operator/administrator of the VNF.

  • Change the state of certain Physical Machines (PMs), e.g. empty the PM, so that maintenance work can be performed at these machines.

Note: Although fault management and maintenance are different operations in NFV, both are considered as part of this project as – except for the trigger – they share a very similar work and message flow. Hence, from implementation perspective, these two are kept together in the Doctor project because of this high degree of similarity.

Use cases and scenarios

Telecom services often have very high requirements on service performance. As a consequence they often utilize redundancy and high availability (HA) mechanisms for both the service and the platform. The HA support may be built-in or provided by the platform. In any case, the HA support typically has a very fast detection and reaction time to minimize service impact. The main changes proposed in this document are about making a clear distinction between fault management and recovery a) within the VIM/NFVI and b) High Availability support for VNFs on the other, claiming that HA support within a VNF or as a service from the platform is outside the scope of Doctor and is discussed in the High Availability for OPNFV project. Doctor should focus on detecting and remediating faults in the NFVI. This will ensure that applications come back to a fully redundant configuration faster than before.

As an example, Telecom services can come with an Active-Standby (ACT-STBY) configuration which is a (1+1) redundancy scheme. ACT and STBY nodes (aka Physical Network Function (PNF) in ETSI NFV terminology) are in a hot standby configuration. If an ACT node is unable to function properly due to fault or any other reason, the STBY node is instantly made ACT, and affected services can be provided without any service interruption.

The ACT-STBY configuration needs to be maintained. This means, when a STBY node is made ACT, either the previously ACT node, after recovery, shall be made STBY, or, a new STBY node needs to be configured. The actual operations to instantiate/configure a new STBY are similar to instantiating a new VNF and therefore are outside the scope of this project.

The NFVI fault management and maintenance requirements aim at providing fast failure detection of physical and virtualized resources and remediation of the virtualized resources provided to Consumers according to their predefined request to enable applications to recover to a fully redundant mode of operation.

  1. Fault management/recovery using ACT-STBY configuration (Triggered by critical error)
  2. Preventive actions based on fault prediction (Preventing service stop by handling warnings)
  3. VM Retirement (Managing service during NFVI maintenance, i.e. H/W, Hypervisor, Host OS, maintenance)
Faults
Fault management using ACT-STBY configuration

In figure1, a system-wide view of relevant functional blocks is presented. OpenStack is considered as the VIM implementation (aka Controller) which has interfaces with the NFVI and the Consumers. The VNF implementation is represented as different virtual resources marked by different colors. Consumers (VNFM or NFVO in ETSI NFV terminology) own/manage the respective virtual resources (VMs in this example) shown with the same colors.

The first requirement in this use case is that the Controller needs to detect faults in the NFVI (“1. Fault Notification” in figure1) affecting the proper functioning of the virtual resources (labelled as VM-x) running on top of it. It should be possible to configure which relevant fault items should be detected. The VIM (e.g. OpenStack) itself could be extended to detect such faults. Alternatively, a third party fault monitoring tool could be used which then informs the VIM about such faults; this third party fault monitoring element can be considered as a component of VIM from an architectural point of view.

Once such fault is detected, the VIM shall find out which virtual resources are affected by this fault. In the example in figure1, VM-4 is affected by a fault in the Hardware Server-3. Such mapping shall be maintained in the VIM, depicted as the “Server-VM info” table inside the VIM.

Once the VIM has identified which virtual resources are affected by the fault, it needs to find out who is the Consumer (i.e. the owner/manager) of the affected virtual resources (Step 2). In the example shown in figure1, the VIM knows that for the red VM-4, the manager is the red Consumer through an Ownership info table. The VIM then notifies (Step 3 “Fault Notification”) the red Consumer about this fault, preferably with sufficient abstraction rather than detailed physical fault information.

_images/figure1.png

Fault management/recovery use case

The Consumer then switches to STBY configuration by switching the STBY node to ACT state (Step 4). It further initiates a process to instantiate/configure a new STBY. However, switching to STBY mode and creating a new STBY machine is a VNFM/NFVO level operation and therefore outside the scope of this project. Doctor project does not create interfaces for such VNFM level configuration operations. Yet, since the total failover time of a consumer service depends on both the delay of such processes as well as the reaction time of Doctor components, minimizing Doctor’s reaction time is a necessary basic ingredient to fast failover times in general.

Once the Consumer has switched to STBY configuration, it notifies (Step 5 “Instruction” in figure1) the VIM. The VIM can then take necessary (e.g. pre-determined by the involved network operator) actions on how to clean up the fault affected VMs (Step 6 “Execute Instruction”).

The key issue in this use case is that a VIM (OpenStack in this context) shall not take a standalone fault recovery action (e.g. migration of the affected VMs) before the ACT-STBY switching is complete, as that might violate the ACT-STBY configuration and render the node out of service.

As an extension of the 1+1 ACT-STBY resilience pattern, a STBY instance can act as backup to N ACT nodes (N+1). In this case, the basic information flow remains the same, i.e., the consumer is informed of a failure in order to activate the STBY node. However, in this case it might be useful for the failure notification to cover a number of failed instances due to the same fault (e.g., more than one instance might be affected by a switch failure). The reaction of the consumer might depend on whether only one active instance has failed (similar to the ACT-STBY case), or if more active instances are needed as well.

Preventive actions based on fault prediction

The fault management scenario explained in Fault management using ACT-STBY configuration can also be performed based on fault prediction. In such cases, in VIM, there is an intelligent fault prediction module which, based on its NFVI monitoring information, can predict an imminent fault in the elements of NFVI. A simple example is raising temperature of a Hardware Server which might trigger a pre-emptive recovery action. The requirements of such fault prediction in the VIM are investigated in the OPNFV project “Data Collection for Failure Prediction” [PRED].

This use case is very similar to Fault management using ACT-STBY configuration. Instead of a fault detection (Step 1 “Fault Notification in” figure1), the trigger comes from a fault prediction module in the VIM, or from a third party module which notifies the VIM about an imminent fault. From Step 2~5, the work flow is the same as in the “Fault management using ACT-STBY configuration” use case, except in this case, the Consumer of a VM/VNF switches to STBY configuration based on a predicted fault, rather than an occurred fault.

NFVI Maintenance
VM Retirement

All network operators perform maintenance of their network infrastructure, both regularly and irregularly. Besides the hardware, virtualization is expected to increase the number of elements subject to such maintenance as NFVI holds new elements like the hypervisor and host OS. Maintenance of a particular resource element e.g. hardware, hypervisor etc. may render a particular server hardware unusable until the maintenance procedure is complete.

However, the Consumer of VMs needs to know that such resources will be unavailable because of NFVI maintenance. The following use case is again to ensure that the ACT-STBY configuration is not violated. A stand-alone action (e.g. live migration) from VIM/OpenStack to empty a physical machine so that consequent maintenance procedure could be performed may not only violate the ACT-STBY configuration, but also have impact on real-time processing scenarios where dedicated resources to virtual resources (e.g. VMs) are necessary and a pause in operation (e.g. vCPU) is not allowed. The Consumer is in a position to safely perform the switch between ACT and STBY nodes, or switch to an alternative VNF forwarding graph so the hardware servers hosting the ACT nodes can be emptied for the upcoming maintenance operation. Once the target hardware servers are emptied (i.e. no virtual resources are running on top), the VIM can mark them with an appropriate flag (i.e. “maintenance” state) such that these servers are not considered for hosting of virtual machines until the maintenance flag is cleared (i.e. nodes are back in “normal” status).

A high-level view of the maintenance procedure is presented in figure2. VIM/OpenStack, through its northbound interface, receives a maintenance notification (Step 1 “Maintenance Request”) from the Administrator (e.g. a network operator) including information about which hardware is subject to maintenance. Maintenance operations include replacement/upgrade of hardware, update/upgrade of the hypervisor/host OS, etc.

The consequent steps to enable the Consumer to perform ACT-STBY switching are very similar to the fault management scenario. From VIM/OpenStack’s internal database, it finds out which virtual resources (VM-x) are running on those particular Hardware Servers and who are the managers of those virtual resources (Step 2). The VIM then informs the respective Consumer (VNFMs or NFVO) in Step 3 “Maintenance Notification”. Based on this, the Consumer takes necessary actions (Step 4, e.g. switch to STBY configuration or switch VNF forwarding graphs) and then notifies (Step 5 “Instruction”) the VIM. Upon receiving such notification, the VIM takes necessary actions (Step 6 “Execute Instruction” to empty the Hardware Servers so that consequent maintenance operations could be performed. Due to the similarity for Steps 2~6, the maintenance procedure and the fault management procedure are investigated in the same project.

_images/figure2.png

Maintenance use case

High level architecture and general features
Functional overview

The Doctor project circles around two distinct use cases: 1) management of failures of virtualized resources and 2) planned maintenance, e.g. migration, of virtualized resources. Both of them may affect a VNF/application and the network service it provides, but there is a difference in frequency and how they can be handled.

Failures are spontaneous events that may or may not have an impact on the virtual resources. The Consumer should as soon as possible react to the failure, e.g., by switching to the STBY node. The Consumer will then instruct the VIM on how to clean up or repair the lost virtual resources, i.e. restore the VM, VLAN or virtualized storage. How much the applications are affected varies. Applications with built-in HA support might experience a short decrease in retainability (e.g. an ongoing session might be lost) while keeping availability (establishment or re-establishment of sessions are not affected), whereas the impact on applications without built-in HA may be more serious. How much the network service is impacted depends on how the service is implemented. With sufficient network redundancy the service may be unaffected even when a specific resource fails.

On the other hand, planned maintenance impacting virtualized resources are events that are known in advance. This group includes e.g. migration due to software upgrades of OS and hypervisor on a compute host. Some of these might have been requested by the application or its management solution, but there is also a need for coordination on the actual operations on the virtual resources. There may be an impact on the applications and the service, but since they are not spontaneous events there is room for planning and coordination between the application management organization and the infrastructure management organization, including performing whatever actions that would be required to minimize the problems.

Failure prediction is the process of pro-actively identifying situations that may lead to a failure in the future unless acted on by means of maintenance activities. From applications’ point of view, failure prediction may impact them in two ways: either the warning time is so short that the application or its management solution does not have time to react, in which case it is equal to the failure scenario, or there is sufficient time to avoid the consequences by means of maintenance activities, in which case it is similar to planned maintenance.

Architecture Overview

NFV and the Cloud platform provide virtual resources and related control functionality to users and administrators. figure3 shows the high level architecture of NFV focusing on the NFVI, i.e., the virtualized infrastructure. The NFVI provides virtual resources, such as virtual machines (VM) and virtual networks. Those virtual resources are used to run applications, i.e. VNFs, which could be components of a network service which is managed by the consumer of the NFVI. The VIM provides functionalities of controlling and viewing virtual resources on hardware (physical) resources to the consumers, i.e., users and administrators. OpenStack is a prominent candidate for this VIM. The administrator may also directly control the NFVI without using the VIM.

Although OpenStack is the target upstream project where the new functional elements (Controller, Notifier, Monitor, and Inspector) are expected to be implemented, a particular implementation method is not assumed. Some of these elements may sit outside of OpenStack and offer a northbound interface to OpenStack.

General Features and Requirements

The following features are required for the VIM to achieve high availability of applications (e.g., MME, S/P-GW) and the Network Services:

  1. Monitoring: Monitor physical and virtual resources.
  2. Detection: Detect unavailability of physical resources.
  3. Correlation and Cognition: Correlate faults and identify affected virtual resources.
  4. Notification: Notify unavailable virtual resources to their Consumer(s).
  5. Fencing: Shut down or isolate a faulty resource.
  6. Recovery action: Execute actions to process fault recovery and maintenance.

The time interval between the instant that an event is detected by the monitoring system and the Consumer notification of unavailable resources shall be < 1 second (e.g., Step 1 to Step 4 in figure4).

_images/figure3.png

High level architecture

Monitoring

The VIM shall monitor physical and virtual resources for unavailability and suspicious behavior.

Detection

The VIM shall detect unavailability and failures of physical resources that might cause errors/faults in virtual resources running on top of them. Unavailability of physical resource is detected by various monitoring and managing tools for hardware and software components. This may include also predicting upcoming faults. Note, fault prediction is out of scope of this project and is investigated in the OPNFV “Data Collection for Failure Prediction” project [PRED].

The fault items/events to be detected shall be configurable.

The configuration shall enable Failure Selection and Aggregation. Failure aggregation means the VIM determines unavailability of physical resource from more than two non-critical failures related to the same resource.

There are two types of unavailability - immediate and future:

  • Immediate unavailability can be detected by setting traps of raw failures on hardware monitoring tools.
  • Future unavailability can be found by receiving maintenance instructions issued by the administrator of the NFVI or by failure prediction mechanisms.
Correlation and Cognition

The VIM shall correlate each fault to the impacted virtual resource, i.e., the VIM shall identify unavailability of virtualized resources that are or will be affected by failures on the physical resources under them. Unavailability of a virtualized resource is determined by referring to the mapping of physical and virtualized resources.

VIM shall allow configuration of fault correlation between physical and virtual resources. VIM shall support correlating faults:

  • between a physical resource and another physical resource
  • between a physical resource and a virtual resource
  • between a virtual resource and another virtual resource

Failure aggregation is also required in this feature, e.g., a user may request to be only notified if failures on more than two standby VMs in an (N+M) deployment model occurred.

Notification

The VIM shall notify the alarm, i.e., unavailability of virtual resource(s), to the Consumer owning it over the northbound interface, such that the Consumers impacted by the failure can take appropriate actions to recover from the failure.

The VIM shall also notify the unavailability of physical resources to its Administrator.

All notifications shall be transferred immediately in order to minimize the stalling time of the network service and to avoid over assignment caused by delay of capability updates.

There may be multiple consumers, so the VIM has to find out the owner of a faulty resource. Moreover, there may be a large number of virtual and physical resources in a real deployment, so polling the state of all resources to the VIM would lead to heavy signaling traffic. Thus, a publication/subscription messaging model is better suited for these notifications, as notifications are only sent to subscribed consumers.

Notifications will be send out along with the configuration by the consumer. The configuration includes endpoint(s) in which the consumers can specify multiple targets for the notification subscription, so that various and multiple receiver functions can consume the notification message. Also, the conditions for notifications shall be configurable, such that the consumer can set according policies, e.g. whether it wants to receive fault notifications or not.

Note: the VIM should only accept notification subscriptions for each resource by its owner or administrator. Notifications to the Consumer about the unavailability of virtualized resources will include a description of the fault, preferably with sufficient abstraction rather than detailed physical fault information.

Fencing

Recovery actions, e.g. safe VM evacuation, have to be preceded by fencing the failed host. Fencing hereby means to isolate or shut down a faulty resource. Without fencing – when the perceived disconnection is due to some transient or partial failure – the evacuation might lead into two identical instances running together and having a dangerous conflict.

There is a cross-project definition in OpenStack of how to implement fencing, but there has not been any progress. The general description is available here: https://wiki.openstack.org/wiki/Fencing_Instances_of_an_Unreachable_Host

OpenStack provides some mechanisms that allow fencing of faulty resources. Some are automatically invoked by the platform itself (e.g. Nova disables the compute service when libvirtd stops running, preventing new VMs to be scheduled to that node), while other mechanisms are consumer trigger-based actions (e.g. Neutron port admin-state-up). For other fencing actions not supported by OpenStack, the Doctor project may suggest ways to address the gap (e.g. through means of resourcing to external tools and orchestration methods), or documenting or implementing them upstream.

The Doctor Inspector component will be responsible of marking resources down in the OpenStack and back up if necessary.

Recovery Action

In the basic Fault management using ACT-STBY configuration use case, no automatic actions will be taken by the VIM, but all recovery actions executed by the VIM and the NFVI will be instructed and coordinated by the Consumer.

In a more advanced use case, the VIM may be able to recover the failed virtual resources according to a pre-defined behavior for that resource. In principle this means that the owner of the resource (i.e., its consumer or administrator) can define which recovery actions shall be taken by the VIM. Examples are a restart of the VM or migration/evacuation of the VM.

High level northbound interface specification
Fault Management

This interface allows the Consumer to subscribe to fault notification from the VIM. Using a filter, the Consumer can narrow down which faults should be notified. A fault notification may trigger the Consumer to switch from ACT to STBY configuration and initiate fault recovery actions. A fault query request/response message exchange allows the Consumer to find out about active alarms at the VIM. A filter can be used to narrow down the alarms returned in the response message.

_images/figure4.png

High-level message flow for fault management

The high level message flow for the fault management use case is shown in figure4. It consists of the following steps:

  1. The VIM monitors the physical and virtual resources and the fault management workflow is triggered by a monitored fault event.
  2. Event correlation, fault detection and aggregation in VIM. Note: this may also happen after Step 3.
  3. Database lookup to find the virtual resources affected by the detected fault.
  4. Fault notification to Consumer.
  5. The Consumer switches to standby configuration (STBY).
  6. Instructions to VIM requesting certain actions to be performed on the affected resources, for example migrate/update/terminate specific resource(s). After reception of such instructions, the VIM is executing the requested action, e.g., it will migrate or terminate a virtual resource.
NFVI Maintenance

The NFVI maintenance interface allows the Administrator to notify the VIM about a planned maintenance operation on the NFVI. A maintenance operation may for example be an update of the server firmware or the hypervisor. The MaintenanceRequest message contains instructions to change the state of the physical resource from ‘enabled’ to ‘going-to-maintenance’ and a timeout [1]. After receiving the MaintenanceRequest,the VIM decides on the actions to be taken based on maintenance policies predefined by the affected Consumer(s).

[1]Timeout is set by the Administrator and corresponds to the maximum time to empty the physical resources.
_images/figure5a.png

High-level message flow for maintenance policy enforcement

The high level message flow for the NFVI maintenance policy enforcement is shown in figure5a. It consists of the following steps:

  1. Maintenance trigger received from Administrator.
  2. VIM switches the affected physical resources to “going-to-maintenance” state e.g. so that no new VM will be scheduled on the physical servers.
  3. Database lookup to find the Consumer(s) and virtual resources affected by the maintenance operation.
  4. Maintenance policies are enforced in the VIM, e.g. affected VM(s) are shut down on the physical server(s), or affected Consumer(s) are notified about the planned maintenance operation (steps 4a/4b).

Once the affected Consumer(s) have been notified, they take specific actions (e.g. switch to standby (STBY) configuration, request to terminate the virtual resource(s)) to allow the maintenance action to be executed. After the physical resources have been emptied, the VIM puts the physical resources in “in-maintenance” state and sends a MaintenanceResponse back to the Administrator.

_images/figure5b.png

Successful NFVI maintenance

The high level message flow for a successful NFVI maintenance is show in figure5b. It consists of the following steps:

  1. The Consumer C3 switches to standby configuration (STBY).
  2. Instructions from Consumers C2/C3 are shared to VIM requesting certain actions to be performed (steps 6a, 6b). After receiving such instructions, the VIM executes the requested action in order to empty the physical resources (step 6c) and informs the Consumer about the result of the actions (steps 6d, 6e).
  3. The VIM switches the physical resources to “in-maintenance” state
  4. Maintenance response is sent from VIM to inform the Administrator that the physical servers have been emptied.
  5. The Administrator is coordinating and executing the maintenance operation/work on the NFVI. Note: this step is out of scope of Doctor project.

The requested actions to empty the physical resources may not be successful (e.g. migration fails or takes too long) and in such a case, the VIM puts the physical resources back to ‘enabled’ and informs the Administrator about the problem.

_images/figure5c.png

Example of failed NFVI maintenance

An example of a high level message flow to cover the failed NFVI maintenance case is shown in figure5c. It consists of the following steps:

  1. The Consumer C3 switches to standby configuration (STBY).
  2. Instructions from Consumers C2/C3 are shared to VIM requesting certain actions to be performed (steps 6a, 6b). The VIM executes the requested actions and sends back a NACK to consumer C2 (step 6d) as the migration of the virtual resource(s) is not completed by the given timeout.
  3. The VIM switches the physical resources to “enabled” state.
  4. MaintenanceNotification is sent from VIM to inform the Administrator that the maintenance action cannot start.
Gap analysis in upstream projects

This section presents the findings of gaps on existing VIM platforms. The focus was to identify gaps based on the features and requirements specified in Section 3.3. The analysis work determined gaps that are presented here.

VIM Northbound Interface
Immediate Notification
  • Type: ‘deficiency in performance’
  • Description
    • To-be
      • VIM has to notify unavailability of virtual resource (fault) to VIM user immediately.
      • Notification should be passed in ‘1 second’ after fault detected/notified by VIM.
      • Also, the following conditions/requirement have to be met:
        • Only the owning user can receive notification of fault related to owned virtual resource(s).
    • As-is
      • OpenStack Metering ‘Ceilometer’ can notify unavailability of virtual resource (fault) to the owner of virtual resource based on alarm configuration by the user.
      • Alarm notifications are triggered by alarm evaluator instead of notification agents that might receive faults
      • Evaluation interval should be equal to or larger than configured pipeline interval for collection of underlying metrics.
      • The interval for collection has to be set large enough which depends on the size of the deployment and the number of metrics to be collected.
      • The interval may not be less than one second in even small deployments. The default value is 60 seconds.
      • Alternative: OpenStack has a message bus to publish system events. The operator can allow the user to connect this, but there are no functions to filter out other events that should not be passed to the user or which were not requested by the user.
    • Gap
      • Fault notifications cannot be received immediately by Ceilometer.
  • Solved by
Maintenance Notification
VIM Southbound interface
Normalization of data collection models
  • Type: ‘missing’
  • Description
    • To-be
      • A normalized data format needs to be created to cope with the many data models from different monitoring solutions.
    • As-is
      • Data can be collected from many places (e.g. Zabbix, Nagios, Cacti, Zenoss). Although each solution establishes its own data models, no common data abstraction models exist in OpenStack.
    • Gap
      • Normalized data format does not exist.
  • Solved by
OpenStack
Ceilometer

OpenStack offers a telemetry service, Ceilometer, for collecting measurements of the utilization of physical and virtual resources [CEIL]. Ceilometer can collect a number of metrics across multiple OpenStack components and watch for variations and trigger alarms based upon the collected data.

Scalability of fault aggregation
  • Type: ‘scalability issue’
  • Description
    • To-be
      • Be able to scale to a large deployment, where thousands of monitoring events per second need to be analyzed.
    • As-is
      • Performance issue when scaling to medium-sized deployments.
    • Gap
      • Ceilometer seems to be unsuitable for monitoring medium and large scale NFVI deployments.
  • Solved by
Monitoring of hardware and software
  • Type: ‘missing (lack of functionality)’
  • Description
    • To-be
      • OpenStack (as VIM) should monitor various hardware and software in NFVI to handle faults on them by Ceilometer.
      • OpenStack may have monitoring functionality in itself and can be integrated with third party monitoring tools.
      • OpenStack need to be able to detect the faults listed in the Annex.
    • As-is
      • For each deployment of OpenStack, an operator has responsibility to configure monitoring tools with relevant scripts or plugins in order to monitor hardware and software.
      • OpenStack Ceilometer does not monitor hardware and software to capture faults.
    • Gap
      • Ceilometer is not able to detect and handle all faults listed in the Annex.
  • Solved by
Nova

OpenStack Nova [NOVA] is a mature and widely known and used component in OpenStack cloud deployments. It is the main part of an “infrastructure-as-a-service” system providing a cloud computing fabric controller, supporting a wide diversity of virtualization and container technologies.

Nova has proven throughout these past years to be highly available and fault-tolerant. Featuring its own API, it also provides a compatibility API with Amazon EC2 APIs.

Correct states when compute host is down
  • Type: ‘missing (lack of functionality)’
  • Description
    • To-be
      • The API shall support to change VM power state in case host has failed.
      • The API shall support to change nova-compute state.
      • There could be single API to change different VM states for all VMs belonging to a specific host.
      • Support external systems that are monitoring the infrastructure and resources that are able to call the API fast and reliable.
      • Resource states are reliable such that correlation actions can be fast and automated.
      • User shall be able to read states from OpenStack and trust they are correct.
    • As-is
      • When a VM goes down due to a host HW, host OS or hypervisor failure, nothing happens in OpenStack. The VMs of a crashed host/hypervisor are reported to be live and OK through the OpenStack API.
      • nova-compute state might change too slowly or the state is not reliable if expecting also VMs to be down. This leads to ability to schedule VMs to a failed host and slowness blocks evacuation.
    • Gap
      • OpenStack does not change its states fast and reliably enough.
      • The API does not support to have an external system to change states and to trust the states are reliable (external system has fenced failed host).
      • User cannot read all the states from OpenStack nor trust they are right.
  • Solved by
Evacuate VMs in Maintenance mode
  • Type: ‘missing’
  • Description
    • To-be
      • When maintenance mode for a compute host is set, trigger VM evacuation to available compute nodes before bringing the host down for maintenance.
    • As-is
      • If setting a compute node to a maintenance mode, OpenStack only schedules evacuation of all VMs to available compute nodes if in-maintenance compute node runs the XenAPI and VMware ESX hypervisors. Other hypervisors (e.g. KVM) are not supported and, hence, guest VMs will likely stop running due to maintenance actions administrator may perform (e.g. hardware upgrades, OS updates).
    • Gap
      • Nova libvirt hypervisor driver does not implement automatic guest VMs evacuation when compute nodes are set to maintenance mode ($ nova host-update --maintenance enable <hostname>).
Monasca

Monasca is an open-source monitoring-as-a-service (MONaaS) solution that integrates with OpenStack. Even though it is still in its early days, it is the interest of the community that the platform be multi-tenant, highly scalable, performant and fault-tolerant. It provides a streaming alarm engine, a notification engine, and a northbound REST API users can use to interact with Monasca. Hundreds of thousands of metrics per second can be processed [MONA].

Anomaly detection
  • Type: ‘missing (lack of functionality)’
  • Description
    • To-be
      • Detect the failure and perform a root cause analysis to filter out other alarms that may be triggered due to their cascading relation.
    • As-is
      • A mechanism to detect root causes of failures is not available.
    • Gap
      • Certain failures can trigger many alarms due to their dependency on the underlying root cause of failure. Knowing the root cause can help filter out unnecessary and overwhelming alarms.
  • Status
    • Monasca as of now lacks this feature, although the community is aware and working toward supporting it.
Sensor monitoring
  • Type: ‘missing (lack of functionality)’
  • Description
    • To-be
      • It should support monitoring sensor data retrieval, for instance, from IPMI.
    • As-is
      • Monasca does not monitor sensor data
    • Gap
      • Sensor monitoring is very important. It provides operators status on the state of the physical infrastructure (e.g. temperature, fans).
  • Addressed by
    • Monasca can be configured to use third-party monitoring solutions (e.g. Nagios, Cacti) for retrieving additional data.
Hardware monitoring tools
Zabbix

Zabbix is an open-source solution for monitoring availability and performance of infrastructure components (i.e. servers and network devices), as well as applications [ZABB]. It can be customized for use with OpenStack. It is a mature tool and has been proven to be able to scale to large systems with 100,000s of devices.

Delay in execution of actions
  • Type: ‘deficiency in performance’
  • Description
    • To-be
      • After detecting a fault, the monitoring tool should immediately execute the appropriate action, e.g. inform the manager through the NB I/F
    • As-is
      • A delay of around 10 seconds was measured in two independent testbed deployments
    • Gap
Detailed architecture and interface specification

This section describes a detailed implementation plan, which is based on the high level architecture introduced in Section 3. Section 5.1 describes the functional blocks of the Doctor architecture, which is followed by a high level message flow in Section 5.2. Section 5.3 provides a mapping of selected existing open source components to the building blocks of the Doctor architecture. Thereby, the selection of components is based on their maturity and the gap analysis executed in Section 4. Sections 5.4 and 5.5 detail the specification of the related northbound interface and the related information elements. Finally, Section 5.6 provides a first set of blueprints to address selected gaps required for the realization functionalities of the Doctor project.

Functional Blocks

This section introduces the functional blocks to form the VIM. OpenStack was selected as the candidate for implementation. Inside the VIM, 4 different building blocks are defined (see figure6).

_images/figure6.png

Functional blocks

Monitor

The Monitor module has the responsibility for monitoring the virtualized infrastructure. There are already many existing tools and services (e.g. Zabbix) to monitor different aspects of hardware and software resources which can be used for this purpose.

Inspector

The Inspector module has the ability a) to receive various failure notifications regarding physical resource(s) from Monitor module(s), b) to find the affected virtual resource(s) by querying the resource map in the Controller, and c) to update the state of the virtual resource (and physical resource).

The Inspector has drivers for different types of events and resources to integrate any type of Monitor and Controller modules. It also uses a failure policy database to decide on the failure selection and aggregation from raw events. This failure policy database is configured by the Administrator.

The reason for separation of the Inspector and Controller modules is to make the Controller focus on simple operations by avoiding a tight integration of various health check mechanisms into the Controller.

Controller

The Controller is responsible for maintaining the resource map (i.e. the mapping from physical resources to virtual resources), accepting update requests for the resource state(s) (exposing as provider API), and sending all failure events regarding virtual resources to the Notifier. Optionally, the Controller has the ability to force the state of a given physical resource to down in the resource mapping when it receives failure notifications from the Inspector for that given physical resource. The Controller also re-calculates the capacity of the NVFI when receiving a failure notification for a physical resource.

In a real-world deployment, the VIM may have several controllers, one for each resource type, such as Nova, Neutron and Cinder in OpenStack. Each controller maintains a database of virtual and physical resources which shall be the master source for resource information inside the VIM.

Notifier

The focus of the Notifier is on selecting and aggregating failure events received from the controller based on policies mandated by the Consumer. Therefore, it allows the Consumer to subscribe for alarms regarding virtual resources using a method such as API endpoint. After receiving a fault event from a Controller, it will notify the fault to the Consumer by referring to the alarm configuration which was defined by the Consumer earlier on.

To reduce complexity of the Controller, it is a good approach for the Controllers to emit all notifications without any filtering mechanism and have another service (i.e. Notifier) handle those notifications properly. This is the general philosophy of notifications in OpenStack. Note that a fault message consumed by the Notifier is different from the fault message received by the Inspector; the former message is related to virtual resources which are visible to users with relevant ownership, whereas the latter is related to raw devices or small entities which should be handled with an administrator privilege.

The northbound interface between the Notifier and the Consumer/Administrator is specified in Detailed northbound interface specification.

Sequence
Fault Management

The detailed work flow for fault management is as follows (see also figure7):

  1. Request to subscribe to monitor specific virtual resources. A query filter can be used to narrow down the alarms the Consumer wants to be informed about.
  2. Each subscription request is acknowledged with a subscribe response message. The response message contains information about the subscribed virtual resources, in particular if a subscribed virtual resource is in “alarm” state.
  3. The NFVI sends monitoring events for resources the VIM has been subscribed to. Note: this subscription message exchange between the VIM and NFVI is not shown in this message flow.
  4. Event correlation, fault detection and aggregation in VIM.
  5. Database lookup to find the virtual resources affected by the detected fault.
  6. Fault notification to Consumer.
  7. The Consumer switches to standby configuration (STBY)
  8. Instructions to VIM requesting certain actions to be performed on the affected resources, for example migrate/update/terminate specific resource(s). After reception of such instructions, the VIM is executing the requested action, e.g. it will migrate or terminate a virtual resource.
  1. Query request from Consumer to VIM to get information about the current status of a resource.
  2. Response to the query request with information about the current status of the queried resource. In case the resource is in “fault” state, information about the related fault(s) is returned.

In order to allow for quick reaction to failures, the time interval between fault detection in step 3 and the corresponding recovery actions in step 7 and 8 shall be less than 1 second.

_images/figure7.png

Fault management work flow

_images/figure8.png

Fault management scenario

figure8 shows a more detailed message flow (Steps 4 to 6) between the 4 building blocks introduced in Functional Blocks.

  1. The Monitor observed a fault in the NFVI and reports the raw fault to the Inspector. The Inspector filters and aggregates the faults using pre-configured failure policies.
  2. a) The Inspector queries the Resource Map to find the virtual resources affected by the raw fault in the NFVI. b) The Inspector updates the state of the affected virtual resources in the Resource Map. c) The Controller observes a change of the virtual resource state and informs the Notifier about the state change and the related alarm(s). Alternatively, the Inspector may directly inform the Notifier about it.
  3. The Notifier is performing another filtering and aggregation of the changes and alarms based on the pre-configured alarm configuration. Finally, a fault notification is sent to northbound to the Consumer.
NFVI Maintenance
_images/figure9.png

NFVI maintenance work flow

The detailed work flow for NFVI maintenance is shown in figure9 and has the following steps. Note that steps 1, 2, and 5 to 8a in the NFVI maintenance work flow are very similar to the steps in the fault management work flow and share a similar implementation plan in Release 1.

  1. Subscribe to fault/maintenance notifications.
  2. Response to subscribe request.
  3. Maintenance trigger received from administrator.
  4. VIM switches NFVI resources to “maintenance” state. This, e.g., means they should not be used for further allocation/migration requests
  5. Database lookup to find the virtual resources affected by the detected maintenance operation.
  6. Maintenance notification to Consumer.
  7. The Consumer switches to standby configuration (STBY)
  8. Instructions from Consumer to VIM requesting certain recovery actions to be performed (step 8a). After reception of such instructions, the VIM is executing the requested action in order to empty the physical resources (step 8b).
  9. Maintenance response from VIM to inform the Administrator that the physical machines have been emptied (or the operation resulted in an error state).
  10. Administrator is coordinating and executing the maintenance operation/work on the NFVI.
  1. Query request from Administrator to VIM to get information about the current state of a resource.
  2. Response to the query request with information about the current state of the queried resource(s). In case the resource is in “maintenance” state, information about the related maintenance operation is returned.
_images/figure10.png

NFVI Maintenance scenario

figure10 shows a more detailed message flow (Steps 3 to 6 and 9) between the 4 building blocks introduced in Section 5.1..

  1. The Administrator is sending a StateChange request to the Controller residing in the VIM.

  2. The Controller queries the Resource Map to find the virtual resources affected by the planned maintenance operation.

  3. a) The Controller updates the state of the affected virtual resources in the Resource Map database.

    b) The Controller informs the Notifier about the virtual resources that will be affected by the maintenance operation.

  4. A maintenance notification is sent to northbound to the Consumer.

...

  1. The Controller informs the Administrator after the physical resources have been freed.
Information elements

This section introduces all attributes and information elements used in the messages exchange on the northbound interfaces between the VIM and the VNFO and VNFM.

Note: The information elements will be aligned with current work in ETSI NFV IFA working group.

Simple information elements:

  • SubscriptionID (Identifier): identifies a subscription to receive fault or maintenance notifications.
  • NotificationID (Identifier): identifies a fault or maintenance notification.
  • VirtualResourceID (Identifier): identifies a virtual resource affected by a fault or a maintenance action of the underlying physical resource.
  • PhysicalResourceID (Identifier): identifies a physical resource affected by a fault or maintenance action.
  • VirtualResourceState (String): state of a virtual resource, e.g. “normal”, “maintenance”, “down”, “error”.
  • PhysicalResourceState (String): state of a physical resource, e.g. “normal”, “maintenance”, “down”, “error”.
  • VirtualResourceType (String): type of the virtual resource, e.g. “virtual machine”, “virtual memory”, “virtual storage”, “virtual CPU”, or “virtual NIC”.
  • FaultID (Identifier): identifies the related fault in the underlying physical resource. This can be used to correlate different fault notifications caused by the same fault in the physical resource.
  • FaultType (String): Type of the fault. The allowed values for this parameter depend on the type of the related physical resource. For example, a resource of type “compute hardware” may have faults of type “CPU failure”, “memory failure”, “network card failure”, etc.
  • Severity (Integer): value expressing the severity of the fault. The higher the value, the more severe the fault.
  • MinSeverity (Integer): value used in filter information elements. Only faults with a severity higher than the MinSeverity value will be notified to the Consumer.
  • EventTime (Datetime): Time when the fault was observed.
  • EventStartTime and EventEndTime (Datetime): Datetime range that can be used in a FaultQueryFilter to narrow down the faults to be queried.
  • ProbableCause (String): information about the probable cause of the fault.
  • CorrelatedFaultID (Integer): list of other faults correlated to this fault.
  • isRootCause (Boolean): Parameter indicating if this fault is the root for other correlated faults. If TRUE, then the faults listed in the parameter CorrelatedFaultID are caused by this fault.
  • FaultDetails (Key-value pair): provides additional information about the fault, e.g. information about the threshold, monitored attributes, indication of the trend of the monitored parameter.
  • FirmwareVersion (String): current version of the firmware of a physical resource.
  • HypervisorVersion (String): current version of a hypervisor.
  • ZoneID (Identifier): Identifier of the resource zone. A resource zone is the logical separation of physical and software resources in an NFVI deployment for physical isolation, redundancy, or administrative designation.
  • Metadata (Key-value pair): provides additional information of a physical resource in maintenance/error state.

Complex information elements (see also UML diagrams in figure13 and figure14):

  • VirtualResourceInfoClass:
    • VirtualResourceID [1] (Identifier)
    • VirtualResourceState [1] (String)
    • Faults [0..*] (FaultClass): For each resource, all faults including detailed information about the faults are provided.
  • FaultClass: The parameters of the FaultClass are partially based on ETSI TS 132 111-2 (V12.1.0) [*], which is specifying fault management in 3GPP, in particular describing the information elements used for alarm notifications.
    • FaultID [1] (Identifier)
    • FaultType [1] (String)
    • Severity [1] (Integer)
    • EventTime [1] (Datetime)
    • ProbableCause [1] (String)
    • CorrelatedFaultID [0..*] (Identifier)
    • FaultDetails [0..*] (Key-value pair)
[*]http://www.etsi.org/deliver/etsi_ts/132100_132199/13211102/12.01.00_60/ts_13211102v120100p.pdf
  • SubscribeFilterClass
    • VirtualResourceType [0..*] (String)
    • VirtualResourceID [0..*] (Identifier)
    • FaultType [0..*] (String)
    • MinSeverity [0..1] (Integer)
  • FaultQueryFilterClass: narrows down the FaultQueryRequest, for example it limits the query to certain physical resources, a certain zone, a given fault type/severity/cause, or a specific FaultID.
    • VirtualResourceType [0..*] (String)
    • VirtualResourceID [0..*] (Identifier)
    • FaultType [0..*] (String)
    • MinSeverity [0..1] (Integer)
    • EventStartTime [0..1] (Datetime)
    • EventEndTime [0..1] (Datetime)
  • PhysicalResourceStateClass:
    • PhysicalResourceID [1] (Identifier)
    • PhysicalResourceState [1] (String): mandates the new state of the physical resource.
    • Metadata [0..*] (Key-value pair)
  • PhysicalResourceInfoClass:
    • PhysicalResourceID [1] (Identifier)
    • PhysicalResourceState [1] (String)
    • FirmwareVersion [0..1] (String)
    • HypervisorVersion [0..1] (String)
    • ZoneID [0..1] (Identifier)
    • Metadata [0..*] (Key-value pair)
  • StateQueryFilterClass: narrows down a StateQueryRequest, for example it limits the query to certain physical resources, a certain zone, or a given resource state (e.g., only resources in “maintenance” state).
    • PhysicalResourceID [1] (Identifier)
    • PhysicalResourceState [1] (String)
    • ZoneID [0..1] (Identifier)
Detailed northbound interface specification

This section is specifying the northbound interfaces for fault management and NFVI maintenance between the VIM on the one end and the Consumer and the Administrator on the other ends. For each interface all messages and related information elements are provided.

Note: The interface definition will be aligned with current work in ETSI NFV IFA working group .

All of the interfaces described below are produced by the VIM and consumed by the Consumer or Administrator.

Fault management interface

This interface allows the VIM to notify the Consumer about a virtual resource that is affected by a fault, either within the virtual resource itself or by the underlying virtualization infrastructure. The messages on this interface are shown in figure13 and explained in detail in the following subsections.

Note: The information elements used in this section are described in detail in Section 5.4.

_images/figure13.png

Fault management NB I/F messages

SubscribeRequest (Consumer -> VIM)

Subscription from Consumer to VIM to be notified about faults of specific resources. The faults to be notified about can be narrowed down using a subscribe filter.

Parameters:

  • SubscribeFilter [1] (SubscribeFilterClass): Optional information to narrow down the faults that shall be notified to the Consumer, for example limit to specific VirtualResourceID(s), severity, or cause of the alarm.
SubscribeResponse (VIM -> Consumer)

Response to a subscribe request message including information about the subscribed resources, in particular if they are in “fault/error” state.

Parameters:

  • SubscriptionID [1] (Identifier): Unique identifier for the subscription. It can be used to delete or update the subscription.
  • VirtualResourceInfo [0..*] (VirtualResourceInfoClass): Provides additional information about the subscribed resources, i.e., a list of the related resources, the current state of the resources, etc.
FaultNotification (VIM -> Consumer)

Notification about a virtual resource that is affected by a fault, either within the virtual resource itself or by the underlying virtualization infrastructure. After reception of this request, the Consumer will decide on the optimal action to resolve the fault. This includes actions like switching to a hot standby virtual resource, migration of the fault virtual resource to another physical machine, termination of the faulty virtual resource and instantiation of a new virtual resource in order to provide a new hot standby resource. In some use cases the Consumer can leave virtual resources on failed host to be booted up again after fault is recovered. Existing resource management interfaces and messages between the Consumer and the VIM can be used for those actions, and there is no need to define additional actions on the Fault Management Interface.

Parameters:

  • NotificationID [1] (Identifier): Unique identifier for the notification.
  • VirtualResourceInfo [1..*] (VirtualResourceInfoClass): List of faulty resources with detailed information about the faults.
FaultQueryRequest (Consumer -> VIM)

Request to find out about active alarms at the VIM. A FaultQueryFilter can be used to narrow down the alarms returned in the response message.

Parameters:

  • FaultQueryFilter [1] (FaultQueryFilterClass): narrows down the FaultQueryRequest, for example it limits the query to certain physical resources, a certain zone, a given fault type/severity/cause, or a specific FaultID.
FaultQueryResponse (VIM -> Consumer)

List of active alarms at the VIM matching the FaultQueryFilter specified in the FaultQueryRequest.

Parameters:

  • VirtualResourceInfo [0..*] (VirtualResourceInfoClass): List of faulty resources. For each resource all faults including detailed information about the faults are provided.
NFVI maintenance

The NFVI maintenance interfaces Consumer-VIM allows the Consumer to subscribe to maintenance notifications provided by the VIM. The related maintenance interface Administrator-VIM allows the Administrator to issue maintenance requests to the VIM, i.e. requesting the VIM to take appropriate actions to empty physical machine(s) in order to execute maintenance operations on them. The interface also allows the Administrator to query the state of physical machines, e.g., in order to get details in the current status of the maintenance operation like a firmware update.

The messages defined in these northbound interfaces are shown in figure14 and described in detail in the following subsections.

_images/figure14.png

NFVI maintenance NB I/F messages

SubscribeRequest (Consumer -> VIM)

Subscription from Consumer to VIM to be notified about maintenance operations for specific virtual resources. The resources to be informed about can be narrowed down using a subscribe filter.

Parameters:

  • SubscribeFilter [1] (SubscribeFilterClass): Information to narrow down the faults that shall be notified to the Consumer, for example limit to specific virtual resource type(s).
SubscribeResponse (VIM -> Consumer)

Response to a subscribe request message, including information about the subscribed virtual resources, in particular if they are in “maintenance” state.

Parameters:

  • SubscriptionID [1] (Identifier): Unique identifier for the subscription. It can be used to delete or update the subscription.
  • VirtualResourceInfo [0..*] (VirtalResourceInfoClass): Provides additional information about the subscribed virtual resource(s), e.g., the ID, type and current state of the resource(s).
MaintenanceNotification (VIM -> Consumer)

Notification about a physical resource switched to “maintenance” state. After reception of this request, the Consumer will decide on the optimal action to address this request, e.g., to switch to the standby (STBY) configuration.

Parameters:

  • VirtualResourceInfo [1..*] (VirtualResourceInfoClass): List of virtual resources where the state has been changed to maintenance.
StateChangeRequest (Administrator -> VIM)

Request to change the state of a list of physical resources, e.g. to “maintenance” state, in order to prepare them for a planned maintenance operation.

Parameters:

  • PhysicalResourceState [1..*] (PhysicalResourceStateClass)
StateChangeResponse (VIM -> Administrator)

Response message to inform the Administrator that the requested resources are now in maintenance state (or the operation resulted in an error) and the maintenance operation(s) can be executed.

Parameters:

  • PhysicalResourceInfo [1..*] (PhysicalResourceInfoClass)
StateQueryRequest (Administrator -> VIM)

In this procedure, the Administrator would like to get the information about physical machine(s), e.g. their state (“normal”, “maintenance”), firmware version, hypervisor version, update status of firmware and hypervisor, etc. It can be used to check the progress during firmware update and the confirmation after update. A filter can be used to narrow down the resources returned in the response message.

Parameters:

  • StateQueryFilter [1] (StateQueryFilterClass): narrows down the StateQueryRequest, for example it limits the query to certain physical resources, a certain zone, or a given resource state.
StateQueryResponse (VIM -> Administrator)

List of physical resources matching the filter specified in the StateQueryRequest.

Parameters:

  • PhysicalResourceInfo [0..*] (PhysicalResourceInfoClass): List of physical resources. For each resource, information about the current state, the firmware version, etc. is provided.
NFV IFA, OPNFV Doctor and AODH alarms

This section compares the alarm interfaces of ETSI NFV IFA with the specifications of this document and the alarm class of AODH.

ETSI NFV specifies an interface for alarms from virtualised resources in ETSI GS NFV-IFA 005 [ENFV]. The interface specifies an Alarm class and two notifications plus operations to query alarm instances and to subscribe to the alarm notifications.

The specification in this document has a structure that is very similar to the ETSI NFV specifications. The notifications differ in that an alarm notification in the NFV interface defines a single fault for a single resource while the notification specified in this document can contain multiple faults for multiple resources. The Doctor specification is lacking the detailed time stamps of the NFV specification essential for synchronizaion of the alarm list using the query operation. The detailed time stamps are also of value in the event and alarm history DBs.

AODH defines a base class for alarms, not the notifications. This means that some of the dynamic attributes of the ETSI NFV alarm type, like alarmRaisedTime, are not applicable to the AODH alarm class but are attributes of in the actual notifications. (Description of these attributes will be added later.) The AODH alarm class is lacking some attributes present in the NFV specification, fault details and correlated alarms. Instead the AODH alarm class has attributes for actions, rules and user and project id.

ETSI NFV Alarm Type OPNFV Doctor Requirement Specs AODH Event Alarm Notification Description / Comment Recommendations
alarmId FaultId alarm_id Identifier of an alarm. -
- - alarm_name Human readable alarm name. May be added in ETSI NFV Stage 3.
managedObjectId VirtualResourceId (reason) Identifier of the affected virtual resource is part of the AODH reason parameter. -
- - user_id, project_id User and project identifiers. May be added in ETSI NFV Stage 3.
alarmRaisedTime - - Timestamp when alarm was raised. To be added to Doctor and AODH. May be derived (e.g. in a shimlayer) from the AODH alarm history.
alarmChangedTime - - Timestamp when alarm was changed/updated. see above
alarmClearedTime - - Timestamp when alarm was cleared. see above
eventTime - - Timestamp when alarm was first observed by the Monitor. see above
- EventTime generated Timestamp of the Notification. Update parameter name in Doctor spec. May be added in ETSI NFV Stage 3.
state: E.g. Fired, Updated Cleared VirtualResourceState: E.g. normal, down maintenance, error current: ok, alarm, insufficient_data ETSI NFV IFA 005/006 lists example alarm states. Maintenance state is missing in AODH. List of alarm states will be specified in ETSI NFV Stage 3.
perceivedSeverity: E.g. Critical, Major, Minor, Warning, Indeterminate, Cleared Severity (Integer) Severity: low (default), moderate, critical ETSI NFV IFA 005/006 lists example perceived severity values.

List of alarm states will be specified in ETSI NFV Stage 3.

OPNFV: Severity (Integer):
  • update OPNFV Doctor specification to Enum
perceivedSeverity=Indetermined:
  • remove value Indetermined in IFA and map undefined values to “minor” severity, or
  • add value indetermined in AODH and make it the default value.
perceivedSeverity=Cleared:
  • remove value Cleared in IFA as the information about a cleared alarm alarm can be derived from the alarm state parameter, or
  • add value cleared in AODH and set a rule that the severity is “cleared” when the state is ok.
faultType FaultType event_type in reason_data Type of the fault, e.g. “CPU failure” of a compute resource, in machine interpretable format. OpenStack Alarming (Aodh) can use a fuzzy matching with wildcard string, “compute.cpu.failure”.
N/A N/A type = “event” Type of the notification. For fault notifications the type in AODH is “event”. -
probableCause ProbableCause - Probable cause of the alarm. May be provided (e.g. in a shimlayer) based on Vitrage topology awareness / root-cause-analysis.
isRootCause IsRootCause - Boolean indicating whether the fault is the root cause of other faults. see above
correlatedAlarmId CorrelatedFaultId - List of IDs of correlated faults. see above
faultDetails FaultDetails - Additional details about the fault/alarm. FaultDetails information element will be specified in ETSI NFV Stage 3.
- - action, previous Additional AODH alarm related parameters. -

Table: Comparison of alarm attributes

The primary area of improvement should be alignment of the perceived severity. This is important for a quick and accurate evaluation of the alarm. AODH thus should support also the X.733 values Critical, Major, Minor, Warning and Indeterminate.

The detailed time stamps (raised, changed, cleared) which are essential for synchronizing the alarm list using a query operation should be added to the Doctor specification.

Other areas that need alignment is the so called alarm state in NFV. Here we must however consider what can be attributes of the notification vs. what should be a property of the alarm instance. This will be analyzed later.

Detailed southbound interface specification

This section is specifying the southbound interfaces for fault management between the Monitors and the Inspector. Although southbound interfaces should be flexible to handle various events from different types of Monitors, we define unified event API in order to improve interoperability between the Monitors and the Inspector. This is not limiting implementation of Monitor and Inspector as these could be extended in order to support failures from intelligent inspection like prediction.

Note: The interface definition will be aligned with current work in ETSI NFV IFA working group.

Fault event interface

This interface allows the Monitors to notify the Inspector about an event which was captured by the Monitor and may effect resources managed in the VIM.

EventNotification

Event notification including fault description. The entity of this notification is event, and not fault or error specifically. This allows us to use generic event format or framework build out of Doctor project. The parameters below shall be mandatory, but keys in ‘Details’ can be optional.

Parameters:

  • Time [1]: Datetime when the fault was observed in the Monitor.
  • Type [1]: Type of event that will be used to process correlation in Inspector.
  • Details [0..1]: Details containing additional information with Key-value pair style. Keys shall be defined depending on the Type of the event.

E.g.:

{
    'event': {
        'time': '2016-04-12T08:00:00',
        'type': 'compute.host.down',
        'details': {
            'hostname': 'compute-1',
            'source': 'sample_monitor',
            'cause': 'link-down',
            'severity': 'critical',
            'status': 'down',
            'monitor_id': 'monitor-1',
            'monitor_event_id': '123',
        }
    }
}

Optional parameters in ‘Details’:

  • Hostname: the hostname on which the event occurred.
  • Source: the display name of reporter of this event. This is not limited to monitor, other entity can be specified such as ‘KVM’.
  • Cause: description of the cause of this event which could be different from the type of this event.
  • Severity: the severity of this event set by the monitor.
  • Status: the status of target object in which error occurred.
  • MonitorID: the ID of the monitor sending this event.
  • MonitorEventID: the ID of the event in the monitor. This can be used by operator while tracking the monitor log.
  • RelatedTo: the array of IDs which related to this event.

Also, we can have bulk API to receive multiple events in a single HTTP POST message by using the ‘events’ wrapper as follows:

{
    'events': [
        'event': {
            'time': '2016-04-12T08:00:00',
            'type': 'compute.host.down',
            'details': {},
        },
        'event': {
            'time': '2016-04-12T08:00:00',
            'type': 'compute.host.nic.error',
            'details': {},
        }
    ]
}
Blueprints

This section is listing a first set of blueprints that have been proposed by the Doctor project to the open source community. Further blueprints addressing other gaps identified in Section 4 will be submitted at a later stage of the OPNFV. In this section the following definitions are used:

  • “Event” is a message emitted by other OpenStack services such as Nova and Neutron and is consumed by the “Notification Agents” in Ceilometer.
  • “Notification” is a message generated by a “Notification Agent” in Ceilometer based on an “event” and is delivered to the “Collectors” in Ceilometer that store those notifications (as “sample”) to the Ceilometer “Databases”.
Instance State Notification (Ceilometer) [†]

The Doctor project is planning to handle “events” and “notifications” regarding Resource Status; Instance State, Port State, Host State, etc. Currently, Ceilometer already receives “events” to identify the state of those resources, but it does not handle and store them yet. This is why we also need a new event definition to capture those resource states from “events” created by other services.

This BP proposes to add a new compute notification state to handle events from an instance (server) from nova. It also creates a new meter “instance.state” in OpenStack.

[†]https://etherpad.opnfv.org/p/doctor_bps
Event Publisher for Alarm (Ceilometer) [‡]

Problem statement:

The existing “Alarm Evaluator” in OpenStack Ceilometer is periodically querying/polling the databases in order to check all alarms independently from other processes. This is adding additional delay to the fault notification send to the Consumer, whereas one requirement of Doctor is to react on faults as fast as possible.

The existing message flow is shown in figure12: after receiving an “event”, a “notification agent” (i.e. “event publisher”) will send a “notification” to a “Collector”. The “collector” is collecting the notifications and is updating the Ceilometer “Meter” database that is storing information about the “sample” which is capured from original “event”. The “Alarm Evaluator” is periodically polling this databases then querying “Meter” database based on each alarm configuration.

_images/figure12.png

Implementation plan in Ceilometer architecture

In the current Ceilometer implementation, there is no possibility to directly trigger the “Alarm Evaluator” when a new “event” was received, but the “Alarm Evaluator” will only find out that requires firing new notification to the Consumer when polling the database.

Change/feature request:

This BP proposes to add a new “event publisher for alarm”, which is bypassing several steps in Ceilometer in order to avoid the polling-based approach of the existing Alarm Evaluator that makes notification slow to users. See figure12.

After receiving an “(alarm) event” by listening on the Ceilometer message queue (“notification bus”), the new “event publisher for alarm” immediately hands a “notification” about this event to a new Ceilometer component “Notification-driven alarm evaluator” proposed in the other BP (see Section 5.6.3).

Note, the term “publisher” refers to an entity in the Ceilometer architecture (it is a “notification agent”). It offers the capability to provide notifications to other services outside of Ceilometer, but it is also used to deliver notifications to other Ceilometer components (e.g. the “Collectors”) via the Ceilometer “notification bus”.

Implementation detail

  • “Event publisher for alarm” is part of Ceilometer
  • The standard AMQP message queue is used with a new topic string.
  • No new interfaces have to be added to Ceilometer.
  • “Event publisher for Alarm” can be configured by the Administrator of Ceilometer to be used as “Notification Agent” in addition to the existing “Notifier”
  • Existing alarm mechanisms of Ceilometer can be used allowing users to configure how to distribute the “notifications” transformed from “events”, e.g. there is an option whether an ongoing alarm is re-issued or not (“repeat_actions”).
[‡]https://etherpad.opnfv.org/p/doctor_bps
Notification-driven alarm evaluator (Ceilometer) [§]

Problem statement:

The existing “Alarm Evaluator” in OpenStack Ceilometer is periodically querying/polling the databases in order to check all alarms independently from other processes. This is adding additional delay to the fault notification send to the Consumer, whereas one requirement of Doctor is to react on faults as fast as possible.

Change/feature request:

This BP is proposing to add an alternative “Notification-driven Alarm Evaluator” for Ceilometer that is receiving “notifications” sent by the “Event Publisher for Alarm” described in the other BP. Once this new “Notification-driven Alarm Evaluator” received “notification”, it finds the “alarm” configurations which may relate to the “notification” by querying the “alarm” database with some keys i.e. resource ID, then it will evaluate each alarm with the information in that “notification”.

After the alarm evaluation, it will perform the same way as the existing “alarm evaluator” does for firing alarm notification to the Consumer. Similar to the existing Alarm Evaluator, this new “Notification-driven Alarm Evaluator” is aggregating and correlating different alarms which are then provided northbound to the Consumer via the OpenStack “Alarm Notifier”. The user/administrator can register the alarm configuration via existing Ceilometer API [¶]. Thereby, he can configure whether to set an alarm or not and where to send the alarms to.

Implementation detail

  • The new “Notification-driven Alarm Evaluator” is part of Ceilometer.
  • Most of the existing source code of the “Alarm Evaluator” can be re-used to implement this BP
  • No additional application logic is needed
  • It will access the Ceilometer Databases just like the existing “Alarm evaluator”
  • Only the polling-based approach will be replaced by a listener for “notifications” provided by the “Event Publisher for Alarm” on the Ceilometer “notification bus”.
  • No new interfaces have to be added to Ceilometer.
[§]https://etherpad.opnfv.org/p/doctor_bps
[¶]https://wiki.openstack.org/wiki/Ceilometer/Alerting
Report host fault to update server state immediately (Nova) [#]

Problem statement:

  • Nova state change for failed or unreachable host is slow and does not reliably state host is down or not. This might cause same server instance to run twice if action taken to evacuate instance to another host.
  • Nova state for server(s) on failed host will not change, but remains active and running. This gives the user false information about server state.
  • VIM northbound interface notification of host faults towards VNFM and NFVO should be in line with OpenStack state. This fault notification is a Telco requirement defined in ETSI and will be implemented by OPNFV Doctor project.
  • Openstack user cannot make HA actions fast and reliably by trusting server state and host state.

Proposed change:

There needs to be a new API for Admin to state host is down. This API is used to mark services running in host down to reflect the real situation.

Example on compute node is:

  • When compute node is up and running::

    vm_state: activeand power_state: running
    nova-compute state: up status: enabled
    
  • When compute node goes down and new API is called to state host is down::

    vm_state: stopped power_state: shutdown
    nova-compute state: down status: enabled
    

Alternatives:

There is no attractive alternative to detect all different host faults than to have an external tool to detect different host faults. For this kind of tool to exist there needs to be new API in Nova to report fault. Currently there must be some kind of workarounds implemented as cannot trust or get the states from OpenStack fast enough.

[#]https://blueprints.launchpad.net/nova/+spec/update-server-state-immediately
Summary and conclusion

The Doctor project aimed at detailing NFVI fault management and NFVI maintenance requirements. These are indispensable operations for an Operator, and extremely necessary to realize telco-grade high availability. High availability is a large topic; the objective of Doctor is not to realize a complete high availability architecture and implementation. Instead, Doctor limited itself to addressing the fault events in NFVI, and proposes enhancements necessary in VIM, e.g. OpenStack, to ensure VNFs availability in such fault events, taking a Telco VNFs application level management system into account.

The Doctor project performed a robust analysis of the requirements from NFVI fault management and NFVI maintenance operation, concretely found out gaps in between such requirements and the current implementation of OpenStack. A detailed architecture and interface specification has been described in this document and work to realize Doctor features and fill out the identified gaps in upstream communities is in the final stages of development.

Annex: NFVI Faults

Faults in the listed elements need to be immediately notified to the Consumer in order to perform an immediate action like live migration or switch to a hot standby entity. In addition, the Administrator of the host should trigger a maintenance action to, e.g., reboot the server or replace a defective hardware element.

Faults can be of different severity, i.e., critical, warning, or info. Critical faults require immediate action as a severe degradation of the system has happened or is expected. Warnings indicate that the system performance is going down: related actions include closer (e.g. more frequent) monitoring of that part of the system or preparation for a cold migration to a backup VM. Info messages do not require any action. We also consider a type “maintenance”, which is no real fault, but may trigger maintenance actions like a re-boot of the server or replacement of a faulty, but redundant HW.

Faults can be gathered by, e.g., enabling SNMP and installing some open source tools to catch and poll SNMP. When using for example Zabbix one can also put an agent running on the hosts to catch any other fault. In any case of failure, the Administrator should be notified. The following tables provide a list of high level faults that are considered within the scope of the Doctor project requiring immediate action by the Consumer.

Compute/Storage

Fault Severity How to detect? Comment Immediate action to recover
Processor/CPU failure, CPU condition not ok Critical Zabbix   Switch to hot standby
Memory failure/ Memory condition not ok Critical Zabbix (IPMI)   Switch to hot standby
Network card failure, e.g. network adapter connectivity lost Critical Zabbix/ Ceilometer   Switch to hot standby
Disk crash Info RAID monitoring Network storage is very redundant (e.g. RAID system) and can guarantee high availability Inform OAM
Storage controller Critical Zabbix (IPMI)   Live migration if storage is still accessible; otherwise hot standby
PDU/power failure, power off, server reset Critical Zabbix/ Ceilometer   Switch to hot standby
Power degration, power redundancy lost, power threshold exceeded Warning SNMP   Live migration
Chassis problem (e.g. fan degraded/failed, chassis power degraded), CPU fan problem, temperature/ thermal condition not ok Warning SNMP   Live migration
Mainboard failure Critical Zabbix (IPMI) e.g. PCIe, SAS link failure Switch to hot standby
OS crash (e.g. kernel panic) Critical Zabbix   Switch to hot standby

Hypervisor

Fault Severity How to detect? Comment Immediate action to recover
System has restarted Critical Zabbix   Switch to hot standby
Hypervisor failure Warning/ Critical Zabbix/ Ceilometer   Evacuation/switch to hot standby
Hypervisor status not retrievable after certain period Warning Alarming service Zabbix/ Ceilometer unreachable Rebuild VM

Network

Fault Severity How to detect? Comment Immediate action to recover
SDN/OpenFlow switch, controller degraded/failed Critical Ceilo- meter   Switch to hot standby or reconfigure virtual network topology
Hardware failure of physical switch/router Warning SNMP Redundancy of physical infrastructure is reduced or no longer available Live migration if possible otherwise evacuation
References and bibliography
[DOCT]OPNFV, “Doctor” requirements project, [Online]. Available at https://wiki.opnfv.org/doctor
[PRED]OPNFV, “Data Collection for Failure Prediction” requirements project [Online]. Available at https://wiki.opnfv.org/prediction
[OPSK]OpenStack, [Online]. Available at https://www.openstack.org/
[CEIL]OpenStack Telemetry (Ceilometer), [Online]. Available at https://wiki.openstack.org/wiki/Ceilometer
[NOVA]OpenStack Nova, [Online]. Available at https://wiki.openstack.org/wiki/Nova
[NEUT]OpenStack Neutron, [Online]. Available at https://wiki.openstack.org/wiki/Neutron
[CIND]OpenStack Cinder, [Online]. Available at https://wiki.openstack.org/wiki/Cinder
[MONA]OpenStack Monasca, [Online], Available at https://wiki.openstack.org/wiki/Monasca
[OSAG]OpenStack Cloud Administrator Guide, [Online]. Available at http://docs.openstack.org/admin-guide-cloud/content/
[ZABB]ZABBIX, the Enterprise-class Monitoring Solution for Everyone, [Online]. Available at http://www.zabbix.com/
[ENFV]ETSI NFV, [Online]. Available at http://www.etsi.org/technologies-clusters/technologies/nfv
Doctor Installation Guide
Doctor Configuration

OPNFV installers install most components of Doctor framework including OpenStack Nova, Neutron and Cinder (Doctor Controller) and OpenStack Ceilometer and Aodh (Doctor Notifier) except Doctor Monitor.

After major components of OPNFV are deployed, you can setup Doctor functions by following instructions in this section. You can also learn detailed steps for all supported installers under doctor/doctor_tests/installer.

Doctor Inspector

You need to configure one of Doctor Inspectors below. You can also learn detailed steps for all supported Inspectors under doctor/doctor_tests/inspector.

Sample Inspector

Sample Inspector is intended to show minimum functions of Doctor Inspector.

Sample Inspector is suggested to be placed in one of the controller nodes, but it can be put on any host where Sample Inspector can reach and access the OpenStack Controllers (e.g. Nova, Neutron).

Make sure OpenStack env parameters are set properly, so that Sample Inspector can issue admin actions such as compute host force-down and state update of VM.

Then, you can configure Sample Inspector as follows:

git clone https://gerrit.opnfv.org/gerrit/doctor
cd doctor/doctor_tests/inspector
INSPECTOR_PORT=12345
python sample.py $INSPECTOR_PORT > inspector.log 2>&1 &

Congress

OpenStack Congress is a Governance as a Service (previously Policy as a Service). Congress implements Doctor Inspector as it can inspect a fault situation and propagate errors onto other entities.

Congress is deployed by OPNFV Apex installer. You need to enable doctor datasource driver and set policy rules. By the example configuration below, Congress will force down nova compute service when it received a fault event of that compute host. Also, Congress will set the state of all VMs running on that host from ACTIVE to ERROR state.

openstack congress datasource create doctor "doctor"

openstack congress datasource create --config api_version=$NOVA_MICRO_VERSION \
    --config username=$OS_USERNAME --config tenant_name=$OS_TENANT_NAME \
    --config password=$OS_PASSWORD --config auth_url=$OS_AUTH_URL \
    nova "nova21"

openstack congress policy rule create \
    --name host_down classification \
    'host_down(host) :-
        doctor:events(hostname=host, type="compute.host.down", status="down")'

openstack congress policy rule create \
    --name active_instance_in_host classification \
    'active_instance_in_host(vmid, host) :-
        nova:servers(id=vmid, host_name=host, status="ACTIVE")'

openstack congress policy rule create \
    --name host_force_down classification \
    'execute[nova:services.force_down(host, "nova-compute", "True")] :-
        host_down(host)'

openstack congress policy rule create \
    --name error_vm_states classification \
    'execute[nova:servers.reset_state(vmid, "error")] :-
        host_down(host),
        active_instance_in_host(vmid, host)'

Vitrage

OpenStack Vitrage is an RCA (Root Cause Analysis) service for organizing, analyzing and expanding OpenStack alarms & events. Vitrage implements Doctor Inspector, as it receives a notification that a host is down and calls Nova force-down API. In addition, it raises alarms on the instances running on this host.

Vitrage is not deployed by OPNFV installers yet. It can be installed either on top of a devstack environment, or on top of a real OpenStack environment. See Vitrage Installation

Doctor SB API and a Doctor datasource were implemented in Vitrage in the Ocata release. The Doctor datasource is enabled by default.

After Vitrage is installed and configured, there is a need to configure it to support the Doctor use case. This can be done in a few steps:

  1. Make sure that ‘aodh’ and ‘doctor’ are included in the list of datasource types in /etc/vitrage/vitrage.conf:
[datasources]
types = aodh,doctor,nova.host,nova.instance,nova.zone,static,cinder.volume,neutron.network,neutron.port,heat.stack
  1. Enable the Vitrage Nova notifier. Set the following line in /etc/vitrage/vitrage.conf:
[DEFAULT]
notifiers = nova
  1. Add a template that is responsible to call Nova force-down if Vitrage receives a ‘compute.host.down’ alarm. Copy template and place it under /etc/vitrage/templates
  1. Restart the vitrage-graph and vitrage-notifier services
Doctor Monitors

Doctor Monitors are suggested to be placed in one of the controller nodes, but those can be put on any host which is reachable to target compute host and accessible by the Doctor Inspector. You need to configure Monitors for all compute hosts one by one. You can also learn detailed steps for all supported monitors under doctor/doctor_tests/monitor.

Sample Monitor You can configure the Sample Monitor as follows (Example for Apex deployment):

git clone https://gerrit.opnfv.org/gerrit/doctor
cd doctor/doctor_tests/monitor
INSPECTOR_PORT=12345
COMPUTE_HOST='overcloud-novacompute-1.localdomain.com'
COMPUTE_IP=192.30.9.5
sudo python sample.py "$COMPUTE_HOST" "$COMPUTE_IP" \
    "http://127.0.0.1:$INSPECTOR_PORT/events" > monitor.log 2>&1 &

Collectd Monitor

Doctor User Guide
Doctor capabilities and usage

figure1 shows the currently implemented and tested architecture of Doctor. The implementation is based on OpenStack and related components. The Monitor can be realized by a sample Python-based implementation provided in the Doctor code repository. The Controller is realized by OpenStack Nova, Neutron and Cinder for compute, network and storage, respectively. The Inspector can be realized by OpenStack Congress, Vitrage or a sample Python-based implementation also available in the code repository of Doctor. The Notifier is realized by OpenStack Aodh.

_images/figure11.png

Implemented and tested architecture

Immediate Notification

Immediate notification can be used by creating ‘event’ type alarm via OpenStack Alarming (Aodh) API with relevant internal components support.

See: - Upstream spec document: https://specs.openstack.org/openstack/ceilometer-specs/specs/liberty/event-alarm-evaluator.html - Aodh official documentation: https://docs.openstack.org/aodh/latest

An example of a consumer of this notification can be found in the Doctor repository. It can be executed as follows:

git clone https://gerrit.opnfv.org/gerrit/doctor
cd doctor/doctor_tests/consumer
CONSUMER_PORT=12346
python sample.py "$CONSUMER_PORT" > consumer.log 2>&1 &
Consistent resource state awareness

Resource state of compute host can be changed/updated according to a trigger from a monitor running outside of OpenStack Compute (Nova) by using force-down API.

See: * Upstream spec document: https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/mark-host-down.html * Upstream Compute API reference document: https://developer.openstack.org/api-ref/compute * Doctor Mark Host Down Manual: https://git.opnfv.org/doctor/tree/docs/development/manuals/mark-host-down_manual.rst

Valid compute host status given to VM owner

The resource state of a compute host can be retrieved by a user with the OpenStack Compute (Nova) servers API.

See: * Upstream spec document: https://specs.openstack.org/openstack/nova-specs/specs/mitaka/implemented/get-valid-server-state.html * Upstream Compute API reference document: https://developer.openstack.org/api-ref/compute * Doctor Get Valid Server State Manual: https://git.opnfv.org/doctor/tree/docs/development/manuals/get-valid-server-state.rst

Port data plane status update

Port data plane status can be changed/updated in the case of issues in the underlying data plane affecting connectivity from/to Neutron ports.

See: * Upstream spec document: https://specs.openstack.org/openstack/neutron-specs/specs/pike/port-data-plane-status.html * Upstream Networking API reference document: https://developer.openstack.org/api-ref/network

Doctor driver (Congress)

The Doctor driver can be notified about NFVI failures that have been detected by monitoring systems.

See: * Upstream spec document: https://specs.openstack.org/openstack/congress-specs/specs/mitaka/push-type-datasource-driver.html * Congress official documentation: https://docs.openstack.org/congress/latest

Event API (Vitrage)

With this API, monitoring systems can push events to the Doctor datasource.

See: * Upstream spec document: https://specs.openstack.org/openstack/vitrage-specs/specs/ocata/event-api.html * Vitrage official documentation: https://docs.openstack.org/vitrage/latest

Doctor datasource (Vitrage)

After receiving events from monitoring systems, the Doctor datasource identifies the affected resources based on the resource topology.

See: * Upstream spec document: https://specs.openstack.org/openstack/vitrage-specs/specs/ocata/doctor-datasource.html

Design Documents

This is the directory to store design documents which may include draft versions of blueprints written before proposing to upstream OSS communities such as OpenStack, in order to keep the original blueprint as reviewed in OPNFV. That means there could be out-dated blueprints as result of further refinements in the upstream OSS community. Please refer to the link in each document to find the latest version of the blueprint and status of development in the relevant OSS community.

See also https://wiki.opnfv.org/requirements_projects .

Note

This is a specification draft of a blueprint proposed for OpenStack Nova Liberty. It was written by project member(s) and agreed within the project before submitting it upstream. No further changes to its content will be made here anymore; please follow it upstream:

Original draft is as follow:

Report host fault to update server state immediately

https://blueprints.launchpad.net/nova/+spec/update-server-state-immediately

A new API is needed to report a host fault to change the state of the instances and compute node immediately. This allows usage of evacuate API without a delay. The new API provides the possibility for external monitoring system to detect any kind of host failure fast and reliably and inform OpenStack about it. Nova updates the compute node state and states of the instances. This way the states in the Nova DB will be in sync with the real state of the system.

Problem description
  • Nova state change for failed or unreachable host is slow and does not reliably state compute node is down or not. This might cause same instance to run twice if action taken to evacuate instance to another host.
  • Nova state for instances on failed compute node will not change, but remains active and running. This gives user a false information about instance state. Currently one would need to call “nova reset-state” for each instance to have them in error state.
  • OpenStack user cannot make HA actions fast and reliably by trusting instance state and compute node state.
  • As compute node state changes slowly one cannot evacuate instances.
Use Cases

Use case in general is that in case there is a host fault one should change compute node state fast and reliably when using DB servicegroup backend. On top of this here is the use cases that are not covered currently to have instance states changed correctly: * Management network connectivity lost between controller and compute node. * Host HW failed.

Generic use case flow:

  • The external monitoring system detects a host fault.
  • The external monitoring system fences the host if not down already.
  • The external system calls the new Nova API to force the failed compute node into down state as well as instances running on it.
  • Nova updates the compute node state and state of the effected instances to Nova DB.

Currently nova-compute state will be changing “down”, but it takes a long time. Server state keeps as “vm_state: active” and “power_state: running”, which is not correct. By having external tool to detect host faults fast, fence host by powering down and then report host down to OpenStack, all these states would reflect to actual situation. Also if OpenStack will not implement automatic actions for fault correlation, external tool can do that. This could be configured for example in server instance METADATA easily and be read by external tool.

Project Priority

Liberty priorities have not yet been defined.

Proposed change

There needs to be a new API for Admin to state host is down. This API is used to mark compute node and instances running on it down to reflect the real situation.

Example on compute node is:

  • When compute node is up and running: vm_state: active and power_state: running nova-compute state: up status: enabled
  • When compute node goes down and new API is called to state host is down: vm_state: stopped power_state: shutdown nova-compute state: down status: enabled

vm_state values: soft-delete, deleted, resized and error should not be touched. task_state effect needs to be worked out if needs to be touched.

Alternatives

There is no attractive alternatives to detect all different host faults than to have a external tool to detect different host faults. For this kind of tool to exist there needs to be new API in Nova to report fault. Currently there must have been some kind of workarounds implemented as cannot trust or get the states from OpenStack fast enough.

Data model impact

None

REST API impact
  • Update CLI to report host is down

    nova host-update command

    usage: nova host-update [–status <enable|disable>]

    [–maintenance <enable|disable>] [–report-host-down] <hostname>

    Update host settings.

    Positional arguments

    <hostname> Name of host.

    Optional arguments

    –status <enable|disable> Either enable or disable a host.

    –maintenance <enable|disable> Either put or resume host to/from maintenance.

    –down Report host down to update instance and compute node state in db.

  • Update Compute API to report host is down:

    /v2.1/{tenant_id}/os-hosts/{host_name}

    Normal response codes: 200 Request parameters

    Parameter Style Type Description host_name URI xsd:string The name of the host of interest to you.

    {
    “host”: {

    “status”: “enable”, “maintenance_mode”: “enable” “host_down_reported”: “true”

    }

    }

    {
    “host”: {

    “host”: “65c5d5b7e3bd44308e67fc50f362aee6”, “maintenance_mode”: “enabled”, “status”: “enabled” “host_down_reported”: “true”

    }

    }

  • New method to nova.compute.api module HostAPI class to have a to mark host related instances and compute node down: set_host_down(context, host_name)

  • class novaclient.v2.hosts.HostManager(api) method update(host, values) Needs to handle reporting host down.

  • Schema does not need changes as in db only service and server states are to be changed.

Security impact

API call needs admin privileges (in the default policy configuration).

Notifications impact

None

Other end user impact

None

Performance Impact

Only impact is that user can get information faster about instance and compute node state. This also gives possibility to evacuate faster. No impact that would slow down. Host down should be rare occurrence.

Other deployer impact

Developer can make use of any external tool to detect host fault and report it to OpenStack.

Developer impact

None

Implementation
Assignee(s)

Primary assignee: Tomi Juvonen Other contributors: Ryota Mibu

Work Items
  • Test cases.
  • API changes.
  • Documentation.
Dependencies

None

Testing

Test cases that exists for enabling or putting host to maintenance should be altered or similar new cases made test new functionality.

Documentation Impact

New API needs to be documented:

References
Notification Alarm Evaluator

Note

This is spec draft of blueprint for OpenStack Ceilomter Liberty. To see current version: https://review.openstack.org/172893 To track development activity: https://blueprints.launchpad.net/ceilometer/+spec/notification-alarm-evaluator

https://blueprints.launchpad.net/ceilometer/+spec/notification-alarm-evaluator

This blueprint proposes to add a new alarm evaluator for handling alarms on events passed from other OpenStack services, that provides event-driven alarm evaluation which makes new sequence in Ceilometer instead of the polling-based approach of the existing Alarm Evaluator, and realizes immediate alarm notification to end users.

Problem description

As an end user, I need to receive alarm notification immediately once Ceilometer captured an event which would make alarm fired, so that I can perform recovery actions promptly to shorten downtime of my service. The typical use case is that an end user set alarm on “compute.instance.update” in order to trigger recovery actions once the instance status has changed to ‘shutdown’ or ‘error’. It should be nice that an end user can receive notification within 1 second after fault observed as the same as other helth- check mechanisms can do in some cases.

The existing Alarm Evaluator is periodically querying/polling the databases in order to check all alarms independently from other processes. This is good approach for evaluating an alarm on samples stored in a certain period. However, this is not efficient to evaluate an alarm on events which are emitted by other OpenStack servers once in a while.

The periodical evaluation leads delay on sending alarm notification to users. The default period of evaluation cycle is 60 seconds. It is recommended that an operator set longer interval than configured pipeline interval for underlying metrics, and also longer enough to evaluate all defined alarms in certain period while taking into account the number of resources, users and alarms.

Proposed change

The proposal is to add a new event-driven alarm evaluator which receives messages from Notification Agent and finds related Alarms, then evaluates each alarms;

  • New alarm evaluator could receive event notification from Notification Agent by which adding a dedicated notifier as a publisher in pipeline.yaml (e.g. notifier://?topic=event_eval).
  • When new alarm evaluator received event notification, it queries alarm database by Project ID and Resource ID written in the event notification.
  • Found alarms are evaluated by referring event notification.
  • Depending on the result of evaluation, those alarms would be fired through Alarm Notifier as the same as existing Alarm Evaluator does.

This proposal also adds new alarm type “notification” and “notification_rule”. This enables users to create alarms on events. The separation from other alarm types (such as “threshold” type) is intended to show different timing of evaluation and different format of condition, since the new evaluator will check each event notification once it received whereas “threshold” alarm can evaluate average of values in certain period calculated from multiple samples.

The new alarm evaluator handles Notification type alarms, so we have to change existing alarm evaluator to exclude “notification” type alarms from evaluation targets.

Alternatives

There was similar blueprint proposal “Alarm type based on notification”, but the approach is different. The old proposal was to adding new step (alarm evaluations) in Notification Agent every time it received event from other OpenStack services, whereas this proposal intends to execute alarm evaluation in another component which can minimize impact to existing pipeline processing.

Another approach is enhancement of existing alarm evaluator by adding notification listener. However, there are two issues; 1) this approach could cause stall of periodical evaluations when it receives bulk of notifications, and 2) this could break the alarm portioning i.e. when alarm evaluator received notification, it might have to evaluate some alarms which are not assign to it.

Data model impact

Resource ID will be added to Alarm model as an optional attribute. This would help the new alarm evaluator to filter out non-related alarms while querying alarms, otherwise it have to evaluate all alarms in the project.

REST API impact

Alarm API will be extended as follows;

  • Add “notification” type into alarm type list
  • Add “resource_id” to “alarm”
  • Add “notification_rule” to “alarm”

Sample data of Notification-type alarm:

{
    "alarm_actions": [
        "http://site:8000/alarm"
    ],
    "alarm_id": null,
    "description": "An alarm",
    "enabled": true,
    "insufficient_data_actions": [
        "http://site:8000/nodata"
    ],
    "name": "InstanceStatusAlarm",
    "notification_rule": {
        "event_type": "compute.instance.update",
        "query" : [
            {
                "field" : "traits.state",
                "type" : "string",
                "value" : "error",
                "op" : "eq",
            },
        ]
    },
    "ok_actions": [],
    "project_id": "c96c887c216949acbdfbd8b494863567",
    "repeat_actions": false,
    "resource_id": "153462d0-a9b8-4b5b-8175-9e4b05e9b856",
    "severity": "moderate",
    "state": "ok",
    "state_timestamp": "2015-04-03T17:49:38.406845",
    "timestamp": "2015-04-03T17:49:38.406839",
    "type": "notification",
    "user_id": "c96c887c216949acbdfbd8b494863567"
}

“resource_id” will be refered to query alarm and will not be check permission and belonging of project.

Security impact

None

Pipeline impact

None

Other end user impact

None

Performance/Scalability Impacts

When Ceilomter received a number of events from other OpenStack services in short period, this alarm evaluator can keep working since events are queued in a messaging queue system, but it can cause delay of alarm notification to users and increase the number of read and write access to alarm database.

“resource_id” can be optional, but restricting it to mandatory could be reduce performance impact. If user create “notification” alarm without “resource_id”, those alarms will be evaluated every time event occurred in the project. That may lead new evaluator heavy.

Other deployer impact

New service process have to be run.

Developer impact

Developers should be aware that events could be notified to end users and avoid passing raw infra information to end users, while defining events and traits.

Implementation
Assignee(s)
Primary assignee:
r-mibu
Other contributors:
None
Ongoing maintainer:
None
Work Items
  • New event-driven alarm evaluator
  • Add new alarm type “notification” as well as AlarmNotificationRule
  • Add “resource_id” to Alarm model
  • Modify existing alarm evaluator to filter out “notification” alarms
  • Add new config parameter for alarm request check whether accepting alarms without specifying “resource_id” or not
Future lifecycle

This proposal is key feature to provide information of cloud resources to end users in real-time that enables efficient integration with user-side manager or Orchestrator, whereas currently those information are considered to be consumed by admin side tool or service. Based on this change, we will seek orchestrating scenarios including fault recovery and add useful event definition as well as additional traits.

Dependencies

None

Testing

New unit/scenario tests are required for this change.

Documentation Impact
  • Proposed evaluator will be described in the developer document.
  • New alarm type and how to use will be explained in user guide.
References
Neutron Port Status Update

Note

This document represents a Neutron RFE reviewed in the Doctor project before submitting upstream to Launchpad Neutron space. The document is not intended to follow a blueprint format or to be an extensive document. For more information, please visit http://docs.openstack.org/developer/neutron/policies/blueprints.html

The RFE was submitted to Neutron. You can follow the discussions in https://bugs.launchpad.net/neutron/+bug/1598081

Neutron port status field represents the current status of a port in the cloud infrastructure. The field can take one of the following values: ‘ACTIVE’, ‘DOWN’, ‘BUILD’ and ‘ERROR’.

At present, if a network event occurs in the data-plane (e.g. virtual or physical switch fails or one of its ports, cable gets pulled unintentionally, infrastructure topology changes, etc.), connectivity to logical ports may be affected and tenants’ services interrupted. When tenants/cloud administrators are looking up their resources’ status (e.g. Nova instances and services running in them, network ports, etc.), they will wrongly see everything looks fine. The problem is that Neutron will continue reporting port ‘status’ as ‘ACTIVE’.

Many SDN Controllers managing network elements have the ability to detect and report network events to upper layers. This allows SDN Controllers’ users to be notified of changes and react accordingly. Such information could be consumed by Neutron so that Neutron could update the ‘status’ field of those logical ports, and additionally generate a notification message to the message bus.

However, Neutron misses a way to be able to receive such information through e.g. ML2 driver or the REST API (‘status’ field is read-only). There are pros and cons on both of these approaches as well as other possible approaches. This RFE intends to trigger a discussion on how Neutron could be improved to receive fault/change events from SDN Controllers or even also from 3rd parties not in charge of controlling the network (e.g. monitoring systems, human admins).

Port data plane status

https://bugs.launchpad.net/neutron/+bug/1598081

Neutron does not detect data plane failures affecting its logical resources. This spec addresses that issue by means of allowing external tools to report to Neutron about faults in the data plane that are affecting the ports. A new REST API field is proposed to that end.

Problem Description

An initial description of the problem was introduced in bug #159801 [1]. This spec focuses on capturing one (main) part of the problem there described, i.e. extending Neutron’s REST API to cover the scenario of allowing external tools to report network failures to Neutron. Out of scope of this spec are works to enable port status changes to be received and managed by mechanism drivers.

This spec also tries to address bug #1575146 [2]. Specifically, and argued by the Neutron driver team in [3]:

  • Neutron should not shut down the port completly upon detection of physnet failure; connectivity between instances on the same node may still be reachable. Externals tools may or may not want to trigger a status change on the port based on their own logic and orchestration.
  • Port down is not detected when an uplink of a switch is down;
  • The physnet bridge may have multiple physical interfaces plugged; shutting down the logical port may not be needed in case network redundancy is in place.
Proposed Change

A couple of possible approaches were proposed in [1] (comment #3). This spec proposes tackling the problema via a new extension API to the port resource. The extension adds a new attribute ‘dp-down’ (data plane down) to represent the status of the data plane. The field should be read-only by tenants and read-write by admins.

Neutron should send out an event to the message bus upon toggling the data plane status value. The event is relevant for e.g. auditing.

Data Model Impact

A new attribute as extension will be added to the ‘ports’ table.

Attribute Name Type Access Default Value Validation/ Conversion Description
dp_down boolean RO, tenant RW, admin False True/False  
REST API Impact

A new API extension to the ports resource is going to be introduced.

EXTENDED_ATTRIBUTES_2_0 = {
    'ports': {
        'dp_down': {'allow_post': False, 'allow_put': True,
                    'default': False, 'convert_to': convert_to_boolean,
                    'is_visible': True},
    },
}
Examples

Updating port data plane status to down:

PUT /v2.0/ports/<port-uuid>
Accept: application/json
{
    "port": {
        "dp_down": true
    }
}
Command Line Client Impact
neutron port-update [--dp-down <True/False>] <port>
openstack port set [--dp-down <True/False>] <port>

Argument –dp-down is optional. Defaults to False.

Security Impact

None

Notifications Impact

A notification (event) upon toggling the data plane status (i.e. ‘dp-down’ attribute) value should be sent to the message bus. Such events do not happen with high frequency and thus no negative impact on the notification bus is expected.

Performance Impact

None

IPv6 Impact

None

Other Deployer Impact

None

Developer Impact

None

Implementation
Assignee(s)
  • cgoncalves
Work Items
  • New ‘dp-down’ attribute in ‘ports’ database table
  • API extension to introduce new field to port
  • Client changes to allow for data plane status (i.e. ‘dp-down’ attribute’) being set
  • Policy (tenants read-only; admins read-write)
Documentation Impact

Documentation for both administrators and end users will have to be contemplated. Administrators will need to know how to set/unset the data plane status field.

References
[1]RFE: Port status update, https://bugs.launchpad.net/neutron/+bug/1598081
[2]RFE: ovs port status should the same as physnet https://bugs.launchpad.net/neutron/+bug/1575146
[3]Neutron Drivers meeting, July 21, 2016 http://eavesdrop.openstack.org/meetings/neutron_drivers/2016/neutron_drivers.2016-07-21-22.00.html
Inspector Design Guideline

Note

This is spec draft of design guideline for inspector component. JIRA ticket to track the update and collect comments: DOCTOR-73.

This document summarize the best practise in designing a high performance inspector to meet the requirements in OPNFV Doctor project.

Problem Description

Some pitfalls has be detected during the development of sample inspector, e.g. we suffered a significant performance degrading in listing VMs in a host.

A patch set for caching the list has been committed to solve issue. When a new inspector is integrated, it would be nice to have an evaluation of existing design and give recommendations for improvements.

This document can be treated as a source of related blueprints in inspector projects.

Guidelines
Host specific VMs list

While requirement in doctor project is to have alarm about fault to consumer in one second, it is just a limit we have set in requirements. When talking about fault management in Telco, the implementation needs to be by all means optimal and the one second is far from traditional Telco requirements.

One thing to be optimized in inspector is to eliminate the need to read list of host specific VMs from Nova API, when it gets a host specific failure event. Optimal way of implementation would be to initialize this list when Inspector start by reading from Nova API and after this list would be kept up-to-date by instance.update notifications received from nova. Polling Nova API can be used as a complementary channel to make snapshot of hosts and VMs list in order to keep the data consistent with reality.

This is enhancement and not perhaps something needed to keep under one second in a small system. Anyhow this would be something needed in case of production use.

This guideline can be summarized as following:

  • cache the host VMs mapping instead of reading it on request
  • subscribe and handle update notifications to keep the list up to date
  • make snapshot periodically to ensure data consistency
Parallel execution

In doctor’s architecture, the inspector is responsible to set error state for the affected VMs in order to notify the consumers of such failure. This is done by calling the nova reset-state API. However, this action is a synchronous request with many underlying steps and cost typically hundreds of milliseconds. According to the discussion in mailing list, this time cost will grow linearly if the requests are sent one by one. It will become a critical issue in large scale system.

It is recommended to introduce parallel execution for actions like reset-state that takes a list of targets.

Shortcut notification

An alternative way to improve notification performance is to take a shortcut from inspector to notifier instead of triggering it from controller. The difference between the two workflow is shown below:

conservative notification

Conservative Notification

shortcut notification

Shortcut Notification

It worth noting that the shortcut notification has a side effect that cloud resource states could still be out-of-sync by the time consumer processes the alarm notification. This is out of scope of inspector design but need to be taken consideration in system level.

Also the call of “reset servers state to error” is not necessary in the alternative notification case where the “host forced down” is still called. “get-valid-server-state” was implemented to have valid server state while earlier one couldn’t get it unless calling “reset servers state to error”. When not having “reset servers state to error”, states are more unlikely to be out of sync while notification and force down host would be parallel.

Appendix

A study has been made to evaluate the effect of parallel execution and shortcut notification on OPNFV Beijing Summit 2017.

notification time

Notification Time

Download the full presentation slides here.

Performance Profiler

https://goo.gl/98Osig

This blueprint proposes to create a performance profiler for doctor scenarios.

Problem Description

In the verification job for notification time, we have encountered some performance issues, such as

1. In environment deployed by APEX, it meets the criteria while in the one by Fuel, the performance is much more poor. 2. Signification performance degradation was spotted when we increase the total number of VMs

It takes time to dig the log and analyse the reason. People have to collect timestamp at each checkpoints manually to find out the bottleneck. A performance profiler will make this process automatic.

Proposed Change

Current Doctor scenario covers the inspector and notifier in the whole fault management cycle:

start                                          end
  +       +         +        +       +          +
  |       |         |        |       |          |
  |monitor|inspector|notifier|manager|controller|
  +------>+         |        |       |          |
occurred  +-------->+        |       |          |
  |     detected    +------->+       |          |
  |       |     identified   +-------+          |
  |       |               notified   +--------->+
  |       |                  |    processed  resolved
  |       |                  |                  |
  |       +<-----doctor----->+                  |
  |                                             |
  |                                             |
  +<---------------fault management------------>+

The notification time can be split into several parts and visualized as a timeline:

start                                         end
  0----5---10---15---20---25---30---35---40---45--> (x 10ms)
  +    +   +   +   +    +      +   +   +   +   +
0-hostdown |   |   |    |      |   |   |   |   |
  +--->+   |   |   |    |      |   |   |   |   |
  |  1-raw failure |    |      |   |   |   |   |
  |    +-->+   |   |    |      |   |   |   |   |
  |    | 2-found affected      |   |   |   |   |
  |    |   +-->+   |    |      |   |   |   |   |
  |    |     3-marked host down|   |   |   |   |
  |    |       +-->+    |      |   |   |   |   |
  |    |         4-set VM error|   |   |   |   |
  |    |           +--->+      |   |   |   |   |
  |    |           |  5-notified VM error  |   |
  |    |           |    +----->|   |   |   |   |
  |    |           |    |    6-transformed event
  |    |           |    |      +-->+   |   |   |
  |    |           |    |      | 7-evaluated event
  |    |           |    |      |   +-->+   |   |
  |    |           |    |      |     8-fired alarm
  |    |           |    |      |       +-->+   |
  |    |           |    |      |         9-received alarm
  |    |           |    |      |           +-->+
sample | sample    |    |      |           |10-handled alarm
monitor| inspector |nova| c/m  |    aodh   |
  |                                        |
  +<-----------------doctor--------------->+

Note: c/m = ceilometer

And a table of components sorted by time cost from most to least

Component Time Cost Percentage
inspector 160ms 40%
aodh 110ms 30%
monitor 50ms 14%
...    
...    

Note: data in the table is for demonstration only, not actual measurement

Timestamps can be collected from various sources

  1. log files
  2. trace point in code

The performance profiler will be integrated into the verification job to provide detail result of the test. It can also be deployed independently to diagnose performance issue in specified environment.

Working Items
  1. PoC with limited checkpoints
  2. Integration with verification job
  3. Collect timestamp at all checkpoints
  4. Display the profiling result in console
  5. Report the profiling result to test database
  6. Independent package which can be installed to specified environment
Planned Maintenance Design Guideline

This document describes how one can implement infrastructure maintenance in interaction with VNFM by utilizing the OPNFV Doctor project framework and to meet the set requirements. Document concentrates to OpenStack and VMs while the concept designed is generic for any payload or even different VIM. Admin tool should be also for controller and other cloud hardware, but that is not the main focus in OPNFV Doctor and should be defined better in the upstream implementation. Same goes for any more detailed work to be done.

Problem Description

Telco application need to know when infrastructure maintenance is going to happen in order to guarantee zero down time in its operation. It needs to be possible to make own actions to have application running on not affected resource or give guidance to admin actions like migration. More details are defined in requirement documentation: use cases, architecture and implementation.

Guidelines

Concepts used:

  • event: Notification to rabbitmq with particular event type.
  • state event: Notification to rabbitmq with particular event type including payload with variable defined for state.
  • project event: Notification to rabbitmq that is meant for project. Single event type is used with different payload and state information.
  • admin event: Notification to rabbitmq that is meant for admin or as for any infrastructure service. Single event type is used with different state information.
  • rolling maintenance: Node by Node rolling maintenance and upgrade where a single node at a time will be maintained after a possible application payload is moved away from the node.
  • project stands for application in OpenStack contents and both are used in this document. tenant is many times used for the same.

Infrastructure admin needs to make notification with two different event types. One is meant for admin and one for project. Notification payload can be consumed by application and admin by subscribing to corresponding event alarm trough alarming service like OpenStack AODH.

  • Infrastructure admin needs to make a notification about infrastructure maintenance including all details that application needs in order to make a decisions upon his affected service. Alarm Payload can hold a link to infrastructure admin tool API for reply and for other possible information. There is many steps of communication between admin tool and application, thus the payload needed for the information passed is very similar. Because of this, the same event type can be used, but there can be a variable like state to tell application what is needed as action for each event. If a project have not subscribed to alarm, admin tool responsible for the maintenance will assume it can do maintenance operations without interaction with application on top of it.
  • Infrastructure admin needs to make an event about infrastructure maintenance telling when the maintenance starts and another when it ends. This admin level event should include the host name. This could be consumed by any admin level infrastructure entity. In this document we consume this in Inspector that is in OPNFV Doctor project terms infrastructure entity responsible for automatic host fault management. Automated actions surely needs to be disabled during planned maintenance.

Before maintenance starts application needs to be able to make switch over for his ACT-STBY service affected, do operation to move service to not effected part of infrastructure or give a hint for admin operation like migration that can be automatically issued by admin tool according to agreed policy.

There should be at least one empty host compatible to host under maintenance in order to have a smooth rolling maintenance done. For this to be possible also down scaling the application instances should be possible.

Infrastructure admin should have a tool that is responsible for hosting a maintenance work flow session with needed APIs for admin and for applications. The Group of hosts in single maintenance session should always have the same physical capabilities, so the rolling maintenance can be guaranteed.

Flow diagram is meant to be as high level as possible. It currently does not try to be perfect, but to show the most important interfaces needed between VNFM and infrastructure admin. This can be seen e.g. as missing error handling that can be defined later on.

Flow diagram:

Work flow in OpenStack

Flow diagram step by step:

  • Infrastructure admin makes a maintenance session to maintain and upgrade certain group of hardware. At least compute hardware in single session should be having same capabilities like the amount number of VCPUs to ensure the maintenance can be done node by node in rolling fashion. Maintenance session need to have a session_id that is a unique ID to be carried throughout all events and can be used in APIs needed when interacting with the session. Maintenance session needs to have knowledge about when maintenance will start and what capabilities the possible upgrade to infrastructure will bring to application payload on top of it. It will be matter of the implementation to define in more detail whether some more data is needed when creating a session or if it is defined in the admin tool configuration.

    There can be several parallel maintenance sessions and a single session can include multiple projects payload. Typically maintenance session should include similar type of compute hardware, so you can guarantee moving of instances on top of them can work between the compute hosts.

  • State MAINTENANCE project event and reply ACK_MAINTENANCE. Immediately after a maintenance session is created, infrastructure admin tool will send a project specific ‘notification’ which application manager can consume by subscribing to AODH alarm for this event. As explained already earlier all `project event`s will only be sent in case the project subscribes to alarm and otherwise the interaction with application will simply not be done and operations could be forced.

    The state MAINTENANCE event should at least include:

    • session_id to reference correct maintenance session.
    • state as MAINTENANCE to identify event action needed.
    • instance_ids to tell project which of his instances will be affected by the maintenance. This might be a link to admin tool project specific API as AODH variables are limited to string of 255 character.
    • reply_url for application to call admin tool project specific API to answer ACK_MAINTENANCE including the session_id.
    • project_id to identify project.
    • actions_at time stamp to indicate when maintenance work flow will start. ACK_MAINTENANCE reply is needed before that time.
    • metadata to include key values pairs of a capabilities coming over the maintenance operation like ‘openstack_version’: ‘Queens’
  • Optional state DOWN_SCALE project event and reply ACK_DOWN_SCALE. When it is time to start the maintenance work flow as the time reaches the actions_at defined in previous state event, admin tool needs to check if there is already an empty compute host needed by the rolling maintenance. In case there is no empty host, admin tool can ask application to down scale by sending project specific DOWN_SCALE state event.

    The state DOWN_SCALE event should at least include:

    • session_id to reference correct maintenance session.
    • state as DOWN_SCALE to identify event action needed.
    • reply_url for application to call admin tool project specific API to answer ACK_DOWN_SCALE including the session_id.
    • project_id to identify project.
    • actions_at time stamp to indicate when is the last moment to send ACK_DOWN_SCALE. This means application can have time to finish some ongoing transactions before down scaling his instances. This guarantees a zero downtime for his service.
  • Optional state PREPARE_MAINTENANCE project event and reply ACK_PREPARE_MAINTENANCE. In case still after down scaling the applications there is still no empty compute host, admin tools needs to analyze the situation on compute host under maintenance. It needs to choose compute node that is now almost empty or has otherwise least critical instances running if possible, like looking if there is floating IPs. When compute host is chosen, a PREPARE_MAINTENANCE state event can be sent to projects having instances running on this host to migrate them to other compute hosts. It might also be possible to have another round of DOWN_SCALE state event if necessary, but this is not proposed here.

    The state PREPARE_MAINTENANCE event should at least include:

    • session_id to reference correct maintenance session.
    • state as PREPARE_MAINTENANCE to identify event action needed.
    • instance_ids to tell project which of his instances will be affected by the state event. This might be a link to admin tool project specific API as AODH variables are limited to string of 255 character.
    • reply_url for application to call admin tool project specific API to answer ACK_PREPARE_MAINTENANCE including the session_id and instance_ids with list of key value pairs with key as instance_id and chosen action from allowed actions given via allowed_actions as value.
    • project_id to identify project.
    • actions_at time stamp to indicate when is the last moment to send ACK_PREPARE_MAINTENANCE. This means application can have time to finish some ongoing transactions within his instances and make possible switch over. This guarantees a zero downtime for his service.
    • allowed_actions to tell what admin tool supports as action to move instances to another compute host. Typically a list like: [‘MIGRATE’, ‘LIVE_MIGRATE’]
  • Optional state INSTANCE_ACTION_DONE project event. In case admin tool needed to make action to move instance like migrating it to another compute host, this state event will be sent to tell the operation is complete.

    The state INSTANCE_ACTION_DONE event should at least include:

    • session_id to reference correct maintenance session.
    • instance_ids to tell project which of his instance had the admin action done.
    • project_id to identify project.
  • At this state it is guaranteed there is an empty compute host. It would be maintained first trough IN_MAINTENANCE and MAINTENANCE_COMPLETE steps, but following the flow chart PLANNED_MAINTENANCE will be explained next.

  • Optional state PLANNED_MAINTENANCE project event and reply ACK_PLANNED_MAINTENANCE. In case compute host to be maintained has instances, projects owning those should have this state event. When project receives this state event it knows instances moved to other compute host as resulting actions will now go to host that is already maintained. This means it might have new capabilities that project can take into use. This gives the project the possibility to upgrade his instances also to support new capabilities over the action chosen to move instances.

    The state PLANNED_MAINTENANCE event should at least include:

    • session_id to reference correct maintenance session.
    • state as PLANNED_MAINTENANCE to identify event action needed.
    • instance_ids to tell project which of his instances will be affected by the event. This might be a link to admin tool project specific API as AODH variables are limited to string of 255 character.
    • reply_url for application to call admin tool project specific API to answer ACK_PLANNED_MAINTENANCE including the session_id and instance_ids with list of key value pairs with key as instance_id and chosen action from allowed actions given via allowed_actions as value.
    • project_id to identify project.
    • actions_at time stamp to indicate when is the last moment to send ACK_PLANNED_MAINTENANCE. This means application can have time to finish some ongoing transactions within his instances and make possible switch over. This guarantees a zero downtime for his service.
    • allowed_actions to tell what admin tool supports as action to move instances to another compute host. Typically a list like: [‘MIGRATE’, ‘LIVE_MIGRATE’, ‘OWN_ACTION’] OWN_ACTION means that application may want to re-instantiate his instance perhaps to take into use the new capability coming over the infrastructure maintenance. Re-instantiated instance will go to already maintained host having the new capability.
    • metadata to include key values pairs of a capabilities coming over the maintenance operation like ‘openstack_version’: ‘Queens’
  • State IN_MAINTENANCE and MAINTENANCE_COMPLETE admin event`s. Just before host goes to maintenance the IN_MAINTENANCE state event will be send to indicate host is entering to maintenance. Host is then taken out of production and can be powered off, replaced, or rebooted during the operation. During the maintenance and upgrade host might be moved to admin’s own host aggregate, so it can be tested to work before putting back to production. After maintenance is complete MAINTENANCE_COMPLETE state event will be sent to know host is back in use. Adding or removing of a host is yet not included in this concept, but can be addressed later.

    The state IN_MAINTENANCE and MAINTENANCE_COMPLETE event should at least include:

    • session_id to reference correct maintenance session.
    • state as IN_MAINTENANCE or MAINTENANCE_COMPLETE to indicate host state.
    • project_id to identify admin project needed by AODH alarm.
    • host to indicate the host name.
  • State MAINTENANCE_COMPLETE project event and reply MAINTENANCE_COMPLETE_ACK. After all compute nodes in the maintenance session have gone trough maintenance operation this state event can be send to all projects that had instances running on any of those nodes. If there was a down scale done, now the application could up scale back to full operation.

    • session_id to reference correct maintenance session.
    • state as MAINTENANCE_COMPLETE to identify event action needed.
    • instance_ids to tell project which of his instances are currently running on hosts maintained in this maintenance session. This might be a link to admin tool project specific API as AODH variables are limited to string of 255 character.
    • reply_url for application to call admin tool project specific API to answer ACK_MAINTENANCE including the session_id.
    • project_id to identify project.
    • actions_at time stamp to indicate when maintenance work flow will start.
    • metadata to include key values pairs of a capabilities coming over the maintenance operation like ‘openstack_version’: ‘Queens’
  • At the end admin tool maintenance session can enter to MAINTENANCE_COMPLETE state and session can be removed.

Benefits
  • Application is guaranteed zero downtime as it is aware of the maintenance action affecting its payload. The application is made aware of the maintenance time window to make sure it can prepare for it.
  • Application gets to know new capabilities over infrastructure maintenance and upgrade and can utilize those (like do its own upgrade)
  • Any application supporting the interaction being defined could be running on top of the same infrastructure provider. No vendor lock-in for application.
  • Any infrastructure component can be aware of host(s) under maintenance via `admin event`s about host state. No vendor lock-in for infrastructure components.
  • Generic messaging making it possible to use same concept in different type of clouds and application payloads. instance_ids will uniquely identify any type of instance and similar notification payload can be used regardless we are in OpenStack. Work flow just need to support different cloud infrastructure management to support different cloud.
  • No additional hardware is needed during maintenance operations as down- and up-scaling can be supported for the applications. Optional, if no extensive spare capacity is available for the maintenance - as typically the case in Telco environments.
  • Parallel maintenance sessions for different group of hardware. Same session should include hardware with same capabilities to guarantee rolling maintenance actions.
  • Multi-tenancy support. Project specific messaging about maintenance.
Future considerations
  • Pluggable architecture for infrastructure admin tool to handle different clouds and payloads.
  • Pluggable architecture to handle specific maintenance/upgrade cases like OpenStack upgrade between specific versions or admin testing before giving host back to production.
  • Support for user specific details need to be taken into account in admin side actions (e.g. run a script, ...).
  • (Re-)Use existing implementations like Mistral for work flows.
  • Scaling hardware resources. Allow critical application to be scaled at the same time in controlled fashion or retire application.
POC

There was a Maintenance POC demo ‘How to gain VNF zero down-time during Infrastructure Maintenance and Upgrade’ in the OCP and ONS summit March 2018. Similar concept is also being made as OPNFV Doctor project new test case scenario.

Inspector Design Guideline

Note

This is spec draft of design guideline for inspector component. JIRA ticket to track the update and collect comments: DOCTOR-73.

This document summarize the best practise in designing a high performance inspector to meet the requirements in OPNFV Doctor project.

Problem Description

Some pitfalls has be detected during the development of sample inspector, e.g. we suffered a significant performance degrading in listing VMs in a host.

A patch set for caching the list has been committed to solve issue. When a new inspector is integrated, it would be nice to have an evaluation of existing design and give recommendations for improvements.

This document can be treated as a source of related blueprints in inspector projects.

Guidelines
Host specific VMs list

While requirement in doctor project is to have alarm about fault to consumer in one second, it is just a limit we have set in requirements. When talking about fault management in Telco, the implementation needs to be by all means optimal and the one second is far from traditional Telco requirements.

One thing to be optimized in inspector is to eliminate the need to read list of host specific VMs from Nova API, when it gets a host specific failure event. Optimal way of implementation would be to initialize this list when Inspector start by reading from Nova API and after this list would be kept up-to-date by instance.update notifications received from nova. Polling Nova API can be used as a complementary channel to make snapshot of hosts and VMs list in order to keep the data consistent with reality.

This is enhancement and not perhaps something needed to keep under one second in a small system. Anyhow this would be something needed in case of production use.

This guideline can be summarized as following:

  • cache the host VMs mapping instead of reading it on request
  • subscribe and handle update notifications to keep the list up to date
  • make snapshot periodically to ensure data consistency
Parallel execution

In doctor’s architecture, the inspector is responsible to set error state for the affected VMs in order to notify the consumers of such failure. This is done by calling the nova reset-state API. However, this action is a synchronous request with many underlying steps and cost typically hundreds of milliseconds. According to the discussion in mailing list, this time cost will grow linearly if the requests are sent one by one. It will become a critical issue in large scale system.

It is recommended to introduce parallel execution for actions like reset-state that takes a list of targets.

Shortcut notification

An alternative way to improve notification performance is to take a shortcut from inspector to notifier instead of triggering it from controller. The difference between the two workflow is shown below:

conservative notification

Conservative Notification

shortcut notification

Shortcut Notification

It worth noting that the shortcut notification has a side effect that cloud resource states could still be out-of-sync by the time consumer processes the alarm notification. This is out of scope of inspector design but need to be taken consideration in system level.

Also the call of “reset servers state to error” is not necessary in the alternative notification case where the “host forced down” is still called. “get-valid-server-state” was implemented to have valid server state while earlier one couldn’t get it unless calling “reset servers state to error”. When not having “reset servers state to error”, states are more unlikely to be out of sync while notification and force down host would be parallel.

Appendix

A study has been made to evaluate the effect of parallel execution and shortcut notification on OPNFV Beijing Summit 2017.

notification time

Notification Time

Download the full presentation slides here.

Manuals
OpenStack NOVA API for marking host down.
What the API is for
This API will give external fault monitoring system a possibility of telling OpenStack Nova fast that compute host is down. This will immediately enable calling of evacuation of any VM on host and further enabling faster HA actions.
What this API does
In OpenStack the nova-compute service state can represent the compute host state and this new API is used to force this service down. It is assumed that the one calling this API has made sure the host is also fenced or powered down. This is important, so there is no chance same VM instance will appear twice in case evacuated to new compute host. When host is recovered by any means, the external system is responsible of calling the API again to disable forced_down flag and let the host nova-compute service report again host being up. If network fenced host come up again it should not boot VMs it had if figuring out they are evacuated to other compute host. The decision of deleting or booting VMs there used to be on host should be enhanced later to be more reliable by Nova blueprint: https://blueprints.launchpad.net/nova/+spec/robustify-evacuate
REST API for forcing down:

Parameter explanations: tenant_id: Identifier of the tenant. binary: Compute service binary name. host: Compute host name. forced_down: Compute service forced down flag. token: Token received after successful authentication. service_host_ip: Serving controller node ip.

request: PUT /v2.1/{tenant_id}/os-services/force-down { “binary”: “nova-compute”, “host”: “compute1”, “forced_down”: true }

response: 200 OK { “service”: { “host”: “compute1”, “binary”: “nova-compute”, “forced_down”: true } }

Example: curl -g -i -X PUT http://{service_host_ip}:8774/v2.1/{tenant_id}/os-services /force-down -H “Content-Type: application/json” -H “Accept: application/json ” -H “X-OpenStack-Nova-API-Version: 2.11” -H “X-Auth-Token: {token}” -d ‘{“b inary”: “nova-compute”, “host”: “compute1”, “forced_down”: true}’

CLI for forcing down:

nova service-force-down <hostname> nova-compute

Example: nova service-force-down compute1 nova-compute

REST API for disabling forced down:

Parameter explanations: tenant_id: Identifier of the tenant. binary: Compute service binary name. host: Compute host name. forced_down: Compute service forced down flag. token: Token received after successful authentication. service_host_ip: Serving controller node ip.

request: PUT /v2.1/{tenant_id}/os-services/force-down { “binary”: “nova-compute”, “host”: “compute1”, “forced_down”: false }

response: 200 OK { “service”: { “host”: “compute1”, “binary”: “nova-compute”, “forced_down”: false } }

Example: curl -g -i -X PUT http://{service_host_ip}:8774/v2.1/{tenant_id}/os-services /force-down -H “Content-Type: application/json” -H “Accept: application/json ” -H “X-OpenStack-Nova-API-Version: 2.11” -H “X-Auth-Token: {token}” -d ‘{“b inary”: “nova-compute”, “host”: “compute1”, “forced_down”: false}’

CLI for disabling forced down:

nova service-force-down –unset <hostname> nova-compute

Example: nova service-force-down –unset compute1 nova-compute

Get valid server state
Problem description

Previously when the owner of a VM has queried his VMs, he has not received enough state information, states have not changed fast enough in the VIM and they have not been accurate in some scenarios. With this change this gap is now closed.

A typical case is that, in case of a fault of a host, the user of a high availability service running on top of that host, needs to make an immediate switch over from the faulty host to an active standby host. Now, if the compute host is forced down [1] as a result of that fault, the user has to be notified about this state change such that the user can react accordingly. Similarly, a change of the host state to “maintenance” should also be notified to the users.

What is changed

A new host_status parameter is added to the /servers/{server_id} and /servers/detail endpoints in microversion 2.16. By this new parameter user can get additional state information about the host.

host_status possible values where next value in list can override the previous:

  • UP if nova-compute is up.
  • UNKNOWN if nova-compute status was not reported by servicegroup driver within configured time period. Default is within 60 seconds, but can be changed with service_down_time in nova.conf.
  • DOWN if nova-compute was forced down.
  • MAINTENANCE if nova-compute was disabled. MAINTENANCE in API directly means nova-compute service is disabled. Different wording is used to avoid the impression that the whole host is down, as only scheduling of new VMs is disabled.
  • Empty string indicates there is no host for server.

host_status is returned in the response in case the policy permits. By default the policy is for admin only in Nova policy.json:

"os_compute_api:servers:show:host_status": "rule:admin_api"

For an NFV use case this has to also be enabled for the owner of the VM:

"os_compute_api:servers:show:host_status": "rule:admin_or_owner"
REST API examples:

Case where nova-compute is enabled and reporting normally:

GET /v2.1/{tenant_id}/servers/{server_id}

200 OK
{
  "server": {
    "host_status": "UP",
    ...
  }
}

Case where nova-compute is enabled, but not reporting normally:

GET /v2.1/{tenant_id}/servers/{server_id}

200 OK
{
  "server": {
    "host_status": "UNKNOWN",
    ...
  }
}

Case where nova-compute is enabled, but forced_down:

GET /v2.1/{tenant_id}/servers/{server_id}

200 OK
{
  "server": {
    "host_status": "DOWN",
    ...
  }
}

Case where nova-compute is disabled:

GET /v2.1/{tenant_id}/servers/{server_id}

200 OK
{
  "server": {
    "host_status": "MAINTENANCE",
    ...
  }
}

Host Status is also visible in python-novaclient:

+-------+------+--------+------------+-------------+----------+-------------+
| ID    | Name | Status | Task State | Power State | Networks | Host Status |
+-------+------+--------+------------+-------------+----------+-------------+
| 9a... | vm1  | ACTIVE | -          | RUNNING     | xnet=... | UP          |
+-------+------+--------+------------+-------------+----------+-------------+

Edgecloud

Edge Cloud Requirement in OPNFV
1. Introduction

This Edge Cloud Requirement Document is used for eliciting telecom network Edge Cloud Requirements of OPNFV, where telecom network edge clouds are edge clouds deployed into the telecommunication infrastructure. Edge clouds deployed beyond the borders of telecommunication networks are outside of the scope of this document. This document will define high-level telecom network edge cloud goals, including service reqirements, sites conditions, and translate them into detailed requirements on edge cloud infrastructure components. Moreover, this document can be used as reference for edge cloud testing scenario design.

2. Definitions & Terminologies

The following terminologies will be used in this document:

Core site(s): Sites that are far away from end users/ base stations, completely virtualized, and mainly host control domain services (e.g. telco services: HSS, MME, IMS, EPC, etc).

Edge site(s): Sites that are closer to end users/ base stations, and mainly host control and compute services.

E2E delay: time of the transmission process between the user equipment and the edge cloud site. It contains four parts: time of radio transmission, time of optical fiber transmission, time of GW forwarding, and time of VM forwarding.

BBU: Building Baseband Unit. It’s a centralized processing unit of radio signals. Together with RRU (Remote Radio Unit), it forms the distirbuted base station architecture. For example, a large stadium is usually separated into different districts. Each district would be provided with a RRU, which is close to user, to provide radio access. All RRUs would be linked to a BBU, which is located inside a remote site away from user and provide signal processing, using optical fiber.

BRAS: Broadband Remote Access Server. An Ethernet-centric IP edge router, and the aggregation point for the user traffic. It performs Ethernet aggregation and packets forwarding via IP/MPLS, and supports user management, access protocols termination, QoS and policy management, etc.

UPF: User Plane Function, which is a user plane gateway for user data transmission.

SAE-GW: SAE stands for System Architecture Evolution, which is the core network architecture of 3GPP’s LTE wireless communication standard. SAE-GW includes Serving Gateway and PDN Gateway. Serving Gateway (SGW) routes and forwards user data packets,and also acts as the mobility anchor for LTE and other 3GPP technologies. PDN Gateway (PGW) provides connectivity from the UE to external packet data networks by being the point of exit and entry of traffic for the UE.

SAE-GW related definition link: https://en.wikipedia.org/wiki/System_Architecture_Evolution

CPE: In telecommunications, a customer-premises equipment or customer-provided equipment (CPE) is any terminal and associated equipment located at a subscriber’s premises and connected with a carrier’s telecommunication circuit. CPE generally refers to devices such as telephones, routers, network switches, residential gateways (RG), home networking adapters and Internet access gateways that enable consumers to access communications service providers’ services and distribute them around their house via a local area network (LAN).

CPE definition: https://en.wikipedia.org/wiki/Customer-premises_equipment

enterprise vCPE: Usually CPE provides a number of network functions such as firewall, access control, policy management and discovering/connecting devices at home. enterprise vCPE stands for virtual CPE for enterprise, which is a software framework that virtualizes several CPE funcitons.

4. Features of Edge
4.1. Resource optimized control

As space and power resources are limited in edge sites and edge usually has fewer number of servers (the number varies from a few to several dozens), it is unnecessary to deploy orchestrator or VNFM. The depolyed VIM (e.g.: OpenStack or Kubernetes) and SDN would be optimized for low resource usage to save resources for services. Resource optimisation of VIM and SDN have not been discussed yet, but basic functions such as VM lifecycle management and automatic network management should be persisted.

4.2. Remote provisioning

As there is no professional maintenance staff at edge, remote provisioning should be provided so that virtual resources of distributed edge sites can obtain unified orchestration and maintenance. Orchestrator together with OSS/BSS, EMS and VNFM should be deployed remotely in some central offices to reduce the difficulty and cost of management as well as increasing edge resource utilization ratio. Multi region OpenStack could be considered as one of the VIM solution.

4.3. Resource diversity

With various applications running on edge, diverse resources, including VM, container and bare-metal could co-exist and form diverse resource pool. These resources should be managed by edge management components as well as core orchestration/management components.

4.4. Hardware/Software acceleration

Edge services usually require strict low latency, high bandwidth, and fast computing and processing ability. Acceleration technology should be used in edge to maintain good service performance. OpenStack should fully expose these acceleration capabilities to services. The usage of different acceleration technologies (including DPDK, SR-IOV, GPU, Smart NIC, FPGA and etc.) varies from service to service.

Related project about acceleration: https://wiki.openstack.org/wiki/Cyborg

5. Edge Sites Conditions/ Deployment Scenarios

Latency and distance to customer are taken as two main characters to separate different sites. The following figure shows three different sites.

Edge Sites Structure
5.1. Small Edge
  • Distance to base station: around 10 km, closest site to end users / base station
  • E2E delay(from UE to site): around 2 ms
  • Maximum bandwidth can provide: 50 GB/s
  • Minimum hardware specs: 1 unit of
    • 4 cores (two ARM or Xeon-D processors)
    • 8 GB RAM (4 DIMM)
    • 1 * 240 GB SSD (2 * 2.5)
  • Maximum hardware specs: 5 unit of
    • 16 cores
    • 64 GB RAM
    • 1 * 1 TB storage
  • Power for a site: < 10 kW
  • Physical access of maintainer: Rare, maintenance staff may only show up in this kind of site when machines initialize for the first time or a machine is down. Maintenance staff is skilled in mechanical engineering and not in IT.
  • Physical security: none (Optionally secure booting is needed)
  • Expected frequency of updates to hardware: 3-4 year refresh cycle
  • Expected frequency of updates to firmware: 6-12 months
  • Expected frequency of updates to control systems (e.g. OpenStack or Kubernetes controllers): ~ 12 - 24 months, has to be possible from remote management
  • Physical size: 482.6 mm (19 inch) witdth rack. Not all the sites will have 1000 mm (36 inch) depth capability. Some sites might be limited to 600 mm (12 inch) depth.
  • Cooling: front cooling
  • Access / cabling: front
  • NEBS 3 compliant
  • Number of edge cloud instances: depends on demands (3000+)
  • Services might be deployed here: MEC, or other services which have strict requirements on latency. Services deployed in this kind of sites have huge regional deference
  • Remote network connection reliability: No 100% uptime and variable connectivity expected.
  • Orchestration: no orchestration component. MANO deployed in core site provide remote orchestration
  • Degree of virtualization: it is possible that no virtualization technology would be used in small edge site if virtualization increases structure/network complexity, reduces service performance, or costs more resources. Bare-metal is common in small edge sites. Container would also be a future choice if virtualization was needed
  • Smart NICs are supported
  • Storage: mainly local storage.
5.2. Medium Edge
  • Distance to base station: around 50 km
  • E2E delay (from UE to site): less than 2.5 ms
  • Maximum bandwidth can provide: 100 GB/s
  • Minimum hardware specs: 2 Rack Unit (RU)
  • Maximum hardware specs: 20 Rack Unit
  • Power for a site: 10 - 20 10 kW
  • Physical access of maintainer: Rare. Maintenance staff is skilled in mechanical engineering and not in IT.
  • Physical security: Medium, probably not in a secure data center, probably in a semi-physically secure environment; each device has some authentication (such as certificate) to verify it’s a legitimate piece of hardware deployed by operator; network access is all through security enhanced methods (vpn, connected back to dmz); VPN itself is not considered secure, so other mechanism such as https should be employed as well)
  • Expected frequency of updates to hardware: 5-7 years
  • Expected frequency of updates to firmware: Never unless required to fix blocker/critical bug(s)
  • Expected frequency of updates to control systems (e.g. OpenStack or Kubernetes controllers): 12 - 24 months
  • Physical size: TBD
  • Cooling: front cooling
  • Access / cabling: front
  • NEBS 3 compliant
  • Number of edge cloud instances: 3000+
  • Services might be deployed here: MEC, RAN, CPE, etc.
  • Remote network connection reliability: 24/7 (high uptime but connectivity is variable), 100% uptime expected
  • Orchestration: no orchestration component. MANO deployed in core site provide remote orchestration.
  • Degree of virtualization: depends on site conditions and service requirements. VM, container may form hybrid virtualization layer. Bare-metal is possible in middle sites
  • Smart NICs are supported
  • Storage: local storage and distributed storage, which depends on site conditions and services’ needs
5.3. Large Edge
  • Distance to base station: 80 - 300 km
  • E2E delay: around 4 ms
  • Maximum bandwidth can provide: 200 GB/s
  • Minimum hardware specs: N/A
  • Maximum hardware specs: 100+ servers
  • Power for a site: 20 - 90 kW
  • Physical access of maintainer: professional maintainer will monitor the site. Maintenance staff is skilled in mechanical engineering and not in IT.
  • Physical security: High
  • Expected frequency of updates to hardware: 36 month
  • Expected frequency of updates to firmware: Never unless required to fix blocker/critical bug(s)
  • Expected frequency of updates to control systems (e.g. OpenStack or Kubernetes controllers): 12 - 24 months
  • Physical size: same as a normal DC
  • Cooling: front cooling
  • Access / cabling: front
  • NEBS 3 compliant
  • Number of edge cloud instances: 600+
  • Services might be deployed here: CDN, SAE-GW, UPF, CPE and etc., which have large bandwidth requirements and relatively low latency requirements
  • Remote network connection reliability: reliable and stable
  • Orchestration: no orchestration component. MANO deployed in core site provide remote orchestration
  • Degree of virtualization: almost completely virtualized in the form of VMs (if take CDN into consideration, which may not be virtualized, the virtualization degree would decrease in sites with CDN deployment)
  • Smart NICs are supported
  • Storage: distributed storage
6. Edge Structure

Based on requirements of telco related use cases and edge sites conditions, the edge structure has been summarized as the figure below.

Edge Structure
7. Requirements & Features on NFV Components
7.1. Hardware

Customized server would be possible for edge because of limited space, power, temperature, vibration and etc. But if there were custom enclosures that can provide environmental controls, then non-customized server can be used, which is a cost tradeoff.

More derails: TBD

7.2. Acceleration

Hardware acceleration resources and acceleration software would be necessary for edge.

More details:TBD

7.3. OpenStack

Edge OpenStack would be in hierarchical structure. Remote provisioning like multi-region OpenStack would exist in large edge sites with professional maintenance staff and provide remote management on several middle/small edge sites. Middle and small edge sites would not only have their own resource management components to provide local resource and network management, but also under the remote provisioning of OpenStack in large edge sites.

Hierarchical OpenStack

Optionally for large edge sites, OpenStack would be fully deployed. Its Keystone and Horizon would provide unified tenant and UI management for both itself and remote middle and small edge sites. In this case middle edge sites would have OpenStack with neccessary services like Nova, Neutron and Glance. While small edge site would use resource optimized weight OpenStack.

Other option is to use different instances of the same resource optimized OpenStack to control both large, medium and small edge sites.

More detalis: TBD

7.4. SDN

TBD

7.5. Orchestration & Management

Orchestration and VNF lifecycle management: NFVO, VNFM, EMS exist in core cloud and provide remote lifecycle management.

More details: TBD

7.6. Container

VM, container and bare-metal would exist as three different types of infrastructure resources. Which type of resources to use depends on services’ requirements and sites conditions. The introduction of container would be a future topic.

IPV6

2. IPv6 Installation Procedure
Abstract:

This document provides the users with the Installation Procedure to install OPNFV Gambia Release on IPv6-only Infrastructure.

2.1. Install OPNFV on IPv6-Only Infrastructure

This section provides instructions to install OPNFV on IPv6-only Infrastructure. All underlay networks and API endpoints will be IPv6-only except:

  1. “admin” network in underlay/undercloud still has to be IPv4.
  • It was due to lack of support of IPMI over IPv6 or PXE over IPv6.
  • iPXE does support IPv6 now. Ironic has added support for booting nodes with IPv6.
  • We are starting to work on enabling IPv6-only environment for all networks. For TripleO, this work is still ongoing.
  1. Metadata server is still IPv4 only.

Except the limitations above, the use case scenario of the IPv6-only infrastructure includes:

  1. Support OPNFV deployment on an IPv6 only infrastructure.
  2. Horizon/ODL-DLUX access using IPv6 address from an external host.
  3. OpenStack API access using IPv6 addresses from various python-clients.
  4. Ability to create Neutron Routers, IPv6 subnets (e.g. SLAAC/DHCPv6-Stateful/ DHCPv6-Stateless) to support North-South traffic.
  5. Inter VM communication (East-West routing) when VMs are spread across two compute nodes.
  6. VNC access into a VM using IPv6 addresses.
  7. IPv6 support in OVS VxLAN (and/or GRE) tunnel endpoints with OVS 2.6+.
  8. IPv6 support in iPXE, and booting nodes with IPv6 (NEW).
2.1.1. Install OPNFV in OpenStack-Only Environment

Apex Installer:

# HA, Virtual deployment in OpenStack-only environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-nosdn-nofeature-ha.yaml \
-n /etc/opnfv-apex/network_settings_v6.yaml

# HA, Bare Metal deployment in OpenStack-only environment
./opnfv-deploy -d /etc/opnfv-apex/os-nosdn-nofeature-ha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_settings_v6.yaml

# Non-HA, Virtual deployment in OpenStack-only environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-nosdn-nofeature-noha.yaml \
-n /etc/opnfv-apex/network_settings_v6.yaml

# Non-HA, Bare Metal deployment in OpenStack-only environment
./opnfv-deploy -d /etc/opnfv-apex/os-nosdn-nofeature-noha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_settings_v6.yaml

# Note:
#
# 1. Parameter ""-v" is mandatory for Virtual deployment
# 2. Parameter "-i <inventory file>" is mandatory for Bare Metal deployment
# 2.1 Refer to https://git.opnfv.org/cgit/apex/tree/config/inventory for examples of inventory file
# 3. You can use "-n /etc/opnfv-apex/network_settings.yaml" for deployment in IPv4 infrastructure

Please NOTE that:

  • You need to refer to installer’s documentation for other necessary parameters applicable to your deployment.
  • You need to refer to Release Notes and installer’s documentation if there is any issue in installation.
2.1.2. Install OPNFV in OpenStack with ODL-L3 Environment

Apex Installer:

# HA, Virtual deployment in OpenStack with Open Daylight L3 environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-odl-nofeature-ha.yaml \
-n /etc/opnfv-apex/network_settings_v6.yaml

# HA, Bare Metal deployment in OpenStack with Open Daylight L3 environment
./opnfv-deploy -d /etc/opnfv-apex/os-odl-nofeature-ha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_settings_v6.yaml

# Non-HA, Virtual deployment in OpenStack with Open Daylight L3 environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-odl-nofeature-noha.yaml \
-n /etc/opnfv-apex/network_settings_v6.yaml

# Non-HA, Bare Metal deployment in OpenStack with Open Daylight L3 environment
./opnfv-deploy -d /etc/opnfv-apex/os-odl-nofeature-noha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_settings_v6.yaml

# Note:
#
# 1. Parameter ""-v" is mandatory for Virtual deployment
# 2. Parameter "-i <inventory file>" is mandatory for Bare Metal deployment
# 2.1 Refer to https://git.opnfv.org/cgit/apex/tree/config/inventory for examples of inventory file
# 3. You can use "-n /etc/opnfv-apex/network_settings.yaml" for deployment in IPv4 infrastructure

Please NOTE that:

  • You need to refer to installer’s documentation for other necessary parameters applicable to your deployment.
  • You need to refer to Release Notes and installer’s documentation if there is any issue in installation.
2.1.3. Testing Methodology

There are 2 levels of testing to validate the deployment.

2.1.3.1. Underlay Testing for OpenStack API Endpoints

Underlay Testing is to validate that API endpoints are listening on IPv6 addresses. Currently, we are only considering the Underlay Testing for OpenStack API endpoints. The Underlay Testing for Open Daylight API endpoints is for future release.

The Underlay Testing for OpenStack API endpoints can be as simple as validating Keystone service, and as complete as validating each API endpoint. It is important to reuse Tempest API testing. Currently:

  • Apex Installer will change OS_AUTH_URL in overcloudrc during installation process. For example: export OS_AUTH_URL=http://[2001:db8::15]:5000/v2.0. OS_AUTH_URL points to Keystone and Keystone catalog.
  • When FuncTest runs Tempest for the first time, the OS_AUTH_URL is taken from the environment and placed automatically in Tempest.conf.
  • Under this circumstance, openstack catalog list will return IPv6 URL endpoints for all the services in catalog, including Nova, Neutron, etc, and covering public URLs, private URLs and admin URLs.
  • Thus, as long as the IPv6 URL is given in the overclourc, all the tests will use that (including Tempest).

Therefore Tempest API testing is reused to validate API endpoints are listening on IPv6 addresses as stated above. They are part of OpenStack default Smoke Tests, run in FuncTest and integrated into OPNFV’s CI/CD environment.

2.1.3.2. Overlay Testing

Overlay Testing is to validate that IPv6 is supported in tenant networks, subnets and routers. Both Tempest API testing and Tempest Scenario testing are used in our Overlay Testing.

Tempest API testing validates that the Neutron API supports the creation of IPv6 networks, subnets, routers, etc:

tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_network
tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_port
tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_subnet
tempest.api.network.test_networks.NetworksIpV6Test.test_create_update_delete_network_subnet
tempest.api.network.test_networks.NetworksIpV6Test.test_external_network_visibility
tempest.api.network.test_networks.NetworksIpV6Test.test_list_networks
tempest.api.network.test_networks.NetworksIpV6Test.test_list_subnets
tempest.api.network.test_networks.NetworksIpV6Test.test_show_network
tempest.api.network.test_networks.NetworksIpV6Test.test_show_subnet
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_create_update_delete_network_subnet
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_external_network_visibility
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_list_networks
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_list_subnets
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_show_network
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_show_subnet
tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_in_allowed_allocation_pools
tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_with_no_securitygroups
tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_update_delete_port
tempest.api.network.test_ports.PortsIpV6TestJSON.test_list_ports
tempest.api.network.test_ports.PortsIpV6TestJSON.test_show_port
tempest.api.network.test_routers.RoutersIpV6Test.test_add_multiple_router_interfaces
tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_port_id
tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_subnet_id
tempest.api.network.test_routers.RoutersIpV6Test.test_create_show_list_update_delete_router
tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_list_update_show_delete_security_group
tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_show_delete_security_group_rule
tempest.api.network.test_security_groups.SecGroupIPv6Test.test_list_security_groups

Tempest Scenario testing validates some specific overlay IPv6 scenarios (i.e. use cases) as follows:

tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_dhcpv6_stateless
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_slaac
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_slaac_from_os
tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_dhcpv6_stateless
tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac
tempest.scenario.test_network_v6.TestGettingAddress.test_slaac_from_os

The above Tempest API testing and Scenario testing are quite comprehensive to validate overlay IPv6 tenant networks. They are part of OpenStack default Smoke Tests, run in FuncTest and integrated into OPNFV’s CI/CD environment.

IPv6 Configuration Guide
Abstract:

This document provides the users with the Configuration Guide to set up a service VM as an IPv6 vRouter using OPNFV Gambia Release.

1. IPv6 Configuration - Setting Up a Service VM as an IPv6 vRouter

This section provides instructions to set up a service VM as an IPv6 vRouter using OPNFV Gambia Release installers. Because Open Daylight no longer supports L2-only option, and there is only limited support of IPv6 in L3 option of Open Daylight, setup of service VM as an IPv6 vRouter is only available under pure/native OpenStack environment. The deployment model may be HA or non-HA. The infrastructure may be bare metal or virtual environment.

1.1. Pre-configuration Activities

The configuration will work only in OpenStack-only environment.

Depending on which installer will be used to deploy OPNFV, each environment may be deployed on bare metal or virtualized infrastructure. Each deployment may be HA or non-HA.

Refer to the previous installer configuration chapters, installations guide and release notes.

1.2. Setup Manual in OpenStack-Only Environment

If you intend to set up a service VM as an IPv6 vRouter in OpenStack-only environment of OPNFV Gambia Release, please NOTE that:

  • Because the anti-spoofing rules of Security Group feature in OpenStack prevents a VM from forwarding packets, we need to disable Security Group feature in the OpenStack-only environment.
  • The hostnames, IP addresses, and username are for exemplary purpose in instructions. Please change as needed to fit your environment.
  • The instructions apply to both deployment model of single controller node and HA (High Availability) deployment model where multiple controller nodes are used.
1.2.1. Install OPNFV and Preparation

OPNFV-NATIVE-INSTALL-1: To install OpenStack-only environment of OPNFV Gambia Release:

Apex Installer:

# HA, Virtual deployment in OpenStack-only environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-nosdn-nofeature-ha.yaml \
-n /etc/opnfv-apex/network_setting.yaml

# HA, Bare Metal deployment in OpenStack-only environment
./opnfv-deploy -d /etc/opnfv-apex/os-nosdn-nofeature-ha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_setting.yaml

# Non-HA, Virtual deployment in OpenStack-only environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-nosdn-nofeature-noha.yaml \
-n /etc/opnfv-apex/network_setting.yaml

# Non-HA, Bare Metal deployment in OpenStack-only environment
./opnfv-deploy -d /etc/opnfv-apex/os-nosdn-nofeature-noha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_setting.yaml

# Note:
#
# 1. Parameter ""-v" is mandatory for Virtual deployment
# 2. Parameter "-i <inventory file>" is mandatory for Bare Metal deployment
# 2.1 Refer to https://git.opnfv.org/cgit/apex/tree/config/inventory for examples of inventory file
# 3. You can use "-n /etc/opnfv-apex/network_setting_v6.yaml" for deployment in IPv6-only infrastructure

Compass Installer:

# HA deployment in OpenStack-only environment
export ISO_URL=file://$BUILD_DIRECTORY/compass.iso
export OS_VERSION=${{COMPASS_OS_VERSION}}
export OPENSTACK_VERSION=${{COMPASS_OPENSTACK_VERSION}}
export CONFDIR=$WORKSPACE/deploy/conf/vm_environment
./deploy.sh --dha $CONFDIR/os-nosdn-nofeature-ha.yml \
--network $CONFDIR/$NODE_NAME/network.yml

# Non-HA deployment in OpenStack-only environment
# Non-HA deployment is currently not supported by Compass installer

Fuel Installer:

# HA deployment in OpenStack-only environment
# Scenario Name: os-nosdn-nofeature-ha
# Scenario Configuration File: ha_heat_ceilometer_scenario.yaml
# You can use either Scenario Name or Scenario Configuration File Name in "-s" parameter
sudo ./deploy.sh -b <stack-config-uri> -l <lab-name> -p <pod-name> \
-s os-nosdn-nofeature-ha -i <iso-uri>

# Non-HA deployment in OpenStack-only environment
# Scenario Name: os-nosdn-nofeature-noha
# Scenario Configuration File: no-ha_heat_ceilometer_scenario.yaml
# You can use either Scenario Name or Scenario Configuration File Name in "-s" parameter
sudo ./deploy.sh -b <stack-config-uri> -l <lab-name> -p <pod-name> \
-s os-nosdn-nofeature-noha -i <iso-uri>

# Note:
#
# 1. Refer to http://git.opnfv.org/cgit/fuel/tree/deploy/scenario/scenario.yaml for scenarios
# 2. Refer to http://git.opnfv.org/cgit/fuel/tree/ci/README for description of
#    stack configuration directory structure
# 3. <stack-config-uri> is the base URI of stack configuration directory structure
# 3.1 Example: http://git.opnfv.org/cgit/fuel/tree/deploy/config
# 4. <lab-name> and <pod-name> must match the directory structure in stack configuration
# 4.1 Example of <lab-name>: -l devel-pipeline
# 4.2 Example of <pod-name>: -p elx
# 5. <iso-uri> could be local or remote ISO image of Fuel Installer
# 5.1 Example: http://artifacts.opnfv.org/fuel/euphrates/opnfv-euphrates.1.0.iso
#
# Please refer to Fuel Installer's documentation for further information and any update

Joid Installer:

# HA deployment in OpenStack-only environment
./deploy.sh -o mitaka -s nosdn -t ha -l default -f ipv6

# Non-HA deployment in OpenStack-only environment
./deploy.sh -o mitaka -s nosdn -t nonha -l default -f ipv6

Please NOTE that:

  • You need to refer to installer’s documentation for other necessary parameters applicable to your deployment.
  • You need to refer to Release Notes and installer’s documentation if there is any issue in installation.

OPNFV-NATIVE-INSTALL-2: Clone the following GitHub repository to get the configuration and metadata files

git clone https://github.com/sridhargaddam/opnfv_os_ipv6_poc.git \
/opt/stack/opnfv_os_ipv6_poc
1.2.2. Disable Security Groups in OpenStack ML2 Setup

Please NOTE that although Security Groups feature has been disabled automatically through local.conf configuration file by some installers such as devstack, it is very likely that other installers such as Apex, Compass, Fuel or Joid will enable Security Groups feature after installation.

Please make sure that Security Groups are disabled in the setup

In order to disable Security Groups globally, please make sure that the settings in OPNFV-NATIVE-SEC-1 and OPNFV-NATIVE-SEC-2 are applied, if they are not there by default.

OPNFV-NATIVE-SEC-1: Change the settings in /etc/neutron/plugins/ml2/ml2_conf.ini as follows, if they are not there by default

# /etc/neutron/plugins/ml2/ml2_conf.ini
[securitygroup]
enable_security_group = True
firewall_driver = neutron.agent.firewall.NoopFirewallDriver
[ml2]
extension_drivers = port_security
[agent]
prevent_arp_spoofing = False

OPNFV-NATIVE-SEC-2: Change the settings in /etc/nova/nova.conf as follows, if they are not there by default.

# /etc/nova/nova.conf
[DEFAULT]
security_group_api = neutron
firewall_driver = nova.virt.firewall.NoopFirewallDriver

OPNFV-NATIVE-SEC-3: After updating the settings, you will have to restart the Neutron and Nova services.

Please note that the commands of restarting Neutron and Nova would vary depending on the installer. Please refer to relevant documentation of specific installers

1.2.3. Set Up Service VM as IPv6 vRouter

OPNFV-NATIVE-SETUP-1: Now we assume that OpenStack multi-node setup is up and running. We have to source the tenant credentials in OpenStack controller node in this step. Please NOTE that the method of sourcing tenant credentials may vary depending on installers. For example:

Apex installer:

# On jump host, source the tenant credentials using /bin/opnfv-util provided by Apex installer
opnfv-util undercloud "source overcloudrc; keystone service-list"

# Alternatively, you can copy the file /home/stack/overcloudrc from the installer VM called "undercloud"
# to a location in controller node, for example, in the directory /opt, and do:
# source /opt/overcloudrc

Compass installer:

# source the tenant credentials using Compass installer of OPNFV
source /opt/admin-openrc.sh

Fuel installer:

# source the tenant credentials using Fuel installer of OPNFV
source /root/openrc

Joid installer:

# source the tenant credentials using Joid installer of OPNFV
source $HOME/joid_config/admin-openrc

devstack:

# source the tenant credentials in devstack
source openrc admin demo

Please refer to relevant documentation of installers if you encounter any issue.

OPNFV-NATIVE-SETUP-2: Download fedora22 image which would be used for vRouter

wget https://download.fedoraproject.org/pub/fedora/linux/releases/22/Cloud/x86_64/\
Images/Fedora-Cloud-Base-22-20150521.x86_64.qcow2

OPNFV-NATIVE-SETUP-3: Import Fedora22 image to glance

glance image-create --name 'Fedora22' --disk-format qcow2 --container-format bare \
--file ./Fedora-Cloud-Base-22-20150521.x86_64.qcow2

OPNFV-NATIVE-SETUP-4: This step is Informational. OPNFV Installer has taken care of this step during deployment. You may refer to this step only if there is any issue, or if you are using other installers.

We have to move the physical interface (i.e. the public network interface) to br-ex, including moving the public IP address and setting up default route. Please refer to OS-NATIVE-SETUP-4 and OS-NATIVE-SETUP-5 in our more complete instruction.

OPNFV-NATIVE-SETUP-5: Create Neutron routers ipv4-router and ipv6-router which need to provide external connectivity.

neutron router-create ipv4-router
neutron router-create ipv6-router

OPNFV-NATIVE-SETUP-6: Create an external network/subnet ext-net using the appropriate values based on the data-center physical network setup.

Please NOTE that you may only need to create the subnet of ext-net because OPNFV installers should have created an external network during installation. You must use the same name of external network that installer creates when you create the subnet. For example:

  • Apex installer: external
  • Compass installer: ext-net
  • Fuel installer: admin_floating_net
  • Joid installer: ext-net

Please refer to the documentation of installers if there is any issue

# This is needed only if installer does not create an external work
# Otherwise, skip this command "net-create"
neutron net-create --router:external ext-net

# Note that the name "ext-net" may work for some installers such as Compass and Joid
# Change the name "ext-net" to match the name of external network that an installer creates
neutron subnet-create --disable-dhcp --allocation-pool start=198.59.156.251,\
end=198.59.156.254 --gateway 198.59.156.1 ext-net 198.59.156.0/24

OPNFV-NATIVE-SETUP-7: Create Neutron networks ipv4-int-network1 and ipv6-int-network2 with port_security disabled

neutron net-create ipv4-int-network1
neutron net-create ipv6-int-network2

OPNFV-NATIVE-SETUP-8: Create IPv4 subnet ipv4-int-subnet1 in the internal network ipv4-int-network1, and associate it to ipv4-router.

neutron subnet-create --name ipv4-int-subnet1 --dns-nameserver 8.8.8.8 \
ipv4-int-network1 20.0.0.0/24

neutron router-interface-add ipv4-router ipv4-int-subnet1

OPNFV-NATIVE-SETUP-9: Associate the ext-net to the Neutron routers ipv4-router and ipv6-router.

# Note that the name "ext-net" may work for some installers such as Compass and Joid
# Change the name "ext-net" to match the name of external network that an installer creates
neutron router-gateway-set ipv4-router ext-net
neutron router-gateway-set ipv6-router ext-net

OPNFV-NATIVE-SETUP-10: Create two subnets, one IPv4 subnet ipv4-int-subnet2 and one IPv6 subnet ipv6-int-subnet2 in ipv6-int-network2, and associate both subnets to ipv6-router

neutron subnet-create --name ipv4-int-subnet2 --dns-nameserver 8.8.8.8 \
ipv6-int-network2 10.0.0.0/24

neutron subnet-create --name ipv6-int-subnet2 --ip-version 6 --ipv6-ra-mode slaac \
--ipv6-address-mode slaac ipv6-int-network2 2001:db8:0:1::/64

neutron router-interface-add ipv6-router ipv4-int-subnet2
neutron router-interface-add ipv6-router ipv6-int-subnet2

OPNFV-NATIVE-SETUP-11: Create a keypair

nova keypair-add vRouterKey > ~/vRouterKey

OPNFV-NATIVE-SETUP-12: Create ports for vRouter (with some specific MAC address - basically for automation - to know the IPv6 addresses that would be assigned to the port).

neutron port-create --name eth0-vRouter --mac-address fa:16:3e:11:11:11 ipv6-int-network2
neutron port-create --name eth1-vRouter --mac-address fa:16:3e:22:22:22 ipv4-int-network1

OPNFV-NATIVE-SETUP-13: Create ports for VM1 and VM2.

neutron port-create --name eth0-VM1 --mac-address fa:16:3e:33:33:33 ipv4-int-network1
neutron port-create --name eth0-VM2 --mac-address fa:16:3e:44:44:44 ipv4-int-network1

OPNFV-NATIVE-SETUP-14: Update ipv6-router with routing information to subnet 2001:db8:0:2::/64

neutron router-update ipv6-router --routes type=dict list=true \
destination=2001:db8:0:2::/64,nexthop=2001:db8:0:1:f816:3eff:fe11:1111

OPNFV-NATIVE-SETUP-15: Boot Service VM (vRouter), VM1 and VM2

nova boot --image Fedora22 --flavor m1.small \
--user-data /opt/stack/opnfv_os_ipv6_poc/metadata.txt \
--availability-zone nova:opnfv-os-compute \
--nic port-id=$(neutron port-list | grep -w eth0-vRouter | awk '{print $2}') \
--nic port-id=$(neutron port-list | grep -w eth1-vRouter | awk '{print $2}') \
--key-name vRouterKey vRouter

nova list

# Please wait for some 10 to 15 minutes so that necessary packages (like radvd)
# are installed and vRouter is up.
nova console-log vRouter

nova boot --image cirros-0.3.4-x86_64-uec --flavor m1.tiny \
--user-data /opt/stack/opnfv_os_ipv6_poc/set_mtu.sh \
--availability-zone nova:opnfv-os-controller \
--nic port-id=$(neutron port-list | grep -w eth0-VM1 | awk '{print $2}') \
--key-name vRouterKey VM1

nova boot --image cirros-0.3.4-x86_64-uec --flavor m1.tiny
--user-data /opt/stack/opnfv_os_ipv6_poc/set_mtu.sh \
--availability-zone nova:opnfv-os-compute \
--nic port-id=$(neutron port-list | grep -w eth0-VM2 | awk '{print $2}') \
--key-name vRouterKey VM2

nova list # Verify that all the VMs are in ACTIVE state.

OPNFV-NATIVE-SETUP-16: If all goes well, the IPv6 addresses assigned to the VMs would be as shown as follows:

# vRouter eth0 interface would have the following IPv6 address:
#     2001:db8:0:1:f816:3eff:fe11:1111/64
# vRouter eth1 interface would have the following IPv6 address:
#     2001:db8:0:2::1/64
# VM1 would have the following IPv6 address:
#     2001:db8:0:2:f816:3eff:fe33:3333/64
# VM2 would have the following IPv6 address:
#     2001:db8:0:2:f816:3eff:fe44:4444/64

OPNFV-NATIVE-SETUP-17: Now we need to disable eth0-VM1, eth0-VM2, eth0-vRouter and eth1-vRouter port-security

for port in eth0-VM1 eth0-VM2 eth0-vRouter eth1-vRouter
do
    neutron port-update --no-security-groups $port
    neutron port-update $port --port-security-enabled=False
    neutron port-show $port | grep port_security_enabled
done

OPNFV-NATIVE-SETUP-18: Now we can SSH to VMs. You can execute the following command.

# 1. Create a floatingip and associate it with VM1, VM2 and vRouter (to the port id that is passed).
#    Note that the name "ext-net" may work for some installers such as Compass and Joid
#    Change the name "ext-net" to match the name of external network that an installer creates
neutron floatingip-create --port-id $(neutron port-list | grep -w eth0-VM1 | \
awk '{print $2}') ext-net
neutron floatingip-create --port-id $(neutron port-list | grep -w eth0-VM2 | \
awk '{print $2}') ext-net
neutron floatingip-create --port-id $(neutron port-list | grep -w eth1-vRouter | \
awk '{print $2}') ext-net

# 2. To know / display the floatingip associated with VM1, VM2 and vRouter.
neutron floatingip-list -F floating_ip_address -F port_id | grep $(neutron port-list | \
grep -w eth0-VM1 | awk '{print $2}') | awk '{print $2}'
neutron floatingip-list -F floating_ip_address -F port_id | grep $(neutron port-list | \
grep -w eth0-VM2 | awk '{print $2}') | awk '{print $2}'
neutron floatingip-list -F floating_ip_address -F port_id | grep $(neutron port-list | \
grep -w eth1-vRouter | awk '{print $2}') | awk '{print $2}'

# 3. To ssh to the vRouter, VM1 and VM2, user can execute the following command.
ssh -i ~/vRouterKey fedora@<floating-ip-of-vRouter>
ssh -i ~/vRouterKey cirros@<floating-ip-of-VM1>
ssh -i ~/vRouterKey cirros@<floating-ip-of-VM2>

If everything goes well, ssh will be successful and you will be logged into those VMs. Run some commands to verify that IPv6 addresses are configured on eth0 interface.

OPNFV-NATIVE-SETUP-19: Show an IPv6 address with a prefix of 2001:db8:0:2::/64

ip address show

OPNFV-NATIVE-SETUP-20: ping some external IPv6 address, e.g. ipv6-router

ping6 2001:db8:0:1::1

If the above ping6 command succeeds, it implies that vRouter was able to successfully forward the IPv6 traffic to reach external ipv6-router.

2. IPv6 Post Installation Procedures

Congratulations, you have completed the setup of using a service VM to act as an IPv6 vRouter. You have validated the setup based on the instruction in previous sections. If you want to further test your setup, you can ping6 among VM1, VM2, vRouter and ipv6-router.

This setup allows further open innovation by any 3rd-party.

2.1. Automated post installation activities

Refer to the relevant testing guides, results, and release notes of Yardstick Project.

Using IPv6 Feature of Gambia Release
Abstract:

This section provides the users with:

  • Gap Analysis regarding IPv6 feature requirements with OpenStack Queens Official Release
  • Gap Analysis regarding IPv6 feature requirements with Open Daylight Oxygen Official Release
  • IPv6 Setup in Container Networking
  • Use of Neighbor Discovery (ND) Proxy to connect IPv6-only container to external network
  • Docker IPv6 Simple Cluster Topology
  • Study and recommendation regarding Docker IPv6 NAT

The gap analysis serves as feature specific user guides and references when as a user you may leverage the IPv6 feature in the platform and need to perform some IPv6 related operations.

The IPv6 Setup in Container Networking serves as feature specific user guides and references when as a user you may want to explore IPv6 in Docker container environment. The use of NDP Proxying is explored to connect IPv6-only containers to external network. The Docker IPv6 simple cluster topology is studied with two Hosts, each with 2 Docker containers. Docker IPv6 NAT topic is also explored.

For more information, please find Neutron’s IPv6 document for Queens Release.

1. IPv6 Gap Analysis with OpenStack Queens

This section provides users with IPv6 gap analysis regarding feature requirement with OpenStack Neutron in Queens Official Release. The following table lists the use cases / feature requirements of VIM-agnostic IPv6 functionality, including infrastructure layer and VNF (VM) layer, and its gap analysis with OpenStack Neutron in Queens Official Release.

Please NOTE that in terms of IPv6 support in OpenStack Neutron, there is no difference between Queens release and prior, e.g. Pike and Ocata, releases.

Use Case / Requirement Supported in Queens Notes
All topologies work in a multi-tenant environment Yes The IPv6 design is following the Neutron tenant networks model; dnsmasq is being used inside DHCP network namespaces, while radvd is being used inside Neutron routers namespaces to provide full isolation between tenants. Tenant isolation can be based on VLANs, GRE, or VXLAN encapsulation. In case of overlays, the transport network (and VTEPs) must be IPv4 based as of today.
IPv6 VM to VM only Yes It is possible to assign IPv6-only addresses to VMs. Both switching (within VMs on the same tenant network) as well as east/west routing (between different networks of the same tenant) are supported.
IPv6 external L2 VLAN directly attached to a VM Yes IPv6 provider network model; RA messages from upstream (external) router are forwarded into the VMs

IPv6 subnet routed via L3 agent to an external IPv6 network

  1. Both VLAN and overlay (e.g. GRE, VXLAN) subnet attached to VMs;
  2. Must be able to support multiple L3 agents for a given external network to support scaling (neutron scheduler to assign vRouters to the L3 agents)
  1. Yes
  2. Yes
Configuration is enhanced since Kilo to allow easier setup of the upstream gateway, without the user being forced to create an IPv6 subnet for the external network.

Ability for a NIC to support both IPv4 and IPv6 (dual stack) address.

  1. VM with a single interface associated with a network, which is then associated with two subnets.
  2. VM with two different interfaces associated with two different networks and two different subnets.
  1. Yes
  2. Yes
Dual-stack is supported in Neutron with the addition of Multiple IPv6 Prefixes Blueprint

Support IPv6 Address assignment modes.

  1. SLAAC
  2. DHCPv6 Stateless
  3. DHCPv6 Stateful
  1. Yes
  2. Yes
  3. Yes
 
Ability to create a port on an IPv6 DHCPv6 Stateful subnet and assign a specific IPv6 address to the port and have it taken out of the DHCP address pool. Yes  
Ability to create a port with fixed_ip for a SLAAC/DHCPv6-Stateless Subnet. No The following patch disables this operation: https://review.openstack.org/#/c/129144/
Support for private IPv6 to external IPv6 floating IP; Ability to specify floating IPs via Neutron API (REST and CLI) as well as via Horizon, including combination of IPv6/IPv4 and IPv4/IPv6 floating IPs if implemented. Rejected Blueprint proposed in upstream and got rejected. General expectation is to avoid NAT with IPv6 by assigning GUA to tenant VMs. See https://review.openstack.org/#/c/139731/ for discussion.
Provide IPv6/IPv4 feature parity in support for pass-through capabilities (e.g., SR-IOV). To-Do The L3 configuration should be transparent for the SR-IOV implementation. SR-IOV networking support introduced in Juno based on the sriovnicswitch ML2 driver is expected to work with IPv4 and IPv6 enabled VMs. We need to verify if it works or not.
Additional IPv6 extensions, for example: IPSEC, IPv6 Anycast, Multicast No It does not appear to be considered yet (lack of clear requirements)
VM access to the meta-data server to obtain user data, SSH keys, etc. using cloud-init with IPv6 only interfaces. No This is currently not supported. Config-drive or dual-stack IPv4 / IPv6 can be used as a workaround (so that the IPv4 network is used to obtain connectivity with the metadata service). The following blog How to Use Config-Drive for Metadata with IPv6 Network provides a neat summary on how to use config-drive for metadata with IPv6 network.
Full support for IPv6 matching (i.e., IPv6, ICMPv6, TCP, UDP) in security groups. Ability to control and manage all IPv6 security group capabilities via Neutron/Nova API (REST and CLI) as well as via Horizon. Yes Both IPTables firewall driver and OVS firewall driver support IPv6 Security Group API.
During network/subnet/router create, there should be an option to allow user to specify the type of address management they would like. This includes all options including those low priority if implemented (e.g., toggle on/off router and address prefix advertisements); It must be supported via Neutron API (REST and CLI) as well as via Horizon Yes

Two new Subnet attributes were introduced to control IPv6 address assignment options:

  • ipv6-ra-mode: to determine who sends Router Advertisements;
  • ipv6-address-mode: to determine how VM obtains IPv6 address, default gateway, and/or optional information.
Security groups anti-spoofing: Prevent VM from using a source IPv6/MAC address which is not assigned to the VM Yes  
Protect tenant and provider network from rogue RAs Yes When using a tenant network, Neutron is going to automatically handle the filter rules to allow connectivity of RAs to the VMs only from the Neutron router port; with provider networks, users are required to specify the LLA of the upstream router during the subnet creation, or otherwise manually edit the security-groups rules to allow incoming traffic from this specific address.
Support the ability to assign multiple IPv6 addresses to an interface; both for Neutron router interfaces and VM interfaces. Yes  
Ability for a VM to support a mix of multiple IPv4 and IPv6 networks, including multiples of the same type. Yes  
IPv6 Support in “Allowed Address Pairs” Extension Yes  
Support for IPv6 Prefix Delegation. Yes Partial support in Queens
Distributed Virtual Routing (DVR) support for IPv6 No In Queens DVR implementation, IPv6 works. But all the IPv6 ingress/ egress traffic is routed via the centralized controller node, i.e. similar to SNAT traffic. A fully distributed IPv6 router is not yet supported in Neutron.
VPNaaS Yes VPNaaS supports IPv6. But this feature is not extensively tested.
FWaaS Yes  
BGP Dynamic Routing Support for IPv6 Prefixes Yes BGP Dynamic Routing supports peering via IPv6 and advertising IPv6 prefixes.
VxLAN Tunnels with IPv6 endpoints. Yes Neutron ML2/OVS supports configuring local_ip with IPv6 address so that VxLAN tunnels are established with IPv6 addresses. This feature requires OVS 2.6 or higher version.
IPv6 First-Hop Security, IPv6 ND spoofing Yes  
IPv6 support in Neutron Layer3 High Availability (keepalived+VRRP). Yes  
2. IPv6 Gap Analysis with Open Daylight Oxygen

This section provides users with IPv6 gap analysis regarding feature requirement with Open Daylight Oxygen Official Release. The following table lists the use cases / feature requirements of VIM-agnostic IPv6 functionality, including infrastructure layer and VNF (VM) layer, and its gap analysis with Open Daylight Oxygen Official Release.

Open Daylight Oxygen Status

In Open Daylight Oxygen official release, the legacy Old Netvirt identified by feature odl-ovsdb-openstack is deprecated and no longer supported. The New Netvirt identified by feature odl-netvirt-openstack is used.

Two new features are supported in Open Daylight Oxygen official release:

  • “IPv6 L3VPN Dual Stack with Single router” [3]
  • “IPv6 Inter Data Center using L3VPNs” [4]
Use Case / Requirement Supported in ODL Oxygen Notes
REST API support for IPv6 subnet creation in ODL Yes

Yes, it is possible to create IPv6 subnets in ODL using Neutron REST API.

For a network which has both IPv4 and IPv6 subnets, ODL mechanism driver will send the port information which includes IPv4/v6 addresses to ODL Neutron northbound API. When port information is queried, it displays IPv4 and IPv6 addresses.

IPv6 Router support in ODL:

  1. Communication between VMs on same network
Yes  

IPv6 Router support in ODL:

  1. Communication between VMs on different networks connected to the same router (east-west)
Yes  

IPv6 Router support in ODL:

  1. External routing (north-south)
NO This feature is targeted for Flourine Release. In ODL Oxygen Release, RFE “IPv6 Inter-DC L3 North-South Connectivity Using L3VPN Provider Network Types” Spec [1] is merged. But the code patch has not been merged yet. On the other hand, “IPv6 Cluster Support” is available in Oxygen Release [2]. Basically, existing IPv6 features were enhanced to work in a three node ODL Clustered Setup.

IPAM: Support for IPv6 Address assignment modes.

  1. SLAAC
  2. DHCPv6 Stateless
  3. DHCPv6 Stateful
Yes ODL IPv6 Router supports all the IPv6 Address assignment modes along with Neutron DHCP Agent.
When using ODL for L2 forwarding/tunneling, it is compatible with IPv6. Yes  
Full support for IPv6 matching (i.e. IPv6, ICMPv6, TCP, UDP) in security groups. Ability to control and manage all IPv6 security group capabilities via Neutron/Nova API (REST and CLI) as well as via Horizon Yes  
Shared Networks support Yes  
IPv6 external L2 VLAN directly attached to a VM. Yes Targeted for Flourine Release
ODL on an IPv6 only Infrastructure. Yes Deploying OpenStack with ODL on an IPv6 only infrastructure where the API endpoints are all IPv6 addresses.
VxLAN Tunnels with IPv6 Endpoints Yes  
IPv6 L3VPN Dual Stack with Single router Yes Refer to “Dual Stack VM support in OpenDaylight” Spec [3].
IPv6 Inter Data Center using L3VPNs Yes Refer to “IPv6 Inter-DC L3 North-South connectivity using L3VPN provider network types” Spec [4].
[1]https://docs.opendaylight.org/projects/netvirt/en/stable-fluorine/specs/oxygen/ipv6-interdc-l3vpn.html
[2]http://git.opendaylight.org/gerrit/#/c/66707/
[3](1, 2) https://docs.opendaylight.org/projects/netvirt/en/stable-oxygen/specs/l3vpn-dual-stack-vms.html
[4](1, 2) https://docs.opendaylight.org/projects/netvirt/en/stable-oxygen/specs/ipv6-interdc-l3vpn.html
3. Exploring IPv6 in Container Networking

This document is the summary of how to use IPv6 with Docker.

The defualt Docker container uses 172.17.0.0/24 subnet with 172.17.0.1 as gateway. So IPv6 network needs to be enabled and configured before we can use it with IPv6 traffic.

We will describe how to use IPv6 in Docker in the following 5 sections:

  1. Install Docker Community Edition (CE)
  2. IPv6 with Docker
  3. Design Simple IPv6 Topologies
  4. Design Solutions
  5. Challenges in Production Use
3.1. Install Docker Community Edition (CE)

Step 3.1.1: Download Docker (CE) on your system from “this link” [1].

For Ubuntu 16.04 Xenial x86_64, please refer to “Docker CE for Ubuntu” [2].

Step 3.1.2: Refer to “this link” [3] to install Docker CE on Xenial.

Step 3.1.3: Once you installed the docker, you can verify the standalone default bridge nework as follows:

$ docker network ls
NETWORK ID NAME DRIVER SCOPE
b9e92f9a8390 bridge bridge local
74160ae686b9 host host local
898fbb0a0c83 my_bridge bridge local
57ac095fdaab none null local

Note that:

  • the details may be different with different network drivers.
  • User-defined bridge networks are the best when you need multiple containers to communicate on the same Docker host.
  • Host networks are the best when the network stack should not be isolated from the Docker host, but you want other aspects of the container to be isolated.
  • Overlay networks are the best when you need containers running on different Docker hosts to communicate, or when multiple applications work together using swarm services.
  • Macvlan networks are the best when you are migrating from a VM setup or need your containers to look like physical hosts on your network, each with a unique MAC address.
  • Third-party network plugins allow you to integrate Docker with specialized network stacks. Please refer to “Docker Networking Tutorials” [4].
# This will have docker0 default bridge details showing
# ipv4 172.17.0.1/16 and
# ipv6 fe80::42:4dff:fe2f:baa6/64 entries

$ ip addr show
11: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:4d:2f:ba:a6 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
inet6 fe80::42:4dff:fe2f:baa6/64 scope link
valid_lft forever preferred_lft forever

Thus we see here a simple defult ipv4 networking for docker. Inspect and verify that IPv6 address is not listed here showing its enabled but not used by default docker0 bridge.

You can create user defined bridge network using command like my_bridge below with other than default, e.g. 172.18.0.0/24 here. Note that --ipv6 is not specified yet

$ sudo docker network create \
              --driver=bridge \
              --subnet=172.18.0.0/24 \
              --gaeway= 172.18.0.1 \
              my_bridge

$ docker network inspect bridge
[
  {
    "Name": "bridge",
    "Id": "b9e92f9a839048aab887081876fc214f78e8ce566ef5777303c3ef2cd63ba712",
    "Created": "2017-10-30T23:32:15.676301893-07:00",
    "Scope": "local",
    "Driver": "bridge",
    "EnableIPv6": false,
    "IPAM": {
        "Driver": "default",
        "Options": null,
        "Config": [
            {
                "Subnet": "172.17.0.0/16",
                "Gateway": "172.17.0.1"
            }
        ]
    },
    "Internal": false,
    "Attachable": false,
    "Ingress": false,
    "ConfigFrom": {
        "Network": ""
    },
    "ConfigOnly": false,
    "Containers": {
        "ea76bd4694a8073b195dd712dd0b070e80a90e97b6e2024b03b711839f4a3546": {
        "Name": "registry",
        "EndpointID": "b04dc6c5d18e3bf4e4201aa8ad2f6ad54a9e2ea48174604029576e136b99c49d",
        "MacAddress": "02:42:ac:11:00:02",
        "IPv4Address": "172.17.0.2/16",
        "IPv6Address": ""
        }
    },
    "Options": {
        "com.docker.network.bridge.default_bridge": "true",
        "com.docker.network.bridge.enable_icc": "true",
        "com.docker.network.bridge.enable_ip_masquerade": "true",
        "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
        "com.docker.network.bridge.name": "docker0",
        "com.docker.network.driver.mtu": "1500"
    },
    "Labels": {}
  }
]

$ sudo docker network inspect my_bridge
[
  {
    "Name": "my_bridge",
    "Id": "898fbb0a0c83acc0593897f5af23b1fe680d38b804b0d5a4818a4117ac36498a",
    "Created": "2017-07-16T17:59:55.388151772-07:00",
    "Scope": "local",
    "Driver": "bridge",
    "EnableIPv6": false,
    "IPAM": {
        "Driver": "default",
        "Options": {},
        "Config": [
            {
                "Subnet": "172.18.0.0/16",
                "Gateway": "172.18.0.1"
            }
        ]
    },
    "Internal": false,
    "Attachable": false,
    "Ingress": false,
    "ConfigFrom": {
        "Network": ""
    },
    "ConfigOnly": false,
    "Containers": {},
    "Options": {},
    "Labels": {}
  }
]

You can note that IPv6 is not enabled here yet as seen through network inspect. Since we have only IPv4 installed with Docker, we will move to enable IPv6 for Docker in the next step.

3.2. IPv6 with Docker

Verifyig IPv6 with Docker involves the following steps:

Step 3.2.1: Enable ipv6 support for Docker

In the simplest term, the first step is to enable IPv6 on Docker on Linux hosts. Please refer to “this link” [5]:

  • Edit /etc/docker/daemon.json
  • Set the ipv6 key to true.
{{{ "ipv6": true }}}

Save the file.

Step 3.2.1.1: Set up IPv6 addressing for Docker in daemon.json

If you need IPv6 support for Docker containers, you need to enable the option on the Docker daemon daemon.json and reload its configuration, before creating any IPv6 networks or assigning containers IPv6 addresses.

When you create your network, you can specify the --ipv6 flag to enable IPv6. You can’t selectively disable IPv6 support on the default bridge network.

Step 3.2.1.2: Enable forwarding from Docker containers to the outside world

By default, traffic from containers connected to the default bridge network is not forwarded to the outside world. To enable forwarding, you need to change two settings. These are not Docker commands and they affect the Docker host’s kernel.

  • Setting 1: Configure the Linux kernel to allow IP forwarding:
$ sysctl net.ipv4.conf.all.forwarding=1
  • Setting 2: Change the policy for the iptables FORWARD policy from DROP to ACCEPT.
$ sudo iptables -P FORWARD ACCEPT

These settings do not persist across a reboot, so you may need to add them to a start-up script.

Step 3.2.1.3: Use the default bridge network

The default bridge network is considered a legacy detail of Docker and is not recommended for production use. Configuring it is a manual operation, and it has technical shortcomings.

Step 3.2.1.4: Connect a container to the default bridge network

If you do not specify a network using the --network flag, and you do specify a network driver, your container is connected to the default bridge network by default. Containers connected to the default bridge network can communicate, but only by IP address, unless they are linked using the legacy --link flag.

Step 3.2.1.5: Configure the default bridge network

To configure the default bridge network, you specify options in daemon.json. Here is an example of daemon.json with several options specified. Only specify the settings you need to customize.

{
  "bip": "192.168.1.5/24",
  "fixed-cidr": "192.168.1.5/25",
  "fixed-cidr-v6": "2001:db8::/64",
  "mtu": 1500,
  "default-gateway": "10.20.1.1",
  "default-gateway-v6": "2001:db8:abcd::89",
  "dns": ["10.20.1.2","10.20.1.3"]
}

Restart Docker for the changes to take effect.

Step 3.2.1.6: Use IPv6 with the default bridge network

If you configure Docker for IPv6 support (see Step 2.1.1), the default bridge network is also configured for IPv6 automatically. Unlike user-defined bridges, you cannot selectively disable IPv6 on the default bridge.

Step 3.2.1.7: Reload the Docker configuration file

$ systemctl reload docker

Step 3.2.1.8: You can now create networks with the --ipv6 flag and assign containers IPv6 addresses.

Step 3.2.1.9: Verify your host and docker networks

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
ea76bd4694a8        registry:2          "/entrypoint.sh /e..."   x months ago        Up y months         0.0.0.0:4000->5000/tcp   registry

$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
b9e92f9a8390        bridge              bridge              local
74160ae686b9        host                host                local
898fbb0a0c83        my_bridge           bridge              local
57ac095fdaab        none                null                local

Step 3.2.1.10: Edit /etc/docker/daemon.json and set the ipv6 key to true.

{
  "ipv6": true
}

Save the file.

Step 3.2.1.11: Reload the Docker configuration file.

$ sudo systemctl reload docker

Step 3.2.1.12: You can now create networks with the --ipv6 flag and assign containers IPv6 addresses using the --ip6 flag.

$ sudo docker network create --ipv6 --driver bridge alpine-net--fixed-cidr-v6 2001:db8:1/64

# "docker network create" requires exactly 1 argument(s).
# See "docker network create --help"

Earlier, user was allowed to create a network, or start the daemon, without specifying an IPv6 --subnet, or --fixed-cidr-v6 respectively, even when using the default builtin IPAM driver, which does not support auto allocation of IPv6 pools. In another word, it was an incorrect configurations, which had no effect on IPv6 stuff. It was a no-op.

A fix cleared that so that Docker will now correctly consult with the IPAM driver to acquire an IPv6 subnet for the bridge network, when user did not supply one.

If the IPAM driver in use is not able to provide one, network creation would fail (in this case the default bridge network).

So what you see now is the expected behavior. You need to remove the --ipv6 flag when you start the daemon, unless you pass a --fixed-cidr-v6 pool. We should probably clarify this somewhere.

The above was found on following Docker.

$ docker info
Containers: 27
Running: 1
Paused: 0
Stopped: 26
Images: 852
Server Version: 17.06.1-ce-rc1
Storage Driver: aufs
  Root Dir: /var/lib/docker/aufs
  Backing Filesystem: extfs
  Dirs: 637
  Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
  Volume: local
  Network: bridge host macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170
runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2
init version: 949e6fa
Security Options:
  apparmor
  seccomp
  Profile: default
Kernel Version: 3.13.0-88-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 11.67GiB
Name: aatiksh
ID: HS5N:T7SK:73MD:NZGR:RJ2G:R76T:NJBR:U5EJ:KP5N:Q3VO:6M2O:62CJ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
  127.0.0.0/8
Live Restore Enabled: false

Step 3.2.2: Check the network drivers

Among the 4 supported drivers, we will be using “User-Defined Bridge Network” [6].

3.3. Design Simple IPv6 Topologies

Step 3.3.1: Creating IPv6 user-defined subnet.

Let’s create a Docker with IPv6 subnet:

$ sudo docker network create \
              --ipv6 \
              --driver=bridge \
              --subnet=172.18.0.0/16 \
              --subnet=fcdd:1::/48 \
              --gaeway= 172.20.0.1  \
              my_ipv6_bridge

# Error response from daemon:

cannot create network 8957e7881762bbb4b66c3e2102d72b1dc791de37f2cafbaff42bdbf891b54cc3 (br-8957e7881762): conflicts with network
no matching subnet for range 2002:ac14:0000::/48

# try changing to ip-addess-range instead of subnet for ipv6.
# networks have overlapping IPv4

NETWORK ID          NAME                DRIVER              SCOPE
b9e92f9a8390        bridge              bridge              local
74160ae686b9        host                host                local
898fbb0a0c83        my_bridge           bridge              local
57ac095fdaab        none                null                local
no matching subnet for gateway 172.20.01

# So finally making both as subnet and gateway as 172.20.0.1 works

$ sudo docker network create \
              --ipv6 \
              --driver=bridge \
              --subnet=172.20.0.0/16 \
              --subnet=2002:ac14:0000::/48 \
              --gateway=172.20.0.1 \
              my_ipv6_bridge
898fbb0a0c83acc0593897f5af23b1fe680d38b804b0d5a4818a4117ac36498a (br-898fbb0a0c83):

Since lxdbridge used the ip range on the system there was a conflict. This brings us to question how do we assign IPv6 and IPv6 address for our solutions.

3.4. Design Solutions

For best practices, please refer to “Best Practice Document” [7].

Use IPv6 Calcualtor at “this link” [8].

  • For IPv4 172.16.0.1 = 6to4 prefix 2002:ac10:0001::/48
  • For IPv4 172.17.01/24 = 6to4 prefix 2002:ac11:0001::/48
  • For IPv4 172.18.0.1 = 6to4 prefix 2002:ac12:0001::/48
  • For IPv4 172.19.0.1 = 6to4 prefix 2002:ac13:0001::/48
  • For IPv4 172.20.0.0 = 6to4 prefix 2002:ac14:0000::/48

To avoid overlaping IP’s, let’s use the .20 in our design:

$ sudo docker network create \
              --ipv6 \
              --driver=bridge \
              --subnet=172.20.0.0/24 \
              --subnet=2002:ac14:0000::/48
              --gateway=172.20.0.1
              my_ipv6_bridge

# created ...

052da268171ce47685fcdb68951d6d14e70b9099012bac410c663eb2532a0c87

$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
b9e92f9a8390        bridge              bridge              local
74160ae686b9        host                host                local
898fbb0a0c83        my_bridge           bridge              local
052da268171c        my_ipv6_bridge      bridge              local
57ac095fdaab        none                null                local

# Note the first 16 digits is used here as network id from what we got
# whaen we created it.

$ docker network  inspect my_ipv6_bridge
[
  {
    "Name": "my_ipv6_bridge",
    "Id": "052da268171ce47685fcdb68951d6d14e70b9099012bac410c663eb2532a0c87",
    "Created": "2018-03-16T07:20:17.714212288-07:00",
    "Scope": "local",
    "Driver": "bridge",
    "EnableIPv6": true,
    "IPAM": {
        "Driver": "default",
        "Options": {},
        "Config": [
            {
                "Subnet": "172.20.0.0/16",
                "Gateway": "172.20.0.1"
            },
            {
                "Subnet": "2002:ac14:0000::/48"
            }
        ]
    },
    "Internal": false,
    "Attachable": false,
    "Ingress": false,
    "ConfigFrom": {
        "Network": ""
    },
    "ConfigOnly": false,
    "Containers": {},
    "Options": {},
    "Labels": {}
  }
]

Note that:

  • IPv6 flag is ebnabled and that IPv6 range is listed besides Ipv4 gateway.
  • We are mapping IPv4 and IPv6 address to simplify assignments as per “Best Pratice Document” [7].

Testing the solution and topology:

$ sudo docker run hello-world
Hello from Docker!

This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:

  1. The Docker client contacted the Docker daemon.
  2. The Docker daemon pulled the “hello-world” image from the Docker Hub.
  3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading.
  4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal.

To try something more ambitious, you can run an Ubuntu container with:

$ docker run -it ubuntu bash

root@62b88b030f5a:/# ls
bin   dev  home  lib64  mnt  proc  run   srv  tmp  var
boot  etc  lib   media  opt  root  sbin  sys  usr

On terminal it appears that the docker is functioning normally.

Let’s now push to see if we can use the my_ipv6_bridge network. Please refer to “User-Defined Bridge Network” [9].

3.4.1. Connect a container to a user-defined bridge

When you create a new container, you can specify one or more --network flags. This example connects a Nginx container to the my-net network. It also publishes port 80 in the container to port 8080 on the Docker host, so external clients can access that port. Any other container connected to the my-net network has access to all ports on the my-nginx container, and vice versa.

$ docker create --name my-nginx \
                --network my-net \
                --publish 8080:80 \
                nginx:latest

To connect a running container to an existing user-defined bridge, use the docker network connect command. The following command connects an already-running my-nginx container to an already-existing my_ipv6_bridge network:

$ docker network connect my_ipv6_bridge my-nginx

Now we have connected the IPv6-enabled network to mynginx conatiner. Let’s start and verify its IP Address:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
df1df6ed3efb        alpine              "ash"                    4 hours ago         Up 4 hours                                   alpine1
ea76bd4694a8        registry:2          "/entrypoint.sh /e..."   9 months ago        Up 4 months         0.0.0.0:4000->5000/tcp   registry

The nginx:latest image is not runnung, so let’s start and log into it.

$ docker images | grep latest
REPOSITORY                                          TAG                 IMAGE ID            CREATED             SIZE
nginx                                               latest              73acd1f0cfad        2 days ago          109MB
alpine                                              latest              3fd9065eaf02        2 months ago        4.15MB
swaggerapi/swagger-ui                               latest              e0b4f5dd40f9        4 months ago        23.6MB
ubuntu                                              latest              d355ed3537e9        8 months ago        119MB
hello-world                                         latest              1815c82652c0        9 months ago        1.84kB

Now we do find the nginx and let`s run it

$ docker run -i -t nginx:latest /bin/bash
root@bc13944d22e1:/# ls
bin   dev  home  lib64  mnt  proc  run   srv  tmp  var
boot  etc  lib   media  opt  root  sbin  sys  usr
root@bc13944d22e1:/#

Open another terminal and check the networks and verify that IPv6 address is listed on the container:

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED              STATUS              PORTS                    NAMES
bc13944d22e1        nginx:latest        "/bin/bash"              About a minute ago   Up About a minute   80/tcp                   loving_hawking
df1df6ed3efb        alpine              "ash"                    4 hours ago          Up 4 hours                                   alpine1
ea76bd4694a8        registry:2          "/entrypoint.sh /e..."   9 months ago         Up 4 months         0.0.0.0:4000->5000/tcp   registry

$ ping6 bc13944d22e1

# On 2nd termoinal

$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
b9e92f9a8390        bridge              bridge              local
74160ae686b9        host                host                local
898fbb0a0c83        my_bridge           bridge              local
052da268171c        my_ipv6_bridge      bridge              local
57ac095fdaab        none                null                local

$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 8c:dc:d4:6e:d5:4b brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.80/24 brd 10.0.0.255 scope global dynamic eno1
       valid_lft 558367sec preferred_lft 558367sec
    inet6 2601:647:4001:739c:b80a:6292:1786:b26/128 scope global dynamic
       valid_lft 86398sec preferred_lft 86398sec
    inet6 fe80::8edc:d4ff:fe6e:d54b/64 scope link
       valid_lft forever preferred_lft forever
11: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:4d:2f:ba:a6 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:4dff:fe2f:baa6/64 scope link
       valid_lft forever preferred_lft forever
20: br-052da268171c: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:5e:19:55:0d brd ff:ff:ff:ff:ff:ff
    inet 172.20.0.1/16 scope global br-052da268171c
       valid_lft forever preferred_lft forever
    inet6 2002:ac14::1/48 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::42:5eff:fe19:550d/64 scope link
       valid_lft forever preferred_lft forever
    inet6 fe80::1/64 scope link
       valid_lft forever preferred_lft forever

Note that on the 20th entry we have the br-052da268171c with IPv6 inet6 2002:ac14::1/48 scope global, which belongs to root@bc13944d22e1.

At this time we have been able to provide a simple Docker with IPv6 solution.

3.4.2. Disconnect a container from a user-defined bridge

If another route needs to be added to nginx, you need to modify the routes:

# using ip route commands

$ ip r
default via 10.0.0.1 dev eno1  proto static  metric 100
default via 10.0.0.1 dev wlan0  proto static  metric 600
10.0.0.0/24 dev eno1  proto kernel  scope link  src 10.0.0.80
10.0.0.0/24 dev wlan0  proto kernel  scope link  src 10.0.0.38
10.0.0.0/24 dev eno1  proto kernel  scope link  src 10.0.0.80  metric 100
10.0.0.0/24 dev wlan0  proto kernel  scope link  src 10.0.0.38  metric 600
10.0.8.0/24 dev lxdbr0  proto kernel  scope link  src 10.0.8.1
169.254.0.0/16 dev lxdbr0  scope link  metric 1000
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1
172.18.0.0/16 dev br-898fbb0a0c83  proto kernel  scope link  src 172.18.0.1
172.20.0.0/16 dev br-052da268171c  proto kernel  scope link  src 172.20.0.1
192.168.99.0/24 dev vboxnet1  proto kernel  scope link  src 192.168.99.1

If the routes are correctly updated you should be able to see nginx web page on link http://172.20.0.0.1

We now have completed the exercise.

To disconnect a running container from a user-defined bridge, use the docker network disconnect command. The following command disconnects the my-nginx container from the my-net network.

$ docker network disconnect my_ipv6_bridge my-nginx

The IPv6 Docker we used is for demo purpose only. For real production we need to follow one of the IPv6 solutions we have come across.

3.5. Challenges in Production Use

“This link” [10] discusses the details of the use of nftables which is nextgen iptables, and tries to build production worthy Docker for IPv6 usage.

4. ICMPv6 and NDP

ICMP is a control protocol that is considered to be an integral part of IP, although it is architecturally layered upon IP, i.e., it uses IP to carry its data end-to-end just as a transport protocol like TCP or UDP does. ICMP provides error reporting, congestion reporting, and first-hop gateway redirection.

To communicate on its directly-connected network, a host must implement the communication protocol used to interface to that network. We call this a link layer or media-access layer protocol.

IPv4 uses ARP for link and MAC address discovery. In contrast IPv6 uses ICMPv6 though Neighbor Discovery Protocol (NDP). NDP defines five ICMPv6 packet types for the purpose of router solicitation, router advertisement, neighbor solicitation, neighbor advertisement, and network redirects. Refer RFC 122 & 3122.

Contrasting with ARP, NDP includes Neighbor Unreachability Detection (NUD), thus, improving robustness of packet delivery in the presence of failing routers or links, or mobile nodes. As long as hosts were using single network interface, the isolation between local network and remote network was simple. With requirements of multihoming for hosts with multiple interfaces and multiple destination packet transfers, the complications of maintaining all routing to remote gateways has disappeared.

To add container network to local network and IPv6 link local networks and virtual or logical routing on hosts, the complexity is now exponential. In order to maintain simplicity of end hosts (physical, virtual or containers), just maintaining sessions and remote gateways (routers), and maintaining routes independent of session state is still desirable for scaling internet connected end hosts.

For more details, please refer to [1].

4.1. IPv6-only Containers & Using NDP Proxying

IPv6-only containers will need to fully depend on NDP proxying.

If your Docker host is the only part of an IPv6 subnet but does not have an IPv6 subnet assigned, you can use NDP Proxying to connect your containers to the internet via IPv6.

If the host with IPv6 address 2001:db8::c001 is part of the subnet 2001:db8::/64, and your IaaS provider allows you to configure the IPv6 addresses 2001:db8::c000 to 2001:db8::c00f, your network configuration may look like the following:

$ ip -6 addr show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536
   inet6 ::1/128 scope host
      valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
   inet6 2001:db8::c001/64 scope global
      valid_lft forever preferred_lft forever
   inet6 fe80::601:3fff:fea1:9c01/64 scope link
      valid_lft forever preferred_lft forever

To split up the configurable address range into two subnets 2001:db8::c000/125 and 2001:db8::c008/125, use the following daemon.json settings.

{
  "ipv6": true,
  "fixed-cidr-v6": "2001:db8::c008/125"
}

The first subnet will be used by non-Docker processes on the host, and the second will be used by Docker.

_images/ndp-proxying.png

Figure: Using NDP Proxying

For more details, please refer to [2].

5. Docker IPv6 Simple Cluster Topology

Using external switches or routers allows you to enable IPv6 communication between containers on different hosts. We have two physical hosts: Host1 & Host2, and we will study here two scenarios: one with Switch and the other one with router on the top of hierarchy, connecting those 2 hosts. Both hosts host a pair of containers in a cluster. The contents are borrowed from article [1] below, which can be used on any Linux distro (CentOS, Ubuntu, OpenSUSE etc) with latest kernel. A sample testing is pointed in the blog article [2] as a variation using ESXi & older Ubuntu 14.04.

5.1. Switched Network Environment

Using routable IPv6 addresses allows you to realize communication between containers on different hosts. Let’s have a look at a simple Docker IPv6 cluster example:

_images/docker-ipv6-cluster-example.png

Figure 1: An Docker IPv6 Cluster Example

The Docker hosts are in the 2001:db8:0::/64 subnet. Host1 is configured to provide addresses from the 2001:db8:1::/64 subnet to its containers. It has three routes configured:

  • Route all traffic to 2001:db8:0::/64 via eth0
  • Route all traffic to 2001:db8:1::/64 via docker0
  • Route all traffic to 2001:db8:2::/64 via Host2 with IP 2001:db8:0::2

Host1 also acts as a router on OSI layer 3. When one of the network clients tries to contact a target that is specified in Host1’s routing table, Host1 will forward the traffic accordingly. It acts as a router for all networks it knows: 2001:db8::/64, 2001:db8:1::/64, and 2001:db8:2::/64.

On Host2, we have nearly the same configuration. Host2’s containers will get IPv6 addresses from 2001:db8:2::/64. Host2 has three routes configured:

  • Route all traffic to 2001:db8:0::/64 via eth0
  • Route all traffic to 2001:db8:2::/64 via docker0
  • Route all traffic to 2001:db8:1::/64 via Host1 with IP 2001:db8:0::1

The difference to Host1 is that the network 2001:db8:2::/64 is directly attached to Host2 via its docker0 interface, whereas Host2 reaches 2001:db8:1::/64 via Host1’s IPv6 address 2001:db8:0::1.

This way every container can contact every other container. The containers Container1-* share the same subnet and contact each other directly. The traffic between Container1-* and Container2-* will be routed via Host1 and Host2 because those containers do not share the same subnet.

In a switched environment every host must know all routes to every subnet. You always must update the hosts’ routing tables once you add or remove a host to the cluster.

Every configuration in the diagram that is shown below the dashed line across hosts is handled by Dockeri, such as the docker0 bridge IP address configuration, the route to the Docker subnet on the host, the container IP addresses and the routes on the containers. The configuration above the line across hosts is up to the user and can be adapted to the individual environment.

5.2. Routed Network Environment

In a routed network environment, you replace the layer 2 switch with a layer 3 router. Now the hosts just must know their default gateway (the router) and the route to their own containers (managed by Docker). The router holds all routing information about the Docker subnets. When you add or remove a host to this environment, you just must update the routing table in the router instead of on every host.

_images/routed-network-environment.png

Figure 2: A Routed Network Environment

In this scenario, containers of the same host can communicate directly with each other. The traffic between containers on different hosts will be routed via their hosts and the router. For example, packet from Container1-1 to Container2-1 will be routed through Host1, Router, and Host2 until it arrives at Container2-1.

To keep the IPv6 addresses short in this example a /48 network is assigned to every host. The hosts use a /64 subnet of this for its own services and one for Docker. When adding a third host, you would add a route for the subnet 2001:db8:3::/48 in the router and configure Docker on Host3 with --fixed-cidr-v6=2001:db8:3:1::/64.

Remember the subnet for Docker containers should at least have a size of /80. This way an IPv6 address can end with the container’s MAC address and you prevent NDP neighbor cache invalidation issues in the Docker layer. So if you have a /64 for your whole environment, use /76 subnets for the hosts and /80 for the containers. This way you can use 4096 hosts with 16 /80 subnets each.

Every configuration in the diagram that is visualized below the dashed line across hosts is handled by Docker, such as the docker0 bridge IP address configuration, the route to the Docker subnet on the host, the container IP addresses and the routes on the containers. The configuration above the line across hosts is up to the user and can be adapted to the individual environment.

6. Docker IPv6 NAT
6.1. What is the Issue with Using IPv6 with Containers?

Initially Docker was not created with IPv6 in mind. It was added later. As a result, there are still several unresolved issues as to how IPv6 should be used in a containerized world.

Currently, you can let Docker give each container an IPv6 address from your (public) pool, but this has disadvantages (Refer to [1]):

  • Giving each container a publicly routable address means all ports (even unexposed / unpublished ports) are suddenly reachable by everyone, if no additional filtering is done.
  • By default, each container gets a random IPv6 address, making it impossible do DNS properly. An alternative is to assign a specific IPv6 address to each container, but it is still an administrative hassle.
  • Published ports won’t work on IPv6, unless you have the userland proxy enabled (which, for now, is enabled by default in Docker)
  • The userland proxy, however, seems to be on its way out and has various issues, such as:
    • It can use a lot of RAM.
    • Source IP addresses are rewritten, making it completely unusable for many purposes, e.g. mail servers.

IPv6 for Docker can (depending on your setup) be pretty much unusable and completely inconsistent with the way how IPv4 works. Docker images are mostly designed with IPv4 NAT in mind. NAT provides a layer of security allowing only published ports through. Letting container link to user-defined networks provide inter-container communication. This does not go hand in hand with the way Docker IPv6 works, requiring image maintainers to rethink/adapt their images with IPv6 in mind.

6.2. Why not IPv6 with NAT?

So why not try resolve above issues by managing ip6tables to setup IPv6 NAT for your containers, like how it is done by the Docker daemon for IPv4. This requires a locally reserved address like we do for private IP in IPv4. These are called in IPv6 as local unicast Ipv6 address. Let’s first understand IPv6 addressing scheme.

We note that there are 3 types of IPv6 addresses, and all use last or least significant 64 bits as Interface ID derived by splitting 48-bit MAC address into 24 bits + 24 bits and insert an FE00 hexadecimal number in between those two and inverting the most significant bit to create an equivalent 64-bit MAC called EUI-64 bit. Refer to [2] for details.

1. Global Unicast Address

This is equivalent to IPv4’s public address with always 001 as Most Significant bits of Global Routing Prefix. Subnets are 16 opposed to 8 bits in IPv4.

_images/global-unicast.jpg

2. Link-Local Address

Link-local addresses are used for communication among IPv6 hosts on a link (broadcast segment) only. These addresses are not routable. This address always starts with FE80. These are used for generating IPv6 addresses and 48 bits following FE80 are always set to 0. Interface ID is usual EUI-64 generated from MAC address on the NIC.

_images/link-local.jpg

3. Unique-Local Address

This type of IPv6 address is globally unique & used only in site local communication. The second half of this address contain Interface ID and the first half is divided among Prefix, Local Bit, Global ID and Subnet ID.

_images/unique-local.jpg

Prefix is always set to 1111 110. L bit, is set to 1 if the address is locally assigned. So far, the meaning of L bit to 0 is not defined. Therefore, Unique Local IPv6 address always starts with ‘FD’.

IPv6 addresses of all types are assigned to interfaces, not nodes (hosts). An IPv6 unicast address refers to a single interface. Since each interface belongs to a single node (host), any of that node’s interfaces’ unicast addresses may be used as an identifier for the node(host). For IPv6 NAT we prefer site scope to be within site scope using unique local address, so that they remain private within the organization.

_images/unicast-scope.jpg

Figure 1: Scope of IPv6 Unicast Addresses

Based on the IPv6 scope now question arises as what is needed to be mapped to what? Is it IPv6 to IPv4 or IPv6 to IPv6 with post? Thus, we land up with are we talking NAT64 with dual stack or just NAT66. Is it a standard that is agreed upon in IETF RFCs? Dwelling into questions bring us back to should we complicate life with another docker-ipv6nat?

The conclusion is simple: it is not worth it and it is highly recommended that you go through the blog listed below [3].

6.3. Conclusion

As IPv6 Project team in OPNFV, we recommend that IPv6 NAT is not worth the effort and should be discouraged. As part of our conclusion, we recommend that please do not use IPv6 NAT for containers for any NFV use cases.

Joid

JOID installation instruction
1. Abstract

This document will explain how to install the Fraser release of OPNFV with JOID including installing JOID, configuring JOID for your environment, and deploying OPNFV with different SDN solutions in HA, or non-HA mode.

2. Introduction
2.1. JOID in brief

JOID as Juju OPNFV Infrastructure Deployer allows you to deploy different combinations of OpenStack release and SDN solution in HA or non-HA mode. For OpenStack, JOID currently supports Ocata and Pike. For SDN, it supports Openvswitch, OpenContrail, OpenDayLight, and ONOS. In addition to HA or non-HA mode, it also supports deploying from the latest development tree.

JOID heavily utilizes the technology developed in Juju and MAAS.

Juju is a state-of-the-art, open source modelling tool for operating software in the cloud. Juju allows you to deploy, configure, manage, maintain, and scale cloud applications quickly and efficiently on public clouds, as well as on physical servers, OpenStack, and containers. You can use Juju from the command line or through its beautiful GUI. (source: Juju Docs)

MAAS is Metal As A Service. It lets you treat physical servers like virtual machines (instances) in the cloud. Rather than having to manage each server individually, MAAS turns your bare metal into an elastic cloud-like resource. Machines can be quickly provisioned and then destroyed again as easily as you can with instances in a public cloud. ... In particular, it is designed to work especially well with Juju, the service and model management service. It’s a perfect arrangement: MAAS manages the machines and Juju manages the services running on those machines. (source: MAAS Docs)

2.2. Typical JOID Architecture

The MAAS server is installed and configured on Jumphost with Ubuntu 16.04 LTS server with access to the Internet. Another VM is created to be managed by MAAS as a bootstrap node for Juju. The rest of the resources, bare metal or virtual, will be registered and provisioned in MAAS. And finally the MAAS environment details are passed to Juju for use.

3. Setup Requirements
3.1. Network Requirements

Minimum 2 Networks:

  • One for the administrative network with gateway to access the Internet
  • One for the OpenStack public network to access OpenStack instances via floating IPs

JOID supports multiple isolated networks for data as well as storage based on your network requirement for OpenStack.

No DHCP server should be up and configured. Configure gateways only on eth0 and eth1 networks to access the network outside your lab.

3.2. Jumphost Requirements

The Jumphost requirements are outlined below:

  • OS: Ubuntu 16.04 LTS Server
  • Root access.
  • CPU cores: 16
  • Memory: 32GB
  • Hard Disk: 1× (min. 250 GB)
  • NIC: eth0 (admin, management), eth1 (external connectivity)
3.3. Physical nodes requirements (bare metal deployment)

Besides Jumphost, a minimum of 5 physical servers for bare metal environment.

  • CPU cores: 16
  • Memory: 32GB
  • Hard Disk: 2× (500GB) prefer SSD
  • NIC: eth0 (Admin, Management), eth1 (external network)

NOTE: Above configuration is minimum. For better performance and usage of the OpenStack, please consider higher specs for all nodes.

Make sure all servers are connected to top of rack switch and configured accordingly.

4. Bare Metal Installation

Before proceeding, make sure that your hardware infrastructure satisfies the Setup Requirements.

4.1. Networking

Make sure you have at least two networks configured:

  1. Admin (management) network with gateway to access the Internet (for downloading installation resources).
  2. public/floating network to consume by tenants for floating IPs.

You may configure other networks, e.g. for data or storage, based on your network options for Openstack.

4.2. Jumphost installation and configuration
  1. Install Ubuntu 16.04 (Xenial) LTS server on Jumphost (one of the physical nodes).

    Tip

    Use ubuntu as username as password, as this matches the MAAS credentials installed later.

    During the OS installation, install the OpenSSH server package to allow SSH connections to the Jumphost.

    If the data size of the image is too big or slow (e.g. when mounted through a slow virtual console), you can also use the Ubuntu mini ISO. Install packages: standard system utilities, basic Ubuntu server, OpenSSH server, Virtual Machine host.

    If you have issues with blank console after booting, see this SO answer and set nomodeset, (removing quiet splash can also be useful to see log during booting) either through console in recovery mode or via SSH (if installed).

  2. Install git and bridge-utils packages

    sudo apt install git bridge-utils
    
  3. Configure bridges for each network to be used.

    Example /etc/network/interfaces file:

    source /etc/network/interfaces.d/*
    
    # The loopback network interface (set by Ubuntu)
    auto lo
    iface lo inet loopback
    
    # Admin network interface
    iface eth0 inet manual
    auto brAdmin
    iface brAdmin inet static
            bridge_ports eth0
            address 10.5.1.1
            netmask 255.255.255.0
    
    # Ext. network for floating IPs
    iface eth1 inet manual
    auto brExt
    iface brExt inet static
            bridge_ports eth1
            address 10.5.15.1
            netmask 255.255.255.0
    

    Note

    If you choose to use the separate network for management, public, data and storage, then you need to create bridge for each interface. In case of VLAN tags, use the appropriate network on Jumphost depending on the VLAN ID on the interface.

    Note

    Both of the networks need to have Internet connectivity. If only one of your interfaces has Internet access, you can setup IP forwarding. For an example how to accomplish that, see the script in Nokia pod 1 deployment (labconfig/nokia/pod1/setup_ip_forwarding.sh).

4.3. Configure JOID for your lab

All configuration for the JOID deployment is specified in a labconfig.yaml file. Here you describe all your physical nodes, their roles in OpenStack, their network interfaces, IPMI parameters etc. It’s also where you configure your OPNFV deployment and MAAS networks/spaces. You can find example configuration files from already existing nodes in the repository.

First of all, download JOID to your Jumphost. We recommend doing this in your home directory.

git clone https://gerrit.opnfv.org/gerrit/p/joid.git

Tip

You can select the stable version of your choice by specifying the git branch, for example:

git clone -b stable/fraser https://gerrit.opnfv.org/gerrit/p/joid.git

Create a directory in joid/labconfig/<company_name>/<pod_number>/ and create or copy a labconfig.yaml configuration file to that directory. For example:

# All JOID actions are done from the joid/ci directory
cd joid/ci
mkdir -p ../labconfig/your_company/pod1
cp ../labconfig/nokia/pod1/labconfig.yaml ../labconfig/your_company/pod1/

Example labconfig.yaml configuration file:

lab:
  location: your_company
  racks:
  - rack: pod1
    nodes:
    - name: rack-1-m1
      architecture: x86_64
      roles: [network,control]
      nics:
      - ifname: eth0
        spaces: [admin]
        mac: ["12:34:56:78:9a:bc"]
      - ifname: eth1
        spaces: [floating]
        mac: ["12:34:56:78:9a:bd"]
      power:
        type: ipmi
        address: 192.168.10.101
        user: admin
        pass: admin
    - name: rack-1-m2
      architecture: x86_64
      roles: [compute,control,storage]
      nics:
      - ifname: eth0
        spaces: [admin]
        mac: ["23:45:67:89:ab:cd"]
      - ifname: eth1
        spaces: [floating]
        mac: ["23:45:67:89:ab:ce"]
      power:
        type: ipmi
        address: 192.168.10.102
        user: admin
        pass: admin
    - name: rack-1-m3
      architecture: x86_64
      roles: [compute,control,storage]
      nics:
      - ifname: eth0
        spaces: [admin]
        mac: ["34:56:78:9a:bc:de"]
      - ifname: eth1
        spaces: [floating]
        mac: ["34:56:78:9a:bc:df"]
      power:
        type: ipmi
        address: 192.168.10.103
        user: admin
        pass: admin
    - name: rack-1-m4
      architecture: x86_64
      roles: [compute,storage]
      nics:
      - ifname: eth0
        spaces: [admin]
        mac: ["45:67:89:ab:cd:ef"]
      - ifname: eth1
        spaces: [floating]
        mac: ["45:67:89:ab:ce:f0"]
      power:
        type: ipmi
        address: 192.168.10.104
        user: admin
        pass: admin
    - name: rack-1-m5
      architecture: x86_64
      roles: [compute,storage]
      nics:
      - ifname: eth0
        spaces: [admin]
        mac: ["56:78:9a:bc:de:f0"]
      - ifname: eth1
        spaces: [floating]
        mac: ["56:78:9a:bc:df:f1"]
      power:
        type: ipmi
        address: 192.168.10.105
        user: admin
        pass: admin
    floating-ip-range: 10.5.15.6,10.5.15.250,10.5.15.254,10.5.15.0/24
    ext-port: "eth1"
    dns: 8.8.8.8
opnfv:
    release: d
    distro: xenial
    type: noha
    openstack: pike
    sdncontroller:
    - type: nosdn
    storage:
    - type: ceph
      disk: /dev/sdb
    feature: odl_l2
    spaces:
    - type: admin
      bridge: brAdmin
      cidr: 10.5.1.0/24
      gateway:
      vlan:
    - type: floating
      bridge: brExt
      cidr: 10.5.15.0/24
      gateway: 10.5.15.1
      vlan:

Once you have prepared the configuration file, you may begin with the automatic MAAS deployment.

4.4. MAAS Install

This section will guide you through the MAAS deployment. This is the first of two JOID deployment steps.

Note

For all the commands in this document, please do not use a root user account to run but instead use a non-root user account. We recommend using the ubuntu user as described above.

If you have already enabled maas for your environment and installed it then there is no need to enabled it again or install it. If you have patches from previous MAAS install, then you can apply them here.

Pre-installed MAAS without using the 03-maasdeploy.sh script is not supported. We strongly suggest to use 03-maasdeploy.sh script to deploy the MAAS and JuJu environment.

With the labconfig.yaml configuration file ready, you can start the MAAS deployment. In the joid/ci directory, run the following command:

# in joid/ci directory
./03-maasdeploy.sh custom <absolute path of config>/labconfig.yaml

If you prefer, you can also host your labconfig.yaml file remotely and JOID will download it from there. Just run

# in joid/ci directory
./03-maasdeploy.sh custom http://<web_site_location>/labconfig.yaml

This step will take approximately 30 minutes to a couple of hours depending on your environment. This script will do the following:

  • If this is your first time running this script, it will download all the required packages.
  • Install MAAS on the Jumphost.
  • Configure MAAS to enlist and commission a VM for Juju bootstrap node.
  • Configure MAAS to enlist and commission bare metal servers.
  • Download and load Ubuntu server images to be used by MAAS.

Already during deployment, once MAAS is installed, configured and launched, you can visit the MAAS Web UI and observe the progress of the deployment. Simply open the IP of your jumphost in a web browser and navigate to the /MAAS directory (e.g. http://10.5.1.1/MAAS in our example). You can login with username ubuntu and password ubuntu. In the Nodes page, you can see the bootstrap node and the bare metal servers and their status.

Hint

If you need to re-run this step, first undo the performed actions by running

# in joid/ci
./cleanvm.sh
./cleanmaas.sh
# now you can run the ./03-maasdeploy.sh script again
4.5. Juju Install

This section will guide you through the Juju an OPNFV deployment. This is the second of two JOID deployment steps.

JOID allows you to deploy different combinations of OpenStack and SDN solutions in HA or no-HA mode. For OpenStack, it supports Pike and Ocata. For SDN, it supports Open vSwitch, OpenContrail, OpenDaylight and ONOS (Open Network Operating System). In addition to HA or no-HA mode, it also supports deploying the latest from the development tree (tip).

To deploy OPNFV on the previously deployed MAAS system, use the deploy.sh script. For example:

# in joid/ci directory
./deploy.sh -d xenial -m openstack -o pike -s nosdn -f none -t noha -l custom

The above command starts an OPNFV deployment with Ubuntu Xenial (16.04) distro, OpenStack model, Pike version of OpenStack, Open vSwitch (and no other SDN), no special features, no-HA OpenStack mode and with custom labconfig. I.e. this corresponds to the os-nosdn-nofeature-noha OPNFV deployment scenario.

Note

You can see the usage info of the script by running

./deploy.sh --help

Possible script arguments are as follows.

Ubuntu distro to deploy

[-d <trusty|xenial>]
  • trusty: Ubuntu 16.04.
  • xenial: Ubuntu 17.04.

Model to deploy

[-m <openstack|kubernetes>]

JOID introduces two various models to deploy.

  • openstack: Openstack, which will be used for KVM/LXD container-based workloads.
  • kubernetes: Kubernetes model will be used for docker-based workloads.

Version of Openstack deployed

[-o <pike|ocata>]
  • pike: Pike version of OpenStack.
  • ocata: Ocata version of OpenStack.

SDN controller

[-s <nosdn|odl|opencontrail|onos|canal>]
  • nosdn: Open vSwitch only and no other SDN.
  • odl: OpenDayLight Boron version.
  • opencontrail: OpenContrail SDN.
  • onos: ONOS framework as SDN.
  • cana;: canal CNI plugin for kubernetes.

Feature to deploy (comma separated list)

[-f <lxd|dvr|sfc|dpdk|ipv6|none>]
  • none: No special feature will be enabled.
  • ipv6: IPv6 will be enabled for tenant in OpenStack.
  • lxd: With this feature hypervisor will be LXD rather than KVM.
  • dvr: Will enable distributed virtual routing.
  • dpdk: Will enable DPDK feature.
  • sfc: Will enable sfc feature only supported with ONOS deployment.
  • lb: Load balancing in case of Kubernetes will be enabled.
  • ceph: Ceph storage Kubernetes will be enabled.

Mode of Openstack deployed

[-t <noha|ha|tip>]
  • noha: No High Availability.
  • ha: High Availability.
  • tip: The latest from the development tree.

Where to deploy

[-l <custom|default|...>]
  • custom: For bare metal deployment where labconfig.yaml was provided externally and not part of JOID package.
  • default: For virtual deployment where installation will be done on KVM created using 03-maasdeploy.sh.

Architecture

[-a <amd64|ppc64el|aarch64>]
  • amd64: Only x86 architecture will be used. Future version will support arm64 as well.

This step may take up to a couple of hours, depending on your configuration, internet connectivity etc. You can check the status of the deployment by running this command in another terminal:

watch juju status --format tabular

Hint

If you need to re-run this step, first undo the performed actions by running

# in joid/ci
./clean.sh
# now you can run the ./deploy.sh script again
4.6. OPNFV Scenarios in JOID

Following OPNFV scenarios can be deployed using JOID. Separate yaml bundle will be created to deploy the individual scenario.

Scenario Owner Known Issues
os-nosdn-nofeature-ha Joid  
os-nosdn-nofeature-noha Joid  
os-odl_l2-nofeature-ha Joid Floating ips are not working on this deployment.
os-nosdn-lxd-ha Joid Yardstick team is working to support.
os-nosdn-lxd-noha Joid Yardstick team is working to support.
os-onos-nofeature-ha ONOSFW  
os-onos-sfc-ha ONOSFW  
k8-nosdn-nofeature-noha Joid No support from Functest and Yardstick
k8-nosdn-lb-noha Joid No support from Functest and Yardstick
4.7. Troubleshoot

By default debug is enabled in script and error messages will be printed on ssh terminal where you are running the scripts.

Logs are indispensable when it comes time to troubleshoot. If you want to see all the service unit deployment logs, you can run juju debug-log in another terminal. The debug-log command shows the consolidated logs of all Juju agents (machine and unit logs) running in the environment.

To view a single service unit deployment log, use juju ssh to access to the deployed unit. For example to login into nova-compute unit and look for /var/log/juju/unit-nova-compute-0.log for more info:

ubuntu@R4N4B1:~$ juju ssh nova-compute/0
Warning: Permanently added '172.16.50.60' (ECDSA) to the list of known hosts.
Warning: Permanently added '3-r4n3b1-compute.maas' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 3.13.0-77-generic x86_64)

* Documentation:  https://help.ubuntu.com/
<skipped>
Last login: Tue Feb  2 21:23:56 2016 from bootstrap.maas
ubuntu@3-R4N3B1-compute:~$ sudo -i
root@3-R4N3B1-compute:~# cd /var/log/juju/
root@3-R4N3B1-compute:/var/log/juju# ls
machine-2.log  unit-ceilometer-agent-0.log  unit-ceph-osd-0.log  unit-neutron-contrail-0.log  unit-nodes-compute-0.log  unit-nova-compute-0.log  unit-ntp-0.log
root@3-R4N3B1-compute:/var/log/juju#

Note

By default Juju will add the Ubuntu user keys for authentication into the deployed server and only ssh access will be available.

Once you resolve the error, go back to the jump host to rerun the charm hook with

$ juju resolved --retry <unit>

If you would like to start over, run juju destroy-environment <environment name> to release the resources, then you can run deploy.sh again.

To access of any of the nodes or containers, use

juju ssh <service name>/<instance id>

For example:

juju ssh openstack-dashboard/0
juju ssh nova-compute/0
juju ssh neutron-gateway/0

You can see the available nodes and containers by running

juju status

All charm log files are available under /var/log/juju.


If you have questions, you can join the JOID channel #opnfv-joid on Freenode.

4.8. Common Issues

The following are the common issues we have collected from the community:

  • The right variables are not passed as part of the deployment procedure.

    ./deploy.sh -o pike -s nosdn -t ha -l custom -f none
    
  • If you have not setup MAAS with 03-maasdeploy.sh then the ./clean.sh command could hang, the juju status command may hang because the correct MAAS API keys are not mentioned in cloud listing for MAAS.

    _Solution_: Please make sure you have an MAAS cloud listed using juju clouds and the correct MAAS API key has been added.

  • Deployment times out: use the command juju status and make sure all service containers receive an IP address and they are executing code. Ensure there is no service in the error state.

  • In case the cleanup process hangs,run the juju destroy-model command manually.

Direct console access via the OpenStack GUI can be quite helpful if you need to login to a VM but cannot get to it over the network. It can be enabled by setting the console-access-protocol in the nova-cloud-controller to vnc. One option is to directly edit the juju-deployer bundle and set it there prior to deploying OpenStack.

nova-cloud-controller:
  options:
    console-access-protocol: vnc

To access the console, just click on the instance in the OpenStack GUI and select the Console tab.

5. Virtual Installation

The virtual deployment of JOID is very simple and does not require any special configuration. To deploy a virtual JOID environment follow these few simple steps:

  1. Install a clean Ubuntu 16.04 (Xenial) server on the machine. You can use the tips noted in the first step of the Jumphost installation and configuration for bare metal deployment. However, no specialized configuration is needed, just make sure you have Internet connectivity.

  2. Run the MAAS deployment for virtual deployment without customized labconfig file:

    # in joid/ci directory
    ./03-maasdeploy.sh
    
  3. Run the Juju/OPNFV deployment with your desired configuration parameters, but with -l default -i 1 for virtual deployment. For example to deploy the Kubernetes model:

    # in joid/ci directory
    ./deploy.sh -d xenial -s nosdn -t noha -f none -m kubernetes -l default -i 1
    

Now you should have a working JOID deployment with three virtual nodes. In case of any issues, refer to the Troubleshoot section.

6. Post Installation
6.1. Testing Your Deployment

Once Juju deployment is complete, use juju status to verify that all deployed units are in the _Ready_ state.

Find the OpenStack dashboard IP address from the juju status output, and see if you can login via a web browser. The domain, username and password are admin_domain, admin and openstack.

Optionally, see if you can log in to the Juju GUI. Run juju gui to see the login details.

If you deploy OpenDaylight, OpenContrail or ONOS, find the IP address of the web UI and login. Please refer to each SDN bundle.yaml for the login username/password.

Note

If the deployment worked correctly, you can get easier access to the web dashboards with the setupproxy.sh script described in the next section.

6.2. Create proxies to the dashboards

MAAS, Juju and OpenStack/Kubernetes all come with their own web-based dashboards. However, they might be on private networks and require SSH tunnelling to see them. To simplify access to them, you can use the following script to configure the Apache server on Jumphost to work as a proxy to Juju and OpenStack/Kubernetes dashboards. Furthermore, this script also creates JOID deployment homepage with links to these dashboards, listing also their access credentials.

Simply run the following command after JOID has been deployed.

# run in joid/ci directory
# for OpenStack model:
./setupproxy.sh openstack
# for Kubernetes model:
./setupproxy.sh kubernetes

You can also use the -v argument for more verbose output with xtrace.

After the script has finished, it will print out the addresses and credentials to the dashboards. You can also find the JOID deployment homepage if you open the Jumphost’s IP address in your web browser.

6.3. Configuring OpenStack

At the end of the deployment, the admin-openrc with OpenStack login credentials will be created for you. You can source the file and start configuring OpenStack via CLI.

. ~/joid_config/admin-openrc

The script openstack.sh under joid/ci can be used to configure the OpenStack after deployment.

./openstack.sh <nosdn> custom xenial pike

Below commands are used to setup domain in heat.

juju run-action heat/0 domain-setup

Upload cloud images and creates the sample network to test.

joid/juju/get-cloud-images
joid/juju/joid-configure-openstack
6.4. Configuring Kubernetes

The script k8.sh under joid/ci would be used to show the Kubernetes workload and create sample pods.

./k8.sh
6.5. Configuring OpenStack

At the end of the deployment, the admin-openrc with OpenStack login credentials will be created for you. You can source the file and start configuring OpenStack via CLI.

cat ~/joid_config/admin-openrc
export OS_USERNAME=admin
export OS_PASSWORD=openstack
export OS_TENANT_NAME=admin
export OS_AUTH_URL=http://172.16.50.114:5000/v2.0
export OS_REGION_NAME=RegionOne

We have prepared some scripts to help your configure the OpenStack cloud that you just deployed. In each SDN directory, for example joid/ci/opencontrail, there is a ‘scripts’ folder where you can find the scripts. These scripts are created to help you configure a basic OpenStack Cloud to verify the cloud. For more information on OpenStack Cloud configuration, please refer to the OpenStack Cloud Administrator Guide: http://docs.openstack.org/user-guide-admin/. Similarly, for complete SDN configuration, please refer to the respective SDN administrator guide.

Each SDN solution requires slightly different setup. Please refer to the README in each SDN folder. Most likely you will need to modify the openstack.sh and cloud-setup.sh scripts for the floating IP range, private IP network, and SSH keys. Please go through openstack.sh, glance.sh and cloud-setup.sh and make changes as you see fit.

Let’s take a look at those for the Open vSwitch and briefly go through each script so you know what you need to change for your own environment.

$ ls ~/joid/juju
configure-juju-on-openstack  get-cloud-images  joid-configure-openstack
6.6. openstack.sh

Let’s first look at openstack.sh. First there are 3 functions defined, configOpenrc(), unitAddress(), and unitMachine().

configOpenrc() {
  cat <<-EOF
      export SERVICE_ENDPOINT=$4
      unset SERVICE_TOKEN
      unset SERVICE_ENDPOINT
      export OS_USERNAME=$1
      export OS_PASSWORD=$2
      export OS_TENANT_NAME=$3
      export OS_AUTH_URL=$4
      export OS_REGION_NAME=$5
EOF
}

unitAddress() {
  if [[ "$jujuver" < "2" ]]; then
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"services\"][\"$1\"][\"units\"][\"$1/$2\"][\"public-address\"]" 2> /dev/null
  else
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"applications\"][\"$1\"][\"units\"][\"$1/$2\"][\"public-address\"]" 2> /dev/null
  fi
}

unitMachine() {
  if [[ "$jujuver" < "2" ]]; then
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"services\"][\"$1\"][\"units\"][\"$1/$2\"][\"machine\"]" 2> /dev/null
  else
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"applications\"][\"$1\"][\"units\"][\"$1/$2\"][\"machine\"]" 2> /dev/null
  fi
}

The function configOpenrc() creates the OpenStack login credentials, the function unitAddress() finds the IP address of the unit, and the function unitMachine() finds the machine info of the unit.

create_openrc() {
   keystoneIp=$(keystoneIp)
   if [[ "$jujuver" < "2" ]]; then
       adminPasswd=$(juju get keystone | grep admin-password -A 5 | grep value | awk '{print $2}' 2> /dev/null)
   else
       adminPasswd=$(juju config keystone | grep admin-password -A 5 | grep value | awk '{print $2}' 2> /dev/null)
   fi

   configOpenrc admin $adminPasswd admin http://$keystoneIp:5000/v2.0 RegionOne > ~/joid_config/admin-openrc
   chmod 0600 ~/joid_config/admin-openrc
}

This finds the IP address of the keystone unit 0, feeds in the OpenStack admin credentials to a new file name ‘admin-openrc’ in the ‘~/joid_config/’ folder and change the permission of the file. It’s important to change the credentials here if you use a different password in the deployment Juju charm bundle.yaml.

neutron net-show ext-net > /dev/null 2>&1 || neutron net-create ext-net \
                                               --router:external=True \
                                               --provider:network_type flat \
                                               --provider:physical_network physnet1
neutron subnet-show ext-subnet > /dev/null 2>&1 || neutron subnet-create ext-net \
  --name ext-subnet --allocation-pool start=$EXTNET_FIP,end=$EXTNET_LIP \
  --disable-dhcp --gateway $EXTNET_GW $EXTNET_NET

This section will create the ext-net and ext-subnet for defining the for floating ips.

openstack congress datasource create nova "nova" \
  --config username=$OS_USERNAME \
  --config tenant_name=$OS_TENANT_NAME \
  --config password=$OS_PASSWORD \
  --config auth_url=http://$keystoneIp:5000/v2.0

This section will create the congress datasource for various services. Each service datasource will have entry in the file.

6.7. get-cloud-images
folder=/srv/data/
sudo mkdir $folder || true

if grep -q 'virt-type: lxd' bundles.yaml; then
   URLS=" \
   http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-lxc.tar.gz \
   http://cloud-images.ubuntu.com/xenial/current/xenial-server-cloudimg-amd64-root.tar.gz "

else
   URLS=" \
   http://cloud-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64-disk1.img \
   http://cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-amd64-disk1.img \
   http://cloud-images.ubuntu.com/xenial/current/xenial-server-cloudimg-amd64-disk1.img \
   http://mirror.catn.com/pub/catn/images/qcow2/centos6.4-x86_64-gold-master.img \
   http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2 \
   http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img "
fi

for URL in $URLS
do
FILENAME=${URL##*/}
if [ -f $folder/$FILENAME ];
then
   echo "$FILENAME already downloaded."
else
   wget  -O  $folder/$FILENAME $URL
fi
done

This section of the file will download the images to jumphost if not found to be used with openstack VIM.

Note

The image downloading and uploading might take too long and time out. In this case, use juju ssh glance/0 to log in to the glance unit 0 and run the script again, or manually run the glance commands.

6.8. joid-configure-openstack
source ~/joid_config/admin-openrc

First, source the the admin-openrc file.

::
#Upload images to glance glance image-create –name=”Xenial LXC x86_64” –visibility=public –container-format=bare –disk-format=root-tar –property architecture=”x86_64” < /srv/data/xenial-server-cloudimg-amd64-root.tar.gz glance image-create –name=”Cirros LXC 0.3” –visibility=public –container-format=bare –disk-format=root-tar –property architecture=”x86_64” < /srv/data/cirros-0.3.4-x86_64-lxc.tar.gz glance image-create –name=”Trusty x86_64” –visibility=public –container-format=ovf –disk-format=qcow2 < /srv/data/trusty-server-cloudimg-amd64-disk1.img glance image-create –name=”Xenial x86_64” –visibility=public –container-format=ovf –disk-format=qcow2 < /srv/data/xenial-server-cloudimg-amd64-disk1.img glance image-create –name=”CentOS 6.4” –visibility=public –container-format=bare –disk-format=qcow2 < /srv/data/centos6.4-x86_64-gold-master.img glance image-create –name=”Cirros 0.3” –visibility=public –container-format=bare –disk-format=qcow2 < /srv/data/cirros-0.3.4-x86_64-disk.img

Upload the images into Glance to be used for creating the VM.

# adjust tiny image
nova flavor-delete m1.tiny
nova flavor-create m1.tiny 1 512 8 1

Adjust the tiny image profile as the default tiny instance is too small for Ubuntu.

# configure security groups
neutron security-group-rule-create --direction ingress --ethertype IPv4 --protocol icmp --remote-ip-prefix 0.0.0.0/0 default
neutron security-group-rule-create --direction ingress --ethertype IPv4 --protocol tcp --port-range-min 22 --port-range-max 22 --remote-ip-prefix 0.0.0.0/0 default

Open up the ICMP and SSH access in the default security group.

# import key pair
keystone tenant-create --name demo --description "Demo Tenant"
keystone user-create --name demo --tenant demo --pass demo --email demo@demo.demo

nova keypair-add --pub-key id_rsa.pub ubuntu-keypair

Create a project called ‘demo’ and create a user called ‘demo’ in this project. Import the key pair.

# configure external network
neutron net-create ext-net --router:external --provider:physical_network external --provider:network_type flat --shared
neutron subnet-create ext-net --name ext-subnet --allocation-pool start=10.5.8.5,end=10.5.8.254 --disable-dhcp --gateway 10.5.8.1 10.5.8.0/24

This section configures an external network ‘ext-net’ with a subnet called ‘ext-subnet’. In this subnet, the IP pool starts at 10.5.8.5 and ends at 10.5.8.254. DHCP is disabled. The gateway is at 10.5.8.1, and the subnet mask is 10.5.8.0/24. These are the public IPs that will be requested and associated to the instance. Please change the network configuration according to your environment.

# create vm network
neutron net-create demo-net
neutron subnet-create --name demo-subnet --gateway 10.20.5.1 demo-net 10.20.5.0/24

This section creates a private network for the instances. Please change accordingly.

neutron router-create demo-router

neutron router-interface-add demo-router demo-subnet

neutron router-gateway-set demo-router ext-net

This section creates a router and connects this router to the two networks we just created.

# create pool of floating ips
i=0
while [ $i -ne 10 ]; do
  neutron floatingip-create ext-net
  i=$((i + 1))
done

Finally, the script will request 10 floating IPs.

6.8.1. configure-juju-on-openstack

This script can be used to do juju bootstrap on openstack so that Juju can be used as model tool to deploy the services and VNF on top of openstack using the JOID.

7. Appendices
7.1. Appendix A: Single Node Deployment

By default, running the script ./03-maasdeploy.sh will automatically create the KVM VMs on a single machine and configure everything for you.

if [ ! -e ./labconfig.yaml ]; then
    virtinstall=1
    labname="default"
    cp ../labconfig/default/labconfig.yaml ./
    cp ../labconfig/default/deployconfig.yaml ./

Please change joid/ci/labconfig/default/labconfig.yaml accordingly. The MAAS deployment script will do the following: 1. Create bootstrap VM. 2. Install MAAS on the jumphost. 3. Configure MAAS to enlist and commission VM for Juju bootstrap node.

Later, the 03-massdeploy.sh script will create three additional VMs and register them into the MAAS Server:

if [ "$virtinstall" -eq 1 ]; then
          sudo virt-install --connect qemu:///system --name $NODE_NAME --ram 8192 --cpu host --vcpus 4 \
                   --disk size=120,format=qcow2,bus=virtio,io=native,pool=default \
                   $netw $netw --boot network,hd,menu=off --noautoconsole --vnc --print-xml | tee $NODE_NAME

          nodemac=`grep  "mac address" $NODE_NAME | head -1 | cut -d '"' -f 2`
          sudo virsh -c qemu:///system define --file $NODE_NAME
          rm -f $NODE_NAME
          maas $PROFILE machines create autodetect_nodegroup='yes' name=$NODE_NAME \
              tags='control compute' hostname=$NODE_NAME power_type='virsh' mac_addresses=$nodemac \
              power_parameters_power_address='qemu+ssh://'$USER'@'$MAAS_IP'/system' \
              architecture='amd64/generic' power_parameters_power_id=$NODE_NAME
          nodeid=$(maas $PROFILE machines read | jq -r '.[] | select(.hostname == '\"$NODE_NAME\"').system_id')
          maas $PROFILE tag update-nodes control add=$nodeid || true
          maas $PROFILE tag update-nodes compute add=$nodeid || true

fi
7.2. Appendix B: Automatic Device Discovery

If your bare metal servers support IPMI, they can be discovered and enlisted automatically by the MAAS server. You need to configure bare metal servers to PXE boot on the network interface where they can reach the MAAS server. With nodes set to boot from a PXE image, they will start, look for a DHCP server, receive the PXE boot details, boot the image, contact the MAAS server and shut down.

During this process, the MAAS server will be passed information about the node, including the architecture, MAC address and other details which will be stored in the database of nodes. You can accept and commission the nodes via the web interface. When the nodes have been accepted the selected series of Ubuntu will be installed.

7.3. Appendix C: Machine Constraints

Juju and MAAS together allow you to assign different roles to servers, so that hardware and software can be configured according to their roles. We have briefly mentioned and used this feature in our example. Please visit Juju Machine Constraints https://jujucharms.com/docs/stable/charms-constraints and MAAS tags https://maas.ubuntu.com/docs/tags.html for more information.

7.4. Appendix D: Offline Deployment

When you have limited access policy in your environment, for example, when only the Jump Host has Internet access, but not the rest of the servers, we provide tools in JOID to support the offline installation.

The following package set is provided to those wishing to experiment with a ‘disconnected from the internet’ setup when deploying JOID utilizing MAAS. These instructions provide basic guidance as to how to accomplish the task, but it should be noted that due to the current reliance of MAAS and DNS, that behavior and success of deployment may vary depending on infrastructure setup. An official guided setup is in the roadmap for the next release:

  1. Get the packages from here: https://launchpad.net/~thomnico/+archive/ubuntu/ubuntu-cloud-mirrors

    Note

    The mirror is quite large 700GB in size, and does not mirror SDN repo/ppa.

  2. Additionally to make juju use a private repository of charms instead of using an external location are provided via the following link and configuring environments.yaml to use cloudimg-base-url: https://github.com/juju/docs/issues/757

JOID Configuration guide
JOID Configuration
Scenario 1: Nosdn

./deploy.sh -o pike -s nosdn -t ha -l custom -f none -d xenial -m openstack

Scenario 2: Kubernetes core

./deploy.sh -l custom -f none -m kubernetes

Scenario 3: Kubernetes Load Balancer

./deploy.sh -l custom -f lb -m kubernetes

Scenario 4: Kubernetes with OVN

./deploy.sh -s ovn -l custom -f lb -m kubernetes

Scenario 5: Openstack with Opencontrail

./deploy.sh -o pike -s ocl -t ha -l custom -f none -d xenial -m openstack

Scenario 6: Kubernetes Load Balancer with Canal CNI

./deploy.sh -s canal -l custom -f lb -m kubernetes

Scenario 7: Kubernetes Load Balancer with Ceph

./deploy.sh -l custom -f lb,ceph -m kubernetes

JOID User Guide
1. Introduction

This document will explain how to install OPNFV Fraser with JOID including installing JOID, configuring JOID for your environment, and deploying OPNFV with different SDN solutions in HA, or non-HA mode. Prerequisites include

  • An Ubuntu 16.04 LTS Server Jumphost
  • Minimum 2 Networks per Pharos requirement
    • One for the administrative network with gateway to access the Internet
    • One for the OpenStack public network to access OpenStack instances via floating IPs
    • JOID supports multiple isolated networks for data as well as storage based on your network requirement for OpenStack.
  • Minimum 6 Physical servers for bare metal environment
    • Jump Host x 1, minimum H/W configuration:
      • CPU cores: 16
      • Memory: 32GB
      • Hard Disk: 1 (250GB)
      • NIC: eth0 (Admin, Management), eth1 (external network)
    • Control and Compute Nodes x 5, minimum H/W configuration:
      • CPU cores: 16
      • Memory: 32GB
      • Hard Disk: 2 (500GB) prefer SSD
      • NIC: eth0 (Admin, Management), eth1 (external network)

NOTE: Above configuration is minimum. For better performance and usage of the OpenStack, please consider higher specs for all nodes.

Make sure all servers are connected to top of rack switch and configured accordingly. No DHCP server should be up and configured. Configure gateways only on eth0 and eth1 networks to access the network outside your lab.

2. Orientation
2.1. JOID in brief

JOID as Juju OPNFV Infrastructure Deployer allows you to deploy different combinations of OpenStack release and SDN solution in HA or non-HA mode. For OpenStack, JOID supports Juno and Liberty. For SDN, it supports Openvswitch, OpenContrail, OpenDayLight, and ONOS. In addition to HA or non-HA mode, it also supports deploying from the latest development tree.

JOID heavily utilizes the technology developed in Juju and MAAS. Juju is a state-of-the-art, open source, universal model for service oriented architecture and service oriented deployments. Juju allows you to deploy, configure, manage, maintain, and scale cloud services quickly and efficiently on public clouds, as well as on physical servers, OpenStack, and containers. You can use Juju from the command line or through its powerful GUI. MAAS (Metal-As-A-Service) brings the dynamism of cloud computing to the world of physical provisioning and Ubuntu. Connect, commission and deploy physical servers in record time, re-allocate nodes between services dynamically, and keep them up to date; and in due course, retire them from use. In conjunction with the Juju service orchestration software, MAAS will enable you to get the most out of your physical hardware and dynamically deploy complex services with ease and confidence.

For more info on Juju and MAAS, please visit https://jujucharms.com/ and http://maas.ubuntu.com.

2.2. Typical JOID Setup

The MAAS server is installed and configured on Jumphost with Ubuntu 16.04 LTS with access to the Internet. Another VM is created to be managed by MAAS as a bootstrap node for Juju. The rest of the resources, bare metal or virtual, will be registered and provisioned in MAAS. And finally the MAAS environment details are passed to Juju for use.

3. Installation

We will use 03-maasdeploy.sh to automate the deployment of MAAS clusters for use as a Juju provider. MAAS-deployer uses a set of configuration files and simple commands to build a MAAS cluster using virtual machines for the region controller and bootstrap hosts and automatically commission nodes as required so that the only remaining step is to deploy services with Juju. For more information about the maas-deployer, please see https://launchpad.net/maas-deployer.

3.1. Configuring the Jump Host

Let’s get started on the Jump Host node.

The MAAS server is going to be installed and configured on a Jumphost machine. We need to create bridges on the Jump Host prior to setting up the MAAS.

NOTE: For all the commands in this document, please do not use a ‘root’ user account to run. Please create a non root user account. We recommend using the ‘ubuntu’ user.

Install the bridge-utils package on the Jump Host and configure a minimum of two bridges, one for the Admin network, the other for the Public network:

$ sudo apt-get install bridge-utils

$ cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

iface p1p1 inet manual

auto brAdm
iface brAdm inet static
    address 172.16.50.51
    netmask 255.255.255.0
    bridge_ports p1p1

iface p1p2 inet manual

auto brPublic
iface brPublic inet static
    address 10.10.15.1
    netmask 255.255.240.0
    gateway 10.10.10.1
    dns-nameservers 8.8.8.8
    bridge_ports p1p2

NOTE: If you choose to use separate networks for management, data, and storage, then you need to create a bridge for each interface. In case of VLAN tags, make the appropriate network on jump-host depend upon VLAN ID on the interface.

NOTE: The Ethernet device names can vary from one installation to another. Please change the Ethernet device names according to your environment.

MAAS has been integrated in the JOID project. To get the JOID code, please run

$ sudo apt-get install git
$ git clone https://gerrit.opnfv.org/gerrit/p/joid.git
3.2. Setting Up Your Environment for JOID

To set up your own environment, create a directory in joid/ci/maas/<company name>/<pod number>/ and copy an existing JOID environment over. For example:

$ cd joid/ci
$ mkdir -p ../labconfig/myown/pod
$ cp ../labconfig/cengn/pod2/labconfig.yaml ../labconfig/myown/pod/

Now let’s configure labconfig.yaml file. Please modify the sections in the labconfig as per your lab configuration.

lab:

## Change the name of the lab you want maas name will get firmat as per location and rack name ## location: myown racks: - rack: pod

## based on your lab hardware please fill it accoridngly. ##

# Define one network and control and two control, compute and storage # and rest for compute and storage for backward compaibility. again # server with more disks should be used for compute and storage only. nodes: # DCOMP4-B, 24cores, 64G, 2disk, 4TBdisk - name: rack-2-m1

architecture: x86_64 roles: [network,control] nics: - ifname: eth0

spaces: [admin] mac: [“0c:c4:7a:3a:c5:b6”]
  • ifname: eth1 spaces: [floating] mac: [“0c:c4:7a:3a:c5:b7”]
power:
type: ipmi address: <bmc ip> user: <bmc username> pass: <bmc password>

## repeate the above section for number of hardware nodes you have it.

## define the floating IP range along with gateway IP to be used during the instance floating ips ##
floating-ip-range: 172.16.120.20,172.16.120.62,172.16.120.254,172.16.120.0/24 # Mutiple MACs seperated by space where MACs are from ext-ports across all network nodes.
## interface name to be used for floating ips ##
# eth1 of m4 since tags for networking are not yet implemented. ext-port: “eth1” dns: 8.8.8.8 osdomainname:
opnfv:
release: d distro: xenial type: noha openstack: pike sdncontroller: - type: nosdn storage: - type: ceph
## define the maximum disk possible in your environment ##
disk: /dev/sdb

feature: odl_l2

## Ensure the following configuration matches the bridge configuration on your jumphost

spaces: - type: admin

bridge: brAdm cidr: 10.120.0.0/24 gateway: 10.120.0.254 vlan:
  • type: floating bridge: brPublic cidr: 172.16.120.0/24 gateway: 172.16.120.254

Next we will use the 03-maasdeploy.sh in joid/ci to kick off maas deployment.

3.3. Starting MAAS depoyment

Now run the 03-maasdeploy.sh script with the environment you just created

~/joid/ci$ ./03-maasdeploy.sh custom ../labconfig/mylab/pod/labconfig.yaml

This will take approximately 30 minutes to couple of hours depending on your environment. This script will do the following: 1. Create 1 VM (KVM). 2. Install MAAS on the Jumphost. 3. Configure MAAS to enlist and commission a VM for Juju bootstrap node. 4. Configure MAAS to enlist and commission bare metal servers. 5. Download and load 16.04 images to be used by MAAS.

When it’s done, you should be able to view the MAAS webpage (in our example http://172.16.50.2/MAAS) and see 1 bootstrap node and bare metal servers in the ‘Ready’ state on the nodes page.

3.4. Troubleshooting MAAS deployment

During the installation process, please carefully review the error messages.

Join IRC channel #opnfv-joid on freenode to ask question. After the issues are resolved, re-running 03-maasdeploy.sh will clean up the VMs created previously. There is no need to manually undo what’s been done.

3.5. Deploying OPNFV

JOID allows you to deploy different combinations of OpenStack release and SDN solution in HA or non-HA mode. For OpenStack, it supports Juno and Liberty. For SDN, it supports Open vSwitch, OpenContrail, OpenDaylight and ONOS (Open Network Operating System). In addition to HA or non-HA mode, it also supports deploying the latest from the development tree (tip).

The deploy.sh script in the joid/ci directoy will do all the work for you. For example, the following deploys OpenStack Pike with OpenvSwitch in a HA mode.

~/joid/ci$  ./deploy.sh -o pike -s nosdn -t ha -l custom -f none -m openstack

The deploy.sh script in the joid/ci directoy will do all the work for you. For example, the following deploys Kubernetes with Load balancer on the pod.

~/joid/ci$  ./deploy.sh -m openstack -f lb

Take a look at the deploy.sh script. You will find we support the following for each option:

[-s]
  nosdn: Open vSwitch.
  odl: OpenDayLight Lithium version.
  opencontrail: OpenContrail.
  onos: ONOS framework as SDN.
[-t]
  noha: NO HA mode of OpenStack.
  ha: HA mode of OpenStack.
  tip: The tip of the development.
[-o]
  ocata: OpenStack Ocata version.
  pike: OpenStack Pike version.
[-l]
  default: For virtual deployment where installation will be done on KVM created using ./03-maasdeploy.sh
  custom: Install on bare metal OPNFV defined by labconfig.yaml
[-f]
  none: no special feature will be enabled.
  ipv6: IPv6 will be enabled for tenant in OpenStack.
  dpdk: dpdk will be enabled.
  lxd: virt-type will be lxd.
  dvr: DVR will be enabled.
  lb: Load balancing in case of Kubernetes will be enabled.
[-d]
  xenial: distro to be used is Xenial 16.04
[-a]
  amd64: Only x86 architecture will be used. Future version will support arm64 as well.
[-m]
  openstack: Openstack model will be deployed.
  kubernetes: Kubernetes model will be deployed.

The script will call 01-bootstrap.sh to bootstrap the Juju VM node, then it will call 02-deploybundle.sh with the corrosponding parameter values.

./02-deploybundle.sh $opnfvtype $openstack $opnfvlab $opnfvsdn $opnfvfeature $opnfvdistro

Python script GenBundle.py would be used to create bundle.yaml based on the template defined in the config_tpl/juju2/ directory.

By default debug is enabled in the deploy.sh script and error messages will be printed on the SSH terminal where you are running the scripts. It could take an hour to a couple of hours (maximum) to complete.

You can check the status of the deployment by running this command in another terminal:

$ watch juju status --format tabular

This will refresh the juju status output in tabular format every 2 seconds.

Next we will show you what Juju is deploying and to where, and how you can modify based on your own needs.

3.6. OPNFV Juju Charm Bundles

The magic behind Juju is a collection of software components called charms. They contain all the instructions necessary for deploying and configuring cloud-based services. The charms publicly available in the online Charm Store represent the distilled DevOps knowledge of experts.

A bundle is a set of services with a specific configuration and their corresponding relations that can be deployed together in a single step. Instead of deploying a single service, they can be used to deploy an entire workload, with working relations and configuration. The use of bundles allows for easy repeatability and for sharing of complex, multi-service deployments.

For OPNFV, we have created the charm bundles for each SDN deployment. They are stored in each directory in ~/joid/ci.

We use Juju to deploy a set of charms via a yaml configuration file. You can find the complete format guide for the Juju configuration file here: http://pythonhosted.org/juju-deployer/config.html

In the ‘services’ subsection, here we deploy the ‘Ubuntu Xenial charm from the charm store,’ You can deploy the same charm and name it differently such as the second service ‘nodes-compute.’ The third service we deploy is named ‘ntp’ and is deployed from the NTP Trusty charm from the Charm Store. The NTP charm is a subordinate charm, which is designed for and deployed to the running space of another service unit.

The tag here is related to what we define in the deployment.yaml file for the MAAS. When ‘constraints’ is set, Juju will ask its provider, in this case MAAS, to provide a resource with the tags. In this case, Juju is asking one resource tagged with control and one resource tagged with compute from MAAS. Once the resource information is passed to Juju, Juju will start the installation of the specified version of Ubuntu.

In the next subsection, we define the relations between the services. The beauty of Juju and charms is you can define the relation of two services and all the service units deployed will set up the relations accordingly. This makes scaling out a very easy task. Here we add the relation between NTP and the two bare metal services.

Once the relations are established, Juju considers the deployment complete and moves to the next.

juju  deploy bundles.yaml

It will start the deployment , which will retry the section,

nova-cloud-controller:
  branch: lp:~openstack-charmers/charms/trusty/nova-cloud-controller/next
  num_units: 1
  options:
    network-manager: Neutron
  to:
    - "lxc:nodes-api=0"

We define a service name ‘nova-cloud-controller,’ which is deployed from the next branch of the nova-cloud-controller Trusty charm hosted on the Launchpad openstack-charmers team. The number of units to be deployed is 1. We set the network-manager option to ‘Neutron.’ This 1-service unit will be deployed to a LXC container at service ‘nodes-api’ unit 0.

To find out what other options there are for this particular charm, you can go to the code location at http://bazaar.launchpad.net/~openstack-charmers/charms/trusty/nova-cloud-controller/next/files and the options are defined in the config.yaml file.

Once the service unit is deployed, you can see the current configuration by running juju get:

$ juju config nova-cloud-controller

You can change the value with juju config, for example:

$ juju config nova-cloud-controller network-manager=’FlatManager’

Charms encapsulate the operation best practices. The number of options you need to configure should be at the minimum. The Juju Charm Store is a great resource to explore what a charm can offer you. Following the nova-cloud-controller charm example, here is the main page of the recommended charm on the Charm Store: https://jujucharms.com/nova-cloud-controller/trusty/66

If you have any questions regarding Juju, please join the IRC channel #opnfv-joid on freenode for JOID related questions or #juju for general questions.

3.7. Testing Your Deployment

Once juju-deployer is complete, use juju status –format tabular to verify that all deployed units are in the ready state.

Find the Openstack-dashboard IP address from the juju status output, and see if you can login via a web browser. The username and password is admin/openstack.

Optionally, see if you can log in to the Juju GUI. The Juju GUI is on the Juju bootstrap node, which is the second VM you define in the 03-maasdeploy.sh file. The username and password is admin/admin.

If you deploy OpenDaylight, OpenContrail or ONOS, find the IP address of the web UI and login. Please refer to each SDN bundle.yaml for the login username/password.

3.8. Troubleshooting

Logs are indispensable when it comes time to troubleshoot. If you want to see all the service unit deployment logs, you can run juju debug-log in another terminal. The debug-log command shows the consolidated logs of all Juju agents (machine and unit logs) running in the environment.

To view a single service unit deployment log, use juju ssh to access to the deployed unit. For example to login into nova-compute unit and look for /var/log/juju/unit-nova-compute-0.log for more info.

$ juju ssh nova-compute/0

Example:

ubuntu@R4N4B1:~$ juju ssh nova-compute/0
Warning: Permanently added '172.16.50.60' (ECDSA) to the list of known hosts.
Warning: Permanently added '3-r4n3b1-compute.maas' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 3.13.0-77-generic x86_64)

* Documentation:  https://help.ubuntu.com/
<skipped>
Last login: Tue Feb  2 21:23:56 2016 from bootstrap.maas
ubuntu@3-R4N3B1-compute:~$ sudo -i
root@3-R4N3B1-compute:~# cd /var/log/juju/
root@3-R4N3B1-compute:/var/log/juju# ls
machine-2.log  unit-ceilometer-agent-0.log  unit-ceph-osd-0.log  unit-neutron-contrail-0.log  unit-nodes-compute-0.log  unit-nova-compute-0.log  unit-ntp-0.log
root@3-R4N3B1-compute:/var/log/juju#

NOTE: By default Juju will add the Ubuntu user keys for authentication into the deployed server and only ssh access will be available.

Once you resolve the error, go back to the jump host to rerun the charm hook with:

$ juju resolved --retry <unit>

If you would like to start over, run juju destroy-environment <environment name> to release the resources, then you can run deploy.sh again.

The following are the common issues we have collected from the community:

  • The right variables are not passed as part of the deployment procedure.
./deploy.sh -o pike -s nosdn -t ha -l custom -f none
  • If you have setup maas not with 03-maasdeploy.sh then the ./clean.sh command could hang, the juju status command may hang because the correct MAAS API keys are not mentioned in cloud listing for MAAS. Solution: Please make sure you have an MAAS cloud listed using juju clouds. and the correct MAAS API key has been added.

  • Deployment times out:

    use the command juju status –format=tabular and make sure all service containers receive an IP address and they are executing code. Ensure there is no service in the error state.

  • In case the cleanup process hangs,run the juju destroy-model command manually.

Direct console access via the OpenStack GUI can be quite helpful if you need to login to a VM but cannot get to it over the network. It can be enabled by setting the console-access-protocol in the nova-cloud-controller to vnc. One option is to directly edit the juju-deployer bundle and set it there prior to deploying OpenStack.

nova-cloud-controller:
options:
  console-access-protocol: vnc

To access the console, just click on the instance in the OpenStack GUI and select the Console tab.

4. Post Installation Configuration
4.1. Configuring OpenStack

At the end of the deployment, the admin-openrc with OpenStack login credentials will be created for you. You can source the file and start configuring OpenStack via CLI.

~/joid_config$ cat admin-openrc
export OS_USERNAME=admin
export OS_PASSWORD=openstack
export OS_TENANT_NAME=admin
export OS_AUTH_URL=http://172.16.50.114:5000/v2.0
export OS_REGION_NAME=RegionOne

We have prepared some scripts to help your configure the OpenStack cloud that you just deployed. In each SDN directory, for example joid/ci/opencontrail, there is a ‘scripts’ folder where you can find the scripts. These scripts are created to help you configure a basic OpenStack Cloud to verify the cloud. For more information on OpenStack Cloud configuration, please refer to the OpenStack Cloud Administrator Guide: http://docs.openstack.org/user-guide-admin/. Similarly, for complete SDN configuration, please refer to the respective SDN administrator guide.

Each SDN solution requires slightly different setup. Please refer to the README in each SDN folder. Most likely you will need to modify the openstack.sh and cloud-setup.sh scripts for the floating IP range, private IP network, and SSH keys. Please go through openstack.sh, glance.sh and cloud-setup.sh and make changes as you see fit.

Let’s take a look at those for the Open vSwitch and briefly go through each script so you know what you need to change for your own environment.

~/joid/juju$ ls
configure-juju-on-openstack  get-cloud-images  joid-configure-openstack
4.1.1. openstack.sh

Let’s first look at ‘openstack.sh’. First there are 3 functions defined, configOpenrc(), unitAddress(), and unitMachine().

configOpenrc() {
  cat <<-EOF
      export SERVICE_ENDPOINT=$4
      unset SERVICE_TOKEN
      unset SERVICE_ENDPOINT
      export OS_USERNAME=$1
      export OS_PASSWORD=$2
      export OS_TENANT_NAME=$3
      export OS_AUTH_URL=$4
      export OS_REGION_NAME=$5
EOF
}

unitAddress() {
  if [[ "$jujuver" < "2" ]]; then
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"services\"][\"$1\"][\"units\"][\"$1/$2\"][\"public-address\"]" 2> /dev/null
  else
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"applications\"][\"$1\"][\"units\"][\"$1/$2\"][\"public-address\"]" 2> /dev/null
  fi
}

unitMachine() {
  if [[ "$jujuver" < "2" ]]; then
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"services\"][\"$1\"][\"units\"][\"$1/$2\"][\"machine\"]" 2> /dev/null
  else
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"applications\"][\"$1\"][\"units\"][\"$1/$2\"][\"machine\"]" 2> /dev/null
  fi
}

The function configOpenrc() creates the OpenStack login credentials, the function unitAddress() finds the IP address of the unit, and the function unitMachine() finds the machine info of the unit.

create_openrc() {
   keystoneIp=$(keystoneIp)
   if [[ "$jujuver" < "2" ]]; then
       adminPasswd=$(juju get keystone | grep admin-password -A 7 | grep value | awk '{print $2}' 2> /dev/null)
   else
       adminPasswd=$(juju config keystone | grep admin-password -A 7 | grep value | awk '{print $2}' 2> /dev/null)
   fi

   configOpenrc admin $adminPasswd admin http://$keystoneIp:5000/v2.0 RegionOne > ~/joid_config/admin-openrc
   chmod 0600 ~/joid_config/admin-openrc
}

This finds the IP address of the keystone unit 0, feeds in the OpenStack admin credentials to a new file name ‘admin-openrc’ in the ‘~/joid_config/’ folder and change the permission of the file. It’s important to change the credentials here if you use a different password in the deployment Juju charm bundle.yaml.

neutron net-show ext-net > /dev/null 2>&1 || neutron net-create ext-net \
                                               --router:external=True \
                                               --provider:network_type flat \
                                               --provider:physical_network physnet1
::
neutron subnet-show ext-subnet > /dev/null 2>&1 || neutron subnet-create ext-net
–name ext-subnet –allocation-pool start=$EXTNET_FIP,end=$EXTNET_LIP –disable-dhcp –gateway $EXTNET_GW $EXTNET_NET

This section will create the ext-net and ext-subnet for defining the for floating ips.

openstack congress datasource create nova "nova" \
 --config username=$OS_USERNAME \
 --config tenant_name=$OS_TENANT_NAME \
 --config password=$OS_PASSWORD \
 --config auth_url=http://$keystoneIp:5000/v2.0

This section will create the congress datasource for various services. Each service datasource will have entry in the file.

4.1.2. get-cloud-images
folder=/srv/data/
sudo mkdir $folder || true

if grep -q 'virt-type: lxd' bundles.yaml; then
   URLS=" \
   http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-lxc.tar.gz \
   http://cloud-images.ubuntu.com/xenial/current/xenial-server-cloudimg-amd64-root.tar.gz "

else
   URLS=" \
   http://cloud-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64-disk1.img \
   http://cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-amd64-disk1.img \
   http://cloud-images.ubuntu.com/xenial/current/xenial-server-cloudimg-amd64-disk1.img \
   http://mirror.catn.com/pub/catn/images/qcow2/centos6.4-x86_64-gold-master.img \
   http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2 \
   http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img "
fi

for URL in $URLS
do
FILENAME=${URL##*/}
if [ -f $folder/$FILENAME ];
then
   echo "$FILENAME already downloaded."
else
   wget  -O  $folder/$FILENAME $URL
fi
done

This section of the file will download the images to jumphost if not found to be used with openstack VIM.

NOTE: The image downloading and uploading might take too long and time out. In this case, use juju ssh glance/0 to log in to the glance unit 0 and run the script again, or manually run the glance commands.

4.1.3. joid-configure-openstack
source ~/joid_config/admin-openrc

First, source the the admin-openrc file.

::
#Upload images to glance
glance image-create –name=”Xenial LXC x86_64” –visibility=public –container-format=bare –disk-format=root-tar –property architecture=”x86_64” < /srv/data/xenial-server-cloudimg-amd64-root.tar.gz glance image-create –name=”Cirros LXC 0.3” –visibility=public –container-format=bare –disk-format=root-tar –property architecture=”x86_64” < /srv/data/cirros-0.3.4-x86_64-lxc.tar.gz glance image-create –name=”Trusty x86_64” –visibility=public –container-format=ovf –disk-format=qcow2 < /srv/data/trusty-server-cloudimg-amd64-disk1.img glance image-create –name=”Xenial x86_64” –visibility=public –container-format=ovf –disk-format=qcow2 < /srv/data/xenial-server-cloudimg-amd64-disk1.img glance image-create –name=”CentOS 6.4” –visibility=public –container-format=bare –disk-format=qcow2 < /srv/data/centos6.4-x86_64-gold-master.img glance image-create –name=”Cirros 0.3” –visibility=public –container-format=bare –disk-format=qcow2 < /srv/data/cirros-0.3.4-x86_64-disk.img

upload the images into glane to be used for creating the VM.

# adjust tiny image
nova flavor-delete m1.tiny
nova flavor-create m1.tiny 1 512 8 1

Adjust the tiny image profile as the default tiny instance is too small for Ubuntu.

# configure security groups
neutron security-group-rule-create --direction ingress --ethertype IPv4 --protocol icmp --remote-ip-prefix 0.0.0.0/0 default
neutron security-group-rule-create --direction ingress --ethertype IPv4 --protocol tcp --port-range-min 22 --port-range-max 22 --remote-ip-prefix 0.0.0.0/0 default

Open up the ICMP and SSH access in the default security group.

# import key pair
keystone tenant-create --name demo --description "Demo Tenant"
keystone user-create --name demo --tenant demo --pass demo --email demo@demo.demo

nova keypair-add --pub-key id_rsa.pub ubuntu-keypair

Create a project called ‘demo’ and create a user called ‘demo’ in this project. Import the key pair.

# configure external network
neutron net-create ext-net --router:external --provider:physical_network external --provider:network_type flat --shared
neutron subnet-create ext-net --name ext-subnet --allocation-pool start=10.5.8.5,end=10.5.8.254 --disable-dhcp --gateway 10.5.8.1 10.5.8.0/24

This section configures an external network ‘ext-net’ with a subnet called ‘ext-subnet’. In this subnet, the IP pool starts at 10.5.8.5 and ends at 10.5.8.254. DHCP is disabled. The gateway is at 10.5.8.1, and the subnet mask is 10.5.8.0/24. These are the public IPs that will be requested and associated to the instance. Please change the network configuration according to your environment.

# create vm network
neutron net-create demo-net
neutron subnet-create --name demo-subnet --gateway 10.20.5.1 demo-net 10.20.5.0/24

This section creates a private network for the instances. Please change accordingly.

neutron router-create demo-router
neutron router-interface-add demo-router demo-subnet
neutron router-gateway-set demo-router ext-net

This section creates a router and connects this router to the two networks we just created.

# create pool of floating ips
i=0
while [ $i -ne 10 ]; do
  neutron floatingip-create ext-net
  i=$((i + 1))
done

Finally, the script will request 10 floating IPs.

4.1.4. configure-juju-on-openstack

This script can be used to do juju bootstrap on openstack so that Juju can be used as model tool to deploy the services and VNF on top of openstack using the JOID.

5. Appendix A: Single Node Deployment

By default, running the script ./03-maasdeploy.sh will automatically create the KVM VMs on a single machine and configure everything for you.

if [ ! -e ./labconfig.yaml ]; then
    virtinstall=1
    labname="default"
    cp ../labconfig/default/labconfig.yaml ./
    cp ../labconfig/default/deployconfig.yaml ./

Please change joid/ci/labconfig/default/labconfig.yaml accordingly. The MAAS deployment script will do the following: 1. Create bootstrap VM. 2. Install MAAS on the jumphost. 3. Configure MAAS to enlist and commission VM for Juju bootstrap node.

Later, the 03-massdeploy.sh script will create three additional VMs and register them into the MAAS Server:

if [ "$virtinstall" -eq 1 ]; then
          sudo virt-install --connect qemu:///system --name $NODE_NAME --ram 8192 --cpu host --vcpus 4 \
                   --disk size=120,format=qcow2,bus=virtio,io=native,pool=default \
                   $netw $netw --boot network,hd,menu=off --noautoconsole --vnc --print-xml | tee $NODE_NAME

          nodemac=`grep  "mac address" $NODE_NAME | head -1 | cut -d '"' -f 2`
          sudo virsh -c qemu:///system define --file $NODE_NAME
          rm -f $NODE_NAME
          maas $PROFILE machines create autodetect_nodegroup='yes' name=$NODE_NAME \
              tags='control compute' hostname=$NODE_NAME power_type='virsh' mac_addresses=$nodemac \
              power_parameters_power_address='qemu+ssh://'$USER'@'$MAAS_IP'/system' \
              architecture='amd64/generic' power_parameters_power_id=$NODE_NAME
          nodeid=$(maas $PROFILE machines read | jq -r '.[] | select(.hostname == '\"$NODE_NAME\"').system_id')
          maas $PROFILE tag update-nodes control add=$nodeid || true
          maas $PROFILE tag update-nodes compute add=$nodeid || true

fi
6. Appendix B: Automatic Device Discovery

If your bare metal servers support IPMI, they can be discovered and enlisted automatically by the MAAS server. You need to configure bare metal servers to PXE boot on the network interface where they can reach the MAAS server. With nodes set to boot from a PXE image, they will start, look for a DHCP server, receive the PXE boot details, boot the image, contact the MAAS server and shut down.

During this process, the MAAS server will be passed information about the node, including the architecture, MAC address and other details which will be stored in the database of nodes. You can accept and commission the nodes via the web interface. When the nodes have been accepted the selected series of Ubuntu will be installed.

7. Appendix C: Machine Constraints

Juju and MAAS together allow you to assign different roles to servers, so that hardware and software can be configured according to their roles. We have briefly mentioned and used this feature in our example. Please visit Juju Machine Constraints https://jujucharms.com/docs/stable/charms-constraints and MAAS tags https://maas.ubuntu.com/docs/tags.html for more information.

8. Appendix D: Offline Deployment

When you have limited access policy in your environment, for example, when only the Jump Host has Internet access, but not the rest of the servers, we provide tools in JOID to support the offline installation.

The following package set is provided to those wishing to experiment with a ‘disconnected from the internet’ setup when deploying JOID utilizing MAAS. These instructions provide basic guidance as to how to accomplish the task, but it should be noted that due to the current reliance of MAAS and DNS, that behavior and success of deployment may vary depending on infrastructure setup. An official guided setup is in the roadmap for the next release:

  1. Get the packages from here: https://launchpad.net/~thomnico/+archive/ubuntu/ubuntu-cloud-mirrors
NOTE: The mirror is quite large 700GB in size, and does not mirror SDN repo/ppa.
  1. Additionally to make juju use a private repository of charms instead of using an external location are provided via the following link and configuring environments.yaml to use cloudimg-base-url: https://github.com/juju/docs/issues/757

Opera

OPNFV Opera Overview
1. OPERA Project Overview

Since OPNFV board expanded its scope to include NFV MANO last year, several upstream open source projects have been created to develop MANO solutions. Each solution has demonstrated its unique value in specific area. Open-Orchestrator (OPEN-O) project is one of such communities. Opera seeks to develop requirements for OPEN-O MANO support in the OPNFV reference platform, with the plan to eventually integrate OPEN-O in OPNFV as a non-exclusive upstream MANO. The project will definitely benefit not only OPNFV and Open-O, but can be referenced by other MANO integration as well. In particular, this project is basically use case driven. Based on that, it will focus on the requirement of interfaces/data models for integration among various components and OPNFV platform. The requirement is designed to support integration among Open-O as NFVO with Juju as VNFM and OpenStack as VIM.

Currently OPNFV has already included upstream OpenStack as VIM, and Juju and Tacker have been being considered as gVNFM by different OPNFV projects. OPEN-O as NFVO part of MANO will interact with OpenStack and Juju. The key items required for the integration can be described as follows.

key item

Fig 1. Key Item for Integration

2. Open-O is scoped for the integration

OPEN-O includes various components for OPNFV MANO integration. The initial release of integration will be focusing on NFV-O, Common service and Common TOSCA. Other components of Open-O will be gradually integrated to OPNFV reference platform in later release.

openo component

Fig 2. Deploy Overview

3. The vIMS is used as initial use case

based on which test cases will be created and aligned with Open-O first release for OPNFV D release.

  • Creatting scenario (os-nosdn-openoe-ha) to integrate Open-O with OpenStack Newton.
  • Integrating with COMPASS as installer, FuncTest as testing framework
  • Clearwater vIMS is used as VNFs, Juju is used as VNFM.
  • Use Open-O as Orchestrator to deploy vIMS to do end-2-end test with the following steps.
  1. deploy Open-O as orchestrator
  2. create tenant by Open-O to OpenStack
  3. deploy vIMS VNFs from orchestrator based on TOSCA blueprintn and create VNFs
  4. launch test suite
  5. collect results and clean up
vIMS deploy

Fig 3. vIMS Deploy

OPNFV Opera Installation Instructions
1. Abstract

This document describes how to install Open-O in an OpenStack deployed environment using Opera project.

2. Version history
Date Ver. Author Comment
2017-02-16 0.0.1 Harry Huang (HUAWEI) First draft
3. Opera Installation Instructions

This document providing guidelines on how to deploy a working Open-O environment using opera project.

The audience of this document is assumed to have good knowledge in OpenStack and Linux.

3.1. Preconditions

There are some preconditions before starting the Opera deployment

3.1.1. A functional OpenStack environment

OpenStack should be deployed before opera deploy.

3.1.2. Getting the deployment scripts

Retrieve the repository of Opera using the following command:

3.2. Machine requirements
  1. Ubuntu OS (Pre-installed).
  2. Root access.
  3. Minimum 1 NIC (internet access)
  4. CPU cores: 32
  5. 64 GB free memory
  6. 100G free disk
3.3. Deploy Instruction

After opera deployment, Open-O dockers will be launched on local server as orchestrator and juju vm will be launched on OpenStack as VNFM.

3.3.1. Add OpenStack Admin Openrc file

Add the admin openrc file of your local openstack into opera/conf directory with the name of admin-openrc.sh.

3.3.2. Config open-o.yml

Set openo_version to specify Open-O version.

Set openo_ip to specify an external ip to access Open-O services. (leave the value unset will use local server’s external ip)

Set ports in openo_docker_net to specify Open-O’s exposed service ports.

Set enable_sdno to specify if use Open-O ‘s sdno services. (set this value false will not launch Open-O sdno dockers and reduce deploy duration)

Set vnf_type to specify the vnf type need to be deployed. (currently only support clearwater deployment, leave this unset will not deploy any vnf)

3.3.3. Run opera_launch.sh
./opera_launch.sh
OPNFV Opera Config Instructions
1. Config Guide
1.1. Add OpenStack Admin Openrc file

Add the admin openrc file of your local openstack into opera/conf directory with the name of admin-openrc.sh.

1.2. Config open-o.yml

Set openo_version to specify Open-O version.

Set openo_ip to specify an external ip to access Open-O services. (leave the value unset will use local server’s external ip)

Set ports in openo_docker_net to specify Open-O’s exposed service ports.

Set enable_sdno to specify if use Open-O ‘s sdno services. (set this value false will not launch Open-O sdno dockers and reduce deploy duration)

Set vnf_type to specify the vnf type need to be deployed. (currently only support clearwater deployment, leave this unset will not deploy any vnf)

OPNFV Opera Design
1. OPERA Requirement and Design
  • Define Scenario OS-NOSDN-OPENO-HA and Integrate OPEN-O M Release with OPNFV D Release (with OpenStack Newton)

  • Integrate OPEN-O to OPNFV CI Process
    • Integrate automatic Open-O and Juju installation
  • Deploy Clearwater vIMS through OPEN-O
    • Test case to simulate SIP clients voice call
  • Integrate vIMS test scripts to FuncTest

2. OS-NOSDN-OPENO-HA Scenario Definition
2.1. Compass4NFV supports Open-O NFV Scenario
  • Scenario name: os-nosdn-openo-ha

  • Deployment: OpenStack + Open-O + JuJu

  • Setups:
    • Virtual deployment (one physical server as Jump Server with OS ubuntu)
    • Physical Deployment (one physical server as Jump Server, ubuntu + 5 physical Host Server)
deploy overview

Fig 1. Deploy Overview

3. Open-O is participating OPNFV CI Process
  • All steps are linked to OPNFV CI Process
  • Jenkins jobs remotely access OPEN-O NEXUS repository to fetch binaries
  • COMPASS is to deploy scenario based on OpenStack Newton release.
  • OPEN-O and JuJu installation scripts will be triggered in Jenkins job after COMPASS finish deploying OpenStack
  • Clearwater vIMS deploy scripts will be integrated into FuncTest
  • Clearwater vIMS test scripts will be integrated into FuncTest
opera ci

Fig 2. Opera Ci

4. The vIMS is used as initial use case

based on which test cases will be created and aligned with Open-O first release for OPNFV D release.

  • Creatting scenario (os-nosdn-openoe-ha) to integrate Open-O with OpenStack Newton.
  • Integrating with COMPASS as installer, FuncTest as testing framework
  • Clearwater vIMS is used as VNFs, Juju is used as VNFM.
  • Use Open-O as Orchestrator to deploy vIMS to do end-2-end test with the following steps.
  1. deploy Open-O as orchestrator
  2. create tenant by Open-O to OpenStack
  3. deploy vIMS VNFs from orchestrator based on TOSCA blueprintn and create VNFs
  4. launch test suite
  5. collect results and clean up
vIMS deploy

Fig 3. vIMS Deploy

5. Requirement and Tasks
5.1. OPERA Deployment Key idea
  • Keep OPEN-O deployment agnostic from an installer perspective (Apex, Compass, Fuel, Joid)
  • Breakdown deployments in single scripts (isolation)
  • Have OPNFV CI Process (Jenkins) control and monitor the execution
5.2. Tasks need to be done for OPNFV CD process
  1. Compass to deploy scenario of os-nosdn-openo-noha

  2. Automate OPEN-O installation (deployment) process

  3. Automate JuJu installation process

  4. Create vIMS TOSCA blueprint (for vIMS deployment)

  5. Automate vIMS package deployment (need helper/OPEN-O M)
    • (a)Jenkins to invoke OPEN-O Restful API to import & deploy vIMS ackage
  6. Integrate scripts of step 2,3,4,5 with OPNFV CD Jenkins Job

5.3. FUNCTEST
  1. test case automation
    • (a)Invoke URL request to vIMS services to test deployment is successfully done.
  2. Integrate test scripts with FuncTest
    • (a)trigger these test scripts
    • (b)record test result to DB
functest

Fig 4. Functest

Parser

OPNFV Parser Installation Instruction
Parser tosca2heat Installation

Please follow the below installation steps to install tosca2heat submodule in parser.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Install the heat-translator sub project.

# uninstall pre-installed tosca-parser
pip uninstall -y heat-translator

# change directory to heat-translator
cd parser/tosca2heat/heat-translator

# install requirements
pip install -r requirements.txt

# install heat-translator
python setup.py install

Step 3: Install the tosca-parser sub project.

# uninstall pre-installed tosca-parser
pip uninstall -y tosca-parser

# change directory to tosca-parser
cd parser/tosca2heat/tosca-parser

# install requirements
pip install -r requirements.txt

# install tosca-parser
python setup.py install

Notes: It must uninstall pre-installed tosca-parser and heat-translator before install the two components, and install heat-translator before installing tosca-parser, which is sure to use the OPNFV version of tosca-parser and heat-translator other than openstack’s components.

Parser yang2tosca Installation

Parser yang2tosca requires the following to be installed.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Clone pyang tool or download the zip file from the following link.

git clone https://github.com/mbj4668/pyang.git

OR

wget https://github.com/mbj4668/pyang/archive/master.zip

Step 3: Change directory to the downloaded directory and run the setup file.

cd pyang
python setup.py
Step 4: install python-lxml

Please follow the below installation link. http://lxml.de/installation.html

Parser policy2tosca installation

Please follow the below installation steps to install parser - POLICY2TOSCA.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Install the policy2tosca module.

cd parser/policy2tosca
python setup.py install
Parser verigraph installation

In the present release, verigraph requires that the following software is also installed:

Please follow the below installation steps to install verigraph.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Go to the verigraph directory.

cd parser/verigraph

Step3: Set up the execution environment, based on your operating system.

VeriGraph deployment on Apache Tomcat (Windows):

  • set JAVA HOME environment variable to where you installed the jdk (e.g. C:\Program Files\Java\jdk1.8.XYY);
  • set CATALINA HOME ambient variable to the directory where you installed Apache (e.g. C:\Program Files\Java\apache-tomcat-8.0.30);
  • open the file %CATALINA_HOME%\conf\tomcat-users.xml and under the tomcat-users tag place, initialize an user with roles “tomcat, manager-gui, manager-script”. An example is the following content: xml   <role rolename="tomcat"/>   <role rolename="role1"/>   <user username="tomcat" password="tomcat" roles="tomcat,manager-gui"/>   <user username="both" password="tomcat" roles="tomcat,role1"/>   <user username="role1" password="tomcat" roles="role1"/>
  • edit the “to_be_defined” fields in tomcat-build.xml with the username and password previously configured in Tomcat(e.g. name="tomcatUsername" value="tomcat" and name="tomcatPassword" value="tomcat" the values set in ‘tomcat-users’). Set server.location property to the directory where you installed Apache (e.g. C:\Program Files\Java\apache-tomcat-8.0.30);

VeriGraph deployment on Apache Tomcat (Unix):

  • sudo nano ~/.bashrc
  • set a few environment variables by paste the following content at the end of the file export CATALINA_HOME='/path/to/apache/tomcat/folder' export JRE_HOME='/path/to/jdk/folder' export JDK_HOME='/path/to/jdk/folder'
  • exec bash
  • open the file $CATALINA_HOME\conf\tomcat-users.xml and under the tomcat-users tag place, initialize an user with roles “tomcat, manager-gui, manager-script”. An example is the following content: xml <role rolename="tomcat"/>   <role rolename="role1"/>   <user username="tomcat" password="tomcat" roles="tomcat,manager-gui"/>   <user username="both" password="tomcat" roles="tomcat,role1"/>   <user username="role1" password="tomcat" roles="role1"/>
  • edit the “to_be_defined” fields in tomcat-build.xml with the username and password previously configured in Tomcat(e.g. name="tomcatUsername" value="tomcat" and name="tomcatPassword" value="tomcat" the values set in ‘tomcat-users’). Set server.location property to the directory where you installed Apache (e.g. C:\Program Files\Java\apache-tomcat-8.0.30);

Step4a: Deploy Verigraph in Tomcat.

ant -f build.xml deployWS

Use the Ant script build.xml to manage Verigraph webservice with the following targets:

  • generate-war: it generates the war file;
  • generate-binding: it generates the JAXB classes from the XML Schema file xml_components.xsd;
  • start-tomcat : it starts the Apache Tomcat;
  • deployWS: it deploys the verigraph.war file contained in verigraph/war folder;
  • startWS: it starts the webservice;
  • run-test: it runs the tests in tester folder. It is possible to choose the iterations number for each verification request by launching the test with “-Diteration=n run-test” where n is the number of iterations you want;
  • stopWS: it stops the webservice;
  • undeployWS: it undeploys the webservice from Apache Tomcat;
  • stop-tomcat: it stops Apache Tomcat.

Step4b: Deploy Verigraph with gRPC interface.

ant -f build.xml generate-binding
ant -f gRPC-build.xml run-server

Use the Ant script gRPC-build.xml to manage Verigraph with the following targets:

  • build: compile the program;
  • run: run both client and server;
  • run-client : run only client;
  • run-server : run only server;
  • run-test : launch all tests that are present in the package;
Parser apigateway Installation

In the present release, apigateway requires that the following software is also installed:

Please follow the below installation steps to install apigateway submodule in parser.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Install the apigateway submodule.

# change directory to apigateway
cd parser/apigateway

# install requirements
pip install -r requirements.txt

# install apigateway
python setup.py install

Notes: In release D, apigateway submodule is only initial framework code, and more feature will be provided in the next release.

OPNFV Parser Configuration Guide
Parser configuration

Parser can be configured with any installer in current OPNFV, it only depends on openstack.

Pre-configuration activities

For parser, there is not specific pre-configuration activities.

Hardware configuration

For parser, there is not hardware configuration needed for any current feature.

Feature configuration

For parser, there is not specific configure on openstack.

Parser Post Installation Procedure

Add a brief introduction to the methods of validating the installation according to this specific installer or feature.

Automated post installation activities

Describe specific post installation activities performed by the OPNFV deployment pipeline including testing activities and reports. Refer to the relevant testing guides, results, and release notes.

note: this section should be singular and derived from the test projects once we have one test suite to run for all deploy tools. This is not the case yet so each deploy tool will need to provide (hopefully very simillar) documentation of this.

Parser post configuration procedures

Describe any deploy tool or feature specific scripts, tests or procedures that should be carried out on the deployment post install and configuration in this section.

Platform components validation

Describe any component specific validation procedures necessary for your deployment tool in this section.

OPNFV Parser User Guide
Parser tosca2heat Execution
nfv-heattranslator
There only one way to call nfv-heattranslator service: CLI.

Step 1: Change directory to where the tosca yaml files are present, example is below with vRNC definiton.

cd parser/tosca2heat/tosca-parser/toscaparser/extensions/nfv/tests/data/vRNC/Definitions

Step 2: Run the python command heat-translator with the TOSCA yaml file as an input option.

heat-translator --template-file=<input file> --template-type=tosca
                --outpurt-file=<output hot file>

Example:

heat-translator --template-file=vRNC.yaml \
    --template-type=tosca --output-file=vRNC_hot.yaml

Notes: nfv-heattranslator will call class of ToscaTemplate in nfv-toscaparser firstly to validate and parse input yaml file, then tranlate the file into hot file.

nfv-toscaparser
Implementation of nfv-toscaparser derived from openstack tosca parser is based on the following OASIS specification:
TOSCA Simple Profile YAML 1.2 Referecne http://docs.oasis-open.org/tosca/TOSCA-Simple-Profile-YAML/v1.2/TOSCA-Simple-Profile-YAML-v1.2.html TOSCA Simple Profile YAML NFV 1.0 Referecne http://docs.oasis-open.org/tosca/tosca-nfv/v1.0/tosca-nfv-v1.0.html

There are three ways to call nfv-toscaparser service, Python Lib ,CLI and REST API.

CLI

Using cli, which is used to validate tosca simple based service template. It can be used as:

tosca-parser --template-file=<path to the YAML template>  [--nrpv]  [--debug]
tosca-parser --template-file=<path to the CSAR zip file> [--nrpv]  [--debug]
tosca-parser --template-file=<URL to the template or CSAR>  [--nrpv]  [--debug]

options:
  --nrpv Ignore input parameter validation when parse template.
  --debug debug mode for print more details other than raise exceptions when errors happen
Library(Python)

Using api, which is used to parse and get the result of service template. it can be used as:

ToscaTemplate(path=None, parsed_params=None, a_file=True, yaml_dict_tpl=None,
                                       sub_mapped_node_template=None,
                                       no_required_paras_valid=False, debug=False )
REST API

Using RESTfual API, which are listed as following:

List template versions

PATH: /v1/template_versions METHOD: GET Decription: Lists all supported tosca template versions.

Response Codes

Success 200 - OK Request was successful.

Error

400 - Bad Request Some content in the request was invalid. 404 - Not Found The requested resource could not be found. 500 - Internal Server Error Something went wrong inside the service. This should not happen usually. If it does happen, it means the server has experienced some serious problems.

Request Parameters

No

Response Parameters

template_versions array A list of tosca template version object each describes the type name and
version information for a template version.
Validates a service template

PATH: /v1/validate METHOD: POST Decription: Validate a service template.

Response Codes Success 200 - OK Request was successful.

Error

400 - Bad Request Some content in the request was invalid. 500 - Internal Server Error Something went wrong inside the service. This should not happen usually.

If it does happen, it means the server has experienced some serious problems.

Request Parameters environment (Optional) object A JSON environment for the template service. environment_files (Optional) object An ordered list of names for environment files found in the files dict. files (Optional) object Supplies the contents of files referenced in the template or the environment.

The value is a JSON object, where each key is a relative or absolute URI which serves as the name of
a file, and the associated value provides the contents of the file. The following code shows the general structure of this parameter.
{ ...
“files”: {
“fileA.yaml”: “Contents of the file”, “file:///usr/fileB.template”: “Contents of the file”, “http://example.com/fileC.template”: “Contents of the file”

}

... } ignore_errors (Optional) string List of comma separated error codes to ignore. show_nested (Optional) boolean Set to true to include nested template service in the list. template (Optional) object The service template on which to perform the operation.

This parameter is always provided as a string in the JSON request body. The content of the string is
a JSON- or YAML-formatted service template. For example:
“template”: {
“tosca_definitions_version”: “tosca_simple_yaml_1_0”, ...

} This parameter is required only when you omit the template_url parameter. If you specify both parameters, this value overrides thetemplate_url parameter value.

template_url (Optional) string A URI to the location containing the service template on which to perform the operation. See the description of the template parameter for information about the expected template content located at the URI. This parameter is only required when you omit the template parameter. If you specify both parameters, this parameter is ignored.

Request Example {

“template_url”: “/PATH_TO_TOSCA_TEMPLATES/HelloWord_Instance.csar”

}

Response Parameters Description string The description specified in the template. Error Information (Optional) string Error information

Parse a service template

PATH: /v1/validate METHOD: POST Decription: Validate a service template. Response Code: same as “Validates a service template” Request Parameters: same as “Validates a service template” Response Parameters Description string The description specified in the template. Input parameters object Input parameter list. Service Template object Service template body Output parameters object Input parameter list. Error Information (Optional) string Error information

Parser yang2tosca Execution

Step 1: Change directory to where the scripts are present.

cd parser/yang2tosca
Step 2: Copy the YANG file which needs to be converted into TOSCA to
current (parser/yang2tosca) folder.

Step 3: Run the python script “parser.py” with the YANG file as an input option.

python parser.py -n "YANG filename"

Example:

python parser.py -n example.yaml
Step 4: Verify the TOSCA YAMl which file has been created with the same name
as the YANG file with a “_tosca” suffix.
cat "YANG filename_tosca.yaml"

Example:

cat example_tosca.yaml
Parser policy2tosca Execution

Step 1: To see a list of commands available.

policy2tosca --help

Step 2: To see help for an individual command, include the command name on the command line

policy2tosca help <service>

Step 3: To inject/remove policy types/policy definitions provide the TOSCA file as input to policy2tosca command line.

policy2tosca <service> [arguments]

Example:

policy2tosca add-definition \
    --policy_name rule2 --policy_type  tosca.policies.Placement.Geolocation \
    --description "test description" \
    --properties region:us-north-1,region:us-north-2,min_inst:2 \
    --targets VNF2,VNF4 \
    --metadata "map of strings" \
    --triggers "1,2,3,4" \
    --source example.yaml

Step 4: Verify the TOSCA YAMl updated with the injection/removal executed.

cat "<source tosca file>"

Example:

cat example_tosca.yaml
Parser verigraph Execution

VeriGraph is accessible via both a RESTful API and a gRPC interface.

REST API

Step 1. Change directory to where the service graph examples are present

cd parser/verigraph/examples

Step 2. Use a REST client (e.g., cURL) to send a POST request (whose body is one of the JSON file in the directory)

curl -X POST -d @<file_name>.json http://<server_address>:<server_port>/verify/api/graphs
--header "Content-Type:application/json"

Step 3. Use a REST client to send a GET request to check a reachability-based property between two nodes of the service graph created in the previous step.

curl -X GET http://<server_addr>:<server_port>/verify/api/graphs/<graphID>/
policy?source=<srcNodeID>&destination=<dstNodeID>&type=<propertyType>

where:

  • <graphID> is the identifier of the service graph created at Step 2
  • <srcNodeID> is the name of the source node
  • <dstNodeID> is the name of the destination node
  • <propertyType> can be reachability, isolation or traversal

Step 4. the output is a JSON with the overall result of the verification process and the partial result for each path that connects the source and destination nodes in the service graph.

gRPC API

VeriGraph exposes a gRPC interface that is self-descriptive by its Protobuf file (parser/verigraph/src/main/proto/verigraph.proto). In the current release, Verigraph misses a module that receives service graphs in format of JSON and sends the proper requests to the gRPC server. A testing client has been provided to have an example of how to create a service graph using the gRPC interface and to trigger the verification step.

  1. Run the testing client
cd parser/verigraph
#Client souce code in ``parser/verigraph/src/it/polito/verigraph/grpc/client/Client.java``
ant -f buildVeriGraph_gRPC.xml run-client

SDNVPN

7. SDN VPN

The BGPVPN feature enables creation of BGP VPNs on the Neutron API according to the OpenStack BGPVPN blueprint at Neutron Extension for BGP Based VPN.

In a nutshell, the blueprint defines a BGPVPN object and a number of ways how to associate it with the existing Neutron object model, as well as a unique definition of the related semantics. The BGPVPN framework supports a backend driver model with currently available drivers for Bagpipe, OpenContrail, Nuage and OpenDaylight. The OPNFV scenario makes use of the OpenDaylight driver and backend implementation through the ODL NetVirt project.

8. SDNVPN Testing Suite

An overview of the SDNVPN Test is depicted here. A more detailed description of each test case can be found at SDNVPN Testing.

8.1. Functest scenario specific tests
  • Test Case 1: VPN provides connectivity between subnets, using network association

    Name: VPN connecting Neutron networks and subnets Description: VPNs provide connectivity across Neutron networks and subnets if configured accordingly. Test setup procedure:Set up VM1 and VM2 on Node1 and VM3 on Node2, all having ports in the same Neutron Network N1

    Moreover all ports have 10.10.10/24 addresses (this subnet is denoted SN1 in the following) Set up VM4 on Node1 and VM5 on Node2, both having ports in Neutron Network N2 Moreover all ports have 10.10.11/24 addresses (this subnet is denoted SN2 in the following)

    Test execution:
    • Create VPN1 with eRT<>iRT (so that connected subnets should not reach each other)
    • Associate SN1 to VPN1
    • Ping from VM1 to VM2 should work
    • Ping from VM1 to VM3 should work
    • Ping from VM1 to VM4 should not work
    • Associate SN2 to VPN1
    • Ping from VM4 to VM5 should work
    • Ping from VM1 to VM4 should not work (disabled until isolation fixed upstream)
    • Ping from VM1 to VM5 should not work (disabled until isolation fixed upstream)
    • Change VPN 1 so that iRT=eRT
    • Ping from VM1 to VM4 should work
    • Ping from VM1 to VM5 should work
  • Test Case 2: Tenant separation

    Name: Using VPNs for tenant separation Description: Using VPNs to isolate tenants so that overlapping IP address ranges can be used

    Test setup procedure:
    • Set up VM1 and VM2 on Node1 and VM3 on Node2, all having ports in the same Neutron Network N1.
    • VM1 and VM2 have IP addresses in a subnet SN1 with range 10.10.10/24
    • VM1: 10.10.10.11, running an HTTP server which returns “I am VM1” for any HTTP request (or something else than an HTTP server)
    • VM2: 10.10.10.12, running an HTTP server which returns “I am VM2” for any HTTP request
    • VM3 has an IP address in a subnet SN2 with range 10.10.11/24
    • VM3: 10.10.11.13, running an HTTP server which returns “I am VM3” for any HTTP request
    • Set up VM4 on Node1 and VM5 on Node2, both having ports in Neutron Network N2
    • VM4 has an address in a subnet SN1b with range 10.10.10/24
    • VM4: 10.10.10.12 (the same as VM2), running an HTTP server which returns “I am VM4” for any HTTP request
    • VM5 has an address in a subnet SN2b with range 10.10.11/24
    • VM5: 10.10.11.13 (the same as VM3), running an HTTP server which returns “I am VM5” for any HTTP request
    Test execution:
    • Create VPN 1 with iRT=eRT=RT1 and associate N1 to it
    • HTTP from VM1 to VM2 and VM3 should work It returns “I am VM2” and “I am VM3” respectively
    • HTTP from VM1 to VM4 and VM5 should not work It never returns “I am VM4” or “I am VM5”
    • Create VPN2 with iRT=eRT=RT2 and associate N2 to it
    • HTTP from VM4 to VM5 should work It returns “I am VM5”
    • HTTP from VM4 to VM1 and VM3 should not work It never returns “I am VM1” or “I am VM3”
  • Test Case 3: Data Center Gateway integration

    Name: Data Center Gateway integration Description: Investigate the peering functionality of BGP protocol, using a Zrpcd/Quagga router and OpenDaylight Controller

    Test setup procedure:
    • Search in the pool of nodes and find one Compute node and one Controller nodes, that have OpenDaylight controller running
    • Start an instance using ubuntu-16.04-server-cloudimg-amd64-disk1.img image and in it run the Quagga setup script
    • Start bgp router in the Controller node, using odl:configure-bgp
    Test execution:
    • Set up a Quagga instance in a nova compute node
    • Start a BGP router with OpenDaylight in a controller node
    • Add the Quagga running in the instance as a neighbor
    • Check that bgpd is running
    • Verify that the OpenDaylight and gateway Quagga peer each other
    • Start an instance in a second nova compute node and connect it with a new network, (Network 3-3).
    • Create a bgpvpn (include parameters route-distinguisher and route-targets) and associate it with the network created
    • Define the same route-distinguisher and route-targets on the simulated quagga side
    • Check that the routes from the Network 3-3 are advertised towards simulated Quagga VM
  • Test Case 4: VPN provides connectivity between subnets using router association

    Functest: variant of Test Case 1.
    • Set up a Router R1 with one connected network/subnet N1/S1.
    • Set up a second network N2.
    • Create VPN1 and associate Router R1 and Network N2 to it.
    • Hosts from N2 should be able to reach hosts in N1.

    Name: VPN connecting Neutron networks and subnets using router association Description: VPNs provide connectivity across Neutron networks and subnets if configured accordingly.

    Test setup procedure:
    • Set up VM1 and VM2 on Node1 and VM3 on Node2,
    • All VMs have ports in the same Neutron Network N1 and 10.10.10/24 addresses
    • (this subnet is denoted SN1 in the following).
    • N1/SN1 are connected to router R1.
    • Set up VM4 on Node1 and VM5 on Node2,
    • Both VMs have ports in Neutron Network N2 and having 10.10.11/24 addresses
    • (this subnet is denoted SN2 in the following)
    Test execution:
    • Create VPN1 with eRT<>iRT (so that connected subnets should not reach each other)
    • Associate R1 to VPN1 Ping from VM1 to VM2 should work Ping from VM1 to VM3 should work Ping from VM1 to VM4 should not work
    • Associate SN2 to VPN1 Ping from VM4 to VM5 should work Ping from VM1 to VM4 should not work Ping from VM1 to VM5 should not work
    • Change VPN1 so that iRT=eRT Ping from VM1 to VM4 should work Ping from VM1 to VM5 should work
  • Test Case 7 - Network associate a subnet with a router attached to a VPN and verify floating IP functionality (disabled, because of ODL Bug 6962)

    A test for https://bugs.opendaylight.org/show_bug.cgi?id=6962

    Setup procedure:
    • Create VM1 in a subnet with a router attached.
    • Create VM2 in a different subnet with another router attached.
    • Network associate them to a VPN with iRT=eRT
    • Ping from VM1 to VM2 should work
    • Assign a floating IP to VM1
    • Pinging the floating IP should work
  • Test Case 8 - Router associate a subnet with a router attached to a VPN and verify floating IP functionality

    Setup procedure:
    • Create VM1 in a subnet with a router which is connected with the gateway
    • Create VM2 in a different subnet without a router attached.
    • Assoc the two networks in a VPN iRT=eRT
    • One with router assoc, other with net assoc
    • Try to ping from one VM to the other
    • Assign a floating IP to the VM in the router assoc network
    • Ping it
  • Test Case 9 - Check fail mode in OVS br-int interfaces

    This testcase checks if the fail mode is always ‘secure’. To accomplish it, a check is performed on all OVS br-int interfaces, for all OpenStack nodes. The testcase is considered as successful if all OVS br-int interfaces have fail_mode=secure

  • Test Case 10 - Check the communication between a group of VMs

    This testcase investigates if communication between a group of VMs is interrupted upon deletion and creation of VMs inside this group.

    Test case flow:
    • Create 3 VMs: VM_1 on compute 1, VM_2 on compute 1, VM_3 on compute 2.
    • All VMs ping each other.
    • VM_2 is deleted.
    • Traffic is still flying between VM_1 and VM_3.
    • A new VM, VM_4 is added to compute 1.
    • Traffic is not interrupted and VM_4 can be reached as well.
  • Testcase 11: test Opendaylight resync and group_add_mod feature mechanisms

    This is testcase to test Opendaylight resync and group_add_mod feature functionalities

    Sub-testcase 11-1:
    • Create and start 2 VMs, connected to a common Network. New groups should appear in OVS dump
    • OVS disconnects and the VMs and the networks are cleaned. The new groups are still in the OVS dump, cause OVS is not connected anymore, so it is not notified that the groups are deleted
    • OVS re-connects. The new groups should be deleted, as Opendaylight has to resync the groups totally and should remove the groups since VMS are deleted.
    Sub-testcase 11-2:
    • Create and start 2 VMs, connected to a common Network. New groups should appear in OVS dump
    • OVS disconnects. The new groups are still in the OVS dump, cause OVS is not connected anymore, so it is not notified that the groups are deleted
    • OVS re-connects. The new groups should be still there, as the topology remains. Opendaylight Carbon’s group_add_mod mechanism should handle the already existing group.
    • OVS re-connects. The new groups should be still there, as the topology remains. Opendaylight Carbon’ group_add_mod mechanism should handle the already existing group.
  • Testcase 12: Test Resync mechanism between Opendaylight and OVS This is the testcase to validate flows and groups are programmed correctly after resync which is triggered by OVS del-controller/set-controller commands and adding/remove iptables drop rule on OF port 6653.

    Sub-testcase 12-1:
    • Create and start 2 VMs, connected to a common Network New flows and groups were added to OVS
    • Reconnect the OVS by running del-ontroller and set-controller commands The flows and groups are still intact and none of the flows/groups are removed
    • Reconnect the OVS by adding ip tables drop rule and then remove it The flows and groups are still intact and none of the flows/groups are removed
  • Testcase 13: Test ECMP (Equal-cost multi-path routing) for the extra route

    This testcase validates spraying behavior in OvS when an extra route is configured such that it can be reached from two nova VMs in the same network.

    Setup procedure:
    • Create and start VM1 and VM2 configured with sub interface set to same ip address in both VMs, connected to a common network/router.
    • Update the VM1 and VM2’s Neutron ports with allowed address pairs for sub interface ip/mac addresses.
    • Create BGPVPN with two route distinguishers.
    • Associate router with BGPVPN.
    • Update the router with above sub-interface ip address with nexthops set to VMs ip addresses.
    • Create VM3 and connected to the same network.
    • Ping sub-interface IP address from VM3.
6. SDN VPN
6.1. Introduction

This document provides an overview of how to work with the SDN VPN features in OPNFV.

6.2. Feature and API usage guidelines and example

For the details of using OpenStack BGPVPN API, please refer to the documentation at http://docs.openstack.org/developer/networking-bgpvpn/.

6.2.1. Example

In the example we will show a BGPVPN associated to 2 neutron networks. The BGPVPN will have the import and export routes in the way that it imports its own Route. The outcome will be that vms sitting on these two networks will be able to have a full L3 connectivity.

Some defines:

net_1="Network1"
net_2="Network2"
subnet_net1="10.10.10.0/24"
subnet_net2="10.10.11.0/24"

Create neutron networks and save network IDs:

neutron net-create --provider:network_type=local $net_1
export net_1_id=`echo "$rv" | grep " id " |awk '{print $4}'`
neutron net-create --provider:network_type=local $net_2
export net_2_id=`echo "$rv" | grep " id " |awk '{print $4}'`

Create neutron subnets:

neutron subnet-create $net_1 --disable-dhcp $subnet_net1
neutron subnet-create $net_2 --disable-dhcp $subnet_net2

Create BGPVPN:

neutron bgpvpn-create --route-distinguishers 100:100 --route-targets 100:2530 --name L3_VPN

Start VMs on both networks:

nova boot --flavor 1 --image <some-image> --nic net-id=$net_1_id vm1
nova boot --flavor 1 --image <some-image> --nic net-id=$net_2_id vm2

The VMs should not be able to see each other.

Associate to Neutron networks:

neutron bgpvpn-net-assoc-create L3_VPN --network $net_1_id
neutron bgpvpn-net-assoc-create L3_VPN --network $net_2_id

Now the VMs should be able to ping each other

6.3. Troubleshooting

Check neutron logs on the controller:

tail -f /var/log/neutron/server.log |grep -E "ERROR|TRACE"

Check Opendaylight logs:

tail -f /opt/opendaylight/data/logs/karaf.log

Restart Opendaylight:

service opendaylight restart
4. SDN VPN feature installation
4.1. Hardware requirements

The SDNVPN scenarios can be deployed as a bare-metal or a virtual environment on a single host.

4.1.1. Bare metal deployment on Pharos Lab

Hardware requirements for bare-metal deployments of the OPNFV infrastructure are specified by the Pharos project. The Pharos project provides an OPNFV hardware specification for configuring your hardware at: http://artifacts.opnfv.org/pharos/docs/pharos-spec.html.

4.1.2. Virtual deployment on a single server

To perform a virtual deployment of an OPNFV scenario on a single host, that host has to meet the hardware requirements outlined in the <missing spec>.

When ODL is used as an SDN Controller in an OPNFV virtual deployment, ODL is running on the OpenStack Controller VMs. It is therefore recommended to increase the amount of resources for these VMs.

Our recommendation is to have 2 additional virtual cores and 8GB additional virtual memory on top of the normally recommended configuration.

Together with the commonly used recommendation this sums up to:

6 virtual CPU cores
16 GB virtual memory

The installation section below has more details on how to configure this.

4.2. Installation using Fuel installer
4.3. Preparing the host to install Fuel by script

Before starting the installation of the os-odl-bgpnvp scenario some preparation of the machine that will host the Fuel VM must be done.

4.3.1. Installation of required packages

To be able to run the installation of the basic OPNFV fuel installation the Jumphost (or the host which serves the VMs for the virtual deployment) needs to install the following packages:

sudo apt-get install -y git make curl libvirt-bin libpq-dev qemu-kvm \
                        qemu-system tightvncserver virt-manager sshpass \
                        fuseiso genisoimage blackbox xterm python-pip \
                        python-git python-dev python-oslo.config \
                        python-pip python-dev libffi-dev libxml2-dev \
                        libxslt1-dev libffi-dev libxml2-dev libxslt1-dev \
                        expect curl python-netaddr p7zip-full

sudo pip install GitPython pyyaml netaddr paramiko lxml scp \
                 python-novaclient python-neutronclient python-glanceclient \
                 python-keystoneclient debtcollector netifaces enum
4.3.2. Download the source code and artifact

To be able to install the scenario os-odl-bgpvpn one can follow the way CI is deploying the scenario. First of all the opnfv-fuel repository needs to be cloned:

git clone ssh://<user>@gerrit.opnfv.org:29418/fuel

To check out a specific version of OPNFV, checkout the appropriate branch:

cd fuel
git checkout stable/gambia

Now download the corresponding OPNFV Fuel ISO into an appropriate folder from the website https://www.opnfv.org/software/downloads/release-archives

Have in mind that the fuel repo version needs to map with the downloaded artifact. Note: it is also possible to build the Fuel image using the tools found in the fuel git repository, but this is out of scope of the procedure described here. Check the Fuel project documentation for more information on building the Fuel ISO.

4.4. Simplified scenario deployment procedure using Fuel

This section describes the installation of the os-odl-bgpvpn-ha or os-odl-bgpvpn-noha OPNFV reference platform stack across a server cluster or a single host as a virtual deployment.

4.4.1. Scenario Preparation

dea.yaml and dha.yaml need to be copied and changed according to the lab-name/host where you deploy. Copy the full lab config from:

cp <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/elx \
   <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/<your-lab-name>

Add at the bottom of dha.yaml

disks:
  fuel: 100G
  controller: 100G
  compute: 100G

define_vms:
  controller:
    vcpu:
      value: 4
    memory:
      attribute_equlas:
        unit: KiB
      value: 16388608
    currentMemory:
      attribute_equlas:
        unit: KiB
      value: 16388608

Check if the default settings in dea.yaml are in line with your intentions and make changes as required.

4.4.2. Installation procedures

We describe several alternative procedures in the following. First, we describe several methods that are based on the deploy.sh script, which is also used by the OPNFV CI system. It can be found in the Fuel repository.

In addition, the SDNVPN feature can also be configured manually in the Fuel GUI. This is described in the last subsection.

Before starting any of the following procedures, go to

cd <opnfv-fuel-repo>/ci
4.4.2.1. Full automatic virtual deployment High Availablity Mode

The following command will deploy the high-availability flavor of SDNVPN scenario os-odl-bgpvpn-ha in a fully automatic way, i.e. all installation steps (Fuel server installation, configuration, node discovery and platform deployment) will take place without any further prompt for user input.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso>
4.4.2.2. Full automatic virtual deployment NO High Availability Mode

The following command will deploy the SDNVPN scenario in its non-high-availability flavor (note the different scenario name for the -s switch). Otherwise it does the same as described above.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-noha -i file://<path-to-fuel-iso>
4.4.2.3. Automatic Fuel installation and manual scenario deployment

A useful alternative to the full automatic procedure is to only autodeploy the Fuel host and to run host selection, role assignment and SDNVPN scenario configuration manually.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso> -e

With -e option the installer does not launch environment deployment, so a user can do some modification before the scenario is really deployed. Another interesting option is the -f option which deploys the scenario using an existing Fuel host.

The result of this installation is a fuel sever with the right config for BGPVPN. Now the deploy button on fuel dashboard can be used to deploy the environment. It is as well possible to do the configuration manuell.

4.4.2.4. Feature configuration on existing Fuel

If a Fuel server is already provided but the fuel plugins for Opendaylight, Openvswitch and BGPVPN are not provided install them by:

cd /opt/opnfv/
fuel plugins --install fuel-plugin-ovs-*.noarch.rpm
fuel plugins --install opendaylight-*.noarch.rpm
fuel plugins --install bgpvpn-*.noarch.rpm

If plugins are installed and you want to update them use –force flag.

Now the feature can be configured. Create a new environment with “Neutron with ML2 plugin” and in there “Neutron with tunneling segmentation”. Go to Networks/Settings/Other and check “Assign public network to all nodes”. This is required for features such as floating IP, which require the Compute hosts to have public interfaces. Then go to settings/other and check “OpenDaylight plugin”, “Use ODL to manage L3 traffic”, “BGPVPN plugin” and set the OpenDaylight package version to “5.2.0-1”. Then you should be able to check “BGPVPN extensions” in OpenDaylight plugin section.

Now the deploy button on fuel dashboard can be used to deploy the environment.

4.5. Virtual deployment using Apex installer
4.5.1. Prerequisites

For Virtual Apex deployment a host with Centos 7 is needed. This installation was tested on centos-release-7-2.1511.el7.centos.2.10.x86_64 however any other Centos 7 version should be fine.

4.5.2. Build and Deploy

Download the Apex repo from opnfv gerrit and checkout stable/gambia:

git clone ssh://<user>@gerrit.opnfv.org:29418/apex
cd apex
git checkout stable/gambia

In apex/contrib you will find simple_deploy.sh:

#!/bin/bash
set -e
apex_home=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/../
export CONFIG=$apex_home/build
export LIB=$apex_home/lib
export RESOURCES=$apex_home/.build/
export PYTHONPATH=$PYTHONPATH:$apex_home/lib/python
$apex_home/ci/dev_dep_check.sh || true
$apex_home/ci/clean.sh
pushd $apex_home/build
make clean
make undercloud
make overcloud-opendaylight
popd
pushd $apex_home/ci
echo "All further output will be piped to $PWD/nohup.out"
(nohup ./deploy.sh -v -n $apex_home/config/network/network_settings.yaml -d $apex_home/config/deploy/os-odl_l3-nofeature-noha.yaml &)
tail -f nohup.out
popd

This script will:

  • “dev_dep_check.sh” install all required packages.
  • “clean.sh” clean existing deployments
  • “make clean” clean existing builds
  • “make undercloud” building the undercloud image
  • “make overcloud-opendaylight” build the overcloud image and convert that to a overcloud with opendaylight image
  • “deploy.sh” deploy the os-odl_l3-nofeature-nohs.yaml scenario

Edit the script and change the scenario to os-odl-bgpvpn-noha.yaml. More scenraios can be found: ./apex/config/deploy/

Execute the script in a own screen process:

yum install -y screen
screen -S deploy
bash ./simple_deploy.sh
4.5.2.1. Accessing the undercloud

Determin the mac address of the undercloud vm:

# virsh domiflist undercloud
-> Default network
Interface  Type       Source     Model       MAC
-------------------------------------------------------
vnet0      network    default    virtio      00:6a:9d:24:02:31
vnet1      bridge     admin      virtio      00:6a:9d:24:02:33
vnet2      bridge     external   virtio      00:6a:9d:24:02:35
# arp -n |grep 00:6a:9d:24:02:31
192.168.122.34           ether   00:6a:9d:24:02:31   C                     virbr0
# ssh stack@192.168.122.34
-> no password needed (password stack)

List overcloud deployment info:

# source stackrc
# # Compute and controller:
# nova list
# # Networks
# neutron net-list

List overcloud openstack info:

# source overcloudrc
# nova list
# ...
4.5.2.2. Access the overcloud hosts

On the undercloud:

# . stackrc
# nova list
# ssh heat-admin@<ip-of-host>
-> there is no password the user has direct sudo rights.
2. SDN VPN
2.1. Introduction

This document provides an overview of how to work with the SDN VPN features in OPNFV.

2.2. Feature and API usage guidelines and example

For the details of using OpenStack BGPVPN API, please refer to the documentation at http://docs.openstack.org/developer/networking-bgpvpn/.

2.2.1. Example

In the example we will show a BGPVPN associated to 2 neutron networks. The BGPVPN will have the import and export routes in the way that it imports its own Route. The outcome will be that vms sitting on these two networks will be able to have a full L3 connectivity.

Some defines:

net_1="Network1"
net_2="Network2"
subnet_net1="10.10.10.0/24"
subnet_net2="10.10.11.0/24"

Create neutron networks and save network IDs:

neutron net-create --provider:network_type=local $net_1
export net_1_id=`echo "$rv" | grep " id " |awk '{print $4}'`
neutron net-create --provider:network_type=local $net_2
export net_2_id=`echo "$rv" | grep " id " |awk '{print $4}'`

Create neutron subnets:

neutron subnet-create $net_1 --disable-dhcp $subnet_net1
neutron subnet-create $net_2 --disable-dhcp $subnet_net2

Create BGPVPN:

neutron bgpvpn-create --route-distinguishers 100:100 --route-targets 100:2530 --name L3_VPN

Start VMs on both networks:

nova boot --flavor 1 --image <some-image> --nic net-id=$net_1_id vm1
nova boot --flavor 1 --image <some-image> --nic net-id=$net_2_id vm2

The VMs should not be able to see each other.

Associate to Neutron networks:

neutron bgpvpn-net-assoc-create L3_VPN --network $net_1_id
neutron bgpvpn-net-assoc-create L3_VPN --network $net_2_id

Now the VMs should be able to ping each other

2.3. Troubleshooting

Check neutron logs on the controller:

tail -f /var/log/neutron/server.log |grep -E "ERROR|TRACE"

Check Opendaylight logs:

tail -f /opt/opendaylight/data/logs/karaf.log

Restart Opendaylight:

service opendaylight restart
3. SDN VPN user guide
3.1. Introduction

This document provides an overview of how to work with the SDN VPN features in OPNFV.

3.2. Feature and API usage guidelines and example

For the details of using OpenStack BGPVPN API, please refer to the documentation at http://docs.openstack.org/developer/networking-bgpvpn/.

3.2.1. Example

In the example we will show a BGPVPN associated to 2 neutron networks. The BGPVPN will have the import and export routes in the way that it imports its own Route. The outcome will be that vms sitting on these two networks will be able to have a full L3 connectivity.

Some defines:

net_1="Network1"
net_2="Network2"
subnet_net1="10.10.10.0/24"
subnet_net2="10.10.11.0/24"

Create neutron networks and save network IDs:

neutron net-create --provider:network_type=local $net_1
export net_1_id=`echo "$rv" | grep " id " |awk '{print $4}'`
neutron net-create --provider:network_type=local $net_2
export net_2_id=`echo "$rv" | grep " id " |awk '{print $4}'`

Create neutron subnets:

neutron subnet-create $net_1 --disable-dhcp $subnet_net1
neutron subnet-create $net_2 --disable-dhcp $subnet_net2

Create BGPVPN:

neutron bgpvpn-create --route-distinguishers 100:100 --route-targets 100:2530 --name L3_VPN

Start VMs on both networks:

nova boot --flavor 1 --image <some-image> --nic net-id=$net_1_id vm1
nova boot --flavor 1 --image <some-image> --nic net-id=$net_2_id vm2

The VMs should not be able to see each other.

Associate to Neutron networks:

neutron bgpvpn-net-assoc-create L3_VPN --network $net_1_id
neutron bgpvpn-net-assoc-create L3_VPN --network $net_2_id

Now the VMs should be able to ping each other

3.3. Troubleshooting

Check neutron logs on the controller:

tail -f /var/log/neutron/server.log |grep -E "ERROR|TRACE"

Check Opendaylight logs:

tail -f /opt/opendaylight/data/logs/karaf.log

Restart Opendaylight:

service opendaylight restart
9. SDN VPN
9.1. Introduction

This document provides an overview of how to work with the SDN VPN features in OPNFV.

9.2. Feature and API usage guidelines and example

For the details of using OpenStack BGPVPN API, please refer to the documentation at http://docs.openstack.org/developer/networking-bgpvpn/.

9.2.1. Example

In the example we will show a BGPVPN associated to 2 neutron networks. The BGPVPN will have the import and export routes in the way that it imports its own Route. The outcome will be that vms sitting on these two networks will be able to have a full L3 connectivity.

Some defines:

net_1="Network1"
net_2="Network2"
subnet_net1="10.10.10.0/24"
subnet_net2="10.10.11.0/24"

Create neutron networks and save network IDs:

neutron net-create --provider:network_type=local $net_1
export net_1_id=`echo "$rv" | grep " id " |awk '{print $4}'`
neutron net-create --provider:network_type=local $net_2
export net_2_id=`echo "$rv" | grep " id " |awk '{print $4}'`

Create neutron subnets:

neutron subnet-create $net_1 --disable-dhcp $subnet_net1
neutron subnet-create $net_2 --disable-dhcp $subnet_net2

Create BGPVPN:

neutron bgpvpn-create --route-distinguishers 100:100 --route-targets 100:2530 --name L3_VPN

Start VMs on both networks:

nova boot --flavor 1 --image <some-image> --nic net-id=$net_1_id vm1
nova boot --flavor 1 --image <some-image> --nic net-id=$net_2_id vm2

The VMs should not be able to see each other.

Associate to Neutron networks:

neutron bgpvpn-net-assoc-create L3_VPN --network $net_1_id
neutron bgpvpn-net-assoc-create L3_VPN --network $net_2_id

Now the VMs should be able to ping each other

9.3. Troubleshooting

Check neutron logs on the controller:

tail -f /var/log/neutron/server.log |grep -E "ERROR|TRACE"

Check Opendaylight logs:

tail -f /opt/opendaylight/data/logs/karaf.log

Restart Opendaylight:

service opendaylight restart

SFC

1. Service Function Chaining (SFC)
1.1. Requirements

This section defines requirements for the initial OPNFV SFC implementation, including those requirements driving upstream project enhancements.

1.1.1. Minimal Viable Requirement

Deploy a complete SFC solution by integrating OpenDaylight SFC with OpenStack in an OPNFV environment.

1.1.2. Detailed Requirements

These are the Fraser specific requirements:

1 The supported Service Chaining encapsulation will be NSH VXLAN-GPE.

2 The version of OVS used must support NSH.

3 The SF VM life cycle will be managed by the Tacker VNF Manager.

4 The supported classifier is OpenDaylight NetVirt.

5 ODL will be the OpenStack Neutron backend and will handle all networking
on the compute nodes.

6 Tacker will use the networking-sfc API to configure ODL

7 ODL will use flow based tunnels to create the VXLAN-GPE tunnels

1.1.3. Long Term Requirements

These requirements are out of the scope of the Fraser release.

1 Dynamic movement of SFs across multiple Compute nodes.

2 Load Balancing across multiple SFs

3 Support of a different MANO component apart from Tacker

1. SFC installation and configuration instruction
1.1. Abstract

This document provides information on how to install the OpenDaylight SFC features in OPNFV with the use of os_odl-sfc-(no)ha scenario.

1.2. SFC feature desciription

For details of the scenarios and their provided capabilities refer to the scenario description documents:

  • <os-odl-sfc-ha>
  • <os-odl-sfc-noha>

The SFC feature enables creation of Service Fuction Chains - an ordered list of chained network funcions (e.g. firewalls, NAT, QoS)

The SFC feature in OPNFV is implemented by 3 major components:

  • OpenDaylight SDN controller
  • Tacker: Generic VNF Manager (VNFM) and a NFV Orchestrator (NFVO)
  • OpenvSwitch: The Service Function Forwarder(s)
1.3. Hardware requirements

The SFC scenarios can be deployed on a bare-metal OPNFV cluster or on a virtual environment on a single host.

1.3.1. Bare metal deployment on (OPNFV) Pharos lab

Hardware requirements for bare-metal deployments of the OPNFV infrastructure are given by the Pharos project. The Pharos project provides an OPNFV hardware specification for configuring your hardware: http://artifacts.opnfv.org/pharos/docs/pharos-spec.html

1.3.2. Virtual deployment

SFC scenarios can be deployed using APEX installer and xci utility. Check the requirements from those in order to be able to deploy the OPNFV-SFC:

Apex: https://wiki.opnfv.org/display/apex/Apex XCI: https://wiki.opnfv.org/display/INF/XCI+Developer+Sandbox

3. SFC User Guide
3.1. SFC description

The OPNFV SFC feature will create service chains, classifiers, and create VMs for Service Functions, allowing for client traffic intended to be sent to a server to first traverse the provisioned service chain.

The Service Chain creation consists of configuring the OpenDaylight SFC feature. This configuration will in-turn configure Service Function Forwarders to route traffic to Service Functions. A Service Function Forwarder in the context of OPNFV SFC is the “br-int” OVS bridge on an Open Stack compute node.

The classifier(s) consist of configuring the OpenDaylight Netvirt feature. Netvirt is a Neutron backend which handles the networking for VMs. Netvirt can also create simple classification rules (5-tuples) to send specific traffic to a pre-configured Service Chain. A common example of a classification rule would be to send all HTTP traffic (tcp port 80) to a pre-configured Service Chain.

Service Function VM creation is performed via a VNF Manager. Currently, OPNFV SFC is integrated with OpenStack Tacker, which in addition to being a VNF Manager, also orchestrates the SFC configuration. In OPNFV SFC Tacker creates service chains, classification rules, creates VMs in OpenStack for Service Functions, and then communicates the relevant configuration to OpenDaylight SFC.

3.2. SFC capabilities and usage

The OPNFV SFC feature can be deployed with either the “os-odl-sfc-ha” or the “os-odl-sfc-noha” scenario. SFC usage for both of these scenarios is the same.

As previously mentioned, Tacker is used as a VNF Manager and SFC Orchestrator. All the configuration necessary to create working service chains and classifiers can be performed using the Tacker command line. Refer to the Tacker walkthrough (step 3 and onwards) for more information.

3.2.1. SFC API usage guidelines and example

Refer to the Tacker walkthrough for Tacker usage guidelines and examples.

2. Service Function Chaining (SFC)
2.1. Introduction

The OPNFV Service Function Chaining (SFC) project aims to provide the ability to define an ordered list of a network services (e.g. firewalls, NAT, QoS). These services are then “stitched” together in the network to create a service chain. This project provides the infrastructure to install the upstream ODL SFC implementation project in an NFV environment.

2.2. Definitions

Definitions of most terms used here are provided in the IETF SFC Architecture RFC. Additional terms specific to the OPNFV SFC project are defined below.

2.3. Abbreviations
Abbreviations
Abbreviation Term
NS Network Service
NFVO Network Function Virtualization Orchestrator
NF Network Function
NSH Network Services Header (Service chaining encapsulation)
ODL OpenDaylight SDN Controller
RSP Rendered Service Path
SDN Software Defined Networking
SF Service Function
SFC Service Function Chain(ing)
SFF Service Function Forwarder
SFP Service Function Path
VNF Virtual Network Function
VNFM Virtual Network Function Manager
VNF-FG Virtual Network Function Forwarding Graph
VIM Virtual Infrastructure Manager
2.4. Use Cases

This section outlines the Danube use cases driving the initial OPNFV SFC implementation.

2.4.1. Use Case 1 - Two chains

This use case is targeted on creating simple Service Chains using Firewall Service Functions. As can be seen in the following diagram, 2 service chains are created, each through a different Service Function Firewall. Service Chain 1 will block HTTP, while Service Chain 2 will block SSH.

_images/OPNFV_SFC_Brahmaputra_UseCase.jpg
2.4.2. Use Case 2 - One chain traverses two service functions

This use case creates two service functions, and a chain that makes the traffic flow through both of them. More information is available in the OPNFV-SFC wiki:

https://wiki.opnfv.org/display/sfc/Functest+SFC-ODL+-+Test+2

2.5. Architecture

This section describes the architectural approach to incorporating the upstream OpenDaylight (ODL) SFC project into the OPNFV Danube platform.

2.5.1. Service Functions

A Service Function (SF) is a Function that provides services to flows traversing a Service Chain. Examples of typical SFs include: Firewall, NAT, QoS, and DPI. In the context of OPNFV, the SF will be a Virtual Network Function. The SFs receive data packets from a Service Function Forwarder.

2.5.2. Service Function Forwarders

The Service Function Forwarder (SFF) is the core element used in Service Chaining. It is an OpenFlow switch that, in the context of OPNFV, is hosted in an OVS bridge. In OPNFV there will be one SFF per Compute Node that will be hosted in the “br-int” OpenStack OVS bridge.

The responsibility of the SFF is to steer incoming packets to the corresponding Service Function, or to the SFF in the next compute node. The flows in the SFF are programmed by the OpenDaylight SFC SDN Controller.

2.5.3. Service Chains

Service Chains are defined in the OpenDaylight SFC Controller using the following constructs:

SFC
A Service Function Chain (SFC) is an ordered list of abstract SF types.
SFP
A Service Function Path (SFP) references an SFC, and optionally provides concrete information about the SFC, like concrete SF instances. If SF instances are not supplied, then the RSP will choose them.
RSP
A Rendered Service Path (RSP) is the actual Service Chain. An RSP references an SFP, and effectively merges the information from the SFP and SFC to create the Service Chain. If concrete SF details were not provided in the SFP, then SF selection algorithms are used to choose one. When the RSP is created, the OpenFlows will be programmed and written to the SFF(s).
2.5.4. Service Chaining Encapsulation

Service Chaining Encapsulation encapsulates traffic sent through the Service Chaining domain to facilitate easier steering of packets through Service Chains. If no Service Chaining Encapsulation is used, then packets much be classified at every hop of the chain, which would be slow and would not scale well.

In ODL SFC, Network Service Headers (NSH) is used for Service Chaining encapsulation. NSH is an IETF specification that uses 2 main header fields to facilitate packet steering, namely:

NSP (NSH Path)
The NSP is the Service Path ID.
NSI (NSH Index)
The NSI is the Hop in the Service Chain. The NSI starts at 255 and is decremented by every SF. If the NSI reaches 0, then the packet is dropped, which avoids loops.

NSH also has metadata fields, but that’s beyond the scope of this architecture.

In ODL SFC, NSH packets are encapsulated in VXLAN-GPE.

2.5.5. Classifiers

A classifier is the entry point into Service Chaining. The role of the classifier is to map incoming traffic to Service Chains. In ODL SFC, this mapping is performed by matching the packets and encapsulating the packets in a VXLAN-GPE NSH tunnel.

The packet matching is specific to the classifier implementation, but can be as simple as an ACL, or can be more complex by using PCRF information or DPI.

2.5.6. VNF Manager

In OPNFV SFC, a VNF Manager is needed to spin-up VMs for Service Functions. It has been decided to use the OpenStack Tacker VNF Mgr to spin-up and manage the life cycle of the SFs. Tacker will receive the ODL SFC configuration, manage the SF VMs, and forward the configuration to ODL SFC. The following sequence diagram details the interactions with the VNF Mgr:

_images/OPNFV_SFC_Brahmaputra_SfCreation.jpg
2.5.7. OPNFV SFC Network Topology

The following image details the Network Topology used in OPNFV Danube SFC:

_images/OPNFV_SFC_Brahmaputra_NW_Topology.jpg

Infrastructure

Infrastructure Overview

OPNFV develops, operates, and maintains infrastructure which is used by the OPNFV Community for development, integration, and testing purposes. OPNFV Infrastructure Working Group (Infra WG) oversees the OPNFV Infrastructure, ensures it is kept in a state which serves the community in best possible way and always up to date.

Infra WG is working towards a model whereby we have a seamless pipeline for handing resource requests from the OPNFV community for both development and Continuous Integration perspectives. Automation of requests and integration to existing automation tools is a primary driver in reaching this model. In the Infra WG, we imagine a model where the Infrastructure Requirements that are specified by a Feature, Installer or otherrelevant projects within OPNFV are requested, provisioned, used, reported on and subsequently torn down with no (or minimal) user intervention at the physical/infrastructure level.

Objectives of the Infra WG are

  • Deliver efficiently dimensions resources to OPNFV community needs on request in a timely manner that ensure maximum usage (capacity) and maximum density (distribution of workloads)
  • Satisfy the needs of the twice-yearly release projects, this includes being able to handle load (amount of projects and requests) as well as need (topology and different layouts)
  • Support OPNFV community users. As the INFRA group, we are integral to all aspects of the OPNFV Community (since it starts with the Hardware) - this can mean troubleshooting any element within the stack
  • Provide a method to expand and adapt as OPNFV community needs grow and provide this to Hosting Providers (lab providers) for input in growth forecast so they can better judge how best to contribute with their resources.
  • Work with reporting and other groups to ensure we have adequate feedback to the end-users of the labs on how their systems, code, feature performs.

The details of what is provided as part of the infrastructure can be seen in following chapters.

Hardware Infrastructure

TBD

Software Infrastructure

Security

Continuous Integration - CI

Please see the details of CI from the chapters below.

Cross Community Continuous Integration - XCI

Please see the details of XCI from the chapters below.

Operations Supporting Tools

Calipso Release Guide

Calipso.io
Product Description and Value

Copyright (c) 2017 Koren Lev (Cisco Systems), Yaron Yogev (Cisco Systems) and others All rights reserved. This program and the accompanying materials are made available under the terms of the Apache License, Version 2.0 which accompanies this distribution, and is available at http://www.apache.org/licenses/LICENSE-2.0

image0

Virtual and Physical networking low level details and inter-connections, dependencies in OpenStack, Docker or Kubernetes environments are currently invisible and abstracted, by design, so data is not exposed through any API or UI.

During virtual networking failures, troubleshooting takes substantial amount of time due to manual discovery and analysis.

Maintenance work needs to happen in the data center, virtual and physical networking (controlled or not) are impacted.

Most of the times, the impact of any of the above scenarios is catastrophic.

Project “Calipso” tries to illuminate complex virtual networking with real time operational state visibility for large and highly distributed Virtual Infrastructure Management (VIM).

Customer needs during maintenance:

Visualize the networking topology, easily pinpointing the location needed for maintenance and show the impact of maintenance work needed in that location.

Administrator can plan ahead easily and report up his command chain the detailed impact – Calipso substantially lower the admin time and overhead needed for that reporting.

Customer need during troubleshooting:

Visualize and pinpointing the exact location of the failure in the networking chain, using a suspected ‘focal point’ (ex: a VM that cannot communicate).

Monitor the networking location and alerting till the problem is resolved. Calipso also covers pinpointing the root cause.

Calipso is for multiple distributions/plugins and many virtual environment variances:

We built a fully tested unified model to deal with many variances.

Supporting in initial release: VPP, OVS, LXB with all type drivers possible, onto 5 different OS distributions, totaling to more than 60 variances (see Calipso-model guide).

New classes per object, link and topology can be programmed (see development guide).

Detailed Monitoring:

Calipso provides visible insights using smart discovery and virtual topological representation in graphs, with monitoring per object in the graph inventory to reduce error vectors and troubleshooting, maintenance cycles for VIM operators and administrators.

We believe that Stability is driven by accurate Visibility.

Table of Contents

Calipso.io Product Description and Value 1

1 About 4

1.1 Project Description 4

2 Main modules 5

2.1 High level module descriptions 5

2.2 High level functionality 5

3 Customer Requirements 6

3.1 Releases and Distributions 7

About

Project Description

Calipso interfaces with the virtual infrastructure (like OpenStack) through API, DB and CLI adapters, discovers the specific distribution/plugins in-use, their versions and based on that collects detailed data regarding running objects in the underlying workers and processes running on the different hosts. Calipso analyzes the inventory for inter-relationships and keeps them in a common and highly adaptive data model.

Calipso then represents the inter-connections as real-time topologies using automatic updates per changes in VIM, monitors the related objects and analyzes the data for impact and root-cause analysis.

This is done with the objective to lower and potentially eliminate complexity and lack of visibility from the VIM layers as well as to offer a common and coherent representation of all physical and virtual network components used under the VIM, all exposed through an API.

Calipso is developed to work with different OpenStack flavors, plugins and installers.

Calipso is developed to save network admins discovery and troubleshooting cycles of the networking aspects. Calipso helps estimate the impact of several micro failure in the infrastructure to allow appropriate resolutions.

Calipso focuses on scenarios, which requires VIM/OpenStack maintenance and troubleshooting enhancements using operations dashboards i.e. connectivity, topology and related stats – as well as their correlation.

image1

Main modules

High level module descriptions

Calipso modules included with initial release:

  • Scanning: detailed inventory discovery and inter-connection analysis, smart/logical and automated learning from the VIM, based on specific environment version/type etc.
  • Listening: Attach to VIM message BUS and update changes in real time.
  • Visualization: represent the result of the discovery in browsable graph topology and tree.
  • Monitoring: Health and status for all discovered objects and inter-connections: use the discovered data to configure monitoring agents and gather monitoring results.
  • Analysis: some traffic analysis, impact and root-cause analysis for troubleshooting.
  • API: allow integration with Calipso application’s inventory and monitoring results.
  • Database: Mongo based
  • LDAP: pre-built integration for smooth attachment to corporate directories.

For Monitoring we are planning to utilize the work done by ‘Sensu’ and ‘Barometer’.

The project also develops required enhancements to individual components in OpenStack like Neutron, Telemetry API and the different OpenStack monitoring agents in order to provide a baseline for “Operations APIs”.

High level functionality

Scanning:

Calipso uses API, Database and Command-Line adapters for interfacing with the Cloud infrastructure to logically discover every networking component and it’s relationships with others, building a smart topology and inventory.

Automated setup:

Calipso uses Sensu framework for Monitoring. It automatically deploys and configures the necessary configuration files on all hosts, writes customized checks and handlers to setup monitoring per inventory object.

Modeled analysis:

Calipso uses a unique logical model to help facilitate the topology discovery, analysis of inter-connections and dependencies. Impact Analysis is embedded, other types of analysis is possible through a plugin framework.

Visualization:

Using its unique dependency model calipso visualize topological inventory and monitoring results, in a highly customizable and modeled UI framework

Monitoring:

After collecting the data, from processes and workers provisioned by the cloud management systems, calipso dynamically checks for health and availability, as a baseline for SLA monitoring.

Reporting:

Calipso allows networking administrators to operate, plan for maintenance or troubleshooting and provides an easy to use hierarchical representation of all the virtual networking components.

Customer Requirements

We identified an operational challenge: lack of visibility that leads to limited stability.

The lack of operational tooling coupled with the reality of deployment tools really needs to get solved to decrease the complexity as well as assist not only deploying but also supporting OpenStack and other cloud stacks.

Calispo integrates well with installers like Apex to offer enhanced day 2 operations.

Releases and Distributions

Calipso is distributed for enterprises - ‘S’ release, through calipso.io, and for service providers - ‘P’ release, through OPNFV.

Calipso.io
Administration Guide

Copyright (c) 2017 Koren Lev (Cisco Systems), Yaron Yogev (Cisco Systems) and others All rights reserved. This program and the accompanying materials are made available under the terms of the Apache License, Version 2.0 which accompanies this distribution, and is available at http://www.apache.org/licenses/LICENSE-2.0

image0

Project “Calipso” tries to illuminate complex virtual networking with real time operational state visibility for large and highly distributed Virtual Infrastructure Management (VIM).

Calipso provides visible insights using smart discovery and virtual topological representation in graphs, with monitoring per object in the graph inventory to reduce error vectors and troubleshooting, maintenance cycles for VIM operators and administrators.

Calipso model, described in this document, was built for multi-environment and many VIM variances, the model was tested successfully (as of Aug 27th) against 60 different VIM variances (Distributions, Versions, Networking Drivers and Types).

Table of Contents

Calipso.io Administration Guide 1

1 Environments config 3

2 UI overview 5

2.1 User management 7

2.2 Logging in and out 8

2.3 Messaging check 9

2.4 Adding a new environment 9

3 Preparing an environment for scanning 10

3.1 Where to deploy Calipso application 10

3.2 Environment setup 10

3.3 Filling the environment config data 11

3.4 Testing the connections 11

4 Links and Cliques 12

4.1 Adding environment clique_types 13

5 Environment scanning 14

5.1 UI scanning request 14

5.2 UI scan schedule request 16

5.3 API scanning request 17

5.4 CLI scanning in the calipso-scan container 18

5.4.1 Clique Scanning 19

5.4.2 Viewing results 20

6 Editing or deleting environments 20

7 Event-based scanning 21

7.1 Enabling event-based scanning 21

7.2 Event-based handling details 22

8 ACI scanning 34

9 Monitoring enablement 36

10 Modules data flows 38

Environments config

Environment is defined as a certain type of Virtual Infrastructure facility the runs under a single unified Management (like an OpenStack facility).

Everything in Calipso application rely on environments config, this is maintained in the “environments_config” collection in the mongo Calipso DB.

Environment configs are pushed down to Calipso DB either through UI or API (and only in OPNFV case Calipso provides an automated program to build all needed environments_config parameters for an ‘Apex’ distribution automatically).

When scanning and discovering items Calipso uses this configuration document for successful scanning results, here is an example of an environment config document:

**{ **

**“name”: “DEMO-ENVIRONMENT-SCHEME”, **

**“enable_monitoring”: true, **

**“last_scanned”: “filled-by-scanning”, **

**“app_path”: “/home/scan/calipso_prod/app”, **

**“type”: “environment”, **

**“distribution”: “Mirantis”, **

**“distribution_version”: “8.0”, **

**“mechanism_drivers”: [“OVS”], **

“type_drivers”: “vxlan”

**“operational”: “stopped”, **

**“listen”: true, **

**“scanned”: false, **

“configuration”: [

{

**“name”: “OpenStack”, **

**“port”:”5000”, **

**“user”: “adminuser”, **

**“pwd”: “dummy_pwd”, **

**“host”: “10.0.0.1”, **

“admin_token”: “dummy_token”

**}, **

{

**“name”: “mysql”, **

**“pwd”: “dummy_pwd”, **

**“host”: “10.0.0.1”, **

**“port”: “3307”, **

“user”: “mysqluser”

**}, **

{

**“name”: “CLI”, **

**“user”: “sshuser”, **

**“host”: “10.0.0.1”, **

“pwd”: “dummy_pwd”

**}, **

{

**“name”: “AMQP”, **

**“pwd”: “dummy_pwd”, **

**“host”: “10.0.0.1”, **

**“port”: “5673”, **

“user”: “rabbitmquser”

**}, **

{

**“name”: “Monitoring”, **

**“ssh_user”: “root”, **

**“server_ip”: “10.0.0.1”, **

**“ssh_password”: “dummy_pwd”, **

**“rabbitmq_pass”: “dummy_pwd”, **

**“rabbitmq_user”: “sensu”, **

**“rabbitmq_port”: “5671”, **

**“provision”: “None”, **

**“env_type”: “production”, **

**“ssh_port”: “20022”, **

**“config_folder”: “/local_dir/sensu_config”, **

**“server_name”: “sensu_server”, **

**“type”: “Sensu”, **

“api_port”: NumberInt(4567)

**}, **

{

**“name”: “ACI”, **

**“user”: “admin”, **

**“host”: “10.1.1.104”, **

“pwd”: “dummy_pwd”

}

**], **

**“user”: “wNLeBJxNDyw8G7Ssg”, **

“auth”: {

“view-env”: [

“wNLeBJxNDyw8G7Ssg”

**], **

“edit-env”: [

“wNLeBJxNDyw8G7Ssg”

]

**}, **

}

Here is a brief explanation of the purpose of major keys in this environment configuration doc:

Distribution: captures type of VIM, used for scanning of objects, links and cliques.

Distribution_version: captures version of VIM distribution, used for scanning of objects, links and cliques.

Mechanism_driver: captures virtual switch type used by the VIM, used for scanning of objects, links and cliques.

Type_driver: captures virtual switch tunneling type used by the switch, used for scanning of objects, links and cliques.

Listen: defines whether or not to use Calipso listener against the VIM BUS for updating inventory in real-time from VIM events.

Scanned: defines whether or not Calipso ran a full and a successful scan against this environment.

Last_scanned: end time of last scan.

Operational: defines whether or not VIM environment endpoints are up and running.

Enable_monitoring: defines whether or not Calipso should deploy monitoring of the inventory objects running inside all environment hosts.

Configuration-OpenStack: defines credentials for OpenStack API endpoints access.

Configuration-mysql: defines credentials for OpenStack DB access.

Configuration-CLI: defines credentials for servers CLI access.

Configuration-AMQP: defines credentials for OpenStack BUS access.

Configuration-Monitoring: defines credentials and setup for Calipso sensu server (see monitoring-guide for details).

Configuration-ACI: defines credentials for ACI switched management API, if exists.

User and auth: used for UI authorizations to view and edit this environment.

App-path: defines the root directory of the scanning application.

* This guide will help you understand how-to add new environment through the provided Calispo UI module and then how-to use this environment (and potentially many others) for scanning and real-time inventories collection.

UI overview

Cloud administrator can use the Calipso UI for he’s daily tasks. Once Calipso containers are running (see quickstart-guide) the UI will be available at:

http://server-ip:80 , default login credentials: admin/123456.

Before logging in, while at the main landing page, a generic information is provided.

Post login, at the main dashboard you can click on “Get started” and view a short guide for using some of the basic UI functions, available at: server-ip/getstarted.

The main areas of interest are shown in the following screenshot:

Main areas on UI:

image1

Main areas details:

Navigation Tree(1): Hierarchy searching through the inventory using objects and parents details, to lookup a focal point of interest for graphing or data gathering.

Main functions (2): Jumping between highest level dashboard (all environments), specific environment and some generic help is provided in this area.

Environment Summary (3): The central area where the data is exposed, either through graph or through widget-attribute-listing.

Search engine (4): Finding interesting focal points faster through basic object naming lookups, then clicking on results to get transferred directly to that specific object dashboard. Searches are conducted across all environments.

More settings (5): In this area the main collections of data are exposed, like scans, schedules, messaging, clique_types, link_types and others.

Graph or Data toggle (6): When focusing on a certain focal point, this button allows changing from a graph-view to simple data-view per request, if no graph is available for a certain object the data-view is used by default, if information is missing try this button first to make sure the correct view is chosen.

User management

The first place an administrator might use is the user’s configurations, this is where a basic RBAC is provided for authorizing access to the UI functions. Use the ‘settings’ button and choose ‘users’ to access:

image2

Editing the admin user password is allowed here:

image3

Note:

The ‘admin’ user is allowed all functions on all environments, you shouldn’t change this behavior and you should never delete this user, or you’ll need re-install Calipso.

Adding new user is provided when clicking the “Create new user” option:

Creating a new user:

image4

Before environments are configured there is not a lot of options here, once environments are defined (one or more), users can be allowed to edit or view-only those environments.

Logging in and out


To logout and re-login with different user credentials you can click the username option and choose to sign out:

image5

Messaging check
When calispo-scan and calipso-listen containers are running, they provide basic messages on their processes status, this should be exposed thorough the messaging system up to the UI, to validate this choose ‘messages’ from the settings button:

image6

Adding a new environment
As explained above, environment configuration is the pre requisite for any Calipso data gathering, goto “My Environments” -> and “Add new Environment” to start building the environment configuration scheme:

image7

Note: this is automated with OPNFV apex distro, where Calipso auto-discovers all credentials

Preparing an environment for scanning

Some preparation is needed for allowing Calipso to successfully gather data from the underlying systems running in the virtual infrastructure environment. This chapter explain the basic requirements and provide recommendations.
Where to deploy Calipso application

Calipso application replaces the manual discovery steps typically done by the administrator on every maintenance and troubleshooting cycles, It needs to have the administrators privileges and is most accurate when placed on one of the controllers or a“jump server” deployed as part of the cloud virtual infrastructure, Calipso calls this server a “Master host”.

Consider Calipso as yet another cloud infrastructure module, similar to neutron, nova.

Per supported distributions we recommend installing the Calipso application at:

  1. Mirantis: on the ‘Fuel’ or ‘MCP’ server.
  2. RDO/Packstack: where the ansible playbooks are deployed.
  3. Canonical/Ubuntu: on the juju server.
  4. Triple-O/Apex: on the jump host server.
Environment setup
The following steps should be taken to enable Calispo’s scanner and listener to connect to the environment controllers and compute hosts:
  1. OpenStack API endpoints : Remote access user accessible from the master host with the required credentials and allows typical ports: 5000, 35357, 8777, 8773, 8774, 8775, 9696

  2. OpenStack DB (MariaDB or MySQL): Remote access user accessible from the master host to ports 3306 or 3307 allowed access to all Databases as read-only.

  3. Master host SSH access: Remote access user with sudo privileges accessible from the master host through either user/pass or rsa keys, the master host itself should then be allowed access using rsa-keys (password-less) to all other infrastructure hosts, all allowing to run sudo CLI commands over tty, when commands entered from the master host source itself.

  4. AMQP message BUS (like Rabbitmq): allowed remote access from the master host to listen for all events generated using a guest account with a password.

  5. Physical switch controller (like ACI): admin user/pass accessed from master host.

    Note: The current lack of operational toolsets like Calipso forces the use of the above scanning methods, the purpose of Calipso is to deploy its scanning engine as an agent on all environment hosts, in such scenario the requirements above might be deprecated and the scanning itself can be made more efficient.

Filling the environment config data

As explained in chapter 1 above, environment configuration is the pre requisite and all data required is modeled as described. See api-guide for details on submitting those details through calispo api module. When using the UI module, follow the sections tabs and fill the needed data per help messages and the explanations in chapter 1.

Only the AMQP, Monitoring and ACI sections in environment_config documents are optional, per the requirements detailed below on this guide.

Testing the connections

Before submitting the environment_config document it is wise to test the connections. Each section tab in the environment configuration has an optional butting for testing the connection tagged “test connection”. When this button is clicked, a check is made to make sure all needed data is entered correctly, then a request is sent down to mongoDB to the “connection_tests” collection. Then the calispo scanning module will make the required test and will push back a response message alerting whether or not this connection is possible with the provided details and credentials.

Test connection per configuration section:

image8

With the above tool, the administrator can be assured that Calipso scanning will be successful and the results will be an accurate representation of the state of he’s live environment.

Environment scanning

Once environment is setup correctly, environment_config data is filled and tested, scanning can start. This is can be done with the following four options:
  1. UI scanning request

  2. UI scan schedule request

  3. API scanning or scheduling request.

  4. CLI scanning in the calipso-scan container.

    The following sections with describe those scanning options.

UI scanning request
This can be accomplished after environment configuration has been submitted, the environment name will be listed under “My environment” and the administrator can choose it from the list and login to the specific environment dashboard:

image11

Onces inside a specific environment dashboard the administrator can click the scanning button the go into scanning request wizards:

image12

In most cases, the only step needed to send a scanning request is to use all default options and click the “Submit” button:

image13

Scanning request will propagate into the “scans” collection and will be handled by scan_manager in the calipso-scan container.

Scan options:

Log level: determines the level and details of the scanning logs.

Clear data: empty historical inventories related to that specific environment, before scanning.

Only inventory: creates inventory objects without analyzing for links.

Only links: create links from pre-existing inventory, does not build graph topologies.

Only Cliques: create graph topologies from pre-existing inventory and links.

UI scan schedule request

Scanning can be used periodically to dynamically update the inventories per changes in the underlying virtual environment infrastructure. This can be defined using scan scheduling and can be combined with the above one time scanning request.

image14

Scheduled scans has the same options as in single scan request, while choosing a specific environment to schedule on and providing frequency details, timer is counted from the submission time, scan scheduling requests are propagated to the “scheduled_scans” collection in the Calispo mongoDB and handled by scan_manager in the calispo-scan container.

API scanning request
Follow api-guide for details on submitting scanning request through Calipso API.
CLI scanning in the calipso-scan container
When using the UI for scanning messages are populated in the “Messages” menu item and includes several details for successful scanning and some alerts. When more detailed debugging of the scanning process is needed, administrator can login directly to the calispo-scan container and run the scanning manually using CLI:
  • Login to calispo-scan container running on the installed host:

    ssh scan@localhost –p 3002 , using default-password: ‘scan’

  • Move to the calipso scan application location:

    cd /home/scan/calipso_prod/app/discover

  • Run the scan.py application with the basic default options:

    python3 ./scan.py -m /local_dir/calipso_mongo_access.conf -e Mirantis-8

    Default options: -m points to the default location of mongoDB access details, -e points to the specific environment name, as submitted to mongoDB through UI or API.

    Other optional scanning parameters, can be used for detailed debugging:

    The scan.py script is located in directory app/discover in the Calipso repository.
    To show the help information, run scan.py with –help option, here is the results

    :

    Usage: scan.py [-h] [-c [CGI]] [-m [MONGO_CONFIG]] [-e [ENV]] [-t [TYPE]]

                   [-y [INVENTORY]] [-s] [-i [ID]] [-p [PARENT_ID]]

                   [-a [PARENT_TYPE]] [-f [ID_FIELD]] [-l [LOGLEVEL]]

                   [–inventory_only] [–links_only] [–cliques_only] [–clear]

    Optional arguments:

      -h, –help            show this help message and exit

      -c [CGI], –cgi [CGI]

                            read argument from CGI (true/false) (default: false)

      -m [MONGO_CONFIG], –mongo_config [MONGO_CONFIG]

                            name of config file with MongoDB server access details

      -e [ENV], –env [ENV]

                            name of environment to scan (default: WebEX-

                            Mirantis@Cisco)

      -t [TYPE], –type [TYPE]

                            type of object to scan (default: environment)

      -y [INVENTORY], –inventory [INVENTORY]

                            name of inventory collection (default: ‘inventory’)

      -s, –scan_self       scan changes to a specific object (default: False)

      -i [ID], –id [ID]    ID of object to scan (when scan_self=true)

      -p [PARENT_ID], –parent_id [PARENT_ID]

                            ID of parent object (when scan_self=true)

      -a [PARENT_TYPE], –parent_type [PARENT_TYPE]

                            type of parent object (when scan_self=true)

      -f [ID_FIELD], –id_field [ID_FIELD]

                            name of ID field (when scan_self=true) (default: ‘id’,

                            use ‘name’ for projects)

      -l [LOGLEVEL], –loglevel [LOGLEVEL]

                            logging level (default: ‘INFO’)

      –inventory_only      do only scan to inventory (default: False)

      –links_only          do only links creation (default: False)

      –cliques_only        do only cliques creation (default: False)

      –clear               clear all data prior to scanning (default: False)

    A simple scan.py run will look, by default, for a local MongoDB server. Assuming running this from within the scan container running, the administrator needs to point it to use the specific MongoDB server. This is done using the Mongo access config file created by the installer (see install-guide for details):

    ./scan.py -m your\_mongo\_access.conf
    

    Environment needs to be specified explicitly, no default environment is used by scanner.

    By default, the inventory collection, named ‘inventory’, along with the accompanying collections: “links”, “cliques”, “clique_types” and “clique_constraints” are used to place initial scanning data results.

    As a more granular scan example, for debugging purposes, using environment “RDO-packstack-Mitaka”, pointing scanning results to an inventory collection named “RDO”:
    The accompanying collections will be automatically created and renamed accordingly:
    “RDO_links”, “RDO_cliques”, “RDO_clique_types” and “RDO_clique_constraints”.

    Another parameter in use here is –clear, which is good for development: it clears all the previous data from the data collections (inventory, links & cliques).

    scan.py -m your_mongo_access.conf -e RDO-packstack-Mitaka -y RDO –clear

    Log level will provide the necessary details for cases of scan debugging.

Clique Scanning
For creating cliques based on the discovered objects and links, clique_types must be defined for the given environment (or a default “ANY” environment clique_types will be used)
A clique type specifies the link types used in building a clique (graph topology) for a specific focal point object type.
For example, it can define that for instance objects we want to have the following link types:
  • instance-vnic

  • vnic-vconnector

  • vconnector-vedge

  • vedge-host_pnic

  • host_pnic-network

    See calipso-model guide for more details on cliques and links.

    As in many cases the same clique types are used, we can simply copy the clique_types documents from an existing clique_types collection. For example, using MongoChef:

  • Click the existing clique types collection

  • Right click the results area

  • Choose export

  • Click ‘next’ all the time (JSON format, to clipboard)

  • Select JSON format and “Overwrite document with the same _id”

  • Right click the target collection

  • Choose import, then JSON and clipboard

  • Note that the name of the target collection should have the prefix

    name of your collection’s name. For example, you create a collection named your_test, then your clique types collection’s name must be your_test_clique_types.

    Now run scan.py again to have it create cliques-only from that data.

Viewing results

Scan results are written into the collections in the ‘Calispo’ DB on the MongoDB database.

In our example, we use the MongoDB database server on “install-hostname”http://korlev-osdna-devtest.cisco.com/, so we can connect to it by Mongo client, such as Mongochef and investigate the specific collections for details.

Editing or deleting environments

Inside a specific environment dashboard optional buttons are available for deleting and editing the environment configurations:

image15

Note: Deleting an environment does not empty the inventories of previous scan results, this can be accomplished in future scans when using the –clear options.

Event-based scanning

For dynamic discovery and real-time updates of the inventories Calipso also provides event-based scanning with event_manager application in the calipso-listen container.

Event_manager listens to the VIM AMQP BUS and based on the events updates the inventories and also kickoff automatic scanning of a specific object and its dependencies.

Enabling event-based scanning
Per environment, administrator can define the option of event-based scanning, using either UI or API to configure that parameter in the specific environment configuration:

image16

In cases where event-based scanning is not supported for a specific distribution variance the checkbox for event based scan will be grayed out. When checked, the AMQP section becomes mandatory.

This behavior is maintained through the “supported_environments” collection and explained in more details in the calipso-model document.

Event-based handling details

The event-based scanning module needs more work to adapt to the changes in any specific distribution variance, this is where we would like some community support to help us maintain data without the need for full or partial scanning through scheduling.

The following diagram illustrates event-based scanning module functions on top of the regular scanning module functions:

image17

In the following tables, some of the current capabilities of event-handling and event-based scanning in Calipso are explained: (NOTE: see pdf version of this guide for better tables view)
# Event name AMQP event Handler Workflow Scans Notes
Instance
1 Create Instance compute.instance.create.end EventInstanceAdd
  1. Get instances_root from inventory
  2. If instance_root is None, log error, return None
  3. Create ScanInstancesRoot object.
  4. Scan instances root (and only new instance as a child)
  5. Scan from queue
  6. Get host from inventory
  7. Scan host (and only children of types ‘vconnectors_folder’ and ‘vedges_folder’
  8. Scan from queue
  9. Scan links
  10. Scan cliques
  11. Return True

Yes

{by object id: 2,
links: 1,
cliques: 1,
from queue: ?}
 
2 Update Instance

compute.instance.rebuild.end

compute.instance.update

EventInstanceUpdate
  1. If state == ‘building’, return None
  2. If state == ‘active’ and old_state == ‘building’, call EventInstanceAdd (see #1), return None
  3. If state == ‘deleted’ and old_state == ‘active’, call EventInstanceDelete (see #2), return None
  4. Get instance from inventory
  5. If instance is None, log error, return None
  6. Update several fields in instance.
  7. If name_path has changed, update relevant names and name_path for descendants
  8. Update instance in db
  9. Return None

Yes (if #1 is used)

No (otherwise)

The only fields that are updated: name, object_name and name_path
3 Delete Instance compute.instance.delete.end EventInstanceDelete (EventDeleteBase)
  1. Extract id from payload
  2. Execute self.delete_handler()
No delete_handler() is expanded later
Instance Lifecycle
4 Instance Down

compute.instance.shutdown.start

compute.instance.power_off.start

compute.instance.suspend.start

Not implemented      
5 Instance Up

compute.instance.power_on.end

compute.instance.suspend.end

Not implemented      
Region
6 Add Region servergroup.create Not implemented      
7 Update Region

servergroup.update

servergroup.addmember

Not implemented      
8 Delete Region servergroup.delete Not implemented      
Network
9 Add Network network.create.end EventNetworkAdd
  1. If network with specified id already exists, log error and return None
  2. Parse incoming data and create a network dict
  3. Save network in db
  4. Return None
No  
10 Update Network network.update.end EventNetworkUpdate
  1. Get network_document from db
  2. If network_document doesn’t exist, log error and return None
  3. If name has changed, update relevant names and name_path for descendants
  4. Update admin_state_up from payload
  5. Update network_document in db
No The only fields that are updated: name, object_name, name_path and admin_state_up
11 Delete Network network.delete.end EventNetworkDelete (EventDeleteBase)
  1. Extract network_id from payload
  2. Execute self.delete_handler()
No delete_handler() is expanded later
Subnet
12 Add Subnet subnet.create.end EventSubnetAdd
  1. Get network_document from db
  2. If network_document doesn’t exist, log error and return None
  3. Update network_document with new subnet
  4. If dhcp_enable is True, we update parent network (*note 1*) and add the following children docs: ports_folder, port_document, network_services_folder, dhcp_document, vnic_folder and vnic_document.
  5. Add links for pnics and vservice_vnics (*note 2*)
  6. Scan cliques
  7. Return None
Yes {cliques: 1}
  1. I don’t fully understand what *these lines* do. We make sure ApiAccess.regions variable is not empty, but why? The widespread usage of static variables is not a good sign anyway.
  2. For some reason *the comment* before those lines states we “scan for links” but it looks like we just add them.
13 Update Subnet subnet.update.end EventSubnetUpdate
  1. Get network_document from db
  2. If network_document doesn’t exist, log error and return None
  3. If we don’t have a matching subnet in network_document[‘subnets’], return None
  4. If subnet has enable_dhcp set to True and it wasn’t so before:

4.1. Add dhcp document

4.2. Make sure ApiAccess.regions is not empty

4.3. Add port document

4.4. If port has been added, add vnic document, add links and scan cliques.

  1. Is subnet has enable_dhcp set to False and it wasn’t so before:

5.1. Delete dhcp document

5.2. Delete port binding to dhcp server if exists

  1. If name hasn’t changed, update it by its key in subnets. Otherwise, set it by the new key in subnets. (*note 1*)
Yes {cliques: 1} (only if dhcp status has switched to True)
  1. If subnet name has changed, we set it in subnets object inside network_document by new key, but don’t remove the old one. A bug?
14 Delete Subnet subnet.delete.end EventSubnetDelete
  1. Get network_document from db
  2. If network_document doesn’t exist, log error and return None
  3. Delete subnet id from network_document[‘subnet_ids’]
  4. If subnet exists in network_document[‘subnets’], remove its cidr from network_document[‘cidrs’]

and remove itself from network_document[‘subnets’]

  1. Update network_document in db
  2. If no subnets are left in network_document, delete related vservice dhcp, port and vnic documents
No  
Port
15 Create Port port.create.end EventPortAdd
  1. Check if ports folder exists, create if not.
  2. Add port document to db
  3. If ‘compute’ is not in port[‘device_owner’], return None
  4. Get old_instance_doc (updated instance document) from db
  5. Get instances_root from db
  6. If instances_root is None, log error and return None (*note 1*)
  7. Use an ApiFetchHostInstances fetcher to get data for instance with id equal to the device from payload.
  8. If such instance exists, update old_instance_doc’s fields network_info, network and possibly mac_address with their counterparts from fetched instance. Update old_instance_doc in db
  9. Use a CliFetchInstanceVnics/CliFetchInstanceVnicsVpp fetcher to get vnic with mac_address equal to the port’s mac address
  10. If such vnic exists, update its data and update in db
  11. Add new links using FindLinksForInstanceVnics and FindLinksForVedges classes
  12. Scan cliques
  13. Return True

Yes {cliques: 1}

(only if ‘compute’ is in port[‘device_owner’] and instance_root is not None (see steps 3 and 6))

  1. The port and (maybe) port folder will still persist in db even if we abort the execution on step 6. See idea 1 for details.
16 Update Port port.update.end EventPortUpdate
  1. Get port from db
  2. If port doesn’t exist, log error and return None
  3. Update port data (name, admin_state_up, status, binding:vnic_type) in db
  4. Return None
No  
17 Delete Port port.delete.end EventPortDelete (EventDeleteBase)
  1. Get port from db
  2. If port doesn’t exist, log error and return None
  3. If ‘compute’ is in port[‘device_owner’], do the following:

3.1. Get instance document for the port from db. If it doesn’t exist, to step 4.

3.2. Remove port from network_info of instance

3.3. If it was the last port for network in instance doc, remove network from the doc

3.4. If port’s mac_address is equal to instance_doc’s one, then fetch an instance with the same id as instance_doc using ApiFetchHostInstances fetcher. If instance exists and ‘mac_address’ not in instance, set instance_doc’s mac_address to None

3.5. Save instance_docs in db

  1. Delete port from db
  2. Delete related vnic from db
  3. Execute self.delete_handler(vnic) for vnic
No delete_handler() is expanded later
Router
18 Add Router router.create.end EventRouterAdd
  1. Get host by id from db
  2. Fetch router_doc using a CliFetchHostVservice
  3. If router_doc contains ‘external_gateway_info’:

3.1. Add router document (with network) to db

3.2. Add children documents:

3.3. If no ports folder exists for this router, create one

3.4. Add router port to db

3.5. Add vnics folder for router to db

3.6. If port was successfully added (3.4), try to add vnic document for router to db two times (??)

3.7. If port wasn’t successfully added, try adding vnics_folder again (???) (*note 1*)

3.8. If step 3.7 returned False (*Note 2*), try to add vnic_document again (??)

  1. Add router document (without network) to db (Note 3)
  2. Add relevant links for the new router
  3. Scan cliques
  4. Return None
Yes {cliques: 1}
  1. Looks like code author confused a lot of stuff here. This class needs to be reviewed thoroughly.
  2. Step 3.7 never returns anything for some reason (a bug?)
  3. Why are we adding router document again? It shouldn’t be added again on step 4 if it was already added on step 3.1. Probably an ‘else’ clause is missing
19 Update Router router.update.end EventRouterUpdate
  1. Get router_doc from db
  2. If router_doc doesn’t exist, log error and return None
  3. If payload router data doesn’t have external_gateway_info, do the following:

3.1. If router_doc has a ‘gw_port_id’ key, delete relevant port.

3.2. If router_doc has a ‘network’:

3.2.1. If a port was deleted on step 3.1, remove its ‘network_id’ from router_doc[‘network’]

3.2.2. Delete related links

  1. If payload router data has external_gateway_info, do the following:

4.1. Add new network id to router_doc networks

4.2. Use CliFetchHostVservice to fetch gateway port and update it in router_doc

4.3. Add children documents for router (see #18 steps 3.2-3.8)

4.4. Add relevant links

  1. Update router_doc in db
  2. Scan cliques
  3. Return None
Yes {cliques: 1}  
20 Delete Router router.delete.end EventRouterDelete (EventDeleteBase)
  1. Extract router_id from payload
  2. Execute self.delete_handler()
No delete_handler() is expanded later
Router Interface
21 Add Router Interface router.interface.create EventInterfaceAdd
  1. Get network_doc from db based on subnet id from interface payload
  2. If network_doc doesn’t exist, return None
  3. Make sure ApiAccess.regions is not empty (?)
  4. Add router-interface port document in db
  5. Add vnic document for interface. If unsuccessful, try again after a small delay
  6. Update router:

6.1. If router_doc is an empty type, log an error and continue to step 7 (*Note 1*)

6.2. Add new network id to router_doc network list

6.3. If gateway port is in both router_doc and db, continue to step 6.7

6.4. Fetch router using CliFetchHostVservice, set gateway port in router_doc to the one from fetched router

6.5. Add gateway port to db

6.6. Add vnic document for router. If unsuccessful, try again after a small delay

6.7. Update router_id in db

  1. Add relevant links
  2. Scan cliques
  3. Return None
Yes {cliques: 1}
  1. Log message states that we should abort interface adding, though the code does nothing to support that. Moreover, router_doc can’t be empty at that moment because it’s referenced before.
22 Delete Router Interface router.interface.delete EventInterfaceDelete
  1. Get port_doc by payload port id from db
  2. If port_doc doesn’t exist, log an error and return None
  3. Update relevant router by removing network id of port_doc
  4. Delete port by executing EventPortDelete().delete_port()
No  

ACI scanning

For dynamic discovery and real-time updates of physical switches and connections between physical switches ports and host ports (pNICs), Calispo provides an option to integrate with the Cisco data center switches controller called “ACI APIC”.

This is an optional parameter and once checked details on the ACI server and API credentials needs to be provided:

image18

The results of this integration (when ACI switches are used in that specific VIM environment) are extremely valuable as it maps out and monitors virtual-to-physical connectivity across the entire data center environment, both internal and external.

Example graph generated in such environments:

image19

image20

image21

Monitoring enablement

For dynamic discovery of real-time statuses and states of physical and virtual components and thier connections Calispo provides an option to automatically integrate with the Sensu framework, customized and adapted from the Calispo model and design concepts. Follow the monitoring-guide for details on this optional module.

Enabling Monitoring through UI, using environment configuration wizard:

image22

Modules data flows

Calipso modules/containers and the VIM layers have some inter-dependencies, illustrated in the following diagram:

image23

Calipso.io
API Guide

Copyright (c) 2017 Koren Lev (Cisco Systems), Yaron Yogev (Cisco Systems) and others All rights reserved. This program and the accompanying materials are made available under the terms of the Apache License, Version 2.0 which accompanies this distribution, and is available at http://www.apache.org/licenses/LICENSE-2.0

image0

Project “Calipso” tries to illuminate complex virtual networking with real time operational state visibility for large and highly distributed Virtual Infrastructure Management (VIM).

We believe that Stability is driven by accurate Visibility.

Calipso provides visible insights using smart discovery and virtual topological representation in graphs, with monitoring per object in the graph inventory to reduce error vectors and troubleshooting, maintenance cycles for VIM operators and administrators.

Table of Contents

Calipso.io API Guide 1

1 Pre Requisites 3

1.1 Calipso API container 3

2 Overview 3

2.1 Introduction 3

2.2 HTTP Standards 4

2.3 Calipso API module Code 4

3 Starting the Calipso API server 4

3.1 Authentication 4

3.2 Database 5

3.3 Running the API Server 5

4 Using the Calipso API server 6

4.1 Authentication 6

4.2 Messages 9

4.3 Inventory 14

4.4 Links 17

4.5 Cliques 20

4.6 Clique_types 23

4.7 Clique_constraints 26

4.8 Scans 29

4.9 Scheduled_scans 32

4.10 Constants 35

4.11 Monitoring_Config_Templates 37

4.12 Aggregates 39

4.13 Environment_configs 42

Pre Requisites

Calipso API container

Calipso’s main application is written with Python3.5 for Linux Servers, tested successfully on Centos 7.3 and Ubuntu 16.04. When running using micro-services many of the required software packages and libraries are delivered per micro service, including the API module case. In a monolithic case dependencies are needed.

Here is a list of the required software packages for the API, and the official supported steps required to install them:

  1. Python3.5.x for Linux : https://docs.python.org/3.5/using/unix.html#on-linux

  2. Pip for Python3 : https://docs.python.org/3/installing/index.html

  3. Python3 packages to install using pip3 :

  4. falcon (1.1.0)

  5. pymongo (3.4.0)

  6. gunicorn (19.6.0)

  7. ldap3 (2.1.1)

  8. setuptools (34.3.2)

  9. python3-dateutil (2.5.3-2)

  10. bcrypt (3.1.1)

    You should use pip3 python package manager to install the specific version of the library. Calipso project uses Python 3, so package installation should look like this:

    pip3 install falcon==1.1.0

    The versions of the Python packages specified above are the ones that were used in the development of the API, other versions might also be compatible.

    This document describes how to setup Calipso API container for development against the API.

Overview

Introduction

The Calipso API provides access to the Calipso data stored in the MongoDB.

Calispo API uses falcon (https://falconframework.org) web framework and gunicorn (http://gunicorn.org) WSGI server.

The authentication of the Calipso API is based on LDAP (Lightweight Directory Access Protocol). It can therefore interface with any directory servers which implements the LDAP protocol, e.g. OpenLDAP, Active Directory etc. Calipso app offers and uses the LDAP built-in container by default to make sure this integration is fully tested, but it is possible to interface to other existing directories.

HTTP Standards
The Calipso API supports standard HTTP methods described here: https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html.
At present two types of operations are supported: GET (retrieve data) and POST (create a new data object).
Calipso API module Code

Clipso API code is currently located in opnfv repository.

Run the following command to get the source code:

git clone **https://git.opnfv.org/calipso/**

The source code of the API is located in the app/api directory sub-tree.

Starting the Calipso API server

Authentication

Calipso API uses LDAP as the protocol to implement the authentication, so you can use any LDAP directory server as the authentication backend, like OpenLDAP and Microsoft AD. You can edit the ldap.conf file which is located in app/config directory to configure LDAP server options (see details in quickstart-guide):

# url for connecting to the LDAP server (customize to your own as needed):
url ldap_url
# LDAP attribute mapped to user id, must not be a multivalued attributes:
user_id_attribute CN
# LDAP attribute mapped to user password:
user_pass_attribute userPassword
# LDAP objectclass for user
user_objectclass inetOrgPerson
# Search base for users
user_tree_dn OU=Employees,OU=Example Users,DC=exmaple,DC=com
query_scope one
# Valid options for tls_req_cert are demand, never, and allow
tls_req_cert demand
# CA certificate file path for communicating with LDAP servers.
tls_cacertfile ca_cert_file_path
group_member_attribute member

Calipso currently implements the basic authentication, the client send the query request with its username and password in the auth header, if the user can be bound to the LDAP server, authentication succeeds otherwise fails. Other methods will be supported in future releases.

Database
Calipso API query for and retrieves data from MongoDB container, the data in the MongoDB comes from the results of Calipso scanning, monitoring or the user inputs from the API. All modules of a single Calipso instance of the application must point to the same MongoDB used by the scanning and monitoring modules. Installation and testing of mongoDB is covered in install-guide and quickstart-guide.
Running the API Server

The entry point (initial command) running the Calipso API application is the server.py script in the app/api directory. Options for running the API server can be listed using: python3 server.py –help. Here is the current options available:

-m [MONGO_CONFIG], –mongo_config [MONGO_CONFIG]
                   name of config file with mongo access details
–ldap_config [LDAP_CONFIG]
                   name of the config file with ldap server config
                   details
-l [LOGLEVEL], –loglevel [LOGLEVEL] logging level (default: ‘INFO’)
-b [BIND], –bind [BIND]
                   binding address of the API server (default: 127.0.0.1:8000)
-y [INVENTORY], –inventory [INVENTORY]
                   name of inventory collection (default: ‘inventory’)

For testing, you can simply run the API server by:

python3 app/api/server.py

This will start a HTTP server listening on http://localhost:8000, if you want to change the binding address of the server, you can run it using this command:

python3 server.py –bind ip_address/server_name:port_number

You can also use your own configuration files for LDAP server and MongoDB, just add –mongo_config and –ldap_config options in your command:

python3 server.py –mongo_config your_mongo_config_file_path –ldap_config your_ldap_config_file_path

—inventory option is used to set the collection names the server uses for the API, as per the quickstart-guide this will default to /local_dir/calipso_mongo_access.conf and /local_dir/ldap.conf mounted inside the API container.

Notes: the –inventory argument can only change the collection names of the inventory, links, link_types, clique_types, clique_constraints, cliques, constants and scans collections, names of the monitoring_config_templates, environments_config and messages collections will remain at the root level across releases.

Using the Calipso API server

The following covers the currently available requests and responses on the Calipso API

Authentication

POST        /auth/tokens

Description: get token with password and username or a valid token.

Normal response code: 201

Error response code: badRequest(400), unauthorized(401)

Request

Name In Type Description
auth(Mandatory) body object An auth object that contains the authentication information
methods(Mandatory) body array The authentication methods. For password authentication, specify password, for token authentication, specify token.
credentials(Optional) body object Credentials object which contains the username and password, it must be provided when getting the token with user credentials.
token(Optional) body string The token of the user, it must be provided when getting the user with an existing valid token.

Response

Name In Type Description
token body string Token for the user.
issued-at body string

The date and time when the token was issued. the date and time format follows *ISO 8610*:

YYYY-MM-DDThh:mm:ss.sss+hhmm

expires_at body string

The date and time when the token expires. the date and time format follows *ISO 8610*:

YYYY-MM-DDThh:mm:ss.sss+hhmm

method body string The method which achieves the token.

** Examples**

Get token with credentials:

Post *http://korlev-osdna-staging1.cisco.com:8000/auth/tokens*

{
     “auth”: {
         “methods”: [“credentials”],
         “credentials”: {
               “username”: “username”,
               “password”: “password”
          }
       }
}

Get token with token

post http://korlev-calipso-staging1.cisco.com:8000/auth/tokens

{
    “auth”: {
          “methods”: [“token”],
          “token”: “17dfa88789aa47f6bb8501865d905f13”
    }
}
**

DELETE       /auth/tokens

Description: delete token with a valid token.

Normal response code: 200

Error response code: badRequest(400), unauthorized(401)

Request

Name In Type Description
X-Auth-Token header string A valid authentication token that is doing to be deleted.

Response

200 OK will be returned when the delete succeed

Messages

GET         /messages

Description: get message details with environment name and message id, or get a list of messages with filters except id.

Normal response code: 200

Error response code: badRequest(400), unauthorized(401), notFound(404)

Request

Name In Type Description
env_name(Mandatory) query string Environment name of the messages. e.g. “Mirantis-Liberty-API”.
id (Optional) query string ID of the message.
source_system (Optional) query string Source system of the message, e.g. “OpenStack”.
start_time (Optional) query string

Start time of the messages, when this parameter is specified, the messages after that time will be returned, the date and time format follows *ISO 8610: *

YYYY-MM-DDThh:mm:ss.sss+hhmm

The +hhmm value, if included, returns the time zone as an offset from UTC, For example, 2017-01-25T09:45:33.000-0500. If you omit the time zone, the UTC time is assumed.

end_time (Optional) query string

End time of the message, when this parameter is specified, the messages before that time will be returned, the date and time format follows *ISO 8610*:

YYYY-MM-DDThh:mm:ss.sss+hhmm

The +hhmm value, if included, returns the time zone as an offset from UTC, For example, 2017-01-25T09:45:33.000-0500. If you omit the time zone, the UTC time is assumed.

level (Optional) query string The severity of the messages, we accept the severities strings described in *RFC 5424*, possible values are “panic”, “alert”, “crit”, “error”, “warn”, “notice”, “info” and “debug”.
related_object (Optional) query string ID of the object related to the message.
related_object_type (Optional) query string Type of the object related to the message, possible values are “vnic”, “vconnector”, “vedge”, “instance”, “vservice”, “host_pnic”, “network”, “port”, “otep” and “agent”.
page (Optional) query int Which page will to be returned, the default is first page, if the page is larger than the maximum page of the query, and it will return an empty result set (Page start from 0).
page_size (Optional) query int Size of each page, the default is 1000.

Response 

Name In Type Description
environment body string Environment name of the message.
id body string ID of the message.
_id body string MongoDB ObjectId of the message.
timestamp body string Timestamp of message.
viewed body boolean Indicates whether the message has been viewed.
display_context body string The content which will be displayed.
message body object Message object.
source_system body string Source system of the message, e.g. “OpenStack”.
level body string The severity of the message.
related_object body string Related object of the message.
related_object_type body string Type of the related object.
messages body array List of message ids which match the filters.

Examples

Example Get Messages 

Request:

http://korlev-calipso-testing.cisco.com:8000/messages?env_name=Mirantis-Liberty-API&start_time=2017-01-25T14:28:32.400Z&end_time=2017-01-25T14:28:42.400Z

Response:

{
     messages: [
{
     “level”: “info”,
     “environment”: “Mirantis-Liberty”,
     “id”: “3c64fe31-ca3b-49a3-b5d3-c485d7a452e7”,
     “source_system”: “OpenStack”
},
{
     “level”: “info”,
     “environment”: “Mirantis-Liberty”,
     “id”: “c7071ec0-04db-4820-92ff-3ed2b916738f”,
     “source_system”: “OpenStack”
},
      ]
}

Example Get Message Details

Request

http://korlev-calipso-testing.cisco.com:8000/messages?env_name=Mirantis-Liberty-API&id=80b5e074-0f1a-4b67-810c-fa9c92d41a98

Response

{
“related_object_type”: “instance”,
“source_system”: “OpenStack”,
“level”: “info”,
“timestamp”: “2017-01-25T14:28:33.057000”,
“_id”: “588926916a283a8bee15cfc6”,
“viewed”: true,
“display_context”: “*”,
“related_object”: “97a1e179-6a42-4c7b-bced-4f64bd9e4b6b”,
“environment”: “Mirantis-Liberty-API”,
“message”: {
“_context_show_deleted”: false,
“_context_user_name”: “admin”,
“_context_project_id”: “a3efb05cd0484bf0b600e45dab09276d”,
“_context_service_catalog”: [
{
“type”: “volume”,
“endpoints”: [
{
“region”: “RegionOne”
}
],
“name”: “cinder”
},
{
“type”: “volumev2”,
“endpoints”: [
{
“region”: “RegionOne”
}
],
“name”: “cinderv2”
}
],
“_context_user_identity”: “a864d9560b3048e9864118555bb9614c a3efb05cd0484bf0b600e45dab09276d - - -”,
“_context_project_domain”: null,
“_context_is_admin”: true,
“_context_instance_lock_checked”: false,
“_context_timestamp”: “2017-01-25T22:27:08.773313”,
“priority”: “INFO”,
“_context_project_name”: “project-osdna”,
“publisher_id”: “*compute.node-1.cisco.com*”,
“_context_read_only”: false,
“message_id”: “80b5e074-0f1a-4b67-810c-fa9c92d41a98”,
“_context_user_id”: “a864d9560b3048e9864118555bb9614c”,
“_context_quota_class”: null,
“_context_tenant”: “a3efb05cd0484bf0b600e45dab09276d”,
“_context_remote_address”: “192.168.0.2”,
“_context_request_id”: “req-2955726b-f227-4eac-9826-b675f5345ceb”,
“_context_auth_token”: “gAAAAABYiSVcHmaq1TWwNc1_QLlKhdUeC1-M6zBebXyoXN4D0vMlxisny9Q61crBzqwSyY_Eqd_yjrL8GvxatWI1WI1uG4VeWU6axbLe_k5FaXS4RVOP83yR6eh5g_qXQtsNapQufZB1paypZm8YGERRvR-vV5Ee76aTSkytVjwOBeipr9D0dXd-wHcRnSNkTD76nFbGKTu_”,
“_context_user_domain”: null,
“payload”: {
“image_meta”: {
“container_format”: “bare”,
“disk_format”: “qcow2”,
“min_ram”: “64”,
“base_image_ref”: “5f048984-37d1-4952-8b8a-9acb0237bad7”,
“min_disk”: “0”
},
“display_name”: “test”,
“terminated_at”: “”,
“access_ip_v6”: null,
“architecture”: null,
“audit_period_beginning”: “2017-01-01T00:00:00.000000”,
“metadata”: {},
“node”: “*node-2.cisco.com*”,
“audit_period_ending”: “2017-01-25T22:27:12.888042”,
“instance_type”: “m1.micro”,
“ramdisk_id”: “”,
“availability_zone”: “nova”,
“kernel_id”: “”,
“hostname”: “test”,
“vcpus”: 1,
“bandwidth”: {},
“user_id”: “a864d9560b3048e9864118555bb9614c”,
“state_description”: “block_device_mapping”,
“old_state”: “building”,
“root_gb”: 0,
“instance_flavor_id”: “8784e0b5-7d17-4281-a509-f49d6fd102f9”,
“cell_name”: “”,
“reservation_id”: “r-zt7sh7vy”,
“access_ip_v4”: null,
“deleted_at”: “”,
“tenant_id”: “a3efb05cd0484bf0b600e45dab09276d”,
“disk_gb”: 0,
“instance_id”: “97a1e179-6a42-4c7b-bced-4f64bd9e4b6b”,
“host”: “*node-2.cisco.com*”,
“memory_mb”: 64,
“os_type”: null,
“old_task_state”: “block_device_mapping”,
“state”: “building”,
“instance_type_id”: 6,
“launched_at”: “”,
“ephemeral_gb”: 0,
“created_at”: “2017-01-25 22:27:09+00:00”,
“progress”: “”,
“new_task_state”: “block_device_mapping”
},
“_context_read_deleted”: “no”,
“event_type”: “compute.instance.update”,
“_context_roles”: [
“admin”,
“_member_”
],
“_context_user”: “a864d9560b3048e9864118555bb9614c”,
“timestamp”: “2017-01-25 22:27:12.912744”,
“_unique_id”: “d6dff97e6f71401bb8890057f872644f”,
“_context_resource_uuid”: null,
“_context_domain”: null
},
“id”: “80b5e074-0f1a-4b67-810c-fa9c92d41a98”
}
Inventory

GET            /inventory**            **

Description: get object details with environment name and id of the object, or get a list of objects with filters except id.

Normal response code: 200

Error response code:  badRequest(400), unauthorized(401), notFound(404)

Request

Name In Type Description
env_name (Mandatory) query string Environment of the objects. e.g. “Mirantis-Liberty-API”.
id (Optional) query string ID of the object. e.g. “*node-2.cisco.com*”.
parent_id (Optional) query string ID of the parent object. e.g. “nova”.
id_path (Optional) query string ID path of the object. e.g. “/Mirantis-Liberty-API/Mirantis-Liberty-API-regions/RegionOne/RegionOne-availability_zones/nova/*node-2.cisco.com*”.
parent_path (Optional) query string ID path of the parent object. “/Mirantis-Liberty-API/Mirantis-Liberty-API-regions/RegionOne/RegionOne-availability_zones/nova”.
sub_tree (Optional) query boolean If it is true and the parent_path is specified, it will return the whole sub-tree of that parent object which includes the parent itself, If it is false and the parent_path is specified, it will only return the siblings of that parent (just the children of that parent node), the default value of sub_tree is false.
page (Optional) query int Which page is to be returned, the default is the first page, if the page is larger than the maximum page of the query, it will return an empty set, (page starts from 0).
page_size (Optional) query int Size of each page, the default is 1000.

Response 

Name In Type Description
environment body string Environment name of the object.
id body string ID of the object.
_id body string MongoDB ObjectId of the object.
type body string Type of the object.
parent_type body string Type of the parent object.
parent_id body string ID of the parent object.
name_path body string Name path of the object.
last_scanned body string Time of last scanning.
name body string Name of the object.
id_path body string ID path of the object.
objects body array The list of object IDs that match the filters.

Examples

Example Get Objects 

Request

http://korlev-calipso-testing.cisco.com:8000/inventory?env_name=Mirantis-Liberty-API&parent_path=/Mirantis-Liberty-API/Mirantis-Liberty-API-regions/RegionOne&sub_tree=false

Response

{

    “objects”: [

{
     “id”: “Mirantis-Liberty-regions”,
     “name”: “Regions”,
     “name_path”: “/Mirantis-Liberty/Regions”
},
{
     “id”: “Mirantis-Liberty-projects”,
     “name”: “Projects”,
     “name_path”: “/Mirantis-Liberty/Projects”
}

]

}

Examples Get Object Details

Request

http://korlev-calipso-testing.cisco.com:8000/inventory?env_name=Mirantis-Liberty-API&id=node-2.cisco.com

Response

{
   ‘ip_address’: ‘192.168.0.5’,
   ‘services’: {
      ‘nova-compute’: {
         ‘active’: True,
         ‘updated_at’: ‘2017-01-20T23:03:57.000000’,
         ‘available’: True
       }
    },
‘name’: ‘*node-2.cisco.com*‘,
‘id_path’: ‘/Mirantis-Liberty-API/Mirantis-Liberty-API-regions/RegionOne/RegionOne-availability_zones/nova/*node-2.cisco.com*‘,
‘show_in_tree’: True,
‘os_id’: ‘1’,
‘object_name’: ‘*node-2.cisco.com*‘,
‘_id’: ‘588297ae6a283a8bee15cc0d’,
‘host_type’: [
   ‘Compute’
],
‘name_path’: ‘/Mirantis-Liberty-API/Regions/RegionOne/Availability Zones/nova/*node-2.cisco.com*‘,
‘parent_type’: ‘availability_zone’,
‘zone’: ‘nova’,
‘parent_id’: ‘nova’,
‘host’: ‘*node-2.cisco.com*‘,
‘last_scanned’: ‘2017-01-20T15:05:18.501000’,
‘id’: ‘*node-2.cisco.com*‘,
‘environment’: ‘Mirantis-Liberty-API’,
‘type’: ‘host’
}
Cliques

GET            /cliques

Description: get clique details with environment name and clique id, or get a list of cliques with filters except id

Normal response code: 200

Error response code: badRequest(400), unauthorized(401), notFound(404)

Request

Name In Type Description
env_name (Mandatory) query string Environment of the cliques. e.g. “Mirantis-Liberty-API”.
id (Optional) query string ID of the clique, it must be a string that can be converted to Mongo ObjectID.
focal_point (Optional) query string MongoDB ObjectId of the focal point object, it must be a string that can be converted to Mongo ObjectID.
focal_point_type (Optional) query string Type of the focal point object, some possible values are  “vnic”, “vconnector”, “vedge”, “instance”, “vservice”, “host_pnic”, “network”, “port”, “otep” and “agent”.
link_type(Optional) query string Type of the link, when this filter is specified, it will return all the cliques which contain the specific type of the link, some possible values for link_type are “instance-vnic”, “otep-vconnector”, “otep-host_pnic”, “host_pnic-network”, “vedge-otep”, “vnic-vconnector”, “vconnector-host_pnic”, “vnic-vedge”, “vedge-host_pnic” and “vservice-vnic”.
link_id (Optional) query string MongoDB ObjectId of the link, it must be a string that can be converted to MongoDB ID, when this filter is specified, it will return all the cliques which contain that specific link.
page (Optional) query int The page is to be returned, the default is the first page, if the page is larger than the maximum page of the query, it will return an empty set. (Page starts from 0).
page_size (Optional) query int The size of each page, the default is 1000.

Response

Name In Type Description
id body string ID of the clique.
_id body string MongoDB ObjectId of the clique.
environment body string Environment of the clique.
focal_point body string Object ID of the focal point.
focal_point_type body string Type of the focal point object, e.g. “vservice”.
links body array List of MongoDB ObjectIds of the links in the clique.
links_detailed body array Details of the links in the clique.
constraints body object Constraints of the clique.
cliques body array The list of clique ids that match the filters.

Examples

Example Get Cliques

Request

*http://10.56.20.32:8000/cliques?env_name=Mirantis-Liberty-API&link_id=58a2405a6a283a8bee15d42f*

Response

{

    “cliques”: [

{
      “link_types”: [
          “instance-vnic”,
          “vservice-vnic”,
          “vnic-vconnector”
      ],
     “environment”: “Mirantis-Liberty”,
     “focal_point_type”: “vnic”,
     “id”: “576c119a3f4173144c7a75c5”
},
{
     “link_types”: [
         “vnic-vconnector”,
         “vconnector-vedge”
     ],
     “environment”: “Mirantis-Liberty”,
     “focal_point_type”: “vconnector”,
     “id”: “576c119a3f4173144c7a75c6”
}

   ]

}

Example Get Clique Details

Request

http://korlev-calipso-testing.cisco.com:8000/cliques?env_name=Mirantis-Liberty-API&id=58a2406e6a283a8bee15d43f

Response

{
   ‘id’: ‘58867db16a283a8bee15cd2b’,
   ‘focal_point_type’: ‘host_pnic’,
   ‘environment’: ‘Mirantis-Liberty’,
   ‘_id’: ‘58867db16a283a8bee15cd2b’,
   ‘links_detailed’: [
      {
         ‘state’: ‘up’,
         ‘attributes’: {
            ‘network’: ‘e180ce1c-eebc-4034-9e50-b3bab1c13979’
         },
         ‘target’: ‘58867cc86a283a8bee15cc92’,
         ‘source’: ‘58867d166a283a8bee15ccd0’,
         ‘host’: ‘*node-1.cisco.com*‘,
         ‘link_type’: ‘host_pnic-network’,
         ‘target_id’: ‘e180ce1c-eebc-4034-9e50-b3bab1c13979’,
         ‘source_id’: 'eno16777728.103@eno16777728-00:50:56:ac:e8:97’,
         ‘link_weight’: 0,
         ‘environment’: ‘Mirantis-Liberty’,
         ‘_id’: ‘58867d646a283a8bee15ccf3’,
         ‘target_label’: ‘’,
         ‘link_name’: ‘Segment-None’,
         ‘source_label’: ‘’
      }
   ],
‘links’: [
   ‘58867d646a283a8bee15ccf3’
 ],
‘focal_point’: ‘58867d166a283a8bee15ccd0’,
‘constraints’: {
   }

}

Clique_types

GET        /clique_types

Description: get clique_type details with environment name and clique_type id, or get a list of clique_types with filters except id

Normal response code: 200

Error response code:  badRequest(400), unauthorized(401), notFound(404)

Request

Name In Type Description
env_name query string Environment of the clique_types. e.g. “Mirantis-Liberty-API”
id query string ID of the clique_type, it must be a string that can be converted to the MongoDB ObjectID.
focal_point_type (Optional) query string Type of the focal point object, some possible values for it are “vnic”, “vconnector”, “vedge”, “instance”, “vservice”, “host_pnic”, “network”, “port”, “otep” and “agent”.
link_type(Optional) query string Type of the link, when this filter is specified, it will return all the clique_types which contain the specific link_type in its link_types array. Some possible values of the link_type are “instance-vnic”, “otep-vconnector”, “otep-host_pnic”, “host_pnic-network”, “vedge-otep”, “vnic-vconnector”, “vconnector-host_pnic”, “vnic-vedge”, “vedge-host_pnic” and “vservice-vnic”. Repeat link_type several times to specify multiple link_types, e.g link_type=instance-vnic&link_type=host_pnic-network.
page_size(Optional) query int Size of each page, the default is 1000.
page (Optional) query int Which page is to be returned, the default is first page, if the page is larger than the maximum page of the query, it will return an empty result set. (Page starts from 0).

Response

Name In Type Description
id body string ID of the clique_type.
_id body string MongoDB ObjectId of the clique_type
environment body string Environment of the clique_type.
focal_point_type body string Type of the focal point, e.g. “vnic”.
link_types body array List of link_types of the clique_type.
name body string Name of the clique_type.
clique_types body array List of clique_type ids of clique types that match the filters.

Examples

Example Get Clique_types

Request

*** ***http://korlev-calipso-testing.cisco.com:8000/clique_types?env_name=Mirantis-Liberty-API&link_type=instance-vnic&page_size=3&link_type=host_pnic-network

**Response**

{

    “clique_types”: [

{
       “environment”: “Mirantis-Liberty”,
       “focal_point_type”: “host_pnic”,
       “id”: “58ca73ae3a8a836d10ff3b80”
}

]

}

Example Get Clique_type Details

Request

http://korlev-calipso-testing.cisco.com:8000/clique_types?env_name=Mirantis-Liberty-API&id=585b183c761b05789ee3c659

Response

{
   ‘id’: ‘585b183c761b05789ee3c659’,
   ‘focal_point_type’: ‘vnic’,
   ‘environment’: ‘Mirantis-Liberty-API’,
   ‘_id’: ‘585b183c761b05789ee3c659’,
   ‘link_types’: [
      ‘instance-vnic’,
      ‘vservice-vnic’,
      ‘vnic-vconnector’
   ],
   ‘name’: ‘vnic_clique’
}

POST           /clique_types

Description: Create a new clique_type

Normal response code: 201(Created)

Error response code: badRequest(400), unauthorized(401),  conflict(409)

Request

Name In Type Description
environment(Mandatory) body string Environment of the system, the environment must be the existing environment in the system.
focal_point_type(Mandatory) body string Type of the focal point, some possible values are “vnic”, “vconnector”, “vedge”, “instance”, “vservice”, “host_pnic”, “network”, “port”, “otep” and “agent”.
link_types(Mandatory) body array Link_types of the clique_type, some possible values of the link_type are “instance-vnic”, “otep-vconnector”, “otep-host_pnic”, “host_pnic-network”, “vedge-otep”, “vnic-vconnector”, “vconnector-host_pnic”, “vnic-vedge”, “vedge-host_pnic” and “vservice-vnic”
name(Mandatory) body string Name of the clique type, e.g. “instance_vconnector_clique”

Request Example

post  http://korlev-calipso-testing.cisco.com:8000/clique_types

{
   “environment” : “RDO-packstack-Mitaka”,
    “focal_point_type” : “instance”,
    “link_types” : [
        “instance-vnic”,
        “vnic-vconnector”,
        “vconnector-vedge”,
        “vedge-otep”,
        “otep-host_pnic”,
        “host_pnic-network”
    ],
    “name” : “instance_vconnector_clique”
}

Response

Successful Example

{
        “message”: “created a new clique_type for environment Mirantis-Liberty”
}
Clique_constraints

GET            /clique_constraints

Description: get clique_constraint details with clique_constraint id, or get a list of clique_constraints with filters except id.

Normal response code: 200

Error response code: badRequest(400), unauthorized(401), notFound(404)

Note: this is not environment specific so query starts with parameter, not env_name (as with all others), example:

http://korlev-calipso-testing.cisco.com:8000/clique_constraints?focal_point_type=instance

Request

Name In Type Description
id (Optional) query string ID of the clique_constraint, it must be a string that can be converted to MongoDB ObjectId.
focal_point_type (Optional) query string Type of the focal_point, some possible values for that are “vnic”, “vconnector”, “vedge”, “instance”, “vservice”, “host_pnic”, “network”, “port”, “otep” and “agent”.
constraint(Optional) query string

Constraint of the cliques, repeat this filter several times to specify multiple constraints. e.g

constraint=network&constraint=host_pnic.

page (Optional) query int Which page is to be returned, the default is the first page, if the page is larger than the maximum page of the query, the last page will be returned. (Page starts from 0.)
page_size (Optional) query int Size of each page, the default is 1000

Response 

Name In Type Description
id body string Object id of the clique constraint.
_id body string MongoDB ObjectId of the clique_constraint.
focal_point_type body string Type of the focal point object.
constraints body array Constraints of the clique.
clique_constraints body array List of clique constraints ids that match the filters.

Examples

Example Get Clique_constraints

Request

http://korlev-calipso-testing.cisco.com:8000/clique_constraints?constraint=host_pnic&constraint=network

Response

{

     “clique_constraints”: [

{
       “id”: “576a4176a83d5313f21971f5”
},
{
        “id”: “576ac7069f6ba3074882b2eb”
}

    ]

}

Example Get Clique_constraint Details

Request

http://korlev-calipso-testing.cisco.com:8000/clique_constraints?id=576a4176a83d5313f21971f5

**Response**

{
      “_id”: “576a4176a83d5313f21971f5”,
      “constraints”: [
           “network”,
          “host_pnic”
      ],
     “id”: “576a4176a83d5313f21971f5”,
    “focal_point_type”: “instance”
}
Scans

GET            /scans

Description: get scan details with environment name and scan id, or get a list of scans with filters except id

Normal response code: 200

Error response code: badRequest (400), unauthorized (401), notFound(404)

Request

Name In Type Description
env_name (Mandatory) query string Environment of the scans. e.g. “Mirantis-Liberty”.
id (Optional) query string ID of the scan, it must be a string that can be converted MongoDB ObjectId.
base_object(Optional) query string ID of the scanned base object. e.g. “*node-2.cisco.com*”.
status (Optional) query string Status of the scans, the possible values for the status are “draft”, “pending”, “running”, “completed”, “failed” and “aborted”.
page (Optional) query int Which page is to be returned, the default is the first page, if the page is larger than the maximum page of the query, it will return an empty set. (Page starts from 0.)
page_size (Optional) query int Size of each page, the default is 1000.

Response

Name In Type Description
status body string The current status of the scan, possible values are “draft”, “pending”, “running”, “completed”, “failed” and “aborted”.
log_level body string Logging level of the scanning, the possible values are “CRITICAL”, “ERROR”, “WARNING”, “INFO”, “DEBUG” and “NOTSET”.
clear body boolean Indicates whether it needs to clear all the data before scanning.
scan_only_inventory body boolean Only scan and store data in the inventory.
scan_only_links body boolean Limit the scan to find only missing links.
scan_only_cliques body boolean Limit the scan to find only missing cliques.
scan_completed body boolean Indicates if the scan completed
submit_timestamp body string Submit timestamp of the scan
environment body string Environment name of the scan
inventory body string Name of the inventory collection.
object_id body string Base object of the scan

Examples

Example Get Scans

Request

http://korlev-calipso-testing.cisco.com:8000/scans?status=completed&env_name=Mirantis-Liberty&base_object=ff

Response

{
      “scans”: [
           {
              “status”: “pending”,
              “environment”: “Mirantis-Liberty”,
             “id”: “58c96a075eb66a121cc4e75f”,
             “scan_completed”: true
          }

       ]

}

Example Get Scan Details

Request

http://korlev-calipso-testing.cisco.com:8000/scans?env_name=Mirantis-Liberty&id=589a49cf2e8f4d154386c725

Response

{
      “scan_only_cliques”: true,
      “object_id”: “ff”,
      “start_timestamp”: “2017-01-28T01:02:47.352000”,
      “submit_timestamp”: null,
      “clear”: true,
      “_id”: “589a49cf2e8f4d154386c725”,
      “environment”: “Mirantis-Liberty”,
      “scan_only_links”: true,
      “id”: “589a49cf2e8f4d154386c725”,
      “inventory”: “update-test”,
      “scan_only_inventory”: true,
      “log_level”: “warning”,
      “status”: “completed”,
      “end_timestamp”: “2017-01-28T01:07:54.011000”
}

POST            /scans

Description: create a new scan (ask calipso to scan an environment for detailed data gathering).

Normal response code: 201(Created)

Error response code: badRequest (400), unauthorized (401)

Request 

Name In Type Description
status (mandatory) body string The current status of the scan, possible values are “draft”, “pending”, “running”, “completed”, “failed” and “aborted”.
log_level (optional) body string Logging level of the scanning, the possible values are “critical”, “error”, “warning”, “info”, “debug” and “notset”.
clear (optional) body boolean Indicates whether it needs to clear all the data before scanning.
scan_only_inventory (optional) body boolean Only scan and store data in the inventory.
scan_only_links (optional) body boolean Limit the scan to find only missing links.
scan_only_cliques (optional) body boolean Limit the scan to find only missing cliques.
environment (mandatory) body string Environment name of the scan
inventory (optional) body string Name of the inventory collection.
object_id (optional) body string Base object of the scan

Request Example

post  http://korlev-calipso-testing.cisco.com:8000/*scans*

{
       “status” : “pending”,
       “log_level” : “warning”,
       “clear” : true,
       “scan_only_inventory” : true,
       “environment” : “Mirantis-Liberty”,
       “inventory” : “koren”,
       “object_id” : “ff”
}

Response

Successful Example

{
       “message”: “created a new scan for environment Mirantis-Liberty”
}
Scheduled_scans

GET            /scheduled_scans

Description: get scheduled_scan details with environment name and scheduled_scan id, or get a list of scheduled_scans with filters except id

Normal response code: 200

Error response code: badRequest (400), unauthorized (401), notFound(404)

Request

Name In Type Description
environment(Mandatory) query string Environment of the scheduled_scans. e.g. “Mirantis-Liberty”.
id (Optional) query string ID of the scheduled_scan, it must be a string that can be converted to MongoDB ObjectId.
freq (Optional) query string Frequency of the scheduled_scans, the possible values for the freq are “HOURLY”, “DAILY”, “WEEKLY”, “MONTHLY”, and “YEARLY”.
page (Optional) query int Which page is to be returned, the default is the first page, if the page is larger than the maximum page of the query, it will return an empty set. (Page starts from 0.)
page_size (Optional) query int Size of each page, the default is 1000.

Response

Name In Type Description
freq body string The frequency of the scheduled_scan, possible values are “HOURLY”, “DAILY”, “WEEKLY”, “MONTHLY”, and “YEARLY”.
log_level body string Logging level of the scheduled_scan, the possible values are “critical”, “error”, “warning”, “info”, “debug” and “notset”.
clear body boolean Indicates whether it needs to clear all the data before scanning.
scan_only_inventory body boolean Only scan and store data in the inventory.
scan_only_links body boolean Limit the scan to find only missing links.
scan_only_cliques body boolean Limit the scan to find only missing cliques.
submit_timestamp body string Submitted timestamp of the scheduled_scan
environment body string Environment name of the scheduled_scan
scheduled_timestamp body string Scheduled time for the scanning, it should follows *ISO 8610: *YYYY-MM-DDThh:mm:ss.sss+hhmm

Examples

Example Get Scheduled_scans

Request

http://korlev-calipso-testing.cisco.com:8000/scheduled_scans?environment=Mirantis-Liberty

Response

{
      “scheduled_scans”: [

           {

              “freq”:”WEEKLY”,
              “environment”: “Mirantis-Liberty”,
              “id”: “58c96a075eb66a121cc4e75f”,
              “scheduled_timestamp”: “2017-01-28T01:07:54.011000”
          }

       ]

}

Example Get Scheduled_Scan Details

Request

http://korlev-calipso-testing.cisco.com:8000/scheduled_scans?environment=Mirantis-Liberty&id=589a49cf2e8f4d154386c725

Response

{
      “scan_only_cliques”: true,
      “scheduled_timestamp”: “2017-01-28T01:02:47.352000”,
      “submit_timestamp”: 2017-01-27T01:07:54.011000””,
      “clear”: true,
      “_id”: “589a49cf2e8f4d154386c725”,
      “environment”: “Mirantis-Liberty”,
      “scan_only_links”:false,
      “id”: “589a49cf2e8f4d154386c725”,
      “scan_only_inventory”:false,
      “log_level”: “warning”,
      “freq”: “WEEKLY”
}

POST            /scheduled_scans

Description: create a new scheduled_scan (request calipso to scan in a future date).

Normal response code: 201(Created)

Error response code: badRequest (400), unauthorized (401)

Request 

Name In Type Description
log_level (optional) body string Logging level of the scheduled_scan, the possible values are “critical”, “error”, “warning”, “info”, “debug” and “notset”.
clear (optional) body boolean Indicates whether it needs to clear all the data before scanning.
scan_only_inventory (optional) body boolean Only scan and store data in the inventory.
scan_only_links (optional) body boolean Limit the scan to find only missing links.
scan_only_cliques (optional) body boolean Limit the scan to find only missing cliques.
environment (mandatory) body string Environment name of the scan
freq(mandatory) body string The frequency of the scheduled_scan, possible values are “HOURLY”, “DAILY”, “WEEKLY”, “MONTHLY”, and “YEARLY”.
submit_timestamp(mandatory) body string Submitted time for the scheduled_scan, it should follows *ISO 8610: *YYYY-MM-DDThh:mm:ss.sss+hhmm

** Post** http://korlev-calipso-testing.cisco.com:8000/scheduled_scans

{
       “freq” : “WEEKLY”,
       “log_level” : “warning”,
       “clear” : true,
       “scan_only_inventory” : true,
       “environment” : “Mirantis-Liberty”,
       “submit_timestamp” : “2017-01-28T01:07:54.011000”
}

Response

Successful Example

{
       “message”: “created a new scheduled_scan for environment Mirantis-Liberty”
}
Constants

GET            /constants

Description: get constant details with name (constants are used by ui and event/scan managers)

Normal response code: 200

Error response code: badRequest(400), unauthorized(401), notFound(404)

Request

Name In Type Description
name (Mandatory) query string Name of the constant. e.g. “distributions”.

Response

Name In Type Description
id body string ID of the constant.
_id body string MongoDB ObjectId of the constant.
name body string Name of the constant.
data body array Data of the constant.

Examples

Example Get Constant Details 

Request

*http://korlev-osdna-testing.cisco.com:8000/constants?name=link_states*

Response

{
     “_id”: “588796ac2e8f4d02b8e7aa2a”,
     “data”: [
          {
               “value”: “up”,
               “label”: “up”
          },
         {
             “value”: “down”,
             “label”: “down”
         }
      ],
      “id”: “588796ac2e8f4d02b8e7aa2a”,
      “name”: “link_states”
}
Monitoring_Config_Templates

GET            /monitoring_config_templates

Description: get monitoring_config_template details with template id, or get a list of templates with filters except id (see monitoring-guide).

Normal response code: 200

Error response code: badRequest(400), unauthorized(401), notFound(404)

Request

Name In Type Description
id (Optional) query string ID of the monitoring config template, it must be a string that can be converted MongoDB ObjectId
order (Optional) query int Order by which templates are applied, 1 is the OSDNA default template. Templates that the user added later we use higher order and will override matching attributes in the default templates or add new attributes.
side (Optional) query string The side which runs the monitoring, the possible values are “client” and “server”.
type (Optional) query string The name of the config file, e.g. “client.json”.
page (Optional) query int Which page is to be returned, the default is the first page, if the page is larger than the maximum page of the query, it will return an empty result set. (Page starts from 0).
page_size(Optional) query int Size of each page, the default is 1000.

Response 

Name In Type Description
id body string ID of the monitoring_config_template.
_id body srting MongoDB ObjectId of the monitoring_config_template.
monitoring_system body string System that we use to do the monitoring, e.g, “Sensu”.
order body string Order by which templates are applied, 1 is the OSDNA default templates. Templates that the user added later we use higher order and will override matching attributes in the default templates or add new attributes.
config body object Configuration of the monitoring.
side body string The side which runs the monitoring.
type body string The name of the config file, e.g. “client.json”.

Examples

Example Get Monitoring_config_templates

Request

http://korlev-calipso-testing.cisco.com:8000/monitoring_config_templates?side=client&order=1&type=rabbitmq.json&page=0&page_size=1

Response

{
     “monitoring_config_templates”: [
{
      “type”: “rabbitmq.json”,
      “side”: “client”,
      “id”: “583711893e149c14785d6daa”
}
     ]
}

Example Get Monitoring_config_template Details

Request

http://korlev-calipso-testing.cisco.com:8000/monitoring_config_templates?id=583711893e149c14785d6daa

Response

{
     “order”: “1”,
     “monitoring_system”: “sensu”,
     “_id”: “583711893e149c14785d6daa”,
     “side”: “client”,
     “type”: “rabbitmq.json”,
     “config”: {
     “rabbitmq”: {
     “host”: “{server_ip}”,
     “vhost”: “/sensu”,
     “password”: “{rabbitmq_pass}”,
     “user”: “{rabbitmq_user}”,
     “port”: 5672
       }
     },
    “id”: “583711893e149c14785d6daa”
}
Aggregates

GET            /aggregates

Description: List some aggregated information about environment, message or constant.

Normal response code: 200

Error response code: badRequest(400), unauthorized(401), notFound(404)

Request

Name In Type Description
env_name (Optional) query string Environment name, if the aggregate type is “environment”, this value must be specified.
type (Optional) query string Type of aggregate, currently we support three types of aggregate, “environment”, “message” and “constant”.

Response 

Name In Type Description
type body string Type of aggregate, we support three types of aggregates now, “environment”, “message” and “constant”.
env_name (Optional) body string Environment name of the aggregate, when the aggregate type is “environment”, this attribute will appear.
aggregates body object The aggregates information.

Examples

Example Get Environment Aggregate 

Request

http://korlev-calipso-testing.cisco.com:8000/aggregates?env_name=Mirantis-Liberty-API&type=environment

Response

{
      “env_name”: “Mirantis-Liberty-API”,
      “type”: “environment”,
      “aggregates”: {
          “object_types”: {
             “projects_folder”: 1,
             “instances_folder”: 3,
             “otep”: 3,
             “region”: 1,
             “vedge”: 3,
             “networks_folder”: 2,
             “project”: 2,
             “vconnectors_folder”: 3,
             “availability_zone”: 2,
             “vedges_folder”: 3,
             “regions_folder”: 1,
             “network”: 3,
             “vnics_folder”: 6,
             “instance”: 2,
            “vservice”: 4,
            “availability_zones_folder”: 1,
            “vnic”: 8,
            “vservices_folder”: 3,
            “port”: 9,
            “pnics_folder”: 3,
            “network_services_folder”: 3,
            “ports_folder”: 3,
            “host”: 3,
            “vconnector”: 6,
            “network_agent”: 6,
            “aggregates_folder”: 1,
            “pnic”: 15,
            “network_agents_folder”: 3,
            “vservice_miscellenaous_folder”: 1
            }
      }
}

Example Get Messages Aggregate

Request

http://korlev-calipso-testing.cisco.com:8000/aggregates?type=message

Response

{

    “type”: “message”,

    “aggregates”: {

         “levels”: {

              “warn”: 5,

              “info”: 10,

              “error”: 10

         },

        “environments”: {

              “Mirantis-Liberty-API”: 5,

              “Mirantis-Liberty”: 10

         }

    }

}

Example Get Constants Aggregate

Request

http://korlev-calipso-testing.cisco.com:8000/aggregates?type=constant

Response

{
       “type”: “constant”,
       “aggregates”: {
       “names”: {
          “link_states”: 2,
          “scan_statuses”: 6,
          “type_drivers”: 5,
          “log_levels”: 6,
          “monitoring_sides”: 2,
          “mechanism_drivers”: 5,
          “messages_severity”: 8,
          “distributions”: 16,
          “link_types”: 11,
          “object_types”: 10
         }
       }
}
Environment_configs

GET            /environment_configs

Description: get environment_config details with name, or get a list of environments_config with filters except name

Normal response code: 200

Error response code: badRequest(400), unauthorized(401), notFound(404)

Request

Name In Type Description
name(Optional) query string Name of the environment.
distribution(Optional) query string The distribution of the OpenStack environment, it must be one of the distributions we support, e.g “Mirantis-8.0”.(you can get all the supported distributions by querying the distributions constants)
mechanism_drivers(Optional) query string The mechanism drivers of the environment, it should be one of the drivers in mechanism_drivers constants, e.g “ovs”.
type_drivers(Optional) query string ‘flat’, ‘gre’, ‘vlan’, ‘vxlan’.
user(Optional) query string name of the environment user
listen(Optional) query boolean Indicates whether the environment is being listened.
scanned(Optional) query boolean Indicates whether the environment has been scanned.
monitoring_setup_done(Optional) query boolean Indicates whether the monitoring setup has been done.
operational(Optional) query string operational status of the environment, the possible statuses are “stopped”, “running” and “error”.
page(Optional) query int Which page is to be returned, the default is the first page, if the page is larger than the maximum page of the query, it will return an empty result set. (Page starts from 0).
page_size(Optional) query int Size of each page, the default is 1000.

Response

Name In Type Description
configuration body array List of configurations of the environment, including configurations of mysql, OpenStack, CLI, AMQP and Monitoring.
distribution body string The distribution of the OpenStack environment, it must be one of the distributions we support, e.g “Mirantis-8.0”.
last_scanned body string The date of last time scanning the environment, the format of the date is MM/DD/YY.
mechanism_dirvers body array The mechanism drivers of the environment, it should be one of the drivers in mechanism_drivers constants.
monitoring_setup_done body boolean Indicates whether the monitoring setup has been done.
name body string Name of the environment.
operational body boolean Indicates if the environment is operational.
scanned body boolean Indicates whether the environment has been scanned.
type body string Production, testing, development, etc.
type_drivers body string ‘flat’, ‘gre’, ‘vlan’, ‘vxlan’.
user body string The user of the environment.
listen body boolean Indicates whether the environment is being listened.

Examples

Example Get Environments config

Request

http://korlev-calipso-testing.cisco.com:8000/environment_configs?mechanism_drivers=ovs

**Response**

{
        environment_configs: [
{
      “distribution”: “Canonical-icehouse”,
      “name”: “thundercloud”
}
        ]
}

Example Environment config Details

Request

http://korlev-calipso-testing.cisco.com:8000/environment_configs?name=Mirantis-Mitaka-2

Response

{
       “type_drivers”: “vxlan”,
       “name”: “Mirantis-Mitaka-2”,
       “app_path”: “/home/yarony/osdna_prod/app”,
       “scanned”: true,
       “type”: “environment”,
       “user”: “test”,
       “distribution”: “Mirantis-9.1”,
       “monitoring_setup_done”: true,
       “listen”: true,
       “mechanism_drivers”: [
             “ovs”
       ],
       “configuration”: [
       {
              “name”: “mysql”,
              “user”: “root”,
              “host”: “10.56.31.244”,
              “port”: “3307”,
              “password”: “TsbQPwP2VPIUlcFShkCFwBjX”
        },
        {
              “name”: “CLI”,
              “user”: “root”,
              “host”: “10.56.31.244”,
              “key”: “/home/ilia/Mirantis_Mitaka_id_rsa”
         },
        {
              “password”: “G1VfxeJmtK5vIyNNMP4qZmXB”,
              “user”: “nova”,
              “name”: “AMQP”,
              “port”: “5673”,
              “host”: “10.56.31.244”
         },
        {
             “server_ip”: “*korlev-nsxe1.cisco.com*”,
             “name”: “Monitoring”,
             “port”: “4567”,
             “env_type”: “development”,
             “rabbitmq_pass”: “sensuaccess”,
             “rabbitmq_user”: “sensu”,
             “provision”: “DB”,
             “server_name”: “devtest-sensu”,
             “type”: “Sensu”,
             “config_folder”: “/tmp/sensu_test”
        },
       {
            “user”: “admin”,
            “name”: “OpenStack”,
            “port”: “5000”,
            “admin_token”: “qoeROniLLwFmoGixgun5AXaV”,
            “host”: “10.56.31.244”,
           “pwd”: “admin”
         }
        ],
       “_id”: “582d77ee3e149c1318b3aa54”,
       “operational”: “yes”
}

POST            /environment_configs

Description: create a new environment configuration.

Normal response code: 201(Created)

Error response code: badRequest(400), unauthorized(401), notFound(404), conflict(409)

Request

Name In Type Description
configuration(Mandatory) body array List of configurations of the environment, including configurations of mysql(mandatory), OpenStack(mandatory), CLI(mandatory), AMQP(mandatory) and Monitoring(Optional).
distribution(Mandatory) body string The distribution of the OpenStack environment, it must be one of the distributions we support, e.g “Mirantis-8.0”.(you can get all the supported distributions by querying the distributions constants)
last_scanned(Optional) body string

The date and time of last scanning, it should follows *ISO 8610: *

YYYY-MM-DDThh:mm:ss.sss+hhmm

mechanism_dirvers(Mandatory) body array The mechanism drivers of the environment, it should be one of the drivers in mechanism_drivers constants, e.g “OVS”.
name(Mandatory) body string Name of the environment.
operational(Mandatory) body boolean Indicates if the environment is operational. e.g. true.
scanned(Optional) body boolean Indicates whether the environment has been scanned.
listen(Mandatory) body boolean Indicates if the environment need to been listened.
user(Optional) body string The user of the environment.
app_path(Mandatory) body string The path that the app is located in.
type(Mandatory) body string Production, testing, development, etc.
type_drivers(Mandatory) body string ‘flat’, ‘gre’, ‘vlan’, ‘vxlan’.

Request Example

Post http://korlev-calipso-testing:8000/environment_configs

{
       “app_path” : “/home/korenlev/OSDNA/app/”,
       “configuration” : [
            {
                  “host” : “172.23.165.21”,
                  “name” : “mysql”,
                  “password” : “password”,
                  “port” : NumberInt(3306),
                  “user” : “root”,
                  “schema” : “nova”
            },
            {
                  “name” : “OpenStack”,
                  “host” : “172.23.165.21”,
                  “admin_token” : “TL4T0I7qYNiUifH”,
                  “admin_project” : “admin”,
                  “port” : “5000”,
                  “user” : “admin”,
                  “pwd” : “admin”
           },
          {
                  “host” : “172.23.165.21”,
                  “key” : “/home/yarony/.ssh/juju_id_rsa”,
                  “name” : “CLI”,
                  “user” : “ubuntu”
          },
         {
                  “name” : “AMQP”,
                  “host” : “10.0.0.1”,
                  “port” : “5673”,
                  “user” : “User”,
                  “password” : “abcd1234”
           },
          {
                  “config_folder” : “/tmp/sensu_test_liberty”,
                  “provision” : “None”,
                  “env_type” : “development”,
                  “name” : “Monitoring”,
                  “port” : “4567”,
                  “rabbitmq_pass” : “sensuaccess”,
                  “rabbitmq_user” : “sensu”,
                  “server_ip” : “*korlev.cisco.com*”,
                  “server_name” : “devtest-sensu”,
                  “type” : “Sensu”
            }
         ],
        “distribution” : “Canonical-icehouse”,
        “last_scanned” : “2017-02-13T16:07:15Z”,
        “listen” : true,
        “mechanism_drivers” : [
                 “OVS”
          ],
         “name” : “thundercloud”,
         “operational” : “yes”,
         “scanned” : false,
         “type” : “environment”,
          “type_drivers” : “gre”,
          “user” : “WS7j8oTbWPf3LbNne”
}

Response 

Successful Example

{
        “message”: “created environment_config for Mirantis-Liberty”
}
Calipso.io
Objects Model

Copyright (c) 2017 Koren Lev (Cisco Systems), Yaron Yogev (Cisco Systems) and others All rights reserved. This program and the accompanying materials are made available under the terms of the Apache License, Version 2.0 which accompanies this distribution, and is available at http://www.apache.org/licenses/LICENSE-2.0

image0

Project “Calipso” tries to illuminate complex virtual networking with real time operational state visibility for large and highly distributed Virtual Infrastructure Management (VIM).

Calipso provides visible insights using smart discovery and virtual topological representation in graphs, with monitoring per object in the graph inventory to reduce error vectors and troubleshooting, maintenance cycles for VIM operators and administrators.

Calipso model, described in this document, was built for multi-environment and many VIM variances, the model was tested successfully (as of Aug 27th) against 60 different VIM variances (Distributions, Versions, Networking Drivers and Types).

Table of Contents

Calipso.io Objects Model 1

1 Environments config 4

2 Inventory objects 6

2.1 Host 6

2.2 physical NIC (pNIC) 7

2.3 Bond 7

2.4 Instance 7

2.5 virtual Service (vService) 7

2.6 Network 7

2.7 virtual NIC (vNIC) 7

2.8 Port 8

2.9 virtual Connector (vConnector) 8

2.10 virtual Edge (vEdge) 8

2.11 Overlay-Tunnel-Endpoint (OTEP) 8

2.12 Network_segment 8

2.13 Network_Agent 8

2.14 Looking up Calipso objects details 9

3 Link Objects 10

3.1 Link types 11

4 Clique objects 11

4.1 Clique types 11

5 Supported Environments 12

6 System collections 14

6.1 Attributes_for_hover_on_data 14

6.2 Clique_constraints 14

6.3 Connection_tests 14

6.4 Messages 14

6.5 Network_agent_types 14

6.6 Roles, Users 15

6.7 Statistics 15

6.8 Constants 15

6.9 Constants-env_types 15

6.10 Constants-log_levels 15

6.11 Constants-mechanism_drivers 15

6.12 Constants-type_drivers 15

6.13 Constants-environment_monitoring_types 15

6.14 Constants-link_states 15

6.15 Constants-environment_provision_types 15

6.16 Constants-environment_operational_status 16

6.17 Constants-link_types 16

6.18 Constants-monitoring_sides 16

6.19 Constants-object_types 16

6.20 Constants-scans_statuses 16

6.21 Constants-distributions 16

6.22 Constants-distribution_versions 16

6.23 Constants-message_source_systems 16

6.24 Constants-object_types_for_links 16

6.25 Constants-scan_object_types 17

Environments config

Environment is defined as a certain type of Virtual Infrastructure facility the runs under a single unified Management (like an OpenStack facility).

Everything in Calipso application rely on environments config, this is maintained in the “environments_config” collection in the mongo Calipso DB.

Environment configs are pushed down to Calipso DB either through UI or API (and only in OPNFV case Calipso provides an automated program to build all needed environments_config parameters for an ‘Apex’ distribution automatically).

When scanning and discovering items Calipso uses this configuration document for successful scanning results, here is an example of an environment config document:

**{ **

**“name”: “DEMO-ENVIRONMENT-SCHEME”, **

**“enable_monitoring”: true, **

**“last_scanned”: “filled-by-scanning”, **

**“app_path”: “/home/scan/calipso_prod/app”, **

**“type”: “environment”, **

**“distribution”: “Mirantis”, **

**“distribution_version”: “8.0”, **

**“mechanism_drivers”: [“OVS”], **

“type_drivers”: “vxlan”

**“operational”: “stopped”, **

**“listen”: true, **

**“scanned”: false, **

“configuration”: [

{

**“name”: “OpenStack”, **

**“port”:”5000”, **

**“user”: “adminuser”, **

**“pwd”: “dummy_pwd”, **

**“host”: “10.0.0.1”, **

“admin_token”: “dummy_token”

**}, **

{

**“name”: “mysql”, **

**“pwd”: “dummy_pwd”, **

**“host”: “10.0.0.1”, **

**“port”: “3307”, **

“user”: “mysqluser”

**}, **

{

**“name”: “CLI”, **

**“user”: “sshuser”, **

**“host”: “10.0.0.1”, **

“pwd”: “dummy_pwd”

**}, **

{

**“name”: “AMQP”, **

**“pwd”: “dummy_pwd”, **

**“host”: “10.0.0.1”, **

**“port”: “5673”, **

“user”: “rabbitmquser”

**}, **

{

**“name”: “Monitoring”, **

**“ssh_user”: “root”, **

**“server_ip”: “10.0.0.1”, **

**“ssh_password”: “dummy_pwd”, **

**“rabbitmq_pass”: “dummy_pwd”, **

**“rabbitmq_user”: “sensu”, **

**“rabbitmq_port”: “5671”, **

**“provision”: “None”, **

**“env_type”: “production”, **

**“ssh_port”: “20022”, **

**“config_folder”: “/local_dir/sensu_config”, **

**“server_name”: “sensu_server”, **

**“type”: “Sensu”, **

“api_port”: NumberInt(4567)

**}, **

{

**“name”: “ACI”, **

**“user”: “admin”, **

**“host”: “10.1.1.104”, **

“pwd”: “dummy_pwd”

}

**], **

**“user”: “wNLeBJxNDyw8G7Ssg”, **

“auth”: {

“view-env”: [

“wNLeBJxNDyw8G7Ssg”

**], **

“edit-env”: [

“wNLeBJxNDyw8G7Ssg”

]

**}, **

}

Here is a brief explanation of the purpose of major keys in this environment configuration doc:

Distribution: captures type of VIM, used for scanning of objects, links and cliques.

Distribution_version: captures version of VIM distribution, used for scanning of objects, links and cliques.

Mechanism_driver: captures virtual switch type used by the VIM, used for scanning of objects, links and cliques.

Type_driver: captures virtual switch tunneling type used by the switch, used for scanning of objects, links and cliques.

Listen: defines whether or not to use Calipso listener against the VIM BUS for updating inventory in real-time from VIM events.

Scanned: defines whether or not Calipso ran a full and a successful scan against this environment.

Last_scanned: end time of last scan.

Operational: defines whether or not VIM environment endpoints are up and running.

Enable_monitoring: defines whether or not Calipso should deploy monitoring of the inventory objects running inside all environment hosts.

Configuration-OpenStack: defines credentials for OpenStack API endpoints access.

Configuration-mysql: defines credentials for OpenStack DB access.

Configuration-CLI: defines credentials for servers CLI access.

Configuration-AMQP: defines credentials for OpenStack BUS access.

Configuration-Monitoring: defines credentials and setup for Calipso sensu server (see monitoring-guide for details).

Configuration-ACI: defines credentials for ACI switched management API, if exists.

User and auth: used for UI authorizations to view and edit this environment.

App-path: defines the root directory of the scanning application.

Inventory objects

Calipso’s success in scanning, discovering and analyzing many (60 as of 27th Aug 2017) variances of virtual infrastructures lies with its objects model and relationship definitions (model was tested even against a vSphere VMware environment).

Those objects are the real-time processes and systems that are built by workers and agents on the virtual infrastructure servers.

All Calipso objects are maintained in the “inventory” collection.

Here are the major objects defined in Calipso inventory in order to capture the real-time state of networking:

Host

It’s the physical server that runs all virtual objects, typically a hypervisor or a containers hosting machine.

It’s typically a bare-metal server, in some cases it might be virtual (running “nesting” VMs as second virtualization layer inside it).

physical NIC (pNIC)

It’s the physical Ethernet Network Interface Card attached to the Host, typically several of those are available on a host, in some cases few of those are grouped (bundled) together into etherchannel bond interfaces.

For capturing data from real infrastructure devices Calipso created 2 types of pNICs: host_pnic (pNICs on the servers) and switch_pnic (pNICs on the physical switches). Calipso currently discovers host to switch physical connections only in some types of switches (Cisco ACI as of Aug 27th 2017).

Bond

It’s a logical Network Interface using etherchannel standard protocols to form a group of pNICs providing enhanced throughput for communications to/from the host.

Calipso currently maintains bond details inside a host_pnic object.

Instance
It’s the virtual server created for running a certain application or function. Typically it’s a Virtual Machine, sometimes it’s a Container.
virtual Service (vService)
It’s a process/system that provides some type of networking service to instances running on networks, some might be deployed as namespaces and some might deploy as VM or Container, for example: DHCP server, Router, Firewall, Load-Balancer, VPN service and others. Calipso categorized vServices accordingly.
Network
It’s an abstracted object, illustrating and representing all the components (see below) that builds and provides communication services for several instances and vServices.
virtual NIC (vNIC)
There are 2 types - instance vNIC and vService vNIC:
  • Instance vNIC: It’s the virtual Network Interface Card attached to the Instance and used by it for communications from/to that instance.
  • vService vNIC: It’s the virtual Network Interface Card attached to the vService used by it for communications from/to that vService.
Port
It’s an abstracted object representing the attachment point for an instance or a vService into the network, in reality it’s fulfilled by deployment of vNICs on hosts.
virtual Connector (vConnector)
It’s a process/system that provides layer 2 isolation for a specific network inside the host (isolating traffic from other networks). Examples: Linux Bridge, Bridge-group, port-group etc.
virtual Edge (vEdge)

It’s a process/system that provides switching and routing services for instances and/or vServices running on a specific host. It function as an edge device between virtual components running on that host and the pNICs on that host, making sure traffic is maintained and still isolated across different networks.

Examples: Open Virtual Switch, Midonet, VPP.

Overlay-Tunnel-Endpoint (OTEP)
It’s an abstracted object representing the end-point on the host that runs a certain tunneling technology to provide isolation across networks and hosts for packets leaving and entering the pNICs of a specific host. Examples: VXLAN tunnels endpoints, GRE tunnels endpoints etc.
Network_segment

It’s the specific segment used inside the “overlay tunnel” to represent traffic from a specific network, this depends on the specific type (encapsulation) of the OTEP.

Calipso currently maintains segments details inside a network object.

Network_Agent
It’s a controlling software running on the hosts for orchestrating the lifecycle of the above virtual components. Examples: DHCP agent, L3 agent, OVS agent, Metadata agent etc.
Looking up Calipso objects details

As explained in more details in Calipso admin-guide, the underlying database used is mongoDB. All major objects discovered by Calipso scanning module are maintained in the “inventory” collection and those document includes detailed attributes captured from the infrastructure about those objects, here are the main objects quarries to use for grabbing each of the above object types from Calipso’s inventory:

{type:”vnic”}

{type:”vservice”}

{type:”instance”}

{type:”host_pnic”}

{type:”switch_pnic”}

{type:”vconnector”}

{type:”vedge”}

{type:”network”}

{type:”network_agent”}

{type:”otep”}

{type:”host”}

{type:”port”}

All Calipso modules (visualization, monitoring and analysis) rely on those objects as baseline inventory items for any further computation.

Here is an example of a query made using mongo Chef Client application:

image1

* See Calipso API-guide for details on looking up those objects through the Calipso API.

The following simplified UML illustrates the way Calipso objects relationships are maintained in a VIM of type OpenStack:

image2

Clique objects

Cliques are lists of links. Clique represent a certain path in the virtual networking infrastructure that an administrator is interested in, this is made to allow easier searching and finding of certain points of interest (“focal point”).
Clique types

Based on the specific VIM distribution, distribution version, mechanism driver and type driver variance, Calipso scanning module search for specific cliques using a model that is pre-populated in its “clique_types” collection, and it depends on the environment variance, here is an example of a clique_type:

**{ **

**“environment” : “Apex-Euphrates”, **

“link_types” : [

**“instance-vnic”, **

**“vnic-vconnector”, **

**“vconnector-vedge”, **

**“vedge-otep”, **

**“otep-host_pnic”, **

“host_pnic-network”

**], **

**“name”: “instance_clique_for_opnfv”, **

“focal_point_type”: “instance”

}

The above model instruct the Calipso scanner to create cliques with the above list of link types for a “focal_point” that is an “instance” type of object. We believe this is a highly customized model for analysis of dependencies for many use cases. We have included several clique types, common across variances supported in this release.

The cliques themselves are then maintained in the “cliques” collection.

To clarify this concept, here is an example for an implementation use case in the Calipso UI module:

When the user of the UI clicks on a certain object of type=instance, he expresses he’s wish to see a graph representing the path taken by traffic from that specific instance (as the root source of traffic, on that specific network) all the way down to the host pNIC and the (abstracted) network itself.

A successful completion of scanning and discovery means that all inventory objects, link objects and clique objects (based on the environment clique types) are found and accurately representing real-time state of the virtual networking on the specific environment.

Supported Environments

As of Aug 27th 2017, Calipso application supports 60 different VIM environment variances and with each release the purpose of the application is to maintain support and add more variances per the VIM development cycles. The latest supported variance and the specific functions of Calipso available for that specific variance is captured in the “supported_environments” collection, here are two examples of that ‘supported’ model:

1.

**{ **

“environment” : {

**“distribution” : “Apex”, **

**“distribution_version” : [“Euphrates”], **

**“mechanism_drivers” : “OVS”, **

“type_drivers” : “vxlan”

**}, **

“features” : {

**“listening” : true, **

**“scanning” : true, **

“monitoring” : false

}

}

2.

**{ **

“environment” : {

**“distribution” : “Mirantis”, **

**“distribution_version”: [“6.0”, “7.0”, “8.0”, “9.0”, “9.1”, “10.0”], **

**“mechanism_drivers” : “OVS”, **

“type_drivers” : “vxlan”

**}, **

“features” : {

**“listening” : true, **

**“scanning” : true, **

“monitoring” : true

}

}

The examples above defines for Calipso application that:

  1. For an ‘Apex’ environment of version ‘Euphrates’ using OVS and vxlan, Calipso can scan/discover all details (objects, links, cliques) but is not yet monitoring those discovered objects.
  2. For a “Mirantis” environment of versions 6.0 to 10.0 using OVS and vxlan, Calipso can scan/discover all details (objects, links, cliques) and also monitor those discovered objects.

With each calipso release more “supported_environments” should be added.

System collections

Calipso uses other system collections to maintain its data for scanning, event handling, monitoring and for helping to operate the API and UI modules, here is the recent list of collections not covered yet in other written guides:

Attributes_for_hover_on_data

This collection maintains a list of documents describing what will be presented on the UI popup screen when the use hover-on a specific object type, it details which parameters or attributed from the object’s data will be shown on the screen, making this popup fully customized.

Clique_constraints

Defines the logic on which cliques are built, currently network is the main focus of the UI (central point of connection for all cliques in the system), but this is customizable.

When building a clique graph, Calipso defaults to traversing all nodes edges (links) in the graph.

In some cases we want to limit the graph so it will not expand too much (nor forever).

For example: when we build the graph for a specific instance, we limit the graph to only take objects from the network on which this instance resides - otherwise the graph will show objects related to other instances.

The constraint validation is done by checking value V from the focal point F on the links.

For example, if an n instance has network X, we check that each link we use either has network X (attribute “network” has value X), or does not have the “network” attribute.

Connection_tests

This collection keeps requests from the UI or API to test the different adapters (API, DB, and CLI etc) and their connections to the underlying VIM, making sure dynamic and real-time data will be maintained during discovery.

Messages

Aggregates all loggings from the different systems, source_system of logs currently defined as “OpenStack” (the VIM), “Sensu” (the Monitoring module) and “Calipso” (logs of the application itself. Messages have 6 levels of severity and can be browsed in the UI and through Calipso API.

Network_agent_types

Lists the types of networking agents supported on the VIM (per distribution and version).

Roles, Users

Basic RBAC facility to authorize calispo UI users for certain calipso functionalities on the UI.

Statistics

Built for detailed analysis and future functionalities, used today for traffic analysis (capturing samples of throughputs per session on VPP based environments).

Constants

This is an aggregated collection for many types of documents that are required mostly by the UI and basic functionality on some scanning classes (‘fetchers’).

Constants-env_types

Type of environments to allow for configuration on sensu monitoring framework.

Constants-log_levels

Severity levels for messages generated.

Constants-mechanism_drivers

Mechanism-drivers allowed for UI users.

Constants-type_drivers

Type-drivers allowed for UI users.

Constants-environment_monitoring_types

Currently only “Sensu” is available, might be used for other monitoring systems integrations.

Constants-environment_provision_types

The types of deployment options available for monitoring (see monitoring-guide for details).

Constants-environment_operational_status

Captures the overall (aggregated) status of a curtained environment.

Constants-monitoring_sides

Used for monitoring auto configurations of clients and servers.

Constants-object_types

Lists the type of objects supported through scanning (inventory objects).

Constants-scans_statuses

During scans, several statuses are shown on the UI, based on the specific stage and results.

Constants-distributions

Lists the VIM distributions.

Constants-distribution_versions

Lists the VIM different versions of different distributions.

Constants-message_source_systems

The list of systems that can generate logs and messages.

Constants-scan_object_types

Object_types used during scanning, see development-guide for details.

Constants-configuration_targets Names of the configuration targets used in the configuration section of environment configs.

Calipso.io
Quick Start Guide

Copyright (c) 2017 Koren Lev (Cisco Systems), Yaron Yogev (Cisco Systems) and others All rights reserved. This program and the accompanying materials are made available under the terms of the Apache License, Version 2.0 which accompanies this distribution, and is available at http://www.apache.org/licenses/LICENSE-2.0

image0

Project “Calipso” tries to illuminate complex virtual networking with real time operational state visibility for large and highly distributed Virtual Infrastructure Management (VIM).

We believe that Stability is driven by accurate Visibility.

Calipso provides visible insights using smart discovery and virtual topological representation in graphs, with monitoring per object in the graph inventory to reduce error vectors and troubleshooting, maintenance cycles for VIM operators and administrators.

Table of Contents

Calipso.io Quick Start Guide 1

1 Getting started 3

1.1 Post installation tools 3

1.2 Calipso containers details 3

1.3 Calipso containers access 5

2 Validating Calipso app 5

2.1 Validating calipso-mongo module 5

2.2 Validating calipso-scan module 7

2.3 Validating calipso-listen module 8

2.4 Validating calipso-api module 9

2.5 Validating calipso-sensu module 9

2.6 Validating calipso-ui module 10

2.7 Validating calipso-ldap module 10

Getting started

Post installation tools

Calipso administrator should first complete installation as per install-guide document.

After all calipso containers are running she can start examining the application using the following suggested tools:

  1. MongoChef : https://studio3t.com/download/ as a useful GUI client to interact with calipso mongoDB module.
  2. Web Browser to access calipso-UI at the default localtion: http://server-IP
  3. SSH client to access other calipso containers as needed.
  4. Python3 toolsets for debugging and development as needed.
Calipso containers details
Calipso is currently made of the following 7 containers:
  1. Mongo: holds and maintains calipso’s data inventories.

  2. LDAP: holds and maintains calipso’s user directories.

  3. Scan: deals with automatic discovery of virtual networking from VIMs.

  4. Listen: deals with automatic updating of virtual networking into inventories.

  5. API: runs calipso’s RESTful API server.

  6. UI: runs calipso’s GUI/web server.

  7. Sensu: runs calipso’s monitoring server.

    After successful installation Calipso containers should have been downloaded, registered and started, here are the images used:

    sudo docker images

    Expected results (as of Aug 2017):

    REPOSITORY TAG IMAGE ID CREATED SIZE

    korenlev/calipso listen 12086aaedbc3 6 hours ago 1.05GB

    korenlev/calipso api 34c4c6c1b03e 6 hours ago 992MB

    korenlev/calipso scan 1ee60c4e61d5 6 hours ago 1.1GB

    korenlev/calipso sensu a8a17168197a 6 hours ago 1.65GB

    korenlev/calipso mongo 17f2d62f4445 22 hours ago 1.31GB

    korenlev/calipso ui ab37b366e812 11 days ago 270MB

    korenlev/calipso ldap 316bc94b25ad 2 months ago 269MB

    Typically Calipso application is fully operational at this stage and you can jump to ‘Using Calipso’ section to learn how to use it, the following explains how the containers are deployed by calipso-installer.py for general reference.

    Checking the running containers status and ports in use:

    sudo docker ps

    Expected results and details (as of Aug 2017):

image2

The above listed TCP ports are used by default on the hosts to map to each calipso container, you should be familiar with these mappings of ports per container.

Checking running containers entry-points (The commands used inside the container):

sudo docker inspect [container-ID]

Expected results (as of Aug 2017):

image3

Calipso containers configuration can be listed with docker inspect, summarized in the table above. In a none-containerized deployment (see ‘Monolithic app install option in the install-guide) these are the individual commands that are needed to run calipso manually for special development needs.

The ‘calipso-sensu’ is built using sensu framework customized for calipso monitoring design, ‘calipso-ui’ is built using meteor framework, ‘calipso-ldap’ is built using pre-defined open-ldap container, and as such those three are only supported as pre-built containers.

Administrator should be aware of the following details deployed in the containers:

  1. calipso-api, calipso-sensu, calipso-scan and calipso-listen maps host directory /home/calipso as volume /local_dir inside the container.

    They use calipso_mongo_access.conf and ldap.conf files for configuration.

    They use /home/scan/calipso_prod/app as the main PYTHONPATH needed to run the different python modules per container.

  2. Calipso-sensu is using the ‘supervisord’ process to control all sensu server processes needed for calipso and the calipso event handler on this container.

  3. Calipso-ldap can be used as standalone, but is a pre-requisite for calipso-api.

  4. Calipso-ui needs calipso-mongo with latest scheme, to run and offer UI services.

Calipso containers access

The different Calipso containers are also accessible using SSH and pre-defined default credentials, here is the access details:

Calipso-listen: ssh scan@localhost –p 50022 , password = scan

Calipso-scan: ssh scan@localhost –p 30022 , password = scan

Calipso-api: ssh scan@localhost –p 40022 , password = scan

Calipso-sensu: ssh scan@localhost –p 20022 , password = scan

Calipso-ui: only accessible through web browser

Calipso-ldap: only accessible through ldap tools.

Calipso-mongo: only accessible through mongo clients like MongoChef.

Validating Calipso app

Validating calipso-mongo module

Using MongoChef client, create a new connection pointing to the server where calipso-mongo container is running, using port 27017 and the following default credentials:

Host IP=server_IP and TCP port=27017

Username : calipso

Password : calipso_default

Auto-DB: calipso

Defaults are also configured into /home/calipso/calipso_mongo_access.conf.

The following is a screenshot of a correct connection setup in MongoChef:

image4

When clicking on the new defined connection the calipso DB should be listed:

image5

At this stage you can checkout calipso-mongo collections data and validate as needed.

Validating calipso-scan module

Scan container is running the main calipso scanning engine that receives requests to scan a specific VIM environment, this command will validate that the main scan_manager.py process is running and waiting for scan requests:

sudo docker ps # grab the containerID of calipso-scan

sudo docker logs bf5f2020028a #containerID for example

Expected results:

2017-08-28 06:11:39,231 INFO: Using inventory collection: inventory

2017-08-28 06:11:39,231 INFO: Using links collection: links

2017-08-28 06:11:39,231 INFO: Using link_types collection: link_types

2017-08-28 06:11:39,231 INFO: Using clique_types collection: clique_types

2017-08-28 06:11:39,231 INFO: Using clique_constraints collection: clique_constraints

2017-08-28 06:11:39,231 INFO: Using cliques collection: cliques

2017-08-28 06:11:39,232 INFO: Using monitoring_config collection: monitoring_config

2017-08-28 06:11:39,232 INFO: Using constants collection: constants

2017-08-28 06:11:39,232 INFO: Using scans collection: scans

2017-08-28 06:11:39,232 INFO: Using messages collection: messages

2017-08-28 06:11:39,232 INFO: Using monitoring_config_templates collection: monitoring_config_templates

2017-08-28 06:11:39,232 INFO: Using environments_config collection: environments_config

2017-08-28 06:11:39,232 INFO: Using supported_environments collection: supported_environments

2017-08-28 06:11:39,233 INFO: Started ScanManager with following configuration:

Mongo config file path: /local_dir/calipso_mongo_access.conf

Scans collection: scans

Environments collection: environments_config

Polling interval: 1 second(s)

The above logs basically shows that scan_manager.py is running and listening to scan requests (should they come in through into ‘scans’ collection for specific environment listed in ‘environments_config’ collection, refer to use-guide for details).

Validating calipso-listen module

Listen container is running the main calipso event_manager engine that listens for events on a specific VIM BUS environment, this command will validate that the main event_manager.py process is running and waiting for events from the BUS:

2017-08-28 06:11:35,572 INFO: Using inventory collection: inventory

2017-08-28 06:11:35,572 INFO: Using links collection: links

2017-08-28 06:11:35,572 INFO: Using link_types collection: link_types

2017-08-28 06:11:35,572 INFO: Using clique_types collection: clique_types

2017-08-28 06:11:35,572 INFO: Using clique_constraints collection: clique_constraints

2017-08-28 06:11:35,573 INFO: Using cliques collection: cliques

2017-08-28 06:11:35,573 INFO: Using monitoring_config collection: monitoring_config

2017-08-28 06:11:35,573 INFO: Using constants collection: constants

2017-08-28 06:11:35,573 INFO: Using scans collection: scans

2017-08-28 06:11:35,573 INFO: Using messages collection: messages

2017-08-28 06:11:35,573 INFO: Using monitoring_config_templates collection: monitoring_config_templates

2017-08-28 06:11:35,573 INFO: Using environments_config collection: environments_config

2017-08-28 06:11:35,574 INFO: Using supported_environments collection: supported_environments

2017-08-28 06:11:35,574 INFO: Started EventManager with following configuration:

Mongo config file path: /local_dir/calipso_mongo_access.conf

Collection: environments_config

Polling interval: 5 second(s)

The above logs basically shows that event_manager.py is running and listening to event (should they come in through from VIM BUS) and listed in ‘environments_config’ collection, refer to use-guide for details).

Validating calipso-api module

Scan container is running the main calipso API that allows applications to integrate with calipso inventory and functions, this command will validate it is operational:

sudo docker ps # grab the containerID of calipso-scan

sudo docker logs bf5f2020028c #containerID for example

Expected results:

2017-08-28 06:11:38,118 INFO: Using inventory collection: inventory

2017-08-28 06:11:38,119 INFO: Using links collection: links

2017-08-28 06:11:38,119 INFO: Using link_types collection: link_types

2017-08-28 06:11:38,119 INFO: Using clique_types collection: clique_types

2017-08-28 06:11:38,120 INFO: Using clique_constraints collection: clique_constraints

2017-08-28 06:11:38,120 INFO: Using cliques collection: cliques

2017-08-28 06:11:38,121 INFO: Using monitoring_config collection: monitoring_config

2017-08-28 06:11:38,121 INFO: Using constants collection: constants

2017-08-28 06:11:38,121 INFO: Using scans collection: scans

2017-08-28 06:11:38,121 INFO: Using messages collection: messages

2017-08-28 06:11:38,121 INFO: Using monitoring_config_templates collection: monitoring_config_templates

2017-08-28 06:11:38,122 INFO: Using environments_config collection: environments_config

2017-08-28 06:11:38,122 INFO: Using supported_environments collection: supported_environments

[2017-08-28 06:11:38 +0000] [6] [INFO] Starting gunicorn 19.4.5

[2017-08-28 06:11:38 +0000] [6] [INFO] Listening at: http://0.0.0.0:8000 (6)

[2017-08-28 06:11:38 +0000] [6] [INFO] Using worker: sync

[2017-08-28 06:11:38 +0000] [12] [INFO] Booting worker with pid: 12

The above logs basically shows that the calipso api is running and listening on port 8000 for requests.

Validating calipso-sensu module

Sensu container is running several servers (currently unified into one for simplicity) and the calipso event handler (refer to use-guide for details), here is how to validate it is operational:

ssh scan@localhost -p 20022 # default password = scan

sudo /etc/init.d/sensu-client status

sudo /etc/init.d/sensu-server status

sudo /etc/init.d/sensu-api status

sudo /etc/init.d/uchiwa status

sudo /etc/init.d/rabbitmq-server status

Expected results:

Each of the above should return a pid and a ‘running’ state +

ls /home/scan/calipso_prod/app/monitoring/handlers # should list monitor.py module.

The above logs basically shows that calipso-sensu is running and listening to monitoring events from sensu-clients on VIM hosts, refer to use-guide for details).

Validating calipso-ui module

UI container is running several JS process with the back-end mongoDB, it needs data to run and it will not run if any connection with DB is lost, this is per design. To validate operational state of the UI simply point a Web Browser to : http://server-IP:80 and expect a login page. Use admin/123456 as default credentials to login:

image6

Validating calipso-ldap module

LDAP container is running a common user directory for integration with UI and API modules, it is placed with calipso to validate interaction with LDAP. The main configuration needed for communication with it is stored by calipso installer in /home/calipso/ldap.conf and accessed by the API module. We assume in production use-cases a corporate LDAP server might be used instead, in that case ldap.conf needs to be changed and point to the corporate server.

To validate LDAP container, you will need to install openldap-clients, using:

yum -y install openldap-clients / apt-get install openldap-clients

Search all LDAP users inside that ldap server:

ldapsearch -H ldap://localhost -LL -b ou=Users,dc=openstack,dc=org x

Admin user details on this container (user=admin, pass=password):

LDAP username : cn=admin,dc=openstack,dc=org

cn=admin,dc=openstack,dc=org’s password : password

Account BaseDN [DC=168,DC=56,DC=153:49154]: ou=Users,dc=openstack,dc=org

Group BaseDN [ou=Users,dc=openstack,dc=org]:

Add a new user (admin credentials needed to bind to ldap and add users):

Create a /tmp/adduser.ldif file, use this example:

dn: cn=Myname,ou=Users,dc=openstack,dc=org // which org, which ou etc ...

objectclass: inetOrgPerson

cn: Myname // match the dn details !

sn: Koren

uid: korlev

userpassword: mypassword // the password

carlicense: MYCAR123

homephone: 555-111-2222

mail: korlev@cisco.com

description: koren guy

ou: calipso Department

Run this command to add the above user attributes into the ldap server:

ldapadd -x -D cn=admin,dc=openstack,dc=org -w password -c -f /tmp/adduser.ldif // for example, the above file is used and the admin bind credentials who is, by default, authorized to add users.

You should see “user added” message if successful

Validate users against this LDAP container:

Wrong credentials:

ldapwhoami -x -D cn=Koren,ou=Users,dc=openstack,dc=org -w korlevwrong

Response: ldap_bind: Invalid credentials (49)

Correct credentials:

ldapwhoami -x -D cn=Koren,ou=Users,dc=openstack,dc=org -w korlev

Response: dn:cn=Koren,ou=Users,dc=openstack,dc=org

The reply ou/dc details can be used by any application (UI and API etc) for mapping users to some application specific group…

  • If all the above validations passed, Calipso is now fully functional, refer to admin-guide for more details.

Found a typo or any other feedback? Send an email to users@opnfv.org or talk to us on IRC.