OPNFV Documentation

Open Platform for NFV (OPNFV) facilitates the development and evolution of NFV components across various open source ecosystems. Through system level integration, deployment and testing, OPNFV creates a reference NFV platform to accelerate the transformation of enterprise and service provider networks. Participation is open to anyone, whether you are an employee of a member company or just passionate about network transformation.

Release

Platform overview

Introduction

Network Functions Virtualization (NFV) is transforming the networking industry via software-defined infrastructures and open source is the proven method for quickly developing software for commercial products and services that can move markets. Open Platform for NFV (OPNFV) facilitates the development and evolution of NFV components across various open source ecosystems. Through system level integration, deployment and testing, OPNFV constructs a reference NFV platform to accelerate the transformation of enterprise and service provider networks. As an open source project, OPNFV is uniquely positioned to bring together the work of standards bodies, open source communities, service providers and commercial suppliers to deliver a de facto NFV platform for the industry.

By integrating components from upstream projects, the community is able to conduct performance and use case-based testing on a variety of solutions to ensure the platform’s suitability for NFV use cases. OPNFV also works upstream with other open source communities to bring contributions and learnings from its work directly to those communities in the form of blueprints, patches, bugs, and new code.

OPNFV initially focused on building NFV Infrastructure (NFVI) and Virtualised Infrastructure Management (VIM) by integrating components from upstream projects such as OpenDaylight, OpenStack, Ceph Storage, KVM, Open vSwitch, and Linux. More recently, OPNFV has extended its portfolio of forwarding solutions to include fd.io and ODP, is able to run on both Intel and ARM commercial and white-box hardware, support VM, Container and BareMetal workloads, and includes Management and Network Orchestration MANO components primarily for application composition and management in the Danube release.

These capabilities, along with application programmable interfaces (APIs) to other NFV elements, form the basic infrastructure required for Virtualized Network Functions (VNF) and MANO components.

Concentrating on these components while also considering proposed projects on additional topics (such as the MANO components and applications themselves), OPNFV aims to enhance NFV services by increasing performance and power efficiency improving reliability, availability and serviceability, and delivering comprehensive platform instrumentation.

OPNFV Platform Architecture

The OPNFV project addresses a number of aspects in the development of a consistent virtualisation platform including common hardware requirements, software architecture, MANO and applications.

OPNFV Platform Overview Diagram

Overview infographic of the opnfv platform and projects.

To address these areas effectively, the OPNFV platform architecture can be decomposed into the following basic building blocks:

  • Hardware: with the Infra working group, Pharos project and associated activities
  • Software Platform: through the platform integration and deployment projects
  • MANO: through the MANO working group and associated projects
  • Applications: which affect all other areas and drive requirements for OPNFV

OPNFV Lab Infrastructure

The infrastructure working group oversees such topics as lab management, workflow, definitions, metrics and tools for OPNFV infrastructure.

Fundamental to the WG is the Pharos Specification which provides a set of defined lab infrastructures over a geographically and technically diverse federated global OPNFV lab.

Labs may instantiate bare-metal and virtual environments that are accessed remotely by the community and used for OPNFV platform and feature development, build, deploy and testing. No two labs are the same and the heterogeneity of the Pharos environment provides the ideal platform for establishing hardware and software abstractions providing well understood performance characteristics.

Community labs are hosted by OPNFV member companies on a voluntary basis. The Linux Foundation also hosts an OPNFV lab that provides centralized CI and other production resources which are linked to community labs. Future lab capabilities will include the ability easily automate deploy and test of any OPNFV install scenario in any lab environment as well as on a nested “lab as a service” virtual infrastructure.

OPNFV Software Platform Architecture

The OPNFV software platform is comprised exclusively of open source implementations of platform component pieces. OPNFV is able to draw from the rich ecosystem of NFV related technologies available in open-source then integrate, test, measure and improve these components in conjunction with our source communities.

While the composition of the OPNFV software platform is highly complex and constituted of many projects and components, a subset of these projects gain the most attention from the OPNFV community to drive the development of new technologies and capabilities.

Virtual Infrastructure Management

OPNFV derives it’s virtual infrastructure management from one of our largest upstream ecosystems OpenStack. OpenStack provides a complete reference cloud management system and associated technologies. While the OpenStack community sustains a broad set of projects, not all technologies are relevant in an NFV domain, the OPNFV community consumes a sub-set of OpenStack projects where the usage and composition may vary depending on the installer and scenario.

For details on the scenarios available in OPNFV and the specific composition of components refer to the OPNFV User Guide & Configuration Guide

Operating Systems

OPNFV currently uses Linux on all target machines, this can include Ubuntu, Centos or SUSE linux. The specific version of Linux used for any deployment is documented in the installation guide.

Networking Technologies
SDN Controllers

OPNFV, as an NFV focused project, has a significant investment on networking technologies and provides a broad variety of integrated open source reference solutions. The diversity of controllers able to be used in OPNFV is supported by a similarly diverse set of forwarding technologies.

There are many SDN controllers available today relevant to virtual environments where the OPNFV community supports and contributes to a number of these. The controllers being worked on by the community during this release of OPNFV include:

  • Neutron: an OpenStack project to provide “network connectivity as a service” between interface devices (e.g., vNICs) managed by other OpenStack services (e.g., nova).
  • OpenDaylight: addresses multivendor, traditional and greenfield networks, establishing the industry’s de facto SDN platform and providing the foundation for networks of the future.
  • ONOS: a carrier-grade SDN network operating system designed for high availability, performance, scale-out.
Data Plane

OPNFV extends Linux virtual networking capabilities by using virtual switching and routing components. The OPNFV community proactively engages with these source communities to address performance, scale and resiliency needs apparent in carrier networks.

  • FD.io (Fast data - Input/Output): a collection of several projects and libraries to amplify the transformation that began with Data Plane Development Kit (DPDK) to support flexible, programmable and composable services on a generic hardware platform.
  • Open vSwitch: a production quality, multilayer virtual switch designed to enable massive network automation through programmatic extension, while still supporting standard management interfaces and protocols.

Deployment Architecture

A typical OPNFV deployment starts with three controller nodes running in a high availability configuration including control plane components from OpenStack, SDN, etc. and a minimum of two compute nodes for deployment of workloads (VNFs). A detailed description of the hardware requirements required to support the 5 node configuration can be found in pharos specification: Pharos Project

In addition to the deployment on a highly available physical infrastructure, OPNFV can be deployed for development and lab purposes in a virtual environment. In this case each of the hosts is provided by a virtual machine and allows control and workload placement using nested virtualization.

The initial deployment is done using a staging server, referred to as the “jumphost”. This server-either physical or virtual-is first installed with the installation program that then installs OpenStack and other components on the controller nodes and compute nodes. See the OPNFV User Guide & Configuration Guide for more details.

The OPNFV Testing Ecosystem

The OPNFV community has set out to address the needs of virtualization in the carrier network and as such platform validation and measurements are a cornerstone to the iterative releases and objectives.

To simplify the complex task of feature, component and platform validation and characterization the testing community has established a fully automated method for addressing all key areas of platform validation. This required the integration of a variety of testing frameworks in our CI systems, real time and automated analysis of results, storage and publication of key facts for each run as shown in the following diagram.

Overview infographic of the OPNFV testing Ecosystem

Release Verification

The OPNFV community relies on its testing community to establish release criteria for each OPNFV release. Each release cycle the testing criteria become more stringent and better representative of our feature and resiliency requirements.

As each OPNFV release establishes a set of deployment scenarios to validate, the testing infrastructure and test suites need to accommodate these features and capabilities. It’s not only in the validation of the scenarios themselves where complexity increases, there are test cases that require multiple datacenters to execute when evaluating features, including multisite and distributed datacenter solutions.

The release criteria as established by the testing teams include passing a set of test cases derived from the functional testing project ‘functest,’ a set of test cases derived from our platform system and performance test project ‘yardstick,’ and a selection of test cases for feature capabilities derived from other test projects such as bottlenecks, vsperf, cperf and storperf. The scenario needs to be able to be deployed, pass these tests, and be removed from the infrastructure iteratively (no less that 4 times) in order to fulfil the release criteria.

Functest

Functest provides a functional testing framework incorporating a number of test suites and test cases that test and verify OPNFV platform functionality. The scope of Functest and relevant test cases can be found in the Functest User Guide

Functest provides both feature project and component test suite integration, leveraging OpenStack and SDN controllers testing frameworks to verify the key components of the OPNFV platform are running successfully.

Yardstick

Yardstick is a testing project for verifying the infrastructure compliance when running VNF applications. Yardstick benchmarks a number of characteristics and performance vectors on the infrastructure making it a valuable pre-deployment NFVI testing tools.

Yardstick provides a flexible testing framework for launching other OPNFV testing projects.

There are two types of test cases in Yardstick:

  • Yardstick generic test cases and OPNFV feature test cases; including basic characteristics benchmarking in compute/storage/network area.
  • OPNFV feature test cases include basic telecom feature testing from OPNFV projects; for example nfv-kvm, sfc, ipv6, Parser, Availability and SDN VPN

System Evaluation and compliance testing

The OPNFV community is developing a set of test suites intended to evaluate a set of reference behaviors and capabilities for NFV systems developed externally from the OPNFV ecosystem to evaluate and measure their ability to provide the features and capabilities developed in the OPNFV ecosystem.

The Dovetail project will provide a test framework and methodology able to be used on any NFV platform, including an agreed set of test cases establishing an evaluation criteria for exercising an OPNFV compatible system. The Dovetail project has begun establishing the test framework and will provide a preliminary methodology for the Danube release. Work will continue to develop these test cases to establish a stand alone compliance evaluation solution in future releases.

Additional Testing

Besides the test suites and cases for release verification, additional testing is performed to validate specific features or characteristics of the OPNFV platform. These testing framework and test cases may include some specific needs; such as extended measurements, additional testing stimuli, or tests simulating environmental disturbances or failures.

These additional testing activities provide a more complete evaluation of the OPNFV platform. Some of the projects focused on these testing areas include:

VSPERF

VSPERF provides an automated test-framework and comprehensive test suite for measuring data-plane performance of the NFVI including switching technology, physical and virtual network interfaces. The provided test cases with network topologies can be customized while also allowing individual versions of Operating System, vSwitch and hypervisor to be specified.

Bottlenecks

Bottlenecks provides a framework to find system limitations and bottlenecks, providing root cause isolation capabilities to facilitate system evaluation.

Installation

Abstract

This document provides an overview of the installation of the Danube release of OPNFV.

The Danube release can be installed making use of any of the installer projects in OPNFV: Apex, Compass4Nfv, Fuel or JOID. Each installer provides the ability to install a common OPNFV platform as well as integrating additional features delivered through a variety of scenarios by the OPNFV community.

Introduction

The OPNFV platform is comprised of a variety of upstream components that may be deployed on your infrastructure. A composition of components, tools and configurations is identified in OPNFV as a deployment scenario.

The various OPNFV scenarios provide unique features and capabilities that you may want to leverage, and it is important to understand your required target platform capabilities before installing and configuring your scenarios.

An OPNFV installation requires either a physical infrastructure environment as defined in the Pharos specification, or a virtual one. When configuring a physical infrastructure it is strongly advised to follow the Pharos configuration guidelines.

Scenarios

OPNFV scenarios are designed to host virtualised network functions (VNF’s) in a variety of deployment architectures and locations. Each scenario provides specific capabilities and/or components aimed at solving specific problems for the deployment of VNF’s.

A scenario may, for instance, include components such as OpenStack, OpenDaylight, OVS, KVM etc., where each scenario will include different source components or configurations.

To learn more about the scenarios supported in the Danube release refer to the scenario description documents provided:

Installation Procedure

Detailed step by step instructions for working with an installation toolchain and installing the required scenario are provided by the installation projects. The four projects providing installation support for the OPNFV Danube release are: Apex, Compass4nfv, Fuel and JOID.

The instructions for each toolchain can be found in these links:

OPNFV Test Frameworks

If you have elected to install the OPNFV platform using the deployment toolchain provided by OPNFV your system will have been validated once the installation is completed. The basic deployment validation only addresses a small part of capabilities provided in the platform and you may want to execute more exhaustive tests. Some investigation will be required to select the right test suites to run on your platform.

Many of the OPNFV test project provide user-guide documentation and installation instructions in this document

User Guide & Configuration Guide

Abstract

OPNFV is a collaborative project aimed at providing a variety of virtualisation deployments intended to host applications serving the networking and carrier industries. This document provides guidance and instructions for using platform features designed to support these applications, made available in the OPNFV Danube release.

This document is not intended to replace or replicate documentation from other upstream open source projects such as KVM, OpenDaylight, or OpenStack, but to highlight the features and capabilities delivered through the OPNFV project.

Introduction

OPNFV provides a suite of scenarios, infrastructure deployment options, which are able to be installed to host virtualised network functions (VNFs). This Guide intends to help users of the platform leverage the features and capabilities delivered by the OPNFV project.

OPNFVs’ Continuous Integration builds, deploys and tests combinations of virtual infrastructure components in what are defined as scenarios. A scenario may include components such as KVM, OpenDaylight, OpenStack, OVS, etc., where each scenario will include different source components or configurations. Scenarios are designed to enable specific features and capabilities in the platform that can be leveraged by the OPNFV User community.

Feature Overview

The following links outline the feature deliverables from participating OPNFV projects in the Danube release. Each of the participating projects provides detailed descriptions about the delivered features including use cases, implementation and configuration specifics.

The following Configuration Guides and User Guides assume that the reader already has some information about a given project’s specifics and deliverables. These Guides are intended to be used following the installation with an OPNFV installer to allow users to deploy and implement feature delivered by OPNFV.

If you are unsure about the specifics of a given project, please refer to the OPNFV wiki page at http://wiki.opnfv.org, for more details.

Test Frameworks

Test Framework Overview

OPNFV testing

Introduction

Testing is one of the key activities in OPNFV and includes unit, feature, component, system level testing for development, automated deployment, performance characterization or stress testing.

Test projects are dedicated to provide frameworks, tooling and test-cases categorized as functional, performance or compliance testing. Test projects fulfill different roles such as verifying VIM functionality, benchmarking components and platforms or analysis of measured KPIs for the scenarios released in OPNFV.

Feature projects also provide their own test suites that either run independently or within a test project.

This document details the OPNFV testing ecosystem, describes common test components used by individual OPNFV projects and provides links to project specific documentation.

OPNFV testing ecosystem
The testing projects

The OPNFV testing projects may be summarized as follows:

Overview of OPNFV Testing projects

The major testing projects are described in the table below:

Project Description
Bottlenecks This project aims to find system bottlenecks by testing and verifying OPNFV infrastructure in a staging environment before committing it to a production environment. Instead of debugging a deployment in production environment, an automatic method for executing benchmarks which plans to validate the deployment during staging is adopted. This project forms a staging framework to find bottlenecks and to do analysis of the OPNFV infrastructure.
CPerf SDN Controller benchmarks and performance testing, applicable to controllers in general. Collaboration of upstream controller testing experts, external test tool developers and the standards community. Primarily contribute to upstream/external tooling, then add jobs to run those tools on OPNFV’s infrastructure.
Dovetail This project intends to define and provide a set of OPNFV related validation criteria that will provide input for the evaluation of the use of OPNFV trademarks. The dovetail project is executed with the guidance and oversight of the Compliance and Certification committee and work to secure the goals of the C&C committee for each release. The project intends to incrementally define qualification criteria that establish the foundations of how we are able to measure the ability to utilize the OPNFV platform, how the platform itself should behave, and how applications may be deployed on the platform.
Functest This project deals with the functional testing of the VIM and NFVI. It leverages several upstream test suites (OpenStack, ODL, ONOS, etc.) and can be used by feature project to launch feature test suites in CI/CD. The project is used for scenario validation.
Qtip QTIP as the project for “Platform Performance Benchmarking” in OPNFV aims to provide user a simple indicator for performance, supported by comprehensive testing data and transparent calculation formula. It provides a platform with common services for performance benchmarking which helps users to build indicators by themselves with ease.
Storperf The purpose of this project is to provide a tool to measure block and object storage performance in an NFVI. When complemented with a characterization of typical VF storage performance requirements, it can provide pass/fail thresholds for test, staging, and production NFVI environments.
VSperf This project provides a framework for automation of NFV data-plane performance testing and benchmarking. The NFVI fast-path includes switch technology and network with physical and virtual interfaces. VSperf can be used to evaluate the suitability of different Switch implementations and features, quantify data-path performance and optimize platform configurations.
Yardstick The goal of the Project is to verify the infrastructure compliance when running VNF applications. NFV Use Cases described in ETSI GS NFV 001 show a large variety of applications, each defining specific requirements and complex configuration on the underlying infrastructure and test tools.The Yardstick concept decomposes typical VNF work-load performance metrics into a number of characteristics/performance vectors, which each of them can be represented by distinct test-cases.

The testing working group resources

The assets
Overall Architecture

The Test result management can be summarized as follows:

+-------------+    +-------------+    +-------------+
|             |    |             |    |             |
|   Test      |    |   Test      |    |   Test      |
| Project #1  |    | Project #2  |    | Project #N  |
|             |    |             |    |             |
+-------------+    +-------------+    +-------------+
         |               |               |
         V               V               V
     +---------------------------------------------+
     |                                             |
     |           Test Rest API front end           |
     |    http://testresults.opnfv.org/test        |
     |                                             |
     +---------------------------------------------+
         ^                |                     ^
         |                V                     |
         |     +-------------------------+      |
         |     |                         |      |
         |     |    Test Results DB      |      |
         |     |         Mongo DB        |      |
         |     |                         |      |
         |     +-------------------------+      |
         |                                      |
         |                                      |
   +----------------------+        +----------------------+
   |                      |        |                      |
   | Testing Dashboards   |        |      Landing page    |
   |                      |        |                      |
   +----------------------+        +----------------------+
The testing databases

A Mongo DB Database has been introduced for the Brahmaputra release. The following collections are declared in this database:

  • pods: the list of pods used for production CI
  • projects: the list of projects providing test cases
  • testcases: the test cases related to a given project
  • results: the results of the test cases
  • scenarios: the OPNFV scenarios tested in CI

This database can be used by any project through the testapi. Please note that projects may also use additional databases. This database is mainly use to colelct CI results and scenario trust indicators.

This database is also cloned for OPNFV Plugfest.

The test API

The Test API is used to declare pods, projects, test cases and test results. Pods correspond to the cluster of machines (3 controller and 2 compute nodes in HA mode) used to run the tests and defined in Pharos project. The results pushed in the database are related to pods, projects and cases. If you try to push results of test done on non referenced pod, the API will return an error message.

An additional method dashboard has been added to post-process the raw results in the Brahmaputra release (deprecated in Colorado release).

The data model is very basic, 5 objects are available:
  • Pods
  • Projects
  • Testcases
  • Results
  • Scenarios

For detailed information, please go to http://artifacts.opnfv.org/releng/docs/testapi.html

The reporting

The reporting page for the test projects is http://testresults.opnfv.org/reporting/

Testing group reporting page

This page provides a reporting per OPNFV release and per testing project.

Testing group Danube reporting page

An evolution of this page is planned. It was decided to unify the reporting by creating a landing page that should give the scenario status in one glance (it was previously consolidated manually on a wiki page).

The landing page (planned for Danube 2.0) will be displayed per scenario:
  • the status of the deployment
  • the score of the test projectS
  • a trust indicator

Additional filters (version, installer, test collection time window,... ) are included.

The test case catalog

Until the Colorado release, each testing project was managing the list of its test cases. It was very hard to have a global view of the available test cases among the different test projects. A common view was possible through the API but it was not very user friendly. In fact you may know all the cases per project calling:

with project_name: bottlenecks, functest, qtip, storperf, vsperf, yardstick

It was decided to build a web site providing a consistent view of the test cases per project and allow any scenario owner to build his/her custom list of tests (Danube 2.0).

Other resources

wiki: https://wiki.opnfv.org/testing

mailing list: test-wg@lists.opnfv.org

IRC chan: #opnfv-testperf

weekly meeting (https://wiki.opnfv.org/display/meetings/TestPerf):
  • Usual time: Every Thursday 15:00-16:00 UTC / 7:00-8:00 PST
  • APAC time: 2nd Wednesday of the month 8:00-9:00 UTC

Testing User Guides

Bottlenecks

Bottlenecks - User Guide
POSCA Stress (Factor) Test of Perfomance Life-Cycle
Test Case
Bottlenecks POSCA Stress Test Ping
test case name posca_posca_ping
description Stress test regarding life-cycle while using ping to validate the VM pairs constructions
configuration
config file:
/testsuite/posca/testcase_cfg/posca_posca_ping.yaml

stack number: 5, 10, 20, 50 ...

test result PKT loss rate, success rate, test time, latency
Configuration
load_manager:
  scenarios:
    tool: ping
    test_times: 100
    package_size:
    num_stack: 5, 10, 20
    package_loss: 10%

  contexts:
    stack_create: yardstick
    flavor:
    yardstick_test_ip:
    yardstick_test_dir: "samples"
    yardstick_testcase: "ping_bottlenecks"

dashboard:
  dashboard: "y"
  dashboard_ip:
POSCA Stress (Factor) Test of System bandwidth
Test Case
Bottlenecks POSCA Stress Test Traffic
test case name posca_factor_system_bandwith
description Stress test regarding baseline of the system for a single user, i.e., a VM pair while increasing the package size
configuration
config file:
/testsuite/posca/testcase_cfg/
posca_factor_system_bandwith.yaml

stack number: 1

test result PKT loss rate, latency, throupht, cpu usage
Configration
test_config:
  tool: netperf
  protocol: tcp
  test_time: 20
  tx_pkt_sizes: 64, 256, 1024, 4096, 8192, 16384, 32768, 65536
  rx_pkt_sizes: 64, 256, 1024, 4096, 8192, 16384, 32768, 65536
  cpu_load: 0.9
  latency: 100000
runner_config:
  dashboard: "y"
  dashboard_ip:
  stack_create: yardstick
  yardstick_test_ip:
  yardstick_test_dir: "samples"
  yardstick_testcase: "netperf_bottlenecks"
Bottlenecks - Deprecated Test Cases
Bottlenecks Rubbos Test Case Description Basic
Bottlenecks Rubbos Basic
test case name opnfv_bottlenecks_rubbos_Basic
description Rubbos platform for 1 tomcat, 1 Apache and 1 mysql.
configuration
config file:
/testsuite/rubbos/testcase_cfg/rubbos_basic.yaml

client number: 1

test result throughput
Bottlenecks Rubbos Test Case Description TC1101
Bottlenecks Rubbos TC1101
test case name opnfv_bottlenecks_rubbos_tc1101
description Rubbos platform for 1 tomcat, 1 Apache and 1 mysql.
configuration
config file:
/testsuite/rubbos/testcase_cfg/rubbos_1-1-0-1.yaml

client number: 5

test result throughput
Bottlenecks Rubbos Test Case Description TC1201
Bottlenecks Rubbos TC1201
test case name opnfv_bottlenecks_rubbos_tc1201
description Rubbos platform for 1 Apache, 2 tomcat and 1 mysql.
configuration
config file:
/testsuite/rubbos/testcase_cfg/rubbos_1-2-0-1.yaml

client number: 5

test result throughput
Bottlenecks Rubbos Test Case Description TC1301
Bottlenecks Rubbos TC1301
test case name opnfv_bottlenecks_rubbos_tc1301
description Rubbos platform for 1 Apache, 3 tomcat and 1 mysql.
configuration
config file:
/testsuite/rubbos/testcase_cfg/rubbos_1-3-0-1.yaml

client number: 5

test result throughput
Bottlenecks Rubbos Test Case Description TC1401
Bottlenecks Rubbos TC1401
test case name opnfv_bottlenecks_rubbos_tc1401
description Rubbos platform for 1 Apache, 4 tomcat and 1 mysql.
configuration
config file:
/testsuite/rubbos/testcase_cfg/rubbos_1-4-0-1.yaml

client number: 5

test result throughput
Bottlenecks Rubbos Test Case Description Heavy TC1101
Bottlenecks Rubbos TC Heavy1101
test case name opnfv_bottlenecks_rubbos_heavy_tc1101
description Rubbos platform for 1 tomcat, 1 Apache and 1 mysql.
configuration
config file:
/testsuite/rubbos/testcase_cfg/rubbos-heavy_1-1-0-1.yaml

client number: 10

test result throughput
Bottlenecks vSwitch Test Framework(VSTF) Test Case Description Ti1
Bottlenecks VSTF Ti1
test case name opnfv_bottlenecks_vstf_Ti1
description vSwitch test Ti1.
configuration
config file:
/testsuite/vstf/testcase_cfg/vstf_Ti1.yaml
test result throughput & latency
Bottlenecks vSwitch Test Framework(VSTF) Test Case Description Ti2
Bottlenecks VSTF Ti2
test case name opnfv_bottlenecks_vstf_Ti2
description vSwitch test Ti2.
configuration
config file:
/testsuite/vstf/testcase_cfg/vstf_Ti2.yaml
test result throughput & latency
Bottlenecks vSwitch Test Framework(VSTF) Test Case Description Ti3
Bottlenecks VSTF Ti3
test case name opnfv_bottlenecks_vstf_Ti3
description vSwitch test Ti3.
configuration
config file:
/testsuite/vstf/testcase_cfg/vstf_Ti3.yaml
test result throughput & latency
Bottlenecks vSwitch Test Framework(VSTF) Test Case Description Tn1
Bottlenecks VSTF Tn1
test case name opnfv_bottlenecks_vstf_Tn1
description vSwitch test Tn1.
configuration
config file:
/testsuite/vstf/testcase_cfg/vstf_Tn1.yaml
test result throughput & latency
Bottlenecks vSwitch Test Framework(VSTF) Test Case Description Tn2
Bottlenecks VSTF Tn2
test case name opnfv_bottlenecks_vstf_Tn2
description vSwitch test Tn2.
configuration
config file:
/testsuite/vstf/testcase_cfg/vstf_Tn2.yaml
test result throughput & latency
Bottlenecks vSwitch Test Framework(VSTF) Test Case Description Tu1
Bottlenecks VSTF Tu1
test case name opnfv_bottlenecks_vstf_Tu1
description vSwitch test Tu1.
configuration
config file:
/testsuite/vstf/testcase_cfg/vstf_Tu1.yaml
test result throughput & latency
Bottlenecks vSwitch Test Framework(VSTF) Test Case Description Tu2
Bottlenecks VSTF Tu2
test case name opnfv_bottlenecks_vstf_Tu2
description vSwitch test Tu2.
configuration
config file:
/testsuite/vstf/testcase_cfg/vstf_Tu2.yaml
test result throughput & latency
Bottlenecks vSwitch Test Framework(VSTF) Test Case Description Tu3
Bottlenecks VSTF Tu3
test case name opnfv_bottlenecks_vstf_Tu3
description vSwitch test Tu3.
configuration
config file:
/testsuite/vstf/testcase_cfg/vstf_Tu3.yaml
test result throughput & latency

Dovetail

OPNFV Verified Program certification workflow
Introduction

This document provides guidance for testers on how to obtain OPNFV compliance certification. The OPNFV Verified Program (OVP) is administered by the OPNFV Compliance and Certification (C&C) committee.

For further information about the workflow and general inquiries about the program, please check out the OVP web portal, or contact the C&C committee by email address verified@opnfv.org. This email address should be used for all communication with the OVP.

Step 1: Applying

A tester should start the process by completing an application. The application form can found on the OVP web portal and the following information should be provided:

  • Organization name
  • Organization website (if public)
  • Product name and/or identifier
  • Product specifications
  • Product public documentation
  • Product categories, choose one: (i) software and hardware (ii) software and third party hardware (please specify)
  • Primary contact name, business email, postal address and phone number Only the primary email address should be used for official communication with OPNFV OVP.
  • User ID for OVP web portal The OVP web portal supports the Linux Foundation user ID in the current release. If a new user ID is needed, visit https://identity.linuxfoundation.org.
  • Location where the verification testing is to be conducted. Choose one: (internal vendor lab, third-party lab)
  • If the test is to be conducted by a third-party lab, please specify name and contact information of the third-party lab, including email, address and phone number.

Please email the completed application using the primary contact email account in order to establish identity.

Once the application information is received and in order, an email response will be sent to the primary contact with confirmation and information to proceed.

[Editor’s note: No fee has been established at this time for OVP applications. Recommend we skip fee for the initial release of OVP.]

Step 2: Testing

The following documents guide testers to prepare the test environment and run tests:

A unique Test ID is generated by the Dovetail tool for each test run. Please take a note of this ID for future reference.

Step 3: Submitting Test Results

Testers can upload the test results to the OVP web portal. By default, the results are visible only to the tester who uploaded the data.

Testers can self-review the test results through the portal until they are ready to ask for OVP review. They may also update with or add new test results as needed.

Once the tester is satisfied with the test result, the tester grants access to the test result for OVP review via the portal. The test result is identified by the unique Test ID.

When a test result is made visible to the reviewers, the web portal will notify verified@opnfv.org and Cc the primary contact email that a review request has been made and reference the Test ID. This will alert the C&C Committee to start the OVP review process.

Step 4: OVP Review

Upon receiving the email notification and the Test ID, the C&C Committee conducts a peer based review of the test result. Persons employed by the same organization that submitted the test results or by affiliated organizations will not be part of the reviewers.

The primary contact may be asked via email for any missing information or clarification of the application. The reviewers will make a determination and recommend compliance or non-compliance to the C&C Committee. Normally, the outcome of the review should be communicated to the tester within 10 business days after all required information is in order.

If an application is denied, an appeal can be made to the C&C Committee.

Appendix

.._ovp-application-form:

OPNFV Verified Program Application Form
Field Description
Organization name Organization name
 
 
Organization website Organization website if it is public
 
 
Product name and/or identifier Product name and/or identifier
 
 
Product specifications A link of product specifications
 
 
Product public documentation A link of product public documentation
 
 
Product categories Choose one: (i) software and hardware (ii) software and third party hardware
Primary contact name Name
 
Primary business email Only the Business email address should be used for official communication with OPNFV OVP
 
 
Primary postal address Address
 
Primary phone number Phone Number
 
User ID for OVP web portal Choose one: (i) Linux Foundation (ii) Openstack (iii) Github (iv) Google (v) Facebook ID
 
User ID:
 
Location Choose one: (i)internal vendor lab (ii) third-party lab
 
Name and address:
 
 
Information of the 3rd-party lab If the test is to be conducted by a third-party lab, including name, email, address and phone number
 
 
 
 
 
OPNFV Verified Program 2018.01 Reviewer Guide
Introduction

This reviewer guide provides detailed guidance for reviewers on how to handle the result review process. Reviewers must follow the checklist below to ensure review consistency for the OPNFV Verified Program (OVP) 2018.01 (Danube) release at a minimum.

  1. Mandatory Test Area Results - Validate that results for all mandatory test areas are present.
  2. Test-Case Count within Mandatory Test Area - Check that the total number of test-cases are present in each mandatory test area.
  3. Test-Case Pass Percentage - Ensure all tests have passed (100% pass rate).
  4. Log File Verification - Inspect the log file for each test area (osinterop, ha, vping).
  5. SUT Info Verification - Validate the system under test (SUT) hardware and software endpoint info is present.
1. Mandatory Test Area Results

Validate that results for all mandatory test areas are included in the overall test suite. The required mandatory test areas are:

  • osinterop
  • vping
  • ha

Login to the OVP portal at:

https://verified.opnfv.org

Click on the ‘My Results’ tab in top-level navigation bar.

_images/ovp_top_nav.png

The OVP administrator will ask for review volunteers using the verified@opnfv.org email alias. The incoming results for review will be identified by the administrator with particular ‘Test ID’ and ‘Owner’ values. The corresponding OVP portal result will have a status of ‘review’.

_images/ovp_result_review.png

In the example above, this information will be provided as: - Test ID: a00c47e8 - Owner: jtaylor

Click on the hyperlink within the ‘Test ID’ column.

Note, that the ‘Test ID’ column in this view condenses the UUID used for ‘Test ID’ to eight characters even though the ‘Test ID’ is a longer UUID in the back-end.

_images/ovp_result_overview.png

The ‘Test ID’ hyperlink toggles the view to a top-level listing of the results displayed above. Validate that osinterop, vping and ha test area results are all present within the view.

2. Test-Case Count within Mandatory Test Area

Validate the test-case count within each test area. For the OVP 2018.01 release, this must break down as outlined in the table below.

_images/ovp_test_count.png

In the diagram above (from section 1), these counts can be gleaned from the numbers to the right of the test-cases. The total number is given for the osinterop (dovetail.osinterop.tc001) test area at 205. The vping (dovetail.vping.tc00x) and ha (dovetail.ha.tc00x) test-cases are broken down separately with a line for each test-case. Directly above the ‘Test Result Overview’ listing there’s a summary labelled ‘Test Run Results’ shown below. For OVP 2018.01, a mandatory total of 215 test-cases must be present (205 osinterop + 8 ha + 2 vping).

_images/ovp_missing_ha.png

An example of a listing that should flag a negative review is shown above. The mandatory ha test area is missing one test case (dovetail.ha.tc008).

3. Test-Case Pass Percentage

All mandatory test-cases must pass. This can be validated in multiple ways. The below diagram of the ‘Test Run Results’ is one method and shows that 100% of the mandatory test-cases have passed. This value must not be lower than 100%.

_images/ovp_pass_percentage.png

Another method to check that all mandatory test-cases have passed is shown in the diagram below. The pass/total is given as a fraction and highlighted here in yellow. For the osinterop test area, the result must display [205/205] and for each of the test-cases under the vping and ha test areas [1/1] must be displayed.

_images/ovp_pass_fraction.png
4. Log File Verification

Three log files must be verified for content within each mandatory test area. The log files for each of the test areas is noted in the table below.

_images/ovp_log_files.png

The three log files can be displayed by clicking on the setup icon to the right of the results, as shown in the diagram below.

Note, while the vping and ha test areas list multiple test-cases in the below diagram, there is a single log file for all test-cases within these test areas.

_images/ovp_log_setup.png

Within the osinterop log (dovetail.osinterop.tc001.log), scroll down to the area of the log that begins to list the results of each test-case executed. This can be located by looking for lines prefaced with ‘tempest.api‘ and ending with ‘... ok‘.

_images/ovp_log_test_count.png

The number of lines within the osinterop log for test-cases must add up according to the table above, where test-cases are broken down according to compute, identity, image, network and volume, with respective counts given in the table. The ha log (yardstick.log) must contain the ‘PASS’ result for each of the eight test-cases within this test area. This can be verified by searching the log for the keyword ‘PASS’.

The eight lines to validate are listed below:

  • 017-10-16 05:07:49,158 yardstick.benchmark.scenarios.availability.serviceha serviceha.py:81 INFO The HA test case PASS the SLA
  • 2017-10-16 05:08:31,387 yardstick.benchmark.scenarios.availability.serviceha serviceha.py:81 INFO The HA test case PASS the SLA
  • 2017-10-16 05:09:13,669 yardstick.benchmark.scenarios.availability.serviceha serviceha.py:81 INFO The HA test case PASS the SLA
  • 2017-10-16 05:09:55,967 yardstick.benchmark.scenarios.availability.serviceha serviceha.py:81 INFO The HA test case PASS the SLA
  • 2017-10-16 05:10:38,407 yardstick.benchmark.scenarios.availability.serviceha serviceha.py:81 INFO The HA test case PASS the SLA
  • 2017-10-16 05:11:00,030 yardstick.benchmark.scenarios.availability.scenario_general scenario_general.py:71 INFO [92m Congratulations, the HA test case PASS! [0m
  • 2017-10-16 05:11:22,536 yardstick.benchmark.scenarios.availability.scenario_general scenario_general.py:71 INFO [92m Congratulations, the HA test case PASS! [0m
  • 2017-10-16 05:12:07,880 yardstick.benchmark.scenarios.availability.scenario_general scenario_general.py:71 INFO [92m Congratulations, the HA test case PASS! [0m

The final validation is for the vping test area log file (functest.log). The two entries displayed in the diagrams below must be present in this log file.

  • vping_userdata
  • vping_ssh
_images/ovp_vping_user.png _images/ovp_vping_ssh.png
5. SUT Info Verification

SUT information must be present in the results to validate that all required endpoint services and at least two controllers were present during test execution. For the results shown below, click the ‘info‘ hyperlink in the SUT column to navigate to the SUT information page.

_images/sut_info.png

In the ‘Endpoints‘ listing shown below for the SUT VIM component, ensure that services are present for identify, compute, image, volume and network at a minimum by inspecting the ‘Service Type‘ column.

_images/sut_endpoints.png

Inspect the ‘Hosts‘ listing found below the Endpoints secion of the SUT info page and ensure at least two hosts are present, as two controllers are required the for the mandatory HA test-cases.

OPNFV Verified Program system preparation guide

This document provides a general guide to hardware system prerequisites and expectations for running OPNFV OVP testing. For detailed guide of preparing software tools and configurations, and conducting the test, please refer to the User Guide :ref:dovetail-testing_user_guide.

The OVP test tools expect that the hardware of the System Under Test (SUT) is Pharos compliant Pharos specification

The Pharos specification itself is a general guideline, rather than a set of specific hard requirements at this time, developed by the OPNFV community. For the purpose of helping OVP testers, we summarize the main aspects of hardware to consider in preparation for OVP testing.

As described by the OVP Testing User Guide, the hardware systems involved in OVP testing includes a Test Node, a System Under Test (SUT) system, and network connectivity between them.

The Test Node can be a bare metal machine or a virtual machine that can support Docker container environment. If it is a bare metal machine, it needs to be a x86 based at this time. Detailed information of how to configure and prepare the Test Node can be found in the User Guide.

The System Under Test (SUT) system is expected to consist of a set of general purpose servers, storage devices or systems, and networking infrastructure connecting them together. The set of servers are expected to be of the same architecture, either x86-64 or ARM-64. Mixing different architectures in the same SUT is not supported.

A minimum of 5 servers, 3 configured for controllers and 2 or more configured for compute resource are expected. However this is not a hard requirement at this phase. The OVP 1.0 mandatory test cases only require one compute server. At lease two compute servers are required to pass some of the optional test cases in the current OVP release. OVP control service high availability tests expect two or more control nodes to pass, depending on the HA mechanism implemented by the SUT.

The SUT is also expected to include components for persistent storage. The OVP testing does not expect or impose significant storage size or performance requirements.

The SUT is expected to be connected with high performance networks. These networks are expected in the SUT:

  • A management network by which the Test Node can reach all identity, image, network,

and compute services in the SUT - A data network that supports the virtual network capabilities and data path testing

Additional networks, such as Light Out Management or storage networks, may be beneficial and found in the SUT, but they are not a requirement for OVP testing.

OPNFV Verified Program test specification
Introduction

The OPNFV OVP provides a series or test areas aimed to evaluate the operation of an NFV system in accordance with carrier networking needs. Each test area contains a number of associated test cases which are described in detail in the associated test specification.

All tests in the OVP are required to fulfill a specific set of criteria in order that the OVP is able to provide a fair assessment of the system under test. Test requirements are described in the ‘Test Case Requirements’_ document.

All tests areas addressed in the OVP are covered in the following test specification documents.

OpenStack Services HA test specification
Scope

The HA test area evaluates the ability of the System Under Test to support service continuity and recovery from component failures on part of OpenStack controller services(“nova-api”, “neutron-server”, “keystone”, “glance-api”, “cinder-api”) and on “load balancer” service.

The tests in this test area will emulate component failures by killing the processes of above target services, stressing the CPU load or blocking disk I/O on the selected controller node, and then check if the impacted services are still available and the killed processes are recovered on the selected controller node within a given time interval.

References

This test area references the following specifications:

Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • SUT - system under test
  • Monitor - tools used to measure the service outage time and the process outage time
  • Service outage time - the outage time (seconds) of the specific OpenStack service
  • Process outage time - the outage time (seconds) from the specific processes being killed to recovered
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

SUT is assumed to be in high availability configuration, which typically means more than one controller nodes are in the System Under Test.

Test Area Structure

The HA test area is structured with the following test cases in a sequential manner.

Each test case is able to run independently. Preceding test case’s failure will not affect the subsequent test cases.

Preconditions of each test case will be described in the following test descriptions.

Test Descriptions
Test Case 1 - Controller node OpenStack service down - nova-api
Short name

dovetail.ha.tc001.nova-api_service_down

Use case specification

This test case verifies the service continuity capability in the face of the software process failure. It kills the processes of OpenStack “nova-api” service on the selected controller node, then checks whether the “nova-api” service is still available during the failure, by creating a VM then deleting the VM, and checks whether the killed processes are recovered within a given time interval.

Test preconditions

There is more than one controller node, which is providing the “nova-api” service for API end-point. Denoted a controller node as Node1 in the following configuration.

Basic test flow execution description and pass/fail criteria
Methodology for verifying service continuity and recovery

The service continuity and process recovery capabilities of “nova-api” service is evaluated by monitoring service outage time, process outage time, and results of nova operations.

Service outage time is measured by continuously executing “openstack server list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “nova-api” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is measured by checking the status of “nova-api” processes on the selected controller node. The time of “nova-api” processes being killed to the time of the “nova-api” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “nova-api” processes.

All nova operations are carried out correctly within a given time interval which suggests that the “nova-api” service is continuously available.

Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “nova-api” processes are running on Node1
  • Test action 2: Create a image with “openstack image create test-cirros –file cirros-0.3.5-x86_64-disk.img –disk-format qcow2 –container-format bare”
  • Test action 3: Execute”openstack flavor create m1.test –id auto –ram 512 –disk 1 –vcpus 1” to create flavor “m1.test”.
  • Test action 4: Start two monitors: one for “nova-api” processes and the other for “openstack server list” command. Each monitor will run as an independent process
  • Test action 5: Connect to Node1 through SSH, and then kill the “nova-api” processes
  • Test action 6: When “openstack server list” returns with no error, calculate the service outage time, and execute command “openstack server create –flavor m1.test –image test-cirros test-instance”
  • Test action 7: Continuously Execute “openstack server show test-instance” to check if the status of VM “test-instance” is “Active”
  • Test action 8: If VM “test-instance” is “Active”, execute “openstack server delete test-instance”, then execute “openstack server list” to check if the VM is not in the list
  • Test action 9: Continuously measure process outage time from the monitor until the process outage time is more than 30s
Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

The nova operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

Post conditions

Restart the process of “nova-api” if they are not running. Delete image with “openstack image delete test-cirros” Delete flavor with “openstack flavor delete m1.test”

Test Case 2 - Controller node OpenStack service down - neutron-server
Short name

dovetail.ha.tc002.neutron-server_service_down

Use case specification

This test verifies the high availability of the “neutron-server” service provided by OpenStack controller nodes. It kills the processes of OpenStack “neutron-server” service on the selected controller node, then checks whether the “neutron-server” service is still available, by creating a network and deleting the network, and checks whether the killed processes are recovered.

Test preconditions

There is more than one controller node, which is providing the “neutron-server” service for API end-point. Denoted a controller node as Node1 in the following configuration.

Basic test flow execution description and pass/fail criteria
Methodology for monitoring high availability

The high availability of “neutron-server” service is evaluated by monitoring service outage time, process outage time, and results of neutron operations.

Service outage time is tested by continuously executing “openstack router list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “neutron-server” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of “neutron-server” processes on the selected controller node. The time of “neutron-server” processes being killed to the time of the “neutron-server” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “neutron-server” processes.

Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “neutron-server” processes are running on Node1
  • Test action 2: Start two monitors: one for “neutron-server” process and the other for “openstack router list” command. Each monitor will run as an independent process.
  • Test action 3: Connect to Node1 through SSH, and then kill the “neutron-server” processes
  • Test action 4: When “openstack router list” returns with no error, calculate the service outage time, and execute “openstack network create test-network”
  • Test action 5: Continuously executing “openstack network show test-network”, check if the status of “test-network” is “Active”
  • Test action 6: If “test-network” is “Active”, execute “openstack network delete test-network”, then execute “openstack network list” to check if the “test-network” is not in the list
  • Test action 7: Continuously measure process outage time from the monitor until the process outage time is more than 30s
Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

The neutron operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

Post conditions

Restart the processes of “neutron-server” if they are not running.

Test Case 3 - Controller node OpenStack service down - keystone
Short name

dovetail.ha.tc003.keystone_service_down

Use case specification

This test verifies the high availability of the “keystone” service provided by OpenStack controller nodes. It kills the processes of OpenStack “keystone” service on the selected controller node, then checks whether the “keystone” service is still available by executing command “openstack user list” and whether the killed processes are recovered.

Test preconditions

There is more than one controller node, which is providing the “keystone” service for API end-point. Denoted a controller node as Node1 in the following configuration.

Basic test flow execution description and pass/fail criteria
Methodology for monitoring high availability

The high availability of “keystone” service is evaluated by monitoring service outage time and process outage time

Service outage time is tested by continuously executing “openstack user list” command in loop and checking if the response of the command request is reutrned with no failure. When the response fails, the “keystone” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of “keystone” processes on the selected controller node. The time of “keystone” processes being killed to the time of the “keystone” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “keystone” processes.

Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “keystone” processes are running on Node1
  • Test action 2: Start two monitors: one for “keystone” process and the other for “openstack user list” command. Each monitor will run as an independent process.
  • Test action 3: Connect to Node1 through SSH, and then kill the “keystone” processes
  • Test action 4: Calculate the service outage time and process outage time
  • Test action 5: The test passes if process outage time is less than 20s and service outage time is less than 5s
  • Test action 6: Continuously measure process outage time from the monitor until the process outage time is more than 30s
Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

A negative result will be generated if the above is not met in completion.

Post conditions

Restart the processes of “keystone” if they are not running.

Test Case 4 - Controller node OpenStack service down - glance-api
Short name

dovetail.ha.tc004.glance-api_service_down

Use case specification

This test verifies the high availability of the “glance-api” service provided by OpenStack controller nodes. It kills the processes of OpenStack “glance-api” service on the selected controller node, then checks whether the “glance-api” service is still available, by creating image and deleting image, and checks whether the killed processes are recovered.

Test preconditions

There is more than one controller node, which is providing the “glance-api” service for API end-point. Denoted a controller node as Node1 in the following configuration.

Basic test flow execution description and pass/fail criteria
Methodology for monitoring high availability

The high availability of “glance-api” service is evaluated by monitoring service outage time, process outage time, and results of glance operations.

Service outage time is tested by continuously executing “openstack image list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “glance-api” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of “glance-api” processes on the selected controller node. The time of “glance-api” processes being killed to the time of the “glance-api” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “glance-api” processes.

Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “glance-api” processes are running on Node1
  • Test action 2: Start two monitors: one for “glance-api” process and the other for “openstack image list” command. Each monitor will run as an independent process.
  • Test action 3: Connect to Node1 through SSH, and then kill the “glance-api” processes
  • Test action 4: When “openstack image list” returns with no error, calculate the service outage time, and execute “openstack image create test-image –file cirros-0.3.5-x86_64-disk.img –disk-format qcow2 –container-format bare”
  • Test action 5: Continuously execute “openstack image show test-image”, check if status of “test-image” is “active”
  • Test action 6: If “test-image” is “active”, execute “openstack image delete test-image”. Then execute “openstack image list” to check if “test-image” is not in the list
  • Test action 7: Continuously measure process outage time from the monitor until the process outage time is more than 30s
Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

The glance operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

Post conditions

Restart the processes of “glance-api” if they are not running.

Delete image with “openstack image delete test-image”.

Test Case 5 - Controller node OpenStack service down - cinder-api
Short name

dovetail.ha.tc005.cinder-api_service_down

Use case specification

This test verifies the high availability of the “cinder-api” service provided by OpenStack controller nodes. It kills the processes of OpenStack “cinder-api” service on the selected controller node, then checks whether the “cinder-api” service is still available by executing command “openstack volume list” and whether the killed processes are recovered.

Test preconditions

There is more than one controller node, which is providing the “cinder-api” service for API end-point. Denoted a controller node as Node1 in the following configuration.

Basic test flow execution description and pass/fail criteria
Methodology for monitoring high availability

The high availability of “cinder-api” service is evaluated by monitoring service outage time and process outage time

Service outage time is tested by continuously executing “openstack volume list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “cinder-api” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of “cinder-api” processes on the selected controller node. The time of “cinder-api” processes being killed to the time of the “cinder-api” processes being recovered is the process outage time. Process recovery is verified by checking the existence of “cinder-api” processes.

Test execution
  • Test action 1: Connect to Node1 through SSH, and check that “cinder-api” processes are running on Node1
  • Test action 2: Start two monitors: one for “cinder-api” process and the other for “openstack volume list” command. Each monitor will run as an independent process.
  • Test action 3: Connect to Node1 through SSH, and then execute kill the “cinder-api” processes
  • Test action 4: Continuously measure service outage time from the monitor until the service outage time is more than 5s
  • Test action 5: Continuously measure process outage time from the monitor until the process outage time is more than 30s
Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

The cinder operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

Post conditions

Restart the processes of “cinder-api” if they are not running.

Test Case 6 - Controller Node CPU Overload High Availability
Short name

dovetail.ha.tc006.cpu_overload

Use case specification

This test verifies the availability of services when one of the controller node suffers from heavy CPU overload. When the CPU usage of the specified controller node is up to 100%, which breaks down the OpenStack services on this node, the Openstack services should continue to be available. This test case stresses the CPU usage of a specific controller node to 100%, then checks whether all services provided by the SUT are still available with the monitor tools.

Test preconditions

There is more than one controller node, which is providing the “cinder-api”, “neutron-server”, “glance-api” and “keystone” services for API end-point. Denoted a controller node as Node1 in the following configuration.

Basic test flow execution description and pass/fail criteria
Methodology for monitoring high availability

The high availability of related OpenStack service is evaluated by monitoring service outage time

Service outage time is tested by continuously executing “openstack router list”, “openstack stack list”, “openstack volume list”, “openstack image list” commands in loop and checking if the response of the command request is returned with no failure. When the response fails, the related service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Methodology for stressing CPU usage

To evaluate the high availability of target OpenStack service under heavy CPU load, the test case will first get the number of logical CPU cores on the target controller node by shell command, then use the number to execute ‘dd’ command to continuously copy from /dev/zero and output to /dev/null in loop. The ‘dd’ operation only uses CPU, no I/O operation, which is ideal for stressing the CPU usage.

Since the ‘dd’ command is continuously executed and the CPU usage rate is stressed to 100%, the scheduler will schedule each ‘dd’ command to be processed on a different logical CPU core. Eventually to achieve all logical CPU cores usage rate to 100%.

Test execution
  • Test action 1: Start four monitors: one for “openstack image list” command, one for “openstack router list” command, one for “openstack stack list” command and the last one for “openstack volume list” command. Each monitor will run as an independent process.
  • Test action 2: Connect to Node1 through SSH, and then stress all logical CPU cores usage rate to 100%
  • Test action 3: Continuously measure all the service outage times until they are more than 5s
  • Test action 4: Kill the process that stresses the CPU usage
Pass / fail criteria

All the service outage times are less than 5s.

A negative result will be generated if the above is not met in completion.

Post conditions

No impact on the SUT.

Test Case 7 - Controller Node Disk I/O Overload High Availability
Short name

dovetail.ha.tc007.disk_I/O_overload

Use case specification

This test verifies the high availability of control node. When the disk I/O of the specific disk is overload, which breaks down the OpenStack services on this node, the read and write services should continue to be available. This test case blocks the disk I/O of the specific controller node, then checks whether the services that need to read or write the disk of the controller node are available with some monitor tools.

Test preconditions

There is more than one controller node. Denoted a controller node as Node1 in the following configuration. The controller node has at least 20GB free disk space.

Basic test flow execution description and pass/fail criteria
Methodology for monitoring high availability

The high availability of nova service is evaluated by monitoring service outage time

Service availability is tested by continuously executing “openstack flavor list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the related service is considered in outage.

Methodology for stressing disk I/O

To evaluate the high availability of target OpenStack service under heavy I/O load, the test case will execute shell command on the selected controller node to continuously writing 8kb blocks to /test.dbf

Test execution
  • Test action 1: Connect to Node1 through SSH, and then stress disk I/O by continuously writing 8kb blocks to /test.dbf
  • Test action 2: Start a monitor: for “openstack flavor list” command
  • Test action 3: Create a flavor called “test-001”
  • Test action 4: Check whether the flavor “test-001” is created
  • Test action 5: Continuously measure service outage time from the monitor until the service outage time is more than 5s
  • Test action 6: Stop writing to /test.dbf and delete file /test.dbf
Pass / fail criteria

The service outage time is less than 5s.

The nova operations are carried out in above order and no errors occur.

A negative result will be generated if the above is not met in completion.

Post conditions

Delete flavor with “openstack flavor delete test-001”.

Test Case 8 - Controller Load Balance as a Service High Availability
Short name

dovetail.ha.tc008.load_balance_service_down

Use case specification

This test verifies the high availability of “load balancer” service. When the “load balancer” service of a specified controller node is killed, whether “load balancer” service on other controller nodes will work, and whether the controller node will restart the “load balancer” service are checked. This test case kills the processes of “load balancer” service on the selected controller node, then checks whether the request of the related OpenStack command is processed with no failure and whether the killed processes are recovered.

Test preconditions

There is more than one controller node, which is providing the “load balancer” service for rest-api. Denoted as Node1 in the following configuration.

Basic test flow execution description and pass/fail criteria
Methodology for monitoring high availability

The high availability of “load balancer” service is evaluated by monitoring service outage time and process outage time

Service outage time is tested by continuously executing “openstack image list” command in loop and checking if the response of the command request is returned with no failure. When the response fails, the “load balancer” service is considered in outage. The time between the first response failure and the last response failure is considered as service outage time.

Process outage time is tested by checking the status of processes of “load balancer” service on the selected controller node. The time of those processes being killed to the time of those processes being recovered is the process outage time. Process recovery is verified by checking the existence of processes of “load balancer” service.

Test execution
  • Test action 1: Connect to Node1 through SSH, and check that processes of “load balancer” service are running on Node1
  • Test action 2: Start two monitors: one for processes of “load balancer” service and the other for “openstack image list” command. Each monitor will run as an independent process
  • Test action 3: Connect to Node1 through SSH, and then kill the processes of “load balancer” service
  • Test action 4: Continuously measure service outage time from the monitor until the service outage time is more than 5s
  • Test action 5: Continuously measure process outage time from the monitor until the process outage time is more than 30s
Pass / fail criteria

The process outage time is less than 30s.

The service outage time is less than 5s.

A negative result will be generated if the above is not met in completion.

Post conditions

Restart the processes of “load balancer” if they are not running.

VIM compute operations test specification
Scope

The VIM compute operations test area evaluates the ability of the system under test to support VIM compute operations. The test cases documented here are the compute API test cases in the OpenStack Interop guideline 2016.8 as implemented by the RefStack client. These test cases will evaluate basic OpenStack (as a VIM) compute operations, including:

  • Image management operations
  • Basic support operations
  • API version support operations
  • Quotas management operations
  • Basic server operations
  • Volume management operations
References
Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualization infrastructure
  • SUT - System Under Test
  • UUID - Universally Unique Identifier
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM deployed with a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on VIM compute API operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

For brevity, the test cases in this test area are summarized together based on the operations they are testing.

All these test cases are included in the test case dovetail.osinterop.tc001 of OVP test suite.

Test Descriptions
API Used and Reference

Servers: https://developer.openstack.org/api-ref/compute/

  • create server
  • delete server
  • list servers
  • start server
  • stop server
  • update server
  • get server action
  • set server metadata
  • update server metadata
  • rebuild server
  • create image
  • delete image
  • create keypair
  • delete keypair

Block storage: https://developer.openstack.org/api-ref/block-storage

  • create volume
  • delete volume
  • attach volume to server
  • detach volume from server
Test Case 1 - Image operations within the Compute API
Test case specification

tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_delete_image tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_image_specify_multibyte_character_image_name

Test preconditions
  • Compute server extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a server VM1 with an image IMG1 and wait for VM1 to reach ‘ACTIVE’ status
  • Test action 2: Create a new server image IMG2 from VM1, specifying image name and image metadata. Wait for IMG2 to reach ‘ACTIVE’ status, and then delete IMG2
  • Test assertion 1: Verify IMG2 is created with correct image name and image metadata; verify IMG1’s ‘minRam’ equals to IMG2’s ‘minRam’ and IMG2’s ‘minDisk’ equals to IMG1’s ‘minDisk’ or VM1’s flavor disk size
  • Test assertion 2: Verify IMG2 is deleted correctly
  • Test action 3: Create another server IMG3 from VM1, specifying image name with a 3 byte utf-8 character
  • Test assertion 3: Verify IMG3 is created correctly
  • Test action 4: Delete VM1, IMG1 and IMG3

This test evaluates the Compute API ability of creating image from server, deleting image, creating server image with multi-byte character name. Specifically, the test verifies that:

  • Compute server create image and delete image APIs work correctly.
  • Compute server image can be created with multi-byte character name.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 2 - Action operation within the Compute API
Test case specification

tempest.api.compute.servers.test_instance_actions.InstanceActionsTestJSON.test_get_instance_action tempest.api.compute.servers.test_instance_actions.InstanceActionsTestJSON.test_list_instance_actions

Test preconditions
  • Compute server extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a server VM1 and wait for VM1 to reach ‘ACTIVE’ status
  • Test action 2: Get the action details ACT_DTL of VM1
  • Test assertion 1: Verify ACT_DTL’s ‘instance_uuid’ matches VM1’s ID and ACT_DTL’s ‘action’ matched ‘create’
  • Test action 3: Create a server VM2 and wait for VM2 to reach ‘ACTIVE’ status
  • Test action 4: Delete server VM2 and wait for VM2 to reach termination
  • Test action 5: Get the action list ACT_LST of VM2
  • Test assertion 2: Verify ACT_LST’s length is 2 and two actions are ‘create’ and ‘delete’
  • Test action 6: Delete VM1

This test evaluates the Compute API ability of getting the action details of a provided server and getting the action list of a deleted server. Specifically, the test verifies that:

  • Get the details of the action in a specified server.
  • List the actions that were performed on the specified server.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 3 - Generate, import and delete SSH keys within Compute services
Test case specification

tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_specify_keypair

Test preconditions
  • Compute server extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1 and list all existing keypairs
  • Test action 2: Create a server VM1 with KEYP1 and wait for VM1 to reach ‘ACTIVE’ status
  • Test action 3: Show details of VM1
  • Test assertion 1: Verify value of ‘key_name’ in the details equals to the name of KEYP1
  • Test action 4: Delete KEYP1 and VM1

This test evaluates the Compute API ability of creating a keypair, listing keypairs and creating a server with a provided keypair. Specifically, the test verifies that:

  • Compute create keypair and list keypair APIs work correctly.
  • While creating a server, keypair can be specified.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 4 - List supported versions of the Compute API
Test case specification

tempest.api.compute.test_versions.TestVersions.test_list_api_versions

Test preconditions
  • Compute versions extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Get a List of versioned endpoints in the SUT
  • Test assertion 1: Verify endpoints versions start at ‘v2.0’

This test evaluates the functionality of listing all available APIs to API consumers. Specifically, the test verifies that:

  • Compute list API versions API works correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 5 - Quotas management in Compute API
Test case specification

tempest.api.compute.test_quotas.QuotasTestJSON.test_get_default_quotas tempest.api.compute.test_quotas.QuotasTestJSON.test_get_quotas

Test preconditions
  • Compute quotas extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Get the default quota set using the tenant ID
  • Test assertion 1: Verify the default quota set ID matches tenant ID and the default quota set is complete
  • Test action 2: Get the quota set using the tenant ID
  • Test assertion 2: Verify the quota set ID matches tenant ID and the quota set is complete
  • Test action 3: Get the quota set using the user ID
  • Test assertion 3: Verify the quota set ID matches tenant ID and the quota set is complete

This test evaluates the functionality of getting quota set. Specifically, the test verifies that:

  • User can get the default quota set for its tenant.
  • User can get the quota set for its tenant.
  • User can get the quota set using user ID.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 6 - Basic server operations in the Compute API
Test case specification

This test case evaluates the Compute API ability of basic server operations, including:

  • Create a server with admin password
  • Create a server with a name that already exists
  • Create a server with a numeric name
  • Create a server with a really long metadata
  • Create a server with a name whose length exceeding 255 characters
  • Create a server with an unknown flavor
  • Create a server with an unknown image ID
  • Create a server with an invalid network UUID
  • Delete a server using a server ID that exceeds length limit
  • Delete a server using a negative server ID
  • Get a nonexistent server details
  • Verify the instance host name is the same as the server name
  • Create a server with an invalid access IPv6 address
  • List all existent servers
  • Filter the (detailed) list of servers by flavor, image, server name, server status or limit
  • Lock a server and try server stop, unlock and retry
  • Get and delete metadata from a server
  • List and set metadata for a server
  • Reboot, rebuild, stop and start a server
  • Update a server’s access addresses and server name

The reference is,

tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_server_with_admin_password tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_with_existing_server_name tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_numeric_server_name tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_server_metadata_exceeds_length_limit tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_server_name_length_exceeds_256 tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_with_invalid_flavor tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_with_invalid_image tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_with_invalid_network_uuid tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_delete_server_pass_id_exceeding_length_limit tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_delete_server_pass_negative_id tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_get_non_existent_server tempest.api.compute.servers.test_create_server.ServersTestJSON.test_host_name_is_same_as_server_name tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_host_name_is_same_as_server_name tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_invalid_ip_v6_address tempest.api.compute.servers.test_create_server.ServersTestJSON.test_list_servers tempest.api.compute.servers.test_create_server.ServersTestJSON.test_list_servers_with_detail tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_list_servers tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_list_servers_with_detail tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_flavor tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_image tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_server_name tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_server_status tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_limit_results tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_flavor tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_image tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_limit tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_server_name tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_active_status tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filtered_by_name_wildcard tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_changes_since_future_date tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_changes_since_invalid_date tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits_greater_than_actual_count tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits_pass_negative_value tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits_pass_string tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_non_existing_flavor tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_non_existing_image tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_non_existing_server_name tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_detail_server_is_deleted tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_status_non_existing tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_with_a_deleted_server tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_lock_unlock_server tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_delete_server_metadata_item tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_get_server_metadata_item tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_list_server_metadata tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_set_server_metadata tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_set_server_metadata_item tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_update_server_metadata tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_server_name_blank tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_reboot_server_hard tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_reboot_non_existent_server tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_rebuild_server tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_rebuild_deleted_server tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_rebuild_non_existent_server tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_stop_start_server tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_stop_non_existent_server tempest.api.compute.servers.test_servers.ServersTestJSON.test_update_access_server_address tempest.api.compute.servers.test_servers.ServersTestJSON.test_update_server_name tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_update_name_of_non_existent_server tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_update_server_name_length_exceeds_256 tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_update_server_set_empty_name tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_created_server_vcpus tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_server_details tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_verify_created_server_vcpus tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_verify_server_details

Test preconditions
  • Compute quotas extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a server VM1 with a admin password ‘testpassword’
  • Test assertion 1: Verify the password returned in the response equals to ‘testpassword’
  • Test action 2: Generate a VM name VM_NAME
  • Test action 3: Create 2 servers VM2 and VM3 both with name VM_NAME
  • Test assertion 2: Verify VM2’s ID is not equal to VM3’s ID, and VM2’s name equal to VM3’s name
  • Test action 4: Create a server VM4 with a numeric name ‘12345’
  • Test assertion 3: Verify creating VM4 failed
  • Test action 5: Create a server VM5 with a long metadata ‘{‘a’: ‘b’ * 260}’
  • Test assertion 4: Verify creating VM5 failed
  • Test action 6: Create a server VM6 with name length exceeding 255 characters
  • Test assertion 5: Verify creating VM6 failed
  • Test action 7: Create a server VM7 with an unknown flavor ‘-1’
  • Test assertion 6: Verify creating VM7 failed
  • Test action 8: Create a server VM8 with an unknown image ID ‘-1’
  • Test assertion 7: Verify creating VM8 failed
  • Test action 9: Create a server VM9 with an invalid network UUID ‘a-b-c-d-e-f-g-h-i-j’
  • Test assertion 8: Verify creating VM9 failed
  • Test action 10: Delete a server using a server ID that exceeds system’s max integer limit
  • Test assertion 9: Verify deleting server failed
  • Test action 11: Delete a server using a server ID ‘-1’
  • Test assertion 10: Verify deleting server failed
  • Test action 12: Get a nonexistent server by using a random generated server ID
  • Test assertion 11: Verify get server failed
  • Test action 13: SSH into a provided server and get server’s hostname
  • Test assertion 12: Verify server’s host name is the same as the server name
  • Test action 14: SSH into a provided server and get server’s hostname (manual disk configuration)
  • Test assertion 13: Verify server’s host name is the same as the server name (manual disk configuration)
  • Test action 15: Create a server with an invalid access IPv6 address
  • Test assertion 14: Verify creating server failed, a bad request error is returned in response
  • Test action 16: List all existent servers
  • Test assertion 15: Verify a provided server is in the server list
  • Test action 17: List all existent servers in detail
  • Test assertion 16: Verify a provided server is in the detailed server list
  • Test action 18: List all existent servers (manual disk configuration)
  • Test assertion 17: Verify a provided server is in the server list (manual disk configuration)
  • Test action 19: List all existent servers in detail (manual disk configuration)
  • Test assertion 18: Verify a provided server is in the detailed server list (manual disk configuration)
  • Test action 20: List all existent servers in detail and filter the server list by flavor
  • Test assertion 19: Verify the filtered server list is correct
  • Test action 21: List all existent servers in detail and filter the server list by image
  • Test assertion 20: Verify the filtered server list is correct
  • Test action 22: List all existent servers in detail and filter the server list by server name
  • Test assertion 21: Verify the filtered server list is correct
  • Test action 23: List all existent servers in detail and filter the server list by server status
  • Test assertion 22: Verify the filtered server list is correct
  • Test action 24: List all existent servers in detail and filter the server list by display limit ‘1’
  • Test assertion 23: Verify the length of filtered server list is 1
  • Test action 25: List all existent servers and filter the server list by flavor
  • Test assertion 24: Verify the filtered server list is correct
  • Test action 26: List all existent servers and filter the server list by image
  • Test assertion 25: Verify the filtered server list is correct
  • Test action 27: List all existent servers and filter the server list by display limit ‘1’
  • Test assertion 26: Verify the length of filtered server list is 1
  • Test action 28: List all existent servers and filter the server list by server name
  • Test assertion 27: Verify the filtered server list is correct
  • Test action 29: List all existent servers and filter the server list by server status
  • Test assertion 28: Verify the filtered server list is correct
  • Test action 30: List all existent servers and filter the server list by server name wildcard
  • Test assertion 29: Verify the filtered server list is correct
  • Test action 31: List all existent servers and filter the server list by part of server name
  • Test assertion 30: Verify the filtered server list is correct
  • Test action 32: List all existent servers and filter the server list by a future change-since date
  • Test assertion 31: Verify the filtered server list is empty
  • Test action 33: List all existent servers and filter the server list by a invalid change-since date format
  • Test assertion 32: Verify a bad request error is returned in the response
  • Test action 34: List all existent servers and filter the server list by display limit ‘1’
  • Test assertion 33: Verify the length of filtered server list is 1
  • Test action 35: List all existent servers and filter the server list by a display limit value greater than the length of the server list
  • Test assertion 34: Verify the length of filtered server list equals to the length of server list
  • Test action 36: List all existent servers and filter the server list by display limit ‘-1’
  • Test assertion 35: Verify a bad request error is returned in the response
  • Test action 37: List all existent servers and filter the server list by a string type limit value ‘testing’
  • Test assertion 36: Verify a bad request error is returned in the response
  • Test action 38: List all existent servers and filter the server list by a nonexistent flavor
  • Test assertion 37: Verify the filtered server list is empty
  • Test action 39: List all existent servers and filter the server list by a nonexistent image
  • Test assertion 38: Verify the filtered server list is empty
  • Test action 40: List all existent servers and filter the server list by a nonexistent server name
  • Test assertion 39: Verify the filtered server list is empty
  • Test action 41: List all existent servers in detail and search the server list for a deleted server
  • Test assertion 40: Verify the deleted server is not in the server list
  • Test action 42: List all existent servers and filter the server list by a nonexistent server status
  • Test assertion 41: Verify the filtered server list is empty
  • Test action 43: List all existent servers in detail
  • Test assertion 42: Verify a provided deleted server’s id is not in the server list
  • Test action 44: Lock a provided server VM10 and retrieve the server’s status
  • Test assertion 43: Verify VM10 is in ‘ACTIVE’ status
  • Test action 45: Stop VM10
  • Test assertion 44: Verify stop VM10 failed
  • Test action 46: Unlock VM10 and stop VM10 again
  • Test assertion 45: Verify VM10 is stopped and in ‘SHUTOFF’ status
  • Test action 47: Start VM10
  • Test assertion 46: Verify VM10 is in ‘ACTIVE’ status
  • Test action 48: Delete metadata item ‘key1’ from a provided server
  • Test assertion 47: Verify the metadata item is removed
  • Test action 49: Get metadata item ‘key2’ from a provided server
  • Test assertion 48: Verify the metadata item is correct
  • Test action 50: List all metadata key/value pair for a provided server
  • Test assertion 49: Verify all metadata are retrieved correctly
  • Test action 51: Set metadata {‘meta2’: ‘data2’, ‘meta3’: ‘data3’} for a provided server
  • Test assertion 50: Verify server’s metadata are replaced correctly
  • Test action 52: Set metadata item nova’s value to ‘alt’ for a provided server
  • Test assertion 51: Verify server’s metadata are set correctly
  • Test action 53: Update metadata {‘key1’: ‘alt1’, ‘key3’: ‘value3’} for a provided server
  • Test assertion 52: Verify server’s metadata are updated correctly
  • Test action 54: Create a server with empty name parameter
  • Test assertion 53: Verify create server failed
  • Test action 55: Hard reboot a provided server
  • Test assertion 54: Verify server is rebooted successfully
  • Test action 56: Soft reboot a nonexistent server
  • Test assertion 55: Verify reboot failed, an error is returned in the response
  • Test action 57: Rebuild a provided server with new image, new server name and metadata
  • Test assertion 56: Verify server is rebuilt successfully, server image, name and metadata are correct
  • Test action 58: Create a server VM11
  • Test action 59: Delete VM11 and wait for VM11 to reach termination
  • Test action 60: Rebuild VM11 with another image
  • Test assertion 57: Verify rebuild server failed, an error is returned in the response
  • Test action 61: Rebuild a nonexistent server
  • Test assertion 58: Verify rebuild server failed, an error is returned in the response
  • Test action 62: Stop a provided server
  • Test assertion 59: Verify server reaches ‘SHUTOFF’ status
  • Test action 63: Start the stopped server
  • Test assertion 60: Verify server reaches ‘ACTIVE’ status
  • Test action 64: Stop a provided server
  • Test assertion 61: Verify stop server failed, an error is returned in the response
  • Test action 65: Create a server VM12 and wait it to reach ‘ACTIVE’ status
  • Test action 66: Update VM12’s IPv4 and IPv6 access addresses
  • Test assertion 62: Verify VM12’s access addresses have been updated correctly
  • Test action 67: Create a server VM13 and wait it to reach ‘ACTIVE’ status
  • Test action 68: Update VM13’s server name with non-ASCII characters ‘u00CDu00F1stu00E1u00F1cu00E9’
  • Test assertion 63: Verify VM13’s server name has been updated correctly
  • Test action 69: Update the server name of a nonexistent server
  • Test assertion 64: Verify update server name failed, an ‘object not found’ error is returned in the response
  • Test action 70: Update a provided server’s name with a 256-character long name
  • Test assertion 65: Verify update server name failed, a bad request is returned in the response
  • Test action 71: Update a provided server’s server name with an empty string
  • Test assertion 66: Verify update server name failed, a bad request error is returned in the response
  • Test action 72: Get the number of vcpus of a provided server
  • Test action 73: Get the number of vcpus stated by the server’s flavor
  • Test assertion 67: Verify that the number of vcpus reported by the server matches the amount stated by the server’s flavor
  • Test action 74: Create a server VM14
  • Test assertion 68: Verify VM14’s server attributes are set correctly
  • Test action 75: Get the number of vcpus of a provided server (manual disk configuration)
  • Test action 76: Get the number of vcpus stated by the server’s flavor (manual disk configuration)
  • Test assertion 69: Verify that the number of vcpus reported by the server matches the amount stated by the server’s flavor (manual disk configuration)
  • Test action 77: Create a server VM15 (manual disk configuration)
  • Test assertion 70: Verify VM15’s server attributes are set correctly (manual disk configuration)
  • Test action 78: Delete all VMs created

This test evaluates the functionality of basic server operations. Specifically, the test verifies that:

  • If an admin password is provided on server creation, the server’s root password should be set to that password
  • Create a server with a name that already exists is allowed
  • Create a server with a numeric name or a name that exceeds the length limit is not allowed
  • Create a server with a metadata that exceeds the length limit is not allowed
  • Create a server with an invalid flavor, an invalid image or an invalid network UUID is not allowed
  • Delete a server with a server ID that exceeds the length limit or a nonexistent server ID is not allowed
  • A provided server’s host name is the same as the server name
  • Create a server with an invalid IPv6 access address is not allowed
  • A created server is in the (detailed) list of servers
  • Filter the (detailed) list of servers by flavor, image, server name, server status, and display limit, respectively.
  • Filter the list of servers by a future date
  • Filter the list of servers by an invalid date format, a negative display limit or a string type display limit value is not allowed
  • Filter the list of servers by a nonexistent flavor, image, server name or server status is not allowed
  • Deleted servers are not in the list of servers
  • Deleted servers do not show by default in list of servers
  • Locked server is not allowed to be stopped by non-admin user
  • Can get and delete metadata from servers
  • Can list, set and update server metadata
  • Create a server with name parameter empty is not allowed
  • Hard reboot a server and the server should be power cycled
  • Reboot, rebuild and stop a nonexistent server is not allowed
  • Rebuild a server using the provided image and metadata
  • Stop and restart a server
  • A server’s name and access addresses can be updated
  • Update the name of a nonexistent server is not allowed
  • Update name of a server to a name that exceeds the name length limit is not allowed
  • Update name of a server to an empty string is not allowed
  • The number of vcpus reported by the server matches the amount stated by the server’s flavor
  • The specified server attributes are set correctly

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 7 - Retrieve volume information through the Compute API
Test case specification

This test case evaluates the Compute API ability of attaching volume to a specific server and retrieve volume information, the reference is,

tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_attach_detach_volume tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_list_get_volume_attachments

Test preconditions
  • Compute volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a server VM1 and a volume VOL1
  • Test action 2: Attach VOL1 to VM1
  • Test assertion 1: Stop VM1 successfully and wait VM1 to reach ‘SHUTOFF’ status
  • Test assertion 2: Start VM1 successfully and wait VM1 to reach ‘ACTIVE’ status
  • Test assertion 3: SSH into VM1 and verify VOL1 is in VM1’s root disk devices
  • Test action 3: Detach VOL1 from VM1
  • Test assertion 4: Stop VM1 successfully and wait VM1 to reach ‘SHUTOFF’ status
  • Test assertion 5: Start VM1 successfully and wait VM1 to reach ‘ACTIVE’ status
  • Test assertion 6: SSH into VM1 and verify VOL1 is not in VM1’s root disk devices
  • Test action 4: Create a server VM2 and a volume VOL2
  • Test action 5: Attach VOL2 to VM2
  • Test action 6: List VM2’s volume attachments
  • Test assertion 7: Verify the length of the list is 1 and VOL2 attachment is in the list
  • Test action 7: Retrieve VM2’s volume information
  • Test assertion 8: Verify volume information is correct
  • Test action 8: Delete VM1, VM2, VOL1 and VOL2

This test evaluates the functionality of retrieving volume information. Specifically, the test verifies that:

  • Stop and start a server with an attached volume work correctly.
  • Retrieve a server’s volume information correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

VIM identity operations test specification
Scope

The VIM identity test area evaluates the ability of the system under test to support VIM identity operations. The tests in this area will evaluate API discovery operations within the Identity v3 API, auth operations within the Identity API.

References
Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualisation infrastructure
  • VIM - Virtual Infrastructure Manager
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on an Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on VIM identity operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test.

All these test cases are included in the test case dovetail.osinterop.tc001 of OVP test suite.

Dependency Description

The VIM identity operations test cases are a part of the OpenStack interoperability tempest test cases. For Danube based dovetail release, the OpenStack interoperability guidelines (version 2016.08) is adopted, which is valid for Kilo, Liberty, Mitaka and Newton releases of Openstack.

Test Descriptions
API discovery operations within the Identity v3 API
Use case specification

tempest.api.identity.v3.test_api_discovery.TestApiDiscovery.test_api_version_resources tempest.api.identity.v3.test_api_discovery.TestApiDiscovery.test_api_media_types tempest.api.identity.v3.test_api_discovery.TestApiDiscovery.test_api_version_statuses

Test preconditions

None

Basic test flow execution description and pass/fail criteria
Test execution
  • Test action 1: Show the v3 identity api description, the test passes if keys ‘id’, ‘links’, ‘media-types’, ‘status’, ‘updated’ are all included in the description response message.
  • Test action 2: Get the value of v3 identity api ‘media-types’, the test passes if api version 2 and version 3 are all included in the response.
  • Test action 3: Show the v3 indentity api description, the test passes if ‘current’, ‘stable’, ‘experimental’, ‘supported’, ‘deprecated’ are all of the identity api ‘status’ values.
Pass / fail criteria

This test case passes if all test action steps execute successfully and all assertions are affirmed. If any test steps fails to execute successfully or any of the assertions is not met, the test case fails.

Post conditions

None

Auth operations within the Identity API
Use case specification

tempest.api.identity.v3.test_tokens.TokensV3Test.test_create_token

Test preconditions

None

Basic test flow execution description and pass/fail criteria
Test execution
  • Test action 1: Get the token by system credentials, the test passes if the returned token_id is not empty and is string type.
  • Test action 2: Get the user_id in getting token response message, the test passes if it is equal to the user_id which is used to get token.
  • Test action 3: Get the user_name in getting token response message, the test passes if it is equal to the user_name which is used to get token.
  • Test action 4: Get the method in getting token response message, the test passes if it is equal to the password which is used to get token.
Pass / fail criteria

This test case passes if all test action steps execute successfully and all assertions are affirmed. If any test steps fails to execute successfully or any of the assertions is not met, the test case fails.

Post conditions

None

VIM image operations test specification
Scope

The VIM image test area evaluates the ability of the system under test to support VIM image operations. The test cases documented here are the Image API test cases in the Openstack Interop guideline 2016.8 as implemented by the Refstack client. These test cases will evaluate basic Openstack (as a VIM) image operations including image creation, image list, image update and image deletion capabilities using Glance v2 API.

References
Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • CRUD - Create, Read, Update, and Delete
  • NFVi - Network Functions Virtualization infrastructure
  • VIM - Virtual Infrastructure Manager
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on VIM image operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test.

For brevity, the test cases in this test area are summarized together based on the operations they are testing.

All these test cases are included in the test case dovetail.osinterop.tc001 of OVP test suite.

Test Descriptions
API Used and Reference

Images: https://developer.openstack.org/api-ref/image/v2/

  • create image
  • delete image
  • show image details
  • show images
  • show image schema
  • show images schema
  • upload binary image data
  • add image tag
  • delete image tag
Image get tests using the Glance v2 API
Test case specification

tempest.api.image.v2.test_images.ListUserImagesTest.test_get_image_schema tempest.api.image.v2.test_images.ListUserImagesTest.test_get_images_schema tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_get_delete_deleted_image tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_get_image_null_id tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_get_non_existent_image

Test preconditions

Glance is available.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create 6 images and store their ids in a created images list.
  • Test action 2: Use image v2 API to show image schema and check the body of the response.
  • Test assertion 1: In the body of the response, the value of the key ‘name’ is ‘image’.
  • Test action 3: Use image v2 API to show images schema and check the body of the response.
  • Test assertion 2: In the body of the response, the value of the key ‘name’ is ‘images’.
  • Test action 4: Create an image with name ‘test’, container_formats ‘bare’ and disk_formats ‘raw’. Delete this image with its id and then try to show it with its id. Delete this deleted image again with its id and check the API’s response code.
  • Test assertion 3: The operations of showing and deleting a deleted image with its id both get 404 response code.
  • Test action 5: Use a null image id to show a image and check the API’s response code.
  • Test assertion 4: The API’s response code is 404.
  • Test action 6: Generate a random uuid and use it as the image id to show the image.
  • Test assertion 5: The API’s response code is 404.
  • Test action 7: Delete the 6 images with the stored ids. Show all images and check whether the 6 images’ ids are not in the show list.
  • Test assertion 6: The 6 images’ ids are not found in the show list.

The first two test cases evaluate the ability to use Glance v2 API to show image and images schema. The latter three test cases evaluate the ability to use Glance v2 API to show images with a deleted image id, a null image id and a non-existing image id. Specifically it verifies that:

  • Glance image get API can show the image and images schema.
  • Glance image get API can’t show an image with a deleted image id.
  • Glance image get API can’t show an image with a null image id.
  • Glance image get API can’t show an image with a non-existing image id.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

None

CRUD image operations in Images API v2
Test case specification

tempest.api.image.v2.test_images.ListUserImagesTest.test_list_no_params

Test preconditions

Glance is available.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create 6 images and store their ids in a created images list.
  • Test action 2: List all images and check whether the ids listed are in the created images list.
  • Test assertion 1: The ids get from the list images API are in the created images list.

This test case evaluates the ability to use Glance v2 API to list images. Specifically it verifies that:

  • Glance image API can show the images.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

None

Image list tests using the Glance v2 API
Test case specification

tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_container_format tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_disk_format tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_limit tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_min_max_size tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_size tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_status tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_visibility

Test preconditions

Glance is available.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create 6 images with a random size ranging from 1024 to 4096 and visibility ‘private’; set their (container_format, disk_format) pair to be (ami, ami), (ami, ari), (ami, aki), (ami, vhd), (ami, vmdk) and (ami, raw); store their ids in a list and upload the binary images data.
  • Test action 2: Use Glance v2 API to list all images whose container_format is ‘ami’ and store the response details in a list.
  • Test assertion 1: The list is not empty and all the values of container_format in the list are ‘ami’.
  • Test action 3: Use Glance v2 API to list all images whose disk_format is ‘raw’ and store the response details in a list.
  • Test assertion 2: The list is not empty and all the values of disk_format in the list are ‘raw’.
  • Test action 4: Use Glance v2 API to list one image by setting limit to be 1 and store the response details in a list.
  • Test assertion 3: The length of the list is one.
  • Test action 5: Use Glance v2 API to list images by setting size_min and size_max, and store the response images’ sizes in a list. Choose the first image’s size as the median, size_min is median-500 and size_max is median+500.
  • Test assertion 4: All sizes in the list are no less than size_min and no more than size_max.
  • Test action 6: Use Glance v2 API to show the first created image with its id and get its size from the response. Use Glance v2 API to list images whose size is equal to this size and store the response details in a list.
  • Test assertion 5: All sizes of the images in the list are equal to the size used to list the images.
  • Test action 7: Use Glance v2 API to list the images whose status is active and store the response details in a list.
  • Test assertion 6: All status of images in the list are active.
  • Test action 8: Use Glance v2 API to list the images whose visibility is private and store the response details in a list.
  • Test assertion 7: All images’ values of visibility in the list are private.
  • Test action 9: Delete the 6 images with the stored ids. Show images and check whether the 6 ids are not in the show list.
  • Test assertion 8: The stored 6 ids are not found in the show list.

This test case evaluates the ability to use Glance v2 API to list images with different parameters. Specifically it verifies that:

  • Glance image API can show the images with the container_format.
  • Glance image API can show the images with the disk_format.
  • Glance image API can show the images by setting a limit number.
  • Glance image API can show the images with the size_min and size_max.
  • Glance image API can show the images with the size.
  • Glance image API can show the images with the status.
  • Glance image API can show the images with the visibility type.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

None

Image update tests using the Glance v2 API
Test case specification

tempest.api.image.v2.test_images.BasicOperationsImagesTest.test_update_image tempest.api.image.v2.test_images_tags.ImagesTagsTest.test_update_delete_tags_for_image tempest.api.image.v2.test_images_tags_negative.ImagesTagsNegativeTest.test_update_tags_for_non_existing_image

Test preconditions

Glance is available.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create an image with container_formats ‘ami’, disk_formats ‘ami’ and visibility ‘private’ and store its id returned in the response. Check whether the status of the created image is ‘queued’.
  • Test assertion 1: The status of the created image is ‘queued’.
  • Test action 2: Use the stored image id to upload the binary image data and update this image’s name. Show this image with the stored id. Check if the stored id and name used to update the image are equal to the id and name in the show list.
  • Test assertion 2: The id and name returned in the show list are equal to the stored id and name used to update the image.
  • Test action 3: Create an image with container_formats ‘bare’, disk_formats ‘raw’ and visibility ‘private’ and store its id returned in the response.
  • Test action 4: Use the stored id to add a tag. Show the image with the stored id and check if the tag used to add is in the image’s tags returned in the show list.
  • Test assertion 3: The tag used to add into the image is in the show list.
  • Test action 5: Use the stored id to delete this tag. Show the image with the stored id and check if the tag used to delete is not in the show list.
  • Test assertion 4: The tag used to delete from the image is not in the show list.
  • Test action 6: Generate a random uuid as the image id. Use the image id to add a tag into the image’s tags.
  • Test assertion 5: The API’s response code is 404.
  • Test action 7: Delete the images created in test action 1 and 3. Show the images and check whether the ids are not in the show list.
  • Test assertion 6: The two ids are not found in the show list.

This test case evaluates the ability to use Glance v2 API to update images with different parameters. Specifically it verifies that:

  • Glance image API can update image’s name with the existing image id.
  • Glance image API can update image’s tags with the existing image id.
  • Glance image API can’t update image’s tags with a non-existing image id.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

None

Image deletion tests using the Glance v2 API
Test case specification

tempest.api.image.v2.test_images.BasicOperationsImagesTest.test_delete_image tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_delete_image_null_id tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_delete_non_existing_image tempest.api.image.v2.test_images_tags_negative.ImagesTagsNegativeTest.test_delete_non_existing_tag

Test preconditions

Glance is available.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create an image with container_formats ‘ami’, disk_formats ‘ami’ and visibility ‘private’. Use the id of the created image to delete the image. List all images and check whether this id is in the list.
  • Test assertion 1: The id of the created image is not found in the list of all images after the deletion operation.
  • Test action 2: Delete images with a null id and check the API’s response code.
  • Test assertion 2: The API’s response code is 404.
  • Test action 3: Generate a random uuid and delete images with this uuid as image id. Check the API’s response code.
  • Test assertion 3: The API’s response code is 404.
  • Test action 4: Create an image with container_formats ‘bare’, disk_formats ‘raw’ and visibility ‘private’. Delete this image’s tag with the image id and a random tag Check the API’s response code.
  • Test assertion 4: The API’s response code is 404.
  • Test action 5: Delete the images created in test action 1 and 4. List all images and check whether the ids are in the list.
  • Test assertion 5: The two ids are not found in the list.

The first three test cases evaluate the ability to use Glance v2 API to delete images with an existing image id, a null image id and a non-existing image id. The last one evaluates the ability to use the API to delete a non-existing image tag. Specifically it verifies that:

  • Glance image deletion API can delete the image with an existing id.
  • Glance image deletion API can’t delete an image with a null image id.
  • Glance image deletion API can’t delete an image with a non-existing image id.
  • Glance image deletion API can’t delete an image tag with a non-existing image tag.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

None

VIM network operations test specification
Scope

The VIM network test area evaluates the ability of the system under test to support VIM network operations. The test cases documented here are the network API test cases in the Openstack Interop guideline 2016.8 as implemented by the Refstack client. These test cases will evaluate basic Openstack (as a VIM) network operations including basic CRUD operations on L2 networks, L2 network ports and security groups.

References
Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • CRUD - Create, Read, Update and Delete
  • NFVi - Network Functions Virtualization infrastructure
  • VIM - Virtual Infrastructure Manager
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on VIM network operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

For brevity, the test cases in this test area are summarized together based on the operations they are testing.

All these test cases are included in the test case dovetail.osinterop.tc001 of OVP test suite.

Test Descriptions
API Used and Reference

Network: http://developer.openstack.org/api-ref/networking/v2/index.html

  • create network
  • update network
  • list networks
  • show network details
  • delete network
  • create subnet
  • update subnet
  • list subnets
  • show subnet details
  • delete subnet
  • create port
  • bulk create ports
  • update port
  • list ports
  • show port details
  • delete port
  • create security group
  • update security group
  • list security groups
  • show security group
  • delete security group
  • create security group rule
  • list security group rules
  • show security group rule
  • delete security group rule
Basic CRUD operations on L2 networks and L2 network ports
Test case specification

tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_allocation_pools tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_dhcp_enabled tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_gw tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_gw_and_allocation_pools tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_host_routes_and_dns_nameservers tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_without_gateway tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_all_attributes tempest.api.network.test_networks.NetworksTest.test_create_update_delete_network_subnet tempest.api.network.test_networks.NetworksTest.test_delete_network_with_subnet tempest.api.network.test_networks.NetworksTest.test_list_networks tempest.api.network.test_networks.NetworksTest.test_list_networks_fields tempest.api.network.test_networks.NetworksTest.test_list_subnets tempest.api.network.test_networks.NetworksTest.test_list_subnets_fields tempest.api.network.test_networks.NetworksTest.test_show_network tempest.api.network.test_networks.NetworksTest.test_show_network_fields tempest.api.network.test_networks.NetworksTest.test_show_subnet tempest.api.network.test_networks.NetworksTest.test_show_subnet_fields tempest.api.network.test_networks.NetworksTest.test_update_subnet_gw_dns_host_routes_dhcp tempest.api.network.test_ports.PortsTestJSON.test_create_bulk_port tempest.api.network.test_ports.PortsTestJSON.test_create_port_in_allowed_allocation_pools tempest.api.network.test_ports.PortsTestJSON.test_create_update_delete_port tempest.api.network.test_ports.PortsTestJSON.test_list_ports tempest.api.network.test_ports.PortsTestJSON.test_list_ports_fields tempest.api.network.test_ports.PortsTestJSON.test_show_port tempest.api.network.test_ports.PortsTestJSON.test_show_port_fields tempest.api.network.test_ports.PortsTestJSON.test_update_port_with_security_group_and_extra_attributes tempest.api.network.test_ports.PortsTestJSON.test_update_port_with_two_security_groups_and_extra_attributes

Test preconditions

Neutron is available.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network and create a subnet of this network by setting allocation_pools, then check the details of the subnet and delete the subnet and network
  • Test assertion 1: The allocation_pools returned in the response equals to the one used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 2: Create a network and create a subnet of this network by setting enable_dhcp “True”, then check the details of the subnet and delete the subnet and network
  • Test assertion 2: The enable_dhcp returned in the response is “True” and the network and subnet ids are not found after deletion
  • Test action 3: Create a network and create a subnet of this network by setting gateway_ip, then check the details of the subnet and delete the subnet and network
  • Test assertion 3: The gateway_ip returned in the response equals to the one used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 4: Create a network and create a subnet of this network by setting allocation_pools and gateway_ip, then check the details of the subnet and delete the subnet and network
  • Test assertion 4: The allocation_pools and gateway_ip returned in the response equal to the ones used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 5: Create a network and create a subnet of this network by setting host_routes and dns_nameservers, then check the details of the subnet and delete the subnet and network
  • Test assertion 5: The host_routes and dns_nameservers returned in the response equal to the ones used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 6: Create a network and create a subnet of this network without setting gateway_ip, then delete the subnet and network
  • Test assertion 6: The network and subnet ids are not found after deletion
  • Test action 7: Create a network and create a subnet of this network by setting enable_dhcp “true”, gateway_ip, ip_version, cidr, host_routes, allocation_pools and dns_nameservers, then check the details of the subnet and delete the subnet and network
  • Test assertion 7: The values returned in the response equal to the ones used to create the subnet, and the network and subnet ids are not found after deletion
  • Test action 8: Create a network and update this network’s name, then create a subnet and update this subnet’s name, delete the subnet and network
  • Test assertion 8: The network’s status and subnet’s status are both ‘ACTIVE’ after creation, their names equal to the new names used to update, and the network and subnet ids are not found after deletion
  • Test action 9: Create a network and create a subnet of this network, then delete this network
  • Test assertion 9: The subnet has also been deleted after deleting the network
  • Test action 10: Create a network and list all networks
  • Test assertion 10: The network created is found in the list
  • Test action 11: Create a network and list networks with the id and name of the created network
  • Test assertion 11: The id and name of the list network equal to the created network’s id and name
  • Test action 12: Create a network and create a subnet of this network, then list all subnets
  • Test assertion 12: The subnet created is found in the list
  • Test action 13: Create a network and create a subnet of this network, then list subnets with the id and network_id of the created subnet
  • Test assertion 13: The id and network_id of the list subnet equal to the created subnet
  • Test action 14: Create a network and show network’s details with the id of the created network
  • Test assertion 14: The id and name returned in the response equal to the created network’s id and name
  • Test action 15: Create a network and just show network’s id and name info with the id of the created network
  • Test assertion 15: The keys returned in the response are only id and name, and the values of all the keys equal to network’s id and name
  • Test action 16: Create a network and create a subnet of this network, then show subnet’s details with the id of the created subnet
  • Test assertion 16: The id and cidr info returned in the response equal to the created subnet’s id and cidr
  • Test action 17: Create a network and create a subnet of this network, then show subnet’s id and network_id info with the id of the created subnet
  • Test assertion 17: The keys returned in the response are just id and network_id, and the values of all the keys equal to subnet’s id and network_id
  • Test action 18: Create a network and create a subnet of this network, then update subnet’s name, host_routes, dns_nameservers and gateway_ip
  • Test assertion 18: The name, host_routes, dns_nameservers and gateway_ip returned in the response equal to the values used to update the subnet
  • Test action 19: Create 2 networks and bulk create 2 ports with the ids of the created networks
  • Test assertion 19: The network_id of each port equals to the one used to create the port and the admin_state_up of each port is True
  • Test action 20: Create a network and create a subnet of this network by setting allocation_pools, then create a port with the created network’s id
  • Test assertion 20: The ip_address of the created port is in the range of the allocation_pools
  • Test action 21: Create a network and create a port with its id, then update the port’s name and set its admin_state_up to be False
  • Test assertion 21: The name returned in the response equals to the name used to update the port and the port’s admin_state_up is False
  • Test action 22: Create a network and create a port with its id, then list all ports
  • Test assertion 22: The created port is found in the list
  • Test action 23: Create a network and create a port with its id, then list ports with the id and mac_address of the created port
  • Test assertion 23: The created port is found in the list
  • Test action 24: Create a network and create a port with its id, then show the port’s details
  • Test assertion 24: The key ‘id’ is in the details
  • Test action 25: Create a network and create a port with its id, then show the port’s id and mac_address info with the port’s id
  • Test assertion 25: The keys returned in the response are just id and mac_address, and the values of all the keys equal to port’s id and mac_address
  • Test action 26: Create a network, 2 subnets (SUBNET1 and SUBNET2) and 2 security groups (SG1 and SG2), create a port with SG1 and SUBNET1, then update the port’s security group to SG2 and its subnet_id to SUBNET2
  • Test assertion 26: The port’s subnet_id equals to SUBNET2’s id and its security_group_ids equals to SG2’s id
  • Test action 27: Create a network, 2 subnets (SUBNET1 and SUBNET2) and 3 security groups (SG1, SG2 and SG3), create a port with SG1 and SUBNET1, then update the port’s security group to SG2 and SG3 and its subnet_id to SUBNET2
  • Test assertion 27: The port’s subnet_id equal to SUBNET2’s id and its security_group_ids equals to the ids of SG2 and SG3

These test cases evaluate the ability of basic CRUD operations on L2 networks and L2 network ports. Specifically it verifies that:

  • Subnets can be created successfully by setting different parameters.
  • Subnets can be updated after being created.
  • Ports can be bulk created with network ids.
  • Port’s security group(s) can be updated after being created.
  • Networks/subnets/ports can be listed with their ids and other parameters.
  • All details or special fields’ info of networks/subnets/ports can be shown with their ids.
  • Networks/subnets/ports can be successfully deleted.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Basic CRUD operations on security groups
Test case specification

tempest.api.network.test_security_groups.SecGroupTest.test_create_list_update_show_delete_security_group tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_additional_args tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_icmp_type_code tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_protocol_integer_value tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_remote_group_id tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_remote_ip_prefix tempest.api.network.test_security_groups.SecGroupTest.test_create_show_delete_security_group_rule tempest.api.network.test_security_groups.SecGroupTest.test_list_security_groups tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_additional_default_security_group_fails tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_duplicate_security_group_rule_fails tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_bad_ethertype tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_bad_protocol tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_bad_remote_ip_prefix tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_invalid_ports tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_non_existent_remote_groupid tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_non_existent_security_group tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_delete_non_existent_security_group tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_show_non_existent_security_group tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_show_non_existent_security_group_rule

Test preconditions

Neutron is available.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group SG1, list all security groups, update the name and description of SG1, show details of SG1 and delete SG1
  • Test assertion 1: SG1 is in the list, the name and description of SG1 equal to the ones used to update it, the name and description of SG1 shown in the details equal to the ones used to update it, and SG1’s id is not found after deletion
  • Test action 2: Create a security group SG1, and create a rule with protocol ‘tcp’, port_range_min and port_range_max
  • Test assertion 2: The values returned in the response equal to the ones used to create the rule
  • Test action 3: Create a security group SG1, and create a rule with protocol ‘icmp’ and icmp_type_codes
  • Test assertion 3: The values returned in the response equal to the ones used to create the rule
  • Test action 4: Create a security group SG1, and create a rule with protocol ‘17’
  • Test assertion 4: The values returned in the response equal to the ones used to create the rule
  • Test action 5: Create a security group SG1, and create a rule with protocol ‘udp’, port_range_min, port_range_max and remote_group_id
  • Test assertion 5: The values returned in the response equal to the ones used to create the rule
  • Test action 6: Create a security group SG1, and create a rule with protocol ‘tcp’, port_range_min, port_range_max and remote_ip_prefix
  • Test assertion 6: The values returned in the response equal to the ones used to create the rule
  • Test action 7: Create a security group SG1, create 3 rules with protocol ‘tcp’, ‘udp’ and ‘icmp’ respectively, show details of each rule, list all rules and delete all rules
  • Test assertion 7: The values in the shown details equal to the ones used to create the rule, all rules are found in the list, and all rules are not found after deletion
  • Test action 8: List all security groups
  • Test assertion 8: There is one default security group in the list
  • Test action 9: Create a security group whose name is ‘default’
  • Test assertion 9: Failed to create this security group because of name conflict
  • Test action 10: Create a security group SG1, create a rule with protocol ‘tcp’, port_range_min and port_range_max, and create another tcp rule with the same parameters
  • Test assertion 10: Failed to create this security group rule because of duplicate protocol
  • Test action 11: Create a security group SG1, and create a rule with ethertype ‘bad_ethertype’
  • Test assertion 11: Failed to create this security group rule because of bad ethertype
  • Test action 12: Create a security group SG1, and create a rule with protocol ‘bad_protocol_name’
  • Test assertion 12: Failed to create this security group rule because of bad protocol
  • Test action 13: Create a security group SG1, and create a rule with remote_ip_prefix ‘92.168.1./24’, ‘192.168.1.1/33’, ‘bad_prefix’ and ‘256’ respectively
  • Test assertion 13: Failed to create these security group rules because of bad remote_ip_prefix
  • Test action 14: Create a security group SG1, and create a tcp rule with (port_range_min, port_range_max) (-16, 80), (80, 79), (80, 65536), (None, 6) and (-16, 65536) respectively
  • Test assertion 14: Failed to create these security group rules because of bad ports
  • Test action 15: Create a security group SG1, and create a tcp rule with remote_group_id ‘bad_group_id’ and a random uuid respectively
  • Test assertion 15: Failed to create these security group rules because of nonexistent remote_group_id
  • Test action 16: Create a security group SG1, and create a rule with a random uuid as security_group_id
  • Test assertion 16: Failed to create these security group rules because of nonexistent security_group_id
  • Test action 17: Generate a random uuid and use this id to delete security group
  • Test assertion 17: Failed to delete security group because of nonexistent security_group_id
  • Test action 18: Generate a random uuid and use this id to show security group
  • Test assertion 18: Failed to show security group because of nonexistent id of security group
  • Test action 19: Generate a random uuid and use this id to show security group rule
  • Test assertion 19: Failed to show security group rule because of nonexistent id of security group rule

These test cases evaluate the ability of Basic CRUD operations on security groups and security group rules. Specifically it verifies that:

  • Security groups can be created, list, updated, shown and deleted.
  • Security group rules can be created with different parameters, list, shown and deleted.
  • Cannot create an additional default security group.
  • Cannot create a duplicate security group rules.
  • Cannot create security group rules with bad ethertype, protocol, remote_ip_prefix, ports, remote_group_id and security_group_id.
  • Cannot show or delete security groups or security group rules with nonexistent ids.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

VIM volume operations test specification
Scope

The VIM volume operations test area evaluates the ability of the system under test to support VIM volume operations. The test cases documented here are the volume API test cases in the OpenStack Interop guideline 2016.8 as implemented by the RefStack client. These test cases will evaluate basic OpenStack (as a VIM) volume operations, including:

  • Volume attach and detach operations
  • Volume service availability zone operations
  • Volume cloning operations
  • Image copy-to-volume operations
  • Volume creation and deletion operations
  • Volume service extension listing
  • Volume metadata operations
  • Volume snapshot operations
References
Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualization infrastructure
  • SUT - System Under Test
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
System Under Test (SUT)

The system under test is assumed to be the NFVI and VIM deployed with a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on VIM volume API operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

For brevity, the test cases in this test area are summarized together based on the operations they are testing.

All these test cases are included in the test case dovetail.osinterop.tc001 of OVP test suite.

Test Descriptions
API Used and Reference

Block storage: https://developer.openstack.org/api-ref/block-storage

  • create volume
  • delete volume
  • update volume
  • attach volume to server
  • detach volume from server
  • create volume metadata
  • update volume metadata
  • delete volume metadata
  • list volume
  • create snapshot
  • update snapshot
  • delete snapshot
Test Case 1 - Volume attach and detach operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_attach_detach_volume_to_instance tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_get_volume_attachment tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_attach_volumes_with_nonexistent_volume_id tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_detach_volumes_with_invalid_volume_id

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a server VM1
  • Test action 2: Attach a provided VOL1 to VM1
  • Test assertion 1: Verify VOL1 is in ‘in-use’ status
  • Test action 3: Detach VOL1 from VM1
  • Test assertion 2: Verify VOL1 is in ‘available’ status
  • Test action 4: Create a server VM2
  • Test action 5: Attach a provided VOL2 to VM2 and wait for VOL2 to reach ‘in-use’ status
  • Test action 6: Retrieve VOL2’s attachment information ATTCH_INFO
  • Test assertion 3: Verify ATTCH_INFO is correct
  • Test action 7: Create a server VM3 and wait for VM3 to reach ‘ACTIVE’ status
  • Test action 8: Attach a non-existent volume to VM3
  • Test assertion 4: Verify attach volume failed, a ‘NOT FOUND’ error is returned in the response
  • Test action 9: Detach a volume from a server by using an invalid volume ID
  • Test assertion 5: Verify detach volume failed, a ‘NOT FOUND’ error is returned in the response

This test evaluates the volume API ability of attaching a volume to a server and detaching a volume from a server. Specifically, the test verifies that:

  • Volumes can be attached and detached from servers.
  • Volume attachment information can be retrieved.
  • Attach and detach a volume using an invalid volume ID is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 2 - Volume service availability zone operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_availability_zone.AvailabilityZoneV2TestJSON.test_get_availability_zone_list

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: List all existent availability zones
  • Test assertion 1: Verify the availability zone list length is greater than 0

This test case evaluates the volume API ability of listing availability zones. Specifically, the test verifies that:

  • Availability zones can be listed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 3 - Volume cloning operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_volumes_get.VolumesV2GetTest.test_volume_create_get_update_delete_as_clone

Test preconditions
  • Volume extension API
  • Cinder volume clones feature is enabled
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a volume VOL1
  • Test action 2: Create a volume VOL2 from source volume VOL1 with a specific name and metadata
  • Test action 2: Wait for VOL2 to reach ‘available’ status
  • Test assertion 1: Verify the name of VOL2 is correct
  • Test action 3: Retrieve VOL2’s detail information
  • Test assertion 2: Verify the retrieved volume name, ID and metadata are the same as VOL2
  • Test assertion 3: Verify VOL2’s bootable flag is ‘False’
  • Test action 4: Update the name of VOL2 with the original value
  • Test action 5: Update the name of VOL2 with a new value
  • Test assertion 4: Verify the name of VOL2 is updated successfully
  • Test action 6: Create a volume VOL3 with no name specified and a description contains characters '@#$%^*
  • Test assertion 5: Verify VOL3 is created successfully
  • Test action 7: Update the name of VOL3 and description with the original value
  • Test assertion 6: Verify VOL3’s bootable flag is ‘False’

This test case evaluates the volume API ability of creating a cloned volume from a source volume, getting cloned volume detail information and updating cloned volume attributes.

Specifically, the test verifies that:

  • Cloned volume can be created from a source volume.
  • Cloned volume detail information can be retrieved.
  • Cloned volume detail information can be updated.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 4 - Image copy-to-volume operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_volume_bootable tempest.api.volume.test_volumes_get.VolumesV2GetTest.test_volume_create_get_update_delete_from_image

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Set a provided volume VOL1’s bootable flag to ‘True’
  • Test action 2: Retrieve VOL1’s bootable flag
  • Test assertion 1: Verify VOL1’s bootable flag is ‘True’
  • Test action 3: Set a provided volume VOL1’s bootable flag to ‘False’
  • Test action 4: Retrieve VOL1’s bootable flag
  • Test assertion 2: Verify VOL1’s bootable flag is ‘False’
  • Test action 5: Create a bootable volume VOL2 from one image with a specific name and metadata
  • Test action 6: Wait for VOL2 to reach ‘available’ status
  • Test assertion 3: Verify the name of VOL2 name is correct
  • Test action 7: Retrieve VOL2’s information
  • Test assertion 4: Verify the retrieved volume name, ID and metadata are the same as VOL2
  • Test assertion 5: Verify VOL2’s bootable flag is ‘True’
  • Test action 8: Update the name of VOL2 with the original value
  • Test action 9: Update the name of VOL2 with a new value
  • Test assertion 6: Verify the name of VOL2 is updated successfully
  • Test action 10: Create a volume VOL3 with no name specified and a description contains characters '@#$%^*
  • Test assertion 7: Verify VOL3 is created successfully
  • Test action 11: Update the name of VOL3 and description with the original value
  • Test assertion 8: Verify VOL3’s bootable flag is ‘True’

This test case evaluates the volume API ability of updating volume’s bootable flag and creating a bootable volume from an image, getting bootable volume detail information and updating bootable volume.

Specifically, the test verifies that:

  • Volume bootable flag can be set and retrieved.
  • Bootable volume can be created from a source volume.
  • Bootable volume detail information can be retrieved.
  • Bootable volume detail information can be updated.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 5 - Volume creation and deletion operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_volumes_get.VolumesV2GetTest.test_volume_create_get_update_delete tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_invalid_size tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_nonexistent_source_volid tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_nonexistent_volume_type tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_without_passing_size tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_size_negative tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_size_zero

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a volume VOL1 with a specific name and metadata
  • Test action 2: Wait for VOL1 to reach ‘available’ status
  • Test assertion 1: Verify the name of VOL1 is correct
  • Test action 3: Retrieve VOL1’s information
  • Test assertion 2: Verify the retrieved volume name, ID and metadata are the same as VOL1
  • Test assertion 3: Verify VOL1’s bootable flag is ‘False’
  • Test action 4: Update the name of VOL1 with the original value
  • Test action 5: Update the name of VOL1 with a new value
  • Test assertion 4: Verify the name of VOL1 is updated successfully
  • Test action 6: Create a volume VOL2 with no name specified and a description contains characters '@#$%^*
  • Test assertion 5: Verify VOL2 is created successfully
  • Test action 7: Update the name of VOL2 and description with the original value
  • Test assertion 6: Verify VOL2’s bootable flag is ‘False’
  • Test action 8: Create a volume with an invalid size ‘#$%’
  • Test assertion 7: Verify create volume failed, a bad request error is returned in the response
  • Test action 9: Create a volume with a nonexistent source volume
  • Test assertion 8: Verify create volume failed, a ‘Not Found’ error is returned in the response
  • Test action 10: Create a volume with a nonexistent volume type
  • Test assertion 9: Verify create volume failed, a ‘Not Found’ error is returned in the response
  • Test action 11: Create a volume without passing a volume size
  • Test assertion 10: Verify create volume failed, a bad request error is returned in the response
  • Test action 12: Create a volume with a negative volume size
  • Test assertion 11: Verify create volume failed, a bad request error is returned in the response
  • Test action 13: Create a volume with volume size ‘0’
  • Test assertion 12: Verify create volume failed, a bad request error is returned in the response

This test case evaluates the volume API ability of creating a volume, getting volume detail information and updating volume, the reference is, Specifically, the test verifies that:

  • Volume can be created from a source volume.
  • Volume detail information can be retrieved/updated.
  • Create a volume with an invalid size is not allowed.
  • Create a volume with a nonexistent source volume or volume type is not allowed.
  • Create a volume without passing a volume size is not allowed.
  • Create a volume with a negative volume size is not allowed.
  • Create a volume with volume size ‘0’ is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 6 - Volume service extension listing operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_extensions.ExtensionsV2TestJSON.test_list_extensions

Test preconditions
  • Volume extension API
  • At least one Cinder extension is configured
Basic test flow execution description and pass/fail criteria
  • Test action 1: List all cinder service extensions
  • Test assertion 1: Verify all extensions are list in the extension list

This test case evaluates the volume API ability of listing all existent volume service extensions.

  • Cinder service extensions can be listed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 7 - Volume GET operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_get_invalid_volume_id tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_get_volume_without_passing_volume_id tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_volume_get_nonexistent_volume_id

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Retrieve a volume with an invalid volume ID
  • Test assertion 1: Verify retrieve volume failed, a ‘Not Found’ error is returned in the response
  • Test action 2: Retrieve a volume with an empty volume ID
  • Test assertion 2: Verify retrieve volume failed, a ‘Not Found’ error is returned in the response
  • Test action 3: Retrieve a volume with a nonexistent volume ID
  • Test assertion 3: Verify retrieve volume failed, a ‘Not Found’ error is returned in the response

This test case evaluates the volume API ability of getting volumes. Specifically, the test verifies that:

  • Get a volume with an invalid/an empty/a nonexistent volume ID is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 8 - Volume listing operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_by_name tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_details_by_name tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_param_display_name_and_status tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_with_detail_param_display_name_and_status tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_with_detail_param_metadata tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_with_details tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_with_param_metadata tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volumes_list_by_availability_zone tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volumes_list_by_status tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volumes_list_details_by_availability_zone tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volumes_list_details_by_status tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_list_volumes_detail_with_invalid_status tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_list_volumes_detail_with_nonexistent_name tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_list_volumes_with_invalid_status tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_list_volumes_with_nonexistent_name tempest.api.volume.v2.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_details_pagination tempest.api.volume.v2.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_details_with_multiple_params tempest.api.volume.v2.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_pagination

Test preconditions
  • Volume extension API
  • The backing file for the volume group that Nova uses has space for at least 3 1G volumes
Basic test flow execution description and pass/fail criteria
  • Test action 1: List all existent volumes
  • Test assertion 1: Verify the volume list is complete
  • Test action 2: List existent volumes and filter the volume list by volume name
  • Test assertion 2: Verify the length of filtered volume list is 1 and the retrieved volume is correct
  • Test action 3: List existent volumes in detail and filter the volume list by volume name
  • Test assertion 3: Verify the length of filtered volume list is 1 and the retrieved volume is correct
  • Test action 4: List existent volumes and filter the volume list by volume name and status ‘available’
  • Test assertion 4: Verify the name and status parameters of the fetched volume are correct
  • Test action 5: List existent volumes in detail and filter the volume list by volume name and status ‘available’
  • Test assertion 5: Verify the name and status parameters of the fetched volume are correct
  • Test action 6: List all existent volumes in detail and filter the volume list by volume metadata
  • Test assertion 6: Verify the metadata parameter of the fetched volume is correct
  • Test action 7: List all existent volumes in detail
  • Test assertion 7: Verify the volume list is complete
  • Test action 8: List all existent volumes and filter the volume list by volume metadata
  • Test assertion 8: Verify the metadata parameter of the fetched volume is correct
  • Test action 9: List existent volumes and filter the volume list by availability zone
  • Test assertion 9: Verify the availability zone parameter of the fetched volume is correct
  • Test action 10: List all existent volumes and filter the volume list by volume status ‘available’
  • Test assertion 10: Verify the status parameter of the fetched volume is correct
  • Test action 11: List existent volumes in detail and filter the volume list by availability zone
  • Test assertion 11: Verify the availability zone parameter of the fetched volume is correct
  • Test action 12: List all existent volumes in detail and filter the volume list by volume status ‘available’
  • Test assertion 12: Verify the status parameter of the fetched volume is correct
  • Test action 13: List all existent volumes in detail and filter the volume list by an invalid volume status ‘null’
  • Test assertion 13: Verify the filtered volume list is empty
  • Test action 14: List all existent volumes in detail and filter the volume list by a non-existent volume name
  • Test assertion 14: Verify the filtered volume list is empty
  • Test action 15: List all existent volumes and filter the volume list by an invalid volume status ‘null’
  • Test assertion 15: Verify the filtered volume list is empty
  • Test action 16: List all existent volumes and filter the volume list by a non-existent volume name
  • Test assertion 16: Verify the filtered volume list is empty
  • Test action 17: List all existent volumes in detail and paginate the volume list by desired volume IDs
  • Test assertion 17: Verify only the desired volumes are listed in the filtered volume list
  • Test action 18: List all existent volumes in detail and filter the volume list by volume status ‘available’ and display limit ‘2’
  • Test action 19: Sort the filtered volume list by IDs in ascending order
  • Test assertion 18: Verify the length of filtered volume list is 2
  • Test assertion 19: Verify the status of retrieved volumes is correct
  • Test assertion 20: Verify the filtered volume list is sorted correctly
  • Test action 20: List all existent volumes in detail and filter the volume list by volume status ‘available’ and display limit ‘2’
  • Test action 21: Sort the filtered volume list by IDs in descending order
  • Test assertion 21: Verify the length of filtered volume list is 2
  • Test assertion 22: Verify the status of retrieved volumes is correct
  • Test assertion 23: Verify the filtered volume list is sorted correctly
  • Test action 22: List all existent volumes and paginate the volume list by desired volume IDs
  • Test assertion 24: Verify only the desired volumes are listed in the filtered volume list

This test case evaluates the volume API ability of getting a list of volumes and filtering the volume list. Specifically, the test verifies that:

  • Get a list of volumes (in detail) successful.
  • Get a list of volumes (in detail) and filter volumes by name/status/metadata/availability zone successful.
  • Volume list pagination functionality is working.
  • Get a list of volumes in detail using combined condition successful.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 9 - Volume metadata operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_volume_metadata.VolumesV2MetadataTest.test_crud_volume_metadata tempest.api.volume.test_volume_metadata.VolumesV2MetadataTest.test_update_volume_metadata_item

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create metadata for a provided volume VOL1
  • Test action 2: Get the metadata of VOL1
  • Test assertion 1: Verify the metadata of VOL1 is correct
  • Test action 3: Update the metadata of VOL1
  • Test assertion 2: Verify the metadata of VOL1 is updated
  • Test action 4: Delete one metadata item ‘key1’ of VOL1
  • Test assertion 3: Verify the metadata item ‘key1’ is deleted
  • Test action 5: Create metadata for a provided volume VOL2
  • Test assertion 4: Verify the metadata of VOL2 is correct
  • Test action 6: Update one metadata item ‘key3’ of VOL2
  • Test assertion 5: Verify the metadata of VOL2 is updated

This test case evaluates the volume API ability of creating metadata for a volume, getting the metadata of a volume, updating volume metadata and deleting a metadata item of a volume. Specifically, the test verifies that:

  • Create metadata for volume successfully.
  • Get metadata of volume successfully.
  • Update volume metadata and metadata item successfully.
  • Delete metadata item of a volume successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 10 - Verification of read-only status on volumes with the Cinder v2 API
Test case specification

tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_volume_readonly_update

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Update a provided volume VOL1’s read-only access mode to ‘True’
  • Test assertion 1: Verify VOL1 is in read-only access mode
  • Test action 2: Update a provided volume VOL1’s read-only access mode to ‘False’
  • Test assertion 2: Verify VOL1 is not in read-only access mode

This test case evaluates the volume API ability of setting and updating volume read-only access mode. Specifically, the test verifies that:

  • Volume read-only access mode can be set and updated.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 11 - Volume reservation operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_reserve_unreserve_volume tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_reserve_volume_with_negative_volume_status tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_reserve_volume_with_nonexistent_volume_id tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_unreserve_volume_with_nonexistent_volume_id

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Update a provided volume VOL1 as reserved
  • Test assertion 1: Verify VOL1 is in ‘attaching’ status
  • Test action 2: Update VOL1 as un-reserved
  • Test assertion 2: Verify VOL1 is in ‘available’ status
  • Test action 3: Update a provided volume VOL2 as reserved
  • Test action 4: Update VOL2 as reserved again
  • Test assertion 3: Verify update VOL2 status failed, a bad request error is returned in the response
  • Test action 5: Update VOL2 as un-reserved
  • Test action 6: Update a non-existent volume as reserved by using an invalid volume ID
  • Test assertion 4: Verify update non-existent volume as reserved failed, a ‘Not Found’ error is returned in the response
  • Test action 7: Update a non-existent volume as un-reserved by using an invalid volume ID
  • Test assertion 5: Verify update non-existent volume as un-reserved failed, a ‘Not Found’ error is returned in the response

This test case evaluates the volume API ability of reserving and un-reserving volumes. Specifically, the test verifies that:

  • Volume can be reserved and un-reserved.
  • Update a non-existent volume as reserved is not allowed.
  • Update a non-existent volume as un-reserved is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 12 - Volume snapshot creation/deletion operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_snapshot_metadata.SnapshotV2MetadataTestJSON.test_crud_snapshot_metadata tempest.api.volume.test_snapshot_metadata.SnapshotV2MetadataTestJSON.test_update_snapshot_metadata_item tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_nonexistent_snapshot_id tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_delete_invalid_volume_id tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_delete_volume_without_passing_volume_id tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_volume_delete_nonexistent_volume_id tempest.api.volume.test_volumes_snapshots.VolumesV2SnapshotTestJSON.test_snapshot_create_get_list_update_delete tempest.api.volume.test_volumes_snapshots.VolumesV2SnapshotTestJSON.test_volume_from_snapshot tempest.api.volume.test_volumes_snapshots_list.VolumesV2SnapshotListTestJSON.test_snapshots_list_details_with_params tempest.api.volume.test_volumes_snapshots_list.VolumesV2SnapshotListTestJSON.test_snapshots_list_with_params tempest.api.volume.test_volumes_snapshots_negative.VolumesV2SnapshotNegativeTestJSON.test_create_snapshot_with_nonexistent_volume_id tempest.api.volume.test_volumes_snapshots_negative.VolumesV2SnapshotNegativeTestJSON.test_create_snapshot_without_passing_volume_id

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create metadata for a provided snapshot SNAP1
  • Test action 2: Get the metadata of SNAP1
  • Test assertion 1: Verify the metadata of SNAP1 is correct
  • Test action 3: Update the metadata of SNAP1
  • Test assertion 2: Verify the metadata of SNAP1 is updated
  • Test action 4: Delete one metadata item ‘key3’ of SNAP1
  • Test assertion 3: Verify the metadata item ‘key3’ is deleted
  • Test action 5: Create metadata for a provided snapshot SNAP2
  • Test assertion 4: Verify the metadata of SNAP2 is correct
  • Test action 6: Update one metadata item ‘key3’ of SNAP2
  • Test assertion 5: Verify the metadata of SNAP2 is updated
  • Test action 7: Create a volume with a nonexistent snapshot
  • Test assertion 6: Verify create volume failed, a ‘Not Found’ error is returned in the response
  • Test action 8: Delete a volume with an invalid volume ID
  • Test assertion 7: Verify delete volume failed, a ‘Not Found’ error is returned in the response
  • Test action 9: Delete a volume with an empty volume ID
  • Test assertion 8: Verify delete volume failed, a ‘Not Found’ error is returned in the response
  • Test action 10: Delete a volume with a nonexistent volume ID
  • Test assertion 9: Verify delete volume failed, a ‘Not Found’ error is returned in the response
  • Test action 11: Create a snapshot SNAP2 from a provided volume VOL1
  • Test action 12: Retrieve SNAP2’s detail information
  • Test assertion 10: Verify SNAP2 is created from VOL1
  • Test action 13: Update the name and description of SNAP2
  • Test assertion 11: Verify the name and description of SNAP2 are updated in the response body of update snapshot API
  • Test action 14: Retrieve SNAP2’s detail information
  • Test assertion 12: Verify the name and description of SNAP2 are correct
  • Test action 15: Delete SNAP2
  • Test action 16: Create a volume VOL2 with a volume size
  • Test action 17: Create a snapshot SNAP3 from VOL2
  • Test action 18: Create a volume VOL3 from SNAP3 with a bigger volume size
  • Test action 19: Retrieve VOL3’s detail information
  • Test assertion 13: Verify volume size and source snapshot of VOL3 are correct
  • Test action 20: List all snapshots in detail and filter the snapshot list by name
  • Test assertion 14: Verify the filtered snapshot list is correct
  • Test action 21: List all snapshots in detail and filter the snapshot list by status
  • Test assertion 15: Verify the filtered snapshot list is correct
  • Test action 22: List all snapshots in detail and filter the snapshot list by name and status
  • Test assertion 16: Verify the filtered snapshot list is correct
  • Test action 23: List all snapshots and filter the snapshot list by name
  • Test assertion 17: Verify the filtered snapshot list is correct
  • Test action 24: List all snapshots and filter the snapshot list by status
  • Test assertion 18: Verify the filtered snapshot list is correct
  • Test action 25: List all snapshots and filter the snapshot list by name and status
  • Test assertion 19: Verify the filtered snapshot list is correct
  • Test action 26: Create a snapshot from a nonexistent volume by using an invalid volume ID
  • Test assertion 20: Verify create snapshot failed, a ‘Not Found’ error is returned in the response
  • Test action 27: Create a snapshot from a volume by using an empty volume ID
  • Test assertion 21: Verify create snapshot failed, a ‘Not Found’ error is returned in the response

This test case evaluates the volume API ability of managing snapshot and snapshot metadata. Specifically, the test verifies that:

  • Create metadata for snapshot successfully.
  • Get metadata of snapshot successfully.
  • Update snapshot metadata and metadata item successfully.
  • Delete metadata item of a snapshot successfully.
  • Create a volume from a nonexistent snapshot is not allowed.
  • Delete a volume using an invalid volume ID is not allowed.
  • Delete a volume without passing the volume ID is not allowed.
  • Delete a non-existent volume is not allowed.
  • Create snapshot successfully.
  • Get snapshot’s detail information successfully.
  • Update snapshot attributes successfully.
  • Delete snapshot successfully.
  • Creates a volume and a snapshot passing a size different from the source successfully.
  • List snapshot details by display_name and status filters successfully.
  • Create a snapshot from a nonexistent volume is not allowed.
  • Create a snapshot from a volume without passing the volume ID is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 13 - Volume update operations with the Cinder v2 API
Test case specification

tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_update_volume_with_empty_volume_id tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_update_volume_with_invalid_volume_id tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_update_volume_with_nonexistent_volume_id

Test preconditions
  • Volume extension API
Basic test flow execution description and pass/fail criteria
  • Test action 1: Update a volume by using an empty volume ID
  • Test assertion 1: Verify update volume failed, a ‘Not Found’ error is returned in the response
  • Test action 2: Update a volume by using an invalid volume ID
  • Test assertion 2: Verify update volume failed, a ‘Not Found’ error is returned in the response
  • Test action 3: Update a non-existent volume by using a random generated volume ID
  • Test assertion 3: Verify update volume failed, a ‘Not Found’ error is returned in the response

This test case evaluates the volume API ability of updating volume attributes. Specifically, the test verifies that:

  • Update a volume without passing the volume ID is not allowed.
  • Update a volume using an invalid volume ID is not allowed.
  • Update a non-existent volume is not allowed.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Vping test specification
Scope

The vping test area evaluates basic NFVi capabilities of the system under test. These capabilities include creating a small number of virtual machines, establishing basic L3 connectivity between them and verifying connectivity by means of ICMP packets.

Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • ICMP - Internet Control Message Protocol
  • L3 - Layer 3
  • NFVi - Network functions virtualization infrastructure
  • SCP - Secure Copy
  • SSH - Secure Shell
  • VM - Virtual machine
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

Test Area Structure

The test area is structured in two separate tests which are executed sequentially. The order of the tests is arbitrary as there are no dependencies across the tests.

Test Descriptions
Test Case 1 - vPing using userdata provided by nova metadata service
Short name

dovetail.vping.tc001.userdata

Use case specification

This test evaluates the use case where an NFVi tenant boots up two VMs and requires L3 connectivity between those VMs. The target IP is passed to the VM that will initiate pings by using a custom userdata script provided by nova metadata service.

Test preconditions

At least one compute node is available. No further pre-configuration needed.

Basic test flow execution description and pass/fail criteria
Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. The target IP is passed to the VM sending pings by using a custom userdata script by means of the config driver mechanism provided by Nova metadata service. Whether or not a ping was successful is determined by checking the console output of the source VMs.

Test execution
  • Test action 1: * Create a private tenant network by using neutron client * Create one subnet and one router in the network by neutron client * Add one interface between the subnet and router * Add one gateway route to the router by neutron client * Store the network id in the response
  • Test assertion 1: The network id, subnet id and router id can be found in the response
  • Test action 2: * Create an security group by using neutron client * Store the security group id parameter in the response
  • Test assertion 2: The security group id can be found in the response
  • Test action 3: boot VM1 by using nova client with configured name, image, flavor, private tenant network created in test action 1, security group created in test action 2
  • Test assertion 3: The VM1 object can be found in the response
  • Test action 4: Generate ping script with the IP of VM1 to be passed as userdata provided by the nova metadata service.
  • Test action 5: Boot VM2 by using nova client with configured name, image, flavor, private tenant network created in test action 1, security group created in test action 2, userdata created in test action 4
  • Test assertion 4: The VM2 object can be found in the response
  • Test action 6: Inside VM2, the ping script is executed automatically when booted and it contains a loop doing the ping until the return code is 0 or timeout reached. For each ping, when the return code is 0, “vPing OK” is printed in the VM2 console-log, otherwise, “vPing KO” is printed. Monitoring the console-log of VM2 to see the response generated by the script.
  • Test assertion 5: “vPing OK” is detected, when monitoring the console-log in VM2
  • Test action 7: delete VM1, VM2
  • Test assertion 6: VM1 and VM2 are not present in the VM list
  • Test action 8: delete security group, gateway, interface, router, subnet and network
  • Test assertion 7: The security group, gateway, interface, router, subnet and network are no longer present in the lists after deleting
Pass / fail criteria

This test evaluates basic NFVi capabilities of the system under test. Specifically, the test verifies that:

  • Neutron client network, subnet, router, interface create commands return valid “id” parameters which are shown in the create response message
  • Neutron client interface add command to add between subnet and router returns success code
  • Neutron client gateway add command to add to router returns success code
  • Neutron client security group create command returns valid “id” parameter which is shown in the response message
  • Nova client VM create command returns valid VM attributes response message
  • Nova metadata server can transfer userdata configuration at nova client VM booting time
  • Ping command from one VM to the other in same private tenant network returns valid code
  • All items created using neutron client or nova client create commands are able to be removed by using the returned identifiers

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

None

Test Case 2 - vPing using SSH to a floating IP
Short name

dovetail.vping.tc002.ssh

Use case specification

This test evaluates the use case where an NFVi tenant boots up two VMs and requires L3 connectivity between those VMs. An SSH connection is establised from the host to a floating IP associated with VM2 and ping is executed on VM2 with the IP of VM1 as target.

Test preconditions

At least one compute node is available. There should exist an OpenStack external network and can assign floating IP.

Basic test flow execution description and pass/fail criteria
Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. To this end, the test establishes an SSH connection from the host running the test suite to a floating IP associated with VM2 and executes ping on VM2 with the IP of VM1 as target.

Test execution
  • Test action 1: * Create a private tenant network by neutron client * Create one subnet and one router are created in the network by using neutron client * Create one interface between the subnet and router * Add one gateway route to the router by neutron client * Store the network id in the response
  • Test assertion 1: The network id, subnet id and router id can be found in the response
  • Test action 2: * Create an security group by using neutron client * Store the security group id parameter in the response
  • Test assertion 2: The security group id can be found in the response
  • Test action 3: Boot VM1 by using nova client with configured name, image, flavor, private tenant network created in test action 1, security group created in test action 2
  • Test assertion 3: The VM1 object can be found in the response
  • Test action 4: Boot VM2 by using nova client with configured name, image, flavor, private tenant network created in test action 1, security group created in test action 2
  • Test assertion 4: The VM2 object can be found in the response
  • Test action 5: create one floating IP by using neutron client, storing the floating IP address returned in the response
  • Test assertion 5: Floating IP address can be found in the response
  • Test action 6: Assign the floating IP address created in test action 5 to VM2 by using nova client
  • Test assertion 6: The assigned floating IP can be found in the VM2 console log file
  • Test action 7: Establish SSH connection between the test host and VM2 through the floating IP
  • Test assertion 7: SSH connection between the test host and VM2 is established within 300 seconds
  • Test action 8: Copy the Ping script from the test host to VM2 by using SCPClient
  • Test assertion 8: The Ping script can be found inside VM2
  • Test action 9: Inside VM2, to execute the Ping script to ping VM1, the Ping script contains a loop doing the ping until the return code is 0 or timeout reached, for each ping, when the return code is 0, “vPing OK” is printed in the VM2 console-log, otherwise, “vPing KO” is printed. Monitoring the console-log of VM2 to see the response generated by the script.
  • Test assertion 9: “vPing OK” is detected, when monitoring the console-log in VM2
  • Test action 10: delete VM1, VM2
  • Test assertion 10: VM1 and VM2 are not present in the VM list
  • Test action 11: delete floating IP, security group, gateway, interface, router, subnet and network
  • Test assertion 11: The security group, gateway, interface, router, subnet and network are no longer present in the lists after deleting
Pass / fail criteria

This test evaluates basic NFVi capabilities of the system under test. Specifically, the test verifies that:

  • Neutron client network, subnet, router, interface create commands return valid “id” parameters which are shown in the create response message
  • Neutron client interface add command to add between subnet and router return success code
  • Neutron client gateway add command to add to router return success code
  • Neutron client security group create command returns valid “id” parameter which is shown in the response message
  • Nova client VM create command returns valid VM attributes response message
  • Neutron client floating IP create command return valid floating IP address
  • Nova client add floating IP command returns valid response message
  • SSH connection can be established using a floating IP
  • Ping command from one VM to another in same private tenant network returns valid code
  • All items created using neutron client or nova client create commands are able to be removed by using the returned identifiers

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

None

IPv6 test specification
Scope

The IPv6 test area will evaluate the ability for a SUT to support IPv6 Tenant Network features and functionality. The tests in this test area will evaluate,

  • network, subnet, port, router API CRUD operations
  • interface add and remove operations
  • security group and security group rule API CRUD operations
  • IPv6 address assignment with dual stack, dual net, multiprefix in mode DHCPv6 stateless or SLAAC
References
Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • CIDR - Classless Inter-Domain Routing
  • CRUD - Create, Read, Update, and Delete
  • DHCP - Dynamic Host Configuration Protocol
  • DHCPv6 - Dynamic Host Configuration Protocol version 6
  • ICMP - Internet Control Message Protocol
  • NFVI - Network Functions Virtualization Infrastructure
  • NIC - Network Interface Controller
  • RA - Router Advertisements
  • radvd - The Router Advertisement Daemon
  • SDN - Software Defined Network
  • SLAAC - Stateless Address Auto Configuration
  • TCP - Transmission Control Protocol
  • UDP - User Datagram Protocol
  • VM - Virtual Machine
  • vNIC - virtual Network Interface Card
System Under Test (SUT)

The system under test is assumed to be the NFVI and VIM deployed with a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on network, port and subnet operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test.

Test Descriptions
API Used and Reference

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • show network details
  • update network
  • delete network
  • list networks
  • create netowrk
  • bulk create networks

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • list subnets
  • create subnet
  • bulk create subnet
  • show subnet details
  • update subnet
  • delete subnet

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • list routers
  • create router
  • show router details
  • update router
  • delete router
  • add interface to router
  • remove interface from router

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • show port details
  • update port
  • delete port
  • list port
  • create port
  • bulk create ports

Security groups: https://developer.openstack.org/api-ref/networking/v2/index.html#security-groups-security-groups

  • list security groups
  • create security groups
  • show security group
  • update security group
  • delete security group

Security groups rules: https://developer.openstack.org/api-ref/networking/v2/index.html#security-group-rules-security-group-rules

  • list security group rules
  • create security group rule
  • show security group rule
  • delete security group rule

Servers: https://developer.openstack.org/api-ref/compute/

  • list servers
  • create server
  • create multiple servers
  • list servers detailed
  • show server details
  • update server
  • delete server
Test Case 1 - Create and Delete Bulk Network, IPv6 Subnet and Port
Short name

dovetail.ipv6.tc001.bulk_network_subnet_port_create_delete

Use case specification

This test case evaluates the SUT API ability of creating and deleting multiple networks, IPv6 subnets, ports in one request, the reference is,

tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_network tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_subnet tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_port

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create 2 networks using bulk create, storing the “id” parameters returned in the response
  • Test action 2: List all networks, verifying the two network id’s are found in the list
  • Test assertion 1: The two “id” parameters are found in the network list
  • Test action 3: Delete the 2 created networks using the stored network ids
  • Test action 4: List all networks, verifying the network ids are no longer present
  • Test assertion 2: The two “id” parameters are not present in the network list
  • Test action 5: Create 2 networks using bulk create, storing the “id” parameters returned in the response
  • Test action 6: Create an IPv6 subnets on each of the two networks using bulk create commands, storing the associated “id” parameters
  • Test action 7: List all subnets, verify the IPv6 subnets are found in the list
  • Test assertion 3: The two IPv6 subnet “id” parameters are found in the network list
  • Test action 8: Delete the 2 IPv6 subnets using the stored “id” parameters
  • Test action 9: List all subnets, verify the IPv6 subnets are no longer present in the list
  • Test assertion 4: The two IPv6 subnet “id” parameters, are not present in list
  • Test action 10: Delete the 2 networks created in test action 5, using the stored network ids
  • Test action 11: List all networks, verifying the network ids are no longer present
  • Test assertion 5: The two “id” parameters are not present in the network list
  • Test action 12: Create 2 networks using bulk create, storing the “id” parameters returned in the response
  • Test action 13: Create a port on each of the two networks using bulk create commands, storing the associated “port_id” parameters
  • Test action 14: List all ports, verify the port_ids are found in the list
  • Test assertion 6: The two “port_id” parameters are found in the ports list
  • Test action 15: Delete the 2 ports using the stored “port_id” parameters
  • Test action 16: List all ports, verify port_ids are no longer present in the list
  • Test assertion 7: The two “port_id” parameters, are not present in list
  • Test action 17: Delete the 2 networks created in test action 12, using the stored network ids
  • Test action 18: List all networks, verifying the network ids are no longer present
  • Test assertion 8: The two “id” parameters are not present in the network list

This test evaluates the ability to use bulk create commands to create networks, IPv6 subnets and ports on the SUT API. Specifically it verifies that:

  • Bulk network create commands return valid “id” parameters which are reported in the list commands
  • Bulk IPv6 subnet commands return valid “id” parameters which are reported in the list commands
  • Bulk port commands return valid “port_id” parameters which are reported in the list commands
  • All items created using bulk create commands are able to be removed using the returned identifiers
Post conditions

N/A

Test Case 2 - Create, Update and Delete an IPv6 Network and Subnet
Short name

dovetail.ipv6.tc002.network_subnet_create_update_delete

Use case specification

This test case evaluates the SUT API ability of creating, updating, deleting network and IPv6 subnet with the network, the reference is

tempest.api.network.test_networks.NetworksIpV6Test.test_create_update_delete_network_subnet

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” and “status” parameters returned in the response
  • Test action 2: Verify the value of the created network’s “status” is ACTIVE
  • Test assertion 1: The created network’s “status” is ACTIVE
  • Test action 3: Update this network with a new_name
  • Test action 4: Verify the network’s name equals the new_name
  • Test assertion 2: The network’s name equals to the new_name after name updating
  • Test action 5: Create an IPv6 subnet within the network, storing the “id” parameters returned in the response
  • Test action 6: Update this IPv6 subnet with a new_name
  • Test action 7: Verify the IPv6 subnet’s name equals the new_name
  • Test assertion 3: The IPv6 subnet’s name equals to the new_name after name updating
  • Test action 8: Delete the IPv6 subnet created in test action 5, using the stored subnet id
  • Test action 9: List all subnets, verifying the subnet id is no longer present
  • Test assertion 4: The IPv6 subnet “id” is not present in the subnet list
  • Test action 10: Delete the network created in test action 1, using the stored network id
  • Test action 11: List all networks, verifying the network id is no longer present
  • Test assertion 5: The network “id” is not present in the network list

This test evaluates the ability to create, update, delete network, IPv6 subnet on the SUT API. Specifically it verifies that:

  • Create network commands return ACTIVE “status” parameters which are reported in the list commands
  • Update network commands return updated “name” parameters which equals to the “name” used
  • Update subnet commands return updated “name” parameters which equals to the “name” used
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 3 - Check External Network Visibility
Short name

dovetail.ipv6.tc003.external_network_visibility

Use case specification

This test case verifies user can see external networks but not subnets, the reference is,

tempest.api.network.test_networks.NetworksIpV6Test.test_external_network_visibility

Test preconditions
  1. The SUT has at least one external network.

2. In the external network list, there is no network without external router, i.e., all networks in this list are with external router. 3. There is one external network with configured public network id and there is no subnet on this network

Basic test flow execution description and pass/fail criteria
  • Test action 1: List all networks with external router, storing the “id”s parameters returned in the response
  • Test action 2: Verify list in test action 1 is not empty
  • Test assertion 1: The network with external router list is not empty
  • Test action 3: List all netowrks without external router in test action 1 list
  • Test action 4: Verify list in test action 3 is empty
  • Test assertion 2: networks without external router in the external network list is empty
  • Test action 5: Verify the configured public network id is found in test action 1 stored “id”s
  • Test assertion 3: the public network id is found in the external network “id”s
  • Test action 6: List the subnets of the external network with the configured public network id
  • Test action 7: Verify list in test action 6 is empty
  • Test assertion 4: There is no subnet of the external network with the configured public network id

This test evaluates the ability to use list commands to list external networks, pre-configured public network. Specifically it verifies that:

  • Network list commands to find visible networks with external router
  • Network list commands to find visible network with pre-configured public network id
  • Subnet list commands to find no subnet on the pre-configured public network
Post conditions

None

Test Case 4 - List IPv6 Networks and Subnets
Short name

dovetail.ipv6.tc004.network_subnet_list

Use case specification

This test case evaluates the SUT API ability of listing netowrks, subnets after creating a network and an IPv6 subnet, the reference is

tempest.api.network.test_networks.NetworksIpV6Test.test_list_networks tempest.api.network.test_networks.NetworksIpV6Test.test_list_subnets

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: List all networks, verifying the network id is found in the list
  • Test assertion 1: The “id” parameter is found in the network list
  • Test action 3: Create an IPv6 subnet of the network created in test action 1. storing the “id” parameter returned in the response
  • Test action 4: List all subnets of this network, verifying the IPv6 subnet id is found in the list
  • Test assertion 2: The “id” parameter is found in the IPv6 subnet list
  • Test action 5: Delete the IPv6 subnet using the stored “id” parameters
  • Test action 6: List all subnets, verify subnet_id is no longer present in the list
  • Test assertion 3: The IPv6 subnet “id” parameter is not present in list
  • Test action 7: Delete the network created in test action 1, using the stored network ids
  • Test action 8: List all networks, verifying the network id is no longer present
  • Test assertion 4: The network “id” parameter is not present in the network list

This test evaluates the ability to use create commands to create network, IPv6 subnet, list commands to list the created networks, IPv6 subnet on the SUT API. Specifically it verifies that:

  • Create commands to create network, IPv6 subnet
  • List commands to find that netowrk, IPv6 subnet in the all networks, subnets list after creating
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 5 - Show Details of an IPv6 Network and Subnet
Short name

dovetail.ipv6.tc005.network_subnet_show

Use case specification

This test case evaluates the SUT API ability of showing the network, subnet details, the reference is,

tempest.api.network.test_networks.NetworksIpV6Test.test_show_network tempest.api.network.test_networks.NetworksIpV6Test.test_show_subnet

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” and “name” parameter returned in the response
  • Test action 2: Show the network id and name, verifying the network id and name equal to the “id” and “name” stored in test action 1
  • Test assertion 1: The id and name equal to the “id” and “name” stored in test action 1
  • Test action 3: Create an IPv6 subnet of the network, storing the “id” and CIDR parameter returned in the response
  • Test action 4: Show the details of the created IPv6 subnet, verifying the id and CIDR in the details are equal to the stored id and CIDR in test action 3.
  • Test assertion 2: The “id” and CIDR in show details equal to “id” and CIDR stored in test action 3
  • Test action 5: Delete the IPv6 subnet using the stored “id” parameter
  • Test action 6: List all subnets on the network, verify the IPv6 subnet id is no longer present in the list
  • Test assertion 3: The IPv6 subnet “id” parameter is not present in list
  • Test action 7: Delete the network created in test action 1, using the stored network id
  • Test action 8: List all networks, verifying the network id is no longer present
  • Test assertion 4: The “id” parameter is not present in the network list

This test evaluates the ability to use create commands to create network, IPv6 subnet and show commands to show network, IPv6 subnet details on the SUT API. Specifically it verifies that:

  • Network show commands return correct “id” and “name” parameter which equal to the returned response in the create commands
  • IPv6 subnet show commands return correct “id” and CIDR parameter which equal to the returned response in the create commands
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 6 - Create an IPv6 Port in Allowed Allocation Pools
Short name

dovetail.ipv6.tc006.port_create_in_allocation_pool

Use case specification

This test case evaluates the SUT API ability of creating an IPv6 subnet within allowed IPv6 address allocation pool and creating a port whose address is in the range of the pool, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_in_allowed_allocation_pools

Test preconditions

There should be an IPv6 CIDR configuration, which prefixlen is less than 126.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Check the allocation pools configuration, verifying the prefixlen of the IPv6 CIDR configuration is less than 126.
  • Test assertion 1: The prefixlen of the IPv6 CIDR configuration is less than 126
  • Test action 3: Get the allocation pool by setting the start_ip and end_ip based on the IPv6 CIDR configuration.
  • Test action 4: Create an IPv6 subnet of the network within the allocation pools, storing the “id” parameter returned in the response
  • Test action 5: Create a port of the network, storing the “id” parameter returned in the response
  • Test action 6: Verify the port’s id is in the range of the allocation pools which is got is test action 3
  • Test assertion 2: the port’s id is in the range of the allocation pools
  • Test action 7: Delete the port using the stored “id” parameter
  • Test action 8: List all ports, verify the port id is no longer present in the list
  • Test assertion 3: The port “id” parameter is not present in list
  • Test action 9: Delete the IPv6 subnet using the stored “id” parameter
  • Test action 10: List all subnets on the network, verify the IPv6 subnet id is no longer present in the list
  • Test assertion 4: The IPv6 subnet “id” parameter is not present in list
  • Test action 11: Delete the network created in test action 1, using the stored network id
  • Test action 12: List all networks, verifying the network id is no longer present
  • Test assertion 5: The “id” parameter is not present in the network list

This test evaluates the ability to use create commands to create an IPv6 subnet within allowed IPv6 address allocation pool and create a port whose address is in the range of the pool. Specifically it verifies that:

  • IPv6 subnet create command to create an IPv6 subnet within allowed IPv6 address allocation pool
  • Port create command to create a port whose id is in the range of the allocation pools
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 7 - Create an IPv6 Port with Empty Security Groups
Short name

dovetail.ipv6.tc007.port_create_empty_security_group

Use case specification

This test case evaluates the SUT API ability of creating port with empty security group, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_with_no_securitygroups

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create an IPv6 subnet of the network, storing the “id” parameter returned in the response
  • Test action 3: Create a port of the network with an empty security group, storing the “id” parameter returned in the response
  • Test action 4: Verify the security group of the port is not none but is empty
  • Test assertion 1: the security group of the port is not none but is empty
  • Test action 5: Delete the port using the stored “id” parameter
  • Test action 6: List all ports, verify the port id is no longer present in the list
  • Test assertion 2: The port “id” parameter is not present in list
  • Test action 7: Delete the IPv6 subnet using the stored “id” parameter
  • Test action 8: List all subnets on the network, verify the IPv6 subnet id is no longer present in the list
  • Test assertion 3: The IPv6 subnet “id” parameter is not present in list
  • Test action 9: Delete the network created in test action 1, using the stored network id
  • Test action 10: List all networks, verifying the network id is no longer present
  • Test assertion 4: The “id” parameter is not present in the network list

This test evaluates the ability to use create commands to create port with empty security group of the SUT API. Specifically it verifies that:

  • Port create commands to create a port with an empty security group
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 8 - Create, Update and Delete an IPv6 Port
Short name

dovetail.ipv6.tc008.port_create_update_delete

Use case specification

This test case evaluates the SUT API ability of creating, updating, deleting IPv6 port, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_update_delete_port

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create a port of the network, storing the “id” and “admin_state_up” parameters returned in the response
  • Test action 3: Verify the value of port’s ‘admin_state_up’ is True
  • Test assertion 1: the value of port’s ‘admin_state_up’ is True after creating
  • Test action 4: Update the port’s name with a new_name and set port’s admin_state_up to False, storing the name and admin_state_up parameters returned in the response
  • Test action 5: Verify the stored port’s name equals to new_name and the port’s admin_state_up is False.
  • Test assertion 2: the stored port’s name equals to new_name and the port’s admin_state_up is False
  • Test action 6: Delete the port using the stored “id” parameter
  • Test action 7: List all ports, verify the port is no longer present in the list
  • Test assertion 3: The port “id” parameter is not present in list
  • Test action 8: Delete the network created in test action 1, using the stored network id
  • Test action 9: List all networks, verifying the network id is no longer present
  • Test assertion 4: The “id” parameter is not present in the network list

This test evaluates the ability to use create/update/delete commands to create/update/delete port of the SUT API. Specifically it verifies that:

  • Port create commands return True of ‘admin_state_up’ in response
  • Port update commands to update ‘name’ to new_name and ‘admin_state_up’ to false
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 9 - List IPv6 Ports
Short name

dovetail.ipv6.tc009.port_list

Use case specification

This test case evaluates the SUT ability of creating a port on a network and finding the port in the all ports list, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_list_ports

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create a port of the network, storing the “id” parameter returned in the response
  • Test action 3: List all ports, verify the port id is found in the list
  • Test assertion 1: The “id” parameter is found in the port list
  • Test action 4: Delete the port using the stored “id” parameter
  • Test action 5: List all ports, verify the port is no longer present in the list
  • Test assertion 2: The port “id” parameter is not present in list
  • Test action 6: Delete the network created in test action 1, using the stored network id
  • Test action 7: List all networks, verifying the network id is no longer present
  • Test assertion 3: The “id” parameter is not present in the network list

This test evaluates the ability to use list commands to list the networks and ports on the SUT API. Specifically it verifies that:

  • Port list command to list all ports, the created port is found in the list.
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 10 - Show Key/Valus Details of an IPv6 Port
Short name

dovetail.ipv6.tc010.port_show_details

Use case specification

This test case evaluates the SUT ability of showing the port details, the values in the details should be equal to the values to create the port, the reference is,

tempest.api.network.test_ports.PortsIpV6TestJSON.test_show_port

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create a port of the network, storing the “id” parameter returned in the response
  • Test action 3: Show the details of the port, verify the stored port’s id in test action 2 exists in the details
  • Test assertion 1: The “id” parameter is found in the port shown details
  • Test action 4: Verify the values in the details of the port are the same as the values to create the port
  • Test assertion 2: The values in the details of the port are the same as the values to create the port
  • Test action 5: Delete the port using the stored “id” parameter
  • Test action 6: List all ports, verify the port is no longer present in the list
  • Test assertion 3: The port “id” parameter is not present in list
  • Test action 7: Delete the network created in test action 1, using the stored network id
  • Test action 8: List all networks, verifying the network id is no longer present
  • Test assertion 4: The “id” parameter is not present in the network list

This test evaluates the ability to use show commands to show port details on the SUT API. Specifically it verifies that:

  • Port show commands to show the details of the port, whose id is in the details
  • Port show commands to show the details of the port, whose values are the same as the values to create the port
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 11 - Add Multiple Interfaces for an IPv6 Router
Short name

dovetail.ipv6.tc011.router_add_multiple_interface

Use case specification

This test case evaluates the SUT ability of adding multiple interface to a router, the reference is,

tempest.api.network.test_routers.RoutersIpV6Test.test_add_multiple_router_interfaces

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create 2 networks named network01 and network02 sequentially, storing the “id” parameters returned in the response
  • Test action 2: Create an IPv6 subnet01 in network01, an IPv6 subnet02 in network02 sequentially, storing the “id” parameters returned in the response
  • Test action 3: Create a router, storing the “id” parameter returned in the response
  • Test action 4: Create interface01 with subnet01 and the router
  • Test action 5: Verify the router_id stored in test action 3 equals to the interface01’s ‘device_id’ and subnet01_id stored in test action 2 equals to the interface01’s ‘subnet_id’
  • Test assertion 1: the router_id equals to the interface01’s ‘device_id’ and subnet01_id equals to the interface01’s ‘subnet_id’
  • Test action 5: Create interface02 with subnet02 and the router
  • Test action 6: Verify the router_id stored in test action 3 equals to the interface02’s ‘device_id’ and subnet02_id stored in test action 2 equals to the interface02’s ‘subnet_id’
  • Test assertion 2: the router_id equals to the interface02’s ‘device_id’ and subnet02_id equals to the interface02’s ‘subnet_id’
  • Test action 7: Delete the interfaces, router, IPv6 subnets and networks, networks, subnets, then list all interfaces, ports, IPv6 subnets, networks, the test passes if the deleted ones are not found in the list.
  • Test assertion 3: The interfaces, router, IPv6 subnets and networks ids are not present in the lists after deleting

This test evaluates the ability to use bulk create commands to create networks, IPv6 subnets and ports on the SUT API. Specifically it verifies that:

  • Interface create commands to create interface with IPv6 subnet and router, interface ‘device_id’ and ‘subnet_id’ should equal to the router id and IPv6 subnet id, respectively.
  • Interface create commands to create multiple interface with the same router and multiple IPv6 subnets.
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 12 - Add and Remove an IPv6 Router Interface with port_id
Short name

dovetail.ipv6.tc012.router_interface_add_remove_with_port

Use case specification

This test case evaluates the SUT abiltiy of adding, removing router interface to a port, the subnet_id and port_id of the interface will be checked, the port’s device_id will be checked if equals to the router_id or not. The reference is,

tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_port_id

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create an IPv6 subnet of the network, storing the “id” parameter returned in the response
  • Test action 3: Create a router, storing the “id” parameter returned in the response
  • Test action 4: Create a port of the network, storing the “id” parameter returned in the response
  • Test action 5: Add router interface to the port created, storing the “id” parameter returned in the response
  • Test action 6: Verify the interface’s keys include ‘subnet_id’ and ‘port_id’
  • Test assertion 1: the interface’s keys include ‘subnet_id’ and ‘port_id’
  • Test action 7: Show the port details, verify the ‘device_id’ in port details equals to the router id stored in test action 3
  • Test assertion 2: ‘device_id’ in port details equals to the router id
  • Test action 8: Delete the interface, port, router, subnet and network, then list all interfaces, ports, routers, subnets and networks, the test passes if the deleted ones are not found in the list.
  • Test assertion 3: interfaces, ports, routers, subnets and networks are not found in the lists after deleting

This test evaluates the ability to use add/remove commands to add/remove router interface to the port, show commands to show port details on the SUT API. Specifically it verifies that:

  • Router_interface add commands to add router interface to a port, the interface’s keys should include ‘subnet_id’ and ‘port_id’
  • Port show commands to show ‘device_id’ in port details, which should be equal to the router id
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 13 - Add and Remove an IPv6 Router Interface with subnet_id
Short name

dovetail.ipv6.tc013.router_interface_add_remove

Use case specification

This test case evaluates the SUT API ability of adding and removing a router interface with the IPv6 subnet id, the reference is

tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_subnet_id

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a network, storing the “id” parameter returned in the response
  • Test action 2: Create an IPv6 subnet with the network created, storing the “id” parameter returned in the response
  • Test action 3: Create a router, storing the “id” parameter returned in the response
  • Test action 4: Add a router interface with the stored ids of the router and IPv6 subnet
  • Test assertion 1: Key ‘subnet_id’ is included in the added interface’s keys
  • Test assertion 2: Key ‘port_id’ is included in the added interface’s keys
  • Test action 5: Show the port info with the stored interface’s port id
  • Test assertion 3:: The stored router id is equal to the device id shown in the port info
  • Test action 6: Delete the router interface created in test action 4, using the stored subnet id
  • Test action 7: List all router interfaces, verifying the router interface is no longer present
  • Test assertion 4: The router interface with the stored subnet id is not present in the router interface list
  • Test action 8: Delete the router created in test action 3, using the stored router id
  • Test action 9: List all routers, verifying the router id is no longer present
  • Test assertion 5: The router “id” parameter is not present in the router list
  • Test action 10: Delete the subnet created in test action 2, using the stored subnet id
  • Test action 11: List all subnets, verifying the subnet id is no longer present
  • Test assertion 6: The subnet “id” parameter is not present in the subnet list
  • Test action 12: Delete the network created in test action 1, using the stored network id
  • Test action 13: List all networks, verifying the network id is no longer present
  • Test assertion 7: The network “id” parameter is not present in the network list

This test evaluates the ability to add and remove router interface with the subnet id on the SUT API. Specifically it verifies that:

  • Router interface add command returns valid ‘subnet_id’ parameter which is reported in the interface’s keys
  • Router interface add command returns valid ‘port_id’ parameter which is reported in the interface’s keys
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 14 - Create, Show, List, Update and Delete an IPv6 router
Short name

dovetail.ipv6.tc014.router_create_show_list_update_delete

Use case specification

This test case evaluates the SUT API ability of creating, showing, listing, updating and deleting routers, the reference is

tempest.api.network.test_routers.RoutersIpV6Test.test_create_show_list_update_delete_router

Test preconditions

There should exist an OpenStack external network.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a router, set the admin_state_up to be False and external_network_id to be public network id, storing the “id” parameter returned in the response
  • Test assertion 1: The created router’s admin_state_up is False
  • Test assertion 2: The created router’s external network id equals to the public network id
  • Test action 2: Show details of the router created in test action 1, using the stored router id
  • Test assertion 3: The router’s name shown is the same as the router created
  • Test assertion 4: The router’s external network id shown is the same as the public network id
  • Test action 3: List all routers and verify if created router is in response message
  • Test assertion 5: The stored router id is in the router list
  • Test action 4: Update the name of router and verify if it is updated
  • Test assertion 6: The name of router equals to the name used to update in test action 4
  • Test action 5: Show the details of router, using the stored router id
  • Test assertion 7: The router’s name shown equals to the name used to update in test action 4
  • Test action 6: Delete the router created in test action 1, using the stored router id
  • Test action 7: List all routers, verifying the router id is no longer present
  • Test assertion 8: The “id” parameter is not present in the router list

This test evaluates the ability to create, show, list, update and delete router on the SUT API. Specifically it verifies that:

  • Router create command returns valid “admin_state_up” and “id” parameters which equal to the “admin_state_up” and “id” returned in the response
  • Router show command returns valid “name” parameter which equals to the “name” returned in the response
  • Router show command returns valid “external network id” parameters which equals to the public network id
  • Router list command returns valid “id” parameter which equals to the stored router “id”
  • Router update command returns updated “name” parameters which equals to the “name” used to update
  • Router created using create command is able to be removed using the returned identifiers
Post conditions

None

Test Case 15 - Create, List, Update, Show and Delete an IPv6 security group
Short name

dovetail.ipv6.tc015.security_group_create_list_update_show_delete

Use case specification

This test case evaluates the SUT API ability of creating, listing, updating, showing and deleting security groups, the reference is

tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_list_update_show_delete_security_group

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group, storing the “id” parameter returned in the response
  • Test action 2: List all security groups and verify if created security group is there in response
  • Test assertion 1: The created security group’s “id” is found in the list
  • Test action 3: Update the name and description of this security group, using the stored id
  • Test action 4: Verify if the security group’s name and description are updated
  • Test assertion 2: The security group’s name equals to the name used in test action 3
  • Test assertion 3: The security group’s description equals to the description used in test action 3
  • Test action 5: Show details of the updated security group, using the stored id
  • Test assertion 4: The security group’s name shown equals to the name used in test action 3
  • Test assertion 5: The security group’s description shown equals to the description used in test action 3
  • Test action 6: Delete the security group created in test action 1, using the stored id
  • Test action 7: List all security groups, verifying the security group’s id is no longer present
  • Test assertion 6: The “id” parameter is not present in the security group list

This test evaluates the ability to create list, update, show and delete security group on the SUT API. Specifically it verifies that:

  • Security group create commands return valid “id” parameter which is reported in the list commands
  • Security group update commands return valid “name” and “description” parameters which are reported in the show commands
  • Security group created using create command is able to be removed using the returned identifiers
Post conditions

None

Test Case 16 - Create, Show and Delete IPv6 security group rule
Short name

dovetail.ipv6.tc016.security_group_rule_create_show_delete

Use case specification

This test case evaluates the SUT API ability of creating, showing, listing and deleting security group rules, the reference is

tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_show_delete_security_group_rule

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group, storing the “id” parameter returned in the response
  • Test action 2: Create a rule of the security group with protocol tcp, udp and icmp, respectively, using the stored security group’s id, storing the “id” parameter returned in the response
  • Test action 3: Show details of the created security group rule, using the stored id of the security group rule
  • Test assertion 1: All the created security group rule’s values equal to the rule values shown in test action 3
  • Test action 4: List all security group rules
  • Test assertion 2: The stored security group rule’s id is found in the list
  • Test action 5: Delete the security group rule, using the stored security group rule’s id
  • Test action 6: List all security group rules, verifying the security group rule’s id is no longer present
  • Test assertion 3: The security group rule “id” parameter is not present in the list
  • Test action 7: Delete the security group, using the stored security group’s id
  • Test action 8: List all security groups, verifying the security group’s id is no longer present
  • Test assertion 4: The security group “id” parameter is not present in the list

This test evaluates the ability to create, show, list and delete security group rules on the SUT API. Specifically it verifies that:

  • Security group rule create command returns valid values which are reported in the show command
  • Security group rule created using create command is able to be removed using the returned identifiers
Post conditions

None

Test Case 17 - List IPv6 Security Groups
Short name

dovetail.ipv6.tc017.security_group_list

Use case specification

This test case evaluates the SUT API ability of listing security groups, the reference is

tempest.api.network.test_security_groups.SecGroupIPv6Test.test_list_security_groups

Test preconditions

There should exist a default security group.

Basic test flow execution description and pass/fail criteria
  • Test action 1: List all security groups
  • Test action 2: Verify the default security group exists in the list, the test passes if the default security group exists
  • Test assertion 1: The default security group is in the list

This test evaluates the ability to list security groups on the SUT API. Specifically it verifies that:

  • Security group list command return valid security groups which include the default security group
Post conditions

None

Test Case 18 - IPv6 Address Assignment - Dual Stack, SLAAC, DHCPv6 Stateless
Short name

dovetail.ipv6.tc018.dhcpv6_stateless

Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’. In this case, guest instance obtains IPv6 address from OpenStack managed radvd using SLAAC and optional info from dnsmasq using DHCPv6 stateless. This test case then verifies the ping6 available VM can ping the other VM’s v4 and v6 addresses as well as the v6 subnet’s gateway ip in the same network, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os

Test preconditions

There should exist a public router or a public network.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create one IPv6 subnet of the network created in test action 1 in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, storing the “id” parameter returned in the response
  • Test action 6: Connect the IPv6 subnet to the router, using the stored IPv6 subnet id
  • Test action 7: Boot two VMs on this network, storing the “id” parameters returned in the response
  • Test assertion 1: The vNIC of each VM gets one v4 address and one v6 address actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 address as well as the v6 subnet’s gateway ip
  • Test action 8: Delete the 2 VMs created in test action 7, using the stored ids
  • Test action 9: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 10: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 11: Delete the IPv6 subnet created in test action 5, using the stored id
  • Test action 12: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 13: Delete the network created in test action 1, using the stored id
  • Test action 14: List all networks, verifying the id is no longer present
  • Test assertion 6: The “id” parameter is not present in the network list

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, and verify the ping6 available VM can ping the other VM’s v4 and v6 addresses as well as the v6 subnet’s gateway ip in the same network. Specifically it verifies that:

  • The IPv6 addresses in mode ‘dhcpv6_stateless’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnet’s gateway ip
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 19 - IPv6 Address Assignment - Dual Net, Dual Stack, SLAAC, DHCPv6 Stateless
Short name

dovetail.ipv6.tc019.dualnet_dhcpv6_stateless

Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’. In this case, guest instance obtains IPv6 address from OpenStack managed radvd using SLAAC and optional info from dnsmasq using DHCPv6 stateless. This test case then verifies the ping6 available VM can ping the other VM’s v4 address in one network and v6 address in another network as well as the v6 subnet’s gateway ip, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os

Test preconditions

There should exists a public router or a public network.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create another network, storing the “id” parameter returned in the response
  • Test action 6: Create one IPv6 subnet of network created in test action 5 in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, storing the “id” parameter returned in the response
  • Test action 7: Connect the IPv6 subnet to the router, using the stored IPv6 subnet id
  • Test action 8: Boot two VMs on these two networks, storing the “id” parameters returned in the response
  • Test action 9: Turn on 2nd NIC of each VM for the network created in test action 5
  • Test assertion 1: The 1st vNIC of each VM gets one v4 address assigned and the 2nd vNIC of each VM gets one v6 address actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 address as well as the v6 subnet’s gateway ip
  • Test action 10: Delete the 2 VMs created in test action 8, using the stored ids
  • Test action 11: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 12: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 13: Delete the IPv6 subnet created in test action 6, using the stored id
  • Test action 14: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 15: Delete the 2 networks created in test action 1 and 5, using the stored ids
  • Test action 16: List all networks, verifying the ids are no longer present
  • Test assertion 6: The two “id” parameters are not present in the network list

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, and verify the ping6 available VM can ping the other VM’s v4 address in one network and v6 address in another network as well as the v6 subnet’s gateway ip. Specifically it verifies that:

  • The IPv6 addresses in mode ‘dhcpv6_stateless’ assigned successfully
  • The VM can ping the other VM’s IPv4 address in one network and IPv6 address in another network as well as the v6 subnet’s gateway ip
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 20 - IPv6 Address Assignment - Multiple Prefixes, Dual Stack, SLAAC, DHCPv6 Stateless
Short name

dovetail.ipv6.tc020.multiple_prefixes_dhcpv6_stateless

Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’. In this case, guest instance obtains IPv6 addresses from OpenStack managed radvd using SLAAC and optional info from dnsmasq using DHCPv6 stateless. This test case then verifies the ping6 available VM can ping the other VM’s one v4 address and two v6 addresses with different prefixes as well as the v6 subnets’ gateway ips in the same network, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_dhcpv6_stateless

Test preconditions

There should exist a public router or a public network.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create two IPv6 subnets of the network created in test action 1 in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, storing the “id” parameters returned in the response
  • Test action 6: Connect the two IPv6 subnets to the router, using the stored IPv6 subnet ids
  • Test action 7: Boot two VMs on this network, storing the “id” parameters returned in the response
  • Test assertion 1: The vNIC of each VM gets one v4 address and two v6 addresses with different prefixes actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 addresses as well as the v6 subnets’ gateway ips
  • Test action 8: Delete the 2 VMs created in test action 7, using the stored ids
  • Test action 9: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 10: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 11: Delete two IPv6 subnets created in test action 5, using the stored ids
  • Test action 12: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 13: Delete the network created in test action 1, using the stored id
  • Test action 14: List all networks, verifying the id is no longer present
  • Test assertion 6: The “id” parameter is not present in the network list

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, and verify the ping6 available VM can ping the other VM’s v4 address and two v6 addresses with different prefixes as well as the v6 subnets’ gateway ips in the same network. Specifically it verifies that:

  • The different prefixes IPv6 addresses in mode ‘dhcpv6_stateless’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnets’ gateway ips
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 21 - IPv6 Address Assignment - Dual Net, Multiple Prefixes, Dual Stack, SLAAC, DHCPv6 Stateless
Short name

dovetail.ipv6.tc021.dualnet_multiple_prefixes_dhcpv6_stateless

Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’. In this case, guest instance obtains IPv6 addresses from OpenStack managed radvd using SLAAC and optional info from dnsmasq using DHCPv6 stateless. This test case then verifies the ping6 available VM can ping the other VM’s v4 address in one network and two v6 addresses with different prefixes in another network as well as the v6 subnets’ gateway ips, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_dhcpv6_stateless

Test preconditions

There should exist a public router or a public network.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create another network, storing the “id” parameter returned in the response
  • Test action 6: Create two IPv6 subnets of network created in test action 5 in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, storing the “id” parameters returned in the response
  • Test action 7: Connect the two IPv6 subnets to the router, using the stored IPv6 subnet ids
  • Test action 8: Boot two VMs on these two networks, storing the “id” parameters returned in the response
  • Test action 9: Turn on 2nd NIC of each VM for the network created in test action 5
  • Test assertion 1: The vNIC of each VM gets one v4 address and two v6 addresses with different prefixes actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 addresses as well as the v6 subnets’ gateway ips
  • Test action 10: Delete the 2 VMs created in test action 8, using the stored ids
  • Test action 11: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 12: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 13: Delete two IPv6 subnets created in test action 6, using the stored ids
  • Test action 14: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 15: Delete the 2 networks created in test action 1 and 5, using the stored ids
  • Test action 16: List all networks, verifying the ids are no longer present
  • Test assertion 6: The two “id” parameters are not present in the network list

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘dhcpv6_stateless’ and ipv6_address_mode ‘dhcpv6_stateless’, and verify the ping6 available VM can ping the other VM’s v4 address in one network and two v6 addresses with different prefixes in another network as well as the v6 subnets’ gateway ips. Specifically it verifies that:

  • The IPv6 addresses in mode ‘dhcpv6_stateless’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnets’ gateway ips
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 22 - IPv6 Address Assignment - Dual Stack, SLAAC
Short name

dovetail.ipv6.tc022.slaac

Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’. In this case, guest instance obtains IPv6 address from OpenStack managed radvd using SLAAC. This test case then verifies the ping6 available VM can ping the other VM’s v4 and v6 addresses as well as the v6 subnet’s gateway ip in the same network, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_slaac_from_os

Test preconditions

There should exist a public router or a public network.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create one IPv6 subnet of the network created in test action 1 in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, storing the “id” parameter returned in the response
  • Test action 6: Connect the IPv6 subnet to the router, using the stored IPv6 subnet id
  • Test action 7: Boot two VMs on this network, storing the “id” parameters returned in the response
  • Test assertion 1: The vNIC of each VM gets one v4 address and one v6 address actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 address as well as the v6 subnet’s gateway ip
  • Test action 8: Delete the 2 VMs created in test action 7, using the stored ids
  • Test action 9: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 10: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 11: Delete the IPv6 subnet created in test action 5, using the stored id
  • Test action 12: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 13: Delete the network created in test action 1, using the stored id
  • Test action 14: List all networks, verifying the id is no longer present
  • Test assertion 6: The “id” parameter is not present in the network list

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, and verify the ping6 available VM can ping the other VM’s v4 and v6 addresses as well as the v6 subnet’s gateway ip in the same network. Specifically it verifies that:

  • The IPv6 addresses in mode ‘slaac’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnet’s gateway ip
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 23 - IPv6 Address Assignment - Dual Net, Dual Stack, SLAAC
Short name

dovetail.ipv6.tc023.dualnet_slaac

Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’. In this case, guest instance obtains IPv6 address from OpenStack managed radvd using SLAAC. This test case then verifies the ping6 available VM can ping the other VM’s v4 address in one network and v6 address in another network as well as the v6 subnet’s gateway ip, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_slaac_from_os

Test preconditions

There should exist a public router or a public network.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create another network, storing the “id” parameter returned in the response
  • Test action 6: Create one IPv6 subnet of network created in test action 5 in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, storing the “id” parameter returned in the response
  • Test action 7: Connect the IPv6 subnet to the router, using the stored IPv6 subnet id
  • Test action 8: Boot two VMs on these two networks, storing the “id” parameters returned in the response
  • Test action 9: Turn on 2nd NIC of each VM for the network created in test action 5
  • Test assertion 1: The 1st vNIC of each VM gets one v4 address assigned and the 2nd vNIC of each VM gets one v6 address actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 address as well as the v6 subnet’s gateway ip
  • Test action 10: Delete the 2 VMs created in test action 8, using the stored ids
  • Test action 11: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 12: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 13: Delete the IPv6 subnet created in test action 6, using the stored id
  • Test action 14: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 15: Delete the 2 networks created in test action 1 and 5, using the stored ids
  • Test action 16: List all networks, verifying the ids are no longer present
  • Test assertion 6: The two “id” parameters are not present in the network list

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, and verify the ping6 available VM can ping the other VM’s v4 address in one network and v6 address in another network as well as the v6 subnet’s gateway ip. Specifically it verifies that:

  • The IPv6 addresses in mode ‘slaac’ assigned successfully
  • The VM can ping the other VM’s IPv4 address in one network and IPv6 address in another network as well as the v6 subnet’s gateway ip
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 24 - IPv6 Address Assignment - Multiple Prefixes, Dual Stack, SLAAC
Short name

dovetail.ipv6.tc024.multiple_prefixes_slaac

Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’. In this case, guest instance obtains IPv6 addresses from OpenStack managed radvd using SLAAC. This test case then verifies the ping6 available VM can ping the other VM’s one v4 address and two v6 addresses with different prefixes as well as the v6 subnets’ gateway ips in the same network, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac

Test preconditions

There should exists a public router or a public network.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create two IPv6 subnets of the network created in test action 1 in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, storing the “id” parameters returned in the response
  • Test action 6: Connect the two IPv6 subnets to the router, using the stored IPv6 subnet ids
  • Test action 7: Boot two VMs on this network, storing the “id” parameters returned in the response
  • Test assertion 1: The vNIC of each VM gets one v4 address and two v6 addresses with different prefixes actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 addresses as well as the v6 subnets’ gateway ips
  • Test action 8: Delete the 2 VMs created in test action 7, using the stored ids
  • Test action 9: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 10: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 11: Delete two IPv6 subnets created in test action 5, using the stored ids
  • Test action 12: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 13: Delete the network created in test action 1, using the stored id
  • Test action 14: List all networks, verifying the id is no longer present
  • Test assertion 6: The “id” parameter is not present in the network list

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, and verify the ping6 available VM can ping the other VM’s v4 address and two v6 addresses with different prefixes as well as the v6 subnets’ gateway ips in the same network. Specifically it verifies that:

  • The different prefixes IPv6 addresses in mode ‘slaac’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnets’ gateway ips
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Test Case 25 - IPv6 Address Assignment - Dual Net, Dual Stack, Multiple Prefixes, SLAAC
Short name

dovetail.ipv6.tc025.dualnet_multiple_prefixes_slaac

Use case specification

This test case evaluates IPv6 address assignment in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’. In this case, guest instance obtains IPv6 addresses from OpenStack managed radvd using SLAAC. This test case then verifies the ping6 available VM can ping the other VM’s v4 address in one network and two v6 addresses with different prefixes in another network as well as the v6 subnets’ gateway ips, the reference is

tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_slaac

Test preconditions

There should exist a public router or a public network.

Basic test flow execution description and pass/fail criteria
  • Test action 1: Create one network, storing the “id” parameter returned in the response
  • Test action 2: Create one IPv4 subnet of the created network, storing the “id” parameter returned in the response
  • Test action 3: If there exists a public router, use it as the router. Otherwise, use the public network to create a router
  • Test action 4: Connect the IPv4 subnet to the router, using the stored IPv4 subnet id
  • Test action 5: Create another network, storing the “id” parameter returned in the response
  • Test action 6: Create two IPv6 subnets of network created in test action 5 in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, storing the “id” parameters returned in the response
  • Test action 7: Connect the two IPv6 subnets to the router, using the stored IPv6 subnet ids
  • Test action 8: Boot two VMs on these two networks, storing the “id” parameters returned in the response
  • Test action 9: Turn on 2nd NIC of each VM for the network created in test action 5
  • Test assertion 1: The vNIC of each VM gets one v4 address and two v6 addresses with different prefixes actually assigned
  • Test assertion 2: Each VM can ping the other’s v4 private address
  • Test assertion 3: The ping6 available VM can ping the other’s v6 addresses as well as the v6 subnets’ gateway ips
  • Test action 10: Delete the 2 VMs created in test action 8, using the stored ids
  • Test action 11: List all VMs, verifying the ids are no longer present
  • Test assertion 4: The two “id” parameters are not present in the VM list
  • Test action 12: Delete the IPv4 subnet created in test action 2, using the stored id
  • Test action 13: Delete two IPv6 subnets created in test action 6, using the stored ids
  • Test action 14: List all subnets, verifying the ids are no longer present
  • Test assertion 5: The “id” parameters of IPv4 and IPv6 are not present in the list
  • Test action 15: Delete the 2 networks created in test action 1 and 5, using the stored ids
  • Test action 16: List all networks, verifying the ids are no longer present
  • Test assertion 6: The two “id” parameters are not present in the network list

This test evaluates the ability to assign IPv6 addresses in ipv6_ra_mode ‘slaac’ and ipv6_address_mode ‘slaac’, and verify the ping6 available VM can ping the other VM’s v4 address in one network and two v6 addresses with different prefixes in another network as well as the v6 subnets’ gateway ips. Specifically it verifies that:

  • The IPv6 addresses in mode ‘slaac’ assigned successfully
  • The VM can ping the other VM’s IPv4 and IPv6 private addresses as well as the v6 subnets’ gateway ips
  • All items created using create commands are able to be removed using the returned identifiers
Post conditions

None

Forwarding Packets in Data Path test specification
Scope

This test area evaluates the ability of the system under test to support basic packet forwarding. The test in this test area will evaluate basic packet forwarding through virtual IPv4 networks in data path, including creating server and verifying network connectivity to the created server with ping operation using MTU sized packets.

References

N/A

Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • ICMP - Internet Control Message Protocol
  • MTU - Maximum Transmission Unit
  • NFVi - Network Functions Virtualization infrastructure
  • SSH - Secure Shell
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on the basic operations of forwarding packets in data path through virtual networks. Specifically, the test performs clean-up operations which return the system to the same state as before the test.

This test case is included in the test case dovetail.tempest.tc001 of OVP test suite.

Test Descriptions
API Used and Reference

Security Groups: https://developer.openstack.org/api-ref/network/v2/index.html#security-groups-security-groups

  • create security group
  • delete security group

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • create network
  • delete network

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • create router
  • delete router
  • add interface to router

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • create subnet
  • delete subnet

Servers: https://developer.openstack.org/api-ref/compute/

  • create keypair
  • create server
  • delete server
  • add/assign floating ip

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • create port
  • delete port

Floating IPs: https://developer.openstack.org/api-ref/networking/v2/index.html#floating-ips-floatingips

  • create floating ip
  • delete floating ip
MTU Sized Frames Fit Through
Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_mtu_sized_frames

Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • Neutron net-mtu extension API
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group SG1, which has rules for allowing incoming and outgoing SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 6: Set MTU size to be the default MTU size of the SUT’s network
  • Test action 7: Host sends MTU sized ICMP packets to VM1 using ping
  • Test assertion 1: Ping FIP1 using MTU sized packets successfully
  • Test action 8: SSH to VM1 with FIP1
  • Test assertion 2: SSH to VM1 with FIP1 successfully
  • Test action 9: Delete SG1, NET1, SUBNET1, R1, VM1 and FIP1

This test evaluates the network connectivity using MTU sized frames. Specifically, the test verifies that:

  • With Neutron net-mtu extension configured, MTU sized packets can fit through network.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Security Group and Port Security test specification
Scope

The security group and port security test area evaluates the ability of the system under test to support packet filtering by security group and port security. The tests in this test area will evaluate preventing MAC spoofing by port security, basic security group operations including testing cross/in tenant traffic, testing multiple security groups, using port security to disable security groups and updating security groups.

References

N/A

Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • ICMP - Internet Control Message Protocol
  • MAC - Media Access Control
  • NFVi - Network Functions Virtualization infrastructure
  • SSH - Secure Shell
  • TCP - Transmission Control Protocol
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on the basic operations of security group and port security. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

All these test cases are included in the test case dovetail.tempest.tc002 of OVP test suite.

Test Descriptions
API Used and Reference

Security Groups: https://developer.openstack.org/api-ref/network/v2/index.html#security-groups-security-groups

  • create security group
  • delete security group

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • create network
  • delete network
  • list networks
  • create floating ip
  • delete floating ip

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • create router
  • delete router
  • list routers
  • add interface to router

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • create subnet
  • list subnets
  • delete subnet

Servers: https://developer.openstack.org/api-ref/compute/

  • create keypair
  • create server
  • delete server
  • add/assign floating ip

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • update port
  • list ports
  • show port details
Test Case 1 - Port Security and MAC Spoofing
Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_port_security_macspoofing_port

Test preconditions
  • Neutron port-security extension API
  • Neutron security-group extension API
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 6: Verify can ping FIP1 successfully and can SSH to VM1 with FIP1
  • Test action 7: Create a second neutron network NET2 and subnet SUBNET2, and attach VM1 to NET2
  • Test action 8: Get VM1’s ethernet interface NIC2 for NET2
  • Test action 9: Create second server VM2 on NET2
  • Test action 10: Verify VM1 is able to communicate with VM2 via NIC2
  • Test action 11: Login to VM1 and spoof the MAC address of NIC2 to “00:00:00:00:00:01”
  • Test action 12: Verify VM1 fails to communicate with VM2 via NIC2
  • Test assertion 1: The ping operation is failed
  • Test action 13: Update ‘security_groups’ to be none for VM1’s NIC2 port
  • Test action 14: Update ‘port_security_enable’ to be False for VM1’s NIC2 port
  • Test action 15: Verify now VM1 is able to communicate with VM2 via NIC2
  • Test assertion 2: The ping operation is successful
  • Test action 16: Delete SG1, NET1, NET2, SUBNET1, SUBNET2, R1, VM1, VM2 and FIP1

This test evaluates the ability to prevent MAC spoofing by using port security. Specifically, the test verifies that:

  • With port security, the ICMP packets from a spoof server cannot pass the port.
  • Without port security, the ICMP packets from a spoof server can pass the port.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 2 - Test Security Group Cross Tenant Traffic
Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_cross_tenant_traffic

Test preconditions
  • Neutron security-group extension API
  • Two tenants
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a neutron network NET1 for primary tenant
  • Test action 2: Create a primary tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2 for primary tenant
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Repeat test action 1 to 6 and create NET2, R2, SUBNET2, SG3, SG4, FIP2 and VM2 for an alt_tenant
  • Test action 8: Verify VM1 fails to communicate with VM2 through FIP2
  • Test assertion 1: The ping operation is failed
  • Test action 9: Add ICMP rule to SG4
  • Test action 10: Verify VM1 is able to communicate with VM2 through FIP2
  • Test assertion 2: The ping operation is successful
  • Test action 11: Verify VM2 fails to communicate with VM1 through FIP1
  • Test assertion 3: The ping operation is failed
  • Test action 12: Add ICMP rule to SG2
  • Test action 13: Verify VM2 is able to communicate with VM1 through FIP1
  • Test assertion 4: The ping operation is successful
  • Test action 14: Delete SG1, SG2, SG3, SG4, NET1, NET2, SUBNET1, SUBNET2, R1, R2, VM1, VM2, FIP1 and FIP2

This test evaluates the ability of the security group to filter packets cross tenant. Specifically, the test verifies that:

  • Without ICMP security group rule, the ICMP packets cannot be received by the server in another tenant which differs from the source server.
  • With ingress ICMP security group rule enabled only at tenant1, the server in tenant2 can ping server in tenant1 but not the reverse direction.
  • With ingress ICMP security group rule enabled at tenant2 also, the ping works from both directions.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 3 - Test Security Group in Tenant Traffic
Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_in_tenant_traffic

Test preconditions
  • Neutron security-group extension API
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a neutron network NET1
  • Test action 2: Create a tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Create second server VM2 with default security group and NET1
  • Test action 8: Verify VM1 fails to communicate with VM2 through VM2’s fixed ip
  • Test assertion 1: The ping operation is failed
  • Test action 9: Add ICMP security group rule to default security group
  • Test action 10: Verify VM1 is able to communicate with VM2 through VM2’s fixed ip
  • Test assertion 2: The ping operation is successful
  • Test action 11: Delete SG1, SG2, NET1, SUBNET1, R1, VM1, VM2 and FIP1

This test evaluates the ability of the security group to filter packets in one tenant. Specifically, the test verifies that:

  • Without ICMP security group rule, the ICMP packets cannot be received by the server in the same tenant.
  • With ICMP security group rule, the ICMP packets can be received by the server in the same tenant.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 4 - Test Multiple Security Groups
Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_multiple_security_groups

Test preconditions
  • Neutron security-group extension API
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a neutron network NET1
  • Test action 2: Create a tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Verify failed to ping FIP1
  • Test assertion 1: The ping operation is failed
  • Test action 8: Add ICMP security group rule to SG2
  • Test action 9: Verify can ping FIP1 successfully
  • Test assertion 2: The ping operation is successful
  • Test action 10: Verify can SSH to VM1 with FIP1
  • Test assertion 3: Can SSH to VM1 successfully
  • Test action 11: Delete SG1, SG2, NET1, SUBNET1, R1, VM1 and FIP1

This test evaluates the ability of multiple security groups to filter packets. Specifically, the test verifies that:

  • A server with 2 security groups, one with TCP rule and without ICMP rule, cannot receive the ICMP packets sending from the tempest host machine.
  • A server with 2 security groups, one with TCP rule and the other with ICMP rule, can receive the ICMP packets sending from the tempest host machine and be connected via the SSH client.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 5 - Test Port Security Disable Security Group
Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_port_security_disable_security_group

Test preconditions
  • Neutron security-group extension API
  • Neutron port-security extension API
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a neutron network NET1
  • Test action 2: Create a tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Create second server VM2 with default security group and NET1
  • Test action 8: Update ‘security_groups’ to be none and ‘port_security_enabled’ to be True for VM2’s port
  • Test action 9: Verify VM1 fails to communicate with VM2 through VM2’s fixed ip
  • Test assertion 1: The ping operation is failed
  • Test action 10: Update ‘security_groups’ to be none and ‘port_security_enabled’ to be False for VM2’s port
  • Test action 11: Verify VM1 is able to communicate with VM2 through VM2’s fixed ip
  • Test assertion 2: The ping operation is successful
  • Test action 12: Delete SG1, SG2, NET1, SUBNET1, R1, VM1, VM2 and FIP1

This test evaluates the ability of port security to disable security group. Specifically, the test verifies that:

  • The ICMP packets cannot pass the port whose ‘port_security_enabled’ is True and security_groups is none.
  • The ICMP packets can pass the port whose ‘port_security_enabled’ is False and security_groups is none.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 6 - Test Update Port Security Group
Test case specification

tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_port_update_new_security_group

Test preconditions
  • Neutron security-group extension API
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a neutron network NET1
  • Test action 2: Create a tenant router R1 which routes traffic to public network
  • Test action 3: Create a subnet SUBNET1 and add it as router interface
  • Test action 4: Create 2 empty security groups SG1 and SG2
  • Test action 5: Add a tcp rule to SG1
  • Test action 6: Create a server VM1 with SG1, SG2 and NET1, and assign a floating ip FIP1 (via R1) to VM1
  • Test action 7: Create third empty security group SG3
  • Test action 8: Add ICMP rule to SG3
  • Test action 9: Create second server VM2 with default security group and NET1
  • Test action 10: Verify VM1 fails to communicate with VM2 through VM2’s fixed ip
  • Test assertion 1: The ping operation is failed
  • Test action 11: Update ‘security_groups’ to be SG3 for VM2’s port
  • Test action 12: Verify VM1 is able to communicate with VM2 through VM2’s fixed ip
  • Test assertion 2: The ping operation is successful
  • Test action 13: Delete SG1, SG2, SG3, NET1, SUBNET1, R1, VM1, VM2 and FIP1

This test evaluates the ability to update port with a new security group. Specifically, the test verifies that:

  • Without ICMP security group rule, the VM cannot receive ICMP packets.
  • Update the port’s security group which has ICMP rule, the VM can receive ICMP packets.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Dynamic Network Runtime Operations test specification
Scope

The dynamic network runtime operations test area evaluates the ability of the system under test to support dynamic network runtime operations through the life of a VNF (e.g. attach/detach, enable/disable, read stats). The tests in this test area will evaluate IPv4 network runtime operations functionality. These runtime operations includes hotpluging network interface, detaching floating-ip from VM, attaching floating-ip to VM, updating subnet’s DNS, updating VM instance port admin state and updating router admin state.

Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • DNS - Domain Name System
  • ICMP - Internet Control Message Protocol
  • MAC - Media Access Control
  • NIC - Network Interface Controller
  • NFVi - Network Functions Virtualization infrastructure
  • SSH - Secure Shell
  • TCP - Transmission Control Protocol
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on dynamic network runtime operations. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

All these test cases are included in the test case dovetail.tempest.tc003 of OVP test suite.

Test Descriptions
API Used and Reference

Security Groups: https://developer.openstack.org/api-ref/network/v2/index.html#security-groups-security-groups

  • create security group
  • delete security group

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • create network
  • delete network

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • create router
  • update router
  • delete router
  • add interface to router

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • create subnet
  • update subnet
  • delete subnet

Servers: https://developer.openstack.org/api-ref/compute/

  • create keypair
  • create server
  • delete server
  • add/assign floating IP
  • disassociate floating IP

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • create port
  • update port
  • delete port

Floating IPs: https://developer.openstack.org/api-ref/networking/v2/index.html#floating-ips-floatingips

  • create floating IP
  • delete floating IP
Test Case 1 - Basic network operations
Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops

Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • Openstack nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test assertion 1: Ping FIP1 and SSH to VM1 via FIP1 successfully
  • Test assertion 2: Ping the internal gateway from VM1 successfully
  • Test assertion 3: Ping the default gateway from VM1 using its floating IP FIP1 successfully
  • Test action 6: Detach FIP1 from VM1
  • Test assertion 4: VM1 becomes unreachable after FIP1 disassociated
  • Test action 7: Create a new server VM2 with NET1, and associate floating IP FIP1 to VM2
  • Test assertion 5: Ping FIP1 and SSH to VM2 via FIP1 successfully
  • Test action 8: Delete SG1, NET1, SUBNET1, R1, VM1, VM2 and FIP1

This test evaluates the functionality of basic network operations. Specifically, the test verifies that:

  • The Tempest host can ping VM’s IP address. This implies, but does not guarantee (see the ssh check that follows), that the VM has been assigned the correct IP address and has connectivity to the Tempest host.
  • The Tempest host can perform key-based authentication to an ssh server hosted at VM’s IP address. This check guarantees that the IP address is associated with the target VM.
  • The Tempest host can ssh into the VM via the IP address and successfully ping the internal gateway address, implying connectivity to another VM on the same network.
  • The Tempest host can ssh into the VM via the IP address and successfully ping the default gateway, implying external connectivity.
  • Detach the floating-ip from the VM and VM becomes unreachable.
  • Associate attached floating ip to a new VM and the new VM is reachable.
  • Floating IP status is updated correctly after each change.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 2 - Hotplug network interface
Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_hotplug_nic

Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • Compute interface_attach feature is enabled
  • VM vnic_type is not set to ‘direct’ or ‘macvtap’
  • Openstack nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test assertion 1: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test action 6: Create a second neutron network NET2 and subnet SUBNET2, and attach VM1 to NET2
  • Test action 7: Get VM1’s ethernet interface NIC2 for NET2
  • Test assertion 2: Ping NET2’s internal gateway successfully
  • Test action 8: Delete SG1, NET1, NET2, SUBNET1, SUBNET2, R1, NIC2, VM1 and FIP1

This test evaluates the functionality of adding network to an active VM. Specifically, the test verifies that:

  • New network interface can be added to an existing VM successfully.
  • The Tempest host can ssh into the VM via the IP address and successfully ping the new network’s internal gateway address, implying connectivity to the new network.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 3 - Update subnet’s configuration
Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_subnet_details

Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • DHCP client is available
  • Tenant networks should be non-shared and isolated
  • Openstack nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface, configure SUBNET1 with dns nameserver ‘1.2.3.4’
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test assertion 1: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test assertion 2: Retrieve the VM1’s configured dns and verify it matches the one configured for SUBNET1
  • Test action 6: Update SUBNET1’s dns to ‘9.8.7.6’
  • Test assertion 3: After triggering the DHCP renew from the VM manually, retrieve the VM1’s configured dns and verify it has been successfully updated
  • Test action 7: Delete SG1, NET1, SUBNET1, R1, VM1 and FIP1

This test evaluates the functionality of updating subnet’s configurations. Specifically, the test verifies that:

  • Updating subnet’s DNS server configurations are affecting the VMs.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 4 - Update VM port admin state
Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_instance_port_admin_state

Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • Network port_admin_state_change feature is enabled
  • Openstack nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test action 6: Create a server VM2 with SG1 and NET1, and assign a floating IP FIP2 to VM2
  • Test action 7: Get a SSH client SSHCLNT1 to VM2
  • Test assertion 1: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test assertion 2: Ping FIP1 via SSHCLNT1 successfully
  • Test action 8: Update admin_state_up attribute of VM1 port to False
  • Test assertion 3: Ping FIP1 and SSH to VM1 with FIP1 failed
  • Test assertion 4: Ping FIP1 via SSHCLNT1 failed
  • Test action 9: Update admin_state_up attribute of VM1 port to True
  • Test assertion 5: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test assertion 6: Ping FIP1 via SSHCLNT1 successfully
  • Test action 10: Delete SG1, NET1, SUBNET1, R1, SSHCLNT1, VM1, VM2 and FIP1, FIP2

This test evaluates the VM public and project connectivity status by changing VM port admin_state_up to True & False. Specifically, the test verifies that:

  • Public and project connectivity is reachable before updating admin_state_up attribute of VM port to False.
  • Public and project connectivity is unreachable after updating admin_state_up attribute of VM port to False.
  • Public and project connectivity is reachable after updating admin_state_up attribute of VM port from False to True.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 5 - Update router admin state
Test case specification

tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_router_admin_state

Test preconditions
  • Nova has been configured to boot VMs with Neutron-managed networking
  • Multi-tenant networks capabilities
  • Openstack nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 2: Create a neutron network NET1
  • Test action 3: Create a tenant router R1 which routes traffic to public network
  • Test action 4: Create a subnet SUBNET1 and add it as router interface
  • Test action 5: Create a server VM1 with SG1 and NET1, and assign a floating IP FIP1 (via R1) to VM1
  • Test assertion 1: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test action 6: Update admin_state_up attribute of R1 to False
  • Test assertion 2: Ping FIP1 and SSH to VM1 with FIP1 failed
  • Test action 7: Update admin_state_up attribute of R1 to True
  • Test assertion 3: Ping FIP1 and SSH to VM1 with FIP1 successfully
  • Test action 8: Delete SG1, NET1, SUBNET1, R1, VM1 and FIP1

This test evaluates the router public connectivity status by changing router admin_state_up to True & False. Specifically, the test verifies that:

  • Public connectivity is reachable before updating admin_state_up attribute of router to False.
  • Public connectivity is unreachable after updating admin_state_up attribute of router to False.
  • Public connectivity is reachable after updating admin_state_up attribute of router. from False to True

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Common virtual machine life cycle events test specification
Scope

The common virtual machine life cycle events test area evaluates the ability of the system under test to behave correctly after common virtual machine life cycle events. The tests in this test area will evaluate:

  • Stop/Start a server
  • Reboot a server
  • Rebuild a server
  • Pause/Unpause a server
  • Suspend/Resume a server
  • Resize a server
  • Resizing a volume-backed server
  • Sequence suspend resume
  • Shelve/Unshelve a server
  • Cold migrate a server
  • Live migrate a server
Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualization infrastructure
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on common virtual machine life cycle events. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

All these test cases are included in the test case dovetail.tempest.tc004 of OVP test suite.

Test Descriptions
API Used and Reference

Block storage: https://developer.openstack.org/api-ref/block-storage

  • create volume
  • delete volume
  • attach volume to server
  • detach volume from server

Security Groups: https://developer.openstack.org/api-ref/network/v2/index.html#security-groups-security-groups

  • create security group
  • delete security group

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • create network
  • delete network

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • create router
  • delete router
  • add interface to router

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • create subnet
  • delete subnet

Servers: https://developer.openstack.org/api-ref/compute/

  • create keypair
  • create server
  • show server
  • delete server
  • add/assign floating IP
  • resize server
  • revert resized server
  • confirm resized server
  • pause server
  • unpause server
  • start server
  • stop server
  • reboot server
  • rebuild server
  • suspend server
  • resume suspended server
  • shelve server
  • unshelve server
  • migrate server
  • live-migrate server

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • create port
  • delete port

Floating IPs: https://developer.openstack.org/api-ref/networking/v2/index.html#floating-ips-floatingips

  • create floating IP
  • delete floating IP

Availability zone: https://developer.openstack.org/api-ref/compute/

  • get availability zone
Test Case 1 - Minimum basic scenario
Test case specification

tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario

Test preconditions
  • Nova, cinder, glance, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create an image IMG1
  • Test action 2: Create a keypair KEYP1
  • Test action 3: Create a server VM1 with IMG1 and KEYP1
  • Test assertion 1: Verify VM1 is created successfully
  • Test action 4: Create a volume VOL1
  • Test assertion 2: Verify VOL1 is created successfully
  • Test action 5: Attach VOL1 to VM1
  • Test assertion 3: Verify VOL1’s status has been updated after attached to VM1
  • Test action 6: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test assertion 4: Verify VM1’s addresses have been refreshed after associating FIP1
  • Test action 7: Create and add security group SG1 to VM1
  • Test assertion 5: Verify can SSH to VM1 via FIP1
  • Test action 8: Reboot VM1
  • Test assertion 6: Verify can SSH to VM1 via FIP1
  • Test assertion 7: Verify VM1’s disk count equals to 1
  • Test action 9: Delete the floating IP FIP1 from VM1
  • Test assertion 8: Verify VM1’s addresses have been refreshed after disassociating FIP1
  • Test action 10: Delete SG1, IMG1, KEYP1, VOL1, VM1 and FIP1

This test evaluates a minimum basic scenario. Specifically, the test verifies that:

  • The server can be connected before reboot.
  • The server can be connected after reboot.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 2 - Cold migration
Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_cold_migration

Test preconditions
  • At least 2 compute nodes
  • Nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Get VM1’s host info SRC_HOST
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 6: Cold migrate VM1
  • Test action 7: Wait for VM1 to reach ‘VERIFY_RESIZE’ status
  • Test action 8: Confirm resize VM1
  • Test action 9: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 2: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 10: Get VM1’s host info DST_HOST
  • Test assertion 3: Verify SRC_HOST does not equal to DST_HOST
  • Test action 11: Delete KEYP1, VM1 and FIP1

This test evaluates the ability to cold migrate VMs. Specifically, the test verifies that:

  • Servers can be cold migrated from one compute node to another computer node.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 3 - Pause and unpause server
Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_pause_unpause

Test preconditions
  • Nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Pause VM1
  • Test action 5: Wait for VM1 to reach ‘PAUSED’ status
  • Test assertion 1: Verify FIP1 status is ‘ACTIVE’
  • Test assertion 2: Verify ping FIP1 failed and SSH to VM1 via FIP1 failed
  • Test action 6: Unpause VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 3: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 8: Delete KEYP1, VM1 and FIP1

This test evaluates the ability to pause and unpause VMs. Specifically, the test verifies that:

  • When paused, servers cannot be reached.
  • When unpaused, servers can recover its reachability.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 4 - Reboot server
Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_reboot

Test preconditions
  • Nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Soft reboot VM1
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 6: Delete KEYP1, VM1 and FIP1

This test evaluates the ability to reboot servers. Specifically, the test verifies that:

  • After reboot, servers can still be connected.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 5 - Rebuild server
Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_rebuild

Test preconditions
  • Nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Rebuild VM1 with another image
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 6: Delete KEYP1, VM1 and FIP1

This test evaluates the ability to rebuild servers. Specifically, the test verifies that:

  • Servers can be rebuilt with specific image correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 6 - Resize server
Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_resize

Test preconditions
  • Nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Resize VM1 with another flavor
  • Test action 5: Wait for VM1 to reach ‘VERIFY_RESIZE’ status
  • Test action 6: Confirm resize VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 8: Delete KEYP1, VM1 and FIP1

This test evaluates the ability to resize servers. Specifically, the test verifies that:

  • Servers can be resized with specific flavor correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 7 - Stop and start server
Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_stop_start

Test preconditions
  • Nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Stop VM1
  • Test action 5: Wait for VM1 to reach ‘SHUTOFF’ status
  • Test assertion 1: Verify ping FIP1 failed and SSH to VM1 via FIP1 failed
  • Test action 6: Start VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 2: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 8: Delete KEYP1, VM1 and FIP1

This test evaluates the ability to stop and start servers. Specifically, the test verifies that:

  • When stopped, servers cannot be reached.
  • When started, servers can recover its reachability.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 8 - Suspend and resume server
Test case specification

tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_suspend_resume

Test preconditions
  • Nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a server VM1 with KEYP1
  • Test action 3: Create a floating IP FIP1 and assign FIP1 to VM1
  • Test action 4: Suspend VM1
  • Test action 5: Wait for VM1 to reach ‘SUSPENDED’ status
  • Test assertion 1: Verify ping FIP1 failed and SSH to VM1 via FIP1 failed
  • Test action 6: Resume VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 2: Verify can ping FIP1 successfully and can SSH to VM1 via FIP1
  • Test action 8: Delete KEYP1, VM1 and FIP1

This test evaluates the ability to suspend and resume servers. Specifically, the test verifies that:

  • When suspended, servers cannot be reached.
  • When resumed, servers can recover its reachability.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 9 - Suspend and resume server in sequence
Test case specification

tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_server_sequence_suspend_resume

Test preconditions
  • Nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a server VM1
  • Test action 2: Suspend VM1
  • Test action 3: Wait for VM1 to reach ‘SUSPENDED’ status
  • Test assertion 1: Verify VM1’s status is ‘SUSPENDED’
  • Test action 4: Resume VM1
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 2: Verify VM1’s status is ‘ACTIVE’
  • Test action 6: Suspend VM1
  • Test action 7: Wait for VM1 to reach ‘SUSPENDED’ status
  • Test assertion 3: Verify VM1 status is ‘SUSPENDED’
  • Test action 8: Resume VM1
  • Test action 9: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 4: Verify VM1 status is ‘ACTIVE’
  • Test action 10: Delete KEYP1, VM1 and FIP1

This test evaluates the ability to suspend and resume servers in sequence. Specifically, the test verifies that:

  • Servers can be suspend and resume in sequence correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 10 - Resize volume backed server
Test case specification

tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_volume_backed_server_confirm

Test preconditions
  • Nova, neutron, cinder services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a volume backed server VM1
  • Test action 2: Resize VM1 with another flavor
  • Test action 3: Wait for VM1 to reach ‘VERIFY_RESIZE’ status
  • Test action 4: Confirm resize VM1
  • Test action 5: Wait for VM1 to reach ‘ACTIVE’ status
  • Test assertion 1: VM1’s status is ‘ACTIVE’
  • Test action 6: Delete VM1

This test evaluates the ability to resize volume backed servers. Specifically, the test verifies that:

  • Volume backed servers can be resized with specific flavor correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 11 - Shelve and unshelve server
Test case specification

tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance

Test preconditions
  • Nova, neutron, image services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a security group SG1, which has rules for allowing incoming SSH and ICMP traffic
  • Test action 3: Create a server with SG1 and KEYP1
  • Test action 4: Create a timestamp and store it in a file F1 inside VM1
  • Test action 5: Shelve VM1
  • Test action 6: Unshelve VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test action 8: Read F1 and compare if the read value and the previously written value are the same or not
  • Test assertion 1: Verify the values written and read are the same
  • Test action 9: Delete SG1, KEYP1 and VM1

This test evaluates the ability to shelve and unshelve servers. Specifically, the test verifies that:

  • Servers can be shelved and unshelved correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 12 - Shelve and unshelve volume backed server
Test case specification

tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_volume_backed_instance

Test preconditions
  • Nova, neutron, image, cinder services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Create a keypair KEYP1
  • Test action 2: Create a security group SG1, which has rules for allowing incoming and outgoing SSH and ICMP traffic
  • Test action 3: Create a volume backed server VM1 with SG1 and KEYP1
  • Test action 4: SSH to VM1 to create a timestamp T_STAMP1 and store it in a file F1 inside VM1
  • Test action 5: Shelve VM1
  • Test action 6: Unshelve VM1
  • Test action 7: Wait for VM1 to reach ‘ACTIVE’ status
  • Test action 8: SSH to VM1 to read the timestamp T_STAMP2 stored in F1
  • Test assertion 1: Verify T_STAMP1 equals to T_STAMP2
  • Test action 9: Delete SG1, KEYP1 and VM1

This test evaluates the ability to shelve and unshelve volume backed servers. Specifically, the test verifies that:

  • Volume backed servers can be shelved and unshelved correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

VM Resource Scheduling on Multiple Nodes test specification
Scope

The VM resource scheduling test area evaluates the ability of the system under test to support VM resource scheduling on multiple nodes. The tests in this test area will evaluate capabilities to schedule VM to multiple compute nodes directly with scheduler hints, and create server groups with policy affinity and anti-affinity.

Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • API - Application Programming Interface
  • NFVi - Network Functions Virtualization infrastructure
  • VIM - Virtual Infrastructure Manager
  • VM - Virtual Machine
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

Test Area Structure

The test area is structured based on server group operations and server operations on multiple nodes. Each test case is able to run independently, i.e. irrelevant of the state created by a previous test. Specifically, every test performs clean-up operations which return the system to the same state as before the test.

All these test cases are included in the test case dovetail.tempest.tc005 of OVP test suite.

Test Descriptions
API Used and Reference

Security Groups: https://developer.openstack.org/api-ref/network/v2/index.html#security-groups-security-groups

  • create security group
  • delete security group

Networks: https://developer.openstack.org/api-ref/networking/v2/index.html#networks

  • create network
  • delete network

Routers and interface: https://developer.openstack.org/api-ref/networking/v2/index.html#routers-routers

  • create router
  • delete router
  • add interface to router

Subnets: https://developer.openstack.org/api-ref/networking/v2/index.html#subnets

  • create subnet
  • delete subnet

Servers: https://developer.openstack.org/api-ref/compute/

  • create keypair
  • create server
  • show server
  • delete server
  • add/assign floating IP
  • create server group
  • delete server group
  • list server groups
  • show server group details

Ports: https://developer.openstack.org/api-ref/networking/v2/index.html#ports

  • create port
  • delete port

Floating IPs: https://developer.openstack.org/api-ref/networking/v2/index.html#floating-ips-floatingips

  • create floating IP
  • delete floating IP

Availability zone: https://developer.openstack.org/api-ref/compute/

  • get availability zone
Test Case 1 - Schedule VM to compute nodes
Test case specification

tempest.scenario.test_server_multinode.TestServerMultinode.test_schedule_to_all_nodes

Test preconditions
  • At least 2 compute nodes
  • Openstack nova, neutron services are available
  • One public network
Basic test flow execution description and pass/fail criteria
  • Test action 1: Get all availability zones AZONES1 in the SUT
  • Test action 2: Get all compute nodes in AZONES1
  • Test action 3: Get the value of ‘min_compute_nodes’ which is set by user in tempest config file and means the minimum number of compute nodes expected
  • Test assertion 1: Verify that SUT has at least as many compute nodes as specified by the ‘min_compute_nodes’ threshold
  • Test action 4: Create one server for each compute node, up to the ‘min_compute_nodes’ threshold
  • Test assertion 2: Verify the number of servers matches the ‘min_compute_nodes’ threshold
  • Test action 5: Get every server’s ‘hostId’ and store them in a set which has no duplicate values
  • Test assertion 3: Verify the length of the set equals to the number of servers to ensure that every server ended up on a different host
  • Test action 6: Delete the created servers

This test evaluates the functionality of VM resource scheduling. Specifically, the test verifies that:

  • VMs are scheduled to the requested compute nodes correctly.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 2 - Test create and delete multiple server groups with same name and policy
Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_create_delete_multiple_server_groups_with_same_name_policy

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy affinity
  • Test action 3: Create another server group SERG2 with N1 and policy affinity
  • Test assertion 1: The names of SERG1 and SERG2 are the same
  • Test assertion 2: The ‘policies’ of SERG1 and SERG2 are the same
  • Test assertion 3: The ids of SERG1 and SERG2 are different
  • Test action 4: Delete SERG1 and SERG2
  • Test action 5: List all server groups
  • Test assertion 4: SERG1 and SERG2 are not in the list

This test evaluates the functionality of creating and deleting server groups with the same name and policy. Specifically, the test verifies that:

  • Server groups can be created with the same name and policy.
  • Server groups with the same name and policy can be deleted successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 3 - Test create and delete server group with affinity policy
Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_create_delete_server_group_with_affinity_policy

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy affinity
  • Test assertion 1: The name of SERG1 returned in the response is the same as N1
  • Test assertion 2: The ‘policies’ of SERG1 returned in the response is affinity
  • Test action 3: Delete SERG1 and list all server groups
  • Test assertion 3: SERG1 is not in the list

This test evaluates the functionality of creating and deleting server group with affinity policy. Specifically, the test verifies that:

  • Server group can be created with affinity policy and deleted successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 4 - Test create and delete server group with anti-affinity policy
Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_create_delete_server_group_with_anti_affinity_policy

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy anti-affinity
  • Test assertion 1: The name of SERG1 returned in the response is the same as N1
  • Test assertion 2: The ‘policies’ of SERG1 returned in the response is anti-affinity
  • Test action 3: Delete SERG1 and list all server groups
  • Test assertion 3: SERG1 is not in the list

This test evaluates the functionality of creating and deleting server group with anti-affinity policy. Specifically, the test verifies that:

  • Server group can be created with anti-affinity policy and deleted successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 5 - Test list server groups
Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_list_server_groups

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy affinity
  • Test action 3: List all server groups
  • Test assertion 1: SERG1 is in the list
  • Test action 4: Delete SERG1

This test evaluates the functionality of listing server groups. Specifically, the test verifies that:

  • Server groups can be listed successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 6 - Test show server group details
Test case specification

tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_show_server_group

Test preconditions

None

Basic test flow execution description and pass/fail criteria
  • Test action 1: Generate a random name N1
  • Test action 2: Create a server group SERG1 with N1 and policy affinity, and stored the details (D1) returned in the response
  • Test action 3: Show the details (D2) of SERG1
  • Test assertion 1: All values in D1 are the same as the values in D2
  • Test action 4: Delete SERG1

This test evaluates the functionality of showing server group details. Specifically, the test verifies that:

  • Server groups can be shown successfully.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

VPN test specification
Scope

The VPN test area evaluates the ability of the system under test to support VPN networking for virtual workloads. The tests in this test area will evaluate establishing VPN networks, publishing and communication between endpoints using BGP and tear down of the networks.

References

This test area evaluates the ability of the system to perform selected actions defined in the following specifications. Details of specific features evaluated are described in the test descriptions.

Definitions and abbreviations

The following terms and abbreviations are used in conjunction with this test area

  • BGP - Border gateway protocol
  • eRT - Export route target
  • IETF - Internet Engineering Task Force
  • iRT - Import route target
  • NFVi - Network functions virtualization infrastructure
  • Tenant - An isolated set of virtualized infrastructures
  • VM - Virtual machine
  • VPN - Virtual private network
  • VLAN - Virtual local area network
System Under Test (SUT)

The system under test is assumed to be the NFVi and VIM in operation on a Pharos compliant infrastructure.

Test Area Structure

The test area is structured in four separate tests which are executed sequentially. The order of the tests is arbitrary as there are no dependencies across the tests. Specifially, every test performs clean-up operations which return the system to the same state as before the test.

The test area evaluates the ability of the SUT to establish connectivity between Virtual Machines using an appropriate route target configuration, reconfigure the route targets to remove connectivity between the VMs, then reestablish connectivity by re-association.

Test Descriptions
Test Case 1 - VPN provides connectivity between Neutron subnets
Short name

dovetail.sdnvpn.tc001.subnet_connectivity

Use case specification

This test evaluates the use case where an NFVi tenant uses a BGPVPN to provide connectivity between VMs on different Neutron networks and subnets that reside on different hosts.

Test preconditions

2 compute nodes are available, denoted Node1 and Node2 in the following.

Basic test flow execution description and pass/fail criteria
Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. The target IPs are passed to the VMs sending pings by means of a custom user data script. Whether or not a ping was successful is determined by checking the console output of the source VMs.

Test execution
  • Create Neutron network N1 and subnet SN1 with IP range 10.10.10.0/24
  • Create Neutron network N2 and subnet SN2 with IP range 10.10.11.0/24
  • Create VM1 on Node1 with a port in network N1
  • Create VM2 on Node1 with a port in network N1
  • Create VM3 on Node2 with a port in network N1
  • Create VM4 on Node1 with a port in network N2
  • Create VM5 on Node2 with a port in network N2
  • Create VPN1 with eRT<>iRT
  • Create network association between network N1 and VPN1
  • VM1 sends ICMP packets to VM2 using ping
  • Test assertion 1: Ping from VM1 to VM2 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM3 using ping
  • Test assertion 2: Ping from VM1 to VM3 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM4 using ping
  • Test assertion 3: Ping from VM1 to VM4 fails: ping exits with a non-zero return code
  • Create network association between network N2 and VPN1
  • VM4 sends ICMP packets to VM5 using ping
  • Test assertion 4: Ping from VM4 to VM5 succeeds: ping exits with return code 0
  • Configure iRT=eRT in VPN1
  • VM1 sends ICMP packets to VM4 using ping
  • Test assertion 5: Ping from VM1 to VM4 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM5 using ping
  • Test assertion 6: Ping from VM1 to VM5 succeeds: ping exits with return code 0
  • Delete all instances: VM1, VM2, VM3, VM4 and VM5
  • Delete all networks and subnets: networks N1 and N2 including subnets SN1 and SN2
  • Delete all network associations and VPN1
Pass / fail criteria

This test evaluates the capability of the NFVi and VIM to provide routed IP connectivity between VMs by means of BGP/MPLS VPNs. Specifically, the test verifies that:

  • VMs in the same Neutron subnet have IP connectivity regardless of BGP/MPLS VPNs (test assertion 1, 2, 4)
  • VMs in different Neutron subnets do not have IP connectivity by default - in this case without associating VPNs with the same import and export route targets to the Neutron networks (test assertion 3)
  • VMs in different Neutron subnets have routed IP connectivity after associating both networks with BGP/MPLS VPNs which have been configured with the same import and export route targets (test assertion 5, 6). Hence, adjusting the ingress and egress route targets enables as well as prohibits routing.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 2 - VPNs ensure traffic separation between tenants
Short Name

dovetail.sdnvpn.tc002.tenant_separation

Use case specification

This test evaluates if VPNs provide separation of traffic such that overlapping IP ranges can be used.

Test preconditions

2 compute nodes are available, denoted Node1 and Node2 in the following.

Basic test flow execution description and pass/fail criteria
Methodology for verifying connectivity

Connectivity between VMs is tested by establishing an SSH connection. Moreover, the command “hostname” is executed at the remote VM in order to retrieve the hostname of the remote VM. The retrieved hostname is furthermore compared against an expected value. This is used to verify tenant traffic separation, i.e., despite overlapping IPs, a connection is made to the correct VM as determined by means of the hostname of the target VM.

Test execution
  • Create Neutron network N1
  • Create subnet SN1a of network N1 with IP range 10.10.10.0/24
  • Create subnet SN1b of network N1 with IP range 10.10.11.0/24
  • Create Neutron network N2
  • Create subnet SN2a of network N2 with IP range 10.10.10.0/24
  • Create subnet SN2b of network N2 with IP range 10.10.11.0/24
  • Create VM1 on Node1 with a port in network N1 and IP 10.10.10.11.
  • Create VM2 on Node1 with a port in network N1 and IP 10.10.10.12.
  • Create VM3 on Node2 with a port in network N1 and IP 10.10.11.13.
  • Create VM4 on Node1 with a port in network N2 and IP 10.10.10.12.
  • Create VM5 on Node2 with a port in network N2 and IP 10.10.11.13.
  • Create VPN1 with iRT=eRT=RT1
  • Create network association between network N1 and VPN1
  • VM1 attempts to execute the command hostname on the VM with IP 10.10.10.12 via SSH.
  • Test assertion 1: VM1 can successfully connect to the VM with IP 10.10.10.12. via SSH and execute the remote command hostname. The retrieved hostname equals the hostname of VM2.
  • VM1 attempts to execute the command hostname on the VM with IP 10.10.11.13 via SSH.
  • Test assertion 2: VM1 can successfully connect to the VM with IP 10.10.11.13 via SSH and execute the remote command hostname. The retrieved hostname equals the hostname of VM3.
  • Create VPN2 with iRT=eRT=RT2
  • Create network association between network N2 and VPN2
  • VM4 attempts to execute the command hostname on the VM with IP 10.10.11.13 via SSH.
  • Test assertion 3: VM4 can successfully connect to the VM with IP 10.10.11.13 via SSH and execute the remote command hostname. The retrieved hostname equals the hostname of VM5.
  • VM4 attempts to execute the command hostname on the VM with IP 10.10.11.11 via SSH.
  • Test assertion 4: VM4 cannot connect to the VM with IP 10.10.11.11 via SSH.
  • Delete all instances: VM1, VM2, VM3, VM4 and VM5
  • Delete all networks and subnets: networks N1 and N2 including subnets SN1a, SN1b, SN2a and SN2b
  • Delete all network associations, VPN1 and VPN2
Pass / fail criteria

This test evaluates the capability of the NFVi and VIM to provide routed IP connectivity between VMs by means of BGP/MPLS VPNs. Specifically, the test verifies that:

  • VMs in the same Neutron subnet (still) have IP connectivity between each other when a BGP/MPLS VPN is associated with the network (test assertion 1).
  • VMs in different Neutron subnets have routed IP connectivity between each other when BGP/MPLS VPNs with the same import and expert route targets are associated with both networks (assertion 2).
  • VMs in different Neutron networks and BGP/MPLS VPNs with different import and export route targets can have overlapping IP ranges. The BGP/MPLS VPNs provide traffic separation (assertion 3 and 4).

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 3 - VPN provides connectivity between subnets using router association
Short Name

dovetail.sdnvpn.tc004.router_association

Use case specification

This test evaluates if a VPN provides connectivity between two subnets by utilizing two different VPN association mechanisms: a router association and a network association.

Specifically, the test network topology comprises two networks N1 and N2 with corresponding subnets. Additionally, network N1 is connected to a router R1. This test verifies that a VPN V1 provides connectivity between both networks when applying a router association to router R1 and a network association to network N2.

Test preconditions

2 compute nodes are available, denoted Node1 and Node2 in the following.

Basic test flow execution description and pass/fail criteria
Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. The target IPs are passed to the VMs sending pings by means of a custom user data script. Whether or not a ping was successful is determined by checking the console output of the source VMs.

Test execution
  • Create a network N1, a subnet SN1 with IP range 10.10.10.0/24 and a connected router R1
  • Create a network N2, a subnet SN2 with IP range 10.10.11.0/24
  • Create VM1 on Node1 with a port in network N1
  • Create VM2 on Node1 with a port in network N1
  • Create VM3 on Node2 with a port in network N1
  • Create VM4 on Node1 with a port in network N2
  • Create VM5 on Node2 with a port in network N2
  • Create VPN1 with eRT<>iRT so that connected subnets should not reach each other
  • Create route association between router R1 and VPN1
  • VM1 sends ICMP packets to VM2 using ping
  • Test assertion 1: Ping from VM1 to VM2 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM3 using ping
  • Test assertion 2: Ping from VM1 to VM3 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM4 using ping
  • Test assertion 3: Ping from VM1 to VM4 fails: ping exits with a non-zero return code
  • Create network association between network N2 and VPN1
  • VM4 sends ICMP packets to VM5 using ping
  • Test assertion 4: Ping from VM4 to VM5 succeeds: ping exits with return code 0
  • Change VPN1 so that iRT=eRT
  • VM1 sends ICMP packets to VM4 using ping
  • Test assertion 5: Ping from VM1 to VM4 succeeds: ping exits with return code 0
  • VM1 sends ICMP packets to VM5 using ping
  • Test assertion 6: Ping from VM1 to VM5 succeeds: ping exits with return code 0
  • Delete all instances: VM1, VM2, VM3, VM4 and VM5
  • Delete all networks, subnets and routers: networks N1 and N2 including subnets SN1 and SN2, router R1
  • Delete all network and router associations and VPN1
Pass / fail criteria

This test evaluates the capability of the NFVi and VIM to provide routed IP connectivity between VMs by means of BGP/MPLS VPNs. Specifically, the test verifies that:

  • VMs in the same Neutron subnet have IP connectivity regardless of the import and export route target configuration of BGP/MPLS VPNs (test assertion 1, 2, 4)
  • VMs in different Neutron subnets do not have IP connectivity by default - in this case without associating VPNs with the same import and export route targets to the Neutron networks or connected Neutron routers (test assertion 3).
  • VMs in two different Neutron subnets have routed IP connectivity after associating the first network and a router connected to the second network with BGP/MPLS VPNs which have been configured with the same import and export route targets (test assertion 5, 6). Hence, adjusting the ingress and egress route targets enables as well as prohibits routing.
  • Network and router associations are equivalent methods for binding Neutron networks to VPN.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

Test Case 4 - Verify interworking of router and network associations with floating IP functionality
Short Name

dovetail.sdnvpn.tc008.router_association_floating_ip

Use case specification

This test evaluates if both the router association and network association mechanisms interwork with floating IP functionality.

Specifically, the test network topology comprises two networks N1 and N2 with corresponding subnets. Additionally, network N1 is connected to a router R1. This test verifies that i) a VPN V1 provides connectivity between both networks when applying a router association to router R1 and a network association to network N2 and ii) a VM in network N1 is reachable externally by means of a floating IP.

Test preconditions

At least one compute node is available.

Basic test flow execution description and pass/fail criteria
Methodology for verifying connectivity

Connectivity between VMs is tested by sending ICMP ping packets between selected VMs. The target IPs are passed to the VMs sending pings by means of a custom user data script. Whether or not a ping was successful is determined by checking the console output of the source VMs.

Test execution
  • Create a network N1, a subnet SN1 with IP range 10.10.10.0/24 and a connected router R1
  • Create a network N2 with IP range 10.10.20.0/24
  • Create VM1 with a port in network N1
  • Create VM2 with a port in network N2
  • Create VPN1
  • Create a router association between router R1 and VPN1
  • Create a network association between network N2 and VPN1
  • VM1 sends ICMP packets to VM2 using ping
  • Test assertion 1: Ping from VM1 to VM2 succeeds: ping exits with return code 0
  • Assign a floating IP to VM1
  • The host running the test framework sends ICMP packets to VM1 using ping
  • Test assertion 2: Ping from the host running the test framework to the floating IP of VM1 succeeds: ping exits with return code 0
  • Delete floating IP assigned to VM1
  • Delete all instances: VM1, VM2
  • Delete all networks, subnets and routers: networks N1 and N2 including subnets SN1 and SN2, router R1
  • Delete all network and router associations as well as VPN1
Pass / fail criteria

This test evaluates the capability of the NFVi and VIM to provide routed IP connectivity between VMs by means of BGP/MPLS VPNs. Specifically, the test verifies that:

  • VMs in the same Neutron subnet have IP connectivity regardless of the import and export route target configuration of BGP/MPLS VPNs (test assertion 1)
  • VMs connected to a network which has been associated with a BGP/MPLS VPN are reachable through floating IPs.

In order to pass this test, all test assertions listed in the test execution above need to pass.

Post conditions

N/A

OPNFV Verified Program Testing User Guide
Conducting OVP Testing with Dovetail
Overview

The Dovetail testing framework for OVP consists of two major parts: the testing client that executes all test cases in a lab (vendor self-testing or a third party lab), and the server system that is hosted by the OVP administrator to store and view test results based on a web API. The following diagram illustrates this overall framework.

_images/dovetail_online_mode.png

Within the tester’s lab, the Test Host is the machine where Dovetail executes all automated test cases. As it hosts the test harness, the Test Host must not be part of the System Under Test (SUT) itself. The above diagram assumes that the tester’s Test Host is situated in a DMZ, which has internal network access to the SUT and external access via the public Internet. The public Internet connection allows for easy installation of the Dovetail containers. A singular compressed file that includes all the underlying results can be pulled from the Test Host and uploaded to the OPNFV OVP server. This arrangement may not be supported in some labs. Dovetail also supports an offline mode of installation that is illustrated in the next diagram.

_images/dovetail_offline_mode.png

In the offline mode, the Test Host only needs to have access to the SUT via the internal network, but does not need to connect to the public Internet. This user guide will highlight differences between the online and offline modes of the Test Host. While it is possible to run the Test Host as a virtual machine, this user guide assumes it is a physical machine for simplicity.

The rest of this guide will describe how to install the Dovetail tool as a Docker container image, go over the steps of running the OVP test suite, and then discuss how to view test results and make sense of them.

Readers interested in using Dovetail for its functionalities beyond OVP testing, e.g. for in-house or extended testing, should consult the Dovetail developer’s guide for additional information.

Installing Dovetail

In this section, we describe the procedure to install Dovetail client tool on the Test Host. The Test Host must have network access to the management network with access rights to the Virtual Infrastructure Manager’s API.

Checking the Test Host Readiness

The Test Host must have network access to the Virtual Infrastructure Manager’s API hosted in the SUT so that the Dovetail tool can exercise the API from the Test Host. It must also have ssh access to the Linux operating system of the compute nodes in the SUT. The ssh mechanism is used by some test cases to generate test events in the compute nodes. You can find out which test cases use this mechanism in the test specification document.

We have tested the Dovetail tool on the following host operating systems. Other versions or distributions of Linux may also work, but community support may be more available on these versions.

  • Ubuntu 16.04.2 LTS (Xenial) or 14.04 LTS (Trusty)
  • CentOS-7-1611
  • Red Hat Enterprise Linux 7.3
  • Fedora 24 or 25 Server

Use of Ubuntu 16.04 is highly recommended, as it has been most widely employed during testing. Non-Linux operating systems, such as Windows and Mac OS, have not been tested and are not supported.

If online mode is used, the tester should also validate that the Test Host can reach the public Internet. For example,

$ ping www.opnfv.org
PING www.opnfv.org (50.56.49.117): 56 data bytes
64 bytes from 50.56.49.117: icmp_seq=0 ttl=48 time=52.952 ms
64 bytes from 50.56.49.117: icmp_seq=1 ttl=48 time=53.805 ms
64 bytes from 50.56.49.117: icmp_seq=2 ttl=48 time=53.349 ms
...

Or, if the lab environment does not allow ping, try validating it using HTTPS instead.

$ curl https://www.opnfv.org
<!doctype html>


<html lang="en-US" class="no-js">
<head>
...
Installing Prerequisite Packages on the Test Host

The main prerequisite software for Dovetail are Python and Docker.

In the OVP test suite for the Danube release, Dovetail requires Python 2.7. Various minor versions of Python 2.7.x are known to work Dovetail, but there are no assurances. Python 3.x is not supported at this time.

Use the following steps to check if the right version of python is already installed, and if not, install it.

$ python --version
Python 2.7.6

If your Test Host does not have Python installed, or the version is not 2.7, you should consult Python installation guides corresponding to the operating system in your Test Host on how to install Python 2.7.

Dovetail does not work with Docker versions prior to 1.12.3. We have validated Dovetail with Docker 17.03 CE. Other versions of Docker later than 1.12.3 may also work, but community support may be more available on Docker 17.03 CE or greater.

$ sudo docker version
Client:
Version:      17.03.1-ce
API version:  1.27
Go version:   go1.7.5
Git commit:   c6d412e
Built:        Mon Mar 27 17:10:36 2017
OS/Arch:      linux/amd64

Server:
Version:      17.03.1-ce
API version:  1.27 (minimum version 1.12)
Go version:   go1.7.5
Git commit:   c6d412e
Built:        Mon Mar 27 17:10:36 2017
OS/Arch:      linux/amd64
Experimental: false

If your Test Host does not have Docker installed, or Docker is older than 1.12.3, or you have Docker version other than 17.03 CE and wish to change, you will need to install, upgrade, or re-install in order to run Dovetail. The Docker installation process can be more complex, you should refer to the official Docker installation guide that is relevant to your Test Host’s operating system.

The above installation steps assume that the Test Host is in the online mode. For offline testing, use the following offline installation steps instead.

In order to install or upgrade Python offline, you may download packaged Python 2.7 for your Test Host’s operating system on a connected host, copy the packge to the Test Host, then install from that local copy.

In order to install Docker offline, download Docker static binaries and copy the tar file to the Test Host, such as for Ubuntu14.04, you may follow the following link to install,

https://github.com/meetyg/docker-offline-install
Configuring the Test Host Environment

The Test Host needs a few environment variables set correctly in order to access the Openstack API required to drive the Dovetail tests. For convenience and as a convention, we will also create a home directory for storing all Dovetail related config files and results files:

$ mkdir -p /home/dovetail
$ export DOVETAIL_HOME=/home/dovetail

Here we set dovetail home directory to be /home/dovetail for an example. Then create a directory named pre_config in this directory to store all Dovetail related config files:

$ mkdir -p ${DOVETAIL_HOME}/pre_config
Setting up Primary Configuration File

At this point, you will need to consult your SUT (Openstack) administrator to correctly set the configurations in a file named env_config.sh. The Openstack settings need to be configured such that the Dovetail client has all the necessary credentials and privileges to execute all test operations. If the SUT uses terms somewhat differently from the standard Openstack naming, you will need to adjust this file accordingly.

Create and edit the file ${DOVETAIL_HOME}/pre_config/env_config.sh so that all parameters are set correctly to match your SUT. Here is an example of what this file should contain.

$ cat ${DOVETAIL_HOME}/pre_config/env_config.sh

# Project-level authentication scope (name or ID), recommend admin project.
export OS_PROJECT_NAME=admin

# For identity v2, it uses OS_TENANT_NAME rather than OS_PROJECT_NAME.
export OS_TENANT_NAME=admin

# Authentication username, belongs to the project above, recommend admin user.
export OS_USERNAME=admin

# Authentication password. Use your own password
export OS_PASSWORD=xxxxxxxx

# Authentication URL, one of the endpoints of keystone service. If this is v3 version,
# there need some extra variables as follows.
export OS_AUTH_URL='http://xxx.xxx.xxx.xxx:5000/v3'

# Default is 2.0. If use keystone v3 API, this should be set as 3.
export OS_IDENTITY_API_VERSION=3

# Domain name or ID containing the user above.
# Command to check the domain: openstack user show <OS_USERNAME>
export OS_USER_DOMAIN_NAME=default

# Domain name or ID containing the project above.
# Command to check the domain: openstack project show <OS_PROJECT_NAME>
export OS_PROJECT_DOMAIN_NAME=default

# Special environment parameters for https.
# If using https + cacert, the path of cacert file should be provided.
# The cacert file should be put at $DOVETAIL_HOME/pre_config.
export OS_CACERT=/path/to/pre_config/cacert.pem

# If using https + no cacert, should add OS_INSECURE environment parameter.
export OS_INSECURE=True

The OS_AUTH_URL variable is key to configure correctly, as the other admin services are gleaned from the identity service. HTTPS should be configured in the SUT so the final two variables should be uncommented. However, if SSL is disabled in the SUT, comment out the OS_CACERT and OS_INSECURE variables. Ensure the ‘/path/to/pre_config’ directory in the above file matches the directory location of the cacert file for the OS_CACERT variable.

Export all these variables into environment by,

$ source ${DOVETAIL_HOME}/pre_config/env_config.sh

The above line may be added to your .bashrc file for convenience when repeatedly using Dovetail.

The next three sections outline additional configuration files used by Dovetail. The tempest (tempest_conf.yaml) configuration file is required for executing the mandatory osinterop test cases and the optional ipv6/tempest test cases. The HA (pod.yaml) configuration file is required for the mandatory HA test cases and is also employed to collect SUT hardware info. The hosts.yaml is optional for hostname/IP resolution.

Configuration for Running Tempest Test Cases (Mandatory)

The test cases in the test areas osinterop (OpenStack Interoperability tests), ipv6 and tempest are based on Tempest. A SUT-specific configuration of Tempest is required in order to run those test cases successfully. The corresponding SUT-specific configuration options must be supplied in the file $DOVETAIL_HOME/pre_config/tempest_conf.yaml.

Create and edit file $DOVETAIL_HOME/pre_config/tempest_conf.yaml. Here is an example of what this file should contain.

compute:
  # The minimum number of compute nodes expected.
  # This should be no less than 2 and no larger than the compute nodes the SUT actually has.
  min_compute_nodes: 2

  # Expected device name when a volume is attached to an instance.
  volume_device_name: vdb

Use the listing above at a minimum to execute the mandatory test areas.

Configuration for Running HA Test Cases (Mandatory)

The mandatory HA test cases require OpenStack controller node info. It must include the node’s name, role, ip, as well as the user and key_filename or password to login to the node. Users must create the file ${DOVETAIL_HOME}/pre_config/pod.yaml to store the info. This file is also used as basis to collect SUT hardware information that is stored alongside results and uploaded to the OVP web portal. The SUT hardware information can be viewed within the ‘My Results’ view in the OVP web portal by clicking the SUT column ‘info’ link. In order to collect SUT hardware information holistically, ensure this file has an entry for each of the controller and compute nodes within the SUT.

Below is a sample with the required syntax when password is employed by the controller.

nodes:
-
    # This can not be changed and must be node1.
    name: node1

    # This must be controller.
    role: Controller

    # This is the install IP of a controller node, which is the haproxy primary node
    ip: xx.xx.xx.xx

    # User name of this node. This user must have sudo privileges.
    user: root

    # Password of the user.
    password: root

Besides the ‘password’, a ‘key_filename’ entry can be provided to login to the controller node. Users need to create file $DOVETAIL_HOME/pre_config/id_rsa to store the private key. A sample is provided below to show the required syntax when using a key file.

nodes:
-
    name: node1
    role: Controller
    ip: 10.1.0.50
    user: root

    # Private key of this node. It must be /root/.ssh/id_rsa
    # Dovetail will move the key file from $DOVETAIL_HOME/pre_config/id_rsa
    # to /root/.ssh/id_rsa of Yardstick container
    key_filename: /root/.ssh/id_rsa

Under nodes, repeat entries for name, role, ip, user and password or key file for each of the controller/compute nodes that comprise the SUT. Use a ‘-‘ to separate each of the entries. Specify the value for the role key to be either ‘Controller’ or ‘Compute’ for each node.

Configuration of Hosts File (Optional)

If your SUT uses a hosts file to translate hostnames into the IP of OS_AUTH_URL, then you need to provide the hosts info in a file $DOVETAIL_HOME/pre_config/hosts.yaml.

Create and edit file $DOVETAIL_HOME/pre_config/hosts.yaml. Below is an example of what this file should contain. Note, that multiple hostnames can be specified for each IP address, as shown in the generic syntax below the example.

$ cat ${DOVETAIL_HOME}/pre_config/hosts.yaml

---
hosts_info:
  192.168.141.101:
    - ha-vip

  <ip>:
    - <hostname1>
    - <hostname2>
Installing Dovetail on the Test Host

The Dovetail project maintains a Docker image that has Dovetail test tools preinstalled. This Docker image is tagged with versions. Before pulling the Dovetail image, check the OPNFV’s OVP web page first to determine the right tag for OVP testing.

Online Test Host

If the Test Host is online, you can directly pull Dovetail Docker image and download Ubuntu and Cirros images. All other dependent docker images will automatically be downloaded. The Ubuntu and Cirros images are used by Dovetail for image creation and VM instantiation within the SUT.

$ wget -nc http://artifacts.opnfv.org/sdnvpn/ubuntu-16.04-server-cloudimg-amd64-disk1.img -P ${DOVETAIL_HOME}/pre_config
$ wget -nc http://download.cirros-cloud.net/0.3.5/cirros-0.3.5-x86_64-disk.img -P ${DOVETAIL_HOME}/pre_config

$ sudo docker pull opnfv/dovetail:ovp.1.0.0
ovp.1.0.0: Pulling from opnfv/dovetail
30d541b48fc0: Pull complete
8ecd7f80d390: Pull complete
46ec9927bb81: Pull complete
2e67a4d67b44: Pull complete
7d9dd9155488: Pull complete
cc79be29f08e: Pull complete
e102eed9bf6a: Pull complete
952b8a9d2150: Pull complete
bfbb639d1f38: Pull complete
bf7c644692de: Pull complete
cdc345e3f363: Pull complete
Digest: sha256:d571b1073b2fdada79562e8cc67f63018e8d89268ff7faabee3380202c05edee
Status: Downloaded newer image for opnfv/dovetail:ovp.1.0.0

An example of the <tag> is ovp.1.0.0.

Offline Test Host

If the Test Host is offline, you will need to first pull the Dovetail Docker image, and all the dependent images that Dovetail uses, to a host that is online. The reason that you need to pull all dependent images is because Dovetail normally does dependency checking at run-time and automatically pulls images as needed, if the Test Host is online. If the Test Host is offline, then all these dependencies will need to be manually copied.

$ sudo docker pull opnfv/dovetail:ovp.1.0.0
$ sudo docker pull opnfv/functest:ovp.1.0.0
$ sudo docker pull opnfv/yardstick:danube.3.2
$ sudo docker pull opnfv/testapi:ovp.1.0.0
$ sudo docker pull mongo:3.2.1
$ sudo wget -nc http://artifacts.opnfv.org/sdnvpn/ubuntu-16.04-server-cloudimg-amd64-disk1.img -P {ANY_DIR}
$ sudo wget -nc http://download.cirros-cloud.net/0.3.5/cirros-0.3.5-x86_64-disk.img -P {ANY_DIR}

Once all these images are pulled, save the images, copy to the Test Host, and then load the Dovetail image and all dependent images at the Test Host. The final two lines above are to obtain the test images for transfer to the Test Host.

At the online host, save the images with the command below.

$ sudo docker save -o dovetail.tar opnfv/dovetail:ovp.1.0.0 \
  opnfv/functest:ovp.1.0.0 opnfv/yardstick:danube.3.2 \
  opnfv/testapi:ovp.1.0.0 mongo:3.2.1

The command above creates a dovetail.tar file with all the images, which can then be copied to the Test Host. To load the Dovetail images on the Test Host execute the command below.

$ sudo docker load --input dovetail.tar

Now check to see that all Docker images have been pulled or loaded properly.

$ sudo docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
opnfv/functest      ovp.1.0.0           e2b286547478        6 weeks ago         1.26 GB
opnfv/dovetail      ovp.1.0.0           5d25b289451c        8 days ago          516MB
opnfv/yardstick     danube.3.2          df830d5c2cb2        6 weeks ago         1.21 GB
opnfv/testapi       ovp.1.0.0           05c6d5ebce6c        2 months ago        448 MB
mongo               3.2.1               7e350b877a9a        19 months ago       317 MB

After copying and loading the Dovetail images at the Test Host, also copy the test images (Ubuntu, Cirros) to the Test Host. Copy image ubuntu-16.04-server-cloudimg-amd64-disk1.img to ${DOVETAIL_HOME}/pre_config/. Copy image cirros-0.3.5-x86_64-disk.img to ${DOVETAIL_HOME}/pre_config/.

Starting Dovetail Docker

Regardless of whether you pulled down the Dovetail image directly online, or loaded from a static image tar file, you are now ready to run Dovetail. Use the command below to create a Dovetail container and get access to its shell.

$ sudo docker run --privileged=true -it \
          -e DOVETAIL_HOME=$DOVETAIL_HOME \
          -v $DOVETAIL_HOME:$DOVETAIL_HOME \
          -v /var/run/docker.sock:/var/run/docker.sock \
          opnfv/dovetail:<tag> /bin/bash

The -e option sets the DOVETAIL_HOME environment variable in the container and the -v options map files in the Test Host to files in the container. The latter option allows the Dovetail container to read the configuration files and write result files into DOVETAIL_HOME on the Test Host. The user should be within the Dovetail container shell, once the command above is executed.

Build Local DB and Testapi Services

The steps in this section only need to be executed if the user plans on storing consolidated results on the Test Host that can be uploaded to the OVP portal.

Dovetail needs to build the local DB and testapi service for storing and reporting results to the OVP web portal. There is a script in the Dovetail container for building the local DB. The ports 27017 and 8000 are used by the DB and testapi respectively. If the Test Host is using these ports for existing services, to avoid conflicts, remap the ports to values that are unused. Execute the commands below in the Dovetail container to remap ports, as required. This step can be skipped if there are no port conflicts with the Test Host.

$ export mongodb_port=<new_DB_port>
$ export testapi_port=<new_testapi_port>

Within the Dovetail container, navigate to the directory and execute the shell script using the commands below to build the local DB and testapi services.

$ cd /home/opnfv/dovetail/dovetail/utils/local_db/
$ ./launch_db.sh

To validate the DB and testapi services are running successfully, navigate to the URL http://<test_host_ip>:<testapi_port>/api/v1/results, substituting within the URL the IP address of the Test Host and the testapi port number. If you can access this URL successfully, the services are up and running.

Running the OVP Test Suite

All or a subset of the available tests can be executed at any location within the Dovetail container prompt. You can refer to Dovetail Command Line Interface Reference for the details of the CLI.

$ dovetail run --testsuite <test-suite-name>

The ‘–testsuite’ option is used to control the set of tests intended for execution at a high level. For the purposes of running the OVP test suite, the test suite name follows the following format, ovp.<major>.<minor>.<patch>. The latest and default test suite is

ovp.1.0.0.
$ dovetail run

This command is equal to

$ dovetail run --testsuite ovp.1.0.0

Without any additional options, the above command will attempt to execute all mandatory and optional test cases. To restrict the breadth of the test scope, test areas can also be specified using the ‘–testarea’ option. The test area can be specified broadly using arguments ‘mandatory’ and ‘optional’. The mandatory tests can be narrowed further using test area arguments ‘osinterop’, ‘vping’ and ‘ha’. The optional tests can be narrowed further using test area arguments ‘ipv6’, ‘sdnvpn’ and ‘tempest’.

$ dovetail run --testarea mandatory

By default, results are stored in local files on the Test Host at $DOVETAIL_HOME/results. Each time the ‘dovetail run’ command is executed, the results in the aforementioned directory are overwritten. To create a singular compressed result file for upload to the OVP portal or for archival purposes, the results need to pushed to the local DB. This can be achieved by using the ‘–report’ option with an argument syntax as shown below. Note, that the Test Host IP address and testapi port number must be substituted with appropriate values.

$ dovetail run --report http://<test_host_ip>:<testapi_port>/api/v1/results

If the Test Host is offline, --offline should be added to support running with local resources.

$ dovetail run --offline --report http://<test_host_ip>:<testapi_port>/api/v1/results

Below is an example of running the entire mandatory test area and the creation of the compressed result file on the Test Host.

$ dovetail run --offline --testarea mandatory --report http://192.168.135.2:8000/api/v1/results
2017-09-29 07:00:55,718 - run - INFO - ================================================
2017-09-29 07:00:55,718 - run - INFO - Dovetail compliance: ovp.1.0.0!
2017-09-29 07:00:55,718 - run - INFO - ================================================
2017-09-29 07:00:55,719 - run - INFO - Build tag: daily-master-f0795af6-a4e3-11e7-acc5-0242ac110004
2017-09-29 07:00:55,956 - run - INFO - >>[testcase]: dovetail.osinterop.tc001
2017-09-29 07:15:19,514 - run - INFO - Results have been pushed to database and stored with local file /home/dovetail/results/results.json.
2017-09-29 07:15:19,514 - run - INFO - >>[testcase]: dovetail.vping.tc001
2017-09-29 07:17:24,095 - run - INFO - Results have been pushed to database and stored with local file /home/dovetail/results/results.json.
2017-09-29 07:17:24,095 - run - INFO - >>[testcase]: dovetail.vping.tc002
2017-09-29 07:20:42,434 - run - INFO - Results have been pushed to database and stored with local file /home/dovetail/results/results.json.
2017-09-29 07:20:42,434 - run - INFO - >>[testcase]: dovetail.ha.tc001
...

When test execution is complete, a tar file with all result and log files is written in $DOVETAIL_HOME on the Test Host. An example filename is ${DOVETAIL_HOME}/logs_20180105_0858.tar.gz. The file is named using a timestamp that follows the convention ‘YearMonthDay-HourMinute’. In this case, it was generated at 08:58 on January 5th, 2018. This tar file is used to upload to the OVP portal.

Making Sense of OVP Test Results

When a tester is performing trial runs, Dovetail stores results in local files on the Test Host by default within the directory specified below. Note, that if the ‘–report’ option is used to execute tests, results are written to results.json and the files functest_results.txt and dovetail_ha_tcXXX.out will not be created.

cd $DOVETAIL_HOME/results
  1. Local file
    • Log file: dovetail.log
      • Review the dovetail.log to see if all important information has been captured - in default mode without DEBUG.
      • Review the results.json to see all results data including criteria for PASS or FAIL.
    • Example: OpenStack Interoperability test cases
      • Can see the log details in osinterop_logs/dovetail.osinterop.tc001.log, which has the passed, skipped and failed test cases results.
      • The skipped test cases have the reason for the users to see why these test cases skipped.
      • The failed test cases have rich debug information for the users to see why these test cases fail.
    • Example: vping test case example
      • Its log is stored in dovetail.log.
      • Its result is stored in functest_results.txt.
    • Example: ha test case example
      • Its log is stored in dovetail.log.
      • Its result is stored in dovetail_ha_tcXXX.out.
    • Example: ipv6, sdnvpn and tempest test cases examples
      • Can see the log details in ipv6_logs/dovetail.ipv6.tcXXX.log, sdnvpn_logs/dovetail.sdnvpn.tcXXX.log and tempest_logs/dovetail.tempest.tcXXX.log, respectively. They all have the passed, skipped and failed test cases results.
OVP Portal Web Interface

The OVP portal is a public web interface for the community to collaborate on results and to submit results for official OPNFV compliance verification. The portal can be used as a resource by users and testers to navigate and inspect results more easily than by manually inspecting the log files. The portal also allows users to share results in a private manner until they are ready to submit results for peer community review.

  • Web Site URL
  • Sign In / Sign Up Links
    • Accounts are exposed through Linux Foundation or OpenStack account credentials.
    • If you already have a Linux Foundation ID, you can sign in directly with your ID.
    • If you do not have a Linux Foundation ID, you can sign up for a new one using ‘Sign Up’
  • My Results Tab
    • This is the primary view where most of the workflow occurs.
    • This page lists all results uploaded by you after signing in.
    • You can also upload results on this page with the two steps below.
    • Obtain results tar file located at ${DOVETAIL_HOME}/, example logs_20180105_0858.tar.gz
    • Use the Choose File button where a file selection dialog allows you to choose your result file from the hard-disk. Then click the Upload button and see a results ID once your upload succeeds.
    • Results are status ‘private’ until they are submitted for review.
    • Use the Operation column drop-down option ‘submit to review’, to expose results to OPNFV community peer reviewers. Use the ‘withdraw submit’ option to reverse this action.
    • Use the Operation column drop-down option ‘share with’ to share results with other users by supplying either the login user ID or the email address associated with the share target account. The result is exposed to the share target but remains private otherwise.
  • Profile Tab
    • This page shows your account info after you sign in.
Updating Dovetail or a Test Suite

Follow the instructions in section Installing Dovetail on the Test Host and Running the OVP Test Suite by replacing the docker images with new_tags,

sudo docker pull opnfv/dovetail:<dovetail_new_tag>
sudo docker pull opnfv/functest:<functest_new_tag>
sudo docker pull opnfv/yardstick:<yardstick_new_tag>

This step is necessary if dovetail software or the OVP test suite have updates.

Dovetail Command Line Interface Reference

Dovetail command line is to have a simple command line interface in Dovetail to make easier for users to handle the functions that dovetail framework provides.

Commands List
Commands Action
dovetail –help | -h Show usage of command “dovetail”
dovetail –version Show version number
Dovetail List Commands
dovetail list –help | -h Show usage of command “dovetail list”
dovetail list List all available test suites and all test cases within each test suite
dovetail list <test_suite_name> List all available test areas within test suite <test_suite_name>
Dovetail Show Commands
dovetail show –help | -h Show usage of command “dovetail show”
dovetail show <test_case_name> Show the details of one test case
Dovetail Run Commands
dovetail run –help | -h Show usage of command “dovetail run”
dovetail run Run Dovetail with all test areas within default test suite “compliance_set”
dovetail run –testsuite <test_suite_name> Run Dovetail with all test areas within test suite <test_suite_name>
dovetail run –testsuite <test_suite_name> –testarea <test_area_name> Run Dovetail with test area <test_area_name> within test suite <test_suite_name>. Test area can be chosen from (mandatory, optional, osinterop, ha, vping, ipv6, tempest, sdnvpn). Repeat option to set multiple test areas.
dovetail run –debug | -d Run Dovetail with a debug mode and show all debug logs
dovetail run –offline Run Dovetail offline, use local docker images and will not update them
dovetail run –report | -r <db_url> Push results to local or official DB
dovetail run –yard_tag | -y <yardstick_docker_image_tag> Specify yardstick’s docker image tag, default is danube.3.2
dovetail run –func_tag | -f <functest_docker_image_tag> Specify functest’s docker image tag, default is cvp.0.5.0
dovetail run –bott_tag | -b <bottlenecks_docker_image_tag> Specify bottlenecks’ docker image tag, default is cvp.0.4.0
Commands Examples
Dovetail Commands
root@1f230e719e44:~/dovetail/dovetail# dovetail --help
Usage: dovetail [OPTIONS] COMMAND [ARGS]...

Options:
  --version   Show the version and exit.
  -h, --help  Show this message and exit.

Commands:
  list  list the testsuite details
  run   run the testcases
  show  show the testcases details
root@1f230e719e44:~/dovetail/dovetail# dovetail --version
dovetail, version 0.7.0
Dovetail List Commands
root@1f230e719e44:~/dovetail/dovetail# dovetail list --help
Usage: dovetail list [OPTIONS] [TESTSUITE]

  list the testsuite details

Options:
  -h, --help  Show this message and exit.
root@1f230e719e44:~/dovetail/dovetail# dovetail list debug
- example
    dovetail.example.tc002
- osinterop
    dovetail.osinterop.tc001
- vping
    dovetail.vping.tc001
    dovetail.vping.tc002
Dovetail Show Commands
root@1f230e719e44:~/dovetail/dovetail# dovetail show --help
Usage: dovetail show [OPTIONS] TESTCASE

  show the testcases details

Options:
  -h, --help  Show this message and exit.
root@1f230e719e44:~/dovetail/dovetail# dovetail show dovetail.vping.tc001
---
dovetail.vping.tc001:
  name: dovetail.vping.tc001
  objective: testing for vping using userdata
  validate:
    type: functest
    testcase: vping_userdata
  report:
    sub_testcase_list:
root@1f230e719e44:~/dovetail/dovetail# dovetail show ipv6.tc001
---
dovetail.ipv6.tc001:
  name: dovetail.ipv6.tc001
  objective: Bulk creation and deletion of IPv6 networks, ports and subnets
  validate:
    type: functest
    testcase: tempest_custom
    pre_condition:
      - 'cp /home/opnfv/userconfig/pre_config/tempest_conf.yaml /usr/local/lib/python2.7/dist-packages/functest/opnfv_tests/openstack/tempest/custom_tests/tempest_conf.yaml'
    pre_copy:
      src_file: tempest_custom.txt
      dest_path: /usr/local/lib/python2.7/dist-packages/functest/opnfv_tests/openstack/tempest/custom_tests/test_list.txt
  report:
    sub_testcase_list:
      - tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_network[id-d4f9024d-1e28-4fc1-a6b1-25dbc6fa11e2,smoke]
      - tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_port[id-48037ff2-e889-4c3b-b86a-8e3f34d2d060,smoke]
      - tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_subnet[id-8936533b-c0aa-4f29-8e53-6cc873aec489,smoke]
Dovetail Run Commands
root@1f230e719e44:~/dovetail/dovetail# dovetail run --help
Usage: run.py [OPTIONS]

Dovetail compliance test entry!

Options:
-b, --bott_tag TEXT  Overwrite tag for bottlenecks docker container (e.g. cvp.0.4.0)
-f, --func_tag TEXT  Overwrite tag for functest docker container (e.g. cvp.0.5.0)
-y, --yard_tag TEXT  Overwrite tag for yardstick docker container (e.g. danube.3.2)
--testarea TEXT      compliance testarea within testsuite
--offline            run in offline method, which means not to update the docker upstream images, functest, yardstick, etc.
-r, --report TEXT    push results to DB (e.g. --report http://192.168.135.2:8000/api/v1/results)
--testsuite TEXT     compliance testsuite.
-d, --debug          Flag for showing debug log on screen.
-h, --help           Show this message and exit.
root@1f230e719e44:~/dovetail/dovetail# dovetail run --testsuite proposed_tests --testarea vping --offline -r http://192.168.135.2:8000/api/v1/results
2017-10-12 14:57:51,278 - run - INFO - ================================================
2017-10-12 14:57:51,278 - run - INFO - Dovetail compliance: proposed_tests!
2017-10-12 14:57:51,278 - run - INFO - ================================================
2017-10-12 14:57:51,278 - run - INFO - Build tag: daily-master-b80bca76-af5d-11e7-879a-0242ac110002
2017-10-12 14:57:51,336 - run - WARNING - There is no hosts file /home/jenkins/opnfv/slave_root/workspace/dovetail-compass-huawei-pod7-proposed_tests-danube/cvp/pre_config/hosts.yaml, may be some issues with domain name resolution.
2017-10-12 14:57:51,517 - run - INFO - >>[testcase]: dovetail.vping.tc001
2017-10-12 14:58:21,325 - run - INFO - Results have been pushed to database and stored with local file /home/dovetail/results/results.json.
2017-10-12 14:58:21,337 - run - INFO - >>[testcase]: dovetail.vping.tc002
2017-10-12 14:58:48,862 - run - INFO - Results have been pushed to database and stored with local file /home/dovetail/results/results.json.

Functest

OPNFV FUNCTEST Configuration Guide
Version history
Date Ver. Author Comment
2016-08-17 1.0.0 Juha Haapavirta Column Gaynor Colorado release
2017-01-19 1.0.1 Morgan Richomme Adaptations for Danube * update testcase list * update docker command
Introduction

This document describes how to install and configure Functest in OPNFV. The Functest CLI is used during the Functest environment preparation phase. The given example commands should work in both virtual and bare metal cases alike.

High level architecture

The high level architecture of Functest within OPNFV can be described as follows:

CIMC/Lights+out management               Admin  Mgmt/API  Public  Storage Private
                                          PXE
+                                           +       +        +       +       +
|                                           |       |        |       |       |
|     +----------------------------+        |       |        |       |       |
|     |                            |        |       |        |       |       |
+-----+       Jumphost             |        |       |        |       |       |
|     |                            +--------+       |        |       |       |
|     |                            |        |       |        |       |       |
|     |   +--------------------+   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   | Tools              |   +----------------+        |       |       |
|     |   | - Rally            |   |        |       |        |       |       |
|     |   | - Robot            |   |        |       |        |       |       |
|     |   | - TestON           |   |        |       |        |       |       |
|     |   | - RefStack         |   |        |       |        |       |       |
|     |   |                    |   |-------------------------+       |       |
|     |   | Testcases          |   |        |       |        |       |       |
|     |   | - VIM              |   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   | - SDN Controller   |   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   | - Features         |   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   | - VNF              |   |        |       |        |       |       |
|     |   |                    |   |        |       |        |       |       |
|     |   +--------------------+   |        |       |        |       |       |
|     |     Functest Docker        +        |       |        |       |       |
|     |                            |        |       |        |       |       |
|     |                            |        |       |        |       |       |
|     |                            |        |       |        |       |       |
|     +----------------------------+        |       |        |       |       |
|                                           |       |        |       |       |
|    +----------------+                     |       |        |       |       |
|    |             1  |                     |       |        |       |       |
+----+ +--------------+-+                   |       |        |       |       |
|    | |             2  |                   |       |        |       |       |
|    | | +--------------+-+                 |       |        |       |       |
|    | | |             3  |                 |       |        |       |       |
|    | | | +--------------+-+               |       |        |       |       |
|    | | | |             4  |               |       |        |       |       |
|    +-+ | | +--------------+-+             |       |        |       |       |
|      | | | |             5  +-------------+       |        |       |       |
|      +-+ | |  nodes for     |             |       |        |       |       |
|        | | |  deploying     +---------------------+        |       |       |
|        +-+ |  OPNFV         |             |       |        |       |       |
|          | |                +------------------------------+       |       |
|          +-+     SUT        |             |       |        |       |       |
|            |                +--------------------------------------+       |
|            |                |             |       |        |       |       |
|            |                +----------------------------------------------+
|            +----------------+             |       |        |       |       |
|                                           |       |        |       |       |
+                                           +       +        +       +       +
             SUT = System Under Test

All the libraries and dependencies needed by all of the Functest tools are pre-installed into the Docker image. This allows running Functest on any platform on any Operating System.

The automated mechanisms inside the Functest Docker container will:

  • Retrieve OpenStack credentials
  • Prepare the environment according to the System Under Test (SUT)
  • Perform the appropriate functional tests
  • Push the test results into the OPNFV test result database

This Docker image can be integrated into CI or deployed independently.

Please note that the Functest Docker container has been designed for OPNFV, however, it would be possible to adapt it to any OpenStack based VIM + controller environment, since most of the test cases are integrated from upstream communities.

The functional test cases are described in the Functest User Guide [2]

Prerequisites

The OPNFV deployment is out of the scope of this document but it can be found in http://docs.opnfv.org. The OPNFV platform is considered as the SUT in this document.

Several prerequisites are needed for Functest:

  1. A Jumphost to run Functest on
  2. A Docker daemon shall be installed on the Jumphost
  3. A public/external network created on the SUT
  4. An admin/management network created on the SUT
  5. Connectivity from the Jumphost to the SUT public/external network
  6. Connectivity from the Jumphost to the SUT admin/management network

WARNING: Connectivity from Jumphost is essential and it is of paramount importance to make sure it is working before even considering to install and run Functest. Make also sure you understand how your networking is designed to work.

NOTE: Jumphost refers to any server which meets the previous requirements. Normally it is the same server from where the OPNFV deployment has been triggered previously.

NOTE: If your Jumphost is operating behind a company http proxy and/or firewall, please consult first the section Proxy Support, towards the end of this document. The section details some tips/tricks which may be of help in a proxified environment.

Docker installation

Docker installation and configuration is only needed to be done once through the life cycle of Jumphost.

If your Jumphost is based on Ubuntu, SUSE, RHEL or CentOS linux, please consult the references below for more detailed instructions. The commands below are offered as a short reference.

Tip: For running docker containers behind the proxy, you need first some extra configuration which is described in section Docker Installation on CentOS behind http proxy. You should follow that section before installing the docker engine.

Docker installation needs to be done as root user. You may use other userid’s to create and run the actual containers later if so desired. Log on to your Jumphost as root user and install the Docker Engine (e.g. for CentOS family):

curl -sSL https://get.docker.com/ | sh
systemctl start docker

*Tip:* If you are working through proxy, please set the https_proxy
environment variable first before executing the curl command.

Add your user to docker group to be able to run commands without sudo:

sudo usermod -aG docker <your_user>
A reconnection is needed. There are 2 ways for this:
  1. Re-login to your account
  2. su - <username>
References - Installing Docker Engine on different Linux Operating Systems:
Public/External network on SUT

Some of the tests against the VIM (Virtual Infrastructure Manager) need connectivity through an existing public/external network in order to succeed. This is needed, for example, to create floating IPs to access VM instances through the public/external network (i.e. from the Docker container).

By default, the four OPNFV installers provide a fresh installation with a public/external network created along with a router. Make sure that the public/external subnet is reachable from the Jumphost.

Hint: For the given OPNFV Installer in use, the IP sub-net address used for the public/external network is usually a planning item and should thus be known. Consult the OPNFV Configuration guide [4], and ensure you can reach each node in the SUT, from the Jumphost using the ‘ping’ command using the respective IP address on the public/external network for each node in the SUT. The details of how to determine the needed IP addresses for each node in the SUT may vary according to the used installer and are therefore ommitted here.

Connectivity to Admin/Management network on SUT

Some of the Functest tools need to have access to the OpenStack admin/management network of the controllers [1].

For this reason, check the connectivity from the Jumphost to all the controllers in cluster in the OpenStack admin/management network range.

Installation and configuration
Pulling the Docker image

Pull the Functest Docker image (‘opnfv/functest’) from the public dockerhub registry under the OPNFV account: [dockerhub], with the following docker command:

docker pull opnfv/functest:<TagIdentifier>

where <TagIdentifier> identifies a release of the Functest docker container image in the public Dockerhub registry. There are many tags created automatically by the CI mechanisms, and you must ensure you pull an image with the correct tag to match the OPNFV software release installed in your environment. All available tagged images can be seen from location [FunctestDockerTags]. For example, when running on the first official release of the OPNFV Danube system platform, tag “danube.1.0” is needed. For the second and third releases, the tag “danube.2.0” and “danube.3.0” can be used respectively. Pulling other tags might cause some problems while running the tests. Docker images pulled without a tag specifier bear the implicitly assigned label “latest”. If you need to specifically pull the latest Functest docker image, then omit the tag argument:

docker pull opnfv/functest

After pulling the Docker image, check that it is available with the following docker command:

[functester@jumphost ~]$ docker images
REPOSITORY     TAG             IMAGE ID      CREATED       SIZE
opnfv/functest latest          8cd6683c32ae  2 weeks ago   1.321 GB
opnfv/functest danube.2.0      d2c174a91911  7 minutes ago 1.471 GB
opnfv/functest danube.1.0      13fa54a1b238  4 weeks ago   1.29 GB

The Functest docker container environment can -in principle- be also used with non-OPNFV official installers (e.g. ‘devstack’), with the disclaimer that support for such environments is outside of the scope and responsibility of the OPNFV project.

Accessing the Openstack credentials

OpenStack credentials are mandatory and can be retrieved in different ways. From inside the running Functest docker container the “functest env prepare” command will automatically look for the Openstack credentials file “/home/opnfv/functest/conf/openstack.creds” and retrieve it unless the file already exists. This Functest environment preparation step is described later in this document.

WARNING: When the installer type is “joid” you have to have the credentials file inside the running container before initiating the functest environment preparation. For that reason you have to choose either one of the options below, since the automated copying does not work for “joid”.

You can also specifically pass in the needed file prior to running the environment preparation either:

  • by using the -v option when creating the Docker container. This is referred to in docker documentation as “Bind Mounting”. See the usage of this parameter in the following chapter.
  • or creating a local file ‘/home/opnfv/functest/conf/openstack.creds’ inside the running container with the credentials in it. Consult your installer guide for further details. This is however not instructed in this document.

NOTE: When the installer type is “fuel” and virtualized deployment is used, there you have to explicitly fetch the credentials file executing the following sequence

  1. Create a container as described in next chapter but do not “Bind Mount” the credentials

  2. Log in to container and execute the following command. Replace the IP with installer address after the “-a” parameter:

    $REPOS_DIR/releng/utils/fetch_os_creds.sh \
    -d /home/opnfv/functest/conf/openstack.creds \
    -i fuel \
    -a 10.20.0.2 \
    -v
    ( -d specifies the full path to the Openstack credential file
    -i specifies the INSTALLER_TYPE
    -a specifies the INSTALLER_IP
    -v indicates a virtualized environment and takes no arguments )
    
  3. Continue with your testing, initiate functest environment preparation, run tests etc.

In proxified environment you may need to change the credentials file. There are some tips in chapter: Proxy support

Functest Docker parameters

This chapter explains how to run a container for executing functest test suites. Numbered list below explains some details of the recommended parameters for invoking docker container

  1. It is a good practice to assign a precise container name through the –name option.

  2. Assign parameter for installer type:

    -e "INSTALLER_TYPE=<type>"
    # Use one of following apex, compass, fuel or joid
    
  3. Functest needs to know the IP of some installers:

    -e "INSTALLER_IP=<Specific IP Address>"
    
    This IP is needed to fetch RC file from deployment, fetch logs, ...
    If not provided, there is no way to fetch the RC file. It must be
    provided manually as a volume
    
  4. Credentials for accessing the Openstack. Most convenient way of passing them to container is by having a local copy of the credentials file in Jumphost and then using the -v option. In the example we have local file by the name of “overcloudrc” and we are using that as an argument:

    -v ~/overcloudrc:/home/opnfv/functest/conf/openstack.creds
    
    The credentials file needs to exist in the Docker container
    under the path: '/home/opnfv/functest/conf/openstack.creds'.
    

    WARNING: If you are using the Joid installer, you must pass the credentials using the -v option: -v /var/lib/jenkins/admin-openrc:/home/opnfv/functest/conf/openstack.creds. See the section Accessing the Openstack credentials above.

  5. Passing deployment scenario When running Functest against any of the supported OPNFV scenarios, it is recommended to include also the environment variable DEPLOY_SCENARIO. The DEPLOY_SCENARIO environment variable is passed with the format:

    -e "DEPLOY_SCENARIO=os-<controller>-<nfv_feature>-<ha_mode>"
    where:
    os = OpenStack (No other VIM choices currently available)
    controller is one of ( nosdn | odl_l2 | odl_l3 | onos | ocl)
    nfv_feature is one or more of ( ovs | kvm | sfc | bgpvpn | nofeature )
             If several features are pertinent then use the underscore
             character '_' to separate each feature (e.g. ovs_kvm)
             'nofeature' indicates no NFV feature is deployed
    ha_mode (high availability) is one of ( ha | noha )
    

    NOTE: Not all possible combinations of “DEPLOY_SCENARIO” are supported. The name passed in to the Functest Docker container must match the scenario used when the actual OPNFV platform was deployed. See release note to see the list of supported scenarios.

    NOTE: The scenario name is mainly used to automatically detect if a test suite is runnable or not (e.g. it will prevent ONOS test suite to be run on ODL scenarios). If not set, Functest will try to run the default test cases that might not include SDN controller or a specific feature

    NOTE: A HA scenario means that 3 OpenStack controller nodes are deployed. It does not necessarily mean that the whole system is HA. See installer release notes for details.

Putting all above together, when using installer ‘fuel’ and an invented INSTALLER_IP of ‘10.20.0.2’, the recommended command to create the Functest Docker container is as follows:

docker run --name "FunctestContainer" -it \
-e "INSTALLER_IP=10.20.0.2" \
-e "INSTALLER_TYPE=fuel" \
-e "DEPLOY_SCENARIO=os-odl_l2-ovs_kvm-ha" \
-v ~/overcloudrc:/home/opnfv/functest/conf/openstack.creds \
opnfv/functest /bin/bash

After the run command, a new prompt appears which means that we are inside the container and ready to move to the next step.

For tips on how to set up container with installer Apex, see chapter Apex Installer Tips.

Finally, three additional environment variables can also be passed in to the Functest Docker Container, using the -e “<EnvironmentVariable>=<Value>” mechanism. The first two of these are only relevant to Jenkins CI invoked testing and should not be used when performing manual test scenarios:

-e "NODE_NAME=<Test POD Name>" \
-e "BUILD_TAG=<Jenkins Build Tag>" \
-e "CI_DEBUG=<DebugTraceValue>"
where:
<Test POD Name> = Symbolic name of the POD where the tests are run.
                  Visible in test results files, which are stored
                  to the database. This option is only used when
                  tests are activated under Jenkins CI control.
                  It indicates the POD/hardware where the test has
                  been run. If not specified, then the POD name is
                  defined as "Unknown" by default.
                  DO NOT USE THIS OPTION IN MANUAL TEST SCENARIOS.
<Jenkins Build tag> = Symbolic name of the Jenkins Build Job.
                      Visible in test results files, which are stored
                      to the database. This option is only set when
                      tests are activated under Jenkins CI control.
                      It enables the correlation of test results,
                      which
                      are independently pushed to the results database
                      from different Jenkins jobs.
                      DO NOT USE THIS OPTION IN MANUAL TEST SCENARIOS.
<DebugTraceValue> = "true" or "false"
                    Default = "false", if not specified
                    If "true" is specified, then additional debug trace
                    text can be sent to the test results file / log files
                    and also to the standard console output.
Apex Installer Tips

Some specific tips are useful for the Apex Installer case. If not using Apex Installer; ignore this section.

In case of Triple-O based installer (like Apex) the docker container needs to connect to the installer VM, so it is then required that some known SSH keys are present in docker container. Since the Jumphost root SSH keys are already known, easiest way is to use those using the ‘Bind mount’ method. See below for sample parameter:

-v /root/.ssh/id_rsa:/root/.ssh/id_rsa

NOTE: You need the "sudo" when creating the container to access root
users ssh credentials even the docker command itself might not
require that.

HINT! In case of Triple-O installers you can find value for the INSTALLER_IP parameter by executing command and note the returned IP address:

inst=$(sudo virsh list | grep -iEo "undercloud|instack")
sudo virsh domifaddr ${inst}

NOTE: In releases prior to Colorado, the name 'instack' was
used. Currently the name 'undercloud' is used.

You can copy the credentials file from the “stack” users home directory in installer VM to Jumphost. Please check the correct IP from the command above. In the example below we are using invented IP address “192.168.122.89”:

scp stack@192.168.122.89:overcloudrc .

Here is an example of the full docker command invocation for an Apex installed system, using latest Functest docker container, for illustration purposes:

sudo docker run -it --name "ApexFuncTestODL" \
-e "INSTALLER_IP=192.168.122.89" \
-e "INSTALLER_TYPE=apex" \
-e "DEPLOY_SCENARIO=os-odl_l2-nofeature-ha" \
-v /root/.ssh/id_rsa:/root/.ssh/id_rsa \
-v ~/overcloudrc:/home/opnfv/functest/conf/openstack.creds \
opnfv/functest /bin/bash
Compass installer local development env usage Tips

In the compass-functest local test case check and development environment, in order to get openstack service inside the functest container, some parameters should be configured during container creation, which are hard to guess for freshman. This section will provide the guideline, the parameters values are defaults here, which should be adjusted according to the settings, the complete steps are given here so as not to appear too abruptly.

1, Pull Functest docker image from public dockerhub:

docker pull opnfv/functest:<Tag>

<Tag> here can be “brahmaputra.1.0”, “colorado.1.0”, etc. Tag omitted means the latest docker image:

docker pull opnfv/functest

2, Functest Docker container creation

To make a file used for the environment, such as ‘functest-docker-env’:

OS_AUTH_URL=http://172.16.1.222:35357/v2.0
OS_USERNAME=admin
OS_PASSWORD=console
OS_TENANT_NAME=admin
OS_VOLUME_API_VERSION=2
OS_PROJECT_NAME=admin
INSTALLER_TYPE=compass
INSTALLER_IP=192.168.200.2
EXTERNAL_NETWORK=ext-net

Note: please adjust the content according to the environment, such as ‘TENANT_ID’ maybe used for some special cases.

Then to create the Functest docker:

docker run --privileged=true --rm -t \
--env-file functest-docker-env \
--name <Functest_Container_Name> \
opnfv/functest:<Tag> /bin/bash

3, To attach Functest container

Before trying to attach the Functest container, the status can be checked by:

docker ps -a

to attach the ‘Up’ status Functest container and start bash mode:

docker exec -it <Functest_Container_Name> bash

4, Functest environemnt preparation and check

To see the Section below Preparing the Functest environment.

Functest docker container directory structure

Inside the Functest docker container, the following directory structure should now be in place:

`-- home
    `-- opnfv
      |-- functest
      |   |-- conf
      |   |-- data
      |   `-- results
      `-- repos
          |-- bgpvpn
          |-- copper
          |-- doctor
          |-- domino
          |-- functest
          |-- kingbird
          |-- odl_test
          |-- onos
          |-- parser
          |-- promise
          |-- rally
          |-- refstack-client
          |-- releng
          |-- sdnvpn
          |-- securityscanning
          |-- sfc
          |-- tempest
          |-- vims_test
          `-- vnfs

Underneath the ‘/home/opnfv/’ directory, the Functest docker container includes two main directories:

  • The functest directory stores configuration files (e.g. the OpenStack creds are stored in path ‘/home/opnfv/functest/conf/openstack.creds’), the data directory stores a ‘cirros’ test image used in some functional tests and the results directory stores some temporary result log files
  • The repos directory holds various repositories. The directory ‘/home/opnfv/repos/functest’ is used to prepare the needed Functest environment and to run the tests. The other repository directories are used for the installation of the needed tooling (e.g. rally) or for the retrieval of feature projects scenarios (e.g. promise)

The structure under the functest repository can be described as follows:

. |-- INFO
  |-- LICENSE
  |-- requirements.txt
  |-- run_unit_tests.sh
  |-- setup.py
  |-- test-requirements.txt
  |-- commons
  |   |-- ims
  |   |-- mobile
  |   `--traffic-profile-guidelines.rst
  |-- docker
  |   |-- Dockerfile
  |   |-- config_install_env.sh
  |   `-- docker_remote_api
  |-- docs
  |   |-- com
  |   |-- configguide
  |   |-- devguide
  |   |-- images
  |   |-- internship
  |   |-- release-notes
  |   |-- results
  |   `--userguide
  |-- functest
      |-- __init__.py
      |-- ci
      |   |-- __init__.py
      |   |-- check_os.sh
      |   |-- config_functest.yaml
      |   |-- config_patch.yaml
      |   |-- generate_report.py
      |   |-- prepare_env.py
      |   |-- run_tests.py
      |   |-- testcases.yaml
      |   |-- tier_builder.py
      |   `-- tier_handler.py
      |-- cli
      |   |-- __init__.py
      |   |-- cli_base.py
      |   |-- commands
      |   |-- functest-complete.sh
      |   `-- setup.py
      |-- core
      |   |-- __init__.py
      |   |-- feature_base.py
      |   |-- pytest_suite_runner.py
      |   |-- testcase.py
      |   |-- vnf_base.py
      |-- opnfv_tests
      |   |-- __init__.py
      |   |-- features
      |   |-- mano
      |   |-- openstack
      |   |-- sdn
      |   |-- security_scan
      |   `-- vnf
      |-- tests
      |   |-- __init__.py
      |   `-- unit
      `-- utils
          |-- __init__.py
          |-- config.py
          |-- constants.py
          |-- env.py
          |-- functest_logger.py
          |-- functest_utils.py
          |-- openstack
          |-- openstack_clean.py
          |-- openstack_snapshot.py
          |-- openstack_tacker.py
          `-- openstack_utils.py


  (Note: All *.pyc files removed from above list for brevity...)

We may distinguish several directories, the first level has 4 directories:

  • commons: This directory is dedicated for storage of traffic profile or any other test inputs that could be reused by any test project.
  • docker: This directory includes the needed files and tools to build the Funtest Docker image.
  • docs: This directory includes documentation: Release Notes, User Guide, Configuration Guide and Developer Guide.
  • functest: This directory contains all the code needed to run functest internal cases and OPNFV onboarded feature or VNF test cases.
Functest directory has 6 directories:
  • ci: This directory contains test structure definition files (e.g <filename>.yaml) and bash shell/python scripts used to configure and execute Functional tests. The test execution script can be executed under the control of Jenkins CI jobs.
  • cli: This directory holds the python based Functest CLI utility source code, which is based on the Python ‘click’ framework.
  • core: This directory holds the python based Functest core
    source code. Three abstraction classes have been created to ease the integration of internal, feature or vnf cases.
  • opnfv_tests: This directory includes the scripts required by Functest internal test cases and other feature projects test cases.
  • tests: This directory includes the functest unit tests
  • utils: this directory holds Python source code for some general purpose helper utilities, which testers can also re-use in their own test code. See for an example the Openstack helper utility: ‘openstack_utils.py’.
Useful Docker commands

When typing exit in the container prompt, this will cause exiting the container and probably stopping it. When stopping a running Docker container all the changes will be lost, there is a keyboard shortcut to quit the container without stopping it: <CTRL>-P + <CTRL>-Q. To reconnect to the running container DO NOT use the run command again (since it will create a new container), use the exec or attach command instead:

docker ps  # <check the container ID from the output>
docker exec -ti <CONTAINER_ID> /bin/bash

There are other useful Docker commands that might be needed to manage possible issues with the containers.

List the running containers:

docker ps

List all the containers including the stopped ones:

docker ps -a

Start a stopped container named “FunTest”:

docker start FunTest

Attach to a running container named “StrikeTwo”:

docker attach StrikeTwo

It is useful sometimes to remove a container if there are some problems:

docker rm <CONTAINER_ID>

Use the -f option if the container is still running, it will force to destroy it:

docker rm -f <CONTAINER_ID>

Check the Docker documentation dockerdocs for more information.

Preparing the Functest environment

Once the Functest docker container is up and running, the required Functest environment needs to be prepared. A custom built functest CLI utility is available to perform the needed environment preparation action. Once the environment is prepared, the functest CLI utility can be used to run different functional tests. The usage of the functest CLI utility to run tests is described further in the Functest User Guide OPNFV_FuncTestUserGuide

Prior to commencing the Functest environment preparation, we can check the initial status of the environment. Issue the functest env status command at the prompt:

functest env status
Functest environment is not installed.

Note: When the Functest environment is prepared, the command will
return the status: "Functest environment ready to run tests."

To prepare the Functest docker container for test case execution, issue the functest env prepare command at the prompt:

functest env prepare

This script will make sure that the requirements to run the tests are met and will install the needed libraries and tools by all Functest test cases. It should be run only once every time the Functest docker container is started from scratch. If you try to run this command, on an already prepared enviroment, you will be prompted whether you really want to continue or not:

functest env prepare
It seems that the environment has been already prepared.
Do you want to do it again? [y|n]

(Type 'n' to abort the request, or 'y' to repeat the
 environment preparation)

To list some basic information about an already prepared Functest docker container environment, issue the functest env show at the prompt:

functest env show
+======================================================+
| Functest Environment info                            |
+======================================================+
|  INSTALLER: apex, 192.168.122.89                     |
|   SCENARIO: os-odl_l2-nofeature-ha                   |
|        POD: localhost                                |
| GIT BRANCH: master                                   |
|   GIT HASH: 5bf1647dec6860464eeb082b2875798f0759aa91 |
| DEBUG FLAG: false                                    |
+------------------------------------------------------+
|     STATUS: ready                                    |
+------------------------------------------------------+

Where:

INSTALLER:  Displays the INSTALLER_TYPE value
            - here = "apex"
            and the INSTALLER_IP value
            - here = "192.168.122.89"
SCENARIO:   Displays the DEPLOY_SCENARIO value
            - here = "os-odl_l2-nofeature-ha"
POD:        Displays the value passed in NODE_NAME
            - here = "localhost"
GIT BRANCH: Displays the git branch of the OPNFV Functest
            project repository included in the Functest
            Docker Container.
            - here = "master"
                     (In first official colorado release
                      would be "colorado.1.0")
GIT HASH:   Displays the git hash of the OPNFV Functest
            project repository included in the Functest
            Docker Container.
            - here = "5bf1647dec6860464eeb082b2875798f0759aa91"
DEBUG FLAG: Displays the CI_DEBUG value
            - here = "false"

NOTE: In Jenkins CI runs, an additional item "BUILD TAG"
      would also be listed. The value is set by Jenkins CI.

Finally, the functest CLI has a –help options:

Some examples:

functest --help Usage: functest [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  -h, --help Show this message and exit.

Commands:
  env
  openstack
  testcase
  tier

functest env --help
Usage: functest env [OPTIONS] COMMAND [ARGS]...

Options:
  -h, --help Show this message and exit.

Commands:
  prepare  Prepares the Functest environment.
  show     Shows information about the current...
  status   Checks if the Functest environment is ready...
Checking Openstack and credentials

It is recommended and fairly straightforward to check that Openstack and credentials are working as expected.

Once the credentials are there inside the container, they should be sourced before running any Openstack commands:

source /home/opnfv/functest/conf/openstack.creds

After this, try to run any OpenStack command to see if you get any output, for instance:

openstack user list

This will return a list of the actual users in the OpenStack deployment. In any other case, check that the credentials are sourced:

env|grep OS_

This command must show a set of environment variables starting with OS_, for example:

OS_REGION_NAME=RegionOne
OS_DEFAULT_DOMAIN=default
OS_PROJECT_NAME=admin
OS_PASSWORD=admin
OS_AUTH_STRATEGY=keystone
OS_AUTH_URL=http://172.30.10.3:5000/v2.0
OS_USERNAME=admin
OS_TENANT_NAME=admin
OS_ENDPOINT_TYPE=internalURL
OS_NO_CACHE=true

If the OpenStack command still does not show anything or complains about connectivity issues, it could be due to an incorrect url given to the OS_AUTH_URL environment variable. Check the deployment settings.

SSL Support

If you need to connect to a server that is TLS-enabled (the auth URL begins with “https”) and it uses a certificate from a private CA or a self-signed certificate, then you will need to specify the path to an appropriate CA certificate to use, to validate the server certificate with the environment variable OS_CACERT:

echo $OS_CACERT
/etc/ssl/certs/ca.crt

However, this certificate does not exist in the container by default. It has to be copied manually from the OpenStack deployment. This can be done in 2 ways:

  1. Create manually that file and copy the contents from the OpenStack controller.

  2. (Recommended) Add the file using a Docker volume when starting the container:

    -v <path_to_your_cert_file>:/etc/ssl/certs/ca.cert
    

You might need to export OS_CACERT environment variable inside the container:

export OS_CACERT=/etc/ssl/certs/ca.crt

Certificate verification can be turned off using OS_INSECURE=true. For example, Fuel uses self-signed cacerts by default, so an pre step would be:

export OS_INSECURE=true
Proxy support

If your Jumphost node is operating behind a http proxy, then there are 2 places where some special actions may be needed to make operations succeed:

  1. Initial installation of docker engine First, try following the official Docker documentation for Proxy settings. Some issues were experienced on CentOS 7 based Jumphost. Some tips are documented in section: Docker Installation on CentOS behind http proxy below.
  2. Execution of the Functest environment preparation inside the created docker container Functest needs internet access to download some resources for some test cases. This might not work properly if the Jumphost is connecting to internet through a http Proxy.

If that is the case, make sure the resolv.conf and the needed http_proxy and https_proxy environment variables, as well as the ‘no_proxy’ environment variable are set correctly:

# Make double sure that the 'no_proxy=...' line in the
# 'openstack.creds' file is commented out first. Otherwise, the
# values set into the 'no_proxy' environment variable below will
# be ovewrwritten, each time the command
# 'source ~/functest/conf/openstack.creds' is issued.

cd ~/functest/conf/
sed -i 's/export no_proxy/#export no_proxy/' openstack.creds
source ./openstack.creds

# Next calculate some IP addresses for which http_proxy
# usage should be excluded:

publicURL_IP=$(echo $OS_AUTH_URL | grep -Eo "([0-9]+\.){3}[0-9]+")

adminURL_IP=$(openstack catalog show identity | \
grep adminURL | grep -Eo "([0-9]+\.){3}[0-9]+")

export http_proxy="<your http proxy settings>"
export https_proxy="<your https proxy settings>"
export no_proxy="127.0.0.1,localhost,$publicURL_IP,$adminURL_IP"

# Ensure that "git" uses the http_proxy
# This may be needed if your firewall forbids SSL based git fetch
git config --global http.sslVerify True
git config --global http.proxy <Your http proxy settings>

Validation check: Before running ‘functest env prepare’ CLI command, make sure you can reach http and https sites from inside the Functest docker container.

For example, try to use the nc command from inside the functest docker container:

nc -v opnfv.org 80
Connection to opnfv.org 80 port [tcp/http] succeeded!

nc -v opnfv.org 443
Connection to opnfv.org 443 port [tcp/https] succeeded!

Note: In a Jumphost node based on the CentOS family OS, the nc commands might not work. You can use the curl command instead.

curl http://www.opnfv.org:80 <HTML><HEAD><meta http-equiv=”content-type” . . </BODY></HTML>

curl https://www.opnfv.org:443 <HTML><HEAD><meta http-equiv=”content-type” . . </BODY></HTML>

(Ignore the content. If command returns a valid HTML page, it proves the connection.)

Docker Installation on CentOS behind http proxy

This section is applicable for CentOS family OS on Jumphost which itself is behind a proxy server. In that case, the instructions below should be followed before installing the docker engine:

1) # Make a directory '/etc/systemd/system/docker.service.d'
   # if it does not exist
   sudo mkdir /etc/systemd/system/docker.service.d

2) # Create a file called 'env.conf' in that directory with
   # the following contents:
   [Service]
   EnvironmentFile=-/etc/sysconfig/docker

3) # Set up a file called 'docker' in directory '/etc/sysconfig'
   # with the following contents:
   HTTP_PROXY="<Your http proxy settings>"
   HTTPS_PROXY="<Your https proxy settings>"
   http_proxy="${HTTP_PROXY}"
   https_proxy="${HTTPS_PROXY}"

4) # Reload the daemon
   systemctl daemon-reload

5) # Sanity check - check the following docker settings:
   systemctl show docker | grep -i env

   Expected result:
   ----------------
   EnvironmentFile=/etc/sysconfig/docker (ignore_errors=yes)
   DropInPaths=/etc/systemd/system/docker.service.d/env.conf

Now follow the instructions in [InstallDockerCentOS] to download and install the docker-engine. The instructions conclude with a “test pull” of a sample “Hello World” docker container. This should now work with the above pre-requisite actions.

Integration in CI

In CI we use the Docker image and execute the appropriate commands within the container from Jenkins.

Docker creation in set-functest-env builder [3]:

envs="-e INSTALLER_TYPE=${INSTALLER_TYPE} -e INSTALLER_IP=${INSTALLER_IP} -e NODE_NAME=${NODE_NAME}"
[...]
docker pull opnfv/functest:$DOCKER_TAG >/dev/null
cmd="sudo docker run -id ${envs} ${volumes} ${custom_params} ${TESTCASE_OPTIONS} opnfv/functest:${DOCKER_TAG} /bin/bash"
echo "Functest: Running docker run command: ${cmd}"
${cmd} >${redirect}
sleep 5
container_id=$(docker ps | grep "opnfv/functest:${DOCKER_TAG}" | awk '{print $1}' | head -1)
echo "Container ID=${container_id}"
if [ -z ${container_id} ]; then
    echo "Cannot find opnfv/functest container ID ${container_id}. Please check if it is existing."
    docker ps -a
    exit 1
fi
echo "Starting the container: docker start ${container_id}"
docker start ${container_id}
sleep 5
docker ps >${redirect}
if [ $(docker ps | grep "opnfv/functest:${DOCKER_TAG}" | wc -l) == 0 ]; then
    echo "The container opnfv/functest with ID=${container_id} has not been properly started. Exiting..."
    exit 1
fi

cmd="python ${FUNCTEST_REPO_DIR}/functest/ci/prepare_env.py start"
echo "Executing command inside the docker: ${cmd}"
docker exec ${container_id} ${cmd}

Test execution in functest-all builder [3]:

branch=${GIT_BRANCH##*/}
echo "Functest: run $FUNCTEST_SUITE_NAME on branch ${branch}"
cmd="functest testcase run $FUNCTEST_SUITE_NAME"
fi
container_id=$(docker ps -a | grep opnfv/functest | awk '{print $1}' | head -1)
docker exec $container_id $cmd
ret_value=$?
exit $ret_value

Docker clean in functest-cleanup builder [3] calling docker rm and docker rmi

References

OPNFV main site

Functest page

IRC support channel: #opnfv-functest

OPNFV FUNCTEST user guide
Version history
Date Ver. Author Comment
2016-08-17 1.0.0 Juha Haapavirta Column Gaynor Colorado release
2017-01-23 1.0.1 Morgan Richomme Adaptations for Danube
Introduction

The goal of this document is to describe the OPNFV Functest test cases and to provide a procedure to execute them. In the OPNFV Danube system release, a Functest CLI utility is introduced for an easier execution of test procedures.

IMPORTANT: It is assumed here that the Functest Docker container is already properly deployed and that all instructions described in this guide are to be performed from inside the deployed Functest Docker container.

Overview of the Functest suites

Functest is the OPNFV project primarily targeting function testing. In the Continuous Integration pipeline, it is launched after an OPNFV fresh installation to validate and verify the basic functions of the infrastructure.

The current list of test suites can be distributed over 5 main domains: VIM (Virtualised Infrastructure Manager), Controllers (i.e. SDN Controllers), Features, VNF (Virtual Network Functions) and MANO stacks.

Functest test suites are also distributed in the OPNFV testing categories: healthcheck, smoke, features, components, performance, VNF, Stress tests.

All the Healthcheck and smoke tests of a given scenario must be succesful to validate the scenario for the release.

Domain Tier Test case Comments
VIM healthcheck connection _check Check OpenStack connectivity through SNAPS framework
api_check Check OpenStack API through SNAPS framework
snaps_health _check basic instance creation, check DHCP
smoke vping_ssh NFV “Hello World” using an SSH connection to a destination VM over a created floating IP address on the SUT Public / External network. Using the SSH connection a test script is then copied to the destination VM and then executed via SSH. The script will ping another VM on a specified IP address over the SUT Private Tenant network.
vping_userdata Uses Ping with given userdata to test intra-VM connectivity over the SUT Private Tenant network. The correct operation of the NOVA Metadata service is also verified in this test.
tempest_smoke _serial Generate and run a relevant Tempest Test Suite in smoke mode. The generated test set is dependent on the OpenStack deployment environment.
rally_sanity Run a subset of the OpenStack Rally Test Suite in smoke mode
snaps_smoke Run a subset of the OpenStack Rally Test Suite in smoke mode
refstack
_defcore
Reference RefStack suite tempest selection for NFV
components tempest_full _parallel Generate and run a full set of the OpenStack Tempest Test Suite. See the OpenStack reference test suite [2]. The generated test set is dependent on the OpenStack deployment environment.
rally_full Run the OpenStack testing tool benchmarking OpenStack modules See the Rally documents [3].
tempest_custom Allow to run a customized list of Tempest cases
Controllers smoke odl Opendaylight Test suite Limited test suite to check the basic neutron (Layer 2) operations mainly based on upstream testcases. See below for details
onos Test suite of ONOS L2 and L3 functions. See ONOSFW User Guide for details.
odl_netvirt Test Suite for the OpenDaylight SDN Controller when the NetVirt features are installed. It integrates some test suites from upstream using Robot as the test framework
fds Test Suite for the OpenDaylight SDN Controller when the GBP features are installed. It integrates some test suites from upstream using Robot as the test framework
Features features bgpvpn Implementation of the OpenStack bgpvpn API from the SDNVPN feature project. It allows for the creation of BGP VPNs. See SDNVPN User Guide for details
doctor Doctor platform, as of Colorado release, provides the three features: * Immediate Notification * Consistent resource state awareness for compute host down * Valid compute host status given to VM owner See Doctor User Guide for details
domino Domino provides TOSCA template distribution service for network service and VNF descriptors among MANO components e.g., NFVO, VNFM, VIM, SDN-C, etc., as well as OSS/BSS functions. See Domino User Guide for details
multisite Multisite See Multisite User Guide for details
netready Testing from netready project ping using gluon
odl-sfc SFC testing for odl scenarios See SFC User Guide for details
parser Parser is an integration project which aims to provide placement/deployment templates translation for OPNFV platform, including TOSCA -> HOT, POLICY -> TOSCA and YANG -> TOSCA. it deals with a fake vRNC. See Parser User Guide for details
promise Resource reservation and management project to identify NFV related requirements and realize resource reservation for future usage by capacity management of resource pools regarding compute, network and storage. See Promise User Guide for details.
security_scan Implementation of a simple security scan. (Currently available only for the Apex installer environment)
VNF vnf cloudify_ims Example of a real VNF deployment to show the NFV capabilities of the platform. The IP Multimedia Subsytem is a typical Telco test case, referenced by ETSI. It provides a fully functional VoIP System
orchestra_ims OpenIMS deployment using Openbaton orchestrator
vyos_vrouter vRouter testing

As shown in the above table, Functest is structured into different ‘domains’, ‘tiers’ and ‘test cases’. Each ‘test case’ usually represents an actual ‘Test Suite’ comprised -in turn- of several test cases internally.

Test cases also have an implicit execution order. For example, if the early ‘healthcheck’ Tier testcase fails, or if there are any failures in the ‘smoke’ Tier testcases, there is little point to launch a full testcase execution round.

In Danube, we merged smoke and sdn controller tiers in smoke tier.

An overview of the Functest Structural Concept is depicted graphically below:

Functest Concepts Structure

Some of the test cases are developed by Functest team members, whereas others are integrated from upstream communities or other OPNFV projects. For example, Tempest is the OpenStack integration test suite and Functest is in charge of the selection, integration and automation of those tests that fit suitably to OPNFV.

The Tempest test suite is the default OpenStack smoke test suite but no new test cases have been created in OPNFV Functest.

The results produced by the tests run from CI are pushed and collected into a NoSQL database. The goal is to populate the database with results from different sources and scenarios and to show them on a Functest Dashboard. A screenshot of a live Functest Dashboard is shown below:

Functest Dashboard

Basic components (VIM, SDN controllers) are tested through their own suites. Feature projects also provide their own test suites with different ways of running their tests.

The notion of domain has been introduced in the description of the test cases stored in the Database. This parameters as well as possible tags can be used for the Test case catalog.

vIMS test case was integrated to demonstrate the capability to deploy a relatively complex NFV scenario on top of the OPNFV infrastructure.

Functest considers OPNFV as a black box. As of Danube release the OPNFV offers a lot of potential combinations:

  • 3 controllers (OpenDaylight, ONOS, OpenContrail)
  • 4 installers (Apex, Compass, Fuel, Joid)

Most of the tests are runnable by any combination, but some tests might have restrictions imposed by the utilized installers or due to the available deployed features. The system uses the environment variables (INSTALLER_IP and DEPLOY_SCENARIO) to automatically determine the valid test cases, for each given environment.

A convenience Functest CLI utility is also available to simplify setting up the Functest evironment, management of the OpenStack environment (e.g. resource clean-up) and for executing tests. The Functest CLI organised the testcase into logical Tiers, which contain in turn one or more testcases. The CLI allows execution of a single specified testcase, all test cases in a specified Tier, or the special case of execution of ALL testcases. The Functest CLI is introduced in more detail in the section Executing the functest suites of this document.

The different test cases are described in the remaining sections of this document.

VIM (Virtualized Infrastructure Manager)
Healthcheck tests

In Danube, healthcheck tests have been refactored and rely on SNAPS, a OPNFV middleware project.

SNAPS stands for “SDN/NFV Application development Platform and Stack”. SNAPS is an object-oriented OpenStack library packaged with tests that exercise OpenStack. More information on SNAPS can be found in  [13]

Three tests are declared as healthcheck tests and can be used for gating by the installer, they cover functionally the tests previously done by healthcheck test case.

The tests are:

  • connection_check
  • api_check
  • snaps_health_check

Connection_check consists in 9 test cases (test duration < 5s) checking the connectivity with Glance, Keystone, Neutron, Nova and the external network.

Api_check verifies the retrieval of OpenStack clients: Keystone, Glance, Neutron and Nova and may perform some simple queries. When the config value of snaps.use_keystone is True, functest must have access to the cloud’s private network. This suite consists in 49 tests (test duration < 2 minutes).

snaps_health_check creates instance, allocate floating IP, connect to the VM. This test replaced the previous Colorado healthcheck test.

Self-obviously, successful completion of the ‘healthcheck’ testcase is a necessary pre-requisite for the execution of all other test Tiers.

vPing_ssh

Given the script ping.sh:

#!/bin/sh
while true; do
    ping -c 1 $1 2>&1 >/dev/null
    RES=$?
    if [ "Z$RES" = "Z0" ] ; then
        echo 'vPing OK'
        break
    else
        echo 'vPing KO'
    fi
sleep 1
done

The goal of this test is to establish an SSH connection using a floating IP on the Public/External network and verify that 2 instances can talk over a Private Tenant network:

vPing_ssh test case
+-------------+                    +-------------+
|             |                    |             |
|             | Boot VM1 with IP1  |             |
|             +------------------->|             |
|   Tester    |                    |   System    |
|             | Boot VM2           |    Under    |
|             +------------------->|     Test    |
|             |                    |             |
|             | Create floating IP |             |
|             +------------------->|             |
|             |                    |             |
|             | Assign floating IP |             |
|             | to VM2             |             |
|             +------------------->|             |
|             |                    |             |
|             | Establish SSH      |             |
|             | connection to VM2  |             |
|             | through floating IP|             |
|             +------------------->|             |
|             |                    |             |
|             | SCP ping.sh to VM2 |             |
|             +------------------->|             |
|             |                    |             |
|             | VM2 executes       |             |
|             | ping.sh to VM1     |             |
|             +------------------->|             |
|             |                    |             |
|             |    If ping:        |             |
|             |      exit OK       |             |
|             |    else (timeout): |             |
|             |      exit Failed   |             |
|             |                    |             |
+-------------+                    +-------------+

This test can be considered as an “Hello World” example. It is the first basic use case which must work on any deployment.

vPing_userdata

This test case is similar to vPing_ssh but without the use of Floating IPs and the Public/External network to transfer the ping script. Instead, it uses Nova metadata service to pass it to the instance at booting time. As vPing_ssh, it checks that 2 instances can talk to each other on a Private Tenant network:

vPing_userdata test case
+-------------+                    +-------------+
|             |                    |             |
|             | Boot VM1 with IP1  |             |
|             +------------------->|             |
|             |                    |             |
|             | Boot VM2 with      |             |
|             | ping.sh as userdata|             |
|             | with IP1 as $1.    |             |
|             +------------------->|             |
|   Tester    |                    |   System    |
|             | VM2 exeutes ping.sh|    Under    |
|             | (ping IP1)         |     Test    |
|             +------------------->|             |
|             |                    |             |
|             | Monitor nova       |             |
|             |  console-log VM 2  |             |
|             |    If ping:        |             |
|             |      exit OK       |             |
|             |    else (timeout)  |             |
|             |      exit Failed   |             |
|             |                    |             |
+-------------+                    +-------------+

When the second VM boots it will execute the script passed as userdata automatically. The ping will be detected by periodically capturing the output in the console-log of the second VM.

Tempest

Tempest [2] is the reference OpenStack Integration test suite. It is a set of integration tests to be run against a live OpenStack cluster. Tempest has suites of tests for:

  • OpenStack API validation
  • Scenarios
  • Other specific tests useful in validating an OpenStack deployment

Functest uses Rally [3] to run the Tempest suite. Rally generates automatically the Tempest configuration file tempest.conf. Before running the actual test cases, Functest creates the needed resources (user, tenant) and updates the appropriate parameters into the configuration file.

When the Tempest suite is executed, each test duration is measured and the full console output is stored to a log file for further analysis.

The Tempest testcases are distributed accross two Tiers:

  • Smoke Tier - Test Case ‘tempest_smoke_serial’
  • Components Tier - Test case ‘tempest_full_parallel’

NOTE: Test case ‘tempest_smoke_serial’ executes a defined set of tempest smoke tests with a single thread (i.e. serial mode). Test case ‘tempest_full_parallel’ executes all defined Tempest tests using several concurrent threads (i.e. parallel mode). The number of threads activated corresponds to the number of available logical CPUs.

The goal of the Tempest test suite is to check the basic functionalities of the different OpenStack components on an OPNFV fresh installation, using the corresponding REST API interfaces.

Rally bench test suites

Rally [3] is a benchmarking tool that answers the question:

How does OpenStack work at scale?

The goal of this test suite is to benchmark all the different OpenStack modules and get significant figures that could help to define Telco Cloud KPIs.

The OPNFV Rally scenarios are based on the collection of the actual Rally scenarios:

  • authenticate
  • cinder
  • glance
  • heat
  • keystone
  • neutron
  • nova
  • quotas
  • requests

A basic SLA (stop test on errors) has been implemented.

The Rally testcases are distributed accross two Tiers:

  • Smoke Tier - Test Case ‘rally_sanity’
  • Components Tier - Test case ‘rally_full’

NOTE: Test case ‘rally_sanity’ executes a limited number of Rally smoke test cases. Test case ‘rally_full’ executes the full defined set of Rally tests.

Refstack-client to run Defcore testcases

Refstack-client [8] is a command line utility that allows you to execute Tempest test runs based on configurations you specify. It is the official tool to run Defcore [9] testcases, which focuses on testing interoperability between OpenStack clouds.

Refstack-client is integrated in Functest, consumed by Dovetail, which intends to define and provide a set of OPNFV related validation criteria that will provide input for the evaluation of the use of OPNFV trademarks. This progress is under the guideline of Compliance Verification Program(CVP).

Defcore testcases

Danube Release

Set of DefCore tempest test cases not flagged and required. According to [10], some tests are still flagged due to outstanding bugs in the Tempest library, particularly tests that require SSH. Refstack developers are working on correcting these bugs upstream. Please note that although some tests are flagged because of bugs, there is still an expectation that the capabilities covered by the tests are available. It only contains Openstack core compute (no object storage). The approved guidelines (2016.08) are valid for Kilo, Liberty, Mitaka and Newton releases of OpenStack. The list can be generated using the Rest API from RefStack project: https://refstack.openstack.org/api/v1/guidelines/2016.08/tests?target=compute&type=required&alias=true&flag=false

Running methods

Two running methods are provided after refstack-client integrated into Functest, Functest command line and manually, respectively.

By default, for Defcore test cases run by Functest command line, are run followed with automatically generated configuration file, i.e., refstack_tempest.conf. In some circumstances, the automatic configuration file may not quite satisfied with the SUT, Functest also inherits the refstack-client command line and provides a way for users to set its configuration file according to its own SUT manually.

command line

Inside the Functest container, first to prepare Functest environment:

cd /home/opnfv/repos/functest
pip install -e .
functest env prepare

then to run default defcore testcases by using refstack-client:

functest testcase run refstack_defcore

In OPNFV Continuous Integration(CI) system, the command line method is used.

manually

Inside the Functest container, first to prepare the refstack virtualenv:

cd /home/opnfv/repos/refstack-client
source .venv/bin/activate

then prepare the tempest configuration file and the testcases want to run with the SUT, run the testcases with:

./refstack-client test -c <Path of the tempest configuration file to use> -v --test-list <Path or URL of test list>

using help for more information:

./refstack-client --help
./refstack-client test --help
Reference tempest configuration

command line method

When command line method is used, the default tempest configuration file is generated by Rally.

manually

When running manually is used, recommended way to generate tempest configuration file is:

cd /home/opnfv/repos/functest/functest/opnfv_tests/openstack/refstack_client
python tempest_conf.py

a file called tempest.conf is stored in the current path by default, users can do some adjustment according to the SUT:

vim refstack_tempest.conf

a reference article can be used [15].

snaps_smoke

This test case contains tests that setup and destroy environments with VMs with and without Floating IPs with a newly created user and project. Set the config value snaps.use_floating_ips (True|False) to toggle this functionality. When the config value of snaps.use_keystone is True, Functest must have access the cloud’s private network. This suite consists in 38 tests (test duration < 10 minutes)

SDN Controllers

There are currently 3 available controllers:

  • OpenDaylight (ODL)
  • ONOS
  • OpenContrail (OCL)
OpenDaylight

The OpenDaylight (ODL) test suite consists of a set of basic tests inherited from the ODL project using the Robot [11] framework. The suite verifies creation and deletion of networks, subnets and ports with OpenDaylight and Neutron.

The list of tests can be described as follows:

  • Basic Restconf test cases * Connect to Restconf URL * Check the HTTP code status
  • Neutron Reachability test cases * Get the complete list of neutron resources (networks, subnets, ports)
  • Neutron Network test cases * Check OpenStack networks * Check OpenDaylight networks * Create a new network via OpenStack and check the HTTP status code returned by Neutron * Check that the network has also been successfully created in OpenDaylight
  • Neutron Subnet test cases * Check OpenStack subnets * Check OpenDaylight subnets * Create a new subnet via OpenStack and check the HTTP status code returned by Neutron * Check that the subnet has also been successfully created in OpenDaylight
  • Neutron Port test cases * Check OpenStack Neutron for known ports * Check OpenDaylight ports * Create a new port via OpenStack and check the HTTP status code returned by Neutron * Check that the new port has also been successfully created in OpenDaylight
  • Delete operations * Delete the port previously created via OpenStack * Check that the port has been also succesfully deleted in OpenDaylight * Delete previously subnet created via OpenStack * Check that the subnet has also been successfully deleted in OpenDaylight * Delete the network created via OpenStack * Check that the network has also been succesfully deleted in OpenDaylight

Note: the checks in OpenDaylight are based on the returned HTTP status code returned by OpenDaylight.

ONOS

TestON Framework is used to test the ONOS SDN controller functions. The test cases deal with L2 and L3 functions. The ONOS test suite can be run on any ONOS compliant scenario.

The test cases are described as follows:

  • onosfunctest: The main executable file contains the initialization of the docker environment and functions called by FUNCvirNetNB and FUNCvirNetNBL3
  • FUNCvirNetNB
    • Create Network: Post Network data and check it in ONOS
    • Update Network: Update the Network and compare it in ONOS
    • Delete Network: Delete the Network and check if it’s NULL in ONOS or not
    • Create Subnet: Post Subnet data and check it in ONOS
    • Update Subnet: Update the Subnet and compare it in ONOS
    • Delete Subnet: Delete the Subnet and check if it’s NULL in ONOS or not
    • Create Port: Post Port data and check it in ONOS
    • Update Port: Update the Port and compare it in ONOS
    • Delete Port: Delete the Port and check if it’s NULL in ONOS or not
  • FUNCvirNetNBL3
    • Create Router: Post data for create Router and check it in ONOS
    • Update Router: Update the Router and compare it in ONOS
    • Delete Router: Delete the Router data and check it in ONOS
    • Create RouterInterface: Post Router Interface data to an existing Router and check it in ONOS
    • Delete RouterInterface: Delete the RouterInterface and check the Router
    • Create FloatingIp: Post data for create FloatingIp and check it in ONOS
    • Update FloatingIp: Update the FloatingIp and compare it in ONOS
    • Delete FloatingIp: Delete the FloatingIp and check that it is ‘NULL’ in ONOS
    • Create External Gateway: Post data to create an External Gateway for an existing Router and check it in ONOS
    • Update External Gateway: Update the External Gateway and compare the change
    • Delete External Gateway: Delete the External Gateway and check that it is ‘NULL’ in ONOS
Features

In Danube, Functest supports the integration of:

  • barometer
  • bgpvpn
  • doctor
  • domino
  • fds
  • multisite
  • netready
  • odl-sfc
  • promise
  • security_scan

Note: copper is not supported in Danube.

Please refer to the dedicated feature user guides for details.

VNF
cloudify_ims

The IP Multimedia Subsystem or IP Multimedia Core Network Subsystem (IMS) is an architectural framework for delivering IP multimedia services.

vIMS has been integrated in Functest to demonstrate the capability to deploy a relatively complex NFV scenario on the OPNFV platform. The deployment of a complete functional VNF allows the test of most of the essential functions needed for a NFV platform.

The goal of this test suite consists of:

  • deploy a VNF orchestrator (Cloudify)
  • deploy a Clearwater vIMS (IP Multimedia Subsystem) VNF from this orchestrator based on a TOSCA blueprint defined in [5]
  • run suite of signaling tests on top of this VNF

The Clearwater architecture is described as follows:

vIMS architecture
orchestra_ims

Orchestra test case deals with the deployment of OpenIMS with OpenBaton orchestrator.

parser

See parser user guide for details: [12]

vyos-vrouter

This test case deals with the deployment and the test of vyos vrouter with Cloudify orchestrator. The test case can do testing for interchangeability of BGP Protocol using vyos.

The Workflow is as follows:
  • Deploy
    Deploy VNF Testing topology by Cloudify using blueprint.
  • Configuration
    Setting configuration to Target VNF and reference VNF using ssh
  • Run
    Execution of test command for test item written YAML format file. Check VNF status and behavior.
  • Reporting
    Output of report based on result using JSON format.

The vyos-vrouter architecture is described in [14]

Executing the functest suites
Manual testing
This section assumes the following:
  • The Functest Docker container is running
  • The docker prompt is shown
  • The Functest environment is ready (Functest CLI command ‘functest env prepare’ has been executed)

If any of the above steps are missing please refer to the Functest Config Guide as they are a prerequisite and all the commands explained in this section must be performed inside the container.

The Functest CLI offers two commands (functest tier ...) and (functest testcase ... ) for the execution of Test Tiers or Test Cases:

root@22e436918db0:~/repos/functest/ci# functest tier --help
Usage: functest tier [OPTIONS] COMMAND [ARGS]...

Options:
  -h, --help  Show this message and exit.

Commands:
  get-tests  Prints the tests in a tier.
  list       Lists the available tiers.
  run        Executes all the tests within a tier.
  show       Shows information about a tier.
root@22e436918db0:~/repos/functest/ci# functest testcase --help

Usage: functest testcase [OPTIONS] COMMAND [ARGS]...

Options:
  -h, --help  Show this message and exit.

Commands:
  list  Lists the available testcases.
  run   Executes a test case.
  show  Shows information about a test case.

More details on the existing Tiers and Test Cases can be seen with the ‘list’ command:

root@22e436918db0:~/repos/functest/ci# functest tier list
    - 0. healthcheck:
           ['connection_check', 'api_check', 'snaps_health_check',]
    - 1. smoke:
           ['vping_ssh', 'vping_userdata', 'tempest_smoke_serial', 'odl', 'rally_sanity', 'refstack_defcore', 'snaps_smoke']
    - 2. features:
           ['doctor', 'domino', 'promise', security_scan']
    - 3. components:
           ['tempest_full_parallel', 'rally_full']
    - 4. vnf:
           ['cloudify_ims', 'orchestra_ims', 'vyos_vrouter']

and

root@22e436918db0:~/repos/functest/ci# functest testcase list
api_check
connection_check
snaps_health_check
vping_ssh
vping_userdata
snaps_smoke
refstack_defcore
tempest_smoke_serial
rally_sanity
odl
tempest_full_parallel
rally_full
vyos_vrouter

Note the list of test cases depend on the installer and the scenario.

More specific details on specific Tiers or Test Cases can be seen wih the ‘show’ command:

root@22e436918db0:~/repos/functest/ci# functest tier show smoke
+======================================================================+
| Tier:  smoke                                                         |
+======================================================================+
| Order: 1                                                             |
| CI Loop: (daily)|(weekly)                                            |
| Description:                                                         |
|    Set of basic Functional tests to validate the OpenStack           |
|    deployment.                                                       |
| Test cases:                                                          |
|    - vping_ssh                                                       |
|    - vping_userdata                                                  |
|    - tempest_smoke_serial                                            |
|    - rally_sanity                                                    |
|                                                                      |
+----------------------------------------------------------------------+

and

root@22e436918db0:~/repos/functest/ci# functest testcase  show tempest_smoke_serial
+======================================================================+
| Testcase:  tempest_smoke_serial                                      |
+======================================================================+
| Description:                                                         |
|    This test case runs the smoke subset of the OpenStack Tempest     |
|    suite. The list of test cases is generated by Tempest             |
|    automatically and depends on the parameters of the OpenStack      |
|    deplopyment.                                                      |
| Dependencies:                                                        |
|   - Installer:                                                       |
|   - Scenario :                                                       |
|                                                                      |
+----------------------------------------------------------------------+

To execute a Test Tier or Test Case, the ‘run’ command is used:

root@22e436918db0:~/repos/functest/ci# functest tier run healthcheck
2017-03-21 13:34:21,400 - run_tests - INFO - ############################################
2017-03-21 13:34:21,400 - run_tests - INFO - Running tier 'healthcheck'
2017-03-21 13:34:21,400 - run_tests - INFO - ############################################
2017-03-21 13:34:21,401 - run_tests - INFO -

2017-03-21 13:34:21,401 - run_tests - INFO - ============================================
2017-03-21 13:34:21,401 - run_tests - INFO - Running test case 'connection_check'...
2017-03-21 13:34:21,401 - run_tests - INFO - ============================================
test_glance_connect_fail (snaps.openstack.utils.tests.glance_utils_tests.GlanceSmokeTests) ... ok
test_glance_connect_success (snaps.openstack.utils.tests.glance_utils_tests.GlanceSmokeTests) ... ok
test_keystone_connect_fail (snaps.openstack.utils.tests.keystone_utils_tests.KeystoneSmokeTests) ... ok
test_keystone_connect_success (snaps.openstack.utils.tests.keystone_utils_tests.KeystoneSmokeTests) ... ok
test_neutron_connect_fail (snaps.openstack.utils.tests.neutron_utils_tests.NeutronSmokeTests) ... ok
test_neutron_connect_success (snaps.openstack.utils.tests.neutron_utils_tests.NeutronSmokeTests) ... ok
test_retrieve_ext_network_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronSmokeTests) ... ok
test_nova_connect_fail (snaps.openstack.utils.tests.nova_utils_tests.NovaSmokeTests) ... ok
test_nova_connect_success (snaps.openstack.utils.tests.nova_utils_tests.NovaSmokeTests) ... ok

----------------------------------------------------------------------
Ran 9 tests in 3.768s

OK
2017-03-21 13:34:26,570 - functest.core.testcase_base - INFO - connection_check OK
2017-03-21 13:34:26,918 - functest.core.testcase_base - INFO - The results were successfully pushed to DB
2017-03-21 13:34:26,918 - run_tests - INFO - Test execution time: 00:05
2017-03-21 13:34:26,918 - run_tests - INFO -

2017-03-21 13:34:26,918 - run_tests - INFO - ============================================
2017-03-21 13:34:26,918 - run_tests - INFO - Running test case 'api_check'...
2017-03-21 13:34:26,919 - run_tests - INFO - ============================================
test_create_project_minimal (snaps.openstack.utils.tests.keystone_utils_tests.KeystoneUtilsTests) ... ok
test_create_user_minimal (snaps.openstack.utils.tests.keystone_utils_tests.KeystoneUtilsTests) ... ok
test_create_delete_user (snaps.openstack.tests.create_user_tests.CreateUserSuccessTests) ... ok
test_create_user (snaps.openstack.tests.create_user_tests.CreateUserSuccessTests) ... ok
test_create_user_2x (snaps.openstack.tests.create_user_tests.CreateUserSuccessTests) ...
2017-03-21 13:34:32,684 - create_user - INFO - Found user with name - CreateUserSuccessTests-7e741e11-c9fd-489-name ok
test_create_delete_project (snaps.openstack.tests.create_project_tests.CreateProjectSuccessTests) ... ok
test_create_project (snaps.openstack.tests.create_project_tests.CreateProjectSuccessTests) ... ok
test_create_project_2x (snaps.openstack.tests.create_project_tests.CreateProjectSuccessTests) ...
2017-03-21 13:34:35,922 - create_image - INFO - Found project with name - CreateProjectSuccessTests-b38e08ce-2862-48a-name ok
test_create_project_sec_grp_one_user (snaps.openstack.tests.create_project_tests.CreateProjectUserTests) ...
2017-03-21 13:34:37,907 - OpenStackSecurityGroup - INFO - Creating security group CreateProjectUserTests-ab8801f6-dad8-4f9-name...
2017-03-21 13:34:37,907 - neutron_utils - INFO - Retrieving security group with name - CreateProjectUserTests-ab8801f6-dad8-4f9-name
2017-03-21 13:34:38,376 - neutron_utils - INFO - Creating security group with name - CreateProjectUserTests-ab8801f6-dad8-4f9-name
2017-03-21 13:34:38,716 - neutron_utils - INFO - Retrieving security group rules associate with the security group - CreateProjectUserTests-ab8801f6-dad8-4f9-name
2017-03-21 13:34:38,762 - neutron_utils - INFO - Retrieving security group with ID - 821419cb-c54c-41b4-a61b-fb30e5dd2ec5
2017-03-21 13:34:38,886 - neutron_utils - INFO - Retrieving security group with ID - 821419cb-c54c-41b4-a61b-fb30e5dd2ec5
2017-03-21 13:34:39,000 - neutron_utils - INFO - Retrieving security group with name - CreateProjectUserTests-ab8801f6-dad8-4f9-name
2017-03-21 13:34:39,307 - neutron_utils - INFO - Deleting security group rule with ID - d85fafc0-9649-45c9-a00e-452f3d5c09a6
2017-03-21 13:34:39,531 - neutron_utils - INFO - Deleting security group rule with ID - 69d79c09-bc3b-4975-9353-5f43aca51237
2017-03-21 13:34:39,762 - neutron_utils - INFO - Deleting security group with name - CreateProjectUserTests-ab8801f6-dad8-4f9-name ok
test_create_project_sec_grp_two_users (snaps.openstack.tests.create_project_tests.CreateProjectUserTests) ...
2017-03-21 13:34:43,511 - OpenStackSecurityGroup - INFO - Creating security group CreateProjectUserTests-4d9261a6-e008-44b-name...
2017-03-21 13:34:43,511 - neutron_utils - INFO - Retrieving security group with name - CreateProjectUserTests-4d9261a6-e008-44b-name
2017-03-21 13:34:44,090 - neutron_utils - INFO - Creating security group with name - CreateProjectUserTests-4d9261a6-e008-44b-name
2017-03-21 13:34:44,784 - neutron_utils - INFO - Retrieving security group rules associate with the security group - CreateProjectUserTests-4d9261a6-e008-44b-name
2017-03-21 13:34:44,864 - neutron_utils - INFO - Retrieving security group with ID - 780193e4-9bd2-4f2e-a14d-b01abf74c832
2017-03-21 13:34:45,233 - neutron_utils - INFO - Retrieving security group with ID - 780193e4-9bd2-4f2e-a14d-b01abf74c832
2017-03-21 13:34:45,332 - neutron_utils - INFO - Retrieving security group with name - CreateProjectUserTests-4d9261a6-e008-44b-name
2017-03-21 13:34:45,779 - OpenStackSecurityGroup - INFO - Creating security group CreateProjectUserTests-4d9261a6-e008-44b-name...
2017-03-21 13:34:45,779 - neutron_utils - INFO - Retrieving security group with name - CreateProjectUserTests-4d9261a6-e008-44b-name
2017-03-21 13:34:46,112 - neutron_utils - INFO - Retrieving security group rules associate with the security group - CreateProjectUserTests-4d9261a6-e008-44b-name
2017-03-21 13:34:46,184 - neutron_utils - INFO - Retrieving security group with ID - 780193e4-9bd2-4f2e-a14d-b01abf74c832
2017-03-21 13:34:46,296 - neutron_utils - INFO - Retrieving security group with ID - 780193e4-9bd2-4f2e-a14d-b01abf74c832
2017-03-21 13:34:46,387 - neutron_utils - INFO - Deleting security group rule with ID - 2320a573-ec56-47c5-a1ba-ec514d30114b
2017-03-21 13:34:46,636 - neutron_utils - INFO - Deleting security group rule with ID - 6186282b-db37-4e47-becc-a3886079c069
2017-03-21 13:34:46,780 - neutron_utils - INFO - Deleting security group with name - CreateProjectUserTests-4d9261a6-e008-44b-name
2017-03-21 13:34:47,006 - neutron_utils - INFO - Deleting security group rule with ID - 2320a573-ec56-47c5-a1ba-ec514d30114b
2017-03-21 13:34:47,072 - OpenStackSecurityGroup - WARNING - Rule not found, cannot delete - Security group rule 2320a573-ec56-47c5-a1ba-ec514d30114b does not exist
Neutron server returns request_ids: ['req-d74eb2e2-b26f-4236-87dc-7255866141d9']
2017-03-21 13:34:47,072 - neutron_utils - INFO - Deleting security group rule with ID - 6186282b-db37-4e47-becc-a3886079c069
2017-03-21 13:34:47,118 - OpenStackSecurityGroup - WARNING - Rule not found, cannot delete - Security group rule 6186282b-db37-4e47-becc-a3886079c069 does not exist
Neutron server returns request_ids: ['req-8c0a5a24-be90-4844-a9ed-2a85cc6f59a5']
2017-03-21 13:34:47,118 - neutron_utils - INFO - Deleting security group with name - CreateProjectUserTests-4d9261a6-e008-44b-name
2017-03-21 13:34:47,172 - OpenStackSecurityGroup - WARNING - Security Group not found, cannot delete - Security group 780193e4-9bd2-4f2e-a14d-b01abf74c832 does not exist
Neutron server returns request_ids: ['req-c6e1a6b5-43e0-4d46-bb68-c2e1672d4d21'] ok
test_create_image_minimal_file (snaps.openstack.utils.tests.glance_utils_tests.GlanceUtilsTests) ... ok
test_create_image_minimal_url (snaps.openstack.utils.tests.glance_utils_tests.GlanceUtilsTests) ... ok
test_create_network (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsNetworkTests) ...
2017-03-21 13:35:22,275 - neutron_utils - INFO - Creating network with name NeutronUtilsNetworkTests-c06c20e0-d78f-4fa4-8401-099a7a6cab2e-pub-net
2017-03-21 13:35:23,965 - neutron_utils - INFO - Deleting network with name NeutronUtilsNetworkTests-c06c20e0-d78f-4fa4-8401-099a7a6cab2e-pub-net ok
test_create_network_empty_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsNetworkTests) ... ok
test_create_network_null_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsNetworkTests) ... ok
test_create_subnet (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsSubnetTests) ...
2017-03-21 13:35:25,495 - neutron_utils - INFO - Creating network with name NeutronUtilsSubnetTests-4f440a5f-54e3-4455-ab9b-39dfe06f6d21-pub-net
2017-03-21 13:35:26,841 - neutron_utils - INFO - Creating subnet with name NeutronUtilsSubnetTests-4f440a5f-54e3-4455-ab9b-39dfe06f6d21-pub-subnet
2017-03-21 13:35:28,311 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsSubnetTests-4f440a5f-54e3-4455-ab9b-39dfe06f6d21-pub-subnet
2017-03-21 13:35:29,585 - neutron_utils - INFO - Deleting network with name NeutronUtilsSubnetTests-4f440a5f-54e3-4455-ab9b-39dfe06f6d21-pub-net ok
test_create_subnet_empty_cidr (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsSubnetTests) ...
2017-03-21 13:35:31,013 - neutron_utils - INFO - Creating network with name NeutronUtilsSubnetTests-41fc0db4-71ee-47e6-bec9-316273e5bcc0-pub-net
2017-03-21 13:35:31,652 - neutron_utils - INFO - Deleting network with name NeutronUtilsSubnetTests-41fc0db4-71ee-47e6-bec9-316273e5bcc0-pub-net ok
test_create_subnet_empty_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsSubnetTests) ...
2017-03-21 13:35:32,379 - neutron_utils - INFO - Creating network with name NeutronUtilsSubnetTests-1030e0cb-1714-4d18-8619-a03bac0d0257-pub-net
2017-03-21 13:35:33,516 - neutron_utils - INFO - Creating subnet with name NeutronUtilsSubnetTests-1030e0cb-1714-4d18-8619-a03bac0d0257-pub-subnet
2017-03-21 13:35:34,160 - neutron_utils - INFO - Deleting network with name NeutronUtilsSubnetTests-1030e0cb-1714-4d18-8619-a03bac0d0257-pub-net ok
test_create_subnet_null_cidr (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsSubnetTests) ...
2017-03-21 13:35:35,784 - neutron_utils - INFO - Creating network with name NeutronUtilsSubnetTests-1d7522fd-3fb5-4b1c-8741-97d7c47a5f7d-pub-net
2017-03-21 13:35:36,367 - neutron_utils - INFO - Deleting network with name NeutronUtilsSubnetTests-1d7522fd-3fb5-4b1c-8741-97d7c47a5f7d-pub-net ok
test_create_subnet_null_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsSubnetTests) ...
2017-03-21 13:35:37,055 - neutron_utils - INFO - Creating network with name NeutronUtilsSubnetTests-0a8ac1b2-e5d4-4522-a079-7e17945e482e-pub-net
2017-03-21 13:35:37,691 - neutron_utils - INFO - Deleting network with name NeutronUtilsSubnetTests-0a8ac1b2-e5d4-4522-a079-7e17945e482e-pub-net ok
test_add_interface_router (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:35:38,994 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-433818c9-4472-49a8-9241-791ad0a71d3f-pub-net
2017-03-21 13:35:40,311 - neutron_utils - INFO - Creating subnet with name NeutronUtilsRouterTests-433818c9-4472-49a8-9241-791ad0a71d3f-pub-subnet
2017-03-21 13:35:41,713 - neutron_utils - INFO - Creating router with name - NeutronUtilsRouterTests-433818c9-4472-49a8-9241-791ad0a71d3f-pub-router
2017-03-21 13:35:44,131 - neutron_utils - INFO - Adding interface to router with name NeutronUtilsRouterTests-433818c9-4472-49a8-9241-791ad0a71d3f-pub-router
2017-03-21 13:35:45,725 - neutron_utils - INFO - Removing router interface from router named NeutronUtilsRouterTests-433818c9-4472-49a8-9241-791ad0a71d3f-pub-router
2017-03-21 13:35:47,464 - neutron_utils - INFO - Deleting router with name - NeutronUtilsRouterTests-433818c9-4472-49a8-9241-791ad0a71d3f-pub-router
2017-03-21 13:35:48,670 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsRouterTests-433818c9-4472-49a8-9241-791ad0a71d3f-pub-subnet
2017-03-21 13:35:50,921 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-433818c9-4472-49a8-9241-791ad0a71d3f-pub-net ok
test_add_interface_router_null_router (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:35:52,230 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-1fc2de16-2d3e-497b-b947-022b1bf9d90c-pub-net
2017-03-21 13:35:53,662 - neutron_utils - INFO - Creating subnet with name NeutronUtilsRouterTests-1fc2de16-2d3e-497b-b947-022b1bf9d90c-pub-subnet
2017-03-21 13:35:55,203 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsRouterTests-1fc2de16-2d3e-497b-b947-022b1bf9d90c-pub-subnet
2017-03-21 13:35:55,694 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-1fc2de16-2d3e-497b-b947-022b1bf9d90c-pub-net ok
test_add_interface_router_null_subnet (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:35:57,392 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-2e4fb9f3-312b-4954-8015-435464fdc8b0-pub-net
2017-03-21 13:35:58,215 - neutron_utils - INFO - Creating router with name - NeutronUtilsRouterTests-2e4fb9f3-312b-4954-8015-435464fdc8b0-pub-router
2017-03-21 13:36:00,369 - neutron_utils - INFO - Adding interface to router with name NeutronUtilsRouterTests-2e4fb9f3-312b-4954-8015-435464fdc8b0-pub-router
2017-03-21 13:36:00,369 - neutron_utils - INFO - Deleting router with name - NeutronUtilsRouterTests-2e4fb9f3-312b-4954-8015-435464fdc8b0-pub-router
2017-03-21 13:36:02,742 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-2e4fb9f3-312b-4954-8015-435464fdc8b0-pub-net ok
test_create_port (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:36:05,010 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-dde05ce1-a2f8-4c5e-a028-e1ca0e11a05b-pub-net
2017-03-21 13:36:05,996 - neutron_utils - INFO - Creating subnet with name NeutronUtilsRouterTests-dde05ce1-a2f8-4c5e-a028-e1ca0e11a05b-pub-subnet
2017-03-21 13:36:09,103 - neutron_utils - INFO - Creating port for network with name - NeutronUtilsRouterTests-dde05ce1-a2f8-4c5e-a028-e1ca0e11a05b-pub-net
2017-03-21 13:36:10,312 - neutron_utils - INFO - Deleting port with name NeutronUtilsRouterTests-dde05ce1-a2f8-4c5e-a028-e1ca0e11a05b-port
2017-03-21 13:36:11,045 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsRouterTests-dde05ce1-a2f8-4c5e-a028-e1ca0e11a05b-pub-subnet
2017-03-21 13:36:14,265 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-dde05ce1-a2f8-4c5e-a028-e1ca0e11a05b-pub-net ok
test_create_port_empty_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:36:16,250 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-b986a259-e873-431c-bde4-b2771ace4549-pub-net
2017-03-21 13:36:16,950 - neutron_utils - INFO - Creating subnet with name NeutronUtilsRouterTests-b986a259-e873-431c-bde4-b2771ace4549-pub-subnet
2017-03-21 13:36:17,798 - neutron_utils - INFO - Creating port for network with name - NeutronUtilsRouterTests-b986a259-e873-431c-bde4-b2771ace4549-pub-net
2017-03-21 13:36:18,544 - neutron_utils - INFO - Deleting port with name NeutronUtilsRouterTests-b986a259-e873-431c-bde4-b2771ace4549-port
2017-03-21 13:36:19,582 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsRouterTests-b986a259-e873-431c-bde4-b2771ace4549-pub-subnet
2017-03-21 13:36:21,606 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-b986a259-e873-431c-bde4-b2771ace4549-pub-net ok
test_create_port_invalid_ip (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:36:23,779 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-7ab3a329-9dd8-4e6f-9d52-aafb47ea5122-pub-net
2017-03-21 13:36:25,201 - neutron_utils - INFO - Creating subnet with name NeutronUtilsRouterTests-7ab3a329-9dd8-4e6f-9d52-aafb47ea5122-pub-subnet
2017-03-21 13:36:25,599 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsRouterTests-7ab3a329-9dd8-4e6f-9d52-aafb47ea5122-pub-subnet
2017-03-21 13:36:26,220 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-7ab3a329-9dd8-4e6f-9d52-aafb47ea5122-pub-net ok
test_create_port_invalid_ip_to_subnet (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:36:27,112 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-c016821d-cd4f-4e0f-8f8c-d5cef3392e64-pub-net
2017-03-21 13:36:28,720 - neutron_utils - INFO - Creating subnet with name NeutronUtilsRouterTests-c016821d-cd4f-4e0f-8f8c-d5cef3392e64-pub-subnet
2017-03-21 13:36:29,457 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsRouterTests-c016821d-cd4f-4e0f-8f8c-d5cef3392e64-pub-subnet
2017-03-21 13:36:29,909 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-c016821d-cd4f-4e0f-8f8c-d5cef3392e64-pub-net ok
test_create_port_null_ip (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:36:31,037 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-9a86227f-6041-4b04-86a7-1701fb86baa3-pub-net
2017-03-21 13:36:31,695 - neutron_utils - INFO - Creating subnet with name NeutronUtilsRouterTests-9a86227f-6041-4b04-86a7-1701fb86baa3-pub-subnet
2017-03-21 13:36:32,305 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsRouterTests-9a86227f-6041-4b04-86a7-1701fb86baa3-pub-subnet
2017-03-21 13:36:33,553 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-9a86227f-6041-4b04-86a7-1701fb86baa3-pub-net ok
test_create_port_null_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:36:34,593 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-42efa897-4f65-4d9b-b19d-fbc61f97c966-pub-net
2017-03-21 13:36:35,217 - neutron_utils - INFO - Creating subnet with name NeutronUtilsRouterTests-42efa897-4f65-4d9b-b19d-fbc61f97c966-pub-subnet
2017-03-21 13:36:36,648 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsRouterTests-42efa897-4f65-4d9b-b19d-fbc61f97c966-pub-subnet
2017-03-21 13:36:37,251 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-42efa897-4f65-4d9b-b19d-fbc61f97c966-pub-net ok
test_create_port_null_network_object (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:36:37,885 - neutron_utils - INFO - Creating network with name NeutronUtilsRouterTests-617f4110-45c1-4900-bad1-a6204f34dd64-pub-net
2017-03-21 13:36:38,468 - neutron_utils - INFO - Creating subnet with name NeutronUtilsRouterTests-617f4110-45c1-4900-bad1-a6204f34dd64-pub-subnet
2017-03-21 13:36:40,005 - neutron_utils - INFO - Deleting subnet with name NeutronUtilsRouterTests-617f4110-45c1-4900-bad1-a6204f34dd64-pub-subnet
2017-03-21 13:36:41,637 - neutron_utils - INFO - Deleting network with name NeutronUtilsRouterTests-617f4110-45c1-4900-bad1-a6204f34dd64-pub-net ok
test_create_router_empty_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ... ok
test_create_router_null_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ... ok
test_create_router_simple (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:36:43,424 - neutron_utils - INFO - Creating router with name - NeutronUtilsRouterTests-b6a2dafc-38d4-4c46-bb41-2ba9e1c0084e-pub-router
2017-03-21 13:36:45,013 - neutron_utils - INFO - Deleting router with name - NeutronUtilsRouterTests-b6a2dafc-38d4-4c46-bb41-2ba9e1c0084e-pub-router ok
test_create_router_with_public_interface (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsRouterTests) ...
2017-03-21 13:36:47,829 - neutron_utils - INFO - Creating router with name - NeutronUtilsRouterTests-d268dda2-7a30-4d3d-a008-e5aa4592637d-pub-router
2017-03-21 13:36:49,448 - neutron_utils - INFO - Deleting router with name - NeutronUtilsRouterTests-d268dda2-7a30-4d3d-a008-e5aa4592637d-pub-router ok
test_create_delete_simple_sec_grp (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsSecurityGroupTests) ...
2017-03-21 13:36:51,067 - neutron_utils - INFO - Creating security group with name - NeutronUtilsSecurityGroupTests-1543e861-ea38-4fbe-9723-c27552e3eb7aname
2017-03-21 13:36:51,493 - neutron_utils - INFO - Retrieving security group with name - NeutronUtilsSecurityGroupTests-1543e861-ea38-4fbe-9723-c27552e3eb7aname
2017-03-21 13:36:51,568 - neutron_utils - INFO - Deleting security group with name - NeutronUtilsSecurityGroupTests-1543e861-ea38-4fbe-9723-c27552e3eb7aname
2017-03-21 13:36:51,772 - neutron_utils - INFO - Retrieving security group with name - NeutronUtilsSecurityGroupTests-1543e861-ea38-4fbe-9723-c27552e3eb7aname ok
test_create_sec_grp_no_name (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsSecurityGroupTests) ... ok
test_create_sec_grp_no_rules (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsSecurityGroupTests) ...
2017-03-21 13:36:52,253 - neutron_utils - INFO - Creating security group with name - NeutronUtilsSecurityGroupTests-57c60864-f46c-4391-ba99-6acc4dd123ddname
2017-03-21 13:36:52,634 - neutron_utils - INFO - Retrieving security group with name - NeutronUtilsSecurityGroupTests-57c60864-f46c-4391-ba99-6acc4dd123ddname
2017-03-21 13:36:52,718 - neutron_utils - INFO - Deleting security group with name - NeutronUtilsSecurityGroupTests-57c60864-f46c-4391-ba99-6acc4dd123ddname ok
test_create_sec_grp_one_rule (snaps.openstack.utils.tests.neutron_utils_tests.NeutronUtilsSecurityGroupTests) ...
2017-03-21 13:36:53,082 - neutron_utils - INFO - Creating security group with name - NeutronUtilsSecurityGroupTests-a3ac62bb-a7e8-4fc2-ba4c-e656f1f3c9a1name
2017-03-21 13:36:53,483 - neutron_utils - INFO - Retrieving security group rules associate with the security group - NeutronUtilsSecurityGroupTests-a3ac62bb-a7e8-4fc2-ba4c-e656f1f3c9a1name
2017-03-21 13:36:53,548 - neutron_utils - INFO - Creating security group to security group - NeutronUtilsSecurityGroupTests-a3ac62bb-a7e8-4fc2-ba4c-e656f1f3c9a1name
2017-03-21 13:36:53,548 - neutron_utils - INFO - Retrieving security group with name - NeutronUtilsSecurityGroupTests-a3ac62bb-a7e8-4fc2-ba4c-e656f1f3c9a1name
2017-03-21 13:36:53,871 - neutron_utils - INFO - Retrieving security group with name - NeutronUtilsSecurityGroupTests-a3ac62bb-a7e8-4fc2-ba4c-e656f1f3c9a1name
2017-03-21 13:36:53,944 - neutron_utils - INFO - Retrieving security group rules associate with the security group - NeutronUtilsSecurityGroupTests-a3ac62bb-a7e8-4fc2-ba4c-e656f1f3c9a1name
2017-03-21 13:36:53,991 - neutron_utils - INFO - Retrieving security group with name - NeutronUtilsSecurityGroupTests-a3ac62bb-a7e8-4fc2-ba4c-e656f1f3c9a1name
2017-03-21 13:36:54,069 - neutron_utils - INFO - Deleting security group rule with ID - 7f76046c-d043-46e0-9d12-4b983525810b
2017-03-21 13:36:54,185 - neutron_utils - INFO - Deleting security group rule with ID - f18a9ed1-466f-4373-a6b2-82bd317bc838
2017-03-21 13:36:54,338 - neutron_utils - INFO - Deleting security group rule with ID - fe34a3d0-948e-47c1-abad-c3ec8d33b2fb
2017-03-21 13:36:54,444 - neutron_utils - INFO - Deleting security group with name - NeutronUtilsSecurityGroupTests-a3ac62bb-a7e8-4fc2-ba4c-e656f1f3c9a1name ok
test_create_delete_keypair (snaps.openstack.utils.tests.nova_utils_tests.NovaUtilsKeypairTests) ...
2017-03-21 13:36:54,637 - nova_utils - INFO - Creating keypair with name - NovaUtilsKeypairTests-5ce69b6f-d8d0-4b66-bd25-30a22cf3bda0 ok
test_create_key_from_file (snaps.openstack.utils.tests.nova_utils_tests.NovaUtilsKeypairTests) ...
2017-03-21 13:36:58,989 - nova_utils - INFO - Saved public key to - tmp/NovaUtilsKeypairTests-df3e848d-a467-4cc4-99d5-022eb67eee94.pub
2017-03-21 13:36:58,990 - nova_utils - INFO - Saved private key to - tmp/NovaUtilsKeypairTests-df3e848d-a467-4cc4-99d5-022eb67eee94
2017-03-21 13:36:58,990 - nova_utils - INFO - Saving keypair to - tmp/NovaUtilsKeypairTests-df3e848d-a467-4cc4-99d5-022eb67eee94.pub
2017-03-21 13:36:58,990 - nova_utils - INFO - Creating keypair with name - NovaUtilsKeypairTests-df3e848d-a467-4cc4-99d5-022eb67eee94 ok
test_create_keypair (snaps.openstack.utils.tests.nova_utils_tests.NovaUtilsKeypairTests) ...
2017-03-21 13:36:59,807 - nova_utils - INFO - Creating keypair with name - NovaUtilsKeypairTests-fc7f7ffd-80f6-43df-bd41-a3c014ba8c3d ok
test_floating_ips (snaps.openstack.utils.tests.nova_utils_tests.NovaUtilsKeypairTests) ...
2017-03-21 13:37:02,765 - nova_utils - INFO - Creating floating ip to external network - admin_floating_net ok
test_create_delete_flavor (snaps.openstack.utils.tests.nova_utils_tests.NovaUtilsFlavorTests) ... ok
test_create_flavor (snaps.openstack.utils.tests.nova_utils_tests.NovaUtilsFlavorTests) ... ok
test_create_clean_flavor (snaps.openstack.tests.create_flavor_tests.CreateFlavorTests) ... ok
test_create_delete_flavor (snaps.openstack.tests.create_flavor_tests.CreateFlavorTests) ... ok
test_create_flavor (snaps.openstack.tests.create_flavor_tests.CreateFlavorTests) ... ok
test_create_flavor_existing (snaps.openstack.tests.create_flavor_tests.CreateFlavorTests) ...
2017-03-21 13:37:18,545 - create_image - INFO - Found flavor with name - CreateFlavorTests-3befc152-4319-4f9c-82d4-75f8941d9533name ok

----------------------------------------------------------------------
Ran 48 tests in 171.000s

OK
2017-03-21 13:37:18,620 - functest.core.testcase_base - INFO - api_check OK
2017-03-21 13:37:18,977 - functest.core.testcase_base - INFO - The results were successfully pushed to DB
2017-03-21 13:37:18,977 - run_tests - INFO - Test execution time: 02:52
2017-03-21 13:37:18,981 - run_tests - INFO -

2017-03-21 13:37:18,981 - run_tests - INFO - ============================================
2017-03-21 13:37:18,981 - run_tests - INFO - Running test case 'snaps_health_check'...
2017-03-21 13:37:18,981 - run_tests - INFO - ============================================
2017-03-21 13:37:19,098 - file_utils - INFO - Attempting to read OS environment file - /home/opnfv/functest/conf/openstack.creds
2017-03-21 13:37:19,099 - openstack_tests - INFO - OS Credentials = OSCreds - username=admin, password=admin, auth_url=http://192.168.10.7:5000/v3, project_name=admin, identity_api_version=3, image_api_version=1, network_api_version=2, compute_api_version=2, user_domain_id=default, proxy_settings=None
2017-03-21 13:37:19,434 - file_utils - INFO - Attempting to read OS environment file - /home/opnfv/functest/conf/openstack.creds
2017-03-21 13:37:19,435 - openstack_tests - INFO - OS Credentials = OSCreds - username=admin, password=admin, auth_url=http://192.168.10.7:5000/v3, project_name=admin, identity_api_version=3, image_api_version=1, network_api_version=2, compute_api_version=2, user_domain_id=default, proxy_settings=None
test_check_vm_ip_dhcp (snaps.openstack.tests.create_instance_tests.SimpleHealthCheck) ...
2017-03-21 13:37:26,082 - create_image - INFO - Creating image
2017-03-21 13:37:28,793 - create_image - INFO - Image is active with name - SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-image
2017-03-21 13:37:28,793 - create_image - INFO - Image is now active with name - SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-image
2017-03-21 13:37:28,794 - OpenStackNetwork - INFO - Creating neutron network SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-priv-net...
2017-03-21 13:37:29,308 - neutron_utils - INFO - Creating network with name SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-priv-net
2017-03-21 13:37:30,771 - neutron_utils - INFO - Creating subnet with name SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-priv-subnet
2017-03-21 13:37:36,974 - neutron_utils - INFO - Creating port for network with name - SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-priv-net
2017-03-21 13:37:38,188 - create_instance - INFO - Creating VM with name - SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-inst
2017-03-21 13:37:41,538 - create_instance - INFO - Created instance with name - SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-inst
2017-03-21 13:37:59,577 - create_instance - INFO - VM is - ACTIVE
2017-03-21 13:37:59,577 - create_instance_tests - INFO - Looking for expression Lease of.*obtained in the console log
2017-03-21 13:37:59,830 - create_instance_tests - INFO - DHCP lease obtained logged in console
2017-03-21 13:37:59,830 - create_instance_tests - INFO - With correct IP address
2017-03-21 13:37:59,830 - create_instance - INFO - Deleting Port - SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5dport-1
2017-03-21 13:37:59,830 - neutron_utils - INFO - Deleting port with name SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5dport-1
2017-03-21 13:38:00,705 - create_instance - INFO - Deleting VM instance - SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-inst
2017-03-21 13:38:01,412 - create_instance - INFO - Checking deletion status
2017-03-21 13:38:04,938 - create_instance - INFO - VM has been properly deleted VM with name - SimpleHealthCheck-23244728-5a5a-4545-9b16-50257a595e5d-inst
ok

----------------------------------------------------------------------
Ran 1 test in 46.982s

OK
2017-03-21 13:38:06,417 - functest.core.testcase_base - INFO - snaps_health_check OK
2017-03-21 13:38:06,778 - functest.core.testcase_base - INFO - The results were successfully pushed to DB
2017-03-21 13:38:06,779 - run_tests - INFO - Test execution time: 00:47
2017-03-21 13:38:06,779 - run_tests - INFO -
and

root@22e436918db0:~/repos/functest/ci# functest testcase run vping_ssh
Executing command: 'python /home/opnfv/repos/functest/ci/run_tests.py -t vping_ssh'
2016-06-30 11:50:31,861 - run_tests - INFO - Sourcing the OpenStack RC file...
2016-06-30 11:50:31,865 - run_tests - INFO - ============================================
2016-06-30 11:50:31,865 - run_tests - INFO - Running test case 'vping_ssh'...
2016-06-30 11:50:31,865 - run_tests - INFO - ============================================
2016-06-30 11:50:32,977 - vping_ssh - INFO - Creating image 'functest-vping' from '/home/opnfv/functest/data/cirros-0.3.5-x86_64-disk.img'...
2016-06-30 11:50:45,470 - vping_ssh - INFO - Creating neutron network vping-net...
2016-06-30 11:50:47,645 - vping_ssh - INFO - Creating security group  'vPing-sg'...
2016-06-30 11:50:48,843 - vping_ssh - INFO - Using existing Flavor 'm1.small'...
2016-06-30 11:50:48,927 - vping_ssh - INFO - vPing Start Time:'2016-06-30 11:50:48'
2016-06-30 11:50:48,927 - vping_ssh - INFO - Creating instance 'opnfv-vping-1'...
2016-06-30 11:51:34,664 - vping_ssh - INFO - Instance 'opnfv-vping-1' is ACTIVE.
2016-06-30 11:51:34,818 - vping_ssh - INFO - Adding 'opnfv-vping-1' to security group 'vPing-sg'...
2016-06-30 11:51:35,209 - vping_ssh - INFO - Creating instance 'opnfv-vping-2'...
2016-06-30 11:52:01,439 - vping_ssh - INFO - Instance 'opnfv-vping-2' is ACTIVE.
2016-06-30 11:52:01,439 - vping_ssh - INFO - Adding 'opnfv-vping-2' to security group 'vPing-sg'...
2016-06-30 11:52:01,754 - vping_ssh - INFO - Creating floating IP for VM 'opnfv-vping-2'...
2016-06-30 11:52:01,969 - vping_ssh - INFO - Floating IP created: '10.17.94.140'
2016-06-30 11:52:01,969 - vping_ssh - INFO - Associating floating ip: '10.17.94.140' to VM 'opnfv-vping-2'
2016-06-30 11:52:02,792 - vping_ssh - INFO - Trying to establish SSH connection to 10.17.94.140...
2016-06-30 11:52:19,915 - vping_ssh - INFO - Waiting for ping...
2016-06-30 11:52:21,108 - vping_ssh - INFO - vPing detected!
2016-06-30 11:52:21,108 - vping_ssh - INFO - vPing duration:'92.2' s.
2016-06-30 11:52:21,109 - vping_ssh - INFO - vPing OK
2016-06-30 11:52:21,153 - clean_openstack - INFO - +++++++++++++++++++++++++++++++
2016-06-30 11:52:21,153 - clean_openstack - INFO - Cleaning OpenStack resources...
2016-06-30 11:52:21,153 - clean_openstack - INFO - +++++++++++++++++++++++++++++++
Version 1 is deprecated, use alternative version 2 instead.
:
:
etc.

To list the test cases which are part of a specific Test Tier, the ‘get-tests’ command is used with ‘functest tier’:

root@22e436918db0:~/repos/functest/ci# functest tier get-tests healthcheck
Test cases in tier 'healthcheck':
 ['connection_check', 'api_check', 'snaps_health_check']

Please note that for some scenarios some test cases might not be launched. For example, the last example displayed only the ‘odl’ testcase for the given environment. In this particular system the deployment does not support the ‘ocl’ SDN Controller Test Case; for example.

Important If you use the command ‘functest tier run <tier_name>’, then the Functest CLI utility will call all valid Test Cases, which belong to the specified Test Tier, as relevant to scenarios deployed to the SUT environment. Thus, the Functest CLI utility calculates automatically which tests can be executed and which cannot, given the environment variable DEPLOY_SCENARIO, which is passed in to the Functest docker container.

Currently, the Functest CLI command ‘functest testcase run <testcase_name>’, supports two possibilities:

*  Run a single Test Case, specified by a valid choice of <testcase_name>
*  Run ALL test Test Cases (for all Tiers) by specifying <testcase_name> = 'all'

Functest includes a cleaning mechanism in order to remove all the OpenStack resources except those present before running any test. The script $REPOS_DIR/functest/functest/utils/openstack_snapshot.py is called once when setting up the Functest environment (i.e. CLI command ‘functest env prepare’) to snapshot all the OpenStack resources (images, networks, volumes, security groups, tenants, users) so that an eventual cleanup does not remove any of these defaults.

It is also called before running a test except if it is disabled by configuration in the testcases.yaml file (clean_flag=false). This flag has been added as some upstream tests already include their own cleaning mechanism (e.g. Rally).

The script openstack_clean.py which is located in $REPOS_DIR/functest/functest/utils/ is normally called after a test execution. It is in charge of cleaning the OpenStack resources that are not specified in the defaults file generated previously which is stored in /home/opnfv/functest/conf/openstack_snapshot.yaml in the Functest docker container.

It is important to mention that if there are new OpenStack resources created manually after the snapshot done before running the tests, they will be removed, unless you use the special method of invoking the test case with specific suppression of clean up. (See the Troubleshooting section).

The reason to include this cleanup meachanism in Functest is because some test suites create a lot of resources (users, tenants, networks, volumes etc.) that are not always properly cleaned, so this function has been set to keep the system as clean as it was before a full Functest execution.

Although the Functest CLI provides an easy way to run any test, it is possible to do a direct call to the desired test script. For example:

python $REPOS_DIR/functest/functest/opnfv_tests/openstack/vping/vping_ssh.py
Automated testing

As mentioned previously, the Functest Docker container preparation as well as invocation of Test Cases can be called within the container from the Jenkins CI system. There are 3 jobs that automate the whole process. The first job runs all the tests referenced in the daily loop (i.e. that must been run daily), the second job runs the tests referenced in the weekly loop (usually long duration tests run once a week maximum) and the third job allows testing test suite by test suite specifying the test suite name. The user may also use either of these Jenkins jobs to execute the desired test suites.

One of the most challenging task in the Danube release consists in dealing with lots of scenarios and installers. Thus, when the tests are automatically started from CI, a basic algorithm has been created in order to detect whether a given test is runnable or not on the given scenario. Some Functest test suites cannot be systematically run (e.g. ODL suite can not be run on an ONOS scenario). The daily/weekly notion has been introduces in Colorado in order to save CI time and avoid running systematically long duration tests. It was not used in Colorado due to CI resource shortage. The mechanism remains however as part of the CI evolution.

CI provides some useful information passed to the container as environment variables:

  • Installer (apex|compass|fuel|joid), stored in INSTALLER_TYPE
  • Installer IP of the engine or VM running the actual deployment, stored in INSTALLER_IP
  • The scenario [controller]-[feature]-[mode], stored in DEPLOY_SCENARIO with
    • controller = (odl|ocl|nosdn|onos)
    • feature = (ovs(dpdk)|kvm|sfc|bgpvpn|multisites|netready|ovs_dpdk_bar)
    • mode = (ha|noha)

The constraints per test case are defined in the Functest configuration file /home/opnfv/repos/functest/functest/ci/testcases.yaml:

tiers:
  -
       name: smoke
       order: 1
       ci_loop: '(daily)|(weekly)'
       description : >-
           Set of basic Functional tests to validate the OpenStack deployment.
       testcases:
           -
               name: vping_ssh
               criteria: 'status == "PASS"'
               blocking: true
               description: >-
                   This test case verifies: 1) SSH to an instance using floating
                   IPs over the public network. 2) Connectivity between 2 instances
                   over a private network.
               dependencies:
                   installer: ''
                   scenario: '^((?!bgpvpn|odl_l3).)*$'
               run:
                   module: 'functest.opnfv_tests.openstack.vping.vping_ssh'
                   class: 'VPingSSH'
       ....
We may distinguish 2 levels in the test case description:
  • Tier level
  • Test case level

At the tier level, we define the following parameters:

  • ci_loop: indicate if in automated mode, the test case must be run in dail and/or weekly jobs
  • description: a high level view of the test case
For a given test case we defined:
  • the name of the test case
  • the criteria (experimental): a criteria used to declare the test case as PASS or FAIL
  • blocking: if set to true, if the test is failed, the execution of the following tests is canceled
  • clean_flag: shall the functect internal mechanism be invoked after the test
  • the description of the test case
  • the dependencies: a combination of 2 regex on the scenario and the installer name
  • run: In Danube we introduced the notion of abstract class in order to harmonize the way to run internal, feature or vnf tests

For further details on abstraction classes, see developper guide.

Additional parameters have been added in the desription in the Database. The target is to use the configuration stored in the Database and consider the local file as backup if the Database is not reachable. The additional fields related to a test case are:

  • trust: we introduced this notion to put in place a mechanism of scenario promotion.
  • Version: it indicates since which version you can run this test
  • domains: the main domain covered by the test suite
  • tags: a list of tags related to the test suite

The order of execution is the one defined in the file if all test cases are selected.

In CI daily job the tests are executed in the following order:

  1. healthcheck (blocking)
  2. smoke: both vPings are blocking
  3. Feature project tests cases

In CI weekly job we add 2 tiers:

  1. VNFs (vIMS)
  2. Components (Rally and Tempest long duration suites)

As explained before, at the end of an automated execution, the OpenStack resources might be eventually removed. Please note that a system snapshot is taken before any test case execution.

This testcase.yaml file is used for CI, for the CLI and for the automatic reporting.

Test results
Manual testing

In manual mode test results are displayed in the console and result files are put in /home/opnfv/functest/results.

Automated testing

In automated mode, test results are displayed in jenkins logs, a summary is provided at the end of the job and can be described as follow:

+==================================================================================================================================================+
|                                                                FUNCTEST REPORT                                                                   |
+==================================================================================================================================================+
|                                                                                                                                                  |
|  Deployment description:                                                                                                                         |
|    INSTALLER: fuel                                                                                                                               |
|    SCENARIO:  os-odl_l2-nofeature-ha                                                                                                             |
|    BUILD TAG: jenkins-functest-fuel-baremetal-daily-master-324                                                                                   |
|    CI LOOP:   daily                                                                                                                              |
|                                                                                                                                                  |
+=========================+===============+============+===============+===========================================================================+
| TEST CASE               | TIER          | DURATION   | RESULT        | URL                                                                       |
+=========================+===============+============+===============+===========================================================================+
| connection_check        | healthcheck   | 00:02      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb62b34079ac000a42e3fe |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+
| api_check               | healthcheck   | 01:15      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb62fe4079ac000a42e3ff |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+
| snaps_health_check      | healthcheck   | 00:50      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb63314079ac000a42e400 |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+
| vping_ssh               | smoke         | 01:10      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb63654079ac000a42e401 |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+
| vping_userdata          | smoke         | 00:59      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb63a14079ac000a42e403 |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+
| tempest_smoke_serial    | smoke         | 12:57      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb66bd4079ac000a42e408 |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+
| rally_sanity            | smoke         | 10:22      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb692b4079ac000a42e40a |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+
| refstack_defcore        | smoke         | 12:28      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb6c184079ac000a42e40c |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+
| snaps_smoke             | smoke         | 12:04      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb6eec4079ac000a42e40e |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+
| domino                  | features      | 00:29      | PASS          | http://testresults.opnfv.org/test/api/v1/results/58cb6f044079ac000a42e40f |
+-------------------------+---------------+------------+---------------+---------------------------------------------------------------------------+

Results are automatically pushed to the test results database, some additional result files are pushed to OPNFV artifact web sites.

Based on the results stored in the result database, a Functest reporting portal is also automatically updated. This portal provides information on:

  • The overall status per scenario and per installer
  • Tempest: Tempest test case including reported errors per scenario and installer
  • vIMS: vIMS details per scenario and installer
Functest reporting portal Fuel status page
Troubleshooting

This section gives some guidelines about how to troubleshoot the test cases owned by Functest.

IMPORTANT: As in the previous section, the steps defined below must be executed inside the Functest Docker container and after sourcing the OpenStack credentials:

. $creds

or:

source /home/opnfv/functest/conf/openstack.creds
VIM

This section covers the test cases related to the VIM (healthcheck, vping_ssh, vping_userdata, tempest_smoke_serial, tempest_full_parallel, rally_sanity, rally_full).

vPing common

For both vPing test cases (vPing_ssh, and vPing_userdata), the first steps are similar:

  • Create Glance image
  • Create Network
  • Create Security Group
  • Create Instances

After these actions, the test cases differ and will be explained in their respective section.

These test cases can be run inside the container, using new Functest CLI as follows:

$ functest testcase run vping_ssh
$ functest testcase run vping_userdata

The Functest CLI is designed to route a call to the corresponding internal python scripts, located in paths: $REPOS_DIR/functest/functest/opnfv_tests/openstack/vping/vping_ssh.py and $REPOS_DIR/functest/functest/opnfv_tests/openstack/vping/vping_userdata.py

Notes:

  1. There is one difference, between the Functest CLI based test case execution compared to the earlier used Bash shell script, which is relevant to point out in troubleshooting scenarios:

    The Functest CLI does not yet support the option to suppress clean-up of the generated OpenStack resources, following the execution of a test case.

    Explanation: After finishing the test execution, the corresponding script will remove, by default, all created resources in OpenStack (image, instances, network and security group). When troubleshooting, it is advisable sometimes to keep those resources in case the test fails and a manual testing is needed.

    It is actually still possible to invoke test execution, with suppression of OpenStack resource cleanup, however this requires invocation of a specific Python script: ‘/home/opnfv/repos/functest/ci/run_test.py’. The OPNFV Functest Developer Guide provides guidance on the use of that Python script in such troubleshooting cases.

Some of the common errors that can appear in this test case are:

vPing_ssh- ERROR - There has been a problem when creating the neutron network....

This means that there has been some problems with Neutron, even before creating the instances. Try to create manually a Neutron network and a Subnet to see if that works. The debug messages will also help to see when it failed (subnet and router creation). Example of Neutron commands (using 10.6.0.0/24 range for example):

neutron net-create net-test
neutron subnet-create --name subnet-test --allocation-pool start=10.6.0.2,end=10.6.0.100 \
--gateway 10.6.0.254 net-test 10.6.0.0/24
neutron router-create test_router
neutron router-interface-add <ROUTER_ID> test_subnet
neutron router-gateway-set <ROUTER_ID> <EXT_NET_NAME>

Another related error can occur while creating the Security Groups for the instances:

vPing_ssh- ERROR - Failed to create the security group...

In this case, proceed to create it manually. These are some hints:

neutron security-group-create sg-test
neutron security-group-rule-create sg-test --direction ingress --protocol icmp \
--remote-ip-prefix 0.0.0.0/0
neutron security-group-rule-create sg-test --direction ingress --ethertype IPv4 \
--protocol tcp --port-range-min 80 --port-range-max 80 --remote-ip-prefix 0.0.0.0/0
neutron security-group-rule-create sg-test --direction egress --ethertype IPv4 \
--protocol tcp --port-range-min 80 --port-range-max 80 --remote-ip-prefix 0.0.0.0/0

The next step is to create the instances. The image used is located in /home/opnfv/functest/data/cirros-0.3.5-x86_64-disk.img and a Glance image is created with the name functest-vping. If booting the instances fails (i.e. the status is not ACTIVE), you can check why it failed by doing:

nova list
nova show <INSTANCE_ID>

It might show some messages about the booting failure. To try that manually:

nova boot --flavor m1.small --image functest-vping --nic net-id=<NET_ID> nova-test

This will spawn a VM using the network created previously manually. In all the OPNFV tested scenarios from CI, it never has been a problem with the previous actions. Further possible problems are explained in the following sections.

vPing_SSH

This test case creates a floating IP on the external network and assigns it to the second instance opnfv-vping-2. The purpose of this is to establish a SSH connection to that instance and SCP a script that will ping the first instance. This script is located in the repository under $REPOS_DIR/functest/functest/opnfv_tests/openstack/vping/ping.sh and takes an IP as a parameter. When the SCP is completed, the test will do an SSH call to that script inside the second instance. Some problems can happen here:

vPing_ssh- ERROR - Cannot establish connection to IP xxx.xxx.xxx.xxx. Aborting

If this is displayed, stop the test or wait for it to finish, if you have used the special method of test invocation with specific supression of OpenStack resource clean-up, as explained earler. It means that the Container can not reach the Public/External IP assigned to the instance opnfv-vping-2. There are many possible reasons, and they really depend on the chosen scenario. For most of the ODL-L3 and ONOS scenarios this has been noticed and it is a known limitation.

First, make sure that the instance opnfv-vping-2 succeeded to get an IP from the DHCP agent. It can be checked by doing:

nova console-log opnfv-vping-2

If the message Sending discover and No lease, failing is shown, it probably means that the Neutron dhcp-agent failed to assign an IP or even that it was not responding. At this point it does not make sense to try to ping the floating IP.

If the instance got an IP properly, try to ping manually the VM from the container:

nova list
<grab the public IP>
ping <public IP>

If the ping does not return anything, try to ping from the Host where the Docker container is running. If that solves the problem, check the iptable rules because there might be some rules rejecting ICMP or TCP traffic coming/going from/to the container.

At this point, if the ping does not work either, try to reproduce the test manually with the steps described above in the vPing common section with the addition:

neutron floatingip-create <EXT_NET_NAME>
nova floating-ip-associate nova-test <FLOATING_IP>

Further troubleshooting is out of scope of this document, as it might be due to problems with the SDN controller. Contact the installer team members or send an email to the corresponding OPNFV mailing list for more information.

vPing_userdata

This test case does not create any floating IP neither establishes an SSH connection. Instead, it uses nova-metadata service when creating an instance to pass the same script as before (ping.sh) but as 1-line text. This script will be executed automatically when the second instance opnfv-vping-2 is booted.

The only known problem here for this test to fail is mainly the lack of support of cloud-init (nova-metadata service). Check the console of the instance:

nova console-log opnfv-vping-2

If this text or similar is shown:

checking http://169.254.169.254/2009-04-04/instance-id
failed 1/20: up 1.13. request failed
failed 2/20: up 13.18. request failed
failed 3/20: up 25.20. request failed
failed 4/20: up 37.23. request failed
failed 5/20: up 49.25. request failed
failed 6/20: up 61.27. request failed
failed 7/20: up 73.29. request failed
failed 8/20: up 85.32. request failed
failed 9/20: up 97.34. request failed
failed 10/20: up 109.36. request failed
failed 11/20: up 121.38. request failed
failed 12/20: up 133.40. request failed
failed 13/20: up 145.43. request failed
failed 14/20: up 157.45. request failed
failed 15/20: up 169.48. request failed
failed 16/20: up 181.50. request failed
failed 17/20: up 193.52. request failed
failed 18/20: up 205.54. request failed
failed 19/20: up 217.56. request failed
failed 20/20: up 229.58. request failed
failed to read iid from metadata. tried 20

it means that the instance failed to read from the metadata service. Contact the Functest or installer teams for more information.

NOTE: Cloud-init in not supported on scenarios dealing with ONOS and the tests have been excluded from CI in those scenarios.

Tempest

In the upstream OpenStack CI all the Tempest test cases are supposed to pass. If some test cases fail in an OPNFV deployment, the reason is very probably one of the following

Error Details
Resources required for test case execution are missing Such resources could be e.g. an external network and access to the management subnet (adminURL) from the Functest docker container.
OpenStack components or services are missing or not configured properly Check running services in the controller and compute nodes (e.g. with “systemctl” or “service” commands). Configuration parameters can be verified from the related .conf files located under ‘/etc/<component>’ directories.
Some resources required for execution test cases are missing The tempest.conf file, automatically generated by Rally in Functest, does not contain all the needed parameters or some parameters are not set properly. The tempest.conf file is located in directory ‘/home/opnfv/.rally/verification/verifier-<UUID> /for-deployment-<UUID>’ in the Functest Docker container. Use the “rally deployment list” command in order to check the UUID the UUID of the current deployment.

When some Tempest test case fails, captured traceback and possibly also the related REST API requests/responses are output to the console. More detailed debug information can be found from tempest.log file stored into related Rally deployment folder.

Rally

The same error causes which were mentioned above for Tempest test cases, may also lead to errors in Rally as well.

Possible scenarios are:
  • authenticate
  • glance
  • cinder
  • heat
  • keystone
  • neutron
  • nova
  • quotas
  • requests
  • vm

To know more about what those scenarios are doing, they are defined in directory: $REPOS_DIR/functest/functest/opnfv_tests/openstack/rally/scenario For more info about Rally scenario definition please refer to the Rally official documentation. [3]

To check any possible problems with Rally, the logs are stored under /home/opnfv/functest/results/rally/ in the Functest Docker container.

Controllers
Opendaylight

If the Basic Restconf test suite fails, check that the ODL controller is reachable and its Restconf module has been installed.

If the Neutron Reachability test fails, verify that the modules implementing Neutron requirements have been properly installed.

If any of the other test cases fails, check that Neutron and ODL have been correctly configured to work together. Check Neutron configuration files, accounts, IP addresses etc.).

ONOS

Please refer to the ONOS documentation. ONOSFW User Guide .

Features

Please refer to the dedicated feature user guides for details.

VNF
cloudify_ims

vIMS deployment may fail for several reasons, the most frequent ones are described in the following table:

Error Comments
Keystone admin API not reachable Impossible to create vIMS user and tenant
Impossible to retrieve admin role id Impossible to create vIMS user and tenant
Error when uploading image from OpenStack to glance impossible to deploy VNF
Cinder quota cannot be updated Default quotas not sufficient, they are adapted in the script
Impossible to create a volume VNF cannot be deployed
SSH connection issue between the Test Docker container and the VM if vPing test fails, vIMS test will fail...
No Internet access from the VM the VMs of the VNF must have an external access to Internet
No access to OpenStack API from the VM Orchestrator can be installed but the vIMS VNF installation fails
References

OPNFV main site

Functest page

IRC support chan: #opnfv-functest

QTIP

QTIP Installation & Configuration
Configuration

QTIP currently supports by using a Docker image. Detailed steps about setting up QTIP can be found below.

To use QTIP you should have access to an OpenStack environment, with at least Nova, Neutron, Glance, Keystone and Heat installed. Add a brief introduction to configure OPNFV with this specific installer

Installing QTIP using Docker
QTIP docker image

QTIP has a Docker images on the docker hub. Pulling opnfv/qtip docker image from docker hub:

docker pull opnfv/qtip:stable

Verify that opnfv/qtip has been downloaded. It should be listed as an image by running the following command.

docker images
Run and enter the docker instance

1. If you want to run benchmarks:

envs="INSTALLER_TYPE={INSTALLER_TYPE} -e INSTALLER_IP={INSTALLER_IP}"
docker run --name qtip -id -e $envs opnfv/qtip
docker exec -i -t qtip /bin/bash

INSTALLER_TYPE should be one of OPNFV installer, e.g. apex, compass, daisy, fuel and joid. Currenty, QTIP only supports installer fuel.

INSTALLER_IP is the ip address of the installer that can be accessed by QTIP.

2. If you do not want to run any benchmarks:

docker run --name qtip -id opnfv/qtip
docker exec -i -t qtip /bin/bash

Now you are in the container and QTIP can be found in the /repos/qtip and can be navigated to using the following command.

cd repos/qtip
Environment configuration
Hardware configuration

QTIP does not have specific hardware requriements, and it can runs over any OPNFV installer.

Jumphost configuration

Installer Docker on Jumphost, which is used for running QTIP image.

You can refer to these links:

Ubuntu: https://docs.docker.com/engine/installation/linux/ubuntu/

Centos: https://docs.docker.com/engine/installation/linux/centos/

Platform components configuration

Describe the configuration of each component in the installer.

QTIP User Guide
Overview

QTIP is the project for Platform Performance Benchmarking in OPNFV. It aims to provide user a simple indicator for performance, simple but supported by comprehensive testing data and transparent calculation formula.

QTIP introduces a concept called QPI, a.k.a. QTIP Performance Index, which aims to be a TRUE indicator of performance. TRUE reflects the core value of QPI in four aspects

  • Transparent: being an open source project, user can inspect all details behind QPI, e.g. formulas, metrics, raw data
  • Reliable: the integrity of QPI will be guaranteed by traceability in each step back to raw test result
  • Understandable: QPI is broke down into section scores, and workload scores in report to help user to understand
  • Extensible: users may create their own QPI by composing the existed metrics in QTIP or extend new metrics
Benchmarks

The builtin benchmarks of QTIP are located in <package_root>/benchmarks folder

  • QPI: specifications about how an QPI is calculated and sources of metrics
  • metric: performance metrics referred in QPI, currently it is categorized by performance testing tools
  • plan: executable benchmarking plan which collects metrics and calculate QPI
CLI User Manual

QTIP consists of a number of benchmarking tools or metrics, grouped under QPI’s. QPI’s map to the different components of a NFVI ecosystem, such as compute, network and storage. Depending on the type of application, a user may group them under plans.

QTIP CLI provides interface to all of the above the components. A help page provides a list of all the commands along with a short description.

qtip [-h|--help]

Typically a complete plan is executed at the target environment. QTIP defaults to a number of sample plans. A list of all the available plans can be viewed

qtip plan list

In order to view the details about a specific plan.

qtip plan show <plan_name>

where plan_name is one of those listed from the previous command.

To execute a complete plan

qtip plan run <plan_name> -p <path_to_result_directory>

QTIP does not limit result storage at a specific directory. Instead a user may specify his own result storage as above. An important thing to remember is to provide absolute path of result directory.

mkdir result
qtip plan run <plan_name> -p $PWD/result

Similarly, the same commands can be used for the other two components making up the plans, i.e QPI’s and metrics. For example, in order to run a single metric

qtip metric run <metric_name> -p $PWD/result

The same can be applied for a QPI.

QTIP also provides the utility to view benchmarking results on the console. One just need to provide to where the results are stored. Extending the example above

qtip report show <metric_name> -p $PWD/result

Debug option helps identify the error by providing a detailed traceback. It can be enabled as

qtip [-d|--debug] plan run <plan_name>
API User Manual

QTIP consists of a number of benchmarking tools or metrics, grouped under QPI’s. QPI’s map to the different components of an NFVI ecosystem, such as compute, network and storage. Depending on the type of application, a user may group them under plans.

QTIP API provides a RESTful interface to all of the above components. User can retrieve list of plans, QPIs and metrics and their individual information.

Running

After installing QTIP. API server can be run using command qtip-api on the local machine.

All the resources and their corresponding operation details can be seen at /v1.0/ui, on hosting server(0.0.0.0:5000 for the local machine).

The whole API specification in json format can be seen at /v1.0/swagger.json.

The data models are given below:

  • Plan
  • Metric
  • QPI

Plan:

{
  "name": <plan name>,
  "description": <plan profile>,
  "info": <{plan info}>,
  "config": <{plan configuration}>,
  "QPIs": <[list of qpis]>,
},

Metric:

{
  "name": <metric name>,
  "description": <metric description>,
  "links": <[links with metric information]>,
  "workloads": <[cpu workloads(single_cpu, multi_cpu]>,
},

QPI:

{
  "name": <qpi name>,
  "description": <qpi description>,
  "formula": <formula>,
  "sections": <[list of sections with different metrics and formulaes]>,
}

The API can be described as follows

Plans:

Method Path Description
GET /v1.0/plans Get the list of of all plans
GET /v1.0/plans/{name} Get details of the specified plan

Metrics:

Method Path Description
GET /v1.0/metrics Get the list of all metrics
GET /v1.0/metrics/{name} Get details of specified metric

QPIs:

Method Path Description
GET /v1.0/qpis Get the list of all QPIs
GET /v1.0/qpis/{name} Get details of specified QPI
Note:
running API with connexion cli does not require base path (/v1.0/) in url
Compute Performance Benchmarking

The compute QPI aims to benchmark the compute components of an OPNFV platform. Such components include, the CPU performance, the memory performance.

The compute QPI consists of both synthetic and application specific benchmarks to test compute components.

All the compute benchmarks could be run in the scenario: On Baremetal Machines provisioned by an OPNFV installer (Host machines)

Note: The Compute benchmank constains relatively old benchmarks such as dhrystone and whetstone. The suite would be updated for better benchmarks such as Linbench for the OPNFV E release.

Getting started

Notice: All descriptions are based on QTIP container.

Inventory File

QTIP uses Ansible to trigger benchmark test. Ansible uses an inventory file to determine what hosts to work against. QTIP can automatically generate a inventory file via OPNFV installer. Users also can write their own inventory infomation into /home/opnfv/qtip/hosts. This file is just a text file containing a list of host IP addresses. For example:

[hosts]
10.20.0.11
10.20.0.12
QTIP key Pair

QTIP use a SSH key pair to connect to remote hosts. When users execute compute QPI, QTIP will generate a key pair named QtipKey under /home/opnfv/qtip/ and pass public key to remote hosts.

If environment variable CI_DEBUG is set to true, users should delete it by manual. If CI_DEBUG is not set or set to false, QTIP will delete the key from remote hosts before the execution ends. Please make sure the key deleted from remote hosts or it can introduce a security flaw.

Commands

In a QTIP container, you can run compute QPI by using QTIP CLI:

mkdir result
qtip plan run <plan_name> -p $PWD/result

QTIP generates results in the $PWD/result directory are listed down under the timestamp name.

you can get more details from userguide/cli.rst.

Metrics

The benchmarks include:

Dhrystone 2.1

Dhrystone is a synthetic benchmark for measuring CPU performance. It uses integer calculations to evaluate CPU capabilities. Both Single CPU performance is measured along multi-cpu performance.

Dhrystone, however, is a dated benchmark and has some short comings. Written in C, it is a small program that doesn’t test the CPU memory subsystem. Additionally, dhrystone results could be modified by optimizing the compiler and insome cases hardware configuration.

References: http://www.eembc.org/techlit/datasheets/dhrystone_wp.pdf

Whetstone

Whetstone is a synthetic benchmark to measure CPU floating point operation performance. Both Single CPU performance is measured along multi-cpu performance.

Like Dhrystone, Whetstone is a dated benchmark and has short comings.

References:

http://www.netlib.org/benchmark/whetstone.c

OpenSSL Speed

OpenSSL Speed can be used to benchmark compute performance of a machine. In QTIP, two OpenSSL Speed benchmarks are incorporated:

  1. RSA signatunes/sec signed by a machine
  2. AES 128-bit encryption throughput for a machine for cipher block sizes

References:

https://www.openssl.org/docs/manmaster/apps/speed.html

RAMSpeed

RAMSpeed is used to measure a machine’s memory perfomace. The problem(array)size is large enough to ensure Cache Misses so that the main machine memory is used.

INTmem and FLOATmem benchmarks are executed in 4 different scenarios:

  1. Copy: a(i)=b(i)
  2. Add: a(i)=b(i)+c(i)
  3. Scale: a(i)=b(i)*d
  4. Tniad: a(i)=b(i)+c(i)*d

INTmem uses integers in these four benchmarks whereas FLOATmem uses floating points for these benchmarks.

References:

http://alasir.com/software/ramspeed/

https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/W51a7ffcf4dfd_4b40_9d82_446ebc23c550/page/Untangling+memory+access+measurements

DPI

nDPI is a modified variant of OpenDPI, Open source Deep packet Inspection, that is maintained by ntop. An example application called pcapreader has been developed and is available for use along nDPI.

A sample .pcap file is passed to the pcapreader application. nDPI classifies traffic in the pcap file into different categories based on string matching. The pcapreader application provides a throughput number for the rate at which traffic was classified, indicating a machine’s computational performance. The results are run 10 times and an average is taken for the obtained number.

nDPI may provide non consistent results and was added to Brahmaputra for experimental purposes

References:

http://www.ntop.org/products/deep-packet-inspection/ndpi/

http://www.ntop.org/wp-content/uploads/2013/12/nDPI_QuickStartGuide.pdf

Storperf

StorPerf User Guide
StorPerf Container Execution Guide
Planning

There are some ports that the container can expose:

  • 22 for SSHD. Username and password are root/storperf. This is used for CLI access only
  • 5000 for StorPerf ReST API.
  • 8000 for StorPerf’s Graphite Web Server
OpenStack Credentials

You must have your OpenStack Controller environment variables defined and passed to the StorPerf container. The easiest way to do this is to put the rc file contents into a clean file the looks similar to this for V2 authentication:

OS_AUTH_URL=http://10.13.182.243:5000/v2.0
OS_TENANT_ID=e8e64985506a4a508957f931d1800aa9
OS_TENANT_NAME=admin
OS_PROJECT_NAME=admin
OS_USERNAME=admin
OS_PASSWORD=admin
OS_REGION_NAME=RegionOne

For V3 authentication, use the following:

OS_AUTH_URL=http://10.13.182.243:5000/v3
OS_PROJECT_ID=32ae78a844bc4f108b359dd7320463e5
OS_PROJECT_NAME=admin
OS_USER_DOMAIN_NAME=Default
OS_USERNAME=admin
OS_PASSWORD=admin
OS_REGION_NAME=RegionOne
OS_INTERFACE=public
OS_IDENTITY_API_VERSION=3
Additionally, if you want your results published to the common OPNFV Test Results
DB, add the following:
TEST_DB_URL=http://testresults.opnfv.org/testapi
Running StorPerf Container

You might want to have the local disk used for storage as the default size of the docker container is only 10g. This is done with the -v option, mounting under /opt/graphite/storage/whisper

mkdir -p ~/carbon
sudo chown 33:33 ~/carbon

The recommended method of running StorPerf is to expose only the ReST and Graphite ports. The command line below shows how to run the container with local disk for the carbon database.

docker run -t --env-file admin-rc -p 5000:5000 -p 8000:8000 -v ~/carbon:/opt/graphite/storage/whisper --name storperf opnfv/storperf
Docker Exec

Instead of exposing port 5022 externally, you can use the exec method in docker. This provides a slightly more secure method of running StorPerf container without having to expose port 22.

If needed, the container can be entered with docker exec. This is not normally required.

docker exec -it storperf bash
Container with SSH

Running the StorPerf Container with all ports open and a local disk for the result storage. This is not recommended as the SSH port is open.

docker run -t --env-file admin-rc -p 5022:22 -p 5000:5000 -p 8000:8000 -v ~/carbon:/opt/graphite/storage/whisper --name storperf opnfv/storperf

This will then permit ssh to localhost port 5022 for CLI access.

StorPerf Installation Guide
OpenStack Prerequisites

If you do not have an Ubuntu 16.04 image in Glance, you will need to add one. There are scripts in storperf/ci directory to assist, or you can use the follow code snippets:

# Put an Ubuntu Image in glance
wget -q https://cloud-images.ubuntu.com/releases/16.04/release/ubuntu-16.04-server-cloudimg-amd64-disk1.img
openstack image create "Ubuntu 16.04 x86_64" --disk-format qcow2 --public \
    --container-format bare --file ubuntu-16.04-server-cloudimg-amd64-disk1.img

# Create StorPerf flavor
openstack flavor create storperf \
    --id auto \
    --ram 8192 \
    --disk 4 \
    --vcpus 2
Planning

StorPerf is delivered as a Docker container. There are two possible methods for installation in your environment:

  1. Run container on Jump Host
  2. Run container in a VM
Running StorPerf on Jump Host

Requirements:

  • Docker must be installed
  • Jump Host must have access to the OpenStack Controller API
  • Jump Host must have internet connectivity for downloading docker image
  • Enough floating IPs must be available to match your agent count
Running StorPerf in a VM

Requirements:

  • VM has docker installed
  • VM has OpenStack Controller credentials and can communicate with the Controller API
  • VM has internet connectivity for downloading the docker image
  • Enough floating IPs must be available to match your agent count
VM Creation

The following procedure will create the VM in your environment

# Create the StorPerf VM itself.  Here we use the network ID generated by OPNFV FUEL.
ADMIN_NET_ID=`neutron net-list | grep 'admin_internal_net ' | awk '{print $2}'`

nova boot --nic net-id=$ADMIN_NET_ID --flavor m1.small --key-name=StorPerf --image 'Ubuntu 14.04' 'StorPerf Master'

At this point, you may associate a floating IP with the StorPerf master VM.

VM Docker Installation

The following procedure will install Docker on Ubuntu 14.04.

sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
cat << EOF | sudo tee /etc/apt/sources.list.d/docker.list
deb https://apt.dockerproject.org/repo ubuntu-trusty main
EOF

sudo apt-get update
sudo apt-get install -y docker-engine
sudo usermod -aG docker ubuntu
Pulling StorPerf Container
Danube

The tag for the latest stable Danube will be:

docker pull opnfv/storperf:danube.1.0
Colorado

The tag for the latest stable Colorado release is:

docker pull opnfv/storperf:colorado.0.1
Brahmaputra

The tag for the latest stable Brahmaputra release is:

docker pull opnfv/storperf:brahmaputra.1.2
Development

The tag for the latest development version is:

docker pull opnfv/storperf:master
StorPerf Test Execution Guide
Prerequisites

This guide requires StorPerf to be running and have its ReST API accessible. If the ReST API is not running on port 5000, adjust the commands provided here as needed.

Interacting With StorPerf

Once the StorPerf container has been started and the ReST API exposed, you can interact directly with it using the ReST API. StorPerf comes with a Swagger interface that is accessible through the exposed port at:

http://StorPerf:5000/swagger/index.html

The typical test execution follows this pattern:

  1. Configure the environment
  2. Initialize the cinder volumes
  3. Execute one or more performance runs
  4. Delete the environment
Configure The Environment

The following pieces of information are required to prepare the environment:

  • The number of VMs/Cinder volumes to create
  • The Glance image that holds the VM operating system to use. StorPerf has only been tested with Ubuntu 16.04
  • The name of the public network that agents will use
  • The size, in gigabytes, of the Cinder volumes to create

The ReST API is a POST to http://StorPerf:5000/api/v1.0/configurations and takes a JSON payload as follows.

{
  "agent_count": int,
  "agent_image": string,
  "public_network": string,
  "volume_size": int
}

This call will block until the stack is created, at which point it will return the OpenStack heat stack id.

Initialize the Cinder Volumes

Before executing a test run for the purpose of measuring performance, it is necessary to fill the Cinder volume with random data. Failure to execute this step can result in meaningless numbers, especially for read performance. Most Cinder drivers are smart enough to know what blocks contain data, and which do not. Uninitialized blocks return “0” immediately without actually reading from the volume.

Initiating the data fill looks the same as a regular performance test, but uses the special workload called “_warm_up”. StorPerf will never push _warm_up data to the OPNFV Test Results DB, nor will it terminate the run on steady state. It is guaranteed to run to completion, which fills 100% of the volume with random data.

The ReST API is a POST to http://StorPerf:5000/api/v1.0/jobs and takes a JSON payload as follows.

{
   "workload": "_warm_up"
}

This will return a job ID as follows.

{
  "job_id": "edafa97e-457e-4d3d-9db4-1d6c0fc03f98"
}

This job ID can be used to query the state to determine when it has completed. See the section on querying jobs for more information.

Execute a Performance Run

Performance runs can execute either a single workload, or iterate over a matrix of workload types, block sizes and queue depths.

Workload Types
rr
Read, Random. 100% read of random blocks
rs
Read, Sequential. 100% read of sequential blocks of data
rw
Read / Write Mix, Random. 70% random read, 30% random write
wr
Write, Random. 100% write of random blocks
ws
Write, Sequential. 100% write of sequential blocks.
Block Sizes

A comma delimited list of the different block sizes to use when reading and writing data. Note: Some Cinder drivers (such as Ceph) cannot support block sizes larger than 16k (16384).

Queue Depths

A comma delimited list of the different queue depths to use when reading and writing data. The queue depth parameter causes FIO to keep this many I/O requests outstanding at one time. It is used to simulate traffic patterns on the system. For example, a queue depth of 4 would simulate 4 processes constantly creating I/O requests.

Deadline

The deadline is the maximum amount of time in minutes for a workload to run. If steady state has not been reached by the deadline, the workload will terminate and that particular run will be marked as not having reached steady state. Any remaining workloads will continue to execute in order.

{
   "block_sizes": "2048,16384,
   "deadline": 20,
   "queue_depths": "2,4",
   "workload": "wr,rr,rw",
}
Metadata

A job can have metadata associated with it for tagging. The following metadata is required in order to push results to the OPNFV Test Results DB:

"metadata": {
    "disk_type": "HDD or SDD",
    "pod_name": "OPNFV Pod Name",
    "scenario_name": string,
    "storage_node_count": int,
    "version": string,
    "build_tag": string,
    "test_case": "snia_steady_state"
}
Query Jobs Information

By issuing a GET to the job API http://StorPerf:5000/api/v1.0/jobs?job_id=<ID>, you can fetch information about the job as follows:

  • &type=status: to report on the status of the job.
  • &type=metrics: to report on the collected metrics.
  • &type=metadata: to report back any metadata sent with the job ReST API
Status

The Status field can be: - Running to indicate the job is still in progress, or - Completed to indicate the job is done. This could be either normal completion

or manually terminated via HTTP DELETE call.

Workloads can have a value of: - Pending to indicate the workload has not yet started, - Running to indicate this is the active workload, or - Completed to indicate this workload has completed.

This is an example of a type=status call.

{
  "Status": "Running",
  "TestResultURL": null,
  "Workloads": {
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.1.block-size.16384": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.1.block-size.4096": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.1.block-size.512": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.4.block-size.16384": "Running",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.4.block-size.4096": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.4.block-size.512": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.8.block-size.16384": "Completed",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.8.block-size.4096": "Pending",
    "eeb2e587-5274-4d2f-ad95-5c85102d055e.ws.queue-depth.8.block-size.512": "Pending"
  }
}
Metrics

Metrics can be queried at any time during or after the completion of a run. Note that the metrics show up only after the first interval has passed, and are subject to change until the job completes.

This is a sample of a type=metrics call.

{
  "rw.queue-depth.1.block-size.512.read.bw": 52.8,
  "rw.queue-depth.1.block-size.512.read.iops": 106.76199999999999,
  "rw.queue-depth.1.block-size.512.read.lat.mean": 93.176,
  "rw.queue-depth.1.block-size.512.write.bw": 22.5,
  "rw.queue-depth.1.block-size.512.write.iops": 45.760000000000005,
  "rw.queue-depth.1.block-size.512.write.lat.mean": 21764.184999999998
}
Abort a Job

Issuing an HTTP DELETE to the job api http://StorPerf:5000/api/v1.0/jobs will force the termination of the whole job, regardless of how many workloads remain to be executed.

curl -X DELETE --header 'Accept: application/json' http://StorPerf:5000/api/v1.0/jobs
Delete the Environment

After you are done testing, you can have StorPerf delete the Heat stack by issuing an HTTP DELETE to the configurations API.

curl -X DELETE --header 'Accept: application/json' http://StorPerf:5000/api/v1.0/configurations

You may also want to delete an environment, and then create a new one with a different number of VMs/Cinder volumes to test the impact of the number of VMs in your environment.

VSPERF

VSPERF

VSPERF is an OPNFV testing project.

VSPERF provides an automated test-framework and comprehensive test suite based on industry standards for measuring data-plane performance of Telco NFV switching technologies as well as physical and virtual network interfaces (NFVI). The VSPERF architecture is switch and traffic generator agnostic and provides full control of software component versions and configurations as well as test-case customization.

The Danube release of VSPERF includes improvements in documentation and capabilities. This includes additional test-cases such as RFC 5481 Latency test and RFC-2889 address-learning-rate test. Hardware traffic generator support is now provided for Spirent and Xena in addition to Ixia. The Moongen software traffic generator is also now fully supported. VSPERF can be used in a variety of modes for configuration and setup of the network and/or for control of the test-generator and test execution.

VSPERF User Guide
1. Installing vswitchperf
Downloading vswitchperf

The vswitchperf can be downloaded from its official git repository, which is hosted by OPNFV. It is necessary to install a git at your DUT before downloading vswitchperf. Installation of git is specific to the packaging system used by Linux OS installed at DUT.

Example of installation of GIT package and its dependencies:

  • in case of OS based on RedHat Linux:

    sudo yum install git
    
  • in case of Ubuntu or Debian:

    sudo apt-get install git
    

After the git is successfully installed at DUT, then vswitchperf can be downloaded as follows:

git clone http://git.opnfv.org/vswitchperf

The last command will create a directory vswitchperf with a local copy of vswitchperf repository.

Supported Operating Systems
  • CentOS 7.3
  • Fedora 24 (kernel 4.8 requires DPDK 16.11 and newer)
  • Fedora 25 (kernel 4.9 requires DPDK 16.11 and newer)
  • openSUSE 42.2
  • RedHat 7.2 Enterprise Linux
  • RedHat 7.3 Enterprise Linux
  • Ubuntu 14.04
  • Ubuntu 16.04
  • Ubuntu 16.10 (kernel 4.8 requires DPDK 16.11 and newer)
Supported vSwitches

The vSwitch must support Open Flow 1.3 or greater.

  • Open vSwitch
  • Open vSwitch with DPDK support
  • TestPMD application from DPDK (supports p2p and pvp scenarios)
Supported Hypervisors
  • Qemu version 2.3 or greater (version 2.5.0 is recommended)
Supported VNFs

In theory, it is possible to use any VNF image, which is compatible with supported hypervisor. However such VNF must ensure, that appropriate number of network interfaces is configured and that traffic is properly forwarded among them. For new vswitchperf users it is recommended to start with official vloop-vnf image, which is maintained by vswitchperf community.

vloop-vnf

The official VM image is called vloop-vnf and it is available for free download from OPNFV artifactory. This image is based on Linux Ubuntu distribution and it supports following applications for traffic forwarding:

  • DPDK testpmd
  • Linux Bridge
  • Custom l2fwd module

The vloop-vnf can be downloaded to DUT, for example by wget:

wget http://artifacts.opnfv.org/vswitchperf/vnf/vloop-vnf-ubuntu-14.04_20160823.qcow2

NOTE: In case that wget is not installed at your DUT, you could install it at RPM based system by sudo yum install wget or at DEB based system by sudo apt-get install wget.

Changelog of vloop-vnf:

Installation

The test suite requires Python 3.3 or newer and relies on a number of other system and python packages. These need to be installed for the test suite to function.

Installation of required packages, preparation of Python 3 virtual environment and compilation of OVS, DPDK and QEMU is performed by script systems/build_base_machine.sh. It should be executed under user account, which will be used for vsperf execution.

NOTE: Password-less sudo access must be configured for given user account before script is executed.

$ cd systems
$ ./build_base_machine.sh

NOTE: you don’t need to go into any of the systems subdirectories, simply run the top level build_base_machine.sh, your OS will be detected automatically.

Script build_base_machine.sh will install all the vsperf dependencies in terms of system packages, Python 3.x and required Python modules. In case of CentOS 7 or RHEL it will install Python 3.3 from an additional repository provided by Software Collections (a link). Installation script will also use virtualenv to create a vsperf virtual environment, which is isolated from the default Python environment. This environment will reside in a directory called vsperfenv in $HOME. It will ensure, that system wide Python installation is not modified or broken by VSPERF installation. The complete list of Python packages installed inside virtualenv can be found at file requirements.txt, which is located at vswitchperf repository.

NOTE: For RHEL 7.3 Enterprise and CentOS 7.3 OVS Vanilla is not built from upstream source due to kernel incompatibilities. Please see the instructions in the vswitchperf_design document for details on configuring OVS Vanilla for binary package usage.

Using vswitchperf

You will need to activate the virtual environment every time you start a new shell session. Its activation is specific to your OS:

  • CentOS 7 and RHEL

    $ scl enable python33 bash
    $ source $HOME/vsperfenv/bin/activate
    
  • Fedora and Ubuntu

    $ source $HOME/vsperfenv/bin/activate
    

After the virtual environment is configued, then VSPERF can be used. For example:

(vsperfenv) $ cd vswitchperf
(vsperfenv) $ ./vsperf --help
Gotcha

In case you will see following error during environment activation:

$ source $HOME/vsperfenv/bin/activate
Badly placed ()'s.

then check what type of shell you are using:

$ echo $SHELL
/bin/tcsh

See what scripts are available in $HOME/vsperfenv/bin

$ ls $HOME/vsperfenv/bin/
activate          activate.csh      activate.fish     activate_this.py

source the appropriate script

$ source bin/activate.csh
Working Behind a Proxy

If you’re behind a proxy, you’ll likely want to configure this before running any of the above. For example:

export http_proxy=proxy.mycompany.com:123
export https_proxy=proxy.mycompany.com:123
Hugepage Configuration

Systems running vsperf with either dpdk and/or tests with guests must configure hugepage amounts to support running these configurations. It is recommended to configure 1GB hugepages as the pagesize.

The amount of hugepages needed depends on your configuration files in vsperf. Each guest image requires 2048 MB by default according to the default settings in the 04_vnf.conf file.

GUEST_MEMORY = ['2048']

The dpdk startup parameters also require an amount of hugepages depending on your configuration in the 02_vswitch.conf file.

VSWITCHD_DPDK_ARGS = ['-c', '0x4', '-n', '4', '--socket-mem 1024,1024']
VSWITCHD_DPDK_CONFIG = {
    'dpdk-init' : 'true',
    'dpdk-lcore-mask' : '0x4',
    'dpdk-socket-mem' : '1024,1024',
}

NOTE: Option VSWITCHD_DPDK_ARGS is used for vswitchd, which supports --dpdk parameter. In recent vswitchd versions, option VSWITCHD_DPDK_CONFIG is used to configure vswitchd via ovs-vsctl calls.

With the --socket-mem argument set to use 1 hugepage on the specified sockets as seen above, the configuration will need 10 hugepages total to run all tests within vsperf if the pagesize is set correctly to 1GB.

VSPerf will verify hugepage amounts are free before executing test environments. In case of hugepage amounts not being free, test initialization will fail and testing will stop.

NOTE: In some instances on a test failure dpdk resources may not release hugepages used in dpdk configuration. It is recommended to configure a few extra hugepages to prevent a false detection by VSPerf that not enough free hugepages are available to execute the test environment. Normally dpdk would use previously allocated hugepages upon initialization.

Depending on your OS selection configuration of hugepages may vary. Please refer to your OS documentation to set hugepages correctly. It is recommended to set the required amount of hugepages to be allocated by default on reboots.

Information on hugepage requirements for dpdk can be found at http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html

You can review your hugepage amounts by executing the following command

cat /proc/meminfo | grep Huge

If no hugepages are available vsperf will try to automatically allocate some. Allocation is controlled by HUGEPAGE_RAM_ALLOCATION configuration parameter in 02_vswitch.conf file. Default is 2GB, resulting in either 2 1GB hugepages or 1024 2MB hugepages.

2. Upgrading vswitchperf
Generic

In case, that VSPERF is cloned from git repository, then it is easy to upgrade it to the newest stable version or to the development version.

You could get a list of stable releases by git command. It is necessary to update local git repository first.

NOTE: Git commands must be executed from directory, where VSPERF repository was cloned, e.g. vswitchperf.

Update of local git repository:

$ git pull

List of stable releases:

$ git tag

brahmaputra.1.0
colorado.1.0
colorado.2.0
colorado.3.0
danube.1.0

You could select which stable release should be used. For example, select danube.1.0:

$ git checkout danube.1.0

Development version of VSPERF can be selected by:

$ git checkout master
Colorado to Danube upgrade notes
Obsoleted features

Support of vHost Cuse interface has been removed in Danube release. It means, that it is not possible to select QemuDpdkVhostCuse as a VNF anymore. Option QemuDpdkVhostUser should be used instead. Please check you configuration files and definition of your testcases for any occurrence of:

VNF = "QemuDpdkVhostCuse"

or

"VNF" : "QemuDpdkVhostCuse"

In case that QemuDpdkVhostCuse is found, it must be modified to QemuDpdkVhostUser.

NOTE: In case that execution of VSPERF is automated by scripts (e.g. for CI purposes), then these scripts must be checked and updated too. It means, that any occurrence of:

./vsperf --vnf QemuDpdkVhostCuse

must be updated to:

./vsperf --vnf QemuDpdkVhostUser
Configuration

Several configuration changes were introduced during Danube release. The most important changes are discussed below.

Paths to DPDK, OVS and QEMU

VSPERF uses external tools for proper testcase execution. Thus it is important to properly configure paths to these tools. In case that tools are installed by installation scripts and are located inside ./src directory inside VSPERF home, then no changes are needed. On the other hand, if path settings was changed by custom configuration file, then it is required to update configuration accordingly. Please check your configuration files for following configuration options:

OVS_DIR
OVS_DIR_VANILLA
OVS_DIR_USER
OVS_DIR_CUSE

RTE_SDK_USER
RTE_SDK_CUSE

QEMU_DIR
QEMU_DIR_USER
QEMU_DIR_CUSE
QEMU_BIN

In case that any of these options is defined, then configuration must be updated. All paths to the tools are now stored inside PATHS dictionary. Please refer to the Configuration of PATHS dictionary and update your configuration where necessary.

Configuration change via CLI

In previous releases it was possible to modify selected configuration options (mostly VNF specific) via command line interface, i.e. by --test-params argument. This concept has been generalized in Danube release and it is possible to modify any configuration parameter via CLI or via Parameters section of the testcase definition. Old configuration options were obsoleted and it is required to specify configuration parameter name in the same form as it is defined inside configuration file, i.e. in uppercase. Please refer to the Overriding values defined in configuration files for additional details.

NOTE: In case that execution of VSPERF is automated by scripts (e.g. for CI purposes), then these scripts must be checked and updated too. It means, that any occurrence of

guest_loopback
vanilla_tgen_port1_ip
vanilla_tgen_port1_mac
vanilla_tgen_port2_ip
vanilla_tgen_port2_mac
tunnel_type

shall be changed to the uppercase form and data type of entered values must match to data types of original values from configuration files.

In case that guest_nic1_name or guest_nic2_name is changed, then new dictionary GUEST_NICS must be modified accordingly. Please see Configuration of GUEST options and conf/04_vnf.conf for additional details.

Traffic configuration via CLI

In previous releases it was possible to modify selected attributes of generated traffic via command line interface. This concept has been enhanced in Danube release and it is now possible to modify all traffic specific options via CLI or by TRAFFIC dictionary in configuration file. Detailed description is available at Configuration of TRAFFIC dictionary section of documentation.

Please check your automated scripts for VSPERF execution for following CLI parameters and update them according to the documentation:

bidir
duration
frame_rate
iload
lossrate
multistream
pkt_sizes
pre-installed_flows
rfc2544_tests
stream_type
traffic_type
3. ‘vsperf’ Traffic Gen Guide
Overview

VSPERF supports the following traffic generators:

To see the list of traffic gens from the cli:

$ ./vsperf --list-trafficgens

This guide provides the details of how to install and configure the various traffic generators.

Background Information

The traffic default configuration can be found in conf/03_traffic.conf, and is configured as follows:

TRAFFIC = {
    'traffic_type' : 'rfc2544_throughput',
    'frame_rate' : 100,
    'bidir' : 'True',  # will be passed as string in title format to tgen
    'multistream' : 0,
    'stream_type' : 'L4',
    'pre_installed_flows' : 'No',           # used by vswitch implementation
    'flow_type' : 'port',                   # used by vswitch implementation

    'l2': {
        'framesize': 64,
        'srcmac': '00:00:00:00:00:00',
        'dstmac': '00:00:00:00:00:00',
    },
    'l3': {
        'proto': 'udp',
        'srcip': '1.1.1.1',
        'dstip': '90.90.90.90',
    },
    'l4': {
        'srcport': 3000,
        'dstport': 3001,
    },
    'vlan': {
        'enabled': False,
        'id': 0,
        'priority': 0,
        'cfi': 0,
    },
}

The framesize parameter can be overridden from the configuration files by adding the following to your custom configuration file 10_custom.conf:

TRAFFICGEN_PKT_SIZES = (64, 128,)

OR from the commandline:

$ ./vsperf --test-params "TRAFFICGEN_PKT_SIZES=(x,y)" $TESTNAME

You can also modify the traffic transmission duration and the number of tests run by the traffic generator by extending the example commandline above to:

$ ./vsperf --test-params "TRAFFICGEN_PKT_SIZES=(x,y);TRAFFICGEN_DURATION=10;" \
                         "TRAFFICGEN_RFC2544_TESTS=1" $TESTNAME
Dummy

The Dummy traffic generator can be used to test VSPERF installation or to demonstrate VSPERF functionality at DUT without connection to a real traffic generator.

You could also use the Dummy generator in case, that your external traffic generator is not supported by VSPERF. In such case you could use VSPERF to setup your test scenario and then transmit the traffic. After the transmission is completed you could specify values for all collected metrics and VSPERF will use them to generate final reports.

Setup

To select the Dummy generator please add the following to your custom configuration file 10_custom.conf.

TRAFFICGEN = 'Dummy'

OR run vsperf with the --trafficgen argument

$ ./vsperf --trafficgen Dummy $TESTNAME

Where $TESTNAME is the name of the vsperf test you would like to run. This will setup the vSwitch and the VNF (if one is part of your test) print the traffic configuration and prompt you to transmit traffic when the setup is complete.

Please send 'continuous' traffic with the following stream config:
30mS, 90mpps, multistream False
and the following flow config:
{
    "flow_type": "port",
    "l3": {
        "srcip": "1.1.1.1",
        "proto": "tcp",
        "dstip": "90.90.90.90"
    },
    "traffic_type": "rfc2544_continuous",
    "multistream": 0,
    "bidir": "True",
    "vlan": {
        "cfi": 0,
        "priority": 0,
        "id": 0,
        "enabled": false
    },
    "frame_rate": 90,
    "l2": {
        "dstport": 3001,
        "srcport": 3000,
        "dstmac": "00:00:00:00:00:00",
        "srcmac": "00:00:00:00:00:00",
        "framesize": 64
    }
}
What was the result for 'frames tx'?

When your traffic generator has completed traffic transmission and provided the results please input these at the VSPERF prompt. VSPERF will try to verify the input:

Is '$input_value' correct?

Please answer with y OR n.

VSPERF will ask you to provide a value for every of collected metrics. The list of metrics can be found at traffic-type-metrics. Finally vsperf will print out the results for your test and generate the appropriate logs and report files.

Metrics collected for supported traffic types

Below you could find a list of metrics collected by VSPERF for each of supported traffic types.

RFC2544 Throughput and Continuous:

  • frames tx
  • frames rx
  • min latency
  • max latency
  • avg latency
  • frameloss

RFC2544 Back2back:

  • b2b frames
  • b2b frame loss %
Dummy result pre-configuration

In case of a Dummy traffic generator it is possible to pre-configure the test results. This is useful for creation of demo testcases, which do not require a real traffic generator. Such testcase can be run by any user and it will still generate all reports and result files.

Result values can be specified within TRAFFICGEN_DUMMY_RESULTS dictionary, where every of collected metrics must be properly defined. Please check the list of traffic-type-metrics.

Dictionary with dummy results can be passed by CLI argument --test-params or specified in Parameters section of testcase definition.

Example of testcase execution with dummy results defined by CLI argument:

$ ./vsperf back2back --trafficgen Dummy --test-params \
  "TRAFFICGEN_DUMMY_RESULTS={'b2b frames':'3000','b2b frame loss %':'0.0'}"

Example of testcase definition with pre-configured dummy results:

{
    "Name": "back2back",
    "Traffic Type": "rfc2544_back2back",
    "Deployment": "p2p",
    "biDirectional": "True",
    "Description": "LTD.Throughput.RFC2544.BackToBackFrames",
    "Parameters" : {
        'TRAFFICGEN_DUMMY_RESULTS' : {'b2b frames':'3000','b2b frame loss %':'0.0'}
    },
},

NOTE: Pre-configured results for the Dummy traffic generator will be used only in case, that the Dummy traffic generator is used. Otherwise the option TRAFFICGEN_DUMMY_RESULTS will be ignored.

Ixia

VSPERF can use both IxNetwork and IxExplorer TCL servers to control Ixia chassis. However usage of IxNetwork TCL server is a preferred option. Following sections will describe installation and configuration of IxNetwork components used by VSPERF.

Installation

On the system under the test you need to install IxNetworkTclClient$(VER_NUM)Linux.bin.tgz.

On the IXIA client software system you need to install IxNetwork TCL server. After its installation you should configure it as follows:

  1. Find the IxNetwork TCL server app (start -> All Programs -> IXIA -> IxNetwork -> IxNetwork_$(VER_NUM) -> IxNetwork TCL Server)

  2. Right click on IxNetwork TCL Server, select properties - Under shortcut tab in the Target dialogue box make sure there is the argument “-tclport xxxx” where xxxx is your port number (take note of this port number as you will need it for the 10_custom.conf file).

    _images/TCLServerProperties1.png
  3. Hit Ok and start the TCL server application

VSPERF configuration

There are several configuration options specific to the IxNetwork traffic generator from IXIA. It is essential to set them correctly, before the VSPERF is executed for the first time.

Detailed description of options follows:

  • TRAFFICGEN_IXNET_MACHINE - IP address of server, where IxNetwork TCL Server is running
  • TRAFFICGEN_IXNET_PORT - PORT, where IxNetwork TCL Server is accepting connections from TCL clients
  • TRAFFICGEN_IXNET_USER - username, which will be used during communication with IxNetwork TCL Server and IXIA chassis
  • TRAFFICGEN_IXIA_HOST - IP address of IXIA traffic generator chassis
  • TRAFFICGEN_IXIA_CARD - identification of card with dedicated ports at IXIA chassis
  • TRAFFICGEN_IXIA_PORT1 - identification of the first dedicated port at TRAFFICGEN_IXIA_CARD at IXIA chassis; VSPERF uses two separated ports for traffic generation. In case of unidirectional traffic, it is essential to correctly connect 1st IXIA port to the 1st NIC at DUT, i.e. to the first PCI handle from WHITELIST_NICS list. Otherwise traffic may not be able to pass through the vSwitch.
  • TRAFFICGEN_IXIA_PORT2 - identification of the second dedicated port at TRAFFICGEN_IXIA_CARD at IXIA chassis; VSPERF uses two separated ports for traffic generation. In case of unidirectional traffic, it is essential to correctly connect 2nd IXIA port to the 2nd NIC at DUT, i.e. to the second PCI handle from WHITELIST_NICS list. Otherwise traffic may not be able to pass through the vSwitch.
  • TRAFFICGEN_IXNET_LIB_PATH - path to the DUT specific installation of IxNetwork TCL API
  • TRAFFICGEN_IXNET_TCL_SCRIPT - name of the TCL script, which VSPERF will use for communication with IXIA TCL server
  • TRAFFICGEN_IXNET_TESTER_RESULT_DIR - folder accessible from IxNetwork TCL server, where test results are stored, e.g. c:/ixia_results; see test-results-share
  • TRAFFICGEN_IXNET_DUT_RESULT_DIR - directory accessible from the DUT, where test results from IxNetwork TCL server are stored, e.g. /mnt/ixia_results; see test-results-share
Test results share

VSPERF is not able to retrieve test results via TCL API directly. Instead, all test results are stored at IxNetwork TCL server. Results are stored at folder defined by TRAFFICGEN_IXNET_TESTER_RESULT_DIR configuration parameter. Content of this folder must be shared (e.g. via samba protocol) between TCL Server and DUT, where VSPERF is executed. VSPERF expects, that test results will be available at directory configured by TRAFFICGEN_IXNET_DUT_RESULT_DIR configuration parameter.

Example of sharing configuration:

  • Create a new folder at IxNetwork TCL server machine, e.g. c:\ixia_results

  • Modify sharing options of ixia_results folder to share it with everybody

  • Create a new directory at DUT, where shared directory with results will be mounted, e.g. /mnt/ixia_results

  • Update your custom VSPERF configuration file as follows:

    TRAFFICGEN_IXNET_TESTER_RESULT_DIR = 'c:/ixia_results'
    TRAFFICGEN_IXNET_DUT_RESULT_DIR = '/mnt/ixia_results'
    

    NOTE: It is essential to use slashes ‘/’ also in path configured by TRAFFICGEN_IXNET_TESTER_RESULT_DIR parameter.

  • Install cifs-utils package.

    e.g. at rpm based Linux distribution:

    yum install cifs-utils
    
  • Mount shared directory, so VSPERF can access test results.

    e.g. by adding new record into /etc/fstab

    mount -t cifs //_TCL_SERVER_IP_OR_FQDN_/ixia_results /mnt/ixia_results
          -o file_mode=0777,dir_mode=0777,nounix
    

It is recommended to verify, that any new file inserted into c:/ixia_results folder is visible at DUT inside /mnt/ixia_results directory.

Spirent Setup

Spirent installation files and instructions are available on the Spirent support website at:

http://support.spirent.com

Select a version of Spirent TestCenter software to utilize. This example will use Spirent TestCenter v4.57 as an example. Substitute the appropriate version in place of ‘v4.57’ in the examples, below.

On the CentOS 7 System

Download and install the following:

Spirent TestCenter Application, v4.57 for 64-bit Linux Client

Spirent Virtual Deployment Service (VDS)

Spirent VDS is required for both TestCenter hardware and virtual chassis in the vsperf environment. For installation, select the version that matches the Spirent TestCenter Application version. For v4.57, the matching VDS version is 1.0.55. Download either the ova (VMware) or qcow2 (QEMU) image and create a VM with it. Initialize the VM according to Spirent installation instructions.

Using Spirent TestCenter Virtual (STCv)

STCv is available in both ova (VMware) and qcow2 (QEMU) formats. For VMware, download:

Spirent TestCenter Virtual Machine for VMware, v4.57 for Hypervisor - VMware ESX.ESXi

Virtual test port performance is affected by the hypervisor configuration. For best practice results in deploying STCv, the following is suggested:

  • Create a single VM with two test ports rather than two VMs with one port each
  • Set STCv in DPDK mode
  • Give STCv 2*n + 1 cores, where n = the number of ports. For vsperf, cores = 5.
  • Turning off hyperthreading and pinning these cores will improve performance
  • Give STCv 2 GB of RAM

To get the highest performance and accuracy, Spirent TestCenter hardware is recommended. vsperf can run with either stype test ports.

Using STC REST Client

The stcrestclient package provides the stchttp.py ReST API wrapper module. This allows simple function calls, nearly identical to those provided by StcPython.py, to be used to access TestCenter server sessions via the STC ReST API. Basic ReST functionality is provided by the resthttp module, and may be used for writing ReST clients independent of STC.

To use REST interface, follow the instructions in the Project page to install the package. Once installed, the scripts named with ‘rest’ keyword can be used. For example: testcenter-rfc2544-rest.py can be used to run RFC 2544 tests using the REST interface.

Configuration:
  1. The Labserver and license server addresses. These parameters applies to all the tests, and are mandatory for all tests.
TRAFFICGEN_STC_LAB_SERVER_ADDR = " "
TRAFFICGEN_STC_LICENSE_SERVER_ADDR = " "
TRAFFICGEN_STC_PYTHON2_PATH = " "
TRAFFICGEN_STC_TESTCENTER_PATH = " "
TRAFFICGEN_STC_TEST_SESSION_NAME = " "
TRAFFICGEN_STC_CSV_RESULTS_FILE_PREFIX = " "
  1. For RFC2544 tests, the following parameters are mandatory
TRAFFICGEN_STC_EAST_CHASSIS_ADDR = " "
TRAFFICGEN_STC_EAST_SLOT_NUM = " "
TRAFFICGEN_STC_EAST_PORT_NUM = " "
TRAFFICGEN_STC_EAST_INTF_ADDR = " "
TRAFFICGEN_STC_EAST_INTF_GATEWAY_ADDR = " "
TRAFFICGEN_STC_WEST_CHASSIS_ADDR = ""
TRAFFICGEN_STC_WEST_SLOT_NUM = " "
TRAFFICGEN_STC_WEST_PORT_NUM = " "
TRAFFICGEN_STC_WEST_INTF_ADDR = " "
TRAFFICGEN_STC_WEST_INTF_GATEWAY_ADDR = " "
TRAFFICGEN_STC_RFC2544_TPUT_TEST_FILE_NAME
  1. RFC2889 tests: Currently, the forwarding, address-caching, and address-learning-rate tests of RFC2889 are supported. The testcenter-rfc2889-rest.py script implements the rfc2889 tests. The configuration for RFC2889 involves test-case definition, and parameter definition, as described below. New results-constants, as shown below, are added to support these tests.

Example of testcase definition for RFC2889 tests:

{
    "Name": "phy2phy_forwarding",
    "Deployment": "p2p",
    "Description": "LTD.Forwarding.RFC2889.MaxForwardingRate",
    "Parameters" : {
        "TRAFFIC" : {
            "traffic_type" : "rfc2889_forwarding",
        },
    },
}

For RFC2889 tests, specifying the locations for the monitoring ports is mandatory. Necessary parameters are:

Other Configurations are :

TRAFFICGEN_STC_RFC2889_MIN_LR = 1488
TRAFFICGEN_STC_RFC2889_MAX_LR = 14880
TRAFFICGEN_STC_RFC2889_MIN_ADDRS = 1000
TRAFFICGEN_STC_RFC2889_MAX_ADDRS = 65536
TRAFFICGEN_STC_RFC2889_AC_LR = 1000

The first 2 values are for address-learning test where as other 3 values are for the Address caching capacity test. LR: Learning Rate. AC: Address Caching. Maximum value for address is 16777216. Whereas, maximum for LR is 4294967295.

Results for RFC2889 Tests: Forwarding tests outputs following values:

TX_RATE_FPS : "Transmission Rate in Frames/sec"
THROUGHPUT_RX_FPS: "Received Throughput Frames/sec"
TX_RATE_MBPS : " Transmission rate in MBPS"
THROUGHPUT_RX_MBPS: "Received Throughput in MBPS"
TX_RATE_PERCENT: "Transmission Rate in Percentage"
FRAME_LOSS_PERCENT: "Frame loss in Percentage"
FORWARDING_RATE_FPS: " Maximum Forwarding Rate in FPS"

Whereas, the address caching test outputs following values,

CACHING_CAPACITY_ADDRS = 'Number of address it can cache'
ADDR_LEARNED_PERCENT = 'Percentage of address successfully learned'

and address learning test outputs just a single value:

OPTIMAL_LEARNING_RATE_FPS = 'Optimal learning rate in fps'

Note that ‘FORWARDING_RATE_FPS’, ‘CACHING_CAPACITY_ADDRS’, ‘ADDR_LEARNED_PERCENT’ and ‘OPTIMAL_LEARNING_RATE_FPS’ are the new result-constants added to support RFC2889 tests.

Xena Networks
Installation

Xena Networks traffic generator requires specific files and packages to be installed. It is assumed the user has access to the Xena2544.exe file which must be placed in VSPerf installation location under the tools/pkt_gen/xena folder. Contact Xena Networks for the latest version of this file. The user can also visit www.xenanetworks/downloads to obtain the file with a valid support contract.

Note VSPerf has been fully tested with version v2.43 of Xena2544.exe

To execute the Xena2544.exe file under Linux distributions the mono-complete package must be installed. To install this package follow the instructions below. Further information can be obtained from http://www.mono-project.com/docs/getting-started/install/linux/

rpm --import "http://keyserver.ubuntu.com/pks/lookup?op=get&search=0x3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF"
yum-config-manager --add-repo http://download.mono-project.com/repo/centos/
yum -y install mono-complete

To prevent gpg errors on future yum installation of packages the mono-project repo should be disabled once installed.

yum-config-manager --disable download.mono-project.com_repo_centos_
Configuration

Connection information for your Xena Chassis must be supplied inside the 10_custom.conf or 03_custom.conf file. The following parameters must be set to allow for proper connections to the chassis.

TRAFFICGEN_XENA_IP = ''
TRAFFICGEN_XENA_PORT1 = ''
TRAFFICGEN_XENA_PORT2 = ''
TRAFFICGEN_XENA_USER = ''
TRAFFICGEN_XENA_PASSWORD = ''
TRAFFICGEN_XENA_MODULE1 = ''
TRAFFICGEN_XENA_MODULE2 = ''
RFC2544 Throughput Testing

Xena traffic generator testing for rfc2544 throughput can be modified for different behaviors if needed. The default options for the following are optimized for best results.

TRAFFICGEN_XENA_2544_TPUT_INIT_VALUE = '10.0'
TRAFFICGEN_XENA_2544_TPUT_MIN_VALUE = '0.1'
TRAFFICGEN_XENA_2544_TPUT_MAX_VALUE = '100.0'
TRAFFICGEN_XENA_2544_TPUT_VALUE_RESOLUTION = '0.5'
TRAFFICGEN_XENA_2544_TPUT_USEPASS_THRESHHOLD = 'false'
TRAFFICGEN_XENA_2544_TPUT_PASS_THRESHHOLD = '0.0'

Each value modifies the behavior of rfc 2544 throughput testing. Refer to your Xena documentation to understand the behavior changes in modifying these values.

Continuous Traffic Testing

Xena continuous traffic by default does a 3 second learning preemption to allow the DUT to receive learning packets before a continuous test is performed. If a custom test case requires this learning be disabled, you can disable the option or modify the length of the learning by modifying the following settings.

TRAFFICGEN_XENA_CONT_PORT_LEARNING_ENABLED = False
TRAFFICGEN_XENA_CONT_PORT_LEARNING_DURATION = 3
MoonGen
Installation

MoonGen architecture overview and general installation instructions can be found here:

https://github.com/emmericp/MoonGen

  • Note: Today, MoonGen with VSPERF only supports 10Gbps line speeds.

For VSPERF use, MoonGen should be cloned from here (as opposed to the previously mentioned GitHub):

git clone https://github.com/atheurer/lua-trafficgen

and use the master branch:

git checkout master

VSPERF uses a particular Lua script with the MoonGen project:

trafficgen.lua

Follow MoonGen set up and execution instructions here:

https://github.com/atheurer/lua-trafficgen/blob/master/README.md

Note one will need to set up ssh login to not use passwords between the server running MoonGen and the device under test (running the VSPERF test infrastructure). This is because VSPERF on one server uses ‘ssh’ to configure and run MoonGen upon the other server.

One can set up this ssh access by doing the following on both servers:

ssh-keygen -b 2048 -t rsa
ssh-copy-id <other server>
Configuration

Connection information for MoonGen must be supplied inside the 10_custom.conf or 03_custom.conf file. The following parameters must be set to allow for proper connections to the host with MoonGen.

TRAFFICGEN_MOONGEN_HOST_IP_ADDR = ""
TRAFFICGEN_MOONGEN_USER = ""
TRAFFICGEN_MOONGEN_BASE_DIR = ""
TRAFFICGEN_MOONGEN_PORTS = ""
TRAFFICGEN_MOONGEN_LINE_SPEED_GBPS = ""
4. vSwitchPerf test suites userguide
General

VSPERF requires a traffic generators to run tests, automated traffic gen support in VSPERF includes:

  • IXIA traffic generator (IxNetwork hardware) and a machine that runs the IXIA client software.
  • Spirent traffic generator (TestCenter hardware chassis or TestCenter virtual in a VM) and a VM to run the Spirent Virtual Deployment Service image, formerly known as “Spirent LabServer”.
  • Xena Network traffic generator (Xena hardware chassis) that houses the Xena Traffic generator modules.
  • Moongen software traffic generator. Requires a separate machine running moongen to execute packet generation.

If you want to use another traffic generator, please select the Dummy generator.

VSPERF Installation

To see the supported Operating Systems, vSwitches and system requirements, please follow the installation instructions <vsperf-installation>.

Traffic Generator Setup

Follow the Traffic generator instructions <trafficgen-installation> to install and configure a suitable traffic generator.

Cloning and building src dependencies

In order to run VSPERF, you will need to download DPDK and OVS. You can do this manually and build them in a preferred location, OR you could use vswitchperf/src. The vswitchperf/src directory contains makefiles that will allow you to clone and build the libraries that VSPERF depends on, such as DPDK and OVS. To clone and build simply:

$ cd src
$ make

VSPERF can be used with stock OVS (without DPDK support). When build is finished, the libraries are stored in src_vanilla directory.

The ‘make’ builds all options in src:

  • Vanilla OVS
  • OVS with vhost_user as the guest access method (with DPDK support)

The vhost_user build will reside in src/ovs/ The Vanilla OVS build will reside in vswitchperf/src_vanilla

To delete a src subdirectory and its contents to allow you to re-clone simply use:

$ make clobber
Configure the ./conf/10_custom.conf file

The 10_custom.conf file is the configuration file that overrides default configurations in all the other configuration files in ./conf The supplied 10_custom.conf file MUST be modified, as it contains configuration items for which there are no reasonable default values.

The configuration items that can be added is not limited to the initial contents. Any configuration item mentioned in any .conf file in ./conf directory can be added and that item will be overridden by the custom configuration value.

Further details about configuration files evaluation and special behaviour of options with GUEST_ prefix could be found at design document.

Using a custom settings file

If your 10_custom.conf doesn’t reside in the ./conf directory of if you want to use an alternative configuration file, the file can be passed to vsperf via the --conf-file argument.

$ ./vsperf --conf-file <path_to_custom_conf> ...

Note that configuration passed in via the environment (--load-env) or via another command line argument will override both the default and your custom configuration files. This “priority hierarchy” can be described like so (1 = max priority):

  1. Testcase definition section Parameters
  2. Command line arguments
  3. Environment variables
  4. Configuration file(s)

Further details about configuration files evaluation and special behaviour of options with GUEST_ prefix could be found at design document.

Overriding values defined in configuration files

The configuration items can be overridden by command line argument --test-params. In this case, the configuration items and their values should be passed in form of item=value and separated by semicolon.

Example:

$ ./vsperf --test-params "TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,);" \
                         "GUEST_LOOPBACK=['testpmd','l2fwd']" pvvp_tput

The second option is to override configuration items by Parameters section of the test case definition. The configuration items can be added into Parameters dictionary with their new values. These values will override values defined in configuration files or specified by --test-params command line argument.

Example:

"Parameters" : {'TRAFFICGEN_PKT_SIZES' : (128,),
                'TRAFFICGEN_DURATION' : 10,
                'GUEST_LOOPBACK' : ['testpmd','l2fwd'],
               }

NOTE: In both cases, configuration item names and their values must be specified in the same form as they are defined inside configuration files. Parameter names must be specified in uppercase and data types of original and new value must match. Python syntax rules related to data types and structures must be followed. For example, parameter TRAFFICGEN_PKT_SIZES above is defined as a tuple with a single value 128. In this case trailing comma is mandatory, otherwise value can be wrongly interpreted as a number instead of a tuple and vsperf execution would fail. Please check configuration files for default values and their types and use them as a basis for any customized values. In case of any doubt, please check official python documentation related to data structures like tuples, lists and dictionaries.

NOTE: Vsperf execution will terminate with runtime error in case, that unknown parameter name is passed via --test-params CLI argument or defined in Parameters section of test case definition. It is also forbidden to redefine a value of TEST_PARAMS configuration item via CLI or Parameters section.

vloop_vnf

VSPERF uses a VM image called vloop_vnf for looping traffic in the deployment scenarios involving VMs. The image can be downloaded from http://artifacts.opnfv.org/.

Please see the installation instructions for information on vloop-vnf images.

l2fwd Kernel Module

A Kernel Module that provides OSI Layer 2 Ipv4 termination or forwarding with support for Destination Network Address Translation (DNAT) for both the MAC and IP addresses. l2fwd can be found in <vswitchperf_dir>/src/l2fwd

Executing tests

All examples inside these docs assume, that user is inside the VSPERF directory. VSPERF can be executed from any directory.

Before running any tests make sure you have root permissions by adding the following line to /etc/sudoers:

username ALL=(ALL)       NOPASSWD: ALL

username in the example above should be replaced with a real username.

To list the available tests:

$ ./vsperf --list

To run a single test:

$ ./vsperf $TESTNAME

Where $TESTNAME is the name of the vsperf test you would like to run.

To run a group of tests, for example all tests with a name containing ‘RFC2544’:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf --tests="RFC2544"

To run all tests:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf

Some tests allow for configurable parameters, including test duration (in seconds) as well as packet sizes (in bytes).

$ ./vsperf --conf-file user_settings.py \
    --tests RFC2544Tput \
    --test-params "TRAFFICGEN_DURATION=10;TRAFFICGEN_PKT_SIZES=(128,)"

For all available options, check out the help dialog:

$ ./vsperf --help
Executing Vanilla OVS tests
  1. If needed, recompile src for all OVS variants

    $ cd src
    $ make distclean
    $ make
    
  2. Update your 10_custom.conf file to use Vanilla OVS:

    VSWITCH = 'OvsVanilla'
    
  3. Run test:

    $ ./vsperf --conf-file=<path_to_custom_conf>
    

    Please note if you don’t want to configure Vanilla OVS through the configuration file, you can pass it as a CLI argument.

    $ ./vsperf --vswitch OvsVanilla
    
Executing tests with VMs

To run tests using vhost-user as guest access method:

  1. Set VHOST_METHOD and VNF of your settings file to:

    VSWITCH = 'OvsDpdkVhost'
    VNF = 'QemuDpdkVhost'
    
  2. If needed, recompile src for all OVS variants

    $ cd src
    $ make distclean
    $ make
    
  3. Run test:

    $ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf
    
Executing tests with VMs using Vanilla OVS

To run tests using Vanilla OVS:

  1. Set the following variables:

    VSWITCH = 'OvsVanilla'
    VNF = 'QemuVirtioNet'
    
    VANILLA_TGEN_PORT1_IP = n.n.n.n
    VANILLA_TGEN_PORT1_MAC = nn:nn:nn:nn:nn:nn
    
    VANILLA_TGEN_PORT2_IP = n.n.n.n
    VANILLA_TGEN_PORT2_MAC = nn:nn:nn:nn:nn:nn
    
    VANILLA_BRIDGE_IP = n.n.n.n
    

    or use --test-params option

    $ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
               --test-params "VANILLA_TGEN_PORT1_IP=n.n.n.n;" \
                             "VANILLA_TGEN_PORT1_MAC=nn:nn:nn:nn:nn:nn;" \
                             "VANILLA_TGEN_PORT2_IP=n.n.n.n;" \
                             "VANILLA_TGEN_PORT2_MAC=nn:nn:nn:nn:nn:nn"
    
  2. If needed, recompile src for all OVS variants

    $ cd src
    $ make distclean
    $ make
    
  3. Run test:

    $ ./vsperf --conf-file<path_to_custom_conf>/10_custom.conf
    
Using vfio_pci with DPDK

To use vfio with DPDK instead of igb_uio add into your custom configuration file the following parameter:

PATHS['dpdk']['src']['modules'] = ['uio', 'vfio-pci']

NOTE: In case, that DPDK is installed from binary package, then please set PATHS['dpdk']['bin']['modules'] instead.

NOTE: Please ensure that Intel VT-d is enabled in BIOS.

NOTE: Please ensure your boot/grub parameters include the following:

iommu=pt intel_iommu=on

To check that IOMMU is enabled on your platform:

$ dmesg | grep IOMMU
[    0.000000] Intel-IOMMU: enabled
[    0.139882] dmar: IOMMU 0: reg_base_addr fbffe000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.139888] dmar: IOMMU 1: reg_base_addr ebffc000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.139893] IOAPIC id 2 under DRHD base  0xfbffe000 IOMMU 0
[    0.139894] IOAPIC id 0 under DRHD base  0xebffc000 IOMMU 1
[    0.139895] IOAPIC id 1 under DRHD base  0xebffc000 IOMMU 1
[    3.335744] IOMMU: dmar0 using Queued invalidation
[    3.335746] IOMMU: dmar1 using Queued invalidation
....
Using SRIOV support

To use virtual functions of NIC with SRIOV support, use extended form of NIC PCI slot definition:

WHITELIST_NICS = ['0000:05:00.0|vf0', '0000:05:00.1|vf3']

Where ‘vf’ is an indication of virtual function usage and following number defines a VF to be used. In case that VF usage is detected, then vswitchperf will enable SRIOV support for given card and it will detect PCI slot numbers of selected VFs.

So in example above, one VF will be configured for NIC ‘0000:05:00.0’ and four VFs will be configured for NIC ‘0000:05:00.1’. Vswitchperf will detect PCI addresses of selected VFs and it will use them during test execution.

At the end of vswitchperf execution, SRIOV support will be disabled.

SRIOV support is generic and it can be used in different testing scenarios. For example:

  • vSwitch tests with DPDK or without DPDK support to verify impact of VF usage on vSwitch performance
  • tests without vSwitch, where traffic is forwared directly between VF interfaces by packet forwarder (e.g. testpmd application)
  • tests without vSwitch, where VM accesses VF interfaces directly by PCI-passthrough to measure raw VM throughput performance.
Using QEMU with PCI passthrough support

Raw virtual machine throughput performance can be measured by execution of PVP test with direct access to NICs by PCI passthrough. To execute VM with direct access to PCI devices, enable vfio-pci. In order to use virtual functions, SRIOV-support must be enabled.

Execution of test with PCI passthrough with vswitch disabled:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
           --vswitch none --vnf QemuPciPassthrough pvp_tput

Any of supported guest-loopback-application can be used inside VM with PCI passthrough support.

Note: Qemu with PCI passthrough support can be used only with PVP test deployment.

Selection of loopback application for tests with VMs

To select the loopback applications which will forward packets inside VMs, the following parameter should be configured:

GUEST_LOOPBACK = ['testpmd']

or use --test-params CLI argument:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
      --test-params "GUEST_LOOPBACK=['testpmd']"

Supported loopback applications are:

'testpmd'       - testpmd from dpdk will be built and used
'l2fwd'         - l2fwd module provided by Huawei will be built and used
'linux_bridge'  - linux bridge will be configured
'buildin'       - nothing will be configured by vsperf; VM image must
                  ensure traffic forwarding between its interfaces

Guest loopback application must be configured, otherwise traffic will not be forwarded by VM and testcases with VM related deployments will fail. Guest loopback application is set to ‘testpmd’ by default.

NOTE: In case that only 1 or more than 2 NICs are configured for VM, then ‘testpmd’ should be used. As it is able to forward traffic between multiple VM NIC pairs.

NOTE: In case of linux_bridge, all guest NICs are connected to the same bridge inside the guest.

Mergable Buffers Options with QEMU

Mergable buffers can be disabled with VSPerf within QEMU. This option can increase performance significantly when not using jumbo frame sized packets. By default VSPerf disables mergable buffers. If you wish to enable it you can modify the setting in the a custom conf file.

GUEST_NIC_MERGE_BUFFERS_DISABLE = [False]

Then execute using the custom conf file.

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf

Alternatively you can just pass the param during execution.

$ ./vsperf --test-params "GUEST_NIC_MERGE_BUFFERS_DISABLE=[False]"
Selection of dpdk binding driver for tests with VMs

To select dpdk binding driver, which will specify which driver the vm NICs will use for dpdk bind, the following configuration parameter should be configured:

GUEST_DPDK_BIND_DRIVER = ['igb_uio_from_src']

The supported dpdk guest bind drivers are:

'uio_pci_generic'      - Use uio_pci_generic driver
'igb_uio_from_src'     - Build and use the igb_uio driver from the dpdk src
                         files
'vfio_no_iommu'        - Use vfio with no iommu option. This requires custom
                         guest images that support this option. The default
                         vloop image does not support this driver.

Note: uio_pci_generic does not support sr-iov testcases with guests attached. This is because uio_pci_generic only supports legacy interrupts. In case uio_pci_generic is selected with the vnf as QemuPciPassthrough it will be modified to use igb_uio_from_src instead.

Note: vfio_no_iommu requires kernels equal to or greater than 4.5 and dpdk 16.04 or greater. Using this option will also taint the kernel.

Please refer to the dpdk documents at http://dpdk.org/doc/guides for more information on these drivers.

Multi-Queue Configuration

VSPerf currently supports multi-queue with the following limitations:

  1. Requires QEMU 2.5 or greater and any OVS version higher than 2.5. The default upstream package versions installed by VSPerf satisfies this requirement.

  2. Guest image must have ethtool utility installed if using l2fwd or linux bridge inside guest for loopback.

  3. If using OVS versions 2.5.0 or less enable old style multi-queue as shown in the ‘‘02_vswitch.conf’’ file.

    OVS_OLD_STYLE_MQ = True
    

To enable multi-queue for dpdk modify the ‘‘02_vswitch.conf’’ file.

VSWITCH_DPDK_MULTI_QUEUES = 2

NOTE: you should consider using the switch affinity to set a pmd cpu mask that can optimize your performance. Consider the numa of the NIC in use if this applies by checking /sys/class/net/<eth_name>/device/numa_node and setting an appropriate mask to create PMD threads on the same numa node.

When multi-queue is enabled, each dpdk or dpdkvhostuser port that is created on the switch will set the option for multiple queues. If old style multi queue has been enabled a global option for multi queue will be used instead of the port by port option.

To enable multi-queue on the guest modify the ‘‘04_vnf.conf’’ file.

GUEST_NIC_QUEUES = [2]

Enabling multi-queue at the guest will add multiple queues to each NIC port when qemu launches the guest.

In case of Vanilla OVS, multi-queue is enabled on the tuntap ports and nic queues will be enabled inside the guest with ethtool. Simply enabling the multi-queue on the guest is sufficient for Vanilla OVS multi-queue.

Testpmd should be configured to take advantage of multi-queue on the guest if using DPDKVhostUser. This can be done by modifying the ‘‘04_vnf.conf’’ file.

GUEST_TESTPMD_PARAMS = ['-l 0,1,2,3,4  -n 4 --socket-mem 512 -- '
                        '--burst=64 -i --txqflags=0xf00 '
                        '--nb-cores=4 --rxq=2 --txq=2 '
                        '--disable-hw-vlan']

NOTE: The guest SMP cores must be configured to allow for testpmd to use the optimal number of cores to take advantage of the multiple guest queues.

In case of using Vanilla OVS and qemu virtio-net you can increase performance by binding vhost-net threads to cpus. This can be done by enabling the affinity in the ‘‘04_vnf.conf’’ file. This can be done to non multi-queue enabled configurations as well as there will be 2 vhost-net threads.

VSWITCH_VHOST_NET_AFFINITIZATION = True

VSWITCH_VHOST_CPU_MAP = [4,5,8,11]

NOTE: This method of binding would require a custom script in a real environment.

NOTE: For optimal performance guest SMPs and/or vhost-net threads should be on the same numa as the NIC in use if possible/applicable. Testpmd should be assigned at least (nb_cores +1) total cores with the cpu mask.

Executing Packet Forwarding tests

To select the applications which will forward packets, the following parameters should be configured:

VSWITCH = 'none'
PKTFWD = 'TestPMD'

or use --vswitch and --fwdapp CLI arguments:

$ ./vsperf phy2phy_cont --conf-file user_settings.py \
           --vswitch none \
           --fwdapp TestPMD

Supported Packet Forwarding applications are:

'testpmd'       - testpmd from dpdk
  1. Update your ‘‘10_custom.conf’’ file to use the appropriate variables for selected Packet Forwarder:

    # testpmd configuration
    TESTPMD_ARGS = []
    # packet forwarding mode supported by testpmd; Please see DPDK documentation
    # for comprehensive list of modes supported by your version.
    # e.g. io|mac|mac_retry|macswap|flowgen|rxonly|txonly|csum|icmpecho|...
    # Note: Option "mac_retry" has been changed to "mac retry" since DPDK v16.07
    TESTPMD_FWD_MODE = 'csum'
    # checksum calculation layer: ip|udp|tcp|sctp|outer-ip
    TESTPMD_CSUM_LAYER = 'ip'
    # checksum calculation place: hw (hardware) | sw (software)
    TESTPMD_CSUM_CALC = 'sw'
    # recognize tunnel headers: on|off
    TESTPMD_CSUM_PARSE_TUNNEL = 'off'
    
  2. Run test:

    $ ./vsperf phy2phy_tput --conf-file <path_to_settings_py>
    
Executing Packet Forwarding tests with one guest

TestPMD with DPDK 16.11 or greater can be used to forward packets as a switch to a single guest using TestPMD vdev option. To set this configuration the following parameters should be used.

VSWITCH = 'none'
PKTFWD = 'TestPMD'

or use --vswitch and --fwdapp CLI arguments:

$ ./vsperf pvp_tput --conf-file user_settings.py \
           --vswitch none \
           --fwdapp TestPMD

Guest forwarding application only supports TestPMD in this configuration.

GUEST_LOOPBACK = ['testpmd']

For optimal performance one cpu per port +1 should be used for TestPMD. Also set additional params for packet forwarding application to use the correct number of nb-cores.

VSWITCHD_DPDK_ARGS = ['-l', '46,44,42,40,38', '-n', '4', '--socket-mem 1024,0']
TESTPMD_ARGS = ['--nb-cores=4', '--txq=1', '--rxq=1']

For guest TestPMD 3 VCpus should be assigned with the following TestPMD params.

GUEST_TESTPMD_PARAMS = ['-l 0,1,2 -n 4 --socket-mem 1024 -- '
                        '--burst=64 -i --txqflags=0xf00 '
                        '--disable-hw-vlan --nb-cores=2 --txq=1 --rxq=1']

Execution of TestPMD can be run with the following command line

./vsperf pvp_tput --vswitch=none --fwdapp=TestPMD --conf-file <path_to_settings_py>

NOTE: To achieve the best 0% loss numbers with rfc2544 throughput testing, other tunings should be applied to host and guest such as tuned profiles and CPU tunings to prevent possible interrupts to worker threads.

VSPERF modes of operation

VSPERF can be run in different modes. By default it will configure vSwitch, traffic generator and VNF. However it can be used just for configuration and execution of traffic generator. Another option is execution of all components except traffic generator itself.

Mode of operation is driven by configuration parameter -m or –mode

-m MODE, --mode MODE  vsperf mode of operation;
    Values:
        "normal" - execute vSwitch, VNF and traffic generator
        "trafficgen" - execute only traffic generator
        "trafficgen-off" - execute vSwitch and VNF
        "trafficgen-pause" - execute vSwitch and VNF but wait before traffic transmission

In case, that VSPERF is executed in “trafficgen” mode, then configuration of traffic generator can be modified through TRAFFIC dictionary passed to the --test-params option. It is not needed to specify all values of TRAFFIC dictionary. It is sufficient to specify only values, which should be changed. Detailed description of TRAFFIC dictionary can be found at Configuration of TRAFFIC dictionary.

Example of execution of VSPERF in “trafficgen” mode:

$ ./vsperf -m trafficgen --trafficgen IxNet --conf-file vsperf.conf \
    --test-params "TRAFFIC={'traffic_type':'rfc2544_continuous','bidir':'False','framerate':60}"
Code change verification by pylint

Every developer participating in VSPERF project should run pylint before his python code is submitted for review. Project specific configuration for pylint is available at ‘pylint.rc’.

Example of manual pylint invocation:

$ pylint --rcfile ./pylintrc ./vsperf
GOTCHAs:
Custom image fails to boot

Using custom VM images may not boot within VSPerf pxp testing because of the drive boot and shared type which could be caused by a missing scsi driver inside the image. In case of issues you can try changing the drive boot type to ide.

GUEST_BOOT_DRIVE_TYPE = ['ide']
GUEST_SHARED_DRIVE_TYPE = ['ide']
OVS with DPDK and QEMU

If you encounter the following error: “before (last 100 chars): ‘-path=/dev/hugepages,share=on: unable to map backing store for hugepages: Cannot allocate memoryrnrn” during qemu initialization, check the amount of hugepages on your system:

$ cat /proc/meminfo | grep HugePages

By default the vswitchd is launched with 1Gb of memory, to change this, modify –socket-mem parameter in conf/02_vswitch.conf to allocate an appropriate amount of memory:

VSWITCHD_DPDK_ARGS = ['-c', '0x4', '-n', '4', '--socket-mem 1024,0']
VSWITCHD_DPDK_CONFIG = {
    'dpdk-init' : 'true',
    'dpdk-lcore-mask' : '0x4',
    'dpdk-socket-mem' : '1024,0',
}

Note: Option VSWITCHD_DPDK_ARGS is used for vswitchd, which supports –dpdk parameter. In recent vswitchd versions, option VSWITCHD_DPDK_CONFIG will be used to configure vswitchd via ovs-vsctl calls.

More information

For more information and details refer to the rest of vSwitchPerfuser documentation.

5. Step driven tests

In general, test scenarios are defined by a deployment used in the particular test case definition. The chosen deployment scenario will take care of the vSwitch configuration, deployment of VNFs and it can also affect configuration of a traffic generator. In order to allow a more flexible way of testcase scripting, VSPERF supports a detailed step driven testcase definition. It can be used to configure and program vSwitch, deploy and terminate VNFs, execute a traffic generator, modify a VSPERF configuration, execute external commands, etc.

Execution of step driven tests is done on a step by step work flow starting with step 0 as defined inside the test case. Each step of the test increments the step number by one which is indicated in the log.

(testcases.integration) - Step 0 'vswitch add_vport ['br0']' start

Step driven tests can be used for both performance and integration testing. In case of integration test, each step in the test case is validated. If a step does not pass validation the test will fail and terminate. The test will continue until a failure is detected or all steps pass. A csv report file is generated after a test completes with an OK or FAIL result.

In case of performance test, the validation of steps is not performed and standard output files with results from traffic generator and underlying OS details are generated by vsperf.

Step driven testcases can be used in two different ways:

# description of full testcase - in this case clean deployment is used
to indicate that vsperf should neither configure vSwitch nor deploy any VNF. Test shall perform all required vSwitch configuration and programming and deploy required number of VNFs.
# modification of existing deployment - in this case, any of supported
deployments can be used to perform initial vSwitch configuration and deployment of VNFs. Additional actions defined by TestSteps can be used to alter vSwitch configuration or deploy additional VNFs. After the last step is processed, the test execution will continue with traffic execution.
Test objects and their functions

Every test step can call a function of one of the supported test objects. The list of supported objects and their most common functions follows:

  • vswitch - provides functions for vSwitch configuration

    List of supported functions:

    • add_switch br_name - creates a new switch (bridge) with given br_name
    • del_switch br_name - deletes switch (bridge) with given br_name
    • add_phy_port br_name - adds a physical port into bridge specified by br_name
    • add_vport br_name - adds a virtual port into bridge specified by br_name
    • del_port br_name port_name - removes physical or virtual port specified by port_name from bridge br_name
    • add_flow br_name flow - adds flow specified by flow dictionary into the bridge br_name; Content of flow dictionary will be passed to the vSwitch. In case of Open vSwitch it will be passed to the ovs-ofctl add-flow command. Please see Open vSwitch documentation for the list of supported flow parameters.
    • del_flow br_name [flow] - deletes flow specified by flow dictionary from bridge br_name; In case that optional parameter flow is not specified or set to an empty dictionary {}, then all flows from bridge br_name will be deleted.
    • dump_flows br_name - dumps all flows from bridge specified by br_name
    • enable_stp br_name - enables Spanning Tree Protocol for bridge br_name
    • disable_stp br_name - disables Spanning Tree Protocol for bridge br_name
    • enable_rstp br_name - enables Rapid Spanning Tree Protocol for bridge br_name
    • disable_rstp br_name - disables Rapid Spanning Tree Protocol for bridge br_name

    Examples:

    ['vswitch', 'add_switch', 'int_br0']
    
    ['vswitch', 'del_switch', 'int_br0']
    
    ['vswitch', 'add_phy_port', 'int_br0']
    
    ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]']
    
    ['vswitch', 'add_flow', 'int_br0', {'in_port': '1', 'actions': ['output:2'],
     'idle_timeout': '0'}],
    
    ['vswitch', 'enable_rstp', 'int_br0']
    
  • vnf[ID] - provides functions for deployment and termination of VNFs; Optional alfanumerical ID is used for VNF identification in case that testcase deploys multiple VNFs.

    List of supported functions:

    • start - starts a VNF based on VSPERF configuration
    • stop - gracefully terminates given VNF

    Examples:

    ['vnf1', 'start']
    ['vnf2', 'start']
    ['vnf2', 'stop']
    ['vnf1', 'stop']
    
  • trafficgen - triggers traffic generation

    List of supported functions:

    • send_traffic traffic - starts a traffic based on the vsperf configuration and given traffic dictionary. More details about traffic dictionary and its possible values are available at Traffic Generator Integration Guide

    Examples:

    ['trafficgen', 'send_traffic', {'traffic_type' : 'rfc2544_throughput'}]
    
    ['trafficgen', 'send_traffic', {'traffic_type' : 'rfc2544_back2back', 'bidir' : 'True'}]
    
  • settings - reads or modifies VSPERF configuration

    List of supported functions:

    • getValue param - returns value of given param
    • setValue param value - sets value of param to given value

    Examples:

    ['settings', 'getValue', 'TOOLS']
    
    ['settings', 'setValue', 'GUEST_USERNAME', ['root']]
    
  • namespace - creates or modifies network namespaces

    List of supported functions:

    • create_namespace name - creates new namespace with given name
    • delete_namespace name - deletes namespace specified by its name
    • assign_port_to_namespace port name [port_up] - assigns NIC specified by port into given namespace name; If optional parameter port_up is set to True, then port will be brought up.
    • add_ip_to_namespace_eth port name addr cidr - assigns an IP address addr/cidr to the NIC specified by port within namespace name
    • reset_port_to_root port name - returns given port from namespace name back to the root namespace

    Examples:

    ['namespace', 'create_namespace', 'testns']
    
    ['namespace', 'assign_port_to_namespace', 'eth0', 'testns']
    
  • veth - manipulates with eth and veth devices

    List of supported functions:

    • add_veth_port port peer_port - adds a pair of veth ports named port and peer_port
    • del_veth_port port peer_port - deletes a veth port pair specified by port and peer_port
    • bring_up_eth_port eth_port [namespace] - brings up eth_port in (optional) namespace

    Examples:

    ['veth', 'add_veth_port', 'veth', 'veth1']
    
    ['veth', 'bring_up_eth_port', 'eth1']
    
  • tools - provides a set of helper functions

    List of supported functions:

    • Assert condition - evaluates given condition and raises AssertionError in case that condition is not True
    • Eval expression - evaluates given expression as a python code and returns its result
    • Exec command [regex] - executes a shell command and filters its output by (optional) regular expression

    Examples:

    ['tools', 'exec', 'numactl -H', 'available: ([0-9]+)']
    ['tools', 'assert', '#STEP[-1][0]>1']
    
  • wait - is used for test case interruption. This object doesn’t have any functions. Once reached, vsperf will pause test execution and waits for press of Enter key. It can be used during testcase design for debugging purposes.

    Examples:

    ['wait']
    
Test Macros

Test profiles can include macros as part of the test step. Each step in the profile may return a value such as a port name. Recall macros use #STEP to indicate the recalled value inside the return structure. If the method the test step calls returns a value it can be later recalled, for example:

{
    "Name": "vswitch_add_del_vport",
    "Deployment": "clean",
    "Description": "vSwitch - add and delete virtual port",
    "TestSteps": [
            ['vswitch', 'add_switch', 'int_br0'],               # STEP 0
            ['vswitch', 'add_vport', 'int_br0'],                # STEP 1
            ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],  # STEP 2
            ['vswitch', 'del_switch', 'int_br0'],               # STEP 3
         ]
}

This test profile uses the vswitch add_vport method which returns a string value of the port added. This is later called by the del_port method using the name from step 1.

It is also possible to use negative indexes in step macros. In that case #STEP[-1] will refer to the result from previous step, #STEP[-2] will refer to result of step called before previous step, etc. It means, that you could change STEP 2 from previous example to achieve the same functionality:

['vswitch', 'del_port', 'int_br0', '#STEP[-1][0]'],  # STEP 2

Also commonly used steps can be created as a separate profile.

STEP_VSWITCH_PVP_INIT = [
    ['vswitch', 'add_switch', 'int_br0'],           # STEP 0
    ['vswitch', 'add_phy_port', 'int_br0'],         # STEP 1
    ['vswitch', 'add_phy_port', 'int_br0'],         # STEP 2
    ['vswitch', 'add_vport', 'int_br0'],            # STEP 3
    ['vswitch', 'add_vport', 'int_br0'],            # STEP 4
]

This profile can then be used inside other testcases

{
    "Name": "vswitch_pvp",
    "Deployment": "clean",
    "Description": "vSwitch - configure switch and one vnf",
    "TestSteps": STEP_VSWITCH_PVP_INIT +
                 [
                    ['vnf', 'start'],
                    ['vnf', 'stop'],
                 ] +
                 STEP_VSWITCH_PVP_FINIT
}
HelloWorld and other basic Testcases

The following examples are for demonstration purposes. You can run them by copying and pasting into the conf/integration/01_testcases.conf file. A command-line instruction is shown at the end of each example.

HelloWorld

The first example is a HelloWorld testcase. It simply creates a bridge with 2 physical ports, then sets up a flow to drop incoming packets from the port that was instantiated at the STEP #1. There’s no interaction with the traffic generator. Then the flow, the 2 ports and the bridge are deleted. ‘add_phy_port’ method creates a ‘dpdk’ type interface that will manage the physical port. The string value returned is the port name that will be referred by ‘del_port’ later on.

{
    "Name": "HelloWorld",
    "Description": "My first testcase",
    "Deployment": "clean",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],   # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 2
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'actions': ['drop'], 'idle_timeout': '0'}],
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]

},

To run HelloWorld test:

./vsperf --conf-file user_settings.py --integration HelloWorld
Specify a Flow by the IP address

The next example shows how to explicitly set up a flow by specifying a destination IP address. All packets received from the port created at STEP #1 that have a destination IP address = 90.90.90.90 will be forwarded to the port created at the STEP #2.

{
    "Name": "p2p_rule_l3da",
    "Description": "Phy2Phy with rule on L3 Dest Addr",
    "Deployment": "clean",
    "biDirectional": "False",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],   # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 2
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_dst': '90.90.90.90', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous'}],
        ['vswitch', 'dump_flows', 'int_br0'],   # STEP 5
        ['vswitch', 'del_flow', 'int_br0'],     # STEP 7 == del-flows
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration p2p_rule_l3da
Multistream feature

The next testcase uses the multistream feature. The traffic generator will send packets with different UDP ports. That is accomplished by using “Stream Type” and “MultiStream” keywords. 4 different flows are set to forward all incoming packets.

{
    "Name": "multistream_l4",
    "Description": "Multistream on UDP ports",
    "Deployment": "clean",
    "Parameters": {
        'TRAFFIC' : {
            "multistream": 4,
            "stream_type": "L4",
        },
    },
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],   # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'], # STEP 2
        # Setup Flows
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '0', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '1', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '2', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '3', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        # Send mono-dir traffic
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous', \
            'bidir' : 'False'}],
        # Clean up
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
     ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration multistream_l4
PVP with a VM Replacement

This example launches a 1st VM in a PVP topology, then the VM is replaced by another VM. When VNF setup parameter in ./conf/04_vnf.conf is “QemuDpdkVhostUser” ‘add_vport’ method creates a ‘dpdkvhostuser’ type port to connect a VM.

{
    "Name": "ex_replace_vm",
    "Description": "PVP with VM replacement",
    "Deployment": "clean",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],       # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 3    vm1
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 4

        # Setup Flows
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[4][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[2][1]', \
            'actions': ['output:#STEP[4][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[3][1]', \
            'actions': ['output:#STEP[1][1]'], 'idle_timeout': '0'}],

        # Start VM 1
        ['vnf1', 'start'],
        # Now we want to replace VM 1 with another VM
        ['vnf1', 'stop'],

        ['vswitch', 'add_vport', 'int_br0'],        # STEP 11    vm2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 12
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'actions': ['output:#STEP[11][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[12][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],

        # Start VM 2
        ['vnf2', 'start'],
        ['vnf2', 'stop'],
        ['vswitch', 'dump_flows', 'int_br0'],

        # Clean up
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[3][0]'],    # vm1
        ['vswitch', 'del_port', 'int_br0', '#STEP[4][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[11][0]'],   # vm2
        ['vswitch', 'del_port', 'int_br0', '#STEP[12][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration ex_replace_vm
VM with a Linux bridge

This example setups a PVP topology and routes traffic to the VM based on the destination IP address. A command-line parameter is used to select a Linux bridge as a guest loopback application. It is also possible to select a guest loopback application by a configuration option GUEST_LOOPBACK.

{
    "Name": "ex_pvp_rule_l3da",
    "Description": "PVP with flow on L3 Dest Addr",
    "Deployment": "clean",
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],       # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 3    vm1
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 4
        # Setup Flows
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_dst': '90.90.90.90', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        # Each pkt from the VM is forwarded to the 2nd dpdk port
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[4][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        # Start VMs
        ['vnf1', 'start'],
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous', \
            'bidir' : 'False'}],
        ['vnf1', 'stop'],
        # Clean up
        ['vswitch', 'dump_flows', 'int_br0'],       # STEP 10
        ['vswitch', 'del_flow', 'int_br0'],         # STEP 11
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[3][0]'],  # vm1 ports
        ['vswitch', 'del_port', 'int_br0', '#STEP[4][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --test-params \
        "GUEST_LOOPBACK=['linux_bridge']" --integration ex_pvp_rule_l3da
Forward packets based on UDP port

This examples launches 2 VMs connected in parallel. Incoming packets will be forwarded to one specific VM depending on the destination UDP port.

{
    "Name": "ex_2pvp_rule_l4dp",
    "Description": "2 PVP with flows on L4 Dest Port",
    "Deployment": "clean",
    "Parameters": {
        'TRAFFIC' : {
            "multistream": 2,
            "stream_type": "L4",
        },
    },
    "TestSteps": [
        ['vswitch', 'add_switch', 'int_br0'],       # STEP 0
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 1
        ['vswitch', 'add_phy_port', 'int_br0'],     # STEP 2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 3    vm1
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 4
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 5    vm2
        ['vswitch', 'add_vport', 'int_br0'],        # STEP 6
        # Setup Flows to reply ICMPv6 and similar packets, so to
        # avoid flooding internal port with their re-transmissions
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:01', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:02', \
            'actions': ['output:#STEP[4][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:03', \
            'actions': ['output:#STEP[5][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', \
            {'priority': '1', 'dl_src': '00:00:00:00:00:04', \
            'actions': ['output:#STEP[6][1]'], 'idle_timeout': '0'}],
        # Forward UDP packets depending on dest port
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '0', \
            'actions': ['output:#STEP[3][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[1][1]', \
            'dl_type': '0x0800', 'nw_proto': '17', 'udp_dst': '1', \
            'actions': ['output:#STEP[5][1]'], 'idle_timeout': '0'}],
        # Send VM output to phy port #2
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[4][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        ['vswitch', 'add_flow', 'int_br0', {'in_port': '#STEP[6][1]', \
            'actions': ['output:#STEP[2][1]'], 'idle_timeout': '0'}],
        # Start VMs
        ['vnf1', 'start'],                          # STEP 16
        ['vnf2', 'start'],                          # STEP 17
        ['trafficgen', 'send_traffic', \
            {'traffic_type' : 'rfc2544_continuous', \
            'bidir' : 'False'}],
        ['vnf1', 'stop'],
        ['vnf2', 'stop'],
        ['vswitch', 'dump_flows', 'int_br0'],
        # Clean up
        ['vswitch', 'del_flow', 'int_br0'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[1][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[2][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[3][0]'],  # vm1 ports
        ['vswitch', 'del_port', 'int_br0', '#STEP[4][0]'],
        ['vswitch', 'del_port', 'int_br0', '#STEP[5][0]'],  # vm2 ports
        ['vswitch', 'del_port', 'int_br0', '#STEP[6][0]'],
        ['vswitch', 'del_switch', 'int_br0'],
    ]
},

To run the test:

./vsperf --conf-file user_settings.py --integration ex_2pvp_rule_l4dp
Modification of existing PVVP deployment

This is an example of modification of a standard deployment scenario with additional TestSteps. Standard PVVP scenario is used to configure a vSwitch and to deploy two VNFs connected in series. Additional TestSteps will deploy a 3rd VNF and connect it in parallel to already configured VNFs. Traffic generator is instructed (by Multistream feature) to send two separate traffic streams. One stream will be sent to the standalone VNF and second to two chained VNFs.

In case, that test is defined as a performance test, then traffic results will be collected and available in both csv and rst report files.

{
    "Name": "pvvp_pvp_cont",
    "Deployment": "pvvp",
    "Description": "PVVP and PVP in parallel with Continuous Stream",
    "Parameters" : {
        "TRAFFIC" : {
            "traffic_type" : "rfc2544_continuous",
            "multistream": 2,
        },
    },
    "TestSteps": [
                    ['vswitch', 'add_vport', 'br0'],
                    ['vswitch', 'add_vport', 'br0'],
                    # priority must be higher than default 32768, otherwise flows won't match
                    ['vswitch', 'add_flow', 'br0',
                     {'in_port': '1', 'actions': ['output:#STEP[-2][1]'], 'idle_timeout': '0', 'dl_type':'0x0800',
                                                  'nw_proto':'17', 'tp_dst':'0', 'priority': '33000'}],
                    ['vswitch', 'add_flow', 'br0',
                     {'in_port': '2', 'actions': ['output:#STEP[-2][1]'], 'idle_timeout': '0', 'dl_type':'0x0800',
                                                  'nw_proto':'17', 'tp_dst':'0', 'priority': '33000'}],
                    ['vswitch', 'add_flow', 'br0', {'in_port': '#STEP[-4][1]', 'actions': ['output:1'],
                                                    'idle_timeout': '0'}],
                    ['vswitch', 'add_flow', 'br0', {'in_port': '#STEP[-4][1]', 'actions': ['output:2'],
                                                    'idle_timeout': '0'}],
                    ['vswitch', 'dump_flows', 'br0'],
                    ['vnf1', 'start'],
                 ]
},

To run the test:

./vsperf --conf-file user_settings.py pvvp_pvp_cont
6. Integration tests

VSPERF includes a set of integration tests defined in conf/integration. These tests can be run by specifying –integration as a parameter to vsperf. Current tests in conf/integration include switch functionality and Overlay tests.

Tests in the conf/integration can be used to test scaling of different switch configurations by adding steps into the test case.

For the overlay tests VSPERF supports VXLAN, GRE and GENEVE tunneling protocols. Testing of these protocols is limited to unidirectional traffic and P2P (Physical to Physical scenarios).

NOTE: The configuration for overlay tests provided in this guide is for unidirectional traffic only.

Executing Integration Tests

To execute integration tests VSPERF is run with the integration parameter. To view the current test list simply execute the following command:

./vsperf --integration --list

The standard tests included are defined inside the conf/integration/01_testcases.conf file.

Executing Tunnel encapsulation tests

The VXLAN OVS DPDK encapsulation tests requires IPs, MAC addresses, bridge names and WHITELIST_NICS for DPDK.

NOTE: Only Ixia traffic generators currently support the execution of the tunnel encapsulation tests. Support for other traffic generators may come in a future release.

Default values are already provided. To customize for your environment, override the following variables in you user_settings.py file:

# Variables defined in conf/integration/02_vswitch.conf
# Tunnel endpoint for Overlay P2P deployment scenario
# used for br0
VTEP_IP1 = '192.168.0.1/24'

# Used as remote_ip in adding OVS tunnel port and
# to set ARP entry in OVS (e.g. tnl/arp/set br-ext 192.168.240.10 02:00:00:00:00:02
VTEP_IP2 = '192.168.240.10'

# Network to use when adding a route for inner frame data
VTEP_IP2_SUBNET = '192.168.240.0/24'

# Bridge names
TUNNEL_INTEGRATION_BRIDGE = 'br0'
TUNNEL_EXTERNAL_BRIDGE = 'br-ext'

# IP of br-ext
TUNNEL_EXTERNAL_BRIDGE_IP = '192.168.240.1/24'

# vxlan|gre|geneve
TUNNEL_TYPE = 'vxlan'

# Variables defined conf/integration/03_traffic.conf
# For OP2P deployment scenario
TRAFFICGEN_PORT1_MAC = '02:00:00:00:00:01'
TRAFFICGEN_PORT2_MAC = '02:00:00:00:00:02'
TRAFFICGEN_PORT1_IP = '1.1.1.1'
TRAFFICGEN_PORT2_IP = '192.168.240.10'

To run VXLAN encapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=vxlan' overlay_p2p_tput

To run GRE encapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=gre' overlay_p2p_tput

To run GENEVE encapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=geneve' overlay_p2p_tput

To run OVS NATIVE tunnel tests (VXLAN/GRE/GENEVE):

  1. Install the OVS kernel modules
cd src/ovs/ovs
sudo -E make modules_install
  1. Set the following variables:
VSWITCH = 'OvsVanilla'
# Specify vport_* kernel module to test.
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_vxlan',
    'vport_gre',
    'vport_geneve',
    'datapath/linux/openvswitch.ko',
]

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run tests:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'TUNNEL_TYPE=vxlan' overlay_p2p_tput
Executing VXLAN decapsulation tests

To run VXLAN decapsulation tests:

  1. Set the variables used in “Executing Tunnel encapsulation tests”
  2. Set dstmac of DUT_NIC2_MAC to the MAC adddress of the 2nd NIC of your DUT
DUT_NIC2_MAC = '<DUT NIC2 MAC>'
  1. Run test:
./vsperf --conf-file user_settings.py --integration overlay_p2p_decap_cont

If you want to use different values for your VXLAN frame, you may set:

VXLAN_FRAME_L3 = {'proto': 'udp',
                  'packetsize': 64,
                  'srcip': TRAFFICGEN_PORT1_IP,
                  'dstip': '192.168.240.1',
                 }
VXLAN_FRAME_L4 = {'srcport': 4789,
                  'dstport': 4789,
                  'vni': VXLAN_VNI,
                  'inner_srcmac': '01:02:03:04:05:06',
                  'inner_dstmac': '06:05:04:03:02:01',
                  'inner_srcip': '192.168.0.10',
                  'inner_dstip': '192.168.240.9',
                  'inner_proto': 'udp',
                  'inner_srcport': 3000,
                  'inner_dstport': 3001,
                 }
Executing GRE decapsulation tests

To run GRE decapsulation tests:

  1. Set the variables used in “Executing Tunnel encapsulation tests”
  2. Set dstmac of DUT_NIC2_MAC to the MAC adddress of the 2nd NIC of your DUT
DUT_NIC2_MAC = '<DUT NIC2 MAC>'
  1. Run test:
./vsperf --conf-file user_settings.py --test-params 'TUNNEL_TYPE=gre' \
         --integration overlay_p2p_decap_cont

If you want to use different values for your GRE frame, you may set:

GRE_FRAME_L3 = {'proto': 'gre',
                'packetsize': 64,
                'srcip': TRAFFICGEN_PORT1_IP,
                'dstip': '192.168.240.1',
               }

GRE_FRAME_L4 = {'srcport': 0,
                'dstport': 0
                'inner_srcmac': '01:02:03:04:05:06',
                'inner_dstmac': '06:05:04:03:02:01',
                'inner_srcip': '192.168.0.10',
                'inner_dstip': '192.168.240.9',
                'inner_proto': 'udp',
                'inner_srcport': 3000,
                'inner_dstport': 3001,
               }
Executing GENEVE decapsulation tests

IxNet 7.3X does not have native support of GENEVE protocol. The template, GeneveIxNetTemplate.xml_ClearText.xml, should be imported into IxNET for this testcase to work.

To import the template do:

  1. Run the IxNetwork TCL Server
  2. Click on the Traffic menu
  3. Click on the Traffic actions and click Edit Packet Templates
  4. On the Template editor window, click Import. Select the template located at 3rd_party/ixia/GeneveIxNetTemplate.xml_ClearText.xml and click import.
  5. Restart the TCL Server.

To run GENEVE decapsulation tests:

  1. Set the variables used in “Executing Tunnel encapsulation tests”
  2. Set dstmac of DUT_NIC2_MAC to the MAC adddress of the 2nd NIC of your DUT
DUT_NIC2_MAC = '<DUT NIC2 MAC>'
  1. Run test:
./vsperf --conf-file user_settings.py --test-params 'tunnel_type=geneve' \
         --integration overlay_p2p_decap_cont

If you want to use different values for your GENEVE frame, you may set:

GENEVE_FRAME_L3 = {'proto': 'udp',
                   'packetsize': 64,
                   'srcip': TRAFFICGEN_PORT1_IP,
                   'dstip': '192.168.240.1',
                  }

GENEVE_FRAME_L4 = {'srcport': 6081,
                   'dstport': 6081,
                   'geneve_vni': 0,
                   'inner_srcmac': '01:02:03:04:05:06',
                   'inner_dstmac': '06:05:04:03:02:01',
                   'inner_srcip': '192.168.0.10',
                   'inner_dstip': '192.168.240.9',
                   'inner_proto': 'udp',
                   'inner_srcport': 3000,
                   'inner_dstport': 3001,
                  }
Executing Native/Vanilla OVS VXLAN decapsulation tests

To run VXLAN decapsulation tests:

  1. Set the following variables in your user_settings.py file:
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_vxlan',
    'datapath/linux/openvswitch.ko',
]

DUT_NIC1_MAC = '<DUT NIC1 MAC ADDRESS>'

TRAFFICGEN_PORT1_IP = '172.16.1.2'
TRAFFICGEN_PORT2_IP = '192.168.1.11'

VTEP_IP1 = '172.16.1.2/24'
VTEP_IP2 = '192.168.1.1'
VTEP_IP2_SUBNET = '192.168.1.0/24'
TUNNEL_EXTERNAL_BRIDGE_IP = '172.16.1.1/24'
TUNNEL_INT_BRIDGE_IP = '192.168.1.1'

VXLAN_FRAME_L2 = {'srcmac':
                  '01:02:03:04:05:06',
                  'dstmac': DUT_NIC1_MAC
                 }

VXLAN_FRAME_L3 = {'proto': 'udp',
                  'packetsize': 64,
                  'srcip': TRAFFICGEN_PORT1_IP,
                  'dstip': '172.16.1.1',
                 }

VXLAN_FRAME_L4 = {
                  'srcport': 4789,
                  'dstport': 4789,
                  'protocolpad': 'true',
                  'vni': 99,
                  'inner_srcmac': '01:02:03:04:05:06',
                  'inner_dstmac': '06:05:04:03:02:01',
                  'inner_srcip': '192.168.1.2',
                  'inner_dstip': TRAFFICGEN_PORT2_IP,
                  'inner_proto': 'udp',
                  'inner_srcport': 3000,
                  'inner_dstport': 3001,
                 }

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run test:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'tunnel_type=vxlan' overlay_p2p_decap_cont
Executing Native/Vanilla OVS GRE decapsulation tests

To run GRE decapsulation tests:

  1. Set the following variables in your user_settings.py file:
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_gre',
    'datapath/linux/openvswitch.ko',
]

DUT_NIC1_MAC = '<DUT NIC1 MAC ADDRESS>'

TRAFFICGEN_PORT1_IP = '172.16.1.2'
TRAFFICGEN_PORT2_IP = '192.168.1.11'

VTEP_IP1 = '172.16.1.2/24'
VTEP_IP2 = '192.168.1.1'
VTEP_IP2_SUBNET = '192.168.1.0/24'
TUNNEL_EXTERNAL_BRIDGE_IP = '172.16.1.1/24'
TUNNEL_INT_BRIDGE_IP = '192.168.1.1'

GRE_FRAME_L2 = {'srcmac':
                '01:02:03:04:05:06',
                'dstmac': DUT_NIC1_MAC
               }

GRE_FRAME_L3 = {'proto': 'udp',
                'packetsize': 64,
                'srcip': TRAFFICGEN_PORT1_IP,
                'dstip': '172.16.1.1',
               }

GRE_FRAME_L4 = {
                'srcport': 4789,
                'dstport': 4789,
                'protocolpad': 'true',
                'inner_srcmac': '01:02:03:04:05:06',
                'inner_dstmac': '06:05:04:03:02:01',
                'inner_srcip': '192.168.1.2',
                'inner_dstip': TRAFFICGEN_PORT2_IP,
                'inner_proto': 'udp',
                'inner_srcport': 3000,
                'inner_dstport': 3001,
               }

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run test:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'tunnel_type=gre' overlay_p2p_decap_cont
Executing Native/Vanilla OVS GENEVE decapsulation tests

To run GENEVE decapsulation tests:

  1. Set the following variables in your user_settings.py file:
PATHS['vswitch']['OvsVanilla']['src']['modules'] = [
    'vport_geneve',
    'datapath/linux/openvswitch.ko',
]

DUT_NIC1_MAC = '<DUT NIC1 MAC ADDRESS>'

TRAFFICGEN_PORT1_IP = '172.16.1.2'
TRAFFICGEN_PORT2_IP = '192.168.1.11'

VTEP_IP1 = '172.16.1.2/24'
VTEP_IP2 = '192.168.1.1'
VTEP_IP2_SUBNET = '192.168.1.0/24'
TUNNEL_EXTERNAL_BRIDGE_IP = '172.16.1.1/24'
TUNNEL_INT_BRIDGE_IP = '192.168.1.1'

GENEVE_FRAME_L2 = {'srcmac':
                   '01:02:03:04:05:06',
                   'dstmac': DUT_NIC1_MAC
                  }

GENEVE_FRAME_L3 = {'proto': 'udp',
                   'packetsize': 64,
                   'srcip': TRAFFICGEN_PORT1_IP,
                   'dstip': '172.16.1.1',
                  }

GENEVE_FRAME_L4 = {'srcport': 6081,
                   'dstport': 6081,
                   'protocolpad': 'true',
                   'geneve_vni': 0,
                   'inner_srcmac': '01:02:03:04:05:06',
                   'inner_dstmac': '06:05:04:03:02:01',
                   'inner_srcip': '192.168.1.2',
                   'inner_dstip': TRAFFICGEN_PORT2_IP,
                   'inner_proto': 'udp',
                   'inner_srcport': 3000,
                   'inner_dstport': 3001,
                  }

NOTE: In case, that Vanilla OVS is installed from binary package, then please set PATHS['vswitch']['OvsVanilla']['bin']['modules'] instead.

  1. Run test:
./vsperf --conf-file user_settings.py --integration \
         --test-params 'tunnel_type=geneve' overlay_p2p_decap_cont
Executing Tunnel encapsulation+decapsulation tests

The OVS DPDK encapsulation_decapsulation tests requires IPs, MAC addresses, bridge names and WHITELIST_NICS for DPDK.

The test cases can test the tunneling encap and decap without using any ingress overlay traffic as compared to above test cases. To achieve this the OVS is configured to perform encap and decap in a series on the same traffic stream as given below.

TRAFFIC-IN –> [ENCAP] –> [MOD-PKT] –> [DECAP] –> TRAFFIC-OUT

Default values are already provided. To customize for your environment, override the following variables in you user_settings.py file:

# Variables defined in conf/integration/02_vswitch.conf

# Bridge names
TUNNEL_EXTERNAL_BRIDGE1 = 'br-phy1'
TUNNEL_EXTERNAL_BRIDGE2 = 'br-phy2'
TUNNEL_MODIFY_BRIDGE1 = 'br-mod1'
TUNNEL_MODIFY_BRIDGE2 = 'br-mod2'

# IP of br-mod1
TUNNEL_MODIFY_BRIDGE_IP1 = '10.0.0.1/24'

# Mac of br-mod1
TUNNEL_MODIFY_BRIDGE_MAC1 = '00:00:10:00:00:01'

# IP of br-mod2
TUNNEL_MODIFY_BRIDGE_IP2 = '20.0.0.1/24'

#Mac of br-mod2
TUNNEL_MODIFY_BRIDGE_MAC2 = '00:00:20:00:00:01'

# vxlan|gre|geneve, Only VXLAN is supported for now.
TUNNEL_TYPE = 'vxlan'

To run VXLAN encapsulation+decapsulation tests:

./vsperf --conf-file user_settings.py --integration \
         overlay_p2p_mod_tput
7. Execution of vswitchperf testcases by Yardstick
General

Yardstick is a generic framework for a test execution, which is used for validation of installation of OPNFV platform. In the future, Yardstick will support two options of vswitchperf testcase execution:

  • plugin mode, which will execute native vswitchperf testcases; Tests will be executed natively by vsperf, and test results will be processed and reported by yardstick.
  • traffic generator mode, which will run vswitchperf in trafficgen mode only; Yardstick framework will be used to launch VNFs and to configure flows to ensure, that traffic is properly routed. This mode will allow to test OVS performance in real world scenarios.

In Colorado release only the traffic generator mode is supported.

Yardstick Installation

In order to run Yardstick testcases, you will need to prepare your test environment. Please follow the installation instructions to install the yardstick.

Please note, that yardstick uses OpenStack for execution of testcases. OpenStack must be installed with Heat and Neutron services. Otherwise vswitchperf testcases cannot be executed.

VM image with vswitchperf

A special VM image is required for execution of vswitchperf specific testcases by yardstick. It is possible to use a sample VM image available at OPNFV artifactory or to build customized image.

Sample VM image with vswitchperf

Sample VM image is available at vswitchperf section of OPNFV artifactory for free download:

$ wget http://artifacts.opnfv.org/vswitchperf/vnf/vsperf-yardstick-image.qcow2

This image can be used for execution of sample testcases with dummy traffic generator.

NOTE: Traffic generators might require an installation of client software. This software is not included in the sample image and must be installed by user.

NOTE: This image will be updated only in case, that new features related to yardstick integration will be added to the vswitchperf.

Preparation of custom VM image

In general, any Linux distribution supported by vswitchperf can be used as a base image for vswitchperf. One of the possibilities is to modify vloop-vnf image, which can be downloaded from http://artifacts.opnfv.org/vswitchperf.html/ (see vloop-vnf).

Please follow the Installing vswitchperf to install vswitchperf inside vloop-vnf image. As vswitchperf will be run in trafficgen mode, it is possible to skip installation and compilation of OVS, QEMU and DPDK to keep image size smaller.

In case, that selected traffic generator requires installation of additional client software, please follow appropriate documentation. For example in case of IXIA, you would need to install IxOS and IxNetowrk TCL API.

VM image usage

Image with vswitchperf must be uploaded into the glance service and vswitchperf specific flavor configured, e.g.:

$ glance --os-username admin --os-image-api-version 1 image-create --name \
  vsperf --is-public true --disk-format qcow2 --container-format bare --file \
  vsperf-yardstick-image.qcow2

$ nova --os-username admin flavor-create vsperf-flavor 100 2048 25 1
Testcase execution

After installation, yardstick is available as python package within yardstick specific virtual environment. It means, that yardstick environment must be enabled before the test execution, e.g.:

source ~/yardstick_venv/bin/activate

Next step is configuration of OpenStack environment, e.g. in case of devstack:

source /opt/openstack/devstack/openrc
export EXTERNAL_NETWORK=public

Vswitchperf testcases executable by yardstick are located at vswitchperf repository inside yardstick/tests directory. Example of their download and execution follows:

git clone https://gerrit.opnfv.org/gerrit/vswitchperf
cd vswitchperf

yardstick -d task start yardstick/tests/rfc2544_throughput_dummy.yaml

NOTE: Optional argument -d shows debug output.

Testcase customization

Yardstick testcases are described by YAML files. vswitchperf specific testcases are part of the vswitchperf repository and their yaml files can be found at yardstick/tests directory. For detailed description of yaml file structure, please see yardstick documentation and testcase samples. Only vswitchperf specific parts will be discussed here.

Example of yaml file:

...
scenarios:
-
  type: Vsperf
  options:
    testname: 'p2p_rfc2544_throughput'
    trafficgen_port1: 'eth1'
    trafficgen_port2: 'eth3'
    external_bridge: 'br-ex'
    test_params: 'TRAFFICGEN_DURATION=30;TRAFFIC={'traffic_type':'rfc2544_throughput}'
    conf_file: '~/vsperf-yardstick.conf'

  host: vsperf.demo

  runner:
    type: Sequence
    scenario_option_name: frame_size
    sequence:
    - 64
    - 128
    - 512
    - 1024
    - 1518
  sla:
    metrics: 'throughput_rx_fps'
    throughput_rx_fps: 500000
    action: monitor

context:
...
Section option

Section option defines details of vswitchperf test scenario. Lot of options are identical to the vswitchperf parameters passed through --test-params argument. Following options are supported:

  • frame_size - a packet size for which test should be executed; Multiple packet sizes can be tested by modification of Sequence runner section inside YAML definition. Default: ‘64’
  • conf_file - sets path to the vswitchperf configuration file, which will be uploaded to VM; Default: ‘~/vsperf-yardstick.conf’
  • setup_script - sets path to the setup script, which will be executed during setup and teardown phases
  • trafficgen_port1 - specifies device name of 1st interface connected to the trafficgen
  • trafficgen_port2 - specifies device name of 2nd interface connected to the trafficgen
  • external_bridge - specifies name of external bridge configured in OVS; Default: ‘br-ex’
  • test_params - specifies a string with a list of vsperf configuration parameters, which will be passed to the --test-params CLI argument; Parameters should be stated in the form of param=value and separated by a semicolon. Configuration of traffic generator is driven by TRAFFIC dictionary, which can be also updated by values defined by test_params. Please check VSPERF documentation for details about available configuration parameters and their data types. In case that both test_params and conf_file are specified, then values from test_params will override values defined in the configuration file.

In case that trafficgen_port1 and/or trafficgen_port2 are defined, then these interfaces will be inserted into the external_bridge of OVS. It is expected, that OVS runs at the same node, where the testcase is executed. In case of more complex OpenStack installation or a need of additional OVS configuration, setup_script can be used.

NOTE It is essential to specify a configuration for selected traffic generator. In case, that standalone testcase is created, then traffic generator can be selected and configured directly in YAML file by test_params. On the other hand, if multiple testcases should be executed with the same traffic generator settings, then a customized configuration file should be prepared and its name passed by conf_file option.

Section runner

Yardstick supports several runner types. In case of vswitchperf specific TCs, Sequence runner type can be used to execute the testcase for given list of frame sizes.

Section sla

In case that sla section is not defined, then testcase will be always considered as successful. On the other hand, it is possible to define a set of test metrics and their minimal values to evaluate test success. Any numeric value, reported by vswitchperf inside CSV result file, can be used. Multiple metrics can be defined as a coma separated list of items. Minimal value must be set separately for each metric.

e.g.:

sla:
    metrics: 'throughput_rx_fps,throughput_rx_mbps'
    throughput_rx_fps: 500000
    throughput_rx_mbps: 1000

In case that any of defined metrics will be lower than defined value, then testcase will be marked as failed. Based on action policy, yardstick will either stop test execution (value assert) or it will run next test (value monitor).

NOTE The throughput SLA (or any other SLA) cannot be set to a meaningful value without knowledge of the server and networking environment, possibly including prior testing in that environment to establish a baseline SLA level under well-understood circumstances.

Indices

Yardstick

Performance Testing User Guide (Yardstick)
1. Introduction

Welcome to Yardstick’s documentation !

Yardstick is an OPNFV Project.

The project’s goal is to verify infrastructure compliance, from the perspective of a Virtual Network Function (VNF).

The Project’s scope is the development of a test framework, Yardstick, test cases and test stimuli to enable Network Function Virtualization Infrastructure (NFVI) verification. The Project also includes a sample VNF, the Virtual Traffic Classifier (VTC) and its experimental framework, ApexLake !

Yardstick is used in OPNFV for verifying the OPNFV infrastructure and some of the OPNFV features. The Yardstick framework is deployed in several OPNFV community labs. It is installer, infrastructure and application independent.

See also

Pharos for information on OPNFV community labs and this Presentation for an overview of Yardstick

1.1. About This Document

This document consists of the following chapters:

1.2. Contact Yardstick

Feedback? Contact us

2. Methodology
2.1. Abstract

This chapter describes the methodology implemented by the Yardstick project for verifying the NFVI from the perspective of a VNF.

2.2. ETSI-NFV

The document ETSI GS NFV-TST001, “Pre-deployment Testing; Report on Validation of NFV Environments and Services”, recommends methods for pre-deployment testing of the functional components of an NFV environment.

The Yardstick project implements the methodology described in chapter 6, “Pre- deployment validation of NFV infrastructure”.

The methodology consists in decomposing the typical VNF work-load performance metrics into a number of characteristics/performance vectors, which each can be represented by distinct test-cases.

The methodology includes five steps:

  • Step1: Define Infrastruture - the Hardware, Software and corresponding
    configuration target for validation; the OPNFV infrastructure, in OPNFV community labs.
  • Step2: Identify VNF type - the application for which the
    infrastructure is to be validated, and its requirements on the underlying infrastructure.
  • Step3: Select test cases - depending on the workload that represents the
    application for which the infrastruture is to be validated, the relevant test cases amongst the list of available Yardstick test cases.
  • Step4: Execute tests - define the duration and number of iterations for the
    selected test cases, tests runs are automated via OPNFV Jenkins Jobs.
  • Step5: Collect results - using the common API for result collection.

See also

Yardsticktst for material on alignment ETSI TST001 and Yardstick.

2.3. Metrics

The metrics, as defined by ETSI GS NFV-TST001, are shown in Table1, Table2 and Table3.

In OPNFV Colorado release, generic test cases covering aspects of the listed metrics are available; further OPNFV releases will provide extended testing of these metrics. The view of available Yardstick test cases cross ETSI definitions in Table1, Table2 and Table3 is shown in Table4. It shall be noticed that the Yardstick test cases are examples, the test duration and number of iterations are configurable, as are the System Under Test (SUT) and the attributes (or, in Yardstick nomemclature, the scenario options).

Table 1 - Performance/Speed Metrics

Category Performance/Speed
Compute
  • Latency for random memory access
  • Latency for cache read/write operations
  • Processing speed (instructions per second)
  • Throughput for random memory access (bytes per second)
Network
  • Throughput per NFVI node (frames/byte per second)
  • Throughput provided to a VM (frames/byte per second)
  • Latency per traffic flow
  • Latency between VMs
  • Latency between NFVI nodes
  • Packet delay variation (jitter) between VMs
  • Packet delay variation (jitter) between NFVI nodes
Storage
  • Sequential read/write IOPS
  • Random read/write IOPS
  • Latency for storage read/write operations
  • Throughput for storage read/write operations

Table 2 - Capacity/Scale Metrics

Category Capacity/Scale
Compute
  • Number of cores and threads- Available memory size
  • Cache size
  • Processor utilization (max, average, standard deviation)
  • Memory utilization (max, average, standard deviation)
  • Cache utilization (max, average, standard deviation)
Network
  • Number of connections
  • Number of frames sent/received
  • Maximum throughput between VMs (frames/byte per second)
  • Maximum throughput between NFVI nodes (frames/byte per second)
  • Network utilization (max, average, standard deviation)
  • Number of traffic flows
Storage
  • Storage/Disk size
  • Capacity allocation (block-based, object-based)
  • Block size
  • Maximum sequential read/write IOPS
  • Maximum random read/write IOPS
  • Disk utilization (max, average, standard deviation)

Table 3 - Availability/Reliability Metrics

Category Availability/Reliability
Compute
  • Processor availability (Error free processing time)
  • Memory availability (Error free memory time)
  • Processor mean-time-to-failure
  • Memory mean-time-to-failure
  • Number of processing faults per second
Network
  • NIC availability (Error free connection time)
  • Link availability (Error free transmission time)
  • NIC mean-time-to-failure
  • Network timeout duration due to link failure
  • Frame loss rate
Storage
  • Disk availability (Error free disk access time)
  • Disk mean-time-to-failure
  • Number of failed storage read/write operations per second

Table 4 - Yardstick Generic Test Cases

Category Performance/Speed Capacity/Scale Availability/Reliability
Compute TC003 [1] TC004 TC010 TC012 TC014 TC069 TC003 [1] TC004 TC024 TC055 TC013 [1] TC015 [1]
Network TC001 TC002 TC009 TC011 TC042 TC043 TC044 TC073 TC075 TC016 [1] TC018 [1]
Storage TC005 TC063 TC017 [1]

Note

The description in this OPNFV document is intended as a reference for users to understand the scope of the Yardstick Project and the deliverables of the Yardstick framework. For complete description of the methodology, please refer to the ETSI document.

Footnotes

[1](1, 2, 3, 4, 5, 6, 7) To be included in future deliveries.
3. Architecture
3.1. Abstract

This chapter describes the yardstick framework software architecture. we will introduce it from Use-Case View, Logical View, Process View and Deployment View. More technical details will be introduced in this chapter.

3.2. Overview
3.2.1. Architecture overview

Yardstick is mainly written in Python, and test configurations are made in YAML. Documentation is written in reStructuredText format, i.e. .rst files. Yardstick is inspired by Rally. Yardstick is intended to run on a computer with access and credentials to a cloud. The test case is described in a configuration file given as an argument.

How it works: the benchmark task configuration file is parsed and converted into an internal model. The context part of the model is converted into a Heat template and deployed into a stack. Each scenario is run using a runner, either serially or in parallel. Each runner runs in its own subprocess executing commands in a VM using SSH. The output of each scenario is written as json records to a file or influxdb or http server, we use influxdb as the backend, the test result will be shown with grafana.

3.2.2. Concept

Benchmark - assess the relative performance of something

Benchmark configuration file - describes a single test case in yaml format

Context - The set of Cloud resources used by a scenario, such as user names, image names, affinity rules and network configurations. A context is converted into a simplified Heat template, which is used to deploy onto the Openstack environment.

Data - Output produced by running a benchmark, written to a file in json format

Runner - Logic that determines how a test scenario is run and reported, for example the number of test iterations, input value stepping and test duration. Predefined runner types exist for re-usage, see Runner types.

Scenario - Type/class of measurement for example Ping, Pktgen, (Iperf, LmBench, ...)

SLA - Relates to what result boundary a test case must meet to pass. For example a latency limit, amount or ratio of lost packets and so on. Action based on SLA can be configured, either just to log (monitor) or to stop further testing (assert). The SLA criteria is set in the benchmark configuration file and evaluated by the runner.

3.2.3. Runner types

There exists several predefined runner types to choose between when designing a test scenario:

Arithmetic: Every test run arithmetically steps the specified input value(s) in the test scenario, adding a value to the previous input value. It is also possible to combine several input values for the same test case in different combinations.

Snippet of an Arithmetic runner configuration:

runner:
    type: Arithmetic
    iterators:
    -
      name: stride
      start: 64
      stop: 128
      step: 64

Duration: The test runs for a specific period of time before completed.

Snippet of a Duration runner configuration:

runner:
  type: Duration
  duration: 30

Sequence: The test changes a specified input value to the scenario. The input values to the sequence are specified in a list in the benchmark configuration file.

Snippet of a Sequence runner configuration:

runner:
  type: Sequence
  scenario_option_name: packetsize
  sequence:
  - 100
  - 200
  - 250

Iteration: Tests are run a specified number of times before completed.

Snippet of an Iteration runner configuration:

runner:
  type: Iteration
  iterations: 2
3.3. Use-Case View

Yardstick Use-Case View shows two kinds of users. One is the Tester who will do testing in cloud, the other is the User who is more concerned with test result and result analyses.

For testers, they will run a single test case or test case suite to verify infrastructure compliance or bencnmark their own infrastructure performance. Test result will be stored by dispatcher module, three kinds of store method (file, influxdb and http) can be configured. The detail information of scenarios and runners can be queried with CLI by testers.

For users, they would check test result with four ways.

If dispatcher module is configured as file(default), there are two ways to check test result. One is to get result from yardstick.out ( default path: /tmp/yardstick.out), the other is to get plot of test result, it will be shown if users execute command “yardstick-plot”.

If dispatcher module is configured as influxdb, users will check test result on Grafana which is most commonly used for visualizing time series data.

If dispatcher module is configured as http, users will check test result on OPNFV testing dashboard which use MongoDB as backend.

Yardstick Use-Case View
3.4. Logical View

Yardstick Logical View describes the most important classes, their organization, and the most important use-case realizations.

Main classes:

TaskCommands - “yardstick task” subcommand handler.

HeatContext - Do test yaml file context section model convert to HOT, deploy and undeploy Openstack heat stack.

Runner - Logic that determines how a test scenario is run and reported.

TestScenario - Type/class of measurement for example Ping, Pktgen, (Iperf, LmBench, ...)

Dispatcher - Choose user defined way to store test results.

TaskCommands is the “yardstick task” subcommand’s main entry. It takes yaml file (e.g. test.yaml) as input, and uses HeatContext to convert the yaml file’s context section to HOT. After Openstack heat stack is deployed by HeatContext with the converted HOT, TaskCommands use Runner to run specified TestScenario. During first runner initialization, it will create output process. The output process use Dispatcher to push test results. The Runner will also create a process to execute TestScenario. And there is a multiprocessing queue between each runner process and output process, so the runner process can push the real-time test results to the storage media. TestScenario is commonly connected with VMs by using ssh. It sets up VMs and run test measurement scripts through the ssh tunnel. After all TestScenaio is finished, TaskCommands will undeploy the heat stack. Then the whole test is finished.

Yardstick framework architecture in Danube
3.5. Process View (Test execution flow)

Yardstick process view shows how yardstick runs a test case. Below is the sequence graph about the test execution flow using heat context, and each object represents one module in yardstick:

Yardstick Process View

A user wants to do a test with yardstick. He can use the CLI to input the command to start a task. “TaskCommands” will receive the command and ask “HeatContext” to parse the context. “HeatContext” will then ask “Model” to convert the model. After the model is generated, “HeatContext” will inform “Openstack” to deploy the heat stack by heat template. After “Openstack” deploys the stack, “HeatContext” will inform “Runner” to run the specific test case.

Firstly, “Runner” would ask “TestScenario” to process the specific scenario. Then “TestScenario” will start to log on the openstack by ssh protocal and execute the test case on the specified VMs. After the script execution finishes, “TestScenario” will send a message to inform “Runner”. When the testing job is done, “Runner” will inform “Dispatcher” to output the test result via file, influxdb or http. After the result is output, “HeatContext” will call “Openstack” to undeploy the heat stack. Once the stack is undepoyed, the whole test ends.

3.6. Deployment View

Yardstick deployment view shows how the yardstick tool can be deployed into the underlying platform. Generally, yardstick tool is installed on JumpServer(see 07-installation for detail installation steps), and JumpServer is connected with other control/compute servers by networking. Based on this deployment, yardstick can run the test cases on these hosts, and get the test result for better showing.

Yardstick Deployment View
3.7. Yardstick Directory structure

yardstick/ - Yardstick main directory.

tests/ci/ - Used for continuous integration of Yardstick at different PODs and
with support for different installers.
docs/ - All documentation is stored here, such as configuration guides,
user guides and Yardstick descriptions.

etc/ - Used for test cases requiring specific POD configurations.

samples/ - test case samples are stored here, most of all scenario and
feature’s samples are shown in this directory.
tests/ - Here both Yardstick internal tests (functional/ and unit/) as
well as the test cases run to verify the NFVI (opnfv/) are stored. Also configurations of what to run daily and weekly at the different PODs is located here.
tools/ - Currently contains tools to build image for VMs which are deployed
by Heat. Currently contains how to build the yardstick-trusty-server image with the different tools that are needed from within the image.

plugin/ - Plug-in configuration files are stored here.

vTC/ - Contains the files for running the virtual Traffic Classifier tests.

yardstick/ - Contains the internals of Yardstick: Runners, Scenario, Contexts,
CLI parsing, keys, plotting tools, dispatcher, plugin install/remove scripts and so on.
4. Yardstick Installation
4.1. Abstract

Yardstick supports installation by Docker or directly in Ubuntu. The installation procedure for Docker and direct installation are detailed in the sections below.

To use Yardstick you should have access to an OpenStack environment, with at least Nova, Neutron, Glance, Keystone and Heat installed.

The steps needed to run Yardstick are:

  1. Install Yardstick.
  2. Load OpenStack environment variables.
  3. Create Yardstick flavor.
  4. Build a guest image and load it into the OpenStack environment.
  5. Create the test configuration .yaml file and run the test case/suite.
4.2. Prerequisites

The OPNFV deployment is out of the scope of this document and can be found here. The OPNFV platform is considered as the System Under Test (SUT) in this document.

Several prerequisites are needed for Yardstick:

  1. A Jumphost to run Yardstick on
  2. A Docker daemon or a virtual environment installed on the Jumphost
  3. A public/external network created on the SUT
  4. Connectivity from the Jumphost to the SUT public/external network

NOTE: Jumphost refers to any server which meets the previous requirements. Normally it is the same server from where the OPNFV deployment has been triggered.

WARNING: Connectivity from Jumphost is essential and it is of paramount importance to make sure it is working before even considering to install and run Yardstick. Make also sure you understand how your networking is designed to work.

NOTE: If your Jumphost is operating behind a company http proxy and/or Firewall, please consult first the section `Proxy Support (**Todo**)`_, towards the end of this document. That section details some tips/tricks which may be of help in a proxified environment.

4.4. Install Yardstick directly in Ubuntu

Alternatively you can install Yardstick framework directly in Ubuntu or in an Ubuntu Docker image. No matter which way you choose to install Yardstick, the following installation steps are identical.

If you choose to use the Ubuntu Docker image, you can pull the Ubuntu Docker image from Docker hub:

docker pull ubuntu:16.04
4.4.1. Install Yardstick

Prerequisite preparation:

apt-get update && apt-get install -y git python-setuptools python-pip
easy_install -U setuptools==30.0.0
pip install appdirs==1.4.0
pip install virtualenv

Create a virtual environment:

virtualenv ~/yardstick_venv
export YARDSTICK_VENV=~/yardstick_venv
source ~/yardstick_venv/bin/activate

Download the source code and install Yardstick from it:

git clone https://gerrit.opnfv.org/gerrit/yardstick
export YARDSTICK_REPO_DIR=~/yardstick
cd yardstick
./install.sh
4.4.2. Configure the Yardstick environment (Todo)

For installing Yardstick directly in Ubuntu, the yardstick env command is not available. You need to prepare OpenStack environment variables and create Yardstick flavor and guest images manually.

4.4.3. Uninstall Yardstick

For unistalling Yardstick, just delete the virtual environment:

rm -rf ~/yardstick_venv
4.5. Verify the installation

It is recommended to verify that Yardstick was installed successfully by executing some simple commands and test samples. Before executing Yardstick test cases make sure yardstick-flavor and yardstick-image can be found in OpenStack and the openrc file is sourced. Below is an example invocation of Yardstick help command and ping.py test sample:

yardstick -h
yardstick task start samples/ping.yaml

NOTE: The above commands could be run in both the Yardstick container and the Ubuntu directly.

Each testing tool supported by Yardstick has a sample configuration file. These configuration files can be found in the samples directory.

Default location for the output is /tmp/yardstick.out.

4.6. Deploy InfluxDB and Grafana using Docker

Without InfluxDB, Yardstick stores results for runnning test case in the file /tmp/yardstick.out. However, it’s unconvenient to retrieve and display test results. So we will show how to use InfluxDB to store data and use Grafana to display data in the following sections.

4.6.2. Manually deploy InfluxDB and Grafana containers

You could also deploy influxDB and Grafana containers manually on the Jumphost. The following sections show how to do.

4.6.2.1. Pull docker images
docker pull tutum/influxdb
docker pull grafana/grafana
4.6.2.2. Run and configure influxDB

Run influxDB:

docker run -d --name influxdb \
-p 8083:8083 -p 8086:8086 --expose 8090 --expose 8099 \
tutum/influxdb
docker exec -it influxdb bash

Configure influxDB:

influx
>CREATE USER root WITH PASSWORD 'root' WITH ALL PRIVILEGES
>CREATE DATABASE yardstick;
>use yardstick;
>show MEASUREMENTS;
4.6.2.3. Run and configure Grafana

Run Grafana:

docker run -d --name grafana -p 3000:3000 grafana/grafana

Log on http://{YOUR_IP_HERE}:3000 using admin/admin and configure database resource to be {YOUR_IP_HERE}:8086.

Grafana data source configration
4.6.2.4. Configure yardstick.conf
docker exec -it yardstick /bin/bash
cp etc/yardstick/yardstick.conf.sample /etc/yardstick/yardstick.conf
vi /etc/yardstick/yardstick.conf

Modify yardstick.conf:

[DEFAULT]
debug = True
dispatcher = influxdb

[dispatcher_influxdb]
timeout = 5
target = http://{YOUR_IP_HERE}:8086
db_name = yardstick
username = root
password = root

Now you can run Yardstick test cases and store the results in influxDB.

4.7. Deploy InfluxDB and Grafana directly in Ubuntu (Todo)
4.8. Run Yardstick in a local environment

We also have a guide about how to run Yardstick in a local environment. This work is contributed by Tapio Tallgren. You can find this guide at here.

4.9. Create a test suite for Yardstick

A test suite in yardstick is a yaml file which include one or more test cases. Yardstick is able to support running test suite task, so you can customize your own test suite and run it in one task.

tests/opnfv/test_suites is the folder where Yardstick puts CI test suite. A typical test suite is like below (the fuel_test_suite.yaml example):

---
# Fuel integration test task suite

schema: "yardstick:suite:0.1"

name: "fuel_test_suite"
test_cases_dir: "samples/"
test_cases:
-
  file_name: ping.yaml
-
  file_name: iperf3.yaml

As you can see, there are two test cases in the fuel_test_suite.yaml. The schema and the name must be specified. The test cases should be listed via the tag test_cases and their relative path is also marked via the tag test_cases_dir.

Yardstick test suite also supports constraints and task args for each test case. Here is another sample (the os-nosdn-nofeature-ha.yaml example) to show this, which is digested from one big test suite:

---

schema: "yardstick:suite:0.1"

name: "os-nosdn-nofeature-ha"
test_cases_dir: "tests/opnfv/test_cases/"
test_cases:
-
    file_name: opnfv_yardstick_tc002.yaml
-
    file_name: opnfv_yardstick_tc005.yaml
-
    file_name: opnfv_yardstick_tc043.yaml
       constraint:
          installer: compass
          pod: huawei-pod1
       task_args:
          huawei-pod1: '{"pod_info": "etc/yardstick/.../pod.yaml",
          "host": "node4.LF","target": "node5.LF"}'

As you can see in test case opnfv_yardstick_tc043.yaml, there are two tags, constraint and task_args. constraint is to specify which installer or pod it can be run in the CI environment. task_args is to specify the task arguments for each pod.

All in all, to create a test suite in Yardstick, you just need to create a yaml file and add test cases, constraint or task arguments if necessary.

4.10. Proxy Support (Todo)
5. Installing a plug-in into Yardstick
5.1. Abstract

Yardstick provides a plugin CLI command to support integration with other OPNFV testing projects. Below is an example invocation of Yardstick plugin command and Storperf plug-in sample.

5.2. Installing Storperf into Yardstick

Storperf is delivered as a Docker container from https://hub.docker.com/r/opnfv/storperf/tags/.

There are two possible methods for installation in your environment:

  • Run container on Jump Host
  • Run container in a VM

In this introduction we will install Storperf on Jump Host.

5.2.1. Step 0: Environment preparation

Running Storperf on Jump Host Requirements:

  • Docker must be installed
  • Jump Host must have access to the OpenStack Controller API
  • Jump Host must have internet connectivity for downloading docker image
  • Enough floating IPs must be available to match your agent count

Before installing Storperf into yardstick you need to check your openstack environment and other dependencies:

  1. Make sure docker is installed.
  2. Make sure Keystone, Nova, Neutron, Glance, Heat are installed correctly.
  3. Make sure Jump Host have access to the OpenStack Controller API.
  4. Make sure Jump Host must have internet connectivity for downloading docker image.
  5. You need to know where to get basic openstack Keystone authorization info, such as OS_PASSWORD, OS_TENANT_NAME, OS_AUTH_URL, OS_USERNAME.
  6. To run a Storperf container, you need to have OpenStack Controller environment variables defined and passed to Storperf container. The best way to do this is to put environment variables in a “storperf_admin-rc” file. The storperf_admin-rc should include credential environment variables at least:
  • OS_AUTH_URL
  • OS_USERNAME
  • OS_PASSWORD
  • OS_TENANT_ID
  • OS_TENANT_NAME
  • OS_PROJECT_NAME
  • OS_PROJECT_ID
  • OS_USER_DOMAIN_ID

Yardstick has a “prepare_storperf_admin-rc.sh” script which can be used to generate the “storperf_admin-rc” file, this script is located at test/ci/prepare_storperf_admin-rc.sh

#!/bin/bash
# Prepare storperf_admin-rc for StorPerf.
AUTH_URL=${OS_AUTH_URL}
USERNAME=${OS_USERNAME:-admin}
PASSWORD=${OS_PASSWORD:-console}

TENANT_NAME=${OS_TENANT_NAME:-admin}
TENANT_ID=`openstack project show admin|grep '\bid\b' |awk -F '|' '{print $3}'|sed -e 's/^[[:space:]]*//'`
PROJECT_NAME=${OS_PROJECT_NAME:-$TENANT_NAME}
PROJECT_ID=`openstack project show admin|grep '\bid\b' |awk -F '|' '{print $3}'|sed -e 's/^[[:space:]]*//'`
USER_DOMAIN_ID=${OS_USER_DOMAIN_ID:-default}

rm -f ~/storperf_admin-rc
touch ~/storperf_admin-rc

echo "OS_AUTH_URL="$AUTH_URL >> ~/storperf_admin-rc
echo "OS_USERNAME="$USERNAME >> ~/storperf_admin-rc
echo "OS_PASSWORD="$PASSWORD >> ~/storperf_admin-rc
echo "OS_PROJECT_NAME="$PROJECT_NAME >> ~/storperf_admin-rc
echo "OS_PROJECT_ID="$PROJECT_ID >> ~/storperf_admin-rc
echo "OS_TENANT_NAME="$TENANT_NAME >> ~/storperf_admin-rc
echo "OS_TENANT_ID="$TENANT_ID >> ~/storperf_admin-rc
echo "OS_USER_DOMAIN_ID="$USER_DOMAIN_ID >> ~/storperf_admin-rc

The generated “storperf_admin-rc” file will be stored in the root directory. If you installed Yardstick using Docker, this file will be located in the container. You may need to copy it to the root directory of the Storperf deployed host.

5.2.2. Step 1: Plug-in configuration file preparation

To install a plug-in, first you need to prepare a plug-in configuration file in YAML format and store it in the “plugin” directory. The plugin configration file work as the input of yardstick “plugin” command. Below is the Storperf plug-in configuration file sample:

---
# StorPerf plugin configuration file
# Used for integration StorPerf into Yardstick as a plugin
schema: "yardstick:plugin:0.1"
plugins:
  name: storperf
deployment:
  ip: 192.168.23.2
  user: root
  password: root

In the plug-in configuration file, you need to specify the plug-in name and the plug-in deployment info, including node ip, node login username and password. Here the Storperf will be installed on IP 192.168.23.2 which is the Jump Host in my local environment.

5.2.3. Step 2: Plug-in install/remove scripts preparation

In “yardstick/resource/scripts” directory, there are two folders: a “install” folder and a “remove” folder. You need to store the plug-in install/remove scripts in these two folders respectively.

The detailed installation or remove operation should de defined in these two scripts. The name of both install and remove scripts should match the plugin-in name that you specified in the plug-in configuration file.

For example, the install and remove scripts for Storperf are both named to “storperf.bash”.

5.2.4. Step 3: Install and remove Storperf

To install Storperf, simply execute the following command:

# Install Storperf
yardstick plugin install plugin/storperf.yaml
5.2.4.1. removing Storperf from yardstick

To remove Storperf, simply execute the following command:

# Remove Storperf
yardstick plugin remove plugin/storperf.yaml

What yardstick plugin command does is using the username and password to log into the deployment target and then execute the corresponding install or remove script.

6. Store Other Project’s Test Results in InfluxDB
6.1. Abstract

This chapter illustrates how to run plug-in test cases and store test results into community’s InfluxDB. The framework is shown in Framework.

Store Other Project's Test Results in InfluxDB
6.2. Store Storperf Test Results into Community’s InfluxDB

As shown in Framework, there are two ways to store Storperf test results into community’s InfluxDB:

  1. Yardstick executes Storperf test case (TC074), posting test job to Storperf container via ReST API. After the test job is completed, Yardstick reads test results via ReST API from Storperf and posts test data to the influxDB.
  2. Additionally, Storperf can run tests by itself and post the test result directly to the InfluxDB. The method for posting data directly to influxDB will be supported in the future.

Our plan is to support rest-api in D release so that other testing projects can call the rest-api to use yardstick dispatcher service to push data to yardstick’s influxdb database.

For now, influxdb only support line protocol, and the json protocol is deprecated.

Take ping test case for example, the raw_result is json format like this:

  "benchmark": {
      "timestamp": 1470315409.868095,
      "errors": "",
      "data": {
        "rtt": {
        "ares": 1.125
        }
      },
    "sequence": 1
    },
  "runner_id": 2625
}

With the help of “influxdb_line_protocol”, the json is transform to like below as a line string:

'ping,deploy_scenario=unknown,host=athena.demo,installer=unknown,pod_name=unknown,
  runner_id=2625,scenarios=Ping,target=ares.demo,task_id=77755f38-1f6a-4667-a7f3-
    301c99963656,version=unknown rtt.ares=1.125 1470315409868094976'

So, for data output of json format, you just need to transform json into line format and call influxdb api to post the data into the database. All this function has been implemented in Influxdb. If you need support on this, please contact Mingjiang.

curl -i -XPOST 'http://104.197.68.199:8086/write?db=yardstick' --
  data-binary 'ping,deploy_scenario=unknown,host=athena.demo,installer=unknown, ...'

Grafana will be used for visualizing the collected test data, which is shown in Visual. Grafana can be accessed by Login.

results visualization
7. Grafana dashboard
7.1. Abstract

This chapter describes the Yardstick grafana dashboard. The Yardstick grafana dashboard can be found here: http://testresults.opnfv.org/grafana/

Yardstick grafana dashboard
7.2. Public access

Yardstick provids a public account for accessing to the dashboard. The username and password are both set to ‘opnfv’.

7.3. Testcase dashboard

For each test case, there is a dedicated dashboard. Shown here is the dashboard of TC002.

For each test case dashboard. On the top left, we have a dashboard selection, you can switch to different test cases using this pull-down menu.

Underneath, we have a pod and scenario selection. All the pods and scenarios that have ever published test data to the InfluxDB will be shown here.

You can check multiple pods or scenarios.

For each test case, we have a short description and a link to detailed test case information in Yardstick user guide.

Underneath, it is the result presentation section. You can use the time period selection on the top right corner to zoom in or zoom out the chart.

7.4. Administration access

For a user with administration rights it is easy to update and save any dashboard configuration. Saved updates immediately take effect and become live. This may cause issues like:

  • Changes and updates made to the live configuration in Grafana can compromise existing Grafana content in an unwanted, unpredicted or incompatible way. Grafana as such is not version controlled, there exists one single Grafana configuration per dashboard.
  • There is a risk several people can disturb each other when doing updates to the same Grafana dashboard at the same time.

Any change made by administrator should be careful.

7.5. Add a dashboard into yardstick grafana

Due to security concern, users that using the public opnfv account are not able to edit the yardstick grafana directly.It takes a few more steps for a non-yardstick user to add a custom dashboard into yardstick grafana.

There are 6 steps to go.

Add a dashboard into yardstick grafana
  1. You need to build a local influxdb and grafana, so you can do the work locally. You can refer to How to deploy InfluxDB and Grafana locally wiki page about how to do this.
  2. Once step one is done, you can fetch the existing grafana dashboard configuration file from the yardstick repository and import it to your local grafana. After import is done, you grafana dashboard will be ready to use just like the community’s dashboard.
  3. The third step is running some test cases to generate test results and publishing it to your local influxdb.
  4. Now you have some data to visualize in your dashboard. In the fourth step, it is time to create your own dashboard. You can either modify an existing dashboard or try to create a new one from scratch. If you choose to modify an existing dashboard then in the curtain menu of the existing dashboard do a “Save As...” into a new dashboard copy instance, and then continue doing all updates and saves within the dashboard copy.
  5. When finished with all Grafana configuration changes in this temporary dashboard then chose “export” of the updated dashboard copy into a JSON file and put it up for review in Gerrit, in file /yardstick/dashboard/Yardstick-TCxxx-yyyyyyyyyyyyy. For instance a typical default name of the file would be “Yardstick-TC001 Copy-1234567891234”.
  6. Once you finish your dashboard, the next step is exporting the configuration file and propose a patch into Yardstick. Yardstick team will review and merge it into Yardstick repository. After approved review Yardstick team will do an “import” of the JSON file and also a “save dashboard” as soon as possible to replace the old live dashboard configuration.
8. Yardstick Restful API
8.1. Abstract

Yardstick support restful API in danube.

8.2. Available API
8.2.1. /yardstick/env/action

Description: This API is used to do some work related to environment. For now, we support:

  1. Prepare yardstick environment(Including fetch openrc file, get external network and load images)
  2. Start a InfluxDB docker container and config yardstick output to InfluxDB.
  3. Start a Grafana docker container and config with the InfluxDB.

Which API to call will depend on the Parameters.

Method: POST

Prepare Yardstick Environment Example:

{
    'action': 'prepareYardstickEnv'
}

This is an asynchronous API. You need to call /yardstick/asynctask API to get the task result.

Start and Config InfluxDB docker container Example:

{
    'action': 'createInfluxDBContainer'
}

This is an asynchronous API. You need to call /yardstick/asynctask API to get the task result.

Start and Config Grafana docker container Example:

{
    'action': 'createGrafanaContainer'
}

This is an asynchronous API. You need to call /yardstick/asynctask API to get the task result.

8.2.2. /yardstick/asynctask

Description: This API is used to get the status of asynchronous task

Method: GET

Get the status of asynchronous task Example:

http://localhost:8888/yardstick/asynctask?task_id=3f3f5e03-972a-4847-a5f8-154f1b31db8c

The returned status will be 0(running), 1(finished) and 2(failed).

8.2.3. /yardstick/testcases

Description: This API is used to list all release test cases now in yardstick.

Method: GET

Get a list of release test cases Example:

http://localhost:8888/yardstick/testcases
8.2.4. /yardstick/testcases/release/action

Description: This API is used to run a yardstick release test case.

Method: POST

Run a release test case Example:

{
    'action': 'runTestCase',
    'args': {
        'opts': {},
        'testcase': 'tc002'
    }
}

This is an asynchronous API. You need to call /yardstick/results to get the result.

8.2.5. /yardstick/testcases/samples/action

Description: This API is used to run a yardstick sample test case.

Method: POST

Run a sample test case Example:

{
    'action': 'runTestCase',
    'args': {
        'opts': {},
        'testcase': 'ping'
    }
}

This is an asynchronous API. You need to call /yardstick/results to get the result.

8.2.6. /yardstick/testsuites/action

Description: This API is used to run a yardstick test suite.

Method: POST

Run a test suite Example:

{
    'action': 'runTestSuite',
    'args': {
        'opts': {},
        'testcase': 'smoke'
    }
}

This is an asynchronous API. You need to call /yardstick/results to get the result.

8.2.7. /yardstick/results

Description: This API is used to get the test results of certain task. If you call /yardstick/testcases/samples/action API, it will return a task id. You can use the returned task id to get the results by using this API.

Get test results of one task Example:

http://localhost:8888/yardstick/results?task_id=3f3f5e03-972a-4847-a5f8-154f1b31db8c

This API will return a list of test case result

9. Yardstick User Interface

This interface provides a user to view the test result in table format and also values pinned on to a graph.

9.1. Command
yardstick report generate <task-ID> <testcase-filename>
9.2. Description

1. When the command is triggered using the task-id and the testcase name provided the respective values are retrieved from the database (influxdb in this particular case).

2. The values are then formatted and then provided to the html template framed with complete html body using Django Framework.

  1. Then the whole template is written into a html file.

The graph is framed with Timestamp on x-axis and output values (differ from testcase to testcase) on y-axis with the help of “Highcharts”.

10. Virtual Traffic Classifier
10.1. Abstract

This chapter provides an overview of the virtual Traffic Classifier, a contribution to OPNFV Yardstick from the EU Project TNOVA. Additional documentation is available in TNOVAresults.

10.2. Overview

The virtual Traffic Classifier (VTC) VNF, comprises of a Virtual Network Function Component (VNFC). The VNFC contains both the Traffic Inspection module, and the Traffic forwarding module, needed to run the VNF. The exploitation of Deep Packet Inspection (DPI) methods for traffic classification is built around two basic assumptions:

  • third parties unaffiliated with either source or recipient are able to

inspect each IP packet’s payload

  • the classifier knows the relevant syntax of each application’s packet

payloads (protocol signatures, data patterns, etc.).

The proposed DPI based approach will only use an indicative, small number of the initial packets from each flow in order to identify the content and not inspect each packet.

In this respect it follows the Packet Based per Flow State (term:PBFS). This method uses a table to track each session based on the 5-tuples (src address, dest address, src port,dest port, transport protocol) that is maintained for each flow.

10.3. Concepts
  • Traffic Inspection: The process of packet analysis and application

identification of network traffic that passes through the VTC.

  • Traffic Forwarding: The process of packet forwarding from an incoming

network interface to a pre-defined outgoing network interface.

  • Traffic Rule Application: The process of packet tagging, based on a

predefined set of rules. Packet tagging may include e.g. Type of Service (ToS) field modification.

10.4. Architecture

The Traffic Inspection module is the most computationally intensive component of the VNF. It implements filtering and packet matching algorithms in order to support the enhanced traffic forwarding capability of the VNF. The component supports a flow table (exploiting hashing algorithms for fast indexing of flows) and an inspection engine for traffic classification.

The implementation used for these experiments exploits the nDPI library. The packet capturing mechanism is implemented using libpcap. When the DPI engine identifies a new flow, the flow register is updated with the appropriate information and transmitted across the Traffic Forwarding module, which then applies any required policy updates.

The Traffic Forwarding moudle is responsible for routing and packet forwarding. It accepts incoming network traffic, consults the flow table for classification information for each incoming flow and then applies pre-defined policies marking e.g. ToS/Differentiated Services Code Point (DSCP) multimedia traffic for Quality of Service (QoS) enablement on the forwarded traffic. It is assumed that the traffic is forwarded using the default policy until it is identified and new policies are enforced.

The expected response delay is considered to be negligible, as only a small number of packets are required to identify each flow.

10.5. Graphical Overview
+----------------------------+
|                            |
| Virtual Traffic Classifier |
|                            |
|     Analysing/Forwarding   |
|        ------------>       |
|     ethA          ethB     |
|                            |
+----------------------------+
     |              ^
     |              |
     v              |
+----------------------------+
|                            |
|     Virtual Switch         |
|                            |
+----------------------------+
10.6. Install

run the vTC/build.sh with root privileges

10.7. Run
sudo ./pfbridge -a eth1 -b eth2

Note

Virtual Traffic Classifier is not support in OPNFV Danube release.

10.8. Development Environment

Ubuntu 14.04 Ubuntu 16.04

11. Apexlake Installation Guide
11.1. Abstract

ApexLake is a framework that provides automatic execution of experiments and related data collection to enable a user validate infrastructure from the perspective of a Virtual Network Function (VNF).

In the context of Yardstick, a virtual Traffic Classifier (VTC) network function is utilized.

11.1.1. Framework Hardware Dependencies

In order to run the framework there are some hardware related dependencies for ApexLake.

The framework needs to be installed on the same physical node where DPDK-pktgen is installed.

The installation requires the physical node hosting the packet generator must have 2 NICs which are DPDK compatible.

The 2 NICs will be connected to the switch where the OpenStack VM network is managed.

The switch used must support multicast traffic and IGMP snooping. Further details about the configuration are provided at the following here.

The corresponding ports to which the cables are connected need to be configured as VLAN trunks using two of the VLAN IDs available for Neutron. Note the VLAN IDs used as they will be required in later configuration steps.

11.1.2. Framework Software Dependencies

Before starting the framework, a number of dependencies must first be installed. The following describes the set of instructions to be executed via the Linux shell in order to install and configure the required dependencies.

  1. Install Dependencies.

To support the framework dependencies the following packages must be installed. The example provided is based on Ubuntu and needs to be executed in root mode.

apt-get install python-dev
apt-get install python-pip
apt-get install python-mock
apt-get install tcpreplay
apt-get install libpcap-dev
  1. Source OpenStack openrc file.
source openrc
  1. Configure Openstack Neutron

In order to support traffic generation and management by the virtual Traffic Classifier, the configuration of the port security driver extension is required for Neutron.

For further details please follow the following link: PORTSEC This step can be skipped in case the target OpenStack is Juno or Kilo release, but it is required to support Liberty. It is therefore required to indicate the release version in the configuration file located in ./yardstick/vTC/apexlake/apexlake.conf

  1. Create Two Networks based on VLANs in Neutron.

To enable network communications between the packet generator and the compute node, two networks must be created via Neutron and mapped to the VLAN IDs that were previously used in the configuration of the physical switch. The following shows the typical set of commands required to configure Neutron correctly. The physical switches need to be configured accordingly.

VLAN_1=2032
VLAN_2=2033
PHYSNET=physnet2
neutron net-create apexlake_inbound_network \
        --provider:network_type vlan \
        --provider:segmentation_id $VLAN_1 \
        --provider:physical_network $PHYSNET

neutron subnet-create apexlake_inbound_network \
        192.168.0.0/24 --name apexlake_inbound_subnet

neutron net-create apexlake_outbound_network \
        --provider:network_type vlan \
        --provider:segmentation_id $VLAN_2 \
        --provider:physical_network $PHYSNET

neutron subnet-create apexlake_outbound_network 192.168.1.0/24 \
        --name apexlake_outbound_subnet
  1. Download Ubuntu Cloud Image and load it on Glance

The virtual Traffic Classifier is supported on top of Ubuntu 14.04 cloud image. The image can be downloaded on the local machine and loaded on Glance using the following commands:

wget cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-amd64-disk1.img
glance image-create \
        --name ubuntu1404 \
        --is-public true \
        --disk-format qcow \
        --container-format bare \
        --file trusty-server-cloudimg-amd64-disk1.img
  1. Configure the Test Cases

The VLAN tags must also be included in the test case Yardstick yaml file as parameters for the following test cases:

11.1.2.1. Install and Configure DPDK Pktgen

Execution of the framework is based on DPDK Pktgen. If DPDK Pktgen has not installed, it is necessary to download, install, compile and configure it. The user can create a directory and download the dpdk packet generator source code:

cd experimental_framework/libraries
mkdir dpdk_pktgen
git clone https://github.com/pktgen/Pktgen-DPDK.git

For instructions on the installation and configuration of DPDK and DPDK Pktgen please follow the official DPDK Pktgen README file. Once the installation is completed, it is necessary to load the DPDK kernel driver, as follow:

insmod uio
insmod DPDK_DIR/x86_64-native-linuxapp-gcc/kmod/igb_uio.ko

It is necessary to set the configuration file to support the desired Pktgen configuration. A description of the required configuration parameters and supporting examples is provided in the following:

[PacketGen]
packet_generator = dpdk_pktgen

# This is the directory where the packet generator is installed
# (if the user previously installed dpdk-pktgen,
# it is required to provide the director where it is installed).
pktgen_directory = /home/user/software/dpdk_pktgen/dpdk/examples/pktgen/

# This is the directory where DPDK is installed
dpdk_directory = /home/user/apexlake/experimental_framework/libraries/Pktgen-DPDK/dpdk/

# Name of the dpdk-pktgen program that starts the packet generator
program_name = app/app/x86_64-native-linuxapp-gcc/pktgen

# DPDK coremask (see DPDK-Pktgen readme)
coremask = 1f

# DPDK memory channels (see DPDK-Pktgen readme)
memory_channels = 3

# Name of the interface of the pktgen to be used to send traffic (vlan_sender)
name_if_1 = p1p1

# Name of the interface of the pktgen to be used to receive traffic (vlan_receiver)
name_if_2 = p1p2

# PCI bus address correspondent to if_1
bus_slot_nic_1 = 01:00.0

# PCI bus address correspondent to if_2
bus_slot_nic_2 = 01:00.1

To find the parameters related to names of the NICs and the addresses of the PCI buses the user may find it useful to run the DPDK tool nic_bind as follows:

DPDK_DIR/tools/dpdk_nic_bind.py --status

Lists the NICs available on the system, and shows the available drivers and bus addresses for each interface. Please make sure to select NICs which are DPDK compatible.

11.1.2.2. Installation and Configuration of smcroute

The user is required to install smcroute which is used by the framework to support multicast communications.

The following is the list of commands required to download and install smroute.

cd ~
git clone https://github.com/troglobit/smcroute.git
cd smcroute
git reset --hard c3f5c56
sed -i 's/aclocal-1.11/aclocal/g' ./autogen.sh
sed -i 's/automake-1.11/automake/g' ./autogen.sh
./autogen.sh
./configure
make
sudo make install
cd ..

It is required to do the reset to the specified commit ID. It is also requires the creation a configuration file using the following command:

SMCROUTE_NIC=(name of the nic)

where name of the nic is the name used previously for the variable “name_if_2”. For example:

SMCROUTE_NIC=p1p2

Then create the smcroute configuration file /etc/smcroute.conf

echo mgroup from $SMCROUTE_NIC group 224.192.16.1 > /etc/smcroute.conf

At the end of this procedure it will be necessary to perform the following actions to add the user to the sudoers:

adduser USERNAME sudo
echo "user ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
11.1.2.3. Experiment using SR-IOV Configuration on the Compute Node

To enable SR-IOV interfaces on the physical NIC of the compute node, a compatible NIC is required. NIC configuration depends on model and vendor. After proper configuration to support SR-IOV, a proper configuration of OpenStack is required. For further information, please refer to the SRIOV configuration guide

11.1.3. Finalize installation the framework on the system

The installation of the framework on the system requires the setup of the project. After entering into the apexlake directory, it is sufficient to run the following command.

python setup.py install

Since some elements are copied into the /tmp directory (see configuration file) it could be necessary to repeat this step after a reboot of the host.

12. Apexlake API Interface Definition
12.1. Abstract

The API interface provided by the framework to enable the execution of test cases is defined as follows.

12.2. init

static init()

Initializes the Framework

Returns None

12.3. execute_framework

static execute_framework (test_cases,

iterations,

heat_template,

heat_template_parameters,

deployment_configuration,

openstack_credentials)

Executes the framework according the specified inputs

Parameters

  • test_cases

    Test cases to be run with the workload (dict() of dict())

    Example:

    test_case = dict()

    test_case[’name’] = ‘module.Class’

    test_case[’params’] = dict()

    test_case[’params’][’throughput’] = ‘1’

    test_case[’params’][’vlan_sender’] = ‘1000’

    test_case[’params’][’vlan_receiver’] = ‘1001’

    test_cases = [test_case]

  • iterations

    Number of test cycles to be executed (int)

  • heat_template

    (string) File name of the heat template corresponding to the workload to be deployed. It contains the parameters to be evaluated in the form of #parameter_name. (See heat_templates/vTC.yaml as example).

  • heat_template_parameters

    (dict) Parameters to be provided as input to the heat template. See http://docs.openstack.org/developer/heat/ template_guide/hot_guide.html section “Template input parameters” for further info.

  • deployment_configuration

    ( dict[string] = list(strings) ) ) Dictionary of parameters representing the deployment configuration of the workload.

    The key is a string corresponding to the name of the parameter, the value is a list of strings representing the value to be assumed by a specific param. The parameters are user defined: they have to correspond to the place holders (#parameter_name) specified in the heat template.

Returns dict() containing results

13. Network Services Benchmarking (NSB)
13.1. Abstract

This chapter provides an overview of the NSB, a contribution to OPNFV Yardstick from Intel.

13.2. Overview

GOAL: Extend Yardstick to perform real world VNFs and NFVi Characterization and benchmarking with repeatable and deterministic methods.

The Network Service Benchmarking (NSB) extends the yardstick framework to do VNF characterization and benchmarking in three different execution environments - bare metal i.e. native Linux environment, standalone virtual environment and managed virtualized environment (e.g. Open stack etc.). It also brings in the capability to interact with external traffic generators both hardware & software based for triggering and validating the traffic according to user defined profiles.

NSB extension includes:

  • Generic data models of Network Services, based on ETSI spec (ETSI GS NFV-TST 001) .. _ETSI GS NFV-TST 001: http://www.etsi.org/deliver/etsi_gs/NFV-TST/001_099/001/01.01.01_60/gs_nfv-tst001v010101p.pdf

  • New Standalone context for VNF testing like SRIOV, OVS, OVS-DPDK etc

  • Generic VNF configuration models and metrics implemented with Python classes

  • Traffic generator features and traffic profiles

    • L1-L3 state-less traffic profiles
    • L4-L7 state-full traffic profiles
    • Tunneling protocol / network overlay support
  • Test case samples

    • Ping
    • Trex
    • vPE,vCGNAT, vFirewall etc - ipv4 throughput, latency etc
  • Traffic generators like Trex, ab/nginx, ixia, iperf etc

  • KPIs for a given use case:

    • System agent support for collecting NFVi KPI. This includes:

      • CPU statistic
      • Memory BW
      • OVS-DPDK Stats
    • Network KPIs, e.g., inpackets, outpackets, thoughput, latency etc

    • VNF KPIs, e.g., packet_in, packet_drop, packet_fwd etc

13.3. Architecture

The Network Service (NS) defines a set of Virtual Network Functions (VNF) connected together using NFV infrastructure.

The Yardstick NSB extension can support multiple VNFs created by different vendors including traffic generators. Every VNF being tested has its own data model. The Network service defines a VNF modelling on base of performed network functionality. The part of the data model is a set of the configuration parameters, number of connection points used and flavor including core and memory amount.

The ETSI defines a Network Service as a set of configurable VNFs working in some NFV Infrastructure connecting each other using Virtual Links available through Connection Points. The ETSI MANO specification defines a set of management entities called Network Service Descriptors (NSD) and VNF Descriptors (VNFD) that define real Network Service. The picture below makes an example how the real Network Operator use-case can map into ETSI Network service definition

Network Service framework performs the necessary test steps. It may involve

  • Interacting with traffic generator and providing the inputs on traffic type / packet structure to generate the required traffic as per the test case. Traffic profiles will be used for this.
  • Executing the commands required for the test procedure and analyses the command output for confirming whether the command got executed correctly or not. E.g. As per the test case, run the traffic for the given time period / wait for the necessary time delay
  • Verify the test result.
  • Validate the traffic flow from SUT
  • Fetch the table / data from SUT and verify the value as per the test case
  • Upload the logs from SUT onto the Test Harness server
  • Read the KPI’s provided by particular VNF
13.3.1. Components of Network Service
  • Models for Network Service benchmarking: The Network Service benchmarking requires the proper modelling approach. The NSB provides models using Python files and defining of NSDs and VNFDs.

The benchmark control application being a part of OPNFV yardstick can call that python models to instantiate and configure the VNFs. Depending on infrastructure type (bare-metal or fully virtualized) that calls could be made directly or using MANO system.

  • Traffic generators in NSB: Any benchmark application requires a set of traffic generator and traffic profiles defining the method in which traffic is generated.

The Network Service benchmarking model extends the Network Service definition with a set of Traffic Generators (TG) that are treated same way as other VNFs being a part of benchmarked network service. Same as other VNFs the traffic generator are instantiated and terminated.

Every traffic generator has own configuration defined as a traffic profile and a set of KPIs supported. The python models for TG is extended by specific calls to listen and generate traffic.

  • The stateless TREX traffic generator: The main traffic generator used as Network Service stimulus is open source TREX tool.

The TREX tool can generate any kind of stateless traffic.

+--------+      +-------+      +--------+
|        |      |       |      |        |
|  Trex  | ---> |  VNF  | ---> |  Trex  |
|        |      |       |      |        |
+--------+      +-------+      +--------+

Supported testcases scenarios:

  • Correlated UDP traffic using TREX traffic generator and replay VNF.

    • using different IMIX configuration like pure voice, pure video traffic etc
    • using different number IP flows like 1 flow, 1K, 16K, 64K, 256K, 1M flows
    • Using different number of rules configured like 1 rule, 1K, 10K rules

For UDP correlated traffic following Key Performance Indicators are collected for every combination of test case parameters:

  • RFC2544 throughput for various loss rate defined (1% is a default)
13.4. Graphical Overview

NSB Testing with yardstick framework facilitate performance testing of various VNFs provided.

+-----------+
|           |                                                     +-----------+
|   vPE     |                                                   ->|TGen Port 0|
| TestCase  |                                                   | +-----------+
|           |                                                   |
+-----------+     +------------------+            +-------+     |
                  |                  | -- API --> |  VNF  | <--->
+-----------+     |     Yardstick    |            +-------+     |
| Test Case | --> |    NSB Testing   |                          |
+-----------+     |                  |                          |
      |           |                  |                          |
      |           +------------------+                          |
+-----------+                                                   | +-----------+
|   Traffic |                                                   ->|TGen Port 1|
|  patterns |                                                     +-----------+
+-----------+

            Figure 1: Network Service - 2 server configuration
14. Yardstick - NSB Testing -Installation
14.1. Abstract

The Network Service Benchmarking (NSB) extends the yardstick framework to do VNF characterization and benchmarking in three different execution environments viz., bare metal i.e. native Linux environment, standalone virtual environment and managed virtualized environment (e.g. Open stack etc.). It also brings in the capability to interact with external traffic generators both hardware & software based for triggering and validating the traffic according to user defined profiles.

The steps needed to run Yardstick with NSB testing are:

  • Install Yardstick (NSB Testing).
  • Setup pod.yaml describing Test topology
  • Create the test configuration yaml file.
  • Run the test case.
14.2. Prerequisites

Refer chapter Yardstick Instalaltion for more information on yardstick prerequisites

Several prerequisites are needed for Yardstick(VNF testing):

  • Python Modules: pyzmq, pika.
  • flex
  • bison
  • build-essential
  • automake
  • libtool
  • librabbitmq-dev
  • rabbitmq-server
  • collectd
  • intel-cmt-cat
14.3. Install Yardstick (NSB Testing)

Refer chapter Yardstick Installation for more information on installing Yardstick

After Yardstick is installed, executing the “nsb_setup.sh” script to setup NSB testing.

./nsb_setup.sh

It will also automatically download all the packages needed for NSB Testing setup.

14.4. System Topology:
+----------+              +----------+
|          |              |          |
|          | (0)----->(0) |   Ping/  |
|    TG1   |              |   vPE/   |
|          |              |   2Trex  |
|          | (1)<-----(1) |          |
+----------+              +----------+
trafficgen_1                   vnf
14.5. OpenStack parameters and credentials
14.5.1. Environment variables

Before running Yardstick (NSB Testing) it is necessary to export traffic generator libraries.

source ~/.bash_profile
14.5.2. Config yardstick conf
cp ./etc/yardstick/yardstick.conf.sample /etc/yardstick/yardstick.conf
vi /etc/yardstick/yardstick.conf

Add trex_path and bin_path in ‘nsb’ section.

[DEFAULT]
debug = True
dispatcher = influxdb

[dispatcher_influxdb]
timeout = 5
target = http://{YOUR_IP_HERE}:8086
db_name = yardstick
username = root
password = root

[nsb]
trex_path=/opt/nsb_bin/trex/scripts
bin_path=/opt/nsb_bin
14.5.3. Config pod.yaml describing Topology

Before executing Yardstick test cases, make sure that pod.yaml reflects the topology and update all the required fields.

cp /etc/yardstick/nodes/pod.yaml.nsb.sample /etc/yardstick/nodes/pod.yaml

Config pod.yaml

nodes:
-
    name: trafficgen_1
    role: TrafficGen
    ip: 1.1.1.1
    user: root
    password: r00t
    interfaces:
        xe0:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.0"
            driver:    i40e # default kernel driver
            dpdk_port_num: 0
            local_ip: "152.16.100.20"
            netmask:   "255.255.255.0"
            local_mac: "00:00:00:00:00:01"
        xe1:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.1"
            driver:    i40e # default kernel driver
            dpdk_port_num: 1
            local_ip: "152.16.40.20"
            netmask:   "255.255.255.0"
            local_mac: "00:00.00:00:00:02"

-
    name: vnf
    role: vnf
    ip: 1.1.1.2
    user: root
    password: r00t
    host: 1.1.1.2 #BM - host == ip, virtualized env - Host - compute node
    interfaces:
        xe0:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.0"
            driver:    i40e # default kernel driver
            dpdk_port_num: 0
            local_ip: "152.16.100.19"
            netmask:   "255.255.255.0"
            local_mac: "00:00:00:00:00:03"

        xe1:  # logical name from topology.yaml and vnfd.yaml
            vpci:      "0000:07:00.1"
            driver:    i40e # default kernel driver
            dpdk_port_num: 1
            local_ip: "152.16.40.19"
            netmask:   "255.255.255.0"
            local_mac: "00:00:00:00:00:04"
    routing_table:
    - network: "152.16.100.20"
      netmask: "255.255.255.0"
      gateway: "152.16.100.20"
      if: "xe0"
    - network: "152.16.40.20"
      netmask: "255.255.255.0"
      gateway: "152.16.40.20"
      if: "xe1"
    nd_route_tbl:
    - network: "0064:ff9b:0:0:0:0:9810:6414"
      netmask: "112"
      gateway: "0064:ff9b:0:0:0:0:9810:6414"
      if: "xe0"
    - network: "0064:ff9b:0:0:0:0:9810:2814"
      netmask: "112"
      gateway: "0064:ff9b:0:0:0:0:9810:2814"
      if: "xe1"
14.5.4. Enable yardstick virtual environment

Before executing yardstick test cases, make sure to activate yardstick python virtual environment

source /opt/nsb_bin/yardstick_venv/bin/activate
14.6. Run Yardstick - Network Service Testcases
14.6.1. NS testing - using NSBperf CLI
 source /opt/nsb_setup/yardstick_venv/bin/activate
 PYTHONPATH: ". ~/.bash_profile"
 cd <yardstick_repo>/yardstick/cmd

Execute command: ./NSPerf.py -h
     ./NSBperf.py --vnf <selected vnf> --test <rfc test>
     eg: ./NSBperf.py --vnf vpe --test tc_baremetal_rfc2544_ipv4_1flow_64B.yaml
14.6.2. NS testing - using yardstick CLI
source /opt/nsb_setup/yardstick_venv/bin/activate
PYTHONPATH: ". ~/.bash_profile"
Go to test case forlder type we want to execute.
e.g. <yardstick repo>/samples/vnf_samples/nsut/<vnf>/ run: yardstick –debug task start <test_case.yaml>
15. Yardstick Test Cases
15.1. Abstract

This chapter lists available Yardstick test cases. Yardstick test cases are divided in two main categories:

  • Generic NFVI Test Cases - Test Cases developed to realize the methodology

described in Methodology

  • OPNFV Feature Test Cases - Test Cases developed to verify one or more

aspect of a feature delivered by an OPNFV Project, including the test cases developed for the VTC.

15.2. Generic NFVI Test Case Descriptions
15.2.1. Yardstick Test Case Description TC001
Network Performance
test case id OPNFV_YARDSTICK_TC001_NETWORK PERFORMANCE
metric Number of flows and throughput
test purpose

The purpose of TC001 is to evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

pktgen

Linux packet generator is a tool to generate packets at very high speed in the kernel. pktgen is mainly used to drive and LAN equipment test network. pktgen supports multi threading. To generate random MAC address, IP address, port number UDP packets, pktgen uses multiple CPU processors in the different PCI bus (PCI, PCIe bus) with Gigabit Ethernet tested (pktgen performance depends on the CPU processing speed, memory delay, PCI bus speed hardware parameters), Transmit data rate can be even larger than 10GBit/s. Visible can satisfy most card test requirements.

(Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

test description This test case uses Pktgen to generate packet flow between two hosts for simulating network workloads on the SUT.
traffic profile An IP table is setup on server to monitor for received packets.
configuration

file: opnfv_yardstick_tc001.yaml

Packet size is set to 60 bytes. Number of ports: 10, 50, 100, 500 and 1000, where each runs for 20 seconds. The whole sequence is run twice The client and server are distributed on different hardware.

For SLA max_ppm is set to 1000. The amount of configured ports map to between 110 up to 1001000 flows, respectively.

applicability

Test can be configured with different:

  • packet sizes;
  • amount of flows;
  • test duration.

Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

usability This test case is used for generating high network throughput to simulate certain workloads on the SUT. Hence it should work with other test cases.
references

pktgen

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs are booted, as server and client.
step 2 Yardstick is connected with the server VM by using ssh. ‘pktgen_benchmark’ bash script is copyied from Jump Host to the server VM via the ssh tunnel.
step 3 An IP table is setup on server to monitor for received packets.
step 4

pktgen is invoked to generate packet flow between two server and client for simulating network workloads on the SUT. Results are processed and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 5 Two host VMs are deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.2. Yardstick Test Case Description TC002
Network Latency
test case id OPNFV_YARDSTICK_TC002_NETWORK LATENCY
metric RTT (Round Trip Time)
test purpose

The purpose of TC002 is to do a basic verification that network latency is within acceptable boundaries when packets travel between hosts located on same or different compute blades.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

ping

Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network. It measures the round-trip time for packet sent from the originating host to a destination computer that are echoed back to the source.

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Docker image. (For example also a Cirros image can be downloaded from cirros-image, it includes ping)

test topology

Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from host VM to target VM(s) to elicit ICMP ECHO_RESPONSE.

For one host VM there can be multiple target VMs. Host VM and target VM(s) can be on same or different compute blades.

configuration

file: opnfv_yardstick_tc002.yaml

Packet size 100 bytes. Test duration 60 seconds. One ping each 10 seconds. Test is iterated two times. SLA RTT is set to maximum 10 ms.

applicability

This test case can be configured with different:

  • packet sizes;
  • burst sizes;
  • ping intervals;
  • test durations;
  • test iterations.

Default values exist.

SLA is optional. The SLA in this test case serves as an example. Considerably lower RTT is expected, and also normal to achieve in balanced L2 environments. However, to cover most configurations, both bare metal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many real time applications start to suffer badly if the RTT time is higher than this. Some may suffer bad also close to this RTT, while others may not suffer at all. It is a compromise that may have to be tuned for different configuration purposes.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

Ping

ETSI-NFV-TST001

pre-test conditions

The test case image (cirros-image) needs to be installed into Glance with ping included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs are booted, as server and client.
step 2 Yardstick is connected with the server VM by using ssh. ‘ping_benchmark’ bash script is copyied from Jump Host to the server VM via the ssh tunnel.
step 3

Ping is invoked. Ping packets are sent from server VM to client VM. RTT results are calculated and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 Two host VMs are deleted.
test verdict Test should not PASS if any RTT is above the optional SLA value, or if there is a test case execution problem.
15.2.3. Yardstick Test Case Description TC004
Cache Utilization
test case id OPNFV_YARDSTICK_TC004_CACHE Utilization
metric cache hit, cache miss, hit/miss ratio, buffer size and page cache size
test purpose

The purpose of TC004 is to evaluate the IaaS compute capability with regards to cache utilization.This test case should be run in parallel with other Yardstick test cases and not run as a stand-alone test case.

This test case measures cache usage statistics, including cache hit, cache miss, hit ratio, buffer cache size and page cache size, with some wokloads runing on the infrastructure. Both average and maximun values are collected.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

cachestat

cachestat is a tool using Linux ftrace capabilities for showing Linux page cache hit/miss statistics.

(cachestat is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with cachestat included.)

test description cachestat test is invoked in a host VM on a compute blade, cachestat test requires some other test cases running in the host to stimulate workload.
configuration

File: cachestat.yaml (in the ‘samples’ directory)

Interval is set 1. Test repeat, pausing every 1 seconds in-between. Test durarion is set to 60 seconds.

SLA is not available in this test case.

applicability

Test can be configured with different:

  • interval;
  • runner Duration.

Default values exist.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

cachestat

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with cachestat included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with cachestat installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. ‘cache_stat’ bash script is copyied from Jump Host to the server VM via the ssh tunnel.
step 3

‘cache_stat’ script is invoked. Raw cache usage statistics are collected and filtrated. Average and maximum values are calculated and recorded. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict None. Cache utilization results are collected and stored.
15.2.4. Yardstick Test Case Description TC005
Storage Performance
test case id OPNFV_YARDSTICK_TC005_STORAGE PERFORMANCE
metric IOPS (Average IOs performed per second), Throughput (Average disk read/write bandwidth rate), Latency (Average disk read/write latency)
test purpose

The purpose of TC005 is to evaluate the IaaS storage performance with regards to IOPS, throughput and latency.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

fio

fio is an I/O tool meant to be used both for benchmark and stress/hardware verification. It has support for 19 different types of I/O engines (sync, mmap, libaio, posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O priorities (for newer Linux kernels), rate I/O, forked or threaded jobs, and much more.

(fio is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with fio included.)

test description fio test is invoked in a host VM on a compute blade, a job file as well as parameters are passed to fio and fio will start doing what the job file tells it to do.
configuration

file: opnfv_yardstick_tc005.yaml

IO types is set to read, write, randwrite, randread, rw. IO block size is set to 4KB, 64KB, 1024KB. fio is run for each IO type and IO block size scheme, each iteration runs for 30 seconds (10 for ramp time, 20 for runtime).

For SLA, minimum read/write iops is set to 100, minimum read/write throughput is set to 400 KB/s, and maximum read/write latency is set to 20000 usec.

applicability

This test case can be configured with different:

  • IO types;
  • IO block size;
  • IO depth;
  • ramp time;
  • test duration.

Default values exist.

SLA is optional. The SLA in this test case serves as an example. Considerably higher throughput and lower latency are expected. However, to cover most configurations, both baremetal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many heavy IO applications start to suffer badly if the read/write bandwidths are lower than this.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

fio

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with fio included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with fio installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. ‘fio_benchmark’ bash script is copyied from Jump Host to the host VM via the ssh tunnel.
step 3

‘fio_benchmark’ script is invoked. Simulated IO operations are started. IOPS, disk read/write bandwidth and latency are recorded and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.5. Yardstick Test Case Description TC008
Packet Loss Extended Test
test case id OPNFV_YARDSTICK_TC008_NW PERF, Packet loss Extended Test
metric Number of flows, packet size and throughput
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of packet sizes and flows matter for the throughput between VMs on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs ans similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc008.yaml

Packet size: 64, 128, 256, 512, 1024, 1280 and 1518 bytes.

Number of ports: 1, 10, 50, 100, 500 and 1000. The amount of configured ports map from 2 up to 1001000 flows, respectively. Each packet_size/port_amount combination is run ten times, for 20 seconds each. Then the next packet_size/port_amount combination is run, and so on.

The client and server are distributed on different HW.

For SLA max_ppm is set to 1000.

test tool

pktgen

(Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

references

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.6. Yardstick Test Case Description TC009
Packet Loss
test case id OPNFV_YARDSTICK_TC009_NW PERF, Packet loss
metric Number of flows, packets lost and throughput
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between VMs on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs ans similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc009.yaml

Packet size: 64 bytes

Number of ports: 1, 10, 50, 100, 500 and 1000. The amount of configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run ten times, for 20 seconds each. Then the next port_amount is run, and so on.

The client and server are distributed on different HW.

For SLA max_ppm is set to 1000.

test tool

pktgen

(Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

references

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.7. Yardstick Test Case Description TC010
Memory Latency
test case id OPNFV_YARDSTICK_TC010_MEMORY LATENCY
metric Memory read latency (nanoseconds)
test purpose

The purpose of TC010 is to evaluate the IaaS compute performance with regards to memory read latency. It measures the memory read latency for varying memory sizes and strides. Whole memory hierarchy is measured.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

Lmbench

Lmbench is a suite of operating system microbenchmarks. This test uses lat_mem_rd tool from that suite including:

  • Context switching
  • Networking: connection establishment, pipe, TCP, UDP, and RPC hot potato
  • File system creates and deletes
  • Process creation
  • Signal handling
  • System call overhead
  • Memory read latency

(LMbench is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with LMbench included.)

test description

LMbench lat_mem_rd benchmark measures memory read latency for varying memory sizes and strides.

The benchmark runs as two nested loops. The outer loop is the stride size. The inner loop is the array size. For each array size, the benchmark creates a ring of pointers that point backward one stride.Traversing the array is done by:

p = (char **)*p;

in a for loop (the over head of the for loop is not significant; the loop is an unrolled loop 100 loads long). The size of the array varies from 512 bytes to (typically) eight megabytes. For the small sizes, the cache will have an effect, and the loads will be much faster. This becomes much more apparent when the data is plotted.

Only data accesses are measured; the instruction cache is not measured.

The results are reported in nanoseconds per load and have been verified accurate to within a few nanoseconds on an SGI Indy.

configuration

File: opnfv_yardstick_tc010.yaml

  • SLA (max_latency): 30 nanoseconds
  • Stride - 128 bytes
  • Stop size - 64 megabytes
  • Iterations: 10 - test is run 10 times iteratively.
  • Interval: 1 - there is 1 second delay between each iteration.

SLA is optional. The SLA in this test case serves as an example. Considerably lower read latency is expected. However, to cover most configurations, both baremetal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many heavy IO applications start to suffer badly if the read latency is higher than this.

applicability

Test can be configured with different:

  • strides;
  • stop_size;
  • iterations and intervals.

Default values exist.

SLA (optional) : max_latency: The maximum memory latency that is accepted.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

LMbench lat_mem_rd

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with Lmbench included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host is installed as client. LMbench’s lat_mem_rd tool is invoked and logs are produced and stored.

Result: logs are stored.

step 1 A host VM with LMbench installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. ‘lmbench_latency_benchmark’ bash script is copyied from Jump Host to the host VM via the ssh tunnel.
step 3

‘lmbench_latency_benchmark’ script is invoked. LMbench’s lat_mem_rd benchmark starts to measures memory read latency for varying memory sizes and strides. Memory read latency are recorded and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Test fails if the measured memory latency is above the SLA value or if there is a test case execution problem.
15.2.8. Yardstick Test Case Description TC011
Packet delay variation between VMs
test case id OPNFV_YARDSTICK_TC011_PACKET DELAY VARIATION BETWEEN VMs
metric jitter: packet delay variation (ms)
test purpose

The purpose of TC011 is to evaluate the IaaS network performance with regards to network jitter (packet delay variation). It measures the packet delay variation sending the packets from one VM to the other.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

iperf3

iPerf3 is a tool for active measurements of the maximum achievable bandwidth on IP networks. It supports tuning of various parameters related to timing, buffers and protocols. The UDP protocols can be used to measure jitter delay.

(iperf3 is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

test description

iperf3 test is invoked between a host VM and a target VM.

Jitter calculations are continuously computed by the server, as specified by RTP in RFC 1889. The client records a 64 bit second/microsecond timestamp in the packet. The server computes the relative transit time as (server’s receive time - client’s send time). The client’s and server’s clocks do not need to be synchronized; any difference is subtracted outin the jitter calculation. Jitter is the smoothed mean of differences between consecutive transit times.

configuration

File: opnfv_yardstick_tc011.yaml

  • options: protocol: udp # The protocol used by iperf3 tools bandwidth: 20m # It will send the given number of packets

    without pausing

  • runner: duration: 30 # Total test duration 30 seconds.

  • SLA (optional): jitter: 10 (ms) # The maximum amount of jitter that is

    accepted.

applicability

Test can be configured with different:

  • bandwidth: Test case can be configured with different
    bandwidth.
  • duration: The test duration can be configured.
  • jitter: SLA is optional. The SLA in this test case
    serves as an example.
usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

iperf3

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with iperf3 included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs with iperf3 installed are booted, as server and client.
step 2 Yardstick is connected with the host VM by using ssh. A iperf3 server is started on the server VM via the ssh tunnel.
step 3

iperf3 benchmark is invoked. Jitter is calculated and check against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VMs are deleted.
test verdict Test should not PASS if any jitter is above the optional SLA value, or if there is a test case execution problem.
15.2.9. Yardstick Test Case Description TC012
Memory Bandwidth
test case id OPNFV_YARDSTICK_TC012_MEMORY BANDWIDTH
metric Memory read/write bandwidth (MBps)
test purpose

The purpose of TC012 is to evaluate the IaaS compute performance with regards to memory throughput. It measures the rate at which data can be read from and written to the memory (this includes all levels of memory).

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

LMbench

LMbench is a suite of operating system microbenchmarks. This test uses bw_mem tool from that suite including:

  • Cached file read
  • Memory copy (bcopy)
  • Memory read
  • Memory write
  • Pipe
  • TCP

(LMbench is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with LMbench included.)

test description LMbench bw_mem benchmark allocates twice the specified amount of memory, zeros it, and then times the copying of the first half to the second half. The benchmark is invoked in a host VM on a compute blade. Results are reported in megabytes moved per second.
configuration

File: opnfv_yardstick_tc012.yaml

  • SLA (optional): 15000 (MBps) min_bw: The minimum amount of memory bandwidth that is accepted.
  • Size: 10 240 kB - test allocates twice that size (20 480kB) zeros it and then measures the time it takes to copy from one side to another.
  • Benchmark: rdwr - measures the time to read data into memory and then write data to the same location.
  • Warmup: 0 - the number of iterations to perform before taking actual measurements.
  • Iterations: 10 - test is run 10 times iteratively.
  • Interval: 1 - there is 1 second delay between each iteration.

SLA is optional. The SLA in this test case serves as an example. Considerably higher bandwidth is expected. However, to cover most configurations, both baremetal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many heavy IO applications start to suffer badly if the read/write bandwidths are lower than this.

applicability

Test can be configured with different:

  • memory sizes;
  • memory operations (such as rd, wr, rdwr, cp, frd, fwr, fcp, bzero, bcopy);
  • number of warmup iterations;
  • iterations and intervals.

Default values exist.

SLA (optional) : min_bandwidth: The minimun memory bandwidth that is accepted.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

LMbench bw_mem

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with Lmbench included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with LMbench installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. “lmbench_bandwidth_benchmark” bash script is copied from Jump Host to the host VM via ssh tunnel.
step 3

‘lmbench_bandwidth_benchmark’ script is invoked. LMbench’s bw_mem benchmark starts to measures memory read/write bandwidth. Memory read/write bandwidth results are recorded and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Test fails if the measured memory bandwidth is below the SLA value or if there is a test case execution problem.
15.2.10. Yardstick Test Case Description TC014
Processing speed
test case id OPNFV_YARDSTICK_TC014_PROCESSING SPEED
metric score of single cpu running, score of parallel running
test purpose

The purpose of TC014 is to evaluate the IaaS compute performance with regards to CPU processing speed. It measures score of single cpu running and parallel running.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

UnixBench

Unixbench is the most used CPU benchmarking software tool. It can measure the performance of bash scripts, CPUs in multithreading and single threading. It can also measure the performance for parallel taks. Also, specific disk IO for small and large files are performed. You can use it to measure either linux dedicated servers and linux vps servers, running CentOS, Debian, Ubuntu, Fedora and other distros.

(UnixBench is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with UnixBench included.)

test description

The UnixBench runs system benchmarks in a host VM on a compute blade, getting information on the CPUs in the system. If the system has more than one CPU, the tests will be run twice – once with a single copy of each test running at once, and once with N copies, where N is the number of CPUs.

UnixBench will processs a set of results from a single test by averaging the individal pass results into a single final value.

configuration

file: opnfv_yardstick_tc014.yaml

run_mode: Run unixbench in quiet mode or verbose mode test_type: dhry2reg, whetstone and so on

For SLA with single_score and parallel_score, both can be set by user, default is NA.

applicability

Test can be configured with different:

  • test types;
  • dhry2reg;
  • whetstone.

Default values exist.

SLA (optional) : min_score: The minimun UnixBench score that is accepted.

usability This test case is one of Yardstick’s generic test. Thus it is runnable on most of the scenarios.
references

unixbench

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with unixbench included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 A host VM with UnixBench installed is booted.
step 2 Yardstick is connected with the host VM by using ssh. “unixbench_benchmark” bash script is copied from Jump Host to the host VM via ssh tunnel.
step 3

UnixBench is invoked. All the tests are executed using the “Run” script in the top-level of UnixBench directory. The “Run” script will run a standard “index” test, and save the report in the “results” directory. Then the report is processed by “unixbench_benchmark” and checked againsted the SLA.

Result: Logs are stored.

step 4 The host VM is deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.11. Yardstick Test Case Description TC024
CPU Load
test case id OPNFV_YARDSTICK_TC024_CPU Load
metric CPU load
test purpose To evaluate the CPU load performance of the IaaS. This test case should be run in parallel to other Yardstick test cases and not run as a stand-alone test case. Average, minimum and maximun values are obtained. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: cpuload.yaml (in the ‘samples’ directory)

  • interval: 1 - repeat, pausing every 1 seconds in-between.
  • count: 10 - display statistics 10 times, then exit.
test tool

mpstat

(mpstat is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. However, if mpstat is not present the TC instead uses /proc/stats as source to produce “mpstat” output.

references man-pages
applicability

Test can be configured with different:

  • interval;
  • count;
  • runner Iteration and intervals.

There are default values for each above-mentioned option. Run in background with other test cases.

pre-test conditions

The test case image needs to be installed into Glance with mpstat included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host is installed. The related TC, or TCs, is invoked and mpstat logs are produced and stored.

Result: Stored logs

test verdict None. CPU load results are fetched and stored.
15.2.12. Yardstick Test Case Description TC037
Latency, CPU Load, Throughput, Packet Loss
test case id OPNFV_YARDSTICK_TC037_LATENCY,CPU LOAD,THROUGHPUT, PACKET LOSS
metric Number of flows, latency, throughput, packet loss CPU utilization percentage, CPU interrupt per second
test purpose

The purpose of TC037 is to evaluate the IaaS compute capacity and network performance with regards to CPU utilization, packet flows and network throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades, and the CPU load variation.

Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

Ping, Pktgen, mpstat

Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network. It measures the round-trip time for packet sent from the originating host to a destination computer that are echoed back to the source.

Linux packet generator is a tool to generate packets at very high speed in the kernel. pktgen is mainly used to drive and LAN equipment test network. pktgen supports multi threading. To generate random MAC address, IP address, port number UDP packets, pktgen uses multiple CPU processors in the different PCI bus (PCI, PCIe bus) with Gigabit Ethernet tested (pktgen performance depends on the CPU processing speed, memory delay, PCI bus speed hardware parameters), Transmit data rate can be even larger than 10GBit/s. Visible can satisfy most card test requirements.

The mpstat command writes to standard output activities for each available processor, processor 0 being the first one. Global average activities among all processors are also reported. The mpstat command can be used both on SMP and UP machines, but in the latter, only global average activities will be printed.

(Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Docker image. For example also a Cirros image can be downloaded from cirros-image, it includes ping.

Pktgen and mpstat are not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Docker image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen and mpstat included.)

test description This test case uses Pktgen to generate packet flow between two hosts for simulating network workloads on the SUT. Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from a host VM to the target VM(s) to elicit ICMP ECHO_RESPONSE, meanwhile CPU activities are monitored by mpstat.
configuration

file: opnfv_yardstick_tc037.yaml

Packet size is set to 64 bytes. Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run two times, for 20 seconds each. Then the next port_amount is run, and so on. During the test CPU load on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different hardware. mpstat monitoring interval is set to 1 second. ping packet size is set to 100 bytes. For SLA max_ppm is set to 1000.

applicability

Test can be configured with different:

  • pktgen packet sizes;
  • amount of flows;
  • test duration;
  • ping packet size;
  • mpstat monitor interval.

Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

references

Ping

mpstat

pktgen

ETSI-NFV-TST001

pre-test conditions

The test case image needs to be installed into Glance with pktgen, mpstat included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs are booted, as server and client.
step 2 Yardstick is connected with the server VM by using ssh. ‘pktgen_benchmark’, “ping_benchmark” bash script are copyied from Jump Host to the server VM via the ssh tunnel.
step 3 An IP table is setup on server to monitor for received packets.
step 4

pktgen is invoked to generate packet flow between two server and client for simulating network workloads on the SUT. Ping is invoked. Ping packets are sent from server VM to client VM. mpstat is invoked, recording activities for each available processor. Results are processed and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

step 5 Two host VMs are deleted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.13. Yardstick Test Case Description TC038
Latency, CPU Load, Throughput, Packet Loss (Extended measurements)
test case id OPNFV_YARDSTICK_TC038_Latency,CPU Load,Throughput,Packet Loss
metric Number of flows, latency, throughput, CPU load, packet loss
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs ans similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc038.yaml

Packet size: 64 bytes Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run ten times, for 20 seconds each. Then the next port_amount is run, and so on. During the test CPU load on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different HW. For SLA max_ppm is set to 1000.

test tool

pktgen

(Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

ping

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Glance image. (For example also a cirros image can be downloaded, it includes ping)

mpstat

(Mpstat is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image.

references

Ping and Mpstat man pages

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to loose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.14. Yardstick Test Case Description TC0042
Network Performance
test case id OPNFV_YARDSTICK_TC042_DPDK pktgen latency measurements
metric L2 Network Latency
test purpose Measure L2 network latency when DPDK is enabled between hosts on different compute blades.
configuration

file: opnfv_yardstick_tc042.yaml

  • Packet size: 64 bytes
  • SLA(max_latency): 100usec
test tool

DPDK Pktgen-dpdk

(DPDK and Pktgen-dpdk are not part of a Linux distribution, hence they needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with DPDK and pktgen-dpdk included.)

references

DPDK

Pktgen-dpdk

ETSI-NFV-TST001

applicability Test can be configured with different packet sizes. Default values exist.
pre-test conditions

The test case image needs to be installed into Glance with DPDK and pktgen-dpdk included in it.

The NICs of compute nodes must support DPDK on POD.

And at least compute nodes setup hugepage.

If you want to achievement a hight performance result, it is recommend to use NUAM, CPU pin, OVS and so on.

test sequence description and expected result
step 1 The hosts are installed on different blades, as server and client. Both server and client have three interfaces. The first one is management such as ssh. The other two are used by DPDK.
step 2 Testpmd is invoked with configurations to forward packets from one DPDK port to the other on server.
step 3

Pktgen-dpdk is invoked with configurations as a traffic generator and logs are produced and stored on client.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.15. Yardstick Test Case Description TC043
Network Latency Between NFVI Nodes
test case id OPNFV_YARDSTICK_TC043_LATENCY_BETWEEN_NFVI_NODES
metric RTT (Round Trip Time)
test purpose

The purpose of TC043 is to do a basic verification that network latency is within acceptable boundaries when packets travel between different NFVI nodes.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

ping

Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network. It measures the round-trip time for packet sent from the originating host to a destination computer that are echoed back to the source.

test topology Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from host node to target node to elicit ICMP ECHO_RESPONSE.
configuration

file: opnfv_yardstick_tc043.yaml

Packet size 100 bytes. Total test duration 600 seconds. One ping each 10 seconds. SLA RTT is set to maximum 10 ms.

applicability

This test case can be configured with different:

  • packet sizes;
  • burst sizes;
  • ping intervals;
  • test durations;
  • test iterations.

Default values exist.

SLA is optional. The SLA in this test case serves as an example. Considerably lower RTT is expected, and also normal to achieve in balanced L2 environments. However, to cover most configurations, both bare metal and fully virtualized ones, this value should be possible to achieve and acceptable for black box testing. Many real time applications start to suffer badly if the RTT time is higher than this. Some may suffer bad also close to this RTT, while others may not suffer at all. It is a compromise that may have to be tuned for different configuration purposes.

references

Ping

ETSI-NFV-TST001

pre_test conditions Each pod node must have ping included in it.
test sequence description and expected result
step 1 Yardstick is connected with the NFVI node by using ssh. ‘ping_benchmark’ bash script is copyied from Jump Host to the NFVI node via the ssh tunnel.
step 2

Ping is invoked. Ping packets are sent from server node to client node. RTT results are calculated and checked against the SLA. Logs are produced and stored.

Result: Logs are stored.

test verdict Test should not PASS if any RTT is above the optional SLA value, or if there is a test case execution problem.
15.2.16. Yardstick Test Case Description TC044
Memory Utilization
test case id OPNFV_YARDSTICK_TC044_Memory Utilization
metric Memory utilization
test purpose To evaluate the IaaS compute capability with regards to memory utilization.This test case should be run in parallel to other Yardstick test cases and not run as a stand-alone test case. Measure the memory usage statistics including used memory, free memory, buffer, cache and shared memory. Both average and maximun values are obtained. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

File: memload.yaml (in the ‘samples’ directory)

  • interval: 1 - repeat, pausing every 1 seconds in-between.
  • count: 10 - display statistics 10 times, then exit.
test tool

free

free provides information about unused and used memory and swap space on any computer running Linux or another Unix-like operating system. free is normally part of a Linux distribution, hence it doesn’t needs to be installed.

references

man-pages

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • interval;
  • count;
  • runner Iteration and intervals.

There are default values for each above-mentioned option. Run in background with other test cases.

pre-test conditions

The test case image needs to be installed into Glance with free included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host is installed as client. The related TC, or TCs, is invoked and free logs are produced and stored.

Result: logs are stored.

test verdict None. Memory utilization results are fetched and stored.
15.2.17. Yardstick Test Case Description TC055
Compute Capacity
test case id OPNFV_YARDSTICK_TC055_Compute Capacity
metric Number of cpus, number of cores, number of threads, available memory size and total cache size.
test purpose To evaluate the IaaS compute capacity with regards to hardware specification, including number of cpus, number of cores, number of threads, available memory size and total cache size. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc055.yaml

There is are no additional configurations to be set for this TC.

test tool

/proc/cpuinfo

this TC uses /proc/cpuinfo as source to produce compute capacity output.

references

/proc/cpuinfo_

ETSI-NFV-TST001

applicability None.
pre-test conditions No POD specific requirements have been identified.
test sequence description and expected result
step 1

The hosts are installed, TC is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict None. Hardware specification are fetched and stored.
15.2.18. Yardstick Test Case Description TC061
Network Utilization
test case id OPNFV_YARDSTICK_TC061_Network Utilization
metric Network utilization
test purpose To evaluate the IaaS network capability with regards to network utilization, including Total number of packets received per second, Total number of packets transmitted per second, Total number of kilobytes received per second, Total number of kilobytes transmitted per second, Number of compressed packets received per second (for cslip etc.), Number of compressed packets transmitted per second, Number of multicast packets received per second, Utilization percentage of the network interface. This test case should be run in parallel to other Yardstick test cases and not run as a stand-alone test case. Measure the network usage statistics from the network devices Average, minimum and maximun values are obtained. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

File: netutilization.yaml (in the ‘samples’ directory)

  • interval: 1 - repeat, pausing every 1 seconds in-between.
  • count: 1 - display statistics 1 times, then exit.
test tool

sar

The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. sar is normally part of a Linux distribution, hence it doesn’t needs to be installed.

references

man-pages

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • interval;
  • count;
  • runner Iteration and intervals.

There are default values for each above-mentioned option. Run in background with other test cases.

pre-test conditions

The test case image needs to be installed into Glance with sar included in the image.

No POD specific requirements have been identified.

test sequence description and expected result.
step 1

The host is installed as client. The related TC, or TCs, is invoked and sar logs are produced and stored.

Result: logs are stored.

test verdict None. Network utilization results are fetched and stored.
15.2.19. Yardstick Test Case Description TC063
Storage Capacity
test case id OPNFV_YARDSTICK_TC063_Storage Capacity
metric Storage/disk size, block size Disk Utilization
test purpose This test case will check the parameters which could decide several models and each model has its specified task to measure. The test purposes are to measure disk size, block size and disk utilization. With the test results, we could evaluate the storage capacity of the host.
configuration
file: opnfv_yardstick_tc063.yaml
  • test_type: “disk_size”
  • runner:
    type: Iteration iterations: 1 - test is run 1 time iteratively.
test tool

fdisk A command-line utility that provides disk partitioning functions

iostat This is a computer system monitor tool used to collect and show operating system storage input and output statistics.

references

iostat fdisk

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • test_type: “disk size”, “block size”, “disk utilization”
  • interval: 1 - how ofter to stat disk utilization
    type: int unit: seconds
  • count: 15 - how many times to stat disk utilization
    type: int unit: na

There are default values for each above-mentioned option. Run in background with other test cases.

pre-test conditions

The test case image needs to be installed into Glance

No POD specific requirements have been identified.

test sequence Output the specific storage capacity of disk information as the sequence into file.
step 1

The pod is available and the hosts are installed. Node5 is used and logs are produced and stored.

Result: Logs are stored.

test verdict None.
15.2.20. Yardstick Test Case Description TC069
Memory Bandwidth
test case id OPNFV_YARDSTICK_TC069_Memory Bandwidth
metric Megabyte per second (MBps)
test purpose To evaluate the IaaS compute performance with regards to memory bandwidth. Measure the maximum possible cache and memory performance while reading and writing certain blocks of data (starting from 1Kb and further in power of 2) continuously through ALU and FPU respectively. Measure different aspects of memory performance via synthetic simulations. Each simulation consists of four performances (Copy, Scale, Add, Triad). Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

File: opnfv_yardstick_tc069.yaml

  • SLA (optional): 7000 (MBps) min_bandwidth: The minimum amount of memory bandwidth that is accepted.

  • type_id: 1 - runs a specified benchmark (by an ID number):

    1 – INTmark [writing] 4 – FLOATmark [writing] 2 – INTmark [reading] 5 – FLOATmark [reading] 3 – INTmem 6 – FLOATmem

  • block_size: 64 Megabytes - the maximum block

    size per array.

  • load: 32 Gigabytes - the amount of data load per pass.

  • iterations: 5 - test is run 5 times iteratively.

  • interval: 1 - there is 1 second delay between each iteration.

test tool

RAMspeed

RAMspeed is a free open source command line utility to measure cache and memory performance of computer systems. RAMspeed is not always part of a Linux distribution, hence it needs to be installed in the test image.

references

RAMspeed

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • benchmark operations (such as INTmark [writing], INTmark [reading], FLOATmark [writing], FLOATmark [reading], INTmem, FLOATmem);
  • block size per array;
  • load per pass;
  • number of batch run iterations;
  • iterations and intervals.

There are default values for each above-mentioned option.

pre-test conditions

The test case image needs to be installed into Glance with RAmspeed included in the image.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host is installed as client. RAMspeed is invoked and logs are produced and stored.

Result: logs are stored.

test verdict Test fails if the measured memory bandwidth is below the SLA value or if there is a test case execution problem.
15.2.21. Yardstick Test Case Description TC070
Latency, Memory Utilization, Throughput, Packet Loss
test case id OPNFV_YARDSTICK_TC070_Latency, Memory Utilization, Throughput,Packet Loss
metric Number of flows, latency, throughput, Memory Utilization, packet loss
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc070.yaml

Packet size: 64 bytes Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run two times, for 20 seconds each. Then the next port_amount is run, and so on. During the test Memory Utilization on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different HW. For SLA max_ppm is set to 1000.

test tool

pktgen

Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. (As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

ping

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Glance image. (For example also a cirros image can be downloaded, it includes ping)

free

free provides information about unused and used memory and swap space on any computer running Linux or another Unix-like operating system. free is normally part of a Linux distribution, hence it doesn’t needs to be installed.

references

Ping and free man pages

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to lose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.22. Yardstick Test Case Description TC071
Latency, Cache Utilization, Throughput, Packet Loss
test case id OPNFV_YARDSTICK_TC071_Latency, Cache Utilization, Throughput,Packet Loss
metric Number of flows, latency, throughput, Cache Utilization, packet loss
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc071.yaml

Packet size: 64 bytes Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run two times, for 20 seconds each. Then the next port_amount is run, and so on. During the test Cache Utilization on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different HW. For SLA max_ppm is set to 1000.

test tool

pktgen

Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. (As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

ping

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Glance image. (For example also a cirros image can be downloaded, it includes ping)

cachestat

cachestat is not always part of a Linux distribution, hence it needs to be installed.

references

Ping man pages

pktgen

cachestat

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to lose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.23. Yardstick Test Case Description TC072
Latency, Network Utilization, Throughput, Packet Loss
test case id OPNFV_YARDSTICK_TC072_Latency, Network Utilization, Throughput,Packet Loss
metric Number of flows, latency, throughput, Network Utilization, packet loss
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of flows matter for the throughput between hosts on different compute blades. Typically e.g. the performance of a vSwitch depends on the number of flows running through it. Also performance of other equipment or entities can depend on the number of flows or the packet sizes used. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc072.yaml

Packet size: 64 bytes Number of ports: 1, 10, 50, 100, 300, 500, 750 and 1000. The amount configured ports map from 2 up to 1001000 flows, respectively. Each port amount is run two times, for 20 seconds each. Then the next port_amount is run, and so on. During the test Network Utilization on both client and server, and the network latency between the client and server are measured. The client and server are distributed on different HW. For SLA max_ppm is set to 1000.

test tool

pktgen

Pktgen is not always part of a Linux distribution, hence it needs to be installed. It is part of the Yardstick Glance image. (As an example see the /yardstick/tools/ directory for how to generate a Linux image with pktgen included.)

ping

Ping is normally part of any Linux distribution, hence it doesn’t need to be installed. It is also part of the Yardstick Glance image. (For example also a cirros image can be downloaded, it includes ping)

sar

The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. sar is normally part of a Linux distribution, hence it doesn’t needs to be installed.

references

Ping and sar man pages

pktgen

ETSI-NFV-TST001

applicability

Test can be configured with different packet sizes, amount of flows and test duration. Default values exist.

SLA (optional): max_ppm: The number of packets per million packets sent that are acceptable to lose, not received.

pre-test conditions

The test case image needs to be installed into Glance with pktgen included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The hosts are installed, as server and client. pktgen is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.24. Yardstick Test Case Description TC073
Throughput per NFVI node test
test case id OPNFV_YARDSTICK_TC073_Network latency and throughput between nodes
metric Network latency and throughput
test purpose To evaluate the IaaS network performance with regards to flows and throughput, such as if and how different amounts of packet sizes and flows matter for the throughput between nodes in one pod.
configuration

file: opnfv_yardstick_tc073.yaml

Packet size: default 1024 bytes.

Test length: default 20 seconds.

The client and server are distributed on different nodes.

For SLA max_mean_latency is set to 100.

test tool netperf Netperf is a software application that provides network bandwidth testing between two hosts on a network. It supports Unix domain sockets, TCP, SCTP, DLPI and UDP via BSD Sockets. Netperf provides a number of predefined tests e.g. to measure bulk (unidirectional) data transfer or request response performance. (netperf is not always part of a Linux distribution, hence it needs to be installed.)
references netperf Man pages ETSI-NFV-TST001
applicability

Test can be configured with different packet sizes and test duration. Default values exist.

SLA (optional): max_mean_latency

pre-test conditions The POD can be reached by external ip and logged on via ssh
test sequence description and expected result
step 1 Install netperf tool on each specified node, one is as the server, and the other as the client.
step 2 Log on to the client node and use the netperf command to execute the network performance test
step 3 The throughput results stored.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.2.25. Yardstick Test Case Description TC075
Network Capacity and Scale Testing
test case id OPNFV_YARDSTICK_TC075_Network_Capacity_and_Scale_testing
metric Number of connections, Number of frames sent/received
test purpose To evaluate the network capacity and scale with regards to connections and frmaes.
configuration

file: opnfv_yardstick_tc075.yaml

There is no additional configuration to be set for this TC.

test tool

netstar

Netstat is normally part of any Linux distribution, hence it doesn’t need to be installed.

references

Netstat man page

ETSI-NFV-TST001

applicability This test case is mainly for evaluating network performance.
pre_test conditions Each pod node must have netstat included in it.
test sequence description and expected result
step 1

The pod is available. Netstat is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict None. Number of connections and frames are fetched and stored.
15.2.26. Yardstick Test Case Description TC076
Monitor Network Metrics
test case id OPNFV_YARDSTICK_TC076_Monitor_Network_Metrics
metric IP datagram error rate, ICMP message error rate, TCP segment error rate and UDP datagram error rate
test purpose

The purpose of TC076 is to evaluate the IaaS network reliability with regards to IP datagram error rate, ICMP message error rate, TCP segment error rate and UDP datagram error rate.

TC076 monitors network metrics provided by the Linux kernel in a host and calculates IP datagram error rate, ICMP message error rate, TCP segment error rate and UDP datagram error rate.

The purpose is also to be able to spot the trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.

test tool

nstat

nstat is a simple tool to monitor kernel snmp counters and network interface statistics.

(nstat is not always part of a Linux distribution, hence it needs to be installed. nstat is provided by the iproute2 collection, which is usually also the name of the package in many Linux distributions.As an example see the /yardstick/tools/ directory for how to generate a Linux image with iproute2 included.)

test description

Ping packets (ICMP protocol’s mandatory ECHO_REQUEST datagram) are sent from host VM to target VM(s) to elicit ICMP ECHO_RESPONSE.

nstat is invoked on the target vm to monitors network metrics provided by the Linux kernel.

configuration

file: opnfv_yardstick_tc076.yaml

There is no additional configuration to be set for this TC.

references

nstat man page

ETSI-NFV-TST001

applicability This test case is mainly for monitoring network metrics.
pre_test conditions

The test case image needs to be installed into Glance with fio included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1 Two host VMs are booted, as server and client.
step 2 Yardstick is connected with the server VM by using ssh. ‘ping_benchmark’ bash script is copyied from Jump Host to the server VM via the ssh tunnel.
step 3

Ping is invoked. Ping packets are sent from server VM to client VM. RTT results are calculated and checked against the SLA. nstat is invoked on the client vm to monitors network metrics provided by the Linux kernel. IP datagram error rate, ICMP message error rate, TCP segment error rate and UDP datagram error rate are calculated. Logs are produced and stored.

Result: Logs are stored.

step 4 Two host VMs are deleted.
test verdict None.
15.3. OPNFV Feature Test Cases
15.3.1. H A
15.3.1.1. Yardstick Test Case Description TC019
Control Node Openstack Service High Availability
test case id OPNFV_YARDSTICK_TC019_HA: Control node Openstack service down
test purpose This test case will verify the high availability of the service provided by OpenStack (like nova-api, neutro-server) on control node.
test method This test case kills the processes of a specific Openstack service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “nova-api” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific

Openstack command, which needs two parameters:

1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request

  1. the “process” monitor check whether a process is running on a specific node, which needs three parameters:

1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “openstack server list” monitor2: -monitor_type: “process” -process_name: “nova-api” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc019.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.2. Yardstick Test Case Description TC025
OpenStack Controller Node abnormally shutdown High Availability
test case id OPNFV_YARDSTICK_TC025_HA: OpenStack Controller Node abnormally shutdown
test purpose This test case will verify the high availability of controller node. When one of the controller node abnormally shutdown, the service provided by it should be OK.
test method This test case shutdowns a specified controller node with some fault injection tools, then checks whether all services provided by the controller node are OK with some monitor tools.
attackers

In this test case, an attacker called “host-shutdown” is needed. This attacker includes two parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “host-shutdown” in this test case. 2) host: the name of a controller node being attacked.

e.g. -fault_type: “host-shutdown” -host: node1

monitors

In this test case, one kind of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific

Openstack command, which needs two parameters

1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request

There are four instance of the “openstack-cmd” monitor: monitor1: -monitor_type: “openstack-cmd” -api_name: “nova image-list” monitor2: -monitor_type: “openstack-cmd” -api_name: “neutron router-list” monitor3: -monitor_type: “openstack-cmd” -api_name: “heat stack-list” monitor4: -monitor_type: “openstack-cmd” -api_name: “cinder list”

metrics In this test case, there is one metric: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc019.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute shutdown script on the host

Result: The host will be shutdown.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: All monitor result will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It restarts the specified controller node if it is not restarted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.3. Yardstick Test Case Description TC045
Control Node Openstack Service High Availability - Neutron Server
test case id OPNFV_YARDSTICK_TC045: Control node Openstack service down - neutron server
test purpose This test case will verify the high availability of the network service provided by OpenStack (neutro-server) on control node.
test method This test case kills the processes of neutron-server service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “neutron- server”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “neutron-server” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be neutron related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “neutron agent-list” monitor2: -monitor_type: “process” -process_name: “neutron-server” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc045.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.4. Yardstick Test Case Description TC046
Control Node Openstack Service High Availability - Keystone
test case id OPNFV_YARDSTICK_TC046: Control node Openstack service down - keystone
test purpose This test case will verify the high availability of the user service provided by OpenStack (keystone) on control node.
test method This test case kills the processes of keystone service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “keystone” 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “keystone” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be keystone related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “keystone user-list” monitor2: -monitor_type: “process” -process_name: “keystone” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc046.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.5. Yardstick Test Case Description TC047
Control Node Openstack Service High Availability - Glance Api
test case id OPNFV_YARDSTICK_TC047: Control node Openstack service down - glance api
test purpose This test case will verify the high availability of the image service provided by OpenStack (glance-api) on control node.
test method This test case kills the processes of glance-api service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “glance- api”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “glance-api” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be glance related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “glance image-list” monitor2: -monitor_type: “process” -process_name: “glance-api” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc047.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.6. Yardstick Test Case Description TC048
Control Node Openstack Service High Availability - Cinder Api
test case id OPNFV_YARDSTICK_TC048: Control node Openstack service down - cinder api
test purpose This test case will verify the high availability of the volume service provided by OpenStack (cinder-api) on control node.
test method This test case kills the processes of cinder-api service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “cinder- api”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “cinder-api” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be cinder related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “cinder list” monitor2: -monitor_type: “process” -process_name: “cinder-api” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc048.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.7. Yardstick Test Case Description TC049
Control Node Openstack Service High Availability - Swift Proxy
test case id OPNFV_YARDSTICK_TC049: Control node Openstack service down - swift proxy
test purpose This test case will verify the high availability of the storage service provided by OpenStack (swift-proxy) on control node.
test method This test case kills the processes of swift-proxy service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “swift- proxy”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “swift-proxy” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request. In this case, the command name should be swift related commands.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scritps. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “swift stat” monitor2: -monitor_type: “process” -process_name: “swift-proxy” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc049.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.8. Yardstick Test Case Description TC050
OpenStack Controller Node Network High Availability
test case id OPNFV_YARDSTICK_TC050: OpenStack Controller Node Network High Availability
test purpose This test case will verify the high availability of control node. When one of the controller failed to connect the network, which breaks down the Openstack services on this node. These Openstack service should able to be accessed by other controller nodes, and the services on failed controller node should be isolated.
test method This test case turns off the network interfaces of a specified control node, then checks whether all services provided by the control node are OK with some monitor tools.
attackers

In this test case, an attacker called “close-interface” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “close-interface” in this test case. 2) host: which is the name of a control node being attacked. 3) interface: the network interface to be turned off.

There are four instance of the “close-interface” monitor: attacker1(for public netork): -fault_type: “close-interface” -host: node1 -interface: “br-ex” attacker2(for management netork): -fault_type: “close-interface” -host: node1 -interface: “br-mgmt” attacker3(for storage netork): -fault_type: “close-interface” -host: node1 -interface: “br-storage” attacker4(for private netork): -fault_type: “close-interface” -host: node1 -interface: “br-mesh”

monitors

In this test case, the monitor named “openstack-cmd” is needed. The monitor needs needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request

There are four instance of the “openstack-cmd” monitor: monitor1: -monitor_type: “openstack-cmd” -command_name: “nova image-list” monitor2: -monitor_type: “openstack-cmd” -command_name: “neutron router-list” monitor3: -monitor_type: “openstack-cmd” -command_name: “heat stack-list” monitor4: -monitor_type: “openstack-cmd” -command_name: “cinder list”

metrics In this test case, there is one metric: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc050.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the turnoff network interface script with param value specified by “interface”.

Result: Network interfaces will be turned down.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It turns up the network interface of the control node if it is not turned up.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.9. Yardstick Test Case Description TC051
OpenStack Controller Node CPU Overload High Availability
test case id OPNFV_YARDSTICK_TC051: OpenStack Controller Node CPU Overload High Availability
test purpose This test case will verify the high availability of control node. When the CPU usage of a specified controller node is stressed to 100%, which breaks down the Openstack services on this node. These Openstack service should able to be accessed by other controller nodes, and the services on failed controller node should be isolated.
test method This test case stresses the CPU uasge of a specified control node to 100%, then checks whether all services provided by the environment are OK with some monitor tools.
attackers In this test case, an attacker called “stress-cpu” is needed. This attacker includes two parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “stress-cpu” in this test case. 2) host: which is the name of a control node being attacked. e.g. -fault_type: “stress-cpu” -host: node1
monitors

In this test case, the monitor named “openstack-cmd” is needed. The monitor needs needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request

There are four instance of the “openstack-cmd” monitor: monitor1: -monitor_type: “openstack-cmd” -command_name: “nova image-list” monitor2: -monitor_type: “openstack-cmd” -command_name: “neutron router-list” monitor3: -monitor_type: “openstack-cmd” -command_name: “heat stack-list” monitor4: -monitor_type: “openstack-cmd” -command_name: “cinder list”

metrics In this test case, there is one metric: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc051.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the stress cpu script on the host.

Result: The CPU usage of the host will be stressed to 100%.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It kills the process that stresses the CPU usage.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.10. Yardstick Test Case Description TC052
OpenStack Controller Node Disk I/O Block High Availability
test case id OPNFV_YARDSTICK_TC052: OpenStack Controller Node Disk I/O Block High Availability
test purpose This test case will verify the high availability of control node. When the disk I/O of a specified disk is blocked, which breaks down the Openstack services on this node. Read and write services should still be accessed by other controller nodes, and the services on failed controller node should be isolated.
test method This test case blocks the disk I/O of a specified control node, then checks whether the services that need to read or wirte the disk of the control node are OK with some monitor tools.
attackers In this test case, an attacker called “disk-block” is needed. This attacker includes two parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “disk-block” in this test case. 2) host: which is the name of a control node being attacked. e.g. -fault_type: “disk-block” -host: node1
monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scripts. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request.

e.g. -monitor_type: “openstack-cmd” -command_name: “nova flavor-list”

2. the second monitor verifies the read and write function by a “operation” and a “result checker”. the “operation” have two parameters: 1) operation_type: which is used for finding the operation class and related scripts. 2) action_parameter: parameters for the operation. the “result checker” have three parameters: 1) checker_type: which is used for finding the reuslt checker class and realted scripts. 2) expectedValue: the expected value for the output of the checker script. 3) condition: whether the expected value is in the output of checker script or is totally same with the output.

In this case, the “operation” adds a flavor and the “result checker” checks whether ths flavor is created. Their parameters show as follows: operation: -operation_type: “nova-create-flavor” -action_parameter:

flavorconfig: “test-001 test-001 100 1 1”

result checker: -checker_type: “check-flavor” -expectedValue: “test-001” -condition: “in”

metrics In this test case, there is one metric: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc052.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

do attacker: connect the host through SSH, and then execute the block disk I/O script on the host.

Result: The disk I/O of the host will be blocked

step 2

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 3 do operation: add a flavor
step 4 do result checker: check whether the falvor is created
step 5

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 6

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It excutes the release disk I/O script to release the blocked I/O.
test verdict Fails if monnitor SLA is not passed or the result checker is not passed, or if there is a test case execution problem.
15.3.1.11. Yardstick Test Case Description TC053
OpenStack Controller Load Balance Service High Availability
test case id OPNFV_YARDSTICK_TC053: OpenStack Controller Load Balance Service High Availability
test purpose This test case will verify the high availability of the load balance service(current is HAProxy) that supports OpenStack on controller node. When the load balance service of a specified controller node is killed, whether other load balancers on other controller nodes will work, and whether the controller node will restart the load balancer are checked.
test method This test case kills the processes of load balance service on a selected control node, then checks whether the request of the related Openstack command is OK and the killed processes are recovered.
attackers

In this test case, an attacker called “kill-process” is needed. This attacker includes three parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “kill-process” in this test case. 2) process_name: which is the process name of the specified OpenStack service. If there are multiple processes use the same name on the host, all of them are killed by this attacker. In this case. This parameter should always set to “swift- proxy”. 3) host: which is the name of a control node being attacked.

e.g. -fault_type: “kill-process” -process_name: “haproxy” -host: node1

monitors

In this test case, two kinds of monitor are needed: 1. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scritps. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request.

2. the “process” monitor check whether a process is running on a specific node, which needs three parameters: 1) monitor_type: which used for finding the monitor class and related scripts. It should be always set to “process” for this monitor. 2) process_name: which is the process name for monitor 3) host: which is the name of the node runing the process In this case, the command_name of monitor1 should be services that is supported by load balancer and the process- name of monitor2 should be “haproxy”, for example:

e.g. monitor1: -monitor_type: “openstack-cmd” -command_name: “nova image-list” monitor2: -monitor_type: “process” -process_name: “haproxy” -host: node1

metrics In this test case, there are two metrics: 1)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request. 2)process_recover_time: which indicates the maximun time (seconds) from the process being killed to recovered
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc053.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the kill process script with param value specified by “process_name”

Result: Process will be killed.

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It will check the status of the specified process on the host, and restart the process if it is not running for next test cases.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.1.12. Yardstick Test Case Description TC054
OpenStack Virtual IP High Availability
test case id OPNFV_YARDSTICK_TC054: OpenStack Virtual IP High Availability
test purpose This test case will verify the high availability for virtual ip in the environment. When master node of virtual ip is abnormally shutdown, connection to virtual ip and the services binded to the virtual IP it should be OK.
test method This test case shutdowns the virtual IP master node with some fault injection tools, then checks whether virtual ips can be pinged and services binded to virtual ip are OK with some monitor tools.
attackers

In this test case, an attacker called “control-shutdown” is needed. This attacker includes two parameters: 1) fault_type: which is used for finding the attacker’s scripts. It should be always set to “control-shutdown” in this test case. 2) host: which is the name of a control node being attacked.

In this case the host should be the virtual ip master node, that means the host ip is the virtual ip, for exapmle: -fault_type: “control-shutdown” -host: node1(the VIP Master node)

monitors

In this test case, two kinds of monitor are needed: 1. the “ip_status” monitor that pings a specific ip to check the connectivity of this ip, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scripts. It should be always set to “ip_status” for this monitor. 2) ip_address: The ip to be pinged. In this case, ip_address should be the virtual IP.

2. the “openstack-cmd” monitor constantly request a specific Openstack command, which needs two parameters: 1) monitor_type: which is used for finding the monitor class and related scripts. It should be always set to “openstack-cmd” for this monitor. 2) command_name: which is the command name used for request.

e.g. monitor1: -monitor_type: “ip_status” -host: 192.168.0.2 monitor2: -monitor_type: “openstack-cmd” -command_name: “nova image-list”

metrics In this test case, there are two metrics: 1) ping_outage_time: which-indicates the maximum outage time to ping the specified host. 2)service_outage_time: which indicates the maximum outage time (seconds) of the specified Openstack command request.
test tool Developed by the project. Please see folder: “yardstick/benchmark/scenarios/availability/ha_tools”
references ETSI NFV REL001
configuration

This test case needs two configuration files: 1) test case file: opnfv_yardstick_tc054.yaml -Attackers: see above “attackers” discription -waiting_time: which is the time (seconds) from the process being killed to stoping monitors the monitors -Monitors: see above “monitors” discription -SLA: see above “metrics” discription

2)POD file: pod.yaml The POD configuration should record on pod.yaml first. the “host” item in this test case will use the node name in the pod.yaml.

test sequence description and expected result
step 1

start monitors: each monitor will run with independently process

Result: The monitor info will be collected.

step 2

do attacker: connect the host through SSH, and then execute the shutdown script on the VIP master node.

Result: VIP master node will be shutdown

step 3

stop monitors after a period of time specified by “waiting_time”

Result: The monitor info will be aggregated.

step 4

verify the SLA

Result: The test case is passed or not.

post-action It is the action when the test cases exist. It restarts the original VIP master node if it is not restarted.
test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.2. IPv6
15.3.2.1. Yardstick Test Case Description TC027
IPv6 connectivity between nodes on the tenant network
test case id OPNFV_YARDSTICK_TC027_IPv6 connectivity
metric RTT, Round Trip Time
test purpose To do a basic verification that IPv6 connectivity is within acceptable boundaries when ipv6 packets travel between hosts located on same or different compute blades. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration

file: opnfv_yardstick_tc027.yaml

Packet size 56 bytes. SLA RTT is set to maximum 30 ms. ipv6 test case can be configured as three independent modules (setup, run, teardown). if you only want to setup ipv6 testing environment, do some tests as you want, “run_step” of task yaml file should be configured as “setup”. if you want to setup and run ping6 testing automatically, “run_step” should be configured as “setup, run”. and if you have had a environment which has been setup, you only wan to verify the connectivity of ipv6 network, “run_step” should be “run”. Of course, default is that three modules run sequentially.

test tool

ping6

Ping6 is normally part of Linux distribution, hence it doesn’t need to be installed.

references

ipv6

ETSI-NFV-TST001

applicability Test case can be configured with different run step you can run setup, run benchmark, teardown independently SLA is optional. The SLA in this test case serves as an example. Considerably lower RTT is expected.
pre-test conditions

The test case image needs to be installed into Glance with ping6 included in it.

For Brahmaputra, a compass_os_nosdn_ha deploy scenario is need. more installer and more sdn deploy scenario will be supported soon

test sequence description and expected result
step 1 To setup IPV6 testing environment: 1. disable security group 2. create (ipv6, ipv4) router, network and subnet 3. create VRouter, VM1, VM2
step 2 To run ping6 to verify IPV6 connectivity : 1. ssh to VM1 2. Ping6 to ipv6 router from VM1 3. Get the result(RTT) and logs are stored
step 3 To teardown IPV6 testing environment 1. delete VRouter, VM1, VM2 2. delete (ipv6, ipv4) router, network and subnet 3. enable security group
test verdict Test should not PASS if any RTT is above the optional SLA value, or if there is a test case execution problem.
15.3.3. KVM
15.3.3.1. Yardstick Test Case Description TC028
KVM Latency measurements
test case id OPNFV_YARDSTICK_TC028_KVM Latency measurements
metric min, avg and max latency
test purpose To evaluate the IaaS KVM virtualization capability with regards to min, avg and max latency. The purpose is also to be able to spot trends. Test results, graphs and similar shall be stored for comparison reasons and product evolution understanding between different OPNFV versions and/or configurations.
configuration file: samples/cyclictest-node-context.yaml
test tool

Cyclictest

(Cyclictest is not always part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/tools/ directory for how to generate a Linux image with cyclictest included.)

references Cyclictest
applicability This test case is mainly for kvm4nfv project CI verify. Upgrade host linux kernel, boot a gust vm update it’s linux kernel, and then run the cyclictest to test the new kernel is work well.
pre-test conditions

The test kernel rpm, test sequence scripts and test guest image need put the right folders as specified in the test case yaml file. The test guest image needs with cyclictest included in it.

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The host and guest os kernel is upgraded. Cyclictest is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict Fails only if SLA is not passed, or if there is a test case execution problem.
15.3.4. Parser
15.3.4.1. Yardstick Test Case Description TC040
Verify Parser Yang-to-Tosca
test case id OPNFV_YARDSTICK_TC040 Verify Parser Yang-to-Tosca
metric
  1. tosca file which is converted from yang file by Parser
  2. result whether the output is same with expected outcome
test purpose To verify the function of Yang-to-Tosca in Parser.
configuration

file: opnfv_yardstick_tc040.yaml

yangfile: the path of the yangfile which you want to convert toscafile: the path of the toscafile which is your expected outcome.

test tool

Parser

(Parser is not part of a Linux distribution, hence it needs to be installed. As an example see the /yardstick/benchmark/scenarios/parser/parser_setup.sh for how to install it manual. Of course, it will be installed and uninstalled automatically when you run this test case by yardstick)

references Parser
applicability Test can be configured with different path of yangfile and toscafile to fit your real environment to verify Parser
pre-test conditions No POD specific requirements have been identified. it can be run without VM
test sequence description and expected result
step 1

parser is installed without VM, running Yang-to-Tosca module to convert yang file to tosca file, validating output against expected outcome.

Result: Logs are stored.

test verdict Fails only if output is different with expected outcome or if there is a test case execution problem.

15.3.4.2. Yardstick Test Case Description TC074
Storperf
test case id OPNFV_YARDSTICK_TC074_Storperf
metric Storage performance
test purpose

Storperf integration with yardstick. The purpose of StorPerf is to provide a tool to measure block and object storage performance in an NFVI. When complemented with a characterization of typical VF storage performance requirements, it can provide pass/fail thresholds for test, staging, and production NFVI environments.

The benchmarks developed for block and object storage will be sufficiently varied to provide a good preview of expected storage performance behavior for any type of VNF workload.

configuration

file: opnfv_yardstick_tc074.yaml

  • agent_count: 1 - the number of VMs to be created
  • agent_image: “Ubuntu-14.04” - image used for creating VMs
  • public_network: “ext-net” - name of public network
  • volume_size: 2 - cinder volume size
  • block_sizes: “4096” - data block size
  • queue_depths: “4”
  • StorPerf_ip: “192.168.200.2”
  • query_interval: 10 - state query interval
  • timeout: 600 - maximum allowed job time
test tool

Storperf

StorPerf is a tool to measure block and object storage performance in an NFVI.

StorPerf is delivered as a Docker container from https://hub.docker.com/r/opnfv/storperf/tags/.

references

Storperf

ETSI-NFV-TST001

applicability

Test can be configured with different:

  • agent_count

  • volume_size

  • block_sizes

  • queue_depths

  • query_interval

  • timeout

  • target=[device or path] The path to either an attached storage device (/dev/vdb, etc) or a directory path (/opt/storperf) that will be used to execute the performance test. In the case of a device, the entire device will be used. If not specified, the current directory will be used.

  • workload=[workload module] If not specified, the default is to run all workloads. The workload types are:

    • rs: 100% Read, sequential data
    • ws: 100% Write, sequential data
    • rr: 100% Read, random access
    • wr: 100% Write, random access
    • rw: 70% Read / 30% write, random access
  • nossd: Do not perform SSD style preconditioning.

  • nowarm: Do not perform a warmup prior to measurements.

  • report= [job_id] Query the status of the supplied job_id and report on metrics. If a workload is supplied, will report on only that subset.

    There are default values for each above-mentioned option.

pre-test conditions

If you do not have an Ubuntu 14.04 image in Glance, you will need to add one. A key pair for launching agents is also required.

Storperf is required to be installed in the environment. There are two possible methods for Storperf installation:

Run container on Jump Host Run container in a VM

Running StorPerf on Jump Host Requirements:

  • Docker must be installed
  • Jump Host must have access to the OpenStack Controller API
  • Jump Host must have internet connectivity for downloading docker image
  • Enough floating IPs must be available to match your agent count

Running StorPerf in a VM Requirements:

  • VM has docker installed
  • VM has OpenStack Controller credentials and can communicate with the Controller API
  • VM has internet connectivity for downloading the docker image
  • Enough floating IPs must be available to match your agent count

No POD specific requirements have been identified.

test sequence description and expected result
step 1

The Storperf is installed and Ubuntu 14.04 image is stored in glance. TC is invoked and logs are produced and stored.

Result: Logs are stored.

test verdict None. Storage performance results are fetched and stored.
15.3.5. virtual Traffic Classifier
15.3.5.1. Yardstick Test Case Description TC006
Network Performance
test case id OPNFV_YARDSTICK_TC006_Virtual Traffic Classifier Data Plane Throughput Benchmarking Test.
metric Throughput
test purpose To measure the throughput supported by the virtual Traffic Classifier according to the RFC2544 methodology for a user-defined set of vTC deployment configurations.
configuration

file: file: opnfv_yardstick_tc006.yaml

packet_size: size of the packets to be used during the
throughput calculation. Allowe values: [64, 128, 256, 512, 1024, 1280, 1518]
vnic_type: type of VNIC to be used.
Allowed values are:
  • normal: for default OvS port configuration
  • direct: for SR-IOV port configuration

Default value: None

vtc_flavor: OpenStack flavor to be used for the vTC
Default available values are: m1.small, m1.medium, and m1.large, but the user can create his/her own flavor and give it as input Default value: None
vlan_sender: vlan tag of the network on which the vTC will
receive traffic (VLAN Network 1). Allowed values: range (1, 4096)
vlan_receiver: vlan tag of the network on which the vTC
will send traffic back to the packet generator (VLAN Network 2). Allowed values: range (1, 4096)
default_net_name: neutron name of the defaul network that
is used for access to the internet from the vTC (vNIC 1).
default_subnet_name: subnet name for vNIC1
(information available through Neutron).
vlan_net_1_name: Neutron Name for VLAN Network 1
(information available through Neutron).
vlan_subnet_1_name: Subnet Neutron name for VLAN Network 1
(information available through Neutron).
vlan_net_2_name: Neutron Name for VLAN Network 2
(information available through Neutron).
vlan_subnet_2_name: Subnet Neutron name for VLAN Network 2
(information available through Neutron).
test tool

DPDK pktgen

DPDK Pktgen is not part of a Linux distribution, hence it needs to be installed by the user.

references

DPDK Pktgen: DPDKpktgen

ETSI-NFV-TST001

RFC 2544: rfc2544

applicability Test can be configured with different flavors, vNIC type and packet sizes. Default values exist as specified above. The vNIC type and flavor MUST be specified by the user.
pre-test

The vTC has been successfully instantiated and configured. The user has correctly assigned the values to the deployment

configuration parameters.
  • Multicast traffic MUST be enabled on the network.
    The Data network switches need to be configured in order to manage multicast traffic.
  • In the case of SR-IOV vNICs use, SR-IOV compatible NICs
    must be used on the compute node.
  • Yarsdtick needs to be installed on a host connected to the
    data network and the host must have 2 DPDK-compatible NICs. Proper configuration of DPDK and DPDK pktgen is required before to run the test case. (For further instructions please refer to the ApexLake documentation).
test sequence Description and expected results
step 1 The vTC is deployed, according to the user-defined configuration
step 2 The vTC is correctly deployed and configured as necessary The initialization script has been correctly executed and vTC is ready to receive and process the traffic.
step 3 Test case is executed with the selected parameters: - vTC flavor - vNIC type - packet size The traffic is sent to the vTC using the maximum available traffic rate for 60 seconds.
step 4

The vTC instance forwards all the packets back to the packet generator for 60 seconds, as specified by RFC 2544.

Steps 3 and 4 are executed different times, with different rates in order to find the maximum supported traffic rate according to the current definition of throughput in RFC 2544.

test verdict The result of the test is a number between 0 and 100 which represents the throughput in terms of percentage of the available pktgen NIC bandwidth.
15.3.5.2. Yardstick Test Case Description TC007
Network Performance
test case id
OPNFV_YARDSTICK_TC007_Virtual Traffic Classifier Data Plane
Throughput Benchmarking Test in Presence of Noisy neighbours
metric Throughput
test purpose To measure the throughput supported by the virtual Traffic Classifier according to the RFC2544 methodology for a user-defined set of vTC deployment configurations in the presence of noisy neighbours.
configuration

file: opnfv_yardstick_tc007.yaml

packet_size: size of the packets to be used during the
throughput calculation. Allowe values: [64, 128, 256, 512, 1024, 1280, 1518]
vnic_type: type of VNIC to be used.
Allowed values are:
  • normal: for default OvS port configuration
  • direct: for SR-IOV port configuration
vtc_flavor: OpenStack flavor to be used for the vTC
Default available values are: m1.small, m1.medium, and m1.large, but the user can create his/her own flavor and give it as input
num_of_neighbours: Number of noisy neighbours (VMs) to be
instantiated during the experiment. Allowed values: range (1, 10)
amount_of_ram: RAM to be used by each neighbor.
Allowed values: [‘250M’, ‘1G’, ‘2G’, ‘3G’, ‘4G’, ‘5G’,
‘6G’, ‘7G’, ‘8G’, ‘9G’, ‘10G’]

Deault value: 256M

number_of_cores: Number of noisy neighbours (VMs) to be
instantiated during the experiment. Allowed values: range (1, 10) Default value: 1
vlan_sender: vlan tag of the network on which the vTC will
receive traffic (VLAN Network 1). Allowed values: range (1, 4096)
vlan_receiver: vlan tag of the network on which the vTC
will send traffic back to the packet generator (VLAN Network 2). Allowed values: range (1, 4096)
default_net_name: neutron name of the defaul network that
is used for access to the internet from the vTC (vNIC 1).
default_subnet_name: subnet name for vNIC1
(information available through Neutron).
vlan_net_1_name: Neutron Name for VLAN Network 1
(information available through Neutron).
vlan_subnet_1_name: Subnet Neutron name for VLAN Network 1
(information available through Neutron).
vlan_net_2_name: Neutron Name for VLAN Network 2
(information available through Neutron).
vlan_subnet_2_name: Subnet Neutron name for VLAN Network 2
(information available through Neutron).
test tool

DPDK pktgen

DPDK Pktgen is not part of a Linux distribution, hence it needs to be installed by the user.

references

DPDKpktgen

ETSI-NFV-TST001

rfc2544

applicability Test can be configured with different flavors, vNIC type and packet sizes. Default values exist as specified above. The vNIC type and flavor MUST be specified by the user.
pre-test

The vTC has been successfully instantiated and configured. The user has correctly assigned the values to the deployment

configuration parameters.
  • Multicast traffic MUST be enabled on the network.
    The Data network switches need to be configured in order to manage multicast traffic.
  • In the case of SR-IOV vNICs use, SR-IOV compatible NICs
    must be used on the compute node.
  • Yarsdtick needs to be installed on a host connected to the
    data network and the host must have 2 DPDK-compatible NICs. Proper configuration of DPDK and DPDK pktgen is required before to run the test case. (For further instructions please refer to the ApexLake documentation).
test sequence Description and expected results
step 1 The noisy neighbours are deployed as required by the user.
step 2 The vTC is deployed, according to the configuration required by the user
step 3 The vTC is correctly deployed and configured as necessary. The initialization script has been correctly executed and the vTC is ready to receive and process the traffic.
step 4

Test case is executed with the parameters specified by the user:

  • vTC flavor
  • vNIC type
  • packet size
The traffic is sent to the vTC using the maximum available
traffic rate
step 5

The vTC instance forwards all the packets back to the packet generator for 60 seconds, as specified by RFC 2544.

Steps 4 and 5 are executed different times with different with different traffic rates, in order to find the maximum supported traffic rate, accoring to the current definition of throughput in RFC 2544.

test verdict The result of the test is a number between 0 and 100 which represents the throughput in terms of percentage of the available pktgen NIC bandwidth.
15.3.5.3. Yardstick Test Case Description TC020
Network Performance
test case id OPNFV_YARDSTICK_TC0020_Virtual Traffic Classifier Instantiation Test
metric Failure
test purpose To verify that a newly instantiated vTC is ‘alive’ and functional and its instantiation is correctly supported by the infrastructure.
configuration

file: opnfv_yardstick_tc020.yaml

vnic_type: type of VNIC to be used.
Allowed values are:
  • normal: for default OvS port configuration
  • direct: for SR-IOV port configuration

Default value: None

vtc_flavor: OpenStack flavor to be used for the vTC
Default available values are: m1.small, m1.medium, and m1.large, but the user can create his/her own flavor and give it as input Default value: None
vlan_sender: vlan tag of the network on which the vTC will
receive traffic (VLAN Network 1). Allowed values: range (1, 4096)
vlan_receiver: vlan tag of the network on which the vTC
will send traffic back to the packet generator (VLAN Network 2). Allowed values: range (1, 4096)
default_net_name: neutron name of the defaul network that
is used for access to the internet from the vTC (vNIC 1).
default_subnet_name: subnet name for vNIC1
(information available through Neutron).
vlan_net_1_name: Neutron Name for VLAN Network 1
(information available through Neutron).
vlan_subnet_1_name: Subnet Neutron name for VLAN Network 1
(information available through Neutron).
vlan_net_2_name: Neutron Name for VLAN Network 2
(information available through Neutron).
vlan_subnet_2_name: Subnet Neutron name for VLAN Network 2
(information available through Neutron).
test tool

DPDK pktgen

DPDK Pktgen is not part of a Linux distribution, hence it needs to be installed by the user.

references

DPDKpktgen

ETSI-NFV-TST001

rfc2544

applicability Test can be configured with different flavors, vNIC type and packet sizes. Default values exist as specified above. The vNIC type and flavor MUST be specified by the user.
pre-test

The vTC has been successfully instantiated and configured. The user has correctly assigned the values to the deployment

configuration parameters.
  • Multicast traffic MUST be enabled on the network.
    The Data network switches need to be configured in order to manage multicast traffic. Installation and configuration of smcroute is required before to run the test case. (For further instructions please refer to the ApexLake documentation).
  • In the case of SR-IOV vNICs use, SR-IOV compatible NICs
    must be used on the compute node.
  • Yarsdtick needs to be installed on a host connected to the
    data network and the host must have 2 DPDK-compatible NICs. Proper configuration of DPDK and DPDK pktgen is required before to run the test case. (For further instructions please refer to the ApexLake documentation).
test sequence Description and expected results
step 1 The vTC is deployed, according to the configuration provided by the user.
step 2 The vTC is correctly deployed and configured as necessary. The initialization script has been correctly executed and the vTC is ready to receive and process the traffic.
step 3 Test case is executed with the parameters specified by the the user: - vTC flavor - vNIC type A constant rate traffic is sent to the vTC for 10 seconds.
step 4

The vTC instance tags all the packets and sends them back to the packet generator for 10 seconds.

The framework checks that the packet generator receives back all the packets with the correct tag from the vTC.

test verdict The vTC is deemed to be successfully instantiated if all packets are sent back with the right tag as requested, else it is deemed DoA (Dead on arrival)
15.3.5.4. Yardstick Test Case Description TC021
Network Performance
test case id OPNFV_YARDSTICK_TC0021_Virtual Traffic Classifier Instantiation Test in Presence of Noisy Neighbours
metric Failure
test purpose To verify that a newly instantiated vTC is ‘alive’ and functional and its instantiation is correctly supported by the infrastructure in the presence of noisy neighbours.
configuration

file: opnfv_yardstick_tc021.yaml

vnic_type: type of VNIC to be used.
Allowed values are:
  • normal: for default OvS port configuration
  • direct: for SR-IOV port configuration

Default value: None

vtc_flavor: OpenStack flavor to be used for the vTC
Default available values are: m1.small, m1.medium, and m1.large, but the user can create his/her own flavor and give it as input Default value: None
num_of_neighbours: Number of noisy neighbours (VMs) to be
instantiated during the experiment. Allowed values: range (1, 10)
amount_of_ram: RAM to be used by each neighbor.
Allowed values: [‘250M’, ‘1G’, ‘2G’, ‘3G’, ‘4G’, ‘5G’,
‘6G’, ‘7G’, ‘8G’, ‘9G’, ‘10G’]

Deault value: 256M

number_of_cores: Number of noisy neighbours (VMs) to be
instantiated during the experiment. Allowed values: range (1, 10) Default value: 1
vlan_sender: vlan tag of the network on which the vTC will
receive traffic (VLAN Network 1). Allowed values: range (1, 4096)
vlan_receiver: vlan tag of the network on which the vTC
will send traffic back to the packet generator (VLAN Network 2). Allowed values: range (1, 4096)
default_net_name: neutron name of the defaul network that
is used for access to the internet from the vTC (vNIC 1).
default_subnet_name: subnet name for vNIC1
(information available through Neutron).
vlan_net_1_name: Neutron Name for VLAN Network 1
(information available through Neutron).
vlan_subnet_1_name: Subnet Neutron name for VLAN Network 1
(information available through Neutron).
vlan_net_2_name: Neutron Name for VLAN Network 2
(information available through Neutron).
vlan_subnet_2_name: Subnet Neutron name for VLAN Network 2
(information available through Neutron).
test tool

DPDK pktgen

DPDK Pktgen is not part of a Linux distribution, hence it needs to be installed by the user.

references

DPDK Pktgen: DPDK Pktgen: DPDKpktgen

ETSI-NFV-TST001

RFC 2544: rfc2544

applicability Test can be configured with different flavors, vNIC type and packet sizes. Default values exist as specified above. The vNIC type and flavor MUST be specified by the user.
pre-test

The vTC has been successfully instantiated and configured. The user has correctly assigned the values to the deployment

configuration parameters.
  • Multicast traffic MUST be enabled on the network.
    The Data network switches need to be configured in order to manage multicast traffic. Installation and configuration of smcroute is required before to run the test case. (For further instructions please refer to the ApexLake documentation).
  • In the case of SR-IOV vNICs use, SR-IOV compatible NICs
    must be used on the compute node.
  • Yarsdtick needs to be installed on a host connected to the
    data network and the host must have 2 DPDK-compatible NICs. Proper configuration of DPDK and DPDK pktgen is required before to run the test case. (For further instructions please refer to the ApexLake documentation).
test sequence Description and expected results
step 1 The noisy neighbours are deployed as required by the user.
step 2 The vTC is deployed, according to the configuration provided by the user.
step 3 The vTC is correctly deployed and configured as necessary. The initialization script has been correctly executed and the vTC is ready to receive and process the traffic.
step 4 Test case is executed with the selected parameters: - vTC flavor - vNIC type A constant rate traffic is sent to the vTC for 10 seconds.
step 5

The vTC instance tags all the packets and sends them back to the packet generator for 10 seconds.

The framework checks if the packet generator receives back all the packets with the correct tag from the vTC.

test verdict The vTC is deemed to be successfully instantiated if all packets are sent back with the right tag as requested, else it is deemed DoA (Dead on arrival)
15.4. Templates
15.4.1. Yardstick Test Case Description TCXXX
test case slogan e.g. Network Latency
test case id e.g. OPNFV_YARDSTICK_TC001_NW Latency
metric what will be measured, e.g. latency
test purpose describe what is the purpose of the test case
configuration what .yaml file to use, state SLA if applicable, state test duration, list and describe the scenario options used in this TC and also list the options using default values.
test tool e.g. ping
references e.g. RFCxxx, ETSI-NFVyyy
applicability describe variations of the test case which can be performend, e.g. run the test for different packet sizes
pre-test conditions describe configuration in the tool(s) used to perform the measurements (e.g. fio, pktgen), POD-specific configuration required to enable running the test
test sequence description and expected result
step 1

use this to describe tests that require sveveral steps e.g collect logs.

Result: what happens in this step e.g. logs collected

step 2

remove interface

Result: interface down.

step N

what is done in step N

Result: what happens

test verdict expected behavior, or SLA, pass/fail criteria
15.4.2. Task Template Syntax
15.4.2.1. Basic template syntax

A nice feature of the input task format used in Yardstick is that it supports the template syntax based on Jinja2. This turns out to be extremely useful when, say, you have a fixed structure of your task but you want to parameterize this task in some way. For example, imagine your input task file (task.yaml) runs a set of Ping scenarios:

# Sample benchmark task config file
# measure network latency using ping
schema: "yardstick:task:0.1"

scenarios:
-
  type: Ping
  options:
    packetsize: 200
  host: athena.demo
  target: ares.demo

  runner:
    type: Duration
    duration: 60
    interval: 1

  sla:
    max_rtt: 10
    action: monitor

context:
    ...

Let’s say you want to run the same set of scenarios with the same runner/ context/sla, but you want to try another packetsize to compare the performance. The most elegant solution is then to turn the packetsize name into a template variable:

# Sample benchmark task config file
# measure network latency using ping

schema: "yardstick:task:0.1"
scenarios:
-
  type: Ping
  options:
    packetsize: {{packetsize}}
  host: athena.demo
  target: ares.demo

  runner:
    type: Duration
    duration: 60
    interval: 1

  sla:
    max_rtt: 10
    action: monitor

context:
    ...

and then pass the argument value for {{packetsize}} when starting a task with this configuration file. Yardstick provides you with different ways to do that:

1.Pass the argument values directly in the command-line interface (with either a JSON or YAML dictionary):

yardstick task start samples/ping-template.yaml
--task-args'{"packetsize":"200"}'

2.Refer to a file that specifies the argument values (JSON/YAML):

yardstick task start samples/ping-template.yaml --task-args-file args.yaml
15.4.2.2. Using the default values

Note that the Jinja2 template syntax allows you to set the default values for your parameters. With default values set, your task file will work even if you don’t parameterize it explicitly while starting a task. The default values should be set using the {% set ... %} clause (task.yaml). For example:

# Sample benchmark task config file
# measure network latency using ping
schema: "yardstick:task:0.1"
{% set packetsize = packetsize or "100" %}
scenarios:
-
  type: Ping
  options:
  packetsize: {{packetsize}}
  host: athena.demo
  target: ares.demo

  runner:
    type: Duration
    duration: 60
    interval: 1
  ...

If you don’t pass the value for {{packetsize}} while starting a task, the default one will be used.

15.4.2.3. Advanced templates

Yardstick makes it possible to use all the power of Jinja2 template syntax, including the mechanism of built-in functions. As an example, let us make up a task file that will do a block storage performance test. The input task file (fio-template.yaml) below uses the Jinja2 for-endfor construct to accomplish that:

#Test block sizes of 4KB, 8KB, 64KB, 1MB
#Test 5 workloads: read, write, randwrite, randread, rw
schema: "yardstick:task:0.1"

 scenarios:
{% for bs in ['4k', '8k', '64k', '1024k' ] %}
  {% for rw in ['read', 'write', 'randwrite', 'randread', 'rw' ] %}
-
  type: Fio
  options:
    filename: /home/ubuntu/data.raw
    bs: {{bs}}
    rw: {{rw}}
    ramp_time: 10
  host: fio.demo
  runner:
    type: Duration
    duration: 60
    interval: 60

  {% endfor %}
{% endfor %}
context
    ...
16. Glossary
API
Application Programming Interface
DPDK
Data Plane Development Kit
DPI
Deep Packet Inspection
DSCP
Differentiated Services Code Point
IGMP
Internet Group Management Protocol
IOPS
Input/Output Operations Per Second
NFVI
Network Function Virtualization Infrastructure
NIC
Network Interface Controller
PBFS
Packet Based per Flow State
QoS
Quality of Service
SR-IOV
Single Root IO Virtualization
SUT
System Under Test
ToS
Type of Service
VLAN
Virtual LAN
VM
Virtual Machine
VNF
Virtual Network Function
VNFC
Virtual Network Function Component
VTC
Virtual Traffic Classifier
17. References

Testing Developer Guides

Bottlenecks

Bottlenecks - Testing Guide
Project Testing Guide

For each test suite, you can either setup test story or test case to run certain test. test story could include several test cases as a set in one configuration file. You could then call the test story or test case by using Bottlencks CLI or Python build process. Details will be shown in the following section.

Brief Introdcution of the Test suites in Project Releases

Brahmaputra: rubbos is introduced, which is an end2end NFVI perforamnce tool. Virtual switch test framework(VSTF) is also introduced, which is an test framework used for vswitch performance test.

Colorado: rubbos is refactored by using puppet, which makes it quite flexible to configure with different number of load generator (Client), worker (tomcat). vstf is refactored by extracting the test case’s configuration information.

Danube: posca testsuite is introduced to implementing stress (factor), scenario and tuning test in parametric manner. Two testcases are developed and integrated into community CI pipeline. Rubbos and VSTF are not supported any more.

Integration Description
Release Integrated Installer Supported Testsuite
Brahmaputra Fuel Rubbos, VSTF
Colorado Compass Rubbos, VSTF
Danube Compass POSCA
Test suite & Test case Description
POSCA posca_factor_ping
posca_factor_system_bandwidth
Rubbos rubbos_basic
rubbos_TC1101
rubbos_TC1201
rubbos_TC1301
rubbos_TC1401
rubbos_heavy_TC1101
vstf vstf_Ti1
vstf_Ti2
vstf_Ti3
vstf_Tn1
vstf_Tn2
vstf_Tu1
vstf_Tu2
vstf_Tu3
POSCA Testsuite Guide
POSCA Introduction

The POSCA (Parametric Bottlenecks Testing Catalogue) testsuite classifies the bottlenecks test cases and results into 5 categories. Then the results will be analyzed and bottlenecks will be searched among these categories.

The POSCA testsuite aims to locate the bottlenecks in parmetric manner and to decouple the bottlenecks regarding the deployment requirements. The POSCA testsuite provides an user friendly way to profile and understand the E2E system behavior and deployment requirements.

Goals of the POSCA testsuite:
  1. Automatically locate the bottlenecks in a iterative manner.
  2. Automatically generate the testing report for bottlenecks in different categories.
  3. Implementing Automated Staging.
Scopes of the POSCA testsuite:
  1. Modeling, Testing and Test Result analysis.
  2. Parameters choosing and Algorithms.
Test stories of POSCA testsuite:
  1. Factor test (Stress test): base test cases that Feature test and Optimization will be dependant on.
  2. Feature test: test cases for features/scenarios.
  3. Optimization test: test to tune the system parameter.

Detailed workflow is illutrated below.

Preinstall Packages
if [ -d usr/local/bin/docker-compose ]; then
    rm -rf usr/local/bin/docker-compose
fi
curl -L https://github.com/docker/compose/releases/download/1.11.0/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
Run POSCA Locally

POSCA testsuite is highly automated regarding test environment preparation, installing testing tools, excuting tests and showing the report/analysis. A few steps are needed to run it locally.

It is presumed that a user is using Compass4nfv to deploy OPNFV Danube and the user logins jumper server as root.

Downloading Bottlenecks Software
mkdir /home/opnfv
cd /home/opnfv
git clone https://gerrit.opnfv.org/gerrit/bottlenecks
cd bottlenecks
Preparing Python Virtual Evnironment
. pre_virt_env.sh
Excuting Specified Testcase

Bottlencks provide a CLI interface to run the tests, which is one of the most convenient way since it is more close to our natural languge. An GUI interface with rest API will also be provided in later update.

bottlenecks [testcase run <testcase>] [teststory run <teststory>]

For the testcase command, testcase name should be as the same name of the test case configuration file located in testsuites/posca/testcase_cfg. For stress tests in Danube, testcase should be replaced by either posca_factor_ping or posca_factor_system_bandwidth. For the teststory command, a user could specified the test cases to be excuted by defined it in a teststory configuration file located in testsuites/posca/testsuite_story. There is also an example there named posca_factor_test.

There are also other 2 ways to run test cases and test stories. The first one is using shell script.

bash run_tests.sh [-h|--help] [-s <testsuite>] [-c <testcase>]

The second is using python interpreter.

docker-compose -f docker/bottleneck-compose/docker-compose.yml up -d
docker pull tutum/influxdb:0.13
sleep 5
POSCA_SCRIPT="/home/opnfv/bottlenecks/testsuites/posca"
docker exec bottleneckcompose_bottlenecks_1 python ${POSCA_SCRIPT}/run_posca.py [testcase <testcase>] [teststory <teststory>]
Showing Report

Bottlenecks uses ELK to illustrate the testing results. Asumming IP of the SUT (System Under Test) is denoted as ipaddr, then the address of Kibana is http://[ipaddr]:5601. One can visit this address to see the illustrations. Address for elasticsearch is http://[ipaddr]:9200. One can use any Rest Tool to visit the testing data stored in elasticsearch.

Cleaning Up Environment
. rm_virt_env.sh

If you want to clean the dockers that established during the test, you can excute the additional commands below.

docker-compose -f docker/bottleneck-compose/docker-compose.yml down -d
docker ps -a | grep 'influxdb' | awk '{print $1}' | xargs docker rm -f >/dev/stdout

Or you can just run the following command

bash run_tests.sh --cleanup

Note that you can also add cleanup parameter when you run a test case. Then environment will be automatically cleaned up when completing the test.

Run POSCA through Community CI

POSCA test cases are runned by OPNFV CI now. See https://build.opnfv.org for details of the building jobs. Each building job is set up to execute a single test case. The test results/logs will be printed on the web page and reported automatically to community MongoDB. There are two ways to report the results.

  1. Report testing result by shell script
bash run_tests.sh [-h|--help] [-s <testsuite>] [-c <testcase>] --report
  1. Report testing result by python interpreter
docker-compose -f docker/bottleneck-compose/docker-compose.yml up -d
docker pull tutum/influxdb:0.13
sleep 5
REPORT="True"
POSCA_SCRIPT="/home/opnfv/bottlenecks/testsuites/posca"
docker exec bottleneckcompose_bottlenecks_1 python ${POSCA_SCRIPT}/run_posca.py [testcase <testcase>] [teststory <teststory>] REPORT
Test Result Description
Dashbard guide
Scope

This document provides an overview of the results of test cases developed by the OPNFV Bottlenecks Project, executed on OPNFV community labs.

OPNFV CI(Continous Integration) system provides automated build, deploy and testing for the software developed in OPNFV. Unless stated, the reported tests are automated via Jenkins Jobs.

Test results are visible in the following dashboard:

  • Testing dashboard: uses Mongo DB to store test results and Bitergia for visualization, which includes the rubbos test result, vstf test result.
Bottlenecks - Deprecated Testing Guide
Rubbos Testsuite Guide
Rubbos Introduction

Rubbos is a bulletin board benchmark modeled after an online news forum like Slashdot. It is an open source Middleware and an n-tier system model which is used to be deployed on multiple physical node and to measure the whole performacne of OPNFV platform. Rubbos can deploy the Apache, tomcat, and DB. Based on the deployment, rubbos gives the pressure to the whole system. When the system reaches to the peak, the throughput will not grow more. This testcase can help to understand the bottlenecks of OPNFV plantform and improve the performance of OPNFV platform.

Detailed workflow is illutrated below.

Bottlenecks Framework Setup
Preinstall Packages

There is a need to install some packages before running the rubbos, gcc, gettext, g++, libaio1, libaio-dev, make and git are necessary. When the rubbos runs on the OPNFV community continuous integration(CI) system, the required packages are installed automately as shown in the code repository, which is /utils/infra_setup/vm_dev_setup/packages.conf, besides, the packages can be encapsulated in the images initially. If someone wants to use rubbos locally, he/she has to install them by hand, such as in ubuntu 14.04,

apt-get update
apt-get install gettext
How does Rubbos Integrate into Installers

1.Community CI System

Rubbos has been successfully integrated into fuel and compass with NOSDN scenario in OPNFV community CI system.

Heat is used to create 9 instances, which is shown in /utils/infra_setup/heat_template/HOT_create_instance.sh, the 9 instances are used for installing Apache, Tomcat, Mysql, Control, Benchmark and 4 Clients. The tools, such as rubbos, sysstat, oprofile, etc, are installed in these instances to perform the test, the test results are stored in the Benchmark instance initially, then they are copied to the Rubbos_result instance, finally, the test results are transferred to the community dashboard.

There’s a need to store our pakages as large as M bytes or G bytes size, such as the images, jdk, apache-ant, apache-tomcat, etc, the OPNFV community storage system, Google Cloud Storage, is used, the pakages can be downloaded from https://artifacts.opnfv.org/bottlenecks/rubbos.

2.Local Deployment

If someone wants to run the rubbos in his own environment, he/she can keep to the following steps,

2.1 Start up instances by using heat, nova or libvert. In Openstack Environemnt, the heat script can refer /utils/infra_setup/heat_template/HOT_create_instance.sh, if the openstack doesn’t support heat module, the script /utils/infra_setup/create_instance.sh can be used. Without Openstack, there’s a way to set up instances by using libvert, the scripts are shown under the directory /utils/rubbos_dev_env_setup.

The image can be downloaded from the community cloud storage

curl --connect-timeout 10 -o bottlenecks-trusty-server.img
     http://artifacts.opnfv.org/bottlenecks/rubbos/bottlenecks-trusty-server.img

2.2 Ssh into the control node and clone the bottlenecks codes to the root directory.

git clone https://git.opnfv.org/bottlenecks /bottlenecks

2.3 Download the packages and decompress them into the proper directory.

curl --connect-timeout 10 -o app_tools.tar.gz
     http://artifacts.opnfv.org/bottlenecks/rubbos/app_tools.tar.gz
curl --connect-timeout 10 -o rubbosMulini6.tar.gz
     http://artifacts.opnfv.org/bottlenecks/rubbos/rubbosMulini6.tar.gz
tar zxf app_tools.tar.gz -C /bottlenecks/rubbos
tar zxf rubbosMulini6.tar.gz -C /bottlenecks/rubbos/rubbos_scripts

2.4 Ssh into the Control node and run the script

source /bottlenecks/rubbos/rubbos_scripts/1-1-1/scripts/run.sh

2.5 Check the test results under the directory /bottlenecks/rubbos/rubbos_results in Control node. The results are stored in the format of xml, move them to the brower chrome, then you can see the results.

Test Result Description

In OPNFV community, the result is shown in the following format

[{'client': 200, 'throughput': 27},
 {'client': 700, 'throughput': 102},
 {'client': 1200, 'throughput': 177},
 {'client': 1700, 'throughput': 252},
 {'client': 2200, 'throughput': 323},
 {'client': 2700, 'throughput': 399},
 {'client': 3200, 'throughput': 473}]

The results are transferred to the community database and a map is drawed on the dashboard. Along with the growth of the number of the client, the throughput grows at first, then meets up with a point of inflexion, which is caused by the bottlenecks of the measured system.

VSTF Testsuite Guide
VSTF Introduction

VSTF(Virtual Switch Test Framework) is a system-level testing framework in the area of network virtualization, and it could help you estimate the system switch ability and find out the network bottlenecks by main KPIs(bandwidth, latency, resource usage and so on), VSTF owns a methodology to define the test scenario and testcases, Now we could support Tu testcases in the Openstack environment, More scenarios and cases will be added.

VSTF TestScenario
  1. Tu - VM to VM
  2. Tn - Physical Nic loopback
  3. TnV - VNF loopback
  4. Ti - VM to Physical Nic
Pre-install Packages on the ubuntu 14.04 VM
VSTF VM Preparation Steps
  1. Create a ubuntu 14.04 VM
  2. Install dependency inside VM
  3. Install vstf python package inside VM
VM preparation

Install python2.7 version and git

sudo apt-get install python2.7
sudo apt-get install git

Download Bottlenecks package

sudo cd /home/
sudo git clone https://gerrit.opnfv.org/gerrit/bottlenecks

Install the dependency

sudo apt-get install python-pip
sudo pip install --upgrade pip
sudo dpkg-reconfigure dash
sudo apt-get install libjpeg-dev
sudo apt-get install libpng-dev
sudo apt-get install python-dev
sudo apt-get install python-testrepository
sudo apt-get install git
sudo apt-get install python-pika
sudo apt-get install python-oslo.config
sudo pip install -r /home/bottlenecks/vstf/requirements.txt

Install vstf package

sudo mkdir -p /var/log/vstf/
sudo cp -r /home/bottlenecks/vstf/etc/vstf/ /etc/
sudo mkdir -p /opt/vstf/
sudo cd /home/bottlenecks;sudo rm -rf build/
sudo python setup.py install
Image on the Cloud
Name vstf-image
URL http://artifacts.opnfv.org/bottlenecks/vstf-manager-new.img
Format QCOW2
Size 5G
User root
Passwd root

There is a complete vstf image on the cloud ,you could download it and use it to deploy and run cases ,but do not need VM preparation steps.

How is VSTF Integrated into Installers
VM requirements
Name FLAVOR IMAGE_NAME NETWORK
vstf-manager m1.large vstf-image control-plane=XX.XX.XX.XX
vstf-tester m1.large vstf-image control-plane(eth0)=XX.XX.XX.XX test-plane(eth1)=XX.XX.XX.XX
vstf-target m1.large vstf-image control-plane(eth0)=XX.XX.XX.XX test-plane(eth1)=XX.XX.XX.XX

m1.large means 4U4G for the target image Size 5GB For the network used by VMs,network need two plane ,one plane is control plane and the other plane is test plane.

OPNFV community Usage in the CI system
Project Name Project Categoty
bottlenecks-daily-fuel-vstf-lf-master bottlenecks

OPNFV community jenkins Project info

Main Entrance for the ci test:

cd /home/bottlenecks/ci;
bash -x vstf_run.sh
Test on local(Openstack Environment)

download the image file

curl --connect-timeout 10 -o /tmp/vstf-manager.img \
     http://artifacts.opnfv.org/bottlenecks/vstf-manager-new.img -v

create the image file by the glance

glance image-create --name $MANAGER_IMAGE_NAME \
      --disk-format qcow2 \
      --container-format bare \
      --file /tmp/vstf-manager.img

create the keypair for the image(anyone will be ok)

cd /home/bottlenecks/utils/infra_setup/bottlenecks_key
nova keypair-add --pub_key $KEY_PATH/bottlenecks_key.pub $KEY_NAME

create the vstf three VMs in the openstack by heat

cd /home/bottlenecks/utils/infra_setup/heat_template/vstf_heat_template
heat stack-create vstf -f bottleneck_vstf.yaml

launch the vstf process inside the vstf-manager vstf-tester vstf-target VMs

cd /home/bottlenecks/utils/infra_setup/heat_template/vstf_heat_template
bash -x launch_vstf.sh

edit the test scenario and test packet list in the vstf_test.sh, now support the Tu-1/2/3

function fn_testing_scenario(){
    ...
    local test_length_list="64 128 256 512 1024"
    local test_scenario_list="Tu-1 Tu-3"
    ...
}

launch the vstf script

cd /home/bottlenecks/utils/infra_setup/heat_template/vstf_heat_template
bash -x vstf_test.sh
Test Result Description
Result Format

For example after the test, The result will display as the following format

{ u'64': { u'AverageLatency': 0.063,
           u'Bandwidth': 0.239,
           u'CPU': 0.0,
           u'Duration': 20,
           u'MaximumLatency': 0.063,
           u'MinimumLatency': 0.063,
           u'MppspGhz': 0,
           u'OfferedLoad': 100.0,
           u'PercentLoss': 22.42,
           u'RxFrameCount': 4309750.0,
           u'RxMbps': 198.28,
           u'TxFrameCount': 5555436.0,
           u'TxMbps': 230.03}}
Option Description
Option Name Description
AverageLatency The average latency data during the packet transmission (Unit:microsecond)
Bandwidth Network bandwidth(Unit:Million packets per second)
CPU Total Resource Cpu usage(Unit: Ghz)
Duration Test time(Unit: second)
MaximumLatency The maximum packet latency during the packet transmission (Unit:microsecond)
MinimumLatency The maximum packet latency during the packet transmission (Unit:microsecond)
MppspGhz Million Packets per second with per CPU resource Ghz(Unit: Mpps/Ghz)
OfferedLoad The load of network offered
PercentLoss The percent of frame loss rate
RxFrameCount The total frame on Nic rx
RxMbps The received bandwidth per second
TxFrameCount The total frame on Nic rx
TxMbps The send bandwidth per second

Dovetail

OPNFV Verified Program test case requirements
OVP Test Suite Purpose and Goals

The OVP test suite is intended to provide a method for validating the interfaces and behaviors of an NFVi platform according to the expected capabilities exposed in OPNFV. The behavioral foundation evaluated in these tests should serve to provide a functional baseline for VNF deployment and portability across NFVi instances. All OVP tests are available in open source and are executed in open source test frameworks.

Test case requirements

The following requirements are mandatory for a test to be submitted for consideration in the OVP test suite:

  • All test cases must be fully documented, in a common format. Please consider the existing OPNFV Verified Program test specification as examples.
    • Clearly identifying the test procedure and expected results / metrics to determine a “pass” or “fail” result.
  • Tests must be validated for the purpose of OVP, tests should be run with both an expected positive and negative outcome.
  • At the current stage of OVP, only functional tests are eligible, performance testing is out of scope.
    • Performance test output could be built in as “for information only”, but must not carry pass/fail metrics.
  • Test cases should favor implementation of a published standard interface for validation.
    • Where no standard is available provide API support references.
    • If a standard exists and is not followed, an exemption is required. Such exemptions can be raised in the project meetings first, and if no consensus can be reached, escalated to the TSC.
  • Test cases must pass on applicable OPNFV reference deployments and release versions.
    • Tests must not require a specific NFVi platform composition or installation tool.
      • Tests and test tools must run independently of the method of platform installation and architecture.
      • Tests and test tools must run independently of specific OPNFV components allowing different components such as storage backends or SDN controllers.
    • Tests must not require un-merged patches to the relevant upstream projects.
    • Tests must not require features or code which are out of scope for the latest release of the OPNFV project.
    • Tests must have a documented history of recent successful verification in OPNFV testing programs including CI, Functest, Yardstick, Bottlenecks, Dovetail, etc. (i.e., all testing programs in OPNFV that regularly validate tests against the release, whether automated or manual).
    • Tests must be considered optional unless they have a documented history for ALL OPNFV scenarios that are both
      • applicable, i.e., support the feature that the test exercises, and
      • released, i.e., in the OPNFV release supported by the OVP test suite version.
  • Tests must run against a fully deployed and operational system under test.
  • Tests and test implementations must support stand alone OPNFV and commercial OPNFV-derived solutions.
    • There can be no dependency on OPNFV resources or infrastructure.
    • Tests must not require external resources while a test is running, e.g., connectivity to the Internet. All resources required to run a test, e.g., VM and container images, are downloaded and installed as part of the system preparation and test tool installation.
  • The following things must be documented for the test case:
    • Use case specification
    • Test preconditions
    • Basic test flow execution description and test assertions
    • Pass fail criteria
  • The following things may be documented for the test case:
    • Parameter border test cases descriptions
    • Fault/Error test case descriptions
    • Post conditions where the system state may be left changed after completion

New test case proposals should complete a OVP test case worksheet to ensure that all of these considerations are met before the test case is approved for inclusion in the OVP test suite.

Dovetail Test Suite Naming Convention

Test case naming and structuring must comply with the following conventions. The fully qualified name of a test case must comprise three sections:

<community>.<test_area>.<test_case_name>

  • community: The fully qualified test case name must identify the community or upstream project which developed and maintains the test case. For test cases originating in OPNFV projects, the community identifier is ‘opnfv’. Test cases consumed from the OpenStack tempest test suite, are named ‘tempest’, for example.
  • test_area: The fully qualified test case name must identify the test case area. For test cases originating in OPNFV projects, the test case area must identify the project name.
  • test_case_name: The fully qualified test case name must include a concise description of the purpose of the test case.

An example of a fully qualified test case name is opnfv.sdnvpn.router_association_floating_ip.

Compliance and Verification program accepted test cases
Mandatory OVP Test Areas
Test Area VIM Operations - Compute
Image operations within the Compute API
tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_delete_image
tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_create_image_specify_multibyte_character_image_name
Basic support Compute API for server actions such as reboot, rebuild, resize
tempest.api.compute.servers.test_instance_actions.InstanceActionsTestJSON.test_get_instance_action
tempest.api.compute.servers.test_instance_actions.InstanceActionsTestJSON.test_list_instance_actions
Generate, import, and delete SSH keys within Compute services
tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_specify_keypair
List supported versions of the Compute API
tempest.api.compute.test_versions.TestVersions.test_list_api_versions
Quotas management in Compute API
tempest.api.compute.test_quotas.QuotasTestJSON.test_get_default_quotas
tempest.api.compute.test_quotas.QuotasTestJSON.test_get_quotas
Basic server operations in the Compute API
tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_server_with_admin_password
tempest.api.compute.servers.test_servers.ServersTestJSON.test_create_with_existing_server_name
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_numeric_server_name
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_server_metadata_exceeds_length_limit
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_server_name_length_exceeds_256
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_with_invalid_flavor
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_with_invalid_image
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_create_with_invalid_network_uuid
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_delete_server_pass_id_exceeding_length_limit
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_delete_server_pass_negative_id
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_get_non_existent_server
tempest.api.compute.servers.test_create_server.ServersTestJSON.test_host_name_is_same_as_server_name
tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_host_name_is_same_as_server_name
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_invalid_ip_v6_address
tempest.api.compute.servers.test_create_server.ServersTestJSON.test_list_servers
tempest.api.compute.servers.test_create_server.ServersTestJSON.test_list_servers_with_detail
tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_list_servers
tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_list_servers_with_detail
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_flavor
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_image
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_server_name
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_filter_by_server_status
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_detailed_limit_results
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_flavor
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_image
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_limit
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_server_name
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_server_status
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filtered_by_name_wildcard
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_changes_since_future_date
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_changes_since_invalid_date
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits_greater_than_actual_count
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits_pass_negative_value
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_limits_pass_string
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_non_existing_flavor
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_non_existing_image
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_by_non_existing_server_name
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_detail_server_is_deleted
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_status_non_existing
tempest.api.compute.servers.test_list_servers_negative.ListServersNegativeTestJSON.test_list_servers_with_a_deleted_server
tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_lock_unlock_server
tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_delete_server_metadata_item
tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_get_server_metadata_item
tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_list_server_metadata
tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_set_server_metadata
tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_set_server_metadata_item
tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_update_server_metadata
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_server_name_blank
tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_reboot_server_hard
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_reboot_non_existent_server
tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_rebuild_server
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_rebuild_deleted_server
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_rebuild_non_existent_server
tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_stop_start_server
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_stop_non_existent_server
tempest.api.compute.servers.test_servers.ServersTestJSON.test_update_access_server_address
tempest.api.compute.servers.test_servers.ServersTestJSON.test_update_server_name
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_update_name_of_non_existent_server
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_update_server_name_length_exceeds_256
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_update_server_set_empty_name
tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_created_server_vcpus
tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_server_details
tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_verify_created_server_vcpus
tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_verify_server_details
tempest.api.compute.servers.test_list_server_filters.ListServerFiltersTestJSON.test_list_servers_filter_by_server_status
tempest.api.compute.servers.test_servers_negative.ServersNegativeTestJSON.test_rebuild_deleted_server
Retrieve volume information through the Compute API
tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_attach_detach_volume
tempest.api.compute.volumes.test_attach_volume.AttachVolumeTestJSON.test_list_get_volume_attachments
Test Area VIM Operations - Identity
API discovery operations within the Identity v3 API
tempest.api.identity.v3.test_api_discovery.TestApiDiscovery.test_api_media_types
tempest.api.identity.v3.test_api_discovery.TestApiDiscovery.test_api_version_resources
tempest.api.identity.v3.test_api_discovery.TestApiDiscovery.test_api_version_statuses
Auth operations within the Identity API
tempest.api.identity.v3.test_tokens.TokensV3Test.test_create_token
Test Area VIM Operations - Image
Image deletion tests using the Glance v2 API
tempest.api.image.v2.test_images.BasicOperationsImagesTest.test_delete_image
tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_delete_image_null_id
tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_delete_non_existing_image
tempest.api.image.v2.test_images_tags_negative.ImagesTagsNegativeTest.test_delete_non_existing_tag
Image get tests using the Glance v2 API
tempest.api.image.v2.test_images.ListImagesTest.test_get_image_schema
tempest.api.image.v2.test_images.ListImagesTest.test_get_images_schema
tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_get_delete_deleted_image
tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_get_image_null_id
tempest.api.image.v2.test_images_negative.ImagesNegativeTest.test_get_non_existent_image
tempest.api.image.v2.test_images.ListUserImagesTest.test_get_image_schema
tempest.api.image.v2.test_images.ListUserImagesTest.test_get_images_schema
CRUD image operations in Images API v2
tempest.api.image.v2.test_images.ListImagesTest.test_list_no_params
tempest.api.image.v2.test_images.ListImagesTest.test_index_no_params
tempest.api.image.v2.test_images.ListUserImagesTest.test_list_no_params
Image list tests using the Glance v2 API
tempest.api.image.v2.test_images.ListImagesTest.test_list_images_param_container_format
tempest.api.image.v2.test_images.ListImagesTest.test_list_images_param_disk_format
tempest.api.image.v2.test_images.ListImagesTest.test_list_images_param_limit
tempest.api.image.v2.test_images.ListImagesTest.test_list_images_param_min_max_size
tempest.api.image.v2.test_images.ListImagesTest.test_list_images_param_size
tempest.api.image.v2.test_images.ListImagesTest.test_list_images_param_status
tempest.api.image.v2.test_images.ListImagesTest.test_list_images_param_visibility
tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_container_format
tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_disk_format
tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_limit
tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_min_max_size
tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_size
tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_status
tempest.api.image.v2.test_images.ListUserImagesTest.test_list_images_param_visibility
Image update tests using the Glance v2 API
tempest.api.image.v2.test_images.BasicOperationsImagesTest.test_update_image
tempest.api.image.v2.test_images_tags.ImagesTagsTest.test_update_delete_tags_for_image
tempest.api.image.v2.test_images_tags_negative.ImagesTagsNegativeTest.test_update_tags_for_non_existing_image
Test Area VIM Operations - Network
Basic CRUD operations on L2 networks and L2 network ports
tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_allocation_pools
tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_dhcp_enabled
tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_gw
tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_gw_and_allocation_pools
tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_with_host_routes_and_dns_nameservers
tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_without_gateway
tempest.api.network.test_networks.NetworksTest.test_create_delete_subnet_all_attributes
tempest.api.network.test_networks.NetworksTest.test_create_update_delete_network_subnet
tempest.api.network.test_networks.NetworksTest.test_delete_network_with_subnet
tempest.api.network.test_networks.NetworksTest.test_list_networks
tempest.api.network.test_networks.NetworksTest.test_list_networks_fields
tempest.api.network.test_networks.NetworksTest.test_list_subnets
tempest.api.network.test_networks.NetworksTest.test_list_subnets_fields
tempest.api.network.test_networks.NetworksTest.test_show_network
tempest.api.network.test_networks.NetworksTest.test_show_network_fields
tempest.api.network.test_networks.NetworksTest.test_show_subnet
tempest.api.network.test_networks.NetworksTest.test_show_subnet_fields
tempest.api.network.test_networks.NetworksTest.test_update_subnet_gw_dns_host_routes_dhcp
tempest.api.network.test_ports.PortsTestJSON.test_create_bulk_port
tempest.api.network.test_ports.PortsTestJSON.test_create_port_in_allowed_allocation_pools
tempest.api.network.test_ports.PortsTestJSON.test_create_update_delete_port
tempest.api.network.test_ports.PortsTestJSON.test_list_ports
tempest.api.network.test_ports.PortsTestJSON.test_list_ports_fields
tempest.api.network.test_ports.PortsTestJSON.test_show_port
tempest.api.network.test_ports.PortsTestJSON.test_show_port_fields
tempest.api.network.test_ports.PortsTestJSON.test_update_port_with_security_group_and_extra_attributes
tempest.api.network.test_ports.PortsTestJSON.test_update_port_with_two_security_groups_and_extra_attributes
tempest.api.network.test_networks.NetworksTestJSON.test_create_delete_subnet_with_allocation_pools
tempest.api.network.test_networks.NetworksTestJSON.test_create_delete_subnet_with_dhcp_enabled
tempest.api.network.test_networks.NetworksTestJSON.test_create_delete_subnet_with_gw
tempest.api.network.test_networks.NetworksTestJSON.test_create_delete_subnet_with_gw_and_allocation_pools
tempest.api.network.test_networks.NetworksTestJSON.test_create_delete_subnet_with_host_routes_and_dns_nameservers
tempest.api.network.test_networks.NetworksTestJSON.test_create_delete_subnet_without_gateway
tempest.api.network.test_networks.NetworksTestJSON.test_create_delete_subnet_all_attributes
tempest.api.network.test_networks.NetworksTestJSON.test_create_update_delete_network_subnet
tempest.api.network.test_networks.NetworksTestJSON.test_delete_network_with_subnet
tempest.api.network.test_networks.NetworksTestJSON.test_list_networks
tempest.api.network.test_networks.NetworksTestJSON.test_list_networks_fields
tempest.api.network.test_networks.NetworksTestJSON.test_list_subnets
tempest.api.network.test_networks.NetworksTestJSON.test_list_subnets_fields
tempest.api.network.test_networks.NetworksTestJSON.test_show_network
tempest.api.network.test_networks.NetworksTestJSON.test_show_network_fields
tempest.api.network.test_networks.NetworksTestJSON.test_show_subnet
tempest.api.network.test_networks.NetworksTestJSON.test_show_subnet_fields
tempest.api.network.test_networks.NetworksTestJSON.test_update_subnet_gw_dns_host_routes_dhcp
Basic CRUD operations on security groups
tempest.api.network.test_security_groups.SecGroupTest.test_create_list_update_show_delete_security_group
tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_additional_args
tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_icmp_type_code
tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_protocol_integer_value
tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_remote_group_id
tempest.api.network.test_security_groups.SecGroupTest.test_create_security_group_rule_with_remote_ip_prefix
tempest.api.network.test_security_groups.SecGroupTest.test_create_show_delete_security_group_rule
tempest.api.network.test_security_groups.SecGroupTest.test_list_security_groups
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_additional_default_security_group_fails
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_duplicate_security_group_rule_fails
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_bad_ethertype
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_bad_protocol
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_bad_remote_ip_prefix
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_invalid_ports
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_non_existent_remote_groupid
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_create_security_group_rule_with_non_existent_security_group
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_delete_non_existent_security_group
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_show_non_existent_security_group
tempest.api.network.test_security_groups_negative.NegativeSecGroupTest.test_show_non_existent_security_group_rule
Test Area VIM Operations - Volume
Volume attach and detach operations with the Cinder v2 API
tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_attach_detach_volume_to_instance
tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_get_volume_attachment
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_attach_volumes_with_nonexistent_volume_id
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_detach_volumes_with_invalid_volume_id
Volume service availability zone operations with the Cinder v2 API
tempest.api.volume.test_availability_zone.AvailabilityZoneV2TestJSON.test_get_availability_zone_list
tempest.api.volume.test_availability_zone.AvailabilityZoneTestJSON.test_get_availability_zone_list
Volume cloning operations with the Cinder v2 API
tempest.api.volume.test_volumes_get.VolumesV2GetTest.test_volume_create_get_update_delete_as_clone
tempest.api.volume.test_volumes_get.VolumesGetTest.test_volume_create_get_update_delete_as_clone
Image copy-to-volume operations with the Cinder v2 API
tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_volume_bootable
tempest.api.volume.test_volumes_get.VolumesV2GetTest.test_volume_create_get_update_delete_from_image
tempest.api.volume.test_volumes_get.VolumesActionsTest.test_volume_bootable
tempest.api.volume.test_volumes_get.VolumesGetTest.test_volume_create_get_update_delete_from_image
Volume creation and deletion operations with the Cinder v2 API
tempest.api.volume.test_volumes_get.VolumesV2GetTest.test_volume_create_get_update_delete
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_invalid_size
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_nonexistent_source_volid
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_nonexistent_volume_type
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_out_passing_size
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_size_negative
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_size_zero
tempest.api.volume.test_volumes_get.VolumesGetTest.test_volume_create_get_update_delete
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_invalid_size
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_nonexistent_source_volid
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_nonexistent_volume_type
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_without_passing_size
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_without_passing_size
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_size_negative
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_size_zero
Volume service extension listing operations with the Cinder v2 API
tempest.api.volume.test_extensions.ExtensionsV2TestJSON.test_list_extensions
tempest.api.volume.test_extensions.ExtensionsTestJSON.test_list_extensions
Volume GET operations with the Cinder v2 API
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_get_invalid_volume_id
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_get_volume_without_passing_volume_id
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_volume_get_nonexistent_volume_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_get_invalid_volume_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_get_volume_without_passing_volume_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_volume_get_nonexistent_volume_id
Volume listing operations with the Cinder v2 API
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_by_name
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_details_by_name
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_param_display_name_and_status
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_with_detail_param_display_name_and_status
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_with_detail_param_metadata
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_with_details
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_with_param_metadata
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volumes_list_by_availability_zone
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volumes_list_by_status
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volumes_list_details_by_availability_zone
tempest.api.volume.test_volumes_list.VolumesV2ListTestJSON.test_volumes_list_details_by_status
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_list_volumes_detail_with_invalid_status
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_list_volumes_detail_with_nonexistent_name
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_list_volumes_with_invalid_status
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_list_volumes_with_nonexistent_name
tempest.api.volume.v2.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_details_pagination
tempest.api.volume.v2.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_details_with_multiple_params
tempest.api.volume.v2.test_volumes_list.VolumesV2ListTestJSON.test_volume_list_pagination
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_by_name
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_details_by_name
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_param_display_name_and_status
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_with_detail_param_display_name_and_status
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_with_detail_param_metadata
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_with_details
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_with_param_metadata
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_by_availability_zone
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_by_status
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_details_by_availability_zone
tempest.api.volume.test_volumes_list.VolumesListTestJSON.test_volume_list_details_by_status
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_list_volumes_detail_with_invalid_status
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_list_volumes_detail_with_nonexistent_name
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_list_volumes_with_invalid_status
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_list_volumes_with_nonexistent_name
tempest.api.volume.v2.test_volumes_list.VolumesListTestJSON.test_volume_list_details_pagination
tempest.api.volume.v2.test_volumes_list.VolumesListTestJSON.test_volume_list_details_with_multiple_params
tempest.api.volume.v2.test_volumes_list.VolumesListTestJSON.test_volume_list_pagination
Volume metadata operations with the Cinder v2 API
tempest.api.volume.test_volume_metadata.VolumesV2MetadataTest.test_create_get_delete_volume_metadata
tempest.api.volume.test_volume_metadata.VolumesV2MetadataTest.test_update_volume_metadata_item
tempest.api.volume.test_volume_metadata.VolumesMetadataTest.test_crud_volume_metadata
tempest.api.volume.test_volume_metadata.VolumesV2MetadataTest.test_crud_volume_metadata
tempest.api.volume.test_volume_metadata.VolumesMetadataTest.test_update_volume_metadata_item
tempest.api.volume.test_volume_metadata.VolumesMetadataTest.test_update_show_volume_metadata_item
Verification of read-only status on volumes with the Cinder v2 API
tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_volume_readonly_update
tempest.api.volume.test_volumes_actions.VolumesActionsTest.test_volume_readonly_update
Volume reservation operations with the Cinder v2 API
tempest.api.volume.test_volumes_actions.VolumesV2ActionsTest.test_reserve_unreserve_volume
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_reserve_volume_with_negative_volume_status
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_reserve_volume_with_nonexistent_volume_id
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_unreserve_volume_with_nonexistent_volume_id
tempest.api.volume.test_volumes_actions.VolumesActionsTest.test_reserve_unreserve_volume
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_reserve_volume_with_negative_volume_status
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_reserve_volume_with_nonexistent_volume_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_unreserve_volume_with_nonexistent_volume_id
Volume snapshot creation/deletion operations with the Cinder v2 API
tempest.api.volume.test_snapshot_metadata.SnapshotV2MetadataTestJSON.test_create_get_delete_snapshot_metadata
tempest.api.volume.test_snapshot_metadata.SnapshotV2MetadataTestJSON.test_update_snapshot_metadata_item
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_create_volume_with_nonexistent_snapshot_id
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_delete_invalid_volume_id
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_delete_volume_without_passing_volume_id
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_volume_delete_nonexistent_volume_id
tempest.api.volume.test_volumes_snapshots.VolumesV2SnapshotTestJSON.test_snapshot_create_get_list_update_delete
tempest.api.volume.test_volumes_snapshots.VolumesV2SnapshotTestJSON.test_volume_from_snapshot
tempest.api.volume.test_volumes_snapshots.VolumesV2SnapshotTestJSON.test_snapshots_list_details_with_params
tempest.api.volume.test_volumes_snapshots.VolumesV2SnapshotTestJSON.test_snapshots_list_with_params
tempest.api.volume.test_volumes_snapshots_negative.VolumesV2SnapshotNegativeTestJSON.test_create_snapshot_with_nonexistent_volume_id
tempest.api.volume.test_volumes_snapshots_negative.VolumesV2SnapshotNegativeTestJSON.test_create_snapshot_without_passing_volume_id
tempest.api.volume.test_snapshot_metadata.SnapshotMetadataTestJSON.test_crud_snapshot_metadata
tempest.api.volume.test_snapshot_metadata.SnapshotV2MetadataTestJSON.test_crud_snapshot_metadata
tempest.api.volume.test_snapshot_metadata.SnapshotMetadataTestJSON.test_update_snapshot_metadata_item
tempest.api.volume.test_snapshot_metadata.SnapshotMetadataTestJSON.test_update_show_snapshot_metadata_item
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_create_volume_with_nonexistent_snapshot_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_delete_invalid_volume_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_delete_volume_without_passing_volume_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_volume_delete_nonexistent_volume_id
tempest.api.volume.test_volumes_snapshots.VolumesSnapshotTestJSON.test_snapshot_create_get_list_update_delete
tempest.api.volume.test_volumes_snapshots.VolumesSnapshotTestJSON.test_volume_from_snapshot
tempest.api.volume.test_volumes_snapshots_list.VolumesSnapshotListTestJSON.test_snapshots_list_details_with_params
tempest.api.volume.test_volumes_snapshots_list.VolumesV2SnapshotListTestJSON.test_snapshots_list_details_with_params
tempest.api.volume.test_volumes_snapshots_list.VolumesSnapshotListTestJSON.test_snapshots_list_with_params
tempest.api.volume.test_volumes_snapshots_list.VolumesV2SnapshotListTestJSON.test_snapshots_list_with_params
tempest.api.volume.test_volumes_snapshots_negative.VolumesSnapshotNegativeTestJSON.test_create_snapshot_with_nonexistent_volume_id
tempest.api.volume.test_volumes_snapshots_negative.VolumesSnapshotNegativeTestJSON.test_create_snapshot_without_passing_volume_id
Volume update operations with the Cinder v2 API
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_update_volume_with_empty_volume_id
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_update_volume_with_invalid_volume_id
tempest.api.volume.test_volumes_negative.VolumesV2NegativeTest.test_update_volume_with_nonexistent_volume_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_update_volume_with_empty_volume_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_update_volume_with_invalid_volume_id
tempest.api.volume.test_volumes_negative.VolumesNegativeTest.test_update_volume_with_nonexistent_volume_id
Test Area High Availability
Verify high availability of OpenStack controller services
opnfv.ha.tc001.nova-api_service_down
opnfv.ha.tc003.neutron-server_service_down
opnfv.ha.tc004.keystone_service_down
opnfv.ha.tc005.glance-api_service_down
opnfv.ha.tc006.cinder-api_service_down
opnfv.ha.tc009.cpu_overload
opnfv.ha.tc010.disk_I/O_block
opnfv.ha.tc011.load_balance_service_down
Test Area vPing - Basic VNF Connectivity
opnfv.vping.userdata
opnfv.vping.ssh
Optional OVP Test Areas
Test Area BGP VPN
Verify association and dissasocitation of node using route targets
opnfv.sdnvpn.subnet_connectivity
opnfv.sdnvpn.tenant separation
opnfv.sdnvpn.router_association
opnfv.sdnvpn.router_association_floating_ip
IPv6 Compliance Testing Methodology and Test Cases
Test Case 1: Create and Delete an IPv6 Network, Port and Subnet
tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_network
tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_port
tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_subnet
Test Case 2: Create, Update and Delete an IPv6 Network and Subnet
tempest.api.network.test_networks.NetworksIpV6Test.test_create_update_delete_network_subnet
Test Case 3: Check External Network Visibility
tempest.api.network.test_networks.NetworksIpV6Test.test_external_network_visibility
Test Case 4: List IPv6 Networks and Subnets of a Tenant
tempest.api.network.test_networks.NetworksIpV6Test.test_list_networks
tempest.api.network.test_networks.NetworksIpV6Test.test_list_subnets
Test Case 5: Show Information of an IPv6 Network and Subnet
tempest.api.network.test_networks.NetworksIpV6Test.test_show_network
tempest.api.network.test_networks.NetworksIpV6Test.test_show_subnet
Test Case 6: Create an IPv6 Port in Allowed Allocation Pools
tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_in_allowed_allocation_pools
Test Case 7: Create an IPv6 Port without Security Groups
tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_with_no_securitygroups
Test Case 8: Create, Update and Delete an IPv6 Port
tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_update_delete_port
Test Case 9: List IPv6 Ports of a Tenant
tempest.api.network.test_ports.PortsIpV6TestJSON.test_list_ports
Test Case 10: Show Information of an IPv6 Port
tempest.api.network.test_ports.PortsIpV6TestJSON.test_show_port
Test Case 11: Add Multiple Interfaces for an IPv6 Router
tempest.api.network.test_routers.RoutersIpV6Test.test_add_multiple_router_interfaces
Test Case 12: Add and Remove an IPv6 Router Interface with port_id
tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_port_id
Test Case 13: Add and Remove an IPv6 Router Interface with subnet_id
tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_subnet_id
Test Case 14: Create, Update, Delete, List and Show an IPv6 Router
tempest.api.network.test_routers.RoutersIpV6Test.test_create_show_list_update_delete_router
Test Case 15: Create, Update, Delete, List and Show an IPv6 Security Group
tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_list_update_show_delete_security_group
Test Case 16: Create, Delete and Show Security Group Rules
tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_show_delete_security_group_rule
Test Case 17: List All Security Groups
tempest.api.network.test_security_groups.SecGroupIPv6Test.test_list_security_groups
Test Case 18: IPv6 Address Assignment - Dual Stack, SLAAC, DHCPv6 Stateless
tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os
Test Case 19: IPv6 Address Assignment - Dual Net, Dual Stack, SLAAC, DHCPv6 Stateless
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os
Test Case 20: IPv6 Address Assignment - Multiple Prefixes, Dual Stack, SLAAC, DHCPv6 Stateless
tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_dhcpv6_stateless
Test Case 21: IPv6 Address Assignment - Dual Net, Multiple Prefixes, Dual Stack, SLAAC, DHCPv6 Stateless
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_dhcpv6_stateless
Test Case 22: IPv6 Address Assignment - Dual Stack, SLAAC
tempest.scenario.test_network_v6.TestGettingAddress.test_slaac_from_os
Test Case 23: IPv6 Address Assignment - Dual Net, Dual Stack, SLAAC
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_slaac_from_os
Test Case 24: IPv6 Address Assignment - Multiple Prefixes, Dual Stack, SLAAC
tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac
Test Case 25: IPv6 Address Assignment - Dual Net, Dual Stack, Multiple Prefixes, SLAAC
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_slaac
Filtering Packets Based on Security Rules and Port Security in Data Path
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_port_security_macspoofing_port
tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_cross_tenant_traffic
tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_in_tenant_traffic
tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_multiple_security_groups
tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_port_security_disable_security_group
tempest.scenario.test_security_groups_basic_ops.TestSecurityGroupsBasicOps.test_port_update_new_security_group
Dynamic Network Runtime Operations Through the Life of a VNF
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_hotplug_nic
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_subnet_details
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_instance_port_admin_state
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_update_router_admin_state
Correct Behavior after Common Virtual Machine Life Cycles Events
tempest.scenario.test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_cold_migration
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_cold_migration_revert
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_pause_unpause
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_reboot
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_rebuild
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_resize
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_stop_start
tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_suspend_resume
tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_server_sequence_suspend_resume
tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_volume_backed_server_confirm
tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_volume_backed_instance
Simple Virtual Machine Resource Scheduling on Multiple Nodes
tempest.scenario.test_server_multinode.TestServerMultinode.test_schedule_to_all_nodes
tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_create_delete_multiple_server_groups_with_same_name_policy
tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_create_delete_server_group_with_affinity_policy
tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_create_delete_server_group_with_anti_affinity_policy
tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_list_server_groups
tempest.api.compute.servers.test_server_group.ServerGroupTestJSON.test_show_server_group
Forwarding Packets Through Virtual Networks in Data Path
tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_mtu_sized_frames

Functest

OPNFV FUNCTEST developer guide
Introduction

Functest is a project dealing with functional testing. Functest produces its own internal test cases but can also be considered as a framework to support feature and VNF onboarding project testing. Functest developed a TestAPI and defined a test collection framework that can be used by any OPNFV project.

Therefore there are many ways to contribute to Functest. You can:

  • Develop new internal test cases
  • Integrate the tests from your feature project
  • Develop the framework to ease the integration of external test cases
  • Develop the API / Test collection framework
  • Develop dashboards or automatic reporting portals

This document describes how, as a developer, you may interact with the Functest project. The first section details the main working areas of the project. The Second part is a list of “How to” to help you to join the Functest family whatever your field of interest is.

Functest developer areas
Functest High level architecture

Functest is project delivering a test container dedicated to OPNFV. It includes the tools, the scripts and the test scenarios.

Functest can be described as follow:

+----------------------+
|                      |
|   +--------------+   |                  +-------------------+
|   |              |   |    Public        |                   |
|   | Tools        |   +------------------+      OPNFV        |
|   | Scripts      |   |                  | System Under Test |
|   | Scenarios    |   +------------------+                   |
|   |              |   |    Management    |                   |
|   +--------------+   |                  +-------------------+
|                      |
|    Functest Docker   |
|                      |
+----------------------+
Functest internal test cases

The internal test cases in Danube are:

  • api_check
  • cloudify_ims
  • connection_check
  • vping_ssh
  • vping_userdata
  • odl
  • rally_full
  • rally_sanity
  • snaps_health_check
  • tempest_full_parallel
  • tempest_smoke_serial

By internal, we mean that this particular test cases have been developped and/or integrated by functest contributors and the associated code is hosted in the Functest repository. An internal case can be fully developped or a simple integration of upstream suites (e.g. Tempest/Rally developped in OpenStack are just integrated in Functest). The structure of this repository is detailed in [1]. The main internal test cases are in the opnfv_tests subfolder of the repository, the internal test cases are:

  • sdn: odl, onos
  • openstack: api_check, connection_check, snaps_health_check, vping_ssh, vping_userdata, tempest_*, rally_*, snaps_smoke
  • vnf: cloudify_ims

If you want to create a new test case you will have to create a new folder under the testcases directory.

Functest external test cases

The external test cases are inherited from other OPNFV projects, especially the feature projects.

The external test cases are:

  • barometer
  • bgpvpn
  • doctor
  • domino
  • odl-netvirt
  • onos
  • fds
  • multisite
  • netready
  • orchestra_ims
  • parser
  • promise
  • refstack_defcore
  • security_scan
  • snaps_smoke
  • sfc-odl
  • vyos_vrouter

The code to run these test cases may be directly in the repository of the project. We have also a features sub directory under opnfv_tests directory that may be used (it can be usefull if you want to reuse Functest library).

Functest framework

Functest can be considered as a framework. Functest is release as a docker file, including tools, scripts and a CLI to prepare the environement and run tests. It simplifies the integration of external test suites in CI pipeline and provide commodity tools to collect and display results.

Since Colorado, test categories also known as tiers have been created to group similar tests, provide consistant sub-lists and at the end optimize test duration for CI (see How To section).

The definition of the tiers has been agreed by the testing working group.

The tiers are:
  • healthcheck
  • smoke
  • features
  • components
  • performance
  • vnf
  • stress
Functest abstraction classes

In order to harmonize test integration, 3 abstraction classes have been introduced in Danube:

  • testcase: base for any test case
  • feature_base: abstraction for feature project
  • vnf_base: abstraction for vnf onboarding

The goal is to unify the way to run test from Functest.

feature_base and vnf_base inherit from testcase:

     +-----------------------------------------+
     |                                         |
     |         Testcase_base                   |
     |                                         |
     |         - init()                        |
     |         - run()                         |
     |         - publish_report()              |
     |         - check_criteria()              |
     |                                         |
     +-----------------------------------------+
            |                       |
            V                       V
+--------------------+   +--------------------------+
|                    |   |                          |
|    feature_base    |   |      vnf_base            |
|                    |   |                          |
|  - prepare()       |   |  - prepare()             |
|  - execute()       |   |  - deploy_orchestrator() |
|  - post()          |   |  - deploy_vnf()          |
|  - parse_results() |   |  - test_vnf()            |
|                    |   |  - clean()               |
|                    |   |  - execute()             |
|                    |   |                          |
+--------------------+   +--------------------------+
Functest util classes

In order to simplify the creation of test cases, Functest develops some functions that can be used by any feature or internal test cases. Several features are supported such as logger, configuration management and Openstack capabilities (snapshot, clean, tacker,..). These functions can be found under <repo>/functest/utils and can be described as follows:

functest/utils/ |– config.py |– constants.py |– env.py |– functest_logger.py |– functest_utils.py |– openstack_clean.py |– openstack_snapshot.py |– openstack_tacker.py `– openstack_utils.py

Note that for Openstack, keystone v3 is now deployed by default by compass, fuel and joid in Danube. All installers still support keysone v2 (deprecated in next version).

Test collection framework

The OPNFV testing group created a test collection database to collect the test results from CI:

Any test project running on any lab integrated in CI can push the results to this database. This database can be used to see the evolution of the tests and compare the results versus the installers, the scenarios or the labs.

Overall Architecture

The Test result management can be summarized as follows:

+-------------+    +-------------+    +-------------+
|             |    |             |    |             |
|   Test      |    |   Test      |    |   Test      |
| Project #1  |    | Project #2  |    | Project #N  |
|             |    |             |    |             |
+-------------+    +-------------+    +-------------+
         |               |               |
         V               V               V
     +-----------------------------------------+
     |                                         |
     |         Test Rest API front end         |
     |  http://testresults.opnfv.org/test      |
     |                                         |
     +-----------------------------------------+
         A                |
         |                V
         |     +-------------------------+
         |     |                         |
         |     |    Test Results DB      |
         |     |         Mongo DB        |
         |     |                         |
         |     +-------------------------+
         |
         |
   +----------------------+
   |                      |
   |    test Dashboard    |
   |                      |
   +----------------------+
TestAPI description

The TestAPI is used to declare pods, projects, test cases and test results. Pods are the pods used to run the tests. The results pushed in the database are related to pods, projects and cases. If you try to push results of test done on non referenced pod, the API will return an error message.

An additional method dashboard has been added to post-process the raw results in release Brahmaputra (deprecated in Colorado).

The data model is very basic, 5 objects are created:

  • Pods
  • Projects
  • Testcases
  • Results
  • Scenarios

The code of the API is hosted in the releng repository [6]. The static documentation of the API can be found at [17]. The TestAPI has been dockerized and may be installed locally in your lab. See [15] for details.

The deployment of the TestAPI has been automated. A jenkins job manages:

  • the unit tests of the TestAPI
  • the creation of a new docker file
  • the deployment of the new TestAPI
  • the archive of the old TestAPI
  • the backup of the Mongo DB
TestAPI Authorization

PUT/DELETE/POST operations of the TestAPI now require token based authorization. The token needs to be added in the request using a header ‘X-Auth-Token’ for access to the database.

e.g::
headers[‘X-Auth-Token’]

The value of the header i.e the token can be accessed in the jenkins environment variable TestApiToken. The token value is added as a masked password.

headers['X-Auth-Token'] = os.environ.get('TestApiToken')

The above example is in Python. Token based authentication has been added so that only ci pods jenkins job can have access to the database.

Please note that currently token authorization is implemented but is not yet enabled.

An automatic reporting page has been created in order to provide a consistant view of the scenarios. In this page, each scenario is evaluated according to test criteria. The code for the automatic reporting is available at [8].

The results are collected from the centralized database every day and, per scenario. A score is calculated based on the results from the last 10 days. This score is the addition of single test scores. Each test case has a success criteria reflected in the criteria field from the results.

Considering an instance of a scenario os-odl_l2-nofeature-ha, the scoring is the addition of the scores of all the runnable tests from the categories (tiers healthcheck, smoke and features) corresponding to this scenario.

Test Apex Compass Fuel Joid
vPing_ssh X X X X
vPing_userdata X X X X
tempest_smoke_serial X X X X
rally_sanity X X X X
odl X X X X
promise     X X
doctor X   X  
security_scan X      
parser     X  
copper X     X

src: colorado (see release note for the last matrix version)

All the testcases listed in the table are runnable on os-odl_l2-nofeature scenarios. If no result is available or if all the results are failed, the test case get 0 point. If it was succesfull at least once but not anymore during the 4 runs, the case get 1 point (it worked once). If at least 3 of the last 4 runs were successful, the case get 2 points. If the last 4 runs of the test are successful, the test get 3 points.

In the example above, the target score for fuel/os-odl_l2-nofeature-ha is 3x6 = 18 points.

The scenario is validated per installer when we got 3 points for all individual test cases (e.g 18/18). Please note that complex or long duration tests are not considered for the scoring. The success criteria are not always easy to define and may require specific hardware configuration. These results however provide a good level of trust on the scenario.

A web page is automatically generated every day to display the status. This page can be found at [9]. For the status, click on Status menu, you may also get feedback for vims and tempest_smoke_serial test cases.

Any validated scenario is stored in a local file on the web server. In fact as we are using a sliding windows to get results, it may happen that a successful scenarios is no more run (because considered as stable) and then the number of iterations (4 needed) would not be sufficient to get the green status.

Please note that other test cases (e.g. sfc_odl, bgpvpn) need also ODL configuration addons and as a consequence specific scenario. There are not considered as runnable on the generic odl_l2 scenario.

Dashboard

Dashboard is used to provide a consistant view of the results collected in CI. The results showed on the dashboard are post processed from the Database, which only contains raw results.

In Brahmaputra, we created a basic dashboard. Since Colorado, it was decided to adopt ELK framework. Mongo DB results are extracted to feed Elasticsearch database ([7]).

A script was developed to build elasticsearch data set. This script can be found in [16].

For next versions, it was decided to integrated bitergia dashboard. Bitergia already provides a dashboard for code and infrastructure. A new Test tab will be added. The dataset will be built by consuming the TestAPI.

How TOs
How Functest works?

The installation and configuration of the Functest docker image is described in [1].

The procedure to start tests is described in [2]

How can I contribute to Functest?

If you are already a contributor of any OPNFV project, you can contribute to functest. If you are totally new to OPNFV, you must first create your Linux Foundation account, then contact us in order to declare you in the repository database.

We distinguish 2 levels of contributors:

  • the standard contributor can push patch and vote +1/0/-1 on any Functest patch
  • The commitor can vote -2/-1/0/+1/+2 and merge

Functest commitors are promoted by the Functest contributors.

Where can I find some help to start?

This guide is made for you. You can also have a look at the project wiki page [10]. There are references on documentation, video tutorials, tips...

You can also directly contact us by mail with [Functest] prefix in the title at opnfv-tech-discuss@lists.opnfv.org or on the IRC chan #opnfv-functest.

What kind of testing do you do in Functest?

Functest is focusing on Functional testing. The results must be PASS or FAIL. We do not deal with performance and/or qualification tests. We consider OPNFV as a black box and execute our tests from the jumphost according to Pharos reference technical architecture.

Upstream test suites are integrated (Rally/Tempest/ODL/ONOS,...). If needed Functest may bootstrap temporarily testing activities if they are identified but not covered yet by an existing testing project (e.g security_scan before the creation of the security repository)

How test constraints are defined?

Test constraints are defined according to 2 paramaters:

  • The scenario (DEPLOY_SCENARIO env variable)
  • The installer (INSTALLER_TYPE env variable)

A scenario is a formal description of the system under test. The rules to define a scenario are described in [4]

These 2 constraints are considered to determinate if the test is runnable or not (e.g. no need to run onos suite on odl scenario).

In the test declaration for CI, the test owner shall indicate these 2 constraints. The file testcases.yaml [5] must be patched in git to include new test cases. A more elaborated system based on template is planned for next releases

For each dependency, it is possible to define a regex:

name: promise
criteria: 'success_rate == 100%'
description: >-
    Test suite from Promise project.
dependencies:
    installer: '(fuel)|(joid)'
    scenario: ''

In the example above, it means that promise test will be runnable only with joid or fuel installers on any scenario.

The vims criteria means any installer and exclude onos and odl with bgpvpn scenarios:

name: vims
criteria: 'status == "PASS"'
description: >-
    This test case deploys an OpenSource vIMS solution from Clearwater
    using the Cloudify orchestrator. It also runs some signaling traffic.
dependencies:
    installer: ''
    scenario: '(ocl)|(nosdn)|^(os-odl)((?!bgpvpn).)*$'
How to write and check constaint regex?

Regex are standard regex. You can have a look at [11]

You can also easily test your regex via an online regex checker such as [12]. Put your scenario in the TEST STRING window (e.g. os-odl_l3-ovs-ha), put your regex in the REGULAR EXPRESSION window, then you can test your rule.

How to know which test I can run?

You can use the API [13]. The static declaration is in git [5]

If you are in a Functest docker container (assuming that the environement has been prepared): just use the CLI.

You can get the list per Test cases or by Tier:

# functest testcase list
healthcheck
vping_ssh
vping_userdata
tempest_smoke_serial
rally_sanity
odl
doctor
security_scan
tempest_full_parallel
rally_full
vims
# functest tier list
- 0. healthcheck:
['healthcheck']
- 1. smoke:
['vping_ssh', 'vping_userdata', 'tempest_smoke_serial', 'rally_sanity']
- 2. sdn_suites:
['odl']
- 3. features:
['doctor', 'security_scan']
- 4. openstack:
['tempest_full_parallel', 'rally_full']
- 5. vnf:
['vims']
How to manually start Functest tests?

Assuming that you are connected on the jumphost and that the system is “Pharos compliant”, i.e the technical architecture is compatible with the one defined in the Pharos project:

# docker pull opnfv/functest:latest
# envs="-e INSTALLER_TYPE=fuel -e INSTALLER_IP=10.20.0.2 -e DEPLOY_SCENARIO=os-odl_l2-nofeature-ha -e CI_DEBUG=true"
# sudo docker run --privileged=true -id ${envs} opnfv/functest:latest /bin/bash

Then you must connect to the docker container and source the credentials:

# docker ps (copy the id)
# docker exec -ti <container_id> bash
# source $creds

You must first check if the environment is ready:

# functest env status
Functest environment ready to run tests.

If not ready, prepare the env by launching:

# functest env prepare
Functest environment ready to run tests.

Once the Functest env is ready, you can use the CLI to start tests.

You can run test cases per test case or per tier:
# functest testcase run <case name> or # functest tier run <tier name>

e.g:

# functest testcase run tempest_smoke_serial
# functest tier run features

If you want to run all the tests you can type:

# functest testcase run all

If you want to run all the tiers (same at the end that running all the test cases) you can type:

# functest tier run all
How to declare my tests in Functest?

If you want to add new internal test cases, you can submit patch under the testcases directory of Functest repository.

For feature test integration, the code can be kept into your own repository. The Functest files to be modified are:

  • functest/docker/Dockerfile: get your code in Functest container
  • functest/ci/testcases.yaml: reference your test and its associated constraints
Dockerfile

This file lists the repositories (internal or external) to be cloned in the Functest container. You can also add external packages:

RUN git clone https://gerrit.opnfv.org/gerrit/<your project> ${REPOS_DIR}/<your project>
testcases.yaml

All the test cases that must be run from CI / CLI must be declared in ci/testcases.yaml.

This file is used to get the constraints related to the test:

name: <my_super_test_case>
criteria: <not used yet in Colorado, could be > 'PASS', 'rate > 90%'
description: >-
    <the description of your super test suite>
dependencies:
    installer: regex related to installer e.g. 'fuel', '(apex)||(joid)'
    scenario: regex related to the scenario e.g. 'ovs*no-ha'

You must declare your test case in one of the category (tier).

If you are integrating test suites from a feature project, the default category is features.

How to select my list of tests for CI?

Functest can be run automatically from CI, a jenkins job is usually called after an OPNFV fresh installation. By default we try to run all the possible tests (see [14] called from Functest jenkins job):

cmd="python ${FUNCTEST_REPO_DIR}/ci/run_tests.py -t all ${flags}"

Each case can be configured as daily and/or weekly task. Weekly tasks are used for long duration or experimental tests. Daily tasks correspond to the minimum set of test suites to validate a scenario.

When executing run_tests.py, a check based on the jenkins build tag will be considered to detect whether it is a daily and/or a weekly test.

in your CI you can customize the list of test you want to run by case or by tier, just change the line:

cmd="python ${FUNCTEST_REPO_DIR}/ci/run_tests.py -t <whatever you want> ${flags}"

e.g.:

cmd="python ${FUNCTEST_REPO_DIR}/ci/run_tests.py -t healthcheck,smoke ${flags}"

This command will run all the test cases of the first 2 tiers, i.e. healthcheck, connection_check, api_check, vping_ssh, vping_userdata, snaps_somke, tempest_smoke_serial and rally_sanity.

How to push your results into the Test Database

The test database is used to collect test results. By default it is enabled only for CI tests from Production CI pods.

The architecture and associated API is described in previous chapter. If you want to push your results from CI, you just have to call the API at the end of your script.

You can also reuse a python function defined in functest_utils.py:

def push_results_to_db(db_url, case_name, logger, pod_name,version, payload):
  """
  POST results to the Result target DB
  """
  url = db_url + "/results"
  installer = get_installer_type(logger)
  params = {"project_name": "functest", "case_name": case_name,
            "pod_name": pod_name, "installer": installer,
            "version": version, "details": payload}

  headers = {'Content-Type': 'application/json'}
  try:
      r = requests.post(url, data=json.dumps(params), headers=headers)
      if logger:
          logger.debug(r)
      return True
  except Exception, e:
      print "Error [push_results_to_db('%s', '%s', '%s', '%s', '%s')]:" \
          % (db_url, case_name, pod_name, version, payload), e
      return False
Where can I find the documentation on the test API?

http://artifacts.opnfv.org/releng/docs/testapi.html

How to exclude Tempest case from default Tempest smoke suite?

Tempest default smoke suite deals with 165 test cases. Since Colorado the success criteria is 100%, i.e. if 1 test is failed the success criteria is not matched for the scenario.

It is necessary to exclude some test cases that are expected to fail due to known upstream bugs (see release notes).

A file has been created for such operation: https://git.opnfv.org/cgit/functest/tree/functest/opnfv_tests/openstack/tempest/custom_tests/blacklist.txt.

It can be described as follows:

-
    scenarios:
        - os-odl_l2-bgpvpn-ha
        - os-odl_l2-bgpvpn-noha
    installers:
        - fuel
        - apex
    tests:
        - tempest.api.compute.servers.test_create_server.ServersTestJSON.test_list_servers
        - tempest.api.compute.servers.test_create_server.ServersTestJSON.test_verify_server_details
        - tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_list_servers
        - tempest.api.compute.servers.test_create_server.ServersTestManualDisk.test_verify_server_details
        - tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_reboot_server_hard
        - tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops
        - tempest.scenario.test_server_basic_ops.TestServerBasicOps.test_server_basic_ops
        - tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern.test_volume_boot_pattern
        - tempest.scenario.test_volume_boot_pattern.TestVolumeBootPatternV2.test_volume_boot_pattern

Please note that each exclusion must be justified. the goal is not to exclude test cases because they do not pass. Several scenarios reached the 100% criteria. So it is expected in the patch submited to exclude the cases to indicate the reasons of the exclusion.

How do I know the Functest status of a scenario?

A Functest automatic reporting page is generated daily. This page is dynamically created through a cron job and is based on the results stored in the Test DB. You can access this reporting page: http://testresults.opnfv.org/reporting

See https://wiki.opnfv.org/pages/viewpage.action?pageId=6828617 for details.

I have tests, to which category should I declare them?

CATEGORIES/TIERS description:

healthcheck Simple OpenStack healtcheck tests case that validates the basic operations in OpenStack
Smoke Set of smoke test cases/suites to validate the most common OpenStack and SDN Controller operations
Features Test cases that validate a specific feature on top of OPNFV. Those come from Feature projects and need a bit of support for integration
Components Advanced Openstack tests: Full Tempest, Full Rally
Performance Out of Functest Scope
VNF Test cases related to deploy an open source VNF including an orchestrator

The main ambiguity could be between features and VNF. In fact sometimes you have to spawn VMs to demonstrate the capabilities of the feature you introduced. We recommend to declare your test in the feature category.

VNF category is really dedicated to test including:

  • creation of resources
  • deployement of an orchestrator/VNFM
  • deployment of the VNF
  • test of the VNFM
  • free resources

The goal is not to study a particular feature on the infrastructure but to have a whole end to end test of a VNF automatically deployed in CI. Moreover VNF are run in weekly jobs (one a week), feature tests are in daily jobs and use to get a scenario score.

Where are the logs?

Functest deals with internal and external testcases. Each testcase can generate logs.

Since Colorado we introduce the possibility to push the logs to the artifact. A new script (https://git.opnfv.org/releng/tree/utils/push-test-logs.sh) has been created for CI.

When called, and assuming that the POD is authorized to push the logs to artifacts, the script will push all the results or logs locally stored under /home/opnfv/functest/results/.

If the POD is not connected to CI, logs are not pushed. But in both cases, logs are stored in /home/opnfv/functest/results in the container. Projects are encouraged to push their logs here.

Since Colorado it is also easy for feature project to integrate this feature by adding the log file as output_file parameter when calling execute_command from functest_utils library

ret_val = functest_utils.execute_command(cmd, output_file=log_file)
How does Functest deal with VNF onboarding?

VNF onboarding has been introduced in Brahmaputra through the automation of a clearwater vIMS deployed thanks to cloudify orchestrator.

This automation has been described at OpenStack summit Barcelona: https://youtu.be/Jr4nG74glmY

The goal of Functest consists in testing OPNFV from a functional perspective: the NFVI and/or the features developed in OPNFV. Feature test suites are provided by the feature project. Functest just simplifies the integration of the suite into the CI and gives a consolidated view of the tests per scenario.

Functest does not develop VNFs.

Functest does not test any MANO stack.

OPNFV projects dealing with VNF onboarding

Testing VNF is not the main goal however it gives interesting and realistic feedback on OPNFV as a Telco cloud.

Onboarding VNF also allows to test a full stack: orchestrator + VNF.

Functest is VNF and MANO stack agnostic.

An internship has been initiated to reference the Open Source VNF: Intern Project Open Source VNF catalog

New projects dealing with orchestrators or VNFs are candidate for Danube.

The 2 projects dealing with orchestration are:

  • orchestra (Openbaton)
  • opera (Open-O)

The Models project address various goals for promoting availability and convergence of information and/or data models related to NFV service/VNF management, as being defined in standards (SDOs) and as developed in open source projects.

Functest VNF onboarding

In order to simplify VNF onboarding a new abstraction class has been developed in Functest.

This class is based on vnf_base and can be described as follow:

+————+ +————–+ | test_base |————>| vnf_base | +————+ +————–+

|_ prepare
|_ deploy_orchestrator (optional)
|_ deploy_vnf
|_ test_vnf
|_ clean

Several methods are declared in vnf_base:

  • prepare
  • deploy_orchestrator
  • deploy_vnf
  • test_vnf
  • clean

deploy_vnf and test_vnf are mandatory.

prepare will create a user and a project.

How to declare your orchestrator/VNF?
  1. test declaration

You must declare your testcase in the file <Functest repo>/functest/ci/testcases.yaml

  1. configuration

You can precise some configuration parameters in config_functest.yaml

  1. implement your test

Create your own VnfOnboarding file

you must create your entry point through a python clase as referenced in the configuration file

e.g. aaa => creation of the file <Functest repo>/functest/opnfv_tests/vnf/aaa/aaa.py

the class shall inherit vnf_base. You must implement the methods deploy_vnf() and test_vnf() and may implement deploy_orchestrator()

you can call the code from your repo (but need to add the repo in Functest if it is not the case)

  1. success criteria

So far we considered the test as PASS if vnf_deploy and test_vnf is PASS (see example in aaa).

References

[1]: http://artifacts.opnfv.org/functest/docs/configguide/index.html Functest configuration guide

[2]: http://artifacts.opnfv.org/functest/docs/userguide/index.html functest user guide

[3]: https://wiki.opnfv.org/opnfv_test_dashboard Brahmaputra dashboard

[4]: https://wiki.opnfv.org/display/INF/CI+Scenario+Naming

[5]: https://git.opnfv.org/cgit/functest/tree/ci/testcases.yaml

[6]: https://git.opnfv.org/cgit/releng/tree/utils/test/result_collection_api

[7]: https://git.opnfv.org/cgit/releng/tree/utils/test/scripts

[8]: https://git.opnfv.org/cgit/releng/tree/utils/test/reporting/functest

[9]: http://testresults.opnfv.org/reporting/

[10]: https://wiki.opnfv.org/opnfv_functional_testing

[11]: https://docs.python.org/2/howto/regex.html

[12]: https://regex101.com/

[13]: http://testresults.opnfv.org/test/api/v1/projects/functest/cases

[14]: https://git.opnfv.org/cgit/releng/tree/jjb/functest/functest-daily.sh

[15]: https://git.opnfv.org/cgit/releng/tree/utils/test/result_collection_api/README.rst

[16]: https://git.opnfv.org/cgit/releng/tree/utils/test/scripts/mongo_to_elasticsearch.py

[17]: http://artifacts.opnfv.org/releng/docs/testapi.html

OPNFV main site: http://www.opnfv.org

OPNFV functional test page: https://wiki.opnfv.org/opnfv_functional_testing

IRC support chan: #opnfv-functest

OpenRC: http://docs.openstack.org/user-guide/common/cli_set_environment_variables_using_openstack_rc.html

Rally installation procedure: https://rally.readthedocs.org/en/latest/tutorial/step_0_installation.html

config_functest.yaml : https://git.opnfv.org/cgit/functest/tree/functest/ci/config_functest.yaml

QTIP

QTIP Developer Guide
Overview

QTIP uses Python as primary programming language and build the framework from the following packages

Module Package
api Connexion - API first applications with OpenAPI/Swagger and Flask
cli Click - the “Command Line Interface Creation Kit”
template Jinja2 - a full featured template engine for Python
docs sphinx - a tool that makes it easy to create intelligent and beautiful documentation
testing pytest - a mature full-featured Python testing tool that helps you write better programs
Source Code

The structure of repository is based on the recommended sample in The Hitchhiker’s Guide to Python

Path Content
./benchmarks/ builtin benchmark assets including plan, QPI and metrics
./contrib/ independent project/plugin/code contributed to QTIP
./docker/ configuration for building Docker image for QTIP deployment
./docs/ release notes, user and developer documentation, design proposals
./legacy/ legacy obsoleted code that is unmaintained but kept for reference
./opt/ optional component, e.g. scripts to setup infrastructure services for QTIP
./qtip/ the actual package
./tests/ package functional and unit tests
./third-party/ third part included in QTIP project
Coding Style

QTIP follows OpenStack Style Guidelines for source code and commit message.

Specially, it is recommended to link each patch set with a JIRA issue. Put:

JIRA: QTIP-n

in commit message to create an automatic link.

Testing

All testing related code are stored in ./tests/

Path Content
./tests/data/ data fixtures for testing
./tests/unit/ unit test for each module, follow the same layout as ./qtip/
./conftest.py pytest configuration in project scope

tox is used to automate the testing tasks

cd <project_root>
pip install tox
tox

The test cases are written in pytest. You may run it selectively with

pytest tests/unit/reporter
Architecture

In Danube, QTIP releases its standalone mode, which is also know as solo:

QTIP standalone mode

The runner could be launched from CLI (command line interpreter) or API (application programming interface) and drives the testing jobs. The generated data including raw performance data and testing environment are fed to collector. Performance metrics will be parsed from the raw data and used for QPI calculation. Then the benchmark report is rendered with the benchmarking results.

The execution can be detailed in the diagram below:

QTIP execution sequence
CLI - Command Line Interface

QTIP consists of different tools(metrics) to benchmark the NFVI. These metrics fall under different NFVI subsystems(QPI’s) such as compute, storage and network. A plan consists of one or more QPI’s, depending upon how the end user would want to measure performance. CLI is designed to help the user, execute benchmarks and view respective scores.

Framework

QTIP CLI has been created using the Python package Click, Command Line Interface Creation Kit. It has been chosen for number of reasons. It presents the user with a very simple yet powerful API to build complex applications. One of the most striking features is command nesting.

As explained, QTIP consists of metrics, QPI’s and plans. CLI is designed to provide interface to all these components. It is responsible for execution, as well as provide listing and details of each individual element making up these components.

Design

CLI’s entry point extends Click’s built in MultiCommand class object. It provides two methods, which are overridden to provide custom configurations.

class QtipCli(click.MultiCommand):

    def list_commands(self, ctx):
        rv = []
        for filename in os.listdir(cmd_folder):
            if filename.endswith('.py') and \
                filename.startswith('cmd_'):
                rv.append(filename[4:-3])
        rv.sort()
        return rv

    def get_command(self, ctx, name):
        try:
            if sys.version_info[0] == 2:
                name = name.encode('ascii', 'replace')
            mod = __import__('qtip.cli.commands.cmd_' + name,
                             None, None, ['cli'])
        except ImportError:
            return
        return mod.cli

Commands and subcommands will then be loaded by the get_command method above.

Extending the Framework

Framework can be easily extended, as per the users requirements. One such example can be to override the builtin configurations with user defined ones. These can be written in a file, loaded via a Click Context and passed through to all the commands.

class Context:

    def __init__():

        self.config = ConfigParser.ConfigParser()
        self.config.read('path/to/configuration_file')

    def get_paths():

        paths = self.config.get('section', 'path')
        return paths

The above example loads configuration from user defined paths, which then need to be provided to the actual command definitions.

from qtip.cli.entry import Context

pass_context = click.make_pass_decorator(Context, ensure=False)

@cli.command('list', help='List the Plans')
@pass_context
def list(ctx):
    plans = Plan.list_all(ctx.paths())
    table = utils.table('Plans', plans)
    click.echo(table)
API - Application Programming Interface

QTIP consists of different tools(metrics) to benchmark the NFVI. These metrics fall under different NFVI subsystems(QPI’s) such as compute, storage and network. A plan consists of one or more QPI’s, depending upon how the end-user would want to measure performance. API is designed to expose a RESTful interface to the user for executing benchmarks and viewing respective scores.

Framework

QTIP API has been created using the Python package Connexion. It has been chosen for a number of reasons. It follows API First approach to create micro-services. Hence, firstly the API specifications are defined from the client side perspective, followed by the implementation of the micro-service. It decouples the business logic from routing and resource mapping making design and implementation cleaner.

It has two major components:

API Specifications

The API specification is defined in a yaml or json file. Connexion follows Open API specification to determine the design and maps the endpoints to methods in python.
Micro-service Implementation
Connexion maps the operationId corresponding to every operation in API Specification to methods in python which handles request and responses.

As explained, QTIP consists of metrics, QPI’s and plans. The API is designed to provide a RESTful interface to all these components. It is responsible to provide listing and details of each individual element making up these components.

Design
Specification

API’s entry point (main) runs connexion App class object after adding API Specification using App.add_api method. It loads specification from swagger.yaml file by specifying specification_dir.

Connexion reads API’s endpoints(paths), operations, their request and response parameter details and response definitions from the API specification i.e. swagger.yaml in this case.

Following example demonstrates specification for the resource plans.

paths:
  /plans/{name}:
    get:
      summary: Get a plan by plan name
      operationId: qtip.api.controllers.plan.get_plan
      tags:
        - Plan
        - Standalone
      parameters:
        - name: name
          in: path
          description: Plan name
          required: true
          type: string
      responses:
        200:
          description: Plan information
          schema:
            $ref: '#/definitions/Plan'
        404:
          description: Plan not found
          schema:
            $ref: '#/definitions/Error'
        501:
          description: Resource not implemented
          schema:
            $ref: '#/definitions/Error'
        default:
          description: Unexpected error
          schema:
            $ref: '#/definitions/Error'
definitions:
  Plan:
    type: object
    required:
      - name
    properties:
      name:
        type: string
      description:
        type: string
      info:
        type: object
      config:
        type: object

Every operationId in above operations corresponds to a method in controllers. QTIP has three controller modules each for plan, QPI and metric. Connexion will read these mappings and automatically route endpoints to business logic.

Swagger Editor can be explored to play with more such examples and to validate the specification.

Controllers

The request is handled through these methods and response is sent back to the client. Connexion takes care of data validation.

@common.check_endpoint_for_error(resource='Plan')
def get_plan(name):
    plan_spec = plan.Plan(name)
    return plan_spec.content

In above code get_plan takes a plan name and return its content.

The decorator check_endpoint_for_error defined in common is used to handle error and return a suitable error response.

During Development the server can be run by passing specification file(swagger.yaml in this case) to connexion cli -

connexion run <path_to_specification_file> -v
Extending the Framework
Modifying Existing API:

API can be modified by adding entries in swagger.yaml and adding the corresponding controller mapped from operationID.

Adding endpoints:

New endpoints can be defined in paths section in swagger.yaml. To add a new resource dummy -

paths:
  /dummies:
    get:
      summary: Get all dummies
      operationId: qtip.api.controllers.dummy.get_dummies
      tags:
        - dummy
      responses:
        200:
          description: Foo information
          schema:
            $ref: '#/definitions/Dummy
        default:
          description: Unexpected error
          schema:
            $ref: '#/definitions/Error'

And then model of the resource can be defined in the definitions section.

definitions:
  Dummy:
    type: object
    required:
      - name
    properties:
      name:
        type: string
      description:
        type: string
      id:
        type: string
Adding controller methods:

Methods for handling requests and responses for every operation for the endpoint added can be implemented in controller.

In controllers.dummy

def get_dummies():
    all_dummies = [<code to get all dummies>]
    return all_dummies, httplib.OK
Adding error responses

Decorators for handling errors are defined in common.py in api.

from qtip.api import common

@common.check_endpoint_for_error(resource='dummy',operation='get')
def get_dummies()
    all_dummies = [<code to get all dummies>]
    return all_dummies
Adding new API:

API can easily be extended by adding more APIs to Connexion.App class object using add_api class method.

In __main__

def get_app():
app = connexion.App(__name__, specification_dir=swagger_dir)
app.add_api('swagger.yaml', base_path='/v1.0', strict_validation=True)
return app

Extending it to add new APIs. The new API should have all endpoints mapped using operationId.

from qtip.api import __main__
my_app = __main__.get_app()
my_app.add_api('new_api.yaml',base_path'api2',strict_validation=True)
my_app.run(host="0.0.0.0", port=5000)
Compute QPI

The compute QPI gives user an overall score for system compute performace.

Summary

The compute QPI are calibrated a ZTE E9000 server as a baseline with score of 2500 points. Higher scores are better, with double the score indicating double the performance. The compute QPI provides three different kinds of scores:

  • Workload Scores
  • Section Scores
  • Compute QPI Scores
Baseline

ZTE E9000 server with an 2 Deca core Intel Xeon CPU processor,128560.0MB Memory.

Workload Scores

Each time a workload is executed QTIP calculates a score based on the computer’s performance compared to the baseline performance.

Section Scores

QTIP uses a number of different tests, or workloads, to measure performance. The workloads are divided into five different sections:

Section Detail Indication
Integer Integer workloads measure the integer instruction performace of host or vm by performing Dhrystone test. All app relies on integer performance
Floating point Floating point workloads measure the floating pointperfo rmance by performing Whetstone test. Floating point performance is especially important in video games,digital content creation applications.
Memory Memory workloads measure memory bandwidth by performing RamSpeed test. Software working with cipher large amounts data relies on SSL Performace.
DPI DPI workloads measure deep-packet inspection speed by performing nDPI test. Software working with network packet anlysis relies on DPI performance.
SSL SSL Performance workloads measure cipher speeds by using the OpenSSL tool. Software working with cipher large amounts data relies on SSL Performace

A section score is the geometric mean of all the workload scores for workloads that are part of the section. These scores are useful for determining the performance of the computer in a particular area.

Compute QPI Scores

The compute QPI score is the weighted arithmetic mean of the five section scores. The compute QPI score provides a way to quickly compare performance across different computers and different platforms without getting bogged down in details.

VSPERF

VSPERF

VSPERF is an OPNFV testing project.

VSPERF provides an automated test-framework and comprehensive test suite based on industry standards for measuring data-plane performance of Telco NFV switching technologies as well as physical and virtual network interfaces (NFVI). The VSPERF architecture is switch and traffic generator agnostic and provides full control of software component versions and configurations as well as test-case customization.

The Danube release of VSPERF includes improvements in documentation and capabilities. This includes additional test-cases such as RFC 5481 Latency test and RFC-2889 address-learning-rate test. Hardware traffic generator support is now provided for Spirent and Xena in addition to Ixia. The Moongen software traffic generator is also now fully supported. VSPERF can be used in a variety of modes for configuration and setup of the network and/or for control of the test-generator and test execution.

VSPERF Developer Guide
1. Traffic Generator Integration Guide
Intended Audience

This document is intended to aid those who want to integrate new traffic generator into the vsperf code. It is expected, that reader has already read generic part of VSPERF Design Document.

Let us create a sample traffic generator called sample_tg, step by step.

Step 1 - create a directory

Implementation of trafficgens is located at tools/pkt_gen/ directory, where every implementation has its dedicated sub-directory. It is required to create a new directory for new traffic generator implementations.

E.g.

$ mkdir tools/pkt_gen/sample_tg
Step 2 - create a trafficgen module

Every trafficgen class must inherit from generic ITrafficGenerator interface class. VSPERF during its initialization scans content of pkt_gen directory for all python modules, that inherit from ITrafficGenerator. These modules are automatically added into the list of supported traffic generators.

Example:

Let us create a draft of tools/pkt_gen/sample_tg/sample_tg.py module.

from tools.pkt_gen import trafficgen

class SampleTG(trafficgen.ITrafficGenerator):
    """
    A sample traffic generator implementation
    """
    pass

VSPERF is immediately aware of the new class:

$ ./vsperf --list-trafficgen

Output should look like:

Classes derived from: ITrafficGenerator
======

* Ixia:             A wrapper around the IXIA traffic generator.

* IxNet:            A wrapper around IXIA IxNetwork applications.

* Dummy:            A dummy traffic generator whose data is generated by the user.

* SampleTG:         A sample traffic generator implementation

* TestCenter:       Spirent TestCenter
Step 3 - configuration

All configuration values, required for correct traffic generator function, are passed from VSPERF to the traffic generator in a dictionary. Default values shared among all traffic generators are defined in conf/03_traffic.conf within TRAFFIC dictionary. Default values are loaded by ITrafficGenerator interface class automatically, so it is not needed to load them explicitly. In case that there are any traffic generator specific default values, then they should be set within class specific __init__ function.

VSPERF passes test specific configuration within traffic dictionary to every start and send function. So implementation of these functions must ensure, that default values are updated with the testcase specific values. Proper merge of values is assured by call of merge_spec function from conf module.

Example of merge_spec usage in tools/pkt_gen/sample_tg/sample_tg.py module:

from conf import merge_spec

def start_rfc2544_throughput(self, traffic=None, duration=30):
    self._params = {}
    self._params['traffic'] = self.traffic_defaults.copy()
    if traffic:
        self._params['traffic'] = merge_spec(
            self._params['traffic'], traffic)
Step 4 - generic functions

There are some generic functions, which every traffic generator should provide. Although these functions are mainly optional, at least empty implementation must be provided. This is required, so that developer is explicitly aware of these functions.

The connect function is called from the traffic generator controller from its __enter__ method. This function should assure proper connection initialization between DUT and traffic generator. In case, that such implementation is not needed, empty implementation is required.

The disconnect function should perform clean up of any connection specific actions called from the connect function.

Example in tools/pkt_gen/sample_tg/sample_tg.py module:

def connect(self):
    pass

def disconnect(self):
    pass
Step 5 - supported traffic types

Currently VSPERF supports three different types of tests for traffic generators, these are identified in vsperf through the traffic type, which include:

  • RFC2544 throughput - Send fixed size packets at different rates, using
    traffic configuration, until minimum rate at which no packet loss is detected is found. Methods with its implementation have suffix _rfc2544_throughput.
  • RFC2544 back2back - Send fixed size packets at a fixed rate, using traffic
    configuration, for specified time interval. Methods with its implementation have suffix _rfc2544_back2back.
  • continuous flow - Send fixed size packets at given framerate, using traffic
    configuration, for specified time interval. Methods with its implementation have suffix _cont_traffic.

In general, both synchronous and asynchronous interfaces must be implemented for each traffic type. Synchronous functions start with prefix send_. Asynchronous with prefixes start_ and wait_ in case of throughput and back2back and start_ and stop_ in case of continuous traffic type.

Example of synchronous interfaces:

def send_rfc2544_throughput(self, traffic=None, tests=1, duration=20,
                            lossrate=0.0):
def send_rfc2544_back2back(self, traffic=None, tests=1, duration=20,
                           lossrate=0.0):
def send_cont_traffic(self, traffic=None, duration=20):

Example of asynchronous interfaces:

def start_rfc2544_throughput(self, traffic=None, tests=1, duration=20,
                             lossrate=0.0):
def wait_rfc2544_throughput(self):

def start_rfc2544_back2back(self, traffic=None, tests=1, duration=20,
                            lossrate=0.0):
def wait_rfc2544_back2back(self):

def start_cont_traffic(self, traffic=None, duration=20):
def stop_cont_traffic(self):

Description of parameters used by send, start, wait and stop functions:

  • param traffic: A dictionary with detailed definition of traffic pattern. It contains following parameters to be implemented by traffic generator.

    Note: Traffic dictionary has also virtual switch related parameters, which are not listed below.

    Note: There are parameters specific to testing of tunnelling protocols, which are discussed in detail at Integration tests userguide.

    • param traffic_type: One of the supported traffic types, e.g. rfc2544_throughput, rfc2544_continuous or rfc2544_back2back.
    • param frame_rate: Defines desired percentage of frame rate used during continuous stream tests.
    • param bidir: Specifies if generated traffic will be full-duplex (true) or half-duplex (false).
    • param multistream: Defines number of flows simulated by traffic generator. Value 0 disables MultiStream feature.
    • param stream_type: Stream Type defines ISO OSI network layer used for simulation of multiple streams. Supported values:
      • L2 - iteration of destination MAC address
      • L3 - iteration of destination IP address
      • L4 - iteration of destination port of selected transport protocol
    • param l2: A dictionary with data link layer details, e.g. srcmac, dstmac and framesize.
    • param l3: A dictionary with network layer details, e.g. srcip, dstip and proto.
    • param l3: A dictionary with transport layer details, e.g. srcport, dstport.
    • param vlan: A dictionary with vlan specific parameters, e.g. priority, cfi, id and vlan on/off switch enabled.
  • param tests: Number of times the test is executed.

  • param duration: Duration of continuous test or per iteration duration in case of RFC2544 throughput or back2back traffic types.

  • param lossrate: Acceptable lossrate percentage.

Step 6 - passing back results

It is expected that methods send, wait and stop will return values measured by traffic generator within a dictionary. Dictionary keys are defined in ResultsConstants implemented in core/results/results_constants.py. Please check sections for RFC2544 Throughput & Continuous and for Back2Back. The same key names should be used by all traffic generator implementations.

2. VSPERF Design Document
Intended Audience

This document is intended to aid those who want to modify the vsperf code. Or to extend it - for example to add support for new traffic generators, deployment scenarios and so on.

Usage
Example Connectivity to DUT

Establish connectivity to the VSPERF DUT Linux host. If this is in an OPNFV lab following the steps provided by Pharos

The followign steps establish the VSPERF environment.

Example Command Lines

List all the cli options:

$ ./vsperf -h

Run all tests that have tput in their name - phy2phy_tput, pvp_tput etc.:

$ ./vsperf --tests 'tput'

As above but override default configuration with settings in ‘10_custom.conf’. This is useful as modifying configuration directly in the configuration files in conf/NN_*.py shows up as changes under git source control:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf --tests 'tput'

Override specific test parameters. Useful for shortening the duration of tests for development purposes:

$ ./vsperf --test-params 'TRAFFICGEN_DURATION=10;TRAFFICGEN_RFC2544_TESTS=1;' \
                         'TRAFFICGEN_PKT_SIZES=(64,)' pvp_tput
Typical Test Sequence

This is a typical flow of control for a test.

_images/vsperf1.png
Configuration

The conf package contains the configuration files (*.conf) for all system components, it also provides a settings object that exposes all of these settings.

Settings are not passed from component to component. Rather they are available globally to all components once they import the conf package.

from conf import settings
...
log_file = settings.getValue('LOG_FILE_DEFAULT')

Settings files (*.conf) are valid python code so can be set to complex types such as lists and dictionaries as well as scalar types:

first_packet_size = settings.getValue('PACKET_SIZE_LIST')[0]
Configuration Procedure and Precedence

Configuration files follow a strict naming convention that allows them to be processed in a specific order. All the .conf files are named NN_name.conf, where NN is a decimal number. The files are processed in order from 00_name.conf to 99_name.conf so that if the name setting is given in both a lower and higher numbered conf file then the higher numbered file is the effective setting as it is processed after the setting in the lower numbered file.

The values in the file specified by --conf-file takes precedence over all the other configuration files and does not have to follow the naming convention.

Configuration of PATHS dictionary

VSPERF uses external tools like Open vSwitch and Qemu for execution of testcases. These tools may be downloaded and built automatically (see Installation) or installed manually by user from binary packages. It is also possible to use a combination of both approaches, but it is essential to correctly set paths to all required tools. These paths are stored within a PATHS dictionary, which is evaluated before execution of each testcase, in order to setup testcase specific environment. Values selected for testcase execution are internally stored inside TOOLS dictionary, which is used by VSPERF to execute external tools, load kernel modules, etc.

The default configuration of PATHS dictionary is spread among three different configuration files to follow logical grouping of configuration options. Basic description of PATHS dictionary is placed inside conf/00_common.conf. The configuration specific to DPDK and vswitches is located at conf/02_vswitch.conf. The last part related to the Qemu is defined inside conf/04_vnf.conf. Default configuration values can be used in case, that all required tools were downloaded and built automatically by vsperf itself. In case, that some of tools were installed manually from binary packages, then it will be necessary to modify the content of PATHS dictionary accordingly.

Dictionary has a specific section of configuration options for every tool type, it means:

  • PATHS['vswitch'] - contains a separete dictionary for each of vswitches supported by VSPEF

    Example:

    PATHS['vswitch'] = {
       'OvsDpdkVhost': { ... },
       'OvsVanilla' : { ... },
       ...
    }
    
  • PATHS['dpdk'] - contains paths to the dpdk sources, kernel modules and tools (e.g. testpmd)

    Example:

    PATHS['dpdk'] = {
       'type' : 'src',
       'src': {
           'path': os.path.join(ROOT_DIR, 'src/dpdk/dpdk/'),
           'modules' : ['uio', os.path.join(RTE_TARGET, 'kmod/igb_uio.ko')],
           'bind-tool': 'tools/dpdk*bind.py',
           'testpmd': os.path.join(RTE_TARGET, 'app', 'testpmd'),
       },
       ...
    }
    
  • PATHS['qemu'] - contains paths to the qemu sources and executable file

    Example:

    PATHS['qemu'] = {
        'type' : 'bin',
        'bin': {
            'qemu-system': 'qemu-system-x86_64'
        },
        ...
    }
    

Every section specific to the particular vswitch, dpdk or qemu may contain following types of configuration options:

  • option type - is a string, which defines the type of configured paths (‘src’ or ‘bin’) to be selected for a given section:

    • value src means, that VSPERF will use vswitch, DPDK or QEMU built from sources e.g. by execution of systems/build_base_machine.sh script during VSPERF installation
    • value bin means, that VSPERF will use vswitch, DPDK or QEMU binaries installed directly in the operating system, e.g. via OS specific packaging system
  • option path - is a string with a valid system path; Its content is checked for existence, prefixed with section name and stored into TOOLS for later use e.g. TOOLS['dpdk_src'] or TOOLS['vswitch_src']

  • option modules - is list of strings with names of kernel modules; Every module name from given list is checked for a ‘.ko’ suffix. In case that it matches and if it is not an absolute path to the module, then module name is prefixed with value of path option defined for the same section

    Example:

    """
    snippet of PATHS definition from the configuration file:
    """
    PATHS['vswitch'] = {
        'OvsVanilla' = {
            'type' : 'src',
            'src': {
                'path': '/tmp/vsperf/src_vanilla/ovs/ovs/',
                'modules' : ['datapath/linux/openvswitch.ko'],
                ...
            },
            ...
        }
        ...
    }
    
    """
    Final content of TOOLS dictionary used during runtime:
    """
    TOOLS['vswitch_modules'] = ['/tmp/vsperf/src_vanilla/ovs/ovs/datapath/linux/openvswitch.ko']
    
  • all other options are strings with names and paths to specific tools; If a given string contains a relative path and option path is defined for a given section, then string content will be prefixed with content of the path. Otherwise the name of the tool will be searched within standard system directories. In case that filename contains OS specific wildcards, then they will be expanded to the real path. At the end of the processing, every absolute path will be checked for its existence. In case that temporary path (i.e. path with a _tmp suffix) does not exist, then log will be written and vsperf will continue. If any other path will not exist, then vsperf execution will be terminated with a runtime error.

    Example:

    """
    snippet of PATHS definition from the configuration file:
    """
    PATHS['vswitch'] = {
        'OvsDpdkVhost': {
            'type' : 'src',
            'src': {
                'path': '/tmp/vsperf/src_vanilla/ovs/ovs/',
                'ovs-vswitchd': 'vswitchd/ovs-vswitchd',
                'ovsdb-server': 'ovsdb/ovsdb-server',
                ...
            }
            ...
        }
        ...
    }
    
    """
    Final content of TOOLS dictionary used during runtime:
    """
    TOOLS['ovs-vswitchd'] = '/tmp/vsperf/src_vanilla/ovs/ovs/vswitchd/ovs-vswitchd'
    TOOLS['ovsdb-server'] = '/tmp/vsperf/src_vanilla/ovs/ovs/ovsdb/ovsdb-server'
    

Note: In case that bin type is set for DPDK, then TOOLS['dpdk_src'] will be set to the value of PATHS['dpdk']['src']['path']. The reason is, that VSPERF uses downloaded DPDK sources to copy DPDK and testpmd into the GUEST, where testpmd is built. In case, that DPDK sources are not available, then vsperf will continue with test execution, but testpmd can’t be used as a guest loopback. This is useful in case, that other guest loopback applications (e.g. buildin or l2fwd) are used.

Note: In case of RHEL 7.3 OS usage, binary package configuration is required for Vanilla OVS tests. With the installation of a supported rpm for OVS there is a section in the conf\10_custom.conf file that can be used.

Configuration of TRAFFIC dictionary

TRAFFIC dictionary is used for configuration of traffic generator. Default values can be found in configuration file conf/03_traffic.conf. These default values can be modified by (first option has the highest priorty):

  1. Parameters section of testcase defintion
  2. command line options specified by --test-params argument
  3. custom configuration file

It is to note, that in case of option 1 and 2, it is possible to specify only values, which should be changed. In case of custom configuration file, it is required to specify whole TRAFFIC dictionary with its all values or explicitly call and update() method of TRAFFIC dictionary.

Detailed description of TRAFFIC dictionary items follows:

'traffic_type'  - One of the supported traffic types.
                  E.g. rfc2544_throughput, rfc2544_back2back
                  or rfc2544_continuous
                  Data type: str
                  Default value: "rfc2544_throughput".
'bidir'         - Specifies if generated traffic will be full-duplex (True)
                  or half-duplex (False)
                  Data type: str
                  Supported values: "True", "False"
                  Default value: "False".
'frame_rate'    - Defines desired percentage of frame rate used during
                  continuous stream tests.
                  Data type: int
                  Default value: 100.
'multistream'   - Defines number of flows simulated by traffic generator.
                  Value 0 disables multistream feature
                  Data type: int
                  Supported values: 0-65535
                  Default value: 0.
'stream_type'   - Stream type is an extension of the "multistream" feature.
                  If multistream is disabled, then stream type will be
                  ignored. Stream type defines ISO OSI network layer used
                  for simulation of multiple streams.
                  Data type: str
                  Supported values:
                     "L2" - iteration of destination MAC address
                     "L3" - iteration of destination IP address
                     "L4" - iteration of destination port
                            of selected transport protocol
                  Default value: "L4".
'pre_installed_flows'
               -  Pre-installed flows is an extension of the "multistream"
                  feature. If enabled, it will implicitly insert a flow
                  for each stream. If multistream is disabled, then
                  pre-installed flows will be ignored.
                  Note: It is supported only for p2p deployment scenario.
                  Data type: str
                  Supported values:
                     "Yes" - flows will be inserted into OVS
                     "No"  - flows won't be inserted into OVS
                  Default value: "No".
'flow_type'     - Defines flows complexity.
                  Data type: str
                  Supported values:
                     "port" - flow is defined by ingress ports
                     "IP"   - flow is defined by ingress ports
                              and src and dst IP addresses
                  Default value: "port"
'l2'            - A dictionary with l2 network layer details. Supported
                  values are:
    'srcmac'    - Specifies source MAC address filled by traffic generator.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: str
                  Default value: "00:00:00:00:00:00".
    'dstmac'    - Specifies destination MAC address filled by traffic generator.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: str
                  Default value: "00:00:00:00:00:00".
    'framesize' - Specifies default frame size. This value should not be
                  changed directly. It will be overridden during testcase
                  execution by values specified by list TRAFFICGEN_PKT_SIZES.
                  Data type: int
                  Default value: 64
'l3'            - A dictionary with l3 network layer details. Supported
                  values are:
    'srcip'     - Specifies source MAC address filled by traffic generator.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: str
                  Default value: "1.1.1.1".
    'dstip'     - Specifies destination MAC address filled by traffic generator.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: str
                  Default value: "90.90.90.90".
    'proto'     - Specifies deflaut protocol type.
                  Please check particular traffic generator implementation
                  for supported protocol types.
                  Data type: str
                  Default value: "udp".
'l4'            - A dictionary with l4 network layer details. Supported
                  values are:
    'srcport'   - Specifies source port of selected transport protocol.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: int
                  Default value: 3000
    'dstport'   - Specifies destination port of selected transport protocol.
                  NOTE: It can be modified by vsperf in some scenarios.
                  Data type: int
                  Default value: 3001
'vlan'          - A dictionary with vlan encapsulation details. Supported
                  values are:
    'enabled'   - Specifies if vlan encapsulation should be enabled or
                  disabled.
                  Data type: bool
                  Default value: False
    'id'        - Specifies vlan id.
                  Data type: int (NOTE: must fit to 12 bits)
                  Default value: 0
    'priority'  - Specifies a vlan priority (PCP header field).
                  Data type: int (NOTE: must fit to 3 bits)
                  Default value: 0
    'cfi'       - Specifies if frames can or cannot be dropped during
                  congestion (DEI header field).
                  Data type: int (NOTE: must fit to 1 bit)
                  Default value: 0
Configuration of GUEST options

VSPERF is able to setup scenarios involving a number of VMs in series or in parallel. All configuration options related to a particular VM instance are defined as lists and prefixed with GUEST_ label. It is essential, that there is enough items in all GUEST_ options to cover all VM instances involved in the test. In case there is not enough items, then VSPERF will use the first item of particular GUEST_ option to expand the list to required length.

Example of option expansion for 4 VMs:

"""
Original values:
"""
GUEST_SMP = ['2']
GUEST_MEMORY = ['2048', '4096']

"""
Values after automatic expansion:
"""
GUEST_SMP = ['2', '2', '2', '2']
GUEST_MEMORY = ['2048', '4096', '2048', '2048']

First option can contain macros starting with # to generate VM specific values. These macros can be used only for options of list or str types with GUEST_ prefix.

Example of macros and their expnasion for 2 VMs:

"""
Original values:
"""
GUEST_SHARE_DIR = ['/tmp/qemu#VMINDEX_share']
GUEST_BRIDGE_IP = ['#IP(1.1.1.5)/16']

"""
Values after automatic expansion:
"""
GUEST_SHARE_DIR = ['/tmp/qemu0_share', '/tmp/qemu1_share']
GUEST_BRIDGE_IP = ['1.1.1.5/16', '1.1.1.6/16']

Additional examples are available at 04_vnf.conf.

Note: In case, that macro is detected in the first item of the list, then all other items are ignored and list content is created automatically.

Multiple macros can be used inside one configuration option definition, but macros cannot be used inside other macros. The only exception is macro #VMINDEX, which is expanded first and thus it can be used inside other macros.

Following macros are supported:

  • #VMINDEX - it is replaced by index of VM being executed; This macro is expanded first, so it can be used inside other macros.

    Example:

    GUEST_SHARE_DIR = ['/tmp/qemu#VMINDEX_share']
    
  • #MAC(mac_address[, step]) - it will iterate given mac_address with optional step. In case that step is not defined, then it is set to 1. It means, that first VM will use the value of mac_address, second VM value of mac_address increased by step, etc.

    Example:

    GUEST_NICS = [[{'mac' : '#MAC(00:00:00:00:00:01,2)'}]]
    
  • #IP(ip_address[, step]) - it will iterate given ip_address with optional step. In case that step is not defined, then it is set to 1. It means, that first VM will use the value of ip_address, second VM value of ip_address increased by step, etc.

    Example:

    GUEST_BRIDGE_IP = ['#IP(1.1.1.5)/16']
    
  • #EVAL(expression) - it will evaluate given expression as python code; Only simple expressions should be used. Call of the functions is not supported.

    Example:

    GUEST_CORE_BINDING = [('#EVAL(6+2*#VMINDEX)', '#EVAL(7+2*#VMINDEX)')]
    
Other Configuration

conf.settings also loads configuration from the command line and from the environment.

PXP Deployment

Every testcase uses one of the supported deployment scenarios to setup test environment. The controller responsible for a given scenario configures flows in the vswitch to route traffic among physical interfaces connected to the traffic generator and virtual machines. VSPERF supports several deployments including PXP deployment, which can setup various scenarios with multiple VMs.

These scenarios are realized by VswitchControllerPXP class, which can configure and execute given number of VMs in serial or parallel configurations. Every VM can be configured with just one or an even number of interfaces. In case that VM has more than 2 interfaces, then traffic is properly routed among pairs of interfaces.

Example of traffic routing for VM with 4 NICs in serial configuration:

         +------------------------------------------+
         |  VM with 4 NICs                          |
         |  +---------------+    +---------------+  |
         |  |  Application  |    |  Application  |  |
         |  +---------------+    +---------------+  |
         |      ^       |            ^       |      |
         |      |       v            |       v      |
         |  +---------------+    +---------------+  |
         |  | logical ports |    | logical ports |  |
         |  |   0       1   |    |   2       3   |  |
         +--+---------------+----+---------------+--+
                ^       :            ^       :
                |       |            |       |
                :       v            :       v
+-----------+---------------+----+---------------+----------+
| vSwitch   |   0       1   |    |   2       3   |          |
|           | logical ports |    | logical ports |          |
| previous  +---------------+    +---------------+   next   |
| VM or PHY     ^       |            ^       |     VM or PHY|
|   port   -----+       +------------+       +--->   port   |
+-----------------------------------------------------------+

It is also possible to define different number of interfaces for each VM to better simulate real scenarios.

Example of traffic routing for 2 VMs in serial configuration, where 1st VM has 4 NICs and 2nd VM 2 NICs:

         +------------------------------------------+  +---------------------+
         |  1st VM with 4 NICs                      |  |  2nd VM with 2 NICs |
         |  +---------------+    +---------------+  |  |  +---------------+  |
         |  |  Application  |    |  Application  |  |  |  |  Application  |  |
         |  +---------------+    +---------------+  |  |  +---------------+  |
         |      ^       |            ^       |      |  |      ^       |      |
         |      |       v            |       v      |  |      |       v      |
         |  +---------------+    +---------------+  |  |  +---------------+  |
         |  | logical ports |    | logical ports |  |  |  | logical ports |  |
         |  |   0       1   |    |   2       3   |  |  |  |   0       1   |  |
         +--+---------------+----+---------------+--+  +--+---------------+--+
                ^       :            ^       :               ^       :
                |       |            |       |               |       |
                :       v            :       v               :       v
+-----------+---------------+----+---------------+-------+---------------+----------+
| vSwitch   |   0       1   |    |   2       3   |       |   4       5   |          |
|           | logical ports |    | logical ports |       | logical ports |          |
| previous  +---------------+    +---------------+       +---------------+   next   |
| VM or PHY     ^       |            ^       |               ^       |     VM or PHY|
|   port   -----+       +------------+       +---------------+       +---->  port   |
+-----------------------------------------------------------------------------------+

The number of VMs involved in the test and the type of their connection is defined by deployment name as follows:

  • pvvp[number] - configures scenario with VMs connected in series with optional number of VMs. In case that number is not specified, then 2 VMs will be used.

    Example of 2 VMs in a serial configuration:

    +----------------------+  +----------------------+
    |   1st VM             |  |   2nd VM             |
    |   +---------------+  |  |   +---------------+  |
    |   |  Application  |  |  |   |  Application  |  |
    |   +---------------+  |  |   +---------------+  |
    |       ^       |      |  |       ^       |      |
    |       |       v      |  |       |       v      |
    |   +---------------+  |  |   +---------------+  |
    |   | logical ports |  |  |   | logical ports |  |
    |   |   0       1   |  |  |   |   0       1   |  |
    +---+---------------+--+  +---+---------------+--+
            ^       :                 ^       :
            |       |                 |       |
            :       v                 :       v
    +---+---------------+---------+---------------+--+
    |   |   0       1   |         |   3       4   |  |
    |   | logical ports | vSwitch | logical ports |  |
    |   +---------------+         +---------------+  |
    |       ^       |                 ^       |      |
    |       |       +-----------------+       v      |
    |   +----------------------------------------+   |
    |   |              physical ports            |   |
    |   |      0                         1       |   |
    +---+----------------------------------------+---+
               ^                         :
               |                         |
               :                         v
    +------------------------------------------------+
    |                                                |
    |                traffic generator               |
    |                                                |
    +------------------------------------------------+
    
  • pvpv[number] - configures scenario with VMs connected in parallel with optional number of VMs. In case that number is not specified, then 2 VMs will be used. Multistream feature is used to route traffic to particular VMs (or NIC pairs of every VM). It means, that VSPERF will enable multistream feaure and sets the number of streams to the number of VMs and their NIC pairs. Traffic will be dispatched based on Stream Type, i.e. by UDP port, IP address or MAC address.

    Example of 2 VMs in a parallel configuration, where traffic is dispatched

    based on the UDP port.

    +----------------------+  +----------------------+
    |   1st VM             |  |   2nd VM             |
    |   +---------------+  |  |   +---------------+  |
    |   |  Application  |  |  |   |  Application  |  |
    |   +---------------+  |  |   +---------------+  |
    |       ^       |      |  |       ^       |      |
    |       |       v      |  |       |       v      |
    |   +---------------+  |  |   +---------------+  |
    |   | logical ports |  |  |   | logical ports |  |
    |   |   0       1   |  |  |   |   0       1   |  |
    +---+---------------+--+  +---+---------------+--+
            ^       :                 ^       :
            |       |                 |       |
            :       v                 :       v
    +---+---------------+---------+---------------+--+
    |   |   0       1   |         |   3       4   |  |
    |   | logical ports | vSwitch | logical ports |  |
    |   +---------------+         +---------------+  |
    |      ^         |                 ^       :     |
    |      |     ......................:       :     |
    |  UDP | UDP :   |                         :     |
    |  port| port:   +--------------------+    :     |
    |   0  |  1  :                        |    :     |
    |      |     :                        v    v     |
    |   +----------------------------------------+   |
    |   |              physical ports            |   |
    |   |    0                               1   |   |
    +---+----------------------------------------+---+
             ^                               :
             |                               |
             :                               v
    +------------------------------------------------+
    |                                                |
    |                traffic generator               |
    |                                                |
    +------------------------------------------------+
    

PXP deployment is backward compatible with PVP deployment, where pvp is an alias for pvvp1 and it executes just one VM.

The number of interfaces used by VMs is defined by configuration option GUEST_NICS_NR. In case that more than one pair of interfaces is defined for VM, then:

  • for pvvp (serial) scenario every NIC pair is connected in serial before connection to next VM is created
  • for pvpv (parallel) scenario every NIC pair is directly connected to the physical ports and unique traffic stream is assigned to it

Examples:

  • Deployment pvvp10 will start 10 VMs and connects them in series
  • Deployment pvpv4 will start 4 VMs and connects them in parallel
  • Deployment pvpv1 and GUEST_NICS_NR = [4] will start 1 VM with 4 interfaces and every NIC pair is directly connected to the physical ports
  • Deployment pvvp and GUEST_NICS_NR = [2, 4] will start 2 VMs; 1st VM will have 2 interfaces and 2nd VM 4 interfaces. These interfaces will be connected in serial, i.e. traffic will flow as follows: PHY1 -> VM1_1 -> VM1_2 -> VM2_1 -> VM2_2 -> VM2_3 -> VM2_4 -> PHY2

Note: In case that only 1 or more than 2 NICs are configured for VM, then testpmd should be used as forwarding application inside the VM. As it is able to forward traffic between multiple VM NIC pairs.

Note: In case of linux_bridge, all NICs are connected to the same bridge inside the VM.

VM, vSwitch, Traffic Generator Independence

VSPERF supports different vSwithes, Traffic Generators, VNFs and Forwarding Applications by using standard object-oriented polymorphism:

  • Support for vSwitches is implemented by a class inheriting from IVSwitch.
  • Support for Traffic Generators is implemented by a class inheriting from ITrafficGenerator.
  • Support for VNF is implemented by a class inheriting from IVNF.
  • Support for Forwarding Applications is implemented by a class inheriting from IPktFwd.

By dealing only with the abstract interfaces the core framework can support many implementations of different vSwitches, Traffic Generators, VNFs and Forwarding Applications.

IVSwitch
class IVSwitch:
  start(self)
  stop(self)
  add_switch(switch_name)
  del_switch(switch_name)
  add_phy_port(switch_name)
  add_vport(switch_name)
  get_ports(switch_name)
  del_port(switch_name, port_name)
  add_flow(switch_name, flow)
  del_flow(switch_name, flow=None)
ITrafficGenerator
class ITrafficGenerator:
  connect()
  disconnect()

  send_burst_traffic(traffic, numpkts, time, framerate)

  send_cont_traffic(traffic, time, framerate)
  start_cont_traffic(traffic, time, framerate)
  stop_cont_traffic(self):

  send_rfc2544_throughput(traffic, tests, duration, lossrate)
  start_rfc2544_throughput(traffic, tests, duration, lossrate)
  wait_rfc2544_throughput(self)

  send_rfc2544_back2back(traffic, tests, duration, lossrate)
  start_rfc2544_back2back(traffic, , tests, duration, lossrate)
  wait_rfc2544_back2back()

Note send_xxx() blocks whereas start_xxx() does not and must be followed by a subsequent call to wait_xxx().

IVnf
class IVnf:
  start(memory, cpus,
        monitor_path, shared_path_host,
        shared_path_guest, guest_prompt)
  stop()
  execute(command)
  wait(guest_prompt)
  execute_and_wait (command)
IPktFwd
class IPktFwd:
    start()
    stop()
Controllers

Controllers are used in conjunction with abstract interfaces as way of decoupling the control of vSwtiches, VNFs, TrafficGenerators and Forwarding Applications from other components.

The controlled classes provide basic primitive operations. The Controllers sequence and co-ordinate these primitive operation in to useful actions. For instance the vswitch_controller_p2p can be used to bring any vSwitch (that implements the primitives defined in IVSwitch) into the configuration required by the Phy-to-Phy Deployment Scenario.

In order to support a new vSwitch only a new implementation of IVSwitch needs be created for the new vSwitch to be capable of fulfilling all the Deployment Scenarios provided for by existing or future vSwitch Controllers.

Similarly if a new Deployment Scenario is required it only needs to be written once as a new vSwitch Controller and it will immediately be capable of controlling all existing and future vSwitches in to that Deployment Scenario.

Similarly the Traffic Controllers can be used to co-ordinate basic operations provided by implementers of ITrafficGenerator to provide useful tests. Though traffic generators generally already implement full test cases i.e. they both generate suitable traffic and analyse returned traffic in order to implement a test which has typically been predefined in an RFC document. However the Traffic Controller class allows for the possibility of further enhancement - such as iterating over tests for various packet sizes or creating new tests.

Traffic Controller’s Role _images/traffic_controller1.png
Loader & Component Factory

The working of the Loader package (which is responsible for finding arbitrary classes based on configuration data) and the Component Factory which is responsible for choosing the correct class for a particular situation - e.g. Deployment Scenario can be seen in this diagram.

_images/factory_and_loader1.png
Routing Tables

Vsperf uses a standard set of routing tables in order to allow tests to easily mix and match Deployment Scenarios (PVP, P2P topology), Tuple Matching and Frame Modification requirements.

+--------------+
|              |
| Table 0      |  table#0 - Match table. Flows designed to force 5 & 10
|              |  tuple matches go here.
|              |
+--------------+
       |
       |
       v
+--------------+  table#1 - Routing table. Flow entries to forward
|              |  packets between ports goes here.
| Table 1      |  The chosen port is communicated to subsequent tables by
|              |  setting the metadata value to the egress port number.
|              |  Generally this table is set-up by by the
+--------------+  vSwitchController.
       |
       |
       v
+--------------+  table#2 - Frame modification table. Frame modification
|              |  flow rules are isolated in this table so that they can
| Table 2      |  be turned on or off without affecting the routing or
|              |  tuple-matching flow rules. This allows the frame
|              |  modification and tuple matching required by the tests
|              |  in the VSWITCH PERFORMANCE FOR TELCO NFV test
+--------------+  specification to be independent of the Deployment
       |          Scenario set up by the vSwitchController.
       |
       v
+--------------+
|              |
| Table 3      |  table#3 - Egress table. Egress packets on the ports
|              |  setup in Table 1.
+--------------+
3. VSPERF LEVEL TEST DESIGN (LTD)
Introduction

The intention of this Level Test Design (LTD) document is to specify the set of tests to carry out in order to objectively measure the current characteristics of a virtual switch in the Network Function Virtualization Infrastructure (NFVI) as well as the test pass criteria. The detailed test cases will be defined in details-of-LTD, preceded by the doc-id-of-LTD and the scope-of-LTD.

This document is currently in draft form.

Document identifier

The document id will be used to uniquely identify versions of the LTD. The format for the document id will be: OPNFV_vswitchperf_LTD_REL_STATUS, where by the status is one of: draft, reviewed, corrected or final. The document id for this version of the LTD is: OPNFV_vswitchperf_LTD_Brahmaputra_REVIEWED.

Scope

The main purpose of this project is to specify a suite of performance tests in order to objectively measure the current packet transfer characteristics of a virtual switch in the NFVI. The intent of the project is to facilitate testing of any virtual switch. Thus, a generic suite of tests shall be developed, with no hard dependencies to a single implementation. In addition, the test case suite shall be architecture independent.

The test cases developed in this project shall not form part of a separate test framework, all of these tests may be inserted into the Continuous Integration Test Framework and/or the Platform Functionality Test Framework - if a vSwitch becomes a standard component of an OPNFV release.

Details of the Level Test Design

This section describes the features to be tested (FeaturesToBeTested-of-LTD), and identifies the sets of test cases or scenarios (TestIdentification-of-LTD).

Features to be tested

Characterizing virtual switches (i.e. Device Under Test (DUT) in this document) includes measuring the following performance metrics:

  • Throughput
  • Packet delay
  • Packet delay variation
  • Packet loss
  • Burst behaviour
  • Packet re-ordering
  • Packet correctness
  • Availability and capacity of the DUT
Test identification
Throughput tests

The following tests aim to determine the maximum forwarding rate that can be achieved with a virtual switch. The list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

Test ID: LTD.Throughput.RFC2544.PacketLossRatio

Title: RFC 2544 X% packet loss ratio Throughput and Latency Test

Prerequisite Test: N/A

Priority:

Description:

This test determines the DUT’s maximum forwarding rate with X% traffic loss for a constant load (fixed length frames at a fixed interval time). The default loss percentages to be tested are: - X = 0% - X = 10^-7%

Note: Other values can be tested if required by the user.

The selected frame sizes are those previously defined under Default Test Parameters. The test can also be used to determine the average latency of the traffic.

Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result.

Expected Result: At the end of each trial, the presence or absence of loss determines the modification of offered load for the next trial, converging on a maximum rate, or RFC2544 Throughput with X% loss. The Throughput load is re-used in related RFC2544 tests and other tests.

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
Test ID: LTD.Throughput.RFC2544.PacketLossRatioFrameModification

Title: RFC 2544 X% packet loss Throughput and Latency Test with packet modification

Prerequisite Test: N/A

Priority:

Description:

This test determines the DUT’s maximum forwarding rate with X% traffic loss for a constant load (fixed length frames at a fixed interval time). The default loss percentages to be tested are: - X = 0% - X = 10^-7%

Note: Other values can be tested if required by the user.

The selected frame sizes are those previously defined under Default Test Parameters. The test can also be used to determine the average latency of the traffic.

Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result.

During this test, the DUT must perform the following operations on the traffic flow:

  • Perform packet parsing on the DUT’s ingress port.
  • Perform any relevant address look-ups on the DUT’s ingress ports.
  • Modify the packet header before forwarding the packet to the DUT’s egress port. Packet modifications include:
    • Modifying the Ethernet source or destination MAC address.
    • Modifying/adding a VLAN tag. (Recommended).
    • Modifying/adding a MPLS tag.
    • Modifying the source or destination ip address.
    • Modifying the TOS/DSCP field.
    • Modifying the source or destination ports for UDP/TCP/SCTP.
    • Modifying the TTL.

Expected Result: The Packet parsing/modifications require some additional degree of processing resource, therefore the RFC2544 Throughput is expected to be somewhat lower than the Throughput level measured without additional steps. The reduction is expected to be greatest on tests with the smallest packet sizes (greatest header processing rates).

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss and packet modification operations being performed by the DUT.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • The RFC5481 PDV form of delay variation on the traffic flow, using the 99th percentile.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
Test ID: LTD.Throughput.RFC2544.Profile

Title: RFC 2544 Throughput and Latency Profile

Prerequisite Test: N/A

Priority:

Description:

This test reveals how throughput and latency degrades as the offered rate varies in the region of the DUT’s maximum forwarding rate as determined by LTD.Throughput.RFC2544.PacketLossRatio (0% Packet Loss). For example it can be used to determine if the degradation of throughput and latency as the offered rate increases is slow and graceful or sudden and severe.

The selected frame sizes are those previously defined under Default Test Parameters.

The offered traffic rate is described as a percentage delta with respect to the DUT’s RFC 2544 Throughput as determined by LTD.Throughput.RFC2544.PacketLoss Ratio (0% Packet Loss case). A delta of 0% is equivalent to an offered traffic rate equal to the RFC 2544 Maximum Throughput; A delta of +50% indicates an offered rate half-way between the Maximum RFC2544 Throughput and line-rate, whereas a delta of -50% indicates an offered rate of half the RFC 2544 Maximum Throughput. Therefore the range of the delta figure is natuarlly bounded at -100% (zero offered traffic) and +100% (traffic offered at line rate).

The following deltas to the maximum forwarding rate should be applied:

  • -50%, -10%, 0%, +10% & +50%

Expected Result: For each packet size a profile should be produced of how throughput and latency vary with offered rate.

Metrics Collected:

The following are the metrics collected for this test:

  • The forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each delta to the maximum forwarding rate and for each frame size.
  • The average latency for each delta to the maximum forwarding rate and for each frame size.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
  • Any failures experienced (for example if the vSwitch crashes, stops processing packets, restarts or becomes unresponsive to commands) when the offered load is above Maximum Throughput MUST be recorded and reported with the results.
Test ID: LTD.Throughput.RFC2544.SystemRecoveryTime

Title: RFC 2544 System Recovery Time Test

Prerequisite Test LTD.Throughput.RFC2544.PacketLossRatio

Priority:

Description:

The aim of this test is to determine the length of time it takes the DUT to recover from an overload condition for a constant load (fixed length frames at a fixed interval time). The selected frame sizes are those previously defined under Default Test Parameters, traffic should be sent to the DUT under normal conditions. During the duration of the test and while the traffic flows are passing though the DUT, at least one situation leading to an overload condition for the DUT should occur. The time from the end of the overload condition to when the DUT returns to normal operations should be measured to determine recovery time. Prior to overloading the DUT, one should record the average latency for 10,000 packets forwarded through the DUT.

The overload condition SHOULD be to transmit traffic at a very high frame rate to the DUT (150% of the maximum 0% packet loss rate as determined by LTD.Throughput.RFC2544.PacketLossRatio or line-rate whichever is lower), for at least 60 seconds, then reduce the frame rate to 75% of the maximum 0% packet loss rate. A number of time-stamps should be recorded: - Record the time-stamp at which the frame rate was reduced and record a second time-stamp at the time of the last frame lost. The recovery time is the difference between the two timestamps. - Record the average latency for 10,000 frames after the last frame loss and continue to record average latency measurements for every 10,000 frames, when latency returns to within 10% of pre-overload levels record the time-stamp.

Expected Result:

Metrics collected

The following are the metrics collected for this test:

  • The length of time it takes the DUT to recover from an overload condition.
  • The length of time it takes the DUT to recover the average latency to pre-overload conditions.

Deployment scenario:

  • Physical → virtual switch → physical.
Test ID: LTD.Throughput.RFC2544.BackToBackFrames

Title: RFC2544 Back To Back Frames Test

Prerequisite Test: N

Priority:

Description:

The aim of this test is to characterize the ability of the DUT to process back-to-back frames. For each frame size previously defined under Default Test Parameters, a burst of traffic is sent to the DUT with the minimum inter-frame gap between each frame. If the number of received frames equals the number of frames that were transmitted, the burst size should be increased and traffic is sent to the DUT again. The value measured is the back-to-back value, that is the maximum burst size the DUT can handle without any frame loss. Please note a trial must run for a minimum of 2 seconds and should be repeated 50 times (at a minimum).

Expected Result:

Tests of back-to-back frames with physical devices have produced unstable results in some cases. All tests should be repeated in multiple test sessions and results stability should be examined.

Metrics collected

The following are the metrics collected for this test:

  • The average back-to-back value across the trials, which is the number of frames in the longest burst that the DUT will handle without the loss of any frames.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → physical.
Test ID: LTD.Throughput.RFC2889.MaxForwardingRateSoak

Title: RFC 2889 X% packet loss Max Forwarding Rate Soak Test

Prerequisite Test LTD.Throughput.RFC2544.PacketLossRatio

Priority:

Description:

The aim of this test is to understand the Max Forwarding Rate stability over an extended test duration in order to uncover any outliers. To allow for an extended test duration, the test should ideally run for 24 hours or, if this is not possible, for at least 6 hours. For this test, each frame size must be sent at the highest Throughput rate with X% packet loss, as determined in the prerequisite test. The default loss percentages to be tested are: - X = 0% - X = 10^-7%

Note: Other values can be tested if required by the user.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • Max Forwarding Rate stability of the DUT.
    • This means reporting the number of packets lost per time interval and reporting any time intervals with packet loss. The RFC2889 Forwarding Rate shall be measured in each interval. An interval of 60s is suggested.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
  • The RFC5481 PDV form of delay variation on the traffic flow, using the 99th percentile.
Test ID: LTD.Throughput.RFC2889.MaxForwardingRateSoakFrameModification

Title: RFC 2889 Max Forwarding Rate Soak Test with Frame Modification

Prerequisite Test: LTD.Throughput.RFC2544.PacketLossRatioFrameModification (0% Packet Loss)

Priority:

Description:

The aim of this test is to understand the Max Forwarding Rate stability over an extended test duration in order to uncover any outliers. To allow for an extended test duration, the test should ideally run for 24 hours or, if this is not possible, for at least 6 hour. For this test, each frame size must be sent at the highest Throughput rate with 0% packet loss, as determined in the prerequisite test.

During this test, the DUT must perform the following operations on the traffic flow:

  • Perform packet parsing on the DUT’s ingress port.
  • Perform any relevant address look-ups on the DUT’s ingress ports.
  • Modify the packet header before forwarding the packet to the DUT’s egress port. Packet modifications include:
    • Modifying the Ethernet source or destination MAC address.
    • Modifying/adding a VLAN tag (Recommended).
    • Modifying/adding a MPLS tag.
    • Modifying the source or destination ip address.
    • Modifying the TOS/DSCP field.
    • Modifying the source or destination ports for UDP/TCP/SCTP.
    • Modifying the TTL.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • Max Forwarding Rate stability of the DUT.
    • This means reporting the number of packets lost per time interval and reporting any time intervals with packet loss. The RFC2889 Forwarding Rate shall be measured in each interval. An interval of 60s is suggested.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
  • The RFC5481 PDV form of delay variation on the traffic flow, using the 99th percentile.
Test ID: LTD.Throughput.RFC6201.ResetTime

Title: RFC 6201 Reset Time Test

Prerequisite Test: N/A

Priority:

Description:

The aim of this test is to determine the length of time it takes the DUT to recover from a reset.

Two reset methods are defined - planned and unplanned. A planned reset requires stopping and restarting the virtual switch by the usual ‘graceful’ method defined by it’s documentation. An unplanned reset requires simulating a fatal internal fault in the virtual switch - for example by using kill -SIGKILL on a Linux environment.

Both reset methods SHOULD be exercised.

For each frame size previously defined under Default Test Parameters, traffic should be sent to the DUT under normal conditions. During the duration of the test and while the traffic flows are passing through the DUT, the DUT should be reset and the Reset time measured. The Reset time is the total time that a device is determined to be out of operation and includes the time to perform the reset and the time to recover from it (cf. RFC6201).

RFC6201 defines two methods to measure the Reset time:

  • Frame-Loss Method: which requires the monitoring of the number of lost frames and calculates the Reset time based on the number of frames lost and the offered rate according to the following formula:

                       Frames_lost (packets)
    Reset_time = -------------------------------------
                   Offered_rate (packets per second)
    
  • Timestamp Method: which measures the time from which the last frame is forwarded from the DUT to the time the first frame is forwarded after the reset. This involves time-stamping all transmitted frames and recording the timestamp of the last frame that was received prior to the reset and also measuring the timestamp of the first frame that is received after the reset. The Reset time is the difference between these two timestamps.

According to RFC6201 the choice of method depends on the test tool’s capability; the Frame-Loss method SHOULD be used if the test tool supports:

  • Counting the number of lost frames per stream.
  • Transmitting test frame despite the physical link status.

whereas the Timestamp method SHOULD be used if the test tool supports:

  • Timestamping each frame.
  • Monitoring received frame’s timestamp.
  • Transmitting frames only if the physical link status is up.

Expected Result:

Metrics collected

The following are the metrics collected for this test:

  • Average Reset Time over the number of trials performed.

Results of this test should include the following information:

  • The reset method used.
  • Throughput in Fps and Mbps.
  • Average Frame Loss over the number of trials performed.
  • Average Reset Time in milliseconds over the number of trials performed.
  • Number of trials performed.
  • Protocol: IPv4, IPv6, MPLS, etc.
  • Frame Size in Octets
  • Port Media: Ethernet, Gigabit Ethernet (GbE), etc.
  • Port Speed: 10 Gbps, 40 Gbps etc.
  • Interface Encapsulation: Ethernet, Ethernet VLAN, etc.

Deployment scenario:

  • Physical → virtual switch → physical.
Test ID: LTD.Throughput.RFC2889.MaxForwardingRate

Title: RFC2889 Forwarding Rate Test

Prerequisite Test: LTD.Throughput.RFC2544.PacketLossRatio

Priority:

Description:

This test measures the DUT’s Max Forwarding Rate when the Offered Load is varied between the throughput and the Maximum Offered Load for fixed length frames at a fixed time interval. The selected frame sizes are those previously defined under Default Test Parameters. The throughput is the maximum offered load with 0% frame loss (measured by the prerequisite test), and the Maximum Offered Load (as defined by RFC2285) is “the highest number of frames per second that an external source can transmit to a DUT/SUT for forwarding to a specified output interface or interfaces”.

Traffic should be sent to the DUT at a particular rate (TX rate) starting with TX rate equal to the throughput rate. The rate of successfully received frames at the destination counted (in FPS). If the RX rate is equal to the TX rate, the TX rate should be increased by a fixed step size and the RX rate measured again until the Max Forwarding Rate is found.

The trial duration for each iteration should last for the period of time needed for the system to reach steady state for the frame size being tested. Under RFC2889 (Sec. 5.6.3.1) test methodology, the test duration should run for a minimum period of 30 seconds, regardless whether the system reaches steady state before the minimum duration ends.

Expected Result: According to RFC2889 The Max Forwarding Rate is the highest forwarding rate of a DUT taken from an iterative set of forwarding rate measurements. The iterative set of forwarding rate measurements are made by setting the intended load transmitted from an external source and measuring the offered load (i.e what the DUT is capable of forwarding). If the Throughput == the Maximum Offered Load, it follows that Max Forwarding Rate is equal to the Maximum Offered Load.

Metrics Collected:

The following are the metrics collected for this test:

  • The Max Forwarding Rate for the DUT for each packet size.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → physical. Note: Full mesh tests with multiple ingress and egress ports are a key aspect of RFC 2889 benchmarks, and scenarios with both 2 and 4 ports should be tested. In any case, the number of ports used must be reported.
Test ID: LTD.Throughput.RFC2889.ForwardPressure

Title: RFC2889 Forward Pressure Test

Prerequisite Test: LTD.Throughput.RFC2889.MaxForwardingRate

Priority:

Description:

The aim of this test is to determine if the DUT transmits frames with an inter-frame gap that is less than 12 bytes. This test overloads the DUT and measures the output for forward pressure. Traffic should be transmitted to the DUT with an inter-frame gap of 11 bytes, this will overload the DUT by 1 byte per frame. The forwarding rate of the DUT should be measured.

Expected Result: The forwarding rate should not exceed the maximum forwarding rate of the DUT collected by LTD.Throughput.RFC2889.MaxForwardingRate.

Metrics collected

The following are the metrics collected for this test:

  • Forwarding rate of the DUT in FPS or Mbps.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → physical.
Test ID: LTD.Throughput.RFC2889.ErrorFramesFiltering

Title: RFC2889 Error Frames Filtering Test

Prerequisite Test: N/A

Priority:

Description:

The aim of this test is to determine whether the DUT will propagate any erroneous frames it receives or whether it is capable of filtering out the erroneous frames. Traffic should be sent with erroneous frames included within the flow at random intervals. Illegal frames that must be tested include: - Oversize Frames. - Undersize Frames. - CRC Errored Frames. - Dribble Bit Errored Frames - Alignment Errored Frames

The traffic flow exiting the DUT should be recorded and checked to determine if the erroneous frames where passed through the DUT.

Expected Result: Broken frames are not passed!

Metrics collected

No Metrics are collected in this test, instead it determines:

  • Whether the DUT will propagate erroneous frames.
  • Or whether the DUT will correctly filter out any erroneous frames from traffic flow with out removing correct frames.

Deployment scenario:

  • Physical → virtual switch → physical.
Test ID: LTD.Throughput.RFC2889.BroadcastFrameForwarding

Title: RFC2889 Broadcast Frame Forwarding Test

Prerequisite Test: N

Priority:

Description:

The aim of this test is to determine the maximum forwarding rate of the DUT when forwarding broadcast traffic. For each frame previously defined under Default Test Parameters, the traffic should be set up as broadcast traffic. The traffic throughput of the DUT should be measured.

The test should be conducted with at least 4 physical ports on the DUT. The number of ports used MUST be recorded.

As broadcast involves forwarding a single incoming packet to several destinations, the latency of a single packet is defined as the average of the latencies for each of the broadcast destinations.

The incoming packet is transmitted on each of the other physical ports, it is not transmitted on the port on which it was received. The test MAY be conducted using different broadcasting ports to uncover any performance differences.

Expected Result:

Metrics collected:

The following are the metrics collected for this test:

  • The forwarding rate of the DUT when forwarding broadcast traffic.
  • The minimum, average & maximum packets latencies observed.

Deployment scenario:

  • Physical → virtual switch 3x physical. In the Broadcast rate testing, four test ports are required. One of the ports is connected to the test device, so it can send broadcast frames and listen for miss-routed frames.
Test ID: LTD.Throughput.RFC2544.WorstN-BestN

Title: Modified RFC 2544 X% packet loss ratio Throughput and Latency Test

Prerequisite Test: N/A

Priority:

Description:

This test determines the DUT’s maximum forwarding rate with X% traffic loss for a constant load (fixed length frames at a fixed interval time). The default loss percentages to be tested are: X = 0%, X = 10^-7%

Modified RFC 2544 throughput benchmarking methodology aims to quantify the throughput measurement variations observed during standard RFC 2544 benchmarking measurements of virtual switches and VNFs. The RFC2544 binary search algorithm is modified to use more samples per test trial to drive the binary search and yield statistically more meaningful results. This keeps the heart of the RFC2544 methodology, still relying on the binary search of throughput at specified loss tolerance, while providing more useful information about the range of results seen in testing. Instead of using a single traffic trial per iteration step, each traffic trial is repeated N times and the success/failure of the iteration step is based on these N traffic trials. Two types of revised tests are defined - Worst-of-N and Best-of-N.

Worst-of-N

Worst-of-N indicates the lowest expected maximum throughput for ( packet size, loss tolerance) when repeating the test.

  1. Repeat the same test run N times at a set packet rate, record each result.
  2. Take the WORST result (highest packet loss) out of N result samples, called the Worst-of-N sample.
  3. If Worst-of-N sample has loss less than the set loss tolerance, then the step is successful - increase the test traffic rate.
  4. If Worst-of-N sample has loss greater than the set loss tolerance then the step failed - decrease the test traffic rate.
  5. Go to step 1.

Best-of-N

Best-of-N indicates the highest expected maximum throughput for ( packet size, loss tolerance) when repeating the test.

  1. Repeat the same traffic run N times at a set packet rate, record each result.
  2. Take the BEST result (least packet loss) out of N result samples, called the Best-of-N sample.
  3. If Best-of-N sample has loss less than the set loss tolerance, then the step is successful - increase the test traffic rate.
  4. If Best-of-N sample has loss greater than the set loss tolerance, then the step failed - decrease the test traffic rate.
  5. Go to step 1.

Performing both Worst-of-N and Best-of-N benchmark tests yields lower and upper bounds of expected maximum throughput under the operating conditions, giving a very good indication to the user of the deterministic performance range for the tested setup.

Expected Result: At the end of each trial series, the presence or absence of loss determines the modification of offered load for the next trial series, converging on a maximum rate, or RFC2544 Throughput with X% loss. The Throughput load is re-used in related RFC2544 tests and other tests.

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • Following may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system:
  • CPU core utilization.
  • CPU cache utilization.
  • Memory footprint.
  • System bus (QPI, PCI, ...) utilization.
  • CPU cycles consumed per packet.
Test ID: LTD.Throughput.Overlay.Network.<tech>.RFC2544.PacketLossRatio

Title: <tech> Overlay Network RFC 2544 X% packet loss ratio Throughput and Latency Test

NOTE: Throughout this test, four interchangeable overlay technologies are covered by the same test description. They are: VXLAN, GRE, NVGRE and GENEVE.

Prerequisite Test: N/A

Priority:

Description: This test evaluates standard switch performance benchmarks for the scenario where an Overlay Network is deployed for all paths through the vSwitch. Overlay Technologies covered (replacing <tech> in the test name) include:

  • VXLAN
  • GRE
  • NVGRE
  • GENEVE

Performance will be assessed for each of the following overlay network functions:

  • Encapsulation only
  • De-encapsulation only
  • Both Encapsulation and De-encapsulation

For each native packet, the DUT must perform the following operations:

  • Examine the packet and classify its correct overlay net (tunnel) assignment
  • Encapsulate the packet
  • Switch the packet to the correct port

For each encapsulated packet, the DUT must perform the following operations:

  • Examine the packet and classify its correct native network assignment
  • De-encapsulate the packet, if required
  • Switch the packet to the correct port

The selected frame sizes are those previously defined under Default Test Parameters.

Thus, each test comprises an overlay technology, a network function, and a packet size with overlay network overhead included (but see also the discussion at https://etherpad.opnfv.org/p/vSwitchTestsDrafts ).

The test can also be used to determine the average latency of the traffic.

Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result for Throughput.

Expected Result: At the end of each trial, the presence or absence of loss determines the modification of offered load for the next trial, converging on a maximum rate, or RFC2544 Throughput with X% loss (where the value of X is typically equal to zero). The Throughput load is re-used in related RFC2544 tests and other tests.

Metrics Collected: The following are the metrics collected for this test:

  • The maximum Throughput in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT and VNFs (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
Test ID: LTD.Throughput.RFC2544.MatchAction.PacketLossRatio

Title: RFC 2544 X% packet loss ratio match action Throughput and Latency Test

Prerequisite Test: LTD.Throughput.RFC2544.PacketLossRatio

Priority:

Description:

The aim of this test is to determine the cost of carrying out match action(s) on the DUT’s RFC2544 Throughput with X% traffic loss for a constant load (fixed length frames at a fixed interval time).

Each test case requires:

  • selection of a specific match action(s),
  • specifying a percentage of total traffic that is elligible for the match action,
  • determination of the specific test configuration (number of flows, number of test ports, presence of an external controller, etc.), and
  • measurement of the RFC 2544 Throughput level with X% packet loss: Traffic shall be bi-directional and symmetric.

Note: It would be ideal to verify that all match action-elligible traffic was forwarded to the correct port, and if forwarded to an unintended port it should be considered lost.

A match action is an action that is typically carried on a frame or packet that matches a set of flow classification parameters (typically frame/packet header fields). A match action may or may not modify a packet/frame. Match actions include [1]:

  • output : outputs a packet to a particular port.
  • normal: Subjects the packet to traditional L2/L3 processing (MAC learning).
  • flood: Outputs the packet on all switch physical ports other than the port on which it was received and any ports on which flooding is disabled.
  • all: Outputs the packet on all switch physical ports other than the port on which it was received.
  • local: Outputs the packet on the local port, which corresponds to the network device that has the same name as the bridge.
  • in_port: Outputs the packet on the port from which it was received.
  • Controller: Sends the packet and its metadata to the OpenFlow controller as a packet in message.
  • enqueue: Enqueues the packet on the specified queue within port.
  • drop: discard the packet.

Modifications include [1]:

  • mod vlan: covered by LTD.Throughput.RFC2544.PacketLossRatioFrameModification
  • mod_dl_src: Sets the source Ethernet address.
  • mod_dl_dst: Sets the destination Ethernet address.
  • mod_nw_src: Sets the IPv4 source address.
  • mod_nw_dst: Sets the IPv4 destination address.
  • mod_tp_src: Sets the TCP or UDP or SCTP source port.
  • mod_tp_dst: Sets the TCP or UDP or SCTP destination port.
  • mod_nw_tos: Sets the DSCP bits in the IPv4 ToS/DSCP or IPv6 traffic class field.
  • mod_nw_ecn: Sets the ECN bits in the appropriate IPv4 or IPv6 field.
  • mod_nw_ttl: Sets the IPv4 TTL or IPv6 hop limit field.

Note: This comprehensive list requires extensive traffic generator capabilities.

The match action(s) that were applied as part of the test should be reported in the final test report.

During this test, the DUT must perform the following operations on the traffic flow:

  • Perform packet parsing on the DUT’s ingress port.
  • Perform any relevant address look-ups on the DUT’s ingress ports.
  • Carry out one or more of the match actions specified above.

The default loss percentages to be tested are: - X = 0% - X = 10^-7% Other values can be tested if required by the user. The selected frame sizes are those previously defined under Default Test Parameters.

The test can also be used to determine the average latency of the traffic when a match action is applied to packets in a flow. Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result.

Expected Result:

At the end of each trial, the presence or absence of loss determines the modification of offered load for the next trial, converging on a maximum rate, or RFC2544Throughput with X% loss. The Throughput load is re-used in related RFC2544 tests and other tests.

Metrics Collected:

The following are the metrics collected for this test:

  • The RFC 2544 Throughput in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 ofRFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

The metrics collected can be compared to that of the prerequisite test to determine the cost of the match action(s) in the pipeline.

Deployment scenario:

  • Physical → virtual switch → physical (and others are possible)
[1] ovs-ofctl - administer OpenFlow switches
[http://openvswitch.org/support/dist-docs/ovs-ofctl.8.txt ]
Packet Latency tests

These tests will measure the store and forward latency as well as the packet delay variation for various packet types through the virtual switch. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

Test ID: LTD.PacketLatency.InitialPacketProcessingLatency

Title: Initial Packet Processing Latency

Prerequisite Test: N/A

Priority:

Description:

In some virtual switch architectures, the first packets of a flow will take the system longer to process than subsequent packets in the flow. This test determines the latency for these packets. The test will measure the latency of the packets as they are processed by the flow-setup-path of the DUT. There are two methods for this test, a recommended method and a nalternative method that can be used if it is possible to disable the fastpath of the virtual switch.

Recommended method: This test will send 64,000 packets to the DUT, each belonging to a different flow. Average packet latency will be determined over the 64,000 packets.

Alternative method: This test will send a single packet to the DUT after a fixed interval of time. The time interval will be equivalent to the amount of time it takes for a flow to time out in the virtual switch plus 10%. Average packet latency will be determined over 1,000,000 packets.

This test is intended only for non-learning virtual switches; For learning virtual switches use RFC2889.

For this test, only unidirectional traffic is required.

Expected Result: The average latency for the initial packet of all flows should be greater than the latency of subsequent traffic.

Metrics Collected:

The following are the metrics collected for this test:

  • Average latency of the initial packets of all flows that are processed by the DUT.

Deployment scenario:

  • Physical → Virtual Switch → Physical.
Test ID: LTD.PacketDelayVariation.RFC3393.Soak

Title: Packet Delay Variation Soak Test

Prerequisite Tests: LTD.Throughput.RFC2544.PacketLossRatio (0% Packet Loss)

Priority:

Description:

The aim of this test is to understand the distribution of packet delay variation for different frame sizes over an extended test duration and to determine if there are any outliers. To allow for an extended test duration, the test should ideally run for 24 hours or, if this is not possible, for at least 6 hour. For this test, each frame size must be sent at the highest possible throughput with 0% packet loss, as determined in the prerequisite test.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • The packet delay variation value for traffic passing through the DUT.
  • The RFC5481 PDV form of delay variation on the traffic flow, using the 99th percentile, for each 60s interval during the test.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
Scalability tests

The general aim of these tests is to understand the impact of large flow table size and flow lookups on throughput. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

Test ID: LTD.Scalability.Flows.RFC2544.0PacketLoss

Title: RFC 2544 0% loss Flow Scalability throughput test

Prerequisite Test: LTD.Throughput.RFC2544.PacketLossRatio, IF the delta Throughput between the single-flow RFC2544 test and this test with a variable number of flows is desired.

Priority:

Description:

The aim of this test is to measure how throughput changes as the number of flows in the DUT increases. The test will measure the throughput through the fastpath, as such the flows need to be installed on the DUT before passing traffic.

For each frame size previously defined under Default Test Parameters and for each of the following number of flows:

  • 1,000
  • 2,000
  • 4,000
  • 8,000
  • 16,000
  • 32,000
  • 64,000
  • Max supported number of flows.

This test will be conducted under two conditions following the establishment of all flows as required by RFC 2544, regarding the flow expiration time-out:

  1. The time-out never expires during each trial.

2) The time-out expires for all flows periodically. This would require a short time-out compared with flow re-appearance for a small number of flows, and may not be possible for all flow conditions.

The maximum 0% packet loss Throughput should be determined in a manner identical to LTD.Throughput.RFC2544.PacketLossRatio.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum number of frames per second that can be forwarded at the specified number of flows and the specified frame size, with zero packet loss.
Test ID: LTD.MemoryBandwidth.RFC2544.0PacketLoss.Scalability

Title: RFC 2544 0% loss Memory Bandwidth Scalability test

Prerequisite Tests: LTD.Throughput.RFC2544.PacketLossRatio, IF the delta Throughput between an undisturbed RFC2544 test and this test with the Throughput affected by cache and memory bandwidth contention is desired.

Priority:

Description:

The aim of this test is to understand how the DUT’s performance is affected by cache sharing and memory bandwidth between processes.

During the test all cores not used by the vSwitch should be running a memory intensive application. This application should read and write random data to random addresses in unused physical memory. The random nature of the data and addresses is intended to consume cache, exercise main memory access (as opposed to cache) and exercise all memory buses equally. Furthermore:

  • the ratio of reads to writes should be recorded. A ratio of 1:1 SHOULD be used.
  • the reads and writes MUST be of cache-line size and be cache-line aligned.
  • in NUMA architectures memory access SHOULD be local to the core’s node. Whether only local memory or a mix of local and remote memory is used MUST be recorded.
  • the memory bandwidth (reads plus writes) used per-core MUST be recorded; the test MUST be run with a per-core memory bandwidth equal to half the maximum system memory bandwidth divided by the number of cores. The test MAY be run with other values for the per-core memory bandwidth.
  • the test MAY also be run with the memory intensive application running on all cores.

Under these conditions the DUT’s 0% packet loss throughput is determined as per LTD.Throughput.RFC2544.PacketLossRatio.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • The DUT’s 0% packet loss throughput in the presence of cache sharing and memory bandwidth between processes.
Test ID: LTD.Scalability.VNF.RFC2544.PacketLossRatio
Title: VNF Scalability RFC 2544 X% packet loss ratio Throughput and
Latency Test

Prerequisite Test: N/A

Priority:

Description:

This test determines the DUT’s throughput rate with X% traffic loss for a constant load (fixed length frames at a fixed interval time) when the number of VNFs on the DUT increases. The default loss percentages to be tested are: - X = 0% - X = 10^-7% . The minimum number of VNFs to be tested are 3.

Flow classification should be conducted with L2, L3 and L4 matching to understand the matching and scaling capability of the vSwitch. The matching fields which were used as part of the test should be reported as part of the benchmark report.

The vSwitch is responsible for forwarding frames between the VNFs

The SUT (vSwitch and VNF daisy chain) operation should be validated before running the test. This may be completed by running a burst or continuous stream of traffic through the SUT to ensure proper operation before a test.

Note: The traffic rate used to validate SUT operation should be low enough not to stress the SUT.

Note: Other values can be tested if required by the user.

Note: The same VNF should be used in the “daisy chain” formation. Each addition of a VNF should be conducted in a new test setup (The DUT is brought down, then the DUT is brought up again). An atlernative approach would be to continue to add VNFs without bringing down the DUT. The approach used needs to be documented as part of the test report.

The selected frame sizes are those previously defined under Default Test Parameters. The test can also be used to determine the average latency of the traffic.

Under the RFC2544 test methodology, the test duration will include a number of trials; each trial should run for a minimum period of 60 seconds. A binary search methodology must be applied for each trial to obtain the final result for Throughput.

Expected Result: At the end of each trial, the presence or absence of loss determines the modification of offered load for the next trial, converging on a maximum rate, or RFC2544 Throughput with X% loss. The Throughput load is re-used in related RFC2544 tests and other tests.

If the test VNFs are rather light-weight in terms of processing, the test provides a view of multiple passes through the vswitch on logical interfaces. In other words, the test produces an optimistic count of daisy-chained VNFs, but the cumulative effect of traffic on the vSwitch is “real” (assuming that the vSwitch has some dedicated resources, and the effects on shared resources is understood).

Metrics Collected: The following are the metrics collected for this test:

  • The maximum Throughput in Frames Per Second (FPS) and Mbps of the DUT for each frame size with X% packet loss.
  • The average latency of the traffic flow when passing through the DUT and VNFs (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
Test ID: LTD.Scalability.VNF.RFC2544.PacketLossProfile

Title: VNF Scalability RFC 2544 Throughput and Latency Profile

Prerequisite Test: N/A

Priority:

Description:

This test reveals how throughput and latency degrades as the number of VNFs increases and offered rate varies in the region of the DUT’s maximum forwarding rate as determined by LTD.Throughput.RFC2544.PacketLossRatio (0% Packet Loss). For example it can be used to determine if the degradation of throughput and latency as the number of VNFs and offered rate increases is slow and graceful, or sudden and severe. The minimum number of VNFs to be tested is 3.

The selected frame sizes are those previously defined under Default Test Parameters.

The offered traffic rate is described as a percentage delta with respect to the DUT’s RFC 2544 Throughput as determined by LTD.Throughput.RFC2544.PacketLoss Ratio (0% Packet Loss case). A delta of 0% is equivalent to an offered traffic rate equal to the RFC 2544 Throughput; A delta of +50% indicates an offered rate half-way between the Throughput and line-rate, whereas a delta of -50% indicates an offered rate of half the maximum rate. Therefore the range of the delta figure is natuarlly bounded at -100% (zero offered traffic) and +100% (traffic offered at line rate).

The following deltas to the maximum forwarding rate should be applied:

  • -50%, -10%, 0%, +10% & +50%

Note: Other values can be tested if required by the user.

Note: The same VNF should be used in the “daisy chain” formation. Each addition of a VNF should be conducted in a new test setup (The DUT is brought down, then the DUT is brought up again). An atlernative approach would be to continue to add VNFs without bringing down the DUT. The approach used needs to be documented as part of the test report.

Flow classification should be conducted with L2, L3 and L4 matching to understand the matching and scaling capability of the vSwitch. The matching fields which were used as part of the test should be reported as part of the benchmark report.

The SUT (vSwitch and VNF daisy chain) operation should be validated before running the test. This may be completed by running a burst or continuous stream of traffic through the SUT to ensure proper operation before a test.

Note: the traffic rate used to validate SUT operation should be low enough not to stress the SUT

Expected Result: For each packet size a profile should be produced of how throughput and latency vary with offered rate.

Metrics Collected:

The following are the metrics collected for this test:

  • The forwarding rate in Frames Per Second (FPS) and Mbps of the DUT for each delta to the maximum forwarding rate and for each frame size.
  • The average latency for each delta to the maximum forwarding rate and for each frame size.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.
  • Any failures experienced (for example if the vSwitch crashes, stops processing packets, restarts or becomes unresponsive to commands) when the offered load is above Maximum Throughput MUST be recorded and reported with the results.
Activation tests

The general aim of these tests is to understand the capacity of the and speed with which the vswitch can accommodate new flows.

Test ID: LTD.Activation.RFC2889.AddressCachingCapacity

Title: RFC2889 Address Caching Capacity Test

Prerequisite Test: N/A

Priority:

Description:

Please note this test is only applicable to virtual switches that are capable of MAC learning. The aim of this test is to determine the address caching capacity of the DUT for a constant load (fixed length frames at a fixed interval time). The selected frame sizes are those previously defined under Default Test Parameters.

In order to run this test the aging time, that is the maximum time the DUT will keep a learned address in its flow table, and a set of initial addresses, whose value should be >= 1 and <= the max number supported by the implementation must be known. Please note that if the aging time is configurable it must be longer than the time necessary to produce frames from the external source at the specified rate. If the aging time is fixed the frame rate must be brought down to a value that the external source can produce in a time that is less than the aging time.

Learning Frames should be sent from an external source to the DUT to install a number of flows. The Learning Frames must have a fixed destination address and must vary the source address of the frames. The DUT should install flows in its flow table based on the varying source addresses. Frames should then be transmitted from an external source at a suitable frame rate to see if the DUT has properly learned all of the addresses. If there is no frame loss and no flooding, the number of addresses sent to the DUT should be increased and the test is repeated until the max number of cached addresses supported by the DUT determined.

Expected Result:

Metrics collected:

The following are the metrics collected for this test:

  • Number of cached addresses supported by the DUT.
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → 2 x physical (one receiving, one listening).
Test ID: LTD.Activation.RFC2889.AddressLearningRate

Title: RFC2889 Address Learning Rate Test

Prerequisite Test: LTD.Memory.RFC2889.AddressCachingCapacity

Priority:

Description:

Please note this test is only applicable to virtual switches that are capable of MAC learning. The aim of this test is to determine the rate of address learning of the DUT for a constant load (fixed length frames at a fixed interval time). The selected frame sizes are those previously defined under Default Test Parameters, traffic should be sent with each IPv4/IPv6 address incremented by one. The rate at which the DUT learns a new address should be measured. The maximum caching capacity from LTD.Memory.RFC2889.AddressCachingCapacity should be taken into consideration as the maximum number of addresses for which the learning rate can be obtained.

Expected Result: It may be worthwhile to report the behaviour when operating beyond address capacity - some DUTs may be more friendly to new addresses than others.

Metrics collected:

The following are the metrics collected for this test:

  • The address learning rate of the DUT.

Deployment scenario:

  • Physical → virtual switch → 2 x physical (one receiving, one listening).
Coupling between control path and datapath Tests

The following tests aim to determine how tightly coupled the datapath and the control path are within a virtual switch. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

Test ID: LTD.CPDPCouplingFlowAddition

Title: Control Path and Datapath Coupling

Prerequisite Test:

Priority:

Description:

The aim of this test is to understand how exercising the DUT’s control path affects datapath performance.

Initially a certain number of flow table entries are installed in the vSwitch. Then over the duration of an RFC2544 throughput test flow-entries are added and removed at the rates specified below. No traffic is ‘hitting’ these flow-entries, they are simply added and removed.

The test MUST be repeated with the following initial number of flow-entries installed: - < 10 - 1000 - 100,000 - 10,000,000 (or the maximum supported number of flow-entries)

The test MUST be repeated with the following rates of flow-entry addition and deletion per second: - 0 - 1 (i.e. 1 addition plus 1 deletion) - 100 - 10,000

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • The maximum forwarding rate in Frames Per Second (FPS) and Mbps of the DUT.
  • The average latency of the traffic flow when passing through the DUT (if testing for latency, note that this average is different from the test specified in Section 26.3 of RFC2544).
  • CPU and memory utilization may also be collected as part of this test, to determine the vSwitch’s performance footprint on the system.

Deployment scenario:

  • Physical → virtual switch → physical.
CPU and memory consumption

The following tests will profile a virtual switch’s CPU and memory utilization under various loads and circumstances. The following list is not exhaustive but should indicate the type of tests that should be required. It is expected that more will be added.

Test ID: LTD.Stress.RFC2544.0PacketLoss

Title: RFC 2544 0% Loss CPU OR Memory Stress Test

Prerequisite Test:

Priority:

Description:

The aim of this test is to understand the overall performance of the system when a CPU or Memory intensive application is run on the same DUT as the Virtual Switch. For each frame size, an LTD.Throughput.RFC2544.PacketLossRatio (0% Packet Loss) test should be performed. Throughout the entire test a CPU or Memory intensive application should be run on all cores on the system not in use by the Virtual Switch. For NUMA system only cores on the same NUMA node are loaded.

It is recommended that stress-ng be used for loading the non-Virtual Switch cores but any stress tool MAY be used.

Expected Result:

Metrics Collected:

The following are the metrics collected for this test:

  • Memory and CPU utilization of the cores running the Virtual Switch.
  • The number of identity of the cores allocated to the Virtual Switch.
  • The configuration of the stress tool (for example the command line parameters used to start it.)
Note: Stress in the test ID can be replaced with the name of the
component being stressed, when reporting the results: LTD.CPU.RFC2544.0PacketLoss or LTD.Memory.RFC2544.0PacketLoss
Summary List of Tests
  1. Throughput tests
  • Test ID: LTD.Throughput.RFC2544.PacketLossRatio
  • Test ID: LTD.Throughput.RFC2544.PacketLossRatioFrameModification
  • Test ID: LTD.Throughput.RFC2544.Profile
  • Test ID: LTD.Throughput.RFC2544.SystemRecoveryTime
  • Test ID: LTD.Throughput.RFC2544.BackToBackFrames
  • Test ID: LTD.Throughput.RFC2889.Soak
  • Test ID: LTD.Throughput.RFC2889.SoakFrameModification
  • Test ID: LTD.Throughput.RFC6201.ResetTime
  • Test ID: LTD.Throughput.RFC2889.MaxForwardingRate
  • Test ID: LTD.Throughput.RFC2889.ForwardPressure
  • Test ID: LTD.Throughput.RFC2889.ErrorFramesFiltering
  • Test ID: LTD.Throughput.RFC2889.BroadcastFrameForwarding
  • Test ID: LTD.Throughput.RFC2544.WorstN-BestN
  • Test ID: LTD.Throughput.Overlay.Network.<tech>.RFC2544.PacketLossRatio
  1. Packet Latency tests
  • Test ID: LTD.PacketLatency.InitialPacketProcessingLatency
  • Test ID: LTD.PacketDelayVariation.RFC3393.Soak
  1. Scalability tests
  • Test ID: LTD.Scalability.Flows.RFC2544.0PacketLoss
  • Test ID: LTD.MemoryBandwidth.RFC2544.0PacketLoss.Scalability
  • LTD.Scalability.VNF.RFC2544.PacketLossProfile
  • LTD.Scalability.VNF.RFC2544.PacketLossRatio
  1. Activation tests
  • Test ID: LTD.Activation.RFC2889.AddressCachingCapacity
  • Test ID: LTD.Activation.RFC2889.AddressLearningRate
  1. Coupling between control path and datapath Tests
  • Test ID: LTD.CPDPCouplingFlowAddition
  1. CPU and memory consumption
  • Test ID: LTD.Stress.RFC2544.0PacketLoss
4. VSPERF LEVEL TEST PLAN (LTP)
Introduction

The objective of the OPNFV project titled Characterize vSwitch Performance for Telco NFV Use Cases, is to evaluate the performance of virtual switches to identify its suitability for a Telco Network Function Virtualization (NFV) environment. The intention of this Level Test Plan (LTP) document is to specify the scope, approach, resources, and schedule of the virtual switch performance benchmarking activities in OPNFV. The test cases will be identified in a separate document called the Level Test Design (LTD) document.

This document is currently in draft form.

Document identifier

The document id will be used to uniquely identify versions of the LTP. The format for the document id will be: OPNFV_vswitchperf_LTP_REL_STATUS, where by the status is one of: draft, reviewed, corrected or final. The document id for this version of the LTP is: OPNFV_vswitchperf_LTP_Colorado_REVIEWED.

Scope

The main purpose of this project is to specify a suite of performance tests in order to objectively measure the current packet transfer characteristics of a virtual switch in the NFVI. The intent of the project is to facilitate the performance testing of any virtual switch. Thus, a generic suite of tests shall be developed, with no hard dependencies to a single implementation. In addition, the test case suite shall be architecture independent.

The test cases developed in this project shall not form part of a separate test framework, all of these tests may be inserted into the Continuous Integration Test Framework and/or the Platform Functionality Test Framework - if a vSwitch becomes a standard component of an OPNFV release.

Level in the overall sequence

The level of testing conducted by vswitchperf in the overall testing sequence (among all the testing projects in OPNFV) is the performance benchmarking of a specific component (the vswitch) in the OPNFV platfrom. It’s expected that this testing will follow on from the functional and integration testing conducted by other testing projects in OPNFV, namely Functest and Yardstick.

Test classes and overall test conditions

A benchmark is defined by the IETF as: A standardized test that serves as a basis for performance evaluation and comparison. It’s important to note that benchmarks are not Functional tests. They do not provide PASS/FAIL criteria, and most importantly ARE NOT performed on live networks, or performed with live network traffic.

In order to determine the packet transfer characteristics of a virtual switch, the benchmarking tests will be broken down into the following categories:

  • Throughput Tests to measure the maximum forwarding rate (in frames per second or fps) and bit rate (in Mbps) for a constant load (as defined by RFC1242) without traffic loss.
  • Packet and Frame Delay Tests to measure average, min and max packet and frame delay for constant loads.
  • Stream Performance Tests (TCP, UDP) to measure bulk data transfer performance, i.e. how fast systems can send and receive data through the virtual switch.
  • Request/Response Performance Tests (TCP, UDP) the measure the transaction rate through the virtual switch.
  • Packet Delay Tests to understand latency distribution for different packet sizes and over an extended test run to uncover outliers.
  • Scalability Tests to understand how the virtual switch performs as the number of flows, active ports, complexity of the forwarding logic’s configuration... it has to deal with increases.
  • Control Path and Datapath Coupling Tests, to understand how closely coupled the datapath and the control path are as well as the effect of this coupling on the performance of the DUT.
  • CPU and Memory Consumption Tests to understand the virtual switch’s footprint on the system, this includes:
    • CPU core utilization.
    • CPU cache utilization.
    • Memory footprint.
    • System bus (QPI, PCI, ..) utilization.
    • Memory lanes utilization.
    • CPU cycles consumed per packet.
    • Time To Establish Flows Tests.
  • Noisy Neighbour Tests, to understand the effects of resource sharing on the performance of a virtual switch.

Note: some of the tests above can be conducted simultaneously where the combined results would be insightful, for example Packet/Frame Delay and Scalability.

Details of the Level Test Plan

This section describes the following items: * Test items and their identifiers (TestItems) * Test Traceability Matrix (TestMatrix) * Features to be tested (FeaturesToBeTested) * Features not to be tested (FeaturesNotToBeTested) * Approach (Approach) * Item pass/fail criteria (PassFailCriteria) * Suspension criteria and resumption requirements (SuspensionResumptionReqs)

Test items and their identifiers

The test item/application vsperf is trying to test are virtual switches and in particular their performance in an nfv environment. vsperf will first try to measure the maximum achievable performance by a virtual switch and then it will focus in on usecases that are as close to real life deployment scenarios as possible.

Test Traceability Matrix

vswitchperf leverages the “3x3” matrix (introduced in https://tools.ietf.org/html/draft-ietf-bmwg-virtual-net-02) to achieve test traceability. The matrix was expanded to 3x4 to accommodate scale metrics when displaying the coverage of many metrics/benchmarks). Test case covreage in the LTD is tracked using the following catagories:

  SPEED ACCURACY RELIABILITY SCALE
Activation X X X X
Operation X X X X
De-activation        

X = denotes a test catagory that has 1 or more test cases defined.

Features to be tested

Characterizing virtual switches (i.e. Device Under Test (DUT) in this document) includes measuring the following performance metrics:

  • Throughput as defined by RFC1242: The maximum rate at which none of the offered frames are dropped by the DUT. The maximum frame rate and bit rate that can be transmitted by the DUT without any error should be recorded. Note there is an equivalent bit rate and a specific layer at which the payloads contribute to the bits. Errors and improperly formed frames or packets are dropped.
  • Packet delay introduced by the DUT and its cumulative effect on E2E networks. Frame delay can be measured equivalently.
  • Packet delay variation: measured from the perspective of the VNF/application. Packet delay variation is sometimes called “jitter”. However, we will avoid the term “jitter” as the term holds different meaning to different groups of people. In this document we will simply use the term packet delay variation. The preferred form for this metric is the PDV form of delay variation defined in RFC5481. The most relevant measurement of PDV considers the delay variation of a single user flow, as this will be relevant to the size of end-system buffers to compensate for delay variation. The measurement system’s ability to store the delays of individual packets in the flow of interest is a key factor that determines the specific measurement method. At the outset, it is ideal to view the complete PDV distribution. Systems that can capture and store packets and their delays have the freedom to calculate the reference minimum delay and to determine various quantiles of the PDV distribution accurately (in post-measurement processing routines). Systems without storage must apply algorithms to calculate delay and statistical measurements on the fly. For example, a system may store temporary estimates of the mimimum delay and the set of (100) packets with the longest delays during measurement (to calculate a high quantile, and update these sets with new values periodically. In some cases, a limited number of delay histogram bins will be available, and the bin limits will need to be set using results from repeated experiments. See section 8 of RFC5481.
  • Packet loss (within a configured waiting time at the receiver): All packets sent to the DUT should be accounted for.
  • Burst behaviour: measures the ability of the DUT to buffer packets.
  • Packet re-ordering: measures the ability of the device under test to maintain sending order throughout transfer to the destination.
  • Packet correctness: packets or Frames must be well-formed, in that they include all required fields, conform to length requirements, pass integrity checks, etc.
  • Availability and capacity of the DUT i.e. when the DUT is fully “up” and connected, following measurements should be captured for DUT without any network packet load:
    • Includes average power consumption of the CPUs (in various power states) and system over specified period of time. Time period should not be less than 60 seconds.
    • Includes average per core CPU utilization over specified period of time. Time period should not be less than 60 seconds.
    • Includes the number of NIC interfaces supported.
    • Includes headroom of VM workload processing cores (i.e. available for applications).
Features not to be tested

vsperf doesn’t intend to define or perform any functional tests. The aim is to focus on performance.

Approach

The testing approach adoped by the vswitchperf project is black box testing, meaning the test inputs can be generated and the outputs captured and completely evaluated from the outside of the System Under Test. Some metrics can be collected on the SUT, such as cpu or memory utilization if the collection has no/minimal impact on benchmark. This section will look at the deployment scenarios and the general methodology used by vswitchperf. In addition, this section will also specify the details of the Test Report that must be collected for each of the test cases.

Deployment Scenarios

The following represents possible deployment test scenarios which can help to determine the performance of both the virtual switch and the datapaths to physical ports (to NICs) and to logical ports (to VNFs):

Physical port → vSwitch → physical port
                                                     _
+--------------------------------------------------+  |
|              +--------------------+              |  |
|              |                    |              |  |
|              |                    v              |  |  Host
|   +--------------+            +--------------+   |  |
|   |   phy port   |  vSwitch   |   phy port   |   |  |
+---+--------------+------------+--------------+---+ _|
           ^                           :
           |                           |
           :                           v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
Physical port → vSwitch → VNF → vSwitch → physical port
                                                      _
+---------------------------------------------------+  |
|                                                   |  |
|   +-------------------------------------------+   |  |
|   |                 Application               |   |  |
|   +-------------------------------------------+   |  |
|       ^                                  :        |  |
|       |                                  |        |  |  Guest
|       :                                  v        |  |
|   +---------------+           +---------------+   |  |
|   | logical port 0|           | logical port 1|   |  |
+---+---------------+-----------+---------------+---+ _|
        ^                                  :
        |                                  |
        :                                  v         _
+---+---------------+----------+---------------+---+  |
|   | logical port 0|          | logical port 1|   |  |
|   +---------------+          +---------------+   |  |
|       ^                                  :       |  |
|       |                                  |       |  |  Host
|       :                                  v       |  |
|   +--------------+            +--------------+   |  |
|   |   phy port   |  vSwitch   |   phy port   |   |  |
+---+--------------+------------+--------------+---+ _|
           ^                           :
           |                           |
           :                           v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
Physical port → vSwitch → VNF → vSwitch → VNF → vSwitch → physical port
                                                   _
+----------------------+  +----------------------+  |
|   Guest 1            |  |   Guest 2            |  |
|   +---------------+  |  |   +---------------+  |  |
|   |  Application  |  |  |   |  Application  |  |  |
|   +---------------+  |  |   +---------------+  |  |
|       ^       |      |  |       ^       |      |  |
|       |       v      |  |       |       v      |  |  Guests
|   +---------------+  |  |   +---------------+  |  |
|   | logical ports |  |  |   | logical ports |  |  |
|   |   0       1   |  |  |   |   0       1   |  |  |
+---+---------------+--+  +---+---------------+--+ _|
        ^       :                 ^       :
        |       |                 |       |
        :       v                 :       v        _
+---+---------------+---------+---------------+--+  |
|   |   0       1   |         |   3       4   |  |  |
|   | logical ports |         | logical ports |  |  |
|   +---------------+         +---------------+  |  |
|       ^       |                 ^       |      |  |  Host
|       |       L-----------------+       v      |  |
|   +--------------+          +--------------+   |  |
|   |   phy ports  | vSwitch  |   phy ports  |   |  |
+---+--------------+----------+--------------+---+ _|
        ^       ^                 :       :
        |       |                 |       |
        :       :                 v       v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
Physical port → VNF → vSwitch → VNF → physical port
                                                    _
+----------------------+  +----------------------+   |
|   Guest 1            |  |   Guest 2            |   |
|+-------------------+ |  | +-------------------+|   |
||     Application   | |  | |     Application   ||   |
|+-------------------+ |  | +-------------------+|   |
|       ^       |      |  |       ^       |      |   |  Guests
|       |       v      |  |       |       v      |   |
|+-------------------+ |  | +-------------------+|   |
||   logical ports   | |  | |   logical ports   ||   |
||  0              1 | |  | | 0              1  ||   |
++--------------------++  ++--------------------++  _|
    ^              :          ^              :
(PCI passthrough)  |          |     (PCI passthrough)
    |              v          :              |      _
+--------++------------+-+------------++---------+   |
|   |    ||        0   | |    1       ||     |   |   |
|   |    ||logical port| |logical port||     |   |   |
|   |    |+------------+ +------------+|     |   |   |
|   |    |     |                 ^     |     |   |   |
|   |    |     L-----------------+     |     |   |   |
|   |    |                             |     |   |   |  Host
|   |    |           vSwitch           |     |   |   |
|   |    +-----------------------------+     |   |   |
|   |                                        |   |   |
|   |                                        v   |   |
| +--------------+              +--------------+ |   |
| | phy port/VF  |              | phy port/VF  | |   |
+-+--------------+--------------+--------------+-+  _|
    ^                                        :
    |                                        |
    :                                        v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
Physical port → vSwitch → VNF
                                                      _
+---------------------------------------------------+  |
|                                                   |  |
|   +-------------------------------------------+   |  |
|   |                 Application               |   |  |
|   +-------------------------------------------+   |  |
|       ^                                           |  |
|       |                                           |  |  Guest
|       :                                           |  |
|   +---------------+                               |  |
|   | logical port 0|                               |  |
+---+---------------+-------------------------------+ _|
        ^
        |
        :                                            _
+---+---------------+------------------------------+  |
|   | logical port 0|                              |  |
|   +---------------+                              |  |
|       ^                                          |  |
|       |                                          |  |  Host
|       :                                          |  |
|   +--------------+                               |  |
|   |   phy port   |  vSwitch                      |  |
+---+--------------+------------ -------------- ---+ _|
           ^
           |
           :
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
VNF → vSwitch → physical port
                                                      _
+---------------------------------------------------+  |
|                                                   |  |
|   +-------------------------------------------+   |  |
|   |                 Application               |   |  |
|   +-------------------------------------------+   |  |
|                                          :        |  |
|                                          |        |  |  Guest
|                                          v        |  |
|                               +---------------+   |  |
|                               | logical port  |   |  |
+-------------------------------+---------------+---+ _|
                                           :
                                           |
                                           v         _
+------------------------------+---------------+---+  |
|                              | logical port  |   |  |
|                              +---------------+   |  |
|                                          :       |  |
|                                          |       |  |  Host
|                                          v       |  |
|                               +--------------+   |  |
|                     vSwitch   |   phy port   |   |  |
+-------------------------------+--------------+---+ _|
                                       :
                                       |
                                       v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+
VNF → vSwitch → VNF → vSwitch
                                                         _
+-------------------------+  +-------------------------+  |
|   Guest 1               |  |   Guest 2               |  |
|   +-----------------+   |  |   +-----------------+   |  |
|   |   Application   |   |  |   |   Application   |   |  |
|   +-----------------+   |  |   +-----------------+   |  |
|                :        |  |       ^                 |  |
|                |        |  |       |                 |  |  Guest
|                v        |  |       :                 |  |
|     +---------------+   |  |   +---------------+     |  |
|     | logical port 0|   |  |   | logical port 0|     |  |
+-----+---------------+---+  +---+---------------+-----+ _|
                :                    ^
                |                    |
                v                    :                    _
+----+---------------+------------+---------------+-----+  |
|    |     port 0    |            |     port 1    |     |  |
|    +---------------+            +---------------+     |  |
|              :                    ^                   |  |
|              |                    |                   |  |  Host
|              +--------------------+                   |  |
|                                                       |  |
|                     vswitch                           |  |
+-------------------------------------------------------+ _|

HOST 1(Physical port → virtual switch → VNF → virtual switch → Physical port) → HOST 2(Physical port → virtual switch → VNF → virtual switch → Physical port)

HOST 1 (PVP) → HOST 2 (PVP)
                                                   _
+----------------------+  +----------------------+  |
|   Guest 1            |  |   Guest 2            |  |
|   +---------------+  |  |   +---------------+  |  |
|   |  Application  |  |  |   |  Application  |  |  |
|   +---------------+  |  |   +---------------+  |  |
|       ^       |      |  |       ^       |      |  |
|       |       v      |  |       |       v      |  |  Guests
|   +---------------+  |  |   +---------------+  |  |
|   | logical ports |  |  |   | logical ports |  |  |
|   |   0       1   |  |  |   |   0       1   |  |  |
+---+---------------+--+  +---+---------------+--+ _|
        ^       :                 ^       :
        |       |                 |       |
        :       v                 :       v        _
+---+---------------+--+  +---+---------------+--+  |
|   |   0       1   |  |  |   |   3       4   |  |  |
|   | logical ports |  |  |   | logical ports |  |  |
|   +---------------+  |  |   +---------------+  |  |
|       ^       |      |  |       ^       |      |  |  Hosts
|       |       v      |  |       |       v      |  |
|   +--------------+   |  |   +--------------+   |  |
|   |   phy ports  |   |  |   |   phy ports  |   |  |
+---+--------------+---+  +---+--------------+---+ _|
        ^       :                 :       :
        |       +-----------------+       |
        :                                 v
+--------------------------------------------------+
|                                                  |
|                traffic generator                 |
|                                                  |
+--------------------------------------------------+

Note: For tests where the traffic generator and/or measurement receiver are implemented on VM and connected to the virtual switch through vNIC, the issues of shared resources and interactions between the measurement devices and the device under test must be considered.

Note: Some RFC 2889 tests require a full-mesh sending and receiving pattern involving more than two ports. This possibility is illustrated in the Physical port → vSwitch → VNF → vSwitch → VNF → vSwitch → physical port diagram above (with 2 sending and 2 receiving ports, though all ports could be used bi-directionally).

Note: When Deployment Scenarios are used in RFC 2889 address learning or cache capacity testing, an additional port from the vSwitch must be connected to the test device. This port is used to listen for flooded frames.

General Methodology:

To establish the baseline performance of the virtual switch, tests would initially be run with a simple workload in the VNF (the recommended simple workload VNF would be DPDK‘s testpmd application forwarding packets in a VM or vloop_vnf a simple kernel module that forwards traffic between two network interfaces inside the virtualized environment while bypassing the networking stack). Subsequently, the tests would also be executed with a real Telco workload running in the VNF, which would exercise the virtual switch in the context of higher level Telco NFV use cases, and prove that its underlying characteristics and behaviour can be measured and validated. Suitable real Telco workload VNFs are yet to be identified.

Default Test Parameters

The following list identifies the default parameters for suite of tests:

  • Reference application: Simple forwarding or Open Source VNF.
  • Frame size (bytes): 64, 128, 256, 512, 1024, 1280, 1518, 2K, 4k OR Packet size based on use-case (e.g. RTP 64B, 256B) OR Mix of packet sizes as maintained by the Functest project <https://wiki.opnfv.org/traffic_profile_management>.
  • Reordering check: Tests should confirm that packets within a flow are not reordered.
  • Duplex: Unidirectional / Bidirectional. Default: Full duplex with traffic transmitting in both directions, as network traffic generally does not flow in a single direction. By default the data rate of transmitted traffic should be the same in both directions, please note that asymmetric traffic (e.g. downlink-heavy) tests will be mentioned explicitly for the relevant test cases.
  • Number of Flows: Default for non scalability tests is a single flow. For scalability tests the goal is to test with maximum supported flows but where possible will test up to 10 Million flows. Start with a single flow and scale up. By default flows should be added sequentially, tests that add flows simultaneously will explicitly call out their flow addition behaviour. Packets are generated across the flows uniformly with no burstiness. For multi-core tests should consider the number of packet flows based on vSwitch/VNF multi-thread implementation and behavior.
  • Traffic Types: UDP, SCTP, RTP, GTP and UDP traffic.
  • Deployment scenarios are:
  • Physical → virtual switch → physical.
  • Physical → virtual switch → VNF → virtual switch → physical.
  • Physical → virtual switch → VNF → virtual switch → VNF → virtual switch → physical.
  • Physical → VNF → virtual switch → VNF → physical.
  • Physical → virtual switch → VNF.
  • VNF → virtual switch → Physical.
  • VNF → virtual switch → VNF.

Tests MUST have these parameters unless otherwise stated. Test cases with non default parameters will be stated explicitly.

Note: For throughput tests unless stated otherwise, test configurations should ensure that traffic traverses the installed flows through the virtual switch, i.e. flows are installed and have an appropriate time out that doesn’t expire before packet transmission starts.

Flow Classification

Virtual switches classify packets into flows by processing and matching particular header fields in the packet/frame and/or the input port where the packets/frames arrived. The vSwitch then carries out an action on the group of packets that match the classification parameters. Thus a flow is considered to be a sequence of packets that have a shared set of header field values or have arrived on the same port and have the same action applied to them. Performance results can vary based on the parameters the vSwitch uses to match for a flow. The recommended flow classification parameters for L3 vSwitch performance tests are: the input port, the source IP address, the destination IP address and the Ethernet protocol type field. It is essential to increase the flow time-out time on a vSwitch before conducting any performance tests that do not measure the flow set-up time. Normally the first packet of a particular flow will install the flow in the vSwitch which adds an additional latency, subsequent packets of the same flow are not subject to this latency if the flow is already installed on the vSwitch.

Test Priority

Tests will be assigned a priority in order to determine which tests should be implemented immediately and which tests implementations can be deferred.

Priority can be of following types: - Urgent: Must be implemented immediately. - High: Must be implemented in the next release. - Medium: May be implemented after the release. - Low: May or may not be implemented at all.

SUT Setup

The SUT should be configured to its “default” state. The SUT’s configuration or set-up must not change between tests in any way other than what is required to do the test. All supported protocols must be configured and enabled for each test set up.

Port Configuration

The DUT should be configured with n ports where n is a multiple of 2. Half of the ports on the DUT should be used as ingress ports and the other half of the ports on the DUT should be used as egress ports. Where a DUT has more than 2 ports, the ingress data streams should be set-up so that they transmit packets to the egress ports in sequence so that there is an even distribution of traffic across ports. For example, if a DUT has 4 ports 0(ingress), 1(ingress), 2(egress) and 3(egress), the traffic stream directed at port 0 should output a packet to port 2 followed by a packet to port 3. The traffic stream directed at port 1 should also output a packet to port 2 followed by a packet to port 3.

Frame Formats

Frame formats Layer 2 (data link layer) protocols

  • Ethernet II
+---------------------------+-----------+
| Ethernet Header | Payload | Check Sum |
+-----------------+---------+-----------+
|_________________|_________|___________|
      14 Bytes     46 - 1500   4 Bytes
                     Bytes

Layer 3 (network layer) protocols

  • IPv4
+-----------------+-----------+---------+-----------+
| Ethernet Header | IP Header | Payload | Checksum  |
+-----------------+-----------+---------+-----------+
|_________________|___________|_________|___________|
      14 Bytes       20 bytes  26 - 1480   4 Bytes
                                 Bytes
  • IPv6
+-----------------+-----------+---------+-----------+
| Ethernet Header | IP Header | Payload | Checksum  |
+-----------------+-----------+---------+-----------+
|_________________|___________|_________|___________|
      14 Bytes       40 bytes  26 - 1460   4 Bytes
                                 Bytes

Layer 4 (transport layer) protocols

  • TCP
  • UDP
  • SCTP
+-----------------+-----------+-----------------+---------+-----------+
| Ethernet Header | IP Header | Layer 4 Header  | Payload | Checksum  |
+-----------------+-----------+-----------------+---------+-----------+
|_________________|___________|_________________|_________|___________|
      14 Bytes      40 bytes      20 Bytes       6 - 1460   4 Bytes
                                                  Bytes

Layer 5 (application layer) protocols

  • RTP
  • GTP
+-----------------+-----------+-----------------+---------+-----------+
| Ethernet Header | IP Header | Layer 4 Header  | Payload | Checksum  |
+-----------------+-----------+-----------------+---------+-----------+
|_________________|___________|_________________|_________|___________|
      14 Bytes      20 bytes     20 Bytes        >= 6 Bytes   4 Bytes
Packet Throughput

There is a difference between an Ethernet frame, an IP packet, and a UDP datagram. In the seven-layer OSI model of computer networking, packet refers to a data unit at layer 3 (network layer). The correct term for a data unit at layer 2 (data link layer) is a frame, and at layer 4 (transport layer) is a segment or datagram.

Important concepts related to 10GbE performance are frame rate and throughput. The MAC bit rate of 10GbE, defined in the IEEE standard 802 .3ae, is 10 billion bits per second. Frame rate is based on the bit rate and frame format definitions. Throughput, defined in IETF RFC 1242, is the highest rate at which the system under test can forward the offered load, without loss.

The frame rate for 10GbE is determined by a formula that divides the 10 billion bits per second by the preamble + frame length + inter-frame gap.

The maximum frame rate is calculated using the minimum values of the following parameters, as described in the IEEE 802 .3ae standard:

  • Preamble: 8 bytes * 8 = 64 bits
  • Frame Length: 64 bytes (minimum) * 8 = 512 bits
  • Inter-frame Gap: 12 bytes (minimum) * 8 = 96 bits

Therefore, Maximum Frame Rate (64B Frames) = MAC Transmit Bit Rate / (Preamble + Frame Length + Inter-frame Gap) = 10,000,000,000 / (64 + 512 + 96) = 10,000,000,000 / 672 = 14,880,952.38 frame per second (fps)

RFCs for testing virtual switch performance

The starting point for defining the suite of tests for benchmarking the performance of a virtual switch is to take existing RFCs and standards that were designed to test their physical counterparts and adapting them for testing virtual switches. The rationale behind this is to establish a fair comparison between the performance of virtual and physical switches. This section outlines the RFCs that are used by this specification.

RFC 1242 Benchmarking Terminology for Network Interconnection

Devices RFC 1242 defines the terminology that is used in describing performance benchmarking tests and their results. Definitions and discussions covered include: Back-to-back, bridge, bridge/router, constant load, data link frame size, frame loss rate, inter frame gap, latency, and many more.

RFC 2544 Benchmarking Methodology for Network Interconnect Devices

RFC 2544 outlines a benchmarking methodology for network Interconnect Devices. The methodology results in performance metrics such as latency, frame loss percentage, and maximum data throughput.

In this document network “throughput” (measured in millions of frames per second) is based on RFC 2544, unless otherwise noted. Frame size refers to Ethernet frames ranging from smallest frames of 64 bytes to largest frames of 9K bytes.

Types of tests are:

  1. Throughput test defines the maximum number of frames per second that can be transmitted without any error, or 0% loss ratio. In some Throughput tests (and those tests with long duration), evaluation of an additional frame loss ratio is suggested. The current ratio (10^-7 %) is based on understanding the typical user-to-user packet loss ratio needed for good application performance and recognizing that a single transfer through a vswitch must contribute a tiny fraction of user-to-user loss. Further, the ratio 10^-7 % also recognizes practical limitations when measuring loss ratio.
  2. Latency test measures the time required for a frame to travel from the originating device through the network to the destination device. Please note that RFC2544 Latency measurement will be superseded with a measurement of average latency over all successfully transferred packets or frames.
  3. Frame loss test measures the network’s response in overload conditions - a critical indicator of the network’s ability to support real-time applications in which a large amount of frame loss will rapidly degrade service quality.
  4. Burst test assesses the buffering capability of a virtual switch. It measures the maximum number of frames received at full line rate before a frame is lost. In carrier Ethernet networks, this measurement validates the excess information rate (EIR) as defined in many SLAs.
  5. System recovery to characterize speed of recovery from an overload condition.
  6. Reset to characterize speed of recovery from device or software reset. This type of test has been updated by RFC6201 as such, the methodology defined by this specification will be that of RFC 6201.

Although not included in the defined RFC 2544 standard, another crucial measurement in Ethernet networking is packet delay variation. The definition set out by this specification comes from RFC5481.

RFC 2285 Benchmarking Terminology for LAN Switching Devices

RFC 2285 defines the terminology that is used to describe the terminology for benchmarking a LAN switching device. It extends RFC 1242 and defines: DUTs, SUTs, Traffic orientation and distribution, bursts, loads, forwarding rates, etc.

RFC 2889 Benchmarking Methodology for LAN Switching

RFC 2889 outlines a benchmarking methodology for LAN switching, it extends RFC 2544. The outlined methodology gathers performance metrics for forwarding, congestion control, latency, address handling and finally filtering.

RFC 3918 Methodology for IP Multicast Benchmarking

RFC 3918 outlines a methodology for IP Multicast benchmarking.

RFC 4737 Packet Reordering Metrics

RFC 4737 describes metrics for identifying and counting re-ordered packets within a stream, and metrics to measure the extent each packet has been re-ordered.

RFC 5481 Packet Delay Variation Applicability Statement

RFC 5481 defined two common, but different forms of delay variation metrics, and compares the metrics over a range of networking circumstances and tasks. The most suitable form for vSwitch benchmarking is the “PDV” form.

RFC 6201 Device Reset Characterization

RFC 6201 extends the methodology for characterizing the speed of recovery of the DUT from device or software reset described in RFC 2544.

Item pass/fail criteria

vswitchperf does not specify Pass/Fail criteria for the tests in terms of a threshold, as benchmarks do not (and should not do this). The results/metrics for a test are simply reported. If it had to be defined, a test is considered to have passed if it succesfully completed and a relavent metric was recorded/reported for the SUT.

Suspension criteria and resumption requirements

In the case of a throughput test, a test should be suspended if a virtual switch is failing to forward any traffic. A test should be restarted from a clean state if the intention is to carry out the test again.

Test deliverables

Each test should produce a test report that details SUT information as well as the test results. There are a number of parameters related to the system, DUT and tests that can affect the repeatability of a test results and should be recorded. In order to minimise the variation in the results of a test, it is recommended that the test report includes the following information:

  • Hardware details including:
    • Platform details.
    • Processor details.
    • Memory information (see below)
    • Number of enabled cores.
    • Number of cores used for the test.
    • Number of physical NICs, as well as their details (manufacturer, versions, type and the PCI slot they are plugged into).
    • NIC interrupt configuration.
    • BIOS version, release date and any configurations that were modified.
  • Software details including:
    • OS version (for host and VNF)
    • Kernel version (for host and VNF)
    • GRUB boot parameters (for host and VNF).
    • Hypervisor details (Type and version).
    • Selected vSwitch, version number or commit id used.
    • vSwitch launch command line if it has been parameterised.
    • Memory allocation to the vSwitch – which NUMA node it is using, and how many memory channels.
    • Where the vswitch is built from source: compiler details including versions and the flags that were used to compile the vSwitch.
    • DPDK or any other SW dependency version number or commit id used.
    • Memory allocation to a VM - if it’s from Hugpages/elsewhere.
    • VM storage type: snapshot/independent persistent/independent non-persistent.
    • Number of VMs.
    • Number of Virtual NICs (vNICs), versions, type and driver.
    • Number of virtual CPUs and their core affinity on the host.
    • Number vNIC interrupt configuration.
    • Thread affinitization for the applications (including the vSwitch itself) on the host.
    • Details of Resource isolation, such as CPUs designated for Host/Kernel (isolcpu) and CPUs designated for specific processes (taskset).
  • Memory Details
    • Total memory
    • Type of memory
    • Used memory
    • Active memory
    • Inactive memory
    • Free memory
    • Buffer memory
    • Swap cache
    • Total swap
    • Used swap
    • Free swap
  • Test duration.
  • Number of flows.
  • Traffic Information:
    • Traffic type - UDP, TCP, IMIX / Other.
    • Packet Sizes.
  • Deployment Scenario.

Note: Tests that require additional parameters to be recorded will explicitly specify this.

Test management

This section will detail the test activities that will be conducted by vsperf as well as the infrastructure that will be used to complete the tests in OPNFV.

Planned activities and tasks; test progression

A key consideration when conducting any sort of benchmark is trying to ensure the consistency and repeatability of test results between runs. When benchmarking the performance of a virtual switch there are many factors that can affect the consistency of results. This section describes these factors and the measures that can be taken to limit their effects. In addition, this section will outline some system tests to validate the platform and the VNF before conducting any vSwitch benchmarking tests.

System Isolation:

When conducting a benchmarking test on any SUT, it is essential to limit (and if reasonable, eliminate) any noise that may interfere with the accuracy of the metrics collected by the test. This noise may be introduced by other hardware or software (OS, other applications), and can result in significantly varying performance metrics being collected between consecutive runs of the same test. In the case of characterizing the performance of a virtual switch, there are a number of configuration parameters that can help increase the repeatability and stability of test results, including:

  • OS/GRUB configuration:
    • maxcpus = n where n >= 0; limits the kernel to using ‘n’ processors. Only use exactly what you need.
    • isolcpus: Isolate CPUs from the general scheduler. Isolate all CPUs bar one which will be used by the OS.
    • use taskset to affinitize the forwarding application and the VNFs onto isolated cores. VNFs and the vSwitch should be allocated their own cores, i.e. must not share the same cores. vCPUs for the VNF should be affinitized to individual cores also.
    • Limit the amount of background applications that are running and set OS to boot to runlevel 3. Make sure to kill any unnecessary system processes/daemons.
    • Only enable hardware that you need to use for your test – to ensure there are no other interrupts on the system.
    • Configure NIC interrupts to only use the cores that are not allocated to any other process (VNF/vSwitch).
  • NUMA configuration: Any unused sockets in a multi-socket system should be disabled.
  • CPU pinning: The vSwitch and the VNF should each be affinitized to separate logical cores using a combination of maxcpus, isolcpus and taskset.
  • BIOS configuration: BIOS should be configured for performance where an explicit option exists, sleep states should be disabled, any virtualization optimization technologies should be enabled, and hyperthreading should also be enabled, turbo boost and overclocking should be disabled.

System Validation:

System validation is broken down into two sub-categories: Platform validation and VNF validation. The validation test itself involves verifying the forwarding capability and stability for the sub-system under test. The rationale behind system validation is two fold. Firstly to give a tester confidence in the stability of the platform or VNF that is being tested; and secondly to provide base performance comparison points to understand the overhead introduced by the virtual switch.

  • Benchmark platform forwarding capability: This is an OPTIONAL test used to verify the platform and measure the base performance (maximum forwarding rate in fps and latency) that can be achieved by the platform without a vSwitch or a VNF. The following diagram outlines the set-up for benchmarking Platform forwarding capability:

                                                         __
    +--------------------------------------------------+   |
    |   +------------------------------------------+   |   |
    |   |                                          |   |   |
    |   |          l2fw or DPDK L2FWD app          |   |  Host
    |   |                                          |   |   |
    |   +------------------------------------------+   |   |
    |   |                 NIC                      |   |   |
    +---+------------------------------------------+---+ __|
               ^                           :
               |                           |
               :                           v
    +--------------------------------------------------+
    |                                                  |
    |                traffic generator                 |
    |                                                  |
    +--------------------------------------------------+
    
  • Benchmark VNF forwarding capability: This test is used to verify the VNF and measure the base performance (maximum forwarding rate in fps and latency) that can be achieved by the VNF without a vSwitch. The performance metrics collected by this test will serve as a key comparison point for NIC passthrough technologies and vSwitches. VNF in this context refers to the hypervisor and the VM. The following diagram outlines the set-up for benchmarking VNF forwarding capability:

                                                         __
    +--------------------------------------------------+   |
    |   +------------------------------------------+   |   |
    |   |                                          |   |   |
    |   |                 VNF                      |   |   |
    |   |                                          |   |   |
    |   +------------------------------------------+   |   |
    |   |          Passthrough/SR-IOV              |   |  Host
    |   +------------------------------------------+   |   |
    |   |                 NIC                      |   |   |
    +---+------------------------------------------+---+ __|
               ^                           :
               |                           |
               :                           v
    +--------------------------------------------------+
    |                                                  |
    |                traffic generator                 |
    |                                                  |
    +--------------------------------------------------+
    

Methodology to benchmark Platform/VNF forwarding capability

The recommended methodology for the platform/VNF validation and benchmark is: - Run RFC2889 Maximum Forwarding Rate test, this test will produce maximum forwarding rate and latency results that will serve as the expected values. These expected values can be used in subsequent steps or compared with in subsequent validation tests. - Transmit bidirectional traffic at line rate/max forwarding rate (whichever is higher) for at least 72 hours, measure throughput (fps) and latency. - Note: Traffic should be bidirectional. - Establish a baseline forwarding rate for what the platform can achieve. - Additional validation: After the test has completed for 72 hours run bidirectional traffic at the maximum forwarding rate once more to see if the system is still functional and measure throughput (fps) and latency. Compare the measure the new obtained values with the expected values.

NOTE 1: How the Platform is configured for its forwarding capability test (BIOS settings, GRUB configuration, runlevel...) is how the platform should be configured for every test after this

NOTE 2: How the VNF is configured for its forwarding capability test (# of vCPUs, vNICs, Memory, affinitization…) is how it should be configured for every test that uses a VNF after this.

Methodology to benchmark the VNF to vSwitch to VNF deployment scenario

vsperf has identified the following concerns when benchmarking the VNF to vSwitch to VNF deployment scenario:

  • The accuracy of the timing synchronization between VNFs/VMs.
  • The clock accuracy of a VNF/VM if they were to be used as traffic generators.
  • VNF traffic generator/receiver may be using resources of the system under test, causing at least three forms of workload to increase as the traffic load increases (generation, switching, receiving).

The recommendation from vsperf is that tests for this sceanario must include an external HW traffic generator to act as the tester/traffic transmitter and receiver. The perscribed methodology to benchmark this deployment scanrio with an external tester involves the following three steps:

#. Determine the forwarding capability and latency through the virtual interface connected to the VNF/VM.

_images/vm2vm_virtual_interface_benchmark1.png

Virtual interfaces performance benchmark

  1. Determine the forwarding capability and latency through the VNF/hypervisor.
_images/vm2vm_hypervisor_benchmark1.png

Hypervisor performance benchmark

  1. Determine the forwarding capability and latency for the VNF to vSwitch to VNF taking the information from the previous two steps into account.
_images/vm2vm_benchmark1.png

VNF to vSwitch to VNF performance benchmark

vsperf also identified an alternative configuration for the final step:

_images/vm2vm_alternative_benchmark1.png

VNF to vSwitch to VNF alternative performance benchmark

Environment/infrastructure

VSPERF CI jobs are run using the OPNFV lab infrastructure as described by the ‘Pharos Project <https://www.opnfv.org/community/projects/pharos>`_ . A VSPERF POD is described here https://wiki.opnfv.org/display/pharos/VSPERF+in+Intel+Pharos+Lab+-+Pod+12

vsperf CI

vsperf CI jobs are broken down into:

  • Daily job:
    • Runs everyday takes about 10 hours to complete.
    • TESTCASES_DAILY=’phy2phy_tput back2back phy2phy_tput_mod_vlan phy2phy_scalability pvp_tput pvp_back2back pvvp_tput pvvp_back2back’.
    • TESTPARAM_DAILY=’–test-params TRAFFICGEN_PKT_SIZES=(64,128,512,1024,1518)’.
  • Merge job:
    • Runs whenever patches are merged to master.
    • Runs a basic Sanity test.
  • Verify job:
    • Runs every time a patch is pushed to gerrit.
    • Builds documentation.
Scripts:

There are 2 scripts that are part of VSPERFs CI:

  • build-vsperf.sh: Lives in the VSPERF repository in the ci/ directory and is used to run vsperf with the appropriate cli parameters.
  • vswitchperf.yml: YAML description of our jenkins job. lives in the RELENG repository.

More info on vsperf CI can be found here: https://wiki.opnfv.org/display/vsperf/VSPERF+CI

Responsibilities and authority

The group responsible for managing, designing, preparing and executing the tests listed in the LTD are the vsperf committers and contributors. The vsperf committers and contributors should work with the relavent OPNFV projects to ensure that the infrastructure is in place for testing vswitches, and that the results are published to common end point (a results database).

VSPERF IETF Internet Draft

This IETF INternet Draft on Benchmarking Virtual Switches in OPNFV was developed by VSPERF contributors and is maintained in the IETF repo. at https://tools.ietf.org/htm

VSPERF Scenarios and CI Results
1. VSPERF Test Scenarios

Predefined Tests run with CI:

Test Definition
phy2phy_tput PacketLossRatio for Phy2Phy
back2back BackToBackFrames for Phy2Phy
phy2phy_tput_mod_vlan PacketLossRatioFrameModification for Phy2Phy
phy2phy_cont Phy2Phy blast vswitch at x% TX rate and measure throughput
pvp_cont PVP blast vswitch at x% TX rate and measure throughput
pvvp_cont PVVP blast vswitch at x% TX rate and measure throughput
phy2phy_scalability Scalability0PacketLoss for Phy2Phy
pvp_tput PacketLossRatio for PVP
pvp_back2back BackToBackFrames for PVP
pvvp_tput PacketLossRatio for PVVP
pvvp_back2back BackToBackFrames for PVVP
phy2phy_cpu_load CPU0PacketLoss for Phy2Phy
phy2phy_mem_load Same as CPU0PacketLoss but using a memory intensive app

Deployment topologies:

  • Phy2Phy: Physical port -> vSwitch -> Physical port.
  • PVP: Physical port -> vSwitch -> VNF -> vSwitch -> Physical port.
  • PVVP: Physical port -> vSwitch -> VNF -> vSwitch -> VNF -> vSwitch -> Physical port.

Loopback applications in the Guest:

Supported traffic generators:

  • Spirent Testcenter
  • Ixia: IxOS and IxNet.
  • Xena
  • MoonGen
  • Dummy
2. OPNFV Test Results

VSPERF CI jobs are run daily and sample results can be found at https://wiki.opnfv.org/display/vsperf/Vsperf+Results

The following example maps the results in the test dashboard to the appropriate test case in the VSPERF Framework and specifies the metric the vertical/Y axis is plotting. Please note, the presence of dpdk within a test name signifies that the vswitch under test was OVS with DPDK, while its absence indicates that the vswitch under test was stock OVS.

Dashboard Test Framework Test Metric Guest Interface
tput_ovsdpdk phy2phy_tput Throughput (FPS) N/A
tput_ovs phy2phy_tput Throughput (FPS) N/A
b2b_ovsdpdk back2back Back-to-back value N/A
b2b_ovs back2back Back-to-back value N/A
tput_mod_vlan_ovs phy2phy_tput_mod_vlan Throughput (FPS) N/A
tput_mod_vlan_ovsdpdk phy2phy_tput_mod_vlan Throughput (FPS) N/A
scalability_ovs phy2phy_scalability Throughput (FPS) N/A
scalability_ovsdpdk phy2phy_scalability Throughput (FPS) N/A
pvp_tput_ovsdpdkuser pvp_tput Throughput (FPS) vhost-user
pvp_tput_ovsvirtio pvp_tput Throughput (FPS) virtio-net
pvp_b2b_ovsdpdkuser pvp_back2back Back-to-back value vhost-user
pvp_b2b_ovsvirtio pvp_back2back Back-to-back value virtio-net
pvvp_tput_ovsdpdkuser pvvp_tput Throughput (FPS) vhost-user
pvvp_tput_ovsvirtio pvvp_tput Throughput (FPS) virtio-net
pvvp_b2b_ovsdpdkuser pvvp_back2back Throughput (FPS) vhost-user
pvvp_b2b_ovsvirtio pvvp_back2back Throughput (FPS) virtio-net

The loopback application in the VNF was used for PVP and PVVP scenarios was DPDK testpmd.

Indices

Yardstick

OPNFV Yardstick developer guide
1. Introduction

Yardstick is a project dealing with performance testing. Yardstick produces its own test cases but can also be considered as a framework to support feature project testing.

Yardstick developed a test API that can be used by any OPNFV project. Therefore there are many ways to contribute to Yardstick.

You can:

  • Develop new test cases
  • Review codes
  • Develop Yardstick API / framework
  • Develop Yardstick grafana dashboards and Yardstick reporting page
  • Write Yardstick documentation

This developer guide describes how to interact with the Yardstick project. The first section details the main working areas of the project. The Second part is a list of “How to” to help you to join the Yardstick family whatever your field of interest is.

1.1. Where can I find some help to start?

This guide is made for you. You can have a look at the user guide. There are also references on documentation, video tutorials, tips in the project wiki page. You can also directly contact us by mail with [Yardstick] prefix in the title at opnfv-tech-discuss@lists.opnfv.org or on the IRC chan #opnfv-yardstick.

2. Yardstick developer areas
2.1. Yardstick framework

Yardstick can be considered as a framework. Yardstick is release as a docker file, including tools, scripts and a CLI to prepare the environement and run tests. It simplifies the integration of external test suites in CI pipeline and provide commodity tools to collect and display results.

Since Danube, test categories also known as tiers have been created to group similar tests, provide consistant sub-lists and at the end optimize test duration for CI (see How To section).

The definition of the tiers has been agreed by the testing working group.

The tiers are:

  • smoke
  • features
  • components
  • performance
  • vnf
3. How Todos?
3.1. How Yardstick works?

The installation and configuration of the Yardstick is described in the user guide.

3.2. How can I contribute to Yardstick?

If you are already a contributor of any OPNFV project, you can contribute to Yardstick. If you are totally new to OPNFV, you must first create your Linux Foundation account, then contact us in order to declare you in the repository database.

We distinguish 2 levels of contributors:

  • the standard contributor can push patch and vote +1/0/-1 on any Yardstick patch
  • The commitor can vote -2/-1/0/+1/+2 and merge

Yardstick commitors are promoted by the Yardstick contributors.

3.2.1. Gerrit & JIRA introduction

OPNFV uses Gerrit for web based code review and repository management for the Git Version Control System. You can access OPNFV Gerrit. Please note that you need to have Linux Foundation ID in order to use OPNFV Gerrit. You can get one from this link.

OPNFV uses JIRA for issue management. An important principle of change management is to have two-way trace-ability between issue management (i.e. JIRA) and the code repository (via Gerrit). In this way, individual commits can be traced to JIRA issues and we also know which commits were used to resolve a JIRA issue.

If you want to contribute to Yardstick, you can pick a issue from Yardstick’s JIRA dashboard or you can create you own issue and submit it to JIRA.

3.2.2. Install Git and Git-reviews

Installing and configuring Git and Git-Review is necessary in order to submit code to Gerrit. The Getting to the code page will provide you with some help for that.

3.2.3. Verify your patch locally before submitting

Once you finish a patch, you can submit it to Gerrit for code review. A developer sends a new patch to Gerrit will trigger patch verify job on Jenkins CI. The yardstick patch verify job includes python flake8 check, unit test and code coverage test. Before you submit your patch, it is recommended to run the patch verification in your local environment first.

Open a terminal window and set the project’s directory to the working directory using the cd command. Assume that YARDSTICK_REPO_DIR is the path to the Yardstick project folder on your computer:

cd $YARDSTICK_REPO_DIR

Verify your patch:

./run_tests.sh

It is used in CI but also by the CLI.

3.2.4. Submit the code with Git

Tell Git which files you would like to take into account for the next commit. This is called ‘staging’ the files, by placing them into the staging area, using the git add command (or the synonym git stage command):

git add $YARDSTICK_REPO_DIR/samples/sample.yaml

Alternatively, you can choose to stage all files that have been modified (that is the files you have worked on) since the last time you generated a commit, by using the -a argument:

git add -a

Git won’t let you push (upload) any code to Gerrit if you haven’t pulled the latest changes first. So the next step is to pull (download) the latest changes made to the project by other collaborators using the pull command:

git pull

Now that you have the latest version of the project and you have staged the files you wish to push, it is time to actually commit your work to your local Git repository:

git commit --signoff -m "Title of change"

Test of change that describes in high level what was done. There is a lot of
documentation in code so you do not need to repeat it here.

JIRA: YARDSTICK-XXX

The message that is required for the commit should follow a specific set of rules. This practice allows to standardize the description messages attached to the commits, and eventually navigate among the latter more easily.

This document happened to be very clear and useful to get started with that.

3.2.5. Push the code to Gerrit for review

Now that the code has been comitted into your local Git repository the following step is to push it online to Gerrit for it to be reviewed. The command we will use is git review:

git review

This will automatically push your local commit into Gerrit. You can add Yardstick committers and contributors to review your codes.

Gerrit for code review

You can find Yardstick people info here.

3.2.6. Modify the code under review in Gerrit

At the same time the code is being reviewed in Gerrit, you may need to edit it to make some changes and then send it back for review. The following steps go through the procedure.

Once you have modified/edited your code files under your IDE, you will have to stage them. The ‘status’ command is very helpful at this point as it provides an overview of Git’s current state:

git status

The output of the command provides us with the files that have been modified after the latest commit.

You can now stage the files that have been modified as part of the Gerrit code review edition/modification/improvement using git add command. It is now time to commit the newly modified files, but the objective here is not to create a new commit, we simply want to inject the new changes into the previous commit. You can achieve that with the ‘–amend’ option on the git commit command:

git commit --amend

If the commit was successful, the git status command should not return the updated files as about to be commited.

The final step consists in pushing the newly modified commit to Gerrit:

git review
4. Plugins

For information about Yardstick plugins, refer to the chapter Installing a plug-in into Yardstick in the user guide.

Developer

Documentation Guide

Documentation Guide

This page intends to cover the documentation handling for OPNFV. OPNFV projects are expected to create a variety of document types, according to the nature of the project. Some of these are common to projects that develop/integrate features into the OPNFV platform, e.g. Installation Instructions and User/Configurations Guides. Other document types may be project-specific.

Getting Started with Documentation for Your Project

OPNFV documentation is automated and integrated into our git & gerrit toolchains.

We use RST document templates in our repositories and automatically render to HTML and PDF versions of the documents in our artifact store, our WiKi is also able to integrate these rendered documents directly allowing projects to use the revision controlled documentation process for project information, content and deliverables. Read this page which elaborates on how documentation is to be included within opnfvdocs.

Licencing your documentation

All contributions to the OPNFV project are done in accordance with the OPNFV licensing requirements. Documentation in OPNFV is contributed in accordance with the Creative Commons 4.0 and the `SPDX https://spdx.org/>`_ licence. All documentation files need to be licensed using the text below. The license may be applied in the first lines of all contributed RST files:

.. This work is licensed under a Creative Commons Attribution 4.0 International License.
.. SPDX-License-Identifier: CC-BY-4.0
.. (c) <optionally add copywriters name>

These lines will not be rendered in the html and pdf files.
How and where to store the document content files in your repository

All documentation for your project should be structured and stored in the <repo>/docs/ directory. The documentation toolchain will look in these directories and be triggered on events in these directories when generating documents.

Document structure and contribution

A general structure is proposed for storing and handling documents that are common across many projects but also for documents that may be project specific. The documentation is divided into three areas Release, Development and Testing. Templates for these areas can be found under opnfvdocs/docs/templates/.

Project teams are encouraged to use templates provided by the opnfvdocs project to ensure that there is consistency across the community. Following representation shows the expected structure:

docs/
├── development
│   ├── design
│   ├── overview
│   └── requirements
├── release
│   ├── configguide
│   ├── installation
│   ├── release-notes
│   ├── scenarios
│   │   └── scenario.name
│   └── userguide
└── testing
    ├── developer
    └── user
Release documentation

Release documentation is the set of documents that are published for each OPNFV release. These documents are created and developed following the OPNFV release process and milestones and should reflect the content of the OPNFV release. These documents have a master index.rst file in the <opnfvdocs> repository and extract content from other repositories. To provide content into these documents place your <content>.rst files in a directory in your repository that matches the master document and add a reference to that file in the correct place in the corresponding index.rst file in opnfvdocs/docs/release/.

Platform Overview: opnfvdocs/docs/release/overview

  • Note this document is not a contribution driven document
  • Content for this is prepared by the Marketing team together with the opnfvdocs team

Installation Instruction: <repo>/docs/release/installation

  • Folder for documents describing how to deploy each installer and scenario descriptions
  • Release notes will be included here <To Confirm>
  • Security related documents will be included here
  • Note that this document will be compiled into ‘OPNFV Installation Instruction’

User Guide: <repo>/docs/release/userguide

  • Folder for manuals to use specific features
  • Folder for documents describing how to install/configure project specific components and features
  • Can be the directory where API reference for project specific features are stored
  • Note this document will be compiled into ‘OPNFV userguide’

Configuration Guide: <repo>/docs/release/configguide

  • Brief introduction to configure OPNFV with its dependencies.

Release Notes: <repo>/docs/release/release-notes

  • Changes brought about in the release cycle.
  • Include version details.
Testing documentation

Documentation created by test projects can be stored under two different sub directories /user or /developemnt. Release notes will be stored under <repo>/docs/release/release-notes

User documentation: <repo>/testing/user/ Will collect the documentation of the test projects allowing the end user to perform testing towards a OPNFV SUT e.g. Functest/Yardstick/Vsperf/Storperf/Bottlenecks/Qtip installation/config & user guides.

Development documentation: <repo>/testing/developent/ Will collect documentation to explain how to create your own test case and leverage existing testing frameworks e.g. developer guides.

Development Documentation

Project specific documents such as design documentation, project overview or requirement documentation can be stored under /docs/development. Links to generated documents will be dislayed under Development Documentaiton section on docs.opnfv.org. You are encouraged to establish the following basic structure for your project as needed:

Requirement Documentation: <repo>/docs/development/requirements/

  • Folder for your requirement documentation
  • For details on requirements projects’ structures see the Requirements Projects page.

Design Documentation: <repo>/docs/development/design

  • Folder for your upstream design documents (blueprints, development proposals, etc..)

Project overview: <repo>/docs/development/overview

  • Folder for any project specific documentation.

Including your Documentation

In your project repository

Add your documentation to your repository in the folder structure and according to the templates listed above. The documentation templates you will require are available in opnfvdocs/docs/templates/ repository, you should copy the relevant templates to your <repo>/docs/ directory in your repository. For instance if you want to document userguide, then your steps shall be as follows:

git clone ssh://<your_id>@gerrit.opnfv.org:29418/opnfvdocs.git
cp -p opnfvdocs/docs/userguide/* <my_repo>/docs/userguide/

You should then add the relevant information to the template that will explain the documentation. When you are done writing, you can commit the documentation to the project repository.

git add .
git commit --signoff --all
git review
In OPNFVDocs Composite Documentation
In toctree
To import project documents from project repositories, we use submodules.
Each project is stored in opnfvdocs/docs/submodule/ as follows:
_images/Submodules.jpg

To include your project specific documentation in the composite documentation, first identify where your project documentation should be included. Say your project userguide should figure in the ‘OPNFV Userguide’, then:

vim opnfvdocs/docs/release/userguide.introduction.rst

This opens the text editor. Identify where you want to add the userguide. If the userguide is to be added to the toctree, simply include the path to it, example:

.. toctree::
    :maxdepth: 1

 submodules/functest/docs/userguide/index
 submodules/bottlenecks/docs/userguide/index
 submodules/yardstick/docs/userguide/index
 <submodules/path-to-your-file>
‘doc8’ Validation

It is recommended that all rst content is validated by doc8 standards. To validate your rst files using doc8, install doc8.

sudo pip install doc8

doc8 can now be used to check the rst files. Execute as,

doc8 --ignore D000,D001 <file>
Testing: Build Documentation Locally
Composite OPNFVDOCS documentation

To build whole documentation under opnfvdocs/, follow these steps:

Install virtual environment.

sudo pip install virtualenv
cd /local/repo/path/to/project

Download the OPNFVDOCS repository.

git clone https://gerrit.opnfv.org/gerrit/opnfvdocs

Change directory to opnfvdocs & install requirements.

cd opnfvdocs
sudo pip install -r etc/requirements.txt

Update submodules, build documentation using tox & then open using any browser.

cd opnfvdocs
git submodule update --init
tox -edocs
firefox docs/_build/html/index.html

Note

Make sure to run tox -edocs and not just tox.

Individual project documentation

To test how the documentation renders in HTML, follow these steps:

Install virtual environment.

sudo pip install virtualenv
cd /local/repo/path/to/project

Download the opnfvdocs repository.

git clone https://gerrit.opnfv.org/gerrit/opnfvdocs

Change directory to opnfvdocs & install requirements.

cd opnfvdocs
sudo pip install -r etc/requirements.txt

Move the conf.py file to your project folder where RST files have been kept:

mv opnfvdocs/docs/conf.py <path-to-your-folder>/

Move the static files to your project folder:

mv opnfvdocs/_static/ <path-to-your-folder>/

Build the documentation from within your project folder:

sphinx-build -b html <path-to-your-folder> <path-to-output-folder>

Your documentation shall be built as HTML inside the specified output folder directory.

Note

Be sure to remove the conf.py, the static/ files and the output folder from the <project>/docs/. This is for testing only. Only commit the rst files and related content.

Addendum

Index File

The index file must relatively refence your other rst files in that directory.

Here is an example index.rst :

*******************
Documentation Title
*******************

.. toctree::
   :numbered:
   :maxdepth: 2

   documentation-example
Source Files

Document source files have to be written in reStructuredText format (rst). Each file would be build as an html page.

Here is an example source rst file :

=============
Chapter Title
=============

Section Title
=============

Subsection Title
----------------

Hello!
Writing RST Markdown

See http://sphinx-doc.org/rest.html .

Hint: You can add dedicated contents by using ‘only’ directive with build type (‘html’ and ‘singlehtml’) for OPNFV document. But, this is not encouraged to use since this may make different views.

.. only:: html
    This line will be shown only in html version.
Verify Job

The verify job name is docs-verify-rtd-{branch}.

When you send document changes to gerrit, jenkins will create your documents in HTML formats (normal and single-page) to verify that new document can be built successfully. Please check the jenkins log and artifact carefully. You can improve your document even though if the build job succeeded.

Merge Job

The merge job name is docs-merge-rtd-{branch}.

Once the patch is merged, jenkins will automatically trigger building of the new documentation. This might take about 15 minutes while readthedocs builds the documentatation. The newly built documentation shall show up as appropriate placed in docs.opnfv.org/{branch}/path-to-file.

OPNFV Projects

Apex

OPNFV Installation instructions (Apex)

Contents:

1. Abstract

This document describes how to install the Danube release of OPNFV when using Apex as a deployment tool covering it’s limitations, dependencies and required system resources.

2. License

Danube release of OPNFV when using Apex as a deployment tool Docs (c) by Tim Rozet (Red Hat) and Dan Radez (Red Hat)

Danube release of OPNFV when using Apex as a deployment tool Docs are licensed under a Creative Commons Attribution 4.0 International License. You should have received a copy of the license along with this. If not, see <http://creativecommons.org/licenses/by/4.0/>.

3. Introduction

This document describes the steps to install an OPNFV Danube reference platform, as defined by the Genesis Project using the Apex installer.

The audience is assumed to have a good background in networking and Linux administration.

4. Preface

Apex uses Triple-O from the RDO Project OpenStack distribution as a provisioning tool. The Triple-O image based life cycle installation tool provisions an OPNFV Target System (3 controllers, 2 or more compute nodes) with OPNFV specific configuration provided by the Apex deployment tool chain.

The Apex deployment artifacts contain the necessary tools to deploy and configure an OPNFV target system using the Apex deployment toolchain. These artifacts offer the choice of using the Apex bootable ISO (opnfv-apex-danube.iso) to both install CentOS 7 and the necessary materials to deploy or the Apex RPMs (opnfv-apex*.rpm), and their associated dependencies, which expects installation to a CentOS 7 libvirt enabled host. The RPM contains a collection of configuration files, prebuilt disk images, and the automatic deployment script (opnfv-deploy).

An OPNFV install requires a “Jumphost” in order to operate. The bootable ISO will allow you to install a customized CentOS 7 release to the Jumphost, which includes the required packages needed to run opnfv-deploy. If you already have a Jumphost with CentOS 7 installed, you may choose to skip the ISO step and simply install the (opnfv-apex*.rpm) RPMs. The RPMs are the same RPMs included in the ISO and include all the necessary disk images and configuration files to execute an OPNFV deployment. Either method will prepare a host to the same ready state for OPNFV deployment.

opnfv-deploy instantiates a Triple-O Undercloud VM server using libvirt as its provider. This VM is then configured and used to provision the OPNFV target deployment (3 controllers, n compute nodes). These nodes can be either virtual or bare metal. This guide contains instructions for installing either method.

5. Triple-O Deployment Architecture

Apex is based on the OpenStack Triple-O project as distributed by the RDO Project. It is important to understand the basics of a Triple-O deployment to help make decisions that will assist in successfully deploying OPNFV.

Triple-O stands for OpenStack On OpenStack. This means that OpenStack will be used to install OpenStack. The target OPNFV deployment is an OpenStack cloud with NFV features built-in that will be deployed by a smaller all-in-one deployment of OpenStack. In this deployment methodology there are two OpenStack installations. They are referred to as the undercloud and the overcloud. The undercloud is used to deploy the overcloud.

The undercloud is the all-in-one installation of OpenStack that includes baremetal provisioning capability. The undercloud will be deployed as a virtual machine on a jumphost. This VM is pre-built and distributed as part of the Apex RPM.

The overcloud is OPNFV. Configuration will be passed into undercloud and the undercloud will use OpenStack’s orchestration component, named Heat, to execute a deployment that will provision the target OPNFV nodes.

6. Apex High Availability Architecture
6.1. Undercloud

The undercloud is not Highly Available. End users do not depend on the undercloud. It is only for management purposes.

6.2. Overcloud

Apex will deploy three control nodes in an HA deployment. Each of these nodes will run the following services:

  • Stateless OpenStack services
  • MariaDB / Galera
  • RabbitMQ
  • OpenDaylight
  • HA Proxy
  • Pacemaker & VIPs
  • Ceph Monitors and OSDs
Stateless OpenStack services
All running stateless OpenStack services are load balanced by HA Proxy. Pacemaker monitors the services and ensures that they are running.
Stateful OpenStack services
All running stateful OpenStack services are load balanced by HA Proxy. They are monitored by pacemaker in an active/passive failover configuration.
MariaDB / Galera
The MariaDB database is replicated across the control nodes using Galera. Pacemaker is responsible for a proper start up of the Galera cluster. HA Proxy provides and active/passive failover methodology to connections to the database.
RabbitMQ
The message bus is managed by Pacemaker to ensure proper start up and establishment of clustering across cluster members.
OpenDaylight
OpenDaylight is currently installed on all three control nodes and started as an HA cluster unless otherwise noted for that scenario. OpenDaylight’s database, known as MD-SAL, breaks up pieces of the database into “shards”. Each shard will have its own election take place, which will determine which OpenDaylight node is the leader for that shard. The other OpenDaylight nodes in the cluster will be in standby. Every Open vSwitch node connects to every OpenDaylight to enable HA.
HA Proxy
HA Proxy is monitored by Pacemaker to ensure it is running across all nodes and available to balance connections.
Pacemaker & VIPs
Pacemaker has relationships and restraints setup to ensure proper service start up order and Virtual IPs associated with specific services are running on the proper host.
Ceph Monitors & OSDs
The Ceph monitors run on each of the control nodes. Each control node also has a Ceph OSD running on it. By default the OSDs use an autogenerated virtual disk as their target device. A non-autogenerated device can be specified in the deploy file.

VM Migration is configured and VMs can be evacuated as needed or as invoked by tools such as heat as part of a monitored stack deployment in the overcloud.

7. OPNFV Scenario Architecture

OPNFV distinguishes different types of SDN controllers, deployment options, and features into “scenarios”. These scenarios are universal across all OPNFV installers, although some may or may not be supported by each installer.

The standard naming convention for a scenario is: <VIM platform>-<SDN type>-<feature>-<ha/noha>

The only supported VIM type is “OS” (OpenStack), while SDN types can be any supported SDN controller. “feature” includes things like ovs_dpdk, sfc, etc. “ha” or “noha” determines if the deployment will be highly available. If “ha” is used at least 3 control nodes are required.

8. OPNFV Scenarios in Apex

Apex provides pre-built scenario files in /etc/opnfv-apex which a user can select from to deploy the desired scenario. Simply pass the desired file to the installer as a (-d) deploy setting. Read further in the Apex documentation to learn more about invoking the deploy command. Below is quick reference matrix for OPNFV scenarios supported in Apex. Please refer to the respective OPNFV Docs documentation for each scenario in order to see a full scenario description. Also, please refer to release-notes for information about known issues per scenario. The following scenarios correspond to a supported <Scenario>.yaml deploy settings file:

Scenario Owner Supported
os-nosdn-nofeature-ha Apex Yes
os-nosdn-nofeature-noha Apex Yes
os-nosdn-ovs-ha OVS for NFV Yes
os-nosdn-ovs-noha OVS for NFV Yes
os-nosdn-fdio-ha FDS No
os-nosdn-fdio-noha FDS No
os-nosdn-kvm-ha KVM for NFV Yes
os-nosdn-kvm-noha KVM for NFV Yes
os-nosdn-performance-ha Apex Yes
os-odl_l3-nofeature-ha Apex Yes
os-odl_l3-nofeature-noha Apex Yes
os-odl_l3-ovs-ha OVS for NFV Yes
os-odl_l3-ovs-noha OVS for NFV Yes
os-odl-bgpvpn-ha SDNVPN Yes
os-odl-bgpvpn-noha SDNVPN Yes
os-odl-gluon-noha GluOn Yes
os-odl_l3-csit-noha Apex Yes
os-odl_l3-fdio-ha FDS Yes
os-odl_l3-fdio-noha FDS Yes
os-odl_l2-fdio-ha FDS Yes
os-odl_l2-fdio-noha FDS Yes
os-odl_l2-sfc-noha SFC No
os-onos-nofeature-ha ONOSFW No
os-onos-sfc-ha ONOSFW No
os-ovn-nofeature-noha Apex Yes
9. Setup Requirements
9.1. Jumphost Requirements

The Jumphost requirements are outlined below:

  1. CentOS 7 (from ISO or self-installed).
  2. Root access.
  3. libvirt virtualization support.
  4. minimum 1 networks and maximum 5 networks, multiple NIC and/or VLAN combinations are supported. This is virtualized for a VM deployment.
  5. The Danube Apex RPMs and their dependencies.
  6. 16 GB of RAM for a bare metal deployment, 64 GB of RAM for a VM deployment.
9.2. Network Requirements

Network requirements include:

  1. No DHCP or TFTP server running on networks used by OPNFV.
  2. 1-5 separate networks with connectivity between Jumphost and nodes.
    • Control Plane (Provisioning)
    • Private Tenant-Networking Network*
    • External Network*
    • Storage Network*
    • Internal API Network* (required for IPv6 **)
  3. Lights out OOB network access from Jumphost with IPMI node enabled (bare metal deployment only).
  4. External network is a routable network from outside the cloud, deployment. The External network is where public internet access would reside if available.

*These networks can be combined with each other or all combined on the Control Plane network.

**Internal API network, by default, is collapsed with provisioning in IPv4 deployments, this is not possible with the current lack of PXE boot support and therefore the API network is required to be its own network in an IPv6 deployment.

9.3. Bare Metal Node Requirements

Bare metal nodes require:

  1. IPMI enabled on OOB interface for power control.
  2. BIOS boot priority should be PXE first then local hard disk.
  3. BIOS PXE interface should include Control Plane network mentioned above.
9.4. Execution Requirements (Bare Metal Only)

In order to execute a deployment, one must gather the following information:

  1. IPMI IP addresses for the nodes.
  2. IPMI login information for the nodes (user/pass).
  3. MAC address of Control Plane / Provisioning interfaces of the overcloud nodes.
10. Installation High-Level Overview - Bare Metal Deployment

The setup presumes that you have 6 or more bare metal servers already setup with network connectivity on at least 1 or more network interfaces for all servers via a TOR switch or other network implementation.

The physical TOR switches are not automatically configured from the OPNFV reference platform. All the networks involved in the OPNFV infrastructure as well as the provider networks and the private tenant VLANs needs to be manually configured.

The Jumphost can be installed using the bootable ISO or by using the (opnfv-apex*.rpm) RPMs and their dependencies. The Jumphost should then be configured with an IP gateway on its admin or public interface and configured with a working DNS server. The Jumphost should also have routable access to the lights out network for the overcloud nodes.

opnfv-deploy is then executed in order to deploy the undercloud VM and to provision the overcloud nodes. opnfv-deploy uses three configuration files in order to know how to install and provision the OPNFV target system. The information gathered under section Execution Requirements (Bare Metal Only) is put into the YAML file /etc/opnfv-apex/inventory.yaml configuration file. Deployment options are put into the YAML file /etc/opnfv-apex/deploy_settings.yaml. Alternatively there are pre-baked deploy_settings files available in /etc/opnfv-apex/. These files are named with the naming convention os-sdn_controller-enabled_feature-[no]ha.yaml. These files can be used in place of the /etc/opnfv-apex/deploy_settings.yaml file if one suites your deployment needs. Networking definitions gathered under section Network Requirements are put into the YAML file /etc/opnfv-apex/network_settings.yaml. opnfv-deploy will boot the undercloud VM and load the target deployment configuration into the provisioning toolchain. This information includes MAC address, IPMI, Networking Environment and OPNFV deployment options.

Once configuration is loaded and the undercloud is configured it will then reboot the overcloud nodes via IPMI. The nodes should already be set to PXE boot first off the admin interface. The nodes will first PXE off of the undercloud PXE server and go through a discovery/introspection process.

Introspection boots off of custom introspection PXE images. These images are designed to look at the properties of the hardware that is being booted and report the properties of it back to the undercloud node.

After introspection the undercloud will execute a Heat Stack Deployment to continue node provisioning and configuration. The nodes will reboot and PXE from the undercloud PXE server again to provision each node using Glance disk images provided by the undercloud. These disk images include all the necessary packages and configuration for an OPNFV deployment to execute. Once the disk images have been written to node’s disks the nodes will boot locally and execute cloud-init which will execute the final node configuration. This configuration is largely completed by executing a puppet apply on each node.

11. Installation High-Level Overview - VM Deployment

The VM nodes deployment operates almost the same way as the bare metal deployment with a few differences mainly related to power management. opnfv-deploy still deploys an undercloud VM. In addition to the undercloud VM a collection of VMs (3 control nodes + 2 compute for an HA deployment or 1 control node and 1 or more compute nodes for a Non-HA Deployment) will be defined for the target OPNFV deployment. The part of the toolchain that executes IPMI power instructions calls into libvirt instead of the IPMI interfaces on baremetal servers to operate the power management. These VMs are then provisioned with the same disk images and configuration that baremetal would be.

To Triple-O these nodes look like they have just built and registered the same way as bare metal nodes, the main difference is the use of a libvirt driver for the power management.

12. Installation Guide - Bare Metal Deployment

This section goes step-by-step on how to correctly install and provision the OPNFV target system to bare metal nodes.

12.1. Install Bare Metal Jumphost
1a. If your Jumphost does not have CentOS 7 already on it, or you would like to
do a fresh install, then download the Apex bootable ISO from the OPNFV artifacts site <http://artifacts.opnfv.org/apex.html>. There have been isolated reports of problems with the ISO having trouble completing installation successfully. In the unexpected event the ISO does not work please workaround this by downloading the CentOS 7 DVD and performing a “Virtualization Host” install. If you perform a “Minimal Install” or install type other than “Virtualization Host” simply run sudo yum groupinstall "Virtualization Host" chkconfig libvirtd on && reboot to install virtualzation support and enable libvirt on boot. If you use the CentOS 7 DVD proceed to step 1b once the CentOS 7 with “Virtualzation Host” support is completed.
1b. If your Jump host already has CentOS 7 with libvirt running on it then

install the install the RDO Newton Release RPM and epel-release:

sudo yum install https://repos.fedorapeople.org/repos/openstack/openstack-newton/rdo-release-newton-4.noarch.rpm sudo yum install epel-release

The RDO Project release repository is needed to install OpenVSwitch, which is a dependency of opnfv-apex. If you do not have external connectivity to use this repository you need to download the OpenVSwitch RPM from the RDO Project repositories and install it with the opnfv-apex RPM.

2a. Boot the ISO off of a USB or other installation media and walk through

installing OPNFV CentOS 7. The ISO comes prepared to be written directly to a USB drive with dd as such:

dd if=opnfv-apex.iso of=/dev/sdX bs=4M

Replace /dev/sdX with the device assigned to your usb drive. Then select the USB device as the boot media on your Jumphost

2b. If your Jump host already has CentOS 7 with libvirt running on it then

install the opnfv-apex RPMs using the OPNFV artifacts yum repo. This yum repo is created at release. It will not exist before release day.

sudo yum install http://artifacts.opnfv.org/apex/danube/opnfv-apex-release-danube.noarch.rpm

Once you have installed the repo definitions for Apex, RDO and EPEL then yum install Apex:

sudo yum install opnfv-apex

If ONOS will be used, install the ONOS rpm instead of the opnfv-apex rpm.

sudo yum install opnfv-apex-onos

2c. If you choose not to use the Apex yum repo or you choose to use

pre-released RPMs you can download and install the required RPMs from the artifacts site <http://artifacts.opnfv.org/apex.html>. The following RPMs are available for installation:

  • opnfv-apex - OpenDaylight L2 / L3 and ODL SFC support *
  • opnfv-apex-onos - ONOS support *
  • opnfv-apex-undercloud - (reqed) Undercloud Image
  • opnfv-apex-common - (reqed) Supporting config files and scripts
  • python34-markupsafe - (reqed) Dependency of opnfv-apex-common **
  • python3-jinja2 - (reqed) Dependency of opnfv-apex-common **
  • python3-ipmi - (reqed) Dependency of opnfv-apex-common **

* One or more of these RPMs is required Only one of opnfv-apex or opnfv-apex-onos is required. It is safe to leave the unneeded SDN controller’s RPMs uninstalled if you do not intend to use them.

** These RPMs are not yet distributed by CentOS or EPEL. Apex has built these for distribution with Apex while CentOS and EPEL do not distribute them. Once they are carried in an upstream channel Apex will no longer carry them and they will not need special handling for installation.

The EPEL and RDO yum repos are still required: sudo yum install epel-release sudo yum install https://repos.fedorapeople.org/repos/openstack/openstack-newton/rdo-release-newton-4.noarch.rpm

Once the apex RPMs are downloaded install them by passing the file names directly to yum: sudo yum install python34-markupsafe-<version>.rpm python3-jinja2-<version>.rpm python3-ipmi-<version>.rpm sudo yum install opnfv-apex-<version>.rpm opnfv-apex-undercloud-<version>.rpm opnfv-apex-common-<version>.rpm

  1. After the operating system and the opnfv-apex RPMs are installed, login to your Jumphost as root.
  2. Configure IP addresses on the interfaces that you have selected as your networks.
  3. Configure the IP gateway to the Internet either, preferably on the public interface.
  4. Configure your /etc/resolv.conf to point to a DNS server (8.8.8.8 is provided by Google).
12.2. Creating a Node Inventory File

IPMI configuration information gathered in section Execution Requirements (Bare Metal Only) needs to be added to the inventory.yaml file.

  1. Copy /usr/share/doc/opnfv/inventory.yaml.example as your inventory file template to /etc/opnfv-apex/inventory.yaml.

  2. The nodes dictionary contains a definition block for each baremetal host that will be deployed. 1 or more compute nodes and 3 controller nodes are required. (The example file contains blocks for each of these already). It is optional at this point to add more compute nodes into the node list.

  3. Edit the following values for each node:

    • mac_address: MAC of the interface that will PXE boot from undercloud
    • ipmi_ip: IPMI IP Address
    • ipmi_user: IPMI username
    • ipmi_password: IPMI password
    • pm_type: Power Management driver to use for the node
      values: pxe_ipmitool (tested) or pxe_wol (untested) or pxe_amt (untested)
    • cpus: (Introspected*) CPU cores available
    • memory: (Introspected*) Memory available in Mib
    • disk: (Introspected*) Disk space available in Gb
    • disk_device: (Opt***) Root disk device to use for installation
    • arch: (Introspected*) System architecture
    • capabilities: (Opt**) Node’s role in deployment
      values: profile:control or profile:compute

    * Introspection looks up the overcloud node’s resources and overrides these value. You can leave default values and Apex will get the correct values when it runs introspection on the nodes.

    ** If capabilities profile is not specified then Apex will select node’s roles in the OPNFV cluster in a non-deterministic fashion.

    *** disk_device declares which hard disk to use as the root device for installation. The format is a comma delimited list of devices, such as “sda,sdb,sdc”. The disk chosen will be the first device in the list which is found by introspection to exist on the system. Currently, only a single definition is allowed for all nodes. Therefore if multiple disk_device definitions occur within the inventory, only the last definition on a node will be used for all nodes.

12.3. Creating the Settings Files

Edit the 2 settings files in /etc/opnfv-apex/. These files have comments to help you customize them.

  1. deploy_settings.yaml This file includes basic configuration options deployment, and also documents all available options. Alternatively, there are pre-built deploy_settings files available in (/etc/opnfv-apex/). These files are named with the naming convention os-sdn_controller-enabled_feature-[no]ha.yaml. These files can be used in place of the (/etc/opnfv-apex/deploy_settings.yaml) file if one suites your deployment needs. If a pre-built deploy_settings file is chosen there is no need to customize (/etc/opnfv-apex/deploy_settings.yaml). The pre-built file can be used in place of the (/etc/opnfv-apex/deploy_settings.yaml) file.
  2. network_settings.yaml This file provides Apex with the networking information that satisfies the prerequisite Network Requirements. These are specific to your environment.
12.4. Running opnfv-deploy

You are now ready to deploy OPNFV using Apex! opnfv-deploy will use the inventory and settings files to deploy OPNFV.

Follow the steps below to execute:

  1. Execute opnfv-deploy sudo opnfv-deploy -n network_settings.yaml -i inventory.yaml -d deploy_settings.yaml If you need more information about the options that can be passed to opnfv-deploy use opnfv-deploy --help. -n network_settings.yaml allows you to customize your networking topology.
  2. Wait while deployment is executed. If something goes wrong during this part of the process, start by reviewing your network or the information in your configuration files. It’s not uncommon for something small to be overlooked or mis-typed. You will also notice outputs in your shell as the deployment progresses.
  3. When the deployment is complete the undercloud IP and ovecloud dashboard url will be printed. OPNFV has now been deployed using Apex.
13. Installation High-Level Overview - Virtual Deployment

The VM nodes deployment operates almost the same way as the bare metal deployment with a few differences. opnfv-deploy still deploys an undercloud VM. In addition to the undercloud VM a collection of VMs (3 control nodes + 2 compute for an HA deployment or 1 control node and 1 or more compute nodes for a non-HA Deployment) will be defined for the target OPNFV deployment. The part of the toolchain that executes IPMI power instructions calls into libvirt instead of the IPMI interfaces on baremetal servers to operate the power management. These VMs are then provisioned with the same disk images and configuration that baremetal would be. To Triple-O these nodes look like they have just built and registered the same way as bare metal nodes, the main difference is the use of a libvirt driver for the power management. Finally, the default network_settings file will deploy without modification. Customizations are welcome but not needed if a generic set of network_settings are acceptable.

14. Installation Guide - Virtual Deployment

This section goes step-by-step on how to correctly install and provision the OPNFV target system to VM nodes.

14.1. Special Requirements for Virtual Deployments

In scenarios where advanced performance options or features are used, such as using huge pages with nova instances, DPDK, or iommu; it is required to enabled nested KVM support. This allows hardware extensions to be passed to the overcloud VMs, which will allow the overcloud compute nodes to bring up KVM guest nova instances, rather than QEMU. This also provides a great performance increase even in non-required scenarios and is recommended to be enabled.

During deployment the Apex installer will detect if nested KVM is enabled, and if not, it will attempt to enable it; while printing a warning message if it cannot. Check to make sure before deployment that Nested Virtualization is enabled in BIOS, and that the output of cat /sys/module/kvm_intel/parameters/nested returns “Y”. Also verify using lsmod that the kvm_intel module is loaded for x86_64 machines, and kvm_amd is loaded for AMD64 machines.

14.2. Install Jumphost

Follow the instructions in the Install Bare Metal Jumphost section.

14.3. Running opnfv-deploy

You are now ready to deploy OPNFV! opnfv-deploy has virtual deployment capability that includes all of the configuration necessary to deploy OPNFV with no modifications.

If no modifications are made to the included configurations the target environment will deploy with the following architecture:

  • 1 undercloud VM
  • The option of 3 control and 2 or more compute VMs (HA Deploy / default) or 1 control and 1 or more compute VM (Non-HA deploy / pass -n)
  • 1-5 networks: provisioning, private tenant networking, external, storage and internal API. The API, storage and tenant networking networks can be collapsed onto the provisioning network.

Follow the steps below to execute:

  1. sudo opnfv-deploy -v [ --virtual-computes n ] [ --virtual-cpus n ] [ --virtual-ram n ] -n network_settings.yaml -d deploy_settings.yaml
  2. It will take approximately 45 minutes to an hour to stand up undercloud, define the target virtual machines, configure the deployment and execute the deployment. You will notice different outputs in your shell.
  3. When the deployment is complete the IP for the undercloud and a url for the OpenStack dashboard will be displayed
14.4. Verifying the Setup - VMs

To verify the set you can follow the instructions in the Verifying the Setup section.

15. Verifying the Setup

Once the deployment has finished, the OPNFV deployment can be accessed via the undercloud node. From the jump host ssh to the undercloud host and become the stack user. Alternativly ssh keys have been setup such that the root user on the jump host can ssh to undercloud directly as the stack user. For convenience a utility script has been provided to look up the undercloud’s ip address and ssh to the undercloud all in one command. An optional user name can be passed to indicate whether to connect as the stack or root user. The stack user is default if a username is not specified.

opnfv-util undercloud root
su - stack

Once connected to undercloud as the stack user look for two keystone files that can be used to interact with the undercloud and the overcloud. Source the appropriate RC file to interact with the respective OpenStack deployment.

source stackrc (undercloud)
source overcloudrc (overcloud / OPNFV)

The contents of these files include the credentials for the administrative user for undercloud and OPNFV respectivly. At this point both undercloud and OPNFV can be interacted with just as any OpenStack installation can be. Start by listing the nodes in the undercloud that were used to deploy the overcloud.

source stackrc
openstack server list

The control and compute nodes will be listed in the output of this server list command. The IP addresses that are listed are the control plane addresses that were used to provision the nodes. Use these IP addresses to connect to these nodes. Initial authentication requires using the user heat-admin.

ssh heat-admin@192.0.2.7

To begin creating users, images, networks, servers, etc in OPNFV source the overcloudrc file or retrieve the admin user’s credentials from the overcloudrc file and connect to the web Dashboard.

You are now able to follow the OpenStack Verification section.

16. OpenStack Verification

Once connected to the OPNFV Dashboard make sure the OPNFV target system is working correctly:

  1. In the left pane, click Compute -> Images, click Create Image.
  2. Insert a name “cirros”, Insert an Image Location http://download.cirros-cloud.net/0.3.5/cirros-0.3.5-x86_64-disk.img.
  3. Select format “QCOW2”, select Public, then click Create Image.
  4. Now click Project -> Network -> Networks, click Create Network.
  5. Enter a name “internal”, click Next.
  6. Enter a subnet name “internal_subnet”, and enter Network Address 172.16.1.0/24, click Next.
  7. Now go to Project -> Compute -> Instances, click Launch Instance.
  8. Enter Instance Name “first_instance”, select Instance Boot Source “Boot from image”, and then select Image Name “cirros”.
  9. Click Launch, status will cycle though a couple states before becoming “Active”.
  10. Steps 7 though 9 can be repeated to launch more instances.
  11. Once an instance becomes “Active” their IP addresses will display on the Instances page.
  12. Click the name of an instance, then the “Console” tab and login as “cirros”/”cubswin:)”
  13. To verify storage is working, click Project -> Compute -> Volumes, Create Volume
  14. Give the volume a name and a size of 1 GB
  15. Once the volume becomes “Available” click the dropdown arrow and attach it to an instance.

Congratulations you have successfully installed OPNFV!

17. Developer Guide and Troubleshooting

This section aims to explain in more detail the steps that Apex follows to make a deployment. It also tries to explain possible issues you might find in the process of building or deploying an environment.

After installing the Apex RPMs in the jumphost, some files will be located around the system.

  1. /etc/opnfv-apex: this directory contains a bunch of scenarios to be deployed with different characteristics such HA (High Availability), SDN controller integration (OpenDaylight/ONOS), BGPVPN, FDIO, etc. Having a look at any of these files will give you an idea of how to make a customized scenario setting up different flags.
  2. /usr/bin/: it contains the binaries for the commands opnfv-deploy, opnfv-clean and opnfv-util.
  3. /var/opt/opnfv/: it contains several files and directories.

3.1. images/: this folder contains the images that will be deployed according to the chosen scenario.

3.2. lib/: bunch of scripts that will be executed in the different phases of deployment.

17.1. Utilization of Images

As mentioned earlier in this guide, the Undercloud VM will be in charge of deploying OPNFV (Overcloud VMs). Since the Undercloud is an all-in-one OpenStack deployment, it will use Glance to manage the images that will be deployed as the Overcloud.

So whatever customization that is done to the images located in the jumpserver (/var/opt/opnfv/images) will be uploaded to the undercloud and consequently, to the overcloud.

Make sure, the customization is performed on the right image. For example, if I virt-customize the following image overcloud-full-opendaylight.qcow2, but then I deploy OPNFV with the following command:

sudo opnfv-deploy -n network_settings.yaml -d /etc/opnfv-apex/os-onos-nofeature-ha.yaml

It will not have any effect over the deployment, since the customized image is the opendaylight one, and the scenario indicates that the image to be deployed is the overcloud-full-onos.qcow2.

17.2. Post-deployment Configuration

Post-deployment scripts will perform some configuration tasks such ssh-key injection, network configuration, NATing, OpenVswitch creation. It will take care of some OpenStack tasks such creation of endpoints, external networks, users, projects, etc.

If any of these steps fail, the execution will be interrupted. In some cases, the interruption occurs at very early stages, so a new deployment must be executed. However, some other cases it could be worth it to try to debug it.

  1. There is not external connectivity from the overcloud nodes:

    Post-deployment scripts will configure the routing, nameservers and a bunch of other things between the overcloud and the undercloud. If local connectivity, like pinging between the different nodes, is working fine, script must have failed when configuring the NAT via iptables. The main rules to enable external connectivity would look like these:

    iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE iptables -t nat -A POSTROUTING -s ${external_cidr} -o eth0 -j MASQUERADE iptables -A FORWARD -i eth2 -j ACCEPT iptables -A FORWARD -s ${external_cidr} -m state --state ESTABLISHED,RELATED -j ACCEPT service iptables save

    These rules must be executed as root (or sudo) in the undercloud machine.

17.3. OpenDaylight Integration

When a user deploys a scenario that starts with os-odl*:

OpenDaylight (ODL) SDN controller will be deployed and integrated with OpenStack. ODL will run as a systemd service, and can be managed as as a regular service:

systemctl start/restart/stop opendaylight.service

This command must be executed as root in the controller node of the overcloud, where OpenDaylight is running. ODL files are located in /opt/opendaylight. ODL uses karaf as a Java container management system that allows the users to install new features, check logs and configure a lot of things. In order to connect to Karaf’s console, use the following command:

opnfv-util opendaylight

This command is very easy to use, but in case it is not connecting to Karaf, this is the command that is executing underneath:

ssh -p 8101 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no karaf@localhost

Of course, localhost when the command is executed in the overcloud controller, but you use its public IP to connect from elsewhere.

17.4. Debugging Failures

This section will try to gather different type of failures, the root cause and some possible solutions or workarounds to get the process continued.

  1. I can see in the output log a post-deployment error messages:

    Heat resources will apply puppet manifests during this phase. If one of these processes fail, you could try to see the error and after that, re-run puppet to apply that manifest. Log into the controller (see verification section for that) and check as root /var/log/messages. Search for the error you have encountered and see if you can fix it. In order to re-run the puppet manifest, search for “puppet apply” in that same log. You will have to run the last “puppet apply” before the error. And It should look like this:

    FACTER_heat_outputs_path="/var/run/heat-config/heat-config-puppet/5b4c7a01-0d63-4a71-81e9-d5ee6f0a1f2f"  FACTER_fqdn="overcloud-controller-0.localdomain.com" \ FACTER_deploy_config_name="ControllerOvercloudServicesDeployment_Step4"  puppet apply --detailed-exitcodes -l syslog -l console \ /var/lib/heat-config/heat-config-puppet/5b4c7a01-0d63-4a71-81e9-d5ee6f0a1f2f.pp

    As a comment, Heat will trigger the puppet run via os-apply-config and it will pass a different value for step each time. There is a total of five steps. Some of these steps will not be executed depending on the type of scenario that is being deployed.

18. Frequently Asked Questions
19. License

All Apex and “common” entities are protected by the Apache 2.0 License.

20. References
20.3. OpenDaylight

Upstream OpenDaylight provides a number of packaging and deployment options meant for consumption by downstream projects like OPNFV.

Currently, OPNFV Apex uses OpenDaylight’s Puppet module, which in turn depends on OpenDaylight’s RPM.

20.4. RDO Project

RDO Project website

Authors:Tim Rozet (trozet@redhat.com)
Authors:Dan Radez (dradez@redhat.com)
Version:4.0
Indices and tables

Armband

Installation instruction for Fuel@OPNFV on AArch64
1. Abstract

This document describes how to install the Danube release of OPNFV when using Fuel as a deployment tool, with an AArch64 (only) target node pool. It covers its usage, limitations, dependencies and required system resources.

2. Introduction

This document provides guidelines on how to install and configure the Danube release of OPNFV when using Fuel as a deployment tool, with an AArch64 (only) target node pool, including required software and hardware configurations.

Although the available installation options give a high degree of freedom in how the system is set-up, including architecture, services and features, etc., said permutations may not provide an OPNFV compliant reference architecture. This instruction provides a step-by-step guide that results in an OPNFV Danube compliant deployment.

The audience of this document is assumed to have good knowledge in networking and Unix/Linux administration.

3. Preface

Before starting the installation of the AArch64 Danube release of OPNFV, using Fuel as a deployment tool, some planning must be done.

3.1. Retrieving the ISO image

First of all, the Fuel deployment ISO image needs to be retrieved, the ArmbandFuel .iso image of the AArch64 Danube release can be found at OPNFV Downloads.

3.2. Building the ISO image

Alternatively, you may build the Armband Fuel .iso from source by cloning the opnfv/armband git repository. To retrieve the repository for the AArch64 Danube release use the following command:

$ git clone https://gerrit.opnfv.org/gerrit/armband

Check-out the Danube release tag to set the HEAD to the baseline required to replicate the Danube release:

$ git checkout danube.3.0

Go to the armband directory and build the .iso:

$ cd armband; make all

For more information on how to build, please see Build instruction for Fuel@OPNFV

3.3. Other preparations

Next, familiarize yourself with Fuel by reading the following documents:

Prior to installation, a number of deployment specific parameters must be collected, those are:

  1. Provider sub-net and gateway information
  2. Provider VLAN information
  3. Provider DNS addresses
  4. Provider NTP addresses
  5. Network overlay you plan to deploy (VLAN, VXLAN, FLAT)
  6. How many nodes and what roles you want to deploy (Controllers, Storage, Computes)
  7. Monitoring options you want to deploy (Ceilometer, Syslog, etc.).
  8. Other options not covered in the document are available in the links above

This information will be needed for the configuration procedures provided in this document.

4. Hardware requirements

The following minimum hardware requirements must be met for the installation of AArch64 Danube using Fuel:

HW Aspect Requirement
# of AArch64 nodes

Minimum 5 (3 for non redundant deployment):

  • 1 Fuel deployment master (may be virtualized)
  • 3(1) Controllers (1 colocated mongo/ceilometer role, 2 Ceph-OSD roles)
  • 1 Compute (1 co-located Ceph-OSD role)
CPU Minimum 1 socket AArch64 (ARMv8) with Virtualization support
RAM Minimum 16GB/server (Depending on VNF work load)
Firmware UEFI compatible (e.g. EDK2) with PXE support
Disk Minimum 256GB 10kRPM spinning disks
Networks

4 Tagged VLANs (PUBLIC, MGMT, STORAGE, PRIVATE)

1 Un-Tagged VLAN for PXE Boot - ADMIN Network

Note: These can be allocated to a single NIC - or spread out over multiple NICs as your hardware supports.

1 x86_64 node
  • 1 Fuel deployment master, x86 (may be virtualized)
5. Help with Hardware Requirements

Calculate hardware requirements:

For information on compatible hardware types available for use, please see Fuel OpenStack Hardware Compatibility List.

When choosing the hardware on which you will deploy your OpenStack environment, you should think about:

  • CPU – Consider the number of virtual machines that you plan to deploy in your cloud environment and the CPU per virtual machine.
  • Memory – Depends on the amount of RAM assigned per virtual machine and the controller node.
  • Storage – Depends on the local drive space per virtual machine, remote volumes that can be attached to a virtual machine, and object storage.
  • Networking – Depends on the Choose Network Topology, the network bandwidth per virtual machine, and network storage.
6. Top of the rack (TOR) Configuration requirements

The switching infrastructure provides connectivity for the OPNFV infrastructure operations, tenant networks (East/West) and provider connectivity (North/South); it also provides needed connectivity for the Storage Area Network (SAN). To avoid traffic congestion, it is strongly suggested that three physically separated networks are used, that is: 1 physical network for administration and control, one physical network for tenant private and public networks, and one physical network for SAN. The switching connectivity can (but does not need to) be fully redundant, in such case it comprises a redundant 10GE switch pair for each of the three physically separated networks.

The physical TOR switches are not automatically configured from the Fuel OPNFV reference platform. All the networks involved in the OPNFV infrastructure as well as the provider networks and the private tenant VLANs needs to be manually configured.

Manual configuration of the Danube hardware platform should be carried out according to the OPNFV Pharos Specification.

7. OPNFV Software installation and deployment

This section describes the installation of the OPNFV installation server (Fuel master) as well as the deployment of the full OPNFV reference platform stack across a server cluster.

7.1. Install Fuel master
  1. Mount the Danube Armband Fuel ISO file/media as a boot device to the jump host server.

  2. Reboot the jump host to establish the Fuel server.

    • The system now boots from the ISO image.
    • Select “Fuel Install (Static IP)” (See figure below)
    • Press [Enter].
    _images/grub-1.png
  3. Wait until the Fuel setup screen is shown (Note: This can take up to 30 minutes).

  4. In the “Fuel User” section - Confirm/change the default password (See figure below)

    • Enter “admin” in the Fuel password input
    • Enter “admin” in the Confirm password input
    • Select “Check” and press [Enter]
    _images/fuelmenu1.png
  5. In the “Network Setup” section - Configure DHCP/Static IP information for your FUEL node - For example, ETH0 is 10.20.0.2/24 for FUEL booting and ETH1 is DHCP in your corporate/lab network (see figure below).

    • Configure eth1 or other network interfaces here as well (if you have them present on your FUEL server).
    _images/fuelmenu2.png
    _images/fuelmenu2a.png
  6. In the “PXE Setup” section (see figure below) - Change the following fields to appropriate values (example below):

    • DHCP Pool Start 10.20.0.4
    • DHCP Pool End 10.20.0.254
    • DHCP Pool Gateway 10.20.0.2 (IP address of Fuel node)
    _images/fuelmenu3.png
  7. In the “DNS & Hostname” section (see figure below) - Change the following fields to appropriate values:

    • Hostname
    • Domain
    • Search Domain
    • External DNS
    • Hostname to test DNS
    • Select <Check> and press [Enter]
    _images/fuelmenu4.png
  8. OPTION TO ENABLE PROXY SUPPORT - In the “Bootstrap Image” section (see figure below), edit the following fields to define a proxy. (NOTE: cannot be used in tandem with local repository support)

    • Navigate to “HTTP proxy” and enter your http proxy address
    • Select <Check> and press [Enter]
    _images/fuelmenu5.png
  9. In the “Time Sync” section (see figure below) - Change the following fields to appropriate values:

    • NTP Server 1 <Customer NTP server 1>
    • NTP Server 2 <Customer NTP server 2>
    • NTP Server 3 <Customer NTP server 3>
    _images/fuelmenu6.png
  10. In the “Feature groups” section - Enable “Experimental features” if you plan on using Ceilometer and/or MongoDB.

    NOTE: Ceilometer and MongoDB are experimental features starting with Danube.1.0.

  11. Start the installation.

    NOTE: Saving each section and hitting <F8> does not apply all settings!

    • Select Quit Setup and press Save and Quit.
    • The installation will now start, wait until the login screen is shown.
7.2. Boot the Node Servers

After the Fuel Master node has rebooted from the above steps and is at the login prompt, you should boot the Node Servers (Your Compute/Control/Storage blades, nested or real) with a PXE booting scheme so that the FUEL Master can pick them up for control.

NOTE: AArch64 target nodes are expected to support PXE booting an EFI binary, i.e. an EFI-stubbed GRUB2 bootloader.

NOTE: UEFI (EDK2) firmware is highly recommended, becoming the de facto standard for ARMv8 nodes.

  1. Enable PXE booting

    • For every controller and compute server: enable PXE Booting as the first boot device in the UEFI (EDK2) boot order menu, and hard disk as the second boot device in the same menu.
  2. Reboot all the control and compute blades.

  3. Wait for the availability of nodes showing up in the Fuel GUI.

    • Connect to the FUEL UI via the URL provided in the Console (default: https://10.20.0.2:8443)
    • Wait until all nodes are displayed in top right corner of the Fuel GUI: Total nodes and Unallocated nodes (see figure below).
    _images/nodes.png
7.3. Install additional Plugins/Features on the FUEL node
  1. SSH to your FUEL node (e.g. root@10.20.0.2 pwd: r00tme)

  2. Select wanted plugins/features from the /opt/opnfv/ directory.

  3. Install the wanted plugin with the command

    $ fuel plugins --install /opt/opnfv/<plugin-name>-<version>.<arch>.rpm
    

    Expected output (see figure below):

    Plugin ....... was successfully installed.
    
    _images/plugin_install.png

    NOTE: AArch64 Danube 3.0 ships only with ODL, OVS, BGPVPN, SFC and Tacker plugins, see Reference 15.

7.4. Create an OpenStack Environment
  1. Connect to Fuel WEB UI with a browser (default: https://10.20.0.2:8443) (login: admin/admin)

  2. Create and name a new OpenStack environment, to be installed.

    _images/newenv.png
  3. Select “<Newton on Ubuntu 16.04 (aarch64)>” and press <Next>

  4. Select “compute virtulization method”.

    • Select “QEMU-KVM as hypervisor” and press <Next>
  5. Select “network mode”.

    • Select “Neutron with ML2 plugin”
    • Select “Neutron with tunneling segmentation” (Required when using the ODL plugin)
    • Press <Next>
  6. Select “Storage Back-ends”.

    • Select “Ceph for block storage” and press <Next>
  7. Select “additional services” you wish to install.

    • Check option “Install Ceilometer and Aodh” and press <Next>
  8. Create the new environment.

    • Click <Create> Button
7.5. Configure the network environment
  1. Open the environment you previously created.

  2. Open the networks tab and select the “default” Node Networks group to on the left pane (see figure below).

    _images/network.png
  3. Update the Public network configuration and change the following fields to appropriate values:

    • CIDR to <CIDR for Public IP Addresses>
    • IP Range Start to <Public IP Address start>
    • IP Range End to <Public IP Address end>
    • Gateway to <Gateway for Public IP Addresses>
    • Check <VLAN tagging>.
    • Set appropriate VLAN id.
  4. Update the Storage Network Configuration

    • Set CIDR to appropriate value (default 192.168.1.0/24)
    • Set IP Range Start to appropriate value (default 192.168.1.1)
    • Set IP Range End to appropriate value (default 192.168.1.254)
    • Set vlan to appropriate value (default 102)
  5. Update the Management network configuration.

    • Set CIDR to appropriate value (default 192.168.0.0/24)
    • Set IP Range Start to appropriate value (default 192.168.0.1)
    • Set IP Range End to appropriate value (default 192.168.0.254)
    • Check <VLAN tagging>.
    • Set appropriate VLAN id. (default 101)
  6. Update the Private Network Information

    • Set CIDR to appropriate value (default 192.168.2.0/24
    • Set IP Range Start to appropriate value (default 192.168.2.1)
    • Set IP Range End to appropriate value (default 192.168.2.254)
    • Check <VLAN tagging>.
    • Set appropriate VLAN tag (default 103)
  7. Select the “Neutron L3” Node Networks group on the left pane.

    _images/neutronl3.png
  8. Update the Floating Network configuration.

    • Set the Floating IP range start (default 172.16.0.130)
    • Set the Floating IP range end (default 172.16.0.254)
    • Set the Floating network name (default admin_floating_net)
  9. Update the Internal Network configuration.

    • Set Internal network CIDR to an appropriate value (default 192.168.111.0/24)
    • Set Internal network gateway to an appropriate value
    • Set the Internal network name (default admin_internal_net)
  10. Update the Guest OS DNS servers.

    • Set Guest OS DNS Server values appropriately
  11. Save Settings.

  12. Select the “Other” Node Networks group on the left pane (see figure below).

    _images/other.png
  13. Update the Public network assignment.

    • Check the box for “Assign public network to all nodes” (Required by OpenDaylight)
  14. Update Host OS DNS Servers.

    • Provide the DNS server settings
  15. Update Host OS NTP Servers.

    • Provide the NTP server settings
7.6. Select Hypervisor type
  1. In the FUEL UI of your Environment, click the “Settings” Tab

  2. Select “Compute” on the left side pane (see figure below)

    • Check the KVM box and press “Save settings”
    _images/compute.png
7.7. Enable Plugins
  1. In the FUEL UI of your Environment, click the “Settings” Tab

  2. Select Other on the left side pane (see figure below)

    • Enable and configure the plugins of your choice
    _images/plugins_aarch64.png
7.8. Allocate nodes to environment and assign functional roles
  1. Click on the “Nodes” Tab in the FUEL WEB UI (see figure below).

    _images/addnodes.png
  2. Assign roles (see figure below).

    • Click on the <+Add Nodes> button
    • Check <Controller>, <Telemetry - MongoDB> and optionally an SDN Controller role (OpenDaylight controller) in the “Assign Roles” Section.
    • Check one node which you want to act as a Controller from the bottom half of the screen
    • Click <Apply Changes>.
    • Click on the <+Add Nodes> button
    • Check the <Controller> and <Storage - Ceph OSD> roles.
    • Check the two next nodes you want to act as Controllers from the bottom half of the screen
    • Click <Apply Changes>
    • Click on <+Add Nodes> button
    • Check the <Compute> and <Storage - Ceph OSD> roles.
    • Check the Nodes you want to act as Computes from the bottom half of the screen
    • Click <Apply Changes>.
    _images/computelist.png
  3. Configure interfaces (see figure below).

    • Check Select <All> to select all allocated nodes
    • Click <Configure Interfaces>
    • Assign interfaces (bonded) for mgmt-, admin-, private-, public- and storage networks
    • Click <Apply>
    _images/interfaceconf.png
7.9. OPTIONAL - Set Local Mirror Repos

NOTE: Support for local mirrors is incomplete in Danube 3.0. You may opt in for it to fetch less packages from internet during deployment, but an internet connection is still required.

The following steps must be executed if you are in an environment with no connection to the Internet. The Fuel server delivers a local repo that can be used for installation / deployment of openstack.

  1. In the Fuel UI of your Environment, click the Settings Tab and select General from the left pane.
    • Replace the URI values for the “Name” values outlined below:
    • “ubuntu” URI=”deb http://<ip-of-fuel-server>:8080/mirrors/ubuntu/ xenial main”
    • “mos” URI=”deb http://<ip-of-fuel-server>::8080/newton-10.0/ubuntu/x86_64 mos10.0 main restricted”
    • “Auxiliary” URI=”deb http://<ip-of-fuel-server>:8080/newton-10.0/ubuntu/auxiliary auxiliary main restricted”
    • Click <Save Settings> at the bottom to Save your changes
7.10. Target specific configuration
  1. [AArch64 specific] Configure MySQL WSREP SST provider

    NOTE: This option is only available for ArmbandFuel@OPNFV, since it currently only affects AArch64 targets (see Reference 15).

    When using some AArch64 platforms as controller nodes, WSREP SST synchronisation using default backend provider (xtrabackup-v2) used to fail, so a mechanism that allows selecting a different WSREP SST provider has been introduced.

    In the FUEL UI of your Environment, click the <Settings> tab, click <OpenStack Services> on the left side pane (see figure below), then select one of the following options:

    • xtrabackup-v2 (default provider, AArch64 stability issues);
    • rsync (AArch64 validated, better or comparable speed to xtrabackup, takes the donor node offline during state transfer);
    • mysqldump (untested);
    _images/fuelwsrepsst.png
  2. [AArch64 specific] Using a different kernel

    NOTE: By default, a 4.8 based kernel is used, for enabling experimental GICv3 features (e.g. live migration) and SFC support (required by OVS-NSH).

    To use Ubuntu Xenial LTS generic kernel (also available in offline mirror), in the FUEL UI of your Environment, click the <Settings> tab, click <General> on the left side pane, then at the bottom of the page, in the <Provision> subsection, amend the package list:

    • add <linux-headers-generic-lts-xenial>;
    • add <linux-image-generic-lts-xenial>;
    • add <linux-image-extra-lts-xenial> (optional);
    • remove <linux-image-4.8.0-9944-generic>;
    • remove <linux-headers-4.8.0-9944-generic>;
    • remove <linux-image-extra-4.8.0-9944-generic>;
  3. Set up targets for provisioning with non-default “Offloading Modes”

    Some target nodes may require additional configuration after they are PXE booted (bootstrapped); the most frequent changes are in defaults for ethernet devices’ “Offloading Modes” settings (e.g. some targets’ ethernet drivers may strip VLAN traffic by default).

    If your target ethernet drivers have wrong “Offloading Modes” defaults, in “Configure interfaces” page (described above), expand affected interface’s “Offloading Modes” and [un]check the relevant settings (see figure below):

    _images/offloadingmodes.png
  4. Set up targets for “Verify Networks” with non-default “Offloading Modes”

    NOTE: Check Reference 15 for an updated and comprehensive list of known issues and/or limitations, including “Offloading Modes” not being applied during “Verify Networks” step.

    Setting custom “Offloading Modes” in Fuel GUI will only apply those settings during provisiong and not during “Verify Networks”, so if your targets need this change, you have to apply “Offloading Modes” settings by hand to bootstrapped nodes.

    E.g.: Our driver has “rx-vlan-filter” default “on” (expected “off”) on the Openstack interface(s) “eth1”, preventing VLAN traffic from passing during “Verify Networks”.

    • From Fuel master console identify target nodes admin IPs (see figure below):

      $ fuel nodes
      
      _images/fuelconsole1.png
    • SSH into each of the target nodes and disable “rx-vlan-filter” on the affected physical interface(s) allocated for OpenStack traffic (eth1):

      $ ssh root@10.20.0.6 ethtool -K eth1 rx-vlan-filter off
      
    • Repeat the step above for all affected nodes/interfaces in the POD.

7.11. Verify Networks

It is important that the Verify Networks action is performed as it will verify that communicate works for the networks you have setup, as well as check that packages needed for a successful deployment can be fetched.

  1. From the FUEL UI in your Environment, Select the Networks Tab and select “Connectivity check” on the left pane (see figure below)

    • Select <Verify Networks>
    • Continue to fix your topology (physical switch, etc) until the “Verification Succeeded” and “Your network is configured correctly” message is shown
    _images/verifynet.png
7.12. Deploy Your Environment
  1. Deploy the environment.

    • In the Fuel GUI, click on the “Dashboard” Tab.
    • Click on <Deploy Changes> in the “Ready to Deploy?” section
    • Examine any information notice that pops up and click <Deploy>

    Wait for your deployment to complete, you can view the “Dashboard” Tab to see the progress and status of your deployment.

8. Installation health-check
  1. Perform system health-check (see figure below)

    • Click the “Health Check” tab inside your Environment in the FUEL Web UI
    • Check <Select All> and Click <Run Tests>
    • Allow tests to run and investigate results where appropriate
    • Check Reference 15 for known issues / limitations on AArch64
    _images/health.png
9. Release Notes

Please refer to the Release Notes article.

Availability

PROPOSAL: Reach-thru Guest Monitoring and Services for High Availability
1   Overview
author:Greg Waines
organization:Wind River Systems
organization:OPNFV - High Availability
status:Draft - PROPOSAL
date:March 2017
revision:1.5
abstract:This document presents a PROPOSAL for a set of new optional capabilities where the OpenStack Cloud messages into the Guest VMs in order to provide improved Availability of the hosted VMs. The initial set of new capabilities include: enabling the detection of and recovery from internal VM faults and providing a simple out-of-band messaging service to prevent scenarios such as split brain.
2   Introduction

This document provides an overview and rationale for a PROPOSAL of a set of new capabilities where the OpenStack Cloud messages into the Guest VMs in order to provide improved Availability of the hosted VMs.

The initial set of new capabilities specifically include:

  • VM Heartbeating and Health Checking
  • VM Peer State Notification and Messaging

All of these capabilities leverage Host-to-Guest Messaging Interfaces / APIs which are built on a messaging service between the OpenStack Host and the Guest VM that uses a simple low-bandwidth datagram messaging capability in the hypervisor and therefore has no requirements on OpenStack Networking, and is available very early after spawning the VM.

For each capability, the document outlines the interaction with the Guest VM, any key technologies involved, the integration into the larger OpenStack and OPNFV Architectures (e.g. interactions with VNFM), specific OPNFV HA Team deliverables, and the use cases for how availability of the hosted VM is improved.

The intent is for the OPNFV HA Team to review the proposals of this document with the related other teams in OPNFV (Doctor and Management & Orchestration (MANO)) and OpenStack (Nova).

3   Messaging Layer

The Host-to-Guest messaging APIs used by the services discussed in this document use a JSON-formatted application messaging layer on top of a ‘virtio serial device??between QEMU on the OpenStack Host and the Guest VM. JSON formatting provides a simple, humanly readable messaging format which can be easily parsed and formatted using any high level programming language being used in the Guest VM (e.g. C/C++, Python, Java, etc.). Use of the ‘virtio serial device??provides a simple, direct communication channel between host and guest which is independent of the Guest’s L2/L3 networking.

The upper layer JSON messaging format is actually structured as a hierarchical JSON format containing a Base JSON Message Layer and an Application JSON Message Layer:

  • the Base Layer provides the ability to multiplex different groups of message types on top of a single ‘virtio serial device?? e.g.

    • heartbeating and healthchecks,
    • server group messaging,

    and

  • the Application Layer provides the specific message types and fields of a particular group of message types.

4   VM Heartbeating and Health Checking

Normally OpenStack monitoring of the health of a Guest VM is limited to a black-box approach of simply monitoring the presence of the QEMU/KVM PID containing the VM.

VM Heartbeating and Health Checking provides a heartbeat service to monitor the health of guest application(s) within a VM running under the OpenStack Cloud. Loss of heartbeat or a failed health check status will result in a fault event being reported to OPNFV’s DOCTOR infrastructure for alarm identification, impact analysis and reporting. This would then enable VNF Managers (VNFMs) listening to OPNFV’s DOCTOR External Alarm Reporting through Telemetry’s AODH, to initiate any required fault recovery actions.

submodules/availability/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Guest_Heartbeat-FIGURE-1.png

The VM Heartbeating and Health Checking functionality is enabled on a VM through a new flavor extraspec indicating that the VM supports and wants to enable Guest Heartbeating. An extension to Nova Compute uses this extraspec to setup the required ‘virtio serial device’ for Host-to-Guest messaging, on the QEMU/KVM instance created for the VM.

A daemon within the Guest VM will register with the OpenStack Guest Heartbeat Service on the compute node to initiate the heartbeating on itself (i.e. the Guest VM). The OpenStack Compute Node will start heartbeating the Guest VM, and if the heartbeat fails, the OpenStack Compute Node will report the VM Fault thru DOCTOR and ultimately VNFM will see this thru NOVA VM State Chagne Notifications thru AODH. I.e. VNFM wouild see the VM Heartbeat Failure events in teh same way it sees all other VM Faults, thru DOCTOR initiated VM state changes.

Part of the Guest VM’s registration process is the specification of the heartbeat interval in msecs. I.e. the registering Guest VM specifies the heartbeating interval.

Guest heartbeat works on a challenge response model. The OpenStack Guest Heartbeat Service on the compute node will challenge the registered Guest VM daemon with a message each interval. The registered Guest VM daemon must respond prior to the next interval with a message indicating good health. If the OpenStack Host does not receive a valid response, or if the response specifies that the VM is in ill health, then a fault event for the Guest VM is reported to the OpenStack Guest Heartbeat Service on the controller node which will report the event to OPNFV’s DOCTOR (i.e. thru the OpenStack Vitrage data source APIs).

The registered Guest VM daemon’s response to the challenge can be as simple as just immediately responding with OK. This alone allows for detection of a failed or hung QEMU/KVM instance, or a failure of the OS within the VM to schedule the registered Guest VM’s daemon or failure to route basic IO within the Guest VM.

However the registered Guest VM daemon’s response to the challenge can be more complex, running anything from a quick simple sanity check of the health of applications running in the Guest VM, to a more thorough audit of the application state and data. In either case returning the status of the health check enables the OpenStack host to detect and report the event in order to initiate recovery from application level errors or failures within the Guest VM.

In summary, the deliverables of this activity would be:

  • Host Deliverables: (OpenStack and OPNFV blueprints and implementation)
  • an OpenStack Nova or libvirt extension to interpret the new flavor extraspec and if present setup the required ‘virtio serial device’ for Host-to-Guest heartbeat / health-check messaging, on the QEMU/KVM instance created for the VM,
  • an OPNFV Base Host-to-Guest Msging Layer Agent for multiplexing of Application Layer messaging over the ‘virtio serial device’ to the VM,
  • an OPNFV Heartbeat / Health-Check Compute Agent for local heartbeating of VM and reporting of failures to the OpenStack Controller,
  • an OPNFV Heartbeat / Health-check Server on the OpenStack Controller for receiving VM failure notifications and reporting these to Vitrage thru Vitrage’s Data Source API,
  • Guest Deliverables:
  • a Heartbeat / Health-Check Message Specification covering

    • Heartbeat / Health-Check Application Layer JSON Protocol,
    • Base Host-to-Guest JSON Protocol,
    • Details on the use of the underlying ‘virtio serial device’,
  • a Reference Implementation of the Guest-side support of Heartbeat / Health-check containing the peer protocol layers within the Guest.

    • will provide code and compile instructions,
    • Guest will compile based on its specific OS.

This proposal requires review with OPNFV’s Doctor and Management & Orchestration teams, and OpenStack’s Nova Team.

5   VM Peer State Notification and Messaging

Server Group State Notification and Messaging is a service to provide simple low-bandwidth datagram messaging and notifications for servers that are part of the same server group. This messaging channel is available regardless of whether IP networking is functional within the server, and it requires no knowledge within the server about the other members of the group.

NOTE: A Server Group here is the OpenStack Nova Server Group concept where VMs are grouped together for purposes of scheduling. E.g. A specific Server Group instance can specify whether the VMs within the group should be scheduled to run on the same compute host or different compute hosts. A ‘peer’ VM in the context of this section refers to a VM within the same Nova Server Group.

This Server Group Messaging service provides three types of messaging:

  • Broadcast: this allows a server to send a datagram (size of up to 3050 bytes) to all other servers within the server group.
  • Notification: this provides servers with information about changes to the (Nova) state of other servers within the server group.
  • Status: this allows a server to query the current (Nova) state of all servers within the server group (including itself).

A Server Group Messaging entity on both the controller node and the compute nodes manage the routing of of VM-to-VM messages through the platform, leveraging Nova to determine Server Group membership and compute node locations of VMs. The Server Group Messaging entity on the controller also listens to Nova VM state change notifications and querys VM state data from Nova, in order to provide the VM query and notification functionality of this service.

submodules/availability/docs/development/overview/OPNFV_HA_Guest_APIs-Overview_HLD-Peer_Messaging-FIGURE-2.png

This service is not intended for high bandwidth or low-latency operations. It is best-effort, not reliable. Applications should do end-to-end acks and retries if they care about reliability.

This service provides building block type capabilities for the Guest VMs that contribute to higher availability of the VMs in the Guest VM Server Group. Notifications of VM Status changes potentially provide a faster and more accurate notification of failed peer VMs than traditional peer VM monitoring over Tenant Networks. While the Broadcast Messaging mechanism provides an out-of-band messaging mechanism to monitor and control a peer VM under fault conditions; e.g. providing the ability to avoid potential split brain scenarios between 1:1 VMs when faults in Tenant Networking occur.

In summary, the deliverables for Server Group Messaging would be:

  • Host Deliverables:
  • a Nova or libvirt extension to interpret the new flavor extraspec and if present setup the required ‘virtio serial device’ for Host-to-Guest Server Group Messaging, on the QEMU/KVM instance created for the VM,
  • [ leveraging the Base Host-to-Guest Msging Layer Agent from previous section ],
  • a Server Group Messaging Compute Agent for implementing the Application Layer Server Group Messaging JSON Protocol with the VM, and forwarding the messages to/from the Server Group Messaging Server on the Controller,
  • a Server Group Messaging Server on the Controller for routing broadcast messages to the proper Computes and VMs, as well as listening for Nova VM State Change Notifications and forwarding these to applicable Computes and VMs,
  • Guest Deliverables:
  • a Server Group Messaging Message Specification covering

    • Server Group Messaging Application Layer JSON Protocol,
    • [ leveraging Base Host-to-Guest JSON Protocol from previous section ],
    • [ leveraging Details on the use of the underlying ‘virtio serial device’ from previous section ],
  • a Reference Implementation of the Guest-side support of Server Group Messaging containing the peer protocol layers and Guest Application hooks within the Guest.

This proposal requires review with OPNFV’s Management & Orchestration team and OpenStack’s Nova Team.

6   Conclusion

The PROPOSAL of Reach-thru Guest Monitoring and Services described in this document leverage Host-to-Guest messaging to provide a number of extended capabilities that improve the Availability of the hosted VMs. These new capabilities enable detection of and recovery from internal VM faults and provides a simple out-of-band messaging service to prevent scenarios such as split brain.

The integration of these proposed new capabilities into the larger OpenStack and OPNFV Architectures need to be reviewed with the other related teams in OPNFV (Doctor and Management & Orchestration (MANO)) and OpenStack (Nova).

>>>>>>> e83e826... Modify format issues

Barometer

OPNFV Barometer Requirements
Problem Statement

Providing carrier grade Service Assurance is critical in the network transformation to a software defined and virtualized network (NFV). Medium-/large-scale cloud environments account for between hundreds and hundreds of thousands of infrastructure systems. It is vital to monitor systems for malfunctions that could lead to users application service disruption and promptly react to these fault events to facilitate improving overall system performance. As the size of infrastructure and virtual resources grow, so does the effort of monitoring back-ends. SFQM aims to expose as much useful information as possible off the platform so that faults and errors in the NFVI can be detected promptly and reported to the appropriate fault management entity.

The OPNFV platform (NFVI) requires functionality to:

  • Create a low latency, high performance packet processing path (fast path) through the NFVI that VNFs can take advantage of;
  • Measure Telco Traffic and Performance KPIs through that fast path;
  • Detect and report violations that can be consumed by VNFs and higher level EMS/OSS systems

Examples of local measurable QoS factors for Traffic Monitoring which impact both Quality of Experience and five 9’s availability would be (using Metro Ethernet Forum Guidelines as reference):

  • Packet loss
  • Packet Delay Variation
  • Uni-directional frame delay

Other KPIs such as Call drops, Call Setup Success Rate, Call Setup time etc. are measured by the VNF.

In addition to Traffic Monitoring, the NFVI must also support Performance Monitoring of the physical interfaces themselves (e.g. NICs), i.e. an ability to monitor and trace errors on the physical interfaces and report them.

All these traffic statistics for Traffic and Performance Monitoring must be measured in-service and must be capable of being reported by standard Telco mechanisms (e.g. SNMP traps), for potential enforcement actions.

Barometer updated scope

The scope of the project is to provide interfaces to support monitoring of the NFVI. The project will develop plugins for telemetry frameworks to enable the collection of platform stats and events and relay gathered information to fault management applications or the VIM. The scope is limited to collecting/gathering the events and stats and relaying them to a relevant endpoint. The project will not enforce or take any actions based on the gathered information.

Scope of SFQM

NOTE: The SFQM project has been replaced by Barometer. The output of the project will provide interfaces and functions to support monitoring of Packet Latency and Network Interfaces while the VNF is in service.

The DPDK interface/API will be updated to support:

  • Exposure of NIC MAC/PHY Level Counters
  • Interface for Time stamp on RX
  • Interface for Time stamp on TX
  • Exposure of DPDK events

collectd will be updated to support the exposure of DPDK metrics and events.

Specific testing and integration will be carried out to cover:

  • Unit/Integration Test plans: A sample application provided to demonstrate packet latency monitoring and interface monitoring

The following list of features and functionality will be developed:

  • DPDK APIs and functions for latency and interface monitoring
  • A sample application to demonstrate usage
  • collectd plugins

The scope of the project involves developing the relavant DPDK APIs, OVS APIs, sample applications, as well as the utilities in collectd to export all the relavent information to a telemetry and events consumer.

VNF specific processing, Traffic Monitoring, Performance Monitoring and Management Agent are out of scope.

The Proposed Interface counters include:

  • Packet RX
  • Packet TX
  • Packet loss
  • Interface errors + other stats

The Proposed Packet Latency Monitor include:

  • Cycle accurate stamping on ingress
  • Supports latency measurements on egress

Support for failover of DPDK enabled cores is also out of scope of the current proposal. However, this is an important requirement and must-have functionality for any DPDK enabled framework in the NFVI. To that end, a second phase of this project will be to implement DPDK Keep Alive functionality that would address this and would report to a VNF-level Failover and High Availability mechanism that would then determine what actions, including failover, may be triggered.

Consumption Models

In reality many VNFs will have an existing performance or traffic monitoring utility used to monitor VNF behavior and report statistics, counters, etc.

The consumption of performance and traffic related information/events provided by this project should be a logical extension of any existing VNF/NFVI monitoring framework. It should not require a new framework to be developed. We do not see the Barometer gathered metrics and evetns as major additional effort for monitoring frameworks to consume; this project would be sympathetic to existing monitoring frameworks. The intention is that this project represents an interface for NFVI monitoring to be used by higher level fault management entities (see below).

Allowing the Barometer metrics and events to be handled within existing telemetry frameoworks makes it simpler for overall interfacing with higher level management components in the VIM, MANO and OSS/BSS. The Barometer proposal would be complementary to the Doctor project, which addresses NFVI Fault Management support in the VIM, and the VES project, which addresses the integration of VNF telemetry-related data into automated VNF management systems. To that end, the project committers and contributors for the Barometer project wish to collaborate with the Doctor and VES projects to facilitate this.

collectd

collectd is a daemon which collects system performance statistics periodically and provides a variety of mechanisms to publish the collected metrics. It supports more than 90 different input and output plugins. Input plugins retrieve metrics and publish them to the collectd deamon, while output plugins publish the data they receive to an end point. collectd also has infrastructure to support thresholding and notification.

collectd statistics and Notifications

Within collectd notifications and performance data are dispatched in the same way. There are producer plugins (plugins that create notifications/metrics), and consumer plugins (plugins that receive notifications/metrics and do something with them).

Statistics in collectd consist of a value list. A value list includes:

  • Values, can be one of:
    • Derive: used for values where a change in the value since it’s last been read is of interest. Can be used to calculate and store a rate.
    • Counter: similar to derive values, but take the possibility of a counter wrap around into consideration.
    • Gauge: used for values that are stored as is.
    • Absolute: used for counters that are reset after reading.
  • Value length: the number of values in the data set.
  • Time: timestamp at which the value was collected.
  • Interval: interval at which to expect a new value.
  • Host: used to identify the host.
  • Plugin: used to identify the plugin.
  • Plugin instance (optional): used to group a set of values together. For e.g. values belonging to a DPDK interface.
  • Type: unit used to measure a value. In other words used to refer to a data set.
  • Type instance (optional): used to distinguish between values that have an identical type.
  • meta data: an opaque data structure that enables the passing of additional information about a value list. “Meta data in the global cache can be used to store arbitrary information about an identifier” [7].

Host, plugin, plugin instance, type and type instance uniquely identify a collectd value.

Values lists are often accompanied by data sets that describe the values in more detail. Data sets consist of:

  • A type: a name which uniquely identifies a data set.
  • One or more data sources (entries in a data set) which include:
    • The name of the data source. If there is only a single data source this is set to “value”.
    • The type of the data source, one of: counter, gauge, absolute or derive.
    • A min and a max value.

Types in collectd are defined in types.db. Examples of types in types.db:

bitrate    value:GAUGE:0:4294967295
counter    value:COUNTER:U:U
if_octets  rx:COUNTER:0:4294967295, tx:COUNTER:0:4294967295

In the example above if_octets has two data sources: tx and rx.

Notifications in collectd are generic messages containing:

  • An associated severity, which can be one of OKAY, WARNING, and FAILURE.
  • A time.
  • A Message
  • A host.
  • A plugin.
  • A plugin instance (optional).
  • A type.
  • A types instance (optional).
  • Meta-data.
DPDK Enhancements

This section will discuss the Barometer features that were integrated with DPDK.

Measuring Telco Traffic and Performance KPIs

This section will discuss the Barometer features that enable Measuring Telco Traffic and Performance KPIs.

_images/stats_and_timestamps.png

Measuring Telco Traffic and Performance KPIs

  • The very first thing Barometer enabled was a call-back API in DPDK and an associated application that used the API to demonstrate how to timestamp packets and measure packet latency in DPDK (the sample app is called rxtx_callbacks). This was upstreamed to DPDK 2.0 and is represented by the interfaces 1 and 2 in Figure 1.2.
  • The second thing Barometer implemented in DPDK is the extended NIC statistics API, which exposes NIC stats including error stats to the DPDK user by reading the registers on the NIC. This is represented by interface 3 in Figure 1.2.
    • For DPDK 2.1 this API was only implemented for the ixgbe (10Gb) NIC driver, in association with a sample application that runs as a DPDK secondary process and retrieves the extended NIC stats.
    • For DPDK 2.2 the API was implemented for igb, i40e and all the Virtual Functions (VFs) for all drivers.
    • For DPDK 16.07 the API migrated from using string value pairs to using id value pairs, improving the overall performance of the API.
Monitoring DPDK interfaces

With the features Barometer enabled in DPDK to enable measuring Telco traffic and performance KPIs, we can now retrieve NIC statistics including error stats and relay them to a DPDK user. The next step is to enable monitoring of the DPDK interfaces based on the stats that we are retrieving from the NICs, by relaying the information to a higher level Fault Management entity. To enable this Barometer has been enabling a number of plugins for collectd.

DPDK Keep Alive description

SFQM aims to enable fault detection within DPDK, the very first feature to meet this goal is the DPDK Keep Alive Sample app that is part of DPDK 2.2.

DPDK Keep Alive or KA is a sample application that acts as a heartbeat/watchdog for DPDK packet processing cores, to detect application thread failure. The application supports the detection of ‘failed’ DPDK cores and notification to a HA/SA middleware. The purpose is to detect Packet Processing Core fails (e.g. infinite loop) and ensure the failure of the core does not result in a fault that is not detectable by a management entity.

_images/dpdk_ka.png

DPDK Keep Alive Sample Application

Essentially the app demonstrates how to detect ‘silent outages’ on DPDK packet processing cores. The application can be decomposed into two specific parts: detection and notification.

  • The detection period is programmable/configurable but defaults to 5ms if no timeout is specified.
  • The Notification support is enabled by simply having a hook function that where this can be ‘call back support’ for a fault management application with a compliant heartbeat mechanism.
DPDK Keep Alive Sample App Internals

This section provides some explanation of the The Keep-Alive/’Liveliness’ conceptual scheme as well as the DPDK Keep Alive App. The initialization and run-time paths are very similar to those of the L2 forwarding application (see L2 Forwarding Sample Application (in Real and Virtualized Environments) for more information).

There are two types of cores: a Keep Alive Monitor Agent Core (master DPDK core) and Worker cores (Tx/Rx/Forwarding cores). The Keep Alive Monitor Agent Core will supervise worker cores and report any failure (2 successive missed pings). The Keep-Alive/’Liveliness’ conceptual scheme is:

  • DPDK worker cores mark their liveliness as they forward traffic.
  • A Keep Alive Monitor Agent Core runs a function every N Milliseconds to inspect worker core liveliness.
  • If keep-alive agent detects time-outs, it notifies the fault management entity through a call-back function.

Note: Only the worker cores state is monitored. There is no mechanism or agent to monitor the Keep Alive Monitor Agent Core.

DPDK Keep Alive Sample App Code Internals

The following section provides some explanation of the code aspects that are specific to the Keep Alive sample application.

The heartbeat functionality is initialized with a struct rte_heartbeat and the callback function to invoke in the case of a timeout.

rte_global_keepalive_info = rte_keepalive_create(&dead_core, NULL);
if (rte_global_hbeat_info == NULL)
    rte_exit(EXIT_FAILURE, "keepalive_create() failed");

The function that issues the pings hbeat_dispatch_pings() is configured to run every check_period milliseconds.

if (rte_timer_reset(&hb_timer,
        (check_period * rte_get_timer_hz()) / 1000,
        PERIODICAL,
        rte_lcore_id(),
        &hbeat_dispatch_pings, rte_global_keepalive_info
        ) != 0 )
    rte_exit(EXIT_FAILURE, "Keepalive setup failure.\n");

The rest of the initialization and run-time path follows the same paths as the the L2 forwarding application. The only addition to the main processing loop is the mark alive functionality and the example random failures.

rte_keepalive_mark_alive(&rte_global_hbeat_info);
cur_tsc = rte_rdtsc();

/* Die randomly within 7 secs for demo purposes.. */
if (cur_tsc - tsc_initial > tsc_lifetime)
break;

The rte_keepalive_mark_alive() function simply sets the core state to alive.

static inline void
rte_keepalive_mark_alive(struct rte_heartbeat *keepcfg)
{
    keepcfg->state_flags[rte_lcore_id()] = 1;
}

Keep Alive Monitor Agent Core Monitoring Options The application can run on either a host or a guest. As such there are a number of options for monitoring the Keep Alive Monitor Agent Core through a Local Agent on the compute node:

Application Location DPDK KA LOCAL AGENT
HOST X HOST/GUEST
GUEST X HOST/GUEST

For the first implementation of a Local Agent SFQM will enable:

Application Location DPDK KA LOCAL AGENT
HOST X HOST

Through extending the dpdkstat plugin for collectd with KA functionality, and integrating the extended plugin with Monasca for high performing, resilient, and scalable fault detection.

OPNFV Barometer configuration Guide
Barometer Configuration

This document provides guidelines on how to install and configure the Barometer plugin when using Fuel as a deployment tool. The plugin name is: Collectd Ceilometer Plugin. This plugin installs collectd on a compute node and enables a number of collectd plugins to collect metrics and events from the platform and send them to ceilometer.

Pre-configuration activities

The Barometer Fuel plugin can be found in /opt/opnfv on the fuel master. To enable this plugin:

$ cd /opt/opnfv
$ fuel plugins --install fuel-plugin-collectd-ceilometer-1.0-1.0.0-1.noarch.rpm

On the Fuel UI, create a new environment. * In Settings > OpenStack Services * Enable “Install Ceilometer and Aodh” * In Settings > Other * Enable “Deploy Collectd Ceilometer Plugin” * Enable the barometer plugins you’d like to deploy using the checkboxes * Continue with environment configuration and deployment as normal.

Hardware configuration

There’s no specific Hardware configuration required for this the barometer fuel plugin.

Feature configuration

Describe the procedures to configure your feature on the platform in order that it is ready to use according to the feature instructions in the platform user guide. Where applicable you should add content in the postinstall.rst to validate the feature is configured for use. (checking components are installed correctly etc...)

Upgrading the plugin

From time to time new versions of the plugin may become available.

The plugin cannot be upgraded if an active environment is using the plugin.

In order to upgrade the plugin:

  • Copy the updated plugin file to the fuel-master.
  • On the Fuel UI, reset the environment.
  • On the Fuel CLI “fuel plugins –update <fuel-plugin-file>”
  • On the Fuel UI, re-deploy the environment.
Barometer post installation procedures

Add a brief introduction to the methods of validating the installation according to this specific installer or feature.

Automated post installation activities

Describe specific post installation activities performed by the OPNFV deployment pipeline including testing activities and reports. Refer to the relevant testing guides, results, and release notes.

note: this section should be singular and derived from the test projects once we have one test suite to run for all deploy tools. This is not the case yet so each deploy tool will need to provide (hopefully very simillar) documentation of this.

Barometer post configuration procedures

The fuel plugin installs collectd and its plugins on compute nodes. separate config files for each of the collectd plugins. These configuration files can be found on the compute node @ /etc/collectd/collectd.conf.d/ directory. Each collectd plugin will have its own configuration file with a default configuration for each plugin. You can override any of the plugin configurations, by modifying the configuration file and restarting the collectd service on the compute node.

Platform components validation
  1. SSH to a compute node and ensure that the collectd service is running.
  2. On the compute node, you need to inject a corrected memory error:
$ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
$ cd mce-inject
$ make
$ modprobe mce-inject

Modify the test/corrected script to include the following:

CPU 0 BANK 0
STATUS 0xcc00008000010090
ADDR 0x0010FFFFFFF

Inject the error:

$ ./mce-inject < test/corrected
  1. SSH to openstack controller node and query the ceilometer DB:
$ source openrc
$ ceilometer sample-list -m interface.if_packets
$ ceilometer sample-list -m hugepages.vmpage_number
$ ceilometer sample-list -m ovs_events.gauge
$ ceilometer sample-list -m mcelog.errors

As you run each command above, you should see output similar to the examples below:

OPNFV Barometer User Guide
OPNFV Barometer User Guide
Barometer collectd plugins description

collectd is a daemon which collects system performance statistics periodically and provides a variety of mechanisms to publish the collected metrics. It supports more than 90 different input and output plugins. Input plugins retrieve metrics and publish them to the collectd deamon, while output plugins publish the data they receive to an end point. collectd also has infrastructure to support thresholding and notification.

Barometer has enabled the following collectd plugins:

  • dpdkstat plugin: A read plugin that retrieve stats from the DPDK extended
    NIC stats API.
  • dpdkevents plugin: A read plugin that retrieves DPDK link status and DPDK forwarding cores liveliness status (DPDK Keep Alive).
  • ceilometer plugin: A write plugin that pushes the retrieved stats to Ceilometer. It’s capable of pushing any stats read through collectd to Ceilometer, not just the DPDK stats.
  • hugepages plugin: A read plugin that retrieves the number of available and free hugepages on a platform as well as what is available in terms of hugepages per socket.
  • Open vSwitch events Plugin: A read plugin that retrieves events from OVS.
  • Open vSwitch stats Plugin: A read plugin that retrieve flow and interface stats from OVS.
  • mcelog plugin: A read plugin that uses mcelog client protocol to check for memory Machine Check Exceptions and sends the stats for reported exceptions
  • RDT plugin: A read plugin that provides the last level cache utilitzation and memory bandwidth utilization

All the plugins above are available on the collectd master, except for the ceilometer plugin as it’s a python based plugin and only C plugins are accepted by the collectd community. The ceilometer plugin lives in the OpenStack repositories.

Other plugins existing as a pull request into collectd master:

  • SNMP Agent: A write plugin that will act as a AgentX subagent that receives and handles queries from SNMP master agent and returns the data collected by read plugins. The SNMP Agent plugin handles requests only for OIDs specified in configuration file. To handle SNMP queries the plugin gets data from collectd and translates requested values from collectd’s internal format to SNMP format. Supports SNMP: get, getnext and walk requests.
  • Legacy/IPMI: A read plugin that reports platform thermals, voltages, fanspeed, current, flow, power etc. Also, the plugin monitors Intelligent Platform Management Interface (IPMI) System Event Log (SEL) and sends the

Plugins included in the Danube release:

  • Hugepages
  • Open vSwitch Events
  • Ceilometer
  • Mcelog
collectd capabilities and usage

Note

Plugins included in the OPNFV D release will be built-in to the fuel plugin and available in the /opt/opnfv directory on the fuel master. You don’t need to clone the barometer/collectd repos to use these, but you can configure them as shown in the examples below.

The collectd plugins in OPNFV are configured with reasonable defaults, but can be overriden.

Building all Barometer upstreamed plugins from scratch

The plugins that have been merged to the collectd master branch can all be built and configured through the barometer repository.

Note

  • sudo permissions are required to install collectd.
  • These are instructions for Ubuntu 16.04

To build and install these dependencies, clone the barometer repo:

$ git clone https://gerrit.opnfv.org/gerrit/barometer

Install the build dependencies

$ ./src/install_build_deps.sh

To install collectd as a service and install all it’s dependencies:

$ cd barometer/src && sudo make && sudo make install

This will install collectd as a service and the base install directory will be /opt/collectd.

Sample configuration files can be found in ‘/opt/collectd/etc/collectd.conf.d’

Note

  • If you plan on using the Exec plugin, the plugin requires non-root user to execute scripts. By default, collectd_exec user is used. Barometer scripts do not create this user. It needs to be manually added or exec plugin configuration has to be changed to use other, existing user before starting collectd service.
  • If you don’t want to use one of the Barometer plugins, simply remove the sample config file from ‘/opt/collectd/etc/collectd.conf.d’
  • If you are using any Open vSwitch plugins you need to run:
$ sudo ovs-vsctl set-manager ptcp:6640

Below is the per plugin installation and configuration guide, if you only want to install some/particular plugins.

DPDK statistics plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies: DPDK (http://dpdk.org/) Min_Version: 16.04

To build and install DPDK to /usr please see: https://github.com/collectd/collectd/blob/master/docs/BUILD.dpdkstat.md

Building and installing collectd:

$ git clone https://github.com/collectd/collectd.git
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

This will install collectd to /opt/collectd The collectd configuration file can be found at /opt/collectd/etc To configure the dpdkstats plugin you need to modify the configuration file to include:

LoadPlugin dpdkstat
<Plugin "dpdkstat">
    <EAL>
        Coremask "0x2"
        MemoryChannels "4"
        ProcessType "secondary"
        FilePrefix "rte"
    </EAL>
    EnabledPortMask 0xffff
    PortName "interface1"
    PortName "interface2"
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

Note

If you are not building and installing DPDK system-wide you will need to specify the specific paths to the header files and libraries using LIBDPDK_CPPFLAGS and LIBDPDK_LDFLAGS. You will also need to add the DPDK library symbols to the shared library path using ldconfig. Note that this update to the shared library path is not persistant (i.e. it will not survive a reboot).

DPDK events plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies: DPDK (http://dpdk.org/)

To build and install DPDK to /usr please see: https://github.com/collectd/collectd/blob/master/docs/BUILD.dpdkstat.md

Building and installing collectd:

$ git clone https://github.com/maryamtahhan/collectd.git
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

This will install collectd to /opt/collectd The collectd configuration file can be found at /opt/collectd/etc To configure the dpdkevents plugin you need to modify the configuration file to include:

LoadPlugin dpdkevents
<Plugin "dpdkevents">
    Interval 1
    <EAL>
        Coremask "0x1"
        MemoryChannels "4"
        ProcessType "secondary"
        FilePrefix "rte"
    </EAL>
    <Event "link_status">
        SendEventsOnUpdate true
        EnabledPortMask 0xffff
        PortName "interface1"
        PortName "interface2"
        SendNotification false
    </Event>
    <Event "keep_alive">
        SendEventsOnUpdate true
        LCoreMask "0xf"
        KeepAliveShmName "/dpdk_keepalive_shm_name"
        SendNotification false
    </Event>
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

Note

If you are not building and installing DPDK system-wide you will need to specify the specific paths to the header files and libraries using LIBDPDK_CPPFLAGS and LIBDPDK_LDFLAGS. You will also need to add the DPDK library symbols to the shared library path using ldconfig. Note that this update to the shared library path is not persistant (i.e. it will not survive a reboot).

$ ./configure LIBDPDK_CPPFLAGS="path to DPDK header files" LIBDPDK_LDFLAGS="path to DPDK libraries"
Hugepages Plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies: None, but assumes hugepages are configured.

To configure some hugepages:

sudo mkdir -p /mnt/huge
sudo mount -t hugetlbfs nodev /mnt/huge
sudo echo 14336 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages

Building and installing collectd:

$ git clone https://github.com/collectd/collectd.git
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-hugepages --enable-debug
$ make
$ sudo make install

This will install collectd to /opt/collectd The collectd configuration file can be found at /opt/collectd/etc To configure the hugepages plugin you need to modify the configuration file to include:

LoadPlugin hugepages
<Plugin hugepages>
    ReportPerNodeHP  true
    ReportRootHP     true
    ValuesPages      true
    ValuesBytes      false
    ValuesPercentage false
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

Intel RDT Plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies:

Building and installing PQoS/Intel RDT library:

$ git clone https://github.com/01org/intel-cmt-cat.git
$ cd intel-cmt-cat
$ make
$ make install PREFIX=/usr

You will need to insert the msr kernel module:

$ modprobe msr

Building and installing collectd:

$ git clone https://github.com/collectd/collectd.git
$ cd collectd
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --with-libpqos=/usr/ --enable-debug
$ make
$ sudo make install

This will install collectd to /opt/collectd The collectd configuration file can be found at /opt/collectd/etc To configure the RDT plugin you need to modify the configuration file to include:

<LoadPlugin intel_rdt>
  Interval 1
</LoadPlugin>
<Plugin "intel_rdt">
  Cores ""
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod

IPMI Plugin

Repo: https://github.com/maryamtahhan/collectd

Branch: feat_ipmi_events, feat_ipmi_analog

Dependencies: OpenIPMI library

The IPMI plugin is already implemented in the latest collectd and sensors like temperature, voltage, fanspeed, current are already supported there. The list of supported IPMI sensors has been extended and sensors like flow, power are supported now. Also, a System Event Log (SEL) notification feature has been introduced.

  • The feat_ipmi_events branch includes new SEL feature support in collectd IPMI plugin. If this feature is enabled, the collectd IPMI plugin will dispatch notifications about new events in System Event Log.
  • The feat_ipmi_analog branch includes the support of extended IPMI sensors in collectd IPMI plugin.

On Ubuntu, install the dependencies:

$ sudo apt-get install libopenipmi-dev

Enable IPMI support in the kernel:

$ sudo modprobe ipmi_devintf
$ sudo modprobe ipmi_si

Note: If HW supports IPMI, the /dev/ipmi0 character device will be created.

Clone and install the collectd IPMI plugin:

$ git clone  https://github.com/maryamtahhan/collectd
$ cd collectd
$ git checkout $BRANCH
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

Where $BRANCH is feat_ipmi_events or feat_ipmi_analog.

This will install collectd to default folder /opt/collectd. The collectd configuration file (collectd.conf) can be found at /opt/collectd/etc. To configure the IPMI plugin you need to modify the file to include:

LoadPlugin ipmi
<Plugin ipmi>
   SELEnabled true # only feat_ipmi_events branch supports this
</Plugin>

Note: By default, IPMI plugin will read all available analog sensor values, dispatch the values to collectd and send SEL notifications.

For more information on the IPMI plugin parameters and SEL feature configuration, please see: https://github.com/maryamtahhan/collectd/blob/feat_ipmi_events/src/collectd.conf.pod

Extended analog sensors support doesn’t require additional configuration. The usual collectd IPMI documentation can be used:

IPMI documentation:

Mcelog Plugin

Repo: https://github.com/collectd/collectd

Branch: master

Dependencies: mcelog

Start by installing mcelog. Note: The kernel has to have CONFIG_X86_MCE enabled. For 32bit kernels you need at least a 2.6,30 kernel.

On ubuntu:

$ apt-get update && apt-get install mcelog

Or build from source

$ git clone git://git.kernel.org/pub/scm/utils/cpu/mce/mcelog.git
$ cd mcelog
$ make
... become root ...
$ make install
$ cp mcelog.service /etc/systemd/system/
$ systemctl enable mcelog.service
$ systemctl start mcelog.service

Verify you got a /dev/mcelog. You can verify the daemon is running completely by running:

$ mcelog --client

This should query the information in the running daemon. If it prints nothing that is fine (no errors logged yet). More info @ http://www.mcelog.org/installation.html

Modify the mcelog configuration file “/etc/mcelog/mcelog.conf” to include or enable:

socket-path = /var/run/mcelog-client

Clone and install the collectd mcelog plugin:

$ git clone  https://github.com/maryamtahhan/collectd
$ cd collectd
$ git checkout feat_ras
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

This will install collectd to /opt/collectd The collectd configuration file can be found at /opt/collectd/etc To configure the mcelog plugin you need to modify the configuration file to include:

<LoadPlugin mcelog>
  Interval 1
</LoadPlugin>
<Plugin "mcelog">
   McelogClientSocket "/var/run/mcelog-client"
</Plugin>

For more information on the plugin parameters, please see: https://github.com/maryamtahhan/collectd/blob/feat_ras/src/collectd.conf.pod

Simulating a Machine Check Exception can be done in one of 3 ways:

  • Running $make test in the mcelog cloned directory - mcelog test suite
  • using mce-inject
  • using mce-test

mcelog test suite:

It is always a good idea to test an error handling mechanism before it is really needed. mcelog includes a test suite. The test suite relies on mce-inject which needs to be installed and in $PATH.

You also need the mce-inject kernel module configured (with CONFIG_X86_MCE_INJECT=y), compiled, installed and loaded:

$ modprobe mce-inject

Then you can run the mcelog test suite with

$ make test

This will inject different classes of errors and check that the mcelog triggers runs. There will be some kernel messages about page offlining attempts. The test will also lose a few pages of memory in your system (not significant) Note this test will kill any running mcelog, which needs to be restarted manually afterwards. mce-inject:

A utility to inject corrected, uncorrected and fatal machine check exceptions

$ git clone https://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
$ cd mce-inject
$ make
$ modprobe mce-inject

Modify the test/corrected script to include the following:

CPU 0 BANK 0
STATUS 0xcc00008000010090
ADDR 0x0010FFFFFFF

Inject the error: .. code:: bash

$ ./mce-inject < test/corrected

Note: the uncorrected and fatal scripts under test will cause a platform reset. Only the fatal script generates the memory errors. In order to quickly emulate uncorrected memory errors and avoid host reboot following test errors from mce-test suite can be injected:

$ mce-inject  mce-test/cases/coverage/soft-inj/recoverable_ucr/data/srao_mem_scrub

mce-test:

In addition an more in-depth test of the Linux kernel machine check facilities can be done with the mce-test test suite. mce-test supports testing uncorrected error handling, real error injection, handling of different soft offlining cases, and other tests.

Corrected memory error injection:

To inject corrected memory errors:

  • Remove sb_edac and edac_core kernel modules: rmmod sb_edac rmmod edac_core
  • Insert einj module: modprobe einj param_extension=1
  • Inject an error by specifying details (last command should be repeated at least two times):
$ APEI_IF=/sys/kernel/debug/apei/einj
$ echo 0x8 > $APEI_IF/error_type
$ echo 0x01f5591000 > $APEI_IF/param1
$ echo 0xfffffffffffff000 > $APEI_IF/param2
$ echo 1 > $APEI_IF/notrigger
$ echo 1 > $APEI_IF/error_inject
  • Check the MCE statistic: mcelog –client. Check the mcelog log for injected error details: less /var/log/mcelog.
Open vSwitch Plugins

OvS Events Repo: https://github.com/collectd/collectd

OvS Stats Repo: https://github.com/maryamtahhan/collectd

OvS Events Branch: master

OvS Stats Branch:feat_ovs_stats

OvS Events MIBs: The SNMP OVS interface link status is provided by standard IF-MIB (http://www.net-snmp.org/docs/mibs/IF-MIB.txt)

Dependencies: Open vSwitch, Yet Another JSON Library (https://github.com/lloyd/yajl)

On Ubuntu, install the dependencies:

$ sudo apt-get install libyajl-dev openvswitch-switch

Start the Open vSwitch service:

$ sudo service openvswitch-switch start

configure the ovsdb-server manager:

$ sudo ovs-vsctl set-manager ptcp:6640

Clone and install the collectd ovs plugin:

$ git clone $REPO
$ cd collectd
$ git checkout $BRANCH
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug
$ make
$ sudo make install

where $REPO is one of the repos listed at the top of this section.

Where $BRANCH is master or feat_ovs_stats.

This will install collectd to /opt/collectd. The collectd configuration file can be found at /opt/collectd/etc. To configure the OVS events plugin you need to modify the configuration file to include:

<LoadPlugin ovs_events>
   Interval 1
</LoadPlugin>
<Plugin "ovs_events">
   Port 6640
   Socket "/var/run/openvswitch/db.sock"
   Interfaces "br0" "veth0"
   SendNotification false
   DispatchValues true
</Plugin>

To configure the OVS stats plugin you need to modify the configuration file to include:

<LoadPlugin ovs_stats>
   Interval 1
</LoadPlugin>
<Plugin ovs_stats>
   Port "6640"
   Address "127.0.0.1"
   Socket "/var/run/openvswitch/db.sock"
   Bridges "br0" "br_ext"
</Plugin>

For more information on the plugin parameters, please see: https://github.com/collectd/collectd/blob/master/src/collectd.conf.pod and https://github.com/maryamtahhan/collectd/blob/feat_ovs_stats/src/collectd.conf.pod

SNMP Agent Plugin

Repo: https://github.com/maryamtahhan/collectd/

Branch: feat_snmp

Dependencies: NET-SNMP library

Start by installing net-snmp and dependencies.

On ubuntu:

$ apt-get install snmp snmp-mibs-downloader snmpd libsnmp-dev
$ systemctl start snmpd.service

Or build from source

Become root to install net-snmp dependencies

$ apt-get install libperl-dev

Clone and build net-snmp

$ git clone https://github.com/haad/net-snmp.git
$ cd net-snmp
$ ./configure --with-persistent-directory="/var/net-snmp" --with-systemd --enable-shared --prefix=/usr
$ make

Become root

$ make install

Copy default configuration to persistent folder

$ cp EXAMPLE.conf /usr/share/snmp/snmpd.conf

Set library path and default MIB configuration

$ cd ~/
$ echo export LD_LIBRARY_PATH=/usr/lib >> .bashrc
$ net-snmp-config --default-mibdirs
$ net-snmp-config --snmpconfpath

Configure snmpd as a service

$ cd net-snmp
$ cp ./dist/snmpd.service /etc/systemd/system/
$ systemctl enable snmpd.service
$ systemctl start snmpd.service

Add the following line to snmpd.conf configuration file “/usr/share/snmp/snmpd.conf” to make all OID tree visible for SNMP clients:

view   systemonly  included   .1

To verify that SNMP is working you can get IF-MIB table using SNMP client to view the list of Linux interfaces:

$ snmpwalk -v 2c -c public localhost IF-MIB::interfaces

Clone and install the collectd snmp_agent plugin:

$ git clone  https://github.com/maryamtahhan/collectd
$ cd collectd
$ git checkout feat_snmp
$ ./build.sh
$ ./configure --enable-syslog --enable-logfile --enable-debug --enable-snmp --with-libnetsnmp
$ make
$ sudo make install

This will install collectd to /opt/collectd The collectd configuration file can be found at /opt/collectd/etc SNMP Agent plugin is a generic plugin and cannot work without configuration. To configure the snmp_agent plugin you need to modify the configuration file to include OIDs mapped to collectd types. The following example maps scalar memAvailReal OID to value represented as free memory type of memory plugin:

LoadPlugin snmp_agent
<Plugin "snmp_agent">
  <Data "memAvailReal">
    Plugin "memory"
    Type "memory"
    TypeInstance "free"
    OIDs "1.3.6.1.4.1.2021.4.6.0"
  </Data>
</Plugin>

For more information on the plugin parameters, please see: https://github.com/maryamtahhan/collectd/blob/feat_snmp/src/collectd.conf.pod

For more details on AgentX subagent, please see: http://www.net-snmp.org/tutorial/tutorial-5/toolkit/demon/

Installing collectd as a service

NOTE: In an OPNFV installation, collectd is installed and configured as a service.

Collectd service scripts are available in the collectd/contrib directory. To install collectd as a service:

$ sudo cp contrib/systemd.collectd.service /etc/systemd/system/
$ cd /etc/systemd/system/
$ sudo mv systemd.collectd.service collectd.service
$ sudo chmod +x collectd.service

Modify collectd.service

[Service]
ExecStart=/opt/collectd/sbin/collectd
EnvironmentFile=-/opt/collectd/etc/
EnvironmentFile=-/opt/collectd/etc/
CapabilityBoundingSet=CAP_SETUID CAP_SETGID

Reload

$ sudo systemctl daemon-reload
$ sudo systemctl start collectd.service
$ sudo systemctl status collectd.service should show success
Additional useful plugins
  • Exec Plugin : Can be used to show you when notifications are being
generated by calling a bash script that dumps notifications to file. (handy for debug). Modify /opt/collectd/etc/collectd.conf:
LoadPlugin exec
<Plugin exec>
#   Exec "user:group" "/path/to/exec"
   NotificationExec "user" "<path to barometer>/barometer/src/collectd/collectd_sample_configs/write_notification.sh"
</Plugin>

write_notification.sh (just writes the notification passed from exec through STDIN to a file (/tmp/notifications)):

#!/bin/bash
rm -f /tmp/notifications
while read x y
do
  echo $x$y >> /tmp/notifications
done

output to /tmp/notifications should look like:

Severity:WARNING
Time:1479991318.806
Host:localhost
Plugin:ovs_events
PluginInstance:br-ex
Type:gauge
TypeInstance:link_status
uuid:f2aafeec-fa98-4e76-aec5-18ae9fc74589

linkstate of "br-ex" interface has been changed to "DOWN"
  • logfile plugin: Can be used to log collectd activity. Modify /opt/collectd/etc/collectd.conf to include:
LoadPlugin logfile
<Plugin logfile>
    LogLevel info
    File "/var/log/collectd.log"
    Timestamp true
    PrintSeverity false
</Plugin>
Monitoring Interfaces and Openstack Support
_images/monitoring_interfaces.png

Monitoring Interfaces and Openstack Support

The figure above shows the DPDK L2 forwarding application running on a compute node, sending and receiving traffic. collectd is also running on this compute node retrieving the stats periodically from DPDK through the dpdkstat plugin and publishing the retrieved stats to Ceilometer through the ceilometer plugin.

To see this demo in action please checkout: Barometer OPNFV Summit demo

collectd VES plugin User Guide

The Barometer repository contains a python based write plugin for VES.

The plugin currently supports pushing platform relevant metrics through the additional measurements field for VES.

Please note: Hardcoded configuration values will be modified so that they are configurable through the configuration file.

Installation Instructions:
  1. Clone this repo
  2. Install collectd
$ sudo apt-get install collectd
  1. Modify the collectd configuration script: /etc/collectd/collectd.conf
<LoadPlugin python>
  Globals true
</LoadPlugin>

<Plugin python>
  ModulePath "/path/to/your/python/modules"
  LogTraces true
  Interactive false
  Import "ves_plugin"
<Module ves_plugin>
# VES plugin configuration (see next section below)
</Module>
</Plugin>

where “/path/to/your/python/modules” is the path to where you cloned this repo

VES python plugin configuration description:

Note Details of the Vendor Event Listener REST service

REST resources are defined with respect to a ServerRoot:

ServerRoot = https://{Domain}:{Port}/{optionalRoutingPath}

REST resources are of the form:

{ServerRoot}/eventListener/v{apiVersion}`
{ServerRoot}/eventListener/v{apiVersion}/{topicName}`
{ServerRoot}/eventListener/v{apiVersion}/eventBatch`

Domain “host” * VES domain name. It can be IP address or hostname of VES collector (default: 127.0.0.1)

Port port * VES port (default: 30000)

Path “path” * Used as the “optionalRoutingPath” element in the REST path (default: empty)

Topic “path” * Used as the “topicName” element in the REST path (default: empty)

UseHttps true|false * Allow plugin to use HTTPS instead of HTTP (default: false)

Username “username” * VES collector user name (default: empty)

Password “passwd” * VES collector password (default: empty)

FunctionalRole “role” * Used as the ‘functionalRole’ field of ‘commonEventHeader’ event (default: Collectd VES Agent)

GuestRunning true|false * This option is used if the collectd is running on a guest machine, e.g this option should be set to true in this case. Defaults to false.

Other collectd.conf configurations

Please ensure that FQDNLookup is set to false

FQDNLookup   false

Please ensure that the virt plugin is enabled and configured as follows. This configuration is is required only on a host side (‘GuestRunning’ = false).

LoadPlugin virt

<Plugin virt>
        Connection "qemu:///system"
        RefreshInterval 60
        HostnameFormat uuid
</Plugin>

Please ensure that the cpu plugin is enabled and configured as follows

LoadPlugin cpu

<Plugin cpu>
    ReportByCpu false
    ValuesPercentage true
</Plugin>

Please ensure that the aggregation plugin is enabled and configured as follows

LoadPlugin aggregation

<Plugin aggregation>
    <Aggregation>
            Plugin "cpu"
            Type "percent"
            GroupBy "Host"
            GroupBy "TypeInstance"
            SetPlugin "cpu-aggregation"
            CalculateAverage true
    </Aggregation>
</Plugin>

If plugin is running on a guest side, it is important to enable uuid plugin too. In this case the hostname in event message will be represented as UUID instead of system host name.

LoadPlugin uuid

If custom UUID needs to be provided, the following configuration is required in collectd.conf file:

<Plugin uuid>
    UUIDFile "/etc/uuid"
</Plugin>

Where “/etc/uuid” is a file containing custom UUID.

Please also ensure that the following plugins are enabled:

LoadPlugin disk
LoadPlugin interface
LoadPlugin memory
VES plugin notification example

A good example of collectD notification is monitoring of CPU load on a host or guest using ‘threshold’ plugin. The following configuration will setup VES plugin to send ‘Fault’ event every time a CPU idle value is out of range (e.g.: WARNING: CPU-IDLE < 50%, CRITICAL: CPU-IDLE < 30%) and send ‘Fault’ NORMAL event if CPU idle value is back to normal.

LoadPlugin threshold

<Plugin "threshold">
     <Plugin "cpu-aggregation">
        <Type "percent">
          WarningMin    50.0
          WarningMax   100.0
          FailureMin    30.0
          FailureMax   100.0
          Instance "idle"
          Hits 1
        </Type>
    </Plugin>
</Plugin>

More detailed information on how to configure collectD thresholds(memory, cpu etc.) can be found here at https://collectd.org/documentation/manpages/collectd-threshold.5.shtml

Compass4Nfv

Compass4nfv Installation Instructions
1. Abstract

This document describes how to install the Danube release of OPNFV when using Compass4nfv as a deployment tool covering it’s limitations, dependencies and required system resources.

2. Version history
Date Ver. Author Comment
2017-02-21 3.0.0 Justin chi (HUAWEI) Changes for D release
2016-09-13 2.1.0 Yuenan Li (HUAWEI) Adjusted the docs structure
2016-09-12 2.0.0 Yuenan Li (HUAWEI) Rewritten for Compass4nfv C release
2016-01-17 1.0.0 Justin chi (HUAWEI) Rewritten for Compass4nfv B release
2015-12-16 0.0.2 Matthew Li (HUAWEI) Minor changes & formatting
2015-09-12 0.0.1 Chen Shuai (HUAWEI) First draft
3. Features
3.1. Supported Openstack Version and OS
  OS only OpenStack Liberty OpenStack Mitaka OpenStack Newton
CentOS 7 yes yes yes yes
Ubuntu trusty yes yes yes no
Ubuntu xenial yes no yes yes
3.2. Supported Openstack Flavor and Features
  OpenStack Liberty OpenStack Mitaka OpenStack Newton
Virtual Deployment Yes Yes Yes
Baremetal Deployment Yes Yes Yes
HA Yes Yes Yes
Ceph Yes Yes Yes
SDN ODL/ONOS Yes Yes Yes*
Compute Node Expansion Yes Yes Yes
Multi-Nic Support Yes Yes Yes
Boot Recovery Yes Yes Yes
  • ONOS support will Release in Danube 2.0 or 3.0
4. Compass4nfv configuration

This document describes providing guidelines on how to install and configure the Danube release of OPNFV when using Compass as a deployment tool including required software and hardware configurations.

Installation and configuration of host OS, OpenStack, OpenDaylight, ONOS, Ceph etc. can be supported by Compass on Virtual nodes or Bare Metal nodes.

The audience of this document is assumed to have good knowledge in networking and Unix/Linux administration.

4.1. Preconditions

Before starting the installation of the Danube release of OPNFV, some planning must be done.

4.1.1. Retrieving the installation ISO image

First of all, The installation ISO is needed for deploying your OPNFV environment, it included packages of Compass, OpenStack, OpenDaylight, ONOS and so on.

The stable release ISO can be retrieved via OPNFV software download page

The daily build ISO can be retrieved via OPNFV artifacts repository:

http://artifacts.opnfv.org/compass4nfv.html

NOTE: Search the keyword “compass4nfv/Danube” to locate the ISO image.

E.g. compass4nfv/danube/opnfv-2017-03-29_08-55-09.iso

The name of iso image includes the time of iso building, you can get the daily ISO according the building time. The git url and sha1 of Compass4nfv are recorded in properties files, According these, the corresponding deployment scripts can be retrieved.

4.1.2. Getting the deployment scripts

To retrieve the repository of Compass4nfv on Jumphost use the following command:

NOTE: PLEASE DO NOT GIT CLONE COMPASS4NFV IN ROOT DIRECTORY(INCLUDE SUBFOLDERS).

To get stable /Danube release, you can use the following command:

  • git checkout Danube.1.0
4.2. Setup Requirements

If you have only 1 Bare Metal server, Virtual deployment is recommended. if more than or equal 3 servers, the Bare Metal deployment is recommended. The minimum number of servers for Bare metal deployment is 3, 1 for JumpServer(Jumphost), 1 for controller, 1 for compute.

4.2.1. Jumphost Requirements

The Jumphost requirements are outlined below:

  1. Ubuntu 14.04 (Pre-installed).
  2. Root access.
  3. libvirt virtualization support.
  4. Minimum 2 NICs.
    • PXE installation Network (Receiving PXE request from nodes and providing OS provisioning)
    • IPMI Network (Nodes power control and set boot PXE first via IPMI interface)
    • External Network (Optional: Internet access)
  5. 16 GB of RAM for a Bare Metal deployment, 64 GB of RAM for a Virtual deployment.
  6. CPU cores: 32, Memory: 64 GB, Hard Disk: 500 GB, (Virtual Deployment needs 1 TB Hard Disk)
4.3. Bare Metal Node Requirements

Bare Metal nodes require:

  1. IPMI enabled on OOB interface for power control.
  2. BIOS boot priority should be PXE first then local hard disk.
  3. Minimum 3 NICs.
    • PXE installation Network (Broadcasting PXE request)
    • IPMI Network (Receiving IPMI command from Jumphost)
    • External Network (OpenStack mgmt/external/storage/tenant network)
4.4. Network Requirements

Network requirements include:

  1. No DHCP or TFTP server running on networks used by OPNFV.
  2. 2-6 separate networks with connectivity between Jumphost and nodes.
    • PXE installation Network
    • IPMI Network
    • Openstack mgmt Network*
    • Openstack external Network*
    • Openstack tenant Network*
    • Openstack storage Network*
  3. Lights out OOB network access from Jumphost with IPMI node enabled (Bare Metal deployment only).
  4. External network has Internet access, meaning a gateway and DNS availability.

The networks with(*) can be share one NIC(Default configuration) or use an exclusive NIC(Reconfigurated in network.yml).

4.5. Execution Requirements (Bare Metal Only)

In order to execute a deployment, one must gather the following information:

  1. IPMI IP addresses of the nodes.
  2. IPMI login information for the nodes (user/pass).
  3. MAC address of Control Plane / Provisioning interfaces of the Bare Metal nodes.
4.6. Configurations

There are three configuration files a user needs to modify for a cluster deployment. network_cfg.yaml for openstack networks on hosts. dha file for host role, IPMI credential and host nic idenfitication (MAC address). deploy.sh for os and openstack version.

5. Configure network

network_cfg.yaml file describes networks configuration for openstack on hosts. It specifies host network mapping and ip assignment of networks to be installed on hosts. Compass4nfv includes a sample network_cfg.yaml under compass4nfv/deploy/conf/network_cfg.yaml

There are three openstack networks to be installed: external, mgmt and storage. These three networks can be shared on one physical nic or on separate nics (multi-nic). The sample included in compass4nfv uses one nic. For multi-nic configuration, see multi-nic configuration.

5.1. Configure openstack network

**! All interface name in network_cfg.yaml must be identified in dha file by mac address !**

Compass4nfv will install networks on host as described in this configuration. It will look for physical nic on host by mac address from dha file and rename nic to the name with that mac address. Therefore, any network interface name that is not identified by mac address in dha file will not be installed correctly as compass4nfv cannot find the nic.

Configure provider network

provider_net_mappings:
  - name: br-prv
    network: physnet
    interface: eth1
    type: ovs
    role:
      - controller
      - compute

The external nic in dha file must be named eth1 with mac address. If user uses a different interface name in dha file, change eth1 to that name here. Note: User cannot use eth0 for external interface name as install/pxe network is named as such.

Configure openstack mgmt&storage network:

sys_intf_mappings:
  - name: mgmt
    interface: eth1
    vlan_tag: 101
    type: vlan
    role:
      - controller
      - compute
- name: storage
    interface: eth1
    vlan_tag: 102
    type: vlan
    role:
      - controller
      - compute

Change vlan_tag of mgmt and storage to corresponding vlan tag configured on switch.

Note: for virtual deployment, there is no need to modify mgmt&storage network.

If using multi-nic feature, i.e, separate nic for mgmt or storage network, user needs to change name to desired nic name (need to match dha file). Please see multi-nic configuration.

5.2. Assign IP address to networks

ip_settings section specifics ip assignment for openstack networks.

User can use default ip range for mgmt&storage network.

for external networks:

- name: external
   ip_ranges:
   - - "192.168.50.210"
     - "192.168.50.220"
   cidr: "192.168.50.0/24"
   gw: "192.168.50.1"
   role:
     - controller
     - compute

Provide at least number of hosts available ip for external IP range(these ips will be assigned to each host). Provide actual cidr and gateway in cidr and gw fields.

configure public IP for horizon dashboard

public_vip:
 ip: 192.168.50.240
 netmask: "24"
 interface: external

Provide an external ip in ip field. This ip cannot be within the ip range assigned to external network configured in pervious section. It will be used for horizon address.

See section 6.2 (Vitual) and 7.2 (BareMetal) for graphs illustrating network topology.

6. Installation on Bare Metal
6.1. Nodes Configuration (Bare Metal Deployment)

The below file is the inventory template of deployment nodes:

“compass4nfv/deploy/conf/hardware_environment/huawei-pod1/dha.yml”

The “dha.yml” is a collectively name for “os-nosdn-nofeature-ha.yml os-ocl-nofeature-ha.yml os-odl_l2-moon-ha.yml etc”.

You can write your own IPMI IP/User/Password/Mac address/roles reference to it.

  • name – Host name for deployment node after installation.
  • ipmiVer – IPMI interface version for deployment node support. IPMI 1.0 or IPMI 2.0 is available.
  • ipmiIP – IPMI IP address for deployment node. Make sure it can access from Jumphost.
  • ipmiUser – IPMI Username for deployment node.
  • ipmiPass – IPMI Password for deployment node.
  • mac – MAC Address of deployment node PXE NIC.
  • interfaces – Host NIC renamed according to NIC MAC addresses when OS provisioning.
  • roles – Components deployed.

Set TYPE/FLAVOR and POWER TOOL

E.g. .. code-block:: yaml

TYPE: baremetal FLAVOR: cluster POWER_TOOL: ipmitool

Set ipmiUser/ipmiPass and ipmiVer

E.g.

ipmiUser: USER
ipmiPass: PASSWORD
ipmiVer: '2.0'

Assignment of different roles to servers

E.g. Openstack only deployment roles setting

hosts:
  - name: host1
    mac: 'F8:4A:BF:55:A2:8D'
    interfaces:
       - eth1: 'F8:4A:BF:55:A2:8E'
    ipmiIp: 172.16.130.26
    roles:
      - controller
      - ha

  - name: host2
    mac: 'D8:49:0B:DA:5A:B7'
    interfaces:
      - eth1: 'D8:49:0B:DA:5A:B8'
    ipmiIp: 172.16.130.27
    roles:
      - compute

NOTE: THE ‘ha’ role MUST BE SELECTED WITH CONTROLLERS, EVEN THERE IS ONLY ONE CONTROLLER NODE.

E.g. Openstack and ceph deployment roles setting

hosts:
  - name: host1
    mac: 'F8:4A:BF:55:A2:8D'
    interfaces:
       - eth1: 'F8:4A:BF:55:A2:8E'
    ipmiIp: 172.16.130.26
    roles:
      - controller
      - ha
      - ceph-adm
      - ceph-mon

  - name: host2
    mac: 'D8:49:0B:DA:5A:B7'
    interfaces:
      - eth1: 'D8:49:0B:DA:5A:B8'
    ipmiIp: 172.16.130.27
    roles:
      - compute
      - ceph-osd

E.g. Openstack and ODL deployment roles setting

hosts:
  - name: host1
    mac: 'F8:4A:BF:55:A2:8D'
    interfaces:
       - eth1: 'F8:4A:BF:55:A2:8E'
    ipmiIp: 172.16.130.26
    roles:
      - controller
      - ha
      - odl

  - name: host2
    mac: 'D8:49:0B:DA:5A:B7'
    interfaces:
      - eth1: 'D8:49:0B:DA:5A:B8'
    ipmiIp: 172.16.130.27
    roles:
      - compute

E.g. Openstack and ONOS deployment roles setting

hosts:
  - name: host1
    mac: 'F8:4A:BF:55:A2:8D'
    interfaces:
       - eth1: 'F8:4A:BF:55:A2:8E'
    ipmiIp: 172.16.130.26
    roles:
      - controller
      - ha
      - onos

  - name: host2
    mac: 'D8:49:0B:DA:5A:B7'
    interfaces:
      - eth1: 'D8:49:0B:DA:5A:B8'
    ipmiIp: 172.16.130.27
    roles:
      - compute
6.2. Network Configuration (Bare Metal Deployment)

Before deployment, there are some network configuration to be checked based on your network topology.Compass4nfv network default configuration file is “compass4nfv/deploy/conf/hardware_environment/huawei-pod1/network.yml”. This file is an example, you can customize by yourself according to specific network environment.

In this network.yml, there are several config sections listed following(corresponed to the ordre of the config file):

6.2.1. Provider Mapping
  • name – provider network name.
  • network – default as physnet, do not change it.
  • interfaces – the NIC or Bridge attached by the Network.
  • type – the type of the NIC or Bridge(vlan for NIC and ovs for Bridge, either).
  • roles – all the possible roles of the host machines which connected by this network(mostly put both controller and compute).
6.2.2. System Interface
  • name – Network name.
  • interfaces – the NIC or Bridge attached by the Network.
  • vlan_tag – if type is vlan, add this tag before ‘type’ tag.
  • type – the type of the NIC or Bridge(vlan for NIC and ovs for Bridge, either).
  • roles – all the possible roles of the host machines which connected by this network(mostly put both controller and compute).
6.2.3. IP Settings
  • name – network name corresponding the the network name in System Interface section one by one.
  • ip_ranges – ip addresses range provided for this network.
  • cidr – the IPv4 address and its associated routing prefix and subnet mask?
  • gw – need to add this line only if network is external.
  • roles – all the possible roles of the host machines which connected by this network(mostly put both controller and compute).
6.2.4. Internal VIP(virtual or proxy IP)
  • ip – virtual or proxy ip address, must be in the same subnet with mgmt network but must not be in the range of mgmt network.
  • netmask – the length of netmask
  • interface – mostly mgmt.
6.2.5. Public VIP
  • ip – virtual or proxy ip address, must be in the same subnet with external network but must not be in the range of external network.
  • netmask – the length of netmask
  • interface – mostly external.
6.2.6. ONOS NIC
  • the NIC for ONOS, if there is no ONOS configured, leave it unchanged.
6.2.7. Public Network
  • enable – must be True(if False, you need to set up provider network manually).
  • network – leave it ext-net.
  • type – the type of the ext-net above, such as flat or vlan.
  • segment_id – when the type is vlan, this should be id of vlan.
  • subnet – leave it ext-subnet.
  • provider_network – leave it physnet.
  • router – leave it router-ext.
  • enable_dhcp – must be False.
  • no_gateway – must be False.
  • external_gw – same as gw in ip_settings.
  • floating_ip_cidr – cidr for floating ip, see explanation in ip_settings.
  • floating_ip_start – define range of floating ip with floating_ip_end(this defined range must not be included in ip range of external configured in ip_settings section).
  • floating_ip_end – define range of floating ip with floating_ip_start.

The following figure shows the default network configuration.

+--+                          +--+     +--+
|  |                          |  |     |  |
|  |      +------------+      |  |     |  |
|  +------+  Jumphost  +------+  |     |  |
|  |      +------+-----+      |  |     |  |
|  |             |            |  |     |  |
|  |             +------------+  +-----+  |
|  |                          |  |     |  |
|  |      +------------+      |  |     |  |
|  +------+    host1   +------+  |     |  |
|  |      +------+-----+      |  |     |  |
|  |             |            |  |     |  |
|  |             +------------+  +-----+  |
|  |                          |  |     |  |
|  |      +------------+      |  |     |  |
|  +------+    host2   +------+  |     |  |
|  |      +------+-----+      |  |     |  |
|  |             |            |  |     |  |
|  |             +------------+  +-----+  |
|  |                          |  |     |  |
|  |      +------------+      |  |     |  |
|  +------+    host3   +------+  |     |  |
|  |      +------+-----+      |  |     |  |
|  |             |            |  |     |  |
|  |             +------------+  +-----+  |
|  |                          |  |     |  |
|  |                          |  |     |  |
+-++                          ++-+     +-++
  ^                            ^         ^
  |                            |         |
  |                            |         |
+-+-------------------------+  |         |
|      External Network     |  |         |
+---------------------------+  |         |
       +-----------------------+---+     |
       |       IPMI Network        |     |
       +---------------------------+     |
               +-------------------------+-+
               | PXE(Installation) Network |
               +---------------------------+

The following figure shows the interfaces and nics of JumpHost and deployment nodes in huawei-pod1 network configuration(default one nic for openstack networks).

+--------------JumpHost-------------+
|                                   |
|   +-+Compass+-+                   |
|   |           +     +--------+    |    External-network
|   |         eth2+---+br-ext  +-+eth0+----------------------+
|   |           +     +--------+    |                        |
|   |           |                   |                        |
|   |           |                   |                        |
|   |           +     +--------+    |    Install-network     |
|   |         eth1+---+install +-+eth1+-----------------+    |
|   |           +     +--------+    |                   |    |
|   |           |                   |                   |    |
|   |           |                   |                   |    |
|   |           +                   |    IPMI-network   |    |
|   |         eth0                eth2+-----------+     |    |
|   |           +                   |             |     |    |
|   +---+VM+----+                   |             |     |    |
+-----------------------------------+             |     |    |
                                                  |     |    |
                                                  |     |    |
                                                  |     |    |
                                                  |     |    |
+---------------Host1---------------+             |     |    |
|                                   |             |     |    |
|                                  eth0+----------------+    |
|                                   |             |     |    |
|                   mgmt +--------+ |             |     |    |
|                                 | |             |     |    |
|                +-----------+    | |             |     |    |
|   external+----+  br-prv   +----+eth1+---------------------+
|                +-----------+    | |             |     |    |
|                                 | |             |     |    |
|                   storage +-----+ |             |     |    |
|                                   |             |     |    |
+-----------------------------------+             |     |    |
|                                 IPMI+-----------+     |    |
+-----------------------------------+             |     |    |
                                                  |     |    |
                                                  |     |    |
                                                  |     |    |
+---------------Host2---------------+             |     |    |
|                                   |             |     |    |
|                                  eth0+----------------+    |
|                                   |             |          |
|                   mgmt +--------+ |             |          |
|                                 | |             |          |
|                +-----------+    | |             |          |
|   external+----+  br-prv   +----+eth1+---------------------+
|                +-----------+    | |             |
|                                 | |             |
|                   storage +-----+ |             |
|                                   |             |
+-----------------------------------+             |
|                                 IPMI+-----------+
+-----------------------------------+

The following figure shows the interfaces and nics of JumpHost and deployment nodes in intel-pod8 network configuration(openstack networks are seperated by multiple NICs).

+-------------+JumpHost+------------+
|                                   |
|   +-+Compass+-+                   |
|   |           +     +--------+    |    External-network
|   |         eth2+---+br-ext  +-+eth0+----------------------+
|   |           +     +--------+    |                        |
|   |           |                   |                        |
|   |           |                   |                        |
|   |           +     +--------+    |    Install-network     |
|   |         eth1+---+install +-+eth1+-----------------+    |
|   |           +     +--------+    |                   |    |
|   |           |                   |                   |    |
|   |           |                   |                   |    |
|   |           +                   |    IPMI-network   |    |
|   |         eth0                eth2+-----------+     |    |
|   |           +                   |             |     |    |
|   +---+VM+----+                   |             |     |    |
+-----------------------------------+             |     |    |
                                                  |     |    |
                                                  |     |    |
                                                  |     |    |
                                                  |     |    |
+--------------+Host1+--------------+             |     |    |
|                                   |             |     |    |
|                                  eth0+----------------+    |
|                                   |             |     |    |
|                      +--------+   |             |     |    |
|         external+----+br-prv  +-+eth1+---------------------+
|                      +--------+   |             |     |    |
|         storage +---------------+eth2+-------------------------+
|                                   |             |     |    |   |
|         Mgmt    +---------------+eth3+----------------------------+
|                                   |             |     |    |   |  |
|                                   |             |     |    |   |  |
+-----------------------------------+             |     |    |   |  |
|                                 IPMI+-----------+     |    |   |  |
+-----------------------------------+             |     |    |   |  |
                                                  |     |    |   |  |
                                                  |     |    |   |  |
                                                  |     |    |   |  |
                                                  |     |    |   |  |
+--------------+Host2+--------------+             |     |    |   |  |
|                                   |             |     |    |   |  |
|                                  eth0+----------------+    |   |  |
|                                   |             |          |   |  |
|                      +--------+   |             |          |   |  |
|         external+----+br-prv  +-+eth1+---------------------+   |  |
|                      +--------+   |             |              |  |
|         storage +---------------+eth2+-------------------------+  |
|                                   |             | storage-network |
|         Mgmt    +---------------+eth3+----------------------------+
|                                   |             | mgmt-network
|                                   |             |
+-----------------------------------+             |
|                                 IPMI+-----------+
+-----------------------------------+
6.3. Start Deployment (Bare Metal Deployment)
  1. Edit deploy.sh
1.1. Set OS version for deployment nodes.
Compass4nfv supports ubuntu and centos based openstack newton.

E.g.

# Set OS version for target hosts
# Ubuntu16.04 or CentOS7
export OS_VERSION=xenial
or
export OS_VERSION=centos7

1.2. Set ISO image corresponding to your code

E.g.

# Set ISO image corresponding to your code
export ISO_URL=file:///home/compass/compass4nfv.iso
1.3. Set hardware deploy jumpserver PXE NIC. (set eth1 E.g.)
You do not need to set it when virtual deploy.

E.g.

# Set hardware deploy jumpserver PXE NIC
# you need to comment out it when virtual deploy
export INSTALL_NIC=eth1

1.4. Set scenario that you want to deploy

E.g.

nosdn-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-nosdn-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml

ocl-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-ocl-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network_ocl.yml

odl_l2-moon scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-odl_l2-moon-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml

odl_l2-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-odl_l2-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml

odl_l3-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-odl_l3-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml

onos-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-onos-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network_onos.yml

onos-sfc deploy scenario sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/huawei-pod1/os-onos-sfc-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network_onos.yml
  1. Run deploy.sh
./deploy.sh
7. Installation on virtual machines
7.1. Nodes Configuration (Virtual Deployment)
7.1.1. virtual machine setting
  • VIRT_NUMBER – the number of nodes for virtual deployment.
  • VIRT_CPUS – the number of CPUs allocated per virtual machine.
  • VIRT_MEM – the memory size(MB) allocated per virtual machine.
  • VIRT_DISK – the disk size allocated per virtual machine.
export VIRT_NUMBER=${VIRT_NUMBER:-5}
export VIRT_CPUS=${VIRT_CPU:-4}
export VIRT_MEM=${VIRT_MEM:-16384}
export VIRT_DISK=${VIRT_DISK:-200G}
7.1.2. roles setting

The below file is the inventory template of deployment nodes:

”./deploy/conf/vm_environment/huawei-virtual1/dha.yml”

The “dha.yml” is a collectively name for “os-nosdn-nofeature-ha.yml os-ocl-nofeature-ha.yml os-odl_l2-moon-ha.yml etc”.

You can write your own address/roles reference to it.

  • name – Host name for deployment node after installation.
  • roles – Components deployed.

Set TYPE and FLAVOR

E.g.

TYPE: virtual
FLAVOR: cluster

Assignment of different roles to servers

E.g. Openstack only deployment roles setting

hosts:
  - name: host1
    roles:
      - controller
      - ha

  - name: host2
    roles:
      - compute

NOTE: IF YOU SELECT MUTIPLE NODES AS CONTROLLER, THE ‘ha’ role MUST BE SELECT, TOO.

E.g. Openstack and ceph deployment roles setting

hosts:
  - name: host1
    roles:
      - controller
      - ha
      - ceph-adm
      - ceph-mon

  - name: host2
    roles:
      - compute
      - ceph-osd

E.g. Openstack and ODL deployment roles setting

hosts:
  - name: host1
    roles:
      - controller
      - ha
      - odl

  - name: host2
    roles:
      - compute

E.g. Openstack and ONOS deployment roles setting

hosts:
  - name: host1
    roles:
      - controller
      - ha
      - onos

  - name: host2
    roles:
      - compute
7.2. Network Configuration (Virtual Deployment)

The same with Baremetal Deployment.

7.3. Start Deployment (Virtual Deployment)
  1. Edit deploy.sh
1.1. Set OS version for deployment nodes.
Compass4nfv supports ubuntu and centos based openstack newton.

E.g.

# Set OS version for target hosts
# Ubuntu16.04 or CentOS7
export OS_VERSION=xenial
or
export OS_VERSION=centos7

1.2. Set ISO image corresponding to your code

E.g.

# Set ISO image corresponding to your code
export ISO_URL=file:///home/compass/compass4nfv.iso

1.3. Set scenario that you want to deploy

E.g.

nosdn-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-nosdn-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network.yml

ocl-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-ocl-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network_ocl.yml

odl_l2-moon scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-odl_l2-moon-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network.yml

odl_l2-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-odl_l2-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network.yml

odl_l3-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-odl_l3-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network.yml

onos-nofeature scenario deploy sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-onos-nofeature-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network_onos.yml

onos-sfc deploy scenario sample

# DHA is your dha.yml's path
export DHA=./deploy/conf/vm_environment/os-onos-sfc-ha.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/vm_environment/huawei-virtual1/network_onos.yml
  1. Run deploy.sh
./deploy.sh
8. Offline Deploy

Compass4nfv uses offline approach to deploy cluster and support complete offline deployment on a jumphost without access internet. Here is the offline deployment instruction:

8.1. Preparation for offline deploy
  1. Download compass.iso from OPNFV artifacts repository (Search compass4nfv in http://artifacts.opnfv.org/ and download an appropriate ISO. ISO can also be generated by script build.sh in compass4nfv root directory.)
  2. Download the Jumphost preparation package from our httpserver. (Download the jumphost environment package from here. It should be awared that currently we only support ubuntu trusty as offline jumphost OS.)
  3. Clone the compass4nfv code repository.
8.2. Steps of offline deploy
  1. Copy the compass.iso, jh_env_package.tar.gz and the compass4nfv code repository to your jumphost.
  2. Export the local path of the compass.iso and jh_env_package.tar.gz on jumphost. Then you can perform deployment on a offline jumphost.

E.g.

Export the compass4nfv.iso and jh_env_package.tar.gz path

# ISO_URL and JHPKG_URL should be absolute path
export ISO_URL=file:///home/compass/compass4nfv.iso
export JHPKG_URL=file:///home/compass/jh_env_package.tar.gz

Run deploy.sh

./deploy.sh
9. Expansion Guide
9.1. Edit NETWORK File

The below file is the inventory template of deployment nodes:

”./deploy/conf/hardware_environment/huawei-pod1/network.yml”

You need to edit the network.yml which you had edited the first deployment.

NOTE: External subnet’s ip_range should exclude the IPs those have already been used.

9.2. Edit DHA File

The below file is the inventory template of deployment nodes:

”./deploy/conf/hardware_environment/expansion-sample/hardware_cluster_expansion.yml”

You can write your own IPMI IP/User/Password/Mac address/roles reference to it.

  • name – Host name for deployment node after installation.
  • ipmiIP – IPMI IP address for deployment node. Make sure it can access from Jumphost.
  • ipmiUser – IPMI Username for deployment node.
  • ipmiPass – IPMI Password for deployment node.
  • mac – MAC Address of deployment node PXE NIC .

Set TYPE/FLAVOR and POWER TOOL

E.g.

TYPE: baremetal
FLAVOR: cluster
POWER_TOOL: ipmitool

Set ipmiUser/ipmiPass and ipmiVer

E.g.

ipmiUser: USER
ipmiPass: PASSWORD
ipmiVer: '2.0'

Assignment of roles to servers

E.g. Only increase one compute node

hosts:
   - name: host6
     mac: 'E8:4D:D0:BA:60:45'
     interfaces:
        - eth1: '08:4D:D0:BA:60:44'
     ipmiIp: 172.16.131.23
     roles:
       - compute

E.g. Increase two compute nodes

hosts:
   - name: host6
     mac: 'E8:4D:D0:BA:60:45'
     interfaces:
        - eth1: '08:4D:D0:BA:60:44'
     ipmiIp: 172.16.131.23
     roles:
       - compute

   - name: host6
     mac: 'E8:4D:D0:BA:60:78'
     interfaces:
        - eth1: '08:4D:56:BA:60:83'
     ipmiIp: 172.16.131.23
     roles:
       - compute
9.2.1. Start Expansion
  1. Edit network.yml and dha.yml file

    You need to Edit network.yml and virtual_cluster_expansion.yml or hardware_cluster_expansion.yml. Edit the DHA and NETWORK envionment variables. External subnet’s ip_range and management ip should be changed as the first 6 IPs are already taken by the first deployment.

E.g.

--- network.yml     2017-02-16 20:07:10.097878150 +0800
+++ network_expansion.yml   2017-02-17 11:40:08.734480478 +0800
@@ -56,7 +56,7 @@
   - name: external
     ip_ranges:
-      - - "192.168.116.201"
+      - - "192.168.116.206"
         - "192.168.116.221"
     cidr: "192.168.116.0/24"
     gw: "192.168.116.1"
  1. Edit deploy.sh
2.1. Set EXPANSION and VIRT_NUMBER.
VIRT_NUMBER decide how many virtual machines needs to expand when virtual expansion

E.g.

export EXPANSION="true"
export MANAGEMENT_IP_START="10.1.0.55"
export VIRT_NUMBER=1
export DEPLOY_FIRST_TIME="false"

2.2. Set scenario that you need to expansion

E.g.

# DHA is your dha.yml's path
export DHA=./deploy/conf/hardware_environment/expansion-sample/hardware_cluster_expansion.yml

# NETWORK is your network.yml's path
export NETWORK=./deploy/conf/hardware_environment/huawei-pod1/network.yml
Note: Other environment variable shoud be same as your first deployment.
Please check the environment variable before you run deploy.sh.
  1. Run deploy.sh
./deploy.sh
Compass4nfv Design Guide
1. How to integrate a feature into compass4nfv

This document describes how to integrate a feature (e.g. sdn, moon, kvm, sfc) into compass installer. Follow the steps below, you can achieve the goal.

1.1. Create a role for the feature

Currently Ansible is the main packages installation plugin in the adapters of Compass4nfv, which is used to deploy all the roles listed in the playbooks. (More details about ansible and playbook can be achieved according to the Reference.) The mostly used playbook in compass4nfv is named “HA-ansible-multinodes.yml” located in “your_path_to_compass4nfv/compass4nfv/deploy/ adapters/ansible/openstack/”.

Before you add your role into the playbook, create your role under the directory of “your_path_to_compass4nfv/compass4nfv/deploy/adapters/ansible/roles/”. For example Fig 1 shows some roles currently existed in compass4nfv.

Existed roles in compass4nfv

Fig 1. Existed roles in compass4nfv

Let’s take a look at “moon” and understand the construction of a role. Fig 2 below presents the tree of “moon”.

Tree of moon role

Fig 2. Tree of moon role

There are five directories in moon, which are files, handlers, tasks, templates and vars. Almost every role has such five directories.

For “files”, it is used to store the files you want to copy to the hosts without any modification. These files can be configuration files, code files and etc. Here in moon’s files directory, there are two python files and one configuration file. All of the three files will be copied to controller nodes for some purposes.

For “handlers”, it is used to store some operations frequently used in your tasks. For example, restart the service daemon.

For “tasks”, it is used to store the task yaml files. You need to add the yaml files including the tasks you write to deploy your role on the hosts. Please attention that a main.yml should be existed as the entrance of running tasks. In Fig 2, you can find that there are four yaml files in the tasks directory of moon. The main.yml is the entrance which will call the other three yaml files.

For “templates”, it is used to store the files that you want to replace some variables in them before copying to hosts. These variables are usually defined in “vars” directory. This can avoid hard coding.

For “vars”, it is used to store the yaml files in which the packages and variables are defined. The packages defined here are some generic debian or rpm packages. The script of making repo will scan the packages names here and download them into related PPA. For some special packages, section “Build packages for the feature” will introduce how to handle with special packages. The variables defined here are used in the files in “templates” and “tasks”.

Note: you can get the special packages in the tasks like this:

- name: get the special packages' http server
  shell: awk -F'=' '/compass_server/ {print $2}' /etc/compass.conf
  register: http_server

- name: download odl package
  get_url:
    url: "http://{{ http_server.stdout_lines[0] }}/packages/odl/{{ odl_pkg_url }}"
    dest: /opt/
1.2. Build packages for the feature

In the previous section, we have explained how to build the generic packages for your feature. In this section, we will talk about how to build the special packages used by your feature.

Features building directory in compass4nfv

Fig 3. Features building directory in compass4nfv

Fig 3 shows the tree of “your_path_to_compass4nfv/compass4nfv/repo/features/”. Dockerfile is used to start a docker container to run the scripts in scripts directory. These scripts will download the special feature related packages into the container. What you need to do is to write a shell script to download or build the package you want. And then put the script into “your_path_to_compass4nfv/compass4nfv/repo/features/scripts/”. Attention that, you need to make a directory under /pkg. Take opendaylight as an example:

mkdir -p /pkg/odl

After downloading or building your feature packages, please copy all of your packages into the directory you made, e.g. /pkg/odl.

Note: If you have specail requirements for the container OS or kernel vesion, etc. Please contact us.

After all of these, come back to your_path_to_compass4nfv/compass4nfv/ directory, and run the command below:

./repo/make_repo.sh feature # To get special packages

./repo/make_repo.sh openstack # To get generic packages

When execution finished, you will get a tar package named packages.tar.gz under “your_path_to_compass4nfv/compass4nfv/work/repo/”. Your feature related packages have been archived in this tar package. And you will also get the PPA packages which includes the generic packages you defined in the role directory. The PPA packages are xenial-newton-ppa.tar.gz and centos7-newton-ppa.tar.gz, also in “your_path_to_compass4nfv/compass4nfv/work/repo/”.

1.3. Build compass ISO including the feature

Before you deploy a cluster with your feature installed, you need an ISO with feature packages, generic packages and role included. This section introduces how to build the ISO you want. What you need to do are two simple things:

Configure the build configuration file

The build configuration file is located in “your_path_to_compass4nfv/compass4nfv/build/”. There are lines in the file like this:

export APP_PACKAGE=${APP_PACKAGE:-$FEATURE_URL/packages.tar.gz}

export XENIAL_NEWTON_PPA=${XENIAL_NEWTON_PPA:-$PPA_URL/xenial-newton-ppa.tar.gz}

export CENTOS7_NEWTON_PPA=${CENTOS7_NEWTON_PPA:-$PPA_URL/centos7-newton-ppa.tar.gz}

Just replace the $FEATURE_URL and $PPA_URL to the directory where your packages.tar.gz located in. For example:

export APP_PACKAGE=${APP_PACKAGE:-file:///home/opnfv/compass4nfv/work/repo/packages.tar.gz}

export XENIAL_NEWTON_PPA=${XENIAL_NEWTON_PPA:-file:///home/opnfv/compass4nfv/work/repo/xenial-newton-ppa.tar.gz}

export CENTOS7_NEWTON_PPA=${CENTOS7_NEWTON_PPA:-file:///home/opnfv/compass4nfv/work/repo/centos7-newton-ppa.tar.gz}

Build the ISO

After the configuration, just run the command below to build the ISO you want for deployment.

./build.sh
1.4. References

Ansible documentation: http://docs.ansible.com/ansible/index.html>

Copper

OPNFV Copper Danube Overview
OPNFV Danube Copper Overview
Introduction

The OPNFV Copper project aims to help ensure that virtualized infrastructure and application deployments comply with goals of the NFV service provider or the VNF designer/user.

This is the third (“Danube”) release of the Copper project. The documentation provided here focuses on the overall goals of the Copper project and the specific features supported in the Colorado release.

Overall Goals for Configuration Policy

As focused on by Copper, configuration policy helps ensure that the NFV service environment meets the requirements of the variety of stakeholders which will provide or use NFV platforms.

These requirements can be expressed as an intent of the stakeholder, in specific terms or more abstractly, but at the highest level they express:

  • what I want
  • what I don’t want

Using road-based transportation as an analogy, some examples of this are shown below:

Configuration Intent Example
Who I Am What I Want What I Don’t Want
user a van, wheelchair-accessible, electric powered someone driving off with my van
road provider keep drivers moving at an optimum safe speed four-way stops
public safety shoulder warning strips, center media barriers speeding, tractors on the freeway

According to their role, service providers may apply more specific configuration requirements than users, since service providers are more likely to be managing specific types of infrastructure capabilities.

Developers and users may also express their requirements more specifically, based upon the type of application or how the user intends to use it.

For users, a high-level intent can be also translated into a more or less specific configuration capability by the service provider, taking into consideration aspects such as the type of application or its constraints.

Examples of such translation are:

Intent Translation into Configuration Capability
Intent Configuration Capability
network security firewall, DPI, private subnets
compute/storage security vulnerability monitoring, resource access controls
high availability clustering, auto-scaling, anti-affinity, live migration
disaster recovery geo-diverse anti-affinity
high compute/storage performance clustering, affinity
high network performance data plane acceleration
resource reclamation low-usage monitoring

Although such intent-to-capability translation is conceptually useful, it is unclear how it can address the variety of aspects that may affect the choice of an applicable configuration capability.

For that reason, the Copper project will initially focus on more specific configuration requirements as fulfilled by specific configuration capabilities, as well as how those requirements and capabilities are expressed in VNF and service design and packaging or as generic policies for the NFV Infrastructure.

OPNFV Copper Danube Requirements
Requirements

This section outlines general requirements for configuration policies, per the two main aspects in the Copper project scope:

  • Ensuring resource requirements of VNFs and services are applied per VNF designer, service, and tenant intent
  • Ensuring that generic policies are not violated, e.g. networks connected to VMs must either be public or owned by the VM owner
Resource Requirements

Resource requirements describe the characteristics of virtual resources (compute, storage, network) that are needed for VNFs and services, and how those resources should be managed over the lifecycle of a VNF/service. Upstream projects already include multiple ways in which resource requirements can be expressed and fulfilled, e.g.:

  • OpenStack Nova
    • the Image feature, enabling “VM templates” to be defined for NFs and referenced by name as a specific NF version to be used
    • the Flavor feature, addressing basic compute and storage requirements, with extensibility for custom attributes
  • OpenStack Heat
    • the Heat Orchestration Template feature, enabling a variety of VM aspects to be defined and managed by Heat throughout the VM lifecycle, notably
      • alarm handling (requires Ceilometer)
      • attached volumes (requires Cinder)
      • domain name assignment (requires Designate)
      • images (requires Glance)
      • autoscaling
      • software configuration associated with VM lifecycle hooks (CREATE, UPDATE, SUSPEND, RESUME, DELETE)
      • wait conditions and signaling for sequencing orchestration steps
      • orchestration service user management (requires Keystone)
      • shared storage (requires Manila)
      • load balancing (requires Neutron LBaaS)
      • firewalls (requires Neutron FWaaS)
      • various Neutron-based network and security configuration items
      • Nova flavors
      • Nova server attributes including access control
      • Nova server group affinity and anti-affinity
      • “Data-intensive application clustering” (requires Sahara)
      • DBaaS (requires Trove)
      • “multi-tenant cloud messaging and notification service” (requires Zaqar)
  • OpenStack Group-Based Policy
    • API-based grouping of endpoints with associated contractual expectations for data flow processing and service chaining
  • OpenStack Tacker
    • “a fully functional ETSI MANO based general purpose NFV Orchestrator and VNF Manager for OpenStack”
  • OpenDaylight Group-Based Policy
    • model-based grouping of endpoints with associated contractual expectations for data flow processing
  • OpenDaylight Service Function Chaining (SFC)
    • model-based management of “service chains” and the infrastucture that enables them
  • Additional projects that are commonly used for configuration management, implemented as client-server frameworks using model-based, declarative, or scripted configuration management data.
Generic Policy Requirements

Generic policy requirements address conditions related to resource state and events which need to be monitored for, and optionally responded to or prevented. These conditions are typically expected to be VNF/service-independent, as VNF/service-dependent condition handling (e.g. scale in/out) are considered to be addressed by VNFM/NFVO/VIM functions as described under Resource Requirements or as FCAPS related functions. However the general capabilities below can be applied to VNF/service-specific policy handling as well, or in particular to invocation of VNF/service-specific management/orchestration actions. The high-level required capabilities include:

  • Polled monitoring: Exposure of state via request-response APIs.
  • Notifications: Exposure of state via pub-sub APIs.
  • Realtime/near-realtime notifications: Notifications that occur in actual or near realtime.
  • Delegated policy: CRUD operations on policies that are distributed to specific components for local handling, including one/more of monitoring, violation reporting, and enforcement.
  • Violation reporting: Reporting of conditions that represent a policy violation.
  • Reactive enforcement: Enforcement actions taken in response to policy violation events.
  • Proactive enforcement: Enforcement actions taken in advance of policy violation events, e.g. blocking actions that could result in a policy violation.
  • Compliance auditing: Periodic auditing of state against policies.

Upstream projects already include multiple ways in which configuration conditions can be monitored and responded to:

  • OpenStack Congress provides a table-based mechanism for state monitoring and proactive/reactive policy enforcement, including data obtained from internal databases of OpenStack core and optional services. The Congress design approach is also extensible to other VIMs (e.g. SDNCs) through development of data source drivers for the new monitored state information.
  • OpenStack Aodh provides means to trigger alarms upon a wide variety of conditions derived from its monitored OpenStack analytics.
  • Nagios “offers complete monitoring and alerting for servers, switches, applications, and services”.
Requirements Validation Approach

The Copper project will assess the completeness of the upstream project solutions for requirements in scope though a process of:

  • developing configuration policy use cases to focus solution assessment tests
  • integrating the projects into the OPNFV platform for testing
  • executing functional and performance tests for the solutions
  • assessing overall requirements coverage and gaps in the most complete upstream solutions

Depending upon the priority of discovered gaps, new requirements will be submitted to upstream projects for the next available release cycle.

OPNFV Copper Installation
OPNFV Copper Installation Guide

This document describes how to install Copper, its dependencies and required system resources.

Version History
Date Ver. Author Comment
2017 Feb 7 1.0 Bryan Sullivan  
Introduction

The Congress service is automatically configured as required by the JOID and Apex installers, including creation of datasources per the installed datasource drivers. This release includes default support for the following datasource drivers:

  • nova
  • neutronv2
  • ceilometer
  • cinder
  • glancev2
  • keystone

For JOID, Congress is installed through a JuJu Charm, and for Apex through a Puppet Module. Both the Charm and Module are being upstreamed to OpenStack for future maintenance.

Other project installer support (e.g. Doctor) may install additional datasource drivers once Congress is installed.

Manual Installation

NOTE: This section describes a manual install procedure that had been tested under the JOID and Apex base installs prior to the integration of native installer support through JuJu (JOID) and Puppet (Apex). This procedure is being maintained as a basis for additional installer support in future releases. However, since Congress is pre-installed for JOID and Apex, this procedure is not necessary and not recommended for use if Congress is already installed.

Copper provides a set of bash scripts to automatically install Congress based upon a JOID or Apex install which does not already have Congress installed. These scripts are in the Copper repo at:

  • components/congress/install/bash/install_congress_1.sh
  • components/congress/install/bash/install_congress_2.sh

Prerequisites to using these scripts:

  • OPFNV installed via JOID or Apex
  • For Apex installs, on the jumphost, ssh to the undercloud VM and “su stack”.
  • For JOID installs, admin-openrc.sh saved from Horizon to ~/admin-openrc.sh
  • Retrieve the copper install script as below, optionally specifying the branch to use as a URL parameter, e.g. ?h=stable%2Fbrahmaputra

To invoke the procedure, enter the following shell commands, optionally specifying the branch identifier to use for OpenStack.

cd ~
wget https://git.opnfv.org/cgit/copper/plain/components/congress/install/bash/install_congress_1.sh
wget https://git.opnfv.org/cgit/copper/plain/components/congress/install/bash/install_congress_2.sh
bash install_congress_1.sh [openstack-branch]
OPNFV Copper Configuration Guide
Copper Configuration

This release includes installer support for the OpenStack Congress service under JOID and Apex installers. Congress is installed by default for all JOID and Apex scenarios. Support for other OPNFV installer deployed environments is planned for the next release.

Hardware Configuration

There is no specific hardware configuration required for the Copper project.

Copper Post Installation Procedures

This section describes optional procedures for verifying that the Congress service is operational.

Copper Functional Tests

This release includes the following test cases which are integrated into OPNFV Functest for the JOID and Apex installers:

  • DMZ Placement: dmz.sh
  • SMTP Ingress: smtp_ingress.sh
  • Reserved Subnet: reserved_subnet.sh

These scripts, related scripts that clean up the OpenStack environment afterward, and a combined test runner (run.sh) are in the Copper repo under the “tests” folder. Instructions for using the tests are provided as script comments.

Further description of the tests is provided on the Copper wiki at https://wiki.opnfv.org/display/copper/testing.

Congress Test Webapp

This release also provides a webapp that can be automatically installed in a Docker container on the OPNFV jumphost. This script is in the Copper repo at:

  • components/congress/test-webapp/setup/install_congress_testserver.sh

Prerequisites for using this script:

  • OPFNV installed per JOID or Apex installer
  • For Apex installs, on the jumphost, ssh to the undercloud VM and “su stack”

To invoke the procedure, enter the following shell commands, optionally specifying the branch identifier to use for Copper:

wget https://git.opnfv.org/cgit/copper/plain/components/congress/test-webapp/setup/install_congress_testserver.sh
bash install_congress_testserver.sh [copper-branch]
Using the Test Webapp

Browse to the webapp IP address provided at the end of the install procedure.

Interactive options are meant to be self-explanatory given a basic familiarity with the Congress service and data model.

Removing the Test Webapp

The webapp can be removed by running this script from the Copper repo:

  • components/congress/test-webapp/setup/clean_congress_testserver.sh
OPNFV Copper User Guide
OPNFV Copper User Guide

This release focuses on use of the OpenStack Congress service for managing configuration policy. See the Congress intro guide for general information on the capabilities and usage of Congress.

Examples of Congress API usage can be found in the Copper tests as described on the OPNFV wiki at https://wiki.opnfv.org/display/copper/testing.

OPNFV Copper Danube Design
Definitions
Definitions
Term Meaning
State Information that can be used to convey or imply the state of something, e.g. an application, resource, entity, etc. This can include data held inside OPNFV components, “events” that have occurred (e.g. “policy violation”), etc.
Event An item of significance to the policy engine, for which the engine has become aware through some method of discovery e.g. polling or notification.
Abbreviations
Abbreviations
Term Meaning
CRUD Create, Read, Update, Delete (database operation types)
FCAPS Fault, Configuration, Accounting, Performance, Security
NF Network Function
SFC Service Function Chaining
VNF Virtual Network Function
NFVI Network Function Virtualization Infrastructure
Architecture
Architectural Concept

The following example diagram illustrates a “relationship diagram” type view of an NFVI platform, in which the roles of components focused on policy management, services, and infrastructure are shown.

This view illustrates that a large-scale deployment of NFVI may leverage multiple components of the same “type” (e.g. SDN Controller), which fulfill specific purposes for which they are optimized. For example, a global SDN controller and cloud orchestrator can act as directed by a service orchestrator in the provisioning of VNFs per intent, while various components at a local and global level handle policy-related events directly and/or feed them back through a closed-loop policy design that responds as needed, directly or through the service orchestrator.

policy_architecture.png

(source of the diagram above: https://git.opnfv.org/cgit/copper/plain/design_docs/images/policy_architecture.pptx)

Architectural Aspects
  • Policies are reflected in two high-level goals
    • Ensure resource requirements of VNFs and services are applied per VNF designer, service, and tenant intent
    • Ensure that generic policies are not violated, e.g. networks connected to VMs must either be public or owned by the VM owner
  • Policies are distributed through two main means
    • As part of VNF packages, customized if needed by Service Design tools, expressing intent of the VNF designer and service provider, and possibly customized or supplemented by service orchestrators per the intent of specific tenants
    • As generic policies provisioned into VIMs (SDN controllers and cloud orchestrators), expressing intent of the service provider re what states/events need to be policy-governed independently of specific VNFs
  • Policies are applied locally and in closed-loop systems per the capabilities of the local policy enforcer and the impact of the related state/event conditions
    • VIMs should be able to execute most policies locally
    • VIMs may need to pass policy-related state/events to a closed-loop system, where those events are relevant to other components in the architecture (e.g. service orchestrator), or some additional data/arbitration is needed to resolve the state/event condition
  • Policies are localized as they are distributed/delegated
    • High-level policies (e.g. expressing “intent”) can be translated into VNF package elements or generic policies, perhaps using distinct syntaxes
    • Delegated policy syntaxes are likely VIM-specific, e.g. Datalog (Congress)
  • Closed-loop policy and VNF-lifecycle event handling are //somewhat// distinct
    • Closed-loop policy is mostly about resolving conditions that can’t be handled locally, but as above in some cases the conditions may be of relevance and either delivered directly or forwarded to service orchestrators
    • VNF-lifecycle events that can’t be handled by the VIM locally are delivered directly to the service orchestrator
  • Some events/analytics need to be collected into a more “open-loop” system which can enable other actions, e.g.
    • audits and manual interventions
    • machine-learning focused optimizations of policies (largely a future objective)

Issues to be investigated as part of establishing an overall cohesive/adaptive policy architecture:

  • For the various components which may fulfill a specific purpose, what capabilities (e.g. APIs) do they have/need to
    • handle events locally
    • enable closed-loop policy handling components to subscribe/optimize policy-related events that are of interest
  • For global controllers and cloud orchestrators
    • How do they support correlation of events impacting resources in different scopes (network and cloud)
    • What event/response flows apply to various policy use cases
  • What specific policy use cases can/should fall into each overall class
    • locally handled by NFVI components
    • handled by a closed-loop policy system, either VNF/service-specific or VNF-independent

Doctor

Doctor Development Guide
Testing Doctor

You have two options to test Doctor functions with the script developed for doctor CI.

You need to install OpenStack and other OPNFV components except Doctor Sample Inspector, Sample Monitor and Sample Consumer, as these will be launched in this script. You are encouraged to use OPNFV offcial installers, but you can also deploy all components with other installers such as devstack or manual operation. In those cases, the versions of all components shall be matched with the versions of them in OPNFV specific release.

Run Test Script

Doctor project has own testing script under doctor/tests. This test script can be used for functional testing agained an OPNFV deployment.

Before running this script, make sure OpenStack env parameters are set properly following OpenStack CLI manual, so that Doctor Inspector can operate OpenStack services.

Then, you can run the script as follows:

git clone https://gerrit.opnfv.org/gerrit/doctor
cd doctor/tests
export INSTALLER_TYPE=local
export INSPECTOR_TYPE=sample
./run.sh

INSTALLER_TYPE can be ‘apex’, ‘fuel’, ‘joid’ and ‘local’(default). If you are not using OPNFV installers in this option, chose ‘local’. INSPECTOR_TYPE can be specified either ‘sample’(default) or ‘congress’.

For testing with stable version, checkout stable branch of doctor repo before ‘./run.sh’.

Run Functest Suite

Functest supports Doctor testing by triggering the test script above in a Functest container. You can run the Doctor test with the following steps:

DOCKER_TAG=latest
docker pull opnfv/functest:${DOCKER_TAG}
docker run --privileged=true -id \
    -e INSTALLER_TYPE=${INSTALLER_TYPE} \
    -e INSTALLER_IP=${INSTALLER_IP} \
    -e INSPECTOR_TYPE=sample \
    opnfv/functest:${DOCKER_TAG} /bin/bash
docker exec <container_id> python /home/opnfv/repos/functest/functest/ci/prepare_env.py start
docker exec <container_id> functest testcase run doctor

See Functest Userguide for more information.

For testing with stable version, change DOCKER_TAG to ‘stable’ or other release tag identifier.

Tips
Doctor: Fault Management and Maintenance
Project:

Doctor, https://wiki.opnfv.org/doctor

Editors:

Ashiq Khan (NTT DOCOMO), Gerald Kunzmann (NTT DOCOMO)

Authors:

Ryota Mibu (NEC), Carlos Goncalves (NEC), Tomi Juvonen (Nokia), Tommy Lindgren (Ericsson), Bertrand Souville (NTT DOCOMO), Balazs Gibizer (Ericsson), Ildiko Vancsa (Ericsson) and others.

Abstract:

Doctor is an OPNFV requirement project [DOCT]. Its scope is NFVI fault management, and maintenance and it aims at developing and realizing the consequent implementation for the OPNFV reference platform.

This deliverable is introducing the use cases and operational scenarios for Fault Management considered in the Doctor project. From the general features, a high level architecture describing logical building blocks and interfaces is derived. Finally, a detailed implementation is introduced, based on available open source components, and a related gap analysis is done as part of this project. The implementation plan finally discusses an initial realization for a NFVI fault management and maintenance solution in open source software.

Definition of terms

Different SDOs and communities use different terminology related to NFV/Cloud/SDN. This list tries to define an OPNFV terminology, mapping/translating the OPNFV terms to terminology used in other contexts.

ACT-STBY configuration
Failover configuration common in Telco deployments. It enables the operator to use a standby (STBY) instance to take over the functionality of a failed active (ACT) instance.
Administrator
Administrator of the system, e.g. OAM in Telco context.
Consumer
User-side Manager; consumer of the interfaces produced by the VIM; VNFM, NFVO, or Orchestrator in ETSI NFV [ENFV] terminology.
EPC
Evolved Packet Core, the main component of the core network architecture of 3GPP’s LTE communication standard.
MME
Mobility Management Entity, an entity in the EPC dedicated to mobility management.
NFV
Network Function Virtualization
NFVI
Network Function Virtualization Infrastructure; totality of all hardware and software components which build up the environment in which VNFs are deployed.
S/P-GW
Serving/PDN-Gateway, two entities in the EPC dedicated to routing user data packets and providing connectivity from the UE to external packet data networks (PDN), respectively.
Physical resource
Actual resources in NFVI; not visible to Consumer.
VNFM
Virtualized Network Function Manager; functional block that is responsible for the lifecycle management of VNF.
NFVO
Network Functions Virtualization Orchestrator; functional block that manages the Network Service (NS) lifecycle and coordinates the management of NS lifecycle, VNF lifecycle (supported by the VNFM) and NFVI resources (supported by the VIM) to ensure an optimized allocation of the necessary resources and connectivity.
VIM
Virtualized Infrastructure Manager; functional block that is responsible for controlling and managing the NFVI compute, storage and network resources, usually within one operator’s Infrastructure Domain, e.g. NFVI Point of Presence (NFVI-PoP).
Virtual Machine (VM)
Virtualized computation environment that behaves very much like a physical computer/server.
Virtual network
Virtual network routes information among the network interfaces of VM instances and physical network interfaces, providing the necessary connectivity.
Virtual resource
A Virtual Machine (VM), a virtual network, or virtualized storage; Offered resources to “Consumer” as result of infrastructure virtualization; visible to Consumer.
Virtual Storage
Virtualized non-volatile storage allocated to a VM.
VNF
Virtualized Network Function. Implementation of a Network Function that can be deployed on a Network Function Virtualization Infrastructure (NFVI).
Introduction

The goal of this project is to build an NFVI fault management and maintenance framework supporting high availability of the Network Services on top of the virtualized infrastructure. The key feature is immediate notification of unavailability of virtualized resources from VIM, to support failure recovery, or failure avoidance of VNFs running on them. Requirement survey and development of missing features in NFVI and VIM are in scope of this project in order to fulfil requirements for fault management and maintenance in NFV.

The purpose of this requirement project is to clarify the necessary features of NFVI fault management, and maintenance, identify missing features in the current OpenSource implementations, provide a potential implementation architecture and plan, provide implementation guidelines in relevant upstream projects to realize those missing features, and define the VIM northbound interfaces necessary to perform the task of NFVI fault management, and maintenance in alignment with ETSI NFV [ENFV].

Problem description

A Virtualized Infrastructure Manager (VIM), e.g. OpenStack [OPSK], cannot detect certain Network Functions Virtualization Infrastructure (NFVI) faults. This feature is necessary to detect the faults and notify the Consumer in order to ensure the proper functioning of EPC VNFs like MME and S/P-GW.

  • EPC VNFs are often in active standby (ACT-STBY) configuration and need to switch from STBY mode to ACT mode as soon as relevant faults are detected in the active (ACT) VNF.
  • NFVI encompasses all elements building up the environment in which VNFs are deployed, e.g., Physical Machines, Hypervisors, Storage, and Network elements.

In addition, VIM, e.g. OpenStack, needs to receive maintenance instructions from the Consumer, i.e. the operator/administrator of the VNF.

  • Change the state of certain Physical Machines (PMs), e.g. empty the PM, so that maintenance work can be performed at these machines.

Note: Although fault management and maintenance are different operations in NFV, both are considered as part of this project as – except for the trigger – they share a very similar work and message flow. Hence, from implementation perspective, these two are kept together in the Doctor project because of this high degree of similarity.

Use cases and scenarios

Telecom services often have very high requirements on service performance. As a consequence they often utilize redundancy and high availability (HA) mechanisms for both the service and the platform. The HA support may be built-in or provided by the platform. In any case, the HA support typically has a very fast detection and reaction time to minimize service impact. The main changes proposed in this document are about making a clear distinction between fault management and recovery a) within the VIM/NFVI and b) High Availability support for VNFs on the other, claiming that HA support within a VNF or as a service from the platform is outside the scope of Doctor and is discussed in the High Availability for OPNFV project. Doctor should focus on detecting and remediating faults in the NFVI. This will ensure that applications come back to a fully redundant configuration faster than before.

As an example, Telecom services can come with an Active-Standby (ACT-STBY) configuration which is a (1+1) redundancy scheme. ACT and STBY nodes (aka Physical Network Function (PNF) in ETSI NFV terminology) are in a hot standby configuration. If an ACT node is unable to function properly due to fault or any other reason, the STBY node is instantly made ACT, and affected services can be provided without any service interruption.

The ACT-STBY configuration needs to be maintained. This means, when a STBY node is made ACT, either the previously ACT node, after recovery, shall be made STBY, or, a new STBY node needs to be configured. The actual operations to instantiate/configure a new STBY are similar to instantiating a new VNF and therefore are outside the scope of this project.

The NFVI fault management and maintenance requirements aim at providing fast failure detection of physical and virtualized resources and remediation of the virtualized resources provided to Consumers according to their predefined request to enable applications to recover to a fully redundant mode of operation.

  1. Fault management/recovery using ACT-STBY configuration (Triggered by critical error)
  2. Preventive actions based on fault prediction (Preventing service stop by handling warnings)
  3. VM Retirement (Managing service during NFVI maintenance, i.e. H/W, Hypervisor, Host OS, maintenance)
Faults
Fault management using ACT-STBY configuration

In figure1, a system-wide view of relevant functional blocks is presented. OpenStack is considered as the VIM implementation (aka Controller) which has interfaces with the NFVI and the Consumers. The VNF implementation is represented as different virtual resources marked by different colors. Consumers (VNFM or NFVO in ETSI NFV terminology) own/manage the respective virtual resources (VMs in this example) shown with the same colors.

The first requirement in this use case is that the Controller needs to detect faults in the NFVI (“1. Fault Notification” in figure1) affecting the proper functioning of the virtual resources (labelled as VM-x) running on top of it. It should be possible to configure which relevant fault items should be detected. The VIM (e.g. OpenStack) itself could be extended to detect such faults. Alternatively, a third party fault monitoring tool could be used which then informs the VIM about such faults; this third party fault monitoring element can be considered as a component of VIM from an architectural point of view.

Once such fault is detected, the VIM shall find out which virtual resources are affected by this fault. In the example in figure1, VM-4 is affected by a fault in the Hardware Server-3. Such mapping shall be maintained in the VIM, depicted as the “Server-VM info” table inside the VIM.

Once the VIM has identified which virtual resources are affected by the fault, it needs to find out who is the Consumer (i.e. the owner/manager) of the affected virtual resources (Step 2). In the example shown in figure1, the VIM knows that for the red VM-4, the manager is the red Consumer through an Ownership info table. The VIM then notifies (Step 3 “Fault Notification”) the red Consumer about this fault, preferably with sufficient abstraction rather than detailed physical fault information.

_images/figure1.png

Fault management/recovery use case

The Consumer then switches to STBY configuration by switching the STBY node to ACT state (Step 4). It further initiates a process to instantiate/configure a new STBY. However, switching to STBY mode and creating a new STBY machine is a VNFM/NFVO level operation and therefore outside the scope of this project. Doctor project does not create interfaces for such VNFM level configuration operations. Yet, since the total failover time of a consumer service depends on both the delay of such processes as well as the reaction time of Doctor components, minimizing Doctor’s reaction time is a necessary basic ingredient to fast failover times in general.

Once the Consumer has switched to STBY configuration, it notifies (Step 5 “Instruction” in figure1) the VIM. The VIM can then take necessary (e.g. pre-determined by the involved network operator) actions on how to clean up the fault affected VMs (Step 6 “Execute Instruction”).

The key issue in this use case is that a VIM (OpenStack in this context) shall not take a standalone fault recovery action (e.g. migration of the affected VMs) before the ACT-STBY switching is complete, as that might violate the ACT-STBY configuration and render the node out of service.

As an extension of the 1+1 ACT-STBY resilience pattern, a STBY instance can act as backup to N ACT nodes (N+1). In this case, the basic information flow remains the same, i.e., the consumer is informed of a failure in order to activate the STBY node. However, in this case it might be useful for the failure notification to cover a number of failed instances due to the same fault (e.g., more than one instance might be affected by a switch failure). The reaction of the consumer might depend on whether only one active instance has failed (similar to the ACT-STBY case), or if more active instances are needed as well.

Preventive actions based on fault prediction

The fault management scenario explained in Fault management using ACT-STBY configuration can also be performed based on fault prediction. In such cases, in VIM, there is an intelligent fault prediction module which, based on its NFVI monitoring information, can predict an imminent fault in the elements of NFVI. A simple example is raising temperature of a Hardware Server which might trigger a pre-emptive recovery action. The requirements of such fault prediction in the VIM are investigated in the OPNFV project “Data Collection for Failure Prediction” [PRED].

This use case is very similar to Fault management using ACT-STBY configuration. Instead of a fault detection (Step 1 “Fault Notification in” figure1), the trigger comes from a fault prediction module in the VIM, or from a third party module which notifies the VIM about an imminent fault. From Step 2~5, the work flow is the same as in the “Fault management using ACT-STBY configuration” use case, except in this case, the Consumer of a VM/VNF switches to STBY configuration based on a predicted fault, rather than an occurred fault.

NFVI Maintenance
VM Retirement

All network operators perform maintenance of their network infrastructure, both regularly and irregularly. Besides the hardware, virtualization is expected to increase the number of elements subject to such maintenance as NFVI holds new elements like the hypervisor and host OS. Maintenance of a particular resource element e.g. hardware, hypervisor etc. may render a particular server hardware unusable until the maintenance procedure is complete.

However, the Consumer of VMs needs to know that such resources will be unavailable because of NFVI maintenance. The following use case is again to ensure that the ACT-STBY configuration is not violated. A stand-alone action (e.g. live migration) from VIM/OpenStack to empty a physical machine so that consequent maintenance procedure could be performed may not only violate the ACT-STBY configuration, but also have impact on real-time processing scenarios where dedicated resources to virtual resources (e.g. VMs) are necessary and a pause in operation (e.g. vCPU) is not allowed. The Consumer is in a position to safely perform the switch between ACT and STBY nodes, or switch to an alternative VNF forwarding graph so the hardware servers hosting the ACT nodes can be emptied for the upcoming maintenance operation. Once the target hardware servers are emptied (i.e. no virtual resources are running on top), the VIM can mark them with an appropriate flag (i.e. “maintenance” state) such that these servers are not considered for hosting of virtual machines until the maintenance flag is cleared (i.e. nodes are back in “normal” status).

A high-level view of the maintenance procedure is presented in figure2. VIM/OpenStack, through its northbound interface, receives a maintenance notification (Step 1 “Maintenance Request”) from the Administrator (e.g. a network operator) including information about which hardware is subject to maintenance. Maintenance operations include replacement/upgrade of hardware, update/upgrade of the hypervisor/host OS, etc.

The consequent steps to enable the Consumer to perform ACT-STBY switching are very similar to the fault management scenario. From VIM/OpenStack’s internal database, it finds out which virtual resources (VM-x) are running on those particular Hardware Servers and who are the managers of those virtual resources (Step 2). The VIM then informs the respective Consumer (VNFMs or NFVO) in Step 3 “Maintenance Notification”. Based on this, the Consumer takes necessary actions (Step 4, e.g. switch to STBY configuration or switch VNF forwarding graphs) and then notifies (Step 5 “Instruction”) the VIM. Upon receiving such notification, the VIM takes necessary actions (Step 6 “Execute Instruction” to empty the Hardware Servers so that consequent maintenance operations could be performed. Due to the similarity for Steps 2~6, the maintenance procedure and the fault management procedure are investigated in the same project.

_images/figure2.png

Maintenance use case

High level architecture and general features
Functional overview

The Doctor project circles around two distinct use cases: 1) management of failures of virtualized resources and 2) planned maintenance, e.g. migration, of virtualized resources. Both of them may affect a VNF/application and the network service it provides, but there is a difference in frequency and how they can be handled.

Failures are spontaneous events that may or may not have an impact on the virtual resources. The Consumer should as soon as possible react to the failure, e.g., by switching to the STBY node. The Consumer will then instruct the VIM on how to clean up or repair the lost virtual resources, i.e. restore the VM, VLAN or virtualized storage. How much the applications are affected varies. Applications with built-in HA support might experience a short decrease in retainability (e.g. an ongoing session might be lost) while keeping availability (establishment or re-establishment of sessions are not affected), whereas the impact on applications without built-in HA may be more serious. How much the network service is impacted depends on how the service is implemented. With sufficient network redundancy the service may be unaffected even when a specific resource fails.

On the other hand, planned maintenance impacting virtualized resources are events that are known in advance. This group includes e.g. migration due to software upgrades of OS and hypervisor on a compute host. Some of these might have been requested by the application or its management solution, but there is also a need for coordination on the actual operations on the virtual resources. There may be an impact on the applications and the service, but since they are not spontaneous events there is room for planning and coordination between the application management organization and the infrastructure management organization, including performing whatever actions that would be required to minimize the problems.

Failure prediction is the process of pro-actively identifying situations that may lead to a failure in the future unless acted on by means of maintenance activities. From applications’ point of view, failure prediction may impact them in two ways: either the warning time is so short that the application or its management solution does not have time to react, in which case it is equal to the failure scenario, or there is sufficient time to avoid the consequences by means of maintenance activities, in which case it is similar to planned maintenance.

Architecture Overview

NFV and the Cloud platform provide virtual resources and related control functionality to users and administrators. figure3 shows the high level architecture of NFV focusing on the NFVI, i.e., the virtualized infrastructure. The NFVI provides virtual resources, such as virtual machines (VM) and virtual networks. Those virtual resources are used to run applications, i.e. VNFs, which could be components of a network service which is managed by the consumer of the NFVI. The VIM provides functionalities of controlling and viewing virtual resources on hardware (physical) resources to the consumers, i.e., users and administrators. OpenStack is a prominent candidate for this VIM. The administrator may also directly control the NFVI without using the VIM.

Although OpenStack is the target upstream project where the new functional elements (Controller, Notifier, Monitor, and Inspector) are expected to be implemented, a particular implementation method is not assumed. Some of these elements may sit outside of OpenStack and offer a northbound interface to OpenStack.

General Features and Requirements

The following features are required for the VIM to achieve high availability of applications (e.g., MME, S/P-GW) and the Network Services:

  1. Monitoring: Monitor physical and virtual resources.
  2. Detection: Detect unavailability of physical resources.
  3. Correlation and Cognition: Correlate faults and identify affected virtual resources.
  4. Notification: Notify unavailable virtual resources to their Consumer(s).
  5. Fencing: Shut down or isolate a faulty resource.
  6. Recovery action: Execute actions to process fault recovery and maintenance.

The time interval between the instant that an event is detected by the monitoring system and the Consumer notification of unavailable resources shall be < 1 second (e.g., Step 1 to Step 4 in figure4).

_images/figure3.png

High level architecture

Monitoring

The VIM shall monitor physical and virtual resources for unavailability and suspicious behavior.

Detection

The VIM shall detect unavailability and failures of physical resources that might cause errors/faults in virtual resources running on top of them. Unavailability of physical resource is detected by various monitoring and managing tools for hardware and software components. This may include also predicting upcoming faults. Note, fault prediction is out of scope of this project and is investigated in the OPNFV “Data Collection for Failure Prediction” project [PRED].

The fault items/events to be detected shall be configurable.

The configuration shall enable Failure Selection and Aggregation. Failure aggregation means the VIM determines unavailability of physical resource from more than two non-critical failures related to the same resource.

There are two types of unavailability - immediate and future:

  • Immediate unavailability can be detected by setting traps of raw failures on hardware monitoring tools.
  • Future unavailability can be found by receiving maintenance instructions issued by the administrator of the NFVI or by failure prediction mechanisms.
Correlation and Cognition

The VIM shall correlate each fault to the impacted virtual resource, i.e., the VIM shall identify unavailability of virtualized resources that are or will be affected by failures on the physical resources under them. Unavailability of a virtualized resource is determined by referring to the mapping of physical and virtualized resources.

VIM shall allow configuration of fault correlation between physical and virtual resources. VIM shall support correlating faults:

  • between a physical resource and another physical resource
  • between a physical resource and a virtual resource
  • between a virtual resource and another virtual resource

Failure aggregation is also required in this feature, e.g., a user may request to be only notified if failures on more than two standby VMs in an (N+M) deployment model occurred.

Notification

The VIM shall notify the alarm, i.e., unavailability of virtual resource(s), to the Consumer owning it over the northbound interface, such that the Consumers impacted by the failure can take appropriate actions to recover from the failure.

The VIM shall also notify the unavailability of physical resources to its Administrator.

All notifications shall be transferred immediately in order to minimize the stalling time of the network service and to avoid over assignment caused by delay of capability updates.

There may be multiple consumers, so the VIM has to find out the owner of a faulty resource. Moreover, there may be a large number of virtual and physical resources in a real deployment, so polling the state of all resources to the VIM would lead to heavy signaling traffic. Thus, a publication/subscription messaging model is better suited for these notifications, as notifications are only sent to subscribed consumers.

Notifications will be send out along with the configuration by the consumer. The configuration includes endpoint(s) in which the consumers can specify multiple targets for the notification subscription, so that various and multiple receiver functions can consume the notification message. Also, the conditions for notifications shall be configurable, such that the consumer can set according policies, e.g. whether it wants to receive fault notifications or not.

Note: the VIM should only accept notification subscriptions for each resource by its owner or administrator. Notifications to the Consumer about the unavailability of virtualized resources will include a description of the fault, preferably with sufficient abstraction rather than detailed physical fault information.

Fencing

Recovery actions, e.g. safe VM evacuation, have to be preceded by fencing the failed host. Fencing hereby means to isolate or shut down a faulty resource. Without fencing – when the perceived disconnection is due to some transient or partial failure – the evacuation might lead into two identical instances running together and having a dangerous conflict.

There is a cross-project definition in OpenStack of how to implement fencing, but there has not been any progress. The general description is available here: https://wiki.openstack.org/wiki/Fencing_Instances_of_an_Unreachable_Host

OpenStack provides some mechanisms that allow fencing of faulty resources. Some are automatically invoked by the platform itself (e.g. Nova disables the compute service when libvirtd stops running, preventing new VMs to be scheduled to that node), while other mechanisms are consumer trigger-based actions (e.g. Neutron port admin-state-up). For other fencing actions not supported by OpenStack, the Doctor project may suggest ways to address the gap (e.g. through means of resourcing to external tools and orchestration methods), or documenting or implementing them upstream.

The Doctor Inspector component will be responsible of marking resources down in the OpenStack and back up if necessary.

Recovery Action

In the basic Fault management using ACT-STBY configuration use case, no automatic actions will be taken by the VIM, but all recovery actions executed by the VIM and the NFVI will be instructed and coordinated by the Consumer.

In a more advanced use case, the VIM may be able to recover the failed virtual resources according to a pre-defined behavior for that resource. In principle this means that the owner of the resource (i.e., its consumer or administrator) can define which recovery actions shall be taken by the VIM. Examples are a restart of the VM or migration/evacuation of the VM.

High level northbound interface specification
Fault Management

This interface allows the Consumer to subscribe to fault notification from the VIM. Using a filter, the Consumer can narrow down which faults should be notified. A fault notification may trigger the Consumer to switch from ACT to STBY configuration and initiate fault recovery actions. A fault query request/response message exchange allows the Consumer to find out about active alarms at the VIM. A filter can be used to narrow down the alarms returned in the response message.

_images/figure4.png

High-level message flow for fault management

The high level message flow for the fault management use case is shown in figure4. It consists of the following steps:

  1. The VIM monitors the physical and virtual resources and the fault management workflow is triggered by a monitored fault event.
  2. Event correlation, fault detection and aggregation in VIM. Note: this may also happen after Step 3.
  3. Database lookup to find the virtual resources affected by the detected fault.
  4. Fault notification to Consumer.
  5. The Consumer switches to standby configuration (STBY).
  6. Instructions to VIM requesting certain actions to be performed on the affected resources, for example migrate/update/terminate specific resource(s). After reception of such instructions, the VIM is executing the requested action, e.g., it will migrate or terminate a virtual resource.
NFVI Maintenance

The NFVI maintenance interface allows the Administrator to notify the VIM about a planned maintenance operation on the NFVI. A maintenance operation may for example be an update of the server firmware or the hypervisor. The MaintenanceRequest message contains instructions to change the state of the physical resource from ‘enabled’ to ‘going-to-maintenance’ and a timeout [1]. After receiving the MaintenanceRequest,the VIM decides on the actions to be taken based on maintenance policies predefined by the affected Consumer(s).

[1]Timeout is set by the Administrator and corresponds to the maximum time to empty the physical resources.
_images/figure5a.png

High-level message flow for maintenance policy enforcement

The high level message flow for the NFVI maintenance policy enforcement is shown in figure5a. It consists of the following steps:

  1. Maintenance trigger received from Administrator.
  2. VIM switches the affected physical resources to “going-to-maintenance” state e.g. so that no new VM will be scheduled on the physical servers.
  3. Database lookup to find the Consumer(s) and virtual resources affected by the maintenance operation.
  4. Maintenance policies are enforced in the VIM, e.g. affected VM(s) are shut down on the physical server(s), or affected Consumer(s) are notified about the planned maintenance operation (steps 4a/4b).

Once the affected Consumer(s) have been notified, they take specific actions (e.g. switch to standby (STBY) configuration, request to terminate the virtual resource(s)) to allow the maintenance action to be executed. After the physical resources have been emptied, the VIM puts the physical resources in “in-maintenance” state and sends a MaintenanceResponse back to the Administrator.

_images/figure5b.png

Successful NFVI maintenance

The high level message flow for a successful NFVI maintenance is show in figure5b. It consists of the following steps:

  1. The Consumer C3 switches to standby configuration (STBY).
  2. Instructions from Consumers C2/C3 are shared to VIM requesting certain actions to be performed (steps 6a, 6b). After receiving such instructions, the VIM executes the requested action in order to empty the physical resources (step 6c) and informs the Consumer about the result of the actions (steps 6d, 6e).
  3. The VIM switches the physical resources to “in-maintenance” state
  4. Maintenance response is sent from VIM to inform the Administrator that the physical servers have been emptied.
  5. The Administrator is coordinating and executing the maintenance operation/work on the NFVI. Note: this step is out of scope of Doctor project.

The requested actions to empty the physical resources may not be successful (e.g. migration fails or takes too long) and in such a case, the VIM puts the physical resources back to ‘enabled’ and informs the Administrator about the problem.

_images/figure5c.png

Example of failed NFVI maintenance

An example of a high level message flow to cover the failed NFVI maintenance case is shown in figure5c. It consists of the following steps:

  1. The Consumer C3 switches to standby configuration (STBY).
  2. Instructions from Consumers C2/C3 are shared to VIM requesting certain actions to be performed (steps 6a, 6b). The VIM executes the requested actions and sends back a NACK to consumer C2 (step 6d) as the migration of the virtual resource(s) is not completed by the given timeout.
  3. The VIM switches the physical resources to “enabled” state.
  4. MaintenanceNotification is sent from VIM to inform the Administrator that the maintenance action cannot start.
Gap analysis in upstream projects

This section presents the findings of gaps on existing VIM platforms. The focus was to identify gaps based on the features and requirements specified in Section 3.3. The analysis work determined gaps that are presented here.

VIM Northbound Interface
Immediate Notification
  • Type: ‘deficiency in performance’
  • Description
    • To-be
      • VIM has to notify unavailability of virtual resource (fault) to VIM user immediately.
      • Notification should be passed in ‘1 second’ after fault detected/notified by VIM.
      • Also, the following conditions/requirement have to be met:
        • Only the owning user can receive notification of fault related to owned virtual resource(s).
    • As-is
      • OpenStack Metering ‘Ceilometer’ can notify unavailability of virtual resource (fault) to the owner of virtual resource based on alarm configuration by the user.
      • Alarm notifications are triggered by alarm evaluator instead of notification agents that might receive faults
      • Evaluation interval should be equal to or larger than configured pipeline interval for collection of underlying metrics.
      • The interval for collection has to be set large enough which depends on the size of the deployment and the number of metrics to be collected.
      • The interval may not be less than one second in even small deployments. The default value is 60 seconds.
      • Alternative: OpenStack has a message bus to publish system events. The operator can allow the user to connect this, but there are no functions to filter out other events that should not be passed to the user or which were not requested by the user.
    • Gap
      • Fault notifications cannot be received immediately by Ceilometer.
  • Solved by
Maintenance Notification
VIM Southbound interface
Normalization of data collection models
  • Type: ‘missing’
  • Description
    • To-be
      • A normalized data format needs to be created to cope with the many data models from different monitoring solutions.
    • As-is
      • Data can be collected from many places (e.g. Zabbix, Nagios, Cacti, Zenoss). Although each solution establishes its own data models, no common data abstraction models exist in OpenStack.
    • Gap
      • Normalized data format does not exist.
  • Solved by
OpenStack
Ceilometer

OpenStack offers a telemetry service, Ceilometer, for collecting measurements of the utilization of physical and virtual resources [CEIL]. Ceilometer can collect a number of metrics across multiple OpenStack components and watch for variations and trigger alarms based upon the collected data.

Scalability of fault aggregation
  • Type: ‘scalability issue’
  • Description
    • To-be
      • Be able to scale to a large deployment, where thousands of monitoring events per second need to be analyzed.
    • As-is
      • Performance issue when scaling to medium-sized deployments.
    • Gap
      • Ceilometer seems to be unsuitable for monitoring medium and large scale NFVI deployments.
  • Solved by
Monitoring of hardware and software
  • Type: ‘missing (lack of functionality)’
  • Description
    • To-be
      • OpenStack (as VIM) should monitor various hardware and software in NFVI to handle faults on them by Ceilometer.
      • OpenStack may have monitoring functionality in itself and can be integrated with third party monitoring tools.
      • OpenStack need to be able to detect the faults listed in the Annex.
    • As-is
      • For each deployment of OpenStack, an operator has responsibility to configure monitoring tools with relevant scripts or plugins in order to monitor hardware and software.
      • OpenStack Ceilometer does not monitor hardware and software to capture faults.
    • Gap
      • Ceilometer is not able to detect and handle all faults listed in the Annex.
  • Solved by
Nova

OpenStack Nova [NOVA] is a mature and widely known and used component in OpenStack cloud deployments. It is the main part of an “infrastructure-as-a-service” system providing a cloud computing fabric controller, supporting a wide diversity of virtualization and container technologies.

Nova has proven throughout these past years to be highly available and fault-tolerant. Featuring its own API, it also provides a compatibility API with Amazon EC2 APIs.

Correct states when compute host is down
  • Type: ‘missing (lack of functionality)’
  • Description
    • To-be
      • The API shall support to change VM power state in case host has failed.
      • The API shall support to change nova-compute state.
      • There could be single API to change different VM states for all VMs belonging to a specific host.
      • Support external systems that are monitoring the infrastructure and resources that are able to call the API fast and reliable.
      • Resource states are reliable such that correlation actions can be fast and automated.
      • User shall be able to read states from OpenStack and trust they are correct.
    • As-is
      • When a VM goes down due to a host HW, host OS or hypervisor failure, nothing happens in OpenStack. The VMs of a crashed host/hypervisor are reported to be live and OK through the OpenStack API.
      • nova-compute state might change too slowly or the state is not reliable if expecting also VMs to be down. This leads to ability to schedule VMs to a failed host and slowness blocks evacuation.
    • Gap
      • OpenStack does not change its states fast and reliably enough.
      • The API does not support to have an external system to change states and to trust the states are reliable (external system has fenced failed host).
      • User cannot read all the states from OpenStack nor trust they are right.
  • Solved by
Evacuate VMs in Maintenance mode
  • Type: ‘missing’
  • Description
    • To-be
      • When maintenance mode for a compute host is set, trigger VM evacuation to available compute nodes before bringing the host down for maintenance.
    • As-is
      • If setting a compute node to a maintenance mode, OpenStack only schedules evacuation of all VMs to available compute nodes if in-maintenance compute node runs the XenAPI and VMware ESX hypervisors. Other hypervisors (e.g. KVM) are not supported and, hence, guest VMs will likely stop running due to maintenance actions administrator may perform (e.g. hardware upgrades, OS updates).
    • Gap
      • Nova libvirt hypervisor driver does not implement automatic guest VMs evacuation when compute nodes are set to maintenance mode ($ nova host-update --maintenance enable <hostname>).
Monasca

Monasca is an open-source monitoring-as-a-service (MONaaS) solution that integrates with OpenStack. Even though it is still in its early days, it is the interest of the community that the platform be multi-tenant, highly scalable, performant and fault-tolerant. It provides a streaming alarm engine, a notification engine, and a northbound REST API users can use to interact with Monasca. Hundreds of thousands of metrics per second can be processed [MONA].

Anomaly detection
  • Type: ‘missing (lack of functionality)’
  • Description
    • To-be
      • Detect the failure and perform a root cause analysis to filter out other alarms that may be triggered due to their cascading relation.
    • As-is
      • A mechanism to detect root causes of failures is not available.
    • Gap
      • Certain failures can trigger many alarms due to their dependency on the underlying root cause of failure. Knowing the root cause can help filter out unnecessary and overwhelming alarms.
  • Status
    • Monasca as of now lacks this feature, although the community is aware and working toward supporting it.
Sensor monitoring
  • Type: ‘missing (lack of functionality)’
  • Description
    • To-be
      • It should support monitoring sensor data retrieval, for instance, from IPMI.
    • As-is
      • Monasca does not monitor sensor data
    • Gap
      • Sensor monitoring is very important. It provides operators status on the state of the physical infrastructure (e.g. temperature, fans).
  • Addressed by
    • Monasca can be configured to use third-party monitoring solutions (e.g. Nagios, Cacti) for retrieving additional data.
Hardware monitoring tools
Zabbix

Zabbix is an open-source solution for monitoring availability and performance of infrastructure components (i.e. servers and network devices), as well as applications [ZABB]. It can be customized for use with OpenStack. It is a mature tool and has been proven to be able to scale to large systems with 100,000s of devices.

Delay in execution of actions
  • Type: ‘deficiency in performance’
  • Description
    • To-be
      • After detecting a fault, the monitoring tool should immediately execute the appropriate action, e.g. inform the manager through the NB I/F
    • As-is
      • A delay of around 10 seconds was measured in two independent testbed deployments
    • Gap
Detailed architecture and interface specification

This section describes a detailed implementation plan, which is based on the high level architecture introduced in Section 3. Section 5.1 describes the functional blocks of the Doctor architecture, which is followed by a high level message flow in Section 5.2. Section 5.3 provides a mapping of selected existing open source components to the building blocks of the Doctor architecture. Thereby, the selection of components is based on their maturity and the gap analysis executed in Section 4. Sections 5.4 and 5.5 detail the specification of the related northbound interface and the related information elements. Finally, Section 5.6 provides a first set of blueprints to address selected gaps required for the realization functionalities of the Doctor project.

Functional Blocks

This section introduces the functional blocks to form the VIM. OpenStack was selected as the candidate for implementation. Inside the VIM, 4 different building blocks are defined (see figure6).

_images/figure6.png

Functional blocks

Monitor

The Monitor module has the responsibility for monitoring the virtualized infrastructure. There are already many existing tools and services (e.g. Zabbix) to monitor different aspects of hardware and software resources which can be used for this purpose.

Inspector

The Inspector module has the ability a) to receive various failure notifications regarding physical resource(s) from Monitor module(s), b) to find the affected virtual resource(s) by querying the resource map in the Controller, and c) to update the state of the virtual resource (and physical resource).

The Inspector has drivers for different types of events and resources to integrate any type of Monitor and Controller modules. It also uses a failure policy database to decide on the failure selection and aggregation from raw events. This failure policy database is configured by the Administrator.

The reason for separation of the Inspector and Controller modules is to make the Controller focus on simple operations by avoiding a tight integration of various health check mechanisms into the Controller.

Controller

The Controller is responsible for maintaining the resource map (i.e. the mapping from physical resources to virtual resources), accepting update requests for the resource state(s) (exposing as provider API), and sending all failure events regarding virtual resources to the Notifier. Optionally, the Controller has the ability to force the state of a given physical resource to down in the resource mapping when it receives failure notifications from the Inspector for that given physical resource. The Controller also re-calculates the capacity of the NVFI when receiving a failure notification for a physical resource.

In a real-world deployment, the VIM may have several controllers, one for each resource type, such as Nova, Neutron and Cinder in OpenStack. Each controller maintains a database of virtual and physical resources which shall be the master source for resource information inside the VIM.

Notifier

The focus of the Notifier is on selecting and aggregating failure events received from the controller based on policies mandated by the Consumer. Therefore, it allows the Consumer to subscribe for alarms regarding virtual resources using a method such as API endpoint. After receiving a fault event from a Controller, it will notify the fault to the Consumer by referring to the alarm configuration which was defined by the Consumer earlier on.

To reduce complexity of the Controller, it is a good approach for the Controllers to emit all notifications without any filtering mechanism and have another service (i.e. Notifier) handle those notifications properly. This is the general philosophy of notifications in OpenStack. Note that a fault message consumed by the Notifier is different from the fault message received by the Inspector; the former message is related to virtual resources which are visible to users with relevant ownership, whereas the latter is related to raw devices or small entities which should be handled with an administrator privilege.

The northbound interface between the Notifier and the Consumer/Administrator is specified in Detailed northbound interface specification.

Sequence
Fault Management

The detailed work flow for fault management is as follows (see also figure7):

  1. Request to subscribe to monitor specific virtual resources. A query filter can be used to narrow down the alarms the Consumer wants to be informed about.
  2. Each subscription request is acknowledged with a subscribe response message. The response message contains information about the subscribed virtual resources, in particular if a subscribed virtual resource is in “alarm” state.
  3. The NFVI sends monitoring events for resources the VIM has been subscribed to. Note: this subscription message exchange between the VIM and NFVI is not shown in this message flow.
  4. Event correlation, fault detection and aggregation in VIM.
  5. Database lookup to find the virtual resources affected by the detected fault.
  6. Fault notification to Consumer.
  7. The Consumer switches to standby configuration (STBY)
  8. Instructions to VIM requesting certain actions to be performed on the affected resources, for example migrate/update/terminate specific resource(s). After reception of such instructions, the VIM is executing the requested action, e.g. it will migrate or terminate a virtual resource.
  1. Query request from Consumer to VIM to get information about the current status of a resource.
  2. Response to the query request with information about the current status of the queried resource. In case the resource is in “fault” state, information about the related fault(s) is returned.

In order to allow for quick reaction to failures, the time interval between fault detection in step 3 and the corresponding recovery actions in step 7 and 8 shall be less than 1 second.

_images/figure7.png

Fault management work flow

_images/figure8.png

Fault management scenario

figure8 shows a more detailed message flow (Steps 4 to 6) between the 4 building blocks introduced in Functional Blocks.

  1. The Monitor observed a fault in the NFVI and reports the raw fault to the Inspector. The Inspector filters and aggregates the faults using pre-configured failure policies.
  2. a) The Inspector queries the Resource Map to find the virtual resources affected by the raw fault in the NFVI. b) The Inspector updates the state of the affected virtual resources in the Resource Map. c) The Controller observes a change of the virtual resource state and informs the Notifier about the state change and the related alarm(s). Alternatively, the Inspector may directly inform the Notifier about it.
  3. The Notifier is performing another filtering and aggregation of the changes and alarms based on the pre-configured alarm configuration. Finally, a fault notification is sent to northbound to the Consumer.
NFVI Maintenance
_images/figure9.png

NFVI maintenance work flow

The detailed work flow for NFVI maintenance is shown in figure9 and has the following steps. Note that steps 1, 2, and 5 to 8a in the NFVI maintenance work flow are very similar to the steps in the fault management work flow and share a similar implementation plan in Release 1.

  1. Subscribe to fault/maintenance notifications.
  2. Response to subscribe request.
  3. Maintenance trigger received from administrator.
  4. VIM switches NFVI resources to “maintenance” state. This, e.g., means they should not be used for further allocation/migration requests
  5. Database lookup to find the virtual resources affected by the detected maintenance operation.
  6. Maintenance notification to Consumer.
  7. The Consumer switches to standby configuration (STBY)
  8. Instructions from Consumer to VIM requesting certain recovery actions to be performed (step 8a). After reception of such instructions, the VIM is executing the requested action in order to empty the physical resources (step 8b).
  9. Maintenance response from VIM to inform the Administrator that the physical machines have been emptied (or the operation resulted in an error state).
  10. Administrator is coordinating and executing the maintenance operation/work on the NFVI.
  1. Query request from Administrator to VIM to get information about the current state of a resource.
  2. Response to the query request with information about the current state of the queried resource(s). In case the resource is in “maintenance” state, information about the related maintenance operation is returned.
_images/figure10.png

NFVI Maintenance scenario

figure10 shows a more detailed message flow (Steps 3 to 6 and 9) between the 4 building blocks introduced in Section 5.1..

  1. The Administrator is sending a StateChange request to the Controller residing in the VIM.

  2. The Controller queries the Resource Map to find the virtual resources affected by the planned maintenance operation.

  3. a) The Controller updates the state of the affected virtual resources in the Resource Map database.

    b) The Controller informs the Notifier about the virtual resources that will be affected by the maintenance operation.

  4. A maintenance notification is sent to northbound to the Consumer.

...

  1. The Controller informs the Administrator after the physical resources have been freed.
Information elements

This section introduces all attributes and information elements used in the messages exchange on the northbound interfaces between the VIM and the VNFO and VNFM.

Note: The information elements will be aligned with current work in ETSI NFV IFA working group.

Simple information elements:

  • SubscriptionID (Identifier): identifies a subscription to receive fault or maintenance notifications.
  • NotificationID (Identifier): identifies a fault or maintenance notification.
  • VirtualResourceID (Identifier): identifies a virtual resource affected by a fault or a maintenance action of the underlying physical resource.
  • PhysicalResourceID (Identifier): identifies a physical resource affected by a fault or maintenance action.
  • VirtualResourceState (String): state of a virtual resource, e.g. “normal”, “maintenance”, “down”, “error”.
  • PhysicalResourceState (String): state of a physical resource, e.g. “normal”, “maintenance”, “down”, “error”.
  • VirtualResourceType (String): type of the virtual resource, e.g. “virtual machine”, “virtual memory”, “virtual storage”, “virtual CPU”, or “virtual NIC”.
  • FaultID (Identifier): identifies the related fault in the underlying physical resource. This can be used to correlate different fault notifications caused by the same fault in the physical resource.
  • FaultType (String): Type of the fault. The allowed values for this parameter depend on the type of the related physical resource. For example, a resource of type “compute hardware” may have faults of type “CPU failure”, “memory failure”, “network card failure”, etc.
  • Severity (Integer): value expressing the severity of the fault. The higher the value, the more severe the fault.
  • MinSeverity (Integer): value used in filter information elements. Only faults with a severity higher than the MinSeverity value will be notified to the Consumer.
  • EventTime (Datetime): Time when the fault was observed.
  • EventStartTime and EventEndTime (Datetime): Datetime range that can be used in a FaultQueryFilter to narrow down the faults to be queried.
  • ProbableCause (String): information about the probable cause of the fault.
  • CorrelatedFaultID (Integer): list of other faults correlated to this fault.
  • isRootCause (Boolean): Parameter indicating if this fault is the root for other correlated faults. If TRUE, then the faults listed in the parameter CorrelatedFaultID are caused by this fault.
  • FaultDetails (Key-value pair): provides additional information about the fault, e.g. information about the threshold, monitored attributes, indication of the trend of the monitored parameter.
  • FirmwareVersion (String): current version of the firmware of a physical resource.
  • HypervisorVersion (String): current version of a hypervisor.
  • ZoneID (Identifier): Identifier of the resource zone. A resource zone is the logical separation of physical and software resources in an NFVI deployment for physical isolation, redundancy, or administrative designation.
  • Metadata (Key-value pair): provides additional information of a physical resource in maintenance/error state.

Complex information elements (see also UML diagrams in figure13 and figure14):

  • VirtualResourceInfoClass:
    • VirtualResourceID [1] (Identifier)
    • VirtualResourceState [1] (String)
    • Faults [0..*] (FaultClass): For each resource, all faults including detailed information about the faults are provided.
  • FaultClass: The parameters of the FaultClass are partially based on ETSI TS 132 111-2 (V12.1.0) [*], which is specifying fault management in 3GPP, in particular describing the information elements used for alarm notifications.
    • FaultID [1] (Identifier)
    • FaultType [1] (String)
    • Severity [1] (Integer)
    • EventTime [1] (Datetime)
    • ProbableCause [1] (String)
    • CorrelatedFaultID [0..*] (Identifier)
    • FaultDetails [0..*] (Key-value pair)
[*]http://www.etsi.org/deliver/etsi_ts/132100_132199/13211102/12.01.00_60/ts_13211102v120100p.pdf
  • SubscribeFilterClass
    • VirtualResourceType [0..*] (String)
    • VirtualResourceID [0..*] (Identifier)
    • FaultType [0..*] (String)
    • MinSeverity [0..1] (Integer)
  • FaultQueryFilterClass: narrows down the FaultQueryRequest, for example it limits the query to certain physical resources, a certain zone, a given fault type/severity/cause, or a specific FaultID.
    • VirtualResourceType [0..*] (String)
    • VirtualResourceID [0..*] (Identifier)
    • FaultType [0..*] (String)
    • MinSeverity [0..1] (Integer)
    • EventStartTime [0..1] (Datetime)
    • EventEndTime [0..1] (Datetime)
  • PhysicalResourceStateClass:
    • PhysicalResourceID [1] (Identifier)
    • PhysicalResourceState [1] (String): mandates the new state of the physical resource.
    • Metadata [0..*] (Key-value pair)
  • PhysicalResourceInfoClass:
    • PhysicalResourceID [1] (Identifier)
    • PhysicalResourceState [1] (String)
    • FirmwareVersion [0..1] (String)
    • HypervisorVersion [0..1] (String)
    • ZoneID [0..1] (Identifier)
    • Metadata [0..*] (Key-value pair)
  • StateQueryFilterClass: narrows down a StateQueryRequest, for example it limits the query to certain physical resources, a certain zone, or a given resource state (e.g., only resources in “maintenance” state).
    • PhysicalResourceID [1] (Identifier)
    • PhysicalResourceState [1] (String)
    • ZoneID [0..1] (Identifier)
Detailed northbound interface specification

This section is specifying the northbound interfaces for fault management and NFVI maintenance between the VIM on the one end and the Consumer and the Administrator on the other ends. For each interface all messages and related information elements are provided.

Note: The interface definition will be aligned with current work in ETSI NFV IFA working group .

All of the interfaces described below are produced by the VIM and consumed by the Consumer or Administrator.

Fault management interface

This interface allows the VIM to notify the Consumer about a virtual resource that is affected by a fault, either within the virtual resource itself or by the underlying virtualization infrastructure. The messages on this interface are shown in figure13 and explained in detail in the following subsections.

Note: The information elements used in this section are described in detail in Section 5.4.

_images/figure13.png

Fault management NB I/F messages

SubscribeRequest (Consumer -> VIM)

Subscription from Consumer to VIM to be notified about faults of specific resources. The faults to be notified about can be narrowed down using a subscribe filter.

Parameters:

  • SubscribeFilter [1] (SubscribeFilterClass): Optional information to narrow down the faults that shall be notified to the Consumer, for example limit to specific VirtualResourceID(s), severity, or cause of the alarm.
SubscribeResponse (VIM -> Consumer)

Response to a subscribe request message including information about the subscribed resources, in particular if they are in “fault/error” state.

Parameters:

  • SubscriptionID [1] (Identifier): Unique identifier for the subscription. It can be used to delete or update the subscription.
  • VirtualResourceInfo [0..*] (VirtualResourceInfoClass): Provides additional information about the subscribed resources, i.e., a list of the related resources, the current state of the resources, etc.
FaultNotification (VIM -> Consumer)

Notification about a virtual resource that is affected by a fault, either within the virtual resource itself or by the underlying virtualization infrastructure. After reception of this request, the Consumer will decide on the optimal action to resolve the fault. This includes actions like switching to a hot standby virtual resource, migration of the fault virtual resource to another physical machine, termination of the faulty virtual resource and instantiation of a new virtual resource in order to provide a new hot standby resource. In some use cases the Consumer can leave virtual resources on failed host to be booted up again after fault is recovered. Existing resource management interfaces and messages between the Consumer and the VIM can be used for those actions, and there is no need to define additional actions on the Fault Management Interface.

Parameters:

  • NotificationID [1] (Identifier): Unique identifier for the notification.
  • VirtualResourceInfo [1..*] (VirtualResourceInfoClass): List of faulty resources with detailed information about the faults.
FaultQueryRequest (Consumer -> VIM)

Request to find out about active alarms at the VIM. A FaultQueryFilter can be used to narrow down the alarms returned in the response message.

Parameters:

  • FaultQueryFilter [1] (FaultQueryFilterClass): narrows down the FaultQueryRequest, for example it limits the query to certain physical resources, a certain zone, a given fault type/severity/cause, or a specific FaultID.
FaultQueryResponse (VIM -> Consumer)

List of active alarms at the VIM matching the FaultQueryFilter specified in the FaultQueryRequest.

Parameters:

  • VirtualResourceInfo [0..*] (VirtualResourceInfoClass): List of faulty resources. For each resource all faults including detailed information about the faults are provided.
NFVI maintenance

The NFVI maintenance interfaces Consumer-VIM allows the Consumer to subscribe to maintenance notifications provided by the VIM. The related maintenance interface Administrator-VIM allows the Administrator to issue maintenance requests to the VIM, i.e. requesting the VIM to take appropriate actions to empty physical machine(s) in order to execute maintenance operations on them. The interface also allows the Administrator to query the state of physical machines, e.g., in order to get details in the current status of the maintenance operation like a firmware update.

The messages defined in these northbound interfaces are shown in figure14 and described in detail in the following subsections.

_images/figure14.png

NFVI maintenance NB I/F messages

SubscribeRequest (Consumer -> VIM)

Subscription from Consumer to VIM to be notified about maintenance operations for specific virtual resources. The resources to be informed about can be narrowed down using a subscribe filter.

Parameters:

  • SubscribeFilter [1] (SubscribeFilterClass): Information to narrow down the faults that shall be notified to the Consumer, for example limit to specific virtual resource type(s).
SubscribeResponse (VIM -> Consumer)

Response to a subscribe request message, including information about the subscribed virtual resources, in particular if they are in “maintenance” state.

Parameters:

  • SubscriptionID [1] (Identifier): Unique identifier for the subscription. It can be used to delete or update the subscription.
  • VirtualResourceInfo [0..*] (VirtalResourceInfoClass): Provides additional information about the subscribed virtual resource(s), e.g., the ID, type and current state of the resource(s).
MaintenanceNotification (VIM -> Consumer)

Notification about a physical resource switched to “maintenance” state. After reception of this request, the Consumer will decide on the optimal action to address this request, e.g., to switch to the standby (STBY) configuration.

Parameters:

  • VirtualResourceInfo [1..*] (VirtualResourceInfoClass): List of virtual resources where the state has been changed to maintenance.
StateChangeRequest (Administrator -> VIM)

Request to change the state of a list of physical resources, e.g. to “maintenance” state, in order to prepare them for a planned maintenance operation.

Parameters:

  • PhysicalResourceState [1..*] (PhysicalResourceStateClass)
StateChangeResponse (VIM -> Administrator)

Response message to inform the Administrator that the requested resources are now in maintenance state (or the operation resulted in an error) and the maintenance operation(s) can be executed.

Parameters:

  • PhysicalResourceInfo [1..*] (PhysicalResourceInfoClass)
StateQueryRequest (Administrator -> VIM)

In this procedure, the Administrator would like to get the information about physical machine(s), e.g. their state (“normal”, “maintenance”), firmware version, hypervisor version, update status of firmware and hypervisor, etc. It can be used to check the progress during firmware update and the confirmation after update. A filter can be used to narrow down the resources returned in the response message.

Parameters:

  • StateQueryFilter [1] (StateQueryFilterClass): narrows down the StateQueryRequest, for example it limits the query to certain physical resources, a certain zone, or a given resource state.
StateQueryResponse (VIM -> Administrator)

List of physical resources matching the filter specified in the StateQueryRequest.

Parameters:

  • PhysicalResourceInfo [0..*] (PhysicalResourceInfoClass): List of physical resources. For each resource, information about the current state, the firmware version, etc. is provided.
NFV IFA, OPNFV Doctor and AODH alarms

This section compares the alarm interfaces of ETSI NFV IFA with the specifications of this document and the alarm class of AODH.

ETSI NFV specifies an interface for alarms from virtualised resources in ETSI GS NFV-IFA 005 [ENFV]. The interface specifies an Alarm class and two notifications plus operations to query alarm instances and to subscribe to the alarm notifications.

The specification in this document has a structure that is very similar to the ETSI NFV specifications. The notifications differ in that an alarm notification in the NFV interface defines a single fault for a single resource while the notification specified in this document can contain multiple faults for multiple resources. The Doctor specification is lacking the detailed time stamps of the NFV specification essential for synchronizaion of the alarm list using the query operation. The detailed time stamps are also of value in the event and alarm history DBs.

AODH defines a base class for alarms, not the notifications. This means that some of the dynamic attributes of the ETSI NFV alarm type, like alarmRaisedTime, are not applicable to the AODH alarm class but are attributes of in the actual notifications. (Description of these attributes will be added later.) The AODH alarm class is lacking some attributes present in the NFV specification, fault details and correlated alarms. Instead the AODH alarm class has attributes for actions, rules and user and project id.

ETSI NFV Alarm Type OPNFV Doctor Requirement Specs AODH Event Alarm Notification Description / Comment Recommendations
alarmId FaultId alarm_id Identifier of an alarm. -
- - alarm_name Human readable alarm name. May be added in ETSI NFV Stage 3.
managedObjectId VirtualResourceId (reason) Identifier of the affected virtual resource is part of the AODH reason parameter. -
- - user_id, project_id User and project identifiers. May be added in ETSI NFV Stage 3.
alarmRaisedTime - - Timestamp when alarm was raised. To be added to Doctor and AODH. May be derived (e.g. in a shimlayer) from the AODH alarm history.
alarmChangedTime - - Timestamp when alarm was changed/updated. see above
alarmClearedTime - - Timestamp when alarm was cleared. see above
eventTime - - Timestamp when alarm was first observed by the Monitor. see above
- EventTime generated Timestamp of the Notification. Update parameter name in Doctor spec. May be added in ETSI NFV Stage 3.
state: E.g. Fired, Updated Cleared VirtualResourceState: E.g. normal, down maintenance, error current: ok, alarm, insufficient_data ETSI NFV IFA 005/006 lists example alarm states. Maintenance state is missing in AODH. List of alarm states will be specified in ETSI NFV Stage 3.
perceivedSeverity: E.g. Critical, Major, Minor, Warning, Indeterminate, Cleared Severity (Integer) Severity: low (default), moderate, critical ETSI NFV IFA 005/006 lists example perceived severity values.

List of alarm states will be specified in ETSI NFV Stage 3.

OPNFV: Severity (Integer):
  • update OPNFV Doctor specification to Enum
perceivedSeverity=Indetermined:
  • remove value Indetermined in IFA and map undefined values to “minor” severity, or
  • add value indetermined in AODH and make it the default value.
perceivedSeverity=Cleared:
  • remove value Cleared in IFA as the information about a cleared alarm alarm can be derived from the alarm state parameter, or
  • add value cleared in AODH and set a rule that the severity is “cleared” when the state is ok.
faultType FaultType event_type in reason_data Type of the fault, e.g. “CPU failure” of a compute resource, in machine interpretable format. OpenStack Alarming (Aodh) can use a fuzzy matching with wildcard string, “compute.cpu.failure”.
N/A N/A type = “event” Type of the notification. For fault notifications the type in AODH is “event”. -
probableCause ProbableCause - Probable cause of the alarm. May be provided (e.g. in a shimlayer) based on Vitrage topology awareness / root-cause-analysis.
isRootCause IsRootCause - Boolean indicating whether the fault is the root cause of other faults. see above
correlatedAlarmId CorrelatedFaultId - List of IDs of correlated faults. see above
faultDetails FaultDetails - Additional details about the fault/alarm. FaultDetails information element will be specified in ETSI NFV Stage 3.
- - action, previous Additional AODH alarm related parameters. -

Table: Comparison of alarm attributes

The primary area of improvement should be alignment of the perceived severity. This is important for a quick and accurate evaluation of the alarm. AODH thus should support also the X.733 values Critical, Major, Minor, Warning and Indeterminate.

The detailed time stamps (raised, changed, cleared) which are essential for synchronizing the alarm list using a query operation should be added to the Doctor specification.

Other areas that need alignment is the so called alarm state in NFV. Here we must however consider what can be attributes of the notification vs. what should be a property of the alarm instance. This will be analyzed later.

Detailed southbound interface specification

This section is specifying the southbound interfaces for fault management between the Monitors and the Inspector. Although southbound interfaces should be flexible to handle various events from different types of Monitors, we define unified event API in order to improve interoperability between the Monitors and the Inspector. This is not limiting implementation of Monitor and Inspector as these could be extended in order to support failures from intelligent inspection like prediction.

Note: The interface definition will be aligned with current work in ETSI NFV IFA working group.

Fault event interface

This interface allows the Monitors to notify the Inspector about an event which was captured by the Monitor and may effect resources managed in the VIM.

EventNotification

Event notification including fault description. The entity of this notification is event, and not fault or error specifically. This allows us to use generic event format or framework build out of Doctor project. The parameters below shall be mandatory, but keys in ‘Details’ can be optional.

Parameters:

  • Time [1]: Datetime when the fault was observed in the Monitor.
  • Type [1]: Type of event that will be used to process correlation in Inspector.
  • Details [0..1]: Details containing additional information with Key-value pair style. Keys shall be defined depending on the Type of the event.

E.g.:

{
    'event': {
        'time': '2016-04-12T08:00:00',
        'type': 'compute.host.down',
        'details': {
            'hostname': 'compute-1',
            'source': 'sample_monitor',
            'cause': 'link-down',
            'severity': 'critical',
            'status': 'down',
            'monitor_id': 'monitor-1',
            'monitor_event_id': '123',
        }
    }
}

Optional parameters in ‘Details’:

  • Hostname: the hostname on which the event occurred.
  • Source: the display name of reporter of this event. This is not limited to monitor, other entity can be specified such as ‘KVM’.
  • Cause: description of the cause of this event which could be different from the type of this event.
  • Severity: the severity of this event set by the monitor.
  • Status: the status of target object in which error occurred.
  • MonitorID: the ID of the monitor sending this event.
  • MonitorEventID: the ID of the event in the monitor. This can be used by operator while tracking the monitor log.
  • RelatedTo: the array of IDs which related to this event.

Also, we can have bulk API to receive multiple events in a single HTTP POST message by using the ‘events’ wrapper as follows:

{
    'events': [
        'event': {
            'time': '2016-04-12T08:00:00',
            'type': 'compute.host.down',
            'details': {},
        },
        'event': {
            'time': '2016-04-12T08:00:00',
            'type': 'compute.host.nic.error',
            'details': {},
        }
    ]
}
Blueprints

This section is listing a first set of blueprints that have been proposed by the Doctor project to the open source community. Further blueprints addressing other gaps identified in Section 4 will be submitted at a later stage of the OPNFV. In this section the following definitions are used:

  • “Event” is a message emitted by other OpenStack services such as Nova and Neutron and is consumed by the “Notification Agents” in Ceilometer.
  • “Notification” is a message generated by a “Notification Agent” in Ceilometer based on an “event” and is delivered to the “Collectors” in Ceilometer that store those notifications (as “sample”) to the Ceilometer “Databases”.
Instance State Notification (Ceilometer) [†]

The Doctor project is planning to handle “events” and “notifications” regarding Resource Status; Instance State, Port State, Host State, etc. Currently, Ceilometer already receives “events” to identify the state of those resources, but it does not handle and store them yet. This is why we also need a new event definition to capture those resource states from “events” created by other services.

This BP proposes to add a new compute notification state to handle events from an instance (server) from nova. It also creates a new meter “instance.state” in OpenStack.

[†]https://etherpad.opnfv.org/p/doctor_bps
Event Publisher for Alarm (Ceilometer) [‡]

Problem statement:

The existing “Alarm Evaluator” in OpenStack Ceilometer is periodically querying/polling the databases in order to check all alarms independently from other processes. This is adding additional delay to the fault notification send to the Consumer, whereas one requirement of Doctor is to react on faults as fast as possible.

The existing message flow is shown in figure12: after receiving an “event”, a “notification agent” (i.e. “event publisher”) will send a “notification” to a “Collector”. The “collector” is collecting the notifications and is updating the Ceilometer “Meter” database that is storing information about the “sample” which is capured from original “event”. The “Alarm Evaluator” is periodically polling this databases then querying “Meter” database based on each alarm configuration.

_images/figure12.png

Implementation plan in Ceilometer architecture

In the current Ceilometer implementation, there is no possibility to directly trigger the “Alarm Evaluator” when a new “event” was received, but the “Alarm Evaluator” will only find out that requires firing new notification to the Consumer when polling the database.

Change/feature request:

This BP proposes to add a new “event publisher for alarm”, which is bypassing several steps in Ceilometer in order to avoid the polling-based approach of the existing Alarm Evaluator that makes notification slow to users. See figure12.

After receiving an “(alarm) event” by listening on the Ceilometer message queue (“notification bus”), the new “event publisher for alarm” immediately hands a “notification” about this event to a new Ceilometer component “Notification-driven alarm evaluator” proposed in the other BP (see Section 5.6.3).

Note, the term “publisher” refers to an entity in the Ceilometer architecture (it is a “notification agent”). It offers the capability to provide notifications to other services outside of Ceilometer, but it is also used to deliver notifications to other Ceilometer components (e.g. the “Collectors”) via the Ceilometer “notification bus”.

Implementation detail

  • “Event publisher for alarm” is part of Ceilometer
  • The standard AMQP message queue is used with a new topic string.
  • No new interfaces have to be added to Ceilometer.
  • “Event publisher for Alarm” can be configured by the Administrator of Ceilometer to be used as “Notification Agent” in addition to the existing “Notifier”
  • Existing alarm mechanisms of Ceilometer can be used allowing users to configure how to distribute the “notifications” transformed from “events”, e.g. there is an option whether an ongoing alarm is re-issued or not (“repeat_actions”).
[‡]https://etherpad.opnfv.org/p/doctor_bps
Notification-driven alarm evaluator (Ceilometer) [§]

Problem statement:

The existing “Alarm Evaluator” in OpenStack Ceilometer is periodically querying/polling the databases in order to check all alarms independently from other processes. This is adding additional delay to the fault notification send to the Consumer, whereas one requirement of Doctor is to react on faults as fast as possible.

Change/feature request:

This BP is proposing to add an alternative “Notification-driven Alarm Evaluator” for Ceilometer that is receiving “notifications” sent by the “Event Publisher for Alarm” described in the other BP. Once this new “Notification-driven Alarm Evaluator” received “notification”, it finds the “alarm” configurations which may relate to the “notification” by querying the “alarm” database with some keys i.e. resource ID, then it will evaluate each alarm with the information in that “notification”.

After the alarm evaluation, it will perform the same way as the existing “alarm evaluator” does for firing alarm notification to the Consumer. Similar to the existing Alarm Evaluator, this new “Notification-driven Alarm Evaluator” is aggregating and correlating different alarms which are then provided northbound to the Consumer via the OpenStack “Alarm Notifier”. The user/administrator can register the alarm configuration via existing Ceilometer API [¶]. Thereby, he can configure whether to set an alarm or not and where to send the alarms to.

Implementation detail

  • The new “Notification-driven Alarm Evaluator” is part of Ceilometer.
  • Most of the existing source code of the “Alarm Evaluator” can be re-used to implement this BP
  • No additional application logic is needed
  • It will access the Ceilometer Databases just like the existing “Alarm evaluator”
  • Only the polling-based approach will be replaced by a listener for “notifications” provided by the “Event Publisher for Alarm” on the Ceilometer “notification bus”.
  • No new interfaces have to be added to Ceilometer.
[§]https://etherpad.opnfv.org/p/doctor_bps
[¶]https://wiki.openstack.org/wiki/Ceilometer/Alerting
Report host fault to update server state immediately (Nova) [#]

Problem statement:

  • Nova state change for failed or unreachable host is slow and does not reliably state host is down or not. This might cause same server instance to run twice if action taken to evacuate instance to another host.
  • Nova state for server(s) on failed host will not change, but remains active and running. This gives the user false information about server state.
  • VIM northbound interface notification of host faults towards VNFM and NFVO should be in line with OpenStack state. This fault notification is a Telco requirement defined in ETSI and will be implemented by OPNFV Doctor project.
  • Openstack user cannot make HA actions fast and reliably by trusting server state and host state.

Proposed change:

There needs to be a new API for Admin to state host is down. This API is used to mark services running in host down to reflect the real situation.

Example on compute node is:

  • When compute node is up and running::

    vm_state: activeand power_state: running
    nova-compute state: up status: enabled
    
  • When compute node goes down and new API is called to state host is down::

    vm_state: stopped power_state: shutdown
    nova-compute state: down status: enabled
    

Alternatives:

There is no attractive alternative to detect all different host faults than to have an external tool to detect different host faults. For this kind of tool to exist there needs to be new API in Nova to report fault. Currently there must be some kind of workarounds implemented as cannot trust or get the states from OpenStack fast enough.

[#]https://blueprints.launchpad.net/nova/+spec/update-server-state-immediately
Summary and conclusion

The Doctor project aimed at detailing NFVI fault management and NFVI maintenance requirements. These are indispensable operations for an Operator, and extremely necessary to realize telco-grade high availability. High availability is a large topic; the objective of Doctor is not to realize a complete high availability architecture and implementation. Instead, Doctor limited itself to addressing the fault events in NFVI, and proposes enhancements necessary in VIM, e.g. OpenStack, to ensure VNFs availability in such fault events, taking a Telco VNFs application level management system into account.

The Doctor project performed a robust analysis of the requirements from NFVI fault management and NFVI maintenance operation, concretely found out gaps in between such requirements and the current implementation of OpenStack. A detailed architecture and interface specification has been described in this document and work to realize Doctor features and fill out the identified gaps in upstream communities is in the final stages of development.

Annex: NFVI Faults

Faults in the listed elements need to be immediately notified to the Consumer in order to perform an immediate action like live migration or switch to a hot standby entity. In addition, the Administrator of the host should trigger a maintenance action to, e.g., reboot the server or replace a defective hardware element.

Faults can be of different severity, i.e., critical, warning, or info. Critical faults require immediate action as a severe degradation of the system has happened or is expected. Warnings indicate that the system performance is going down: related actions include closer (e.g. more frequent) monitoring of that part of the system or preparation for a cold migration to a backup VM. Info messages do not require any action. We also consider a type “maintenance”, which is no real fault, but may trigger maintenance actions like a re-boot of the server or replacement of a faulty, but redundant HW.

Faults can be gathered by, e.g., enabling SNMP and installing some open source tools to catch and poll SNMP. When using for example Zabbix one can also put an agent running on the hosts to catch any other fault. In any case of failure, the Administrator should be notified. The following tables provide a list of high level faults that are considered within the scope of the Doctor project requiring immediate action by the Consumer.

Compute/Storage

Fault Severity How to detect? Comment Immediate action to recover
Processor/CPU failure, CPU condition not ok Critical Zabbix   Switch to hot standby
Memory failure/ Memory condition not ok Critical Zabbix (IPMI)   Switch to hot standby
Network card failure, e.g. network adapter connectivity lost Critical Zabbix/ Ceilometer   Switch to hot standby
Disk crash Info RAID monitoring Network storage is very redundant (e.g. RAID system) and can guarantee high availability Inform OAM
Storage controller Critical Zabbix (IPMI)   Live migration if storage is still accessible; otherwise hot standby
PDU/power failure, power off, server reset Critical Zabbix/ Ceilometer   Switch to hot standby
Power degration, power redundancy lost, power threshold exceeded Warning SNMP   Live migration
Chassis problem (e.g. fan degraded/failed, chassis power degraded), CPU fan problem, temperature/ thermal condition not ok Warning SNMP   Live migration
Mainboard failure Critical Zabbix (IPMI) e.g. PCIe, SAS link failure Switch to hot standby
OS crash (e.g. kernel panic) Critical Zabbix   Switch to hot standby

Hypervisor

Fault Severity How to detect? Comment Immediate action to recover
System has restarted Critical Zabbix   Switch to hot standby
Hypervisor failure Warning/ Critical Zabbix/ Ceilometer   Evacuation/switch to hot standby
Hypervisor status not retrievable after certain period Warning Alarming service Zabbix/ Ceilometer unreachable Rebuild VM

Network

Fault Severity How to detect? Comment Immediate action to recover
SDN/OpenFlow switch, controller degraded/failed Critical Ceilo- meter   Switch to hot standby or reconfigure virtual network topology
Hardware failure of physical switch/router Warning SNMP Redundancy of physical infrastructure is reduced or no longer available Live migration if possible otherwise evacuation
References and bibliography
[DOCT]OPNFV, “Doctor” requirements project, [Online]. Available at https://wiki.opnfv.org/doctor
[PRED]OPNFV, “Data Collection for Failure Prediction” requirements project [Online]. Available at https://wiki.opnfv.org/prediction
[OPSK]OpenStack, [Online]. Available at https://www.openstack.org/
[CEIL]OpenStack Telemetry (Ceilometer), [Online]. Available at https://wiki.openstack.org/wiki/Ceilometer
[NOVA]OpenStack Nova, [Online]. Available at https://wiki.openstack.org/wiki/Nova
[NEUT]OpenStack Neutron, [Online]. Available at https://wiki.openstack.org/wiki/Neutron
[CIND]OpenStack Cinder, [Online]. Available at https://wiki.openstack.org/wiki/Cinder
[MONA]OpenStack Monasca, [Online], Available at https://wiki.openstack.org/wiki/Monasca
[OSAG]OpenStack Cloud Administrator Guide, [Online]. Available at http://docs.openstack.org/admin-guide-cloud/content/
[ZABB]ZABBIX, the Enterprise-class Monitoring Solution for Everyone, [Online]. Available at http://www.zabbix.com/
[ENFV]ETSI NFV, [Online]. Available at http://www.etsi.org/technologies-clusters/technologies/nfv
Doctor Installation Guide
Doctor Configuration

OPNFV installers install most components of Doctor framework including OpenStack Nova, Neutron and Cinder (Doctor Controller) and OpenStack Ceilometer and Aodh (Doctor Notifier) except Doctor Monitor.

After major components of OPNFV are deployed, you can setup Doctor functions by following instructions in this section. You can also learn detailed steps in setup_installer() under doctor/tests.

Doctor Inspector

You need to configure one of Doctor Inspector below.

Doctor Sample Inspector

Sample Inspector is intended to show minimum functions of Doctor Inspector.

Doctor Sample Inspector suggested to be placed in one of the controller nodes, but it can be put on any host where Doctor Monitor can reach and access the OpenStack Controller (Nova).

Make sure OpenStack env parameters are set properly, so that Doctor Inspector can issue admin actions such as compute host force-down and state update of VM.

Then, you can configure Doctor Inspector as follows:

git clone https://gerrit.opnfv.org/gerrit/doctor -b stable/danube
cd doctor/tests
INSPECTOR_PORT=12345
python inspector.py $INSPECTOR_PORT > inspector.log 2>&1 &

Congress

OpenStack Congress is a Governance as a Service (previously Policy as a Service). Congress implements Doctor Inspector as it can inspect a fault situation and propagate errors onto other entities.

Congress is deployed by OPNFV installers. You need to enable doctor datasource driver and set policy rules. By the example configuration below, Congress will force down nova compute service when it received a fault event of that compute host. Also, Congress will set the state of all VMs running on that host from ACTIVE to ERROR state.

openstack congress datasource create doctor doctor

openstack congress policy rule create \
    --name host_down classification \
    'host_down(host) :-
        doctor:events(hostname=host, type="compute.host.down", status="down")'

openstack congress policy rule create \
    --name active_instance_in_host classification \
    'active_instance_in_host(vmid, host) :-
        nova:servers(id=vmid, host_name=host, status="ACTIVE")'

openstack congress policy rule create \
    --name host_force_down classification \
    'execute[nova:services.force_down(host, "nova-compute", "True")] :-
        host_down(host)'

openstack congress policy rule create \
    --name error_vm_states classification \
    'execute[nova:servers.reset_state(vmid, "error")] :-
        host_down(host),
        active_instance_in_host(vmid, host)'
Doctor Monitor

Doctor Sample Monitor

Doctor Monitors are suggested to be placed in one of the controller nodes, but those can be put on any host which is reachable to target compute host and accessible by the Doctor Inspector. You need to configure Monitors for all compute hosts one by one.

Make sure OpenStack env parameters are set properly, so that Doctor Inspector can issue admin actions such as compute host force-down and state update of VM.

Then, you can configure the Doctor Monitor as follows (Example for Apex deployment):

git clone https://gerrit.opnfv.org/gerrit/doctor -b stable/danube
cd doctor/tests
INSPECTOR_PORT=12345
COMPUTE_HOST='overcloud-novacompute-1.localdomain.com'
COMPUTE_IP=192.30.9.5
sudo python monitor.py "$COMPUTE_HOST" "$COMPUTE_IP" \
    "http://127.0.0.1:$INSPECTOR_PORT/events" > monitor.log 2>&1 &
Doctor User Guide
Doctor capabilities and usage

figure1 shows the currently implemented and tested architecture of Doctor. The implementation is based on OpenStack and related components. The Monitor can be realized by a sample Python-based implementation provided in the Doctor code repository. The Controller is realized by OpenStack Nova, Neutron and Cinder for compute, network and storage, respectively. The Inspector can be realized by OpenStack Congress or a sample Python-based implementation also available in the code repository of Doctor. The Notifier is realized by OpenStack Aodh.

_images/figure11.png

Implemented and tested architecture

Immediate Notification

Immediate notification can be used by creating ‘event’ type alarm via OpenStack Alarming (Aodh) API with relevant internal components support.

See, upstream spec document: http://specs.openstack.org/openstack/ceilometer-specs/specs/liberty/event-alarm-evaluator.html

An example of a consumer of this notification can be found in the Doctor repository. It can be executed as follows:

git clone https://gerrit.opnfv.org/gerrit/doctor -b stable/danube
cd doctor/tests
CONSUMER_PORT=12346
python consumer.py "$CONSUMER_PORT" > consumer.log 2>&1 &
Consistent resource state awareness

Resource state of compute host can be changed/updated according to a trigger from a monitor running outside of OpenStack Compute (Nova) by using force-down API.

See http://artifacts.opnfv.org/doctor/danube/manuals/mark-host-down_manual.html for more detail.

Valid compute host status given to VM owner

The resource state of a compute host can be retrieved by a user with the OpenStack Compute (Nova) servers API.

See http://artifacts.opnfv.org/doctor/danube/manuals/get-valid-server-state.html for more detail.

Design Documents

This is the directory to store design documents which may include draft versions of blueprints written before proposing to upstream OSS communities such as OpenStack, in order to keep the original blueprint as reviewed in OPNFV. That means there could be out-dated blueprints as result of further refinements in the upstream OSS community. Please refer to the link in each document to find the latest version of the blueprint and status of development in the relevant OSS community.

See also https://wiki.opnfv.org/requirements_projects .

Note

This is a specification draft of a blueprint proposed for OpenStack Nova Liberty. It was written by project member(s) and agreed within the project before submitting it upstream. No further changes to its content will be made here anymore; please follow it upstream:

Original draft is as follow:

Report host fault to update server state immediately

https://blueprints.launchpad.net/nova/+spec/update-server-state-immediately

A new API is needed to report a host fault to change the state of the instances and compute node immediately. This allows usage of evacuate API without a delay. The new API provides the possibility for external monitoring system to detect any kind of host failure fast and reliably and inform OpenStack about it. Nova updates the compute node state and states of the instances. This way the states in the Nova DB will be in sync with the real state of the system.

Problem description
  • Nova state change for failed or unreachable host is slow and does not reliably state compute node is down or not. This might cause same instance to run twice if action taken to evacuate instance to another host.
  • Nova state for instances on failed compute node will not change, but remains active and running. This gives user a false information about instance state. Currently one would need to call “nova reset-state” for each instance to have them in error state.
  • OpenStack user cannot make HA actions fast and reliably by trusting instance state and compute node state.
  • As compute node state changes slowly one cannot evacuate instances.
Use Cases

Use case in general is that in case there is a host fault one should change compute node state fast and reliably when using DB servicegroup backend. On top of this here is the use cases that are not covered currently to have instance states changed correctly: * Management network connectivity lost between controller and compute node. * Host HW failed.

Generic use case flow:

  • The external monitoring system detects a host fault.
  • The external monitoring system fences the host if not down already.
  • The external system calls the new Nova API to force the failed compute node into down state as well as instances running on it.
  • Nova updates the compute node state and state of the effected instances to Nova DB.

Currently nova-compute state will be changing “down”, but it takes a long time. Server state keeps as “vm_state: active” and “power_state: running”, which is not correct. By having external tool to detect host faults fast, fence host by powering down and then report host down to OpenStack, all these states would reflect to actual situation. Also if OpenStack will not implement automatic actions for fault correlation, external tool can do that. This could be configured for example in server instance METADATA easily and be read by external tool.

Project Priority

Liberty priorities have not yet been defined.

Proposed change

There needs to be a new API for Admin to state host is down. This API is used to mark compute node and instances running on it down to reflect the real situation.

Example on compute node is:

  • When compute node is up and running: vm_state: active and power_state: running nova-compute state: up status: enabled
  • When compute node goes down and new API is called to state host is down: vm_state: stopped power_state: shutdown nova-compute state: down status: enabled

vm_state values: soft-delete, deleted, resized and error should not be touched. task_state effect needs to be worked out if needs to be touched.

Alternatives

There is no attractive alternatives to detect all different host faults than to have a external tool to detect different host faults. For this kind of tool to exist there needs to be new API in Nova to report fault. Currently there must have been some kind of workarounds implemented as cannot trust or get the states from OpenStack fast enough.

Data model impact

None

REST API impact
  • Update CLI to report host is down

    nova host-update command

    usage: nova host-update [–status <enable|disable>]

    [–maintenance <enable|disable>] [–report-host-down] <hostname>

    Update host settings.

    Positional arguments

    <hostname> Name of host.

    Optional arguments

    –status <enable|disable> Either enable or disable a host.

    –maintenance <enable|disable> Either put or resume host to/from maintenance.

    –down Report host down to update instance and compute node state in db.

  • Update Compute API to report host is down:

    /v2.1/{tenant_id}/os-hosts/{host_name}

    Normal response codes: 200 Request parameters

    Parameter Style Type Description host_name URI xsd:string The name of the host of interest to you.

    {
    “host”: {

    “status”: “enable”, “maintenance_mode”: “enable” “host_down_reported”: “true”

    }

    }

    {
    “host”: {

    “host”: “65c5d5b7e3bd44308e67fc50f362aee6”, “maintenance_mode”: “enabled”, “status”: “enabled” “host_down_reported”: “true”

    }

    }

  • New method to nova.compute.api module HostAPI class to have a to mark host related instances and compute node down: set_host_down(context, host_name)

  • class novaclient.v2.hosts.HostManager(api) method update(host, values) Needs to handle reporting host down.

  • Schema does not need changes as in db only service and server states are to be changed.

Security impact

API call needs admin privileges (in the default policy configuration).

Notifications impact

None

Other end user impact

None

Performance Impact

Only impact is that user can get information faster about instance and compute node state. This also gives possibility to evacuate faster. No impact that would slow down. Host down should be rare occurrence.

Other deployer impact

Developer can make use of any external tool to detect host fault and report it to OpenStack.

Developer impact

None

Implementation
Assignee(s)

Primary assignee: Tomi Juvonen Other contributors: Ryota Mibu

Work Items
  • Test cases.
  • API changes.
  • Documentation.
Dependencies

None

Testing

Test cases that exists for enabling or putting host to maintenance should be altered or similar new cases made test new functionality.

Documentation Impact

New API needs to be documented:

References
Notification Alarm Evaluator

Note

This is spec draft of blueprint for OpenStack Ceilomter Liberty. To see current version: https://review.openstack.org/172893 To track development activity: https://blueprints.launchpad.net/ceilometer/+spec/notification-alarm-evaluator

https://blueprints.launchpad.net/ceilometer/+spec/notification-alarm-evaluator

This blueprint proposes to add a new alarm evaluator for handling alarms on events passed from other OpenStack services, that provides event-driven alarm evaluation which makes new sequence in Ceilometer instead of the polling-based approach of the existing Alarm Evaluator, and realizes immediate alarm notification to end users.

Problem description

As an end user, I need to receive alarm notification immediately once Ceilometer captured an event which would make alarm fired, so that I can perform recovery actions promptly to shorten downtime of my service. The typical use case is that an end user set alarm on “compute.instance.update” in order to trigger recovery actions once the instance status has changed to ‘shutdown’ or ‘error’. It should be nice that an end user can receive notification within 1 second after fault observed as the same as other helth- check mechanisms can do in some cases.

The existing Alarm Evaluator is periodically querying/polling the databases in order to check all alarms independently from other processes. This is good approach for evaluating an alarm on samples stored in a certain period. However, this is not efficient to evaluate an alarm on events which are emitted by other OpenStack servers once in a while.

The periodical evaluation leads delay on sending alarm notification to users. The default period of evaluation cycle is 60 seconds. It is recommended that an operator set longer interval than configured pipeline interval for underlying metrics, and also longer enough to evaluate all defined alarms in certain period while taking into account the number of resources, users and alarms.

Proposed change

The proposal is to add a new event-driven alarm evaluator which receives messages from Notification Agent and finds related Alarms, then evaluates each alarms;

  • New alarm evaluator could receive event notification from Notification Agent by which adding a dedicated notifier as a publisher in pipeline.yaml (e.g. notifier://?topic=event_eval).
  • When new alarm evaluator received event notification, it queries alarm database by Project ID and Resource ID written in the event notification.
  • Found alarms are evaluated by referring event notification.
  • Depending on the result of evaluation, those alarms would be fired through Alarm Notifier as the same as existing Alarm Evaluator does.

This proposal also adds new alarm type “notification” and “notification_rule”. This enables users to create alarms on events. The separation from other alarm types (such as “threshold” type) is intended to show different timing of evaluation and different format of condition, since the new evaluator will check each event notification once it received whereas “threshold” alarm can evaluate average of values in certain period calculated from multiple samples.

The new alarm evaluator handles Notification type alarms, so we have to change existing alarm evaluator to exclude “notification” type alarms from evaluation targets.

Alternatives

There was similar blueprint proposal “Alarm type based on notification”, but the approach is different. The old proposal was to adding new step (alarm evaluations) in Notification Agent every time it received event from other OpenStack services, whereas this proposal intends to execute alarm evaluation in another component which can minimize impact to existing pipeline processing.

Another approach is enhancement of existing alarm evaluator by adding notification listener. However, there are two issues; 1) this approach could cause stall of periodical evaluations when it receives bulk of notifications, and 2) this could break the alarm portioning i.e. when alarm evaluator received notification, it might have to evaluate some alarms which are not assign to it.

Data model impact

Resource ID will be added to Alarm model as an optional attribute. This would help the new alarm evaluator to filter out non-related alarms while querying alarms, otherwise it have to evaluate all alarms in the project.

REST API impact

Alarm API will be extended as follows;

  • Add “notification” type into alarm type list
  • Add “resource_id” to “alarm”
  • Add “notification_rule” to “alarm”

Sample data of Notification-type alarm:

{
    "alarm_actions": [
        "http://site:8000/alarm"
    ],
    "alarm_id": null,
    "description": "An alarm",
    "enabled": true,
    "insufficient_data_actions": [
        "http://site:8000/nodata"
    ],
    "name": "InstanceStatusAlarm",
    "notification_rule": {
        "event_type": "compute.instance.update",
        "query" : [
            {
                "field" : "traits.state",
                "type" : "string",
                "value" : "error",
                "op" : "eq",
            },
        ]
    },
    "ok_actions": [],
    "project_id": "c96c887c216949acbdfbd8b494863567",
    "repeat_actions": false,
    "resource_id": "153462d0-a9b8-4b5b-8175-9e4b05e9b856",
    "severity": "moderate",
    "state": "ok",
    "state_timestamp": "2015-04-03T17:49:38.406845",
    "timestamp": "2015-04-03T17:49:38.406839",
    "type": "notification",
    "user_id": "c96c887c216949acbdfbd8b494863567"
}

“resource_id” will be refered to query alarm and will not be check permission and belonging of project.

Security impact

None

Pipeline impact

None

Other end user impact

None

Performance/Scalability Impacts

When Ceilomter received a number of events from other OpenStack services in short period, this alarm evaluator can keep working since events are queued in a messaging queue system, but it can cause delay of alarm notification to users and increase the number of read and write access to alarm database.

“resource_id” can be optional, but restricting it to mandatory could be reduce performance impact. If user create “notification” alarm without “resource_id”, those alarms will be evaluated every time event occurred in the project. That may lead new evaluator heavy.

Other deployer impact

New service process have to be run.

Developer impact

Developers should be aware that events could be notified to end users and avoid passing raw infra information to end users, while defining events and traits.

Implementation
Assignee(s)
Primary assignee:
r-mibu
Other contributors:
None
Ongoing maintainer:
None
Work Items
  • New event-driven alarm evaluator
  • Add new alarm type “notification” as well as AlarmNotificationRule
  • Add “resource_id” to Alarm model
  • Modify existing alarm evaluator to filter out “notification” alarms
  • Add new config parameter for alarm request check whether accepting alarms without specifying “resource_id” or not
Future lifecycle

This proposal is key feature to provide information of cloud resources to end users in real-time that enables efficient integration with user-side manager or Orchestrator, whereas currently those information are considered to be consumed by admin side tool or service. Based on this change, we will seek orchestrating scenarios including fault recovery and add useful event definition as well as additional traits.

Dependencies

None

Testing

New unit/scenario tests are required for this change.

Documentation Impact
  • Proposed evaluator will be described in the developer document.
  • New alarm type and how to use will be explained in user guide.
References
Neutron Port Status Update

Note

This document represents a Neutron RFE reviewed in the Doctor project before submitting upstream to Launchpad Neutron space. The document is not intended to follow a blueprint format or to be an extensive document. For more information, please visit http://docs.openstack.org/developer/neutron/policies/blueprints.html

The RFE was submitted to Neutron. You can follow the discussions in https://bugs.launchpad.net/neutron/+bug/1598081

Neutron port status field represents the current status of a port in the cloud infrastructure. The field can take one of the following values: ‘ACTIVE’, ‘DOWN’, ‘BUILD’ and ‘ERROR’.

At present, if a network event occurs in the data-plane (e.g. virtual or physical switch fails or one of its ports, cable gets pulled unintentionally, infrastructure topology changes, etc.), connectivity to logical ports may be affected and tenants’ services interrupted. When tenants/cloud administrators are looking up their resources’ status (e.g. Nova instances and services running in them, network ports, etc.), they will wrongly see everything looks fine. The problem is that Neutron will continue reporting port ‘status’ as ‘ACTIVE’.

Many SDN Controllers managing network elements have the ability to detect and report network events to upper layers. This allows SDN Controllers’ users to be notified of changes and react accordingly. Such information could be consumed by Neutron so that Neutron could update the ‘status’ field of those logical ports, and additionally generate a notification message to the message bus.

However, Neutron misses a way to be able to receive such information through e.g. ML2 driver or the REST API (‘status’ field is read-only). There are pros and cons on both of these approaches as well as other possible approaches. This RFE intends to trigger a discussion on how Neutron could be improved to receive fault/change events from SDN Controllers or even also from 3rd parties not in charge of controlling the network (e.g. monitoring systems, human admins).

Port data plane status

https://bugs.launchpad.net/neutron/+bug/1598081

Neutron does not detect data plane failures affecting its logical resources. This spec addresses that issue by means of allowing external tools to report to Neutron about faults in the data plane that are affecting the ports. A new REST API field is proposed to that end.

Problem Description

An initial description of the problem was introduced in bug #159801 [1]. This spec focuses on capturing one (main) part of the problem there described, i.e. extending Neutron’s REST API to cover the scenario of allowing external tools to report network failures to Neutron. Out of scope of this spec are works to enable port status changes to be received and managed by mechanism drivers.

This spec also tries to address bug #1575146 [2]. Specifically, and argued by the Neutron driver team in [3]:

  • Neutron should not shut down the port completly upon detection of physnet failure; connectivity between instances on the same node may still be reachable. Externals tools may or may not want to trigger a status change on the port based on their own logic and orchestration.
  • Port down is not detected when an uplink of a switch is down;
  • The physnet bridge may have multiple physical interfaces plugged; shutting down the logical port may not be needed in case network redundancy is in place.
Proposed Change

A couple of possible approaches were proposed in [1] (comment #3). This spec proposes tackling the problema via a new extension API to the port resource. The extension adds a new attribute ‘dp-down’ (data plane down) to represent the status of the data plane. The field should be read-only by tenants and read-write by admins.

Neutron should send out an event to the message bus upon toggling the data plane status value. The event is relevant for e.g. auditing.

Data Model Impact

A new attribute as extension will be added to the ‘ports’ table.

Attribute Name Type Access Default Value Validation/ Conversion Description
dp_down boolean RO, tenant RW, admin False True/False  
REST API Impact

A new API extension to the ports resource is going to be introduced.

EXTENDED_ATTRIBUTES_2_0 = {
    'ports': {
        'dp_down': {'allow_post': False, 'allow_put': True,
                    'default': False, 'convert_to': convert_to_boolean,
                    'is_visible': True},
    },
}
Examples

Updating port data plane status to down:

PUT /v2.0/ports/<port-uuid>
Accept: application/json
{
    "port": {
        "dp_down": true
    }
}
Command Line Client Impact
neutron port-update [--dp-down <True/False>] <port>
openstack port set [--dp-down <True/False>] <port>

Argument –dp-down is optional. Defaults to False.

Security Impact

None

Notifications Impact

A notification (event) upon toggling the data plane status (i.e. ‘dp-down’ attribute) value should be sent to the message bus. Such events do not happen with high frequency and thus no negative impact on the notification bus is expected.

Performance Impact

None

IPv6 Impact

None

Other Deployer Impact

None

Developer Impact

None

Implementation
Assignee(s)
  • cgoncalves
Work Items
  • New ‘dp-down’ attribute in ‘ports’ database table
  • API extension to introduce new field to port
  • Client changes to allow for data plane status (i.e. ‘dp-down’ attribute’) being set
  • Policy (tenants read-only; admins read-write)
Documentation Impact

Documentation for both administrators and end users will have to be contemplated. Administrators will need to know how to set/unset the data plane status field.

References
[1]RFE: Port status update, https://bugs.launchpad.net/neutron/+bug/1598081
[2]RFE: ovs port status should the same as physnet https://bugs.launchpad.net/neutron/+bug/1575146
[3]Neutron Drivers meeting, July 21, 2016 http://eavesdrop.openstack.org/meetings/neutron_drivers/2016/neutron_drivers.2016-07-21-22.00.html
Inspector Design Guideline

Note

This is spec draft of design guideline for inspector component. JIRA ticket to track the update and collect comments: DOCTOR-73.

This document summarize the best practise in designing a high performance inspector to meet the requirements in OPNFV Doctor project.

Problem Description

Some pitfalls has be detected during the development of sample inspector, e.g. we suffered a significant performance degrading in listing VMs in a host.

A patch set for caching the list has been committed to solve issue. When a new inspector is integrated, it would be nice to have an evaluation of existing design and give recommendations for improvements.

This document can be treated as a source of related blueprints in inspector projects.

Guidelines
Host specific VMs list

TBD, see DOCTOR-76.

Parallel execution

TBD, see discussion in mailing list.

Performance Profiler

https://goo.gl/98Osig

This blueprint proposes to create a performance profiler for doctor scenarios.

Problem Description

In the verification job for notification time, we have encountered some performance issues, such as

1. In environment deployed by APEX, it meets the criteria while in the one by Fuel, the performance is much more poor. 2. Signification performance degradation was spotted when we increase the total number of VMs

It takes time to dig the log and analyse the reason. People have to collect timestamp at each checkpoints manually to find out the bottleneck. A performance profiler will make this process automatic.

Proposed Change

Current Doctor scenario covers the inspector and notifier in the whole fault management cycle:

start                                          end
  +       +         +        +       +          +
  |       |         |        |       |          |
  |monitor|inspector|notifier|manager|controller|
  +------>+         |        |       |          |
occurred  +-------->+        |       |          |
  |     detected    +------->+       |          |
  |       |     identified   +-------+          |
  |       |               notified   +--------->+
  |       |                  |    processed  resolved
  |       |                  |                  |
  |       +<-----doctor----->+                  |
  |                                             |
  |                                             |
  +<---------------fault management------------>+

The notification time can be split into several parts and visualized as a timeline:

start                                         end
  0----5---10---15---20---25---30---35---40---45--> (x 10ms)
  +    +   +   +   +    +      +   +   +   +   +
0-hostdown |   |   |    |      |   |   |   |   |
  +--->+   |   |   |    |      |   |   |   |   |
  |  1-raw failure |    |      |   |   |   |   |
  |    +-->+   |   |    |      |   |   |   |   |
  |    | 2-found affected      |   |   |   |   |
  |    |   +-->+   |    |      |   |   |   |   |
  |    |     3-marked host down|   |   |   |   |
  |    |       +-->+    |      |   |   |   |   |
  |    |         4-set VM error|   |   |   |   |
  |    |           +--->+      |   |   |   |   |
  |    |           |  5-notified VM error  |   |
  |    |           |    +----->|   |   |   |   |
  |    |           |    |    6-transformed event
  |    |           |    |      +-->+   |   |   |
  |    |           |    |      | 7-evaluated event
  |    |           |    |      |   +-->+   |   |
  |    |           |    |      |     8-fired alarm
  |    |           |    |      |       +-->+   |
  |    |           |    |      |         9-received alarm
  |    |           |    |      |           +-->+
sample | sample    |    |      |           |10-handled alarm
monitor| inspector |nova| c/m  |    aodh   |
  |                                        |
  +<-----------------doctor--------------->+

Note: c/m = ceilometer

And a table of components sorted by time cost from most to least

Component Time Cost Percentage
inspector 160ms 40%
aodh 110ms 30%
monitor 50ms 14%
...    
...    

Note: data in the table is for demonstration only, not actual measurement

Timestamps can be collected from various sources

  1. log files
  2. trace point in code

The performance profiler will be integrated into the verification job to provide detail result of the test. It can also be deployed independently to diagnose performance issue in specified environment.

Working Items
  1. PoC with limited checkpoints
  2. Integration with verification job
  3. Collect timestamp at all checkpoints
  4. Display the profiling result in console
  5. Report the profiling result to test database
  6. Independent package which can be installed to specified environment
Manuals
OpenStack NOVA API for marking host down.
What the API is for
This API will give external fault monitoring system a possibility of telling OpenStack Nova fast that compute host is down. This will immediately enable calling of evacuation of any VM on host and further enabling faster HA actions.
What this API does
In OpenStack the nova-compute service state can represent the compute host state and this new API is used to force this service down. It is assumed that the one calling this API has made sure the host is also fenced or powered down. This is important, so there is no chance same VM instance will appear twice in case evacuated to new compute host. When host is recovered by any means, the external system is responsible of calling the API again to disable forced_down flag and let the host nova-compute service report again host being up. If network fenced host come up again it should not boot VMs it had if figuring out they are evacuated to other compute host. The decision of deleting or booting VMs there used to be on host should be enhanced later to be more reliable by Nova blueprint: https://blueprints.launchpad.net/nova/+spec/robustify-evacuate
REST API for forcing down:

Parameter explanations: tenant_id: Identifier of the tenant. binary: Compute service binary name. host: Compute host name. forced_down: Compute service forced down flag. token: Token received after successful authentication. service_host_ip: Serving controller node ip.

request: PUT /v2.1/{tenant_id}/os-services/force-down { “binary”: “nova-compute”, “host”: “compute1”, “forced_down”: true }

response: 200 OK { “service”: { “host”: “compute1”, “binary”: “nova-compute”, “forced_down”: true } }

Example: curl -g -i -X PUT http://{service_host_ip}:8774/v2.1/{tenant_id}/os-services /force-down -H “Content-Type: application/json” -H “Accept: application/json ” -H “X-OpenStack-Nova-API-Version: 2.11” -H “X-Auth-Token: {token}” -d ‘{“b inary”: “nova-compute”, “host”: “compute1”, “forced_down”: true}’

CLI for forcing down:

nova service-force-down <hostname> nova-compute

Example: nova service-force-down compute1 nova-compute

REST API for disabling forced down:

Parameter explanations: tenant_id: Identifier of the tenant. binary: Compute service binary name. host: Compute host name. forced_down: Compute service forced down flag. token: Token received after successful authentication. service_host_ip: Serving controller node ip.

request: PUT /v2.1/{tenant_id}/os-services/force-down { “binary”: “nova-compute”, “host”: “compute1”, “forced_down”: false }

response: 200 OK { “service”: { “host”: “compute1”, “binary”: “nova-compute”, “forced_down”: false } }

Example: curl -g -i -X PUT http://{service_host_ip}:8774/v2.1/{tenant_id}/os-services /force-down -H “Content-Type: application/json” -H “Accept: application/json ” -H “X-OpenStack-Nova-API-Version: 2.11” -H “X-Auth-Token: {token}” -d ‘{“b inary”: “nova-compute”, “host”: “compute1”, “forced_down”: false}’

CLI for disabling forced down:

nova service-force-down –unset <hostname> nova-compute

Example: nova service-force-down –unset compute1 nova-compute

Get valid server state
Problem description

Previously when the owner of a VM has queried his VMs, he has not received enough state information, states have not changed fast enough in the VIM and they have not been accurate in some scenarios. With this change this gap is now closed.

A typical case is that, in case of a fault of a host, the user of a high availability service running on top of that host, needs to make an immediate switch over from the faulty host to an active standby host. Now, if the compute host is forced down [1] as a result of that fault, the user has to be notified about this state change such that the user can react accordingly. Similarly, a change of the host state to “maintenance” should also be notified to the users.

What is changed

A new host_status parameter is added to the /servers/{server_id} and /servers/detail endpoints in microversion 2.16. By this new parameter user can get additional state information about the host.

host_status possible values where next value in list can override the previous:

  • UP if nova-compute is up.
  • UNKNOWN if nova-compute status was not reported by servicegroup driver within configured time period. Default is within 60 seconds, but can be changed with service_down_time in nova.conf.
  • DOWN if nova-compute was forced down.
  • MAINTENANCE if nova-compute was disabled. MAINTENANCE in API directly means nova-compute service is disabled. Different wording is used to avoid the impression that the whole host is down, as only scheduling of new VMs is disabled.
  • Empty string indicates there is no host for server.

host_status is returned in the response in case the policy permits. By default the policy is for admin only in Nova policy.json:

"os_compute_api:servers:show:host_status": "rule:admin_api"

For an NFV use case this has to also be enabled for the owner of the VM:

"os_compute_api:servers:show:host_status": "rule:admin_or_owner"
REST API examples:

Case where nova-compute is enabled and reporting normally:

GET /v2.1/{tenant_id}/servers/{server_id}

200 OK
{
  "server": {
    "host_status": "UP",
    ...
  }
}

Case where nova-compute is enabled, but not reporting normally:

GET /v2.1/{tenant_id}/servers/{server_id}

200 OK
{
  "server": {
    "host_status": "UNKNOWN",
    ...
  }
}

Case where nova-compute is enabled, but forced_down:

GET /v2.1/{tenant_id}/servers/{server_id}

200 OK
{
  "server": {
    "host_status": "DOWN",
    ...
  }
}

Case where nova-compute is disabled:

GET /v2.1/{tenant_id}/servers/{server_id}

200 OK
{
  "server": {
    "host_status": "MAINTENANCE",
    ...
  }
}

Host Status is also visible in python-novaclient:

+-------+------+--------+------------+-------------+----------+-------------+
| ID    | Name | Status | Task State | Power State | Networks | Host Status |
+-------+------+--------+------------+-------------+----------+-------------+
| 9a... | vm1  | ACTIVE | -          | RUNNING     | xnet=... | UP          |
+-------+------+--------+------------+-------------+----------+-------------+

Domino

Domino Project Overview
Domino Description

Domino provides a distribution service for Network Service Descriptors (NSDs) and Virtual Network Function Descriptors (VNFDs) that are composed using Tosca Simple Profile for Network Functions Virtualization (http://docs.oasis-open.org/tosca/tosca-nfv/v1.0/tosca-nfv-v1.0.html). Domino service is targeted towards supporting many Software Defined Network (SDN) controllers, Service Orchestrators (SOs), VNF Managers (VNFMs), Virtual Infastructure Managers (VIMs), Operation and Business Support Systems that produce and/or consume NSDs and VNFDs.

Producers of NSDs and VNFDs use Domino Service through Service Access Points (SAPs) or End Points (EPs) to publish these descriptors. Consumers of NSDs and VNFDs subscribe with the Domino Service through the same SAPs/EPs and declare their resource capabilities to onboard and perform Life Cycle Management (LCM) for Network Services (NSs) and Virtual Network Functions (VNFs). Thus, Domino acts as a service broker for NSs and VNFs modeled in a Tosca template.

Domino Capabilities and Usage
Labels in Domino

Domino’s pub/sub architecture is based on labels (see Fig. 1 below). Each Template Producer and Template Consumer is expected to run a local Domino Client to publish templates and subscribe for labels.

alternate text

Domino provides a pub/sub server for NSDs and VNFDs

Domino Service does not interpret what the labels mean. Domino derives labels directly from the normative definitions in TOSCA Simple YAML Profile for NFV. Domino parses the policy rules included in the NSD/VNFD, form “policy” labels, and determine which resources are associated with which set of labels. Domino identifies which Domino Clients can host which resource based on the label subscriptions by these clients. Once mapping of resources to the clients are done, new NSDs/VNFDs are created based on the mapping. These new NSDs/VNFDs are translated and delivered to the clients.

Label Format and Examples

Domino supports policy labels in the following form:

<policytype>:properties:<key:value>

Orchestrators, controllers, and managers use Domino service to announce their capabilities by defining labels in this form and subscribing for these labels with the Domino Server.

For instance a particular VIM that is capable of performing an affinity based VNF or VDU placement at host machine granularity can specify a label in the form:

tosca.policies.Placement.affinity:properties:granularity:hostlevel

When the VIM registers with the Domino Service and subscribed for that label, Domino views this VIM as a candidate location that can host a VNF or VDU requesting affinity based placement policy at host machine granularity.

Another use case is the announcement of lifecycle management capabilities for VNFs and VNF Forwarding Graphs (VNFFG) by different SDN Controllers (SDN-Cs), VNFMs, or VIMs. For instance

tosca.policies.Scaling.VNFFG:properties:session_continuity:true

can be used as a label to indicate that when a scaling operation on a VNFFG (e.g., add more VNFs into the graph) is requested, existing session can still be enforced to go through the same chain of VNF instances.

To utilize Domino’s domain mapping services for virtual network resources (e.g., VNF, VDU, VNFFG, CP, VL, etc.), a network service or network function request must include policy rules that are composed of policy types and property values that match to the label announcements made by these domains. For instance, when a TOSCA template includes a policy rule with type “tosca.policies.Scaling.VNFFG” and property field “session_continuity” set as “true” targeting one or more VNFFGs, this serves as the hint for the Domino Server to identify all the Domain Clients that subscribed the label “tosca.policies.Scaling.VNFFG:properties:session_continuity:true”.

Template Example for Label Extraction

Consider the following NSD TOSCA template:

tosca_definitions_version: tosca_simple_profile_for_nfv_1_0_0
description: Template for deploying a single server with predefined properties.
metadata:
  template_name: TOSCA NFV Sample Template
policy_types:
  tosca.policies.Placement.Geolocation:
    description: Geolocation policy
    derived_from: tosca.policies.Placement
topology_template:
  node_templates:
    VNF1:
      type: tosca.nodes.nfv.VNF
      properties:
        id: vnf1
        vendor: acmetelco
        version: 1.0
    VNF2:
      type: tosca.nodes.nfv.VNF
      properties:
        id: vnf2
        vendor: ericsson
        version: 1.0
    VNF3:
      type: tosca.nodes.nfv.VNF
      properties:
        id: vnf3
        vendor: huawei
        version: 1.0
  policies:
    - rule1:
        type: tosca.policies.Placement.Geolocation
        targets: [ VNF1 ]
        properties:
          region: [ us-west-1 ]
    - rule2:
        type: tosca.policies.Placement.Geolocation
        targets: [ VNF2, VNF3 ]
        properties:
          region: [ us-west-1 , us-west-2 ]

Domino Server extracts all possible policy labels by exhaustively concatenating key-value pairs under the properties section of the policy rules to the policy type of these rules:

tosca.policies.Placement.Geolocation:properties:region:us-west-1
tosca.policies.Placement.Geolocation:properties:region:us-west-2

Furthermore, Domino Server iterates over the targets specified under policy rules to generate a set of labels for each target node:

required_labels['VNF1'] = { tosca.policies.Placement.Geolocation:properties:region:us-west-1 }
required_labels['VNF2'] = { tosca.policies.Placement.Geolocation:properties:region:us-west-1 , tosca.policies.Placement.Geolocation:properties:region:us-west-2}
required_labels['VNF3'] = { tosca.policies.Placement.Geolocation:properties:region:us-west-1 , tosca.policies.Placement.Geolocation:properties:region:us-west-2}

When a Template Consuming site (e.g., VNFM or VIM) registers with the Domino Server using Domino Client, it becomes an eligible candidate for template distribution with an initially empty set of label subscriptions. Suppose three different Domino Clients register with the Domino Server and subscribe for some or none of the policy labels such that the Domino Server has the current subscription state as follows:

subscribed_labels[site-1] = { } #this is empty set
subscribed_labels[site-2] = { tosca.policies.Placement.Geolocation:properties:region:us-west-1 }
subscribed_labels[site-3] = { tosca.policies.Placement.Geolocation:properties:region:us-west-1 ,  tosca.policies.Placement.Geolocation:properties:region:us-west-2}

Based on the TOSCA example and hypothetical label subscriptions above, Domino Server identifies all the VNFs can be hosted by Site-3, while VNF1 can be hosted by both Site-2 and Site-3. Note that Site-1 cannot host any of the VNFs listed in the TOSCA file. When a VNF can be hosted by multiple sites, Domino Server picks the site that can host the most number of VNFs. When not all VNFs can be hosted on the same site, the TOSCA file is partitioned into multiple files, one for each site. These files share a common part (e.g, meta-data, policy-types, version, description, virtual resources that are not targeted by any policy rule, etc.). Each site specific file has also a non-common part that only appears in that file (i.e., virtual resources explicitly assigned to that site and the policy rules that accompany those virtual resources.

In the current Domino convention, if a VNF (or any virtual resource) does not have a policy rule (i.e., it is not specified as a target in any of the policy rules) and it also is not dependent on any VNF (or any virtual resource) that is assigned to another site, that resource is wild carded by default and treated as part of the “common part”. Also note that currently Domino does not support all or nothing semantics: if some of the virtual resources are not mappable to any domain because they are targets of policy rules that are not supported by any site, these portions will be excluded while the remaining virtual resources will be still be part of one or more template files to be distributed to hosting sites. When NSDs and VNFDs are prepared, these conventions must be kept in mind. In the future releases, these conventions can change based on the new use cases.

For the example above, no partitioning would occur as all VNFs are mapped onto site-3; Domino Server simply delivers the Tosca file to Domino Client hosted on site-3. When TOSCA cannot be consumed by a particular site directly, Domino Server can utilize existing translators (e.g., heat-translator) to first translate the template before delivery.

Internal Processing Pipeline at Domino Server

Fig. 2 shows the block diagram for the processing stages of a published TOSCA template. Domino Client issues an RPC call publish(tosca file). Domino Server passes the received tosca file to Label Extractor that outputs resource labels. Domain Mapper uses the extracted labels and tosca file to find mappings from resources to domains as well as the resource dependencies. Resource to domain mappings and resource dependencies are utilized to partition the orchestration template into individual resource orchestration templates (one for each domain). If a translation is required (e.g., TOSCA to HOT), individual resource orchestration templates are first translated and then placed on a template distribution workflow based on resource dependencies. Message Sender block in the server takes one distribution task at a time from the workflow generator and pushes the orchestration template to the corresponding Domino Client.

alternate text

Domino Service Processing Pipeline

Resource Scheduling

Domino Service currently supports maximum packing strategy when a virtual resource type can be hosted on multiple candidate sites. Initially, Domino Scheduler identifies virtual resources that has only one feasible site for hosting. Each such virtual resource is trivially assigned to its only feasible site. The remaining virtual resources with multiple candidate locations are sequentially allocated to one of their candidate locations that has the most virtual resource assignments so far. Note that wildcarded resources are assigned to all sites. To prevent wildcarding within the current release, (i) all sites must subscribed to a base policy with a dummy key-value pair defined under the properties tab and (ii) all the independent resources must be specified as target of that policy in NSD or VNFD file.

Domino Installation Instruction
Domino Installation

Note: The steps below are tested for Ubuntu (16.04, 14.04) and OS X El Capitan.

Prerequisites
  • git
  • python-pip
  • python (version =2.7)
  • tosca-parser (version >=0.4.0)
  • heat-translator (version >=0.5.0)
Installation Steps (Single Node)
  • Step-0: Prepare Environment
> $sudo pip install tosca-parser
> $sudo pip install heat-translator
> $sudo pip install requests
  • Step-1: Get the Domino code
git clone https://gerrit.opnfv.org/gerrit/domino -b stable/danube
  • Step-2: Go to the main domino directory
cd domino

You should see DominoClient.py, DominoServer.py, and domino-cli.py as executables.

Installation Steps (Multiple Node)

Repeat the installation steps for single node on each node. The script run_on_remotenodes.sh under ./domino/tests directory deploys the Domino Code on three hosts from a deployment node and tests RPC calls. In the script, the private key location and remote host IP addresses must be manually entered. IS_IPandKEY_CONFIGURED should be set true, i.e., IS_IPandKEY_CONFIGURED=true.

Domino Configuration Guide
Domino Configuration

Domino Server and Clients can be configured via (i) passing command line options (see API documentation) and (ii) the configuration file “domino_conf.py” under the main directory.

  • The default file for logging is set as none and log level set as “WARNING”.
Domino Server
  • The default server unique user ID is set as 0 in the configuration file.
  • The default TCP port for RPC calls is set as 9090 in the configuration file.
  • The default database file for Domino Server is set as “dominoserver.db” under the main directory
  • The default folder for keeping published TOSCA files and pushed parts is set as “toscafiles” in the configuration file via variable TOSCADIR.
  • The default log level (WARNING) can be changed by passing the flags –log or -l followed by a log level, e.g., ERROR, WARNING, INFO, or DEBUG.
Domino Client
  • The default mode of CLI is non-interactive (i.e., Domino CLI Utility is used). Passing –iac=TRUE would begin the client in interactive mode.
  • The default log level (WARNING) can be changed by passing the flags –log or -l followed by a log level, e.g., ERROR, WARNING, INFO, or DEBUG.
  • The default Domino Server IP is set as “localhost”. This can be overwritten at the time of launching DominoClient via the option flags -i or –ipaddr followed by the IP address of the actual server hosting the Domino Server.
  • The default Domino Client TCP port for RPC calls is set as 9091 in the configuration file. It can be overwritten when the DominoClient is launched by passing the flags –port or -p followed by the port number.
  • The default folder for keeping preceived TOSCA files is set as “toscafiles” in the configuration file via variable TOSCA_RX_DIR.
Domino User Guide
Domino API Usage Guidelines and Examples
Using domino-cli Client

Prerequisites:

  1. Make sure that domino-cli.py is in +x mode.
  2. Change directory to where domino-cli.py, DominoClient.py and DominoServer.py are located or include file path in the PATH environment variable.
  3. Start the Domino Server:
./DominoServer.py --log=debug
  1. Start the Domino Client:
./DominoClient.py -p <portnumber> --cliport <cli-portnumber> --log=debug

Note1: The default log level is WARNING and omitting –log option will lead to minimal/no logging on the console

Note2: domino_conf.py file includes most of the default values

  • Registration Command

Command line input:

./domino-cli.py <cli-portnumber> register

This message has the following fields that are automatically filled in.

Message Type (= REGISTER)
DESIRED UDID (= if not allocated, this will be assigned as Unique Domino ID)
Sequence Number (=incremented after each RPC call)
IP ADDR (= IP address of DOMINO Client to be used by DOMINO Server for future RPC Calls to this client)
TCP PORT (= TCP port of DOMINO Client to be used by DOMINO Server for future RPC Calls to this client)
Supported Templates (= Null, this field not used currently)
  • Heart Beat Command

Command line input:

./domino-cli.py <cli-portnumber> heartbeat

This message has the following fields that are automatically filled in.

Message Type (= HEART_BEAT)
UDID (= Unique Domino ID assigned during registration)
Sequence Number (=incremented after each RPC call)
  • Label and Template Type Subscription Command
./domino-cli.py <cli-portnumber> subscribe -l <labelname> -t <templatetype>

Note that -l can be substituted by –label and -t can be substituted by –ttype.

More than one label or template type can be subscribed within the same command line as comma separated labels or template types

./domino-cli.py <cli-portnumber> subscribe -l <label1>,<label2>,<labeln> -t <ttype1>,<ttype2>,<ttypen>

To subscribe more than one label or template type, one can also repeat the options -l and -t, e.g.:

./domino-cli.py <cli-portnumber> subscribe -l <label1> -l <label2> -l <labeln> -t <ttype1> -t <ttype2> -t <ttypen>

It is safe to call subscribe command multiple times with duplicate labels.

This message has the following fields that are automatically filled in.

Message Type (= SUBSCRIBE)
UDID (= Unique Domino ID assigned during registration)
Sequence Number (=incremented after each RPC call)
Template Operation (= APPEND)
Label Operation (= APPEND)

The following fields are filled in based on arguments passed on via -l/–label and -t/–ttype flags

Subscribe RPC also supports options for label using
–lop=APPEND/DELETE/OVERWRITE
and for supported template types using
–top=APPEND/DELETE/OVERWRITE.

When unspecified, the default is APPEND. DELETE deletes existing labels (template types) specified in the current call via key -l/–label (-t/–ttype). OVERWRITE removes the current set of labels (template types) and sets it to the new set of values passed in the same RPC call.

By default, no translation service is provided. Currently, only TOSCA to Heat Orchestration Template (HOT) translation is supported using OpenStack heat-translator library. A domain that requires HOT files must subscribe HOT template type using

./domino-cli.py <cli-portnumber> subscribe -t hot
  • Template Publishing Command
./domino-cli.py <cli-portnumber> publish -t <toscafile>

Note that -t can be substituted with –tosca-file.

If -t or –tosca-file flag is used multiple times, the last tosca file passed as input will be used. This usage is not recommended as undefined/unintended results may emerge as the Domino client will continue to publish.

This message has the following fields that are automatically filled in.

Message Type (= PUBLISH)
UDID (= Unique Domino ID assigned during registration)
Sequence Number (=incremented after each RPC call)
Template Type (= TOSCA)
Template File

Since Danube release, Domino Server supports stateful updates for template publishing. The following command can be used to update the service template for an existing Template Unique ID (TUID):

./domino-cli.py <cli-portnumber> publish -t <toscafile> -k <TUID>

Note that -k can be substituted with –tuid. When Domino Server receives this command, it verifies whether the client previously published the provided TUID. If such TUID does not exist, then Domino Server returns FAILED response back to the client. If such TUID exists, Domino Server recomputes which resources are mapped onto which domains and updates each domain with the new VNF and NS descriptors. If a previously utilized domain is no longer targeted, it is updated with a null descriptor.

  • Template Listing Command
./domino-cli.py <cli-portnumber> list-tuids

Queries all the Template Unique IDs (TUIDs) published by the Domino Client from the Domino Server.

Interactive CLI mode

To enter this mode, start Domino Client with interactive console option set as true, i.e., –iac=true:

./DominoClient -p <portnumber> --iax=true --log=DEBUG

The rest of the API calls are the same as in the case of using domino-cli.py except that at the prompt there is no need to write “domino-cli.py <cli-portnumber>, e.g.,:

>>register
>>heartbeat
>>subscribe -l <label1> -t <ttype1>
>>publish -t <toscafile>

The interactive CLI mode is mainly supported for manual testing.

IPV6

IPv6 Installation Procedure
Abstract:

This document provides the users with the Installation Procedure to install OPNFV Danube Release on IPv6-only Infrastructure.

1. Install OPNFV on IPv6-Only Infrastructure

This section provides instructions to install OPNFV on IPv6-only Infrastructure. All underlay networks and API endpoints will be IPv6-only except:

  1. “admin” network in underlay/undercloud still has to be IPv4, due to lack of support of IPMI over IPv6 or PXE over IPv6.
  2. OVS VxLAN (or GRE) tunnel endpoint is still IPv4 only, although IPv6 traffic can be encapsulated within the tunnel.
  3. Metadata server is still IPv4 only.

Except the limitations above, the use case scenario of the IPv6-only infrastructure includes:

  1. Support OPNFV deployment on an IPv6 only infrastructure.
  2. Horizon/ODL-DLUX access using IPv6 address from an external host.
  3. OpenStack API access using IPv6 addresses from various python-clients.
  4. Ability to create Neutron Routers, IPv6 subnets (e.g. SLAAC/DHCPv6-Stateful/ DHCPv6-Stateless) to support North-South traffic.
  5. Inter VM communication (East-West routing) when VMs are spread across two compute nodes.
  6. VNC access into a VM using IPv6 addresses.
1.1. Install OPNFV in OpenStack-Only Environment

Apex Installer:

# HA, Virtual deployment in OpenStack-only environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-nosdn-nofeature-ha.yaml \
-n /etc/opnfv-apex/network_settings_v6.yaml

# HA, Bare Metal deployment in OpenStack-only environment
./opnfv-deploy -d /etc/opnfv-apex/os-nosdn-nofeature-ha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_settings_v6.yaml

# Non-HA, Virtual deployment in OpenStack-only environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-nosdn-nofeature-noha.yaml \
-n /etc/opnfv-apex/network_settings_v6.yaml

# Non-HA, Bare Metal deployment in OpenStack-only environment
./opnfv-deploy -d /etc/opnfv-apex/os-nosdn-nofeature-noha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_settings_v6.yaml

# Note:
#
# 1. Parameter ""-v" is mandatory for Virtual deployment
# 2. Parameter "-i <inventory file>" is mandatory for Bare Metal deployment
# 2.1 Refer to https://git.opnfv.org/cgit/apex/tree/config/inventory for examples of inventory file
# 3. You can use "-n /etc/opnfv-apex/network_settings.yaml" for deployment in IPv4 infrastructure

Please NOTE that:

  • You need to refer to installer’s documentation for other necessary parameters applicable to your deployment.
  • You need to refer to Release Notes and installer’s documentation if there is any issue in installation.
1.2. Install OPNFV in OpenStack with ODL-L3 Environment

Apex Installer:

# HA, Virtual deployment in OpenStack with Open Daylight L3 environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-odl_l3-nofeature-ha.yaml \
-n /etc/opnfv-apex/network_settings_v6.yaml

# HA, Bare Metal deployment in OpenStack with Open Daylight L3 environment
./opnfv-deploy -d /etc/opnfv-apex/os-odl_l3-nofeature-ha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_settings_v6.yaml

# Non-HA, Virtual deployment in OpenStack with Open Daylight L3 environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-odl_l3-nofeature-noha.yaml \
-n /etc/opnfv-apex/network_settings_v6.yaml

# Non-HA, Bare Metal deployment in OpenStack with Open Daylight L3 environment
./opnfv-deploy -d /etc/opnfv-apex/os-odl_l3-nofeature-noha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_settings_v6.yaml

# Note:
#
# 1. Parameter ""-v" is mandatory for Virtual deployment
# 2. Parameter "-i <inventory file>" is mandatory for Bare Metal deployment
# 2.1 Refer to https://git.opnfv.org/cgit/apex/tree/config/inventory for examples of inventory file
# 3. You can use "-n /etc/opnfv-apex/network_settings.yaml" for deployment in IPv4 infrastructure

Please NOTE that:

  • You need to refer to installer’s documentation for other necessary parameters applicable to your deployment.
  • You need to refer to Release Notes and installer’s documentation if there is any issue in installation.
1.3. Testing Methodology

There are 2 levels of testing to validate the deployment.

1.3.1. Underlay Testing for OpenStack API Endpoints

Underlay Testing is to validate that API endpoints are listening on IPv6 addresses. Currently, we are only considering the Underlay Testing for OpenStack API endpoints. The Underlay Testing for Open Daylight API endpoints is for future release.

The Underlay Testing for OpenStack API endpoints can be as simple as validating Keystone service, and as complete as validating each API endpoint. It is important to reuse Tempest API testing. Currently:

  • Apex Installer will change OS_AUTH_URL in overcloudrc during installation process. For example: export OS_AUTH_URL=http://[2001:db8::15]:5000/v2.0. OS_AUTH_URL points to Keystone and Keystone catalog.
  • When FuncTest runs Tempest for the first time, the OS_AUTH_URL is taken from the environment and placed automatically in Tempest.conf.
  • Under this circumstance, openstack catalog list will return IPv6 URL endpoints for all the services in catalog, including Nova, Neutron, etc, and covering public URLs, private URLs and admin URLs.
  • Thus, as long as the IPv6 URL is given in the overclourc, all the tests will use that (including Tempest).

Therefore Tempest API testing is reused to validate API endpoints are listening on IPv6 addresses as stated above. They are part of OpenStack default Smoke Tests, run in FuncTest and integrated into OPNFV’s CI/CD environment.

1.3.2. Overlay Testing

Overlay Testing is to validate that IPv6 is supported in tenant networks, subnets and routers. Both Tempest API testing and Tempest Scenario testing are used in our Overlay Testing.

Tempest API testing validates that the Neutron API supports the creation of IPv6 networks, subnets, routers, etc:

tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_network
tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_port
tempest.api.network.test_networks.BulkNetworkOpsIpV6Test.test_bulk_create_delete_subnet
tempest.api.network.test_networks.NetworksIpV6Test.test_create_update_delete_network_subnet
tempest.api.network.test_networks.NetworksIpV6Test.test_external_network_visibility
tempest.api.network.test_networks.NetworksIpV6Test.test_list_networks
tempest.api.network.test_networks.NetworksIpV6Test.test_list_subnets
tempest.api.network.test_networks.NetworksIpV6Test.test_show_network
tempest.api.network.test_networks.NetworksIpV6Test.test_show_subnet
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_create_update_delete_network_subnet
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_external_network_visibility
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_list_networks
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_list_subnets
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_show_network
tempest.api.network.test_networks.NetworksIpV6TestAttrs.test_show_subnet
tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_in_allowed_allocation_pools
tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_port_with_no_securitygroups
tempest.api.network.test_ports.PortsIpV6TestJSON.test_create_update_delete_port
tempest.api.network.test_ports.PortsIpV6TestJSON.test_list_ports
tempest.api.network.test_ports.PortsIpV6TestJSON.test_show_port
tempest.api.network.test_routers.RoutersIpV6Test.test_add_multiple_router_interfaces
tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_port_id
tempest.api.network.test_routers.RoutersIpV6Test.test_add_remove_router_interface_with_subnet_id
tempest.api.network.test_routers.RoutersIpV6Test.test_create_show_list_update_delete_router
tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_list_update_show_delete_security_group
tempest.api.network.test_security_groups.SecGroupIPv6Test.test_create_show_delete_security_group_rule
tempest.api.network.test_security_groups.SecGroupIPv6Test.test_list_security_groups

Tempest Scenario testing validates some specific overlay IPv6 scenarios (i.e. use cases) as follows:

tempest.scenario.test_network_v6.TestGettingAddress.test_dhcp6_stateless_from_os
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_dhcp6_stateless_from_os
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_dhcpv6_stateless
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_multi_prefix_slaac
tempest.scenario.test_network_v6.TestGettingAddress.test_dualnet_slaac_from_os
tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_dhcpv6_stateless
tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac
tempest.scenario.test_network_v6.TestGettingAddress.test_slaac_from_os

The above Tempest API testing and Scenario testing are quite comprehensive to validate overlay IPv6 tenant networks. They are part of OpenStack default Smoke Tests, run in FuncTest and integrated into OPNFV’s CI/CD environment.

IPv6 Configuration Guide
Abstract:

This document provides the users with the Configuration Guide to set up a service VM as an IPv6 vRouter using OPNFV Danube Release.

1. IPv6 Configuration - Setting Up a Service VM as an IPv6 vRouter

This section provides instructions to set up a service VM as an IPv6 vRouter using OPNFV Danube Release installers. The environment may be pure OpenStack option or Open Daylight L2-only option. The deployment model may be HA or non-HA. The infrastructure may be bare metal or virtual environment.

1.1. Pre-configuration Activities

The configuration will work in 2 environments:

  1. OpenStack-only environment
  2. OpenStack with Open Daylight L2-only environment

Depending on which installer will be used to deploy OPNFV, each environment may be deployed on bare metal or virtualized infrastructure. Each deployment may be HA or non-HA.

Refer to the previous installer configuration chapters, installations guide and release notes.

1.2. Setup Manual in OpenStack-Only Environment

If you intend to set up a service VM as an IPv6 vRouter in OpenStack-only environment of OPNFV Danube Release, please NOTE that:

  • Because the anti-spoofing rules of Security Group feature in OpenStack prevents a VM from forwarding packets, we need to disable Security Group feature in the OpenStack-only environment.
  • The hostnames, IP addresses, and username are for exemplary purpose in instructions. Please change as needed to fit your environment.
  • The instructions apply to both deployment model of single controller node and HA (High Availability) deployment model where multiple controller nodes are used.
1.2.1. Install OPNFV and Preparation

OPNFV-NATIVE-INSTALL-1: To install OpenStack-only environment of OPNFV Danube Release:

Apex Installer:

# HA, Virtual deployment in OpenStack-only environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-nosdn-nofeature-ha.yaml \
-n /etc/opnfv-apex/network_setting.yaml

# HA, Bare Metal deployment in OpenStack-only environment
./opnfv-deploy -d /etc/opnfv-apex/os-nosdn-nofeature-ha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_setting.yaml

# Non-HA, Virtual deployment in OpenStack-only environment
./opnfv-deploy -v -d /etc/opnfv-apex/os-nosdn-nofeature-noha.yaml \
-n /etc/opnfv-apex/network_setting.yaml

# Non-HA, Bare Metal deployment in OpenStack-only environment
./opnfv-deploy -d /etc/opnfv-apex/os-nosdn-nofeature-noha.yaml \
-i <inventory file> -n /etc/opnfv-apex/network_setting.yaml

# Note:
#
# 1. Parameter ""-v" is mandatory for Virtual deployment
# 2. Parameter "-i <inventory file>" is mandatory for Bare Metal deployment
# 2.1 Refer to https://git.opnfv.org/cgit/apex/tree/config/inventory for examples of inventory file
# 3. You can use "-n /etc/opnfv-apex/network_setting_v6.yaml" for deployment in IPv6-only infrastructure

Compass Installer:

# HA deployment in OpenStack-only environment
export ISO_URL=file://$BUILD_DIRECTORY/compass.iso
export OS_VERSION=${{COMPASS_OS_VERSION}}
export OPENSTACK_VERSION=${{COMPASS_OPENSTACK_VERSION}}
export CONFDIR=$WORKSPACE/deploy/conf/vm_environment
./deploy.sh --dha $CONFDIR/os-nosdn-nofeature-ha.yml \
--network $CONFDIR/$NODE_NAME/network.yml

# Non-HA deployment in OpenStack-only environment
# Non-HA deployment is currently not supported by Compass installer

Fuel Installer:

# HA deployment in OpenStack-only environment
# Scenario Name: os-nosdn-nofeature-ha
# Scenario Configuration File: ha_heat_ceilometer_scenario.yaml
# You can use either Scenario Name or Scenario Configuration File Name in "-s" parameter
sudo ./deploy.sh -b <stack-config-uri> -l <lab-name> -p <pod-name> \
-s os-nosdn-nofeature-ha -i <iso-uri>

# Non-HA deployment in OpenStack-only environment
# Scenario Name: os-nosdn-nofeature-noha
# Scenario Configuration File: no-ha_heat_ceilometer_scenario.yaml
# You can use either Scenario Name or Scenario Configuration File Name in "-s" parameter
sudo ./deploy.sh -b <stack-config-uri> -l <lab-name> -p <pod-name> \
-s os-nosdn-nofeature-noha -i <iso-uri>

# Note:
#
# 1. Refer to http://git.opnfv.org/cgit/fuel/tree/deploy/scenario/scenario.yaml for scenarios
# 2. Refer to http://git.opnfv.org/cgit/fuel/tree/ci/README for description of
#    stack configuration directory structure
# 3. <stack-config-uri> is the base URI of stack configuration directory structure
# 3.1 Example: http://git.opnfv.org/cgit/fuel/tree/deploy/config
# 4. <lab-name> and <pod-name> must match the directory structure in stack configuration
# 4.1 Example of <lab-name>: -l devel-pipeline
# 4.2 Example of <pod-name>: -p elx
# 5. <iso-uri> could be local or remote ISO image of Fuel Installer
# 5.1 Example: http://artifacts.opnfv.org/fuel/danube/opnfv-danube.1.0.iso
#
# Please refer to Fuel Installer's documentation for further information and any update

Joid Installer:

# HA deployment in OpenStack-only environment
./deploy.sh -o mitaka -s nosdn -t ha -l default -f ipv6

# Non-HA deployment in OpenStack-only environment
./deploy.sh -o mitaka -s nosdn -t nonha -l default -f ipv6

Please NOTE that:

  • You need to refer to installer’s documentation for other necessary parameters applicable to your deployment.
  • You need to refer to Release Notes and installer’s documentation if there is any issue in installation.

OPNFV-NATIVE-INSTALL-2: Clone the following GitHub repository to get the configuration and metadata files

git clone https://github.com/sridhargaddam/opnfv_os_ipv6_poc.git \
/opt/stack/opnfv_os_ipv6_poc
1.2.2. Disable Security Groups in OpenStack ML2 Setup

Please NOTE that although Security Groups feature has been disabled automatically through local.conf configuration file by some installers such as devstack, it is very likely that other installers such as Apex, Compass, Fuel or Joid will enable Security Groups feature after installation.

Please make sure that Security Groups are disabled in the setup

In order to disable Security Groups globally, please make sure that the settings in OPNFV-NATIVE-SEC-1 and OPNFV-NATIVE-SEC-2 are applied, if they are not there by default.

OPNFV-NATIVE-SEC-1: Change the settings in /etc/neutron/plugins/ml2/ml2_conf.ini as follows, if they are not there by default

# /etc/neutron/plugins/ml2/ml2_conf.ini
[securitygroup]
enable_security_group = True
firewall_driver = neutron.agent.firewall.NoopFirewallDriver
[ml2]
extension_drivers = port_security
[agent]
prevent_arp_spoofing = False

OPNFV-NATIVE-SEC-2: Change the settings in /etc/nova/nova.conf as follows, if they are not there by default.

# /etc/nova/nova.conf
[DEFAULT]
security_group_api = neutron
firewall_driver = nova.virt.firewall.NoopFirewallDriver

OPNFV-NATIVE-SEC-3: After updating the settings, you will have to restart the Neutron and Nova services.

Please note that the commands of restarting Neutron and Nova would vary depending on the installer. Please refer to relevant documentation of specific installers

1.2.3. Set Up Service VM as IPv6 vRouter

OPNFV-NATIVE-SETUP-1: Now we assume that OpenStack multi-node setup is up and running. We have to source the tenant credentials in OpenStack controller node in this step. Please NOTE that the method of sourcing tenant credentials may vary depending on installers. For example:

Apex installer:

# On jump host, source the tenant credentials using /bin/opnfv-util provided by Apex installer
opnfv-util undercloud "source overcloudrc; keystone service-list"

# Alternatively, you can copy the file /home/stack/overcloudrc from the installer VM called "undercloud"
# to a location in controller node, for example, in the directory /opt, and do:
# source /opt/overcloudrc

Compass installer:

# source the tenant credentials using Compass installer of OPNFV
source /opt/admin-openrc.sh

Fuel installer:

# source the tenant credentials using Fuel installer of OPNFV
source /root/openrc

Joid installer:

# source the tenant credentials using Joid installer of OPNFV
source $HOME/joid_config/admin-openrc

devstack:

# source the tenant credentials in devstack
source openrc admin demo

Please refer to relevant documentation of installers if you encounter any issue.

OPNFV-NATIVE-SETUP-2: Download fedora22 image which would be used for vRouter

wget https://download.fedoraproject.org/pub/fedora/linux/releases/22/Cloud/x86_64/\
Images/Fedora-Cloud-Base-22-20150521.x86_64.qcow2

OPNFV-NATIVE-SETUP-3: Import Fedora22 image to glance

glance image-create --name 'Fedora22' --disk-format qcow2 --container-format bare \
--file ./Fedora-Cloud-Base-22-20150521.x86_64.qcow2

OPNFV-NATIVE-SETUP-4: This step is Informational. OPNFV Installer has taken care of this step during deployment. You may refer to this step only if there is any issue, or if you are using other installers.

We have to move the physical interface (i.e. the public network interface) to br-ex, including moving the public IP address and setting up default route. Please refer to OS-NATIVE-SETUP-4 and OS-NATIVE-SETUP-5 in our more complete instruction.

OPNFV-NATIVE-SETUP-5: Create Neutron routers ipv4-router and ipv6-router which need to provide external connectivity.

neutron router-create ipv4-router
neutron router-create ipv6-router

OPNFV-NATIVE-SETUP-6: Create an external network/subnet ext-net using the appropriate values based on the data-center physical network setup.

Please NOTE that you may only need to create the subnet of ext-net because OPNFV installers should have created an external network during installation. You must use the same name of external network that installer creates when you create the subnet. For example:

  • Apex installer: external
  • Compass installer: ext-net
  • Fuel installer: admin_floating_net
  • Joid installer: ext-net

Please refer to the documentation of installers if there is any issue

# This is needed only if installer does not create an external work
# Otherwise, skip this command "net-create"
neutron net-create --router:external ext-net

# Note that the name "ext-net" may work for some installers such as Compass and Joid
# Change the name "ext-net" to match the name of external network that an installer creates
neutron subnet-create --disable-dhcp --allocation-pool start=198.59.156.251,\
end=198.59.156.254 --gateway 198.59.156.1 ext-net 198.59.156.0/24

OPNFV-NATIVE-SETUP-7: Create Neutron networks ipv4-int-network1 and ipv6-int-network2 with port_security disabled

neutron net-create ipv4-int-network1
neutron net-create ipv6-int-network2

OPNFV-NATIVE-SETUP-8: Create IPv4 subnet ipv4-int-subnet1 in the internal network ipv4-int-network1, and associate it to ipv4-router.

neutron subnet-create --name ipv4-int-subnet1 --dns-nameserver 8.8.8.8 \
ipv4-int-network1 20.0.0.0/24

neutron router-interface-add ipv4-router ipv4-int-subnet1

OPNFV-NATIVE-SETUP-9: Associate the ext-net to the Neutron routers ipv4-router and ipv6-router.

# Note that the name "ext-net" may work for some installers such as Compass and Joid
# Change the name "ext-net" to match the name of external network that an installer creates
neutron router-gateway-set ipv4-router ext-net
neutron router-gateway-set ipv6-router ext-net

OPNFV-NATIVE-SETUP-10: Create two subnets, one IPv4 subnet ipv4-int-subnet2 and one IPv6 subnet ipv6-int-subnet2 in ipv6-int-network2, and associate both subnets to ipv6-router

neutron subnet-create --name ipv4-int-subnet2 --dns-nameserver 8.8.8.8 \
ipv6-int-network2 10.0.0.0/24

neutron subnet-create --name ipv6-int-subnet2 --ip-version 6 --ipv6-ra-mode slaac \
--ipv6-address-mode slaac ipv6-int-network2 2001:db8:0:1::/64

neutron router-interface-add ipv6-router ipv4-int-subnet2
neutron router-interface-add ipv6-router ipv6-int-subnet2

OPNFV-NATIVE-SETUP-11: Create a keypair

nova keypair-add vRouterKey > ~/vRouterKey

OPNFV-NATIVE-SETUP-12: Create ports for vRouter (with some specific MAC address - basically for automation - to know the IPv6 addresses that would be assigned to the port).

neutron port-create --name eth0-vRouter --mac-address fa:16:3e:11:11:11 ipv6-int-network2
neutron port-create --name eth1-vRouter --mac-address fa:16:3e:22:22:22 ipv4-int-network1

OPNFV-NATIVE-SETUP-13: Create ports for VM1 and VM2.

neutron port-create --name eth0-VM1 --mac-address fa:16:3e:33:33:33 ipv4-int-network1
neutron port-create --name eth0-VM2 --mac-address fa:16:3e:44:44:44 ipv4-int-network1

OPNFV-NATIVE-SETUP-14: Update ipv6-router with routing information to subnet 2001:db8:0:2::/64

neutron router-update ipv6-router --routes type=dict list=true \
destination=2001:db8:0:2::/64,nexthop=2001:db8:0:1:f816:3eff:fe11:1111

OPNFV-NATIVE-SETUP-15: Boot Service VM (vRouter), VM1 and VM2

nova boot --image Fedora22 --flavor m1.small \
--user-data /opt/stack/opnfv_os_ipv6_poc/metadata.txt \
--availability-zone nova:opnfv-os-compute \
--nic port-id=$(neutron port-list | grep -w eth0-vRouter | awk '{print $2}') \
--nic port-id=$(neutron port-list | grep -w eth1-vRouter | awk '{print $2}') \
--key-name vRouterKey vRouter

nova list

# Please wait for some 10 to 15 minutes so that necessary packages (like radvd)
# are installed and vRouter is up.
nova console-log vRouter

nova boot --image cirros-0.3.4-x86_64-uec --flavor m1.tiny \
--user-data /opt/stack/opnfv_os_ipv6_poc/set_mtu.sh \
--availability-zone nova:opnfv-os-controller \
--nic port-id=$(neutron port-list | grep -w eth0-VM1 | awk '{print $2}') \
--key-name vRouterKey VM1

nova boot --image cirros-0.3.4-x86_64-uec --flavor m1.tiny
--user-data /opt/stack/opnfv_os_ipv6_poc/set_mtu.sh \
--availability-zone nova:opnfv-os-compute \
--nic port-id=$(neutron port-list | grep -w eth0-VM2 | awk '{print $2}') \
--key-name vRouterKey VM2

nova list # Verify that all the VMs are in ACTIVE state.

OPNFV-NATIVE-SETUP-16: If all goes well, the IPv6 addresses assigned to the VMs would be as shown as follows:

# vRouter eth0 interface would have the following IPv6 address:
#     2001:db8:0:1:f816:3eff:fe11:1111/64
# vRouter eth1 interface would have the following IPv6 address:
#     2001:db8:0:2::1/64
# VM1 would have the following IPv6 address:
#     2001:db8:0:2:f816:3eff:fe33:3333/64
# VM2 would have the following IPv6 address:
#     2001:db8:0:2:f816:3eff:fe44:4444/64

OPNFV-NATIVE-SETUP-17: Now we need to disable eth0-VM1, eth0-VM2, eth0-vRouter and eth1-vRouter port-security

for port in eth0-VM1 eth0-VM2 eth0-vRouter eth1-vRouter
do
    neutron port-update --no-security-groups $port
    neutron port-update $port --port-security-enabled=False
    neutron port-show $port | grep port_security_enabled
done

OPNFV-NATIVE-SETUP-18: Now we can SSH to VMs. You can execute the following command.

# 1. Create a floatingip and associate it with VM1, VM2 and vRouter (to the port id that is passed).
#    Note that the name "ext-net" may work for some installers such as Compass and Joid
#    Change the name "ext-net" to match the name of external network that an installer creates
neutron floatingip-create --port-id $(neutron port-list | grep -w eth0-VM1 | \
awk '{print $2}') ext-net
neutron floatingip-create --port-id $(neutron port-list | grep -w eth0-VM2 | \
awk '{print $2}') ext-net
neutron floatingip-create --port-id $(neutron port-list | grep -w eth1-vRouter | \
awk '{print $2}') ext-net

# 2. To know / display the floatingip associated with VM1, VM2 and vRouter.
neutron floatingip-list -F floating_ip_address -F port_id | grep $(neutron port-list | \
grep -w eth0-VM1 | awk '{print $2}') | awk '{print $2}'
neutron floatingip-list -F floating_ip_address -F port_id | grep $(neutron port-list | \
grep -w eth0-VM2 | awk '{print $2}') | awk '{print $2}'
neutron floatingip-list -F floating_ip_address -F port_id | grep $(neutron port-list | \
grep -w eth1-vRouter | awk '{print $2}') | awk '{print $2}'

# 3. To ssh to the vRouter, VM1 and VM2, user can execute the following command.
ssh -i ~/vRouterKey fedora@<floating-ip-of-vRouter>
ssh -i ~/vRouterKey cirros@<floating-ip-of-VM1>
ssh -i ~/vRouterKey cirros@<floating-ip-of-VM2>
1.3. Setup Manual in OpenStack with Open Daylight L2-Only Environment

If you intend to set up a service VM as an IPv6 vRouter in an environment of OpenStack and Open Daylight L2-only of OPNFV Danube Release, please NOTE that:

  • We SHOULD use the odl-ovsdb-openstack version of Open Daylight Boron in OPNFV Danube Release. Please refer to our Gap Analysis for more information.
  • The hostnames, IP addresses, and username are for exemplary purpose in instructions. Please change as needed to fit your environment.
  • The instructions apply to both deployment model of single controller node and HA (High Availability) deployment model where multiple controller nodes are used.
  • However, in case of HA, when ipv6-router is created in step SETUP-SVM-11, it could be created in any of the controller node. Thus you need to identify in which controller node ipv6-router is created in order to manually spawn radvd daemon inside the ipv6-router namespace in steps SETUP-SVM-24 through SETUP-SVM-30.
1.3.1. Install OPNFV and Preparation

OPNFV-INSTALL-1: To install OpenStack with Open Daylight L2-only environment of OPNFV Danube Release:

Apex Installer:

Please NOTE that:

  • In OPNFV Danube Release, Apex Installer no longer supports Open Daylight L2-only environment or odl-ovsdb-openstack. Instead, it supports Open Daylight L3 deployment with odl-netvirt-openstack.
  • IPv6 features are not fully supported in Open Daylight L3 with odl-netvirt-openstack yet. It is still a work in progress.
  • Thus we cannot realize Service VM as an IPv6 vRouter using Apex Installer under OpenStack + Open Daylight L3 with odl-netvirt-openstack environment.

Compass Installer:

# HA deployment in OpenStack with Open Daylight L2-only environment
export ISO_URL=file://$BUILD_DIRECTORY/compass.iso
export OS_VERSION=${{COMPASS_OS_VERSION}}
export OPENSTACK_VERSION=${{COMPASS_OPENSTACK_VERSION}}
export CONFDIR=$WORKSPACE/deploy/conf/vm_environment
./deploy.sh --dha $CONFDIR/os-odl_l2-nofeature-ha.yml \
--network $CONFDIR/$NODE_NAME/network.yml

# Non-HA deployment in OpenStack with Open Daylight L2-only environment
# Non-HA deployment is currently not supported by Compass installer

Fuel Installer:

# HA deployment in OpenStack with Open Daylight L2-only environment
# Scenario Name: os-odl_l2-nofeature-ha
# Scenario Configuration File: ha_odl-l2_heat_ceilometer_scenario.yaml
# You can use either Scenario Name or Scenario Configuration File Name in "-s" parameter
sudo ./deploy.sh -b <stack-config-uri> -l <lab-name> -p <pod-name> \
-s os-odl_l2-nofeature-ha -i <iso-uri>

# Non-HA deployment in OpenStack with Open Daylight L2-only environment
# Scenario Name: os-odl_l2-nofeature-noha
# Scenario Configuration File: no-ha_odl-l2_heat_ceilometer_scenario.yaml
# You can use either Scenario Name or Scenario Configuration File Name in "-s" parameter
sudo ./deploy.sh -b <stack-config-uri> -l <lab-name> -p <pod-name> \
-s os-odl_l2-nofeature-noha -i <iso-uri>

# Note:
#
# 1. Refer to http://git.opnfv.org/cgit/fuel/tree/deploy/scenario/scenario.yaml for scenarios
# 2. Refer to http://git.opnfv.org/cgit/fuel/tree/ci/README for description of
#    stack configuration directory structure
# 3. <stack-config-uri> is the base URI of stack configuration directory structure
# 3.1 Example: http://git.opnfv.org/cgit/fuel/tree/deploy/config
# 4. <lab-name> and <pod-name> must match the directory structure in stack configuration
# 4.1 Example of <lab-name>: -l devel-pipeline
# 4.2 Example of <pod-name>: -p elx
# 5. <iso-uri> could be local or remote ISO image of Fuel Installer
# 5.1 Example: http://artifacts.opnfv.org/fuel/danube/opnfv-danube.1.0.iso
#
# Please refer to Fuel Installer's documentation for further information and any update

Joid Installer:

# HA deployment in OpenStack with Open Daylight L2-only environment
./deploy.sh -o mitaka -s odl -t ha -l default -f ipv6

# Non-HA deployment in OpenStack with Open Daylight L2-only environment
./deploy.sh -o mitaka -s odl -t nonha -l default -f ipv6

Please NOTE that:

  • You need to refer to installer’s documentation for other necessary parameters applicable to your deployment.
  • You need to refer to Release Notes and installer’s documentation if there is any issue in installation.

OPNFV-INSTALL-2: Clone the following GitHub repository to get the configuration and metadata files

git clone https://github.com/sridhargaddam/opnfv_os_ipv6_poc.git \
/opt/stack/opnfv_os_ipv6_poc
1.3.2. Disable Security Groups in OpenStack ML2 Setup

Please NOTE that although Security Groups feature has been disabled automatically through local.conf configuration file by some installers such as devstack, it is very likely that other installers such as Apex, Compass, Fuel or Joid will enable Security Groups feature after installation.

Please make sure that Security Groups are disabled in the setup

In order to disable Security Groups globally, please make sure that the settings in OPNFV-SEC-1 and OPNFV-SEC-2 are applied, if they are not there by default.

OPNFV-SEC-1: Change the settings in /etc/neutron/plugins/ml2/ml2_conf.ini as follows, if they are not there by default.

# /etc/neutron/plugins/ml2/ml2_conf.ini
[securitygroup]
enable_security_group = True
firewall_driver = neutron.agent.firewall.NoopFirewallDriver
[ml2]
extension_drivers = port_security
[agent]
prevent_arp_spoofing = False

OPNFV-SEC-2: Change the settings in /etc/nova/nova.conf as follows, if they are not there by default.

# /etc/nova/nova.conf
[DEFAULT]
security_group_api = neutron
firewall_driver = nova.virt.firewall.NoopFirewallDriver

OPNFV-SEC-3: After updating the settings, you will have to restart the Neutron and Nova services.

Please note that the commands of restarting Neutron and Nova would vary depending on the installer. Please refer to relevant documentation of specific installers

1.3.3. Source the Credentials in OpenStack Controller Node

SETUP-SVM-1: Login in OpenStack Controller Node. Start a new terminal, and change directory to where OpenStack is installed.

SETUP-SVM-2: We have to source the tenant credentials in this step. Please NOTE that the method of sourcing tenant credentials may vary depending on installers. For example:

Apex installer:

# On jump host, source the tenant credentials using /bin/opnfv-util provided by Apex installer
opnfv-util undercloud "source overcloudrc; keystone service-list"

# Alternatively, you can copy the file /home/stack/overcloudrc from the installer VM called "undercloud"
# to a location in controller node, for example, in the directory /opt, and do:
# source /opt/overcloudrc

Compass installer:

# source the tenant credentials using Compass installer of OPNFV
source /opt/admin-openrc.sh

Fuel installer:

# source the tenant credentials using Fuel installer of OPNFV
source /root/openrc

Joid installer:

# source the tenant credentials using Joid installer of OPNFV
source $HOME/joid_config/admin-openrc

devstack:

# source the tenant credentials in devstack
source openrc admin demo

Please refer to relevant documentation of installers if you encounter any issue.

1.3.4. Informational Note: Move Public Network from Physical Network Interface to br-ex

SETUP-SVM-3: Move the physical interface (i.e. the public network interface) to br-ex

SETUP-SVM-4: Verify setup of br-ex

Those 2 steps are Informational. OPNFV Installer has taken care of those 2 steps during deployment. You may refer to this step only if there is any issue, or if you are using other installers.

We have to move the physical interface (i.e. the public network interface) to br-ex, including moving the public IP address and setting up default route. Please refer to SETUP-SVM-3 and SETUP-SVM-4 in our more complete instruction.

1.3.5. Create IPv4 Subnet and Router with External Connectivity

SETUP-SVM-5: Create a Neutron router ipv4-router which needs to provide external connectivity.

neutron router-create ipv4-router

SETUP-SVM-6: Create an external network/subnet ext-net using the appropriate values based on the data-center physical network setup.

Please NOTE that you may only need to create the subnet of ext-net because OPNFV installers should have created an external network during installation. You must use the same name of external network that installer creates when you create the subnet. For example:

  • Apex installer: external
  • Compass installer: ext-net
  • Fuel installer: admin_floating_net
  • Joid installer: ext-net

Please refer to the documentation of installers if there is any issue

# This is needed only if installer does not create an external work
# Otherwise, skip this command "net-create"
neutron net-create --router:external ext-net

# Note that the name "ext-net" may work for some installers such as Compass and Joid
# Change the name "ext-net" to match the name of external network that an installer creates
neutron subnet-create --disable-dhcp --allocation-pool start=198.59.156.251,\
end=198.59.156.254 --gateway 198.59.156.1 ext-net 198.59.156.0/24

Please note that the IP addresses in the command above are for exemplary purpose. Please replace the IP addresses of your actual network.

SETUP-SVM-7: Associate the ext-net to the Neutron router ipv4-router.

# Note that the name "ext-net" may work for some installers such as Compass and Joid
# Change the name "ext-net" to match the name of external network that an installer creates
neutron router-gateway-set ipv4-router ext-net

SETUP-SVM-8: Create an internal/tenant IPv4 network ipv4-int-network1

neutron net-create ipv4-int-network1

SETUP-SVM-9: Create an IPv4 subnet ipv4-int-subnet1 in the internal network ipv4-int-network1

neutron subnet-create --name ipv4-int-subnet1 --dns-nameserver 8.8.8.8 \
ipv4-int-network1 20.0.0.0/24

SETUP-SVM-10: Associate the IPv4 internal subnet ipv4-int-subnet1 to the Neutron router ipv4-router.

neutron router-interface-add ipv4-router ipv4-int-subnet1
1.3.6. Create IPv6 Subnet and Router with External Connectivity

Now, let us create a second neutron router where we can “manually” spawn a radvd daemon to simulate an external IPv6 router.

SETUP-SVM-11: Create a second Neutron router ipv6-router which needs to provide external connectivity

neutron router-create ipv6-router

SETUP-SVM-12: Associate the ext-net to the Neutron router ipv6-router

# Note that the name "ext-net" may work for some installers such as Compass and Joid
# Change the name "ext-net" to match the name of external network that an installer creates
neutron router-gateway-set ipv6-router ext-net

SETUP-SVM-13: Create a second internal/tenant IPv4 network ipv4-int-network2

neutron net-create ipv4-int-network2

SETUP-SVM-14: Create an IPv4 subnet ipv4-int-subnet2 for the ipv6-router internal network ipv4-int-network2

neutron subnet-create --name ipv4-int-subnet2 --dns-nameserver 8.8.8.8 \
ipv4-int-network2 10.0.0.0/24

SETUP-SVM-15: Associate the IPv4 internal subnet ipv4-int-subnet2 to the Neutron router ipv6-router.

neutron router-interface-add ipv6-router ipv4-int-subnet2
1.3.7. Prepare Image, Metadata and Keypair for Service VM

SETUP-SVM-16: Download fedora22 image which would be used as vRouter

wget https://download.fedoraproject.org/pub/fedora/linux/releases/22/Cloud/x86_64/\
Images/Fedora-Cloud-Base-22-20150521.x86_64.qcow2

glance image-create --name 'Fedora22' --disk-format qcow2 --container-format bare \
--file ./Fedora-Cloud-Base-22-20150521.x86_64.qcow2

SETUP-SVM-17: Create a keypair

nova keypair-add vRouterKey > ~/vRouterKey

SETUP-SVM-18: Create ports for vRouter and both the VMs with some specific MAC addresses.

neutron port-create --name eth0-vRouter --mac-address fa:16:3e:11:11:11 ipv4-int-network2
neutron port-create --name eth1-vRouter --mac-address fa:16:3e:22:22:22 ipv4-int-network1
neutron port-create --name eth0-VM1 --mac-address fa:16:3e:33:33:33 ipv4-int-network1
neutron port-create --name eth0-VM2 --mac-address fa:16:3e:44:44:44 ipv4-int-network1
1.3.8. Boot Service VM (vRouter) with eth0 on ipv4-int-network2 and eth1 on ipv4-int-network1

Let us boot the service VM (vRouter) with eth0 interface on ipv4-int-network2 connecting to ipv6-router, and eth1 interface on ipv4-int-network1 connecting to ipv4-router.

SETUP-SVM-19: Boot the vRouter using Fedora22 image on the OpenStack Compute Node with hostname opnfv-os-compute

nova boot --image Fedora22 --flavor m1.small \
--user-data /opt/stack/opnfv_os_ipv6_poc/metadata.txt \
--availability-zone nova:opnfv-os-compute \
--nic port-id=$(neutron port-list | grep -w eth0-vRouter | awk '{print $2}') \
--nic port-id=$(neutron port-list | grep -w eth1-vRouter | awk '{print $2}') \
--key-name vRouterKey vRouter

Please note that /opt/stack/opnfv_os_ipv6_poc/metadata.txt is used to enable the vRouter to automatically spawn a radvd, and

  • Act as an IPv6 vRouter which advertises the RA (Router Advertisements) with prefix 2001:db8:0:2::/64 on its internal interface (eth1).
  • Forward IPv6 traffic from internal interface (eth1)

SETUP-SVM-20: Verify that Fedora22 image boots up successfully and vRouter has ssh keys properly injected

nova list
nova console-log vRouter

Please note that it may take a few minutes for the necessary packages to get installed and ssh keys to be injected.

# Sample Output
[  762.884523] cloud-init[871]: ec2: #############################################################
[  762.909634] cloud-init[871]: ec2: -----BEGIN SSH HOST KEY FINGERPRINTS-----
[  762.931626] cloud-init[871]: ec2: 2048 e3:dc:3d:4a:bc:b6:b0:77:75:a1:70:a3:d0:2a:47:a9   (RSA)
[  762.957380] cloud-init[871]: ec2: -----END SSH HOST KEY FINGERPRINTS-----
[  762.979554] cloud-init[871]: ec2: #############################################################
1.3.9. Boot Two Other VMs in ipv4-int-network1

In order to verify that the setup is working, let us create two cirros VMs with eth1 interface on the ipv4-int-network1, i.e., connecting to vRouter eth1 interface for internal network.

We will have to configure appropriate mtu on the VMs’ interface by taking into account the tunneling overhead and any physical switch requirements. If so, push the mtu to the VM either using dhcp options or via meta-data.

SETUP-SVM-21: Create VM1 on OpenStack Controller Node with hostname opnfv-os-controller

nova boot --image cirros-0.3.4-x86_64-uec --flavor m1.tiny \
--user-data /opt/stack/opnfv_os_ipv6_poc/set_mtu.sh \
--availability-zone nova:opnfv-os-controller \
--nic port-id=$(neutron port-list | grep -w eth0-VM1 | awk '{print $2}') \
--key-name vRouterKey VM1

SETUP-SVM-22: Create VM2 on OpenStack Compute Node with hostname opnfv-os-compute

nova boot --image cirros-0.3.4-x86_64-uec --flavor m1.tiny \
--user-data /opt/stack/opnfv_os_ipv6_poc/set_mtu.sh \
--availability-zone nova:opnfv-os-compute \
--nic port-id=$(neutron port-list | grep -w eth0-VM2 | awk '{print $2}') \
--key-name vRouterKey VM2

SETUP-SVM-23: Confirm that both the VMs are successfully booted.

nova list
nova console-log VM1
nova console-log VM2
1.3.10. Spawn RADVD in ipv6-router

Let us manually spawn a radvd daemon inside ipv6-router namespace to simulate an external router. First of all, we will have to identify the ipv6-router namespace and move to the namespace.

Please NOTE that in case of HA (High Availability) deployment model where multiple controller nodes are used, ipv6-router created in step SETUP-SVM-11 could be in any of the controller node. Thus you need to identify in which controller node ipv6-router is created in order to manually spawn radvd daemon inside the ipv6-router namespace in steps SETUP-SVM-24 through SETUP-SVM-30. The following command in Neutron will display the controller on which the ipv6-router is spawned.

neutron l3-agent-list-hosting-router ipv6-router

Then you login to that controller and execute steps SETUP-SVM-24 through SETUP-SVM-30

SETUP-SVM-24: identify the ipv6-router namespace and move to the namespace

sudo ip netns exec qrouter-$(neutron router-list | grep -w ipv6-router | \
awk '{print $2}') bash

SETUP-SVM-25: Upon successful execution of the above command, you will be in the router namespace. Now let us configure the IPv6 address on the <qr-xxx> interface.

export router_interface=$(ip a s | grep -w "global qr-*" | awk '{print $7}')
ip -6 addr add 2001:db8:0:1::1 dev $router_interface

SETUP-SVM-26: Update the sample file /opt/stack/opnfv_os_ipv6_poc/scenario2/radvd.conf with $router_interface.

cp /opt/stack/opnfv_os_ipv6_poc/scenario2/radvd.conf /tmp/radvd.$router_interface.conf
sed -i 's/$router_interface/'$router_interface'/g' /tmp/radvd.$router_interface.conf

SETUP-SVM-27: Spawn a radvd daemon to simulate an external router. This radvd daemon advertises an IPv6 subnet prefix of 2001:db8:0:1::/64 using RA (Router Advertisement) on its $router_interface so that eth0 interface of vRouter automatically configures an IPv6 SLAAC address.

$radvd -C /tmp/radvd.$router_interface.conf -p /tmp/br-ex.pid.radvd -m syslog

SETUP-SVM-28: Add an IPv6 downstream route pointing to the eth0 interface of vRouter.

ip -6 route add 2001:db8:0:2::/64 via 2001:db8:0:1:f816:3eff:fe11:1111

SETUP-SVM-29: The routing table should now look similar to something shown below.

ip -6 route show
2001:db8:0:1::1 dev qr-42968b9e-62 proto kernel metric 256
2001:db8:0:1::/64 dev qr-42968b9e-62 proto kernel metric 256 expires 86384sec
2001:db8:0:2::/64 via 2001:db8:0:1:f816:3eff:fe11:1111 dev qr-42968b9e-62 proto ra metric 1024 expires 29sec
fe80::/64 dev qg-3736e0c7-7c proto kernel metric 256
fe80::/64 dev qr-42968b9e-62 proto kernel metric 256

SETUP-SVM-30: If all goes well, the IPv6 addresses assigned to the VMs would be as shown as follows:

# vRouter eth0 interface would have the following IPv6 address:
#     2001:db8:0:1:f816:3eff:fe11:1111/64
# vRouter eth1 interface would have the following IPv6 address:
#     2001:db8:0:2::1/64
# VM1 would have the following IPv6 address:
#     2001:db8:0:2:f816:3eff:fe33:3333/64
# VM2 would have the following IPv6 address:
#     2001:db8:0:2:f816:3eff:fe44:4444/64
1.3.11. Testing to Verify Setup Complete

Now, let us SSH to those VMs, e.g. VM1 and / or VM2 and / or vRouter, to confirm that it has successfully configured the IPv6 address using SLAAC with prefix 2001:db8:0:2::/64 from vRouter.

We use floatingip mechanism to achieve SSH.

SETUP-SVM-31: Now we can SSH to VMs. You can execute the following command.

# 1. Create a floatingip and associate it with VM1, VM2 and vRouter (to the port id that is passed).
#    Note that the name "ext-net" may work for some installers such as Compass and Joid
#    Change the name "ext-net" to match the name of external network that an installer creates
neutron floatingip-create --port-id $(neutron port-list | grep -w eth0-VM1 | \
awk '{print $2}') ext-net
neutron floatingip-create --port-id $(neutron port-list | grep -w eth0-VM2 | \
awk '{print $2}') ext-net
neutron floatingip-create --port-id $(neutron port-list | grep -w eth1-vRouter | \
awk '{print $2}') ext-net

# 2. To know / display the floatingip associated with VM1, VM2 and vRouter.
neutron floatingip-list -F floating_ip_address -F port_id | grep $(neutron port-list | \
grep -w eth0-VM1 | awk '{print $2}') | awk '{print $2}'
neutron floatingip-list -F floating_ip_address -F port_id | grep $(neutron port-list | \
grep -w eth0-VM2 | awk '{print $2}') | awk '{print $2}'
neutron floatingip-list -F floating_ip_address -F port_id | grep $(neutron port-list | \
grep -w eth1-vRouter | awk '{print $2}') | awk '{print $2}'

# 3. To ssh to the vRouter, VM1 and VM2, user can execute the following command.
ssh -i ~/vRouterKey fedora@<floating-ip-of-vRouter>
ssh -i ~/vRouterKey cirros@<floating-ip-of-VM1>
ssh -i ~/vRouterKey cirros@<floating-ip-of-VM2>

If everything goes well, ssh will be successful and you will be logged into those VMs. Run some commands to verify that IPv6 addresses are configured on eth0 interface.

SETUP-SVM-32: Show an IPv6 address with a prefix of 2001:db8:0:2::/64

ip address show

SETUP-SVM-33: ping some external IPv6 address, e.g. ipv6-router

ping6 2001:db8:0:1::1

If the above ping6 command succeeds, it implies that vRouter was able to successfully forward the IPv6 traffic to reach external ipv6-router.

2. IPv6 Post Installation Procedures

Congratulations, you have completed the setup of using a service VM to act as an IPv6 vRouter. You have validated the setup based on the instruction in previous sections. If you want to further test your setup, you can ping6 among VM1, VM2, vRouter and ipv6-router.

This setup allows further open innovation by any 3rd-party. For more instructions and documentations, please refer to:

  1. IPv6 Configuration Guide (HTML): http://artifacts.opnfv.org/ipv6/docs/setupservicevm/index.html
  2. IPv6 User Guide (HTML): http://artifacts.opnfv.org/ipv6/docs/gapanalysis/index.html
2.1. Automated post installation activities

Refer to the relevant testing guides, results, and release notes of Yardstick Project.

Using IPv6 Feature of Danube Release
Abstract:

This section provides the users with gap analysis regarding IPv6 feature requirements with OpenStack Newton Official Release and Open Daylight Boron Official Release. The gap analysis serves as feature specific user guides and references when as a user you may leverage the IPv6 feature in the platform and need to perform some IPv6 related operations.

For more information, please find Neutron’s IPv6 document for Newton Release.

1. IPv6 Gap Analysis with OpenStack Newton

This section provides users with IPv6 gap analysis regarding feature requirement with OpenStack Neutron in Newton Official Release. The following table lists the use cases / feature requirements of VIM-agnostic IPv6 functionality, including infrastructure layer and VNF (VM) layer, and its gap analysis with OpenStack Neutron in Newton Official Release.

Use Case / Requirement Supported in Newton Notes
All topologies work in a multi-tenant environment Yes The IPv6 design is following the Neutron tenant networks model; dnsmasq is being used inside DHCP network namespaces, while radvd is being used inside Neutron routers namespaces to provide full isolation between tenants. Tenant isolation can be based on VLANs, GRE, or VXLAN encapsulation. In case of overlays, the transport network (and VTEPs) must be IPv4 based as of today.
IPv6 VM to VM only Yes It is possible to assign IPv6-only addresses to VMs. Both switching (within VMs on the same tenant network) as well as east/west routing (between different networks of the same tenant) are supported.
IPv6 external L2 VLAN directly attached to a VM Yes IPv6 provider network model; RA messages from upstream (external) router are forwarded into the VMs

IPv6 subnet routed via L3 agent to an external IPv6 network

  1. Both VLAN and overlay (e.g. GRE, VXLAN) subnet attached to VMs;
  2. Must be able to support multiple L3 agents for a given external network to support scaling (neutron scheduler to assign vRouters to the L3 agents)
  1. Yes
  2. Yes
Configuration is enhanced since Kilo to allow easier setup of the upstream gateway, without the user being forced to create an IPv6 subnet for the external network.

Ability for a NIC to support both IPv4 and IPv6 (dual stack) address.

  1. VM with a single interface associated with a network, which is then associated with two subnets.
  2. VM with two different interfaces associated with two different networks and two different subnets.
  1. Yes
  2. Yes
Dual-stack is supported in Neutron with the addition of Multiple IPv6 Prefixes Blueprint

Support IPv6 Address assignment modes.

  1. SLAAC
  2. DHCPv6 Stateless
  3. DHCPv6 Stateful
  1. Yes
  2. Yes
  3. Yes
 
Ability to create a port on an IPv6 DHCPv6 Stateful subnet and assign a specific IPv6 address to the port and have it taken out of the DHCP address pool. Yes  
Ability to create a port with fixed_ip for a SLAAC/DHCPv6-Stateless Subnet. No The following patch disables this operation: https://review.openstack.org/#/c/129144/
Support for private IPv6 to external IPv6 floating IP; Ability to specify floating IPs via Neutron API (REST and CLI) as well as via Horizon, including combination of IPv6/IPv4 and IPv4/IPv6 floating IPs if implemented. Rejected Blueprint proposed in upstream and got rejected. General expectation is to avoid NAT with IPv6 by assigning GUA to tenant VMs. See https://review.openstack.org/#/c/139731/ for discussion.
Provide IPv6/IPv4 feature parity in support for pass-through capabilities (e.g., SR-IOV). To-Do The L3 configuration should be transparent for the SR-IOV implementation. SR-IOV networking support introduced in Juno based on the sriovnicswitch ML2 driver is expected to work with IPv4 and IPv6 enabled VMs. We need to verify if it works or not.
Additional IPv6 extensions, for example: IPSEC, IPv6 Anycast, Multicast No It does not appear to be considered yet (lack of clear requirements)
VM access to the meta-data server to obtain user data, SSH keys, etc. using cloud-init with IPv6 only interfaces. No This is currently not supported. Config-drive or dual-stack IPv4 / IPv6 can be used as a workaround (so that the IPv4 network is used to obtain connectivity with the metadata service). The following blog How to Use Config-Drive for Metadata with IPv6 Network provides a neat summary on how to use config-drive for metadata with IPv6 network.
Full support for IPv6 matching (i.e., IPv6, ICMPv6, TCP, UDP) in security groups. Ability to control and manage all IPv6 security group capabilities via Neutron/Nova API (REST and CLI) as well as via Horizon. Yes Both IPTables firewall driver and OVS firewall driver support IPv6 Security Group API.
During network/subnet/router create, there should be an option to allow user to specify the type of address management they would like. This includes all options including those low priority if implemented (e.g., toggle on/off router and address prefix advertisements); It must be supported via Neutron API (REST and CLI) as well as via Horizon Yes

Two new Subnet attributes were introduced to control IPv6 address assignment options:

  • ipv6-ra-mode: to determine who sends Router Advertisements;
  • ipv6-address-mode: to determine how VM obtains IPv6 address, default gateway, and/or optional information.
Security groups anti-spoofing: Prevent VM from using a source IPv6/MAC address which is not assigned to the VM Yes  
Protect tenant and provider network from rogue RAs Yes When using a tenant network, Neutron is going to automatically handle the filter rules to allow connectivity of RAs to the VMs only from the Neutron router port; with provider networks, users are required to specify the LLA of the upstream router during the subnet creation, or otherwise manually edit the security-groups rules to allow incoming traffic from this specific address.
Support the ability to assign multiple IPv6 addresses to an interface; both for Neutron router interfaces and VM interfaces. Yes  
Ability for a VM to support a mix of multiple IPv4 and IPv6 networks, including multiples of the same type. Yes  
IPv6 Support in “Allowed Address Pairs” Extension Yes  
Support for IPv6 Prefix Delegation. Yes Partial support in Newton
Distributed Virtual Routing (DVR) support for IPv6 No In Newton DVR implementation, IPv6 works. But all the IPv6 ingress/ egress traffic is routed via the centralized controller node, i.e. similar to SNAT traffic. A fully distributed IPv6 router is not yet supported in Neutron.
VPNaaS Yes VPNaaS supports IPv6. But this feature is not extensively tested.
FWaaS Yes  
BGP Dynamic Routing Support for IPv6 Prefixes Yes BGP Dynamic Routing supports peering via IPv6 and advertising IPv6 prefixes.
VxLAN Tunnels with IPv6 endpoints. Yes Neutron ML2/OVS supports configuring local_ip with IPv6 address so that VxLAN tunnels are established with IPv6 addresses. This feature requires OVS 2.6 or higher version.
IPv6 First-Hop Security, IPv6 ND spoofing Yes  
IPv6 support in Neutron Layer3 High Availability (keepalived+VRRP). Yes  
2. IPv6 Gap Analysis with Open Daylight Boron

This section provides users with IPv6 gap analysis regarding feature requirement with Open Daylight Boron Official Release. The following table lists the use cases / feature requirements of VIM-agnostic IPv6 functionality, including infrastructure layer and VNF (VM) layer, and its gap analysis with Open Daylight Boron Official Release.

Open Daylight Boron Status

There are 2 options in Open Daylight Boron to provide Virtualized Networks:

1 Old Netvirt: netvirt implementation used in Open Daylight Beryllium Release
identified by feature odl-ovsdb-openstack
2 New Netvirt: netvirt implementation which will replace the Old Netvirt in the
future releases based on a more modular design. It is identified by feature odl-netvirt-openstack
Use Case / Requirement Supported in ODL Boron Notes
Old Netvirt

(odl-ovsdb-openstack)

New Netvirt

(odl-netvirt-openstack)

REST API support for IPv6 subnet creation in ODL Yes Yes

Yes, it is possible to create IPv6 subnets in ODL using Neutron REST API.

For a network which has both IPv4 and IPv6 subnets, ODL mechanism driver will send the port information which includes IPv4/v6 addresses to ODL Neutron northbound API. When port information is queried, it displays IPv4 and IPv6 addresses.

IPv6 Router support in ODL:

  1. Communication between VMs on same network
No Yes  

IPv6 Router support in ODL:

  1. Communication between VMs on different networks connected to the same router (east-west)
No Yes  

IPv6 Router support in ODL:

  1. External routing (north-south)
No Work in Progress Work in progress.

IPAM: Support for IPv6 Address assignment modes.

  1. SLAAC
  2. DHCPv6 Stateless
  3. DHCPv6 Stateful
No Yes ODL IPv6 Router supports all the IPv6 Address assignment modes along with Neutron DHCP Agent.
When using ODL for L2 forwarding/tunneling, it is compatible with IPv6. Yes Yes  
Full support for IPv6 matching (i.e. IPv6, ICMPv6, TCP, UDP) in security groups. Ability to control and manage all IPv6 security group capabilities via Neutron/Nova API (REST and CLI) as well as via Horizon Partial Yes Security Groups for IPv6 is supported in the new NetVirt.
Shared Networks support Yes Yes  
IPv6 external L2 VLAN directly attached to a VM. ToDo ToDo  
ODL on an IPv6 only Infrastructure. No Work in Progress Deploying OpenStack with ODL on an IPv6 only infrastructure where the API endpoints are all IPv6 addresses.
VxLAN Tunnels with IPv6 Endpoints No Work in Progress The necessary patches are under review to support this use case for OVS 2.6 or higher version.

Joid

JOID installation instruction
Bare Metal Installations:
Requirements as per Pharos:
Networking:

Minimum 2 networks

1. First for Admin/Management network with gateway to access external network
2. Second for floating ip network to consume by tenants for floating ips

NOTE: JOID support multiple isolated networks for API, data as well as storage. Based on your network options for Openstack.

Minimum 6 physical servers

  1. Jump host server:
``  Minimum H/W Spec needed``
``  CPU cores: 16``
``  Memory: 32 GB``
``  Hard Disk: 1(250 GB)``
``  NIC: if0(Admin, Management), if1 (external network)``
  1. Node servers (minimum 5):
``  Minimum H/W Spec``
``  CPU cores: 16``
``  Memory: 32 GB``
``  Hard Disk: 2(1 TB preferred SSD) this includes the space for ceph as well``
``  NIC: if0 (Admin, Management), if1 (external network)``

NOTE: Above configuration is minimum and for better performance and usage of the Openstack please consider higher spec for each nodes.

Make sure all servers are connected to top of rack switch and configured accordingly. No DHCP server should be up and configured. Only gateway at eth0 and eth1 network should be configure to access the network outside your lab.

Jump node configuration:

1. Install Ubuntu 16.04.1 LTS server version of OS on the first server. 2. Install the git and bridge-utils packages on the server and configure minimum two bridges on jump host:

brAdm and brExt cat /etc/network/interfaces

``   # The loopback network interface``
``   auto lo``
``   iface lo inet loopback``
``   iface if0 inet manual``
``   auto brAdm ``
``   iface brAdm inet static``
``       address 10.5.1.1``
``       netmask 255.255.255.0``
``       bridge_ports if0``
``   iface if1 inet manual``
``   auto brExt``
``   iface brExt inet static``
``       address 10.5.15.1``
``       netmask 255.255.255.0``
``       bridge_ports if1``

NOTE: If you choose to use the separate network for management, pulic , data and storage then you need to create bridge for each interface. In case of VLAN tags use the appropriate network on jump-host depend upon VLAN ID on the interface.

Configure JOID for your lab

Get the joid code from gerritt

git clone https://gerrit.opnfv.org/gerrit/joid.git

Enable MAAS (labconfig.yaml is must and base for MAAS installation and scenario deployment)

If you have already enabled maas for your environment and installed it then there is no need to enabled it again or install it. If you have patches from previous MAAS enablement then you can apply it here.

NOTE: If MAAS is pre installed without 03-maasdeploy.sh not supported. We strongly suggest to use 03-maaseploy.sh to deploy the MAAS and JuJu environment.

If enabling first time then follow it further. - Create a directory in joid/labconfig/<company name>/<pod number>/ for example

mkdir joid/labconfig/intel/pod7/

  • copy labconfig.yaml from pod6 to pod7

cp joid/labconfig/intel/pod5/* joid/labconfig/intel/pod7/

labconfig.yaml file
Prerequisite:

1. Make sure Jump host node has been configured with bridges on each interface, so that appropriate MAAS and JUJU bootstrap VM can be created. For example if you have three network admin, data and floating ip then I would suggest to give names like brAdm, brData and brExt etc. 2. You have information about the node MAC address and power management details (IPMI IP, username, password) of the nodes used for deployment.

modify labconfig.yaml

This file has been used to configure your maas and bootstrap node in a VM. Comments in the file are self explanatory and we expect fill up the information according to match lab infrastructure information. Sample labconfig.yaml can be found at https://gerrit.opnfv.org/gerrit/gitweb?p=joid.git;a=blob;f=labconfig/intel/pod6/labconfig.yaml

*lab:

location: intel racks: - rack: pod6

nodes: - name: rack-6-m1

architecture: x86_64 roles: [network,control] nics: - ifname: eth1

spaces: [public] mac: [“xx:xx:xx:xx:xx:xx”]
power:
type: ipmi address: xx.xx.xx.xx user: xxxx pass: xxxx
  • name: rack-6-m1 architecture: x86_64 roles: [network,control] nics: - ifname: eth1

    spaces: [public] mac: [“xx:xx:xx:xx:xx:xx”]

    power:

    type: ipmi address: xx.xx.xx.xx user: xxxx pass: xxxx

  • name: rack-6-m1 architecture: x86_64 roles: [network,control] nics: - ifname: eth1

    spaces: [public] mac: [“xx:xx:xx:xx:xx:xx”]

    power:

    type: ipmi address: xx.xx.xx.xx user: xxxx pass: xxxx

  • name: rack-6-m1 architecture: x86_64 roles: [network,control] nics: - ifname: eth1

    spaces: [public] mac: [“xx:xx:xx:xx:xx:xx”]

    power:

    type: ipmi address: xx.xx.xx.xx user: xxxx pass: xxxx

  • name: rack-6-m1 architecture: x86_64 roles: [network,control] nics: - ifname: eth1

    spaces: [public] mac: [“xx:xx:xx:xx:xx:xx”]

    power:

    type: ipmi address: xx.xx.xx.xx user: xxxx pass: xxxx

floating-ip-range: 10.5.15.6,10.5.15.250,10.5.15.254,10.5.15.0/24 ext-port: “eth1” dns: 8.8.8.8

opnfv:

release: d distro: xenial type: nonha openstack: newton sdncontroller: - type: nosdn storage: - type: ceph

disk: /dev/sdb

feature: odl_l2 spaces: - type: floating

bridge: brEx cidr: 10.5.15.0/24 gateway: 10.5.15.254 vlan:
  • type: admin bridge: brAdm cidr: 10.5.1.0/24 gateway: vlan:*
Deployment of OPNFV using JOID:

Once you have done the change in above section then run the following commands to do the automatic deployments.

MAAS Install

After integrating the changes as mentioned above run the MAAS install. then run the below commands to start the MAAS deployment.

``   ./03-maasdeploy.sh custom <absolute path of config>/labconfig.yaml `` or ``   ./03-maasdeploy.sh custom http://<web site location>/labconfig.yaml ``

For deployment of Danube release on KVM please use the following command.

``   ./03-maasdeploy.sh default ``

OPNFV Install
``   ./deploy.sh -o newton -s nosdn -t nonha -l custom -f none -d xenial -m openstack``
``   ``

./deploy.sh -o newton -s nosdn -t nonha -l custom -f none -d xenial -m openstack

NOTE: Possible options are as follows:

choose which sdn controller to use.
[-s <nosdn|odl|opencontrail|onos>] nosdn: openvswitch only and no other SDN. odl: OpenDayLight Boron version. opencontrail: OpenContrail SDN. onos: ONOS framework as SDN.
Mode of Openstack deployed.
[-t <nonha|ha|tip>] nonha: NO HA mode of Openstack ha: HA mode of openstack.
Wihch version of Openstack deployed.
[-o <Newton|Mitaka>] Newton: Newton version of openstack. Mitaka: Mitaka version of openstack.
Where to deploy
[-l <custom | default>] etc... custom: For bare metal deployment where labconfig.yaml provided externally and not part of JOID. default: For virtual deployment where installation will be done on KVM created using 03-maasdeploy.sh
what feature to deploy. Comma seperated list
[-f <lxd|dvr|sfc|dpdk|ipv6|none>] none: no special feature will be enabled. ipv6: ipv6 will be enabled for tenant in openstack. lxd: With this feature hypervisor will be LXD rather than KVM. dvr: Will enable distributed virtual routing. dpdk: Will enable DPDK feature. sfc: Will enable sfc feature only supported with onos deployment.
which Ubuntu distro to use.
[ -d <trusty|xenial> ]

Which model to deploy JOID introduces the various model to deploy apart from openstack for docker based container workloads. [-m <openstack|kubernetes>]

openstack: Openstack which will be used for KVM/LXD container based workloads. kubernetes: Kubernes model will be used for docker based workloads.

OPNFV Scenarios in JOID Following OPNFV scenarios can be deployed using JOID. Seperate yaml bundle will be created to deploy the individual scenario.

Scenario Owner Known Issues os-nosdn-nofeature-ha Joid os-nosdn-nofeature-noha Joid os-odl_l2-nofeature-ha Joid Floating ips are not working on this deployment. os-nosdn-lxd-ha Joid Yardstick team is working to support. os-nosdn-lxd-noha Joid Yardstick team is working to support. os-onos-nofeature-ha ONOSFW os-onos-sfc-ha ONOSFW k8-nosdn-nofeature-nonha Joid No support from Functest and Yardstick k8-nosdn-lb-nonha Joid No support from Functest and Yardstick

Troubleshoot

By default debug is enabled in script and error messages will be printed on ssh terminal where you are running the scripts.

To Access of any control or compute nodes. juju ssh <service name>/<instance id> for example to login into openstack-dashboard container.

juju ssh openstack-dashboard/0 juju ssh nova-compute/0 juju ssh neutron-gateway/0

All charm jog files are availble under /var/log/juju

By default juju will add the current user keys for authentication into the deployed server and only ssh access will be available.

JOID Configuration guide
JOID Configuration
Scenario 1: ODL L2

./deploy.sh -o newton -s odl -t ha -l custom -f none -d xenial -m openstack

Scenario 2: Nosdn

./deploy.sh -o newton -s nosdn -t ha -l custom -f none -d xenial -m openstack

Scenario 3: ONOS nofeature

./deploy.sh -o newton -s onos -t ha -l custom -f none -d xenial -m openstack

Scenario 4: ONOS with SFC

./deploy.sh -o newton -s onos -t ha -l custom -f none -d xenial -m openstack

Scenario 5: Kubernetes core

./deploy.sh -l custom -f none -m kubernetes

JOID User Guide
1. Introduction

This document will explain how to install OPNFV Danube with JOID including installing JOID, configuring JOID for your environment, and deploying OPNFV with different SDN solutions in HA, or non-HA mode. Prerequisites include

  • An Ubuntu 16.04 LTS Server Jumphost
  • Minimum 2 Networks per Pharos requirement
    • One for the administrative network with gateway to access the Internet
    • One for the OpenStack public network to access OpenStack instances via floating IPs
    • JOID supports multiple isolated networks for data as well as storage based on your network requirement for OpenStack.
  • Minimum 6 Physical servers for bare metal environment
    • Jump Host x 1, minimum H/W configuration:
      • CPU cores: 16
      • Memory: 32GB
      • Hard Disk: 1 (250GB)
      • NIC: eth0 (Admin, Management), eth1 (external network)
    • Control and Compute Nodes x 5, minimum H/W configuration:
      • CPU cores: 16
      • Memory: 32GB
      • Hard Disk: 2 (500GB) prefer SSD
      • NIC: eth0 (Admin, Management), eth1 (external network)

NOTE: Above configuration is minimum. For better performance and usage of the OpenStack, please consider higher specs for all nodes.

Make sure all servers are connected to top of rack switch and configured accordingly. No DHCP server should be up and configured. Configure gateways only on eth0 and eth1 networks to access the network outside your lab.

2. Orientation
2.1. JOID in brief

JOID as Juju OPNFV Infrastructure Deployer allows you to deploy different combinations of OpenStack release and SDN solution in HA or non-HA mode. For OpenStack, JOID supports Juno and Liberty. For SDN, it supports Openvswitch, OpenContrail, OpenDayLight, and ONOS. In addition to HA or non-HA mode, it also supports deploying from the latest development tree.

JOID heavily utilizes the technology developed in Juju and MAAS. Juju is a state-of-the-art, open source, universal model for service oriented architecture and service oriented deployments. Juju allows you to deploy, configure, manage, maintain, and scale cloud services quickly and efficiently on public clouds, as well as on physical servers, OpenStack, and containers. You can use Juju from the command line or through its powerful GUI. MAAS (Metal-As-A-Service) brings the dynamism of cloud computing to the world of physical provisioning and Ubuntu. Connect, commission and deploy physical servers in record time, re-allocate nodes between services dynamically, and keep them up to date; and in due course, retire them from use. In conjunction with the Juju service orchestration software, MAAS will enable you to get the most out of your physical hardware and dynamically deploy complex services with ease and confidence.

For more info on Juju and MAAS, please visit https://jujucharms.com/ and http://maas.ubuntu.com.

2.2. Typical JOID Setup

The MAAS server is installed and configured on Jumphost with Ubuntu 16.04 LTS with access to the Internet. Another VM is created to be managed by MAAS as a bootstrap node for Juju. The rest of the resources, bare metal or virtual, will be registered and provisioned in MAAS. And finally the MAAS environment details are passed to Juju for use.

3. Installation

We will use 03-maasdeploy.sh to automate the deployment of MAAS clusters for use as a Juju provider. MAAS-deployer uses a set of configuration files and simple commands to build a MAAS cluster using virtual machines for the region controller and bootstrap hosts and automatically commission nodes as required so that the only remaining step is to deploy services with Juju. For more information about the maas-deployer, please see https://launchpad.net/maas-deployer.

3.1. Configuring the Jump Host

Let’s get started on the Jump Host node.

The MAAS server is going to be installed and configured on a Jumphost machine. We need to create bridges on the Jump Host prior to setting up the MAAS.

NOTE: For all the commands in this document, please do not use a ‘root’ user account to run. Please create a non root user account. We recommend using the ‘ubuntu’ user.

Install the bridge-utils package on the Jump Host and configure a minimum of two bridges, one for the Admin network, the other for the Public network:

$ sudo apt-get install bridge-utils

$ cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

iface p1p1 inet manual

auto brAdm
iface brAdm inet static
    address 172.16.50.51
    netmask 255.255.255.0
    bridge_ports p1p1

iface p1p2 inet manual

auto brPublic
iface brPublic inet static
    address 10.10.15.1
    netmask 255.255.240.0
    gateway 10.10.10.1
    dns-nameservers 8.8.8.8
    bridge_ports p1p2

NOTE: If you choose to use separate networks for management, data, and storage, then you need to create a bridge for each interface. In case of VLAN tags, make the appropriate network on jump-host depend upon VLAN ID on the interface.

NOTE: The Ethernet device names can vary from one installation to another. Please change the Ethernet device names according to your environment.

MAAS has been integrated in the JOID project. To get the JOID code, please run

$ sudo apt-get install git
$ git clone https://gerrit.opnfv.org/gerrit/p/joid.git
3.2. Setting Up Your Environment for JOID

To set up your own environment, create a directory in joid/ci/maas/<company name>/<pod number>/ and copy an existing JOID environment over. For example:

$ cd joid/ci
$ mkdir -p ../labconfig/myown/pod
$ cp ../labconfig/cengn/pod2/labconfig.yaml ../labconfig/myown/pod/

Now let’s configure labconfig.yaml file. Please modify the sections in the labconfig as per your lab configuration.

lab:

## Change the name of the lab you want maas name will get firmat as per location and rack name ## location: myown racks: - rack: pod

## based on your lab hardware please fill it accoridngly. ##

# Define one network and control and two control, compute and storage # and rest for compute and storage for backward compaibility. again # server with more disks should be used for compute and storage only. nodes: # DCOMP4-B, 24cores, 64G, 2disk, 4TBdisk - name: rack-2-m1

architecture: x86_64 roles: [network,control] nics: - ifname: eth0

spaces: [admin] mac: [“0c:c4:7a:3a:c5:b6”]
  • ifname: eth1 spaces: [floating] mac: [“0c:c4:7a:3a:c5:b7”]
power:
type: ipmi address: <bmc ip> user: <bmc username> pass: <bmc password>

## repeate the above section for number of hardware nodes you have it.

## define the floating IP range along with gateway IP to be used during the instance floating ips ##
floating-ip-range: 172.16.120.20,172.16.120.62,172.16.120.254,172.16.120.0/24 # Mutiple MACs seperated by space where MACs are from ext-ports across all network nodes.
## interface name to be used for floating ips ##
# eth1 of m4 since tags for networking are not yet implemented. ext-port: “eth1” dns: 8.8.8.8 osdomainname:
opnfv:
release: d distro: xenial type: nonha openstack: newton sdncontroller: - type: nosdn storage: - type: ceph
## define the maximum disk possible in your environment ##
disk: /dev/sdb

feature: odl_l2

## Ensure the following configuration matches the bridge configuration on your jumphost

spaces: - type: admin

bridge: brAdm cidr: 10.120.0.0/24 gateway: 10.120.0.254 vlan:
  • type: floating bridge: brPublic cidr: 172.16.120.0/24 gateway: 172.16.120.254

Next we will use the 03-maasdeploy.sh in joid/ci to kick off maas deployment.

3.3. Starting MAAS depoyment

Now run the 03-maasdeploy.sh script with the environment you just created

~/joid/ci$ ./03-maasdeploy.sh custom ../labconfig/mylab/pod/labconfig.yaml

This will take approximately 30 minutes to couple of hours depending on your environment. This script will do the following: 1. Create 1 VM (KVM). 2. Install MAAS on the Jumphost. 3. Configure MAAS to enlist and commission a VM for Juju bootstrap node. 4. Configure MAAS to enlist and commission bare metal servers. 5. Download and load 16.04 images to be used by MAAS.

When it’s done, you should be able to view the MAAS webpage (in our example http://172.16.50.2/MAAS) and see 1 bootstrap node and bare metal servers in the ‘Ready’ state on the nodes page.

3.4. Troubleshooting MAAS deployment

During the installation process, please carefully review the error messages.

Join IRC channel #opnfv-joid on freenode to ask question. After the issues are resolved, re-running 03-maasdeploy.sh will clean up the VMs created previously. There is no need to manually undo what’s been done.

3.5. Deploying OPNFV

JOID allows you to deploy different combinations of OpenStack release and SDN solution in HA or non-HA mode. For OpenStack, it supports Juno and Liberty. For SDN, it supports Open vSwitch, OpenContrail, OpenDaylight and ONOS (Open Network Operating System). In addition to HA or non-HA mode, it also supports deploying the latest from the development tree (tip).

The deploy.sh script in the joid/ci directoy will do all the work for you. For example, the following deploys OpenStack Newton with OpenvSwitch in a HA mode.

~/joid/ci$  ./deploy.sh -o newton -s nosdn -t ha -l custom -f none -m openstack

The deploy.sh script in the joid/ci directoy will do all the work for you. For example, the following deploys Kubernetes with Load balancer on the pod.

~/joid/ci$  ./deploy.sh -m openstack -f lb

Take a look at the deploy.sh script. You will find we support the following for each option:

[-s]
  nosdn: Open vSwitch.
  odl: OpenDayLight Lithium version.
  opencontrail: OpenContrail.
  onos: ONOS framework as SDN.
[-t]
  nonha: NO HA mode of OpenStack.
  ha: HA mode of OpenStack.
  tip: The tip of the development.
[-o]
  mitak: OpenStack Mitaka version.
  newton: OpenStack Newton version.
[-l]
  default: For virtual deployment where installation will be done on KVM created using ./03-maasdeploy.sh
  custom: Install on bare metal OPNFV defined by labconfig.yaml
[-f]
  none: no special feature will be enabled.
  ipv6: IPv6 will be enabled for tenant in OpenStack.
  dpdk: dpdk will be enabled.
  lxd: virt-type will be lxd.
  dvr: DVR will be enabled.
  lb: Load balancing in case of Kubernetes will be enabled.
[-d]
  xenial: distro to be used is Xenial 16.04
[-a]
  amd64: Only x86 architecture will be used. Future version will support arm64 as well.
[-m]
  openstack: Openstack model will be deployed.
  kubernetes: Kubernetes model will be deployed.

The script will call 01-bootstrap.sh to bootstrap the Juju VM node, then it will call 02-deploybundle.sh with the corrosponding parameter values.

./02-deploybundle.sh $opnfvtype $openstack $opnfvlab $opnfvsdn $opnfvfeature $opnfvdistro

Python script GenBundle.py would be used to create bundle.yaml based on the template defined in the config_tpl/juju2/ directory.

By default debug is enabled in the deploy.sh script and error messages will be printed on the SSH terminal where you are running the scripts. It could take an hour to a couple of hours (maximum) to complete.

You can check the status of the deployment by running this command in another terminal:

$ watch juju status --format tabular

This will refresh the juju status output in tabular format every 2 seconds.

Next we will show you what Juju is deploying and to where, and how you can modify based on your own needs.

3.6. OPNFV Juju Charm Bundles

The magic behind Juju is a collection of software components called charms. They contain all the instructions necessary for deploying and configuring cloud-based services. The charms publicly available in the online Charm Store represent the distilled DevOps knowledge of experts.

A bundle is a set of services with a specific configuration and their corresponding relations that can be deployed together in a single step. Instead of deploying a single service, they can be used to deploy an entire workload, with working relations and configuration. The use of bundles allows for easy repeatability and for sharing of complex, multi-service deployments.

For OPNFV, we have created the charm bundles for each SDN deployment. They are stored in each directory in ~/joid/ci.

We use Juju to deploy a set of charms via a yaml configuration file. You can find the complete format guide for the Juju configuration file here: http://pythonhosted.org/juju-deployer/config.html

In the ‘services’ subsection, here we deploy the ‘Ubuntu Xenial charm from the charm store,’ You can deploy the same charm and name it differently such as the second service ‘nodes-compute.’ The third service we deploy is named ‘ntp’ and is deployed from the NTP Trusty charm from the Charm Store. The NTP charm is a subordinate charm, which is designed for and deployed to the running space of another service unit.

The tag here is related to what we define in the deployment.yaml file for the MAAS. When ‘constraints’ is set, Juju will ask its provider, in this case MAAS, to provide a resource with the tags. In this case, Juju is asking one resource tagged with control and one resource tagged with compute from MAAS. Once the resource information is passed to Juju, Juju will start the installation of the specified version of Ubuntu.

In the next subsection, we define the relations between the services. The beauty of Juju and charms is you can define the relation of two services and all the service units deployed will set up the relations accordingly. This makes scaling out a very easy task. Here we add the relation between NTP and the two bare metal services.

Once the relations are established, Juju considers the deployment complete and moves to the next.

juju  deploy bundles.yaml

It will start the deployment , which will retry the section,

nova-cloud-controller:
  branch: lp:~openstack-charmers/charms/trusty/nova-cloud-controller/next
  num_units: 1
  options:
    network-manager: Neutron
  to:
    - "lxc:nodes-api=0"

We define a service name ‘nova-cloud-controller,’ which is deployed from the next branch of the nova-cloud-controller Trusty charm hosted on the Launchpad openstack-charmers team. The number of units to be deployed is 1. We set the network-manager option to ‘Neutron.’ This 1-service unit will be deployed to a LXC container at service ‘nodes-api’ unit 0.

To find out what other options there are for this particular charm, you can go to the code location at http://bazaar.launchpad.net/~openstack-charmers/charms/trusty/nova-cloud-controller/next/files and the options are defined in the config.yaml file.

Once the service unit is deployed, you can see the current configuration by running juju get:

$ juju config nova-cloud-controller

You can change the value with juju config, for example:

$ juju config nova-cloud-controller network-manager=’FlatManager’

Charms encapsulate the operation best practices. The number of options you need to configure should be at the minimum. The Juju Charm Store is a great resource to explore what a charm can offer you. Following the nova-cloud-controller charm example, here is the main page of the recommended charm on the Charm Store: https://jujucharms.com/nova-cloud-controller/trusty/66

If you have any questions regarding Juju, please join the IRC channel #opnfv-joid on freenode for JOID related questions or #juju for general questions.

3.7. Testing Your Deployment

Once juju-deployer is complete, use juju status –format tabular to verify that all deployed units are in the ready state.

Find the Openstack-dashboard IP address from the juju status output, and see if you can login via a web browser. The username and password is admin/openstack.

Optionally, see if you can log in to the Juju GUI. The Juju GUI is on the Juju bootstrap node, which is the second VM you define in the 03-maasdeploy.sh file. The username and password is admin/admin.

If you deploy OpenDaylight, OpenContrail or ONOS, find the IP address of the web UI and login. Please refer to each SDN bundle.yaml for the login username/password.

3.8. Troubleshooting

Logs are indispensable when it comes time to troubleshoot. If you want to see all the service unit deployment logs, you can run juju debug-log in another terminal. The debug-log command shows the consolidated logs of all Juju agents (machine and unit logs) running in the environment.

To view a single service unit deployment log, use juju ssh to access to the deployed unit. For example to login into nova-compute unit and look for /var/log/juju/unit-nova-compute-0.log for more info.

$ juju ssh nova-compute/0

Example:

ubuntu@R4N4B1:~$ juju ssh nova-compute/0
Warning: Permanently added '172.16.50.60' (ECDSA) to the list of known hosts.
Warning: Permanently added '3-r4n3b1-compute.maas' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 3.13.0-77-generic x86_64)

* Documentation:  https://help.ubuntu.com/
<skipped>
Last login: Tue Feb  2 21:23:56 2016 from bootstrap.maas
ubuntu@3-R4N3B1-compute:~$ sudo -i
root@3-R4N3B1-compute:~# cd /var/log/juju/
root@3-R4N3B1-compute:/var/log/juju# ls
machine-2.log  unit-ceilometer-agent-0.log  unit-ceph-osd-0.log  unit-neutron-contrail-0.log  unit-nodes-compute-0.log  unit-nova-compute-0.log  unit-ntp-0.log
root@3-R4N3B1-compute:/var/log/juju#

NOTE: By default Juju will add the Ubuntu user keys for authentication into the deployed server and only ssh access will be available.

Once you resolve the error, go back to the jump host to rerun the charm hook with:

$ juju resolved --retry <unit>

If you would like to start over, run juju destroy-environment <environment name> to release the resources, then you can run deploy.sh again.

The following are the common issues we have collected from the community:

  • The right variables are not passed as part of the deployment procedure.
./deploy.sh -o newton -s nosdn -t ha -l custom -f none
  • If you have setup maas not with 03-maasdeploy.sh then the ./clean.sh command could hang, the juju status command may hang because the correct MAAS API keys are not mentioned in cloud listing for MAAS. Solution: Please make sure you have an MAAS cloud listed using juju clouds. and the correct MAAS API key has been added.
  • Deployment times out:
    use the command juju status –format=tabular and make sure all service containers receive an IP address and they are executing code. Ensure there is no service in the error state.
  • In case the cleanup process hangs,run the juju destroy-model command manually.

Direct console access via the OpenStack GUI can be quite helpful if you need to login to a VM but cannot get to it over the network. It can be enabled by setting the console-access-protocol in the nova-cloud-controller to vnc. One option is to directly edit the juju-deployer bundle and set it there prior to deploying OpenStack.

nova-cloud-controller:
options:
  console-access-protocol: vnc

To access the console, just click on the instance in the OpenStack GUI and select the Console tab.

4. Post Installation Configuration
4.1. Configuring OpenStack

At the end of the deployment, the admin-openrc with OpenStack login credentials will be created for you. You can source the file and start configuring OpenStack via CLI.

~/joid_config$ cat admin-openrc
export OS_USERNAME=admin
export OS_PASSWORD=openstack
export OS_TENANT_NAME=admin
export OS_AUTH_URL=http://172.16.50.114:5000/v2.0
export OS_REGION_NAME=RegionOne

We have prepared some scripts to help your configure the OpenStack cloud that you just deployed. In each SDN directory, for example joid/ci/opencontrail, there is a ‘scripts’ folder where you can find the scripts. These scripts are created to help you configure a basic OpenStack Cloud to verify the cloud. For more information on OpenStack Cloud configuration, please refer to the OpenStack Cloud Administrator Guide: http://docs.openstack.org/user-guide-admin/. Similarly, for complete SDN configuration, please refer to the respective SDN administrator guide.

Each SDN solution requires slightly different setup. Please refer to the README in each SDN folder. Most likely you will need to modify the openstack.sh and cloud-setup.sh scripts for the floating IP range, private IP network, and SSH keys. Please go through openstack.sh, glance.sh and cloud-setup.sh and make changes as you see fit.

Let’s take a look at those for the Open vSwitch and briefly go through each script so you know what you need to change for your own environment.

~/joid/juju$ ls
configure-juju-on-openstack  get-cloud-images  joid-configure-openstack
4.1.1. openstack.sh

Let’s first look at ‘openstack.sh’. First there are 3 functions defined, configOpenrc(), unitAddress(), and unitMachine().

configOpenrc() {
  cat <<-EOF
      export SERVICE_ENDPOINT=$4
      unset SERVICE_TOKEN
      unset SERVICE_ENDPOINT
      export OS_USERNAME=$1
      export OS_PASSWORD=$2
      export OS_TENANT_NAME=$3
      export OS_AUTH_URL=$4
      export OS_REGION_NAME=$5
EOF
}

unitAddress() {
  if [[ "$jujuver" < "2" ]]; then
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"services\"][\"$1\"][\"units\"][\"$1/$2\"][\"public-address\"]" 2> /dev/null
  else
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"applications\"][\"$1\"][\"units\"][\"$1/$2\"][\"public-address\"]" 2> /dev/null
  fi
}

unitMachine() {
  if [[ "$jujuver" < "2" ]]; then
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"services\"][\"$1\"][\"units\"][\"$1/$2\"][\"machine\"]" 2> /dev/null
  else
      juju status --format yaml | python -c "import yaml; import sys; print yaml.load(sys.stdin)[\"applications\"][\"$1\"][\"units\"][\"$1/$2\"][\"machine\"]" 2> /dev/null
  fi
}

The function configOpenrc() creates the OpenStack login credentials, the function unitAddress() finds the IP address of the unit, and the function unitMachine() finds the machine info of the unit.

create_openrc() {
   keystoneIp=$(keystoneIp)
   if [[ "$jujuver" < "2" ]]; then
       adminPasswd=$(juju get keystone | grep admin-password -A 5 | grep value | awk '{print $2}' 2> /dev/null)
   else
       adminPasswd=$(juju config keystone | grep admin-password -A 5 | grep value | awk '{print $2}' 2> /dev/null)
   fi

   configOpenrc admin $adminPasswd admin http://$keystoneIp:5000/v2.0 RegionOne > ~/joid_config/admin-openrc
   chmod 0600 ~/joid_config/admin-openrc
}

This finds the IP address of the keystone unit 0, feeds in the OpenStack admin credentials to a new file name ‘admin-openrc’ in the ‘~/joid_config/’ folder and change the permission of the file. It’s important to change the credentials here if you use a different password in the deployment Juju charm bundle.yaml.

neutron net-show ext-net > /dev/null 2>&1 || neutron net-create ext-net \
                                               --router:external=True \
                                               --provider:network_type flat \
                                               --provider:physical_network physnet1
::
neutron subnet-show ext-subnet > /dev/null 2>&1 || neutron subnet-create ext-net
–name ext-subnet –allocation-pool start=$EXTNET_FIP,end=$EXTNET_LIP –disable-dhcp –gateway $EXTNET_GW $EXTNET_NET

This section will create the ext-net and ext-subnet for defining the for floating ips.

openstack congress datasource create nova "nova" \
 --config username=$OS_USERNAME \
 --config tenant_name=$OS_TENANT_NAME \
 --config password=$OS_PASSWORD \
 --config auth_url=http://$keystoneIp:5000/v2.0

This section will create the congress datasource for various services. Each service datasource will have entry in the file.

4.1.2. get-cloud-images
folder=/srv/data/
sudo mkdir $folder || true

if grep -q 'virt-type: lxd' bundles.yaml; then
   URLS=" \
   http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-lxc.tar.gz \
   http://cloud-images.ubuntu.com/xenial/current/xenial-server-cloudimg-amd64-root.tar.gz "

else
   URLS=" \
   http://cloud-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64-disk1.img \
   http://cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-amd64-disk1.img \
   http://cloud-images.ubuntu.com/xenial/current/xenial-server-cloudimg-amd64-disk1.img \
   http://mirror.catn.com/pub/catn/images/qcow2/centos6.4-x86_64-gold-master.img \
   http://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud.qcow2 \
   http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img "
fi

for URL in $URLS
do
FILENAME=${URL##*/}
if [ -f $folder/$FILENAME ];
then
   echo "$FILENAME already downloaded."
else
   wget  -O  $folder/$FILENAME $URL
fi
done

This section of the file will download the images to jumphost if not found to be used with openstack VIM.

NOTE: The image downloading and uploading might take too long and time out. In this case, use juju ssh glance/0 to log in to the glance unit 0 and run the script again, or manually run the glance commands.

4.1.3. joid-configure-openstack
source ~/joid_config/admin-openrc

First, source the the admin-openrc file.

::
#Upload images to glance
glance image-create –name=”Xenial LXC x86_64” –visibility=public –container-format=bare –disk-format=root-tar –property architecture=”x86_64” < /srv/data/xenial-server-cloudimg-amd64-root.tar.gz glance image-create –name=”Cirros LXC 0.3” –visibility=public –container-format=bare –disk-format=root-tar –property architecture=”x86_64” < /srv/data/cirros-0.3.4-x86_64-lxc.tar.gz glance image-create –name=”Trusty x86_64” –visibility=public –container-format=ovf –disk-format=qcow2 < /srv/data/trusty-server-cloudimg-amd64-disk1.img glance image-create –name=”Xenial x86_64” –visibility=public –container-format=ovf –disk-format=qcow2 < /srv/data/xenial-server-cloudimg-amd64-disk1.img glance image-create –name=”CentOS 6.4” –visibility=public –container-format=bare –disk-format=qcow2 < /srv/data/centos6.4-x86_64-gold-master.img glance image-create –name=”Cirros 0.3” –visibility=public –container-format=bare –disk-format=qcow2 < /srv/data/cirros-0.3.4-x86_64-disk.img

upload the images into glane to be used for creating the VM.

# adjust tiny image
nova flavor-delete m1.tiny
nova flavor-create m1.tiny 1 512 8 1

Adjust the tiny image profile as the default tiny instance is too small for Ubuntu.

# configure security groups
neutron security-group-rule-create --direction ingress --ethertype IPv4 --protocol icmp --remote-ip-prefix 0.0.0.0/0 default
neutron security-group-rule-create --direction ingress --ethertype IPv4 --protocol tcp --port-range-min 22 --port-range-max 22 --remote-ip-prefix 0.0.0.0/0 default

Open up the ICMP and SSH access in the default security group.

# import key pair
keystone tenant-create --name demo --description "Demo Tenant"
keystone user-create --name demo --tenant demo --pass demo --email demo@demo.demo

nova keypair-add --pub-key id_rsa.pub ubuntu-keypair

Create a project called ‘demo’ and create a user called ‘demo’ in this project. Import the key pair.

# configure external network
neutron net-create ext-net --router:external --provider:physical_network external --provider:network_type flat --shared
neutron subnet-create ext-net --name ext-subnet --allocation-pool start=10.5.8.5,end=10.5.8.254 --disable-dhcp --gateway 10.5.8.1 10.5.8.0/24

This section configures an external network ‘ext-net’ with a subnet called ‘ext-subnet’. In this subnet, the IP pool starts at 10.5.8.5 and ends at 10.5.8.254. DHCP is disabled. The gateway is at 10.5.8.1, and the subnet mask is 10.5.8.0/24. These are the public IPs that will be requested and associated to the instance. Please change the network configuration according to your environment.

# create vm network
neutron net-create demo-net
neutron subnet-create --name demo-subnet --gateway 10.20.5.1 demo-net 10.20.5.0/24

This section creates a private network for the instances. Please change accordingly.

neutron router-create demo-router

neutron router-interface-add demo-router demo-subnet

neutron router-gateway-set demo-router ext-net

This section creates a router and connects this router to the two networks we just created.

# create pool of floating ips
i=0
while [ $i -ne 10 ]; do
  neutron floatingip-create ext-net
  i=$((i + 1))
done

Finally, the script will request 10 floating IPs.

4.1.4. configure-juju-on-openstack

This script can be used to do juju bootstrap on openstack so that Juju can be used as model tool to deploy the services and VNF on top of openstack using the JOID.

5. Appendix A: Single Node Deployment

By default, running the script ./03-maasdeploy.sh will automatically create the KVM VMs on a single machine and configure everything for you.

if [ ! -e ./labconfig.yaml ]; then
    virtinstall=1
    labname="default"
    cp ../labconfig/default/labconfig.yaml ./
    cp ../labconfig/default/deployconfig.yaml ./

Please change joid/ci/labconfig/default/labconfig.yaml accordingly. The MAAS deployment script will do the following: 1. Create bootstrap VM. 2. Install MAAS on the jumphost. 3. Configure MAAS to enlist and commission VM for Juju bootstrap node.

Later, the 03-massdeploy.sh script will create three additional VMs and register them into the MAAS Server:

if [ "$virtinstall" -eq 1 ]; then
          sudo virt-install --connect qemu:///system --name $NODE_NAME --ram 8192 --cpu host --vcpus 4 \
                   --disk size=120,format=qcow2,bus=virtio,io=native,pool=default \
                   $netw $netw --boot network,hd,menu=off --noautoconsole --vnc --print-xml | tee $NODE_NAME

          nodemac=`grep  "mac address" $NODE_NAME | head -1 | cut -d '"' -f 2`
          sudo virsh -c qemu:///system define --file $NODE_NAME
          rm -f $NODE_NAME
          maas $PROFILE machines create autodetect_nodegroup='yes' name=$NODE_NAME \
              tags='control compute' hostname=$NODE_NAME power_type='virsh' mac_addresses=$nodemac \
              power_parameters_power_address='qemu+ssh://'$USER'@'$MAAS_IP'/system' \
              architecture='amd64/generic' power_parameters_power_id=$NODE_NAME
          nodeid=$(maas $PROFILE machines read | jq -r '.[] | select(.hostname == '\"$NODE_NAME\"').system_id')
          maas $PROFILE tag update-nodes control add=$nodeid || true
          maas $PROFILE tag update-nodes compute add=$nodeid || true

fi
6. Appendix B: Automatic Device Discovery

If your bare metal servers support IPMI, they can be discovered and enlisted automatically by the MAAS server. You need to configure bare metal servers to PXE boot on the network interface where they can reach the MAAS server. With nodes set to boot from a PXE image, they will start, look for a DHCP server, receive the PXE boot details, boot the image, contact the MAAS server and shut down.

During this process, the MAAS server will be passed information about the node, including the architecture, MAC address and other details which will be stored in the database of nodes. You can accept and commission the nodes via the web interface. When the nodes have been accepted the selected series of Ubuntu will be installed.

7. Appendix C: Machine Constraints

Juju and MAAS together allow you to assign different roles to servers, so that hardware and software can be configured according to their roles. We have briefly mentioned and used this feature in our example. Please visit Juju Machine Constraints https://jujucharms.com/docs/stable/charms-constraints and MAAS tags https://maas.ubuntu.com/docs/tags.html for more information.

8. Appendix D: Offline Deployment

When you have limited access policy in your environment, for example, when only the Jump Host has Internet access, but not the rest of the servers, we provide tools in JOID to support the offline installation.

The following package set is provided to those wishing to experiment with a ‘disconnected from the internet’ setup when deploying JOID utilizing MAAS. These instructions provide basic guidance as to how to accomplish the task, but it should be noted that due to the current reliance of MAAS and DNS, that behavior and success of deployment may vary depending on infrastructure setup. An official guided setup is in the roadmap for the next release:

  1. Get the packages from here: https://launchpad.net/~thomnico/+archive/ubuntu/ubuntu-cloud-mirrors
NOTE: The mirror is quite large 700GB in size, and does not mirror SDN repo/ppa.
  1. Additionally to make juju use a private repository of charms instead of using an external location are provided via the following link and configuring environments.yaml to use cloudimg-base-url: https://github.com/juju/docs/issues/757

Kvmfornfv

KVM4NFV Requirements
1. Kvm4nfv Requirements
1.1. Introduction

The NFV hypervisors provide crucial functionality in the NFV Infrastructure(NFVI).The existing hypervisors, however, are not necessarily designed or targeted to meet the requirements for the NFVI.

This document specifies the list of requirements that need to be met as part of this “NFV Hypervisors-KVM” project in Danube release.

As part of this project we need to make collaborative efforts towards enabling the NFV features.

1.2. Scope and Purpose

The main purpose of this project is to enhance the KVM hypervisor for NFV, by looking at the following areas initially:

  • Minimal Interrupt latency variation for data plane VNFs:
    • Minimal Timing Variation for Timing correctness of real-time VNFs
    • Minimal packet latency variation for data-plane VNFs
  • Inter-VM communication
  • Fast live migration

The output of this project would be list of the performance goals,comprehensive instructions for the system configurations,tools to measure Performance and interrupt latency.

1.3. Methods and Instrumentation

The above areas would require software development and/or specific hardware features, and some need just configurations information for the system (hardware, BIOS, OS, etc.).

A right configuration is critical for improving the NFV performance/latency. Even working on the same code base, different configurations can make completely different performance/latency result. Configurations that can be made as part of this project to tune a specific scenario are:

  1. Platform Configuration : Some hardware features like Power management, Hyper-Threading,Legacy USB Support/Port 60/64 Emulation,SMI can be configured.
  2. Operating System Configuration : Some configuration features like CPU isolation,Memory allocation,IRQ affinity,Device assignment for VM,Tickless, TSC,Idle,_RCU_NOCB_,Disable the RT throttling,NUMA can be configured.
  3. Performance/Latency Tuning : Application level configurations like timers,Making vfio MSI interrupt as non-threaded,Cache Allocation Technology(CAT) enabling can be tuned to improve the NFV performance/latency.
1.4. Features to be tested
The tests that need to be conducted to make sure that latency is addressed are:
  1. Timer test
  2. Device Interrupt Test
  3. Packet forwarding (DPDK OVS)
  4. Packet Forwarding (SR-IOV)
  5. Bare-metal Packet Forwarding
1.5. Dependencies
  1. OPNFV Project: “Characterize vSwitch Performance for Telco NFV Use Cases” (VSPERF) for performance evaluation of ivshmem vs. vhost-user.
  2. OPNFV Project: “Pharos” for Test Bed Infrastructure, and possibly “Yardstick” for infrastructure verification.
  3. There are currently no similar projects underway in OPNFV or in an upstream project
  4. The relevant upstream project to be influenced here is QEMU/KVM and libvirt.
  5. In terms of HW dependencies, the aim is to use standard IA Server hardware for this project, as provided by OPNFV Pharos.
KVM4NFV Installation instruction
1. Abstract

This document will give the instructions to user on how to deploy available KVM4NFV build scenario verfied for the Danube release of the OPNFV platform.

2. KVM4NFV Installation Instruction
2.1. Preparing the installation

The OPNFV project- KVM4NFV (https://gerrit.opnfv.org/gerrit/kvmfornfv.git) is cloned first, to make the build scripts for Qemu & Kernel, Rpms and Debians available.

2.2. HW requirements

These build scripts are triggered on the Jenkins-Slave build server. Currently Intel POD10 is used as test environment for kvm4nfv to execute cyclictest. As part of this test environment Intel pod10-jump is configured as jenkins slave and all the latest build artifacts are downloaded on to it. Intel pod10-node1 is the host on which a guest vm will be launched as a part of running cylictest through yardstick.

2.3. Build instructions

Builds are possible for the following packages-

kvmfornfv source code

The ./ci/build.sh is the main script used to trigger the Rpms (on ‘centos’) and Debians (on ‘ubuntu’) builds in this case.

  • How to build Kernel/Qemu Rpms- To build rpm packages, build.sh script is run with -p and -o option (i.e. if -p package option is passed as “centos” or in default case). Example:
cd kvmfornfv/

For Kernel/Qemu RPMs,
sh ./ci/build.sh -p centos -o build_output
  • How to build Kernel/Qemu Debians- To build debian packages, build.sh script is run with -p and -o option (i.e. if -p package option is passed as “ubuntu”). Example:
cd kvmfornfv/

For Kernel/Qemu Debians,
sh ./ci/build.sh -p ubuntu -o build_output
  • How to build all Kernel & Qemu, Rpms & Debians- To build both debian and rpm packages, build.sh script is run with -p and -o option (i.e. if -p package option is passed as “both”). Example:
cd kvmfornfv/

For Kernel/Qemu RPMs and Debians,
sh ./ci/build.sh -p both -o build_output

Note

Kvm4nfv can be installed in two ways

  1. As part of a scenario deployment
  2. As a stand alone component

For installation of kvmfornfv as part of scenario deployment use this `link`_

http://artifacts.opnfv.org/kvmfornfv/docs/index.html#document-scenarios/kvmfornfv.scenarios.description
2.4. Installation instructions

Installation can be done in the following ways-

1. From kvmfornfv source code- The build packages that are prepared in the above section, are installed differently depending on the platform.

Please visit the links for each-

2. Using Fuel installer-

  • Please refer to the document present at /fuel-plugin/README.md
2.5. Post-installation activities

After the packages are built, test these packages by executing the scripts present in ci/envs for configuring the host and guest respectively.

3. Release Note for KVM4NFV CICD
3.1. Abstract

This document contains the release notes for the Danube release of OPNFV when using KVM4NFV CICD process.

3.2. Introduction

Provide a brief introduction of how this configuration is used in OPNFV release using KVM4VFV CICD as scenario.

Be sure to reference your scenario installation instruction.

3.3. Release Data
Project NFV Hypervisors-KVM
Repo/tag kvmfornfv
Release designation  
Release date 2017-03-27
Purpose of the delivery
  • Automate the KVM4VFV CICD scenario
  • Executing latency test cases
  • Collection of logs for debugging
3.4. Document version change
The following documents are added-
  • configurationguide
  • installationprocedure
  • userguide
  • overview
  • glossary
  • releasenotes
3.5. Reason for new version
3.5.1. Feature additions
JIRA REFERENCE SLOGAN
JIRA: NFV Hypervisors-KVMFORNFV-34
JIRA: NFV Hypervisors-KVMFORNFV-57
JIRA: NFV Hypervisors-KVMFORNFV-58
JIRA: NFV Hypervisors-KVMFORNFV-59
JIRA: NFV Hypervisors-KVMFORNFV-60
3.6. Known issues

JIRA TICKETS:

JIRA REFERENCE SLOGAN
JIRA: NFV Hypervisors-KVMFORNFV-75
3.7. Workarounds

See JIRA: https://jira.opnfv.org/projects

For more information on the OPNFV Danube release, please visit http://www.opnfv.org/danube

Kvm4nfv Configuration Guide
Danube 1.0
1. Configuration Abstract

This document provides guidance for the configurations available in the Danube release of OPNFV

The release includes four installer tools leveraging different technologies; Apex, Compass4nfv, Fuel and JOID, which deploy components of the platform.

This document also includes the selection of tools and components including guidelines for how to deploy and configure the platform to an operational state.

2. Configuration Options

OPNFV provides a variety of virtual infrastructure deployments called scenarios designed to host virtualised network functions (VNF’s). KVM4NFV scenarios provide specific capabilities and/or components aimed to solve specific problems for the deployment of VNF’s. KVM4NFV scenario includes components such as OpenStack,KVM etc. which includes different source components or configurations.

Note

  • Each KVM4NFV scenario provides unique features and capabilities, it is important to understand your target platform capabilities before installing and configuring. This configuration guide outlines how to configure components in order to enable the features required.
  • More deatils of kvm4nfv scenarios installation and description can be found in the scenario guide of kvm4nfv docs
4. Low Latency Feature Configuration Description
4.1. Introduction

In KVM4NFV project, we focus on the KVM hypervisor to enhance it for NFV, by looking at the following areas initially

  • Minimal Interrupt latency variation for data plane VNFs:
    • Minimal Timing Variation for Timing correctness of real-time VNFs
    • Minimal packet latency variation for data-plane VNFs
  • Inter-VM communication,
  • Fast live migration
4.2. Configuration of Cyclictest

Cyclictest measures Latency of response to a stimulus. Achieving low latency with the KVM4NFV project requires setting up a special test environment. This environment includes the BIOS settings, kernel configuration, kernel parameters and the run-time environment.

4.2.1. Pre-configuration activities

Intel POD10 is currently used as OPNFV-KVM4NFV test environment. The rpm packages from the latest build are downloaded onto Intel-Pod10 jump server from artifact repository. Yardstick running in a ubuntu docker container on Intel Pod10-jump server will configure the host(intel pod10 node1/node2 based on job type), the guest and triggers the cyclictest on the guest using below sample yaml file.

For IDLE-IDLE test,

host_setup_seqs:
- "host-setup0.sh"
- "reboot"
- "host-setup1.sh"
- "host-run-qemu.sh"

guest_setup_seqs:
- "guest-setup0.sh"
- "reboot"
- "guest-setup1.sh"
_images/idle-idle-test.png
For [CPU/Memory/IO]Stress-IDLE tests,

host_setup_seqs:
- "host-setup0.sh"
- "reboot"
- "host-setup1.sh"
- "stress_daily.sh" [cpustress/memory/io]
- "host-run-qemu.sh"

guest_setup_seqs:
- "guest-setup0.sh"
- "reboot"
- "guest-setup1.sh"
_images/stress-idle-test.png

The following scripts are used for configuring host and guest to create a special test environment and achieve low latency.

Note: host-setup0.sh, host-setup1.sh and host-run-qemu.sh are run on the host, followed by guest-setup0.sh and guest-setup1.sh scripts on the guest VM.

host-setup0.sh: Running this script will install the latest kernel rpm on host and will make necessary changes as following to create special test environment.

  • Isolates CPUs from the general scheduler
  • Stops timer ticks on isolated CPUs whenever possible
  • Stops RCU callbacks on isolated CPUs
  • Enables intel iommu driver and disables DMA translation for devices
  • Sets HugeTLB pages to 1GB
  • Disables machine check
  • Disables clocksource verification at runtime

host-setup1.sh: Running this script will make the following test environment changes.

  • Disabling watchdogs to reduce overhead
  • Disabling RT throttling
  • Reroute interrupts bound to isolated CPUs to CPU 0
  • Change the iptable so that we can ssh to the guest remotely

stress_daily.sh: Scripts gets triggered only for stress-idle tests. Running this script make the following environment changes.

  • Triggers stress_script.sh, which runs the stress command with necessary options
  • CPU,Memory or IO stress can be applied based on the test type
  • Applying stress only on the Host is handled in D-Release
  • For Idle-Idle test the stress script is not triggered
  • Stress is applied only on the free cores to prevent load on qemu process
Note:
  • On Numa Node 1: 22,23 cores are allocated for QEMU process
  • 24-43 are used for applying stress
host-run-qemu.sh: Running this script will launch a guest vm on the host.
Note: download guest disk image from artifactory.

guest-setup0.sh: Running this scrcipt on the guest vm will install the latest build kernel rpm, cyclictest and make the following configuration on guest vm.

  • Isolates CPUs from the general scheduler
  • Stops timer ticks on isolated CPUs whenever possible
  • Uses polling idle loop to improve performance
  • Disables clocksource verification at runtime

guest-setup1.sh: Running this script on guest vm will do the following configurations.

  • Disable watchdogs to reduce overhead
  • Routes device interrupts to non-RT CPU
  • Disables RT throttling
4.2.2. Hardware configuration

Currently Intel POD10 is used as test environment for kvm4nfv to execute cyclictest. As part of this test environment Intel pod10-jump is configured as jenkins slave and all the latest build artifacts are downloaded on to it.

3. Scenariomatrix

Scenarios are implemented as deployable compositions through integration with an installation tool. OPNFV supports multiple installation tools and for any given release not all tools will support all scenarios. While our target is to establish parity across the installation tools to ensure they can provide all scenarios, the practical challenge of achieving that goal for any given feature and release results in some disparity.

3.1. Danube scenario overeview

The following table provides an overview of the installation tools and available scenario’s in the Danube release of OPNFV.

Scenario status is indicated by a weather pattern icon. All scenarios listed with a weather pattern are possible to deploy and run in your environment or a Pharos lab, however they may have known limitations or issues as indicated by the icon.

Weather pattern icon legend:

Weather Icon Scenario Status
_images/weather-clear.jpg Stable, no known issues
_images/weather-few-clouds.jpg Stable, documented limitations
_images/weather-overcast.jpg Deployable, stability or feature limitations
_images/weather-dash.jpg Not deployed with this installer

Scenarios that are not yet in a state of “Stable, no known issues” will continue to be stabilised and updates will be made on the stable/danube branch. While we intend that all Danube scenarios should be stable it is worth checking regularly to see the current status. Due to our dependency on upstream communities and code some issues may not be resolved prior to the D release.

3.2. Scenario Naming

In OPNFV scenarios are identified by short scenario names, these names follow a scheme that identifies the key components and behaviours of the scenario. The rules for scenario naming are as follows:

os-[controller]-[feature]-[mode]-[option]

Details of the fields are

  • [os]: mandatory
    • Refers to the platform type used
    • possible value: os (OpenStack)
  • [controller]: mandatory
    • Refers to the SDN controller integrated in the platform
    • example values: nosdn, ocl, odl, onos
  • [feature]: mandatory
    • Refers to the feature projects supported by the scenario
    • example values: nofeature, kvm, ovs, sfc
  • [mode]: mandatory
    • Refers to the deployment type, which may include for instance high availability
    • possible values: ha, noha
  • [option]: optional
    • Used for the scenarios those do not fit into naming scheme.
    • The optional field in the short scenario name should not be included if there is no optional scenario.

Some examples of supported scenario names are:

  • os-nosdn-kvm-noha
    • This is an OpenStack based deployment using neutron including the OPNFV enhanced KVM hypervisor
  • os-onos-nofeature-ha
    • This is an OpenStack deployment in high availability mode including ONOS as the SDN controller
  • os-odl_l2-sfc
    • This is an OpenStack deployment using OpenDaylight and OVS enabled with SFC features
  • os-nosdn-kvm_nfv_ovs_dpdk-ha
    • This is an Openstack deployment with high availability using OVS, DPDK including the OPNFV enhanced KVM hypervisor
    • This deployment has 3-Contoller and 2-Compute nodes
  • os-nosdn-kvm_nfv_ovs_dpdk-noha
    • This is an Openstack deployment without high availability using OVS, DPDK including the OPNFV enhanced KVM hypervisor
    • This deployment has 1-Contoller and 3-Compute nodes
  • os-nosdn-kvm_nfv_ovs_dpdk_bar-ha
    • This is an Openstack deployment with high availability using OVS, DPDK including the OPNFV enhanced KVM hypervisor and Barometer
    • This deployment has 3-Contoller and 2-Compute nodes
  • os-nosdn-kvm_nfv_ovs_dpdk_bar-noha
    • This is an Openstack deployment without high availability using OVS, DPDK including the OPNFV enhanced KVM hypervisor and Barometer
    • This deployment has 1-Contoller and 3-Compute nodes
3.3. Installing your scenario

There are two main methods of deploying your target scenario, one method is to follow this guide which will walk you through the process of deploying to your hardware using scripts or ISO images, the other method is to set up a Jenkins slave and connect your infrastructure to the OPNFV Jenkins master.

For the purposes of evaluation and development a number of Danube scenarios are able to be deployed virtually to mitigate the requirements on physical infrastructure. Details and instructions on performing virtual deployments can be found in the installer specific installation instructions.

To set up a Jenkins slave for automated deployment to your lab, refer to the Jenkins slave connect guide.

KVM4NFV User Guide
1. Userguide Abstract

In KVM4NFV project, we focus on the KVM hypervisor to enhance it for NFV, by looking at the following areas initially-

  • Minimal Interrupt latency variation for data plane VNFs:
    • Minimal Timing Variation for Timing correctness of real-time VNFs
    • Minimal packet latency variation for data-plane VNFs
  • Inter-VM communication
  • Fast live migration
2. Userguide Introduction
2.1. Overview

The project “NFV Hypervisors-KVM” makes collaborative efforts to enable NFV features for existing hypervisors, which are not necessarily designed or targeted to meet the requirements for the NFVI.The KVM4NFV scenario consists of Continuous Integration builds, deployments and testing combinations of virtual infrastructure components.

2.2. KVM4NFV Features

Using this project, the following areas are targeted-

  • Minimal Interrupt latency variation for data plane VNFs:
    • Minimal Timing Variation for Timing correctness of real-time VNFs
    • Minimal packet latency variation for data-plane VNFs
  • Inter-VM communication
  • Fast live migration

Some of the above items would require software development and/or specific hardware features, and some need just configurations information for the system (hardware, BIOS, OS, etc.).

We include a requirements gathering stage as a formal part of the project. For each subproject, we will start with an organized requirement stage so that we can determine specific use cases (e.g. what kind of VMs should be live migrated) and requirements (e.g. interrupt latency, jitters, Mpps, migration-time, down-time, etc.) to set out the performance goals.

Potential future projects would include:

  • Dynamic scaling (via scale-out) using VM instantiation
  • Fast live migration for SR-IOV

The user guide outlines how to work with key components and features in the platform, each feature description section will indicate the scenarios that provide the components and configurations required to use it.

The configuration guide details which scenarios are best for you and how to install and configure them.

2.3. General usage guidelines

The user guide for KVM4NFV features and capabilities provide step by step instructions for using features that have been configured according to the installation and configuration instructions.

2.4. Scenarios User Guide

The procedure to deploy/test KVM4NFV scenarios in a nested virtualization or on bare-metal environment is mentioned in the below link. The kvm4nfv user guide can be found at docs/scenarios

http://artifacts.opnfv.org/kvmfornfv/docs/index.html#kvmfornfv-scenarios-overview-and-description

The deployment has been verified for os-nosdn-kvm-ha, os-nosdn-kvm-noha, os-nosdn-kvm_ovs_dpdk-ha, os-nosdn-kvm_ovs_dpdk-noha and os-nosdn-kvm_ovs_dpdk_bar-ha, os-nosdn-kvm_ovs_dpdk_bar-noha test scenarios.

For brief view of the above scenarios use:

http://artifacts.opnfv.org/kvmfornfv/docs/index.html#scenario-abstract
3. Using common platform components

This section outlines basic usage principals and methods for some of the commonly deployed components of supported OPNFV scenario’s in Danube. The subsections provide an outline of how these components are commonly used and how to address them in an OPNFV deployment.The components derive from autonomous upstream communities and where possible this guide will provide direction to the relevant documentation made available by those communities to better help you navigate the OPNFV deployment.

4. Using Danube Features

The following sections of the user guide provide feature specific usage guidelines and references for KVM4NFV project.

  • <project>/docs/userguide/low_latency.userguide.rst
  • <project>/docs/userguide/live_migration.userguide.rst
  • <project>/docs/userguide/tuning.userguide.rst
5. FTrace Debugging Tool
5.1. About Ftrace

Ftrace is an internal tracer designed to find what is going on inside the kernel. It can be used for debugging or analyzing latencies and performance related issues that take place outside of user-space. Although ftrace is typically considered the function tracer, it is really a frame work of several assorted tracing utilities.

One of the most common uses of ftrace is the event tracing.

Note: - For KVM4NFV, Ftrace is preferred as it is in-built kernel tool - More stable compared to other debugging tools

5.2. Version Features
Release Features
Colorado
  • Ftrace Debugging tool is not implemented in Colorado release of KVM4NFV
Danube
  • Ftrace aids in debugging the KVM4NFV 4.4-linux-kernel level issues
  • Option to disable if not required
5.3. Implementation of Ftrace

Ftrace uses the debugfs file system to hold the control files as well as the files to display output.

When debugfs is configured into the kernel (which selecting any ftrace option will do) the directory /sys/kernel/debug will be created. To mount this directory, you can add to your /etc/fstab file:

debugfs       /sys/kernel/debug          debugfs defaults        0       0

Or you can mount it at run time with:

mount -t debugfs nodev /sys/kernel/debug

Some configurations for Ftrace are used for other purposes, like finding latency or analyzing the system. For the purpose of debugging, the kernel configuration parameters that should be enabled are:

CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_STACK_TRACER=y
CONFIG_DYNAMIC_FTRACE=y

The above parameters must be enabled in /boot/config-4.4.0-el7.x86_64 i.e., kernel config file for ftrace to work. If not enabled, change the parameter to y and run.,

On CentOS
grub2-mkconfig -o /boot/grub2/grub.cfg
sudo reboot

Re-check the parameters after reboot before running ftrace.

5.4. Files in Ftrace:

The below is a list of few major files in Ftrace.

current_tracer:

This is used to set or display the current tracer that is configured.

available_tracers:

This holds the different types of tracers that have been compiled into the kernel. The tracers listed here can be configured by echoing their name into current_tracer.

tracing_on:

This sets or displays whether writing to the tracering buffer is enabled. Echo 0 into this file to disable the tracer or 1 to enable it.

trace:

This file holds the output of the trace in a human readable format.

tracing_cpumask:

This is a mask that lets the user only trace on specified CPUs. The format is a hex string representing the CPUs.

events:

It holds event tracepoints (also known as static tracepoints) that have been compiled into the kernel. It shows what event tracepoints exist and how they are grouped by system.
5.5. Avaliable Tracers

Here is the list of current tracers that may be configured based on usage.

  • function
  • function_graph
  • irqsoff
  • preemptoff
  • preemptirqsoff
  • wakeup
  • wakeup_rt

Brief about a few:

function:

Function call tracer to trace all kernel functions.

function_graph:

Similar to the function tracer except that the function tracer probes the functions on their entry whereas the function graph tracer traces on both entry and exit of the functions.

nop:

This is the “trace nothing” tracer. To remove tracers from tracing simply echo “nop” into current_tracer.

Examples:

To list available tracers:
[tracing]# cat available_tracers
function_graph function wakeup wakeup_rt preemptoff irqsoff preemptirqsoff nop

Usage:
[tracing]# echo function > current_tracer
[tracing]# cat current_tracer
function

To view output:
[tracing]# cat trace | head -10

To Stop tracing:
[tracing]# echo 0 > tracing_on

To Start/restart tracing:
[tracing]# echo 1 > tracing_on;
5.6. Ftrace in KVM4NFV

Ftrace is part of KVM4NFV D-Release. KVM4NFV built 4.4-linux-Kernel will be tested by executing cyclictest and analyzing the results/latency values (max, min, avg) generated. Ftrace (or) function tracer is a stable kernel inbuilt debugging tool which tests real time kernel and outputs a log as part of the code. These output logs are useful in following ways.

  • Kernel Debugging.
  • Helps in Kernel code optimization and
  • Can be used to better understand the kernel level code flow

Ftrace logs for KVM4NFV can be found here:

5.7. Ftrace Usage in KVM4NFV Kernel Debugging:

Kvm4nfv has two scripts in /ci/envs to provide ftrace tool:

  • enable_trace.sh
  • disable_trace.sh
Found at.,
$ cd kvmfornfv/ci/envs
5.8. Enabling Ftrace in KVM4NFV

The enable_trace.sh script is triggered by changing ftrace_enable value in test_kvmfornfv.sh script to 1 (which is zero by default). Change as below to enable Ftrace.

ftrace_enable=1

Note:

  • Ftrace is enabled before
5.9. Details of enable_trace script
  • CPU coremask is calculated using getcpumask()
  • All the required events are enabled by,
    echoing “1” to $TRACEDIR/events/event_name/enable file

Example,

$TRACEDIR = /sys/kernel/debug/tracing/
sudo bash -c "echo 1 > $TRACEDIR/events/irq/enable"
sudo bash -c "echo 1 > $TRACEDIR/events/task/enable"
sudo bash -c "echo 1 > $TRACEDIR/events/syscalls/enable"

The set_event file contains all the enabled events list

  • Function tracer is selected. May be changed to other avaliable tracers based on requirement
sudo bash -c "echo function > $TRACEDIR/current_tracer
  • When tracing is turned ON by setting tracing_on=1, the trace file keeps getting append with the traced data until tracing_on=0 and then ftrace_buffer gets cleared.
To Stop/Pause,
echo 0 >tracing_on;

To Start/Restart,
echo 1 >tracing_on;
  • Once tracing is diabled, disable_trace.sh script is triggered.
5.10. Details of disable_trace Script

In disable trace script the following are done:

  • The trace file is copied and moved to /tmp folder based on timestamp
  • The current tracer file is set to nop
  • The set_event file is cleared i.e., all the enabled events are disabled
  • Kernel Ftrace is disabled/unmounted
5.11. Publishing Ftrace logs:

The generated trace log is pushed to artifacts by kvmfornfv-upload-artifact.sh script available in releng which will be triggered as a part of kvm4nfv daily job. The trigger in the script is.,

echo "Uploading artifacts for future debugging needs...."
gsutil cp -r $WORKSPACE/build_output/log-*.tar.gz $GS_LOG_LOCATION > $WORKSPACE/gsutil.log 2>&1
6. KVM4NFV Dashboard Guide
6.1. Dashboard for KVM4NFV Daily Test Results
6.2. Abstract

This chapter explains the procedure to configure the InfluxDB and Grafana on Node1 or Node2 depending on the testtype to publish KVM4NFV test results. The cyclictest cases are executed and results are published on Yardstick Dashboard(Grafana). InfluxDB is the database which will store the cyclictest results and Grafana is a visualisation suite to view the maximum,minimum and average values of the time series data of cyclictest results.The framework is shown in below image.

_images/dashboard-architecture.png
6.3. Version Features
Release Features
Colorado
  • Data published in Json file format
  • No database support to store the test’s latency values of cyclictest
  • For each run, the previous run’s output file is replaced with a new file with currents latency values.
Danube
  • Test results are stored in Influxdb
  • Graphical representation of the latency values using Grafana suite. (Dashboard)
  • Supports graphical view for multiple testcases and test-types (Stress/Idle)
6.4. Installation Steps:

To configure Yardstick, InfluxDB and Grafana for KVM4NFV project following sequence of steps are followed:

Note:

All the below steps are done as per the script, which is a part of CICD integration of kvmfornfv.

For Yardstick:
git clone https://gerrit.opnfv.org/gerrit/yardstick

For InfluxDB:
docker pull tutum/influxdb
docker run -d --name influxdb -p 8083:8083 -p 8086:8086 --expose 8090 --expose 8099 tutum/influxdb
docker exec -it influxdb bash
$influx
>CREATE USER root WITH PASSWORD 'root' WITH ALL PRIVILEGES
>CREATE DATABASE yardstick;
>use yardstick;
>show MEASUREMENTS;

For Grafana:
docker pull grafana/grafana
docker run -d --name grafana -p 3000:3000 grafana/grafana

The Yardstick document for Grafana and InfluxDB configuration can be found here.

6.5. Configuring the Dispatcher Type:

Need to configure the dispatcher type in /etc/yardstick/yardstick.conf depending on the dispatcher methods which are used to store the cyclictest results. A sample yardstick.conf can be found at /yardstick/etc/yardstick.conf.sample, which can be copied to /etc/yardstick.

mkdir -p /etc/yardstick/
cp /yardstick/etc/yardstick.conf.sample /etc/yardstick/yardstick.conf

Dispatcher Modules:

Three type of dispatcher methods are available to store the cyclictest results.

  • File
  • InfluxDB
  • HTTP

1. File: Default Dispatcher module is file. If the dispatcher module is configured as a file,then the test results are stored in a temporary file yardstick.out ( default path: /tmp/yardstick.out). Dispatcher module of “Verify Job” is “Default”. So,the results are stored in Yardstick.out file for verify job. Storing all the verify jobs in InfluxDB database causes redundancy of latency values. Hence, a File output format is prefered.

[DEFAULT]
debug = False
dispatcher = file

[dispatcher_file]
file_path = /tmp/yardstick.out
max_bytes = 0
backup_count = 0

2. Influxdb: If the dispatcher module is configured as influxdb, then the test results are stored in Influxdb. Users can check test resultsstored in the Influxdb(Database) on Grafana which is used to visualize the time series data.

To configure the influxdb, the following content in /etc/yardstick/yardstick.conf need to updated

[DEFAULT]
debug = False
dispatcher = influxdb

[dispatcher_influxdb]
timeout = 5
target = http://127.0.0.1:8086  ##Mention the IP where influxdb is running
db_name = yardstick
username = root
password = root

Dispatcher module of “Daily Job” is Influxdb. So, the results are stored in influxdb and then published to Dashboard.

3. HTTP: If the dispatcher module is configured as http, users can check test result on OPNFV testing dashboard which uses MongoDB as backend.

[DEFAULT]
debug = False
dispatcher = http

[dispatcher_http]
timeout = 5
target = http://127.0.0.1:8000/results
_images/UseCaseDashboard.png
6.5.1. Detailing the dispatcher module in verify and daily Jobs:

KVM4NFV updates the dispatcher module in the yardstick configuration file(/etc/yardstick/yardstick.conf) depending on the Job type(Verify/Daily). Once the test is completed, results are published to the respective dispatcher modules.

Dispatcher module is configured for each Job type as mentioned below.

  1. Verify Job : Default “DISPATCHER_TYPE” i.e. file(/tmp/yardstick.out) is used. User can also see the test results on Jenkins console log.
*"max": "00030", "avg": "00006", "min": "00006"*
  1. Daily Job : Opnfv Influxdb url is configured as dispatcher module.
DISPATCHER_TYPE=influxdb
DISPATCHER_INFLUXDB_TARGET="http://104.197.68.199:8086"

Influxdb only supports line protocol, and the json protocol is deprecated.

For example, the raw_result of cyclictest in json format is:
"benchmark": {
     "timestamp": 1478234859.065317,
     "errors": "",
     "data": {
        "max": "00012",
        "avg": "00008",
        "min": "00007"
     },
   "sequence": 1
   },
  "runner_id": 23
}
With the help of “influxdb_line_protocol”, the json is transformed as a line string:
'kvmfornfv_cyclictest_idle_idle,deploy_scenario=unknown,host=kvm.LF,
installer=unknown,pod_name=unknown,runner_id=23,scenarios=Cyclictest,
task_id=e7be7516-9eae-406e-84b6-e931866fa793,version=unknown
avg="00008",max="00012",min="00007" 1478234859065316864'

Influxdb api which is already implemented in Influxdb will post the data in line format into the database.

Displaying Results on Grafana dashboard:

  • Once the test results are stored in Influxdb, dashboard configuration file(Json) which used to display the cyclictest results

on Grafana need to be created by following the Grafana-procedure and then pushed into yardstick-repo

  • Grafana can be accessed at Login using credentials opnfv/opnfv and used for visualizing the collected test data as shown in Visual
_images/Dashboard-screenshot-1.png
_images/Dashboard-screenshot-2.png
6.6. Understanding Kvm4nfv Grafana Dashboard

The Kvm4nfv dashboard found at http://testresults.opnfv.org/ currently supports graphical view of cyclictest. For viewing Kvm4nfv dashboarduse,

http://testresults.opnfv.org/grafana/dashboard/db/kvmfornfv-cyclictest

The login details are:

    Username: opnfv
    Password: opnfv
The JSON of the kvmfonfv-cyclictest dashboard can be found at.,

$ git clone https://gerrit.opnfv.org/gerrit/yardstick.git
$ cd yardstick/dashboard
$ cat KVMFORNFV-Cyclictest

The Dashboard has four tables, each representing a specific test-type of cyclictest case,

  • Kvmfornfv_Cyclictest_Idle-Idle
  • Kvmfornfv_Cyclictest_CPUstress-Idle
  • Kvmfornfv_Cyclictest_Memorystress-Idle
  • Kvmfornfv_Cyclictest_IOstress-Idle

Note:

  • For all graphs, X-axis is marked with time stamps, Y-axis with value in microsecond units.

A brief about what each graph of the dashboard represents:

6.6.1. 1. Idle-Idle Graph

Idle-Idle graph displays the Average, Maximum and Minimum latency values obtained by running Idle_Idle test-type of the cyclictest. Idle_Idle implies that no stress is applied on the Host or the Guest.

_images/Idle-Idle.png
6.6.2. 2. CPU_Stress-Idle Graph

Cpu_Stress-Idle graph displays the Average, Maximum and Minimum latency values obtained by running Cpu-stress_Idle test-type of the cyclictest. Cpu-stress_Idle implies that CPU stress is applied on the Host and no stress on the Guest.

_images/Cpustress-Idle.png
6.6.3. 3. Memory_Stress-Idle Graph

Memory_Stress-Idle graph displays the Average, Maximum and Minimum latency values obtained by running Memory-stress_Idle test-type of the Cyclictest. Memory-stress_Idle implies that Memory stress is applied on the Host and no stress on the Guest.

_images/Memorystress-Idle.png
6.6.4. 4. IO_Stress-Idle Graph

IO_Stress-Idle graph displays the Average, Maximum and Minimum latency values obtained by running IO-stress_Idle test-type of the Cyclictest. IO-stress_Idle implies that IO stress is applied on the Host and no stress on the Guest.

_images/IOstress-Idle.png
6.7. Future Scope

The future work will include adding the kvmfornfv_Packet-forwarding test results into Grafana and influxdb.

7. Low Latency Environment

Achieving low latency with the KVM4NFV project requires setting up a special test environment. This environment includes the BIOS settings, kernel configuration, kernel parameters and the run-time environment.

7.1. Hardware Environment Description

BIOS setup plays an important role in achieving real-time latency. A collection of relevant settings, used on the platform where the baseline performance data was collected, is detailed below:

7.1.1. CPU Features

Some special CPU features like TSC-deadline timer, invariant TSC and Process posted interrupts, etc, are helpful for latency reduction.

7.1.2. CPU Topology

NUMA topology is also important for latency reduction.

7.1.3. BIOS Setup

Careful BIOS setup is important in achieving real time latency. Different platforms have different BIOS setups, below are the important BIOS settings on the platform used to collect the baseline performance data.

7.2. Software Environment Setup

Both the host and the guest environment need to be configured properly to reduce latency variations. Below are some suggested kernel configurations. The ci/envs/ directory gives detailed implementation on how to setup the environment.

7.2.1. Kernel Parameter

Please check the default kernel configuration in the source code at: kernel/arch/x86/configs/opnfv.config.

Below is host kernel boot line example:

isolcpus=11-15,31-35 nohz_full=11-15,31-35 rcu_nocbs=11-15,31-35
iommu=pt intel_iommu=on default_hugepagesz=1G hugepagesz=1G mce=off idle=poll
intel_pstate=disable processor.max_cstate=1 pcie_asmp=off tsc=reliable

Below is guest kernel boot line example

isolcpus=1 nohz_full=1 rcu_nocbs=1 mce=off idle=poll default_hugepagesz=1G
hugepagesz=1G

Please refer to tuning.userguide for more explanation.

7.2.2. Run-time Environment Setup

Not only are special kernel parameters needed but a special run-time environment is also required. Please refer to tunning.userguide for more explanation.

7.3. Test cases to measure Latency

The performance of the kvm4nfv is assesed by the latency values. Cyclictest and Packet forwarding Test cases result in real time latency values of average, minimum and maximum.

  • Cyclictest
  • Packet Forwarding test
7.4. 1. Cyclictest case

Cyclictest results are the most frequently cited real-time Linux metric. The core concept of Cyclictest is very simple. In KVM4NFV cyclictest is implemented on the Guest-VM with 4.4-Kernel RPM installed. It generated Max,Min and Avg values which help in assesing the kernel used. Cyclictest in currently divided into the following test types,

  • Idle-Idle
  • CPU_stress-Idle
  • Memory_stress-Idle
  • IO_stress-Idle

Future scope of work may include the below test-types,

  • CPU_stress-CPU_stress
  • Memory_stress-Memory_stress
  • IO_stress-IO_stress
7.4.1. Understanding the naming convention
[Host-Type ] - [Guest-Type]
  • Host-Type : Mentions the type of stress applied on the kernel of the Host
  • Guest-Type : Mentions the type of stress applied on the kernel of the Guest

Example.,

Idle - CPU_stress

The above name signifies that,

  • No Stress is applied on the Host kernel
  • CPU Stress is applied on the Guest kernel

Note:

  • Stress is applied using the stress which is installed as part of the deployment. Stress can be applied on CPU, Memory and Input-Output (Read/Write) operations using the stress tool.
7.4.2. Version Features
Test Name Colorado Danube
  • Idle - Idle
Y Y
  • Cpustress - Idle
  Y
  • Memorystress - Idle
  Y
  • IOstress - Idle
  Y
7.4.3. Idle-Idle test-type

Cyclictest in run on the Guest VM when Host,Guest are not under any kind of stress. This is the basic cyclictest of the KVM4NFV project. Outputs Avg, Min and Max latency values.

_images/idle-idle-test-type.png
7.4.4. CPU_Stress-Idle test-type

Here, the host is under CPU stress, where multiple times sqrt() function is called on kernel which results increased CPU load. The cyclictest will run on the guest, where the guest is under no stress. Outputs Avg, Min and Max latency values.

_images/cpu-stress-idle-test-type.png
7.4.5. Memory_Stress-Idle test-type

In this type, the host is under memory stress where continuos memory operations are implemented to increase the Memory stress (Buffer stress).The cyclictest will run on the guest, where the guest is under no stress. It outputs Avg, Min and Max latency values.

_images/memory-stress-idle-test-type.png
7.4.6. IO_Stress-Idle test-type

The host is under constant Input/Output stress .i.e., multiple read-write operations are invoked to increase stress. Cyclictest will run on the guest VM that is launched on the same host, where the guest is under no stress. It outputs Avg, Min and Max latency values.

_images/io-stress-idle-test-type.png
7.4.7. CPU_Stress-CPU_Stress test-type

Not implemented for Danube release.

7.4.8. Memory_Stress-Memory_Stress test-type

Not implemented for Danube release.

7.4.9. IO_Stress-IO_Stress test type

Not implemented for Danube release.

7.5. 2. Packet Forwarding Test cases

Packet forwarding is an other test case of Kvm4nfv. It measures the time taken by a packet to return to source after reaching its destination. This test case uses automated test-framework provided by OPNFV VSWITCHPERF project and a traffic generator (IXIA is used for kvm4nfv). Only latency results generating test cases are triggered as a part of kvm4nfv daily job.

Latency test measures the time required for a frame to travel from the originating device through the network to the destination device. Please note that RFC2544 Latency measurement will be superseded with a measurement of average latency over all successfully transferred packets or frames.

Packet forwarding test cases currently supports the following test types:

  • Packet forwarding to Host
  • Packet forwarding to Guest
  • Packet forwarding to Guest using SRIOV

The testing approach adoped is black box testing, meaning the test inputs can be generated and the outputs captured and completely evaluated from the outside of the System Under Test(SUT).

7.5.1. Packet forwarding to Host

This is also known as Physical port → vSwitch → physical port deployment. This test measures the time taken by the packet/frame generated by traffic generator(phy) to travel through the network to the destination device(phy). This test results min,avg and max latency values. This value signifies the performance of the installed kernel.

Packet flow,

_images/host_pk_fw.png
7.5.2. Packet forwarding to Guest

This is also known as Physical port → vSwitch → VNF → vSwitch → physical port deployment.

This test measures the time taken by the packet/frame generated by traffic generator(phy) to travel through the network involving a guest to the destination device(phy). This test results min,avg and max latency values. This value signifies the performance of the installed kernel.

Packet flow,

_images/guest_pk_fw.png
7.5.3. Packet forwarding to Guest using SRIOV

This test is used to verify the VNF and measure the base performance (maximum forwarding rate in fps and latency) that can be achieved by the VNF without a vSwitch. The performance metrics collected by this test will serve as a key comparison point for NIC passthrough technologies and vSwitches. VNF in this context refers to the hypervisor and the VM.

Note: The Vsperf running on the host is still required.

Packet flow,

_images/sriov_pk_fw.png
8. Fast Live Migration

The NFV project requires fast live migration. The specific requirement is total live migration time < 2Sec, while keeping the VM down time < 10ms when running DPDK L2 forwarding workload.

We measured the baseline data of migrating an idle 8GiB guest running a DPDK L2 forwarding work load and observed that the total live migration time was 2271ms while the VM downtime was 26ms. Both of these two indicators failed to satisfy the requirements.

8.1. Current Challenges

The following 4 features have been developed over the years to make the live migration process faster.

  • XBZRLE:
    Helps to reduce the network traffic by just sending the compressed data.
  • RDMA:
    Uses a specific NIC to increase the efficiency of data transmission.
  • Multi thread compression:
    Compresses the data before transmission.
  • Auto convergence:
    Reduces the data rate of dirty pages.

Tests show none of the above features can satisfy the requirement of NFV. XBZRLE and Multi thread compression do the compression entirely in software and they are not fast enough in a 10Gbps network environment. RDMA is not flexible because it has to transport all the guest memory to the destination without zero page optimization. Auto convergence is not appropriate for NFV because it will impact guest’s performance.

So we need to find other ways for optimization.

8.2. Optimizations
  1. Delay non-emergency operations By profiling, it was discovered that some of the cleanup operations during the stop and copy stage are the main reason for the long VM down time. The cleanup operation includes stopping the dirty page logging, which is a time consuming operation. By deferring these operations until the data transmission is completed the VM down time is reduced to about 5-7ms.
  2. Optimize zero page checking Currently QEMU uses the SSE2 instruction to optimize the zero pages checking. The SSE2 instruction can process 16 bytes per instruction. By using the AVX2 instruction, we can process 32 bytes per instruction. Testing shows that using AVX2 can speed up the zero pages checking process by about 25%.
  3. Remove unnecessary context synchronization. The CPU context was being synchronized twice during live migration. Removing this unnecessary synchronization shortened the VM downtime by about 100us.
8.3. Test Environment

The source and destination host have the same hardware and OS: :: Host: HSW-EP CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz RAM: 64G OS: RHEL 7.1 Kernel: 4.2 QEMU v2.4.0

Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01) QEMU parameters: :: ${qemu} -smp ${guest_cpus} -monitor unix:${qmp_sock},server,nowait -daemonize -cpu host,migratable=off,+invtsc,+tsc-deadline,pmu=off -realtime mlock=on -mem-prealloc -enable-kvm -m 1G -mem-path /mnt/hugetlbfs-1g -drive file=/root/minimal-centos1.qcow2,cache=none,aio=threads -netdev user,id=guest0,hostfwd=tcp:5555-:22 -device virtio-net-pci,netdev=guest0 -nographic -serial /dev/null -parallel /dev/null

Network connection

live migration network connection
8.4. Test Result

The down time is set to 10ms when doing the test. We use pktgen to send the packages to guest, the package size is 64 bytes, and the line rate is 2013 Mbps.

  1. Total live migration time

    The total live migration time before and after optimization is shown in the chart below. For an idle guest, we can reduce the total live migration time from 2070ms to 401ms. For a guest running the DPDK L2 forwarding workload, the total live migration time is reduced from 2271ms to 654ms.

total live migration time
  1. VM downtime

    The VM down time before and after optimization is shown in the chart below. For an idle guest, we can reduce the VM down time from 29ms to 9ms. For a guest running the DPDK L2 forwarding workload, the VM down time is reduced from 26ms to 5ms.

vm downtime
9. Danube OpenStack User Guide

OpenStack is a cloud operating system developed and released by the OpenStack project. OpenStack is used in OPNFV for controlling pools of compute, storage, and networking resources in a Pharos compliant infrastructure.

OpenStack is used in Danube to manage tenants (known in OpenStack as projects),users, services, images, flavours, and quotas across the Pharos infrastructure.The OpenStack interface provides the primary interface for an operational Danube deployment and it is from the “horizon console” that an OPNFV user will perform the majority of administrative and operational activities on the deployment.

9.1. OpenStack references

The OpenStack user guide provides details and descriptions of how to configure and interact with the OpenStack deployment.This guide can be used by lab engineers and operators to tune the OpenStack deployment to your liking.

Once you have configured OpenStack to your purposes, or the Danube deployment meets your needs as deployed, an operator, or administrator, will find the best guidance for working with OpenStack in the OpenStack administration guide.

9.2. Connecting to the OpenStack instance

Once familiar with the basic of working with OpenStack you will want to connect to the OpenStack instance via the Horizon Console. The Horizon console provide a Web based GUI that will allow you operate the deployment. To do this you should open a browser on the JumpHost to the following address and enter the username and password:

http://{Controller-VIP}:80/index.html> username: admin password: admin

Other methods of interacting with and configuring OpenStack,, like the REST API and CLI are also available in the Danube deployment, see the OpenStack administration guide for more information on using those interfaces.

10. Packet Forwarding
10.1. About Packet Forwarding

Packet Forwarding is a test suite of KVM4NFV. These latency tests measures the time taken by a Packet generated by the traffic generator to travel from the originating device through the network to the destination device. Packet Forwarding is implemented using test framework implemented by OPNFV VSWITCHPERF project and an IXIA Traffic Generator.

10.2. Version Features
Release Features
Colorado
  • Packet Forwarding is not part of Colorado release of KVM4NFV
Danube
  • Packet Forwarding is a testcase in KVM4NFV
  • Implements three scenarios (Host/Guest/SRIOV) as part of testing in KVM4NFV
  • Uses automated test framework of OPNFV VSWITCHPERF software (PVP/PVVP)
  • Works with IXIA Traffic Generator
10.3. VSPERF

VSPerf is an OPNFV testing project. VSPerf will develop a generic and architecture agnostic vSwitch testing framework and associated tests, that will serve as a basis for validating the suitability of different vSwitch implementations in a Telco NFV deployment environment. The output of this project will be utilized by the OPNFV Performance and Test group and its associated projects, as part of OPNFV Platform and VNF level testing and validation.

For complete VSPERF documentation go to link.

10.3.1. Installation

Guidelines of installating VSPERF.

10.3.2. Supported Operating Systems
  • CentOS 7
  • Fedora 20
  • Fedora 21
  • Fedora 22
  • RedHat 7.2
  • Ubuntu 14.04
10.3.3. Supported vSwitches

The vSwitch must support Open Flow 1.3 or greater.

  • OVS (built from source).
  • OVS with DPDK (built from source).
10.3.4. Supported Hypervisors
  • Qemu version 2.6.
10.3.5. Other Requirements

The test suite requires Python 3.3 and relies on a number of other packages. These need to be installed for the test suite to function.

Installation of required packages, preparation of Python 3 virtual environment and compilation of OVS, DPDK and QEMU is performed by script systems/build_base_machine.sh. It should be executed under user account, which will be used for vsperf execution.

Please Note: Password-less sudo access must be configured for given user before script is executed.

Execution of installation script:

$ cd vswitchperf
$ cd systems
$ ./build_base_machine.sh

Script build_base_machine.sh will install all the vsperf dependencies in terms of system packages, Python 3.x and required Python modules. In case of CentOS 7 it will install Python 3.3 from an additional repository provided by Software Collections (a link). In case of RedHat 7 it will install Python 3.4 as an alternate installation in /usr/local/bin. Installation script will also use virtualenv to create a vsperf virtual environment, which is isolated from the default Python environment. This environment will reside in a directory called vsperfenv in $HOME.

You will need to activate the virtual environment every time you start a new shell session. Its activation is specific to your OS:

For running testcases VSPERF is installed on Intel pod1-node2 in which centos operating system is installed. Only VSPERF installion on Centos is discussed here. For installation steps on other operating systems please refer to here.

10.3.6. For CentOS 7

## Python 3 Packages

To avoid file permission errors and Python version issues, use virtualenv to create an isolated environment with Python3. The required Python 3 packages can be found in the requirements.txt file in the root of the test suite. They can be installed in your virtual environment like so:

scl enable python33 bash
# Create virtual environment
virtualenv vsperfenv
cd vsperfenv
source bin/activate
pip install -r requirements.txt

You need to activate the virtual environment every time you start a new shell session. To activate, simple run:

scl enable python33 bash
cd vsperfenv
source bin/activate
10.3.7. Working Behind a Proxy

If you’re behind a proxy, you’ll likely want to configure this before running any of the above. For example:

export http_proxy="http://<username>:<password>@<proxy>:<port>/";
export https_proxy="https://<username>:<password>@<proxy>:<port>/";
export ftp_proxy="ftp://<username>:<password>@<proxy>:<port>/";
export socks_proxy="socks://<username>:<password>@<proxy>:<port>/";

For other OS specific activation click this link:

10.4. Traffic-Generators

VSPERF supports many Traffic-generators. For configuring VSPERF to work with the available traffic-generator go through this.

VSPERF supports the following traffic generators:

  • Dummy (DEFAULT): Allows you to use your own external traffic generator.
  • IXIA (IxNet and IxOS)
  • Spirent TestCenter
  • Xena Networks
  • MoonGen

To see the list of traffic gens from the cli:

$ ./vsperf --list-trafficgens

This guide provides the details of how to install and configure the various traffic generators.

As KVM4NFV uses only IXIA traffic generator, it is discussed here. For complete documentation regarding traffic generators please follow this link.

10.5. IXIA Setup
10.5.1. Hardware Requirements

VSPERF requires the following hardware to run tests: IXIA traffic generator (IxNetwork), a machine that runs the IXIA client software and a CentOS Linux release 7.1.1503 (Core) host.

10.5.2. Installation

Follow the installation instructions to install.

10.5.3. On the CentOS 7 system

You need to install IxNetworkTclClient$(VER_NUM)Linux.bin.tgz.

10.5.4. On the IXIA client software system
Find the IxNetwork TCL server app (start -> All Programs -> IXIA -> IxNetwork -> IxNetwork_$(VER_NUM) -> IxNetwork TCL Server)
  • Right click on IxNetwork TCL Server, select properties
  • Under shortcut tab in the Target dialogue box make sure there is the argument “-tclport xxxx”

where xxxx is your port number (take note of this port number you will need it for the 10_custom.conf file).

_images/IXIA1.png
  • Hit Ok and start the TCL server application
10.6. VSPERF configuration

There are several configuration options specific to the IxNetworks traffic generator from IXIA. It is essential to set them correctly, before the VSPERF is executed for the first time.

Detailed description of options follows:

  • TRAFFICGEN_IXNET_MACHINE - IP address of server, where IxNetwork TCL Server is running
  • TRAFFICGEN_IXNET_PORT - PORT, where IxNetwork TCL Server is accepting connections from TCL clients
  • TRAFFICGEN_IXNET_USER - username, which will be used during communication with IxNetwork TCL Server and IXIA chassis
  • TRAFFICGEN_IXIA_HOST - IP address of IXIA traffic generator chassis
  • TRAFFICGEN_IXIA_CARD - identification of card with dedicated ports at IXIA chassis
  • TRAFFICGEN_IXIA_PORT1 - identification of the first dedicated port at TRAFFICGEN_IXIA_CARD at IXIA chassis; VSPERF uses two separated ports for traffic generation. In case of unidirectional traffic, it is essential to correctly connect 1st IXIA port to the 1st NIC at DUT, i.e. to the first PCI handle from WHITELIST_NICS list. Otherwise traffic may not be able to pass through the vSwitch.
  • TRAFFICGEN_IXIA_PORT2 - identification of the second dedicated port at TRAFFICGEN_IXIA_CARD at IXIA chassis; VSPERF uses two separated ports for traffic generation. In case of unidirectional traffic, it is essential to correctly connect 2nd IXIA port to the 2nd NIC at DUT, i.e. to the second PCI handle from WHITELIST_NICS list. Otherwise traffic may not be able to pass through the vSwitch.
  • TRAFFICGEN_IXNET_LIB_PATH - path to the DUT specific installation of IxNetwork TCL API
  • TRAFFICGEN_IXNET_TCL_SCRIPT - name of the TCL script, which VSPERF will use for communication with IXIA TCL server
  • TRAFFICGEN_IXNET_TESTER_RESULT_DIR - folder accessible from IxNetwork TCL server, where test results are stored, e.g. c:/ixia_results; see test-results-share
  • TRAFFICGEN_IXNET_DUT_RESULT_DIR - directory accessible from the DUT, where test results from IxNetwork TCL server are stored, e.g. /mnt/ixia_results; see test-results-share
10.6.1. Test results share

VSPERF is not able to retrieve test results via TCL API directly. Instead, all test results are stored at IxNetwork TCL server. Results are stored at folder defined by TRAFFICGEN_IXNET_TESTER_RESULT_DIR configuration parameter. Content of this folder must be shared (e.g. via samba protocol) between TCL Server and DUT, where VSPERF is executed. VSPERF expects, that test results will be available at directory configured by TRAFFICGEN_IXNET_DUT_RESULT_DIR configuration parameter.

Example of sharing configuration:

  • Create a new folder at IxNetwork TCL server machine, e.g. c:\ixia_results

  • Modify sharing options of ixia_results folder to share it with everybody

  • Create a new directory at DUT, where shared directory with results will be mounted, e.g. /mnt/ixia_results

  • Update your custom VSPERF configuration file as follows:

    TRAFFICGEN_IXNET_TESTER_RESULT_DIR = 'c:/ixia_results'
    TRAFFICGEN_IXNET_DUT_RESULT_DIR = '/mnt/ixia_results'
    

    Note: It is essential to use slashes ‘/’ also in path configured by TRAFFICGEN_IXNET_TESTER_RESULT_DIR parameter.

  • Install cifs-utils package.

    e.g. at rpm based Linux distribution:

yum install cifs-utils
  • Mount shared directory, so VSPERF can access test results.

    e.g. by adding new record into /etc/fstab

mount -t cifs //_TCL_SERVER_IP_OR_FQDN_/ixia_results /mnt/ixia_results
      -o file_mode=0777,dir_mode=0777,nounix

It is recommended to verify, that any new file inserted into c:/ixia_results folder is visible at DUT inside /mnt/ixia_results directory.

10.6.2. Cloning and building src dependencies

In order to run VSPERF, you will need to download DPDK and OVS. You can do this manually and build them in a preferred location, or you could use vswitchperf/src. The vswitchperf/src directory contains makefiles that will allow you to clone and build the libraries that VSPERF depends on, such as DPDK and OVS. To clone and build simply:

cd src
make

To delete a src subdirectory and its contents to allow you to re-clone simply use:

make cleanse
10.6.3. Configure the ./conf/10_custom.conf file

The supplied 10_custom.conf file must be modified, as it contains configuration items for which there are no reasonable default values.

The configuration items that can be added is not limited to the initial contents. Any configuration item mentioned in any .conf file in ./conf directory can be added and that item will be overridden by the custom configuration value.

10.6.4. Using a custom settings file

Alternatively a custom settings file can be passed to vsperf via the –conf-file argument.

./vsperf --conf-file <path_to_settings_py> ...

Note that configuration passed in via the environment (–load-env) or via another command line argument will override both the default and your custom configuration files. This “priority hierarchy” can be described like so (1 = max priority):

  1. Command line arguments
  2. Environment variables
  3. Configuration file(s)
10.6.5. vloop_vnf

VSPERF uses a VM image called vloop_vnf for looping traffic in the deployment scenarios involving VMs. The image can be downloaded from http://artifacts.opnfv.org/.

Please see the installation instructions for information on vloop-vnf images.

10.6.6. l2fwd Kernel Module

A Kernel Module that provides OSI Layer 2 Ipv4 termination or forwarding with support for Destination Network Address Translation (DNAT) for both the MAC and IP addresses. l2fwd can be found in <vswitchperf_dir>/src/l2fwd

10.6.7. Executing tests

Before running any tests make sure you have root permissions by adding the following line to /etc/sudoers: .. code:: bash

username ALL=(ALL) NOPASSWD: ALL

username in the example above should be replaced with a real username.

To list the available tests:

./vsperf --list-tests

To run a group of tests, for example all tests with a name containing ‘RFC2544’:

./vsperf --conf-file=user_settings.py --tests="RFC2544"

To run all tests:

./vsperf --conf-file=user_settings.py

Some tests allow for configurable parameters, including test duration (in seconds) as well as packet sizes (in bytes).

./vsperf --conf-file user_settings.py
    --tests RFC2544Tput
    --test-param` "rfc2544_duration=10;packet_sizes=128"

For all available options, check out the help dialog:

./vsperf --help
10.7. Testcases

Available Tests in VSPERF are:

  • phy2phy_tput
  • phy2phy_forwarding
  • back2back
  • phy2phy_tput_mod_vlan
  • phy2phy_cont
  • pvp_cont
  • pvvp_cont
  • pvpv_cont
  • phy2phy_scalability
  • pvp_tput
  • pvp_back2back
  • pvvp_tput
  • pvvp_back2back
  • phy2phy_cpu_load
  • phy2phy_mem_load
10.8. VSPERF modes of operation

VSPERF can be run in different modes. By default it will configure vSwitch, traffic generator and VNF. However it can be used just for configuration and execution of traffic generator. Another option is execution of all components except traffic generator itself.

Mode of operation is driven by configuration parameter -m or –mode

-m MODE, --mode MODE  vsperf mode of operation;
   Values:
        "normal" - execute vSwitch, VNF and traffic generator
        "trafficgen" - execute only traffic generator
        "trafficgen-off" - execute vSwitch and VNF
        "trafficgen-pause" - execute vSwitch and VNF but wait before traffic transmission

In case, that VSPERF is executed in “trafficgen” mode, then configuration of traffic generator can be modified through TRAFFIC dictionary passed to the --test-params option. It is not needed to specify all values of TRAFFIC dictionary. It is sufficient to specify only values, which should be changed. Detailed description of TRAFFIC dictionary can be found at: ref:configuration-of-traffic-dictionary.

Example of execution of VSPERF in “trafficgen” mode:

$ ./vsperf -m trafficgen --trafficgen IxNet --conf-file vsperf.conf \
    --test-params "TRAFFIC={'traffic_type':'rfc2544_continuous','bidir':'False','framerate':60}"
10.9. Packet Forwarding Test Scenarios

KVM4NFV currently implements three scenarios as part of testing:

  • Host Scenario
  • Guest Scenario.
  • SR-IOV Scenario.
10.9.1. Packet Forwarding Host Scenario

Here host DUT has VSPERF installed in it and is properly configured to use IXIA Traffic-generator by providing IXIA CARD, PORTS and Lib paths along with IP. please refer to figure.2

_images/Host_Scenario.png
10.9.2. Packet Forwarding Guest Scenario

Here the guest is a Virtual Machine (VM) launched by using vloop_vnf provided by vsperf project on host/DUT using Qemu. In this latency test the time taken by the frame/packet to travel from the originating device through network involving a guest to destination device is calculated. The resulting latency values will define the performance of installed kernel.

_images/Guest_Scenario.png
10.9.3. Packet Forwarding SRIOV Scenario

In this test the packet generated at the IXIA is forwarded to the Guest VM launched on Host by implementing SR-IOV interface at NIC level of host .i.e., DUT. The time taken by the packet to travel through the network to the destination the IXIA traffic-generator is calculated and published as a test result for this scenario.

SRIOV-support is given below, it details how to use SR-IOV.

_images/SRIOV_Scenario.png
10.9.4. Using vfio_pci with DPDK

To use vfio with DPDK instead of igb_uio add into your custom configuration file the following parameter:

PATHS['dpdk']['src']['modules'] = ['uio', 'vfio-pci']

NOTE: In case, that DPDK is installed from binary package, then please

set PATHS['dpdk']['bin']['modules'] instead.

NOTE: Please ensure that Intel VT-d is enabled in BIOS.

NOTE: Please ensure your boot/grub parameters include the following:

iommu=pt intel_iommu=on

To check that IOMMU is enabled on your platform:

 $ dmesg | grep IOMMU
 [    0.000000] Intel-IOMMU: enabled
 [    0.139882] dmar: IOMMU 0: reg_base_addr fbffe000 ver 1:0 cap d2078c106f0466 ecap f020de
 [    0.139888] dmar: IOMMU 1: reg_base_addr ebffc000 ver 1:0 cap d2078c106f0466 ecap f020de
 [    0.139893] IOAPIC id 2 under DRHD base  0xfbffe000 IOMMU 0
 [    0.139894] IOAPIC id 0 under DRHD base  0xebffc000 IOMMU 1
 [    0.139895] IOAPIC id 1 under DRHD base  0xebffc000 IOMMU 1
 [    3.335744] IOMMU: dmar0 using Queued invalidation
 [    3.335746] IOMMU: dmar1 using Queued invalidation
....
10.9.5. Using SRIOV support

To use virtual functions of NIC with SRIOV support, use extended form of NIC PCI slot definition:

WHITELIST_NICS = ['0000:03:00.0|vf0', '0000:03:00.1|vf3']

Where vf is an indication of virtual function usage and following number defines a VF to be used. In case that VF usage is detected, then vswitchperf will enable SRIOV support for given card and it will detect PCI slot numbers of selected VFs.

So in example above, one VF will be configured for NIC ‘0000:05:00.0’ and four VFs will be configured for NIC ‘0000:05:00.1’. Vswitchperf will detect PCI addresses of selected VFs and it will use them during test execution.

At the end of vswitchperf execution, SRIOV support will be disabled.

SRIOV support is generic and it can be used in different testing scenarios. For example:

  • vSwitch tests with DPDK or without DPDK support to verify impact of VF usage on vSwitch performance
  • tests without vSwitch, where traffic is forwared directly between VF interfaces by packet forwarder (e.g. testpmd application)
  • tests without vSwitch, where VM accesses VF interfaces directly by PCI-passthrough to measure raw VM throughput performance.
10.9.5.1. Using QEMU with PCI passthrough support

Raw virtual machine throughput performance can be measured by execution of PVP test with direct access to NICs by PCI passthrough. To execute VM with direct access to PCI devices, enable vfio-pci. In order to use virtual functions, SRIOV-support must be enabled.

Execution of test with PCI passthrough with vswitch disabled:

$ ./vsperf --conf-file=<path_to_custom_conf>/10_custom.conf \
           --vswitch none --vnf QemuPciPassthrough pvp_tput

Any of supported guest-loopback-application can be used inside VM with PCI passthrough support.

Note: Qemu with PCI passthrough support can be used only with PVP test deployment.

10.9.6. Results

The results for the packet forwarding test cases are uploaded to artifacts. The link for the same can be found below

http://artifacts.opnfv.org/kvmfornfv.html
11. PCM Utility in KVM4NFV
11.1. Collecting Memory Bandwidth Information using PCM utility

This chapter includes how the PCM utility is used in kvm4nfv to collect memory bandwidth information

11.2. About PCM utility

The Intel® Performance Counter Monitor provides sample C++ routines and utilities to estimate the internal resource utilization of the latest Intel® Xeon® and Core™ processors and gain a significant performance boost.In Intel PCM toolset,there is a pcm-memory.x tool which is used for observing the memory traffic intensity

11.3. Version Features
Release Features
Colorado
  • In Colorado release,we don’t have memory bandwidth information collected through the cyclic testcases.
Danube
  • pcm-memory.x will be executed before the execution of every testcase
  • pcm-memory.x provides the memory bandwidth data throughout out the testcases
  • used for all test-types (stress/idle)
  • Generated memory bandwidth logs are published to the KVMFORFNV artifacts
11.3.1. Implementation of pcm-memory.x:

The tool measures the memory bandwidth observed for every channel reporting seperate throughput for reads from memory and writes to the memory. pcm-memory.x tool tends to report values slightly higher than the application’s own measurement.

Command:

sudo ./pcm-memory.x  [Delay]/[external_program]

Parameters

  • pcm-memory can called with either delay or external_program/application as a parameter
  • If delay is given as 5,then the output will be produced with refresh of every 5 seconds.
  • If external_program is script/application,then the output will produced after the execution of the application or the script passed as a parameter.

Sample Output:

The output produced with default refresh of 1 second.
Socket 0 Socket 1
Memory Performance Monitoring Memory Performance Monitoring
Mem Ch 0: Reads (MB/s): 6870.81
Writes(MB/s): 1805.03
Mem Ch 1: Reads (MB/s): 6873.91
Writes(MB/s): 1810.86
Mem Ch 2: Reads (MB/s): 6866.77
Writes(MB/s): 1804.38
Mem Ch 3: Reads (MB/s): 6867.47
Writes(MB/s): 1805.53

NODE0 Mem Read (MB/s) : 27478.96 NODE0 Mem Write (MB/s): 7225.79 NODE0 P. Write (T/s) : 214810 NODE0 Memory (MB/s) : 34704.75

Mem Ch 0: Reads (MB/s): 7406.36
Writes(MB/s): 1951.25
Mem Ch 1: Reads (MB/s): 7411.11
Writes(MB/s): 1957.73
Mem Ch 2: Reads (MB/s): 7403.39
Writes(MB/s): 1951.42
Mem Ch 3: Reads (MB/s): 7403.66
Writes(MB/s): 1950.95

NODE1 Mem Read (MB/s) : 29624.51 NODE1 Mem Write (MB/s): 7811.36 NODE1 P. Write (T/s) : 238294 NODE1 Memory (MB/s) : 37435.87

  • System Read Throughput(MB/s): 57103.47
  • System Write Throughput(MB/s): 15037.15
  • System Memory Throughput(MB/s): 72140.62
11.3.2. pcm-memory.x in KVM4NFV:

pcm-memory is a part of KVM4NFV in D release.pcm-memory.x will be executed with delay of 60 seconds before starting every testcase to monitor the memory traffic intensity which was handled in collect_MBWInfo function .The memory bandwidth information will be collected into the logs through the testcase updating every 60 seconds.

Pre-requisites:

1.Check for the processors supported by PCM .Latest pcm utility version (2.11)support Intel® Xeon® E5 v4 processor family.

2.Disabling NMI Watch Dog

3.Installing MSR registers

Memory Bandwidth logs for KVM4NFV can be found here:

http://artifacts.opnfv.org/kvmfornfv.html

Details of the function implemented:

In install_Pcm function, it handles the installation of pcm utility and the required prerequisites for pcm-memory.x tool to execute.

$ git clone https://github.com/opcm/pcm
$ cd pcm
$ make

In collect_MBWInfo Function,the below command is executed on the node which was collected to the logs with the timestamp and testType.The function will be called at the begining of each testcase and signal will be passed to terminate the pcm-memory process which was executing throughout the cyclic testcase.

$ pcm-memory.x 60 &>/root/MBWInfo/MBWInfo_${testType}_${timeStamp}

where,
${testType} = verify (or) daily
11.4. Future Scope

PCM information will be added to cyclictest of kvm4nfv in yardstick.

12. Low Latency Tunning Suggestion

The correct configuration is critical for improving the NFV performance/latency.Even working on the same codebase, configurations can cause wildly different performance/latency results.

There are many combinations of configurations, from hardware configuration to Operating System configuration and application level configuration. And there is no one simple configuration that works for every case. To tune a specific scenario, it’s important to know the behaviors of different configurations and their impact.

12.1. Platform Configuration

Some hardware features can be configured through firmware interface(like BIOS) but others may not be configurable (e.g. SMI on most platforms).

  • Power management: Most power management related features save power at the expensive of latency. These features include: Intel®Turbo Boost Technology, Enhanced Intel®SpeedStep, Processor C state and P state. Normally they should be disabled but, depending on the real-time application design and latency requirements, there might be some features that can be enabled if the impact on deterministic execution of the workload is small.
  • Hyper-Threading: The logic cores that share resource with other logic cores can introduce latency so the recommendation is to disable this feature for realtime use cases.
  • Legacy USB Support/Port 60/64 Emulation: These features involve some emulation in firmware and can introduce random latency. It is recommended that they are disabled.
  • SMI (System Management Interrupt): SMI runs outside of the kernel code and can potentially cause latency. It is a pity there is no simple way to disable it. Some vendors may provide related switches in BIOS but most machines do not have this capability.
12.2. Operating System Configuration
  • CPU isolation: To achieve deterministic latency, dedicated CPUs should be allocated for realtime application. This can be achieved by isolating cpus from kernel scheduler. Please refer to http://lxr.free-electrons.com/source/Documentation/kernel-parameters.txt#L1608 for more information.
  • Memory allocation: Memory shoud be reserved for realtime applications and usually hugepage should be used to reduce page fauts/TLB misses.
  • IRQ affinity: All the non-realtime IRQs should be affinitized to non realtime CPUs to reduce the impact on realtime CPUs. Some OS distributions contain an irqbalance daemon which balances the IRQs among all the cores dynamically. It should be disabled as well.
  • Device assignment for VM: If a device is used in a VM, then device passthrough is desirable. In this case,the IOMMU should be enabled.
  • Tickless: Frequent clock ticks cause latency. CONFIG_NOHZ_FULL should be enabled in the linux kernel. With CONFIG_NOHZ_FULL, the physical CPU will trigger many fewer clock tick interrupts(currently, 1 tick per second). This can reduce latency because each host timer interrupt triggers a VM exit from guest to host which causes performance/latency impacts.
  • TSC: Mark TSC clock source as reliable. A TSC clock source that seems to be unreliable causes the kernel to continuously enable the clock source watchdog to check if TSC frequency is still correct. On recent Intel platforms with Constant TSC/Invariant TSC/Synchronized TSC, the TSC is reliable so the watchdog is useless but cause latency.
  • Idle: The poll option forces a polling idle loop that can slightly improve the performance of waking up an idle CPU.
  • RCU_NOCB: RCU is a kernel synchronization mechanism. Refer to http://lxr.free-electrons.com/source/Documentation/RCU/whatisRCU.txt for more information. With RCU_NOCB, the impact from RCU to the VNF will be reduced.
  • Disable the RT throttling: RT Throttling is a Linux kernel mechanism that occurs when a process or thread uses 100% of the core, leaving no resources for the Linux scheduler to execute the kernel/housekeeping tasks. RT Throttling increases the latency so should be disabled.
  • NUMA configuration: To achieve the best latency. CPU/Memory and device allocated for realtime application/VM should be in the same NUMA node.
OPNFV Glossary
Danube 1.0
Contents

This glossary provides a common definition of phrases and words commonly used in OPNFV.


A

Arno

A river running through Tuscany and the name of the first OPNFV release.

API

Application Programming Interface

AVX2

Advanced Vector Extensions 2 is an instruction set extension for x86.

B

Brahmaputra

A river running through Asia and the name of the Second OPNFV release.

Bios

Basic Input/Output System

Builds

Build in Jenkins is a version of a program.

Bogomips

Bogomips is the number of million times per second a processor can do absolutely nothing.

C

CAT

Cache Automation Technology

CentOS

Community Enterprise Operating System is a Linux distribution

CICD

Continuous Integration and Continuous Deployment

CLI

Command Line Interface

Colorado

A river in Argentina and the name of the Third OPNFV release.

Compute

Compute is an OpenStack service which offers many configuration options which may be deployment specific.

Console

Console is display screen.
CPU
Central Processing Unit

D

Danube

Danube is the fourth release of OPNFV and also a river in Europe

Data plane

The data plane is the part of a network that carries user traffic.

Debian/deb

Debian is a Unix-like computer operating system that is composed entirely of free software.

Docs

Documentation/documents

DPDK

Data Plane Development Kit

DPI

Deep Packet Inspection

DSCP

Differentiated Services Code Point

F

Flavors

Flavors are templates used to define VM configurations.

Fuel

Provides an intuitive, GUI-driven experience for deployment and management of OpenStack

H

Horizon

Horizon is an OpenStack service which serves as an UI.

Hypervisor

A hypervisor, also called a virtual machine manager, is a program that allows multiple operating systems to share a single hardware host.

I

IGMP

Internet Group Management Protocol

IOMMU

Input-Output Memory Management Unit

IOPS

Input/Output Operations Per Second

IRQ

Interrupt ReQuest is an interrupt request sent from the hardware level to the CPU.

IRQ affinity

IRQ affinity is the set of CPU cores that can service that interrupt.

J

Jenkins

Jenkins is an open source continuous integration tool written in Java.

JIRA

JIRA is a bug tracking software.

Jitter

Time difference in packet inter-arrival time to their destination can be called jitter.

JumpHost

A jump host or jump server or jumpbox is a computer on a network typically used to manage devices in a separate security zone.

K

Kernel

The kernel is a computer program that constitutes the central core of a computer’s operating system.

L

Latency

The amount of time it takes a packet to travel from source to destination is Latency.

libvirt

libvirt is an open source API, daemon and management tool for managing platform virtualization.

M

Migration

Migration is the process of moving from the use of one operating environment to another operating environment.

N

NFV

Network Functions Virtualisation, an industry initiative to leverage virtualisation technologies in carrier networks.

NFVI

Network Function Virtualization Infrastructure

NIC

Network Interface Controller

NUMA

Non-Uniform Memory Access

O

OPNFV

Open Platform for NFV, an open source project developing an NFV reference platform and features.

P

Pharos

Is a lighthouse and is a project deals with developing an OPNFV lab infrastructure that is geographically and technically diverse.

Pipeline

A suite of plugins in Jenkins that lets you orchestrate automation.

Platform

OPNFV provides an open source platform for deploying NFV solutions that leverages investments from a community of developers and solution providers.

Pools

A Pool is a set of resources that are kept ready to use, rather than acquired on use and released afterwards.

Q

Qemu

QEMU is a free and open-source hosted hypervisor that performs hardware virtualization.

R

RDMA

Remote Direct Memory Access (RDMA)

Rest-Api

REST (REpresentational State Transfer) is an architectural style, and an approach to communications that is often used in the development of web services

S

Scaling

Refers to altering the size.

Slave

Works with/for master.where master has unidirectional control over one or more other devices.

SR-IOV

Single root IO- Virtualization.

Spin locks

A spinlock is a lock which causes a thread trying to acquire it to simply wait in a loop while repeatedly checking if the lock is available.

Storage

Refers to computer components which store some data.

T

Tenant

A Tenant is a group of users who share a common access with specific privileges to the software instance.

Tickless

A tickless kernel is an operating system kernel in which timer interrupts do not occur at regular intervals, but are only delivered as required.

TSC

Technical Steering Committee

V

VLAN

A virtual local area network, typically an isolated ethernet network.

VM

Virtual machine, an emulation in software of a computer system.

VNF

Virtual network function, typically a networking application or function running in a virtual environment.

X

XBZRLE

Helps to reduce the network traffic by just sending the updated data

Y

Yardstick

Yardstick is an infrastructure verification. It is an OPNFV testing project.
KVM4NFV Design
1. KVM4NFV design description

This design focuses on the enhancement of following area for KVM Hypervisor

  • Minimal Interrupt latency variation for data plane VNFs:
    • Minimal Timing Variation for Timing correctness of real-time VNFs
    • Minimal packet latency variation for data-plane VNFs
  • Fast live migration

Minimal Interrupt latency variation for data plane VNFs

Processing performance and latency depend on a number of factors, including the CPUs (frequency, power management features, etc.), micro-architectural resources, the cache hierarchy and sizes, memory (and hierarchy, such as NUMA) and speed, inter-connects, I/O and I/O NUMA, devices, etc.

There are two separate types of latencies to minimize:

  1. Minimal Timing Variation for Timing correctness of real-time VNFs – timing correctness for scheduling operations(such as Radio scheduling)
  2. Minimal packet latency variation for data-plane VNFs – packet delay variation, which applies to packet processing.

For a VM, interrupt latency (time between arrival of H/W interrupt and invocation of the interrupt handler in the VM), for example, can be either of the above or both, depending on the type of the device. Interrupt latency with a (virtual) timer can cause timing correctness issues with real-time VNFs even if they only use polling for packet processing.

We assume that the VNFs are implemented properly to minimize interrupt latency variation within the VMs, but we have additional causes of latency variation on KVM:

  • Asynchronous (e.g. external interrupts) and synchronous(e.g. instructions) VM exits and handling in KVM (and kernel routines called), which may have loops and spin locks
  • Interrupt handling in the host Linux and KVM, scheduling and virtual interrupt delivery to VNFs
  • Potential VM exit (e.g. EOI) in the interrupt service routines in VNFs
  • Exit to the user-level (e.g. QEMU)
_images/kvm1.png
1.1. Design Considerations

The latency variation and jitters can be minimized with the below steps (with some in parallel):

  1. Statically and exclusively assign hardware resources (CPUs, memory, caches,) to the VNFs.
  2. Pre-allocate huge pages (e.g. 1 GB/2MB pages) and guest-to-host mapping, e.g. EPT (Extended Page Table) page tables, to minimize or mitigate latency from misses in caches,
  3. Use the host Linux configured for hard real-time and packet latency, Check the set of virtual devices used by the VMs to optimize or eliminate virtualization overhead if applicable
  4. Use advanced hardware virtualization features that can reduce or eliminate VM exits, if present, and
  5. Inspect the code paths in KVM and associated kernel services to eliminate code that can cause latencies (e.g. loops and spin locks).
  6. Measure latencies intensively. We leverage the existing testing methods. OSADL, for example, defines industry tests for timing correctness.
1.2. Goals and Guidelines

The output of this project will provide :

  1. A list of the performance goals, which will be obtained by the OPNFV members (as described above)
  2. A set of comprehensive instructions for the system configurations (hardware features, BIOS setup, kernel parameters, VM configuration, options to QEMU/KVM, etc.)
  3. The above features to the upstream of Linux, the real-time patch set, KVM, QEMU, libvirt, and
  4. Performance and interrupt latency measurement tools
1.3. Test plan

The tests that need to be conducted to make sure that all components from OPNFV meet the requirement are mentioned below:

Timer test:This test utilize the cyclictest (https://rt.wiki.kernel.org/index.php/Cyclictest) to test the guest timer latency (the latency from the time that the guest timer should be triggered to the time the guest timer is really triggered).

_images/TimerTest.png

Device Interrupt Test:A device on the hardware platform trigger interrupt every one ms and the device interrupt will be delivered to the VNF. This test cover the latency from the interrupt happened on the hardware to the time the interrupt handled in the VNF.

_images/DeviceInterruptTest.png

Packet forwarding (DPDK OVS):A packet is sent from TG (Traffic Generator) to a VNF. The VNF, after processing the packet, forwards the packet to another NIC and in the end the packet is received by the traffic generator. The test check the latency from the packet is sent out by the TC to the time the packet is received by the TC.

_images/PacketforwardingDPDK_OVS.png

Packet Forwarding (SR-IOV):This test is similar to Packet Forwarding (DPDK OVS). However, instead of using virtio NIC devices on the guest, a PCI NIC or a PCI VF NIC is assigned to the guest for network acess.

Bare-metal Packet Forwarding:This is used to compare with the above packet forwarding scenario.

_images/Bare-metalPacketForwarding.png

Multisite

Multisite Installation procedure
1. Kingbird installation instruction
1.1. Abstract

This document will give the user instructions on how to deploy available scenarios verified for the Danube release of OPNFV platform.

1.2. Preparing the installation

Kingbird is centralized synchronization service for multi-region OpenStack deployments. Kingbird provides centralized quota management feature. At least two OpenStack regions with shared KeyStone should be installed first.

Kingbird includes kingbird-api and kingbird-engine, kingbird-api and kingbird-engine which talk to each other through message bus, and both services access the database. Kingbird-api receives the RESTful API request for quota management and forward the request to kingbird-engine to do quota synchronization etc task.

Therefore install Kingbird on the controller nodes of one of the OpenStack region, these two services could be deployed in same node or different node. Both kingbird-api and kingbird-engine can run in multiple nodes with multi-workers mode. It’s up to you how many nodes you want to deploy kingbird-api and kingbird-engine and they can work in same node or different nodes.

1.3. HW requirements

No special hardware requirements

1.4. Installation instruction

In colorado release, Kingbird is recommended to be installed in a python virtual environment. So install and activate virtualenv first.

sudo pip install virtualenv
virtualenv venv
source venv/bin/activate

Get the latest code of Kingbird from git repository:

git clone https://github.com/openstack/kingbird.git
cd kingbird/
pip install -e .

or get the stable release from PyPI repository:

pip install kingbird

In case of the database package are not installed, you may need to install:

pip install mysql
pip install pymysql

In the Kingbird root folder, where you can find the source code of Kingbird, generate the configuration sample file for Kingbird:

oslo-config-generator --config-file=./tools/config-generator.conf

prepare the folder used for cache, log and configuration for Kingbird:

sudo rm -rf /var/cache/kingbird
sudo mkdir -p /var/cache/kingbird
sudo chown `whoami` /var/cache/kingbird
sudo rm -rf /var/log/kingbird
sudo mkdir -p /var/log/kingbird
sudo chown `whoami` /var/log/kingbird
sudo rm -rf /etc/kingbird
sudo mkdir -p /etc/kingbird
sudo chown `whoami` /etc/kingbird

Copy the sample configuration to the configuration folder /etc/kingbird:

cp etc/kingbird/kingbird.conf.sample /etc/kingbird/kingbird.conf

Before editing the configuration file, prepare the database info for Kingbird.

mysql -uroot -e "CREATE DATABASE $kb_db CHARACTER SET utf8;"
mysql -uroot -e "GRANT ALL PRIVILEGES ON $kb_db.* TO '$kb_db_user'@'%' IDENTIFIED BY '$kb_db_pwd';"

For example, the following command will create database “kingbird”, and grant the privilege for the db user “kingbird” with password “password”:

mysql -uroot -e "CREATE DATABASE kingbird CHARACTER SET utf8;"
mysql -uroot -e "GRANT ALL PRIVILEGES ON kingbird.* TO 'kingbird'@'%' IDENTIFIED BY 'password';"

Create the service user in OpenStack:

source openrc admin admin
openstack user create --project=service --password=$kb_svc_pwd $kb_svc_user
openstack role add --user=$kb_svc_user --project=service admin

For example, the following command will create service user “kingbird”, and grant the user “kingbird” with password “password” the role of admin in service project:

source openrc admin admin
openstack user create --project=service --password=password kingbird
openstack role add --user=kingbird --project=service admin

Then edit the configuration file for Kingbird:

vim /etc/kingbird/kingbird.conf

By default, the bind_host of kingbird-api is local_host(127.0.0.1), and the port for the service is 8118, you can leave it as the default if no port conflict happened.

Please replace the address of Kingbird service “127.0.0.1” which is mentioned below to the address you get from OpenStack Kingbird endpoint.

To make the Kingbird work normally, you have to edit these configuration items. The [cache] section is used by kingbird engine to access the quota information of Nova, Cinder, Neutron in each region, replace the auth_uri to the keystone service in your environment, especially if the keystone service is not located in the same node, and also for the account to access the Nova, Cinder, Neutron in each region, in the following configuration, user “admin” with password “password” of the tenant “admin” is configured to access other Nova, Cinder, Neutron in each region:

[cache]
auth_uri = http://127.0.0.1:5000/v3
admin_tenant = admin
admin_password = password
admin_username = admin

Configure the database section with the service user “kingbird” and its password, to access database “kingbird”. For detailed database section configuration, please refer to http://docs.openstack.org/developer/oslo.db/opts.html, and change the following configuration accordingly based on your environment.

[database]
connection = mysql+pymysql://$kb_db_user:$kb_db_pwd@127.0.0.1/$kb_db?charset=utf8

For example, if the database is “kingbird”, and the db user “kingbird” with password “password”, then the configuration is as following:

[database]
connection = mysql+pymysql://kingbird:password@127.0.0.1/kingbird?charset=utf8

The [keystone_authtoken] section is used by keystonemiddleware for token validation during the API request to the kingbird-api, please refer to http://docs.openstack.org/developer/keystonemiddleware/middlewarearchitecture.html on how to configure the keystone_authtoken section for the keystonemiddleware in detail, and change the following configuration accordingly based on your environment:

please specify the region_name where you want the token will be validated if the KeyStone is deployed in multiple regions

[keystone_authtoken]
signing_dir = /var/cache/kingbird
cafile = /opt/stack/data/ca-bundle.pem
auth_uri = http://127.0.0.1:5000/v3
project_domain_name = Default
project_name = service
user_domain_name = Default
password = $kb_svc_pwd
username = $kb_svc_user
auth_url = http://127.0.0.1:35357/v3
auth_type = password
region_name = RegionOne

For example, if the service user is “kingbird, and the password for the user is “password”, then the configuration will look like this:

[keystone_authtoken]
signing_dir = /var/cache/kingbird
cafile = /opt/stack/data/ca-bundle.pem
auth_uri = http://127.0.0.1:5000/v3
project_domain_name = Default
project_name = service
user_domain_name = Default
password = password
username = kingbird
auth_url = http://127.0.0.1:35357/v3
auth_type = password
region_name = RegionOne

And also configure the message bus connection, you can refer to the message bus configuration in Nova, Cinder, Neutron configuration file.

[DEFAULT]
transport_url = rabbit://stackrabbit:password@127.0.0.1:5672/

After these basic configuration items configured, now the database schema of “kingbird” should be created:

python kingbird/cmd/manage.py --config-file=/etc/kingbird/kingbird.conf db_sync

And create the service and endpoint for Kingbird, please change the endpoint url according to your cloud planning:

openstack service create --name=kingbird synchronization
openstack endpoint create --region=RegionOne kingbird public http://127.0.0.1:8118/v1.0
openstack endpoint create --region=RegionOne kingbird admin http://127.0.0.1:8118/v1.0
openstack endpoint create --region=RegionOne kingbird internal http://127.0.0.1:8118/v1.0

Now it’s ready to run kingbird-api and kingbird-engine:

nohup python kingbird/cmd/api.py --config-file=/etc/kingbird/kingbird.conf &
nohup python kingbird/cmd/engine.py --config-file=/etc/kingbird/kingbird.conf &

Run the following command to check whether kingbird-api and kingbird-engine are running:

ps aux|grep python
1.5. Post-installation activities

Run the following commands to check whether kingbird-api is serving, please replace $mytoken to the token you get from “openstack token issue”:

openstack token issue
curl  -H "Content-Type: application/json"  -H "X-Auth-Token: $mytoken" \
http://127.0.0.1:8118/

If the response looks like following: {“versions”: [{“status”: “CURRENT”, “updated”: “2016-03-07”, “id”: “v1.0”, “links”: [{“href”: “http://127.0.0.1:8118/v1.0/”, “rel”: “self”}]}]}, then that means the kingbird-api is working normally.

Run the following commands to check whether kingbird-engine is serving, please replace $mytoken to the token you get from “openstack token issue”, and the $admin_project_id to the admin project id in your environment:

curl  -H "Content-Type: application/json"  -H "X-Auth-Token: $mytoken" \
-X PUT \
http://127.0.0.1:8118/v1.0/$admin_project_id/os-quota-sets/$admin_project_id/sync

If the response looks like following: “triggered quota sync for 0320065092b14f388af54c5bd18ab5da”, then that means the kingbird-engine is working normally.

Multisite Configuration Guide
1. Multisite identity service management
1.1. Goal

A user should, using a single authentication point be able to manage virtual resources spread over multiple OpenStack regions.

1.2. Before you read

This chapter does not intend to cover all configuration of KeyStone and other OpenStack services to work together with KeyStone.

This chapter focuses only on the configuration part should be taken into account in multi-site scenario.

Please read the configuration documentation related to identity management of OpenStack for all configuration items.

http://docs.openstack.org/liberty/config-reference/content/ch_configuring-openstack-identity.html

How to configure the database cluster for synchronization or asynchrounous repliation in multi-site scenario is out of scope of this document. The only remainder is that for the synchronization or replication, only Keystone database is required. If you are using MySQL, you can configure like this:

In the master:

binlog-do-db=keystone

In the slave:

replicate-do-db=keystone
1.3. Deployment options

For each detail description of each deployment option, please refer to the admin-user-guide.

  • Distributed KeyStone service with PKI token

    In KeyStone configuration file, PKI token format should be configured

    provider = pki
    

    or

    provider = pkiz
    

    In the [keystone_authtoken] section of each OpenStack service configuration file in each site, configure the identity_url and auth_uri to the address of KeyStone service

    identity_uri = https://keystone.your.com:35357/
    auth_uri = http://keystone.your.com:5000/v2.0
    

    It’s better to use domain name for the KeyStone service, but not to use IP address directly, especially if you deployed KeyStone service in at least two sites for site level high availability.

  • Distributed KeyStone service with Fernet token

  • Distributed KeyStone service with Fernet token + Async replication ( star-mode).

    In these two deployment options, the token validation is planned to be done in local site.

    In KeyStone configuration file, Fernet token format should be configured

    provider = fernet
    

    In the [keystone_authtoken] section of each OpenStack service configuration file in each site, configure the identity_url and auth_uri to the address of local KeyStone service

    identity_uri = https://local-keystone.your.com:35357/
    auth_uri = http://local-keystone.your.com:5000/v2.0
    

    and especially, configure the region_name to your local region name, for example, if you are configuring services in RegionOne, and there is local KeyStone service in RegionOne, then

    region_name = RegionOne
    
2. Configuration of Multisite.Kingbird

A brief introduction to configure Multisite Kingbird service. Only the configuration items for Kingbird will be described here. Logging, messaging, database, keystonemiddleware etc configuration which are generated from OpenStack OSLO libary, will not be described here, for these configuration items are common to Nova, Cinder, Neutron. So please refer to corresponding description from Nova or Cinder or Neutron.

2.1. Configuration in [DEFAULT]
2.1.1. configuration items for kingbird-api
2.1.1.1. bind_host
  • default value: bind_host = 0.0.0.0
  • description: The host IP to bind for kingbird-api service
2.1.1.2. bind_port
  • default value: bind_port = 8118
  • description: The port to bind for kingbird-api service
2.1.1.3. api_workers
  • default value: api_workers = 2
  • description: Number of kingbird-api workers
2.1.2. configuration items for kingbird-engine
2.1.2.1. host
  • default value: host = localhost
  • description: The host name kingbird-engine service is running on
2.1.2.2. workers
  • default value: workers = 1
  • description: Number of kingbird-engine workers
2.1.2.3. report_interval
  • default value: report_interval = 60
  • description: Seconds between running periodic reporting tasks to keep the engine alive in the DB. If the engine doesn’t report its aliveness to the DB more than two intervals, then the lock accquired by the engine will be removed by other engines.
2.1.3. common configuration items for kingbird-api and kingbird-engine
2.1.3.1. use_default_quota_class
  • default value: use_default_quota_class = true
  • description: Enables or disables use of default quota class with default quota, boolean value
2.2. Configuration in [kingbird_global_limit]

For quota limit, a negative value means unlimited.

2.2.1. configuration items for kingbird-api and kingbird-engine
2.2.1.1. quota_instances
  • default value: quota_instances = 10
  • description: Number of instances allowed per project, integer value.
2.2.1.2. quota_cores
  • default value: quota_cores = 20
  • description: Number of instance cores allowed per project, integer value.
2.2.1.3. quota_ram
  • default value: quota_ram = 512
  • description: Megabytes of instance RAM allowed per project, integer value.
2.2.1.4. quota_metadata_items
  • default value: quota_metadata_items = 128
  • description: Number of metadata items allowed per instance, integer value.
2.2.1.5. quota_key_pairs
  • default value: quota_key_pairs = 10
  • description: Number of key pairs per user, integer value.
2.2.1.6. quota_fixed_ips
  • default value: quota_fixed_ips = -1
  • description: Number of fixed IPs allowed per project, this should be at least the number of instances allowed, integer value.
2.2.1.7. quota_security_groups
  • default value: quota_security_groups = 10
  • description: Number of security groups per project, integer value.
2.2.1.8. quota_floating_ips
  • default value: quota_floating_ips = 10
  • description: Number of floating IPs allowed per project, integer value.
2.2.1.9. quota_network
  • default value: quota_network = 10
  • description: Number of networks allowed per project, integer value.
2.2.1.10. quota_subnet
  • default value: quota_subnet = 10
  • description: Number of subnets allowed per project, integer value.
2.2.1.11. quota_port
  • default value: quota_port = 50
  • description: Number of ports allowed per project, integer value.
2.2.1.12. quota_security_group
  • default value: quota_security_group = 10
  • description: Number of security groups allowed per project, integer value.
2.2.1.13. quota_security_group_rule
  • default value: quota_security_group_rule = 100
  • description: Number of security group rules allowed per project, integer value.
2.2.1.14. quota_router
  • default value: quota_router = 10
  • description: Number of routers allowed per project, integer value.
2.2.1.15. quota_floatingip
  • default value: quota_floatingip = 50
  • description: Number of floating IPs allowed per project, integer value.
2.2.1.16. quota_volumes
  • default value: quota_volumes = 10
  • description: Number of volumes allowed per project, integer value.
2.2.1.17. quota_snapshots
  • default value: quota_snapshots = 10
  • description: Number of snapshots allowed per project, integer value.
2.2.1.18. quota_gigabytes
  • default value: quota_gigabytes = 1000
  • description: Total amount of storage, in gigabytes, allowed for volumes and snapshots per project, integer value.
2.2.1.19. quota_backups
  • default value: quota_backups = 10
  • description: Number of volume backups allowed per project, integer value.
2.2.1.20. quota_backup_gigabytes
  • default value: quota_backup_gigabytes = 1000
  • description: Total amount of storage, in gigabytes, allowed for volume backups per project, integer value.
2.3. Configuration in [cache]

The [cache] section is used by kingbird engine to access the quota information for Nova, Cinder, Neutron in each region in order to reduce the KeyStone load while retrieving the endpoint information each time.

2.3.1. configuration items for kingbird-engine
2.3.1.1. auth_uri
2.3.1.2. admin_username
  • default value:
  • description: Username of admin account, for example, admin.
2.3.1.3. admin_password
  • default value:
  • description: Password for admin account, for example, password.
2.3.1.4. admin_tenant
  • default value:
  • description: Tenant name of admin account, for example, admin.
2.3.1.5. admin_user_domain_name
  • default value: admin_user_domain_name = Default
  • description: User domain name of admin account.
2.3.1.6. admin_project_domain_name
  • default value: admin_project_domain_name = Default
  • description: Project domain name of admin account.
2.4. Configuration in [scheduler]

The [scheduler] section is used by kingbird engine to periodically synchronize and rebalance the quota for each project.

2.4.1. configuration items for kingbird-engine
2.4.1.1. periodic_enable
  • default value: periodic_enable = True
  • description: Boolean value for enable/disable periodic tasks.
2.4.1.2. periodic_interval
  • default value: periodic_interval = 900
  • description: Periodic time interval for automatic quota sync job, unit is seconds.
2.5. Configuration in [batch]

The [batch] section is used by kingbird engine to periodicly synchronize and rebalance the quota for each project.

  • default value: batch_size = 3
  • description: Batch size number of projects will be synced at a time.
2.6. Configuration in [locks]

The [locks] section is used by kingbird engine to periodically synchronize and rebalance the quota for each project.

  • default value: lock_retry_times = 3
  • description: Number of times trying to grab a lock.
  • default value: lock_retry_interval =10
  • description: Number of seconds between lock retries.
Multisite Admin User Guide
1. Multisite identity service management
1.1. Goal

A user should, using a single authentication point be able to manage virtual resources spread over multiple OpenStack regions.

1.2. Token Format

There are 3 types of token format supported by OpenStack KeyStone

  • FERNET
  • UUID
  • PKI/PKIZ

It’s very important to understand these token format before we begin the mutltisite identity service management. Please refer to the OpenStack official site for the identity management. http://docs.openstack.org/admin-guide-cloud/identity_management.html

Please note that PKI/PKIZ token format has been deprecated.

1.3. Key consideration in multisite scenario

A user is provided with a single authentication URL to the Identity (Keystone) service. Using that URL, the user authenticates with Keystone by requesting a token typically using username/password credentials. Keystone server validates the credentials, possibly with an external LDAP/AD server and returns a token to the user. The user sends a request to a service in a selected region including the token. Now the service in the region, say Nova needs to validate the token. The service uses its configured keystone endpoint and service credentials to request token validation from Keystone. After the token is validated by KeyStone, the user is authorized to use the service.

The key considerations for token validation in multisite scenario are:
  • Site level failure: impact on authN and authZ shoulde be as minimal as possible
  • Scalable: as more and more sites added, no bottleneck in token validation
  • Amount of inter region traffic: should be kept as little as possible

Hence, Keystone token validation should preferably be done in the same region as the service itself.

The challenge to distribute KeyStone service into each region is the KeyStone backend. Different token format has different data persisted in the backend.

  • Fernet: Tokens are non persistent cryptographic based tokens and validated online by the Keystone service. Fernet tokens are more lightweight than PKI tokens and have a fixed size. Fernet tokens require Keystone deployed in a distributed manner, again to avoid inter region traffic. The data synchronization cost for the Keystone backend is smaller due to the non- persisted token.
  • UUID: UUID tokens have a fixed size. Tokens are persistently stored and create a lot of database traffic, the persistence of token is for the revoke purpose. UUID tokens are validated online by Keystone, call to service will request keystone for token validation. Keystone can become a bottleneck in a large system. Due to this, UUID token type is not suitable for use in multi region clouds, no matter the Keystone database replicates or not.

Cryptographic tokens bring new (compared to UUID tokens) issues/use-cases like key rotation, certificate revocation. Key management is out of scope for this use case.

1.4. Database deployment as the backend for KeyStone service
Database replication:
  • Master/slave asynchronous: supported by the database server itself (mysql/mariadb etc), works over WAN, it’s more scalable. But only master will provide write functionality, domain/project/role provisioning.
  • Multi master synchronous: Galera(others like percona), not so scalable, for multi-master writing, and need more parameter tunning for WAN latency.It can provide the capability for limited multi-sites multi-write function for distributed KeyStone service.
  • Symmetrical/asymmetrical: data replicated to all regions or a subset, in the latter case it means some regions needs to access Keystone in another region.

Database server sharing: In an OpenStack controller, normally many databases from different services are provided from the same database server instance. For HA reasons, the database server is usually synchronously replicated to a few other nodes (controllers) to form a cluster. Note that _all_ database are replicated in this case, for example when Galera sync repl is used.

Only the Keystone database can be replicated to other sites. Replicating databases for other services will cause those services to get of out sync and malfunction.

Since only the Keystone database is to be replicated sync. or async. to another region/site, it’s better to deploy Keystone database into its own database server with extra networking requirement, cluster or replication configuration. How to support this by installer is out of scope.

The database server can be shared when async master/slave replication is used, if global transaction identifiers GTID is enabled.

1.5. Deployment options

Distributed KeyStone service with Fernet token

Fernet token is a very new format, and just introduced recently,the biggest gain for this token format is :1) lightweight, size is small to be carried in the API request, not like PKI token( as the sites increased, the endpoint-list will grows and the token size is too long to carry in the API request) 2) no token persistence, this also make the DB not changed too much and with light weight data size (just project, Role, domain, endpoint etc). The drawback for the Fernet token is that token has to be validated by KeyStone for each API request.

This makes that the DB of KeyStone can work as a cluster in multisite (for example, using MySQL galera cluster). That means install KeyStone API server in each site, but share the same the backend DB cluster.Because the DB cluster will synchronize data in real time to multisite, all KeyStone server can see the same data.

Because each site with KeyStone installed, and all data kept same, therefore all token validation could be done locally in the same site.

The challenge for this solution is how many sites the DB cluster can support. Question is aksed to MySQL galera developers, their answer is that no number/distance/network latency limitation in the code. But in the practice, they have seen a case to use MySQL cluster in 5 data centers, each data centers with 3 nodes.

This solution will be very good for limited sites which the DB cluster can cover very well.

Distributed KeyStone service with Fernet token + Async replication (star-mode)

One master KeyStone cluster with Fernet token in one or two sites(for site level high availability purpose), other sites will be installed with at least 2 slave nodes where the node is configured with DB async replication from the master cluster members. The async. replication data source is better to be from different member of the master cluster, if there are two sites for the KeyStone cluster, it’ll be better that source members for async. replication are located in different site.

Only the master cluster nodes are allowed to write, other slave nodes waiting for replication from the master cluster member( very little delay).

Pros:
  • Deploy database cluster in the master sites is to provide more master nodes, in order to provide more slaves could be done with async. replication in parallel. Two sites for the master cluster is to provide higher reliability (site level) for writing request, but reduce the maintaince challenge at the same time by limiting the cluster spreading over too many sites.
  • Multi-slaves in other sites is because of the slave has no knowledge of other slaves, so easy to manage multi-slaves in one site than a cluster, and multi-slaves work independently but provide multi-instance redundancy(like a cluster, but independent).
Cons:
  • Need to be aware of the chanllenge of key distribution and rotation for Fernet token.
2. Multisite VNF Geo site disaster recovery
2.1. Goal

A VNF (telecom application) should, be able to restore in another site for catastrophic failures happened.

2.2. Key consideration in multisite scenario

Geo site disaster recovery is to deal with more catastrophic failures (flood, earthquake, propagating software fault), and that loss of calls, or even temporary loss of service, is acceptable. It is also seems more common to accept/expect manual / administrator intervene into drive the process, not least because you don’t want to trigger the transfer by mistake.

In terms of coordination/replication or backup/restore between geographic sites, discussion often (but not always) seems to focus on limited application level data/config replication, as opposed to replication backup/restore between of cloud infrastructure between different sites.

And finally, the lack of a requirement to do fast media transfer (without resignalling) generally removes the need for special networking behavior, with slower DNS-style redirection being acceptable.

Here is more concerns about cloud infrastructure level capability to support VNF geo site disaster recovery

2.3. Option1, Consistency application backup

The disater recovery process will work like this:

  1. DR(Geo site disaster recovery )software get the volumes for each VM in the VNF from Nova
  2. DR software call Nova quiesce API to quarantee quiecing VMs in desired order
  3. DR software takes snapshots of these volumes in Cinder (NOTE: Because storage often provides fast snapshot, so the duration between quiece and unquiece is a short interval)
  4. DR software call Nova unquiece API to unquiece VMs of the VNF in reverse order
  5. DR software create volumes from the snapshots just taken in Cinder
  6. DR software create backup (incremental) for these volumes to remote backup storage ( swift or ceph, or.. ) in Cinder
  7. If this site failed,
  1. DR software restore these backup volumes in remote Cinder in the backup site.
  2. DR software boot VMs from bootable volumes from the remote Cinder in the backup site and attach the regarding data volumes.

Note: Quiesce/Unquiesce spec was approved in Mitaka, but code not get merged in time, https://blueprints.launchpad.net/nova/+spec/expose-quiesce-unquiesce-api The spec was rejected in Newton when it was reproposed: https://review.openstack.org/#/c/295595/. So this option will not work any more.

2.4. Option2, Vitrual Machine Snapshot
  1. DR software create VM snapshot in Nova
  2. Nova quiece the VM internally (NOTE: The upper level application or DR software should take care of avoiding infra level outage induced VNF outage)
  3. Nova create image in Glance
  4. Nova create a snapshot of the VM, including volumes

5) If the VM is volume backed VM, then create volume snapshot in Cinder 5) No image uploaded to glance, but add the snapshot in the meta data of the

image in Glance
  1. DR software to get the snapshot information from the Glance

7) DR software create volumes from these snapshots 9) DR software create backup (incremental) for these volumes to backup storage

( swift or ceph, or.. ) in Cinder
  1. If this site failed,
  1. DR software restore these backup volumes to Cinder in the backup site.
  2. DR software boot vm from bootable volume from Cinder in the backup site and attach the data volumes.

This option only provides single VM level consistency disaster recovery.

This feature is already available in current OPNFV release.

2.5. Option3, Consistency volume replication
  1. DR software creates datastore (Block/Cinder, Object/Swift, App Custom storage) with replication enabled at the relevant scope, for use to selectively backup/replicate desire data to GR backup site
  2. DR software get the reference of storage in the remote site storage
  3. If primary site failed,
  1. DR software managing recovery in backup site gets references to relevant storage and passes to new software instances
  2. Software attaches (or has attached) replicated storage, in the case of volumes promoting to writable.
Pros:
  • Replication will be done in the storage level automatically, no need to create backup regularly, for example, daily.
  • Application selection of limited amount of data to replicate reduces risk of replicating failed state and generates less overhear.
  • Type of replication and model (active/backup, active/active, etc) can be tailored to application needs
Cons:
  • Applications need to be designed with support in mind, including both selection of data to be replicated and consideration of consistency
  • “Standard” support in Openstack for Disaster Recovery currently fairly limited, though active work in this area.

Note: Volume replication v2.1 support project level replication.

3. VNF high availability across VIM
3.1. Goal

A VNF (telecom application) should, be able to realize high availability deloyment across OpenStack instances.

3.2. Key consideration in multisite scenario

Most of telecom applications have already been designed as Active-Standby/Active-Active/N-Way to achieve high availability (99.999%, corresponds to 5.26 minutes of unplanned downtime in a year), typically state replication or heart beat between Active-Active/Active-Active/N-Way (directly or via replicated database services, or via private designed message format) are required.

We have to accept the currently limited availability ( 99.99%) of a given OpenStack instance, and intend to provide the availability of the telecom application by spreading its function across multiple OpenStack instances.To help with this, many people appear willing to provide multiple “independent” OpenStack instances in a single geographic site, with special networking (L2/L3) between clouds in that physical site.

The telecom application often has different networking plane for different purpose:

  1. external network plane: using for communication with other telecom application.
  2. components inter-communication plane: one VNF often consisted of several components, this plane is designed for components inter-communication with each other
  3. backup plane: this plane is used for the heart beat or state replication between the component’s active/standby or active/active or N-way cluster.
  4. management plane: this plane is mainly for the management purpose, like configuration

Generally these planes are separated with each other. And for legacy telecom application, each internal plane will have its fixed or flexble IP addressing plan.

There are some interesting/hard requirements on the networking (L2/L3) between OpenStack instances, at lease the backup plane across different OpenStack instances:

To make the VNF can work with HA mode across different OpenStack instances in one site (but not limited to), need to support at lease the backup plane across different OpenStack instances:

1) L2 networking across OpenStack instance for heartbeat or state replication. Overlay L2 networking or shared L2 provider networks can work as the backup plance for heartbeat or state replication. Overlay L2 network is preferred, the reason is:

  1. Support legacy compatibility: Some telecom app with built-in internal L2 network, for easy to move these app to VNF, it would be better to provide L2 network.
  2. Isolated L2 network will simplify the security management between different network planes.
  3. Easy to support IP/mac floating across OpenStack.
  4. Support IP overlapping: multiple VNFs may have overlaping IP address for cross OpenStack instance networking.

Therefore, over L2 networking across Neutron feature is required in OpenStack.

2) L3 networking across OpenStack instance for heartbeat or state replication. For L3 networking, we can leverage the floating IP provided in current Neutron, or use VPN or BGPVPN(networking-bgpvpn) to setup the connection.

L3 networking to support the VNF HA will consume more resources and need to take more security factors into consideration, this make the networking more complex. And L3 networking is also not possible to provide IP floating across OpenStack instances.

3) The IP address used for VNF to connect with other VNFs should be able to be floating cross OpenStack instance. For example, if the master failed, the IP address should be used in the standby which is running in another OpenStack instance. There are some method like VRRP/GARP etc can help the movement of the external IP, so no new feature will be added to OpenStack.

Several projects are addressing the networking requirements, deployment should consider the factors mentioned above.

4. Multisite.Kingbird user guide
4.1. Quota management for OpenStack multi-region deployments

Kingbird is centralized synchronization service for multi-region OpenStack deployments. In OPNFV Colorado release, Kingbird provides centralized quota management feature. Administrator can set quota per project based in Kingbird and sync the quota limit to multi-region OpenStack periodiclly or on-demand. The tenant can check the total quota limit and usage from Kingbird for all regions. Administrator can also manage the default quota by quota class setting.

Following quota items are supported to be managed in Kingbird:

  • instances: Number of instances allowed per project.
  • cores: Number of instance cores allowed per project.
  • ram: Megabytes of instance RAM allowed per project.
  • metadata_items: Number of metadata items allowed per instance.
  • key_pairs: Number of key pairs per user.
  • fixed_ips: Number of fixed IPs allowed per project, valid if Nova Network is used.
  • security_groups: Number of security groups per project, valid if Nova Network is used.
  • floating_ips: Number of floating IPs allowed per project, valid if Nova Network is used.
  • network: Number of networks allowed per project, valid if Neutron is used.
  • subnet: Number of subnets allowed per project, valid if Neutron is used.
  • port: Number of ports allowed per project, valid if Neutron is used.
  • security_group: Number of security groups allowed per project, valid if Neutron is used.
  • security_group_rule: Number of security group rules allowed per project, valid if Neutron is used.
  • router: Number of routers allowed per project, valid if Neutron is used.
  • floatingip: Number of floating IPs allowed per project, valid if Neutron is used.
  • volumes: Number of volumes allowed per project.
  • snapshots: Number of snapshots allowed per project.
  • gigabytes: Total amount of storage, in gigabytes, allowed for volumes and snapshots per project.
  • backups: Number of volume backups allowed per project.
  • backup_gigabytes: Total amount of storage, in gigabytes, allowed for volume backups per project.

Key pair is the only resource type supported in resource synchronization.

Only restful APIs are provided for Kingbird in Colorado release, so curl or other http client can be used to call Kingbird API.

Before use the following command, get token, project id, and kingbird service endpoint first. Use $kb_token to repesent the token, and $admin_tenant_id as administrator project_id, and $tenant_id as the target project_id for quota management and $kb_ip_addr for the kingbird service endpoint ip address.

Note: To view all tenants (projects), run:

openstack project list

To get token, run:

openstack token issue

To get Kingbird service endpoint, run:

openstack endpoint list
4.2. Quota Management API
  1. Update global limit for a tenant

    Use python-kingbirdclient:

    kingbird quota update b8eea2ceda4c47f1906fda7e7152a322 --port 10 --security_groups 10
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X PUT \
    -d '{"quota_set":{"cores": 10,"ram": 51200, "metadata_items": 100,"key_pairs": 100, "network":20,"security_group": 20,"security_group_rule": 20}}' \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-sets/$tenant_id
    
  2. Get global limit for a tenant

    Use python-kingbirdclient:

    kingbird quota show --tenant $tenant_id
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-sets/$tenant_id
    
  3. A tenant can also get the global limit by himself

    Use python-kingbirdclient:

    kingbird quota show
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    http://$kb_ip_addr:8118/v1.0/$tenant_id/os-quota-sets/$tenant_id
    
  4. Get defaults limits

    Use python-kingbirdclient:

    kingbird quota defaults
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-sets/defaults
    
  5. Get total usage for a tenant

    Use python-kingbirdclient:

    kingbird quota detail --tenant $tenant_id
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X GET \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-sets/$tenant_id/detail
    
  6. A tenant can also get the total usage by himself

    Use python-kingbirdclient:

    kingbird quota detail
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X GET \
    http://$kb_ip_addr:8118/v1.0/$tenant_id/os-quota-sets/$tenant_id/detail
    
  7. On demand quota sync

    Use python-kingbirdclient:

    kingbird quota sync $tenant_id
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X PUT \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-sets/$tenant_id/sync
    
  8. Delete specific global limit for a tenant

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X DELETE \
    -d '{"quota_set": [ "cores", "ram"]}' \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-sets/$tenant_id
    
  9. Delete all kingbird global limit for a tenant

    Use python-kingbirdclient:

    kingbird quota delete $tenant_id
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X DELETE \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-sets/$tenant_id
    
4.3. Quota Class API
  1. Update default quota class

    Use python-kingbirdclient:

    kingbird quota-class update --port 10 --security_groups 10 <quota class>
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X PUT \
    -d '{"quota_class_set":{"cores": 100, "network":50,"security_group": 50,"security_group_rule": 50}}' \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-class-sets/default
    
  2. Get default quota class

    Use python-kingbirdclient:

    kingbird quota-class show default
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-class-sets/default
    
  3. Delete default quota class

    Use python-kingbirdclient:

    kingbird quota-class delete default
    

    Use curl:

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X DELETE \
    http://$kb_ip_addr:8118/v1.0/$admin_tenant_id/os-quota-class-sets/default
    
4.4. Resource Synchronization API
  1. Create synchronization job

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X POST -d \
    '{"resource_set":{"resources": ["<Keypair_name>"],"force":<True/False>,"resource_type": "keypair","source": <"Source_Region">,"target": [<"List_of_target_regions">]}}' \
    http://$kb_ip_addr:8118/v1.0/$tenant_id/os-sync
    
  2. Get synchronization job

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    http://$kb_ip_addr:8118/v1.0/$tenant_id/os-sync/
    
  3. Get active synchronization job

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    http://$kb_ip_addr:8118/v1.0/$tenant_id/os-sync/active
    
  4. Get detail information of a synchronization job

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    http://$kb_ip_addr:8118/v1.0/$tenant_id/os-sync/$job_id
    
  5. Delete a synchronization job

    curl \
    -H "Content-Type: application/json" \
    -H "X-Auth-Token: $kb_token" \
    -X DELETE \
     http://$kb_ip_addr:8118/v1.0/$tenant_id/os-sync/job_id
    
5. Multisite.Tricircle user guide

Tricircle is one OpenStack big-tent project. All user guide related documents could be found from OpenStack website:

Netready

NetReady: Network Readiness
Project:

NetReady, https://wiki.opnfv.org/display/netready/NetReady

Editors:

Georg Kunz (Ericsson)

Authors:

Bin Hu (AT&T), Gergely Csatari (Nokia), Georg Kunz (Ericsson) and others

Abstract:

OPNFV provides an infrastructure with different SDN controller options to realize NFV functionality on the platform it builds. As OPNFV uses OpenStack as a VIM, we need to analyze the capabilities this component offers us. The networking functionality is provided by a component called Neutron, which provides a pluggable architecture and specific APIs for integrating different networking backends, for instance SDN controllers. As NFV wasn’t taken into consideration at the time when Neutron was designed we are already facing several bottlenecks and architectural shortcomings while implementing NFV use cases.

The NetReady project aims at evolving OpenStack networking step-by-step to find the most efficient way to fulfill the requirements of the identified NFV use cases, taking into account the NFV mindset and the capabilities of SDN controllers.

History:
Date Description
22.03.2016 Project creation
19.04.2016 Initial version of the deliverable uploaded to Gerrit
22.07.2016 First version ready for sharing with the community
22.09.2016 Version accompanying the OPNFV Colorado release
1. Introduction

This document represents and describes the results of the OPNFV NetReady (Network Readiness) project. Specifically, the document comprises a selection of NFV-related networking use cases and their networking requirements. For every use case, it furthermore presents a gap analysis of the aforementioned requirements with respect to the current OpenStack networking architecture. Finally it provides a description of potential solutions and improvements.

1.1. Scope

NetReady is a project within the OPNFV initiative. Its focus is on NFV (Network Function Virtualization) related networking use cases and their requirements on the underlying NFVI (Network Function Virtualization Infrastructure).

The NetReady project addresses the OpenStack networking architecture, specifically OpenStack Neutron, from a NFV perspective. Its goal is to identify gaps in the current OpenStack networking architecture with respect to NFV requirements and to propose and evaluate improvements and potential complementary solutions.

1.2. Problem Description

Telco ecosystem’s movement towards the cloud domain results in Network Function Virtualization that is discussed and specified in ETSI NFV. This movement opens up many green field areas which are full of potential growth in both business and technology. This new NFV domain brings new business opportunities and new market segments as well as emerging technologies that are exploratory and experimental in nature, especially in NFV networking.

It is often stated that NFV imposes additional requirements on the networking architecture and feature set of the underlying NFVI beyond those of data center networking. For instance, the NFVI needs to establish and manage connectivity beyond the data center to the WAN (Wide Area Network). Moreover, NFV networking use cases often abstract from L2 connectivity and instead focus on L3-only connectivity. Hence, the NFVI networking architecture needs to be flexible enough to be able to meet the requirements of NFV-related use cases in addition to traditional data center networking.

Traditionally, OpenStack networking, represented typically by the OpenStack Neutron project, targets virtualized data center networking. This comprises originally establishing and managing layer 2 network connectivity among VMs (Virtual Machines). Over the past releases of OpenStack, Neutron has grown to provide an extensive feature set, covering both L2 as well as L3 networking services such as virtual routers, NATing, VPNaaS and BGP VPNs.

It is an ongoing debate how well the current OpenStack networking architecture can meet the additional requirements of NFV networking. Hence, a thorough analysis of NFV networking requirements and their relation to the OpenStack networking architecture is needed.

Besides current additional use cases and requirements of NFV networking, more importantly, because of the green field nature of NFV, it is foreseen that there will be more and more new NFV networking use cases and services, which will bring new business, in near future. The challenges for telco ecosystem are to:

  • Quickly catch the new business opportunity;
  • Execute it in agile way so that we can accelerate the time-to-market and improve the business agility in offering our customers with innovative NFV services.

Therefore, it is critically important for telco ecosystem to quickly develop and deploy new NFV networking APIs on-demand based on market need.

1.3. Goals

The goals of the NetReady project and correspondingly this document are the following:

  • This document comprises a collection of relevant NFV networking use cases and clearly describes their requirements on the NFVI. These requirements are stated independently of a particular implementation, for instance OpenStack Neutron. Instead, requirements are formulated in terms of APIs (Application Programming Interfaces) and data models needed to realize a given NFV use case.
  • The list of use cases is not considered to be all-encompassing but it represents a carefully selected set of use cases that are considered to be relevant at the time of writing. More use cases may be added over time. The authors are very open to suggestions, reviews, clarifications, corrections and feedback in general.
  • This document contains a thorough analysis of the gaps in the current OpenStack networking architecture with respect to the requirements imposed by the selected NFV use cases. To this end, we analyze existing functionality in OpenStack networking.
  • Beyond current list of use cases and gap analysis in the document, more importantly, it is the future of NFV networking that needs to be made easy to innovate, quick to develop, and agile to deploy and operate. A model-driven, extensible framework is expected to achieve agility for innovations in NFV networking.
  • This document will in future revisions describe the proposed improvements and complementary solutions needed to enable OpenStack to fulfill the identified NFV requirements.
2. Use Cases

The following sections address networking use cases that have been identified to be relevant in the scope of NFV and NetReady.

2.1. Multiple Networking Backends
2.1.1. Description

Network Function Virtualization (NFV) brings the need of supporting multiple networking back-ends in virtualized infrastructure environments.

First of all, a Service Providers’ virtualized network infrastructure will consist of multiple SDN Controllers from different vendors for obvious business reasons. Those SDN Controllers may be managed within one cloud or multiple clouds. Jointly, those VIMs (e.g. OpenStack instances) and SDN Controllers need to work together in an interoperable framework to create NFV services in the Service Providers’ virtualized network infrastructure. It is needed that one VIM (e.g. OpenStack instance) shall be able to support multiple SDN Controllers as back-end.

Secondly, a Service Providers’ virtualized network infrastructure will serve multiple, heterogeneous administrative domains, such as mobility domain, access networks, edge domain, core networks, WAN, enterprise domain, etc. The architecture of virtualized network infrastructure needs different types of SDN Controllers that are specialized and targeted for specific features and requirements of those different domains. The architectural design may also include global and local SDN Controllers. Importantly, multiple local SDN Controllers may be managed by one VIM (e.g. OpenStack instance).

Furthermore, even within one administrative domain, NFV services could also be quite diversified. Specialized NFV services require specialized and dedicated SDN Controllers. Thus a Service Provider needs to use multiple APIs and back-ends simultaneously in order to provide users with diversified services at the same time. At the same time, for a particular NFV service, the new networking APIs need to be agnostic of the back-ends.

2.1.2. Requirements

Based on the use cases described above, we derive the following requirements.

It is expected that in NFV networking service domain:

  • One OpenStack instance shall support multiple SDN Controllers simultaneously
  • New networking API shall be integrated flexibly and quickly
  • New NFV Networking APIs shall be agnostic of back-ends
  • Interoperability is needed among multi-vendor SDN Controllers at back-end
2.1.3. Current Implementation

In the current implementation of OpenStack networking, SDN controllers are hooked up to Neutron by means of dedicated plugins. A plugin translates requests coming in through the Neutron northbound API, e.g. the creation of a new network, into the appropriate northbound API calls of the corresponding SDN controller.

There are multiple different plugin mechanisms currently available in Neutron, each targeting a different purpose. In general, there are core plugins, covering basic networking functionality and service plugins, providing layer 3 connectivity and advanced networking services such as FWaaS or LBaaS.

2.1.3.1. Core and ML2 Plugins

The Neutron core plugins cover basic Neutron functionality, such as creating networks and ports. Every core plugin implements the functionality needed to cover the full range of the Neutron core API. A special instance of a core plugin is the ML2 core plugin, which in turn allows for using sub-drivers - separated again into type drivers (VLAN, VxLAN, GRE) or mechanism drivers (OVS, OpenDaylight, etc.). This allows to using dedicated sub-drivers for dedicated functionality.

In practice, different SDN controllers use both plugin mechanisms to integrate with Neutron. For instance OpenDaylight uses a ML2 mechanism plugin driver whereas OpenContrail integrated by means of a full core plugin.

In its current implementation, only one Neutron core plugin can be active at any given time. This means that if a SDN controller utilizes a dedicated core plugin, no other SDN controller can be used at the same time for the same type of service.

In contrast, the ML2 plugin allows for using multiple mechanism drivers simultaneously. In principle, this enables a parallel deployment of multiple SDN controllers if and only if all SDN controllers integrate through a ML2 mechanism driver.

2.1.3.2. Neutron Service Plugins

Neutron service plugins target L3 services and advanced networking services, such as BGPVPN or LBaaS. Typically, a service itself provides a driver plugin mechanism which needs to be implemented for every SDN controller. As the architecture of the driver mechanism is up to the community developing the service plugin, it needs to be analyzed for every driver plugin mechanism individually if and how multiple back-ends are supported.

2.1.4. Gaps in the current solution

Given the use case description and the current implementation of OpenStack Neutron, we identify the following gaps:

  • [MB-GAP1] Limited support for multiple back-ends

    As pointed out above, the Neutron core plugin mechanism only allows for one active plugin at a time. The ML2 plugin allows for running multiple mechanism drivers in parallel, however, successful inter-working strongly depends on the individual driver.

    Moreover, the ML2 plugin and its API is - by design - very layer 2 focused. For NFV networking use cases beyond layer 2, for instance L3VPNs, a more flexible API is required.

2.1.5. Conclusion

We conclude that a complementary method of integrating multiple SDN controllers into a single OpenStack deployment is needed to fulfill the needs of operators.

2.2. L3VPN Use Cases

L3VPNs are virtual layer 3 networks described in multiple standards and RFCs, such as [RFC4364] and [RFC7432]. Connectivity as well as traffic separation is achieved by exchanging routes between VRFs (Virtual Routing and Forwarding).

Moreover, a Service Providers’ virtualized network infrastructure may consist of one or more SDN Controllers from different vendors. Those SDN Controllers may be managed within one cloud or multiple clouds. Jointly, those VIMs (e.g. OpenStack instances) and SDN Controllers work together in an interoperable framework to create L3 services in the Service Providers’ virtualized network infrastructure.

While interoperability between SDN controllers and the corresponding data planes is ensured based on standardized protocols (e.g., [RFC4364] and [RFC7432]), the integration and management of different SDN domains from the VIM is not clearly defined. Hence, this section analyses three L3VPN use cases involving multiple SDN Controllers.

2.2.1. Any-to-Any Base Case
2.2.1.1. Description

This any-to-any use case is the base scenario, providing layer 3 connectivity between VNFs in the same L3VPN while separating the traffic and IP address spaces of different L3VPNs belonging to different tenants.

There are 2 hosts (compute nodes). SDN Controller A and vForwarder A are provided by Vendor A and run on host A. SDN Controller B and vForwarder B are provided by Vendor B, and run on host B.

There are 2 tenants. Tenant 1 creates L3VPN Blue with 2 subnets: 10.1.1.0/24 and 10.3.7.0/24. Tenant 2 creates L3VPN Red with 1 subnet and an overlapping address space: 10.1.1.0/24. The network topology is shown in l3vpn-any2any-figure.

_images/l3vpn-any2any.png

In L3VPN Blue, VMs G1 (10.1.1.5) and G2 (10.3.7.9) are spawned on host A, and attached to 2 subnets (10.1.1.0/24 and 10.3.7.0/24) and assigned IP addresses respectively. VMs G3 (10.1.1.6) and G4 (10.3.7.10) are spawned on host B, and attached to 2 subnets (10.1.1.0/24 and 10.3.7.0/24) and assigned IP addresses respectively.

In L3VPN Red, VM G5 (10.1.1.5) is spawned on host A, and attached to subnet 10.1.1.0/24. VM G6 (10.1.1.6) is spawned on host B, and attached to the same subnet 10.1.1.0/24.

2.2.1.2. Derived Requirements
2.2.1.2.1. Northbound API / Workflow

An example of the desired workflow is as follows:

  1. Create Network
  2. Create Network VRF Policy Resource Any-to-Any

2.1. This policy causes the following configuration when a VM of this tenant is spawned on a host:

2.1.1. There will be a RD assigned per VRF

2.1.2. There will be a RT used for the common any-to-any communication

  1. Create Subnet
  2. Create Port (subnet, network VRF policy resource). This causes the controller to:

4.1. Create a VRF in vForwarder’s FIB, or update VRF if it already exists

4.2. Install an entry for the guest’s host route in FIBs of the vForwarder serving this tenant’s virtual network

4.3. Announce guest host route to WAN-GW via MP-BGP

2.2.1.3. Current implementation

Support for creating and managing L3VPNs is available in OpenStack Neutron by means of the [BGPVPN] project. In order to create the L3VPN network configuration described above using the API [BGPVPN] API, the following workflow is needed:

  1. Create Neutron networks for tenant “Blue”

neutron net-create --tenant-id Blue net1

neutron net-create --tenant-id Blue net2

  1. Create subnets for the Neutron networks for tenant “Blue”

neutron subnet-create --tenant-id Blue --name subnet1 net1 10.1.1.0/24

neutron subnet-create --tenant-id Blue --name subnet2 net2 10.3.7.0/24

  1. Create Neutron ports in the corresponding networks for tenant “Blue”

    neutron port-create --tenant-id Blue --name G1 --fixed-ip subnet_id=subnet1,ip_address=10.1.1.5 net1

    neutron port-create --tenant-id Blue --name G2 --fixed-ip subnet_id=subnet1,ip_address=10.1.1.6 net1

    neutron port-create --tenant-id Blue --name G3 --fixed-ip subnet_id=subnet2,ip_address=10.3.7.9 net2

    neutron port-create --tenant-id Blue --name G4 --fixed-ip subnet_id=subnet2,ip_address=10.3.7.10 net2

  2. Create Neutron network for tenant “Red”

neutron net-create --tenant-id Red net3
  1. Create subnet for the Neutron network of tenant “Red”
neutron subnet-create --tenant-id Red --name subnet3 net3 10.1.1.0/24
  1. Create Neutron ports in the networks of tenant “Red”

neutron port-create --tenant-id Red --name G5 --fixed-ip subnet_id=subnet3,ip_address=10.1.1.5 net3

neutron port-create --tenant-id Red --name G7 --fixed-ip subnet_id=subnet3,ip_address=10.1.1.6 net3

  1. Create a L3VPN by means of the BGPVPN API for tenant “Blue”
neutron bgpvpn-create --tenant-id Blue --route-targets AS:100 --name vpn1
  1. Associate the L3VPN of tenant “Blue” with the previously created networks

neutron bgpvpn-net-assoc-create --tenant-id Blue --network net1 --name vpn1

neutron bgpvpn-net-assoc-create --tenant-id Blue --network net2 --name vpn1

  1. Create a L3VPN by means of the BGPVPN API for tenant “Red”
neutron bgpvpn-create --tenant-id Red --route-targets AS:200 --name vpn2
  1. Associate the L3VPN of tenant “Red” with the previously created networks
neutron bgpvpn-net-assoc-create --tenant-id Red --network net3 --name vpn2

Comments:

  • In this configuration only one BGPVPN for each tenant is created.
  • The ports are associated indirectly to the VPN through their networks.
  • The BGPVPN backend takes care of distributing the /32 routes to the vForwarder instances and assigning appropriate RD values.
2.2.1.4. Gaps in the current solution

In terms of the functionality provided by the BGPVPN project, there are no gaps preventing this particular use case from a L3VPN perspective.

However, in order to support the multi-vendor aspects of this use case, a better support for integrating multiple backends is needed (see previous use case).

2.2.2. L3VPN: ECMP Load Splitting Case (Anycast)
2.2.2.1. Description

In this use case, multiple instances of a VNF are reachable through the same IP. The networking infrastructure is then responsible for spreading the network load across the VNF instances using Equal-Cost Multi-Path (ECMP) or perform a fail-over in case of a VNF failure.

There are 2 hosts (compute nodes). SDN Controller A and vForwarder A are provided by Vendor A, and run on host A. SDN Controller B and vForwarder B are provided by Vendor B, and run on host B.

There is one tenant. Tenant 1 creates L3VPN Blue with subnet 10.1.1.0/24.

The network topology is shown in l3vpn-ecmp-figure:

_images/l3vpn-ecmp.png

In L3VPN Blue, VNF1.1 and VNF1.2 are spawned on host A, attached to subnet 10.1.1.0/24 and assigned the same IP address 10.1.1.5. VNF1.3 is spawned on host B, attached to subnet 10.1.1.0/24 and assigned the same IP addresses 10.1.1.5. VNF 2 and VNF 3 are spawned on host A and B respectively, attached to subnet 10.1.1.0/24, and assigned different IP addresses 10.1.1.6 and 10.1.1.3 respectively.

Here, the Network VRF Policy Resource is ECMP/AnyCast. Traffic to the anycast IP 10.1.1.5 can be load split from either WAN GW or another VM like G5.

2.2.2.2. Current implementation

Support for creating and managing L3VPNs is, in general, available in OpenStack Neutron by means of the BGPVPN project [BGPVPN]. However, the BGPVPN project does not yet fully support ECMP as described in the following.

There are (at least) two different approached to configuring ECMP:

  1. Using Neutron ports with identical IP addresses, or
  2. Using Neutron ports with unique IPs addresses and creating static routes to a common IP prefix with next hops pointing to the unique IP addresses.
2.2.2.2.1. Ports with identical IP addresses

In this approach, multiple Neutron ports using the same IP address are created. In the current Neutron architecture, a port has to reside in a specific Neutron network. However, re-using the same IP address multiple times in a given Neutron network is not possible as this would create an IP collision. As a consequence, creating one Neutron network for each port is required.

Given multiple Neutron networks, the BGPVPN API allows for associating those networks with the same VPN. It is then up to the networking backend to implement ECMP load balancing. This behavior and the corresponding API for configuring the behavior is currently not available. It is nevertheless on the road map of the BGPVPN project.

2.2.2.2.2. Static Routes to ports with unique IP addresses

In this approach, Neutron ports are assigned unique IPs and static routes pointing to the same ECMP load-balanced prefix are created. The static routes define the unique Neutron port IPs as next-hop addresses.

Currently, the API for static routes is not yet available in the BGPVPN project, but it is on the road map. The following work flow shows how to realize this particular use case under the assumption that support for static routes is available in the BGPVPN API.

  1. Create Neutron network for tenant “Blue”
neutron net-create --tenant-id Blue net1
  1. Create subnet for the network of tenant “Blue”
neutron subnet-create --tenant-id Blue --name subnet1 net1 5.1.1.0/24
  1. Create Neutron ports in the network of tenant “Blue”

neutron port-create --tenant-id Blue --name G1 --fixed-ip subnet_id=subnet1,ip_address=5.1.1.1 net1

neutron port-create --tenant-id Blue --name G2 --fixed-ip subnet_id=subnet1,ip_address=5.1.1.2 net1

neutron port-create --tenant-id Blue --name G3 --fixed-ip subnet_id=subnet1,ip_address=5.1.1.3 net1

neutron port-create --tenant-id Blue --name G4 --fixed-ip subnet_id=subnet1,ip_address=5.1.1.4 net1

neutron port-create --tenant-id Blue --name G5 --fixed-ip subnet_id=subnet1,ip_address=5.1.1.5 net1

neutron port-create --tenant-id Blue --name G6 --fixed-ip subnet_id=subnet1,ip_address=5.1.1.6 net1

  1. Create a L3VPN for tenant “Blue”
neutron bgpvpn-create --tenant-id Blue --route-target AS:100 vpn1
  1. Associate the BGPVPN with the network of tenant “Blue”
neutron bgpvpn-network-associate --tenant-id Blue --network-id net1 vpn1
  1. Create static routes which point to the same target

neutron bgpvpn-static-route-add --tenant-id Blue --cidr 10.1.1.5/32 --nexthop-ip 5.1.1.1 vpn1

neutron bgpvpn-static-route-add --tenant-id Blue --cidr 10.1.1.5/32 --nexthop-ip 5.1.1.2 vpn1

neutron bgpvpn-static-route-add --tenant-id Blue --cidr 10.1.1.5/32 --nexthop-ip 5.1.1.3 vpn1

2.2.2.3. Gaps in the current solution

Given the use case description and the currently available implementation in OpenStack provided by BGPVPN project, we identify the following gaps:

  • [L3VPN-ECMP-GAP1] Static routes are not yet supported by the BGPVPN project.

    Currently, no API for configuring static routes is available in the BGPVPN project. This feature is on the road map, however.

  • [L3VPN-ECMP-GAP2] Behavior not defined for multiple Neutron ports of the same IP

    The Neutron and BGPVPN API allow for creating multiple ports with the same IP in different networks and associating the networks with the same VPN. The exact behavior of this configuration is however not defined and an API for configuring the behavior (load-balancing or fail-over) is missing. Development of this feature is on the road map of the project, however.

  • [L3VPN-ECMP-GAP3] It is not possible to assign the same IP to multiple Neutron ports within the same Neutron subnet.

    This is due to the fundamental requirement of avoiding IP collisions within the L2 domain which is a Neutron network.

2.2.2.4. Conclusions

In the context of the ECMP use case, three gaps have been identified. Gap [L3VPN-ECMP-GAP1] and [L3VPN-ECMP-GAP2] are missing or undefined functionality in the BGPVPN project. There is no architectural hindrance preventing the implementation of the missing features in the BGPVPN project as well as in Neutron.

The third gap [L3VPN-ECMP-GAP3] is based on the fact that Neutron ports always have to exist in a Neutron network. As a consequence, in order to create ports with the same IP, multiple networks must be used. This port-network binding will most likely not be relaxed in future releases of Neutron to retain backwards compatibility. A clean alternative to Neutron can instead provide more modeling flexibility.

2.2.3. Hub and Spoke Case
2.2.3.1. Description

In a traditional Hub-and-spoke topology there are two types of network entities: a central hub and multiple spokes. The corresponding VRFs of the hub and the spokes are configured to import and export routes such that all traffic is directed through the hub. As a result, spokes cannot communicate with each other directly, but only indirectly via the central hub. Hence, the hub typically hosts central network functions such firewalls.

Furthermore, there is no layer 2 connectivity between the VNFs.

In addition, in this use case, the deployed network infrastructure comprises equipment from two different vendors, Vendor A and Vendor B. There are 2 hosts (compute nodes). SDN Controller A and vForwarder A are provided by Vendor A, and run on host A. SDN Controller B and vForwarder B are provided by Vendor B, and run on host B.

There is 1 tenant. Tenant 1 creates L3VPN Blue with 2 subnets: 10.1.1.0/24 and 10.3.7.0/24.

The network topology is shown in l3vpn-hub-spoke-figure:

_images/l3vpn-hub-spoke.png

In L3VPN Blue, vFW(H) is acting the role of hub (a virtual firewall). The other 3 VNF VMs are spoke. vFW(H) and VNF1(S) are spawned on host A, and VNF2(S) and VNF3(S) are spawned on host B. vFW(H) (10.1.1.5) and VNF2(S) (10.1.1.6) are attached to subnet 10.1.1.0/24. VNF1(S) (10.3.7.9) and VNF3(S) (10.3.7.10) are attached to subnet 10.3.7.0/24.

2.2.3.2. Derived Requirements
2.2.3.2.1. Northbound API / Workflow

Exemplary workflow is described as follows:

  1. Create Network
  2. Create VRF Policy Resource
2.1. Hub and Spoke
  1. Create Subnet
  2. Create Port

4.1. Subnet

4.2. VRF Policy Resource, [H | S]

2.2.3.2.2. Current implementation

Different APIs have been developed to support creating a L3 network topology and directing network traffic through specific network elements in specific order, for example, [BGPVPN] and [NETWORKING-SFC]. We analyzed those APIs regarding the Hub-and-Spoke use case.

2.2.3.2.2.1. BGPVPN

Support for creating and managing L3VPNs is in general available in OpenStack Neutron by means of the BGPVPN API [BGPVPN]. The [BGPVPN] API currently supports the concepts of network- and router-associations. An association maps Neutron network objects (networks and routers) to a VRF with the following semantics:

  • A network association interconnects all subnets and ports of a Neutron network by binding them to a given VRF
  • a router association interconnects all networks, and hence indirectly all ports, connected to a Neutron router by binding them to a given VRF

It is important to notice that these associations apply to entire Neutron networks including all ports connected to a network. This is due to the fact that in the Neutron, ports can only exist within a network but not individually. Furthermore, Neutron networks were originally designed to represent layer 2 domains. As a result, ports within the same Neutron network typically have layer connectivity among each other. There are efforts to relax this original design assumption, e.g. routed networks, which however do not solve the problem at hand here (see the gap analysis further down below).

In order to realize the hub-and-spoke topology outlined above, VRFs need to be created on a per port basis. Specifically, ports belonging to the same network should not be interconnected except through a corresponding configuration of a per-port-VRF. This configuration includes setting up next-hop routing table, labels, I-RT and E-RT etc. in order to enable traffic direction from hub to spokes.

It may be argued that given the current network- and router-association mechanisms, the following workflow establishes a network topology which aims to achieve the desired traffic flow from Hub to Spokes. The basic idea is to model separate VRFs per VM by creating a dedicated Neutron network with two subnets for each VRF in the Hub-and-Spoke topology.

  1. Create Neutron network “hub”
neutron net-create --tenant-id Blue hub
  1. Create a separate Neutron network for every “spoke”
neutron net-create --tenant-id Blue spoke-i
  1. For every network (hub and spokes), create two subnets

neutron subnet-create <hub/spoke-i UUID> --tenant-id Blue 10.1.1.0/24

neutron subnet-create <hub/spoke-i UUID> --tenant-id Blue 10.3.7.0/24

  1. Create the Neutron ports in the corresponding networks

neutron port-create --tenant-id Blue --name vFW(H) --fixed-ip subnet_id=<hub UUID>,ip_address=10.1.1.5

neutron port-create --tenant-id Blue --name VNF1(S) --fixed-ip subnet_id=<spoke-i UUID>,ip_address=10.3.7.9

neutron port-create --tenant-id Blue --name VNF2(S) --fixed-ip subnet_id=<spoke-i UUID>,ip_address=10.1.1.6

neutron port-create --tenant-id Blue --name VNF3(S) --fixed-ip subnet_id=<spoke-i UUID>,ip_address=10.3.7.10

  1. Create a BGPVPN object (VRF) for the hub network with the corresponding import and export targets
neutron bgpvpn-create --name hub-vrf --import-targets <RT-hub RT-spoke> --export-targets <RT-hub>
  1. Create a BGPVPN object (VRF) for every spoke network with the corresponding import and export targets
neutron bgpvpn-create --name spoke-i-vrf --import-targets <RT-hub> --export-targets <RT-spoke>
  1. Associate the hub network with the hub VRF
bgpvpn-net-assoc-create hub --network <hub network-UUID>
  1. Associate each spoke network with the corresponding spoke VRF
bgpvpn-net-assoc-create spoke-i --network <spoke-i network-UUID>
  1. Add static route to direct all traffic to vFW VNF running at the hub.

    Note: Support for static routes not yet available.

neutron bgpvpn-static-route-add --tenant-id Blue --cidr 0/0 --nexthop-ip 10.1.1.5 hub

After step 9, VMs can be booted with the corresponding ports.

The resulting network topology intents to resemble the target topology as shown in l3vpn-hub-spoke-figure, and achieve the desired traffic direction from Hub to Spoke. However, it deviates significantly from the essence of the Hub-and-Spoke use case as described above in terms of desired network topology, i.e. one L3VPN with multiple VRFs associated with vFW(H) and other VNFs(S) separately. And this method of using the current network- and router-association mechanism is not scalable when there are large number of Spokes, and in case of scale-in and scale-out of Hub and Spokes.

The gap analysis in the next section describes the technical reasons for this.

2.2.3.2.2.2. Network SFC

Support of Service Function Chaining is in general available in OpenStack Neutron through the Neutron API for Service Insertion and Chaining project [NETWORKING-SFC]. However, the [NETWORKING-SFC] API is focused on creating service chaining through NSH at L2, although it intends to be agnostic of backend implementation. It is unclear whether or not the service chain from vFW(H) to VNFs(S) can be created in the way of L3VPN-based VRF policy approach using [NETWORKING-SFC] API.

Hence, it is currently not possible to configure the networking use case as described above.

2.2.3.2.3. Gaps in the Current Solution

Given the use case description and the currently available implementation in OpenStack provided by [BGPVPN] project and [NETWORKING-SFC] project, we identify the following gaps:

  • [L3VPN-HS-GAP1] No means to disable layer 2 semantic of Neutron networks

    Neutron networks were originally designed to represent layer 2 broadcast domains. As such, all ports connected to a network are in principle inter-connected on layer 2 (not considering security rules here). In contrast, in order to realize L3VPN use cases such as the hub-and-spoke topology, connectivity among ports must be controllable on a per port basis on layer 3.

    There are ongoing efforts to relax this design assumption, for instance by means of routed networks ([NEUTRON-ROUTED-NETWORKS]). In a routed network, a Neutron network is a layer 3 domain which is composed of multiple layer 2 segments. A routed network only provides layer 3 connectivity across segments, but layer 2 connectivity across segments is optional. This means, depending on the particular networking backend and segmentation technique used, there might be layer 2 connectivity across segments or not. A new flag l2_adjacency indicates whether or not a user can expect layer 2 connectivity or not across segments.

    This flag, however, is ready-only and cannot be used to overwrite or disable the layer 2 semantics of a Neutron network.

  • [L3VPN-HS-GAP2] No port-association available in the BGPVPN project yet

    Due to gap [L3VPN-HS-GAP1], the [BGPVPN] project was not yet able to implement the concept of a port association. A port association would allow to associate individual ports with VRFs and thereby control layer 3 connectivity on a per port basis.

    The workflow described above intents to mimic port associations by means of separate Neutron networks. Hence, the resulting workflow is overly complicated and not intuitive by requiring to create additional Neutron entities (networks) which are not present in the target topology. Moreover, creating large numbers of Neutron networks limits scalability.

    Port associations are on the road map of the [BGPVPN] project, however, no design that overcomes the problems outlined above has been specified yet. Consequently, the time-line for this feature is unknown.

    As a result, creating a clean Hub-and-Spoke topology is current not yet supported by the [BGPVPN] API.

  • [L3VPN-HS-GAP3] No support for static routes in the BGPVPN project yet

    In order to realize the hub-and-spoke use case, a static route is needed to attract the traffic at the hub to the corresponding VNF (direct traffic to the firewall). Support for static routes in the BGPVPN project is available for the router association by means of the Neutron router extra routes feature. However, there is no support for static routes for network and port associations yet.

    Design work for supporting static routes for network associations has started, but no final design has been proposed yet.

2.2.4. Conclusion

Based on the gap analyses of the three specific L3VPN use cases we conclude that there are gaps in both the functionality provided by the BGPVPN project as well as the support for multiple backends in Neutron.

Some of the identified gaps [L3VPN-ECMP-GAP1, L3VPN-ECMP-GAP2, L3VPN-HS-GAP3] in the BGPVPN project are merely missing functionality which can be integrated in the existing OpenStack networking architecture.

Other gaps, such as the inability to explicitly disable the layer 2 semantics of Neutron networks [L3VPN-HS-GAP1] or the tight integration of ports and networks [L3VPN-HS-GAP2] hinder a clean integration of the needed functionality. In order to close these gaps, fundamental changes in Neutron or alternative approaches need to be investigated.

2.3. Service Binding Design Pattern
2.3.1. Description

This use case aims at binding multiple networks or network services to a single vNIC (port) of a given VM. There are several specific application scenarios for this use case:

  • Shared Service Functions: A service function connects to multiple networks of a tenant by means of a single vNIC.

    Typically, a vNIC is bound to a single network. Hence, in order to directly connect a service function to multiple networks at the same time, multiple vNICs are needed - each vNIC binds the service function to a separate network. For service functions requiring connectivity to a large number of networks, this approach does not scale as the number of vNICs per VM is limited and additional vNICs occupy additional resources on the hypervisor.

    A more scalable approach is to bind multiple networks to a single vNIC and let the service function, which is now shared among multiple networks, handle the separation of traffic itself.

  • Multiple network services: A service function connects to multiple different network types such as a L2 network, a L3(-VPN) network, a SFC domain or services such as DHCP, IPAM, firewall/security, etc.

In order to achieve a flexible binding of multiple services to vNICs, a logical separation between a vNIC (instance port) - that is, the entity that is used by the compute service as hand-off point between the network and the VM - and a service interface - that is, the interface a service binds to - is needed.

Furthermore, binding network services to service interfaces instead of to the vNIC directly enables a more dynamic management of the network connectivity of network functions as there is no need to add or remove vNICs.

2.3.2. Requirements
2.3.2.1. Data model

This section describes a general concept for a data model and a corresponding API. It is not intended that these entities are to be implemented exactly as described. Instead, they are meant to show a design pattern for future network service models and their corresponding APIs. For example, the “service” entity should hold all required attributes for a specific service, for instance a given L3VPN service. Hence, there would be no entity “service” but rather “L3VPN”.

  • instance-port

    An instance port object represents a vNIC which is bindable to an OpenStack instance by the compute service (Nova).

    Attributes: Since an instance-port is a layer 2 device, its attributes include the MAC address, MTU and others.

  • interface

    An interface object is a logical abstraction of an instance-port. It allows to build hierarchies of interfaces by means of a reference to a parent interface. Each interface represents a subset of the packets traversing a given port or parent interface after applying a layer 2 segmentation mechanism specific to the interface type.

    Attributes: The attributes are specific to the type of interface.

    Examples: trunk interface, VLAN interface, VxLAN interface, MPLS interface

  • service

    A service object represents a specific networking service.

    Attributes: The attributes of the service objects are service specific and valid for given service instance.

    Examples: L2, L3VPN, SFC

  • service-port

    A service port object binds an interface to a service.

    Attributes: The attributes of a service-port are specific for the bound service.

    Examples: port services (IPAM, DHCP, security), L2 interfaces, L3VPN interfaces, SFC interfaces.

2.3.2.2. Northbound API

An exemplary API for manipulating the data model is described below. As for the data model, this API is not intended to be a concrete API, but rather an example for a design pattern that clearly separates ports from services and service bindings.

  • instance-port-{create,delete} <name>

    Creates or deletes an instance port object that represents a vNIC in a VM.

  • interface-{create,delete} <name> [interface type specific parameters]

    Creates or deletes an interface object.

  • service-{create,delete} <name> [service specific parameters]

    Create a specific service object, for instance a L3VPN, a SFC domain, or a L2 network.

  • service-port-{create,delete} <service-id> <interface-id> [service specific parameters]

    Creates a service port object, thereby binding an interface to a given service.

2.3.2.3. Orchestration

None.

2.3.2.4. Dependencies on other resources

The compute service needs to be able to consume instance ports instead of classic Neutron ports.

2.3.3. Current Implementation

The core Neutron API does not follow the service binding design pattern. For example, a port has to exist in a Neutron network - specifically it has to be created for a particular Neutron network. It is not possible to create just a port and assign it to a network later on as needed. As a result, a port cannot be moved from one network to another, for instance.

Regarding the shared service function use case outlined above, there is an ongoing activity in Neutron [VLAN-AWARE-VMs]. The solution proposed by this activity allows for creating a trunk-port and multiple sub-ports per Neutron port which can be bound to multiple networks (one network per sub-port). This allows for binding a single VNIC to multiple networks and allow the corresponding VMs to handle the network segmentation (VLAN tagged traffic) itself. While this is a step in the direction of binding multiple services (networks) to a port, it is limited by the fundamental assumption of Neutron that a port has to exist on a given network.

There are extensions of Neutron that follow the service binding design pattern more closely. An example is the BGPVPN project. A rough mapping of the service binding design pattern to the data model of the BGPVPN project is as follows:

  • instance-port -> Neutron port
  • service -> VPN
  • service-port -> network association

This example shows that extensions of Neutron can in fact follow the described design pattern in their respective data model and APIs.

2.3.4. Conclusions

In conclusion, the design decisions taken for the core Neutron API and data model do not follow the service binding model. As a result, it is hard to implement certain use cases which rely on a flexible binding of services to ports. Due to the backwards compatibility to the large amount of existing Neutron code, it is unlikely that the core Neutron API will adapt to this design pattern.

New extension to Neutron however are relatively free to choose their data model and API - within the architectural boundaries of Neutron of course. In order to provide the flexibility needed, extensions shall aim for following the service binding design pattern if possible.

For the same reason, new networking frameworks complementing Neutron, such as Gluon, shall follow this design pattern and create the foundation for implementing networking services accordingly.

2.4. Georedundancy

Georedundancy refers to a configuration which ensures the service continuity of the VNFs even if a whole datacenter fails.

It is possible that the VNF application layer provides additional redundancy with VNF pooling on top of the georedundancy functionality described here.

It is possible that either the VNFCs of a single VNF are spread across several datacenters (this case is covered by the OPNFV multi-site project [MULTISITE] or different, redundant VNFs are started in different datacenters.

When the different VNFs are started in different datacenters the redundancy can be achieved by redundant VNFs in a hot (spare VNF is running its configuration and internal state is synchronized to the active VNF), warm (spare VNF is running, its configuration is synchronized to the active VNF) or cold (spare VNF is not running, active VNFs configuration is stored in a persistent, central store and configured to the spare VNF during its activation) standby state in a different datacenter from where the active VNFs are running. The synchronization and data transfer can be handled by the application or by the infrastructure.

In all of these georedundancy setups there is a need for a network connection between the datacenter running the active VNF and the datacenter running the spare VNF.

In case of a distributed cloud it is possible that the georedundant cloud of an application is not predefined or changed and the change requires configuration in the underlay networks when the network operator uses network isolation. Isolation of the traffic between the datacenters might be needed due to the multi-tenant usage of NFVI/VIM or due to the IP pool management of the network operator.

This set of georedundancy use cases is about enabling the possibility to select a datacenter as backup datacenter and build the connectivity between the NFVIs in the different datacenters in a programmable way.

The focus of these uses cases is on the functionality of OpenStack. It is not considered how the provisioning of physical resources is handled by the SDN controllers to interconnect the two datacenters.

As an example the following picture (georedundancy-before) shows a multi-cell cloud setup where the underlay network is not fully meshed.

_images/georedundancy-before.png

Each datacenter (DC) is a separate OpenStack cell, region or instance. Let’s assume that a new VNF is started in DC b with a Redundant VNF in DC d. In this case a direct underlay network connection is needed between DC b and DC d. The configuration of this connection should be programmable in both DC b and DC d. The result of the deployment is shown in the following figure (georedundancy-after):

_images/georedundancy-after.png
2.4.1. Connection between different OpenStack cells
2.4.1.1. Description

There should be an API to manage the infrastructure networks between two OpenStack cells. (Note: In the Mitaka release of OpenStack cells v1 are considered as experimental, while cells v2 functionality is under implementation). Cells are considered to be problematic from maintainability perspective as the sub-cells are using only the internal message bus and there is no API (and CLI) to do maintenance actions in case of a network connectivity problem between the main cell and the sub cells.

The following figure (cells-architecture) shows the architecture of the most relevant OpenStack components in multi cell OpenStack environment.

_images/cells-architecture.png

The functionality behind the API depends on the underlying network providers (SDN controllers) and the networking setup. (For example OpenDaylight has an API to add new BGP neighbor.)

OpenStack Neutron should provide an abstracted API for this functionality what calls the underlying SDN controllers API.

2.4.1.2. Derived Requirements
  • Possibility to define a remote and a local endpoint
  • As in case of cells the nova-api service is shared. It should be possible to identify the cell in the API calls
2.4.1.2.1. Northbound API / Workflow
  • An infrastructure network management API is needed
  • API call to define the remote and local infrastructure endpoints
  • When the endpoints are created neutron is configured to use the new network.
2.4.1.2.2. Dependencies on compute services
None.
2.4.1.2.3. Data model objects
  • local and remote endpoint objects (Most probably IP addresses with some additional properties).
2.4.1.3. Current implementation
Current OpenStack implementation provides no way to set up the underlay network connection. OpenStack Tricircle project [TRICIRCLE] has plans to build up inter datacenter L2 and L3 networks.
2.4.1.4. Gaps in the current solution
An infrastructure management API is missing from Neutron where the local and remote endpoints of the underlay network could be configured.
2.4.2. Connection between different OpenStack regions or cloud instances
2.4.2.1. Description

There should be an API to manage the infrastructure networks between two OpenStack regions or instances.

The following figure (instances-architecture) shows the architecture of the most relevant OpenStack components in multi instance OpenStack environment.

_images/instances-architecture.png

The functionality behind the API depends on the underlying network providers (SDN controllers) and the networking setup. (For example both OpenDaylight and ONOS have an API to add new BGP neighbors.)

OpenStack Neutron should provide an abstracted API for this functionality what calls the underlying SDN controllers API.

2.4.2.2. Derived Requirements
  • Possibility to define a remote and a local endpoint
  • As in case of cells the nova-api service is shared. It should be possible to identify the cell in the API calls
2.4.2.2.1. Northbound API / Workflow
  • An infrastructure network management API is needed
  • API call to define the remote and local infrastructure endpoints
  • When the endpoints are created Neutron is configured to use the new network.
2.4.2.2.2. Data model objects
  • local and remote endpoint objects (Most probably IP addresses with some additional properties, like local or remote Autonomus Systems (AS)).
2.4.2.3. Current implementation
Current OpenStack implementation provides no way to set up the underlay network connection. OpenStack Tricircle project [TRICIRCLE] has plans to build up inter datacenter L2 and L3 networks.
2.4.2.4. Gaps in the current solution
An infrastructure management API is missing from Neutron where the local and remote endpoints of the underlay network could be configured.
2.4.3. Conclusion
An API is needed what provides possibility to set up the local and remote endpoints for the underlay network. This API present in the SDN solutions, but OpenStack does not provide an abstracted API for this functionality to hide the differences of the SDN solutions.
3. Retired Use Cases

The following use cases have previously been analyzed in OPNFV Netready. Since then, the identified gaps have been addressed and/or closed in the upstream community.

These use cases are not removed from the document for the sake of completeness, but moved to a separate chapter to keep the document structure clean.

3.1. Programmable Provisioning of Provider Networks
3.1.1. Description

In a NFV environment the VNFMs (Virtual Network Function Manager) are consumers of the OpenStack IaaS API. They are often deployed without administrative rights on top of the NFVI platform. Furthermore, in the telco domain provider networks are often used. However, when a provider network is created administrative rights are needed what in the case of a VNFM without administrative rights requires additional manual configuration work. It shall be possible to configure provider networks without administrative rights. It should be possible to assign the capability to create provider networks to any roles.

The following figure (api-users) shows the possible users of an OpenStack API and the relation of OpenStack and ETSI NFV components. Boxes with solid line are the ETSI NFV components while the boxes with broken line are the OpenStack components.

_images/api-users.png
3.1.2. Requirements
  • Authorize the possibility of provider network creation based on policy
  • There should be a new entry in policy.json which controls the provider network creation
  • Default policy of this new entry should be rule:admin_or_owner.
  • This policy should be respected by the Neutron API
3.1.2.1. Northbound API / Workflow
  • No changes in the API
3.1.2.2. Data model objects
  • No changes in the data model
3.1.3. Current implementation

Only admin users can manage provider networks [OS-NETWORKING-GUIDE-ML2].

3.1.4. Potential implementation
  • Policy engine shall be able to handle a new provider network creation and modification related policy.
  • When a provider network is created or modified neutron should check the authority with the policy engine instead of requesting administrative rights.
3.1.5. Solution in upstream community

A bug report has been submitted to the upstream OpenStack community to highlight this gap: https://bugs.launchpad.net/neutron/+bug/1630880

This bug report revealed that this use case has already been addressed in the upstream community. Specifically, it is possible to specify the roles (e.g., admin, regular user) in the Neutron policy.json file which are able to create and update provider networks.

However, the OpenStack user guide wrongly stated that only administrators can create and update provider type networks. Hence, a correction has been submitted to the OpenStack documentation repository, clarifying the possibility to change this behavior based on policies: https://review.openstack.org/#/c/390359/

In conclusion, this use case has been retired as the corresponding gaps have been closed in the upstream community.

4. Summary and Conclusion

This document presented the results of the OPNFV NetReady (Network Readiness) project ([NETREADY]). It described a selection of NFV-related networking use cases and their corresponding networking requirements. Moreover, for every use case, it describes an associated gap analysis which analyses the aforementioned networking requirements with respect to the current OpenStack networking architecture.

The contents of the current document are the selected use cases and their derived requirements and identified gaps for OPNFV C release.

OPNFV NetReady is open to take any further use cases under analysis in later OPNFV releases. The project backlog ([NETREADY-JIRA]) lists the use cases and topics planned to be developed in future releases of OPNFV.

Based on the gap analyses, we draw the following conclusions:

  • Besides current requirements and gaps identified in support of NFV networking, more and more new NFV networking services are to be innovated in the near future. Those innovations will bring additional requirements, and more significant gaps will be expected. On the other hand, NFV networking business requires it to be made easy to innovate, quick to develop, and agile to deploy and operate. Therefore, a model-driven, extensible framework is expected to support NFV networking on-demand in order to accelerate time-to-market and achieve business agility for innovations in NFV networking business.
  • Neutron networks are implicitly, because of their reliance on subnets, L2 domains. L2 network overlays are the only way to implement Neutron networks because of their semantics. However, L2 networks are inefficient ways to implement cloud networking, and while this is not necessarily a problem for enterprise use cases with moderate traffic it can add expense to the infrastructure of NFV cases where networking is heavily used and efficient use of capacity is key.
  • In NFV environment it should be possible to execute network administrator tasks without OpenStack administrator rights.
  • In a multi-site setup it should be possible to manage the connection between the sites in a programmable way.

The latest version of this document can be found at [SELF].

5. Definition of terms

Different standards developing organizations and communities use different terminology related to Network Function Virtualization, Cloud Computing, and Software Defined Networking. This list defines the terminology in the contexts of this document.

API
Application Programming Interface.
Cloud Computing
A model that enables access to a shared pool of configurable computing resources, such as networks, servers, storage, applications, and services, that can be rapidly provisioned and released with minimal management effort or service provider interaction.
Edge Computing
Edge computing pushes applications, data and computing power (services) away from centralized points to the logical extremes of a network.
Instance
Refers in OpenStack terminology to a running VM, or a VM in a known state such as suspended, that can be used like a hardware server.
NFV
Network Function Virtualization.
NFVI
Network Function Virtualization Infrastructure. Totality of all hardware and software components which build up the environment in which VNFs are deployed.
SDN
Software-Defined Networking. Emerging architecture that decouples the network control and forwarding functions, enabling the network control to become directly programmable and the underlying infrastructure to be abstracted for applications and network services.
Server
Computer that provides explicit services to the client software running on that system, often managing a variety of computer operations. In OpenStack terminology, a server is a VM instance.
vForwarder
vForwarder is used as generic and vendor neutral term for a software packet forwarder. Concrete examples includes OpenContrail vRouter, OpenvSwitch, Cisco VTF.
VIM
Virtualized Infrastructure Manager. Functional block that is responsible for controlling and managing the NFVI compute, storage and network resources, usually within one operator’s Infrastructure Domain, e.g. NFVI Point of Presence (NFVI-PoP).
Virtual network
Virtual network routes information among the network interfaces of VM instances and physical network interfaces, providing the necessary connectivity.
VM
Virtual Machine. Virtualized computation environment that behaves like a physical computer/server by modeling the computing architecture of a real or hypothetical computer.
VNF
Virtualized Network Function. Implementation of an Network Function that can be deployed on a Network Function Virtualization Infrastructure (NFVI).
VNFC
Virtualized Network Function Component. A VNF may be composed of multiple components, jointly providing the functionality of the VNF.
WAN
Wide Area Network.
OPNFV NetReady Installation
OPNFV Gluon Installation Guide

The Gluon framework can be installed by means of the os-odl-gluon-noha scenario and the Apex installer. Please visit the Apex installer documentation for details on how to install the os-odl-gluon-noha scenario in a virtual or a bare-metal environment.

Quick start guide

The easiest way to set up Gluon is to create a virtual deployment. In a nutshell, these are the installation steps:

  1. install a bare-metal CentOS jumphost
  2. install the Apex RPM packages
  3. create the virtual deployment by running the following command
opnfv-deploy -v -n network_settings.yaml  \
             -d os-odl-gluon-noha.yaml \
             --virtual-computes 3
OPNFV NetReady Configuration Guide
Gluon Configuration

The configuration of the Gluon framework is entirely handled by the corresponding scenario os-odl-gluon-noha available for the Apex installer. In general, the installer installs and configures all components so that no additional configuration steps are needed after installing the aforementioned scenario.

Pre-configuration activities

No pre-configuration steps are needed in addition to the pre-configuration needed for an Apex virtual or bare metal deployment. Please review the Apex installation instructions for further details.

Hardware configuration

No specific hardware configuration is needed for running the os-odl-gluon-noha scenario providing the Gluon framework in addition to the hardware requirements listed for Apex-based scenarios.

Feature configuration

No specific additional configuration is needed after installing the os-odl-gluon-noha scenario using the Apex installer.

Gluon Post Installation Procedure

The configuration of the Gluon framework is entirely handled by the corresponding scenario os-odl-gluon-noha available for the Apex installer. In general, Apex installs and configures all components so that no additional configuration steps are needed after deploying the aforementioned scenario.

Automated post installation activities

An overview of all test suites run by the OPNFV pipeline against the os-odl-gluon-noha scenario as well as the test results can be found at the Functest test result overview page.

Gluon post configuration procedures

No post configuration procedures need to be performed after deploying the os-odl-gluon-noha scenario using the Apex installer.

Platform components validation

As described in the Gluon scenario description, the Gluon framework consists of five software components. This section describes how to validate their successful installation.

  • Gluon core plugin: Check in the file /etc/neutron/neutron.conf that the Neutron core plugin is set to gluon.
  • Proton server: Check that the process proton-server is running.
  • Proton client: Verify that the protonclient tool is installed and executable.
  • etcd: Verify that the etcd key-value-store is installed and running by means of the etcdctl tool.
  • Proton shim layer for OpenDaylight: Verify that the proton-shim-server process is running.
NetReady User Guide
OPNFV NetReady User Guide
Gluon Description

Gluon brings a Networking Service Framework that enables Telecom Service Providers to provide their customers with networking services on-demand. Gluon uses a model-driven approach to generate Networking Service APIs (including objects, database schema, and RESTful API endpoints) from a YAML file which models the Networking Service. When a Telecom Service Provider needs to launch a new Networking Service, it only needs to model the new service in a YAML file. The Gluon framework generates the APIs accordingly. Thus Gluon helps Telecom Service Providers accelerate the time-to-market and achieve business agility through its extensibility and scalability in generating APIs for new use-cases and services.

Gluon Capabilities and Usage

Gluon is the port arbiter that maintains a list of ports and bindings of different networking backends. A Proton is a set of APIs of a particular NFV Networking Service. A Proton Server is the API server that hosts multiple Protons, i.e. multiple sets of APIs. Gluon uses backend drivers to interact with the Proton Server for port binding and other operations.

A Proton is created by a Particle Generator based on a YAML file modeled for this particular NFV Networking Service. When a Proton is created, the objects, database schema, and RESTful APIs of this Proton are created. Then the Proton specific driver would be loaded into Gluon.

When the Proton Server receives port binding and other operation requests, it broadcasts those requests to etcd. The Shim Layers of respective SDN Controllers listen to etcd, and get the notification from etcd. Based on the type of operations, parameter data, and its own deployment and policy configuration, SDN Controllers act upon accordingly. This mechanism is similar to Neutron’s Hierarchical Port Binding (HPB), and provides the flexibility and scalability when a port operation needs to be supported by multiple SDN Controllers in collaborative and interoperable way.

Gluon API Guidelines and Examples

This section shows you how to use Proton to create the needed objects, and then use nova boot to bind the port to a VM. It is assumed that you have already installed Gluon package, including etcd and Gluon Plugin, and started Proton Server. If not, please refer to the Installation guide.

Getting Help

Just typing the protonclient --help command gives you general help information:

$ protonclient --help

Usage: protonclient --api <api_name> [OPTIONS] COMMAND[ARGS]...

Options:
--api TEXT      Name of API, one of ['net-l3vpn', 'test']
--port INTEGER  Port of endpoint (OS_PROTON_PORT)
--host TEXT     Host of endpoint (OS_PROTON_HOST)
--help          Show this message and exit.
Mandatory Parameters

--api <api_name> is a mandatory parameter. For example, --api net-l3vpn.

Just typing the protonclient command shows you that those mandatory parameters are required, and gives you general help information too.

$ protonclient
--api is not specified!

Usage: protonclient --api <api_name> [OPTIONS] COMMAND[ARGS]...

Options:
--api TEXT      Name of API, one of ['net-l3vpn', 'test']
--port INTEGER  Port of endpoint (OS_PROTON_PORT)
--host TEXT     Host of endpoint (OS_PROTON_HOST)
--help          Show this message and exit.
Using L3VPN Proton

NOTE that there is a KNOWN BUG in the Usage message where the mandatory parameters --api net-l3vpn are missing.

$ protonclient --api net-l3vpn
Usage: protonclient [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  interface-create
  interface-delete
  interface-list
  interface-show
  interface-update
  port-create
  port-delete
  port-list
  port-show
  port-update
  vpn-create
  vpn-delete
  vpn-list
  vpn-show
  vpn-update
  vpnafconfig-create
  vpnafconfig-delete
  vpnafconfig-list
  vpnafconfig-show
  vpnafconfig-update
  vpnbinding-create
  vpnbinding-delete
  vpnbinding-list
  vpnbinding-show
  vpnbinding-update

The following sections give you the general work flow of how to use Proton to create and configure an L3VPN.

For more details and examples, please refer to the Gluon upstream user guide.

Work Flow of Using L3VPN

The work flow of using L3VPN includes:

  • Step 1: Create ``Port`` Object
$ protonclient --api net-l3vpn port-create --help
$ protonclient --api net-l3vpn port-create [ARGS] ...

Please NOTE: a default interface object is automatically created too when a Port is created, and this default interface object is attached to this Port object. The UUID of this default Interface object will be the same as the UUID of the parent Port object.

  • Step 2 (Optional): Create ``Interface`` Object
$ protonclient --api net-l3vpn interface-create --help
$ protonclient --api net-l3vpn interface-create [ARGS] ...

Please NOTE: This step is optional because a default Interface object was already automatically created when a Port object was created at Step 1.

  • For example: list the default ``Interface`` Object:
$ protonclient --api net-l3vpn interface-list
  • Step 3 (Optional): Create ``VPNAFConfig`` Object
$ protonclient --api net-l3vpn vpnafconfig-create --help
$ protonclient --api net-l3vpn vpnafconfig-create [ARGS] ...

Please NOTE: This step is optional because all parameters needed for an L3VPN (route specifiers) are also present in creating a VPN service object at Step 4. This part of the API needs to be aligned in the future.

  • Step 4: Create ``VPN`` Object
$ protonclient --api net-l3vpn vpn-create --help
$ protonclient --api net-l3vpn vpn-create [ARGS] ...

At this point you have a Port object, default Interface object and a VPN service object created.

  • View VPN and Port Objects

You can view the values with the following commands:

$ protonclient --api net-l3vpn vpn-list
$ protonclient --api net-l3vpn port-list
  • Step 5: Create ``VPNBinding`` Object

You need to create a VPNBinding object to tie the Interface and the Service together in order to achieve service binding.

  $ protonclient --api net-l3vpn vpnbinding-create --help
  $ protonclient --api net-l3vpn vpnbinding-create [ARGS] ...

* View ``VPNBinding`` Objects
$ protonclient --api net-l3vpn vpnbinding-list

At this point you have had all of the information needed for an L3VPN Port in Proton.

  • Step 6: Create VM and Bind our L3VPN Port
$ nova --debug boot --flavor 1 --image cirros --nic port-id=<port-id> <VM-Name>

Opera

OPNFV Opera Overview
1. OPERA Project Overview

Since OPNFV board expanded its scope to include NFV MANO last year, several upstream open source projects have been created to develop MANO solutions. Each solution has demonstrated its unique value in specific area. Open-Orchestrator (OPEN-O) project is one of such communities. Opera seeks to develop requirements for OPEN-O MANO support in the OPNFV reference platform, with the plan to eventually integrate OPEN-O in OPNFV as a non-exclusive upstream MANO. The project will definitely benefit not only OPNFV and Open-O, but can be referenced by other MANO integration as well. In particular, this project is basically use case driven. Based on that, it will focus on the requirement of interfaces/data models for integration among various components and OPNFV platform. The requirement is designed to support integration among Open-O as NFVO with Juju as VNFM and OpenStack as VIM.

Currently OPNFV has already included upstream OpenStack as VIM, and Juju and Tacker have been being considered as gVNFM by different OPNFV projects. OPEN-O as NFVO part of MANO will interact with OpenStack and Juju. The key items required for the integration can be described as follows.

key item

Fig 1. Key Item for Integration

2. Open-O is scoped for the integration

OPEN-O includes various components for OPNFV MANO integration. The initial release of integration will be focusing on NFV-O, Common service and Common TOSCA. Other components of Open-O will be gradually integrated to OPNFV reference platform in later release.

openo component

Fig 2. Deploy Overview

3. The vIMS is used as initial use case

based on which test cases will be created and aligned with Open-O first release for OPNFV D release.

  • Creatting scenario (os-nosdn-openoe-ha) to integrate Open-O with OpenStack Newton.
  • Integrating with COMPASS as installer, FuncTest as testing framework
  • Clearwater vIMS is used as VNFs, Juju is used as VNFM.
  • Use Open-O as Orchestrator to deploy vIMS to do end-2-end test with the following steps.
  1. deploy Open-O as orchestrator
  2. create tenant by Open-O to OpenStack
  3. deploy vIMS VNFs from orchestrator based on TOSCA blueprintn and create VNFs
  4. launch test suite
  5. collect results and clean up
vIMS deploy

Fig 3. vIMS Deploy

OPNFV Opera Installation Instructions
1. Abstract

This document describes how to install Open-O in an OpenStack deployed environment using Opera project.

2. Version history
Date Ver. Author Comment
2017-02-16 0.0.1 Harry Huang (HUAWEI) First draft
3. Opera Installation Instructions

This document providing guidelines on how to deploy a working Open-O environment using opera project.

The audience of this document is assumed to have good knowledge in OpenStack and Linux.

3.1. Preconditions

There are some preconditions before starting the Opera deployment

3.1.1. A functional OpenStack environment

OpenStack should be deployed before opera deploy.

3.1.2. Getting the deployment scripts

Retrieve the repository of Opera using the following command:

3.2. Machine requirements
  1. Ubuntu OS (Pre-installed).
  2. Root access.
  3. Minimum 1 NIC (internet access)
  4. CPU cores: 32
  5. 64 GB free memory
  6. 100G free disk
3.3. Deploy Instruction

After opera deployment, Open-O dockers will be launched on local server as orchestrator and juju vm will be launched on OpenStack as VNFM.

3.3.1. Add OpenStack Admin Openrc file

Add the admin openrc file of your local openstack into opera/conf directory with the name of admin-openrc.sh.

3.3.2. Config open-o.yml

Set openo_version to specify Open-O version.

Set openo_ip to specify an external ip to access Open-O services. (leave the value unset will use local server’s external ip)

Set ports in openo_docker_net to specify Open-O’s exposed service ports.

Set enable_sdno to specify if use Open-O ‘s sdno services. (set this value false will not launch Open-O sdno dockers and reduce deploy duration)

Set vnf_type to specify the vnf type need to be deployed. (currently only support clearwater deployment, leave this unset will not deploy any vnf)

3.3.3. Run opera_launch.sh
./opera_launch.sh
OPNFV Opera Config Instructions
1. Config Guide
1.1. Add OpenStack Admin Openrc file

Add the admin openrc file of your local openstack into opera/conf directory with the name of admin-openrc.sh.

1.2. Config open-o.yml

Set openo_version to specify Open-O version.

Set openo_ip to specify an external ip to access Open-O services. (leave the value unset will use local server’s external ip)

Set ports in openo_docker_net to specify Open-O’s exposed service ports.

Set enable_sdno to specify if use Open-O ‘s sdno services. (set this value false will not launch Open-O sdno dockers and reduce deploy duration)

Set vnf_type to specify the vnf type need to be deployed. (currently only support clearwater deployment, leave this unset will not deploy any vnf)

OPNFV Opera Design
1. OPERA Requirement and Design
  • Define Scenario OS-NOSDN-OPENO-HA and Integrate OPEN-O M Release with OPNFV D Release (with OpenStack Newton)
  • Integrate OPEN-O to OPNFV CI Process
    • Integrate automatic Open-O and Juju installation
  • Deploy Clearwater vIMS through OPEN-O
    • Test case to simulate SIP clients voice call
  • Integrate vIMS test scripts to FuncTest
2. OS-NOSDN-OPENO-HA Scenario Definition
2.1. Compass4NFV supports Open-O NFV Scenario
  • Scenario name: os-nosdn-openo-ha
  • Deployment: OpenStack + Open-O + JuJu
  • Setups:
    • Virtual deployment (one physical server as Jump Server with OS ubuntu)
    • Physical Deployment (one physical server as Jump Server, ubuntu + 5 physical Host Server)
deploy overview

Fig 1. Deploy Overview

3. Open-O is participating OPNFV CI Process
  • All steps are linked to OPNFV CI Process
  • Jenkins jobs remotely access OPEN-O NEXUS repository to fetch binaries
  • COMPASS is to deploy scenario based on OpenStack Newton release.
  • OPEN-O and JuJu installation scripts will be triggered in Jenkins job after COMPASS finish deploying OpenStack
  • Clearwater vIMS deploy scripts will be integrated into FuncTest
  • Clearwater vIMS test scripts will be integrated into FuncTest
opera ci

Fig 2. Opera Ci

4. The vIMS is used as initial use case

based on which test cases will be created and aligned with Open-O first release for OPNFV D release.

  • Creatting scenario (os-nosdn-openoe-ha) to integrate Open-O with OpenStack Newton.
  • Integrating with COMPASS as installer, FuncTest as testing framework
  • Clearwater vIMS is used as VNFs, Juju is used as VNFM.
  • Use Open-O as Orchestrator to deploy vIMS to do end-2-end test with the following steps.
  1. deploy Open-O as orchestrator
  2. create tenant by Open-O to OpenStack
  3. deploy vIMS VNFs from orchestrator based on TOSCA blueprintn and create VNFs
  4. launch test suite
  5. collect results and clean up
vIMS deploy

Fig 3. vIMS Deploy

5. Requirement and Tasks
5.1. OPERA Deployment Key idea
  • Keep OPEN-O deployment agnostic from an installer perspective (Apex, Compass, Fuel, Joid)
  • Breakdown deployments in single scripts (isolation)
  • Have OPNFV CI Process (Jenkins) control and monitor the execution
5.2. Tasks need to be done for OPNFV CD process
  1. Compass to deploy scenario of os-nosdn-openo-noha
  2. Automate OPEN-O installation (deployment) process
  3. Automate JuJu installation process
  4. Create vIMS TOSCA blueprint (for vIMS deployment)
  5. Automate vIMS package deployment (need helper/OPEN-O M)
    • (a)Jenkins to invoke OPEN-O Restful API to import & deploy vIMS ackage
  6. Integrate scripts of step 2,3,4,5 with OPNFV CD Jenkins Job
5.3. FUNCTEST
  1. test case automation
    • (a)Invoke URL request to vIMS services to test deployment is successfully done.
  2. Integrate test scripts with FuncTest
    • (a)trigger these test scripts
    • (b)record test result to DB
functest

Fig 4. Functest

Parser

OPNFV Parser Installation Instruction
Parser tosca2heat Installation

Please follow the below installation steps to install tosca2heat submodule in parser.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Install the heat-translator sub project.

# uninstall pre-installed tosca-parser
pip uninstall -y heat-translator

# change directory to heat-translator
cd parser/tosca2heat/heat-translator

# install requirements
pip install -r requirements.txt

# install heat-translator
python setup.py install

Step 3: Install the tosca-parser sub project.

# uninstall pre-installed tosca-parser
pip uninstall -y tosca-parser

# change directory to tosca-parser
cd parser/tosca2heat/tosca-parser

# install requirements
pip install -r requirements.txt

# install tosca-parser
python setup.py install

Notes: It must uninstall pre-installed tosca-parser and heat-translator before install the two components, and install heat-translator before installing tosca-parser, which is sure to use the OPNFV version of tosca-parser and heat-translator other than openstack’s components.

Parser yang2tosca Installation

Parser yang2tosca requires the following to be installed.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Clone pyang tool or download the zip file from the following link.

git clone https://github.com/mbj4668/pyang.git

OR

wget https://github.com/mbj4668/pyang/archive/master.zip

Step 3: Change directory to the downloaded directory and run the setup file.

cd pyang
python setup.py
Step 4: install python-lxml

Please follow the below installation link. http://lxml.de/installation.html

Parser policy2tosca installation

Please follow the below installation steps to install parser - POLICY2TOSCA.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Install the policy2tosca module.

cd parser/policy2tosca
python setup.py install
Parser verigraph installation

In the present release, verigraph requires that the following software is also installed:

Please follow the below installation steps to install verigraph.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Go to the verigraph directory.

cd parser/verigraph

Step3: Follow the instructions in README.rst for downloading verigraph dependencies and for installing verigraph.

Parser apigateway Installation

In the present release, apigateway requires that the following software is also installed:

Please follow the below installation steps to install apigateway submodule in parser.

Step 1: Clone the parser project.

git clone https://gerrit.opnfv.org/gerrit/parser

Step 2: Install the apigateway submodule.

# change directory to apigateway
cd parser/apigateway

# install requirements
pip install -r requirements.txt

# install apigateway
python setup.py install

Notes: In release D, apigateway submodule is only initial framework code, and more feature will be provided in the next release.

OPNFV Parser Configuration Guide
Parser configuration

Parser can be configured with any installer in current OPNFV, it only depends on openstack.

Pre-configuration activities

For parser, there is not specific pre-configuration activities.

Hardware configuration

For parser, there is not hardware configuration needed for any current feature.

Feature configuration

For parser, there is not specific configure on openstack.

Parser Post Installation Procedure

Add a brief introduction to the methods of validating the installation according to this specific installer or feature.

Automated post installation activities

Describe specific post installation activities performed by the OPNFV deployment pipeline including testing activities and reports. Refer to the relevant testing guides, results, and release notes.

note: this section should be singular and derived from the test projects once we have one test suite to run for all deploy tools. This is not the case yet so each deploy tool will need to provide (hopefully very simillar) documentation of this.

Parser post configuration procedures

Describe any deploy tool or feature specific scripts, tests or procedures that should be carried out on the deployment post install and configuration in this section.

Platform components validation

Describe any component specific validation procedures necessary for your deployment tool in this section.

OPNFV Parser User Guide
Parser tosca2heat Execution

Step 1: Change directory to where the tosca yaml files are present, example is below with vRNC definiton.

cd parser/tosca2heat/tosca-parser/toscaparser/extensions/nfv/tests/data/vRNC/Definitions

Step 2: Run the python command heat-translator with the TOSCA yaml file as an input option.

heat-translator --template-file=<input file> --template-type=tosca
                --outpurt-file=<output hot file>

Example:

heat-translator --template-file=vRNC.yaml \
    --template-type=tosca --output-file=vRNC_hot.yaml

Notes: heat-translator will call class of ToscaTemplate in tosca-parser firstly to validate and parse input yaml file, then tranlate the file into hot file, if you only want to validate or check the input file and don’t want to translate, please use tosaca-parser as following:

tosca-parser --template-file=<input yaml file>

Example:

tosca-parser --template-file=vRNC.yaml
Parser yang2tosca Execution

Step 1: Change directory to where the scripts are present.

cd parser/yang2tosca
Step 2: Copy the YANG file which needs to be converted into TOSCA to
current (parser/yang2tosca) folder.

Step 3: Run the python script “parser.py” with the YANG file as an input option.

python parser.py -n "YANG filename"

Example:

python parser.py -n example.yaml
Step 4: Verify the TOSCA YAMl which file has been created with the same name
as the YANG file with a “_tosca” suffix.
cat "YANG filename_tosca.yaml"

Example:

cat example_tosca.yaml
Parser policy2tosca Execution

Step 1: To see a list of commands available.

policy2tosca --help

Step 2: To see help for an individual command, include the command name on the command line

policy2tosca help <service>

Step 3: To inject/remove policy types/policy definitions provide the TOSCA file as input to policy2tosca command line.

policy2tosca <service> [arguments]

Example:

policy2tosca add-definition \
    --policy_name rule2 --policy_type  tosca.policies.Placement.Geolocation \
    --description "test description" \
    --properties region:us-north-1,region:us-north-2,min_inst:2 \
    --targets VNF2,VNF4 \
    --metadata "map of strings" \
    --triggers "1,2,3,4" \
    --source example.yaml

Step 4: Verify the TOSCA YAMl updated with the injection/removal executed.

cat "<source tosca file>"

Example:

cat example_tosca.yaml
Parser verigraph Execution

VeriGraph is accessible via both a RESTful API and a gRPC interface.

REST API

Step 1. Change directory to where the service graph examples are present

cd parser/verigraph/examples

Step 2. Use a REST client (e.g., cURL) to send a POST request (whose body is one of the JSON file in the directory)

curl -X POST -d @<file_name>.json http://<server_address>:<server_port>/verify/api/graphs
--header "Content-Type:application/json"

Step 3. Use a REST client to send a GET request to check a reachability-based property between two nodes of the service graph created in the previous step.

curl -X GET http://<server_addr>:<server_port>/verify/api/graphs/<graphID>/
policy?source=<srcNodeID>&destination=<dstNodeID>&type=<propertyType>

where:

  • <graphID> is the identifier of the service graph created at Step 2
  • <srcNodeID> is the name of the source node
  • <dstNodeID> is the name of the destination node
  • <propertyType> can be reachability, isolation or traversal

Step 4. the output is a JSON with the overall result of the verification process and the partial result for each path that connects the source and destination nodes in the service graph.

gRPC API

VeriGraph exposes a gRPC interface that is self-descriptive by its Protobuf file (parser/verigraph/src/main/proto/verigraph.proto). In the current release, Verigraph misses a module that receives service graphs in format of JSON and sends the proper requests to the gRPC server. A testing client has been provided to have an example of how to create a service graph using the gRPC interface and to trigger the verification step.

  1. Run the testing client
cd parser/verigraph
#Client souce code in ``parser/verigraph/src/main/it/polito/grpc/Client.java``
ant -f buildVeriGraph_gRPC.xml run-client

Promise

Promise: Resource Management
Project:Promise, https://wiki.opnfv.org/promise
Editors:Ashiq Khan (NTT DOCOMO), Bertrand Souville (NTT DOCOMO)
Authors:Ravi Chunduru (ClearPath Networks), Peter Lee (ClearPath Networks), Gerald Kunzmann (NTT DOCOMO), Ryota Mibu (NEC), Carlos Goncalves (NEC), Arturo Martin De Nicolas (Ericsson)
Abstract:Promise is an OPNFV requirement project. Its objective is to realize ETSI NFV defined resource reservation and NFVI capacity features within the scope of OPNFV. Promise provides the details of the requirements on resource reservation, NFVI capacity management at VIM, specification of the northbound interfaces from VIM relevant to these features, and implementation plan to realize these features in OPNFV.
1. Definition of terms

Different SDOs and communities use different terminology related to NFV/Cloud/SDN. This list tries to define an OPNFV terminology, mapping/translating the OPNFV terms to terminology used in other contexts.

Administrator
Administrator of the system, e.g. OAM in Telco context.
Consumer
User-side Manager; consumer of the interfaces produced by the VIM; VNFM, NFVO, or Orchestrator in ETSI NFV [NFV003] terminology.
NFV
Network Function Virtualization
NFVI
Network Function Virtualization Infrastructure; totality of all hardware and software components which build up the environment in which VNFs are deployed.
NFVO
Network Functions Virtualization Orchestrator; functional block that manages the Network Service (NS) lifecycle and coordinates the management of NS lifecycle, VNF lifecycle (supported by the VNFM) and NFVI resources (supported by the VIM) to ensure an optimized allocation of the necessary resources and connectivity.
Physical resource
Actual resources in NFVI; not visible to Consumer.
Resource zone
A set of NFVI hardware and software resources logically grouped according to physical isolation and redundancy capabilities or to certain administrative policies for the NFVI [NFVIFA010]
VIM
Virtualized Infrastructure Manager; functional block that is responsible for controlling and managing the NFVI compute, storage and network resources, usually within one operator’s Infrastructure Domain, e.g. NFVI Point of Presence (NFVI-PoP).
Virtual Machine (VM)
Virtualized computation environment that behaves very much like a physical computer/server.
Virtual network
Virtual network routes information among the network interfaces of VM instances and physical network interfaces, providing the necessary connectivity.
Virtual resource
A Virtual Machine (VM), a virtual network, or virtualized storage; Offered resources to “Consumer” as result of infrastructure virtualization; visible to Consumer.
Virtual Storage
Virtualized non-volatile storage allocated to a VM.
VNF
Virtualized Network Function. Implementation of an Network Function that can be deployed on a Network Function Virtualization Infrastructure (NFVI).
VNFM
Virtualized Network Function Manager; functional block that is responsible for the lifecycle management of VNF.
2. Introduction

Resource reservation is a basic function for the operation of a virtualized telecom network. In resource reservation, VIM reserves resources for a certain period as requested by the NFVO. A resource reservation will have a start time which could be into the future. Therefore, the reserved resources shall be available for the NFVO requested purpose (e.g. for a VNF) at the start time for the duration asked by NFVO. Resources include all three resource types in an NFVI i.e. compute, storage and network.

Besides, NFVO requires abstracted NFVI resource capacity information in order to take decisions on VNF placement and other operations related to the virtual resources. VIM is required to inform the NFVO of NFVI resource state information for this purpose. Promise project aims at delivering the detailed requirements on these two features defined in ETSI NFV MAN GS [NFVMAN], the list of gaps in upstream projects, potential implementation architecture and plan, and the VIM northbound interface specification for resource reservation and capacity management.

2.1. Problem description

OpenStack, a prominent candidate for the VIM, cannot reserve resources for future use. OpenStack requires immediate instantiation of Virtual Machines (VMs) in order to occupy resources intended to be reserved. Blazar can reserve compute resources for future by keeping the VMs in shelved mode. However, such reserved resources can also be used for scaling out rather than new VM instantiation. Blazar does not support network and storage resource reservation yet.

Besides, OpenStack does not provide a northbound interface through which it can notify an upper layer management entity e.g. NFVO about capacity changes in its NFVI, periodically or in an event driven way. Capacity management is a feature defined in ETSI NFV MAN GS [NFVMAN] and is required in network operation.

3. Use cases and scenarios
3.1. Use cases

Resource reservation is a basic feature in any virtualization-based network operation. In order to perform such resource reservation from NFVO to VIM, NFVI capacity information is also necessary at the NFVO side. Below, four use cases to show typical requirements and solutions for capacity management and resource reservation is presented. A typical use case as considered for the Brahmaputra release is described in ANNEX A: Use case for OPNFV Brahmaputra.

  1. Resource capacity management
  2. Resource reservation for immediate use
  3. Resource reservation for future use
  4. Co-existence of reservations and allocation requests without reservation
3.1.1. Resource capacity management

NFVO takes the first decision on in which NFVI it would instantiate a VNF. Along with NFVIs resource attributes (e.g. availability of hardware accelerators, particular CPU architectures etc.), NFVO needs to know available capacity of an NFVI in order to make an informed decision on selecting a particular NFVI. Such capacity information shall be in a coarser granularity than the respective VIM, as VIM maintains capacity information of its NFVI in fine details. However a very coarse granularity, like simply the number of available virtual CPU cores, may not be sufficient. In order to allow the NFVO to make well founded allocation decisions, an appropriate level to expose the available capacity may be per flavor. Capacity information may be required for the complete NFVI, or per partition or availability zone, or other granularities. Therefore, VIM requires to inform the NFVO about available capacity information regarding its NFVI at a pre-determined abstraction, either by a query-response, or in an event-based, or in a periodical way.

3.1.2. Resource reservation for immediate use

Reservation is inherently for the future. Even if some reserved resources are to be consumed instantly, there is a network latency between the issuance of a resource reservation request from the NFVO, a response from the VIM, and actual allocation of the requested resources to a VNF/VNFM. Within such latency, resource capacity in the NFVI in question could change, e.g., due to failure, allocation to a different request. Therefore, the response from a VIM to the NFVO to a resource reservation request for immediate use should have a validity period which shows until when this VIM can hold the requested resources. During this time, the NFVO should proceed to allocation if it wishes to consume the reserved requested. If allocation is not performed within the validity period, the response from VIM for a particular resource reservation request becomes invalid and VIM is not liable to provide those resources to NFVO/VNFM anymore. Reservations requests for immediate use do not have a start time but may have an end time.

3.1.3. Resource reservation for future use

Network operators may want to reserve extra resources for future use. Such necessity could arise from predicted congestion in telecom nodes e.g. due to local traffic spikes for concerts, natural disasters etc. In such a case, the NFVO, while sending a resource reservation request to the VIM, shall include a start time (and an end time if necessary). The start time indicates at what time the reserved resource shall be available to a designated consumer e.g. a VNF/VNFM. Here, the requirement is that the reserved resources shall be available when the start time arrives. After the start time has arrived, the reserved resources are allocated to the designated consumer(s). An explicit allocation request is needed. How actually these requested resources are held by the VIM for the period in between the arrival of the resource reservation request and the actual allocation is outside the scope of this requirement project.

3.1.4. Co-existence of reservations and allocation requests without reservation

In a real environment VIM will have to handle allocation requests without any time reference, i.e. time-unbound, together with time-bound reservations and allocation requests with an explicitly indicated end-time. A granted reservation for the future will effectively reduce the available capacity for any new time-unbound allocation request. The consequence is that reservations, even those far in the future, may result in denial of service for new allocation requests.

To alleviate this problem several approaches can be taken. They imply an implicit or explicit priority scheme:

  • Allocation requests without reservation and which are time-unbound will be granted resources in a best-effort way: if there is instant capacity, but the resources may be later withdrawn due to the start time of a previously granted reservation
  • Both allocation requests and reservation requests contain a priority which may be related to SLAs and contractual conditions between the tenant and the NFVI provider. Interactions may look like:
    • A reservation request for future use may cancel another, not yet started, reservation with lower priority
    • An allocation request without reservations and time-unbound [1] may be granted resources and prevent a future reservation with lower priority from getting resources at start time
    • A reservation request may result in terminating resources allocated to a request with no reservation, if the latter has lower priority
[1]In this case, the consumer (VNFM or NFVO) requests to immediately instantiate and assign virtualized resources without having reserved the resources beforehand
3.2. Scenarios

This section describes the expected behavior of the system in different scenarios.

As we are targeting a cloud platform with the above use cases, it is crucial to keep the flexibility in the system. By this means it is hard to explicitely block hardware resources for the future and ensure their availablility for allocation therefore it can happen that the reservation plan cannot be fulfilled due to certain changes in the underlying environment. We need to ensure that we are prepared for error and edge cases during the design period.

4. High level architecture and general features
4.1. Architecture Overview
_images/figure15.png

Resource Reservation Architecture

figure1 shows the high level architecture for the resource reservation use cases. Reserved resources are guaranteed for a given user/client for the period expressed by start and end time. User/client represents the requestor and the consequent consumer of the reserved resources and correspond to the NFVO or VNFM in ETSI NFV terminology.

Note: in this document only reservation requests from NFVO are considered.

4.2. General Features

This section provides a list of features that need to be developed in the Promise project.

  • Resource capacity management
    • Discovery of available resource capacity in resource providers
    • Monitoring of available resource capacity in resource providers
    • Update available resource capacity as a result of new or expired reservations, addition/removal of resources. Note: this is a VIM internal function, not an operation in the VIM northbound interface.
  • Resource reservation
    • Set start time and end time for allocation
    • Increase/decrease reserved resource’s capacity
    • Update resource reservations, e.g. add/remove reserved resources
    • Terminate an allocated resource due to the end time of a reservation
  • VIM northbound interfaces
    • Receive/Reply resource reservation requests
    • Receive/Reply resource capacity management requests
    • Receive/Reply resource allocation requests for reserved resources when start time arrives
    • Subscribe/Notify resource reservation event
      • Notify reservation error or process completion prior to reservation start
      • Notify remaining time until termination of a resource due to the end time of a reservation
      • Notify termination of a resource due to the end time of a reservation
    • Receive/Reply queries on available resource capacity
    • Subscribe/Notify changes in available resource capacity
4.3. High level northbound interface specification
4.3.1. Resource Capacity Management
_images/figure21.png

Resource capacity management message flow: notification of capacity change

figure2 shows a high level flow for a use case of resource capacity management. In this example, the VIM notifies the NFVO of capacity change after having received an event regarding a change in capacity (e.g. a fault notification) from the NFVI. The NFVO can also retrieve detailed capacity information using the Query Capacity Request interface operation.

_images/figure31.png

Resource capacity management message flow: query of capacity density

figure3 shows a high level flow for another use case of resource capacity management. In this example, the NFVO queries the VIM about the actual capacity to instantiate a certain resource according to a certain template, for example a VM according to a certain flavor. In this case the VIM responds with the number of VMs that could be instantiated according to that flavor with the currently available capacity.

4.3.2. Resource Reservation
_images/figure41.png

Resource reservation flow

figure4 shows a high level flow for a use case of resource reservation. The main steps are:

  • The NFVO sends a resource reservation request to the VIM using the Create Resource Reservation Request interface operation.
  • The NFVO gets a reservation identifier reservation associated with this request in the reply message
  • Using the reservation identifier reservation, the NFVO can query/update/terminate a resource reservation using the corresponding interface operations
  • The NFVO is notified that the resource reservation is terminated due to the end time of the reservation
4.4. Information elements
4.4.1. Resource Capacity Management
4.4.1.1. Notify Capacity Change Event

The notification change message shall include the following information elements:

Name Type Description
Notification Identifier Identifier issued by the VIM for the capacity change event notification
Zone Identifier Identifier of the zone where capacity has changed
Used/Reserved/Total Capacity List Used, reserved and total capacity information regarding the resource items subscribed for notification for which capacity change event occurred
4.4.1.2. Query Resource Capacity Request

The capacity management query request message shall include the following information elements:

Name Type Description
Zone Identifier Identifier of the zone where capacity is requested
Attributes List Attributes of resource items to be notified regarding capacity change events
Resources List Identifiers of existing resource items to be queried regarding capacity info (such as images, flavors, virtual containers, networks, physical machines, etc.)

The capacity management query request message may also include the following information element:

Name Type Description
Flavor Identifier Identifier that is passed in the request to obtain information of the number of virtual resources that can be instantiated according to this flavor with the available capacity
4.4.1.3. Query Resource Capacity Reply

The capacity management query reply message shall include the following information elements:

Name Type Description
Zone Identifier Identifier of the zone where capacity is requested
Used/Reserved/Total Capacity List Used, reserved and total capacity information regarding each of the resource items requested to check for capacity

The detailed specification of the northbound interface for Capacity Management in provided in section 5.1.1.

4.4.2. Resource Reservation
4.4.2.1. Create Resource Reservation Request

The create resource reservation request message shall include the following information elements:

Name Type Description
Start Timestamp Start time for consumption of the reserved resources
End Timestamp End time for consumption of the reserved resources
Expiry Timestamp If not all reserved resources are allocated between start time and expiry, the VIM shall release the corresponding resources [1]
Amount Number Amount of the resources per resource item type (i.e. compute/network/storage) that need to be reserved
Zone Identifier The zone where the resources need(s) to be reserved
Attributes List Attributes of the resources to be reserved such as DPDK support, hypervisor, network link bandwidth, affinity rules, etc.
Resources List Identifiers of existing resource items to be reserved (such as images, flavors, virtual containers, networks, physical machines, etc.)
[1]Expiry is a period around start time within which, the allocation process must take place. If allocation process does not start within the expiry period, the reservation becomes invalid and VIM should release the resources
4.4.2.2. Create Resource Reservation Reply

The create resource reservation reply message shall include the following information elements:

Name Type Description
Reservation Identifier Identification of the reservation instance. It can be used by a consumer to modify the reservation later, and to request the allocation of the reserved resources.
Message Text Output message that provides additional information about the create resource reservation request (e.g. may be a simple ACK if the request is being background processed by the VIM)
4.4.2.3. Notify Reservation Event

The notification reservation event message shall include the following information elements:

Name Type Description
Reservation Identifier Identification of the reservation instance triggering the event
Notification Identifier Identification of the resource event notification issued by the VIM
Message Text Message describing the event

The detailed specification of the northbound interface for Resource Reservation is provided in section 5.1.2.

5. Gap analysis in upstream projects

This section provides a list of gaps in upstream projects for realizing resource reservation and management. The gap analysis work focuses on the current OpenStack Blazar project [BLAZAR] in this first release.

5.1. OpenStack
5.1.1. Resource reservation for future use
5.1.2. Resource reservation update
  • Category: Blazar
  • Type: ‘missing’ (lack of functionality)
  • Description:
    • To-be: Have the possibility of adding/removing resources to an existing reservation, e..g in case of NFVI failure
    • As-is: Currently in Blazar, a reservation can only be modified in terms of start/end time
  • Related blueprints: N/A
5.1.3. Give me an offer
  • Category: Blazar
  • Type: ‘missing’ (lack of functionality)
  • Description:
    • To-be: To have the possibility of giving a quotation to a requesting user and an expiration time. Reserved resources shall be released if they are not claimed before this expiration time.
    • As-is: Blazar can already send notification e.g. to inform a given user that a reservation is about to expire
  • Related blueprints: N/A
5.1.4. StormStack StormForge
5.1.4.1. Stormify
  • Stormify enables rapid web applications construction
  • Based on Ember.js style Data stores
  • Developed on Node.js using coffeescript/javascript
  • Auto RESTful API generation based on Data Models
  • Development starts with defining Data Models
  • Code hosted at github : http://github.com/stormstack/stormify
5.1.4.2. StormForge
  • Data Model driven management of Resource Providers
  • Based on Stormify Framework and implemented as per the OPNFV Promise requirements
  • Data Models are auto generated and RESTful API code from YANG schema
  • Currently planned key services include Resource Capacity Management Service and Resource Reservation Service
  • List of YANG schemas for Promise project is attached in the Appendix
  • Code hosted at github: http://github.com/stormstack/stormforge
5.1.4.3. Resource Discovery
  • Category: StormForge
  • Type: ‘planning’ (lack of functionality)
  • Description
    • To-be: To be able to discover resources in real time from OpenStack components. Planning to add OpenStack Project to interface with Promise for real time updates on capacity or any failures
    • As-is: Currently, resource capacity is learnt using NB APIs related to quota
  • Related Blueprints: N/A
6. Detailed architecture and message flows

Within the Promise project we consider two different architectural options, i.e. a shim-layer based architecture and an architecture targeting at full OpenStack integration.

6.1. Shim-layer architecture

The shim-layer architecture is using a layer on top of OpenStack to provide the capacity management, resource reservation, and resource allocation features.

6.1.1. Detailed Message Flows

Note, that only selected parameters for the messages are shown. Refer to Detailed northbound interface specification and Annex ANNEX B: Promise YANG schema based on YangForge for a full set of message parameters.

6.1.1.1. Resource Capacity Management
_images/figure5_new.png

Capacity Management Scenario

figure5 shows a detailed message flow between the consumers and the capacity management functional blocks inside the shim-layer. It has the following steps:

  • Step 1a: The Consumer sends a query-capacity request to Promise using some filter like time-windows or resource type. The capacity is looked up in the shim-layer capacity map.
  • Step 1b: The shim-layer will respond with information about the total, available, reserved, and used (allocated) capacities matching the filter.
  • Step 2a: The Consumer can send increase/decrease-capacity requests to update the capacity available to the reservation system. It can be 100% of available capacity in the given provider/source or only a subset, i.e., it can allow for leaving some “buffer” in the actual NFVI to be used outside the Promise shim-layer or for a different reservation service instance. It can also be used to inform the reservation system that from a certain time in the future, additional resources can be reserved (e.g. due to a planned upgrade of the capacity), or the available capacity will be reduced (e.g. due to a planned downtime of some of the resources).
  • Step 2b: The shim-layer will respond with an ACK/NACK message.
  • Step 3a: Consumers can subscribe for capacity-change events using a filter.
  • Step 3b: Each successful subscription is responded with a subscription_id.
  • Step 4: The shim-layer monitors the capacity information for the various types of resources by periodically querying the various Controllers (e.g. Nova, Neutron, Cinder) or by creating event alarms in the VIM (e.g. with Ceilometer for OpenStack) and updates capacity information in its capacity map.
  • Step 5: Capacity changes are notified to the Consumer.
6.1.1.2. Resource Reservation
_images/figure6_new.png

Resource Reservation for Future Use Scenario

figure6 shows a detailed message flow between the Consumer and the resource reservation functional blocks inside the shim-layer. It has the following steps:

  • Step 1a: The Consumer creates a resource reservation request for future use by setting a start and end time for the reservation as well as more detailed information about the resources to be reserved. The Promise shim-layer will check the free capacity in the given time window and in case sufficient capacity exists to meet the reservation request, will mark those resources “reserved” in its reservation map.
  • Step 1b: If the reservation was successful, a reservation_id and status of the reservation will be returned to the Consumer. In case the reservation cannot be met, the shim-layer may return information about the maximum capacity that could be reserved during the requested time window and/or a potential time window where the requested (amount of) resources would be available.
  • Step 2a: Reservations can be updated using an update-reservation, providing the reservation_id and the new reservation_data. Promise Reservation Manageer will check the feasibility to update the reservation as requested.
  • Step 2b: If the reservation was updated successfully, a reservation_id and status of the reservation will be returned to the Consumer. Otherwise, an appropriate error message will be returned.
  • Step 3a: A cancel-reservation request can be used to withdraw an existing reservation. Promise will update the reservation map by removing the reservation as well as the capacity map by adding the freed capacity.
  • Step 3b: The response message confirms the cancelation.
  • Step 4a: Consumers can also issue query-reservation requests to receive a list of reservation. An input filter can be used to narrow down the query, e.g., only provide reservations in a given time window. Promise will query its reservation map to identify reservations matching the input filter.
  • Step 4b: The response message contains information about all reservations matching the input filter. It also provides information about the utilization in the requested time window.
  • Step 5a: Consumers can subscribe for reservation-change events using a filter.
  • Step 5b: Each successful subscription is responded with a subscription_id.
  • Step 6a: Promise synchronizes the available and used capacity with the underlying VIM.
  • Step 6b: In certain cases, e.g., due a failure in the underlying hardware, some reservations cannot be kept up anymore and have to be updated or canceled. The shim-layer will identify affected reservations among its reservation records.
  • Step 7: Subscribed Consumers will be informed about the updated reservations. The notification contains the updated reservation_data and new status of the reservation. It is then up to the Consumer to take appropriate actions in order to ensure high priority reservations are favored over lower priority reservations.
6.1.1.3. Resource Allocation
_images/figure7_new.png

Resource Allocation

figure7 shows a detailed message flow between the Consumer, the functional blocks inside the shim-layer, and the VIM. It has the following steps:

  • Step 1a: The Consumer sends a create-instance request providing information about the resources to be reserved, i.e., provider_id (optional in case of only one provider), name of the instance, the requested flavour and image, etc. If the allocation is against an existing reservation, the reservation_id has to be provided.
  • Step 1b: If a reservation_id was provided, Promise checks if a reservation with that ID exists, the reservation start time has arrived (i.e. the reservation is active), and the required capacity for the requested flavor is within the available capacity of the reservation. If those conditions are met, Promise creates a record for the allocation (VMState=”INITIALIZED”) and update its databases. If no reservation_id was provided in the allocation request, Promise checks whether the required capacity to meet the request can be provided from the available, non-reserved capacity. If yes, Promise creates a record for the allocation and update its databases. In any other case, Promise rejects the create-instance request.
  • Step 2: In the case the create-instance request was rejected, Promise responds with a “status=rejected” providing the reason of the rejection. This will help the Consumer to take appropriate actions, e.g., send an updated create-instance request. The allocation work flow will terminate at this step and the below steps are not executed.
  • Step 3a: If the create-instance request was accepted and a related allocation record has been created, the shim-layer issues a createServer request to the VIM Controller providing all information to create the server instance.
  • Step 3b: The VIM Controller sends an immediate reply with an instance_id and starts the VIM-internal allocation process.
  • Step 4: The Consumer gets an immediate response message with allocation status “in progress” and the assigned instance_id.
  • Step 5a+b: The consumer subscribes to receive notifications about allocation events related to the requested instance. Promise responds with an acknowledgment including a subscribe_id.
  • Step 6: In parallel to the previous step, Promise shim-layer creates an alarm in Aodh to receive notifications about all changes to the VMState for instance_id.
  • Step 7a: The VIM Controller notifies all instance related events to Ceilometer. After the allocation has been completed or failed, it sends an event to Ceilometer. This triggers the OpenStack alarming service Aodh to notify the new VMState (e.g. ACTIVE and ERROR) to the shim-layer that updates its internal allocation records.
  • Step 7b: Promise sends a notification message to the subscribed Consumer with information on the allocated resources including their new VMState.
  • Step 8a+b: Allocated instances can be terminated by the Consumer by sending a destroy-instance request to the shim-layer. Promise responds with an acknowledgment and the new status “DELETING” for the instance.
  • Step 9a: Promise sends a deleteServer request for the instance_id to the VIM Controller.
  • Step 10a: After the instance has been deleted, an event alarm is sent to the shim-layer that updates its internal allocation records and capacity utilization.
  • Step 10b: The shim-layer also notifies the subscribed Consumer about the successfully destroyed instance.
6.1.2. Internal operations

Note

This section is to be updated

In the following, the internal logic and operations of the shim-layer will be explained in more detail, e.g. the “check request” (step 1b in figure7 of the allocation work flow).

6.2. Integrated architecture

The integrated architecture aims at full integration with OpenStack. This means that it is planned to use the already existing OpenStack APIs extended with the reservation capabilities.

The advantage of this approach is that we don’t need to re-model the complex resource structure we have for the virtual machines and the corresponding infrastructure.

The atomic item is the virtual machine with the minimum set of resources it requires to be able to start it up. It is important to state that resource reservation is handled on VM instance level as opposed to standalone resources like CPU, memory and so forth. As the placement is an important aspect in order to be able to use the reserved resources it provides the constraint to handle resources in groups.

The placement constraint also makes it impossible to use a quota management system to solve the base use case described earlier in this document.

OpenStack had a project called Blazar, which was created in order to provide resource reservation functionality in cloud environments. It uses the Shelve API of Nova, which provides a sub-optimal solution. Due to the fact that this feature blocks the reserved resources this solution cannot be considered to be final. Further work is needed to reach a more optimal stage, where the Nova scheduler is intended to be used to schedule the resources for future use to make the reservations.

6.2.1. Phases of the work

The work has two main stages to reach the final solution. The following main work items are on the roadmap for this approach:

  1. Sub-optimal solution by using the shelve API of Nova through the Blazar project:

    • Fix the code base of the Blazar project:

      Due to integration difficulties the Blazar project got suspended. Since the last activities in that repository the OpenStack code base and environment changed significantly, which means that the project’s code base needs to be updated to the latest standards and has to be able to interact with the latest version of the other OpenStack services.

    • Update the Blazar API:

      The REST API needs to be extended to contain the attributes for the reservation defined in this document. This activity shall include testing towards the new API.

  2. Use Nova scheduler to avoid blocking the reserved resources:

    • Analyze the Nova scheduler:

      The status and the possible interface between the resource reservation system and the Nova scheduler needs to be identified. It is crucial to achieve a much more optimal solution than what the current version of Blazar can provide. The goal is to be able to use the reserved resources before the reservation starts. In order to be able to achieve this we need the scheduler to do scheduling for the future considering the reservation intervals that are specified in the request.

    • Define a new design based on the analysis and start the work on it:

      The design for the more optimal solution can be defined only after analyzing the structure and capabilities of the Nova scheduler.

    • This phase can be started in parallel with the previous one.

6.2.2. Detailed Message Flows

Note

to be done

6.2.2.1. Resource Reservation

Note

to be specified

7. Detailed northbound interface specification

Note

This is Work in Progress.

7.1. ETSI NFV IFA Information Models
7.1.1. Compute Flavor

A compute flavor includes information about number of virtual CPUs, size of virtual memory, size of virtual storage, and virtual network interfaces [NFVIFA005].

_images/computeflavor.png
7.2. Virtualised Compute Resources
7.2.1. Compute Capacity Management
7.2.1.1. Subscribe Compute Capacity Change Event

Subscription from Consumer to VIM to be notified about compute capacity changes

POST /capacity/compute/subscribe

Example request:

POST /capacity/compute/subscribe HTTP/1.1
Accept: application/json

{
   "zoneId": "12345",
   "computeResourceTypeId": "vcInstances",
   "threshold": {
      "thresholdType" : "absoluteValue",
      "threshold": {
          "capacity_info": "available",
          "condition": "lt",
          "value": 5
      }
   }
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
   "created": "2015-09-21T00:00:00Z",
   "capacityChangeSubscriptionId": "abcdef-ghijkl-123456789"
}
Status Codes:
7.2.1.2. Query Compute Capacity for a defined resource type

Request to find out about available, reserved, total and allocated compute capacity.

GET /capacity/compute/query

Example request:

GET /capacity/compute/query HTTP/1.1
Accept: application/json

{
  "zoneId": "12345",
  "computeResourceTypeId": "vcInstances",
  "timePeriod":  {
       "startTime": "2015-09-21T00:00:00Z",
       "stopTime": "2015-09-21T00:05:30Z"
  }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
   "zoneId": "12345",
   "lastUpdate": "2015-09-21T00:03:20Z",
   "capacityInformation": {
      "available": 4,
      "reserved": 17,
      "total": 50,
      "allocated": 29
   }
}
Query Parameters:
 
  • limit – Default is 10.
Status Codes:
7.2.1.3. Query Compute Capacity with required attributes

Request to find out available compute capacity with given characteristics

GET /capacity/compute/query

Example request:

GET /capacity/compute/query HTTP/1.1
Accept: application/json

{
  "zoneId": "12345",
  "resourceCriteria":  {
       "virtualCPU": {
           "cpuArchitecture": "x86",
           "numVirtualCpu": 8
       }
  },
  "attributeSelector":  "available",
  "timePeriod":  {
       "startTime": "2015-09-21T00:00:00Z",
       "stopTime": "2015-09-21T00:05:30Z"
  }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
   "zoneId": "12345",
   "lastUpdate": "2015-09-21T00:03:20Z",
   "capacityInformation": {
      "available": 50
   }
}
Query Parameters:
 
  • limit – Default is 10.
Status Codes:
7.2.1.4. Notify Compute Capacity Change Event

Notification about compute capacity changes

POST /capacity/compute/notification

Example notification:

Content-Type: application/json

{
     "zoneId": "12345",
     "notificationId": "zyxwvu-tsrqpo-987654321",
     "capacityChangeTime": "2015-09-21T00:03:20Z",
     "resourceDescriptor": {
        "computeResourceTypeId": "vcInstances"
     },
     "capacityInformation": {
        "available": 4,
        "reserved": 17,
        "total": 50,
        "allocated": 29
     }
}
7.2.2. Compute Resource Reservation
7.2.2.1. Create Compute Resource Reservation

Request the reservation of compute resource capacity

POST /reservation/compute/create

Example request:

POST /reservation/compute/create HTTP/1.1
Accept: application/json

{
    "startTime": "2015-09-21T01:00:00Z",
    "computePoolReservation": {
        "numCpuCores": 20,
        "numVcInstances": 5,
        "virtualMemSize": 10
    }
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
   "reservationData": {
      "startTime": "2015-09-21T01:00:00Z",
      "reservationStatus": "initialized",
      "reservationId": "xxxx-yyyy-zzzz",
      "computePoolReserved": {
          "numCpuCores": 20,
          "numVcInstances": 5,
          "virtualMemSize": 10,
          "zoneId": "23456"
      }
   }
}

or virtualization containers

POST reservation/compute/create

Example request:

POST /reservation/compute/create HTTP/1.1
Accept: application/json

{
  "startTime": "2015-10-05T15:00:00Z",
  "virtualizationContainerReservation": [
    {
       "containerId": "myContainer",
       "containerFlavor": {
          "flavorId": "myFlavor",
          "virtualCpu": {
             "numVirtualCpu": 2,
             "cpuArchitecture": "x86"
          },
          "virtualMemory": {
              "numaEnabled": "False",
              "virtualMemSize": 16
          },
          "storageAttributes": {
              "typeOfStorage": "volume",
              "sizeOfStorage": 16
          }
       }
    }
  ]
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
   "reservationData": {
      "startTime": "2015-10-05T15:00:00Z",
      "reservationId": "aaaa-bbbb-cccc",
      "reservationStatus": "initialized",
      "virtualizationContainerReserved": [
          {
             "containerId": "myContainer",
             "flavorId": "myFlavor",
             "virtualCpu": {
                 "numVirtualCpu": 2,
                 "cpuArchitecture": "x86"
             },
             "virtualMemory": {
                 "numaEnabled": "False",
                 "virtualMemSize": 16
             },
             "virtualDisks": {
                 "storageId": "myStorage",
                 "flavourId": "myStorageFlavour",
                 "typeOfStorage": "volume",
                 "sizeOfStorage": 16,
                 "operationalState": "enabled"
             }
          }
      ]
   }
}
7.2.2.2. Query Compute Resource Reservation

Request to find out about reserved compute resources that the consumer has access to.

GET /reservation/compute/query

Example request:

GET /reservation/compute/query HTTP/1.1
Accept: application/json

{
   "queryReservationFilter": [
       {
           "reservationId": "xxxx-yyyy-zzzz"
       }
   ]

}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
   "queryResult":
   {
      "startTime": "2015-09-21T01:00:00Z",
      "reservationStatus": "active",
      "reservationId": "xxxx-yyyy-zzzz",
      "computePoolReserved":
      {
          "numCpuCores": 20,
          "numVcInstances": 5,
          "virtualMemSize": 10,
          "zoneId": "23456"
      }
   }
}
Status Codes:
7.2.2.3. Update Compute Resource Reservation

Request to update compute resource reservation

POST /reservation/compute/update

Example request:

POST /reservation/compute/update HTTP/1.1
Accept: application/json

{
    "startTime": "2015-09-14T16:00:00Z",
    "reservationId": "xxxx-yyyy-zzzz"
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
  "reservationData": {
      "startTime": "2015-09-14TT16:00:00Z",
      "reservationStatus": "active",
      "reservationId": "xxxx-yyyy-zzzz",
      "computePoolReserved": {
          "numCpuCores": 20,
          "numVcInstances": 5,
          "virtualMemSize": 10,
          "zoneId": "23456"
      }
   }
}
7.2.2.4. Terminate Compute Resource Reservation

Request to terminate a compute resource reservation

DELETE /reservation/compute/(reservation_id)

Example response:

HTTP/1.1 200
Content-Type: application/json

{
   "reservationId": "xxxx-yyyy-zzzz",
}
7.2.2.5. Subscribe Resource Reservation Change Event

Subscription from Consumer to VIM to be notified about changes related to a reservation or to the resources associated to it.

POST /reservation/subscribe

Example request:

 POST /reservation/subscribe HTTP/1.1
 Accept: application/json

 {
    "inputFilter": [
        {
           "reservationId": "xxxx-yyyy-zzzz",
        }
    ]
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
   "created": "2015-09-21T00:00:00Z",
   "reservationChangeSubscriptionId": "abcdef-ghijkl-123456789"
}
Status Codes:
7.2.2.6. Notify Resource Reservation Change Event

Notification about changes in a compute resource reservation

POST /capacity/compute/notification

Example notification:

Content-Type: application/json

{
     "changeId": "aaaaaa-btgxxx-987654321",
     "reservationId": "xxxx-yyyy-zzzz",
     "vimId": "vim-CX-03"
     "changeType": "Reservation time change"
     "changedReservationData": {
        "endTime": "2015-10-14TT16:00:00Z",
     }
}
7.3. Virtualised Network Resources
7.3.1. Network Capacity Management
7.3.1.1. Subscribe Network Capacity Change Event

Susbcription from Consumer to VIM to be notified about network capacity changes

POST /capacity/network/subscribe

Example request:

POST /capacity/network/subscribe HTTP/1.1
Accept: application/json

{
    "networkResourceTypeId": "publicIps",
    "threshold": {
       "thresholdType": "absoluteValue",
       "threshold": {
           "capacity_info": "available",
           "condition": "lt",
           "value": 5
       }
    }
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
   "created": "2015-09-28T00:00:00Z",
   "capacityChangeSubscriptionId": "bcdefg-hijklm-234567890"
}
7.3.1.2. Query Network Capacity

Request to find out about available, reserved, total and allocated network capacity.

GET /capacity/network/query

Example request:

GET /capacity/network/query HTTP/1.1
Accept: application/json

{
    "networkResourceTypeId": "publicIps",
    "timePeriod":  {
        "startTime": "2015-09-28T00:00:00Z",
        "stopTime": "2015-09-28T00:05:30Z"
    }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
    "lastUpdate": "2015-09-28T00:02:10Z",
    "capacityInformation": {
        "available": 4,
        "reserved": 10,
        "total": 64,
        "allocated": 50
    }
}
7.3.1.3. Notify Network Capacity Change Event

Notification about network capacity changes

POST /capacity/network/notification

Example notification:

Content-Type: application/json

{
    "notificationId": "yxwvut-srqpon-876543210",
    "capacityChangeTime": "2015-09-28T00:02:10Z",
    "resourceDescriptor": {
        "networkResourceTypeId": "publicIps"
    },
    "capacityInformation": {
        "available": 4,
        "reserved": 10,
        "total": 64,
        "allocated": 50
    }
}
7.3.2. Network Resource Reservation
7.3.2.1. Create Network Resource Reservation

Request the reservation of network resource capacity and/or virtual networks, network ports

POST /reservation/network/create

Example request:

POST /reservation/network/create HTTP/1.1
Accept: application/json

{
    "startTime": "2015-09-28T01:00:00Z",
    "networkReservation": {
        "numPublicIps": 2
    }
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
    "reservationData": {
        "startTime": "2015-09-28T01:00:00Z",
        "reservationStatus": "initialized",
        "reservationId": "wwww-xxxx-yyyy",
        "publicIps": [
            "10.2.91.60",
            "10.2.91.61"
        ]
    }
}
7.3.2.2. Query Network Resource Reservation

Request to find out about reserved network resources that the consumer has access to.

GET /reservation/network/query

Example request:

GET /reservation/network/query HTTP/1.1
Accept: application/json

{
    "queryReservationFilter": [
        {
            "reservationId": "wwww-xxxx-yyyy"
        }
    ]
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
    "queryResult": {
        "startTime": "2015-09-28T01:00:00Z",
        "reservationStatus": "active",
        "reservationId": "wwww-xxxx-yyyy",
        "publicIps": [
            "10.2.91.60",
            "10.2.91.61"
        ]
    }
}
7.3.2.3. Update Network Resource Reservation

Request to update network resource reservation

POST /reservation/network/update

Example request:

POST /reservation/network/update HTTP/1.1
Accept: application/json

{
    "startTime": "2015-09-21T16:00:00Z",
    "reservationId": "wwww-xxxx-yyyy"
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
    "reservationData": {
        "startTime": "2015-09-21T16:00:00Z",
        "reservationStatus": "active",
        "reservationId": "wwww-xxxx-yyyy",
        "publicIps": [
           "10.2.91.60",
           "10.2.91.61"
        ]
    }
}
7.3.2.4. Terminate Network Resource Reservation

Request to terminate a network resource reservation

DELETE /reservation/network/(reservation_id)

Example response:

HTTP/1.1 200
Content-Type: application/json

{
   "reservationId": "xxxx-yyyy-zzzz",
}
7.4. Virtualised Storage Resources
7.4.1. Storage Capacity Management
7.4.1.1. Subscribe Storage Capacity Change Event

Subscription from Consumer to VIM to be notified about storage capacity changes

POST /capacity/storage/subscribe

Example request:

POST /capacity/storage/subscribe HTTP/1.1
Accept: application/json

{
   "storageResourceTypeId": "volumes",
   "threshold": {
      "thresholdType": "absoluteValue",
      "threshold": {
          "capacity_info": "available",
          "condition": "lt",
          "value": 3
       }
   }
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
    "created": "2015-09-28T12:00:00Z",
    "capacityChangeSubscriptionId": "cdefgh-ijklmn-345678901"
}
7.4.1.2. Query Storage Capacity for a defined resource type

Request to find out about available, reserved, total and allocated storage capacity.

GET /capacity/storage/query

Example request:

GET /capacity/storage/query HTTP/1.1
Accept: application/json

{
    "storageResourceTypeId": "volumes",
    "timePeriod":  {
        "startTime": "2015-09-28T12:00:00Z",
        "stopTime": "2015-09-28T12:04:45Z"
    }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
    "lastUpdate": "2015-09-28T12:01:35Z",
    "capacityInformation": {
        "available": 2,
        "reserved": 4,
        "total": 10,
        "allocated": 4
    }
}
7.4.1.3. Query Storage Capacity with required attributes

Request to find out available capacity.

GET /capacity/storage/query

Example request:

GET /capacity/storage/query HTTP/1.1
Accept: application/json

{
    "resourceCriteria": {
        "typeOfStorage" : "volume",
        "sizeOfStorage" : 200,
        "rdmaSupported" : "True",
    },
    "attributeSelector": "available",
    "timePeriod":  {
        "startTime": "2015-09-28T12:00:00Z",
        "stopTime": "2015-09-28T12:04:45Z"
    }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
    "lastUpdate": "2015-09-28T12:01:35Z",
    "capacityInformation": {
        "available": 2
    }
}
7.4.1.4. Notify Storage Capacity Change Event

Notification about storage capacity changes

POST /capacity/storage/notification

Example notification:

 Content-Type: application/json

 {
     "notificationId": "xwvuts-rqponm-765432109",
     "capacityChangeTime": "2015-09-28T12:01:35Z",
     "resourceDescriptor": {
         "storageResourceTypeId": "volumes"
     },
     "capacityInformation": {
         "available": 2,
         "reserved": 4,
         "total": 10,
         "allocated": 4
     }
}
7.4.2. Storage Resource Reservation
7.4.2.1. Create Storage Resource Reservation

Request the reservation of storage resource capacity

POST /reservation/storage/create

Example request:

POST /reservation/storage/create HTTP/1.1
Accept: application/json

{
    "startTime": "2015-09-28T13:00:00Z",
    "storagePoolReservation": {
        "storageSize": 10,
        "numSnapshots": 3,
        "numVolumes": 2
    }
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
    "reservationData": {
        "startTime": "2015-09-28T13:00:00Z",
        "reservationStatus": "initialized",
        "reservationId": "vvvv-wwww-xxxx",
        "storagePoolReserved": {
            "storageSize": 10,
            "numSnapshots": 3,
            "numVolumes": 2
        }
    }
}
7.4.2.2. Query Storage Resource Reservation

Request to find out about reserved storage resources that the consumer has access to.

GET /reservation/storage/query

Example request:

GET /reservation/storage/query HTTP/1.1
Accept: application/json

{
    "queryReservationFilter": [
        {
            "reservationId": "vvvv-wwww-xxxx"
        }
    ]
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
    "queryResult": {
        "startTime": "2015-09-28T13:00:00Z",
        "reservationStatus": "active",
        "reservationId": "vvvv-wwww-xxxx",
        "storagePoolReserved": {
            "storageSize": 10,
            "numSnapshots": 3,
            "numVolumes": 2
        }
    }
}
7.4.2.3. Update Storage Resource Reservation

Request to update storage resource reservation

POST /reservation/storage/update

Example request:

POST /reservation/storage/update HTTP/1.1
Accept: application/json


{
    "startTime": "2015-09-20T23:00:00Z",
    "reservationId": "vvvv-wwww-xxxx"

}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
    "reservationData": {
        "startTime": "2015-09-20T23:00:00Z",
        "reservationStatus": "active",
        "reservationId": "vvvv-wwww-xxxx",
        "storagePoolReserved": {
            "storageSize": 10,
            "numSnapshots": 3,
            "numVolumes": 2
        }
    }
}
7.4.2.4. Terminate Storage Resource Reservation

Request to terminate a storage resource reservation

DELETE /reservation/storage/(reservation_id)

Example response:

HTTP/1.1 200
Content-Type: application/json

{
   "reservationId": "xxxx-yyyy-zzzz",
}
8. Summary and conclusion

Resource Reservation and Resource Capacity Management are features to be supported by the VIM and exposed to the consumer via the VIM NBI. These features have been specified by ETSI NFV.

This document has described several use cases and corresponding high level flows where Resource Reservation and Capacity Management are of great benefit for the consumer of the virtualised resource management interface: the NFVO or the VNFM. The use cases include:

  • Notification of changes in capacity in the NFVI
  • Query of available resource capacity
  • Reservation of a resource or set of resources for immediate use
  • Reservation of a resource or set of resources for future use

The Promise project has performed a gap analysis in order to fulfill the required functionality. Based on the gap analysis an implementation plan an architecture, information model and northbound interface has been specified.

9. References and bibliography
[PROMISE]OPNFV, “Promise” requirements project, [Online]. Available at https://wiki.opnfv.org/promise
[BLAZAR]OpenStack Blazar Project, [Online]. Available at https://wiki.openstack.org/wiki/Blazar
[PROMOSS]Promise reference implementation, [Online]. Available at https://github.com/opnfv/promise
[YANGFO]Yangforge Project, [Online]. Available at https://github.com/opnfv/yangforge
[NFVMAN]ETSI GS NFV MAN 001, “Network Function Virtualisation (NFV); Management and Orchestration”
[NFV003]ETSI GS NFV 003, “Network Function Virtualisation (NFV); Terminology for Main Concepts in NFV”
[NFVIFA010]ETSI GS NFV IFA 010, “Network Function Virtualisation (NFV); Management and Orchestration; Functional Requirements Specification”
[NFVIFA005]ETSI GS NFV IFA 005, “Network Function Virtualisation (NFV); Management and Orchestration; Or-Vi reference point - Interface and Information Model Specification”
[NFVIFA006]ETSI GS NFV IFA 006 “Network Function Virtualisation (NFV); Management and Orchestration; Vi-Vnfm reference point - Interface and Information Model Specification”
[ETSINFV]ETSI NFV, [Online]. Available at http://www.etsi.org/technologies-clusters/technologies/nfv
10. ANNEX A: Use case for OPNFV Brahmaputra

A basic resource reservation use case to be realized for OPNFV B-release may look as follows:

  • Step 0: Shim-layer is monitoring/querying available capacity at NFVI
    • Step 0a: Cloud operator creates a new OpenStack tenant user and updates quota values for this user
    • Step 0b: The tenant user is creating and instantiating a simple VNF (e.g. 1 network, 2 VMs)
    • Step 0c: OpenStack is notifying shim-layer about capacity change for this new tenant user
    • Step 0d: Cloud operator can visualize the changes using the GUI
  • Step 1: Consumer(NFVO) is sending a reservation request for future use to shim-layer
  • Step 2: Shim-layer is checking for available capacity at the given time window
  • Step 3: Shim-layer is responding with reservation identifier
  • Step 4 (optional): Consumer(NFVO) is sending an update reservation request to shim-layer (startTime set to now) -> continue with Steps 2 and 3.
  • Step 5: Consumer(VNFM) is requesting the allocation of virtualised resources using the reservation identifier in Step 3
11. ANNEX B: Promise YANG schema based on YangForge
module opnfv-promise {
namespace "urn:opnfv:promise";
prefix promise;

import complex-types { prefix ct; }
import ietf-yang-types { prefix yang; }
import ietf-inet-types { prefix inet; }
import access-control-models { prefix acm; }
import nfv-infrastructure { prefix nfvi; }
import nfv-mano { prefix mano; }

description
  "OPNFV Promise Resource Reservation/Allocation controller module";

revision 2015-10-05 {
  description "Complete coverage of reservation related intents";
}

revision 2015-08-06 {
  description "Updated to incorporate YangForge framework";
}

revision 2015-04-16 {
  description "Initial revision.";
}

feature reservation-service {
  description "When enabled, provides resource reservation service";
}

feature multi-provider {
  description "When enabled, provides resource management across multiple providers";
}

typedef reference-identifier {
  description "defines valid formats for external reference id";
  type union {
    type yang:uuid;
    type inet:uri;
    type uint32;
  }
}

grouping resource-utilization {
  container capacity {
    container total     { description 'Conceptual container that should be extended'; }
    container reserved  { description 'Conceptual container that should be extended';
                          config false; }
    container usage     { description 'Conceptual container that should be extended';
                          config false; }
    container available { description 'Conceptual container that should be extended';
                          config false; }
  }
}

grouping temporal-resource-collection {
  description
    "Information model capturing resource-collection with start/end time window";

  leaf start { type yang:date-and-time; }
  leaf end   { type yang:date-and-time; }

  uses nfvi:resource-collection;
}

grouping resource-usage-request {
  description
    "Information model capturing available parameters to make a resource
     usage request.";
  reference "OPNFV-PROMISE, Section 3.4.1";

  uses temporal-resource-collection {
    refine elements {
      description
        "Reference to a list of 'pre-existing' resource elements that are
       required for fulfillment of the resource-usage-request.

       It can contain any instance derived from ResourceElement,
       such as ServerInstances or even other
       ResourceReservations. If the resource-usage-request is
       accepted, the ResourceElement(s) listed here will be placed
       into 'protected' mode as to prevent accidental removal.

       If any of these resource elements become 'unavailable' due to
       environmental or administrative activity, a notification will
       be issued informing of the issue.";
    }
  }

  leaf zone {
    description "Optional identifier to an Availability Zone";
    type instance-identifier { ct:instance-type nfvi:AvailabilityZone; }
  }
}

grouping query-start-end-window {
  container window {
    description "Matches entries that are within the specified start/end time window";
    leaf start { type yang:date-and-time; }
    leaf end   { type yang:date-and-time; }
    leaf scope {
      type enumeration {
        enum "exclusive" {
          description "Matches entries that start AND end within the window";
        }
        enum "inclusive" {
          description "Matches entries that start OR end within the window";
        }
      }
      default "inclusive";
    }
  }
}

grouping query-resource-collection {
  uses query-start-end-window {
    description "Match for ResourceCollection(s) that are within the specified
                 start/end time window";
  }
  leaf-list without {
    description "Excludes specified collection identifiers from the result";
    type instance-identifier { ct:instance-type ResourceCollection; }
  }
  leaf show-utilization { type boolean; default true; }
  container elements {
    leaf-list some {
      description "Query for ResourceCollection(s) that contain some or more of
                   these element(s)";
      type instance-identifier { ct:instance-type nfvi:ResourceElement; }
    }
    leaf-list every {
      description "Query for ResourceCollection(s) that contain all of
                   these element(s)";
      type instance-identifier { ct:instance-type nfvi:ResourceElement; }
    }
  }
}

grouping common-intent-output {
  leaf result {
    type enumeration {
      enum "ok";
      enum "conflict";
      enum "error";
    }
  }
  leaf message { type string; }
}

grouping utilization-output {
  list utilization {
    key 'timestamp';
    leaf timestamp { type yang:date-and-time; }
    leaf count { type int16; }
    container capacity { uses nfvi:resource-capacity; }
  }
}

ct:complex-type ResourceCollection {
  ct:extends nfvi:ResourceContainer;
  ct:abstract true;

  description
    "Describes an abstract ResourceCollection data model, which represents
     a grouping of capacity and elements available during a given
     window in time which must be extended by other resource
     collection related models";

  leaf start { type yang:date-and-time; }
  leaf end   { type yang:date-and-time; }

  leaf active {
    config false;
    description
      "Provides current state of this record whether it is enabled and within
       specified start/end time";
    type boolean;
  }
}

ct:complex-type ResourcePool {
  ct:extends ResourceCollection;

  description
    "Describes an instance of an active ResourcePool record, which
     represents total available capacity and elements from a given
     source.";

  leaf source {
    type instance-identifier {
      ct:instance-type nfvi:ResourceContainer;
      require-instance true;
    }
    mandatory true;
  }

  refine elements {
    // following 'must' statement applies to each element
    // NOTE: just a non-working example for now...
    must "boolean(/source/elements/*[@id=id])" {
      error-message "One or more of the ResourceElement(s) does not exist in
                     the provider to be reserved";
    }
  }
}

ct:complex-type ResourceReservation {
  ct:extends ResourceCollection;

  description
    "Describes an instance of an accepted resource reservation request,
     created usually as a result of 'create-reservation' request.

     A ResourceReservation is a derived instance of a generic
     ResourceCollection which has additional parameters to map the
     pool(s) that were referenced to accept this reservation as well
     as to track allocations made referencing this reservation.

     Contains the capacities of various resource attributes being
     reserved along with any resource elements that are needed to be
     available at the time of allocation(s).";

  reference "OPNFV-PROMISE, Section 3.4.1";

  leaf created-on  { type yang:date-and-time; config false; }
  leaf modified-on { type yang:date-and-time; config false; }

  leaf-list pools {
    config false;
    description
      "Provides list of one or more pools that were referenced for providing
       the requested resources for this reservation.  This is an
       important parameter for informing how/where allocation
       requests can be issued using this reservation since it is
       likely that the total reserved resource capacity/elements are
       made availble from multiple sources.";
    type instance-identifier {
      ct:instance-type ResourcePool;
      require-instance true;
    }
  }

  container remaining {
    config false;
    description
      "Provides visibility into total remaining capacity for this
       reservation based on allocations that took effect utilizing
       this reservation ID as a reference.";

    uses nfvi:resource-capacity;
  }

  leaf-list allocations {
    config false;
    description
      "Reference to a collection of consumed allocations referencing
       this reservation.";
    type instance-identifier {
      ct:instance-type ResourceAllocation;
      require-instance true;
    }
  }
}

ct:complex-type ResourceAllocation {
  ct:extends ResourceCollection;

  description
    "A ResourceAllocation record denotes consumption of resources from a
     referenced ResourcePool.

     It does not reflect an accepted request but is created to
     represent the actual state about the ResourcePool. It is
     created once the allocation(s) have successfully taken effect
     on the 'source' of the ResourcePool.

     The 'priority' state indicates the classification for dealing
     with resource starvation scenarios. Lower priority allocations
     will be forcefully terminated to allow for higher priority
     allocations to be fulfilled.

     Allocations without reference to an existing reservation will
     receive the lowest priority.";

  reference "OPNFV-PROMISE, Section 3.4.3";

  leaf reservation {
    description "Reference to an existing reservation identifier (optional)";

    type instance-identifier {
      ct:instance-type ResourceReservation;
      require-instance true;
    }
  }

  leaf pool {
    description "Reference to an existing resource pool from which allocation is drawn";

    type instance-identifier {
      ct:instance-type ResourcePool;
      require-instance true;
    }
  }

  container instance-ref {
    config false;
    description
      "Reference to actual instance identifier of the provider/server
      for this allocation";
    leaf provider {
      type instance-identifier { ct:instance-type ResourceProvider; }
    }
    leaf server { type yang:uuid; }
  }

  leaf priority {
    config false;
    description
      "Reflects current priority level of the allocation according to
       classification rules";
    type enumeration {
      enum "high"   { value 1; }
      enum "normal" { value 2; }
      enum "low"    { value 3; }
    }
    default "normal";
  }
}

ct:complex-type ResourceProvider {
  ct:extends nfvi:ResourceContainer;

  key "name";
  leaf token { type string; mandatory true; }

  container services { // read-only
    config false;
    container compute {
      leaf endpoint { type inet:uri; }
      ct:instance-list flavors { ct:instance-type nfvi:ComputeFlavor; }
    }
  }

  leaf-list pools {
    config false;
    description
      "Provides list of one or more pools that are referencing this provider.";

    type instance-identifier {
      ct:instance-type ResourcePool;
      require-instance true;
    }
  }
}

// MAIN CONTAINER
container promise {

  uses resource-utilization {
    description "Describes current state info about capacity utilization info";

    augment "capacity/total"     { uses nfvi:resource-capacity; }
    augment "capacity/reserved"  { uses nfvi:resource-capacity; }
    augment "capacity/usage"     { uses nfvi:resource-capacity; }
    augment "capacity/available" { uses nfvi:resource-capacity; }
  }

  ct:instance-list providers {
    if-feature multi-provider;
    description "Aggregate collection of all registered ResourceProvider instances
                 for Promise resource management service";
    ct:instance-type ResourceProvider;
  }

  ct:instance-list pools {
    if-feature reservation-service;
    description "Aggregate collection of all ResourcePool instances";
    ct:instance-type ResourcePool;
  }

  ct:instance-list reservations {
    if-feature reservation-service;
    description "Aggregate collection of all ResourceReservation instances";
    ct:instance-type ResourceReservation;
  }

  ct:instance-list allocations {
    description "Aggregate collection of all ResourceAllocation instances";
    ct:instance-type ResourceAllocation;
  }

  container policy {
    container reservation {
      leaf max-future-start-range {
        description
          "Enforce reservation request 'start' time is within allowed range from now";
        type uint16 { range 0..365; }
        units "days";
      }
      leaf max-future-end-range {
        description
          "Enforce reservation request 'end' time is within allowed range from now";
        type uint16 { range 0..365; }
        units "days";
      }
      leaf max-duration {
        description
          "Enforce reservation duration (end-start) does not exceed specified threshold";
        type uint16;
        units "hours";
        default 8760; // for now cap it at max one year as default
      }
      leaf expiry {
        description
          "Duration in minutes from start when unallocated reserved resources
           will be released back into the pool";
        type uint32;
        units "minutes";
      }
    }
  }
}

//-------------------
// INTENT INTERFACE
//-------------------

// RESERVATION INTENTS
rpc create-reservation {
  if-feature reservation-service;
  description "Make a request to the reservation system to reserve resources";
  input {
    uses resource-usage-request;
  }
  output {
    uses common-intent-output;
    leaf reservation-id {
      type instance-identifier { ct:instance-type ResourceReservation; }
    }
  }
}

rpc update-reservation {
  description "Update reservation details for an existing reservation";
  input {
    leaf reservation-id {
      type instance-identifier {
        ct:instance-type ResourceReservation;
        require-instance true;
      }
      mandatory true;
    }
    uses resource-usage-request;
  }
  output {
    uses common-intent-output;
  }
}

rpc cancel-reservation {
  description "Cancel the reservation and be a good steward";
  input {
    leaf reservation-id {
      type instance-identifier { ct:instance-type ResourceReservation; }
      mandatory true;
    }
  }
  output {
    uses common-intent-output;
  }
}

rpc query-reservation {
  if-feature reservation-service;
  description "Query the reservation system to return matching reservation(s)";
  input {
    leaf zone { type instance-identifier { ct:instance-type nfvi:AvailabilityZone; } }
    uses query-resource-collection;
  }
  output {
    leaf-list reservations { type instance-identifier
                             { ct:instance-type ResourceReservation; } }
    uses utilization-output;
  }
}

// CAPACITY INTENTS
rpc increase-capacity {
  description "Increase total capacity for the reservation system
               between a window in time";
  input {
    uses temporal-resource-collection;
    leaf source {
      type instance-identifier {
        ct:instance-type nfvi:ResourceContainer;
      }
    }
  }
  output {
    uses common-intent-output;
    leaf pool-id {
      type instance-identifier { ct:instance-type ResourcePool; }
    }
  }
}

rpc decrease-capacity {
  description "Decrease total capacity for the reservation system
               between a window in time";
  input {
    uses temporal-resource-collection;
    leaf source {
      type instance-identifier {
        ct:instance-type nfvi:ResourceContainer;
      }
    }
  }
  output {
    uses common-intent-output;
    leaf pool-id {
      type instance-identifier { ct:instance-type ResourcePool; }
    }
  }
}

rpc query-capacity {
  description "Check available capacity information about a specified
               resource collection";
  input {
    leaf capacity {
      type enumeration {
        enum 'total';
        enum 'reserved';
        enum 'usage';
        enum 'available';
      }
      default 'available';
    }
    leaf zone { type instance-identifier { ct:instance-type nfvi:AvailabilityZone; } }
    uses query-resource-collection;
    // TBD: additional parameters for query-capacity
  }
  output {
    leaf-list collections { type instance-identifier
                            { ct:instance-type ResourceCollection; } }
    uses utilization-output;
  }
}

// ALLOCATION INTENTS (should go into VIM module in the future)
rpc create-instance {
  description "Create an instance of specified resource(s) utilizing capacity
               from the pool";
  input {
    leaf provider-id {
      if-feature multi-provider;
      type instance-identifier { ct:instance-type ResourceProvider;
                                 require-instance true; }
    }
    leaf name   { type string; mandatory true; }
    leaf image  {
      type reference-identifier;
      mandatory true;
    }
    leaf flavor {
      type reference-identifier;
      mandatory true;
    }
    leaf-list networks {
      type reference-identifier;
      description "optional, will assign default network if not provided";
    }

    // TODO: consider supporting a template-id (such as HEAT) for more complex instantiation

    leaf reservation-id {
      type instance-identifier { ct:instance-type ResourceReservation;
                                 require-instance true; }
    }
  }
  output {
    uses common-intent-output;
    leaf instance-id {
      type instance-identifier { ct:instance-type ResourceAllocation; }
    }
  }
}

rpc destroy-instance {
  description "Destroy an instance of resource utilization and release it
               back to the pool";
  input {
    leaf instance-id {
      type instance-identifier { ct:instance-type ResourceAllocation;
                                 require-instance true; }
    }
  }
  output {
    uses common-intent-output;
  }
}

// PROVIDER INTENTS (should go into VIM module in the future)
rpc add-provider {
  description "Register a new resource provider into reservation system";
  input {
    leaf provider-type {
      description "Select a specific resource provider type";
      mandatory true;
      type enumeration {
        enum openstack;
        enum hp;
        enum rackspace;
        enum amazon {
          status planned;
        }
        enum joyent {
          status planned;
        }
        enum azure {
          status planned;
        }
      }
      default openstack;
    }
    uses mano:provider-credentials {
      refine endpoint {
        default "http://localhost:5000/v2.0/tokens";
      }
    }
    container tenant {
      leaf id { type string; }
      leaf name { type string; }
    }
  }
  output {
    uses common-intent-output;
    leaf provider-id {
      type instance-identifier { ct:instance-type ResourceProvider; }
    }
  }
}

// TODO...
notification reservation-event;
notification capacity-event;
notification allocation-event;
}
12. ANNEX C: Supported APIS
12.1. Add Provider

Register a new resource provider (e.g. OpenStack) into reservation system.

Request parameters

Name Type Description
provider-type Enumeration Name of the resource provider
endpoint URI Targer URL end point for the resource provider
username String User name
password String Password
region String Specified region for the provider
tenant.id String Id of the tenant
tenant.name String Name of the tenant

Response parameters

Name Type Description
provider-id String Id of the new resource provider
result Enumeration Result info
POST /add-provider

Example request:

POST /add-provider HTTP/1.1
Accept: application/json

{
  "provider-type": "openstack",
  "endpoint": "http://10.0.2.15:5000/v2.0/tokens",
  "username": "promise_user",
  "password": "******",
  "tenant": {
     "name": "promise"
  }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "provider-id": "f25ed9cb-de57-43d5-9b4a-a389a1397302",
  "result": "ok"
}
12.2. Create Reservation

Make a request to the reservation system to reserve resources.

Request parameters

Name Type Description
zone String Id to an availability zone
start DateTime Timestamp when the consumption of reserved resources can begin
end DateTime Timestamp when the consumption of reserved resources should end
capacity.cores int16 Amount of cores to be reserved
capacity.ram int32 Amount of RAM to be reserved
capacity.instances int16 Amount of instances to be reserved
capacity.addresses int32 Amount of public IP addresses to be reserved
elements ResourceElement List of pre-existing resource elements to be reserved

Response parameters

Name Type Description
reservation-id String Id of the reservation
result Enumeration Result info
message String Output message
POST /create-reservation

Example request:

POST /create-reservation HTTP/1.1
Accept: application/json

{
   "capacity": {
      "cores": "5",
      "ram": "25600",
      "addresses": "3",
      "instances": "3"
   },
   "start": "2016-02-02T00:00:00Z",
   "end": "2016-02-03T00:00:00Z"
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "reservation-id": "269b2944-9efc-41e0-b067-6898221e8619",
  "result": "ok",
  "message": "reservation request accepted"
}
12.3. Update Reservation

Update reservation details for an existing reservation.

Request parameters

Name Type Description
reservation-id String Id of the reservation to be updated
zone String Id to an availability zone
start DateTime Updated timestamp when the consumption of reserved resources can begin
end DateTime Updated timestamp when the consumption of reserved resources should end
capacity.cores int16 Updated amount of cores to be reserved
capacity.ram int32 Updated amount of RAM to be reserved
capacity.instances int16 Updated amount of instances to be reserved
capacity.addresses int32 Updated amount of public IP addresses to be reserved
elements ResourceElement Updated list of pre-existing resource elements to be reserved

Response parameters

Name Type Description
result Enumeration Result info
message String Output message
POST /update-reservation

Example request:

POST /update-reservation HTTP/1.1
Accept: application/json

{
   "reservation-id": "269b2944-9efv-41e0-b067-6898221e8619",
   "capacity": {
      "cores": "1",
      "ram": "5120",
      "addresses": "1",
      "instances": "1"
   }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "result": "ok",
  "message": "reservation update successful"
}
12.4. Cancel Reservation

Cancel the reservation.

Request parameters

Name Type Description
reservation-id String Id of the reservation to be canceled

Response parameters

Name Type Description
result Enumeration Result info
message String Output message
POST /cancel-reservation

Example request:

POST /cancel-reservation HTTP/1.1
Accept: application/json

{
  "reservation-id": "269b2944-9efv-41e0-b067-6898221e8619"
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "result": "ok",
  "message": "reservation canceled"
}
12.5. Query Reservation

Query the reservation system to return matching reservation(s).

Request parameters

Name Type Description
zone String Id to an availability zone
show-utilization Boolean Show capacity utilization
without ResourceCollection Excludes specified collection identifiers from the result
elements.some ResourceElement Query for ResourceCollection(s) that contain some or more of these element(s)
elements.every ResourceElement Query for ResourceCollection(s) that contain all of these element(s)
window.start DateTime Matches entries that are within the specified start/end window
window.end DateTime  
wndow.scope Enumeration Matches entries that start {and/or} end within the time window

Response parameters

Name Type Description
reservations ResourceReservation List of matching reservations
utilization CapacityUtilization Capacity utilization over time
POST /query-reservation

Example request:

POST /query-reservation HTTP/1.1
Accept: application/json

{
   "show-utilization": false,
   "window": {
      "start": "2016-02-01T00:00:00Z",
      "end": "2016-02-04T00:00:00Z"
   }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "reservations": [
    "269b2944-9efv-41e0-b067-6898221e8619"
  ],
  "utilization": []
}
12.6. Create Instance

Create an instance of specified resource(s) utilizing capacity from the pool.

Request parameters

Name Type Description
provider-id String Id of the resource provider
reservation-id String Id of the resource reservation
name String Name of the instance
image String Id of the image
flavor String Id of the flavor
networks Uuid List of network uuids

Response parameters

Name Type Description
instance-id String Id of the instance
result Enumeration Result info
message String Output message
POST /create-instance

Example request:

POST /create-instance HTTP/1.1
Accept: application/json

{
  "provider-id": "f25ed9cb-de57-43d5-9b4a-a389a1397302",
  "name": "vm1",
  "image": "ddffc6f5-5c86-4126-b0fb-2c71678633f8",
  "flavor": "91bfdf57-863b-4b73-9d93-fc311894b902"
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "instance-id": "82572779-896b-493f-92f6-a63008868250",
  "result": "ok",
  "message": "created-instance request accepted"
}
12.7. Destroy Instance

Destroy an instance of resource utilization and release it back to the pool.

Request parameters

Name Type Description
instance-id String Id of the instance to be destroyed

Response parameters

Name Type Description
result Enumeration Result info
message String Output message
POST /destroy-instance

Example request:

POST /destroy-instance HTTP/1.1
Accept: application/json

{
   "instance-id": "82572779-896b-493f-92f6-a63008868250"
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "result": "ok",
  "message": "instance destroyed and resource released back to pool"
}
12.8. Decrease Capacity

Decrease total capacity for the reservation system for a given time window.

Request parameters

Name Type Description
source String Id of the resource container
start DateTime Start/end defines the time window when total capacity is decreased
end DateTime  
capacity.cores int16 Decreased amount of cores
capacity.ram int32 Decreased amount of RAM
capacity.instances int16 Decreased amount of instances
capacity.addresses int32 Decreased amount of public IP addresses

Response parameters

Name Type Description
pool-id String Id of the resource pool
result Enumeration Result info
message String Output message
POST /decrease-capacity

Example request:

POST /decrease-capacity HTTP/1.1
Accept: application/json

{
   "source": "ResourcePool:4085f0da-8030-4252-a0ff-c6f93870eb5f",
   "capacity": {
      "cores": "3",
      "ram": "5120",
      "addresses": "1"
   }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
   "pool-id": "c63b2a41-bcc6-42f6-8254-89d633e1bd0b",
   "result": "ok",
   "message": "capacity decrease successful"
}
12.9. Increase Capacity

Increase total capacity for the reservation system for a given time window.

Request parameters

Name Type Description
source String Id of the resource container
start DateTime Start/end defines the time window when total capacity is increased
end DateTime  
capacity.cores int16 Increased amount of cores
capacity.ram int32 Increased amount of RAM
capacity.instances int16 Increased amount of instances
capacity.addresses int32 Increased amount of public IP addresses

Response parameters

Name Type Description
pool-id String Id of the resource pool
result Enumeration Result info
message String Output message
POST /increase-capacity

Example request:

POST /increase-capacity HTTP/1.1
Accept: application/json

{
   "source": "ResourceProvider:f6f13fe3-0126-4c6d-a84f-15f1ab685c4f",
   "capacity": {
       "cores": "20",
       "ram": "51200",
       "instances": "10",
       "addresses": "10"
   }
}

Example response:

HTTP/1.1 200 OK
Content-Type: application/json

{
   "pool-id": "279217a4-7461-4176-bf9d-66770574ca6a",
   "result": "ok",
   "message": "capacity increase successful"
}
12.10. Query Capacity

Query for capacity information about a specified resource collection.

Request parameters

Name Type Description
capacity Enumeration Return total or reserved or available or usage capacity information
zone String Id to an availability zone
show-utilization Boolean Show capacity utilization
without ResourceCollection Excludes specified collection identifiers from the result
elements.some ResourceElement Query for ResourceCollection(s) that contain some or more of these element(s)
elements.every ResourceElement Query for ResourceCollection(s) that contain all of these element(s)
window.start DateTime Matches entries that are within the specified start/end window
window.end DateTime  
window.scope Enumeration Matches entries that start {and/or} end within the time window

Response parameters

Name Type Description
collections ResourceCollection List of matching collections
utilization CapacityUtilization Capacity utilization over time
POST /query-capacity

Example request:

POST /query-capacity HTTP/1.1
Accept: application/json

{
  "show-utilization": false
}

Example response:

HTTP/1.1 201 CREATED
Content-Type: application/json

{
  "collections": [
    "ResourcePool:279217a4-7461-4176-bf9d-66770574ca6a"
  ],
  "utilization": []
}
Promise installation and configuration
Promise installation

Install nodejs, npm and promise

curl -sL https://deb.nodesource.com/setup_4.x | sudo -E bash -
sudo apt-get install -y nodejs
sudo npm -g install npm@latest
git clone https://gerrit.opnfv.org/gerrit/promise
cd promise/source
npm install

Please note that the last command ‘npm install’ will install all needed dependencies for promise (including yangforge and mocha)

_images/screenshot_promise_install.png
Validation

Please perform the following preparation steps:

  1. Set OpenStack environment parameters properly (e.g. source openrc admin demo in DevStack)
  2. Create OpenStack tenant (e.g. promise) and tenant user (e.g. promiser)
  3. Create a flavor in Nova with 1 vCPU and 512 MB RAM
  4. Create a private network, subnet and router in Neutron
  5. Create an image in Glance

Once done, the promise test script can be invoked as follows (as a single line command):

NODE_ENV=mytest \
OS_TENANT_NAME=promise \
OS_USERNAME=promiser \
OS_PASSWORD=<user password from Step 2> \
OS_TEST_FLAVOR=<flavor ID from Step 3> \
OS_TEST_NETWORK=<network ID from Step 4> \
OS_TEST_IMAGE=<image ID from Step 5> \
npm run -s test -- --reporter json > promise-results.json

The results of the tests will be stored in the promise-results.json file.

The results can also be seen in the console (“npm run -s test”)

_images/screenshot_promise.png

All 33 tests passing?! Congratulations, promise has been successfully installed and configured.

Promise user guide
Abstract

Promise is a resource reservation and management project to identify NFV related requirements and realize resource reservation for future usage by capacity management of resource pools regarding compute, network and storage.

The following are the key features provided by this module:

  • Capacity Management
  • Reservation Management
  • Allocation Management

The following sections provide details on the Promise capabilities and its API usage.

Promise capabilities and usage

The Danube implementation of Promise is built with the YangForge data modeling framework [2] , using a shim-layer on top of OpenStack to provide the Promise features. This approach requires communication between Consumers/Administrators and OpenStack to pass through the shim-layer. The shim-layer intercepts the message flow to manage the allocation requests based on existing reservations and available capacities in the providers. It also extracts information from the intercepted messages in order to update its internal databases. Furthermore, Promise provides additional intent-based APIs to allow a Consumer or Administrator to perform capacity management (i.e. add providers, update the capacity, and query the current capacity and utilization of a provider), reservation management (i.e. create, update, cancel, query reservations), and allocation management (i.e. create, destroy instances).

Detailed information about Promise use cases, features, interface specifications, work flows, and the underlying Promise YANG schema can be found in the Promise requirement document [1] .

Promise features and API usage guidelines and examples

This section lists the Promise features and API implemented in OPNFV Danube.

Note: The listed parameters are optional unless explicitly marked as “mandatory”.

Reservation management

The reservation management allows a Consumer to request reservations for resource capacity. Reservations can be for now or a later time window. After the start time of a reservation has arrived, the Consumer can issue create server instance requests against the reserved capacity. Note, a reservation will expire after a predefined expiry time in case no allocation referring to the reservation is requested.

The implemented workflow is well aligned with the described workflow in the Promise requirement document [1] (Section 6.1) except for the “multi-provider” scenario as described in (Multi-)provider management .

create-reservation

This operation allows making a request to the reservation system to reserve resources.

The operation takes the following input parameters:

  • start: start time of the requested reservation
  • end: end time of the requested reservation
  • capacity.instances: amount of instances to be reserved
  • capacity.cores: amount of cores to be reserved
  • capacity.ram: amount of ram in MB to be reserved

Promise will check the available capacity in the given time window and in case sufficient capacity exists to meet the reservation request, will mark those resources “reserved” in its reservation map.

update-reservation

This operation allows to update the reservation details for an existing reservation.

It can take the same input parameters as in create-reservation but in addition requires a mandatory reference to the reservation-id of the reservation that shall be updated.

cancel-reservation

This operation is used to cancel an existing reservation.

The operation takes the following input parameter:

  • reservation-id (mandatory): identifier of the reservation to be canceled.
query-reservation

The operation queries the reservation system to return reservation(s) matching the specified query filter, e.g., reservations that are within a specified start/end time window.

The operation takes the following input parameters to narrow down the query results:

  • without: excludes specified collection identifiers from the result
  • elements.some: query for ResourceCollection(s) that contain some or more of these element(s)
  • elements.every: query for ResourceCollection(s) that contain all of these element(s)
  • window.start: matches entries that are within the specified start/
  • window.end: end time window
  • window.scope: if set to ‘exclusive’, only reservations with start AND end time
    within the time window are returned. Otherwise (‘inclusive’), all reservation starting OR ending in the time windows are returned.
  • show-utilization: boolean value that specifies whether to also return the resource utilization in the queried time window or not
Allocation management
create-instance

This operation is used to create an instance of specified resource(s) for immediate use utilizing capacity from the pool. Create-instance requests can be issued against an existing reservation, but also allocations without a reference to an existing reservation are allowed. In case the allocation request specifies a reservation identifier, Promise checks if a reservation with that ID exists, the reservation start time has arrived (i.e. the reservation is ‘active’), and the required capacity for the requested flavor is within the available capacity of the reservation. If those conditions are met, Promise creates a record for the allocation (VMState=”INITIALIZED”) and update its databases. If no reservation_id was provided in the allocation request, Promise checks whether the required capacity to meet the request can be provided from the available, non-reserved capacity. If yes, Promise creates a record for the allocation with an unique instance-id and update its databases. In any other case, Promise rejects the create-instance request.

In case the create-instance request is rejected, Promise responds with a “status=rejected” providing the reason of the rejection. This will help the Consumer to take appropriate actions, e.g., send an updated create-instance request. In case the create-instance request was accepted and a related allocation record has been created, the shim-layer issues a createServer request to the VIM Controller (i.e. Nova) providing all information to create the server instance.

The operation takes the following input parameters:

  • name (mandatory): Assigned name for the instance to be created
  • image (mandatory): the image to be booted in the new instance
  • flavor (mandatory): the flavor of the requested server instance
  • networks: the list of network uuids of the requested server instance
  • provider-id: identifier of the provider where the instance shall be created
  • reservation-id: identifier of a resource reservation the create-instance

The Danube implementation of Promise has the following limitations:

  • All create server instance requests shall pass through the Promise shim-layer such that Promise can keep track of all allocation requests. This is necessary as in the current release the sychronization between the VIM Controller and Promise on the available capacity is not yet implemented.
  • Create-allocation requests are limited to “simple” allocations, i.e., the current workflow only supports the Nova compute service and create-allocation requests are limited to creating one server instance at a time
  • Prioritization of reservations and allocations is yet not implemented. Future version may allow certain policy-based conflict resolution where, e.g., new allocation request with high priority can “forcefully” terminate lower priority allocations.
destroy-instance

This operation request to destroy an existing server instance and release it back to the pool.

The operation takes the following input parameter:

  • instance-id: identifier of the server instance to be destroyed
Capacity management

The capacity management feature allows the Consumer or Administrator to do capacity planning, i.e. the capacity available to the reservation management can differ from the actual capacity in the registered provider(s). This feature can, e.g., be used to limit the available capacity for a given time window due to a planned downtime of some of the resources, or increase the capacity available to the reservation system in case of a planned upgrade of the available capacity.

increase/decrease-capacity

This operations allows to increase/decrease the total capacity that is made available to the Promise reservation service between a specified window in time. It does NOT increase the actual capacity of a given resource provider, but is used for capacity management inside Promise.

This feature can be used in different ways, like

  • Limit the capacity available to the reservation system to a value below 100% of the available capacity in the VIM, e.g., in order to leave “buffer” in the actual NFVI to be used outside the Promise reservation service.
  • Inform the reservation system that, from a given time in the future, additional resources can be reserved, e.g., due to a planned upgrade of the available capacity of the provider.
  • Similarily, the “decrease-capacity” can be used to reduce the consumable resources in a given time window, e.g., to prepare for a planned downtime of some of the resources.
  • Expose multiple reservation service instances to different consumers sharing the same resource provider.

The operation takes the following input parameters:

  • start: start time for the increased/decreased capacity
  • end: end time for the increased/decreased capacity
  • capacity.cores: Increased/decreased amount of cores
  • capacity.ram: Increased/decreased amount of RAM
  • capacity.instances: Increased/decreased amount of instances

Note, increase/decreasing the capacity in Promise is completely transparent to the VIM. As such, when increasing the virtual capacity in Promise (e.g. for a planned upgrade of the capacity), it is in the responsibility of the Consumer/Administrator to ensure sufficient resources in the VIM are available at the appropriate time, in order to prevent allocations against reservations to fail due to a lack of resources. Therefore, this operations should only be used carefully.

query-capacity

This operation is used to query the available capacity information of the specified resource collection. A filter attribute can be specified to narrow down the query results.

The current implementation supports the following filter criteria:

  • time window: returns reservations matching the specified window
  • window scope: if set to ‘exclusive’, only reservations with start AND end time within the time window are returned. Otherwise, all reservation starting OR ending in the time windows are returned.
  • metric: query for one of the following capacity metrics:
    • ‘total’: resource pools
    • ‘reserved’: reserved resources
    • ‘usage’: resource allocations
    • ‘available’: remaining capacity, i.e. neither reserved nor allocated
(Multi-)provider management

This API towards OpenStack allows a Consumer/Administrator to add and remove resource providers to Promise. Note, Promise supports a multi-provider configuration, however, for Danube, multi-provider support is not yet fully supported.

add-provider

This operation is used to register a new resource provider into the Promise reservation system.

Note, for Danube, the add-provider operation should only be used to register one provider with the Promise shim-layer. Further note that currently only OpenStack is supported as a provider.

The operation takes the following input parameters:

  • provider-type (mandatory) = ‘openstack’: select a specific resource provider type.
  • endpoint (mandatory): target URL endpoint for the resource provider.
  • username (mandatory)
  • password (mandatory)
  • region: specified region for the provider
  • tenant.id: id of the OpenStack tenant/project
  • tenant.name: name of the OpenStack tenant/project
[1](1, 2) Promise requirement document, http://artifacts.opnfv.org/promise/docs/requirements/index.html
[2]YangForge framework, http://github.com/opnfv/yangforge

SDNVPN

SDN VPN

A high-level description of the scenarios is provided in this section. For details of the scenarios and their provided capabilities refer to the scenario description document: http://artifacts.opnfv.org/danube/sdnpvn/scenarios/os-odl_l2-bgpvpn/index.html

The BGPVPN feature enables creation of BGP VPNs on the Neutron API according to the OpenStack BGPVPN blueprint at https://blueprints.launchpad.net/neutron/+spec/neutron-bgp-vpn. In a nutshell, the blueprint defines a BGPVPN object and a number of ways how to associate it with the existing Neutron object model, as well as a unique definition of the related semantics. The BGPVPN framework supports a backend driver model with currently available drivers for Bagpipe, OpenContrail, Nuage and OpenDaylight. The OPNFV scenario makes use of the OpenDaylight driver and backend implementation through the ODL NetVirt project.

SDN VPN
Introduction

This document will provide an overview of how to work with the SDN VPN features in OPNFV.

SDN VPN feature description

A high-level description of the scenarios is provided in this section. For details of the scenarios and their provided capabilities refer to the scenario description document: http://artifacts.opnfv.org/danube/sdnpvn/scenarios/os-odl_l2-bgpvpn/index.html

The BGPVPN feature enables creation of BGP VPNs on the Neutron API according to the OpenStack BGPVPN blueprint at https://blueprints.launchpad.net/neutron/+spec/neutron-bgp-vpn. In a nutshell, the blueprint defines a BGPVPN object and a number of ways how to associate it with the existing Neutron object model, as well as a unique definition of the related semantics. The BGPVPN framework supports a backend driver model with currently available drivers for Bagpipe, OpenContrail, Nuage and OpenDaylight. The OPNFV scenario makes use of the OpenDaylight driver and backend implementation through the ODL NetVirt project.

Hardware requirements

The SDNVPN scenarios can be deployed as a bare-metal or a virtual environment on a single host.

Bare metal deployment on Pharos Lab

Hardware requirements for bare-metal deployments of the OPNFV infrastructure are specified by the Pharos project. The Pharos project provides an OPNFV hardware specification for configuring your hardware at: http://artifacts.opnfv.org/pharos/docs/pharos-spec.html.

Virtual deployment hardware requirements

To perform a virtual deployment of an OPNFV scenario on a single host, that host has to meet the hardware requirements outlined in the <missing spec>.

When ODL is used as an SDN Controller in an OPNFV virtual deployment, ODL is running on the OpenStack Controller VMs. It is therefore recommended to increase the amount of resources for these VMs.

Our recommendation is to have 2 additional virtual cores and 8GB additional virtual memory on top of the normally recommended configuration.

Together with the commonly used recommendation this sums up to:

6 virtual cores
16 GB virtual memory

See in Installation section below how to configure this.

[FUEL] Preparing the host to install Fuel by script

Before starting the installation of the os-odl_l2-bgpnvp scenario some preparation of the machine that will host the Fuel VM must be done.

[FUEL] Installation of required packages

To be able to run the installation of the basic OPNFV fuel installation the Jumphost (or the host which serves the VMs for the virtual deployment) needs to install the following packages:

sudo apt-get install -y git make curl libvirt-bin libpq-dev qemu-kvm \
                        qemu-system tightvncserver virt-manager sshpass \
                        fuseiso genisoimage blackbox xterm python-pip \
                        python-git python-dev python-oslo.config \
                        python-pip python-dev libffi-dev libxml2-dev \
                       libxslt1-dev libffi-dev libxml2-dev libxslt1-dev \
                       expect curl python-netaddr p7zip-full

sudo pip install GitPython pyyaml netaddr paramiko lxml scp \
                 python-novaclient python-neutronclient python-glanceclient \
                 python-keystoneclient debtcollector netifaces enum
[FUEL] Download the source code and artifact

To be able to install the scenario os-odl_l2-bgpvpn one can follow the way CI is deploying the scenario. First of all the opnfv-fuel repository needs to be cloned:

git clone ssh://<user>@gerrit.opnfv.org:29418/fuel

This command downloads the whole repository fuel. To checkout a specific version of OPNFV, checkout the appropriate branch:

cd fuel
git checkout stable/<colorado|danube>

Now download the corresponding OPNFV Fuel ISO into an appropriate folder from the website

https://www.opnfv.org/software/downloads/release-archives

Have in mind that the fuel repo version needs to map with the downloaded artifact.

[FUEL] Simplified scenario deployment procedure using Fuel

This section describes the installation of the os-odl_l2-bgpvpn-ha or os-odl_l2-bgpvpn-noha OPNFV reference platform stack across a server cluster or a single host as a virtual deployment.

[FUEL] Scenario Preparation

dea.yaml and dha.yaml need to be copied and changed according to the lab-name/host where you deploy. Copy the full lab config from:

cp <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/elx \
   <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/<your-lab-name>

Add at the bottom of dha.yaml

disks:
  fuel: 100G
  controller: 100G
  compute: 100G

define_vms:
  controller:
    vcpu:
      value: 4
    memory:
      attribute_equlas:
        unit: KiB
      value: 16388608
    currentMemory:
      attribute_equlas:
        unit: KiB
      value: 16388608

Check if the default settings in dea.yaml are in line with your intentions and make changes as required.

[FUEL] Installation procedures

We describe several alternative procedures in the following. First, we describe several methods that are based on the deploy.sh script, which is also used by the OPNFV CI system. It can be found in the Fuel repository.

In addition, the SDNVPN feature can also be configured manually in the Fuel GUI. This is described in the last subsection.

Before starting any of the following procedures, go to

cd <opnfv-fuel-repo>/ci
[FUEL] Full automatic virtual deployment High Availablity Mode

The following command will deploy the high-availability flavor of SDNVPN scenario os-odl_l2-bgpvpn-ha in a fully automatic way, i.e. all installation steps (Fuel server installation, configuration, node discovery and platform deployment) will take place without any further prompt for user input.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso>
[FUEL] Full automatic virtual deployment NO High Availability Mode

The following command will deploy the SDNVPN scenario in its non-high-availability flavor (note the different scenario name for the -s switch). Otherwise it does the same as described above.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-noha -i file://<path-to-fuel-iso>
[FUEL] Automatic Fuel installation and manual scenario deployment

A useful alternative to the full automatic procedure is to only autodeploy the Fuel host and to run host selection, role assignment and SDNVPN scenario configuration manually.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso> -e

With -e option the installer does not launch environment deployment, so a user can do some modification before the scenario is really deployed. Another interesting option is the -f option which deploys the scenario using an existing Fuel host.

The result of this installation is a fuel sever with the right config for BGPVPN. Now the deploy button on fuel dashboard can be used to deploy the environment. It is as well possible to do the configuration manuell.

[FUEL] Feature configuration on existing Fuel

If a Fuel server is already provided but the fuel plugins for Opendaylight, Openvswitch and BGPVPN are not provided install them by:

cd /opt/opnfv/
fuel plugins --install fuel-plugin-ovs-*.noarch.rpm
fuel plugins --install opendaylight-*.noarch.rpm
fuel plugins --install bgpvpn-*.noarch.rpm

If plugins are installed and you want to update them use –force flag.

Now the feature can be configured. Create a new environment with “Neutron with ML2 plugin” and in there “Neutron with tunneling segmentation”. Go to Networks/Settings/Other and check “Assign public network to all nodes”. This is required for features such as floating IP, which require the Compute hosts to have public interfaces. Then go to settings/other and check “OpenDaylight plugin”, “Use ODL to manage L3 traffic”, “BGPVPN plugin” and set the OpenDaylight package version to “5.2.0-1”. Then you should be able to check “BGPVPN extensions” in OpenDaylight plugin section.

Now the deploy button on fuel dashboard can be used to deploy the environment.

[APEX] Virtual deployment

For Virtual Apex deployment a host with Centos 7 is needed. This installation was tested on centos-release-7-2.1511.el7.centos.2.10.x86_64 however any other Centos 7 version should be fine.

[APEX] Build and Deploy

Download the Apex repo from opnfv gerrit and checkout stable/danube:

git clone ssh://<user>@gerrit.opnfv.org:29418/apex
cd apex
git checkout stable/danube

In apex/contrib you will find simple_deploy.sh:

#!/bin/bash
set -e
apex_home=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/../
export CONFIG=$apex_home/build
export LIB=$apex_home/lib
export RESOURCES=$apex_home/.build/
export PYTHONPATH=$PYTHONPATH:$apex_home/lib/python
$apex_home/ci/dev_dep_check.sh || true
$apex_home/ci/clean.sh
pushd $apex_home/build
make clean
make undercloud
make overcloud-opendaylight
popd
pushd $apex_home/ci
echo "All further output will be piped to $PWD/nohup.out"
(nohup ./deploy.sh -v -n $apex_home/config/network/network_settings.yaml -d $apex_home/config/deploy/os-odl_l3-nofeature-noha.yaml &)
tail -f nohup.out
popd

This script will:

  • “dev_dep_check.sh” install all required packages.
  • “clean.sh” clean existing deployments
  • “make clean” clean existing builds
  • “make undercloud” building the undercloud image
  • “make overcloud-opendaylight” build the overcloud image and convert that to a overcloud with opendaylight image
  • “deploy.sh” deploy the os-odl_l3-nofeature-nohs.yaml scenario

Edit the script and change the scenario to os-odl-bgpvpn-noha.yaml. More scenraios can be found: ./apex/config/deploy/

Execute the script in a own screen process:

yum install -y screen
screen -S deploy
bash ./simple_deploy.sh

Determin the mac address of the undercloud vm:

# virsh domiflist undercloud
-> Default network
Interface  Type       Source     Model       MAC
-------------------------------------------------------
vnet0      network    default    virtio      00:6a:9d:24:02:31
vnet1      bridge     admin      virtio      00:6a:9d:24:02:33
vnet2      bridge     external   virtio      00:6a:9d:24:02:35
# arp -n |grep 00:6a:9d:24:02:31
192.168.122.34           ether   00:6a:9d:24:02:31   C                     virbr0
# ssh stack@192.168.122.34
-> no password needed (password stack)

List overcloud deployment info:

# source stackrc
# # Compute and controller:
# nova list
# # Networks
# neutron net-list

List overcloud openstack info:

# source overcloudrc
# nova list
# ...

On the undercloud:

# . stackrc
# nova list
# ssh heat-admin@<ip-of-host>
-> there is no password the user has direct sudo rights.
Feature and API usage guidelines and example [For Apex and Fuel]

For the details of using OpenStack BGPVPN API, please refer to the documentation at http://docs.openstack.org/developer/networking-bgpvpn/.

Example

In the example we will show a BGPVPN associated to 2 neutron networks. The BGPVPN will have the import and export routes in the way that it imports its own Route. The outcome will be that vms sitting on these two networks will be able to have a full L3 connectivity.

Some defines:

net_1="Network1"
net_2="Network2"
subnet_net1="10.10.10.0/24"
subnet_net2="10.10.11.0/24"

Create neutron networks and save network IDs:

neutron net-create --provider:network_type=local $net_1
export net_1_id=`echo "$rv" | grep " id " |awk '{print $4}'`
neutron net-create --provider:network_type=local $net_2
export net_2_id=`echo "$rv" | grep " id " |awk '{print $4}'`

Create neutron subnets:

neutron subnet-create $net_1 --disable-dhcp $subnet_net1
neutron subnet-create $net_2 --disable-dhcp $subnet_net2

Create BGPVPN:

neutron bgpvpn-create --route-distinguishers 100:100 --route-targets 100:2530 --name L3_VPN

Start VMs on both networks:

nova boot --flavor 1 --image <some-image> --nic net-id=$net_1_id vm1
nova boot --flavor 1 --image <some-image> --nic net-id=$net_2_id vm2

The VMs should not be able to see each other.

Associate to Neutron networks:

neutron bgpvpn-net-assoc-create L3_VPN --network $net_1_id
neutron bgpvpn-net-assoc-create L3_VPN --network $net_2_id

Now the VMs should be able to ping each other

Troubleshooting

Check neutron logs on the controller:

tail -f /var/log/neutron/server.log |grep -E "ERROR|TRACE"

Check Opendaylight logs:

tail -f /opt/opendaylight/data/logs/karaf.log

Restart Opendaylight:

service opendaylight restart
SDN VPN
Introduction

This document will provide an overview of how to work with the SDN VPN features in OPNFV.

SDN VPN feature description

A high-level description of the scenarios is provided in this section. For details of the scenarios and their provided capabilities refer to the scenario description document: http://artifacts.opnfv.org/danube/sdnpvn/scenarios/os-odl_l2-bgpvpn/index.html

The BGPVPN feature enables creation of BGP VPNs on the Neutron API according to the OpenStack BGPVPN blueprint at https://blueprints.launchpad.net/neutron/+spec/neutron-bgp-vpn. In a nutshell, the blueprint defines a BGPVPN object and a number of ways how to associate it with the existing Neutron object model, as well as a unique definition of the related semantics. The BGPVPN framework supports a backend driver model with currently available drivers for Bagpipe, OpenContrail, Nuage and OpenDaylight. The OPNFV scenario makes use of the OpenDaylight driver and backend implementation through the ODL NetVirt project.

Hardware requirements

The SDNVPN scenarios can be deployed as a bare-metal or a virtual environment on a single host.

Bare metal deployment on Pharos Lab

Hardware requirements for bare-metal deployments of the OPNFV infrastructure are specified by the Pharos project. The Pharos project provides an OPNFV hardware specification for configuring your hardware at: http://artifacts.opnfv.org/pharos/docs/pharos-spec.html.

Virtual deployment hardware requirements

To perform a virtual deployment of an OPNFV scenario on a single host, that host has to meet the hardware requirements outlined in the <missing spec>.

When ODL is used as an SDN Controller in an OPNFV virtual deployment, ODL is running on the OpenStack Controller VMs. It is therefore recommended to increase the amount of resources for these VMs.

Our recommendation is to have 2 additional virtual cores and 8GB additional virtual memory on top of the normally recommended configuration.

Together with the commonly used recommendation this sums up to:

6 virtual cores
16 GB virtual memory

See in Installation section below how to configure this.

[FUEL] Preparing the host to install Fuel by script

Before starting the installation of the os-odl_l2-bgpnvp scenario some preparation of the machine that will host the Fuel VM must be done.

[FUEL] Installation of required packages

To be able to run the installation of the basic OPNFV fuel installation the Jumphost (or the host which serves the VMs for the virtual deployment) needs to install the following packages:

sudo apt-get install -y git make curl libvirt-bin libpq-dev qemu-kvm \
                        qemu-system tightvncserver virt-manager sshpass \
                        fuseiso genisoimage blackbox xterm python-pip \
                        python-git python-dev python-oslo.config \
                        python-pip python-dev libffi-dev libxml2-dev \
                       libxslt1-dev libffi-dev libxml2-dev libxslt1-dev \
                       expect curl python-netaddr p7zip-full

sudo pip install GitPython pyyaml netaddr paramiko lxml scp \
                 python-novaclient python-neutronclient python-glanceclient \
                 python-keystoneclient debtcollector netifaces enum
[FUEL] Download the source code and artifact

To be able to install the scenario os-odl_l2-bgpvpn one can follow the way CI is deploying the scenario. First of all the opnfv-fuel repository needs to be cloned:

git clone ssh://<user>@gerrit.opnfv.org:29418/fuel

This command downloads the whole repository fuel. To checkout a specific version of OPNFV, checkout the appropriate branch:

cd fuel
git checkout stable/<colorado|danube>

Now download the corresponding OPNFV Fuel ISO into an appropriate folder from the website

https://www.opnfv.org/software/downloads/release-archives

Have in mind that the fuel repo version needs to map with the downloaded artifact.

[FUEL] Simplified scenario deployment procedure using Fuel

This section describes the installation of the os-odl_l2-bgpvpn-ha or os-odl_l2-bgpvpn-noha OPNFV reference platform stack across a server cluster or a single host as a virtual deployment.

[FUEL] Scenario Preparation

dea.yaml and dha.yaml need to be copied and changed according to the lab-name/host where you deploy. Copy the full lab config from:

cp <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/elx \
   <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/<your-lab-name>

Add at the bottom of dha.yaml

disks:
  fuel: 100G
  controller: 100G
  compute: 100G

define_vms:
  controller:
    vcpu:
      value: 4
    memory:
      attribute_equlas:
        unit: KiB
      value: 16388608
    currentMemory:
      attribute_equlas:
        unit: KiB
      value: 16388608

Check if the default settings in dea.yaml are in line with your intentions and make changes as required.

[FUEL] Installation procedures

We describe several alternative procedures in the following. First, we describe several methods that are based on the deploy.sh script, which is also used by the OPNFV CI system. It can be found in the Fuel repository.

In addition, the SDNVPN feature can also be configured manually in the Fuel GUI. This is described in the last subsection.

Before starting any of the following procedures, go to

cd <opnfv-fuel-repo>/ci
[FUEL] Full automatic virtual deployment High Availablity Mode

The following command will deploy the high-availability flavor of SDNVPN scenario os-odl_l2-bgpvpn-ha in a fully automatic way, i.e. all installation steps (Fuel server installation, configuration, node discovery and platform deployment) will take place without any further prompt for user input.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso>
[FUEL] Full automatic virtual deployment NO High Availability Mode

The following command will deploy the SDNVPN scenario in its non-high-availability flavor (note the different scenario name for the -s switch). Otherwise it does the same as described above.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-noha -i file://<path-to-fuel-iso>
[FUEL] Automatic Fuel installation and manual scenario deployment

A useful alternative to the full automatic procedure is to only autodeploy the Fuel host and to run host selection, role assignment and SDNVPN scenario configuration manually.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso> -e

With -e option the installer does not launch environment deployment, so a user can do some modification before the scenario is really deployed. Another interesting option is the -f option which deploys the scenario using an existing Fuel host.

The result of this installation is a fuel sever with the right config for BGPVPN. Now the deploy button on fuel dashboard can be used to deploy the environment. It is as well possible to do the configuration manuell.

[FUEL] Feature configuration on existing Fuel

If a Fuel server is already provided but the fuel plugins for Opendaylight, Openvswitch and BGPVPN are not provided install them by:

cd /opt/opnfv/
fuel plugins --install fuel-plugin-ovs-*.noarch.rpm
fuel plugins --install opendaylight-*.noarch.rpm
fuel plugins --install bgpvpn-*.noarch.rpm

If plugins are installed and you want to update them use –force flag.

Now the feature can be configured. Create a new environment with “Neutron with ML2 plugin” and in there “Neutron with tunneling segmentation”. Go to Networks/Settings/Other and check “Assign public network to all nodes”. This is required for features such as floating IP, which require the Compute hosts to have public interfaces. Then go to settings/other and check “OpenDaylight plugin”, “Use ODL to manage L3 traffic”, “BGPVPN plugin” and set the OpenDaylight package version to “5.2.0-1”. Then you should be able to check “BGPVPN extensions” in OpenDaylight plugin section.

Now the deploy button on fuel dashboard can be used to deploy the environment.

[APEX] Virtual deployment

For Virtual Apex deployment a host with Centos 7 is needed. This installation was tested on centos-release-7-2.1511.el7.centos.2.10.x86_64 however any other Centos 7 version should be fine.

[APEX] Build and Deploy

Download the Apex repo from opnfv gerrit and checkout stable/danube:

git clone ssh://<user>@gerrit.opnfv.org:29418/apex
cd apex
git checkout stable/danube

In apex/contrib you will find simple_deploy.sh:

#!/bin/bash
set -e
apex_home=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/../
export CONFIG=$apex_home/build
export LIB=$apex_home/lib
export RESOURCES=$apex_home/.build/
export PYTHONPATH=$PYTHONPATH:$apex_home/lib/python
$apex_home/ci/dev_dep_check.sh || true
$apex_home/ci/clean.sh
pushd $apex_home/build
make clean
make undercloud
make overcloud-opendaylight
popd
pushd $apex_home/ci
echo "All further output will be piped to $PWD/nohup.out"
(nohup ./deploy.sh -v -n $apex_home/config/network/network_settings.yaml -d $apex_home/config/deploy/os-odl_l3-nofeature-noha.yaml &)
tail -f nohup.out
popd

This script will:

  • “dev_dep_check.sh” install all required packages.
  • “clean.sh” clean existing deployments
  • “make clean” clean existing builds
  • “make undercloud” building the undercloud image
  • “make overcloud-opendaylight” build the overcloud image and convert that to a overcloud with opendaylight image
  • “deploy.sh” deploy the os-odl_l3-nofeature-nohs.yaml scenario

Edit the script and change the scenario to os-odl-bgpvpn-noha.yaml. More scenraios can be found: ./apex/config/deploy/

Execute the script in a own screen process:

yum install -y screen
screen -S deploy
bash ./simple_deploy.sh

Determin the mac address of the undercloud vm:

# virsh domiflist undercloud
-> Default network
Interface  Type       Source     Model       MAC
-------------------------------------------------------
vnet0      network    default    virtio      00:6a:9d:24:02:31
vnet1      bridge     admin      virtio      00:6a:9d:24:02:33
vnet2      bridge     external   virtio      00:6a:9d:24:02:35
# arp -n |grep 00:6a:9d:24:02:31
192.168.122.34           ether   00:6a:9d:24:02:31   C                     virbr0
# ssh stack@192.168.122.34
-> no password needed (password stack)

List overcloud deployment info:

# source stackrc
# # Compute and controller:
# nova list
# # Networks
# neutron net-list

List overcloud openstack info:

# source overcloudrc
# nova list
# ...

On the undercloud:

# . stackrc
# nova list
# ssh heat-admin@<ip-of-host>
-> there is no password the user has direct sudo rights.
Feature and API usage guidelines and example [For Apex and Fuel]

For the details of using OpenStack BGPVPN API, please refer to the documentation at http://docs.openstack.org/developer/networking-bgpvpn/.

Example

In the example we will show a BGPVPN associated to 2 neutron networks. The BGPVPN will have the import and export routes in the way that it imports its own Route. The outcome will be that vms sitting on these two networks will be able to have a full L3 connectivity.

Some defines:

net_1="Network1"
net_2="Network2"
subnet_net1="10.10.10.0/24"
subnet_net2="10.10.11.0/24"

Create neutron networks and save network IDs:

neutron net-create --provider:network_type=local $net_1
export net_1_id=`echo "$rv" | grep " id " |awk '{print $4}'`
neutron net-create --provider:network_type=local $net_2
export net_2_id=`echo "$rv" | grep " id " |awk '{print $4}'`

Create neutron subnets:

neutron subnet-create $net_1 --disable-dhcp $subnet_net1
neutron subnet-create $net_2 --disable-dhcp $subnet_net2

Create BGPVPN:

neutron bgpvpn-create --route-distinguishers 100:100 --route-targets 100:2530 --name L3_VPN

Start VMs on both networks:

nova boot --flavor 1 --image <some-image> --nic net-id=$net_1_id vm1
nova boot --flavor 1 --image <some-image> --nic net-id=$net_2_id vm2

The VMs should not be able to see each other.

Associate to Neutron networks:

neutron bgpvpn-net-assoc-create L3_VPN --network $net_1_id
neutron bgpvpn-net-assoc-create L3_VPN --network $net_2_id

Now the VMs should be able to ping each other

Troubleshooting

Check neutron logs on the controller:

tail -f /var/log/neutron/server.log |grep -E "ERROR|TRACE"

Check Opendaylight logs:

tail -f /opt/opendaylight/data/logs/karaf.log

Restart Opendaylight:

service opendaylight restart
SDN VPN
Introduction

This document will provide an overview of how to work with the SDN VPN features in OPNFV.

SDN VPN feature description

A high-level description of the scenarios is provided in this section. For details of the scenarios and their provided capabilities refer to the scenario description document: http://artifacts.opnfv.org/danube/sdnpvn/scenarios/os-odl_l2-bgpvpn/index.html

The BGPVPN feature enables creation of BGP VPNs on the Neutron API according to the OpenStack BGPVPN blueprint at https://blueprints.launchpad.net/neutron/+spec/neutron-bgp-vpn. In a nutshell, the blueprint defines a BGPVPN object and a number of ways how to associate it with the existing Neutron object model, as well as a unique definition of the related semantics. The BGPVPN framework supports a backend driver model with currently available drivers for Bagpipe, OpenContrail, Nuage and OpenDaylight. The OPNFV scenario makes use of the OpenDaylight driver and backend implementation through the ODL NetVirt project.

Hardware requirements

The SDNVPN scenarios can be deployed as a bare-metal or a virtual environment on a single host.

Bare metal deployment on Pharos Lab

Hardware requirements for bare-metal deployments of the OPNFV infrastructure are specified by the Pharos project. The Pharos project provides an OPNFV hardware specification for configuring your hardware at: http://artifacts.opnfv.org/pharos/docs/pharos-spec.html.

Virtual deployment hardware requirements

To perform a virtual deployment of an OPNFV scenario on a single host, that host has to meet the hardware requirements outlined in the <missing spec>.

When ODL is used as an SDN Controller in an OPNFV virtual deployment, ODL is running on the OpenStack Controller VMs. It is therefore recommended to increase the amount of resources for these VMs.

Our recommendation is to have 2 additional virtual cores and 8GB additional virtual memory on top of the normally recommended configuration.

Together with the commonly used recommendation this sums up to:

6 virtual cores
16 GB virtual memory

See in Installation section below how to configure this.

[FUEL] Preparing the host to install Fuel by script

Before starting the installation of the os-odl_l2-bgpnvp scenario some preparation of the machine that will host the Fuel VM must be done.

[FUEL] Installation of required packages

To be able to run the installation of the basic OPNFV fuel installation the Jumphost (or the host which serves the VMs for the virtual deployment) needs to install the following packages:

sudo apt-get install -y git make curl libvirt-bin libpq-dev qemu-kvm \
                        qemu-system tightvncserver virt-manager sshpass \
                        fuseiso genisoimage blackbox xterm python-pip \
                        python-git python-dev python-oslo.config \
                        python-pip python-dev libffi-dev libxml2-dev \
                       libxslt1-dev libffi-dev libxml2-dev libxslt1-dev \
                       expect curl python-netaddr p7zip-full

sudo pip install GitPython pyyaml netaddr paramiko lxml scp \
                 python-novaclient python-neutronclient python-glanceclient \
                 python-keystoneclient debtcollector netifaces enum
[FUEL] Download the source code and artifact

To be able to install the scenario os-odl_l2-bgpvpn one can follow the way CI is deploying the scenario. First of all the opnfv-fuel repository needs to be cloned:

git clone ssh://<user>@gerrit.opnfv.org:29418/fuel

This command downloads the whole repository fuel. To checkout a specific version of OPNFV, checkout the appropriate branch:

cd fuel
git checkout stable/<colorado|danube>

Now download the corresponding OPNFV Fuel ISO into an appropriate folder from the website

https://www.opnfv.org/software/downloads/release-archives

Have in mind that the fuel repo version needs to map with the downloaded artifact.

[FUEL] Simplified scenario deployment procedure using Fuel

This section describes the installation of the os-odl_l2-bgpvpn-ha or os-odl_l2-bgpvpn-noha OPNFV reference platform stack across a server cluster or a single host as a virtual deployment.

[FUEL] Scenario Preparation

dea.yaml and dha.yaml need to be copied and changed according to the lab-name/host where you deploy. Copy the full lab config from:

cp <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/elx \
   <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/<your-lab-name>

Add at the bottom of dha.yaml

disks:
  fuel: 100G
  controller: 100G
  compute: 100G

define_vms:
  controller:
    vcpu:
      value: 4
    memory:
      attribute_equlas:
        unit: KiB
      value: 16388608
    currentMemory:
      attribute_equlas:
        unit: KiB
      value: 16388608

Check if the default settings in dea.yaml are in line with your intentions and make changes as required.

[FUEL] Installation procedures

We describe several alternative procedures in the following. First, we describe several methods that are based on the deploy.sh script, which is also used by the OPNFV CI system. It can be found in the Fuel repository.

In addition, the SDNVPN feature can also be configured manually in the Fuel GUI. This is described in the last subsection.

Before starting any of the following procedures, go to

cd <opnfv-fuel-repo>/ci
[FUEL] Full automatic virtual deployment High Availablity Mode

The following command will deploy the high-availability flavor of SDNVPN scenario os-odl_l2-bgpvpn-ha in a fully automatic way, i.e. all installation steps (Fuel server installation, configuration, node discovery and platform deployment) will take place without any further prompt for user input.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso>
[FUEL] Full automatic virtual deployment NO High Availability Mode

The following command will deploy the SDNVPN scenario in its non-high-availability flavor (note the different scenario name for the -s switch). Otherwise it does the same as described above.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-noha -i file://<path-to-fuel-iso>
[FUEL] Automatic Fuel installation and manual scenario deployment

A useful alternative to the full automatic procedure is to only autodeploy the Fuel host and to run host selection, role assignment and SDNVPN scenario configuration manually.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso> -e

With -e option the installer does not launch environment deployment, so a user can do some modification before the scenario is really deployed. Another interesting option is the -f option which deploys the scenario using an existing Fuel host.

The result of this installation is a fuel sever with the right config for BGPVPN. Now the deploy button on fuel dashboard can be used to deploy the environment. It is as well possible to do the configuration manuell.

[FUEL] Feature configuration on existing Fuel

If a Fuel server is already provided but the fuel plugins for Opendaylight, Openvswitch and BGPVPN are not provided install them by:

cd /opt/opnfv/
fuel plugins --install fuel-plugin-ovs-*.noarch.rpm
fuel plugins --install opendaylight-*.noarch.rpm
fuel plugins --install bgpvpn-*.noarch.rpm

If plugins are installed and you want to update them use –force flag.

Now the feature can be configured. Create a new environment with “Neutron with ML2 plugin” and in there “Neutron with tunneling segmentation”. Go to Networks/Settings/Other and check “Assign public network to all nodes”. This is required for features such as floating IP, which require the Compute hosts to have public interfaces. Then go to settings/other and check “OpenDaylight plugin”, “Use ODL to manage L3 traffic”, “BGPVPN plugin” and set the OpenDaylight package version to “5.2.0-1”. Then you should be able to check “BGPVPN extensions” in OpenDaylight plugin section.

Now the deploy button on fuel dashboard can be used to deploy the environment.

[APEX] Virtual deployment

For Virtual Apex deployment a host with Centos 7 is needed. This installation was tested on centos-release-7-2.1511.el7.centos.2.10.x86_64 however any other Centos 7 version should be fine.

[APEX] Build and Deploy

Download the Apex repo from opnfv gerrit and checkout stable/danube:

git clone ssh://<user>@gerrit.opnfv.org:29418/apex
cd apex
git checkout stable/danube

In apex/contrib you will find simple_deploy.sh:

#!/bin/bash
set -e
apex_home=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/../
export CONFIG=$apex_home/build
export LIB=$apex_home/lib
export RESOURCES=$apex_home/.build/
export PYTHONPATH=$PYTHONPATH:$apex_home/lib/python
$apex_home/ci/dev_dep_check.sh || true
$apex_home/ci/clean.sh
pushd $apex_home/build
make clean
make undercloud
make overcloud-opendaylight
popd
pushd $apex_home/ci
echo "All further output will be piped to $PWD/nohup.out"
(nohup ./deploy.sh -v -n $apex_home/config/network/network_settings.yaml -d $apex_home/config/deploy/os-odl_l3-nofeature-noha.yaml &)
tail -f nohup.out
popd

This script will:

  • “dev_dep_check.sh” install all required packages.
  • “clean.sh” clean existing deployments
  • “make clean” clean existing builds
  • “make undercloud” building the undercloud image
  • “make overcloud-opendaylight” build the overcloud image and convert that to a overcloud with opendaylight image
  • “deploy.sh” deploy the os-odl_l3-nofeature-nohs.yaml scenario

Edit the script and change the scenario to os-odl-bgpvpn-noha.yaml. More scenraios can be found: ./apex/config/deploy/

Execute the script in a own screen process:

yum install -y screen
screen -S deploy
bash ./simple_deploy.sh

Determin the mac address of the undercloud vm:

# virsh domiflist undercloud
-> Default network
Interface  Type       Source     Model       MAC
-------------------------------------------------------
vnet0      network    default    virtio      00:6a:9d:24:02:31
vnet1      bridge     admin      virtio      00:6a:9d:24:02:33
vnet2      bridge     external   virtio      00:6a:9d:24:02:35
# arp -n |grep 00:6a:9d:24:02:31
192.168.122.34           ether   00:6a:9d:24:02:31   C                     virbr0
# ssh stack@192.168.122.34
-> no password needed (password stack)

List overcloud deployment info:

# source stackrc
# # Compute and controller:
# nova list
# # Networks
# neutron net-list

List overcloud openstack info:

# source overcloudrc
# nova list
# ...

On the undercloud:

# . stackrc
# nova list
# ssh heat-admin@<ip-of-host>
-> there is no password the user has direct sudo rights.
Feature and API usage guidelines and example [For Apex and Fuel]

For the details of using OpenStack BGPVPN API, please refer to the documentation at http://docs.openstack.org/developer/networking-bgpvpn/.

Example

In the example we will show a BGPVPN associated to 2 neutron networks. The BGPVPN will have the import and export routes in the way that it imports its own Route. The outcome will be that vms sitting on these two networks will be able to have a full L3 connectivity.

Some defines:

net_1="Network1"
net_2="Network2"
subnet_net1="10.10.10.0/24"
subnet_net2="10.10.11.0/24"

Create neutron networks and save network IDs:

neutron net-create --provider:network_type=local $net_1
export net_1_id=`echo "$rv" | grep " id " |awk '{print $4}'`
neutron net-create --provider:network_type=local $net_2
export net_2_id=`echo "$rv" | grep " id " |awk '{print $4}'`

Create neutron subnets:

neutron subnet-create $net_1 --disable-dhcp $subnet_net1
neutron subnet-create $net_2 --disable-dhcp $subnet_net2

Create BGPVPN:

neutron bgpvpn-create --route-distinguishers 100:100 --route-targets 100:2530 --name L3_VPN

Start VMs on both networks:

nova boot --flavor 1 --image <some-image> --nic net-id=$net_1_id vm1
nova boot --flavor 1 --image <some-image> --nic net-id=$net_2_id vm2

The VMs should not be able to see each other.

Associate to Neutron networks:

neutron bgpvpn-net-assoc-create L3_VPN --network $net_1_id
neutron bgpvpn-net-assoc-create L3_VPN --network $net_2_id

Now the VMs should be able to ping each other

Troubleshooting

Check neutron logs on the controller:

tail -f /var/log/neutron/server.log |grep -E "ERROR|TRACE"

Check Opendaylight logs:

tail -f /opt/opendaylight/data/logs/karaf.log

Restart Opendaylight:

service opendaylight restart
SDN VPN user guide
Introduction

This document will provide an overview of how to work with the SDN VPN features in OPNFV.

SDN VPN feature description

A high-level description of the scenarios is provided in this section. For details of the scenarios and their provided capabilities refer to the scenario description document: http://artifacts.opnfv.org/danube/sdnpvn/scenarios/os-odl_l2-bgpvpn/index.html

The BGPVPN feature enables creation of BGP VPNs on the Neutron API according to the OpenStack BGPVPN blueprint at https://blueprints.launchpad.net/neutron/+spec/neutron-bgp-vpn. In a nutshell, the blueprint defines a BGPVPN object and a number of ways how to associate it with the existing Neutron object model, as well as a unique definition of the related semantics. The BGPVPN framework supports a backend driver model with currently available drivers for Bagpipe, OpenContrail, Nuage and OpenDaylight. The OPNFV scenario makes use of the OpenDaylight driver and backend implementation through the ODL NetVirt project.

Hardware requirements

The SDNVPN scenarios can be deployed as a bare-metal or a virtual environment on a single host.

Bare metal deployment on Pharos Lab

Hardware requirements for bare-metal deployments of the OPNFV infrastructure are specified by the Pharos project. The Pharos project provides an OPNFV hardware specification for configuring your hardware at: http://artifacts.opnfv.org/pharos/docs/pharos-spec.html.

Virtual deployment hardware requirements

To perform a virtual deployment of an OPNFV scenario on a single host, that host has to meet the hardware requirements outlined in the <missing spec>.

When ODL is used as an SDN Controller in an OPNFV virtual deployment, ODL is running on the OpenStack Controller VMs. It is therefore recommended to increase the amount of resources for these VMs.

Our recommendation is to have 2 additional virtual cores and 8GB additional virtual memory on top of the normally recommended configuration.

Together with the commonly used recommendation this sums up to:

6 virtual cores
16 GB virtual memory

See in Installation section below how to configure this.

[FUEL] Preparing the host to install Fuel by script

Before starting the installation of the os-odl_l2-bgpnvp scenario some preparation of the machine that will host the Fuel VM must be done.

[FUEL] Installation of required packages

To be able to run the installation of the basic OPNFV fuel installation the Jumphost (or the host which serves the VMs for the virtual deployment) needs to install the following packages:

sudo apt-get install -y git make curl libvirt-bin libpq-dev qemu-kvm \
                        qemu-system tightvncserver virt-manager sshpass \
                        fuseiso genisoimage blackbox xterm python-pip \
                        python-git python-dev python-oslo.config \
                        python-pip python-dev libffi-dev libxml2-dev \
                       libxslt1-dev libffi-dev libxml2-dev libxslt1-dev \
                       expect curl python-netaddr p7zip-full

sudo pip install GitPython pyyaml netaddr paramiko lxml scp \
                 python-novaclient python-neutronclient python-glanceclient \
                 python-keystoneclient debtcollector netifaces enum
[FUEL] Download the source code and artifact

To be able to install the scenario os-odl_l2-bgpvpn one can follow the way CI is deploying the scenario. First of all the opnfv-fuel repository needs to be cloned:

git clone ssh://<user>@gerrit.opnfv.org:29418/fuel

This command downloads the whole repository fuel. To checkout a specific version of OPNFV, checkout the appropriate branch:

cd fuel
git checkout stable/<colorado|danube>

Now download the corresponding OPNFV Fuel ISO into an appropriate folder from the website

https://www.opnfv.org/software/downloads/release-archives

Have in mind that the fuel repo version needs to map with the downloaded artifact.

[FUEL] Simplified scenario deployment procedure using Fuel

This section describes the installation of the os-odl_l2-bgpvpn-ha or os-odl_l2-bgpvpn-noha OPNFV reference platform stack across a server cluster or a single host as a virtual deployment.

[FUEL] Scenario Preparation

dea.yaml and dha.yaml need to be copied and changed according to the lab-name/host where you deploy. Copy the full lab config from:

cp <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/elx \
   <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/<your-lab-name>

Add at the bottom of dha.yaml

disks:
  fuel: 100G
  controller: 100G
  compute: 100G

define_vms:
  controller:
    vcpu:
      value: 4
    memory:
      attribute_equlas:
        unit: KiB
      value: 16388608
    currentMemory:
      attribute_equlas:
        unit: KiB
      value: 16388608

Check if the default settings in dea.yaml are in line with your intentions and make changes as required.

[FUEL] Installation procedures

We describe several alternative procedures in the following. First, we describe several methods that are based on the deploy.sh script, which is also used by the OPNFV CI system. It can be found in the Fuel repository.

In addition, the SDNVPN feature can also be configured manually in the Fuel GUI. This is described in the last subsection.

Before starting any of the following procedures, go to

cd <opnfv-fuel-repo>/ci
[FUEL] Full automatic virtual deployment High Availablity Mode

The following command will deploy the high-availability flavor of SDNVPN scenario os-odl_l2-bgpvpn-ha in a fully automatic way, i.e. all installation steps (Fuel server installation, configuration, node discovery and platform deployment) will take place without any further prompt for user input.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso>
[FUEL] Full automatic virtual deployment NO High Availability Mode

The following command will deploy the SDNVPN scenario in its non-high-availability flavor (note the different scenario name for the -s switch). Otherwise it does the same as described above.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-noha -i file://<path-to-fuel-iso>
[FUEL] Automatic Fuel installation and manual scenario deployment

A useful alternative to the full automatic procedure is to only autodeploy the Fuel host and to run host selection, role assignment and SDNVPN scenario configuration manually.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso> -e

With -e option the installer does not launch environment deployment, so a user can do some modification before the scenario is really deployed. Another interesting option is the -f option which deploys the scenario using an existing Fuel host.

The result of this installation is a fuel sever with the right config for BGPVPN. Now the deploy button on fuel dashboard can be used to deploy the environment. It is as well possible to do the configuration manuell.

[FUEL] Feature configuration on existing Fuel

If a Fuel server is already provided but the fuel plugins for Opendaylight, Openvswitch and BGPVPN are not provided install them by:

cd /opt/opnfv/
fuel plugins --install fuel-plugin-ovs-*.noarch.rpm
fuel plugins --install opendaylight-*.noarch.rpm
fuel plugins --install bgpvpn-*.noarch.rpm

If plugins are installed and you want to update them use –force flag.

Now the feature can be configured. Create a new environment with “Neutron with ML2 plugin” and in there “Neutron with tunneling segmentation”. Go to Networks/Settings/Other and check “Assign public network to all nodes”. This is required for features such as floating IP, which require the Compute hosts to have public interfaces. Then go to settings/other and check “OpenDaylight plugin”, “Use ODL to manage L3 traffic”, “BGPVPN plugin” and set the OpenDaylight package version to “5.2.0-1”. Then you should be able to check “BGPVPN extensions” in OpenDaylight plugin section.

Now the deploy button on fuel dashboard can be used to deploy the environment.

[APEX] Virtual deployment

For Virtual Apex deployment a host with Centos 7 is needed. This installation was tested on centos-release-7-2.1511.el7.centos.2.10.x86_64 however any other Centos 7 version should be fine.

[APEX] Build and Deploy

Download the Apex repo from opnfv gerrit and checkout stable/danube:

git clone ssh://<user>@gerrit.opnfv.org:29418/apex
cd apex
git checkout stable/danube

In apex/contrib you will find simple_deploy.sh:

#!/bin/bash
set -e
apex_home=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/../
export CONFIG=$apex_home/build
export LIB=$apex_home/lib
export RESOURCES=$apex_home/.build/
export PYTHONPATH=$PYTHONPATH:$apex_home/lib/python
$apex_home/ci/dev_dep_check.sh || true
$apex_home/ci/clean.sh
pushd $apex_home/build
make clean
make undercloud
make overcloud-opendaylight
popd
pushd $apex_home/ci
echo "All further output will be piped to $PWD/nohup.out"
(nohup ./deploy.sh -v -n $apex_home/config/network/network_settings.yaml -d $apex_home/config/deploy/os-odl_l3-nofeature-noha.yaml &)
tail -f nohup.out
popd

This script will:

  • “dev_dep_check.sh” install all required packages.
  • “clean.sh” clean existing deployments
  • “make clean” clean existing builds
  • “make undercloud” building the undercloud image
  • “make overcloud-opendaylight” build the overcloud image and convert that to a overcloud with opendaylight image
  • “deploy.sh” deploy the os-odl_l3-nofeature-nohs.yaml scenario

Edit the script and change the scenario to os-odl-bgpvpn-noha.yaml. More scenraios can be found: ./apex/config/deploy/

Execute the script in a own screen process:

yum install -y screen
screen -S deploy
bash ./simple_deploy.sh

Determin the mac address of the undercloud vm:

# virsh domiflist undercloud
-> Default network
Interface  Type       Source     Model       MAC
-------------------------------------------------------
vnet0      network    default    virtio      00:6a:9d:24:02:31
vnet1      bridge     admin      virtio      00:6a:9d:24:02:33
vnet2      bridge     external   virtio      00:6a:9d:24:02:35
# arp -n |grep 00:6a:9d:24:02:31
192.168.122.34           ether   00:6a:9d:24:02:31   C                     virbr0
# ssh stack@192.168.122.34
-> no password needed (password stack)

List overcloud deployment info:

# source stackrc
# # Compute and controller:
# nova list
# # Networks
# neutron net-list

List overcloud openstack info:

# source overcloudrc
# nova list
# ...

On the undercloud:

# . stackrc
# nova list
# ssh heat-admin@<ip-of-host>
-> there is no password the user has direct sudo rights.
Feature and API usage guidelines and example [For Apex and Fuel]

For the details of using OpenStack BGPVPN API, please refer to the documentation at http://docs.openstack.org/developer/networking-bgpvpn/.

Example

In the example we will show a BGPVPN associated to 2 neutron networks. The BGPVPN will have the import and export routes in the way that it imports its own Route. The outcome will be that vms sitting on these two networks will be able to have a full L3 connectivity.

Some defines:

net_1="Network1"
net_2="Network2"
subnet_net1="10.10.10.0/24"
subnet_net2="10.10.11.0/24"

Create neutron networks and save network IDs:

neutron net-create --provider:network_type=local $net_1
export net_1_id=`echo "$rv" | grep " id " |awk '{print $4}'`
neutron net-create --provider:network_type=local $net_2
export net_2_id=`echo "$rv" | grep " id " |awk '{print $4}'`

Create neutron subnets:

neutron subnet-create $net_1 --disable-dhcp $subnet_net1
neutron subnet-create $net_2 --disable-dhcp $subnet_net2

Create BGPVPN:

neutron bgpvpn-create --route-distinguishers 100:100 --route-targets 100:2530 --name L3_VPN

Start VMs on both networks:

nova boot --flavor 1 --image <some-image> --nic net-id=$net_1_id vm1
nova boot --flavor 1 --image <some-image> --nic net-id=$net_2_id vm2

The VMs should not be able to see each other.

Associate to Neutron networks:

neutron bgpvpn-net-assoc-create L3_VPN --network $net_1_id
neutron bgpvpn-net-assoc-create L3_VPN --network $net_2_id

Now the VMs should be able to ping each other

Troubleshooting

Check neutron logs on the controller:

tail -f /var/log/neutron/server.log |grep -E "ERROR|TRACE"

Check Opendaylight logs:

tail -f /opt/opendaylight/data/logs/karaf.log

Restart Opendaylight:

service opendaylight restart
SDN VPN
Introduction

This document will provide an overview of how to work with the SDN VPN features in OPNFV.

SDN VPN feature description

A high-level description of the scenarios is provided in this section. For details of the scenarios and their provided capabilities refer to the scenario description document: http://artifacts.opnfv.org/danube/sdnpvn/scenarios/os-odl_l2-bgpvpn/index.html

The BGPVPN feature enables creation of BGP VPNs on the Neutron API according to the OpenStack BGPVPN blueprint at https://blueprints.launchpad.net/neutron/+spec/neutron-bgp-vpn. In a nutshell, the blueprint defines a BGPVPN object and a number of ways how to associate it with the existing Neutron object model, as well as a unique definition of the related semantics. The BGPVPN framework supports a backend driver model with currently available drivers for Bagpipe, OpenContrail, Nuage and OpenDaylight. The OPNFV scenario makes use of the OpenDaylight driver and backend implementation through the ODL NetVirt project.

Hardware requirements

The SDNVPN scenarios can be deployed as a bare-metal or a virtual environment on a single host.

Bare metal deployment on Pharos Lab

Hardware requirements for bare-metal deployments of the OPNFV infrastructure are specified by the Pharos project. The Pharos project provides an OPNFV hardware specification for configuring your hardware at: http://artifacts.opnfv.org/pharos/docs/pharos-spec.html.

Virtual deployment hardware requirements

To perform a virtual deployment of an OPNFV scenario on a single host, that host has to meet the hardware requirements outlined in the <missing spec>.

When ODL is used as an SDN Controller in an OPNFV virtual deployment, ODL is running on the OpenStack Controller VMs. It is therefore recommended to increase the amount of resources for these VMs.

Our recommendation is to have 2 additional virtual cores and 8GB additional virtual memory on top of the normally recommended configuration.

Together with the commonly used recommendation this sums up to:

6 virtual cores
16 GB virtual memory

See in Installation section below how to configure this.

[FUEL] Preparing the host to install Fuel by script

Before starting the installation of the os-odl_l2-bgpnvp scenario some preparation of the machine that will host the Fuel VM must be done.

[FUEL] Installation of required packages

To be able to run the installation of the basic OPNFV fuel installation the Jumphost (or the host which serves the VMs for the virtual deployment) needs to install the following packages:

sudo apt-get install -y git make curl libvirt-bin libpq-dev qemu-kvm \
                        qemu-system tightvncserver virt-manager sshpass \
                        fuseiso genisoimage blackbox xterm python-pip \
                        python-git python-dev python-oslo.config \
                        python-pip python-dev libffi-dev libxml2-dev \
                       libxslt1-dev libffi-dev libxml2-dev libxslt1-dev \
                       expect curl python-netaddr p7zip-full

sudo pip install GitPython pyyaml netaddr paramiko lxml scp \
                 python-novaclient python-neutronclient python-glanceclient \
                 python-keystoneclient debtcollector netifaces enum
[FUEL] Download the source code and artifact

To be able to install the scenario os-odl_l2-bgpvpn one can follow the way CI is deploying the scenario. First of all the opnfv-fuel repository needs to be cloned:

git clone ssh://<user>@gerrit.opnfv.org:29418/fuel

This command downloads the whole repository fuel. To checkout a specific version of OPNFV, checkout the appropriate branch:

cd fuel
git checkout stable/<colorado|danube>

Now download the corresponding OPNFV Fuel ISO into an appropriate folder from the website

https://www.opnfv.org/software/downloads/release-archives

Have in mind that the fuel repo version needs to map with the downloaded artifact.

[FUEL] Simplified scenario deployment procedure using Fuel

This section describes the installation of the os-odl_l2-bgpvpn-ha or os-odl_l2-bgpvpn-noha OPNFV reference platform stack across a server cluster or a single host as a virtual deployment.

[FUEL] Scenario Preparation

dea.yaml and dha.yaml need to be copied and changed according to the lab-name/host where you deploy. Copy the full lab config from:

cp <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/elx \
   <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/<your-lab-name>

Add at the bottom of dha.yaml

disks:
  fuel: 100G
  controller: 100G
  compute: 100G

define_vms:
  controller:
    vcpu:
      value: 4
    memory:
      attribute_equlas:
        unit: KiB
      value: 16388608
    currentMemory:
      attribute_equlas:
        unit: KiB
      value: 16388608

Check if the default settings in dea.yaml are in line with your intentions and make changes as required.

[FUEL] Installation procedures

We describe several alternative procedures in the following. First, we describe several methods that are based on the deploy.sh script, which is also used by the OPNFV CI system. It can be found in the Fuel repository.

In addition, the SDNVPN feature can also be configured manually in the Fuel GUI. This is described in the last subsection.

Before starting any of the following procedures, go to

cd <opnfv-fuel-repo>/ci
[FUEL] Full automatic virtual deployment High Availablity Mode

The following command will deploy the high-availability flavor of SDNVPN scenario os-odl_l2-bgpvpn-ha in a fully automatic way, i.e. all installation steps (Fuel server installation, configuration, node discovery and platform deployment) will take place without any further prompt for user input.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso>
[FUEL] Full automatic virtual deployment NO High Availability Mode

The following command will deploy the SDNVPN scenario in its non-high-availability flavor (note the different scenario name for the -s switch). Otherwise it does the same as described above.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-noha -i file://<path-to-fuel-iso>
[FUEL] Automatic Fuel installation and manual scenario deployment

A useful alternative to the full automatic procedure is to only autodeploy the Fuel host and to run host selection, role assignment and SDNVPN scenario configuration manually.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os-odl_l2-bgpvpn-ha -i file://<path-to-fuel-iso> -e

With -e option the installer does not launch environment deployment, so a user can do some modification before the scenario is really deployed. Another interesting option is the -f option which deploys the scenario using an existing Fuel host.

The result of this installation is a fuel sever with the right config for BGPVPN. Now the deploy button on fuel dashboard can be used to deploy the environment. It is as well possible to do the configuration manuell.

[FUEL] Feature configuration on existing Fuel

If a Fuel server is already provided but the fuel plugins for Opendaylight, Openvswitch and BGPVPN are not provided install them by:

cd /opt/opnfv/
fuel plugins --install fuel-plugin-ovs-*.noarch.rpm
fuel plugins --install opendaylight-*.noarch.rpm
fuel plugins --install bgpvpn-*.noarch.rpm

If plugins are installed and you want to update them use –force flag.

Now the feature can be configured. Create a new environment with “Neutron with ML2 plugin” and in there “Neutron with tunneling segmentation”. Go to Networks/Settings/Other and check “Assign public network to all nodes”. This is required for features such as floating IP, which require the Compute hosts to have public interfaces. Then go to settings/other and check “OpenDaylight plugin”, “Use ODL to manage L3 traffic”, “BGPVPN plugin” and set the OpenDaylight package version to “5.2.0-1”. Then you should be able to check “BGPVPN extensions” in OpenDaylight plugin section.

Now the deploy button on fuel dashboard can be used to deploy the environment.

[APEX] Virtual deployment

For Virtual Apex deployment a host with Centos 7 is needed. This installation was tested on centos-release-7-2.1511.el7.centos.2.10.x86_64 however any other Centos 7 version should be fine.

[APEX] Build and Deploy

Download the Apex repo from opnfv gerrit and checkout stable/danube:

git clone ssh://<user>@gerrit.opnfv.org:29418/apex
cd apex
git checkout stable/danube

In apex/contrib you will find simple_deploy.sh:

#!/bin/bash
set -e
apex_home=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )/../
export CONFIG=$apex_home/build
export LIB=$apex_home/lib
export RESOURCES=$apex_home/.build/
export PYTHONPATH=$PYTHONPATH:$apex_home/lib/python
$apex_home/ci/dev_dep_check.sh || true
$apex_home/ci/clean.sh
pushd $apex_home/build
make clean
make undercloud
make overcloud-opendaylight
popd
pushd $apex_home/ci
echo "All further output will be piped to $PWD/nohup.out"
(nohup ./deploy.sh -v -n $apex_home/config/network/network_settings.yaml -d $apex_home/config/deploy/os-odl_l3-nofeature-noha.yaml &)
tail -f nohup.out
popd

This script will:

  • “dev_dep_check.sh” install all required packages.
  • “clean.sh” clean existing deployments
  • “make clean” clean existing builds
  • “make undercloud” building the undercloud image
  • “make overcloud-opendaylight” build the overcloud image and convert that to a overcloud with opendaylight image
  • “deploy.sh” deploy the os-odl_l3-nofeature-nohs.yaml scenario

Edit the script and change the scenario to os-odl-bgpvpn-noha.yaml. More scenraios can be found: ./apex/config/deploy/

Execute the script in a own screen process:

yum install -y screen
screen -S deploy
bash ./simple_deploy.sh

Determin the mac address of the undercloud vm:

# virsh domiflist undercloud
-> Default network
Interface  Type       Source     Model       MAC
-------------------------------------------------------
vnet0      network    default    virtio      00:6a:9d:24:02:31
vnet1      bridge     admin      virtio      00:6a:9d:24:02:33
vnet2      bridge     external   virtio      00:6a:9d:24:02:35
# arp -n |grep 00:6a:9d:24:02:31
192.168.122.34           ether   00:6a:9d:24:02:31   C                     virbr0
# ssh stack@192.168.122.34
-> no password needed (password stack)

List overcloud deployment info:

# source stackrc
# # Compute and controller:
# nova list
# # Networks
# neutron net-list

List overcloud openstack info:

# source overcloudrc
# nova list
# ...

On the undercloud:

# . stackrc
# nova list
# ssh heat-admin@<ip-of-host>
-> there is no password the user has direct sudo rights.
Feature and API usage guidelines and example [For Apex and Fuel]

For the details of using OpenStack BGPVPN API, please refer to the documentation at http://docs.openstack.org/developer/networking-bgpvpn/.

Example

In the example we will show a BGPVPN associated to 2 neutron networks. The BGPVPN will have the import and export routes in the way that it imports its own Route. The outcome will be that vms sitting on these two networks will be able to have a full L3 connectivity.

Some defines:

net_1="Network1"
net_2="Network2"
subnet_net1="10.10.10.0/24"
subnet_net2="10.10.11.0/24"

Create neutron networks and save network IDs:

neutron net-create --provider:network_type=local $net_1
export net_1_id=`echo "$rv" | grep " id " |awk '{print $4}'`
neutron net-create --provider:network_type=local $net_2
export net_2_id=`echo "$rv" | grep " id " |awk '{print $4}'`

Create neutron subnets:

neutron subnet-create $net_1 --disable-dhcp $subnet_net1
neutron subnet-create $net_2 --disable-dhcp $subnet_net2

Create BGPVPN:

neutron bgpvpn-create --route-distinguishers 100:100 --route-targets 100:2530 --name L3_VPN

Start VMs on both networks:

nova boot --flavor 1 --image <some-image> --nic net-id=$net_1_id vm1
nova boot --flavor 1 --image <some-image> --nic net-id=$net_2_id vm2

The VMs should not be able to see each other.

Associate to Neutron networks:

neutron bgpvpn-net-assoc-create L3_VPN --network $net_1_id
neutron bgpvpn-net-assoc-create L3_VPN --network $net_2_id

Now the VMs should be able to ping each other

Troubleshooting

Check neutron logs on the controller:

tail -f /var/log/neutron/server.log |grep -E "ERROR|TRACE"

Check Opendaylight logs:

tail -f /opt/opendaylight/data/logs/karaf.log

Restart Opendaylight:

service opendaylight restart

SFC

SFC installation and configuration instruction
1. Abstract

This document provides information on how to install the OpenDayLigh SFC features in OPNFV with the use of os_odl-l2_sfc-(no)ha scenario.

2. SFC feature desciription

For details of the scenarios and their provided capabilities refer to the scenario description documents:

The SFC feature enables creation of Service Fuction Chains - an ordered list of chained network funcions (e.g. firewalls, NAT, QoS)

The SFC feature in OPNFV is implemented by 3 major components:

  • OpenDayLight SDN controller
  • Tacker: Generic VNF Manager (VNFM) and a NFV Orchestrator (NFVO)
  • OpenvSwitch: The Service Function Forwarder(s)
3. Hardware requirements

The SFC scenarios can be deployed on a bare-metal OPNFV cluster or on a virtual environment on a single host.

3.1. Bare metal deployment on (OPNFV) Pharos lab

Hardware requirements for bare-metal deployments of the OPNFV infrastructure are given by the Pharos project. The Pharos project provides an OPNFV hardware specification for configuring your hardware: http://artifacts.opnfv.org/pharos/docs/pharos-spec.html

3.2. Virtual deployment

To perform a virtual deployment of an OPNFV SFC scenario on a single host, that host has to meet the following hardware requirements:

  • SandyBridge compatible CPU with virtualization support
  • capable to host 5 virtual cores (5 physical ones at least)
  • 8-12 GBytes RAM for virtual hosts (controller, compute), 48GByte at least
  • 128 GiBiBytes room on disk for each virtual host (controller, compute) + 64GiBiBytes for fuel master, 576 GiBiBytes at least
  • Ubuntu Trusty Tahr - 14.04(.5) server operating system with at least ssh service selected at installation.
  • Internet Connection (preferably http proxyless)
4. Pre-configuration activites - Preparing the host to install Fuel by script

Before starting the installation of the SFC scenarios some preparation of the machine that will host the Danube Fuel cluster must be done.

4.1. Installation of required packages

To be able to run the installation of the basic OPNFV fuel installation the Jumphost (or the host which serves the VMs for the virtual deployment) needs to install the following packages:

sudo apt-get install -y git make curl libvirt-bin libpq-dev qemu-kvm \
                        qemu-system tightvncserver virt-manager sshpass \
                        fuseiso genisoimage blackbox xterm python-pip \
                        python-git python-dev python-oslo.config \
                        python-pip python-dev libffi-dev libxml2-dev \
                        libxslt1-dev libffi-dev libxml2-dev libxslt1-dev \
                        expect curl python-netaddr p7zip-full

sudo pip install GitPython pyyaml netaddr paramiko lxml scp \
                 scp pycrypto ecdsa debtcollector netifaces enum

During libvirt install the user is added to the libvirtd group, so you have to logout then login back again

4.2. Download the installer source code and artifact

To be able to install the scenario os_odl-l2_sfc-(no)ha one can follow the way CI is deploying the scenario. First of all the opnfv-fuel repository needs to be cloned:

git clone -b 'stable/danube' ssh://<user>@gerrit.opnfv.org:29418/fuel

This command copies the whole danube branch of repository fuel.

Now download the appropriate OPNFV Fuel ISO into an appropriate folder:

wget http://artifacts.opnfv.org/fuel/danube/opnfv-danube.1.0.iso

The exact name of the ISO image may change. Check https://www.opnfv.org/opnfv-danube-fuel-users to get the latest ISO.

5. Simplified scenario deployment procedure using Fuel

This section describes the installation of the os-odl-l2_sfc or os-odl-l2_sfc-noha OPNFV reference platform stack across a server cluster or a single host as a virtual deployment.

5.1. Scenario Preparation

dea.yaml and dha.yaml need to be copied and changed according to the lab-name/host where you deploy. Copy the full lab config from:

cp -r <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/elx \
   <path-to-opnfv-fuel-repo>/deploy/config/labs/devel-pipeline/<your-lab-name>

Add at the bottom of dha.yaml

disks:
  fuel: 64G
  controller: 128G
  compute: 128G

define_vms:
  controller:
    vcpu:
      value: 2
    memory:
      attribute_equlas:
        unit: KiB
      value: 12521472
    currentMemory:
      attribute_equlas:
        unit: KiB
      value: 12521472
  compute:
    vcpu:
      value: 2
    memory:
      attribute_equlas:
        unit: KiB
      value: 8388608
    currentMemory:
      attribute_equlas:
        unit: KiB
      value: 8388608
  fuel:
    vcpu:
      value: 2
    memory:
      attribute_equlas:
        unit: KiB
      value: 2097152
    currentMemory:
      attribute_equlas:
        unit: KiB
      value: 2097152

Check if the default settings in dea.yaml are in line with your intentions and make changes as required.

5.2. Installation procedures

We state here several alternatives. First, we describe methods that are based on the use of the deploy.sh script, what is used by the OPNFV CI system and can be found in the Fuel repository.

In addition, the SFC feature can also be configured manually in the Fuel GUI what we will show in the last subsection.

Before starting any of the following procedures, go to

cd <opnfv-fuel-repo>/ci
5.2.1. Full automatic virtual deployment, High Availablity mode

This example will deploy the high-availability flavor of SFC scenario os_odl-l2_sfc-ha in a fully automatic way, i.e. all installation steps (Fuel server installation, configuration, node discovery and platform deployment) will take place without any further prompt for user input.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name>
-s os_odl-l2_sfc-ha -i file://<path-to-fuel-iso>
5.2.2. Full automatic virtual deployment, non HIGH Availablity mode

The following command will deploy the SFC scenario with non-high-availability flavor (note the different scenario name for the -s switch). Otherwise it does the same as described above.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name>
-s os_odl-l2_sfc-noha -i file://<path-to-fuel-iso>
5.2.3. Automatic Fuel installation and manual scenario deployment

A useful alternative to the full automatic procedure is to only deploy the Fuel host and to run host selection, role assignment and SFC scenario configuration manually.

sudo bash ./deploy.sh -b file://<path-to-opnfv-fuel-repo>/config/ -l devel-pipeline -p <your-lab-name> -s os_odl-l2_sfc-ha -i file://<path-to-fuel-iso> -e

With -e option the installer will skip environment deployment, so an user can do some modification before the scenario is really deployed. Another useful option is the -f option which deploys the scenario using an existing Fuel host.

The result of this installation is a well configured Fuel sever. The use of the deploy button on Fuel dashboard can initiate the deployment. A user may perform manual post-configuration as well.

5.2.4. Feature configuration on existing Fuel

If a Fuel server is already provisioned but the fuel plugins for Opendaylight, Openvswitch are not provided install them by:

cd /opt/opnfv/
fuel plugins --install fuel-plugin-ovs-*.noarch.rpm
fuel plugins --install opendaylight-*.noarch.rpm

If plugins are installed and you want to update them use –force flag.

Note that One may inject other - Danube compatible - plugins to the Fuel Master host using the command scp:

scp <plugin>.rpm root@10.20.0.2:<plugin>.rpm

Now the feature can be configured. Create a new environment with Networking Setup:”OpenDayLight with tunneling segmentation”. Then go to settings/other and check “OpenDaylight plugin, SFC enabled”, “Install Openvswitch with NSH/DPDK, with NSH enabled”. During node provision remember assign the OpenDayLight role to the (primary)controller

Now the deploy button on fuel dashboard can be used to deploy the environment.

SFC User Guide
1. SFC description

The OPNFV SFC feature will create service chains, classifiers, and create VMs for Service Functions, allowing for client traffic intended to be sent to a server to first traverse the provisioned service chain.

The Service Chain creation consists of configuring the OpenDaylight SFC feature. This configuration will in-turn configure Service Function Forwarders to route traffic to Service Functions. A Service Function Forwarder in the context of OPNFV SFC is the “br-int” OVS bridge on an Open Stack compute node.

The classifier(s) consist of configuring the OpenDaylight Netvirt feature. Netvirt is a Neutron backend which handles the networking for VMs. Netvirt can also create simple classification rules (5-tuples) to send specific traffic to a pre-configured Service Chain. A common example of a classification rule would be to send all HTTP traffic (tcp port 80) to a pre-configured Service Chain.

Service Function VM creation is performed via a VNF Manager. Currently, OPNFV SFC is integrated with OpenStack Tacker, which in addition to being a VNF Manager, also orchestrates the SFC configuration. In OPNFV SFC Tacker creates service chains, classification rules, creates VMs in OpenStack for Service Functions, and then communicates the relevant configuration to OpenDaylight SFC.

2. SFC capabilities and usage

The OPNFV SFC feature can be deployed with either the “os-odl_l2-sfc-ha” or the “os-odl_l2-sfc-noha” scenario. SFC usage for both of these scenarios is the same.

As previously mentioned, Tacker is used as a VNF Manager and SFC Orchestrator. All the configuration necessary to create working service chains and classifiers can be performed using the Tacker command line. Refer to the Tacker walkthrough (step 3 and onwards) for more information.

2.1. SFC API usage guidelines and example

Refer to the Tacker walkthrough for Tacker usage guidelines and examples.

Service Function Chaining (SFC)
1. Introduction

The OPNFV Service Function Chaining (SFC) project aims to provide the ability to define an ordered list of a network services (e.g. firewalls, NAT, QoS). These service are then “stitched” together in the network to create a service chain. This project provides the infrastructure to install the upstream ODL SFC implementation project in an NFV environment.

2. Definitions

Definitions of most terms used here are provided in the IETF SFC Architecture RFC. Additional terms specific to the OPNFV SFC project are defined below.

3. Abbreviations
Abbreviations
Abbreviation Term
NS Network Service
NFVO Network Function Virtualization Orchestrator
NF Network Function
NSH Network Services Header (Service chaining encapsulation)
ODL OpenDaylight SDN Controller
RSP Rendered Service Path
SDN Software Defined Networking
SF Service Function
SFC Service Function Chain(ing)
SFF Service Function Forwarder
SFP Service Function Path
VNF Virtual Network Function
VNFM Virtual Network Function Manager
VNF-FG Virtual Network Function Forwarding Graph
VIM Virtual Infrastructure Manager
4. Use Cases

This section outlines the Danube use cases driving the initial OPNFV SFC implementation.

4.1. Use Case 1 - Two chains

This use case is targeted on creating simple Service Chains using Firewall Service Functions. As can be seen in the following diagram, 2 service chains are created, each through a different Service Function Firewall. Service Chain 1 will block HTTP, while Service Chain 2 will block SSH.

_images/OPNFV_SFC_Brahmaputra_UseCase.jpg
4.2. Use Case 2 - One chain traverses two service functions

This use case creates two service functions, and a chain that makes the traffic flow through both of them. More information is available in OPNFV-SFC wiki:

https://wiki.opnfv.org/display/sfc/Functest+SFC-ODL+-+Test+2

5. Architecture

This section describes the architectural approach to incorporating the upstream OpenDaylight (ODL) SFC project into the OPNFV Danube platform.

5.1. Service Functions

A Service Function (SF) is a Function that provides services to flows traversing a Service Chain. Examples of typical SFs include: Firewall, NAT, QoS, and DPI. In the context of OPNFV, the SF will be a Virtual Network Function. The SFs receive data packets from a Service Function Forwarder.

5.2. Service Function Forwarders

The Service Function Forwarder (SFF) is the core element used in Service Chaining. It is an OpenFlow switch that, in the context of OPNFV, is hosted in an OVS bridge. In OPNFV there will be one SFF per Compute Node that will be hosted in the “br-int” OpenStack OVS bridge.

The responsibility of the SFF is to steer incoming packets to the corresponding Service Function, or to the SFF in the next compute node. The flows in the SFF are programmed by the OpenDaylight SFC SDN Controller.

5.3. Service Chains

Service Chains are defined in the OpenDaylight SFC Controller using the following constructs:

SFC
A Service Function Chain (SFC) is an ordered list of abstract SF types.
SFP
A Service Function Path (SFP) references an SFC, and optionally provides concrete information about the SFC, like concrete SF instances. If SF instances are not supplied, then the RSP will choose them.
RSP
A Rendered Service Path (RSP) is the actual Service Chain. An RSP references an SFP, and effectively merges the information from the SFP and SFC to create the Service Chain. If concrete SF details were not provided in the SFP, then SF selection algorithms are used to choose one. When the RSP is created, the OpenFlows will be programmed and written to the SFF(s).
5.4. Service Chaining Encapsulation

Service Chaining Encapsulation encapsulates traffic sent through the Service Chaining domain to facilitate easier steering of packets through Service Chains. If no Service Chaining Encapsulation is used, then packets much be classified at every hop of the chain, which would be slow and would not scale well.

In ODL SFC, Network Service Headers (NSH) is used for Service Chaining encapsulation. NSH is an IETF specification that uses 2 main header fields to facilitate packet steering, namely:

NSP (NSH Path)
The NSP is the Service Path ID.
NSI (NSH Index)
The NSI is the Hop in the Service Chain. The NSI starts at 255 and is decremented by every SF. If the NSI reaches 0, then the packet is dropped which avoids loop detections.

NSH also has metadata fields, but that’s beyond the scope of this architecture.

In ODL SFC, NSH packets are encapsulated in VXLAN-GPE.

5.5. Classifiers

A classifier is the entry point into Service Chaining. The role of the classifier is to map incoming traffic to Service Chains. In ODL SFC, this mapping is performed by matching the packets and encapsulating the packets in a VXLAN-GPE NSH tunnel.

The packet matching is specific to the classifier implementation, but can be as simple as an ACL, or can be more complex by using PCRF information or DPI.

5.6. VNF Manager

In OPNFV SFC, a VNF Manager is needed to spin-up VMs for Service Functions. It has been decided to use the OpenStack Tacker VNF Mgr to spin-up and manage the life cylcle of the SFs. Tacker will receive the ODL SFC configuration, manage the SF VMs, and forward the configuration to ODL SFC. The following sequence diagram details the interactions with the VNF Mgr:

_images/OPNFV_SFC_Brahmaputra_SfCreation.jpg
5.7. OPNFV SFC Network Topology

The following image details the Network Topology used in OPNFV Danube SFC:

_images/OPNFV_SFC_Brahmaputra_NW_Topology.jpg
5.8. OVS NSH patch workaround

When using NSH with VXLAN tunnels, its important that the VXLAN tunnel is terminated in the SF VM. This allows the SF to see the NSH header, allowing it to decrement the NSI and also to use the NSH metadata. When using VXLAN with OpenStack, the tunnels are not terminated in the VM, but in the “br-int” OVS bridge. There is work ongoing in the upstream OVS community to implemement NSH encapsulation. To get around the way OpenStack handles VXLAN tunnels, the OVS work will also include the ability to encapsulate/decapsulate VXLAN tunnels from OpenFlow rules, instead of relying on the Vtep ports. The ongoing upstream OVS work will not be finished by the time OPNFV Danube is released, so a work-around has been created. This work-around will use a private branch of OVS that has a preliminary version of NSH implemented.

The following diagram illustrates how packets will be sent to an SF, when the SFF has processed the packet and wants to send it to the SF:

_images/OPNFV_SFC_BrahmaputraOvsNshWorkaround_toSf.jpg

The following diagram illustrates how packets will sent from an SF to an SFF, once the SF has processed a packet:

_images/OPNFV_SFC_BrahmaputraOvsNshWorkaround_fromSf.jpg
6. Requirements

This section defines requirements for the initial OPNFV SFC implementation, including those requirements driving upstream project enhancements.

6.1. Minimal Viable Requirement

Deploy a complete SFC solution by integrating OpenDaylight SFC with OpenStack in an OPNFV environment.

6.2. Detailed Requirements

These are the Danube specific requirements:

1 The supported Service Chaining encapsulation will be NSH VXLAN-GPE.

2 The version of OVS used must support NSH.

3 The SF VM life cycle will be managed by the Tacker VNF Manager.

4 The supported classifier is OpenDaylight NetVirt.

5 ODL will be the OpenStack Neutron backend and will handle all networking
on the compute nodes.
6.3. Long Term Requirements

These requirements are out of the scope of the Danube release.

1 Dynamic movement of SFs across multiple Compute nodes.

2 Load Balancing across multiple SFs