3. Set maintenance period

https://blueprints.launchpad.net/nova/+spec/set-maintenance-period

When compute host going for maintenance it has to be possible to set the wanted maintenance start and end time.

3.1. Problem description

Admin needs to be able to set wanted maintenance period, so server owner can also prepare for maintenance. Server owner might want to have different actions for his server depending when and how long the maintenance takes. This is very important for services that needs to run without any downtime because of the maintenance.

3.1.1. Use Cases

As admin I want to set maintenance period for certain host.

As owner of a server I want to prepare for maintenance to minimize downtime and keep capacity on needed level.

As owner of a server I want to have different actions for different servers depending aplication on server, when and how long the maintenance will last. To achieve this, there needs to be information about when maintenance in going to happen. Actions for servers are not in scope of this spec. It might be future work to define some auto recovery or some other configured actions, but currently that is just left outside of Nova scope. This can be complex and it might be server is moved, removed or even left to host when maintenance is to take place.

3.2. Proposed change

New admin API endpoint PUT /v2.1/{tenant_id}/os-services/maintenance should be added having host, binary, maintenance_start and maintenance_end parameters in request. API will tell when actual maintenance for host will happen and it is also used to unset maintenance period. As in disable nova-compute case only the scheduling will be stopped, this new API will tell the time period when actually host maintenance is performed and servers can also therefore be down.

If no maintenance_end defined while maintenance_start defined, it means the host will be removed. Calling API with empty timestamps in maintenance_start and maintenance_end will indicate there is no ongoing maintenance and this is also the default value in Nova service DB for these new fields. Service enable/disable is expected to be called separately before and after maintenance.

service.update notification should have new version to include maintenance_start and maintenance_end`` information. Calling of new maintenance API should also trigger this notification.

Alarm should be risen for server owners telling list of their servers that are on a host that is going for maintenance. Also the maintenance_start` and maintenance_end should be visible in the alarm. Host should not be exposed to owner, so alarm should not contain that information. There should also be alarm when maintenance over and it needs to contain information to map it to start maintenance alarm. tbd better in the notification section.

Maintenance should also be shown to server owner by adding OS-EXT-SRV-ATTR:maintenance_start and OS-EXT-SRV-ATTR:maintenance_end parameters to the /servers/{server_id} and /servers/detail endpoints. New policy should be added to control visibility of the new parameters:

“os_compute_api:servers:show:maintenance”: “rule:admin_or_owner”

Currently only ability to show nova-compute service disabled as MAINTENANCE in host_status parameter, but policy for this defaults to admin. In NFV it will be configured also to owner.

3.2.1. Alternatives

tbd

3.2.2. Data model impact

Nova Service DB will have new fields maintenance_start and maintenance_end.

3.2.3. REST API impact

New API microversion is needed to add new API: PUT /v2.1/{tenant_id}/os-services/maintenance Same time also visibility to response of GET /v2.1/\u200b{tenant_id}/servers/{server_id} and GET /v2.1/\u200b{tenant_id}/servers/detail with new parameters OS-EXT-SRV-ATTR:maintenance_start and OS-EXT-SRV-ATTR:maintenance_end. Parameters visibility controlled by new policy.

Example of setting 2 hours maintenance period:

PUT /v2.1/{tenant_id}/os-services/maintenance
{
    "host": "compute1",
    "binary": "nova-compute",
    "maintenance_start": "2016-03-22T01:00:00",
    "maintenance_end": "2016-03-22T03:00:00"
}

200 OK
{
    "service": {
        "binary": "nova-compute",
        "host": "compute1",
        "maintenance_start": "2016-03-22T01:00:00.000000",
        "maintenance_end": "2016-03-22T03:00:00.000000"
    }
}

Example to unset maintenance:

PUT /v2.1/{tenant_id}/os-services/maintenance
{
    "host": "compute1",
    "binary": "nova-compute",
    "maintenance_start": "",
    "maintenance_end": ""
}

200 OK
{
    "service": {
        "binary": "nova-compute",
        "host": "compute1",
        "maintenance_start": "",
        "maintenance_end": ""
    }
}

Example of indicating host removal:

PUT /v2.1/{tenant_id}/os-services/maintenance
{
    "host": "compute1",
    "binary": "nova-compute",
    "maintenance_start": "2016-03-22T01:00:00",
    "maintenance_end": ""
}

200 OK
{
    "service": {
        "binary": "nova-compute",
        "host": "compute1",
        "maintenance_start": "2016-03-22T01:00:00.000000",
        "maintenance_end": ""
    }
}

Setting timestamps in the past will be considered as an error and maintenance_end has to be after maintenance_start or empty:

400 Bad Request

3.2.4. Security impact

None

3.2.5. Notifications impact

New version of service.update notification needs to have new parameters maintenance_start and maintenance_end and notification needs to be triggered if new maintenance API is called:

{
    "priority":"INFO",
    "event_type":"service.update",
    "timestamp":"2016-03-22 00:46:25.211575",
    "publisher_id":"nova-compute:controller",
    "payload":{
        "nova_object.version":"1.0",
        "nova_object.name":"ServiceStatusPayload",
        "nova_object.namespace":"nova",
        "nova_object.data":{
            "binary":"nova-compute",
            "report_count":1,
            "topic":"compute",
            "host":"controller",
            "version":3,
            "disabled":true,
            "forced_down":false,
            "last_seen_up":"2016-03-22T00:46:25Z",
            "disabled_reason":"Going to maintenance",
            "maintenance_start": "2016-03-22T01:00:00Z",
            "maintenance_end": "2016-03-22T03:00:00Z",
        }
    },
    "message_id":"8516b5c7-c6a7-43a4-9ad1-df447f318afb"
}

tbd, maintenance alarm.

3.2.6. Other end user impact

None

3.2.7. Performance Impact

None

3.2.8. Other deployer impact

None

3.2.9. Developer impact

None

3.3. Implementation

3.3.1. Assignee(s)

Primary assignee:
tomi-juvonen-q

3.3.2. Work Items

Nova service DB changes. New maintenance API. Notification. Alarm.

3.4. Dependencies

Continues work started in Mitaka:

https://blueprints.launchpad.net/nova/+spec/get-valid-server-state

3.5. Testing

Unit and functional tests will be added.

3.6. Documentation Impact

API changes need to be documented for new microversion. Maintenance documentation should be updated.

3.7. References

Requirements of OPNFV Doctor project: http://artifacts.opnfv.org/doctor/docs/requirements/requirements.pdf

3.8. History

Table 3.1 Revisions
Release Name Description
Newton Introduced