Skip to content

Add Support for Migration Statistics API Call#43

Merged
phip1611 merged 9 commits intocyberus-technology:gardenlinuxfrom
phip1611:poc-migration-statistics
Feb 19, 2026
Merged

Add Support for Migration Statistics API Call#43
phip1611 merged 9 commits intocyberus-technology:gardenlinuxfrom
phip1611:poc-migration-statistics

Conversation

@phip1611
Copy link
Member

@phip1611 phip1611 commented Nov 25, 2025

TL;DR

This extends the internal API with a vm_progress function and adds a vm.migration-progress HTTP endpoint including support in ch-remote via ch-remote migration-process to query the latest migration progress.

Motivation

Having live and frequently refreshing statistics/metrics about an
ongoing live migration is especially interesting for debugging and
monitoring, such as checking the actual network throughput. With the
proposed changes, for the first time, we will be able to see how live
migrations behave and create benchmarking infrastructure around it.

The ch driver in libvirt will use these information to populate its
virsh domjobinfo information.

Design

We will add a new API endpoint to query information for ongoing live
migrations. The new endpoint will also serve to query information about
any previously failed or canceled migrations. The SendMigration call
will no longer be blocking (wait until the migration is done) but
instead just dispatch the migration.

This streamlines the behavior with QEMU and simplifies management
software.

When one queries the endpoint, a frequently refreshed snapshot of the
migration statistics and progress will be returned. The data will not be
assembled on the fly.

On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster philipp.schuster@cyberus-technology.de

Steps to Merge

  • test it locally using ch-remote
  • add libvirt-tests testcase and verify everything works
  • finish refactoring for "non-blocking send-migration" (2026-02-03)
  • deploy it on a node in SAP land and see if it works
  • merge this when we know the libvirt part is also fine (I however think that this is 99% already)
  • wait for the latest libvirt/libvirt-tests fixes

@phip1611 phip1611 self-assigned this Nov 25, 2025
@phip1611 phip1611 force-pushed the poc-migration-statistics branch from e29356d to 4101b11 Compare November 28, 2025 08:46
@phip1611 phip1611 force-pushed the poc-migration-statistics branch 5 times, most recently from f49e577 to fdf5858 Compare December 9, 2025 11:42
@phip1611 phip1611 force-pushed the poc-migration-statistics branch from f612cf6 to e9a3321 Compare December 15, 2025 14:24
@phip1611 phip1611 force-pushed the poc-migration-statistics branch from e9a3321 to ecb5f45 Compare January 8, 2026 14:31
@phip1611 phip1611 changed the base branch from gardenlinux-v48 to gardenlinux January 8, 2026 14:32
@phip1611 phip1611 force-pushed the poc-migration-statistics branch 3 times, most recently from e6c80dd to fe8cd0d Compare January 12, 2026 16:30
@phip1611 phip1611 marked this pull request as ready for review January 12, 2026 16:30
@phip1611 phip1611 changed the title WIP XXX Migration Statistics Add Support for Migration Statistics API Call Jan 12, 2026
@phip1611 phip1611 marked this pull request as draft January 12, 2026 16:35
@phip1611 phip1611 force-pushed the poc-migration-statistics branch from fe8cd0d to a87fe95 Compare January 12, 2026 16:41
@phip1611 phip1611 requested a review from tpressure January 12, 2026 16:42
@phip1611 phip1611 force-pushed the poc-migration-statistics branch from a87fe95 to 6d68c0c Compare January 12, 2026 16:43
@phip1611 phip1611 force-pushed the poc-migration-statistics branch 2 times, most recently from c934e0e to 78699eb Compare January 13, 2026 10:42
@phip1611 phip1611 force-pushed the poc-migration-statistics branch from 3d56d3c to 28fdbcc Compare February 9, 2026 16:09
@olivereanderson
Copy link

olivereanderson commented Feb 10, 2026

Unimportant nit feel free to ignore :) : I noticed that commit "9dd759c87a09e7113dc32ef75141620ca58a578d" includes "these information" in the commit message. It should be "this information" as information is an uncountable noun and must thus always use the singular form.

Again not important and I don't really care as it is perfectly understandable.

Copy link

@olivereanderson olivereanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for all the work you put into this important feature!

@phip1611 phip1611 force-pushed the poc-migration-statistics branch 10 times, most recently from 5d28f05 to f64cc96 Compare February 13, 2026 14:48
@phip1611 phip1611 marked this pull request as draft February 17, 2026 15:52
@phip1611 phip1611 marked this pull request as ready for review February 19, 2026 08:29
The logging is not very spammy nor costly (iterations take seconds to
dozens of minutes) and is clearly a win for us to debug things.

On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This is the first commit in a series of commits to introduce a new API
endpoint in Cloud Hypervisor to report progress and live-insights about
an ongoing live migration.

# Motivation

Having live and frequently refreshing statistics/metrics about an
ongoing live migration is especially interesting for debugging and
monitoring, such as checking the actual network throughput. With the
proposed changes, for the first time, we will be able to see how live
migrations behave and create benchmarking infrastructure around it.

The ch driver in libvirt will use these information to populate its
`virsh domjobinfo` information.

# Design

We will add a new API endpoint to query information for ongoing live
migrations. The new endpoint will also serve to query information about
any previously failed or canceled migrations. The SendMigration call
will no longer be blocking (wait until the migration is done) but
instead just dispatch the migration.

This streamlines the behavior with QEMU and simplifies management
software.

When one queries the endpoint, a frequently refreshed snapshot of the
migration statistics and progress will be returned. The data will not be
assembled on the fly.

On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This is part of the commit series to enable live updates about an
ongoing live migration. See the first commit for an introduction.

We decided to use an Option<> rather than a Result<> as
there isn't really an error that can happen when we query this endpoint.
A previous snapshot may either be there or not. It also doesn't make
sense here to check if the current VM is running, as users should always
be able to query information about the past (failed or canceled) live
migration.

On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This is part of the commit series to enable live updates about an
ongoing live migration. See the first commit for an introduction.

In this commit, we add the HTTP endpoint to export ongoing VM
live-migration progress.

On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This is part of the commit series to enable live updates about an
ongoing live migration. See the first commit for an introduction.

This commit prepares the avoidance of naming clashes in the following.

On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
This is part of the commit series to enable live updates about an
ongoing live migration. See the first commit for an introduction.

This commit actually brings all the functionality together. The first
version has the limitation that we populate the latest snapshot once per
memory iteration, although this is the most interesting part by far. In
a follow-up, we can make this more fine-grained.

We guarantee that as soon as SendMigration returns, migration progress
can be fetched as the underlying data source is populated.

On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
Time has proven that the previous design was not optimal. Now, the
SendMigration call is not blocking for the duration of the migration.
Instead, it just triggers the migration. Using the new MigrationProgress
endpoint, management software can trigger the state of the migration and
also find information for failed migrations.

A new `keep_alive` parameter for SendMigration will keep the VMM alive
and usable after the migration to ensure management software can fetch
the final state. The management software is then supposed to send a
ShutdownVmm command.

With this, we are finally able to query the migration progress API
endpoint during an ongoing live migration.

On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
We preserve the old behavior in ch-remote: SendMigration is blocking. A
new ´--dispatch` flag however ensures that one can just dispatch the
migration without waiting for it to finish (or fail).

On-behalf-of: SAP philipp.schuster@sap.com
Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
@phip1611 phip1611 force-pushed the poc-migration-statistics branch from f64cc96 to 87dae99 Compare February 19, 2026 08:29
@phip1611 phip1611 enabled auto-merge (rebase) February 19, 2026 08:29
@phip1611 phip1611 merged commit 415c957 into cyberus-technology:gardenlinux Feb 19, 2026
12 checks passed
@phip1611 phip1611 deleted the poc-migration-statistics branch February 19, 2026 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants