Skip to content

Commit a56ab5a

Browse files
committed
doc: Add sxm mux design
Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>
1 parent bd6cab7 commit a56ab5a

3 files changed

Lines changed: 110 additions & 10 deletions

File tree

doc/content/xapi/storage/sxm.md

Lines changed: 102 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,101 @@
22
Title: Storage migration
33
---
44

5+
- [Overview](#overview)
6+
- [SXM Multiplexing](#sxm-multiplexing)
7+
- [Motivation](#motivation)
8+
- [But we have storage\_mux.ml](#but-we-have-storage_muxml)
9+
- [Thought experiments on an alternative design](#thought-experiments-on-an-alternative-design)
10+
- [Design](#design)
11+
- [SMAPIv1 Migration](#smapiv1-migration)
12+
- [Receiving SXM](#receiving-sxm)
13+
- [Xapi code](#xapi-code)
14+
- [Storage code](#storage-code)
15+
- [Copying a VDI](#copying-a-vdi)
16+
- [Mirroring a VDI](#mirroring-a-vdi)
17+
- [Code walkthrough](#code-walkthrough)
18+
- [DATA.copy](#datacopy)
19+
- [DATA.copy\_into](#datacopy_into)
20+
- [DATA.MIRROR.start](#datamirrorstart)
21+
22+
523
## Overview
624

7-
{{<mermaid align="left">}}
25+
26+
## SXM Multiplexing
27+
28+
This section is about the design idea behind the additional layer of mutiplexing specifically
29+
for Storage Xen Motion (SXM) from SRs using SMAPIv3. It is recommended that you have read the
30+
[introduction doc](_index.md) for the storage layer first to understand how storage
31+
multiplexing is done between SMAPIv2 and SMAPI{v1, v3} before reading this.
32+
33+
34+
### Motivation
35+
36+
The existing SXM code was designed to work only with SMAPIv1 SRs, and therefore
37+
does not take into account the dramatic difference in the ways SXM is done between
38+
SMAPIv1 and SMAPIv3. The exact difference will be covered later on in this doc, for this section
39+
it is sufficient to assume that they have two ways of doing migration. Therefore,
40+
we need different code paths for migration from SMAPIv1 and SMAPIv3.
41+
42+
#### But we have storage_mux.ml
43+
44+
Indeed, storage_mux.ml is responsible for multiplexing and forwarding requests to
45+
the correct storage backend, based on the SR type that the caller specifies. And
46+
in fact, for inbound SXM to SMAPIv3 (i.e. migrating into a SMAPIv3 SR, GFS2 for example),
47+
storage_mux is doing the heavy lifting of multiplexing between different storage
48+
backends. Every time a `Remote.` call is invoked, this will go through the SMAPIv2
49+
layer to the remote host and get multiplexed on the destination host, based on
50+
whether we are migrating into a SMAPIv1 or SMAPIv3 SR (see the diagram below).
51+
And the inbound SXM is implemented
52+
by implementing the existing SMAPIv2 -> SMAPIv3 calls (see `import_activate` for example)
53+
which may not have been implemented before.
54+
55+
![mux for inbound](sxm_mux_inbound.svg)
56+
57+
While this works fine for inbound SXM, it does not work for outbound SXM. A typical SXM
58+
consists of four combinations, the source sr type (v1/v3) and the destiantion sr
59+
type (v1/v3), any of the four combinations is possible. We have already covered the
60+
destination multiplexing (v1/v3) by utilising storage_mux, and at this point we
61+
have run out of multiplexer for multiplexing on the source. In other words, we
62+
can only mutiplex once for each SMAPIv2 call, and we can either use that chance for
63+
either the source or the destination, and we have already used it for the latter.
64+
65+
66+
#### Thought experiments on an alternative design
67+
68+
To make it even more concrete, let us consider an example: the mirroring logic in
69+
SXM is different based on the source SR type of the SXM call. You might imagine
70+
defining a function like `MIRROR.start v3_sr v1_sr` that will be multiplexed
71+
by the storage_mux based on the source SR type, and forwarded to storage_smapiv3_migrate,
72+
or even just xapi-storage-script, which is indeed quite possible.
73+
Now at this point we have already done the multiplexing, but we still wish to
74+
multiplex operations on destination SRs, for example, we might want to attach a
75+
VDI belonging to a SMAPIv1 SR on the remote host. But as we have already done the
76+
multiplexing and is now inside xapi-storage-script, we have lost any chance of doing
77+
any further multiplexing :(
78+
79+
### Design
80+
81+
The idea of this new design is to introduce an additional multiplexing layer that
82+
is specific for multiplexing calls based on the source SR type. For example, in
83+
the diagram below the `send_start src_sr dest_sr` will take both the src SR and the
84+
destination SR as parameters, and suppose the mirroring logic is different for different
85+
types of source SRs (i.e. SMAPIv1 or SMAPIv3), the storage migration code will
86+
necessarily choose the right code path based on the source SR type. And this is
87+
exactly what is done in this additional multiplexing layer. The respective logic
88+
for doing {v1,v3}-specifi mirroring, for example, will stay in storage_smapi{v1,v3}_migrate.ml
89+
90+
![mux for outbound](sxm_mux_outbound.svg)
91+
92+
Note that later on storage_smapi{v1,v3}_migrate.ml will still have the flexibility
93+
to call remote SMAPIv2 functions, such as `Remote.VDI.attach dest_sr vdi`, and
94+
it will be handled just as before.
95+
96+
97+
## SMAPIv1 Migration
98+
99+
```mermaid
8100
sequenceDiagram
9101
participant local_tapdisk as local tapdisk
10102
participant local_smapiv2 as local SMAPIv2
@@ -129,7 +221,7 @@ opt post_detach_hook
129221
end
130222
Note over xapi: memory image migration by xenopsd
131223
Note over xapi: destroy the VM record
132-
{{< /mermaid >}}
224+
```
133225

134226
### Receiving SXM
135227

@@ -162,7 +254,7 @@ the receiving end of storage motion:
162254

163255
This is how xapi coordinates storage migration. We'll do it as a code walkthrough through the two layers: xapi and storage-in-xapi (SMAPIv2).
164256

165-
## Xapi code
257+
### Xapi code
166258

167259
The entry point is in [xapi_vm_migration.ml](https://github.com/xapi-project/xen-api/blob/f75d51e7a3eff89d952330ec1a739df85a2895e2/ocaml/xapi/xapi_vm_migrate.ml#L786)
168260

@@ -1056,7 +1148,7 @@ We also try to remove the VM record from the destination if we managed to send i
10561148
Finally we check for mirror failure in the task - this is set by the events thread watching for events from the storage layer, in [storage_access.ml](https://github.com/xapi-project/xen-api/blob/f75d51e7a3eff89d952330ec1a739df85a2895e2/ocaml/xapi/storage_access.ml#L1169-L1207)
10571149

10581150

1059-
## Storage code
1151+
### Storage code
10601152

10611153
The part of the code that is conceptually in the storage layer, but physically in xapi, is located in
10621154
[storage_migrate.ml](https://github.com/xapi-project/xen-api/blob/f75d51e7a3eff89d952330ec1a739df85a2895e2/ocaml/xapi/storage_migrate.ml). There are logically a few separate parts to this file:
@@ -1069,7 +1161,7 @@ The part of the code that is conceptually in the storage layer, but physically i
10691161

10701162
Let's start by considering the way the storage APIs are intended to be used.
10711163

1072-
### Copying a VDI
1164+
#### Copying a VDI
10731165

10741166
`DATA.copy` takes several parameters:
10751167

@@ -1119,7 +1211,7 @@ The implementation uses the `url` parameter to make SMAPIv2 calls to the destina
11191211
The implementation tries to minimize the amount of data copied by looking for related VDIs on the destination SR. See below for more details.
11201212

11211213

1122-
### Mirroring a VDI
1214+
#### Mirroring a VDI
11231215

11241216
`DATA.MIRROR.start` takes a similar set of parameters to that of copy:
11251217

@@ -1156,11 +1248,11 @@ Note that state is a list since the initial phase of the operation requires both
11561248

11571249
Additionally the mirror can be cancelled using the `MIRROR.stop` API call.
11581250

1159-
### Code walkthrough
1251+
#### Code walkthrough
11601252

11611253
let's go through the implementation of `copy`:
11621254

1163-
#### DATA.copy
1255+
##### DATA.copy
11641256

11651257
```ocaml
11661258
let copy ~task ~dbg ~sr ~vdi ~dp ~url ~dest =
@@ -1296,7 +1388,7 @@ Finally we snapshot the remote VDI to ensure we've got a VDI of type 'snapshot'
12961388

12971389
The exception handler does nothing - so we leak remote VDIs if the exception happens after we've done our cloning :-(
12981390

1299-
#### DATA.copy_into
1391+
##### DATA.copy_into
13001392

13011393
Let's now look at the data-copying part. This is common code shared between `VDI.copy`, `VDI.copy_into` and `MIRROR.start` and hence has some duplication of the calls made above.
13021394

@@ -1467,7 +1559,7 @@ The last thing we do is to set the local and remote content_id. The local set_co
14671559
Here we perform the list of cleanup operations. Theoretically. It seems we don't ever actually set this to anything, so this is dead code.
14681560

14691561

1470-
#### DATA.MIRROR.start
1562+
##### DATA.MIRROR.start
14711563

14721564
```ocaml
14731565
let start' ~task ~dbg ~sr ~vdi ~dp ~url ~dest =

doc/content/xapi/storage/sxm_mux_inbound.svg

Lines changed: 4 additions & 0 deletions
Loading

doc/content/xapi/storage/sxm_mux_outbound.svg

Lines changed: 4 additions & 0 deletions
Loading

0 commit comments

Comments
 (0)