|
| 1 | +:orphan: |
| 2 | + |
| 3 | +Generalized eigendecomposition in decoding |
| 4 | +========================================== |
| 5 | + |
| 6 | +.. NOTE: part of this file is included in doc/overview/implementation.rst. |
| 7 | + Changes here are reflected there. If you want to link to this content, link |
| 8 | + to :ref:`ged` to link to that section of the implementation.rst page. |
| 9 | + The next line is a target for :start-after: so we can omit the title from |
| 10 | + the include: |
| 11 | + ged-begin-content |
| 12 | +
|
| 13 | +This section describes the mathematical formulation and application of |
| 14 | +Generalized Eigendecomposition (GED), often used in spatial filtering |
| 15 | +and source separation algorithms, such as :class:`mne.decoding.CSP`, |
| 16 | +:class:`mne.decoding.SPoC`, :class:`mne.decoding.SSD` and |
| 17 | +:class:`mne.preprocessing.Xdawn`. |
| 18 | + |
| 19 | +The core principle of GED is to find a set of channel weights (spatial filter) |
| 20 | +that maximizes the ratio of signal power between two data features. |
| 21 | +These features are defined by the researcher and are represented by two covariance matrices: |
| 22 | +a "signal" matrix :math:`S` and a "reference" matrix :math:`R`. |
| 23 | +For example, :math:`S` could be the covariance of data from a task time interval, |
| 24 | +and :math:`S` could be the covariance from a baseline time interval. For more details see :footcite:`Cohen2022`. |
| 25 | + |
| 26 | +Algebraic formulation of GED |
| 27 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 28 | + |
| 29 | +A few definitions first: |
| 30 | +Let :math:`n \in \mathbb{N}^+` be a number of channels. |
| 31 | +Let :math:`\text{Symm}_n(\mathbb{R}) \subset M_n(\mathbb{R})` be a vector space of real symmetric matrices. |
| 32 | +Let :math:`S^n_+, S^n_{++} \subset \text{Symm}_n(\mathbb{R})` be sets of real positive semidefinite and positive definite matrices, respectively. |
| 33 | +Let :math:`S, R \in S^n_+` be covariance matrices estimated from electrophysiological data :math:`X_S \in M_{n \times t_S}(\mathbb{R})` and :math:`X_R \in M_{n \times t_R}(\mathbb{R})`. |
| 34 | + |
| 35 | +GED (or simultaneous diagonalization by congruence) of :math:`S` and :math:`R` |
| 36 | +is possible when :math:`R` is full rank (and thus :math:`R \in S^n_{++}`): |
| 37 | + |
| 38 | +.. math:: |
| 39 | +
|
| 40 | + SW = RWD, |
| 41 | +
|
| 42 | +where :math:`W \in M_n(\mathbb{R})` is an invertible matrix of eigenvectors |
| 43 | +of :math:`(S, R)` and :math:`D` is a diagonal matrix of eigenvalues :math:`\lambda_i`. |
| 44 | + |
| 45 | +Each eigenvector :math:`\mathbf{w} \in W` is a spatial filter that solves |
| 46 | +an optimization problem of the form: |
| 47 | + |
| 48 | +.. math:: |
| 49 | +
|
| 50 | + \operatorname{argmax}_{\mathbf{w}} \frac{\mathbf{w}^t S \mathbf{w}}{\mathbf{w}^t R \mathbf{w}} |
| 51 | +
|
| 52 | +That is, using spatial filters :math:`W` on time-series :math:`X \in M_{n \times t}(\mathbb{R})`: |
| 53 | + |
| 54 | +.. math:: |
| 55 | +
|
| 56 | + \mathbf{A} = W^t X, |
| 57 | +
|
| 58 | +results in "activation" time-series :math:`A` of the estimated "sources", |
| 59 | +such that the ratio of their variances, |
| 60 | +:math:`\frac{\text{Var}(\mathbf{w}^T X_S)}{\text{Var}(\mathbf{w}^T X_R)} = \frac{\mathbf{w}^T S \mathbf{w}}{\mathbf{w}^T R \mathbf{w}}`, |
| 61 | +is sequentially maximized spatial filters :math:`\mathbf{w}_i`, sorted according to :math:`\lambda_i`. |
| 62 | + |
| 63 | +GED in the principal subspace |
| 64 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 65 | +Unfortunately, :math:`R` might not be full rank depending on the data :math:`X_R` (for example due to average reference, removed PCA/ICA components, etc.). |
| 66 | +In such cases, GED can be performed on :math:`S` and :math:`R` in the principal subspace :math:`Q = \operatorname{Im}(C_{ref}) \subset \mathbb{R}^n` of some reference |
| 67 | +covariance :math:`C_{ref}` (in Common Spatial Pattern (CSP) algorithm, for example, :math:`C_{ref}=\frac{1}{2}(S+R)` and GED is performed on S and R'=S+R). |
| 68 | + |
| 69 | +More formally: |
| 70 | +Let :math:`r \leq n` be a rank of :math:`C \in S^n_+`. |
| 71 | +Let :math:`Q=\operatorname{Im}(C_{ref})` be a principal subspace of :math:`C_{ref}`. |
| 72 | +Let :math:`P \in M_{n \times r}(\mathbb{R})` be an isometry formed by orthonormal basis of :math:`Q`. |
| 73 | +Let :math:`f:S^n_+ \to S^r_+`, :math:`A \mapsto P^t A P` be a "restricting" map, that restricts quadratic form |
| 74 | +:math:`q_A:\mathbb{R}^n \to \mathbb{R}` to :math:`q_{A|_Q}:\mathbb{R}^n \to \mathbb{R}` (in practical terms, :math:`q_A` maps |
| 75 | +spatial filters to variance of the spatially filtered data :math:`X_A`). |
| 76 | + |
| 77 | +Then, the GED of :math:`S` and :math:`R` in the principal subspace :math:`Q` of :math:`C_{ref}` is performed as follows: |
| 78 | + |
| 79 | +1. :math:`S` and :math:`R` are transformed to :math:`S_Q = f(S) = P^t S P` and :math:`R_Q = f(R) = P^t R P`, |
| 80 | + such that :math:`S_Q` and :math:`R_Q` are matrix representations of restricted :math:`q_{S|_Q}` and :math:`q_{R|_Q}`. |
| 81 | +2. GED is performed on :math:`S_Q` and :math:`R_Q`: :math:`S_Q W_Q = R_Q W_Q D`. |
| 82 | +3. Eigenvectors :math:`W_Q` of :math:`(S_Q, R_Q)` are transformed back to :math:`\mathbb{R}^n` |
| 83 | + by :math:`W = P W_Q \in \mathbb{R}^{n \times r}` to obtain :math:`r` spatial filters. |
| 84 | + |
| 85 | +Note that the solution to the original optimization problem is preserved: |
| 86 | + |
| 87 | +.. math:: |
| 88 | +
|
| 89 | + \frac{\mathbf{w_Q}^t S_Q \mathbf{w_Q}}{\mathbf{w_Q}^t R_Q \mathbf{w_Q}}= \frac{\mathbf{w_Q}^t (P^t S P) \mathbf{w_Q}}{\mathbf{w_Q}^t (P^t R P) |
| 90 | + \mathbf{w_Q}} = \frac{\mathbf{w}^t S \mathbf{w}}{\mathbf{w}^t R \mathbf{w}} = \lambda |
| 91 | +
|
| 92 | +
|
| 93 | +In addition to restriction, :math:`q_S` and :math:`q_R` can be rescaled based on the whitened :math:`C_{ref}`. |
| 94 | +In this case the whitening map :math:`f_{wh}:S^n_+ \to S^r_+`, |
| 95 | +:math:`A \mapsto P_{wh}^t A P_{wh}` transforms :math:`A` into matrix representation of :math:`q_{A|Q}` rescaled according to :math:`\Lambda^{-1/2}`, |
| 96 | +where :math:`\Lambda` is a diagonal matrix of eigenvalues of :math:`C_{ref}` and so :math:`P_{wh} = P \Lambda^{-1/2}`. |
| 97 | + |
| 98 | +In MNE-Python, the matrix :math:`P` of the restricting map can be obtained using |
| 99 | +:: |
| 100 | + |
| 101 | + _, ref_evecs, mask = mne.cov._smart_eigh(C_ref, ..., proj_subspace=True, ...) |
| 102 | + restr_mat = ref_evecs[mask] |
| 103 | + |
| 104 | +while :math:`P_{wh}` using: |
| 105 | +:: |
| 106 | + |
| 107 | + restr_mat = compute_whitener(C_ref, ..., pca=True, ...) |
0 commit comments