Is it feasible to use cell2location to distinguish between malignant and non-malignant states of the same cell type?

## Please use the template below to post a question to https://discourse.scverse.org/c/ecosytem/cell2location/. 

### Problem


After simple annotation of scRNA, I used infercnvpy to annotate tumor cells based on immune cells. In my data, Dutcal cells were divided into malignant and non-malignant types. Then, I used cell2location to learn the cell expression patterns in this scRNA and calculated the cell abundance of each spot in the spatial transcriptome of the tumor sample. I'm wondering if using cell2location can truly predict tumor cell abundance accurately. After all, in single-cell analysis, infercnv is used to accurately identify tumor cells, but in spatial transcriptomics, each spot may be a mixture of multiple cell types. My data has a spot size of 55 micrometers, which is low resolution, so using the infercnv method in single-cell analysis doesn't seem very suitable.

As shown in the figure below, I found that in some samples, the cell distribution and abundance of worsening and non-worsening ductal cells are very similar, leading to this problem. Although the differences are particularly obvious in some samples, I still want to understand what causes this situation and how to solve it.

**This is an example of similar distribution and cell abundance.**

<img width="934" height="452" alt="Image" src="https://github.com/user-attachments/assets/e27b388e-5ec0-4b0f-a290-f3e52359d5a5" />

<img width="1500" height="3900" alt="Image" src="https://github.com/user-attachments/assets/28c9a85c-902c-48b0-9785-2a9f224235fe" />

<img width="866" height="412" alt="Image" src="https://github.com/user-attachments/assets/96c18ffd-5d28-4fe2-b43e-a5ddc3eaf0f4" />

<img width="1500" height="3900" alt="Image" src="https://github.com/user-attachments/assets/f34c917b-46de-4aa0-a06b-a2c66c4e42b3" />

**This is an example where the distribution and cell abundance are not entirely similar.**

<img width="883" height="417" alt="Image" src="https://github.com/user-attachments/assets/27f288fa-dfa2-43be-a620-b25144cdf1d4" />

<img width="1500" height="3900" alt="Image" src="https://github.com/user-attachments/assets/30ef7a44-54bc-4e85-8830-02b08425d2f5" />

**This is an example of significant differences in distribution and cell abundance.**

<img width="884" height="416" alt="Image" src="https://github.com/user-attachments/assets/fce94691-8def-4ef3-a4d9-838c9426324b" />

<img width="1500" height="3900" alt="Image" src="https://github.com/user-attachments/assets/f42764d3-1260-49bb-9da5-71bfb0d2b478" />

<img width="871" height="413" alt="Image" src="https://github.com/user-attachments/assets/3c2643bb-e1f2-49a4-a2a5-883ae2419c53" />

<img width="1500" height="3900" alt="Image" src="https://github.com/user-attachments/assets/58e09a20-30ea-4758-bd71-9b4b0d70e089" />

- [x] I follow the instructions from the [cell2location tutorial (using on scvi-tools)](https://cell2location.readthedocs.io/en/latest/notebooks/cell2location_tutorial.html).
- [x] I have adjusted required hyperparameters to my dataset and tissue `N_cells_per_location` and `detection_alpha`.
N_cells_per_location=25
detection_alpha=20
Regarding the method for confirming N_cells_per_location, I found a related issue in the comments, so I manually estimated the average cell count in the spot. Regarding detection_alpha, I followed the tutorial's suggestion and set it to 20.

- [x] I have provided 10X reaction/inlet as `batch_key` for reference NB regression.
- [x] I have checked [scverse Discourse](https://discourse.scverse.org/c/ecosytem/cell2location/) and [old Cell2location Community Forum](https://github.com/BayraktarLab/cell2location/discussions), and did not find a solution.


### Description of the data input and hyperparameters





10X Visium

#### Single cell reference data: number of cells, number of cell types, number of genes



my reference single cell data

AnnData object with n_obs × n_vars = 71154 × 16932
    obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'patients', 'major_cell_type', 'auto_majority_voting', '_indices', '_scvi_batch', '_scvi_labels'
    var: 'symbol', 'MT_gene', 'n_cells', 'nonz_mean'
    uns: '_scvi_manager_uuid', '_scvi_uuid', 'mod'
    obsm: 'MT'
    varm: 'means_per_cluster_mu_fg', 'q05_per_cluster_mu_fg', 'q95_per_cluster_mu_fg', 'stds_per_cluster_mu_fg'

my cell type（18）
['CD4T', 'Bcell', 'CD8T', 'Macro', 'NK', 'DC', 'Mast', 'Mono', 'ILC', 'gdT', 'Acinar', 'CAF', 'Ductal', 'Ductal_tumor', 'Endocrine', 'Endothelial', 'Schwann', 'Stellate']

#### Single cell reference data: technology type (e.g. mix of 10X 3' and 5')


High-viability (>80%) sorted single-cell suspensions were used for single-cell RNA sequencing (scRNA-seq). Cells were encapsulated into droplets using the 10x Genomics Chromium Controller, targeting ~10,000 cells. cDNA was amplified by 13 PCR cycles and used for library preparation. Final libraries were quality-checked with a Bioanalyzer 2200 and quantified with a Qubit 3.0, then normalized, pooled, and sequenced on an Illumina NovaSeq 6000 with 150 bp paired-end reads.

#### Spatial data: number of locations numbers, technology type (e.g. Visium, ISS, Nanostring WTA)


For fresh tissue specimen, optimal cutting temperature (OCT)-embedded tissues were cryo-sectioned at 10-μm and placed on the 6.5 x 6.5 mm capture area of Visium Spatial Gene Expression Slide following Visium Spatial Protocols-Tissue Preparation Guide (10x Genomics, CG00039 Rev D). Sections were fixed in methanol, stained with H&E and imaged at ×20 magnification. Tissues were then permeabilized for 12 min and cDNA libraries were constructed following Visium Spatial Gene Expression Reagent Kits User Guide. Libraries were sequenced on the NovaSeq-6000 system.
For FFPE samples, 4 x 10-μm FFPE sections were collected for RNA extraction by RNeasy FFPE Kit (Qiagen,73504). DV200 of the yielded RNA was evaluated using Agilent RNA 6000 Pico Kit. Samples with DV200>50% were selected for sectioning, deparafinization and HE staining on the Visium Spatial Slide with 6.5 x 6.5 mm capture area. Probe hybridization, probe ligation, probe release & extension, and library construction were conducted following the Visium Spatial Gene Expression Reagent Kit for FFPE User Guide (10x Genomics, CG000407 Rev A).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it feasible to use cell2location to distinguish between malignant and non-malignant states of the same cell type? #429

Please use the template below to post a question to https://discourse.scverse.org/c/ecosytem/cell2location/.

Problem

Description of the data input and hyperparameters

Single cell reference data: number of cells, number of cell types, number of genes

Single cell reference data: technology type (e.g. mix of 10X 3' and 5')

Spatial data: number of locations numbers, technology type (e.g. Visium, ISS, Nanostring WTA)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is it feasible to use cell2location to distinguish between malignant and non-malignant states of the same cell type? #429

Description

Please use the template below to post a question to https://discourse.scverse.org/c/ecosytem/cell2location/.

Problem

Description of the data input and hyperparameters

Single cell reference data: number of cells, number of cell types, number of genes

Single cell reference data: technology type (e.g. mix of 10X 3' and 5')

Spatial data: number of locations numbers, technology type (e.g. Visium, ISS, Nanostring WTA)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions