Zebrafish melanoma

You can download them from STMiner-test-data. You can also download the raw dataset from GEO.

Import package

from STMiner.SPFinder import SPFinder

Load data

file_path = 'Path/to/your/h5ad/file'
sp = SPFinder()
sp.read_h5ad(file=file_path, bin_size=1)

The parameter bin_size specifies the size of merged cells (spots). If not specified, no merging is performed. If set to 50, 50x50 cells/spots will be merged into a single cell/spot. Due to low sequencing depth in some datasets, cells/spots are often merged during analysis (e.g., stereo-seq). However, 10x data typically does not require merging.
The ST datasets was storaged in .adata object of sp, you can use sp.adata to check them:

sp.adata

Besides, sp Obj has many useful attributes which can be used for visualization or integrated into other pipelines (such as scanpy). See API for more details.

Find SVG

sp.get_genes_csr_array(min_cells=200, log1p=False)
sp.spatial_high_variable_genes(thread=6)

The parameter min_cells was used to filter genes that are too sparse to generate a reliable spatial distribution.
The parameter log1p was used to avoid extreme values affecting the results. For most open-source h5ad files, log1p has already been executed, so the default value here is False.
You can perform STMiner in your interested gene sets. Use parameter gene_list to input the gene list to STMiner. Then, STMiner will only calculate the given gene set of the dataset.
You can see output while computing as follows:

Parsing distance array...: 100%|██████████| 10762/10762 [01:12<00:00, 149.11it/s]
Computing ot distances...:  10%|▉         | 1069/10762 [03:04<31:11,  6.12it/s]  

You can check the spatial varitation of each gene by:

sp.global_distance

Gene	Distance	z-score
myha	1.35E+08	2.771493
vmhcl	1.01E+08	2.470881
zgc:101560	9.95E+07	2.458787
pvalb1	9.82E+07	2.445257
myhz2	9.75E+07	2.437787
...	...	...
rps17	2.61E+05	-3.63207
rpl13	2.48E+05	-3.68506
rpl32	2.43E+05	-3.70327
rsl24d1	2.27E+05	-3.7757
rpl22	1.83E+05	-3.99332

Fit GMM

sp.fit_pattern(n_comp=20, gene_list=list(sp.global_distance[:2000]['Gene']))

You can see output while computing as follows:

Fitting GMM...:  10%|▉         | 190/2000 [00:42<04:36,  6.54it/s] 

n_comp： Number of components for each GMM model
gene_list: Gene list to fit GMM model

Build distance array

sp.build_distance_array()

This step calculates the distance between genes’ spatial distributions. You can visualize the distance array by:

import seaborn as sns
sns.clustermap(sp.genes_distance_array)

build distance matrix & clustering

sp.cluster(n_clusters=6)

n_clusters: Number of cluster

Result & Visualization

The result are stored in genes_labels:

spf.genes_labels

The output looks like the following:

	gene_id	labels
0	Cldn5	2
1	Fyco1	2
2	Pmepa1	2
3	Arhgap5	0
4	Apc	5
..	...	...
95	Cyp2a5	0
96	X5730403I07Rik	0
97	Ltbp2	2
98	Rbp4	4
99	Hist1h1e	4

To visualize the patterns by heatmap:

sp.get_pattern_array()
sp.plot.plot_pattern(heatmap=False,
                     s=5,
                     rotate=True,
                     reverse_y=True,
                     reverse_x=True,
                     vmax=95,
                     cmap='Spectral_r',
                     output_path='./')

heatmap: If True, plot a heatmap. If False, plot a scatterplot. False is the default.
s: Spot size
rotate\reverse_y\reverse_x: Adjust the axis of plot.
cmap: cmap of plot
vmax: The percentage of the highest value of plots. Avoid the effect of large values for visualization.
output_path: If set, save the figure to path To visualize the genes by labels:

sp.plot.plot_genes(label=0, n_gene=8, s=5, reverse_y=True, reverse_x=True)

n_gene: Number of genes to visualize

To visualize the specific gene (such as BRAFhuman):

hcc2l.plot.plot_gene('BRAFhuman', 
                     spot_size=10,
                     global_matrix_spot_size=10,
                     rotate=True, 
                     reverse_y=True, 
                     reverse_x=True, 
                     vmax=95, 
                     cmap='Spectral_r',
                     figsize=(5,5),
                     save_path='./',
                     format='png')

reverse_y, reverse_x, rotate is optional, they are used to adjust coordinate here.