Skip to content

Commit 85a7b69

Browse files
committed
Big improvement changes and feature additions for version 2.0
1 parent 16020ee commit 85a7b69

9 files changed

Lines changed: 108 additions & 32 deletions

README.md

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,7 @@ There are currently available the following commands:
191191
- `-include_geojson=<yes or no to include cell segmentation as regions>` : example -> -include_geojson=yes. **OBS**: this includes the cell segmentation as regions in the TissUUmaps project. If this is not set, no regions will be included
192192
- `-compress_geojson=<yes or no to compress geojson regions into pbf>` : example -> -compress_geojson=yes. **OBS**: this includes the cell segmentation regions as a compressed pbf file
193193
- `-include_html=<yes or no to export html page for sharing the TissUUmaps project on the web>` : example -> -include_marker_images=yes. **OBS**: this includes the html page for sharing the TissUUmaps project on the web. A web server is needed to visualize the exported web page
194+
- `-launch=<yes or no to automatically open the exported web page in a browser>` : example -> -launch=yes. **OBS**: requires `-include_html=yes`. Starts a local HTTP server on the first available port (starting at 8080) serving the `TissUUmaps_webexport` folder and opens it in the default browser. The server keeps running until the process is stopped with Ctrl+C
194195

195196
*Example 2*: contents of `pipex_batch_list.txt` for the images from *example 1*
196197
<code>
@@ -291,7 +292,9 @@ If you add the `generate_tissuumaps` command to PIPEX command list a `anndata_Ti
291292
- Install TissUUmaps (https://tissuumaps.github.io/TissUUmaps-docs/docs/intro/installation.html)
292293
- Load the `anndata_TissUUmaps.h5ad` file in TissUUmaps
293294

294-
If you add the `include_html=yes` parameter to the `generate_tissuumaps` command, a `TissUUmaps_webexport` folder will be generated in your analysis/downstream sub-folder. You can share this file on a web server, and access it from any web browser.
295+
If you add the `include_html=yes` parameter to the `generate_tissuumaps` command, a `TissUUmaps_webexport` folder will be generated in your analysis/downstream sub-folder. You can share this folder on a web server, and access it from any web browser.
296+
297+
If you also add `launch=yes`, PIPEX will automatically start a local HTTP server and open the result in your default browser once the export is complete — no separate web server setup needed. The server runs on the first available port starting at 8080 and can be stopped with `Ctrl+C`.
295298

296299
**NOTE**: TissUUmaps requires your images to be in `TIFF` format and be named exactly as your markers (for example: `DAPI.tif`, `CPEP.tif`, etc...)
297300

@@ -475,7 +478,7 @@ Annex 4: Cluster refinement procedure
475478

476479
PIPEX's analysis step includes the possibility to perform multiple refinements of the unsupervised clustering results (leiden and/or kmeans). This can help you with the manual annotation and merging of the clusters automatically discovered.
477480

478-
The idea behind the cluster refinement algorithm is to explore the ranked genes associated to each cluster and try to match them with rules stated by the user. The algorithm then assigns a confidence score per cluster and rule, depending how close its ranked genes are to the rule/s definition/s. Finally, the refinement picks per cluster the annotated cluster with higher confidence (ties are solved by row order).
481+
The idea behind the cluster refinement algorithm is to explore the ranked genes associated to each cluster and try to match them with rules stated by the user. The algorithm then assigns a confidence score per cluster and rule, depending how close its ranked genes are to the rule/s definition/s. Finally, the refinement picks per cluster the annotated cluster with matching or above confidence (ties are solved by row order).
479482

480483
To use the cluster refinement, you have to create a `cell_types.csv` file with rows containing the following information:
481484
- `ref_id`: used as a suffix for the manually annotated cluster name. The final cluster name will be `leiden_ref[ref_id]` or `kmeans_ref[ref_id]`. Each unique `ref_id` group is an independent parallel refinement — it produces its own output column and JSON report, it does not filter the results of a previous ref_id. A typical use is a first ref_id with strict rules (`high` level, higher `min_confidence`) for well-defined populations, and a second ref_id with looser rules to catch remaining ambiguous clusters.
@@ -490,14 +493,14 @@ Here's and example of how a `cell_types.csv` file usually looks:
490493
<code>
491494

492495
ref_id,cell_group,cell_type,cell_subtype,rank_filter,min_confidence,marker1,rule1,marker2,rule2,marker3,rule3
493-
1,artifact,fold,unknown,all,10,CBS,high,CHGA,high,AMY2B,high
494-
1,endocrine,islet,all,positive_only,10,CHGA,high,CPEP,high,AMY2B,low
495-
1,exocrine,acinar,unknown1,all,10,CBS,high,AMY2B,high
496-
1,endothelial,vessels,all,positive_only,30,CD31,high,aSMA,high
497-
1,epithelial,ductal,unknown,all,10,KRT19,high,PANCK,high
498-
1,immune,potential,artifact,all,10,HLADR,high,NPDC1,high,aSMA,low
499-
2,immune,potential,artifact,all,0,HLADR,medium,NPDC1,medium
500-
2,epithelial,ductal,unknown,all,0,KRT19,medium,PANCK,medium
496+
1,artifact,fold,unknown,all,25,CBS,high,CHGA,high,AMY2B,high,,
497+
1,endocrine,islet,all,positive_only,25,CHGA,high,CPEP,high,AMY2B,low
498+
1,exocrine,acinar,unknown1,all,25CBS,high,AMY2B,high,,
499+
1,endothelial,vessels,all,positive_only,30,CD31,high,aSMA,high,,
500+
1,epithelial,ductal,unknown,all,25,KRT19,high,PANCK,high,,
501+
1,immune,potential,artifact,all,25,HLADR,high,NPDC1,high,aSMA,low
502+
2,immune,potential,artifact,all,10,HLADR,medium,NPDC1,medium,,
503+
2,epithelial,ductal,unknown,all,10,KRT19,medium,PANCK,medium,,
501504

502505
</code>
503506

analysis.py

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -332,13 +332,13 @@ def calculate_cluster_info(adata, cluster_type, markers):
332332
sq.pl.nhood_enrichment(adata, cluster_key=cluster_type, method="single", show=False,
333333
save='nhood_enrichment_' + cluster_type + '.jpg')
334334
except Exception as e:
335-
log('Neighborhood calculations failed for cluster ' + cluster_type)
335+
log('Neighborhood calculations failed for cluster ' + cluster_type + ': ' + str(e))
336336

337337
try:
338338
sq.gr.interaction_matrix(adata, cluster_key=cluster_type)
339339
sq.pl.interaction_matrix(adata, cluster_key=cluster_type, show=False, save='interaction_matrix_' + cluster_type + '.jpg')
340340
except Exception as e:
341-
log('Interaction matrix analysis failed for cluster ' + cluster_type)
341+
log('Interaction matrix analysis failed for cluster ' + cluster_type + ': ' + str(e))
342342

343343
try:
344344
sc.tl.rank_genes_groups(adata, cluster_type, method='t-test')
@@ -350,6 +350,19 @@ def calculate_cluster_info(adata, cluster_type, markers):
350350
save='_' + cluster_type)
351351

352352

353+
def _sort_json_keys(obj):
354+
if isinstance(obj, dict):
355+
def _key(k):
356+
try:
357+
return (0, int(k), '')
358+
except (ValueError, TypeError):
359+
return (1, 0, str(k))
360+
return {k: _sort_json_keys(v) for k, v in sorted(obj.items(), key=lambda x: _key(x[0]))}
361+
if isinstance(obj, list):
362+
return [_sort_json_keys(i) for i in obj]
363+
return obj
364+
365+
353366
def refine_clustering(adata, cluster_type, curr_ref_id, cell_types_ref):
354367
clustering_merge_data = {}
355368
clustering_merge_data['scores'] = {}
@@ -406,20 +419,18 @@ def refine_clustering(adata, cluster_type, curr_ref_id, cell_types_ref):
406419
best_candidate = None
407420
best_real_confidence = 0
408421
for curr_cell_type in clustering_merge_data['cell_types'][cluster_id]:
409-
if (best_candidate is None or best_candidate['prob'] < curr_cell_type['prob']) and curr_cell_type['prob'] >= int(curr_cell_type['confidence_threshold']):
410-
best_candidate = { 'cell_type': curr_cell_type['cell_type'], 'prob' : curr_cell_type['prob'] / 100 } #, 'real_confidence' : '{:.1%}'.format(curr_cell_type['prob'])} # * len(clustering_merge_data['cell_types'][cluster_id])) / 100.0) }
422+
if curr_cell_type['prob'] >= int(curr_cell_type['confidence_threshold']):
423+
best_candidate = { 'cell_type': curr_cell_type['cell_type'], 'prob' : curr_cell_type['prob'] / 100 }
411424
best_real_confidence = curr_cell_type['prob']
425+
break
412426

413427
if best_real_confidence > 0:
414428
clustering_merge_data['candidates'][cluster_id] = best_candidate
415429
adata.obs.loc[adata.obs[cluster_type + "_ref" + curr_ref_id] == cluster_id, cluster_type + "_ref" + curr_ref_id] = best_candidate['cell_type']
416430
adata.obs.loc[adata.obs[cluster_type + "_ref" + curr_ref_id + "_p"] == cluster_id, cluster_type + "_ref" + curr_ref_id + "_p"] = '{:.1%}'.format(best_candidate['prob']) #best_candidate['real_confidence'][:-1]
417431

418-
clustering_merge_data["scores"] = OrderedDict(sorted(clustering_merge_data["scores"].items()))
419-
clustering_merge_data["cell_types"] = OrderedDict(sorted(clustering_merge_data["cell_types"].items()))
420-
clustering_merge_data["candidates"] = OrderedDict(sorted(clustering_merge_data["candidates"].items()))
421432
with open(os.path.join(data_folder, 'analysis', 'downstream', 'cell_types_result_' + cluster_type + curr_ref_id + '.json'), 'w') as outfile:
422-
json.dump(clustering_merge_data, outfile, indent = 4)
433+
json.dump(_sort_json_keys(clustering_merge_data), outfile, indent = 4)
423434

424435

425436
#Function to perform different cluster methods
@@ -449,7 +460,7 @@ def clustering(df_norm, markers):
449460
log("Dataset too big to create spatial plots per marker")
450461

451462
#We calculate PCA, neighbors and UMAP for the anndata
452-
sc.pp.pca(adata, n_comps=min(len(markers), 50))
463+
sc.pp.pca(adata, n_comps=min(len(markers), 50, adata.n_obs - 1, adata.n_vars - 1))
453464

454465
pca_loadings = adata.varm['PCs']
455466
loadings_df = pd.DataFrame(pca_loadings, index=adata.var_names, columns=[f'PC{i + 1}' for i in range(pca_loadings.shape[1])])
@@ -474,6 +485,9 @@ def clustering(df_norm, markers):
474485
sc.pp.neighbors(adata, n_neighbors=num_neighbors)
475486
log("Neighbors graph calculated")
476487

488+
sq.gr.spatial_neighbors(adata, coord_type="generic", n_neighs=num_neighbors)
489+
log("Spatial neighbors graph calculated")
490+
477491
sc.tl.umap(adata)
478492
log("UMAP calculated")
479493
sc.pl.umap(adata, show=False, save='_base')
@@ -583,8 +597,6 @@ def clustering(df_norm, markers):
583597
if neigh_cluster_id not in adata.obs:
584598
adata.obs[neigh_cluster_id] = df_norm[neigh_cluster_id].astype('category')
585599
try:
586-
sq.gr.spatial_neighbors(adata, coord_type="generic", n_neighs=num_neighbors)
587-
log("Spatial neighbors graph calculated")
588600
sq.gr.centrality_scores(adata, neigh_cluster_id)
589601
sq.pl.centrality_scores(adata, neigh_cluster_id, save=(neigh_cluster_id + "_centrality_scores.jpg"))
590602
log("Neighborhood centrality scores calculated")
@@ -672,7 +684,7 @@ def clustering(df_norm, markers):
672684
def neighborhood_cell_type_analysis(adata, neigh_cluster_id, k_values, density_threshold, data_folder, image_size):
673685
k_values = sorted(set(k_values))[:3]
674686
cell_types = adata.obs[neigh_cluster_id].astype(str).values
675-
unique_types = sorted(set(cell_types))
687+
unique_types = sorted(set(cell_types), key=lambda x: int(x) if x.lstrip('-').isdigit() else x)
676688
n_types = len(unique_types)
677689
type_to_idx = {t: i for i, t in enumerate(unique_types)}
678690
cell_type_idx = np.array([type_to_idx[t] for t in cell_types])

changelog.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,11 @@ Changelog
4141
- **LMD export**
4242
New output mode for `generate_filtered_masks.py` that produces an XML cutting file compatible with Leica's Laser Microdissection software. Four parameters control the output geometry: `-shape_dilation` expands each cell outline by a given number of pixels, `-convolution_smoothing` controls contour smoothness, `-path_optimization` selects the cutting path order strategy (none, Hilbert, or greedy), and `-distance_heuristic` merges nearby shapes into a single cutting group to reduce stage movements.
4343

44+
#### TissUUmaps export
45+
46+
- **`-launch` parameter for `generate_tissuumaps.py`**
47+
When set to `yes`, automatically starts a local HTTP server serving the `TissUUmaps_webexport` folder after export completes and opens the result in the default browser. The server runs on the first available port starting at 8080 and keeps running until the process is interrupted with `Ctrl+C`. Requires `include_html=yes` to have generated the webexport folder first.
48+
4449
#### Extra scripts
4550

4651
- **`extra/` folder**

generate_filtered_masks.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
import os
33
import argparse
44
import datetime
5+
import matplotlib
6+
matplotlib.use('Agg')
57
import pandas as pd
68
import numpy as np
79
from skimage.io import imsave

generate_geojson.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ def options(argv):
106106
for marker in markers:
107107
cell_data["properties"]["measurements"].append({
108108
"name" : marker,
109-
"value" : float(cell[marker])
109+
"value" : float(cell[marker]) if pd.notna(cell[marker]) else 0.0
110110
})
111111

112112
#if cluster_id parameter is selected, add cluster_id and cluster_color

generate_tissuumaps.py

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,17 @@
1+
import os
2+
os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'
13
import scanpy as sc
24
import scipy
3-
import os
45
import fnmatch
56
import json
67
import copy
78
import datetime
89
import sys
910
import argparse
11+
import threading
12+
import webbrowser
13+
import http.server
14+
import socketserver
1015
from skimage.measure import approximate_polygon
1116
import numpy as np
1217
from pipex_utils import log
@@ -16,6 +21,7 @@
1621
include_geojson = "no"
1722
compress_geojson = "no"
1823
include_html = "no"
24+
launch = "no"
1925

2026
def find_marker_file(folder, marker):
2127
"""Return the filename of the first image in folder whose name ends with marker."""
@@ -173,6 +179,8 @@ def options(argv):
173179
help='compress geojson regions into pbf : example -> -compress_geojson=yes')
174180
parser.add_argument('--include_html', choices=['yes', 'no'], default='no',
175181
help='export html page for web sharing : example -> -include_html=yes')
182+
parser.add_argument('--launch', choices=['yes', 'no'], default='no',
183+
help='launch local web server and open browser after export : example -> -launch=yes')
176184
if not argv:
177185
parser.print_help()
178186
sys.exit()
@@ -185,6 +193,7 @@ def options(argv):
185193
include_geojson = args.include_geojson
186194
compress_geojson = args.compress_geojson
187195
include_html = args.include_html
196+
launch = args.launch
188197

189198
pidfile_filename = './RUNNING'
190199
if "PIPEX_WORK" in os.environ:
@@ -199,4 +208,34 @@ def options(argv):
199208

200209
exporting_tissuumaps()
201210

211+
if launch == "yes":
212+
webexport_path = os.path.join(data_folder, 'analysis', 'downstream', 'TissUUmaps_webexport')
213+
if not os.path.isdir(webexport_path):
214+
print(">>> WARNING: TissUUmaps webexport directory not found, cannot launch", flush=True)
215+
else:
216+
port = 8080
217+
while port < 8200:
218+
try:
219+
handler = http.server.SimpleHTTPRequestHandler
220+
handler.log_message = lambda *args: None
221+
httpd = socketserver.TCPServer(("", port), handler)
222+
break
223+
except OSError:
224+
port += 1
225+
else:
226+
print(">>> WARNING: could not find a free port to launch TissUUmaps", flush=True)
227+
httpd = None
228+
if httpd:
229+
os.chdir(webexport_path)
230+
thread = threading.Thread(target=httpd.serve_forever, daemon=True)
231+
thread.start()
232+
url = f"http://localhost:{port}"
233+
print(f">>> TissUUmaps running at {url} — press Ctrl+C to stop", flush=True)
234+
webbrowser.open(url)
235+
try:
236+
thread.join()
237+
except KeyboardInterrupt:
238+
httpd.shutdown()
239+
print(">>> TissUUmaps server stopped", flush=True)
240+
202241
log("End time exporting tissuumaps")

0 commit comments

Comments
 (0)