11---
22title : " Few-Shot Learning for Rooftop Detection in Satellite Imagery"
33subtitle : " Deep Learning Tutorial"
4- author : " Giorgio Coppala , Nadine Daum, Elena Dreyer, Nico Reichardt"
4+ author : " Giorgio Coppola , Nadine Daum, Elena Dreyer, Nico Reichardt"
55bibliography : refs.bib
66
77
88resources :
99 - img/**
10- - figures/**
1110
1211format :
1312 revealjs :
@@ -29,86 +28,151 @@ format:
2928---
3029
3130
31+ ## Policy Relevance
32+
33+ - Many public auhorities face the problem of ** limited labeled data**
34+ - Annotation is expensive, slow, or requires domain expertise
35+
36+ - ** Applications:**
37+ - Medical sector: ** rare disease detection**
38+ - Emergency management: ** flood extent mapping**
39+ - Climate & energy: ** solar PV rooftop assessment**
40+ - Urban planning: ** building footprints & infrastructure mapping**
41+
42+ - ** Few-shot learning (FSL)** can help:
43+ - Learns to ** generalize** from * 1–5 labeled support examples per class*
44+ - (In our case) learns ** feature embeddings** and constructs ** class prototypes**
45+ - Enables segmentation in a ** new city** with * minimal additional annotation*
46+
47+
3248
3349## Problem Setting
3450
35- ::: columns
36- ::: column
37- - Cities need accurate rooftop maps to plan and scale ** solar PV installations**
51+ ::: {.columns}
3852
39- - Manual rooftop labeling is ** slow ** and ** costly **
53+ ::: {.column width="55%"}
4054
41- - Every city looks different → traditional models do not ** generalize** well
55+ - Goal of the tutorial: apply ** Prototypical Networks** to
56+ rooftop segmentation using only a few labeled tiles
4257
43- ### Idea:
58+ - ** Few-shot segmentation** allows the model to learn characteristic
59+ rooftop shapes and textures from a small Geneva subset
4460
45- - ** Few-shot learning ** makes segmentation possible with only a handful of labeled examples
61+ - Demonstrates how rooftop maps can be produced for solar potential estimation in a ** new geographic setting ** with limited labels
4662
4763:::
4864
49- ::: column
65+ ::: {.column width="45%"}
66+
5067![ ] ( figures/picture_use_case.png ) {width=90% style="margin-left: 30%;"}
68+
5169:::
70+
5271:::
5372
5473
55- ## Dataset: [ Rooftops of Geneva] ( https://huggingface.co/datasets/raphaelattias/overfitteam-geneva-satellite-images )
5674
57- ::: {.two-col-60-40}
58- ::: {.col-left}
75+ ## Dataset: [ Roofs of Geneva] ( https://huggingface.co/datasets/raphaelattias/overfitteam-geneva-satellite-images )
5976
60- - ** Satellite Images** : High-resolution RGB satellite images of Geneva available on Huggingface
6177- ** Size** : 1,050 labeled image-mask pairs
78+
6279- ** Task** : Binary segmentation masks (rooftop vs background)
80+
6381- ** Geographic splits** : 3 grids/ neighborhoods (North, Center, South)
82+
6483- ** Image size** : 250x250 pixels
84+
6585- ** Categories** : Industrial, Residential
6686
67- :::
6887
69- ::: {.col-right}
88+ ## Inside the dataset
7089
71- ![ ] ( figures/grids_animation.gif ) {width=80%}
72- ![ ] ( figures/geneva-map-gif.gif ) {width=80%}
90+ <div style =" text-align :center ;" >
91+ ![ ] ( figures/grids_animation.gif ) {width="50%"}
92+ </div >
93+
94+ <div style =" font-size :0.75rem ; text-align :center ; color :#666 ; margin-top :0.5rem ;" >
95+ Geneva Animation: raw image → overlay rooftop → binary mask
96+ </div >
7397
74- :::
75- :::
7698
7799
78100## Few Shot Learning in General
79101
80- tbd
102+ #### Few-Shot Learning (FSL)
103+ - Learning new ** tasks, labels, or segmentations** from very few labeled examples
104+ * (N-way, K-shot)*
81105
106+ #### Few-Shot Semantic Segmentation (FSSS)
107+ - ** Goal** : Segment novel object classes using only a few annotated examples
108+ - Assigning a class label to ** every pixel**
82109
83- ## Prototypical Network
84110
85- ::: {.two-col-80-20}
86- ::: {.col-left}
87- ![ ] ( figures/illustration_prototypical_network.png ) {width=100%}
111+ ---
88112
89- * (modified from) * [ SRPNet ] ( https://arxiv.org/abs/2210.16829 )
113+ ## Prototypical Networks (ProtNets )
90114
91- :::
115+ * Learn a shared ** embedding space** via a backbone model
116+ * Pixels belonging to the same class are ** close in feature space**
117+ * Class representations are formed as ** prototypes**
118+ * Training follows an ** episodic framework**
119+ * Each episode consists of:
120+ - ** Support set** :
121+ Few images with ** pixel-level masks**
122+ Defines the target classes
123+ - ** Query image** :
124+ Image where the model must segment the target classes
92125
93- ::: {.col-right}
94- - high-level schematic (support → prototype → similarity → segmentation)
95- - 1-way-1-shot → explain what it means
96- - Data preprocessing (augmentation, geographic splits)
97- - Model architecture (feature extraction, CNN layers, backbone)
98- - Training strategy
99- - Loss function
100- - Evaluation metrics
126+ ## Prototypical Network Overview
101127
102- :::
103- :::
128+ #### Workflow
129+ * Support Image → Prototype → Similarity → Query Segmentation
130+
131+
132+ #### Feature Extraction
133+ * ** Backbone:** ResNet-18 CNN, pretrained on ImageNet
134+ * ** Projection:** feature maps → embedding dimension (256 channels)
135+
136+
137+ #### Evaluation Metric
138+ $$
139+ \mathrm{IoU} = \frac{|A \cap B|}{|A \cup B|}
140+ $$
141+
142+
143+ ---
144+
145+ ## Prototypical Network Overview
146+
147+ ![ ] ( figures/illustration_prototypical_network.png ) {width=100% fig-align="center"}
148+
149+ <div style =" font-size :0.75rem ; text-align :center ; color :#666 ; margin-top :0.5rem ;" >
150+ Modified figure from <a href =" https://arxiv.org/abs/2210.16829 " >(Ding et al. 2022)</a >
151+ </div >
152+
153+
154+ ---
155+
156+ ## (Preliminary) Results
157+
158+ #### (1) Meta training loss
159+
160+ The “avg episode loss” at each epoch is the average cross-entropy error over all support–query tasks in that epoch. The encoder is successfully learning a feature space where prototype-based segmentation works increasingly well.
161+
162+ ![ ] ( figures/meta_training_loss.png ) {width="50%" fig-align="center"}
163+
164+
165+ ---
104166
105167## (Preliminary) Results
106168
107- - Show performance for 1-shot / 5-shot / full-data comparison
169+ #### (2) Predicted masks
108170
109- - Show predicted masks
171+ With 5-shot learning, the predicted masks have a mean IoU over 102 test samples of 0.485.
110172
173+ Here an example:
111174
175+ ![ ] ( figures/predicted_mask.png ) {width=80% fig-align="center"}
112176
113177## Discussion
114178
141205</div >
142206
143207
144-
145208## References
146209
147210::: {.refs-super-small}
174237:::
175238
176239
177-
178-
179-
180-
181-
182-
183-
0 commit comments