Skip to content

Commit 404226a

Browse files
Modified the graph video
1 parent 200e80a commit 404226a

3 files changed

Lines changed: 106 additions & 88 deletions

File tree

.gitignore

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +0,0 @@
1-
*.mov
2-
*.mp4

index.html

Lines changed: 106 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ <h1 class="title is-1 publication-title"><span class="gradient-text">DAGDiff</sp
157157
<div class="container is-max-desktop">
158158
<div class="columns has-text-centered">
159159
<div class="column is-full-width">
160-
<h2 class="title is-3">Video Explanation</h2>
160+
<h2 class="title is-2">Video Explanation</h2>
161161
<div class="columns is-centered video-container">
162162
<video controls muted poster="./static/images/video_thumbnail.png" preload="none"
163163
src="./static/videos/intro_video.mp4">
@@ -172,14 +172,14 @@ <h2 class="title is-3">Video Explanation</h2>
172172
<div class="container is-max-desktop">
173173
<div class="columns is-centered has-text-centered">
174174
<div class="column is-four-fifths">
175-
<h2 class="title is-3 mt-3">Abstract</h2>
175+
<h2 class="title is-2 mt-3">Abstract</h2>
176176
<div class="content has-text-justified">
177177
Reliable dual-arm grasping is essential for manipulating large and complex objects but remains a
178178
challenging problem due to stability, collision, and generalization requirements. Prior methods
179179
typically decompose the task into two independent grasp proposals, relying on region priors or
180180
heuristics that limit generalization and provide no principled guarantee of stability. We
181-
propose DAGDiff, an end-to-end framework that directly denoises to grasp pairs in the $SE(3)
182-
\times SE(3)$ space. Our key insight is that stability and collision can be enforced more
181+
propose DAGDiff, an end-to-end framework that directly denoises to grasp pairs in the \(SE(3)
182+
\times SE(3)\) space. Our key insight is that stability and collision can be enforced more
183183
effectively by guiding the diffusion process with classifier signals, rather than relying on
184184
explicit region detection or object priors. To this end, DAGDiff integrates geometry-,
185185
stability-, and collision-aware guidance terms that steer the generative process toward grasps
@@ -209,14 +209,14 @@ <h2 class="title is-3 mt-3">Abstract</h2>
209209

210210

211211

212-
<section class="section" style="background-color: rgb(255, 255, 255); margin-bottom:20px">
212+
<section class="section" style="background-color: rgb(255, 255, 255); margin-bottom:0px">
213213
<div class="container is-max-desktop">
214214
<div class="columns has-text-centered">
215215
<div class="column is-full-width">
216-
<h2 class="title is-3">Model Architecture</h2>
217-
<img src="./static/images/pipeline.svg">
216+
<h2 class="title is-2">Model Architecture</h2>
217+
<img class="mt-4" src="./static/images/pipeline.svg">
218218
<div class="content has-text-justified my-4">
219-
<b>Overview of the proposed method</b>: <b>(a)</b> Given an object point cloud P , our network
219+
<b>Overview of the proposed method</b>: <b>(a)</b> Given an object point cloud \(P\), our network
220220
encodes
221221
geometric features into dense feature maps. Next,
222222
randomly initialized dual-arm grasps \(H\) are used to transform a fixed query cloud into query
@@ -240,115 +240,135 @@ <h2 class="title is-3">Model Architecture</h2>
240240

241241
<h4 class="title is-4 has-text-centered">\(SE(3) \times SE(3) \longleftrightarrow \mathbb{R}^{12}\)</h4>
242242

243-
<div class="columns has-text-justified mt-4">
243+
<div class="columns has-text-justified mt-2 mb-5">
244244
<div class="column is-full-width is-flex is-justify-content-center is-align-items-center"">
245245
<img src=" ./static/images/logmap2.svg">
246246
</div>
247247

248248
<div class="column is-full-width">
249-
Additionally, dual-arm grasp poses are represented as pairs of rigid-body transformations
249+
<b>Denoising in the dual-arm grasp space:</b> Additionally, dual-arm grasp poses are represented as pairs of rigid-body transformations
250250
in \(SE(3) \times SE(3)\), which are mapped into a \(12\text{D}\) Euclidean space for diffusion and
251251
back.
252252
Each \(SE(3)\) element is
253253
first projected into its \(6\text{D}\) Lie algebra representation via the <u>logarithmic map</u>
254254
\((\operatorname{Logmap_{2}})\), and
255-
concatenated to form a vector in \(\mathbb{R}^{12}\).
256-
<br/> <br/>
255+
concatenated to form a vector in \(\mathbb{R}^{12}\).
256+
<br /> <br />
257257
The diffusion process is then carried out in
258258
this Euclidean space. To obtain valid grasp poses, the <u>exponential map</u>
259259
\((\operatorname{Expmap_{2}})\) maps vectors in \(\mathbb{R}^{12}\) back to
260260
\(SE(3) \times SE(3)\). This bidirectional mapping enables diffusion while ensuring grasps remain
261261
consistent with rigid-body motion.
262262

263263
</div>
264+
</div>
265+
<hr />
264266

267+
<h4 class="title is-4 has-text-centered">\(\text{Denoising using Classifier Guidance}\)</h4>
265268

269+
<div class="columns has-text-centered">
270+
<div class="column is-full-width mt-2">
271+
<video autoplay loop muted poster="" preload="none" style="width:100%;">
272+
<source src="./static/videos/only_graph_cropped3.mp4">
273+
</video>
274+
</div>
275+
</div>
276+
277+
<!-- Colormap bar -->
278+
<div class="columns has-text-centered my-5">
279+
<div class="column is-full-width">
280+
<div style="
281+
background: linear-gradient(to right, rgb(255, 85, 85), rgb(63, 255, 63));
282+
height: 12px;
283+
border-radius: 30px;
284+
margin: 0 auto;
285+
width: 70%;
286+
position: relative;">
287+
</div>
288+
<div style="display: flex; justify-content: space-between; width: 70%; margin: 5px auto 0 auto; font-size: 0.9rem;">
289+
<span style="color: rgb(182, 1, 1); font-weight: 500;">Noisy Grasp Pairs</span>
290+
<span style="color: rgb(45, 150, 45); font-weight: 500;">Stable Grasp Pairs</span>
291+
</div>
292+
</div>
266293
</div>
294+
295+
<div class="content has-text-justified my-4">
296+
<b>Overview of the denoising process:</b> The above clip shows the joint denoising process step by step. As the
297+
time progresses, the <span style="color: rgb(182, 1, 1);">Energy \((E_\alpha)\)</span> gradually
298+
decreases, which means grasps are moving towards the object and
299+
not just floating in free space. At the same time, the <span
300+
style="color:rgb(11, 33, 158)">Force-Closure Probability \((C_{\beta}^{\text{fc}})\)</span> steadily
301+
increases,
302+
highlighting how the grasp becomes more stable and reliable over time. Finally, in the later stages of
303+
denoising, colliding grasps are
304+
refined for a small number of iterations using <span style="color:rgb(45, 150, 45)">Collision Classifier
305+
\((C_{\gamma}^{\text{col}})\)</span>, resulting in dual-arm grasps that are force-closure stable as
306+
well as collision-free.
307+
</div>
308+
309+
267310
</div>
268311
</section>
269312

270-
<!--
271313
<section class="section" style="background-color: rgb(252, 252, 252);">
272314
<div class="container is-max-desktop">
273315
<div class="columns has-text-centered">
274316
<div class="column is-full-width">
275-
<h2 class="title is-3">Results (Coming soon)</h2>
276-
</div>
277-
</div>
278-
</div>
279-
</section> -->
317+
<h2 class="title is-3">
318+
Real Life Results <sup style="font-size: 15px;">&dagger;</sup>
319+
</h2>
320+
321+
<p class="is-size-7 has-text-grey mt-4 has-text-right">
322+
<sup>&dagger;</sup> Unseen object categories
323+
</p>
324+
325+
<div class="columns is-multiline is-centered">
326+
<div class="column is-half my-5">
327+
<video autoplay loop muted poster="" preload="none"
328+
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
329+
<source src="./static/videos/real_life_bucket.webm">
330+
</video>
331+
<h4 class="title is-5">(a) Bucket</h4>
332+
</div>
280333

281-
<section class="section" style="background-color: rgb(255, 255, 255);">
282-
<div class="container is-max-desktop">
283-
<div class="columns has-text-centered">
284-
<div class="column is-full-width">
285-
<video autoplay loop muted poster="" preload="none" style="width:100%;">
286-
<source src="./static/videos/only_graph.webm">
287-
</video>
288-
</div>
289-
</div>
290-
</div>
291-
</section>
334+
<div class="column is-half my-5">
335+
<video autoplay loop muted poster="" preload="none"
336+
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
337+
<source src="./static/videos/real_life_tray.webm">
338+
</video>
339+
<h4 class="title is-5">(b) Tray</h4>
340+
</div>
292341

293-
<section class="section" style="background-color: rgb(252, 252, 252);">
294-
<div class="container is-max-desktop">
295-
<div class="columns has-text-centered">
296-
<div class="column is-full-width">
297-
<h2 class="title is-3">
298-
Real Life Results <sup style="font-size: 15px;">&dagger;</sup>
299-
</h2>
300-
301-
<p class="is-size-7 has-text-grey mt-4 has-text-right">
302-
<sup>&dagger;</sup> Unseen object categories
303-
</p>
304-
305-
<div class="columns is-multiline is-centered">
306-
<div class="column is-half my-5">
307-
<video autoplay loop muted poster="" preload="none"
308-
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
309-
<source src="./static/videos/real_life_bucket.webm">
310-
</video>
311-
<h4 class="title is-5">(a) Bucket</h4>
312-
</div>
313-
314-
<div class="column is-half my-5">
315-
<video autoplay loop muted poster="" preload="none"
316-
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
317-
<source src="./static/videos/real_life_tray.webm">
318-
</video>
319-
<h4 class="title is-5">(b) Tray</h4>
320-
</div>
321-
322-
<div class="column is-half my-5">
323-
<video autoplay loop muted poster="" preload="none"
324-
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
325-
<source src="./static/videos/real_life_drone.webm">
326-
</video>
327-
<h4 class="title is-5">(c) Drone</h4>
328-
</div>
329-
330-
<div class="column is-half my-5">
331-
<video autoplay loop muted poster="" preload="none"
332-
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
333-
<source src="./static/videos/real_life_frypan.webm">
334-
</video>
335-
<h4 class="title is-5">(d) Frypan</h4>
336-
</div>
337-
338-
<div class="column is-half my-5">
339-
<video autoplay loop muted poster="" preload="none"
340-
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
341-
<source src="./static/videos/real_life_saucepan.webm">
342-
</video>
343-
<h4 class="title is-5">(e) Saucepan</h4>
342+
<div class="column is-half my-5">
343+
<video autoplay loop muted poster="" preload="none"
344+
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
345+
<source src="./static/videos/real_life_drone.webm">
346+
</video>
347+
<h4 class="title is-5">(c) Drone</h4>
348+
</div>
349+
350+
<div class="column is-half my-5">
351+
<video autoplay loop muted poster="" preload="none"
352+
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
353+
<source src="./static/videos/real_life_frypan.webm">
354+
</video>
355+
<h4 class="title is-5">(d) Frypan</h4>
356+
</div>
357+
358+
<div class="column is-half my-5">
359+
<video autoplay loop muted poster="" preload="none"
360+
style="width:100%; border: 2px solid #ddd; border-radius: 10px;">
361+
<source src="./static/videos/real_life_saucepan.webm">
362+
</video>
363+
<h4 class="title is-5">(e) Saucepan</h4>
364+
</div>
365+
</div>
366+
367+
<!-- footnote -->
344368
</div>
345-
</div>
346-
347-
<!-- footnote -->
348369
</div>
349-
</div>
350370
</div>
351-
</section>
371+
</section>
352372

353373
<!-- <section class="section" id="BibTeX" style="margin-bottom: 1rem;">
354374
<div class="container is-max-desktop content">
@@ -385,4 +405,4 @@ <h2 class="title">BibTeX</h2>
385405

386406
</body>
387407

388-
</html>
408+
</html>
1.36 MB
Binary file not shown.

0 commit comments

Comments
 (0)