Skip to content

Commit 81fc1b6

Browse files
committed
Update docs for semaphore-aware Sequence evalAsync overload
1 parent 570453d commit 81fc1b6

3 files changed

Lines changed: 35 additions & 14 deletions

File tree

docs/overview/advanced-examples.rst

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,9 @@ Async/Await Example
162162

163163
A simple example of asynchronous submission can be found below.
164164

165+
You can also use async submissions with Vulkan semaphores to synchronize
166+
Kompute-generated submits with user-managed queue submits.
167+
165168
First we are able to create the manager as we normally would.
166169

167170
.. code-block:: cpp
@@ -233,15 +236,25 @@ The parameter provided is the maximum amount of time to wait in nanoseconds. Whe
233236
234237
auto sq = mgr.sequence();
235238
236-
// Run Async Kompute operation on the parameters provided
237-
sq->evalAsync<kp::OpAlgoDispatch>(algo);
239+
// Optional: pass submit-level synchronization primitives so this submit
240+
// waits/signals alongside user-managed queue work
241+
std::vector<vk::Semaphore> waitSemaphores = { externalWaitSemaphore };
242+
std::vector<vk::PipelineStageFlags> waitDstStageMasks = {
243+
vk::PipelineStageFlagBits::eComputeShader
244+
};
245+
std::vector<vk::Semaphore> signalSemaphores = { externalSignalSemaphore };
246+
auto opAlgo = std::make_shared<kp::OpAlgoDispatch>(algo);
247+
sq->evalAsync(opAlgo, waitSemaphores, waitDstStageMasks, signalSemaphores);
238248
239249
// Here we can do other work
240250
241-
// When we're ready we can wait
251+
// When we're ready we can wait
242252
// The default wait time is UINT64_MAX
243253
sq->evalAwait();
244254
255+
``evalAwait()`` must be called before invoking ``evalAsync()`` again on the
256+
same ``Sequence``.
257+
245258

246259
Finally, below you can see that we can also run syncrhonous commands without having to change anything.
247260

docs/overview/custom-operations.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ Below you
3333
* - preEval()
3434
- When the Sequence is Evaluated this preEval is called across all operations before dispatching the batch of recorded commands to the GPU. This is useful for example if you need to copy data from local to host memory.
3535
* - postEval()
36-
- After the sequence is Evaluated this postEval is called across all operations. When running asynchronously the postEval is called when you call `evalAwait()`, which is why it's important to always run evalAwait() to ensure the process doesn't go into inconsistent state.
36+
- After the sequence is Evaluated this postEval is called across all operations. In asynchronous flows postEval is called when you run `evalAwait()`, and `evalAwait()` must be called before triggering `evalAsync()` again on the same sequence to avoid inconsistent state.
3737

3838

3939
Simple Operation Extending OpAlgoBase

src/include/kompute/Sequence.hpp

Lines changed: 18 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -154,18 +154,22 @@ class Sequence : public std::enable_shared_from_this<Sequence>
154154

155155
/**
156156
* Eval Async sends all the recorded and stored operations in the vector of
157-
* operations into the gpu as a submit job without a barrier. EvalAwait()
158-
* must ALWAYS be called after to ensure the sequence is terminated
159-
* correctly.
157+
* operations into the gpu as a submit job without a barrier.
158+
*
159+
* evalAwait() must be called before invoking evalAsync() again on this same
160+
* Sequence to complete the previous async run and reset internal state.
160161
*
161162
* @return Boolean stating whether execution was successful.
162163
*/
163164
std::shared_ptr<Sequence> evalAsync();
164165
/**
165166
* Eval Async sends all recorded operations as a submit job and allows
166167
* submit-level GPU synchronization by providing wait and signal semaphores.
167-
* EvalAwait() must ALWAYS be called after to ensure the sequence is
168-
* terminated correctly.
168+
*
169+
* This overload is useful for synchronizing Kompute submissions with
170+
* user-managed queue submissions without forcing CPU-side synchronization.
171+
* evalAwait() must be called before invoking evalAsync() again on this same
172+
* Sequence to complete the previous async run and reset internal state.
169173
*
170174
* @param waitSemaphores Semaphores that must be signaled before this submit
171175
* starts executing.
@@ -182,18 +186,22 @@ class Sequence : public std::enable_shared_from_this<Sequence>
182186
const std::vector<vk::Semaphore>& signalSemaphores);
183187
/**
184188
* Clears currnet operations to record provided one in the vector of
185-
* operations into the gpu as a submit job without a barrier. EvalAwait()
186-
* must ALWAYS be called after to ensure the sequence is terminated
187-
* correctly.
189+
* operations into the gpu as a submit job without a barrier.
190+
*
191+
* evalAwait() must be called before invoking evalAsync() again on this same
192+
* Sequence to complete the previous async run and reset internal state.
188193
*
189194
* @return Boolean stating whether execution was successful.
190195
*/
191196
std::shared_ptr<Sequence> evalAsync(std::shared_ptr<OpBase> op);
192197
/**
193198
* Clears current operations, records the provided one and submits with
194199
* optional wait/signal semaphores for submit-level GPU synchronization.
195-
* EvalAwait() must ALWAYS be called after to ensure the sequence is
196-
* terminated correctly.
200+
*
201+
* This overload is useful for synchronizing Kompute submissions with
202+
* user-managed queue submissions without forcing CPU-side synchronization.
203+
* evalAwait() must be called before invoking evalAsync() again on this same
204+
* Sequence to complete the previous async run and reset internal state.
197205
*
198206
* @param op Operation to record prior to submit.
199207
* @param waitSemaphores Semaphores that must be signaled before this submit

0 commit comments

Comments
 (0)