Skip to content

Commit d9c556d

Browse files
committed
ptx: Fix synchronisation of stencil border kernels
The border kernels did synchronise with the parent stream at the _end_ of the border kernel (so that the Future that contains the result of the stencil operation is dependent on the borders too), but not at the _start_ of the border kernel (meaning that the borders can start executing in parallel with the computation of the argument of the stencil operation). The latter is clearly wrong, and this commit fixes it. This fixes a stencil nondeterminism bug that we were having (nondet-stencil in https://github.com/tomsmeding/accelerate-tests).
1 parent dca0f75 commit d9c556d

1 file changed

Lines changed: 7 additions & 2 deletions

File tree

  • accelerate-llvm-ptx/src/Data/Array/Accelerate/LLVM/PTX

accelerate-llvm-ptx/src/Data/Array/Accelerate/LLVM/PTX/Execute.hs

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -709,6 +709,7 @@ stencilCore repr@(ArrayR shr _) exe gamma aenv halo shOut paramsR params =
709709
future <- new
710710
result <- allocateRemote repr shOut
711711
parent <- asks ptxStream
712+
parentStartPoint <- liftPar (Event.waypoint parent)
712713

713714
-- interior (no bounds checking)
714715
let paramsRinside = TupRsingle (ParamRshape shr) `TupRpair` TupRsingle (ParamRarray repr) `TupRpair` paramsR
@@ -719,15 +720,19 @@ stencilCore repr@(ArrayR shr _) exe gamma aenv halo shOut paramsR params =
719720
-- and each other, as individually they will not saturate the device
720721
forM_ (stencilBorders (arrayRshape repr) shOut halo) $ \(u, v) ->
721722
fork $ do
723+
-- synchronise with start of stencil computation, so that the arguments
724+
-- are available
725+
child <- asks ptxStream
726+
liftIO (Event.after parentStartPoint child)
727+
722728
-- launch in a separate stream
723729
let sh = trav (-) v u
724730
let paramsRborder = TupRsingle (ParamRshape shr) `TupRpair` TupRsingle (ParamRshape shr)
725731
`TupRpair` TupRsingle (ParamRarray repr)
726732
`TupRpair` paramsR
727733
executeOp border gamma aenv shr sh paramsRborder (((u, sh), result), params)
728734

729-
-- synchronisation with main stream
730-
child <- asks ptxStream
735+
-- make remainder of the parent stream depend on the border results
731736
event <- liftPar (Event.waypoint child)
732737
ready <- liftIO (Event.query event)
733738
if ready then return ()

0 commit comments

Comments
 (0)