Thanks for all these work! It's the only xnor gpu kernel I found so far.
but I notice that:
# This is not a binary op right now...
h_conv2 = tf.nn.relu(self.conv2d(h_pool1_bin, Wb_conv2))
#h_conv2 = tf.nn.relu(self.conv2d(h_pool1, W_conv2))
and I plan to implement an binconv2 base on this repo(GPU kernel). I am wondering if there are some difficulty that is hard to solve? is that because of the limitation of kernel:
Limitations
XNOR GEMM op currently only works for square matrices that are powers of 2, with smallest N being 512.
Thanks for all these work! It's the only xnor gpu kernel I found so far.
but I notice that:
and I plan to implement an binconv2 base on this repo(GPU kernel). I am wondering if there are some difficulty that is hard to solve? is that because of the limitation of kernel: