Skip to content

Latest commit

 

History

History
405 lines (284 loc) · 14.1 KB

File metadata and controls

405 lines (284 loc) · 14.1 KB

dgemm

Perform the matrix-matrix operation C = α*op(A)*op(B) + β*C where op(X) is one of the op(X) = X, or op(X) = X^T.

Usage

var dgemm = require( '@stdlib/blas/base/dgemm' );

dgemm( ord, ta, tb, M, N, K, α, A, lda, B, ldb, β, C, ldc )

Performs the matrix-matrix operation C = α*op(A)*op(B) + β*C where op(X) is either op(X) = X or op(X) = X^T, α and β are scalars, A, B, and C are matrices, with op(A) an M by K matrix, op(B) a K by N matrix, and C an M by N matrix.

var Float64Array = require( '@stdlib/array/float64' );

var A = new Float64Array( [ 1.0, 2.0, 3.0, 4.0 ] );
var B = new Float64Array( [ 1.0, 1.0, 0.0, 1.0 ] );
var C = new Float64Array( [ 1.0, 2.0, 3.0, 4.0 ] );

dgemm( 'row-major', 'no-transpose', 'no-transpose', 2, 2, 2, 1.0, A, 2, B, 2, 1.0, C, 2 );
// C => <Float64Array>[ 2.0, 5.0, 6.0, 11.0 ]

The function has the following parameters:

  • ord: storage layout.
  • ta: specifies whether A should be transposed, conjugate-transposed, or not transposed.
  • tb: specifies whether B should be transposed, conjugate-transposed, or not transposed.
  • M: number of rows in the matrix op(A) and in the matrix C.
  • N: number of columns in the matrix op(B) and in the matrix C.
  • K: number of columns in the matrix op(A) and number of rows in the matrix op(B).
  • α: scalar constant.
  • A: first input matrix stored in linear memory as a Float64Array.
  • lda: stride of the first dimension of A (leading dimension of A).
  • B: second input matrix stored in linear memory as a Float64Array.
  • ldb: stride of the first dimension of B (leading dimension of B).
  • β: scalar constant.
  • C: third input matrix stored in linear memory as a Float64Array.
  • ldc: stride of the first dimension of C (leading dimension of C).

The stride parameters determine how elements in the input arrays are accessed at runtime. For example, to perform matrix multiplication of two subarrays,

var Float64Array = require( '@stdlib/array/float64' );

var A = new Float64Array( [ 1.0, 2.0, 0.0, 0.0, 3.0, 4.0, 0.0, 0.0 ] );
var B = new Float64Array( [ 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0 ] );
var C = new Float64Array( [ 1.0, 2.0, 3.0, 4.0 ] );

dgemm( 'row-major', 'no-transpose', 'no-transpose', 2, 2, 2, 1.0, A, 4, B, 4, 1.0, C, 2 );
// C => <Float64Array>[ 2.0, 5.0, 6.0, 11.0 ]

Note that indexing is relative to the first index. To introduce an offset, use typed array views.

var Float64Array = require( '@stdlib/array/float64' );

// Initial arrays (with extra leading element to be skipped):
var A0 = new Float64Array( [ 0.0, 1.0, 2.0, 3.0, 4.0 ] );
var B0 = new Float64Array( [ 0.0, 1.0, 1.0, 0.0, 1.0 ] );
var C0 = new Float64Array( [ 0.0, 1.0, 2.0, 3.0, 4.0 ] );

// Create offset views...
var A1 = new Float64Array( A0.buffer, A0.BYTES_PER_ELEMENT*1 ); // start at 2nd element
var B1 = new Float64Array( B0.buffer, B0.BYTES_PER_ELEMENT*1 ); // start at 2nd element
var C1 = new Float64Array( C0.buffer, C0.BYTES_PER_ELEMENT*1 ); // start at 2nd element

dgemm( 'row-major', 'no-transpose', 'no-transpose', 2, 2, 2, 1.0, A1, 2, B1, 2, 1.0, C1, 2 );
// C0 => <Float64Array>[ 0.0, 2.0, 5.0, 6.0, 11.0 ]

dgemm.ndarray( ta, tb, M, N, K, α, A, sa1, sa2, oa, B, sb1, sb2, ob, β, C, sc1, sc2, oc )

Performs the matrix-matrix operation C = α*op(A)*op(B) + β*C, using alternative indexing semantics and where op(X) is either op(X) = X or op(X) = X^T, α and β are scalars, A, B, and C are matrices, with op(A) an M by K matrix, op(B) a K by N matrix, and C an M by N matrix.

var Float64Array = require( '@stdlib/array/float64' );

var A = new Float64Array( [ 1.0, 2.0, 3.0, 4.0 ] );
var B = new Float64Array( [ 1.0, 1.0, 0.0, 1.0 ] );
var C = new Float64Array( [ 1.0, 2.0, 3.0, 4.0 ] );

dgemm.ndarray( 'no-transpose', 'no-transpose', 2, 2, 2, 1.0, A, 2, 1, 0, B, 2, 1, 0, 1.0, C, 2, 1, 0 );
// C => <Float64Array>[ 2.0, 5.0, 6.0, 11.0 ]

The function has the following additional parameters:

  • sa1: stride of the first dimension of A.
  • sa2: stride of the second dimension of A.
  • oa: starting index for A.
  • sb1: stride of the first dimension of B.
  • sb2: stride of the second dimension of B.
  • ob: starting index for B.
  • sc1: stride of the first dimension of C.
  • sc2: stride of the second dimension of C.
  • oc: starting index for C.

While typed array views mandate a view offset based on the underlying buffer, the offset parameters support indexing semantics based on starting indices. For example,

var Float64Array = require( '@stdlib/array/float64' );

var A = new Float64Array( [ 0.0, 0.0, 1.0, 3.0, 2.0, 4.0 ] );
var B = new Float64Array( [ 0.0, 1.0, 0.0, 1.0, 1.0 ] );
var C = new Float64Array( [ 0.0, 0.0, 0.0, 1.0, 3.0, 2.0, 4.0 ] );

dgemm.ndarray( 'no-transpose', 'no-transpose', 2, 2, 2, 1.0, A, 1, 2, 2, B, 1, 2, 1, 1.0, C, 1, 2, 3 );
// C => <Float64Array>[ 0.0, 0.0, 0.0, 2.0, 6.0, 5.0, 11.0 ]

Notes

  • dgemm() corresponds to the BLAS level 3 function dgemm.

Examples

var discreteUniform = require( '@stdlib/random/array/discrete-uniform' );
var dgemm = require( '@stdlib/blas/base/dgemm' );

var opts = {
    'dtype': 'float64'
};

var M = 3;
var N = 4;
var K = 2;

var A = discreteUniform( M*K, 0, 10, opts ); // 3x2
var B = discreteUniform( K*N, 0, 10, opts ); // 2x4
var C = discreteUniform( M*N, 0, 10, opts ); // 3x4

dgemm( 'row-major', 'no-transpose', 'no-transpose', M, N, K, 1.0, A, K, B, N, 1.0, C, N );
console.log( C );

dgemm.ndarray( 'no-transpose', 'no-transpose', M, N, K, 1.0, A, K, 1, 0, B, N, 1, 0, 1.0, C, N, 1, 0 );
console.log( C );

C APIs

Usage

#include "stdlib/blas/base/dgemm.h"

c_dgemm( layout, transA, transB, M, N, K, alpha, *A, LDA, *B, LDB, beta, *C, LDC )

Performs the matrix-matrix operation C = alpha*op(A)*op(B) + beta*C, where op(X) is either op(X) = X or op(X) = X^T, alpha and beta are scalars, and A, B, and C are matrices, with op(A) an M-by-K matrix, op(B) a K-by-N matrix, and C an M-by-N matrix.

#include "stdlib/blas/base/shared.h"

double A[ 2*3 ] = {
    1.0, 2.0, 3.0,
    4.0, 5.0, 6.0
};
double B[ 3*2 ] = {
    7.0,  8.0,
    9.0,  10.0,
    11.0, 12.0
};
double C[ 2*2 ] = {
    0.0, 0.0,
    0.0, 0.0
};

c_dgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, 2, 2, 3, 1.0, A, 3, B, 2, 0.0, C, 2 );

The function accepts the following arguments:

  • layout: [in] CBLAS_LAYOUT storage layout.
  • transA: [in] CBLAS_TRANSPOSE specifies whether A should be transposed, conjugate-transposed, or not transposed.
  • transB: [in] CBLAS_TRANSPOSE specifies whether B should be transposed, conjugate-transposed, or not transposed.
  • M: [in] CBLAS_INT number of rows in the matrix op(A) and in the matrix C.
  • N: [in] CBLAS_INT number of columns in the matrix op(B) and in the matrix C.
  • K: [in] CBLAS_INT number of columns in the matrix op(A) and number of rows in the matrix op(B).
  • alpha: [in] double scalar constant.
  • A: [in] double* first input matrix.
  • LDA: [in] CBLAS_INT stride of the first dimension of A (a.k.a., leading dimension of the matrix A).
  • B: [in] double* second input matrix.
  • LDB: [in] CBLAS_INT stride of the first dimension of B (a.k.a., leading dimension of the matrix B).
  • beta: [in] double scalar constant.
  • C: [inout] double* result matrix.
  • LDC: [in] CBLAS_INT stride of the first dimension of C (a.k.a., leading dimension of the matrix C).
void c_dgemm( const CBLAS_LAYOUT layout, const CBLAS_TRANSPOSE transA, const CBLAS_TRANSPOSE transB, const CBLAS_INT M, const CBLAS_INT N, const CBLAS_INT K, const double alpha, const double *A, const CBLAS_INT LDA, const double *B, const CBLAS_INT LDB, const double beta, double *C, const CBLAS_INT LDC );

c_dgemm_ndarray( transA, transB, M, N, K, alpha, *A, sa1, sa2, oa, *B, sb1, sb2, ob, beta, *C, sc1, sc2, oc )

Performs the matrix-matrix operation C = alpha*op(A)*op(B) + beta*C, using alternative indexing semantics and where op(X) is either op(X) = X or op(X) = X^T, alpha and beta are scalars, and A, B, and C are matrices, with op(A) an M-by-K matrix, op(B) a K-by-N matrix, and C an M-by-N matrix.

#include "stdlib/blas/base/shared.h"

double A[ 2*3 ] = {
    1.0, 2.0, 3.0,
    4.0, 5.0, 6.0
};
double B[ 3*2 ] = {
    7.0,  8.0,
    9.0,  10.0,
    11.0, 12.0
};
double C[ 2*2 ] = {
    0.0, 0.0,
    0.0, 0.0
};

c_dgemm_ndarray( CblasNoTrans, CblasNoTrans, 2, 2, 3, 1.0, A, 3, 1, 0, B, 2, 1, 0, 0.0, C, 2, 1, 0 );

The function accepts the following arguments:

  • transA: [in] CBLAS_TRANSPOSE specifies whether A should be transposed, conjugate-transposed, or not transposed.
  • transB: [in] CBLAS_TRANSPOSE specifies whether B should be transposed, conjugate-transposed, or not transposed.
  • M: [in] CBLAS_INT number of rows in the matrix op(A) and in the matrix C.
  • N: [in] CBLAS_INT number of columns in the matrix op(B) and in the matrix C.
  • K: [in] CBLAS_INT number of columns in the matrix op(A) and number of rows in the matrix op(B).
  • alpha: [in] double scalar constant.
  • A: [in] double* first input matrix.
  • sa1: [in] CBLAS_INT stride of the first dimension of A.
  • sa2: [in] CBLAS_INT stride of the second dimension of A.
  • oa: [in] CBLAS_INT starting index for A.
  • B: [in] double* second input matrix.
  • sb1: [in] CBLAS_INT stride of the first dimension of B.
  • sb2: [in] CBLAS_INT stride of the second dimension of B.
  • ob: [in] CBLAS_INT starting index for B.
  • beta: [in] double scalar constant.
  • C: [inout] double* result matrix.
  • sc1: [in] CBLAS_INT stride of the first dimension of C.
  • sc2: [in] CBLAS_INT stride of the second dimension of C.
  • oc: [in] CBLAS_INT starting index for C.
void c_dgemm_ndarray( const CBLAS_TRANSPOSE transA, const CBLAS_TRANSPOSE transB, const CBLAS_INT M, const CBLAS_INT N, const CBLAS_INT K, const double alpha, const double *A, const CBLAS_INT strideA1, const CBLAS_INT strideA2, const CBLAS_INT offsetA, const double *B, const CBLAS_INT strideB1, const CBLAS_INT strideB2, const CBLAS_INT offsetB, const double beta, double *C, const CBLAS_INT strideC1, const CBLAS_INT strideC2, const CBLAS_INT offsetC );

Examples

#include "stdlib/blas/base/dgemm.h"
#include "stdlib/blas/base/shared.h"
#include <stdio.h>

int main( void ) {
    // Define a 2x2 output matrix stored in row-major order:
    double C[ 2*2 ] = {
        0.0, 0.0,
        0.0, 0.0
    };

    // Define a 2x3 matrix `A` stored in row-major order:
    const double A[ 2*3 ] = {
        1.0, 2.0, 3.0,
        4.0, 5.0, 6.0
    };

    // Define a 3x2 matrix `B` stored in row-major order:
    const double B[ 3*2 ] = {
        7.0,  8.0,
        9.0,  10.0,
        11.0, 12.0
    };

    // Specify matrix dimensions:
    const int M = 2; // rows of op(A) and C
    const int N = 2; // columns of op(B) and C
    const int K = 3; // columns of op(A) and rows of op(B)

    // Perform operation: C = 1.0*A*B + 0.0*C
    c_dgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, M, N, K, 1.0, A, K, B, N, 0.0, C, N );

    // Print the result:
    for ( int i = 0; i < M; i++ ) {
        for ( int j = 0; j < N; j++ ) {
            printf( "C[%i,%i] = %lf\n", i, j, C[ (i*N)+j ] );
        }
    }
}