Skip to content

To bind or not to bind, that is the question 🧻 #590

@paolopas

Description

@paolopas

Is it possible to run a benchmark to compare the performance of the native binding (builtin NATIVE_RULE) with that of the new binding API (b2::jam::jam_binder)? I think so, and here's how I did it.

The idea is to call a rule implemented in both ways in a Jam loop and measure the time taken to perform the same number of loops, i.e., use the following jam file (legacy.jam) to benchmark the native binding

NATIVE_RULE benchmark : timed ;
IMPORT benchmark : timed : : benchmark.timed ;
while true { benchmark.timed ; }

and the Jamroot

import benchmark ;
while true { benchmark.timed ; }

for API binding benchmarking, this of course requires the build-system (unlike NATIVE_RULE.)

The probe

For the measurements I used

#include "startup.h"
#include "output.h"

#include <ctime>
#include <cstdlib>

/*
 * Report collected data on stderr every 1<<shift tick() calls.
 */
template<unsigned shift>
struct WatchedCounter
{
    static_assert(shift < 32, "");
    static constexpr size_t mask = 1 << shift;
    bool last_masked = false;
    size_t counter = 0;
    clock_t t0;
    size_t exit_count;

    /*
     * Exit program after num reports.
     */
    WatchedCounter(size_t num = 1) : exit_count(num << shift) { t0 = clock(); }

    void tick()
    {
        if (bool(++counter & mask) != last_masked)
        {
            last_masked = !last_masked;
            clock_t t1 = clock();
            double dur = 1000.0 * (t1 - t0) / CLOCKS_PER_SEC;
            err_printf("%ld\t%.3f\n", counter, dur);
        }
        if (counter == exit_count) b2::clean_exit(EXIT_SUCCESS);
    }
};

Native binding implementation

The Jam timed rule is implemented by the function

LIST * native_timed(FRAME * frame, int flags)
{
    static WatchedCounter<13> wc(2);
    wc.tick();
    return L0;
}

whose binding is done by the

/*
 * Legacy binding style.
 */
void init_benchmark()
{
    //char const * args[] = { "any", "*", 0 }; // only used to check for call syntax
    char const * * args = nullptr; // do not care of args
    declare_native_rule(
        "benchmark",
        "timed",
        args,
        native_timed,
        1
    );
}

which is called at the end of load_builtins in builtins.cpp.

New binding API implementation

I added the following mod_bind_benchmark.h

namespace b2 {

//void timed_no_args(); // alternate version
value_ref timed_no_args();

/*
 * New binding style.
 */
struct benchmark_module : b2::bind::module_<benchmark_module>
{
	const char * module_name = "benchmark";

	template <class Binder>
	void def(Binder & binder)
	{
		binder.def(&b2::timed_no_args, "timed");
		binder.loaded();
	}
};
} // namespace b2

which is included and used in bindjam.cpp, while the function that implements timed is in the same source along with everything else.

/*
 * New binding style.
 */
namespace b2 {
// NOTE: returning void seems slower
//void timed_no_args()
value_ref timed_no_args()
{
    static WatchedCounter<13> wc(2);
    wc.tick();
    return value_ref();
}
} // namespace b2

Results

The key parameters are the shift template argument and the number of reports requested to the WatchedCounter constructor, with the values reported, here's what I get on my poor laptop (b2 release gcc 10.3.1), average values over 10 runs:

> b2 -flegacy.jam
8192	15.786
16384	30.463
> b2
8192	25.695
16384	52.523

If I try to pass an argument into the Jam, thus I use the following implementation (binding API) for the timed rule

value_ref timed_any_args(list_cref args)
{
    static WatchedCounter<13> wc(2);
    wc.tick();
    return value_ref();
}

and in the Jams I pass an argument (qwerty in both cases) I get (average values over 10 runs):

> b2 -flegacy.jam
8192	25.876
16384	43.933
> b2
8192	28.125
16384	57.759

Well, it seems that tons of templates don't make the code faster (they certainly don't make it simpler or more readable), although the binding API seems to scale better when passing arguments to rules.

Of course, this data is insufficient to draw any conclusions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions