Skip to content

Latest commit

 

History

History
235 lines (177 loc) · 12.9 KB

File metadata and controls

235 lines (177 loc) · 12.9 KB

Hacking JSCL

Newbie guide

  • Load slime from the root directory.
  • C-c C-l jscl.lisp to load the whole project
  • (jscl-xc:bootstrap "out/" "jscl") will generate out/jscl.js
  • Add tests
  • Open tests.html in your browser to see your failed tests

Code organization, style, etc.

Every definition should include documentation and unit tests.

Definitions are organized in different files following CLHS chapters.

Inside every file definition ordering should follow each CLHS dictionary index. This should make easier to find what has already been defined and what has not been defined yet.

Definitions should follow CLHS naming (e. g., cons definition should be (defun cons (object-1 object-2) …), not (defun cons (x y)) or (defun cons (obj1 obj2)).

Tests should follow the same organization as definitions.

CLHS examples can be used as tests.

Documentation strings should not be taken from CLHS (due to license issues). It is recommended to take them from SBCL instead of reinventing them.

Hacking the compiler

Interactive development

  • Load slime and bootstrap JSCL as explained in the newbie guide.
  • Work in package JSCL-XC with (in-package #:jscl-xc).
  • Use the compile-toplevel function to prepare the JavaSript code associated to a S-expression SEXP, compiled as a toplevel form.
  • Use the process-toplevel function to prepare the JavaScript AST associated to a S-expression SEXP, compiled as a toplevel form.
JSCL-XC> (process-toplevel '(+ 1 2))
(PROGN (SELFCALL (PROGN) (RETURN (+ 1 2))))

JSCL-XC> (compile-toplevel '(+ 1 2))
"(function(){return 1+2;
})();
"

Compiler anatomy

This paragraph outlines a few crude elements of the compiler. A compilation environment keeps track of macros, identifiers numbers, etc.

Code generation. The codegen.lisp file implements details of JavaScript code generation. The file entrypoint is the js function which takes a as input a S-expression representing a JavaScript program as an AST in common Lisp notation and emits the corresponding JavaScript code to the *js-output* output channel, which can be set to t to target stdout. For instance

JSCL-XC> (js '(+ 1 2))
1+2;

The representation accepted by js uses call to denote function calls and named-function to distinguish function expressions featuring a function name.

JSCL-XC> (js '(call (property |console| "log") "A message."))
console['log']('A message.');

JSCL-XC> (js '(call (get |console| |log|) "A message."))
console.log('A message.');

JSCL-XC> (js '(call (get |console| "log") "A message."))
console.log('A message.');


JSCL-XC> (js '(function (a b) (return (+ a b))))
(function(A,B){return A+B;
});

JSCL-XC> (js '(named-function "add2" (a b) (return (+ a b))))
(function add2(A,B){return A+B;
});

JavaScript driver The file prelude.js is a JavaScript driver defining various low-level functions, such as trampolines to call Common-Lisp functions from JavaScript or handling of Cons cells and Symbol cells.

It defines a global object JSCL populated with a hand of substructures such as packages owning Lisp packages and internal where internal functions called by the emitted JavaScript are kept.

It is worth to mention that these internal functions are not used, nor are they referred to, in the code generation.

Compiler Macros, Compilations and Builtins Compilations and Builtins are used to enrich the Common Lisp vocabulary understood by the process-toplevel function. Compilations and Builtins differ in how they handle their arguments, so that Compilations can defined special forms and Builtins can define primitive functions. Compilations and Builtins can use Compiler Macros to factorise or in-line parts of the JavaScript AST they need to generate. Compiler macros should not be mistaken for host Common Lisp macros.

JSCL-XC> (define-builtin mod (x y)
  `(selfcall
    (if (== ,y 0)
        (throw "Division by zero in mod"))
    (return (% ,x ,y))))

JSCL-XC> (with-compilation-environment (process-toplevel '(mod 7 2)))
(PROGN
 (SELFCALL
  (IF (== 2 0)
      (THROW "Division by zero in mod"))
  (RETURN (% 7 2))))

Multiple Values

Every Common Lisp function is represented as a Javascript function with an extra first argument called values. Functions can return multiple values by returning the result of this function, e.g:

(lambda () (values 1 2 3))

could be compiled to something like

function (values) {
  return values(1,2,3)
}

There are two possible values for this argument, pv (primary value), which returns the first arg. And mv, which return all of them as an tagged array.

When compiler will automatically pass pv or mv, depending on the context where the function is used. For example in (+ 1 (f x)), f is called with pv. However, if all multiple values are relevant, like in:

(lambda ()
  (f 0))

then f is called passing the values from the parent function. So this would compile to

function (values) {
  return f(values, 0);
}

Hacking the bootstrap code

Functions are easy

It is okay to forward reference functions. That allows us to write big part of the code without worrying about bootstrap.

So:

(defun f () (g))
(defun g () (print "hello"))

is fine. Because the compiler can generate a function call to g without knowing what g is. Of course, you cannot invoke f until after the definition of g.

To avoid collisions, a lot of JSCL code looks like (defun !standard-function () ...) #+jscl (fset 'standard-function '!standard-function)

This means, we can use any standard CL function in JSCL itself (given JSCL has implemented them).

Macros are more complicated

Macros are processed in two stages:

  1. Their macroexpander code runs in the host system, producing an expansion
  2. The expansion is compiled into the target system.

Unlike functions, macros must be defined before used inside functions. Because we need to call the macroexpander immediately.

Additionally we cannot compile built-in macros from the host system, because their expansion might use functions that are not available in JSCL.

Inside the macros, another macro can be used in two ways:

a) As part of the macroexpansion itself b) To compute some intermediate value but does not show up in the macroexpansion.

Any code can be used in the macroexpansion again, as long as we define it later. But only already define macros can be used in the macroexpansion.

Loading order

With this in minds, the loading / compilation phases are:

  • Load all :host/:both files in the host system. This add a bunch of code that do not collide with the host system. def!struct, parse-macro, !loop, … they can all live aside the standard ones.

This code is easier to write and debug. Because you can develop it in the host system. You can macroexpand any macro, run the compiler to see output of forms, etc.

This includes the compiler/compiler.lisp.

  • Compile :target/:both files to the runtime system with the jscl compiler.

The compiler will compile function calls as described, special forms and primitive functions, generating code in the output jscl.js.

It starts with boot.lisp, to define the basic macros defmacro, defun, etc. These macros can use complicated functions like parse-macro as they wre loaded in the previous phase.

Macros definitions are used to extend both the host and target environments. Macro usages use only the host macroexpansions during bootstrap.

Runtime

We have jscl.js with all the JSCL code crosscompiled. And macro definitions are recorded as their code. This includes the compiler. So we can enter some expressions in the REPL, and the compiler will produce additional JS code, and continue extending the environment with new macros.

The bootstrap magic: why and how

In an ideal world, the compiler implements a (hopefully small) set of primitives, then we compile the bootstrapping code one by one (the *source* list in jscl.lisp) to get a full Lisp system. This means later files in the bootstrapping sequence (i.e. the *source* list) only depend on previous files, and src/boot.lisp depends only on the primitives. However, this is not practical. Look at the first form in src/boot.lisp: we define defmacro, and defmacro requires… at least a pattern matcher for the macro lambda list! It will be pages of unreadable code if we attempt to write the pattern matcher only from the primitives, inside the first top-level form of src/boot.lisp! (Remember we can’t define macros, because we’re defining defmacro).

In practice, we have a bootstrap process (that has certain magical bits) so that we can use definitions later in *source* at compile time and specifically in defmacro. A good example is again the first form in src/boot.lisp. The definition relies on destructuring-bind and parse-macro, defined later in src/lambda-list.lisp, and backquotes, defined very late in src/backquote.lisp. Many other bootstrap code has such “apparent” forward reference, and it makes the life of writing bootstrap code much easier.

Overall, the build process has three phases:

Phase 1
The source files (with :host or :both tag in *source*) are executed in the host CL.
Phase 2
The source files (with :target or :both tag in *source*) are compiled for the target in the host (cross compilation).

Note that compilation of a Lisp file involves both compile time evaluation of some subforms (for :compile-toplevel eval-when situation), and compilation of some subforms (for :load-toplevel eval-when situation). The compile time evaluation happens in the host and use the host global environment, macro definitions, etc. It’s the only choice because JSCL is a compiler-only implementation. The compilation instead is completely controlled by us, uses our macroexpander, compiler and global environment *environment* representation. There’re some tricky bits (i.e. the “bootstrap magic”) to make this really work, which will be explained later.

Phase 2a
Near the end of phase 2, the global lexical environment *environment* is dumped (serialized) by dump-global-environment in jscl.lisp. This is part of the “bootstrap magic”
Phase 3
The output (JavaScript) from Phase 2 is loaded (eval in JS) in the target.
Phase 3a
Near the end of phase 3, the lexical environment dumped by Phase 2a is reinstalled as *environment* in the target. This is part of the “bootstrap magic”.

It’s now clear how those “apparent” forward uses at compile time work. The compile time evaluation of Phase 2 happens in the host and has access to all the definitions entered into the host in Phase 1, even if they occur apparently later in the *source* sequence.

However, there’s still a problem with defmacro. Look at the final definition of defmacro we use (in src/toplevel.lisp at the end of bootstrap), it’s expanded to

`(eval-when (:compile-toplevel :load-toplevel :execute)
   (%compile-defmacro ',name ,expander))

This has both compile-time (:compile-toplevel) and load-time (:load-toplevel) side effect! A defmacro not only need to be evaluated at compile time so the macro definition is available for compiling later forms, it also need to be compiled itself, so it produce the desired load-time side-effect in the target (making the macro available to later files and REPL). “Apparent” forward uses are not problematic for the evaluation, they in fact refer to definitions in the host environment earlier entered in Phase 1; but they are problematic for the compilation. We don’t yet have the target macroexpander to expand them, these are entered into *environment* only later in Phase 2.

The solution is the “bootstrap magic”. During bootstrap, we use a special definition of defmacro (the first form in src/boot.lisp, yet again), that does not have the usual load-time side effect (:load-toplevel situation is missing). It simply records the expander (in S-exp representation) in *environment*. We delayed compilation of the macroexpanders to Phase 2a, at which time we have all the target macroexpanders (in S-exp reprensetation) in *environment*. dump-global-environment then compiles all the target macroexpanders.

This has some extra consequences. The bootstrap definition of defmacro has non-standard semantics. The load-time side-effect is missing (waiting to be reproduced later by dump-global-environment), and the macroexpanders are always evaluated in null (global) lexical environment. We seldomly really need closures as macroexpanders, so these discrepancies are not really problematic for bootstrap code (which we fully control ourselves). But this makes it unsuitable for user code. Therefore, we replace it with the standard definition summarily after dump-global-environment, in src/toplevel.lisp.