Currently, we have this error:
fun evil(e: Example) -> int {
return e.a1 + 20;
}
class Example {
//? Cannot use 'self' in direct member initializer.
var a1 = evil(self);
var b = 10;
}
fun init() {
var e = new Example{};
}
As can be seen, this prevents us from accidentally reading from a1 in its own initializer.
But we don't yet handle the following:
class Example {
//? Cannot use 'self' in direct member initializer.
var a1 = evil(oops());
var b = 10;
fun oops() -> Example { self }
}
To be consistent, and to prevent this uninitialized-read scenario (which would be Very Bad in the case of e.g. a pointer, even if we zero-initialized it first), we probably need to prevent using self at all in a "direct member initializer."
But here's the problem: That seems really annoying for some use cases.
The example that comes to mind is, what if we are trying to store a function, that by default we want to refer to self?
class Button {
var on_click = fun() { do_something_external(self); };
}
See, this is actually safe! So the reference to self here seems like it should be allowed. The trouble is, if we allow this to be allowed, then what about the following:
class Button {
var on_click = fun() -> int { do_something_external(self) };
var oops = on_click();
}
fun do_something_external(b: Button) -> int { return b.oops; }
And now we have a problem again.
Now, with enough flow-control analysis, it would in theory be possible to find exactly all the scenarios where this is unsafe (at least conservatively). For example,
class Button {
var on_click = fun() -> int { do_something_external(self) };
var oops = {
var my_fun = on_click;
if coin_flip() { my_fun = other_fun; }
my_fun(); // Doesn't compile because it could be on_click, which refers to self
};
}
But this seems like a lot of work.
The other option is, we could make this a runtime check. Basically, when an object is being initialized, make any calls to a function that refer to 'self' cause a panic. The main problem with this is, I'm not sure how to do it in such a way that is not super inefficient... essentially any function that could be transitively called by an initializer would either have to have this check hard-coded in, or be compiled twice, once as an "initializer" version and once as the normal version.
One other option to consider... we could make the "direct member initializers" two-step. Basically, we could let the user provide an initial value, which has some extreme restrictions, and then an additional default value that has no restrictions. That way, it would not be possible to read an uninitialized value, but would still usually allow for things to work out nicely. Maybe something like:
class Button {
@delayed_initialize(dummy = fun() -> int { 0 })
var on_click = fun() -> int { do_something_external(self) };
var oops = {
var my_fun = on_click;
// Doesn't compile because we can't refer to delayed_initialize properties, which makes sense
if coin_flip() { my_fun = other_fun; }
};
}
Ultimately, I do feel like my preference would be some sort of data-flow analysis, even if it was really conservative. This is, of course, pretty annoying to implement, but it would give the most user-friendly result.
(I think in general I do need to improve the data-flow analysis of these sorts of cyclic initialization checks, as the experience of using globals in PonieScript has been... Not Great so far).
Currently, we have this error:
As can be seen, this prevents us from accidentally reading from a1 in its own initializer.
But we don't yet handle the following:
To be consistent, and to prevent this uninitialized-read scenario (which would be Very Bad in the case of e.g. a pointer, even if we zero-initialized it first), we probably need to prevent using
selfat all in a "direct member initializer."But here's the problem: That seems really annoying for some use cases.
The example that comes to mind is, what if we are trying to store a function, that by default we want to refer to self?
See, this is actually safe! So the reference to
selfhere seems like it should be allowed. The trouble is, if we allow this to be allowed, then what about the following:And now we have a problem again.
Now, with enough flow-control analysis, it would in theory be possible to find exactly all the scenarios where this is unsafe (at least conservatively). For example,
But this seems like a lot of work.
The other option is, we could make this a runtime check. Basically, when an object is being initialized, make any calls to a function that refer to 'self' cause a panic. The main problem with this is, I'm not sure how to do it in such a way that is not super inefficient... essentially any function that could be transitively called by an initializer would either have to have this check hard-coded in, or be compiled twice, once as an "initializer" version and once as the normal version.
One other option to consider... we could make the "direct member initializers" two-step. Basically, we could let the user provide an initial value, which has some extreme restrictions, and then an additional default value that has no restrictions. That way, it would not be possible to read an uninitialized value, but would still usually allow for things to work out nicely. Maybe something like:
Ultimately, I do feel like my preference would be some sort of data-flow analysis, even if it was really conservative. This is, of course, pretty annoying to implement, but it would give the most user-friendly result.
(I think in general I do need to improve the data-flow analysis of these sorts of cyclic initialization checks, as the experience of using globals in PonieScript has been... Not Great so far).