Replace OrdF64 type in the compiler with a first-class PositiveF64 type#988
Replace OrdF64 type in the compiler with a first-class PositiveF64 type#988apoelstra wants to merge 13 commits into
OrdF64 type in the compiler with a first-class PositiveF64 type#988Conversation
Update the type name to better reflect its domain invariants. The underlying floats are guaranteed to be both positive and not-NaN, making `PositiveF64` a more accurate and descriptive name than `OrdF64`, which only highlighted the `Ord` trait implementation.
Code move only. Retain the public constructor. We will eliminate its use over the following commits. Also, publicly expose the type in src/lib.rs, since we are going to start enforcing invariants on it and making it generally useful.
beea8eb to
dae7ba4
Compare
Our parser only allows nonzero u32s, so just use the type. This adds a bit of noise to the current code, but when we move the probability computations from f64 to PositiveF64, it will pay dividends. Also eliminates a couple panic paths where we were using .sum or bare addition to add up probabilities; now we always convert to f64 before doing the addition, which will lose precision but not panic if the numbers get too big.
…values Once we remove the unchecked constructor(s) from PositiveF64, we won't be able to do these "sum everything then divide by sum" constructions. Instead, encapsulate them into PositiveF64::normalized_iter. This is mildly annoying to use, but I wasn't able to come up with a nicer solution that avoided gratuitous clones, gave sane compiler errors if you misuse them, and had readable code.
Look ma, no type conversions!
By changing this one function and chasing compiler errors, I found I was able to (and required to) replace a *ton* of f64s with PositiveF64s. Amazingly, this did not require adding any new functionality to the PositiveF64 type, or any conversions. Instead, I was able to *delete* a ton of conversions, wrappers, unwrapping, etc. Now concrete.rs has literally no instances of 'f64' which is awesome. The next step is compiler.rs.
There is a ton of over-abstraction in AstElemeExt. Our original goal was to save on code by having one function, AstElemExt::type_check_common, which would match on a Terminal and do all the "fragment-specific" logic. But the consequence is that this needs to be really general to handle all the data that each fragment might need. In particular, disjunctions need "weights" which are provided by the user to indicate which branches are likely to be taken. No other kind of astelem needs extra data, but all the wrappers need infrastructure to carry this stuff around. By eliminating the terminal() function in place of direct constructors for each terminal type, we simplify the code and are able to also get rid of the type_check function. The next commits will eventually get rid of type_check_common, the branch_prob field of AstElemExt, a bunch of mutability and unwrap paths, and possibly more.
… functions Conjunctions don't need to set 'branch_prob', but we do set them, leading to confusing logic. By getting rid of these, we can make the conjunction logic simpler and avoid setting branch_prob to Some(0.0).
Had to move the logic into a separate function to avoid stack overflows, because we are recursing very large/complex functions. Other than that I tried to preserve the existing logic; I will double-back and refactor in a different commit. I also removed all the unused code from AstElemExt::type_check, since it would no longer compile -- I changed the disjunction code to take the left/right weights as parameters rather than storing them in the AstElemExt object. The goal here is to remove the `branch_prob` field entirely, which will then let us make AstElemExt immutable, which will unlock some further refactorings. But until I'm done this, I have to just shuffle around the existing logic and the result is pretty big/ugly.
This totally eliminates the type_check method, and stops using the branch_prob field of AstElemExt. The next commit will delete a ton of unused stuff, which I didn't do here to minimize the amount of noise in this commit. Then I will start converting to use PositiveF64 rather than f64 everywhere. Then I will double-back and clean up this repetitive code.
Costs remain using f64, since they may be zero (e.g. the satisfactio cost of a timelock). But all probilities in the compiler should be positive: if they are zero, the corresponding path is impossible and we represent that with an option.
It is now impossible to create an invalid PositiveF64. This eliminates any possibility of division by 0 in the compiler.
dae7ba4 to
424f23b
Compare
Abeeujah
left a comment
There was a problem hiding this comment.
This is well executed, I identified a few nits which are optional or could be done in a follow up.
| /// Attempts to create a [`PositiveF64`] from an ordinary `f64`. | ||
| pub fn new(f: f64) -> Option<Self> { | ||
| // Can likely make this const in Rust 1.83 | ||
| if f > 0.0 { |
There was a problem hiding this comment.
nit: Pending the MSRV upgrade for this to be made const, this can be rewritten to use the then_some method available on bool.
(f > 0.0).then_some(Self(f))There was a problem hiding this comment.
It would be better to use then rather than then_some, I suppose, because strictly speaking what you wrote results in a temporarily-existing invalid Self in the case that f <= 0. But good idea.
There was a problem hiding this comment.
I believe, then_some returns Some(T) if the bool is true , or None otherwise; which looks very much like what the function is doing.
| /// Returns the sum (or original value). Does not modify in-place. | ||
| #[must_use] | ||
| pub fn conditional_add(self, other: Option<Self>) -> Self { | ||
| if let Some(other) = other { |
There was a problem hiding this comment.
nit: This can also be written to make use of the map_or method available on Option
other.map_or(self, |i| i + self)| { | ||
| // Compute the sum of all the items in the iterator. Because all items in | ||
| // the iterator are positive, this will be 0 iff the iterator is empty. | ||
| let sum = iter.clone().map(|x| x.0).sum::<f64>(); |
There was a problem hiding this comment.
Yeah true, today's years old when I learned that.
| /// Returns `None` if the return value would be 0, which is impermissible for the | ||
| /// [`PositiveF64`] type. | ||
| pub fn one_minus_k_over_n<const MAX: usize, T>(t: &Threshold<T, MAX>) -> Option<Self> { | ||
| if t.is_and() { |
There was a problem hiding this comment.
nit: This could also use the then_some method, if adopted, the tradeoffs here would be conciseness over clarity
(!t.is_and()).then_some(Self(1.0 - (t.k() as f64 / t.n() as f64)))There was a problem hiding this comment.
Hah, I think this is too noisy.
| let rw = PositiveF64::from(subs[1].0) / total; | ||
|
|
||
| //and-or | ||
| if let (Concrete::And(x), _) = (subs[0].1.as_ref(), subs[1].1.as_ref()) { |
There was a problem hiding this comment.
I think this pattern here, could be simplified to just take the reference being used in this if block, and the one unused should not be evaluated at all, same for Line 158 below
There was a problem hiding this comment.
Sometimes we want to execute both sides of this. So there is no way to combine the two clauses into one.
| } | ||
| } | ||
| }; | ||
| if let (_, Concrete::And(x)) = (&subs[0].1.as_ref(), subs[1].1.as_ref()) { |
There was a problem hiding this comment.
Same as stated above, the 0th value should not be evaluated since it is unused.
| } | ||
|
|
||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
This function could benefit from Macros, it would greatly reduce the repetition.
There was a problem hiding this comment.
The whole compiler needs to be rewritten to be non-recursive. So I don't want to make the code harder to modify ahead of doing that.
There was a problem hiding this comment.
Great planning ahead, hopefully we get to eliminate a couple of the double nested loops too in the planned rewrite.
|
@Abeeujah can you stand on your head? |
I try, I try @yancyribbens |
|
How many digits of pi can you calculate? |
I usually just end up with a massive headache after 3.14 🤣 |
Ha! ok, you pass the Turing test :P |
Supercedes #972.
The
OrdF64type in the compiler is weaker than it needs to be. Nominally it exists just to force an ordering onf64, with a code comment that it "will panic if given NaN". But it doesn't actually enforce that you can't put NaN into it, or any other value. There are also lots of places where we do reckless division that could plausibly turn into 0.0 / 0.0, and where we add values asu32s (which might overflow and panic) before converting to f64 or OrdF64. Plus there are 'as f64` casts all over the place, which is code smell.In fact, all the probabilities used by the compiler, and parsed from policies, are required to be positive numbers (the policy values are even integers). So we can greatly improve this situation by renaming
OrdF64toPostiveF64, moving it to its own module, and enforcing that the contained value is always positive. This lets us:Infis a permissible value forPositiveF64)However, to do this we need to significantly refactor the compiler logic, which currently uses
Option<f64>to represent "branch probabilities" and does so in an incoherent way, whereNonerepresents "not yet set or unneeded" andSome(0.0)roughly represents "required to be set but unneeded" and there are.expects all over the place. The compiler also stores these branch probabilities in theAstElemExtstructure, which is totally unnecessary and results in a ton of mutation and arc-cloning.