Feature Request
I think this is a good idea, though I am not quite sure how one would implement it beyond a binary warning. -( deterministic model unlikely to be valid)
If I had a two variables that are independent i.e. 2 Dice I can look for a function to predict the second roll from the score of the first roll, using PySR.
PySr returns an answer y = 3.5 which is the expectation value of the second die. My feature suggestion is that we should have some sort of measure of independence reported for cases like this, highlighting that there isn't likely to be a deterministic relationship between x and y.
This is a silly example of course, but I have seen students trying to find patterns in things using PySR, that are effectively the same thing just in higher dimensions, where the non sensical bit is not perhaps as obvious.
Maybe it is something like a Reduced chi squared being flagged? Minimal example below
import numpy as np
import sympy as sym
x = np.random.randint(6, size=10000) + 1
y = np.random.randint(6, size=10000) + 1
Dice = PySRRegressor(
maxsize=25,
niterations=100,
binary_operators=["+","*"],
elementwise_loss="LPDistLoss(2)",
batching=True,
batch_size=128,
parsimony=1e-10,
model_selection="score",
)
Dice.fit(x.reshape(-1, 1),y)
Feature Request
I think this is a good idea, though I am not quite sure how one would implement it beyond a binary warning. -( deterministic model unlikely to be valid)
If I had a two variables that are independent i.e. 2 Dice I can look for a function to predict the second roll from the score of the first roll, using PySR.
PySr returns an answer y = 3.5 which is the expectation value of the second die. My feature suggestion is that we should have some sort of measure of independence reported for cases like this, highlighting that there isn't likely to be a deterministic relationship between x and y.
This is a silly example of course, but I have seen students trying to find patterns in things using PySR, that are effectively the same thing just in higher dimensions, where the non sensical bit is not perhaps as obvious.
Maybe it is something like a Reduced chi squared being flagged? Minimal example below
import numpy as npimport sympy as sym
x = np.random.randint(6, size=10000) + 1
y = np.random.randint(6, size=10000) + 1
Dice = PySRRegressor(
maxsize=25,
niterations=100,
binary_operators=["+","*"],
elementwise_loss="LPDistLoss(2)",
batching=True,
batch_size=128,
parsimony=1e-10,
model_selection="score",
)
Dice.fit(x.reshape(-1, 1),y)