You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nix handles early cutoff with fixed output derivations. These must be manually kept up to date.
2
+
3
+
Because the dependency graph in Bramble is (currently) constructed using the hash of input derivations any change to build input will result in a full rebuild.
4
+
5
+
I think it's important when thinking about a solution here to remember that all build inputs are either filesystem files or `fetch` derivations, so don't get too creative about storing build state.
6
+
7
+
### Always use the build output as the derivation hash
8
+
9
+
When currently injecting a derivation into a child derivation we use a format like this: `{{ tmb75glr3iqxaso2gn27ytrmr4ufkv6d-.drv:out }}`. Alternatively we could replace that with the hash of the output.
10
+
11
+
I think this is what the steps would look like with this approach:
12
+
13
+
1. Calculate the entire derivation graph. If we already have build outputs computed use them as the named hash.
14
+
2. Find the first derivation that needs to be built. Build it, replace all the child derivations derivation hashes with the output hash. Continue building.
15
+
16
+
What's the cost here? We wouldn't know the derivations we're going to build without building them, this only means that we don't know in advance which things we're going to have to build. This seems fine (and I think is mandatory for this feature).
Nix and Bazel don't allow dynamic dependencies. I think there is an argument to be made that this is the reason their ergonomics are so poor. Nix libs that are intended to build arbitrary projects in a given language rely heavily on code generation. Arguably this is a type of dynamic dependencies.
4
+
5
+
I think it would be interesting to explore first class support for dynamic dependencies in Bramble. Maybe if they are easy to use and set up we can limit the amount of derivations that need network access. If you can generate arbitrary calls to `fetch_url` within a derivation, then maybe you can get away with just that.
6
+
7
+
One Idea:
8
+
9
+
There is a specific type of derivation that outputs starlark code. It is a different color than regular derivations (so we can detect it statically), and only outputs starlark code. This starlark code is run once the derivation is done building. We would need to update the dependency graph as we build.
10
+
11
+
This has some weird implications because we would still need to reference the build output. Do we just need to ensure that the generated code just outputs a single derivation?
12
+
13
+
So let's think about numpy.
14
+
15
+
```python
16
+
deffoo():
17
+
pip_install("numpy")
18
+
```
19
+
20
+
There is no real way to do this because numpy will need to download its own dependencies. So we could either:
21
+
22
+
1. Download them within the derivations using the network, but then other depedencies might generate their own independent depedencies, which would be duplicated and might conflict.
23
+
2. Generate code for each dependency, which totally works, but dosn't have first class support.
24
+
25
+
26
+
```python
27
+
deffoo():
28
+
pip_install("numpy")
29
+
30
+
defpip_install(name):
31
+
deps = fetch_url("dependency_finder.gov/"+name)
32
+
derivation(script="""
33
+
out = ""
34
+
for dep in deps:
35
+
out += "fetch_url(dep)\n"
36
+
return out
37
+
""")
38
+
```
39
+
40
+
Terrible pseudo-code, but basically this derivation returns starlark code with the deps we need to download.
41
+
42
+
If we go this route, we would need to be able to check if we've generated this derivation on the fly. I think if we don't do that, it would be very hard to do things like: validating current url hashes, without actually building.
43
+
44
+
We could just stick the outputted starlark into the store somewhere, but it might be better to generate code and keep it in the source of the project. If the interface is just like in the example above `fetch_url("numpy")` then what if a new version of numpy is published? Any time there was a rebuild the url hash would mismatch. I think ideally if we want to replicate something like a Cargo.lock we would need the output of the code generating derivation to be placed within the project tree. That way, that generated file could reference very specific versions of software to fetch. If the end user wanted to fetch a new version they would simply delete the generated file.
45
+
46
+
This doesn't really remove the need for derivations that access the network. The code generating derivation would still need to make a request for the `numpy` source in order to calculate dependencies.
47
+
48
+
This kind of thing would mean that you could truly write a derivation like `pip_install("numpy")` without code generation that would require certain setup.
49
+
50
+
We could also prevent code generated by a code generating derivation from calling another code generating derivation, at least at the start, to limit all kinds gnarly behavior.
51
+
52
+
----
53
+
54
+
More thoughts.
55
+
56
+
This might actually be a good idea. We could then move forward with limiting network functionality only to derivations that use the network. That way we could be sure that after we've processed all derivations initially we can proceed from there without ever using the network.
57
+
58
+
One complication here is that bramble libraries could use this as well, so we can't generate bramble code and then store it next to the initial source file. Generate code could also depend on passed parmeters, so we wouldn't be able to check a file's lock file just by analyzing the source of that file, we'd need to be sure we were testing it with whatever parmeters were passed to functions in that file.
59
+
60
+
wait, maybe not, the generating function will always be called without arguments. so maybe we just put the code in the project next to the file that calls the function.
61
+
62
+
Ok, either way, that needs to be sorted out, and we might want to consider just adding generated code to the lockfile.
0 commit comments