|
| 1 | +# pycobytes[31] := Programmers Are Lazy |
| 2 | +<!-- #SQUARK live! |
| 3 | +| dest = issues/(issue)/31 |
| 4 | +| title = Programmers Are Lazy |
| 5 | +| head = Programmers Are Lazy |
| 6 | +| index = 31 |
| 7 | +| tags = keywords |
| 8 | +| date = 2025 June 5 |
| 9 | +--> |
| 10 | + |
| 11 | +> *There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. The other is to make it so complicated that there are no obvious deficiencies.* |
| 12 | +
|
| 13 | +Hey pips! |
| 14 | + |
| 15 | +Check out this list comprehension: |
| 16 | + |
| 17 | +```py |
| 18 | +>>> [(x**n) % (n-1) for n in range(2, 10**10) for x in range(2, 1000)] |
| 19 | +... |
| 20 | +``` |
| 21 | + |
| 22 | +Now that’s a massive list right there. Computing it is gonna take a while, to say the least, and it’s almost certainly not fitting in memory. |
| 23 | + |
| 24 | +We’ve see that `range` is very efficient even with large numbers because it computes values on the fly. We want the same thing here, where the iterable only produces the values when they’re needed. |
| 25 | + |
| 26 | +This is a form of **lazy evaluation**, and Python has a neat construct just for it. Turn your square brackets into round parentheses, and you’ve got yourself what’s known as a **generator expression**. |
| 27 | + |
| 28 | +```py |
| 29 | +>>> (mod(x ** n, n-1) for n in range(2, 10**10) for x in range(2, 1000)) |
| 30 | +<generator object <genexpr> at 0x00000204F8F45030> |
| 31 | +``` |
| 32 | + |
| 33 | +Like its name suggests, this object *generates* values one-by-one as they’re needed. The key difference between a generator and, say, a `list`, is that a generator doesn’t store the actual values – it only computes them when asked for them. |
| 34 | + |
| 35 | +```py |
| 36 | +firsts = (row.get_first() for row in table) |
| 37 | +``` |
| 38 | + |
| 39 | +What does this mean? Well, it makes generators very memory-efficient, especially for long collections – only a single element has to be kept in memory, instead of the entire collection. In fact, this also allows generators to be infinite! You can just keep asking them to compute values indefinitely. |
| 40 | + |
| 41 | +So, how do we fetch those values? |
| 42 | + |
| 43 | +Well, generators are iterators (there is a distinction, it’s very technical tho icl), which means you can call `next()` on them to access the next element in the iterable: |
| 44 | + |
| 45 | +```py |
| 46 | +>>> netherlands = (n ** 0.5 / 2 for n in range(0, 314)) |
| 47 | +>>> next(netherlands) |
| 48 | +0.0 |
| 49 | +>>> next(netherlands) |
| 50 | +0.5 |
| 51 | +``` |
| 52 | + |
| 53 | +But 99.7% (3 sf) of the time, you’re just iterating over them with a `for` loop. |
| 54 | + |
| 55 | +```py |
| 56 | +>>> netherlands = (n ** 0.5 / 2 for n in range(0, 314)) |
| 57 | +>>> for n in netherlands: |
| 58 | + if n > 1: |
| 59 | + break |
| 60 | + print(n) |
| 61 | +0.0 |
| 62 | +0.5 |
| 63 | +0.7071067811865476 |
| 64 | +0.8660254037844386 |
| 65 | +1.0 |
| 66 | +``` |
| 67 | + |
| 68 | +> [!Tip] |
| 69 | +> Under the hood, the `for` loop is just repeatedly calling `next()` on the iterable, until the iterable is exhausted. |
| 70 | +
|
| 71 | +Generator expressions are convenient, but they’re actually syntactic sugar for the more fundamental underlying mechanism – **generator functions**. |
| 72 | + |
| 73 | +A generator function, instead of `return`ing a single value, uses a different keyword `yield` to return multiple values. |
| 74 | + |
| 75 | +```py |
| 76 | +def generate(): |
| 77 | + yield "never" |
| 78 | + yield "gonna" |
| 79 | + yield "give" |
| 80 | + yield "you" |
| 81 | +``` |
| 82 | + |
| 83 | +If it makes it easier to understand at first, you can think of `yield` like a `return` that doesn’t stop execution – instead the function keeps running, and each `yield` adds another value to an output iterable. |
| 84 | + |
| 85 | +But um, akshually, generator functions are a lot weirder than that... What really happens is `yield` returns the value, and then *pauses* the function. |
| 86 | + |
| 87 | +Yeah, it’s wack. |
| 88 | + |
| 89 | +> [!Tip] |
| 90 | +> This is difficult to explain, so you may wanna do your own research to help you understand it. |
| 91 | +
|
| 92 | +We’ll start by adding some print statements to a generator function to track what’s happening: |
| 93 | + |
| 94 | +```py |
| 95 | +def source(): |
| 96 | + print("gen: STARTING UP") |
| 97 | + |
| 98 | + print("gen: YIELDING 1") |
| 99 | + yield 1 |
| 100 | + print("gen: YIELDED") |
| 101 | + |
| 102 | + print("gen: YIELDING 2") |
| 103 | + yield 2 |
| 104 | + print("gen: YIELDED") |
| 105 | + |
| 106 | + print("gen: WRAPPING UP") |
| 107 | +``` |
| 108 | + |
| 109 | +When we iterate over this generator function in a `for` loop, the `for` loop is continually calling `next()` on the generator: |
| 110 | + |
| 111 | +```py |
| 112 | +for each in source(): |
| 113 | + do_shenanigans() |
| 114 | + |
| 115 | +# essentially: |
| 116 | +items = source() |
| 117 | +each = next(items) |
| 118 | +do_shenanigans() |
| 119 | +each = next(items) |
| 120 | +do_shenanigans() |
| 121 | +... |
| 122 | +``` |
| 123 | + |
| 124 | +Each time it calls `next()`, the generator function resumes execution until it encounters a `yield`. It then passes that yielded value to the `for` loop, which assigns it to your iterating variable. |
| 125 | + |
| 126 | +```py |
| 127 | +def source(): |
| 128 | + print("gen: STARTING UP") |
| 129 | + |
| 130 | + print("gen: YIELDING 1") |
| 131 | + yield 1 # returns and pauses |
| 132 | + print("gen: YIELDED") # not run! |
| 133 | + |
| 134 | + ... |
| 135 | +``` |
| 136 | + |
| 137 | +```py |
| 138 | +for each in items: |
| 139 | + # first value becomes 1 |
| 140 | +``` |
| 141 | + |
| 142 | +Then the body of the `for` loop runs, and when it’s done, the `for` loop calls `next()` on the generator again. The generator picks up where it left off (and remembers all the state!), runs until it `yield`s the next value, passes it to the `for` loop, and so on. |
| 143 | + |
| 144 | +```py |
| 145 | +>>> for each in source(): |
| 146 | + print(f"for: RECEIVED {each}") |
| 147 | + |
| 148 | +gen: STARTING UP |
| 149 | +gen: YIELDING 1 |
| 150 | +for: RECEIVED 1 |
| 151 | +gen: YIELDED |
| 152 | +gen: YIELDING 2 |
| 153 | +for: RECEIVED 2 |
| 154 | +gen: YIELDED |
| 155 | +gen: WRAPPING UP |
| 156 | +``` |
| 157 | + |
| 158 | +Notice how the `gen: YIELDED` don’t print *before* the `for` loop says it received the value, but *after*. When the generator function `yield`s a value, that’s when execution is suspended and control is passed back to its caller. |
| 159 | +
|
| 160 | +```py |
| 161 | +gen: STARTING UP |
| 162 | +gen: YIELDING 1 # generator yields and pauses |
| 163 | +for: RECEIVED 1 |
| 164 | +gen: YIELDED # generator continues |
| 165 | +``` |
| 166 | +
|
| 167 | +Whew. Lot to take in. If you manage to understand it (don’t worry, it takes time), this pretty intuitively explains why generators can only produce values *in order*. |
| 168 | +
|
| 169 | +It’s got that sequence of a `yield` statements with potentially anything in between, so each yielded value depends on all the ones before it. You can’t use random access[^rand-access] (indexing) like with a list, because computing element `42` requires computing `0:41` too. |
| 170 | + |
| 171 | +[^rand-access]: I thought I’d lied because I hadn’t told you about `itertools.slice()`, but it turns out that doesn’t random access, it literally computes all the values leading up to the first value you want 💀 |
| 172 | + |
| 173 | +> [!Warning] |
| 174 | +> This would also explain why `range()` and `zip()` aren’t generators, just more general iterators – they do allow indexing, so don’t have to be iterated over in order. |
| 175 | + |
| 176 | +All in all, I’m sure you’ll come to appreciate the beauty of generators as you solidify your understanding of how they work. At heart, they’re just lazy-loading iterables – it’s literally like Minecraft loading in chunks as you move in the world. |
| 177 | + |
| 178 | +In fact, you’ve probably already seen a nonzero number of built-in functions in Python that are generators. `open()` is a common one, returning an object that yields the lines of the file when iterated over. If you were reading a 270,000-line .csv file into memory, that would be disastrous, so it’s a good thing the generator only loads it line-by-line! |
| 179 | + |
| 180 | + |
| 181 | +<br> |
| 182 | + |
| 183 | + |
| 184 | +## Further Reading |
| 185 | + |
| 186 | +- High-quality tutorial on generator functions and expressions – [mCoding<sup>↗</sup>](https://youtube.com/watch?v=tmeKsb2Fras) |
| 187 | +- Generators across programming languages – [Wikipedia<sup>↗</sup>](https://wikipedia.org/wiki/Generator_(computer_programming)) |
| 188 | + |
| 189 | + |
| 190 | +<br> |
| 191 | + |
| 192 | + |
| 193 | +## Challenge |
| 194 | + |
| 195 | +Can you write a 1-line expression that generates [perfect](https://wikipedia.org/wiki/Perfect_number) numbers? |
| 196 | + |
| 197 | +```py |
| 198 | +>>> perf = (your_expression) |
| 199 | + |
| 200 | +>>> [next(perf) for i in range(4)] |
| 201 | +6 28 496 8128 |
| 202 | +``` |
| 203 | + |
| 204 | +Your expression should be able to generate infinite numbers, in theory. |
| 205 | + |
| 206 | + |
| 207 | +<br> |
| 208 | + |
| 209 | + |
| 210 | +--- |
| 211 | + |
| 212 | +<div align="center"> |
| 213 | + |
| 214 | +[](http://thecodelesscode.com/case/66) |
| 215 | + |
| 216 | +[*The Codeless Code*, Case 66](http://thecodelesscode.com/case/66) |
| 217 | + |
| 218 | +</div> |
0 commit comments