|
| 1 | +--- |
| 2 | +title: 2.5 Type annotations |
| 3 | +teaching: 15 |
| 4 | +exercises: 45 |
| 5 | +--- |
| 6 | + |
| 7 | +::::::::::::::::::::::::::::::::::::::: objectives |
| 8 | + |
| 9 | +- Understand the advantages of type annotations |
| 10 | +- List the most important type checkers |
| 11 | +- Apply type annotations to simple functions |
| 12 | +- Read parametric types like `list[int]` or `set[str]` |
| 13 | +- Understand the use of type annotations in libraries like `pydantic`, `cattrs` or `msgspec`. |
| 14 | + |
| 15 | +:::::::::::::::::::::::::::::::::::::::::::::::::: |
| 16 | + |
| 17 | +:::::::::::::::::::::::::::::::::::::::: questions |
| 18 | + |
| 19 | +- What are type annotations? |
| 20 | +- Why are types needed? Our scripts run fine without them! |
| 21 | + |
| 22 | +:::::::::::::::::::::::::::::::::::::::::::::::::: |
| 23 | + |
| 24 | +## Why? |
| 25 | + |
| 26 | +- Check your code for correctness before you run it. |
| 27 | +- Force you to handle edge cases. |
| 28 | +- Have documentation that is always correct. |
| 29 | +- Better autocompletion. |
| 30 | +- Automatic serialization. |
| 31 | + |
| 32 | +## Tools |
| 33 | + |
| 34 | +There are many type checkers available: |
| 35 | + |
| 36 | +MyPy |
| 37 | +: The OG, developed by none other than Guido van Rossum himself. |
| 38 | + |
| 39 | +PyRight |
| 40 | +: Often more feature rich than MyPy, developed by Microsoft as part of the Python language server. |
| 41 | + |
| 42 | +Ty |
| 43 | +: By Astral. Not yet feature complete, however Astral brought us `ruff` and `uv`, both stellar tools. So this could become the type checker of the future. |
| 44 | + |
| 45 | +::: info |
| 46 | +Tip: try several of these on some of the examples below. Which type checker has the nicest error messages? Did it find all of the bugs? |
| 47 | +::: |
| 48 | + |
| 49 | +::: challenge |
| 50 | +### Exercise: type errors |
| 51 | + |
| 52 | +Can you spot the mistake in the following code? |
| 53 | + |
| 54 | +```python |
| 55 | +def logistic_map(r, x): |
| 56 | + return r * x * (1 - x) |
| 57 | + |
| 58 | +logistic_map("hello", 2.4) |
| 59 | +``` |
| 60 | + |
| 61 | +What does the error message say went wrong? Do you think this is a good message? |
| 62 | + |
| 63 | +:::: solution |
| 64 | +The error says we can't multiply sequences with a non-int of type float and points to the sub-expression `r * x`. However, the mistake is made at the call point by entering a `str` argument in the first place. |
| 65 | +:::: |
| 66 | + |
| 67 | +Change the function signature to: |
| 68 | + |
| 69 | +```python |
| 70 | +def logistic_map(r: float, x: float) -> float: |
| 71 | + return r * x * (1 - x) |
| 72 | +``` |
| 73 | + |
| 74 | +Do you see any effect in on the erronous call in your editor? |
| 75 | +::: |
| 76 | + |
| 77 | +### Abstract types |
| 78 | + |
| 79 | +We don't always care about the precise type of an object. For instance, if we just want to write a for loop over an iterable, and sometimes we want to express that `Any` object will do: |
| 80 | + |
| 81 | +```python |
| 82 | +from typing import Any |
| 83 | +from collections.abc import Iterable |
| 84 | + |
| 85 | +def print_numbered_list(items: Iterable[Any]): |
| 86 | + for i, v in enumerate(items): |
| 87 | + print(i, v) |
| 88 | +``` |
| 89 | + |
| 90 | +There are many abstract types available in `collections.abc`. |
| 91 | + |
| 92 | +### Completion |
| 93 | + |
| 94 | +Write a function that changes all commas to semi-colons. Start by entering the following: |
| 95 | + |
| 96 | +```python |
| 97 | +def semicolonize(s: str) -> str: |
| 98 | + return s |
| 99 | +``` |
| 100 | + |
| 101 | +Type a `.` after the `s`. Can you see the completion? |
| 102 | + |
| 103 | +## Data classes |
| 104 | + |
| 105 | +::: info |
| 106 | +### Data before classes |
| 107 | +In many languages structures or records are considered more primitive than classes, not so in Python. We will learn more about classes and their place in software design in part 3. In this section we'll only consider data classes as a means of grouping data. |
| 108 | +::: |
| 109 | + |
| 110 | +Type annotations go really well together with data classes, a means of combining elements into a larger data structure. Python supports creating classes using type annotation like so: |
| 111 | + |
| 112 | +```python |
| 113 | +from dataclasses import dataclass |
| 114 | + |
| 115 | +@dataclass |
| 116 | +class Address: |
| 117 | + street: str |
| 118 | + number: int |
| 119 | + suffix: str | None = None |
| 120 | + |
| 121 | +address = Address("Science Park", 402, "Matrix THREE") |
| 122 | + |
| 123 | +print(f"{address.street} {address.number}") |
| 124 | +``` |
| 125 | + |
| 126 | +Now you don't need to define an `__init__` method. There are nice packages that use this technique to allow automatic serialisation and deserialisation. Check out the [`msgspec` package](https://jcristharif.com/msgspec/index.html). |
| 127 | + |
| 128 | +::: challenge |
| 129 | +### Autocompletion |
| 130 | + |
| 131 | +Write a function that prints an address. How is your IDE behaving with and without type annotation? |
| 132 | + |
| 133 | +```python |
| 134 | +def print_address(a: Address): |
| 135 | + ... |
| 136 | +``` |
| 137 | + |
| 138 | +:::: solution |
| 139 | +When you use type-annotation, you'll have better auto-completion. |
| 140 | +:::: |
| 141 | +::: |
| 142 | + |
| 143 | +::: challenge |
| 144 | +### Serialization |
| 145 | + |
| 146 | +Install `msgspec` and try writing and reading back an `Address` object to JSON. Can you think of the advantages of using this approach over Python native `json.dump`? |
| 147 | + |
| 148 | +:::: solution |
| 149 | +- less code |
| 150 | +- automatic validation |
| 151 | +- user friendly error reporting |
| 152 | +- high performance |
| 153 | +:::: |
| 154 | +::: |
| 155 | + |
| 156 | +## Optional: Generics and protocols |
| 157 | + |
| 158 | +How would we type a function that returns the first element in a list? Suppose that we know that the list contains integers. Then: |
| 159 | + |
| 160 | +```python |
| 161 | +def first(lst: list[int]) -> int: |
| 162 | + ... |
| 163 | +``` |
| 164 | + |
| 165 | +But we like to be more generic than that: hence generic types. |
| 166 | + |
| 167 | +```python |
| 168 | +def first[T](lst: list[T]) -> T: |
| 169 | + return lst[0] |
| 170 | + |
| 171 | +first(["a", "b", "c"]) |
| 172 | +``` |
| 173 | + |
| 174 | +::: challenge |
| 175 | +Try running `first([])`, does the type checker complain? Write a version of `first` that returns `None` on an empty list. What should the type signature be? |
| 176 | + |
| 177 | +:::: solution |
| 178 | +Use `Optional[T]` or `T | None`. |
| 179 | + |
| 180 | +```python |
| 181 | +def first[T](lst: list[T]) -> T | None: |
| 182 | + if not lst: |
| 183 | + return None |
| 184 | + return lst[0] |
| 185 | +``` |
| 186 | +:::: |
| 187 | +::: |
| 188 | + |
| 189 | + |
| 190 | +## Optional: Complete the `binary_search` example |
| 191 | + |
| 192 | +We still haven't typed our `binary_search` algorithm completely. We'd like to express the fact that we can only search a list for values of the same type that it contains! We can introduce a type-variable as follows: |
| 193 | + |
| 194 | +```python |
| 195 | +def binary_search[T](lst: list[T], value: T) -> int | None: |
| 196 | + ... |
| 197 | +``` |
| 198 | + |
| 199 | +This reads as: `binary_search` introduces an unknown type `T`, such that we expect `list[T]` and `T` to be the types of the arguments to this function. |
| 200 | + |
| 201 | +::: challenge |
| 202 | +Change your `binary_search` function with the above type definition. What does `mypy` say? |
| 203 | + |
| 204 | +:::: solution |
| 205 | +We haven't taught the type checker that our type should be able to handle comparison operations. When a type defines comparison like that, we say that the type is **ordered**. |
| 206 | +:::: |
| 207 | +::: |
| 208 | + |
| 209 | +There is no built-in type constraint for ordered types, we'll have to define our own. |
| 210 | + |
| 211 | +```python |
| 212 | +from typing import Protocol, Self |
| 213 | + |
| 214 | +class Ord(Protocol): |
| 215 | + def __lt__(self: Self, other: Self) -> bool: |
| 216 | + ... |
| 217 | +``` |
| 218 | + |
| 219 | +The full type definition of `binary_search`: |
| 220 | + |
| 221 | +```python |
| 222 | +def binary_search[T: Ord](lst: list[T], value: T) -> int | None: |
| 223 | + low: int = 0 |
| 224 | + high: int = len(lst)-1 |
| 225 | + while low <= high: |
| 226 | + mid: int = (low+high) // 2 |
| 227 | + if lst[mid] > value: |
| 228 | + high = mid-1 |
| 229 | + elif lst[mid] < value: |
| 230 | + low = mid+1 |
| 231 | + else: |
| 232 | + return mid |
| 233 | + return -1 |
| 234 | +``` |
| 235 | + |
| 236 | +::: challenge |
| 237 | +Try to call `binary_search` in ways that still make the type checker fail. Can you think of properties that we can't express in the type system? |
| 238 | + |
| 239 | +:::: solution |
| 240 | +It is surprisingly hard to find a type in Python that doesn't support the `<` operator. Sometimes this operator doesn't quite capture the meaning of orderedness. In the case of a `set`, the `<` operator checks that one is a subset of the other (a partial but not total order). Types that don't have comparison: `dict`, `complex`. |
| 241 | + |
| 242 | +Even when all the types are satisfied, there's no way that the type system can check that our input list is actually sorted. We'd have to subtype `list` and ensure that on each mutation the list remains sorted; not impossible, but at this point most of us should agree that we're taking this silly example a bit too far. |
| 243 | +:::: |
| 244 | +::: |
| 245 | + |
| 246 | +## For the curious: Algebraic data types |
| 247 | + |
| 248 | +Now that we know about type unions and type products (tuples, named tuples, or data classes), we have all the ingredients to write [algebraic data types](https://en.wikipedia.org/wiki/Algebraic_data_type). For instance, we can define a linked list: |
| 249 | + |
| 250 | +```python |
| 251 | +type List[T] = tuple[T, List[T]] | None |
| 252 | + |
| 253 | +def make_list[T](*args: T) -> List[T]: |
| 254 | + match args: |
| 255 | + case (first, *rest): |
| 256 | + return (first, make_list(*rest)) |
| 257 | + case _: |
| 258 | + return None |
| 259 | + |
| 260 | +def list_to_str[T](lst: List[T]): |
| 261 | + match lst: |
| 262 | + case None: |
| 263 | + return "None" |
| 264 | + case (a, rest): |
| 265 | + return str(a) + " : " + list_to_str(rest) |
| 266 | + |
| 267 | +l: List[int] = make_list(1, 2, 3) |
| 268 | +print(l) |
| 269 | +print(list_to_str(l)) |
| 270 | +``` |
| 271 | + |
| 272 | +The linked list may seem a bit silly, but we can also define tree structures and use `match/case` to traverse a tree. Data structures can become highly complex, but the type system helps us writing correct code here. |
| 273 | + |
0 commit comments