Skip to content

Commit 82bca27

Browse files
authored
Merge pull request #6 from esciencecenter-digital-skills/467-type-annotation
467 type annotation
2 parents 8168ef3 + a7f9ed9 commit 82bca27

5 files changed

Lines changed: 286 additions & 2 deletions

File tree

config.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,8 @@ episodes:
8282
- 22-scaling-up-unit-testing.md
8383
- 23-continuous-integration-automated-testing.md
8484
- 24-diagnosing-issues-improving-robustness.md
85-
- 25-section2-optional-exercises.md
85+
- 25-type-annotation.md
86+
- 26-section2-optional-exercises.md
8687
- 30-section3-intro.md
8788
- 31-software-requirements.md
8889
- 32-software-architecture-design.md

episodes/22-scaling-up-unit-testing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -354,7 +354,7 @@ and allows others to verify against correct behaviour.
354354
## Optional exercises
355355

356356
Checkout
357-
[these optional exercises](25-section2-optional-exercises.md)
357+
[these optional exercises](26-section2-optional-exercises.md)
358358
to learn more about code coverage.
359359

360360

episodes/25-type-annotation.md

Lines changed: 273 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,273 @@
1+
---
2+
title: 2.5 Type annotations
3+
teaching: 15
4+
exercises: 45
5+
---
6+
7+
::::::::::::::::::::::::::::::::::::::: objectives
8+
9+
- Understand the advantages of type annotations
10+
- List the most important type checkers
11+
- Apply type annotations to simple functions
12+
- Read parametric types like `list[int]` or `set[str]`
13+
- Understand the use of type annotations in libraries like `pydantic`, `cattrs` or `msgspec`.
14+
15+
::::::::::::::::::::::::::::::::::::::::::::::::::
16+
17+
:::::::::::::::::::::::::::::::::::::::: questions
18+
19+
- What are type annotations?
20+
- Why are types needed? Our scripts run fine without them!
21+
22+
::::::::::::::::::::::::::::::::::::::::::::::::::
23+
24+
## Why?
25+
26+
- Check your code for correctness before you run it.
27+
- Force you to handle edge cases.
28+
- Have documentation that is always correct.
29+
- Better autocompletion.
30+
- Automatic serialization.
31+
32+
## Tools
33+
34+
There are many type checkers available:
35+
36+
MyPy
37+
: The OG, developed by none other than Guido van Rossum himself.
38+
39+
PyRight
40+
: Often more feature rich than MyPy, developed by Microsoft as part of the Python language server.
41+
42+
Ty
43+
: By Astral. Not yet feature complete, however Astral brought us `ruff` and `uv`, both stellar tools. So this could become the type checker of the future.
44+
45+
::: info
46+
Tip: try several of these on some of the examples below. Which type checker has the nicest error messages? Did it find all of the bugs?
47+
:::
48+
49+
::: challenge
50+
### Exercise: type errors
51+
52+
Can you spot the mistake in the following code?
53+
54+
```python
55+
def logistic_map(r, x):
56+
return r * x * (1 - x)
57+
58+
logistic_map("hello", 2.4)
59+
```
60+
61+
What does the error message say went wrong? Do you think this is a good message?
62+
63+
:::: solution
64+
The error says we can't multiply sequences with a non-int of type float and points to the sub-expression `r * x`. However, the mistake is made at the call point by entering a `str` argument in the first place.
65+
::::
66+
67+
Change the function signature to:
68+
69+
```python
70+
def logistic_map(r: float, x: float) -> float:
71+
return r * x * (1 - x)
72+
```
73+
74+
Do you see any effect in on the erronous call in your editor?
75+
:::
76+
77+
### Abstract types
78+
79+
We don't always care about the precise type of an object. For instance, if we just want to write a for loop over an iterable, and sometimes we want to express that `Any` object will do:
80+
81+
```python
82+
from typing import Any
83+
from collections.abc import Iterable
84+
85+
def print_numbered_list(items: Iterable[Any]):
86+
for i, v in enumerate(items):
87+
print(i, v)
88+
```
89+
90+
There are many abstract types available in `collections.abc`.
91+
92+
### Completion
93+
94+
Write a function that changes all commas to semi-colons. Start by entering the following:
95+
96+
```python
97+
def semicolonize(s: str) -> str:
98+
return s
99+
```
100+
101+
Type a `.` after the `s`. Can you see the completion?
102+
103+
## Data classes
104+
105+
::: info
106+
### Data before classes
107+
In many languages structures or records are considered more primitive than classes, not so in Python. We will learn more about classes and their place in software design in part 3. In this section we'll only consider data classes as a means of grouping data.
108+
:::
109+
110+
Type annotations go really well together with data classes, a means of combining elements into a larger data structure. Python supports creating classes using type annotation like so:
111+
112+
```python
113+
from dataclasses import dataclass
114+
115+
@dataclass
116+
class Address:
117+
street: str
118+
number: int
119+
suffix: str | None = None
120+
121+
address = Address("Science Park", 402, "Matrix THREE")
122+
123+
print(f"{address.street} {address.number}")
124+
```
125+
126+
Now you don't need to define an `__init__` method. There are nice packages that use this technique to allow automatic serialisation and deserialisation. Check out the [`msgspec` package](https://jcristharif.com/msgspec/index.html).
127+
128+
::: challenge
129+
### Autocompletion
130+
131+
Write a function that prints an address. How is your IDE behaving with and without type annotation?
132+
133+
```python
134+
def print_address(a: Address):
135+
...
136+
```
137+
138+
:::: solution
139+
When you use type-annotation, you'll have better auto-completion.
140+
::::
141+
:::
142+
143+
::: challenge
144+
### Serialization
145+
146+
Install `msgspec` and try writing and reading back an `Address` object to JSON. Can you think of the advantages of using this approach over Python native `json.dump`?
147+
148+
:::: solution
149+
- less code
150+
- automatic validation
151+
- user friendly error reporting
152+
- high performance
153+
::::
154+
:::
155+
156+
## Optional: Generics and protocols
157+
158+
How would we type a function that returns the first element in a list? Suppose that we know that the list contains integers. Then:
159+
160+
```python
161+
def first(lst: list[int]) -> int:
162+
...
163+
```
164+
165+
But we like to be more generic than that: hence generic types.
166+
167+
```python
168+
def first[T](lst: list[T]) -> T:
169+
return lst[0]
170+
171+
first(["a", "b", "c"])
172+
```
173+
174+
::: challenge
175+
Try running `first([])`, does the type checker complain? Write a version of `first` that returns `None` on an empty list. What should the type signature be?
176+
177+
:::: solution
178+
Use `Optional[T]` or `T | None`.
179+
180+
```python
181+
def first[T](lst: list[T]) -> T | None:
182+
if not lst:
183+
return None
184+
return lst[0]
185+
```
186+
::::
187+
:::
188+
189+
190+
## Optional: Complete the `binary_search` example
191+
192+
We still haven't typed our `binary_search` algorithm completely. We'd like to express the fact that we can only search a list for values of the same type that it contains! We can introduce a type-variable as follows:
193+
194+
```python
195+
def binary_search[T](lst: list[T], value: T) -> int | None:
196+
...
197+
```
198+
199+
This reads as: `binary_search` introduces an unknown type `T`, such that we expect `list[T]` and `T` to be the types of the arguments to this function.
200+
201+
::: challenge
202+
Change your `binary_search` function with the above type definition. What does `mypy` say?
203+
204+
:::: solution
205+
We haven't taught the type checker that our type should be able to handle comparison operations. When a type defines comparison like that, we say that the type is **ordered**.
206+
::::
207+
:::
208+
209+
There is no built-in type constraint for ordered types, we'll have to define our own.
210+
211+
```python
212+
from typing import Protocol, Self
213+
214+
class Ord(Protocol):
215+
def __lt__(self: Self, other: Self) -> bool:
216+
...
217+
```
218+
219+
The full type definition of `binary_search`:
220+
221+
```python
222+
def binary_search[T: Ord](lst: list[T], value: T) -> int | None:
223+
low: int = 0
224+
high: int = len(lst)-1
225+
while low <= high:
226+
mid: int = (low+high) // 2
227+
if lst[mid] > value:
228+
high = mid-1
229+
elif lst[mid] < value:
230+
low = mid+1
231+
else:
232+
return mid
233+
return -1
234+
```
235+
236+
::: challenge
237+
Try to call `binary_search` in ways that still make the type checker fail. Can you think of properties that we can't express in the type system?
238+
239+
:::: solution
240+
It is surprisingly hard to find a type in Python that doesn't support the `<` operator. Sometimes this operator doesn't quite capture the meaning of orderedness. In the case of a `set`, the `<` operator checks that one is a subset of the other (a partial but not total order). Types that don't have comparison: `dict`, `complex`.
241+
242+
Even when all the types are satisfied, there's no way that the type system can check that our input list is actually sorted. We'd have to subtype `list` and ensure that on each mutation the list remains sorted; not impossible, but at this point most of us should agree that we're taking this silly example a bit too far.
243+
::::
244+
:::
245+
246+
## For the curious: Algebraic data types
247+
248+
Now that we know about type unions and type products (tuples, named tuples, or data classes), we have all the ingredients to write [algebraic data types](https://en.wikipedia.org/wiki/Algebraic_data_type). For instance, we can define a linked list:
249+
250+
```python
251+
type List[T] = tuple[T, List[T]] | None
252+
253+
def make_list[T](*args: T) -> List[T]:
254+
match args:
255+
case (first, *rest):
256+
return (first, make_list(*rest))
257+
case _:
258+
return None
259+
260+
def list_to_str[T](lst: List[T]):
261+
match lst:
262+
case None:
263+
return "None"
264+
case (a, rest):
265+
return str(a) + " : " + list_to_str(rest)
266+
267+
l: List[int] = make_list(1, 2, 3)
268+
print(l)
269+
print(list_to_str(l))
270+
```
271+
272+
The linked list may seem a bit silly, but we can also define tree structures and use `match/case` to traverse a tree. Data structures can become highly complex, but the type system helps us writing correct code here.
273+
File renamed without changes.

pyproject.toml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
[project]
2+
name = "python-intermediate-development"
3+
version = "0.1.0"
4+
requires-python = ">=3.13"
5+
dependencies = [
6+
"mypy>=1.16.1",
7+
"pyrefly>=0.20.2",
8+
"pyright>=1.1.402",
9+
"ty>=0.0.1a11",
10+
]

0 commit comments

Comments
 (0)