-
Notifications
You must be signed in to change notification settings - Fork 595
Expand file tree
/
Copy pathlinear_models_ex_01.py
More file actions
104 lines (87 loc) · 2.83 KB
/
linear_models_ex_01.py
File metadata and controls
104 lines (87 loc) · 2.83 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
# ---
# jupyter:
# jupytext:
# text_representation:
# extension: .py
# format_name: percent
# format_version: '1.3'
# jupytext_version: 1.17.1
# kernelspec:
# display_name: Python 3
# name: python3
# ---
# %% [markdown]
# # 📝 Exercise M4.01
#
# The aim of this exercise is two-fold:
#
# * understand the parametrization of a linear model;
# * quantify the fitting accuracy of a set of such models.
#
# We will reuse part of the code of the course to:
#
# * load data;
# * create the function representing a linear model.
#
# ## Prerequisites
#
# ### Data loading
# %% [markdown]
# ```{note}
# If you want a deeper overview regarding this dataset, you can refer to the
# Appendix - Datasets description section at the end of this MOOC.
# ```
# %%
import pandas as pd
penguins = pd.read_csv("../datasets/penguins_regression.csv")
feature_name = "Flipper Length (mm)"
target_name = "Body Mass (g)"
data, target = penguins[[feature_name]], penguins[[target_name]]
# %% [markdown]
# ### Model definition
# %%
def linear_model_flipper_mass(
flipper_length, weight_flipper_length, intercept_body_mass
):
"""Linear model of the form y = a * x + b"""
body_mass = weight_flipper_length * flipper_length + intercept_body_mass
return body_mass
# %% [markdown]
# ## Main exercise
#
# Define a vector `weights = [...]` and a vector `intercepts = [...]` of the
# same length. Each pair of entries `(weights[i], intercepts[i])` tags a
# different model. Use these vectors along with the vector
# `flipper_length_range` to plot several linear models that could possibly fit
# our data. Use the above helper function to visualize both the models and the
# real samples.
# %%
import numpy as np
flipper_length_range = np.linspace(data.min(), data.max(), num=300)
# %%
# Write your code here.
# %% [markdown]
# In the previous question, you were asked to create several linear models. The
# visualization allowed you to qualitatively assess if a model was better than
# another.
#
# Now, you should come up with a quantitative measure which indicates the
# goodness of fit of each linear model and allows you to select the best model.
# Define a function `goodness_fit_measure(true_values, predictions)` that takes
# as inputs the true target values and the predictions and returns a single
# scalar as output.
# %%
# Write your code here.
# %% [markdown]
# You can now copy and paste the code below to show the goodness of fit for each
# model.
#
# ```python
# for model_idx, (weight, intercept) in enumerate(zip(weights, intercepts)):
# target_predicted = linear_model_flipper_mass(data, weight, intercept)
# print(f"Model #{model_idx}:")
# print(f"{weight:.2f} (g / mm) * flipper length + {intercept:.2f} (g)")
# print(f"Error: {goodness_fit_measure(target, target_predicted):.3f}\n")
# ```
# %%
# Write your code here.