Skip to content

Commit cb5ff8c

Browse files
committed
Second set of updates
1 parent 389d0c3 commit cb5ff8c

1 file changed

Lines changed: 58 additions & 3 deletions

File tree

episodes/visualizing-matplotlib.md

Lines changed: 58 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ As arguments in this function, we add the kind of plot we want, and how our data
170170
With the following code, we will make a scatter plot (argument `kind = "scatter"`) to analyze the relationship between the weight (which will plot in the x axis, argument `x = "weight"`) and the hindfoot length (in the y axis, argument `y = "hindfoot_length"`) of the animals sampled at the study site.
171171

172172
```python
173-
scatter_plot = complete_old.plot(x = "weight", y = "hindfoot_length", kind="scatter")
173+
scatter_plot = complete_old.plot(x = "weight", y = "hindfoot_length", kind = "scatter")
174174
```
175175

176176
This scatter plot shows there seems to be a positive relation between weight and hindfoot length, were heavier animal tend to have bigger hindfoots.
@@ -196,11 +196,66 @@ As shown in the previous image, the x-axis labels overlap with each other, which
196196
At this point we realize a more fine-grained control over our graph is needed, and here is where Matplotlib shows in the picture.
197197

198198

199+
200+
199201
## Advanced plots with Matplotlib
200202

201-
[Matplotlib](https://matplotlib.org/) is a Python library that is widely used throughout the scientific Python community to create high-quality and publication-ready graphics. It supports a wide range of raster and vector graphics formats including PNG, PostScript, EPS, PDF and SVG.
203+
[Matplotlib](https://matplotlib.org/) is a Python library that is widely used throughout the scientific Python community to create high-quality and publication-ready graphics.
204+
It supports a wide range of raster and vector graphics formats including PNG, PostScript, EPS, PDF and SVG.
205+
206+
Moreover, matplotlib is the actual engine behind the plotting capabilities of Pandas, and other plotting libraries like seaborn and plotnine.
207+
For example, when we call the `.plot()` methods on Pandas data objects, we are actually using the matplotlib library in the backstage.
208+
209+
Let's start by recreating our scatter plot, but this time, using matplotlib.
210+
We'll do it one step at a time.
211+
The first thing we need is to create our figure and our axes (or plots), using the `.subplots()` function.
212+
The `fig` object we are creating is the entire plot area, which can contain one or multiple axes.
213+
In this case, we will have only one set of axes, which is the `ax` object.
214+
Only this line of code will result in an empty plot.
215+
We show our plot with the `plt.show()` function
216+
217+
```python
218+
fig, ax = plt.subplots()
219+
plt.show()
220+
```
221+
222+
If we want more than one plot in the same figure, we could specify the number of rows (`nrows` argument) and the number of columns (`ncols`) in this function.
223+
For example, let's say we want two plots (or axes), organized in two columns and one row.
224+
225+
```python
226+
fig, ax = plt.subplots(nrows = 1, ncols =2)
227+
plt.show()
228+
```
229+
230+
Let's focus for now only in making our scatter plot, so just one set of axes.
231+
For this, in our created `ax` axes, we'll modify it with the `.scatter()` function.
232+
In the x axis, we'll use the `weight` column, so we use the argument `x = complete_old["weight"]`.
233+
In the y axis, we'll use the `handfoot_length` column, so we use the argument `y = complete_old["hindfoot_length"]`.
234+
235+
```python
236+
fig, ax = plt.subplots()
237+
ax.scatter(x = complete_old["weight"], y = complete_old["hindfoot_length"])
238+
plt.show()
239+
```
240+
241+
We could further adjust our scatter plot, by changing the transparency of the points (`alpha` argument)
242+
243+
## Changing aesthetics
244+
Making plots is often an iterative process, so we’ll continue developing the scatter plot we just made.
245+
You may have noticed that parts of our scatter plot have many overlapping points, making it difficult to see all the data.
246+
We can adjust the transparency of the points using the `alpha` argument, which takes a value between 0 and 1.
247+
Additionally, we could change the color of the points, with the `c` argument.
248+
249+
```python
250+
fig, ax = plt.subplots()
251+
ax.scatter(x = complete_old["weight"], y = complete_old["hindfoot_length"], alpha = 0.2, c = "green")
252+
plt.show()
253+
```
254+
255+
202256

203-
Moreover, matplotlib is the actual engine behind the plotting capabilities of Pandas, and other plotting libraries like seaborn and plotnine. For example, when we call the `.plot()` methods on Pandas data objects, we are actually using the matplotlib library in the backstage.
257+
## Alternative
258+
All with pandas.
204259

205260
::::::::::::::::::::::::::::::::::::::::: callout
206261

0 commit comments

Comments
 (0)