Skip to content

Commit 9c9f523

Browse files
authored
Update analyze-baseball-stats-with-pandas-and-matplotlib.mdx
1 parent b5aff34 commit 9c9f523

1 file changed

Lines changed: 26 additions & 19 deletions

File tree

projects/analyze-baseball-stats-with-pandas-and-matplotlib/analyze-baseball-stats-with-pandas-and-matplotlib.mdx

Lines changed: 26 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -346,27 +346,27 @@ We can now find the "value" of a player by calculating their OBP divided by thei
346346

347347
```py
348348
value_df = batting_with_salary[
349-
(batting_with_salary["salary"] > 0) &
350-
(batting_with_salary["OBP"] > 0) &
351-
(batting_with_salary["AB"] >= 200)
349+
(batting_with_salary['salary'] > 0) &
350+
(batting_with_salary['OBP'] > 0) &
351+
(batting_with_salary['AB'] >= 200)
352352
].copy()
353353
```
354354

355355
We now have a DataFrame named `value_df` that contains only the rows we're interested in. Let's calculate each player's value and sort by the highest value players! We'll display only the columns that are relevant to us.
356356

357357
```py
358358
value_df_sorted = value_df.sort_values(
359-
by="OBP_per_dollar",
360-
ascending=False
359+
by='OBP_per_dollar',
360+
ascending=False
361361
)
362362

363363
value_df_sorted [[
364-
"playerID",
365-
"yearID",
366-
"teamID",
367-
"OBP",
368-
"salary",
369-
"OBP_per_dollar"
364+
'playerID',
365+
'yearID',
366+
'teamID',
367+
'OBP',
368+
'salary',
369+
'OBP_per_dollar'
370370
]].head()
371371
```
372372

@@ -376,21 +376,25 @@ If we wanted to see if our calculations were working correctly, we could choose
376376

377377
```py
378378
value_df_sorted[value_df_sorted['yearID'] == 2010][[
379-
"playerID",
380-
"yearID",
381-
"teamID", 'AB',
382-
"OBP",
383-
"salary",
384-
"OBP_per_dollar"
379+
'playerID',
380+
'yearID',
381+
'teamID', 'AB',
382+
'OBP',
383+
'salary',
384+
'OBP_per_dollar'
385385
]].head()
386386
```
387387

388-
I went to look up `heywaja01` on baseball-reference.com, and it turns out that this data was from a player named Jason Heyward. In 2010, he was an All Star and got 2nd place in voting for Rookie of the Year! It certainly sounds like a player that was high value. You can also confirm that our calculation of OBP was correct!
388+
I went to look up `heywaja01` on [baseball-reference.com](https://baseball-reference.com), and it turns out that this data was from a player named Jason Heyward. In 2010, he was an All Star and got 2nd place in voting for Rookie of the Year! It certainly sounds like a player that was high value. You can also confirm that our calculation of OBP was correct!
389389

390390

391391
## Recap
392392

393-
Clearly there is a _ton_ that you can do with this dataset. In this project, we practiced the following skills in Pandas:
393+
Congrats on making it to the end!
394+
395+
Clearly there is a _ton_ that you can do with this dataset.
396+
397+
To recap, in this project tutorial, we practiced the following skills in Pandas:
394398

395399
- Initial data exploration using `.describe()` to see summary statistics.
396400
- Filtering the dataset by boolean values (for example, `[value_df_sorted['yearID'] == 2010`)
@@ -399,3 +403,6 @@ Clearly there is a _ton_ that you can do with this dataset. In this project, we
399403
- Using `.merge()` to join two tables together.
400404
401405
Do you have any favorite players or teams? Shohei Ohtani? Aaron Judge? The Chicago Cubs? We hope that you come up with your own questions about baseball and use your Python and Pandas skills to answer those questions!
406+
407+
### More Resources
408+

0 commit comments

Comments
 (0)