Skip to content

Commit 486c776

Browse files
authored
Merge pull request #188 from rstudio/docs-article-metadata
docs: article on using custom metadata
2 parents c94823d + 9a904da commit 486c776

4 files changed

Lines changed: 104 additions & 1 deletion

File tree

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ jobs:
134134
- uses: actions/setup-python@v2
135135

136136
with:
137-
python-version: 3.8
137+
python-version: "3.10"
138138
- name: Install dependencies
139139
run: |
140140
python -m pip install --upgrade pip

docs/_toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,5 @@ format: jb-book
55
root: intro
66
chapters:
77
- file: getting_started
8+
- file: articles/index.rst
89
- file: api/index.rst
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
---
2+
jupyter:
3+
jupytext:
4+
text_representation:
5+
extension: .Rmd
6+
format_name: rmarkdown
7+
format_version: '1.2'
8+
jupytext_version: 1.13.6
9+
kernelspec:
10+
display_name: venv-pins-python
11+
language: python
12+
name: venv-pins-python
13+
---
14+
15+
# Using custom metadata
16+
17+
18+
19+
The `metadata` argument in pins is flexible and can hold any kind of metadata that you can formulate as a `dict` (convertable to JSON).
20+
In some situations, you may want to read and write with _consistent_ customized metadata;
21+
you can create functions to wrap `pin_write()` and `pin_read()` for your particular use case.
22+
23+
We'll begin by creating a temporary board for demonstration:
24+
25+
```{python setup}
26+
import pins
27+
import pandas as pd
28+
29+
from pprint import pprint
30+
31+
board = pins.board_temp()
32+
```
33+
34+
35+
# A function to store pandas Categoricals
36+
37+
Say you want to store a pandas Categorical object as JSON together with the _categories_ of the categorical in the metadata.
38+
39+
For example, here is a simple categorical and its categories:
40+
41+
```{python}
42+
some_cat = pd.Categorical(["a", "a", "b"])
43+
44+
some_cat.categories
45+
```
46+
47+
Notice that the categories attribute is just the unique values in the categorical.
48+
49+
We can write a function wrapping `pin_write()` that holds the categories in metadata, so we can easily re-create the categorical with them.
50+
51+
```{python}
52+
def pin_write_cat_json(
53+
board,
54+
x: pd.Categorical,
55+
name,
56+
**kwargs
57+
):
58+
metadata = {"categories": x.categories.to_list()}
59+
json_data = x.to_list()
60+
board.pin_write(json_data, name = name, type = "json", metadata = metadata, **kwargs)
61+
```
62+
63+
We can use this new function to write a pin as JSON with our specific metadata:
64+
65+
```{python}
66+
some_cat = pd.Categorical(["a", "a", "b", "c"])
67+
pin_write_cat_json(board, some_cat, name = "some-cat")
68+
```
69+
70+
## A function to read categoricals
71+
72+
It's possible to read this pin using the regular `pin_read()` function, but the object we get is no longer a categorical!
73+
74+
```{python}
75+
board.pin_read("some-cat")
76+
```
77+
78+
However, notice that if we use `board.pin_meta()`, the information we stored on categories is in the `.user` field.
79+
80+
```{python}
81+
pprint(
82+
board.pin_meta("some-cat")
83+
)
84+
```
85+
86+
This enables us to write a special function for reading, to reconstruct the categorical, using the categories stashed in metadata:
87+
88+
```{python}
89+
def pin_read_cat_json(board, name, version=None, hash=None, **kwargs):
90+
data = board.pin_read(name = name, version = version, hash = hash, **kwargs)
91+
meta = board.pin_meta(name = name, version = version, **kwargs)
92+
return pd.Categorical(data, categories=meta.user["categories"])
93+
94+
pin_read_cat_json(board, "some-cat")
95+
```
96+
97+
For an example of how this approach is used in a real project, look at look at how the vetiver package wraps these functions to [write](https://github.com/rstudio/vetiver-python/blob/main/vetiver/pin_read_write.py) and [read](https://github.com/rstudio/vetiver-python/blob/main/vetiver/vetiver_model.py) model binaries as pins.

docs/articles/index.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
Articles
2+
========
3+
4+
.. toctree::
5+
customize-pins-metadata.Rmd

0 commit comments

Comments
 (0)