Skip to content

Commit f608d49

Browse files
jaycee-licopybara-github
authored andcommitted
chore: add README to vertexai.preview
PiperOrigin-RevId: 578234377
1 parent bcf48da commit f608d49

1 file changed

Lines changed: 331 additions & 0 deletions

File tree

vertexai/preview/README.md

Lines changed: 331 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,331 @@
1+
# Vertex AI SDK User guide
2+
3+
4+
## Introduction
5+
Vertex AI SDK adds a new usability layer to the [Vertex SDK](https://cloud.google.com/python/docs/reference/aiplatform/latest) with the goal of substantially improving the traditional data-to-model workflows. It allows users to shift from thinking about the mechanics of calling Vertex services, to thinking in terms of building models and seamlessly interleaving the [Vertex SDK](https://cloud.google.com/python/docs/reference/aiplatform/latest) throughout their workflow.
6+
7+
8+
## Setup
9+
Vertex AI SDK is currently available in preview under `vertexai` package. Please install `google-cloud-aiplatform[preview]` using pip to enable full functionalities.
10+
11+
We also recommend installing the package in a [virtualenv](https://virtualenv.pypa.io/en/latest/) using pip. [virtualenv](https://virtualenv.pypa.io/en/latest/) is a tool to create isolated Python environments and helps manage dependencies and versions, and indirectly permissions.
12+
13+
### Mac/Linux
14+
15+
```shell
16+
pip install virtualenv
17+
virtualenv <your-env>
18+
source <your-env>/bin/activate
19+
<your-env>/bin/pip install google-cloud-aiplatform[preview]
20+
```
21+
22+
### Windows
23+
24+
```shell
25+
pip install virtualenv
26+
virtualenv <your-env>
27+
<your-env>\Scripts\activate
28+
<your-env>\Scripts\pip.exe install google-cloud-aiplatform[preview]
29+
```
30+
31+
32+
## Remote training
33+
With the remote training feature in Vertex AI SDK, you can write your machine learning code as usual and then the code will be automatically executed on [Vertex AI CustomJob](https://cloud.google.com/vertex-ai/docs/training/create-custom-job) with few small changes. This feature allows you to seamlessly access Vertex resources while reducing the time needed to learn how to interact with Vertex services.
34+
35+
### Supported frameworks
36+
1. scikit-learn
37+
2. tensorflow
38+
3. pytorch
39+
4. lightning
40+
41+
### User journey
42+
```py
43+
import vertexai
44+
from sklearn.linear_model import LogisticRegression
45+
46+
# Wrap classes to enable Vertex remote execution
47+
LogisticRegression = vertexai.preview.remote(LogisticRegression)
48+
49+
# Init vertexai and switch to remote mode for training
50+
vertexai.init(project="my-project", location="my-location")
51+
vertexai.preview.init(remote=True)
52+
53+
model = LogisticRegression()
54+
55+
# Model will be trained on Vertex
56+
model.fit(X, y)
57+
```
58+
59+
### (Optional) Remote job configuration
60+
Vertex will help you set the remote job based on the model you use. But you can also customize those configurations, e.g., display name, staging bucket, machine type.
61+
62+
```py
63+
# Set the config before training the model
64+
model.fit.vertex.remote_config.display_name = "my-sklearn-training"
65+
model.fit.vertex.remote_config.staging_bucket = "gs://my-bucket"
66+
model.fit.vertex.remote_config.machine_type = "n1-highmem-64"
67+
68+
# Model will be trained on Vertex
69+
model.fit(X, y)
70+
```
71+
72+
[Here](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/preview/_workflow/shared/configs.py#L22-L104) is the full list of supported configurations.
73+
74+
75+
## Remote GPU training
76+
This is an extra feature on top of remote training. It allows you to remotely train supported models on GPU, even though you don't have any GPU resources in your local device. Please check [here](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/preview/_workflow/shared/configs.py#L63-L73) for more information.
77+
78+
### Supported frameworks
79+
1. tensorflow
80+
2. pytorch
81+
82+
### User journey
83+
```py
84+
import vertexai
85+
from tensorflow import keras
86+
87+
# Wrap classes to enable Vertex remote execution
88+
keras.Sequential = vertexai.preview.remote(keras.Sequential)
89+
90+
# Init vertexai and switch to remote mode for training
91+
vertexai.init(project="my-project", location="my-location")
92+
vertexai.preview.init(remote=True)
93+
94+
# Instantiate model
95+
model = keras.Sequential(
96+
[keras.layers.Dense(5, input_shape=(4,)), keras.layers.Softmax()]
97+
)
98+
model.compile(optimizer="adam", loss="mean_squared_error")
99+
100+
# Set `enable_cuda` to True in remote config
101+
model.fit.vertex.remote_config.enable_cuda = True
102+
103+
# Model will be trained on Vertex with GPU
104+
model.fit(dataset, epochs=10)
105+
```
106+
107+
108+
## Remote distributed training
109+
This feature extends remote training by enabling you to remotely train supported models on multi-worker CPU or GPU machines, regardless of the resources available in your local device. Please check [here](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/preview/_workflow/shared/configs.py#L74-L86) for more information.
110+
111+
### Supported frameworks
112+
1. tensorflow
113+
2. pytorch
114+
115+
### User journey
116+
```py
117+
import torch
118+
import vertexai
119+
from vertexai.preview import VertexModel
120+
from vertexai.preview.developer import remote_specs
121+
122+
# Define the custom model with `VertexModel` to enable remote execution
123+
class TorchLogisticRegression(VertexModel, torch.nn.Module):
124+
125+
def __init__(self, input_size:int, output_size:int):
126+
torch.nn.Module.__init__(self)
127+
VertexModel.__init__(self)
128+
self.linear = torch.nn.Linear(input_size, output_size)
129+
self.softmax = torch.nn.Softmax(dim=1)
130+
131+
def forward(self, x):
132+
return self.softmax(self.linear(x))
133+
134+
# Add this train decorator to allow remote training
135+
@vertexai.preview.developer.mark.train()
136+
def train(self, dataloader, num_epochs, lr):
137+
# Add this line to enable distributed training for pytorch
138+
self = remote_specs.setup_pytorch_distributed_training(self)
139+
criterion = torch.nn.CrossEntropyLoss()
140+
optimizer = torch.optim.SGD(self.parameters(), lr=lr)
141+
142+
for t in range(num_epochs):
143+
for idx, batch in enumerate(dataloader):
144+
device = next(self.parameters()).device
145+
x, y = batch[0].to(device), batch[1].to(device)
146+
optimizer.zero_grad()
147+
pred = self(x)
148+
loss = criterion(pred, y)
149+
loss.backward()
150+
optimizer.step()
151+
152+
153+
# Init vertexai and switch to remote mode for training
154+
vertexai.init(project="my-project", location="my-location")
155+
vertexai.preview.init(remote=True)
156+
157+
# Instantiate model
158+
model = TorchLogisticRegression(4, 3)
159+
160+
# Set `enable_distributed` to True in remote config
161+
model.fit.vertex.remote_config.enable_distributed = True
162+
163+
# Model will be distributed trained on Vertex
164+
model.train(dataloader, num_epochs=100, lr=0.05)
165+
```
166+
167+
## Remote training with Autologging
168+
The [autologging feature](https://cloud.google.com/vertex-ai/docs/experiments/autolog-data) is available in remote training. Metrics (summary metrics, time series metrics, etc) and parameters for your remote training can be automatically logged into [Vertex Experiments](https://cloud.google.com/vertex-ai/docs/experiments/intro-vertex-ai-experiments). This feature allows you to easily track, compare, and analyze your training runs with different setups.
169+
170+
### User journey
171+
```py
172+
# Init with experiment and autolog
173+
vertexai.init(
174+
project="my-project",
175+
location="my-location",
176+
experiment="my-exp",
177+
)
178+
vertexai.preview.init(remote=True)
179+
...
180+
# Set a service account in remote config, this is required for autologging
181+
model.fit.vertex.remote_config.service_account = "GCE"
182+
183+
# Model will be trained on Vertex with data automatically logged
184+
model.fit(X, y)
185+
```
186+
187+
188+
## Uptraining
189+
Once you finish training a model, you may register your trained model to the [Vertex Model Registry](https://cloud.google.com/vertex-ai/docs/model-registry/introduction). And then pulled the pretrained model from the model registry for up training with new input training data.
190+
191+
### Supported frameworks
192+
1. scikit-learn
193+
2. tensorflow
194+
3. pytorch
195+
196+
### User journey
197+
198+
#### Register to Model Registry
199+
After a model is trained, you can register the model with `register()` method. It returns an [aiplatform.Model](https://github.com/googleapis/python-aiplatform/blob/main/google/cloud/aiplatform/models.py#L2610) object.
200+
201+
```py
202+
# Model could be trained locally or remotely in previous steps
203+
registered_model = vertexai.preview.register(model)
204+
```
205+
206+
#### Pulled pre-trained model from Model Registry
207+
You can then use the `from_pretrained()` method with the resource name of an [aiplatform.Model](https://github.com/googleapis/python-aiplatform/blob/main/google/cloud/aiplatform/models.py#L2610) or an [aiplatdform.CustomJob](https://github.com/googleapis/python-aiplatform/blob/main/google/cloud/aiplatform/jobs.py#L1171) to retrieve the model object.
208+
209+
```py
210+
# You can get the model resource name through SDK or Google Cloud console.
211+
pulled_model = vertexai.preview.from_pretrained(
212+
model_name=registered_model.resource_name
213+
)
214+
```
215+
216+
```py
217+
# You can get the custom job resource name through Google Cloud console.
218+
pulled_model = vertexai.preview.from_pretrained(
219+
custom_job_name="1234567890123456789"
220+
)
221+
```
222+
223+
224+
#### Uptraining
225+
You can proceed with uptraining (remotely or locally) using new input training data.
226+
227+
```py
228+
pulled_model.fit(X_retrain, y_retrain)
229+
```
230+
231+
232+
## Remote training with Built-in Models
233+
In addition to custom machine learning code, we also support remote training with built-in algorithms through pre-built containers. You will be able to run training jobs on your data without having to write any code for a training application.
234+
235+
### Supported frameworks
236+
1. [TabNetTrainer](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/preview/tabular_models/tabnet_trainer.py#L52)
237+
238+
### User Journey
239+
```py
240+
import vertexai
241+
from vertexai.preview.tabular_models import TabNetTrainer
242+
243+
# Init vertexai and switch to remote mode for training
244+
vertexai.init(project="my-project", location="my-location")
245+
vertexai.preview.init(remote=True)
246+
247+
trainer = TabNetTrainer(
248+
model_type = "classification",
249+
target_column = "target",
250+
learning_rate = 0.01,
251+
max_train_secs = 1800,
252+
)
253+
254+
trainer.fit(training_data, validation_data)
255+
```
256+
257+
258+
## Vizier Hyperparameter Tuning
259+
Vertex AI SDK supports local and remote hyperparameter tuning using [Vertex AI Vizier](https://cloud.google.com/vertex-ai/docs/vizier/overview). Local tuning creates trials locally and runs training locally, meanwhile remote tuning creates trials locally and runs training remotely on [Vertex AI CustomJob](https://cloud.google.com/vertex-ai/docs/training/create-custom-job), with each CustomJob corresponding to one trial.
260+
261+
### Supported frameworks
262+
1. scikit-learn
263+
2. tensorflow
264+
3. pytorch
265+
4. lightning
266+
267+
### User journey
268+
```py
269+
import vertexai
270+
271+
# Init vertexai and switch to remote mode for tuning
272+
vertexai.init(project="my-project", location="my-location")
273+
vertexai.preview.init(remote=True)
274+
275+
276+
def get_model_func(C: float):
277+
from sklearn.linear_model import LogisticRegression
278+
279+
# Wrap the class to train models on Vertex
280+
LogisticRegression = vertexai.preview.remote(LogisticRegression)
281+
# Instantiate model. C will be tuned.
282+
model = LogisticRegression(C=C)
283+
284+
return model
285+
286+
287+
tuner = VizierHyperparameterTuner(
288+
get_model_func=get_model_func,
289+
max_trial_count=4,
290+
parallel_trial_count=2,
291+
hparam_space=[
292+
{
293+
"parameter_id": "C",
294+
"discrete_value_spec": {"values": [0.1, 0.5, 1.0]},
295+
}
296+
],
297+
metric_id="custom",
298+
metric_goal="MAXIMIZE",
299+
)
300+
301+
# Tune model using Vizier. Tuning runs locally while trials run on Vertex.
302+
tuner.fit(X_train, y_train)
303+
304+
# Get the best model after tuning is done.
305+
best_model = tuner.get_best_models()[0]
306+
```
307+
308+
309+
## Sample Notebooks
310+
The notebooks below showcase the different usage of Vertex AI SDK.
311+
312+
- [remote_training_sklearn.ipynb](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/vertex_ai_sdk/remote_training_sklearn.ipynb)
313+
- Remote training
314+
- Uptraining
315+
316+
- [remote_training_pytorch.ipynb](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/vertex_ai_sdk/remote_training_pytorch.ipynb)
317+
- Remote training
318+
- Remote GPU training
319+
- Uptraining
320+
321+
- [remote_training_tensorflow_with_autologging.ipynb](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/vertex_ai_sdk/remote_training_tensorflow_with_autologging.ipynb)
322+
- Remote training
323+
- Remote GPU training
324+
- Remote training with Autologging
325+
- Uptraining
326+
327+
- [remote_training_lightning.ipynb](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/vertex_ai_sdk/remote_training_lightning.ipynb)
328+
- Remote training
329+
330+
- [remote_hyperparameter_tuning.ipynb](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/vertex_ai_sdk/remote_hyperparameter_tuning.ipynb)
331+
- Vizier Hyperparameter Tuning

0 commit comments

Comments
 (0)