Skip to content

Commit b87c42d

Browse files
Add dbm-example custom Jupyter app template
Creates a new custom app template based on Jupyter base-notebook with integrated Workbench tools (gcsfuse, wb CLI) and cloud CLIs (AWS, GCP). Configured for user rishbahc with Java 17 support. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent d1a9abc commit b87c42d

68 files changed

Lines changed: 6275 additions & 0 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

src/dbm-example/.devcontainer.json

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
{
2+
"name": "dbm-example",
3+
"dockerComposeFile": "docker-compose.yaml",
4+
"service": "app",
5+
"shutdownAction": "none",
6+
"workspaceFolder": "/workspace",
7+
"postCreateCommand": [
8+
"./startupscript/post-startup.sh",
9+
"rishbahc",
10+
"/home/rishabhc",
11+
"${templateOption:cloud}",
12+
"${templateOption:login}"
13+
],
14+
"postStartCommand": [
15+
"./startupscript/remount-on-restart.sh",
16+
"rishbahc",
17+
"/home/rishabhc",
18+
"${templateOption:cloud}",
19+
"${templateOption:login}"
20+
],
21+
"features": {
22+
"ghcr.io/devcontainers/features/java:1": {
23+
"version": "17"
24+
},
25+
"ghcr.io/devcontainers/features/aws-cli:1": {},
26+
"ghcr.io/dhoeric/features/google-cloud-cli:1": {}
27+
},
28+
"remoteUser": "root"
29+
}

src/dbm-example/README.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# dbm-example
2+
3+
Custom Workbench application based on quay.io/jupyter/base-notebook.
4+
5+
## Configuration
6+
7+
- **Image**: quay.io/jupyter/base-notebook
8+
- **Port**: 8888
9+
- **User**: rishbahc
10+
- **Home Directory**: /home/rishabhc
11+
12+
## Access
13+
14+
Once deployed in Workbench, access your terminal at the app URL (port 8888).
15+
16+
For local testing:
17+
1. Create Docker network: `docker network create app-network`
18+
2. Run the app: `devcontainer up --workspace-folder .`
19+
3. Access at: `http://localhost:8888`
20+
21+
## Customization
22+
23+
Edit the following files to customize your app:
24+
25+
- `.devcontainer.json` - Devcontainer configuration and features
26+
- `docker-compose.yaml` - Docker Compose configuration (change the `command` to customize ttyd options)
27+
- `devcontainer-template.json` - Template options and metadata
28+
29+
## Testing
30+
31+
To test this app template:
32+
33+
```bash
34+
cd test
35+
./test.sh dbm-example
36+
```
37+
38+
## Usage
39+
40+
1. Fork the repository
41+
2. Modify the configuration files as needed
42+
3. In Workbench UI, create a custom app pointing to your forked repository
43+
4. Select this app template (dbm-example)
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
{
2+
"id": "dbm-example",
3+
"version": "1.0.0",
4+
"name": "dbm-example",
5+
"description": "Custom Workbench app: dbm-example (Image: quay.io/jupyter/base-notebook, Port: 8888, User: rishbahc)",
6+
"options": {
7+
"cloud": {
8+
"type": "string",
9+
"enum": ["gcp", "aws"],
10+
"default": "gcp",
11+
"description": "Cloud provider (gcp or aws)"
12+
},
13+
"login": {
14+
"type": "string",
15+
"description": "Whether to log in to workbench CLI",
16+
"proposals": ["true", "false"],
17+
"default": "false"
18+
}
19+
}
20+
}
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
services:
2+
app:
3+
# The container name must be "application-server"
4+
container_name: "application-server"
5+
# This can be either a pre-existing image or built from a Dockerfile
6+
image: "quay.io/jupyter/base-notebook"
7+
# build:
8+
# context: .
9+
restart: always
10+
volumes:
11+
- .:/workspace:cached
12+
- work:/home/rishabhc/work
13+
# The port specified here will be forwarded and accessible from the
14+
# Workbench UI.
15+
ports:
16+
- 8888:8888
17+
# The service must be connected to the "app-network" Docker network
18+
networks:
19+
- app-network
20+
# SYS_ADMIN and fuse are required to mount workspace resources into the
21+
# container.
22+
cap_add:
23+
- SYS_ADMIN
24+
devices:
25+
- /dev/fuse
26+
security_opt:
27+
- apparmor:unconfined
28+
29+
volumes:
30+
work:
31+
32+
networks:
33+
# The Docker network must be named "app-network". This is an external network
34+
# that is created outside of this docker-compose file.
35+
app-network:
36+
external: true
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Developer guide for (post)startup script
2+
3+
Verily Workbench provisions VMs post-creation to install workbench specific tools (such as CLI, gcsfuse, ssh-keys for git).
4+
5+
Currently there are three flavors of startup.script:
6+
7+
- vertex AI user-managed notebook
8+
- dataproc cluster
9+
- general gce instance (in the startupscript/ folder)
10+
11+
## How to test your change?
12+
13+
### Option 1
14+
15+
If it's a single line change, you can just create an environment in the devel environment and run the command.
16+
17+
### Option 2
18+
19+
If it's a complex change, you can point the VM to your new script and test it end-to-end.
20+
21+
#### Vertex AI
22+
23+
- Step 1
24+
25+
Make your change and push to a branch.
26+
27+
- Step 2
28+
29+
```text
30+
wb resource create gcp-notebook --id=jupyterNotebookForTesting --post-startup-script=https://raw.githubusercontent.com/verily-src/workbench-app-devcontainers/<your-branch>/startupscript/vertex-ai-user-managed-notebook/post-startup.sh
31+
```
32+
33+
- Step 3
34+
Go to the UI and wait till the notebook spins up and verify that it is running.
35+
36+
#### Dataproc
37+
38+
- Step 1
39+
40+
Make your change and push to a branch.
41+
42+
- Step 2
43+
44+
```text
45+
wb resource create dataproc-cluster --name=dataprocForTesting --metadata=startup-script-url=https://raw.githubusercontent.com/verily-src/workbench-app-devcontainers/<your-branch>/startupscript/dataproc/startup.sh
46+
```
47+
48+
Pick a workspace that you have previously created a dataproc cluster so you can reuse the buckets.
49+
50+
- Step 3
51+
Go to the UI and wait till the notebook spins up and verify that it is running.
52+
53+
#### GCE
54+
55+
- Step 1
56+
57+
Clone this repo and put it in a public repo you own. Make the change
58+
59+
- Step 2
60+
In the UI, create a custom r-analysis app pointing at your personal repo.
61+
62+
- Step 3
63+
Wait for the notebook to spin up and go to the instance. Check .workbench/post-startup-output.txt to see if it succeeds.
64+
65+
## Linting and Style
66+
67+
Shell code in this repo will be checked with `shellcheck` as part of pull request testing.
68+
69+
The `shellcheck` tool [can be installed locally](https://github.com/koalaman/shellcheck?tab=readme-ov-file#installing).
70+
Additionally, VSCode has a [ShellCheck extension](https://marketplace.visualstudio.com/items?itemName=timonwong.shellcheck).
71+
72+
To configure `shellcheck` locally with the same configuration as the PR lint tasks, create a
73+
`~/.shellcheckrc` file with the following content:
74+
75+
```shell
76+
disable=SC1090,SC1091
77+
```
78+
79+
This disables checks [SC1090](https://www.shellcheck.net/wiki/SC1090) and
80+
[SC1091](https://www.shellcheck.net/wiki/SC1091), to work around limitations in
81+
`shellcheck` around dynamic file handling.
82+
83+
In addition to `shellcheck`-enforced logic, it is highly recommended that variables and functions
84+
be made `readonly` to prevent overriding or unsetting.
85+
86+
For trivial variable assignment this can be done on a single line:
87+
88+
```shell
89+
readonly FOO="foo"
90+
```
91+
92+
However, for assignments that involve calling a command, `readonly` can mask error responses; in
93+
these cases the variable should be marked `readonly` as a subsequent step. This is enforced by
94+
`shellcheck` rule [SC2155](https://www.shellcheck.net/wiki/SC2155).
95+
96+
```shell
97+
FOO="$(command)"
98+
readonly FOO
99+
```
100+
101+
Functions follow this syntax:
102+
103+
```shell
104+
function my_function {
105+
# ... function logic ...
106+
}
107+
readonly -f my_func
108+
```
109+
110+
## General debugging tips
111+
112+
- Check /home/**\<user\>**/.workbench/post-startup-output.txt to see where the script failed.
113+
The user is jupyter for vertex AI, dataproc for dataproc cluster, and varies by app for gce instance.
114+
115+
- If the proxy url doesn't work, you can ssh to the VM.
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
2+
# Prompt user to authenticate with Workbench CLI if they are not already
3+
# authenticated. Note the lack of a shebang is intentional. This file will be
4+
# appended to the user's .bash_profile file, and is kept separate so that this
5+
# code can be linted with shellcheck.
6+
7+
if [[ "$("${WORKBENCH_INSTALL_PATH}" auth status --format json | jq .loggedIn)" == false ]]; then
8+
echo "User must log into Workbench to continue."
9+
"${WORKBENCH_INSTALL_PATH}" auth login
10+
configure_workbench
11+
fi
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
2+
# Workbench helper functions for AWS. Note the lack of a shebang is
3+
# intentional. This file will be appended to the user's .bashrc file, and is
4+
# kept separate so that these functions can be linted with shellcheck.
5+
6+
function configure_workspace() {
7+
"${WORKBENCH_INSTALL_PATH}" workspace set --uuid "${WORKBENCH_WORKSPACE_UUID}"
8+
"${WORKBENCH_INSTALL_PATH}" workspace configure-aws --cache-with-aws-vault=true
9+
"${WORKBENCH_INSTALL_PATH}" resource mount
10+
}
11+
readonly -f configure_workspace
12+
13+
function configure_ssh() {
14+
local USER_SSH_DIR="${HOME}/.ssh"
15+
mkdir -p "${USER_SSH_DIR}"
16+
local USER_SSH_KEY
17+
USER_SSH_KEY="$("${WORKBENCH_INSTALL_PATH}" security ssh-key get --include-private-key --format=JSON)"
18+
echo "${USER_SSH_KEY}" | jq -r '.privateSshKey' > "${USER_SSH_DIR}"/id_rsa
19+
echo "${USER_SSH_KEY}" | jq -r '.publicSshKey' > "${USER_SSH_DIR}"/id_rsa.pub
20+
chmod 0600 "${USER_SSH_DIR}"/id_rsa*
21+
ssh-keyscan -H github.com >> "${USER_SSH_DIR}/known_hosts"
22+
}
23+
readonly -f configure_ssh
24+
25+
function configure_git() {
26+
mkdir -p "${WORKBENCH_GIT_REPOS_DIR}"
27+
pushd "${WORKBENCH_GIT_REPOS_DIR}" || return
28+
"${WORKBENCH_INSTALL_PATH}" resource list --type=GIT_REPO --format json | \
29+
jq -c .[] | \
30+
while read -r ITEM; do
31+
local GIT_REPO_NAME
32+
GIT_REPO_NAME="$(echo "$ITEM" | jq -r .id)"
33+
local GIT_REPO_URL
34+
GIT_REPO_URL="$(echo "$ITEM" | jq -r .gitRepoUrl)"
35+
if [[ ! -d "${GIT_REPO_NAME}" ]]; then
36+
git clone "${GIT_REPO_URL}" "${GIT_REPO_NAME}"
37+
fi
38+
done
39+
popd || return
40+
}
41+
readonly -f configure_git
42+
43+
function configure_workbench() {
44+
configure_workspace
45+
configure_ssh
46+
configure_git
47+
}
48+
readonly -f configure_workbench
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
#!/bin/bash -x
2+
3+
# configure-aws-vault.sh
4+
#
5+
# Performs additional one-time configuration for applications running in AWS
6+
# EC2 instances.
7+
#
8+
# Note that this script is intended to be sourced from the "post-startup.sh"
9+
# script and is dependent on some functions and variables already being set up
10+
# and some packages already installed:
11+
#
12+
# - emit (function)
13+
# - RUN_AS_LOGIN_USER: run command as app user
14+
# - WORKBENCH_INSTALL_PATH: path to CLI executable
15+
16+
readonly AWS_VAULT_INSTALL_PATH="/usr/bin/aws-vault"
17+
readonly AWS_VAULT_BINARY_PATH="/usr/bin/_aws-vault"
18+
readonly AWS_VAULT_EXE_URL="https://github.com/ByteNess/aws-vault/releases/download/v7.9.13/aws-vault-linux-amd64"
19+
20+
if [[ -f "${AWS_VAULT_INSTALL_PATH}" ]]; then
21+
emit "aws-vault already installed"
22+
else
23+
##########################################
24+
# Install aws-vault for credential caching
25+
##########################################
26+
emit "installing aws-vault"
27+
curl --no-progress-meter --location --output "${AWS_VAULT_BINARY_PATH}" "${AWS_VAULT_EXE_URL}"
28+
29+
cat <<EOF > "${AWS_VAULT_INSTALL_PATH}"
30+
export AWS_VAULT_BACKEND="file"
31+
export AWS_VAULT_FILE_PASSPHRASE=""
32+
33+
# aws-vault's keyring dependency creates dbus-daemon processes without cleaning
34+
# them up (https://github.com/99designs/keyring/issues/103). By setting
35+
# DBUS_SESSION_BUS_ADDRESS to /dev/null, we can prevent aws-vault from creating
36+
# these processes. dbus is only needed for the "secretservice" backend, which we
37+
# do not use.
38+
export DBUS_SESSION_BUS_ADDRESS="/dev/null"
39+
40+
exec "${AWS_VAULT_BINARY_PATH}" "\$@"
41+
EOF
42+
43+
chmod 755 "${AWS_VAULT_INSTALL_PATH}"
44+
chmod 755 "${AWS_VAULT_BINARY_PATH}"
45+
46+
#####################################
47+
# Set up aws-vault credential caching
48+
#####################################
49+
${RUN_AS_LOGIN_USER} "${WORKBENCH_INSTALL_PATH} config set cache-with-aws-vault true"
50+
${RUN_AS_LOGIN_USER} "${WORKBENCH_INSTALL_PATH} config set wb-path --path ${WORKBENCH_INSTALL_PATH}"
51+
${RUN_AS_LOGIN_USER} "${WORKBENCH_INSTALL_PATH} config set aws-vault-path --path ${AWS_VAULT_INSTALL_PATH}"
52+
fi

0 commit comments

Comments
 (0)