gemini-cli-extensions
diff --git a/‎skills/bigquery-data-transfer-service/SKILL.md‎
Lines changed: 66 additions & 30 deletions b/‎skills/bigquery-data-transfer-service/SKILL.md‎
Lines changed: 66 additions & 30 deletions
diff --git a/‎skills/bigquery-data-transfer-service/scripts/bigquery_dts.py‎
Lines changed: 90 additions & 61 deletions b/‎skills/bigquery-data-transfer-service/scripts/bigquery_dts.py‎
Lines changed: 90 additions & 61 deletions
diff --git a/‎skills/building-data-apps/SKILL.md‎
Lines changed: 4 additions & 2 deletions b/‎skills/building-data-apps/SKILL.md‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎skills/developing-with-bigquery/resources/BIGFRAMES.md‎
Lines changed: 1 addition & 0 deletions b/‎skills/developing-with-bigquery/resources/BIGFRAMES.md‎
Lines changed: 1 addition & 0 deletions
@@ -31,7 +31,24 @@ metadata related to ingestion when needed.
 
 ## Workflow
 
-### Step 0: Check for Existing Transfers
+### Step 0: Discover Environment Parameters
+
+Before generating configurations, discover the actual values for the target
+project and region.
+
+> [!TIP]
+> If `deployment.yaml` already exists in the repository root, prioritize
+> extracting `project` and `region` from the target environment configuration
+> (e.g., `dev`).
+
+1.  **Project**: `gcloud config get project`
+2.  **Region**: `gcloud config get compute/region`
+
+> [!TIP]
+> Use these commands to replace placeholders like `<PROJECT_ID>` with actual
+> values. Always remove associated comments that start with TODO once replaced.
+
+### Step 1: Check for Existing Transfers
 
 Before assuming a new transfer is needed, check for existing ones in the target
 region.
@@ -40,18 +57,19 @@ region.
 
     ```bash
     bq ls --transfer_config \
-      --transfer_location=[LOCATION] \
-      --project_id=[PROJECT_ID]
+      --transfer_location=<REGION> \
+      --project_id=<PROJECT_ID>
     ```
 
-2.  **Evaluate Results**:
+2.  **Analyze Existing Transfers**:
 
     -   **Single Transfer Found**:
 
         -   Check if the transfer has at least one successful run: `bq ls
-            --transfer_run --transfer_config=[RESOURCE_NAME]`
-        -   If found: Use existing or manage via deployment framework.
-        -   If not found: Guess tables from config.
+            --transfer_run --transfer_config=<RESOURCE_NAME>`
+        -   If found: Use existing transfer config.
+        -   If not found: Confirm with user if it's ok to trigger
+            the transfer run.
 
     -   **Multiple Transfers Found**:
 
@@ -61,66 +79,84 @@ region.
     -   **Disabled Transfers Found**:
 
         -   Ask user if they want to enable it or create a new one.
-        -   Enable: `bq update --disabled=false
-            --transfer_config=[RESOURCE_NAME]`
+        -   To Enable: Instruct the user to update the transfer configuration
+            within their `deployment.yaml` file by setting the `disabled`
+            field to `false` for the specific transfer resource.
 
     -   **No Transfers Found**: Proceed to create new if needed.
 
-### Step 1: Discover & Validate Parameters (New Transfers)
+### Step 2: Discover & Validate Parameters (New Transfers)
 
 If creating a new transfer, discover the required parameters using the REST API
 and validate them with the user.
 
-> [!TIP] If `DATA_SOURCE_ID` is unknown, run `bq show --transfer_data_sources`
-> `--location=[LOCATION] --project_id=[PROJECT_ID]` to list available source IDs
-> (e.g., `google_cloud_storage`, `salesforce`).
+> [!TIP] If `<DATA_SOURCE_ID>` is unknown, run the discovery script
+> without `<DATA_SOURCE_ID>` argument to list available source IDs
+> (e.g., `google_cloud_storage`).
+> It uses the derived project and location from Step 0.
+> ```bash
+> python3 scripts/bigquery_dts.py --project_id=<PROJECT_ID>
+> ```
 
 1.  **Run Discovery Script**: Use the `bigquery_dts.py` script to inspect Data
     Source parameters via the REST API.
 
     ```bash
-    # Use the path to the script in your workspace
-    python3 scripts/bigquery_dts.py --project_id [PROJECT_ID] [DATA_SOURCE_ID] [LOCATION]
+    # Passes the derived project and region to the script.
+    python3 scripts/bigquery_dts.py --project_id=<PROJECT_ID> <DATA_SOURCE_ID> <REGION>
     ```
 
     > [!IMPORTANT] Run this command every time a new transfer is being planned.
 
-2.  **Mandatory User Questionnaire (CRITICAL)**:
-
-    -   Identify mandatory parameters.
-    -   Present them to the user BEFORE generating config files.
-    -   Ask for verification of assets/tables.
-
-3.  **Wait for User Response**: Do NOT proceed until parameters are confirmed.
-
-### Step 2: Extract Transfer Config Data
+2.  > [!CAUTION] **Mandatory User Questionnaire (CRITICAL)**:
+
+    -   **Explicitly identify ALL specific parameters** returned by the
+        discovery script. **You MUST NOT generalize or vaguely summarize them.**
+    -   **OAuth Authorization (Google Data Sources)**: For Google ecosystem data
+        sources (Google Ads, Youtube, etc.), if the user is not using a service
+        account to configure the DTS transfer config (meaning the user is using
+        End User Credentials or EUC to configure the transfer config), then
+        generate an OAuth URI. Ask the user to visit this URL to authorize.
+        Once the user provides the versionInfo code, use the code as
+        `definition.versionInfo` in `deployment.yaml` and then you can proceed.
+    -   If any parameters are related to authentication,
+        explicitly ask the user to provide the Secret Manager Resource ID
+        (e.g., projects/my-project/secrets/my-secret) for these parameters
+    -   Present every required parameter to the user BEFORE generating
+        config files.
+    -   Ask for verification of assets/tables to be ingested.
+
+3.  **Wait for User Response**: You **MUST NOT** proceed until parameters are
+    confirmed.
+
+### Step 3: Extract Transfer Config Data
 
 Retrieve the configuration details for the selected transfer.
 
 ```bash
-bq show --format=prettyjson --transfer_config [RESOURCE_NAME]
+bq show --format=prettyjson --transfer_config <RESOURCE_NAME>
 ```
 
-### Step 3: Trigger and Verify Transfer
+### Step 4: Trigger and Verify Transfer
 
 After the transfer is deployed via the resource provisioning framework, you MUST
 ensure there is at least a single successful run before proceeding with the rest
 of the tasks.
 
-1.  **Trigger a Manual Run**: If no successful runs are found, or the transfer
-    was just created, trigger a manual run for the current time.
+1.  **Trigger a Manual Run**: If no successful runs or ongoing runs are found,
+    or the transfer was just created, trigger a manual run for the current time.
 
     ```bash
     bq mk --transfer_run \
-      --transfer_config=[RESOURCE_NAME] \
+      --transfer_config=<RESOURCE_NAME> \
       --run_time=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
     ```
 
 2.  **Poll for Completion (5-Minute Rule)**: Attempt to check the status of the
     run every 30-60 seconds for up to **5 minutes**.
 
     ```bash
-    bq ls --transfer_run --transfer_config=[RESOURCE_NAME]
+    bq ls --format=prettyjson --transfer_run --transfer_config=<RESOURCE_NAME>
     ```
 
     -   **Success**: If the run completes successfully, proceed with the rest of
 
@@ -19,71 +19,65 @@
 import os
 import subprocess
 import sys
+from typing import Optional
 import urllib.error
+import urllib.parse
 import urllib.request
 
 
-def get_project_id():
-  """Retrieves the active Google Cloud project ID from the environment or CLI."""
-  project_id = os.environ.get("PROJECT_ID")
-  if project_id:
-    return project_id
-  try:
-    result = subprocess.run(
-        ["gcloud", "config", "get-value", "project"],
-        capture_output=True,
-        text=True,
-        check=True,
-    )
-    return result.stdout.strip()
-  except subprocess.CalledProcessError:
-    print("Error: Could not determine PROJECT_ID.", file=sys.stderr)
-    sys.exit(1)
-  except FileNotFoundError:
+def _run_gcloud(*args: str) -> Optional[str]:
+  """Executes a gcloud command and returns the stripped stdout, or None."""
+  for cmd in ["gcloud", "gcloud.cmd"]:
     try:
       result = subprocess.run(
-          ["gcloud.cmd", "config", "get-value", "project"],
+          [cmd, *args],
           capture_output=True,
           text=True,
           check=True,
       )
       return result.stdout.strip()
-    except Exception:  # pylint: disable=broad-exception-caught
-      print("Error: gcloud command not found.", file=sys.stderr)
-      sys.exit(1)
+    except (subprocess.CalledProcessError, FileNotFoundError):
+      continue
+  return None
+
+
+def get_project_id() -> str:
+  """Retrieves the active Google Cloud project ID."""
+  project_id = os.environ.get("PROJECT_ID")
+  if project_id:
+    return project_id
 
+  val = _run_gcloud("config", "get-value", "project")
+  if val and "(unset)" not in val:
+    return val
 
-def get_token():
+  print("Error: Could not determine PROJECT_ID.", file=sys.stderr)
+  sys.exit(1)
+
+
+def get_token() -> str:
   """Retrieves the access token using the gcloud CLI."""
-  try:
-    result = subprocess.run(
-        ["gcloud", "auth", "print-access-token"],
-        capture_output=True,
-        text=True,
-        check=True,
-    )
-    return result.stdout.strip()
-  except FileNotFoundError:
-    try:
-      result = subprocess.run(
-          ["gcloud.cmd", "auth", "print-access-token"],
-          capture_output=True,
-          text=True,
-          check=True,
-      )
-      return result.stdout.strip()
-    except Exception:  # pylint: disable=broad-exception-caught
-      print("Error: gcloud command not found.", file=sys.stderr)
-      sys.exit(1)
-  except subprocess.CalledProcessError:
-    print(
-        "Error: Could not obtain access token. Are you logged in?",
-        file=sys.stderr,
-    )
-    sys.exit(1)
+  token = _run_gcloud("auth", "print-access-token")
+  if token:
+    return token
 
+  print(
+      "Error: Could not obtain access token. Are you logged in?",
+      file=sys.stderr,
+  )
+  sys.exit(1)
 
-def main():
+
+def get_region() -> str:
+  """Retrieves the default compute region from gcloud config."""
+  val = _run_gcloud("config", "get-value", "compute/region")
+  if val and "(unset)" not in val:
+    return val
+
+  return "us"
+
+
+def main() -> None:
   """Main entry point for the script."""
   parser = argparse.ArgumentParser(
       description=(
@@ -92,9 +86,11 @@ def main():
       )
   )
   parser.add_argument("--project_id", help="The GCP project ID to use")
-  parser.add_argument("data_source_id", help="The DATA_SOURCE_ID to inspect")
   parser.add_argument(
-      "region", nargs="?", default="us", help="The GCP region (default: us)"
+      "data_source_id", nargs="?", help="The DATA_SOURCE_ID to inspect"
+  )
+  parser.add_argument(
+      "region", nargs="?", help="The GCP region (default: derived or us)"
   )
   args = parser.parse_args()
 
@@ -106,16 +102,27 @@ def main():
     )
     sys.exit(1)
 
-  print(
-      f"Retrieving Data Source parameters for: {args.data_source_id} "
-      f"in {args.region}..."
-  )
+  region = args.region or get_region() or "us"
 
-  base_url = (
-      "https://bigquerydatatransfer.googleapis.com/v1/"
-      f"projects/{project_id}/locations/{args.region}"
-  )
-  url = f"{base_url}/dataSources/{args.data_source_id}"
+  if args.data_source_id:
+    print(
+        f"Retrieving Data Source parameters for: {args.data_source_id} "
+        f"in {region}..."
+    )
+    url = (
+        "https://bigquerydatatransfer.googleapis.com/v1/"
+        f"projects/{project_id}/locations/{region}/dataSources/"
+        f"{args.data_source_id}"
+    )
+  else:
+    print(
+        f"Listing available Data Sources in {region} for project "
+        f"{project_id}..."
+    )
+    url = (
+        "https://bigquerydatatransfer.googleapis.com/v1/"
+        f"projects/{project_id}/locations/{region}/dataSources"
+    )
 
   token = get_token()
 
@@ -124,9 +131,31 @@ def main():
   req.add_header("Content-Type", "application/json")
 
   try:
-    with urllib.request.urlopen(req) as response:
+    with urllib.request.urlopen(req, timeout=30) as response:
       data = json.loads(response.read().decode("utf-8"))
       print(json.dumps(data, indent=4))
+
+      # Generate OAuth authorization URI for Google data sources
+      client_id = data.get("clientId")
+      scopes = data.get("scopes")
+      if client_id and scopes:
+        print("\n" + "=" * 40)
+        print("MANDATORY OAUTH AUTHORIZATION STEP")
+        print("=" * 40)
+        print(
+            "This Data Source requires user authorization. "
+            "Please follow the URL below to authorize:"
+        )
+        params = {
+            "redirect_uri": "urn:ietf:wg:oauth:2.0:oob",
+            "response_type": "version_info",
+            "client_id": client_id,
+            "scope": " ".join(scopes),
+        }
+        query_string = urllib.parse.urlencode(params)
+        auth_url = f"https://bigquery.cloud.google.com/datatransfer/oauthz/auth?{query_string}"
+        print(f"\n{auth_url}\n")
+        print("=" * 40 + "\n")
   except urllib.error.HTTPError as e:
     print(f"HTTP Error: {e.code} {e.reason}", file=sys.stderr)
     print(e.read().decode("utf-8"), file=sys.stderr)
 
@@ -6,7 +6,7 @@ description: |
   Analytics chat integration for data analytics.
 
   Relevant when any of the following conditions are true:
-    1. The user explicitly requests to build a data dashboard, data application, or visualization UI, and the UI pulls data from a GCP database (e.g., BigQuery, Spanner).
+    1. The user explicitly requests to build a data dashboard, data application, or visualization UI, and the UI pulls data from a GCP database (defaulting to BigQuery unless an alternative is specified).
     2. You need to generate a frontend web application to interact with, query, and visualize data from GCP data sources.
     3. The user wants to build a "chat with your data" experience or integrate the Gemini Data Analytics chat API into a web interface.
 
@@ -16,7 +16,7 @@ description: |
     3. The web application is not data-centric or does not involve visualizing/querying data from GCP sources.
 license: Apache-2.0
 metadata:
-  version: v1
+  version: v2
   publisher: google
 ---
 
@@ -26,6 +26,8 @@ metadata:
 
 -   **Framework:** React in Vite
 -   **Styling:** Tailwind CSS (Dark Mode by default)
+-   **Database:** BigQuery (Default database unless the user specifies an
+    alternative)
 -   **Icons:** `lucide-react`
 -   **Date Formatting:** `date-fns`
 -   **Data Fetching:** Axios (REST API calls)
 
@@ -27,3 +27,4 @@ Guidelines for generating valid code with the BigFrames (BigQuery DataFrame) lib
     - Sort data chronologically and split around a timepoint before training.
     - Prediction horizon must be less than or equal to training horizon.
 * **PCA**: BigFrames' PCA class lacks simple `transform()` method. Use `predict()` instead.
+* **Model Persistence**: To persist a model. use `model.to_gbq()`. To load a persisted model, use `bpd.read_gbq_model()`.