Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion config/config.json
Original file line number Diff line number Diff line change
Expand Up @@ -195,4 +195,4 @@
"reporting": "REPORTING_OracleEBS"
}
}
}
}
8 changes: 4 additions & 4 deletions docs/deprecated/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,7 @@ Navigate to [Cloud Storage](https://console.cloud.google.com/storage/create-buck
**Alternatively**, you can use the following command to create a bucket from the Cloud Shell:

```bash
gsutil mb -l <REGION/MULTI-REGION> gs://<BUCKET NAME>
gcloud storage buckets create --location <REGION/MULTI-REGION> gs://<BUCKET NAME>
```

Navigate to the _Permissions_ tab. Grant `Storage Object Creator` to the user executing the Build command or to the Service account you created for impersonation.
Expand All @@ -333,7 +333,7 @@ You can create a specific bucket for the Cloud Build process to store the logs.
**Alternatively**, here is the command line to create this bucket:

```bash
gsutil mb -l <REGION/MULTI-REGION> gs://<BUCKET NAME>
gcloud storage buckets create -l <REGION/MULTI-REGION> gs://<BUCKET NAME>
```

You will need to grant `Object Admin` permissions to the Cloud Build service account.
Expand Down Expand Up @@ -560,8 +560,8 @@ We recommend pasting the generated SQL into BigQuery to identify and correct the
If you opted to generate integration or CDC files and have an instance of Cloud Composer (Airflow), you can move them into their final bucket with the following command:

```bash
gsutil -m cp -r gs://<output bucket>/dags/ gs://<composer DAG bucket>/
gsutil -m cp -r gs://<output bucket>/data/ gs://<composer DAG bucket>/
gcloud storage cp --recursive gs://<output bucket>/dags/ gs://<composer DAG bucket>/
gcloud storage cp --recursive gs://<output bucket>/data/ gs://<composer DAG bucket>/
```

## Test, customize and prepare for upgrade
Expand Down
4 changes: 2 additions & 2 deletions docs/deprecated/RELEASE_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,11 +202,11 @@ The following changes are on our roadmap and are planned to happen **no sooner t
* Option for `sequential` deployment (TURBO=false) for SFDC incorporated.
* Salesforce integration now updates the RAW tables with changes, merging changes on landing. This removes the need for additional CDC processing. Deltas are captured using SystemModstamp provided by Salesforce APIs. See details in README.
* `IsArchived` flag is removed from CDC processing for Salesforce.
* Errors originating from gsutil steps in cloudbuild.sfdc.yaml not finding files to copy are now caught and surfaced gracefully.
* Errors originating from gcloud storage steps in cloudbuild.sfdc.yaml not finding files to copy are now caught and surfaced gracefully.
* Removing some `substitution` defaults (e.g., LOCATION) from cloudbuild.yaml file so all configurations are either passed from the command line or read from `config/config.json`. 🚨🔪🚨[TL;DR submit sample call](https://github.com/GoogleCloudPlatform/cortex-data-foundation#tldr-for-setup) was updated to default these flags. These parameters will be removed from subsitution defaults in future releases. 🚨🔪🚨
* Detecting version for Airflow in DAG templates to use updated libraries for Airflow v2 in SAP and Salesforce. This remvoes some deprecation warnings but may need additional libraries installed in your Airflow instance.
* Fix for test harness data not loading in an intended location when the location is not passed as a substitution.
* Checking existence of DAG-generated files before attempting to copy with `gsutil` to avoid errors.
* Checking existence of DAG-generated files before attempting to copy with `gcloud storage cp` to avoid errors.
* **NOTE**: 🚨🚨Structure of RAW landed tables has changed🚨🚨 to not require additional DAG processing. Please check the documentation on mapping and use of the new extraction process before upgrading to avoid disruption. We recommend pausing the replication, making abackup copy of any loaded tables, modifying the schemata of existing loaded tables and testing the new DAGs work with the new columns. The DAG will start fetching records using the last SystemModstamp present in RAW.
## December 2022 - Release 4.0
* **🎆Welcome Salesforce.com to Cortex Data Foundation🎆🐈🦄**: New [module for Salesforce](https://github.com/GoogleCloudPlatform/cortex-salesforce), to be implemented alongside the SAP models or on its own. The module includes optional integration and CDC scripts and reporting views for Leads Capture & Conversion, Opportunity Trends & Pipeline, Sales Activity and Engagement, Case Overview and Trends, Case Management & Resolution, Accounts with Cases. See the [entity-relationship diagram](images/erd_sfdc.png) for a list of tables and views. Check the [Looker repository](https://github.com/looker-open-source/block-cortex-salesforce) for sample dashboards.
Expand Down
4 changes: 2 additions & 2 deletions src/OracleEBS/src/common/materializer/deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -220,8 +220,8 @@ echo "generate_dependent_dags.py completed successfully."
if [[ $(find generated_materializer_dag_files/*/*/task_dep_dags -type f 2> /dev/null | wc -l) -gt 0 ]]
then
echo "Copying DAG files to GCS bucket..."
echo "gsutil -m cp -r 'generated_materializer_dag_files/*' gs://${GCS_TGT_BUCKET}/dags/"
gsutil -m cp -r 'generated_materializer_dag_files/*' "gs://${GCS_TGT_BUCKET}/dags/"
echo "gcloud storage cp --recursive 'generated_materializer_dag_files/*' gs://${GCS_TGT_BUCKET}/dags/"
gcloud storage cp --recursive 'generated_materializer_dag_files/*' "gs://${GCS_TGT_BUCKET}/dags/"
else
echo "No task dependent DAG files to copy to GCS bucket!"
fi
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
from datetime import timedelta

import airflow
from airflow import __version__ as airflow_version
from packaging.version import Version
from airflow.operators.empty import EmptyOperator
from airflow.providers.google.cloud.operators.bigquery import \
BigQueryInsertJobOperator
Expand All @@ -34,18 +36,23 @@

default_dag_args = {
"depends_on_past": False,
"start_date": datetime(${year}, ${month}, ${day}),
"start_date": datetime(int("${year}"), int("${month}"), int("${day}")),
"catchup": False,
"retries": 1,
"retry_delay": timedelta(minutes=30),
}

if Version(airflow_version) >= Version("2.4.0"):
schedule_kwarg = {"schedule": "${load_frequency}"}
else:
schedule_kwarg = {"schedule_interval": "${load_frequency}"}

with airflow.DAG("${dag_full_name}",
default_args=default_dag_args,
catchup=False,
max_active_runs=1,
schedule_interval="${load_frequency}",
tags=${tags}) as dag:
tags=ast.literal_eval("${tags}"),
**schedule_kwarg) as dag:
start_task = EmptyOperator(task_id="start")
refresh_table = BigQueryInsertJobOperator(
task_id="refresh_table",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
from datetime import timedelta

import airflow
from airflow import __version__ as airflow_version
from packaging.version import Version
from airflow.operators.empty import EmptyOperator
from airflow.providers.google.cloud.operators.bigquery import \
BigQueryInsertJobOperator
Expand All @@ -36,18 +38,23 @@

default_dag_args = {
"depends_on_past": False,
"start_date": datetime(${year}, ${month}, ${day}),
"start_date": datetime(int("${year}"), int("${month}"), int("${day}")),
"catchup": False,
"retries": 1,
"retry_delay": timedelta(minutes=30),
}

if Version(airflow_version) >= Version("2.4.0"):
schedule_kwarg = {"schedule": "${load_frequency}"}
else:
schedule_kwarg = {"schedule_interval": "${load_frequency}"}

with airflow.DAG("${dag_full_name}",
default_args=default_dag_args,
catchup=False,
max_active_runs=1,
schedule_interval="${load_frequency}",
tags=${tags}) as dag:
tags=ast.literal_eval("${tags}"),
**schedule_kwarg) as dag:

start_task = EmptyOperator(task_id="start")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,8 @@ steps:
if [[ $(find generated_materializer_dag_files -type f 2> /dev/null | wc -l) -gt 0 ]]
then
echo "Copying DAG files to GCS bucket..."
echo "gsutil -m cp -r 'generated_materializer_dag_files/*' gs://${_GCS_TGT_BUCKET}/dags/"
gsutil -m cp -r 'generated_materializer_dag_files/*' gs://${_GCS_TGT_BUCKET}/dags/
echo "gcloud storage cp --recursive 'generated_materializer_dag_files/*' gs://${_GCS_TGT_BUCKET}/dags/"
gcloud storage cp --recursive 'generated_materializer_dag_files/*' gs://${_GCS_TGT_BUCKET}/dags/
else
echo "No files to copy to GCS bucket!"
fi
Expand Down
6 changes: 3 additions & 3 deletions src/OracleEBS/src/common/py_libs/k9_deployer.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,10 +82,10 @@ def _simple_process_and_upload(k9_id: str, k9_dir: str, jinja_dict: dict,
if "__init__.py" not in [str(p.relative_to(k9_dir)) for p in k9_files]:
with open(f"{tmp_dir}/__init__.py", "w", encoding="utf-8") as f:
f.writelines([
"import os",
"import sys",
"import os\n",
"import sys\n",
("sys.path.append("
"os.path.dirname(os.path.realpath(__file__)))")
"os.path.dirname(os.path.realpath(__file__)))\n")
])

if data_source == "k9":
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ def validate_resources(
if isinstance(ex, NotFound):
logging.error("🛑 Storage bucket `%s` doesn't exist. 🛑",
bucket.name)
elif isinstance(ex, Unauthorized, Forbidden):
elif isinstance(ex, (Unauthorized, Forbidden)):
if checking_on_writing:
logging.error("🛑 Storage bucket `%s` "
"is not writable. 🛑", bucket.name)
Expand Down
4 changes: 4 additions & 0 deletions src/SAP/SAP_CDC/cdc_settings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -312,3 +312,7 @@ data_to_replicate:
load_frequency: "@weekly"
- base_table: mkol
load_frequency: "@weekly"
## CORTEX-CUSTOMER: Uncomment if you need optional address notes/remarks in AddressMD view.
# - base_table: adrt
# load_frequency: "@daily"

4 changes: 2 additions & 2 deletions src/SAP/SAP_CDC/cloudbuild.cdc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ steps:
generated_files=$(shopt -s nullglob dotglob; echo ./generated_dag/*.py)
if (( ${#generated_files} ))
then
gsutil -m cp -r './generated_dag/*.py' gs://${_TGT_BUCKET_}/dags/
gcloud storage cp --recursive './generated_dag/*.py' gs://${_TGT_BUCKET_}/dags/
else
echo "🔪🔪🔪No Python files found under generated_dag folder or the folder does not exist. Skipping copy.🔪🔪🔪"
fi
Expand All @@ -78,7 +78,7 @@ steps:
generated_files=$(shopt -s nullglob dotglob; echo ./generated_sql/*.sql)
if (( ${#generated_files} ))
then
gsutil -m cp -r './generated_sql/*.sql' gs://${_TGT_BUCKET_}/data/bq_data_replication/
gcloud storage cp --recursive './generated_sql/*.sql' gs://${_TGT_BUCKET_}/data/bq_data_replication/
else
echo "🔪No SQL files found under generated_sql folder or the folder does not exist. Skipping copy.🔪"
fi
Expand Down
4 changes: 2 additions & 2 deletions src/SAP/SAP_CDC/common/materializer/deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -220,8 +220,8 @@ echo "generate_dependent_dags.py completed successfully."
if [[ $(find generated_materializer_dag_files/*/*/task_dep_dags -type f 2> /dev/null | wc -l) -gt 0 ]]
then
echo "Copying DAG files to GCS bucket..."
echo "gsutil -m cp -r 'generated_materializer_dag_files/*' gs://${GCS_TGT_BUCKET}/dags/"
gsutil -m cp -r 'generated_materializer_dag_files/*' "gs://${GCS_TGT_BUCKET}/dags/"
echo "gcloud storage cp --recursive 'generated_materializer_dag_files/*' gs://${GCS_TGT_BUCKET}/dags/"
gcloud storage cp --recursive 'generated_materializer_dag_files/*' "gs://${GCS_TGT_BUCKET}/dags/"
else
echo "No task dependent DAG files to copy to GCS bucket!"
fi
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
from datetime import timedelta

import airflow
from airflow import __version__ as airflow_version
from packaging.version import Version
from airflow.operators.empty import EmptyOperator
from airflow.providers.google.cloud.operators.bigquery import \
BigQueryInsertJobOperator
Expand All @@ -34,18 +36,23 @@

default_dag_args = {
"depends_on_past": False,
"start_date": datetime(${year}, ${month}, ${day}),
"start_date": datetime(int("${year}"), int("${month}"), int("${day}")),
"catchup": False,
"retries": 1,
"retry_delay": timedelta(minutes=30),
}

if Version(airflow_version) >= Version("2.4.0"):
schedule_kwarg = {"schedule": "${load_frequency}"}
else:
schedule_kwarg = {"schedule_interval": "${load_frequency}"}

with airflow.DAG("${dag_full_name}",
default_args=default_dag_args,
catchup=False,
max_active_runs=1,
schedule_interval="${load_frequency}",
tags=${tags}) as dag:
tags=ast.literal_eval("${tags}"),
**schedule_kwarg) as dag:
start_task = EmptyOperator(task_id="start")
refresh_table = BigQueryInsertJobOperator(
task_id="refresh_table",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
from datetime import timedelta

import airflow
from airflow import __version__ as airflow_version
from packaging.version import Version
from airflow.operators.empty import EmptyOperator
from airflow.providers.google.cloud.operators.bigquery import \
BigQueryInsertJobOperator
Expand All @@ -36,18 +38,23 @@

default_dag_args = {
"depends_on_past": False,
"start_date": datetime(${year}, ${month}, ${day}),
"start_date": datetime(int("${year}"), int("${month}"), int("${day}")),
"catchup": False,
"retries": 1,
"retry_delay": timedelta(minutes=30),
}

if Version(airflow_version) >= Version("2.4.0"):
schedule_kwarg = {"schedule": "${load_frequency}"}
else:
schedule_kwarg = {"schedule_interval": "${load_frequency}"}

with airflow.DAG("${dag_full_name}",
default_args=default_dag_args,
catchup=False,
max_active_runs=1,
schedule_interval="${load_frequency}",
tags=${tags}) as dag:
tags=ast.literal_eval("${tags}"),
**schedule_kwarg) as dag:

start_task = EmptyOperator(task_id="start")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,8 @@ steps:
if [[ $(find generated_materializer_dag_files -type f 2> /dev/null | wc -l) -gt 0 ]]
then
echo "Copying DAG files to GCS bucket..."
echo "gsutil -m cp -r 'generated_materializer_dag_files/*' gs://${_GCS_TGT_BUCKET}/dags/"
gsutil -m cp -r 'generated_materializer_dag_files/*' gs://${_GCS_TGT_BUCKET}/dags/
echo "gcloud storage cp --recursive 'generated_materializer_dag_files/*' gs://${_GCS_TGT_BUCKET}/dags/"
gcloud storage cp --recursive 'generated_materializer_dag_files/*' gs://${_GCS_TGT_BUCKET}/dags/
else
echo "No files to copy to GCS bucket!"
fi
Expand Down
6 changes: 3 additions & 3 deletions src/SAP/SAP_CDC/common/py_libs/k9_deployer.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,10 +82,10 @@ def _simple_process_and_upload(k9_id: str, k9_dir: str, jinja_dict: dict,
if "__init__.py" not in [str(p.relative_to(k9_dir)) for p in k9_files]:
with open(f"{tmp_dir}/__init__.py", "w", encoding="utf-8") as f:
f.writelines([
"import os",
"import sys",
"import os\n",
"import sys\n",
("sys.path.append("
"os.path.dirname(os.path.realpath(__file__)))")
"os.path.dirname(os.path.realpath(__file__)))\n")
])

if data_source == "k9":
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ def validate_resources(
if isinstance(ex, NotFound):
logging.error("🛑 Storage bucket `%s` doesn't exist. 🛑",
bucket.name)
elif isinstance(ex, Unauthorized, Forbidden):
elif isinstance(ex, (Unauthorized, Forbidden)):
if checking_on_writing:
logging.error("🛑 Storage bucket `%s` "
"is not writable. 🛑", bucket.name)
Expand Down
2 changes: 1 addition & 1 deletion src/SAP/SAP_CDC/src/copy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@
# limitations under the License.

bucket=$1
gsutil cp -r ../generated_dag/ $bucket
gcloud storage cp --recursive ../generated_dag/ $bucket
Loading