PolyBase allows you to read/write data in external storage using T-SQL. This lab demonstrates how to load data into a dedicated SQL pool table in parallel from a data lake file.
-
Log in to the Azure portal using the provided lab credentials in an incognito or private browser window.
-
Navigate to Azure Synapse Analytics.
-
In the upper left corner, click "Create."
-
Configure the workspace:
- Resource group: Select the existing resource group.
- Workspace name, for mine : Enter "fodessynapse."
- Region: Ensure "East US 2" is selected.
-
Configure Data Lake Storage Gen2:
- Account name: Enter "fodestorage."
- File system name: Enter "fodefsn."
-
Click on "acgmsstorage."
-
In the left navigation menu, under Data storage, click "Containers."
-
Click "+ Container."
-
Enter "taxidata" for Name and click "Create."
-
Navigate back to the Resource group page and click on "acgmssynapse."
-
Under "Open Synapse Studio," click "Open."
-
In Synapse workspace, click on the "Manage" icon in the left navigation menu.
-
Select "SQL pools" under the expanded Manage menu.
-
Click "+New" to create a new SQL pool.
-
Configure the SQL pool:
- Dedicated SQL pool name: Enter "TaxiRidesWarehouse."
- Performance level: Adjust to "DW100c."
-
Complete the setup and wait for deployment to finish.
-
In the Synapse workspace, click on the "Data" icon in the left navigation menu.
-
Under Data, click on the "Linked" tab.
-
Expand Azure Data Lake Storage Gen2 > acgmssynapse > taxidata container.
-
Upload the "TaxiRides.parquet" file.
-
Right-click on "TaxiRides.parquet" file, hover over "New SQL script," and select "Create external table."
-
Select the target database:
- Select SQL pool: Choose "TaxiRidesWarehouse."
- External table name: Enter "ExternalTaxiRides."
-
Leave other settings at default and click "Open script."
-
Review the script and execute it.
Use the CTAS (CREATE TABLE AS SELECT) command to create a new table named TaxiRides. Replace <External Table Name> with the name of the external table you created in the previous objective:
-- Create table TaxiRides, using CTAS (Polybase)
-- Create table TaxiRides, using CTAS (Polybase)
CREATE TABLE TaxiRides
WITH
(
DISTRIBUTION = ROUND_ROBIN,
CLUSTERED COLUMNSTORE INDEX
)
AS
SELECT *
FROM dbo.FodeExternalTaxiRides -- Replace name of external table
SELECT * FROM TaxiRides



