Warning
Untested Implementation
This connection type has been implemented based on Azure Synapse's ODBC driver documentation but has not been tested against a live Synapse workspace. The implementation should work as designed, but users should verify functionality in their environment.
Securely store and connect to Azure Synapse Analytics with encrypted credentials.
pip install "pypangolin[synapse]"Additional Requirement: You must also install the Microsoft ODBC Driver for SQL Server on your system.
Ubuntu/Debian:
curl https://packages.microsoft.com/keys/microsoft.asc | sudo apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17macOS:
brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release
brew update
brew install msodbcsql17Windows: Download and install from Microsoft Download Center
from pypangolin import PangolinClient
from pypangolin.assets.connections import SynapseAsset
from cryptography.fernet import Fernet
client = PangolinClient(uri="http://localhost:8080")
client.login("username", "password")
# User-managed encryption (recommended)
encryption_key = Fernet.generate_key().decode('utf-8')
SynapseAsset.register(
client,
catalog="data_sources",
namespace="warehouses",
name="prod_synapse",
connection_string="synapse://myworkspace.sql.azuresynapse.net",
credentials={
"server": "tcp:myworkspace.sql.azuresynapse.net,1433",
"database": "mySynapseDW",
"username": "synapse_user",
"password": "securepassword123",
"driver": "ODBC Driver 17 for SQL Server" # Optional, defaults to this
},
encryption_key=encryption_key,
store_key=False,
description="Production Synapse Analytics"
)import os
encryption_key = os.getenv("SYNAPSE_ENCRYPTION_KEY")
conn = SynapseAsset.connect(
client,
catalog="data_sources",
namespace="warehouses",
name="prod_synapse",
encryption_key=encryption_key
)
# Use the connection
cursor = conn.cursor()
cursor.execute("SELECT @@VERSION")
print(cursor.fetchone())
cursor.close()
conn.close()- server (required) - Synapse server endpoint (format:
tcp:server.sql.azuresynapse.net,1433) - database (required) - Database name
- username (required) - SQL authentication username
- password (required) - SQL authentication password
- driver - ODBC driver name (default: "ODBC Driver 17 for SQL Server")
synapse://workspace.sql.azuresynapse.net
credentials={
"server": "tcp:myworkspace.sql.azuresynapse.net,1433",
"database": "mydb",
"username": "sql_user",
"password": "password123"
}Currently not implemented. For Azure AD auth, you would need to extend the implementation to support authentication tokens.
conn = SynapseAsset.connect(client, "data_sources", "warehouses", "prod_synapse", encryption_key=key)
cursor = conn.cursor()
# Query Synapse
cursor.execute("""
SELECT
CAST(order_date AS DATE) as date,
COUNT(*) as order_count,
SUM(total_amount) as revenue
FROM dbo.orders
WHERE order_date >= DATEADD(day, -30, GETDATE())
GROUP BY CAST(order_date AS DATE)
ORDER BY date DESC
""")
for row in cursor:
print(f"Date: {row[0]}, Orders: {row[1]}, Revenue: ${row[2]:,.2f}")
cursor.close()
conn.close()import pandas as pd
conn = SynapseAsset.connect(client, "data_sources", "warehouses", "prod_synapse", encryption_key=key)
query = """
SELECT TOP 1000
customer_id,
product_name,
quantity,
price
FROM dbo.sales
WHERE sale_date >= DATEADD(month, -1, GETDATE())
"""
df = pd.read_sql(query, conn)
print(df.head())
conn.close()- Use Azure AD authentication when possible (requires custom implementation)
- Enable firewall rules to restrict access
- Use Azure Private Link for secure connectivity
- Rotate SQL credentials regularly
- Use user-managed encryption keys
- Enable Azure Synapse audit logging
- Use CTAS (CREATE TABLE AS SELECT) for large data loads
- Leverage distribution and partitioning strategies
- Use PolyBase for external data access
- Consider result set caching
- Be aware of DWU (Data Warehouse Units) allocation
- Pause compute when not in use to save costs
- Use workload management for query prioritization
Error: [IM002] [Microsoft][ODBC Driver Manager] Data source name not found
Solution: Install Microsoft ODBC Driver 17 for SQL Server
- Verify firewall rules allow your IP address
- Check if Synapse workspace is paused
- Ensure correct server endpoint format
- Verify username and password are correct
- Check if user has access to the specified database
- Ensure SQL authentication is enabled (not just Azure AD)
- Synapse uses T-SQL (Transact-SQL) syntax
- Supports both dedicated SQL pools and serverless SQL pools
- This implementation targets dedicated SQL pools
- Connection string format differs from standard SQL Server
- Test thoroughly in your environment before production deployment