Forum Discussion
bspadp2
Aug 17, 2023Copper Contributor
Loading Parquet and Delta files into Azure Synapse using ADB or Azure Synapse?
I have a below case scenario.
We are using Azure Databricks to pull data from several sources and generate the Parquet and Delta files and loaded them into our ADLS Gen2 Containers.
We are now planning to create our data warehouse inside Azure Synapse SQL Pools, where we will create external tables for dimension tables which will use delta files and hash distributed fact tables using Parquet files.
Now, the question is, to automate this data warehousing loading activity, which method is better? Is it better to use Azure Databricks to write our transformation logic to create dim and fact tables and load them regularly inside Azure Synapse SQL pools (or) is it better to use Azure Synapse to write our transformation logic to create dim and fact tables and load them regularly inside Azure Synapse SQL pools.
Please help.
- CharbelhannaBrass ContributorI believe both way technically would work, however to choose the best way i would advise to look at the whole strategy and objective, ask questions like:
Who would be responsible on the lifecycle maintainance of this transformation (access,repsonsiblity,management,updates) ?
What would be the difference in terms of costs between implementing this transformation logic in synapse or in Databricks ?
What is your ultimate objective? moving away from Databricks on the long term ?
Personally and theoratically I would consider centralizing all transformations and and loading activities on a single platfrom.
Regards,
Charbel HANNA