Microsoft Fabric Data Factory

What is Fabric Data Factory?
Fabric data factory is derived from Azure Data Factory for the most part with some subtle differences.
For example in Fabric data factory the concept of IR (Integration Runtimes) or entities like linked services of ADF(Azure Data Factory) dont exist.
They are primarily designed for seamless integration with Lakehouse and Data warehouse in the Fabric ecosystem.
You can check out the differences between the two here.
In this blog we will look into how to create a fabric data pipeline that copies files and folders from an ADLS2 location to a different location on ADLS2 while preserving the source file/folder hierarchical structure.
The setup
On ADLS2 we have two folders : Source and Destination in the storage container.
We would copy the files and subfolders from Source folder to the Destination folder.

The source folder structure is as below that contains the following files and subfolders

Log into your fabric tenant and under your workspace click the Data Pipeline option

Then add the GetMetaData activity on the Data pipeline canvas.

In the settings option of the GetMetaData activity create a connection to the ADLS2 source location.


Once the connection is done, set the file path under the GetMetaData activity settings.

Next add a For each Activity and under its Settings option set the Item option to :
@activity('Get Metadata').output.childItems

The Get Metadata activity child items would act as items object for the For each activity.
Now add a Copy activity inside the For each activity.

Set the source connection for the Copy activity.

Please note that we've configured the wildcard paths to: /Source. This setting means that we intend to copy all files and subfolders within the Source folder. If we change it to /Source/* it would only copy the subfolders and its contents excluding any files located in the root of the Source folder.
Next set the Destination connection for the Copy Activity.

When you want to copy the subfolders along with the files, ensure that the file format option selected is Binary in the Source and Destination connection option of the Copy Activity.
Now go ahead and Execute the pipeline.

Check the destination folder and you should the all the folders, subfolders and files copied over.

Closing Notes
Using the above method we demonstrated how simple and easy it was to move folders and files using the Copy data activity. However, a major drawback of the above approach is that it preserves the source hierarchy in the destination.
If you prefer not to maintain the source hierarchy in the destination and want to dump all the files in a designated single folder in the destination then the process becomes very complex .I will explain in another blog how to achieve it. Till then stay tuned.




