# Access Blob & ADLS2 data in Azure Synapse Dedicated pool

In my previous [article](https://www.azureguru.net/access-blob-adls2-data-in-azure-synapse-serverless-pool), we explored various access options for ADLS2 and Blob storage in Azure Synapse serverless pool, focusing on how to access data using Managed Identity and Service Principal.

In this article, we will examine how to use Managed Identity and Service Principal to access data in Blob and ADLS2 storage through different endpoints and protocols in Azure Synapse dedicated pool.

### SetUp

To get started we will first create a dedicated pool called `SQLDB` in the Synapse workspace.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733242915805/6c82d0f4-18ed-44e8-aee9-1f6de587b269.png align="left")

Once created you can see the pool in SSMS.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733243271817/f0f158c6-76ef-46ab-bbd3-5e1eac8c87ea.png align="left")

We will use the same sample files that we had uploaded to the Blob and the ADLS2 storages in my previous [article](https://www.azureguru.net/access-blob-adls2-data-in-azure-synapse-serverless-pool#heading-setup)

Files on a blob storage are under a directory `Customers`

![Synapse SQL Serverless pool](https://cdn.hashnode.com/res/hashnode/image/upload/v1733076566898/53276a88-013c-4cf1-b617-593d15bac7e7.png?auto=compress,format&format=webp align="left")

and so are the files on ADSL2

![Synapse SQL Serverless pool](https://cdn.hashnode.com/res/hashnode/image/upload/v1733076602853/10683b6c-b6ff-4b62-843e-f861226b97d5.png?auto=compress,format&format=webp align="left")

We will use the same Managed Identity and Service Principal as the one in my previous article.

***Service Principal***

![Synapse SQL Serverless pool](https://cdn.hashnode.com/res/hashnode/image/upload/v1733076929637/b95673bd-46c0-4257-8190-f7272dd9cafd.png?auto=compress,format&format=webp align="left")

***Managed Identity***

![Synapse SQL Serverless pool](https://cdn.hashnode.com/res/hashnode/image/upload/v1733082344756/e1b1541e-f569-4999-aee6-6a5f90dc9c4a.png?auto=compress,format&format=webp align="left")

Create a master key in the pool

```sql
CREATE MASTER KEY ENCRYPTION BY PASSWORD='Any Strong Password'
GO
```

Please note that unlike serverless pool where you create the key in master database, you have to create the master key in the dedicated sql pool or in the session.

In dedicated pool we CANNOT use OPENROWSET syntax the way we use in serverless pool. You have to use [External tables](https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-table-transact-sql?view=sql-server-ver16&tabs=dedicated) with [External file](https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-file-format-transact-sql?view=sql-server-ver16) format.

*External File Format:*

```sql
CREATE EXTERNAL FILE FORMAT [CsvFormatWithHeader] WITH (
    FORMAT_TYPE = DELIMITEDTEXT, 
    FORMAT_OPTIONS (
        FIELD_TERMINATOR = ',', 
        FIRST_ROW  = 2,
        STRING_DELIMITER = '"',
        USE_TYPE_DEFAULT = False
        )
)
```

Grant the synapse workspace `Storage Blob Data Contributor` role to the blob storage.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733264681703/b9230e38-9611-4e6e-9e2f-0816e74278c8.png align="left")

### **Access Blob with Managed Identity and** https protocol

First, create a database scoped credential

```sql
CREATE DATABASE SCOPED CREDENTIAL MyCredential
WITH IDENTITY = 'MANAGED IDENTITY'; -- MANAGED IDENTITY IS THE KEYWORD
```

Then create an External data source. You can read about external data source [**here**](https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-data-source-transact-sql?view=sql-server-ver16).

```sql
CREATE EXTERNAL DATA SOURCE SampleSource
WITH (
    -- location should be pointing to the container on the blob storage
      location = 'https://storageaccount.blob.core.windows.net/container',
      CREDENTIAL = MyCredential
);
```

Execute the query to the source `Customers`

```sql
CREATE EXTERNAL TABLE dbo.Customers
( [Customerid] [int] ,
	[Name] [varchar](40) ,
	[Email] [varchar](40) ,
	[city] [varchar](40)
	)
    WITH (
            LOCATION = 'Customers',
            DATA_SOURCE = SampleSource,
			FILE_FORMAT = [CsvFormatWithHeader]
            ) 

	SELECT * FROM  dbo.Customers
```

The query errors out : `Msg 105097, Level 16, State 1, Line 1 Managed Service Identity authentication is only supported by abfss scheme.`

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733245816929/bee10b31-3b33-4997-9f6b-5a035cd529b2.png align="left")

This is because Manage Identity cannot access blob or ADLS2 using `https` on Synapse dedicated pool.

Well then you might ask , how to then access blob storage using Managed Identity through `https` ?

There is one way, albeit a very cheeky one..by using the `COPY` command. Basically what you do is `COPY` the data from the blob storage to table in the dedicated pool.

Note it has to be a physical table not and external one.

Create a Customers table

```sql
CREATE TABLE dbo.Customers
( [Customerid] [int] ,
	[Name] [varchar](40) ,
	[Email] [varchar](40) ,
	[city] [varchar](40)
)
```

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733256819186/0eeb83cc-62db-4b2f-b9ca-52b095c57557.png align="left")

Then use the `COPY` command.

```sql
COPY INTO  Customers  FROM 'https://storageaccount.blob.core.windows.net/container/Customers' 
WITH ( FILE_TYPE='CSV', FIRSTROW =2,
CREDENTIAL=(IDENTITY= 'Managed Identity')
)
```

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733266040885/58720557-67e4-442f-8d79-4cf61c3cdc94.png align="left")

### **Access Blob with Managed Identity and** abfss protocol

Ok now lets try Managed Identity with `abfss` endpoint incase `COPY` is not an option for you.

Create database scoped credential

```sql
CREATE DATABASE SCOPED CREDENTIAL MyCredential
WITH IDENTITY = 'MANAGED IDENTITY'; -- MANAGED IDENTITY IS THE KEYWORD
```

We modify the External data source to use `abfss`.

```sql
CREATE EXTERNAL DATA SOURCE SampleSource
WITH (
    -- location should be pointing to the container on the blob storage
   	location = 'abfss://container@storageaccount.blob.core.windows.net',
    CREDENTIAL = MyCredential
);
```

We get an access error and not the protocol error.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733254548522/a83fc861-d83f-4555-a36f-bd18c2e41486.png align="center")

Error `path does not exist` is quite strange.

I suspected that the issue must be with SSMS so I double checked the query with Synapse Studio and I get the same error.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733254712173/fc17c652-b1c1-4c8a-baf6-89c1fc2bb80f.png align="center")

But it was surprising that when I query for an individual file the query succeeds.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733247974654/fd94de39-d239-4832-9a1d-b702e8dbe7c2.png align="left")

Then I thought that the issue must be because of the `blob` endpoint that I used in the External Data Source.

I dropped the existing Data Source and

```sql
DROP EXTERNAL DATA SOURCE SampleSource
```

I changed the endpoint from `blob` to `dfs`.

```sql
CREATE EXTERNAL DATA SOURCE SampleSource
WITH (
   	location = 'abfss://container@storageaccount.dfs.core.windows.net',
    CREDENTIAL = MyCredential
);
```

and it worked because `dfs` endpoint generally respect the hierarchical namespace.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733248415325/bd5f3ddf-b4b2-4915-b4dd-64fa27d263be.png align="left")

We have now established that Managed Identity does not work with `https` protocol.

If you dont want to through the route of the `COPY` option, you have to use `dfs` endpoint with `abfss` protocol to query the entire data and not just individual files.

### **Access ADLS2 with Managed Identity and** https protocol

Lets now check ways to access ADLS2 using Managed Identity

```sql
CREATE DATABASE SCOPED CREDENTIAL MyCredential
WITH IDENTITY = 'MANAGED IDENTITY';

CREATE EXTERNAL DATA SOURCE SampleSource
WITH (
   	location = 'https://conatiner@adlsstorageaccount.dfs.core.windows.net/',
    CREDENTIAL = MyCredential
);
```

the query errors out

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733249394968/82ea35d1-aa42-4990-b437-c825d165dfb5.png align="left")

It means that irrespective of the storage types(Blob or ADLS2), Managed Identity is NOT supported with `https` in Synapse dedicated pool.

You can instead use the `COPY` option instead if you really would want to use `https`

```sql
COPY INTO  Customers  FROM 'https://adlsstorageaccount.dfs.core.windows.net/container/Customers' 
WITH ( FILE_TYPE='CSV', FIRSTROW =2,
CREDENTIAL=(IDENTITY= 'Managed Identity')
)
```

### **Access ADLS2 with Managed Identity and** abfss protocol

Lets now try accessing ADLS2 through `abfss` endpoint with Managed Identity.

```sql
CREATE DATABASE SCOPED CREDENTIAL MyCredential
WITH IDENTITY = 'MANAGED IDENTITY';

CREATE EXTERNAL DATA SOURCE SampleSource
WITH (
       
    location = 'abfss://container@adlsstorgaeaccount.dfs.core.windows.net/',
    CREDENTIAL = MyCredential
);
```

As expected, this query succeeds.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733250423430/a94ed1dd-70b2-4daf-bbde-12fdef9c6024.png align="left")

### **Access Blob with Service Principal and** https protocol

Drop the existing Data Source and Credentials

```sql
DROP EXTERNAL DATA SOURCE SampleSource
DROP DATABASE SCOPED CREDENTIAL MyCredential
```

As mentioned earlier, we will use the existing service principal called `Fabric OAUTH2`.

![Synapse SQL Serverless pool](https://cdn.hashnode.com/res/hashnode/image/upload/v1733088976995/9e3e6943-e5d0-4a4f-81c8-9a7b9f5e82fc.png?auto=compress,format&format=webp align="left")

Ensure that you Grant the service principal `Storage Blob Data Contributor` role for the storage account.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733095079985/25a4be74-0d7a-4881-91ec-2d722844b226.png align="left")

Create database scoped credentials that uses Service Principal.

```sql
CREATE DATABASE SCOPED CREDENTIAL MyCredential
WITH IDENTITY='ClientId@https://login.microsoftonline.com/tenantId/oauth2/v2.0/token' , 
SECRET='secret'
```

```sql
CREATE EXTERNAL DATA SOURCE SampleSource
WITH (
   	location = 'https://storageaccount.blob.core.windows.net/container',
    CREDENTIAL = MyCredential
);
```

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733264079025/bf35afaa-645a-4fe9-8f56-d02312343d2b.png align="left")

The query errors out. We used `https` protocol with `blob` endpoints to access Blob storage. But is it possible to access blob storage using `https` in Service Principal ?

As was the case with Managed Identity where we used `COPY` option with `https`, we can use the same approach for Service Principal as well.

First create Customers table.

```sql
CREATE TABLE dbo.Customers
( [Customerid] [int] ,
	[Name] [varchar](40) ,
	[Email] [varchar](40) ,
	[city] [varchar](40)
)
```

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733256819186/0eeb83cc-62db-4b2f-b9ca-52b095c57557.png align="left")

and use the `COPY` option with the service principal URI

```sql
COPY INTO  Customers  FROM 'https://storageaccount.blob.core.windows.net/container/Customers' 
WITH ( FILE_TYPE='CSV', FIRSTROW =2,
CREDENTIAL=(IDENTITY= 'ClientId@https://login.microsoftonline.com/tenantId/oauth2/v2.0/token', 
SECRET='secret')
)
```

So, Service principal can access blob storage using `https` but it can do only using `COPY` option and not through Data Source. The only option is to use `COPY` if you really want to do it through the use of `https`.

### Access Blob with Service Principal and abfss protocol

Lets try with `abfss` protocol with `blob` endpoints.

```sql
CREATE DATABASE SCOPED CREDENTIAL MyCredential
WITH IDENTITY='ClientId@https://login.microsoftonline.com/tenantId/oauth2/v2.0/token' , 
SECRET='secret'
```

```sql
CREATE EXTERNAL DATA SOURCE SampleSource
WITH (
    location = 'abfss://container@storageaccount.blob.core.windows.net/',
    CREDENTIAL = MyCredential
);
```

Once done ,the query did not error out and neither did it return any data which is quite interesting.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733253917811/e722f3cd-e00e-4ec5-a29a-0151c2bdfe56.png align="left")

But querying individual files returns data

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733254121992/5356c3f5-1e53-4b1d-a9f1-6444dedd1b7b.png align="left")

So lets change the endpoint from `blob` to `dfs`

```sql
CREATE EXTERNAL DATA SOURCE SampleSource
WITH (
    location = 'abfss://container@storageaccount.dfs.core.windows.net/',
    CREDENTIAL = MyCredential
);
```

Re executing the query returns all data across the directory.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733266536653/62a90407-1fd6-4afc-a7fa-528568d85535.png align="left")

Recall that earlier we had used `blob` endpoint through Managed Identity and the query errored out but when using `blob` endpoint with Service Principal the query did not error out and neither did it return any data. This would mean that the `blob` endpoint does not work with Service Principal as well even if you use `abfss` protocol.

You will have to use `dfs` endpoint alongside the `abfss` protocol.

### **Access** ADLS2 **with Service Principal and** https protocol

Next lets try accessing ADLS2 storage with `https` protocol with `blob` endpoints.

```sql
CREATE DATABASE SCOPED CREDENTIAL MyCredential
WITH IDENTITY='ClientId@https://login.microsoftonline.com/tenantId/oauth2/v2.0/token' , 
SECRET='secret'
```

```sql
CREATE EXTERNAL DATA SOURCE SampleSource
WITH (   
    location = 'https://container@adlsstorageaccount.blob.core.windows.net/',
    CREDENTIAL = MyCredential
);
```

As expected it would error out as we were using `https` to access ADLS2 storage.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733253345229/3bbde81f-7059-4298-9a01-6c86f8ebb12f.png align="left")

### Access ADLS2 with Service Principal and abfss protocol

```sql
CREATE DATABASE SCOPED CREDENTIAL MyCredential
WITH IDENTITY='ClientId@https://login.microsoftonline.com/tenantId/oauth2/v2.0/token' , 
SECRET='secret'
```

```sql
CREATE EXTERNAL DATA SOURCE SampleSource
WITH (   
    location = 'abfss://container@adlsstorageaccount.dfs.core.windows.net/',
    CREDENTIAL = MyCredential
);
```

The query works

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1733255735645/0aa8a057-121c-492c-91fd-f7ab57e85db3.png align="left")

### Conclusion

In conclusion, both Managed Identity and Service Principal provide secure and flexible ways to access data in Azure Synapse dedicated pool. Understanding the differences in access methods across different endpoints and protocols is crucial for optimizing data workflows in Synapse dedicate pool. Implementing these options can help you tailor specific implementations for your custom solutions.

Thanks for reading !!!
