0% found this document useful (0 votes)
90 views13 pages

Access ADLS from Databricks with Service Principal

This document provides a step-by-step guide on accessing a CSV file stored in Azure Data Lake Storage (ADLS) from Databricks using a service principle. It details the creation of a service principle in Microsoft Entra ID, the assignment of necessary permissions, and the actual code to access the storage account. The document also mentions that with the introduction of Unity Catalog, the process has become simpler, although this method remains relevant for projects without it.

Uploaded by

phillipefs000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views13 pages

Access ADLS from Databricks with Service Principal

This document provides a step-by-step guide on accessing a CSV file stored in Azure Data Lake Storage (ADLS) from Databricks using a service principle. It details the creation of a service principle in Microsoft Entra ID, the assignment of necessary permissions, and the actual code to access the storage account. The document also mentions that with the introduction of Unity Catalog, the process has become simpler, although this method remains relevant for projects without it.

Uploaded by

phillipefs000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CHENCHU’S

🚀 Mastering PySpark
and Databricks 🚀
Part-22

Accessing Storage Account(ADLS)


from Databricks using Service
principle.

C .R. Anil Kumar Reddy


Associate Developer for Apache Spark 3.0

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

We will access a CSV file which is in storage account(ADLS)


from databricks notebook.

Navigate to Microsoft Entra ID(Formerly known as Active Directory)inside


this we are going to create service principle

App New
EntraID Manage Registrations Registrations

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

Provide any name and click Register

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

Now copy Application(client)ID and Directory(tenantID)

Now click on certificates and secrets create secret and copy


the secret

New client Provide Name


Manage Certificates &
Secret
Secrets & click on ADD

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

Now copy the value ,it will be generated only


once so make sure you copy it.

So now we have,

[Link](client)ID
[Link](tenantID)
3. Client Secret value
Anil Reddy Chenchu
Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

Copy this code from microsoft storage tutorial: [Link]


us/azure/databricks/connect/storage/tutorial-azure-storage#--step-3-grant-the-
service-principal-access-to-azure-data-lake-storage-gen2

Now replace all the details along with storage account name

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

Now we have created Service principle but we have not yet given
permission to this service principle to access the storage account. Let us
provide access now.

Go to storage account and click on IAM and add role assignment

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

Search for storage blob data Contributor, select and


hit on Next

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

Click on Select Members and search for service principle name


given while registering refer slide 2 &3 if you have any confusion
and hit on Select.

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

Click on Review + assign for 2 times

Role has been assigned

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

Now let us access the CSV file stored in ADLS like a dataframe

In the above code test is container name , deltadbstorage19 is storage account name
and Sample is folder name

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

This is how we access storage account from Databricks in real


time using Service Principle, with introduction of Unity Catalog
this has become more simpler but in projects where unity catalog
is still not implemented this is the approach to access ADLS from
Databricks, only difference is secret key will be stored in Azure
Key vault.

Anil Reddy Chenchu


Follow me on Linkedin
[Link]/in/chenchuanil
CHENCHU’S

NIL REDDY CHENCHU


Torture the data, and it will confess to anything

DATA ANALYTICS

Happy Learning

SHARE IF YOU LIKE THE POST

Lets Connect to discuss more on Data

[Link]/in/chenchuanil

You might also like