Skip to main content

Command Palette

Search for a command to run...

Azure AI search for Fabric One Lake unstructured data

Published
4 min read
Azure AI search for Fabric One Lake unstructured data

Since it was difficult to explain all the steps clearly with screenshots in the article, I decided to create a video that walks through the process and makes it easier to understand.

https://youtu.be/nk93ev3vF_o

We all know that we can use fabric data agents to communicate with structured data on lake houses but using them, its not possible to talk to unstructured data on lakehouses such as .pdf, .docx, or .txt files.

But with Azure AI search we can use a no code approach using Azure AI search to communicate with unstructured data on Fabric One Lake. In fact the approach is not just limited to Fabric Onelake. We can use this approach to query with unstructured data on

  • Azure Cosmos DB

  • Azure Blob Storage

  • Azure SQL DB

  • Azure ADLS Gen2 storage

This article will focus on Microsoft Fabric One Lake storage. In an upcoming article I will show how we can retrieve the actions through REST APIs and MCP endpoints.

So before we move its important to understand what Azure AI search is.

Basically Azure AI search is a dedicated search engine and storage of all the searchable content for agentic, full-text, and vector search scenarios. It also includes optional integrated AI to extract text and structure from raw content and to chunk and vectorize content for vector search.

There are 4 main components involved

  • Azure AI Search Service

  • Indexes

  • Knowledge Source

  • Knowledge Base

Azure AI Search is a service from Microsoft that helps search the unstructured data easily and quickly through classical search and Agentic search.

We can also query or expose this data through REST API's and MCP.I have an upcoming article on this topic.

Lets take an example of an e-commerce site. You are searching for running shoes. In a ecommerce site with classical search your search is limited to keywords. In classical search when you search running shoes, The underlying search engine looks into index and finds products containing running and shoes and returns the products

'In a ecommerce site that has agentic search implementation, your search can be "Best running shoes under 200 dollars for beginners". The AI search engine reasons the user search through an LLM model and queries the underlying source to return results using classical search. Under the hood agentic search uses classical search to return the results once it reasons and understands the user input.

Conceptually in Azure AI search, the idea is pretty simple.

You have a Data Source → Indexer → Index (with AI Skills) → Knowledge Source → Knowledge Base → Query.

To start with Azure AI search you first bring your data from underlying sources like Blob Storage, SQL, OneLake or Cosmos DB. Then you create an index, which is like a structured version of your data but optimized for searching.

Then you can use indexers to automatically pull data from your source and update the index.

So indexers is a service that pushes data from source into index. Once the data is indexed, you can run search/AI queries on it using keywords, filters, or even more advanced options.

You can extract information like key phrases, detect language, or analyze text using built-in AI skills.

Overall, Azure AI Search is useful when you have a lot of unstructured data and want users to quickly find what they are looking for without building everything from scratch.

Conculsion

In conclusion, Azure AI Search makes it easy to work with unstructured data in OneLake and beyond. With its ability to combine classical and agentic search, you can quickly build powerful, intelligent search experiences without heavy coding. This approach helps users find the right information faster and unlock more value from their data.

Thanks for reading !!!