Observability in Microsoft Agent Framework through Open Telemetry in Azure Application Insights, KQL and Aspire on Docker

In a typical Agent AI and Workflow setups when something goes wrong major questions start piling up:
Which agent handled the request?
Which tool was invoked?
How long did each step take?
Where did the failure occur?
Why is the workflow slower than expected?
Without proper observability, answering these questions is very difficult.
This is where Open Telemetry, Azure Application Insights and .NET Aspire come together. They provide a powerful way to gain visibility into what's happening inside your Microsoft Agent Framework applications. Instead of relying on console logs and guesswork, you can trace requests end-to-end, monitor agent interactions, measure performance and quickly identify bottlenecks.
In this article, we'll build a Microsoft Agent Framework application running in Docker, integrate it with Azure Application Insights and use Aspires dashboard to visualize telemetry in real time.
UseCase
We will use an use case from this post where an agent runs as a background service and the input to the agent is pushed from Azure Service Bus.
SetUp
Before we move to the OpenTelemetry implementation, we will have to set up Aspire service in Docker and the Log Analytics workspace on Azure.
Pull up the Docker desktop and run the following command in the Docker terminal.
docker run --rm -it -p 18888:18888 -p 4317:18889 -p 4318:18890 -d --name aspire-dashboard mcr.microsoft.com/dotnet/aspire-dashboard:latest
Once installed , navigate to http://localhost:18888/login . The login page would prompt for a token value
Navigate to the Containers tab and under Logs you will find Aspire Dashboard details. We get Dashboard URL, Login URL with the embedded token and the corresponding gRPC and HTTP URLs.
or run the following command in the Docker terminal.
docker ps
Enter the token value and you will land up on the Aspire Dashboard
Set up Log Analytics workspace and Application Insights resource, search for Log Analytics workspace and Application Insights in Azure Marketplace.
While setting up Application Insights you will have to select an option to select the Log Analytics workspace.
The overall setup for these two resources is pretty straightforward.
If required, you can also change the Log Analytics Workspace for the Application Insights.
In the next step, add the following resources to the project
dotnet add package Azure.Monitor.OpenTelemetry.Exporter
dotnet add package OpenTelemetry.Logs
dotnet add package OpenTelemetry.Metrics
dotnet add package OpenTelemetry.Resources
dotnet add package OpenTelemetry.Trace
dotnet add package OpenTelemetry.Instrumentation.Http
dotnet add package System.Diagnostics
Then add the following telemetry services
builder.Services.AddOpenTelemetry()
.WithTracing(tracing =>
{
tracing
.SetSampler(new AlwaysOnSampler())
.SetResourceBuilder(
ResourceBuilder.CreateDefault()
.AddService("MyApp"))
.AddSource("MyApp.Source")
.AddOtlpExporter(options => options.Endpoint = new Uri("http://localhost:4317"))
.AddHttpClientInstrumentation()
.AddConsoleExporter()
.AddAzureMonitorTraceExporter(options =>
{
options.ConnectionString = "AppLicationInsights Connection String"
;
});
});
builder.Logging.ClearProviders();
builder.Logging.AddConsole();
builder.Logging.SetMinimumLevel(LogLevel.Trace);
Lets break down the major aspects of the above code :
The following code sets the service name that appears in the telemetry. Without a service name the app traces appears under a generic or auto-generated name.
.AddService("MyApp"))
Define a source MyApp.Source for the service MyApp that listens to the activities created in it.
.AddSource("MyApp.Source")
Now configure OpenTelemetry to to export traces, metrics, and logs to an OTLP collector. In our case the OTLP collector is an Aspire service running on Docker.
.AddOtlpExporter(options => options.Endpoint = new Uri("http://localhost:4317"))
You might wonder where did port 4317 come from.
Recall that we had configured 4317 while setting up the Aspire Docker container. So 4317 is the host port and 18889/18890 is the container port.
The next part of the code adds enrichment to logs when one service is calling another service using HTTP. This basically means that OpenTelemetry automatically adds extra context (metadata) to the logs, traces and spans about the outgoing HTTP request.
.AddHttpClientInstrumentation()
The next piece of code displays the trace to the console window during execution. This part can be optional as I don't feel it does provides much of an value addition. It simply makes the traces visible during execution.
.AddConsoleExporter()
The following piece of code exports the logs to the Azure Applications Insight resource that we created earlier.
.AddAzureMonitorTraceExporter(options =>
{
options.ConnectionString = "AppLicationInsights Connection String";
}
The connection string is available through the Applications Insight dashboard.
In the following code , we first clear all the default loggers that ASP.NET adds. If required we add the logging info to the console and then set the minimum logging level to Trace which implies that all the details Debug, Info, Warning etc should be logged.
builder.Logging.ClearProviders();
builder.Logging.AddConsole();
builder.Logging.SetMinimumLevel(LogLevel.Trace);
In the Background Service, which in our case is named Worker we set different levels of span/traces.
First, define an ActivitySource called MyApp.Worker under the StartAsync operation.
private static readonly ActivitySource Activity = new("MyApp.Worker");
Next, define an Activity called Worker.Start and trace/spans under it
using var activity_start = Activity.StartActivity("Worker.Start", ActivityKind.Server);
activity_start.SetTag("worker.name", "Worker_1");
activity_start.SetTag("Activity Name", activity_start.DisplayName);
activity_start.SetTag("Source Name", activity_start.Source.Name);
and then under RunAsync event which is a method that runs as a Background service.
Note that we have a different Activity called Worker.Run where different sets of trace/spans are registered.
using var activity = Activity.StartActivity("Worker.Run", ActivityKind.Server);
activity?.SetTag("receiver.name", "promptqueue");
activity?.SetTag("receiver.activity", "servicebusdata");
activity?.SetTag("operation.type", "background-job");
activity?.SetTag("operationstatus", "start");
activity?.SetTag("servicebus.ingestion.start", "promptqueue");
Trace the token usage by the agent
AgentResponse response = await agent.RunAsync(message.Body.ToString(), session);
Console.WriteLine(response.Text);
activity?.SetTag("InputTokeCount", response.Usage.InputTokenCount);
activity?.SetTag("OutputTokeCount", response.Usage.OutputTokenCount);
activity?.SetTag("TotalTokenCount", response.Usage.TotalTokenCount);
Complete Code
Program.cs >>
using Azure;
using Azure.AI.OpenAI;
using Azure.Messaging.ServiceBus;
using Azure.Monitor.OpenTelemetry.Exporter;
using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using OpenTelemetry.Logs;
using OpenTelemetry.Metrics;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using System.Diagnostics;
internal class Program
{
private async static Task Main(string[] args)
{
HostApplicationBuilder builder = Host.CreateApplicationBuilder();
var configuration = new ConfigurationBuilder()
.SetBasePath(Directory.GetCurrentDirectory())
.AddJsonFile("appsettings.json", optional: false)
.Build();
var credential = new AzureKeyCredential(configuration["AppSettings:ApiKey"]);
builder.Services.AddKeyedChatClient("ChatClient", (sp => new AzureOpenAIClient(
new Uri(configuration["AppSettings:EndPoint"]), credential)
.GetChatClient(configuration["AppSettings:Chat_DeploymentName"])
.AsIChatClient()));
builder.Services.AddSingleton<AIAgent>(sp =>
{
Func<ChatClientAgentOptions> func = () =>
{
return new ChatClientAgentOptions
{
ChatOptions = new ChatOptions
{
Instructions = "You are a helpful stock market analysis assistant"
}
};
};
return new ChatClientAgent(sp.GetKeyedService<IChatClient>("ChatClient"), options: func());
}
);
builder.Services.AddSingleton(sp =>
{
return new ServiceBusClient(configuration["AppSettings:AzureQueue"], new ServiceBusClientOptions
{
TransportType = ServiceBusTransportType.AmqpTcp
});
});
builder.Services.AddOpenTelemetry()
.WithTracing(tracing =>
{
tracing
.SetSampler(new AlwaysOnSampler())
.SetResourceBuilder(
ResourceBuilder.CreateDefault()
.AddService("MyApp"))
.AddSource("MyApp.Worker")
.AddOtlpExporter(options => options.Endpoint = new Uri("http://localhost:4317"))
.AddHttpClientInstrumentation()
.AddConsoleExporter()
.AddAzureMonitorTraceExporter(options =>
{
options.ConnectionString = "Application Insight Connection String";
});
});
//docker ps
builder.Logging.ClearProviders();
builder.Logging.AddConsole();
builder.Logging.SetMinimumLevel(LogLevel.Trace);
builder.Services.AddHostedService<Worker>();
using IHost host = builder.Build();
await host.RunAsync().ConfigureAwait(false);
}
internal sealed class Worker(AIAgent agent, ServiceBusClient servicebusClient,IHostApplicationLifetime appLifetime, ILogger<Worker> logger, IHost host) : IHostedService
{
private AgentSession? session;
private Task? backgroundTask;
private static readonly ActivitySource Activity = new("MyApp.Worker");
public async Task StartAsync(CancellationToken cancellationToken)
{
using var activity_start = Activity.StartActivity("Worker.Start", ActivityKind.Server);
activity_start.SetTag("worker.name", "Worker_1");
activity_start.SetTag("Activity Name", activity_start.DisplayName);
activity_start.SetTag("Source Name", activity_start.Source.Name);
session = await agent.CreateSessionAsync(cancellationToken);
backgroundTask = RunAsync(appLifetime.ApplicationStopping);
}
public async Task RunAsync(CancellationToken cancellationToken)
{
await Task.Delay(1000, cancellationToken);
var receiver = servicebusClient!.CreateReceiver("promptqueue");
while (!cancellationToken.IsCancellationRequested)
{
using var activity = Activity.StartActivity("Worker.Run", ActivityKind.Server);
activity?.SetTag("receiver.name", "promptqueue");
activity?.SetTag("receiver.activity", "servicebusdata");
activity?.SetTag("operation.type", "background-job");
activity?.SetTag("operationstatus", "start");
activity?.SetTag("servicebus.ingestion.start", "promptqueue");
ServiceBusReceivedMessage message = await receiver.ReceiveMessageAsync(cancellationToken: cancellationToken);
activity?.SetTag("servicebus.ingestion.end", "promptqueue");
if (message == null)
{
continue;
}
Console.WriteLine("----------------------------------------------------------------------------------------------------------------------------");
Console.WriteLine("");
activity?.SetTag("AgentInput", message.Body.ToString());
activity?.SetTag("SessionId", session.ToString());
AgentResponse response = await agent.RunAsync(message.Body.ToString(), session);
Console.WriteLine(response.Text);
activity?.SetTag("InputTokeCount", response.Usage.InputTokenCount);
activity?.SetTag("OutputTokeCount", response.Usage.OutputTokenCount);
activity?.SetTag("TotalTokenCount", response.Usage.TotalTokenCount);
Console.WriteLine("");
Console.WriteLine("----------------------------------------------------------------------------------------------------------------------------");
await receiver.CompleteMessageAsync(message);
activity?.SetTag("operationstatus", "end");
}
}
public async Task StopAsync(CancellationToken cancellationToken)
{
if (backgroundTask != null)
{
await backgroundTask;
}
}
}
}
Run the app and check the Logs options in Application Insights and you will see the traces.
KQL Queries
We can leverage KQL queries to query and fetch detailed insights from these trace logs.
union traces, requests, dependencies
| order by timestamp desc
Expanding customDimensions gets the token usage details
requests
| where name == "Worker.Start"
| order by timestamp desc
Gets the the traces for event the Worker.Start
requests
| where customDimensions["worker.name"] == "Worker_1"
| order by timestamp desc
Trace details for worker Worker_1
requests
| project timestamp,name,duration,success
| order by duration desc
Duration and success status
requests
| where customDimensions["TotalTokenCount"] >100
| order by timestamp desc
Details of requests that costs more than 100 tokens.
requests
| where duration >60000
| order by duration
Gets the details of requests having a duration >60000
Aspire Dashboard
Navigate to the Apsire dashboard thorough http://localhost:18888/traces that was configured earlier
Execution >>
Conclusion
Through this article, I demonstrated the most viable options by which Observability in Microsoft Agent Framework through Open Telemetry in Azure Application Insights and Aspire can be implemented.
I would personally prefer Observability implementation in MAF through Azure Application Insights because of strong KQL support with the flexibility of leveraging KQL queries which Aspire lacks. Added to that, Azure Application Insights also seamlessly integrates with Grafana to implement rich dashboard experience which I would possibly cover in another article.
I hope this article helps you get started with OpenTelemetry implementation for Microsoft Agent Framework.
Thanks for reading !!!



