Introduction

Cloud costs can be real money wasters, if you log something you will never need, who is responsible for that? While looking at your costs in Azure, you could see Application Insights as a big cost driver. In this blog post, I will show you how to get in control of your Application Insights costs.

Prior knowledge

In an earlier post, I showed how to fix duplicate logging and explained how Log Analytics sits on top of Application Insights.

Read the blog post on How to fix duplicate logging in Application Insights.

All KQL queries in this post are based on the Log Analytics workspace.

Identify the biggest cost tables

Log Analytics consists of tables, each of those with a specific target to log something. For example, AppTraces for traces, but also tables like SynapseIntegrationTriggerRuns for logging Synapse triggers.

The first step is to identify the biggest cost tables. You can do this by running the following query in your ‘Log Analytics Workspace’ resource. The payment model is per Gb, so we want to identify the largest tables.

Log Analytics Workspace - Logs - Kusto Query Language
Log Analytics Workspace - Logs - Kusto Query Language

With knowledge about the biggest cost tables, you can start optimizing your logging. In the next sections, I will show you example queries to give insights into logging costs.

QueryByTable.kusto
1
2
3
4
5
union withsource = table *
| summarize Size = sum(_BilledSize) by table, _IsBillable
| sort by Size desc
| extend Size2 = format_bytes(Size, 2)
| order by Size desc

Azure Diagnostic Logs

On many Azure resources, you can configure Log Analytics Workspace as an upstream source. But did you know that this can lead to many logs you have to pay for? A colleague of mine used this query to identify and reduce 90% of their costs. By disabling the Azure Diagnostic Logs for Power BI, they saved a lot of money. By running this query you will gain insights into the amount of logs ingested per resource.

QueryTableByResourceId.kusto
1
2
3
4
AzureDiagnostics
| where TimeGenerated > ago(32d)
| summarize sum (1)  by bin(TimeGenerated, 1d), _ResourceId
| render columnchart

Application traces

Traces are good for hunting bugs. But when a system is running, do you need all Debug logs? Do you even think every log is important?

In this query below I will sort unique logging metrics by Resource and Costs. The most expensive logs are on top. The magic number 2,52 was the price per Gb ingested for Log Analytics. When you insert more than 100Gb which is a lot, you can get discounted pricing. Make sure when you query you think of your scope and environments that also log this trace.

Make sure you configure your log levels correctly. In appsettings.json of host.json.

AppTracesByCosts.kusto
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
AppTraces
| extend MessageSize = strlen(Message)
| order by MessageSize desc
| summarize
    Count=count(),
    BilledTotalSize = sum(_BilledSize),
    MessageTotalSize = sum(MessageSize)
    by AppRoleName, OperationName, MessageSize, Message, SeverityLevel, _ResourceId
| extend GbSize = BilledTotalSize / 1024 /1024 / 1024
| extend EuroCost = GbSize * 2,52
| extend ResourceName = tostring(split(_ResourceId, "/")[-1])
| project ResourceName, EuroCost, Count, SeverityLevel, Message
| order by EuroCost desc

Application dependencies

Dependencies are really important. But when writing too much dependency logging it can lead to unwanted costs. This query will give you insights into the dependencies that have a great economic footprint in your Log Analytics Workspace. The EuroCost is determined by the sum of _BilledSize size of all dependencies given in Gb, multiplied by 2,52.

The DataTotalSize field indicates the data size, this can contain for example the Database query when that is enabled in your logging. If this value is big and the count of this dependency is high this might be a hotspot to act on.

AppDependenciesByCosts.kusto
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
AppDependencies
| extend DataSize = strlen(Data) 
| order by DataSize desc
| summarize
    Count=count(),
    BilledTotalSize = sum(_BilledSize),
    DataTotalSize = sum(DataSize)
    by AppRoleName, OperationName, DataSize, Data, _ResourceId, Type
| extend GbSize = BilledTotalSize / 1024 / 1024 / 1024
| extend EuroCost = GbSize * 2, 52
| extend ResourceName = tostring(split(_ResourceId, "/")[-1])
| project ResourceName, OperationName, EuroCost, Count, Data, Type
| order by EuroCost desc

Health checks

A special mention is for health checks, do you need the full trace and dependency tree for every health check call? Make sure to exclude those unwanted requests and dependencies. You might only consider keeping health check logging when the health check fails and only the health check result.

Dashboard

By putting the data in a dashboard you will provide your team with an easy way to access these metrics. In my screenshot below there are two of the most important queries, the application traces and the tables.

Make sure to set your dashboard time to a good time scope.

Tracing costs dashboard
Tracing costs dashboard

Conclusion

When turning on diagnostics make sure it helps the business. Revisit diagnostic settings and make sure you are in control of your costs. Also make sure that when in development, you are critical about the diagnostic settings. When turned on, it won’t be turned off soon, because now you’re the expert!

Further reading

For making a custom Processor in Code to make conditional logging make sure to visit my mate’s blog Thomas Vieveen.