Forum Discussion
Ashok42470
Feb 20, 2025Copper Contributor
Need assistance on KQL query for pulling AKS Pod logs
I am trying to pull historical pod logs using below kql query. Looks like joining the tables; containerlog and KubePodInventory didn't go well as i see lot of duplicates in the output.
ContainerLog
//| project TimeGenerated, ContainerID, LogEntry
| join kind= inner (
KubePodInventory
| where ServiceName == "<<servicename>>"
)
on ContainerID
| project TimeGenerated, Namespace, ContainerID, ServiceName, LogEntrySource, LogEntry, Name1
| sort by TimeGenerated asc
Can someone suggest a better query?
Take this:
ContainerLog | join kind=inner ( KubePodInventory | where ServiceName == "<<servicename>>" ) on ContainerID | project TimeGenerated, Namespace, ContainerID, ServiceName, LogEntrySource, LogEntry, Name1 | distinct TimeGenerated, Namespace, ContainerID, ServiceName, LogEntrySource, LogEntry, Name1 | sort by TimeGenerated asc
- lucheteSteel Contributor
Hi Ashok42470,
The join between ContainerLog and KubePodInventory is causing duplicates because you're not filtering enough. To avoid duplicates, try adding the PodName or ContainerName as an additional filter in the join condition. You can also consider using summarize to aggregate the logs per pod/container if needed. Here’s an adjusted query:
ContainerLog | join kind= inner ( KubePodInventory | where ServiceName == "<<servicename>>" ) on $left.ContainerID == $right.ContainerID | project TimeGenerated, Namespace, ContainerID, ServiceName, LogEntrySource, LogEntry, Name1 | summarize Logs = make_list(LogEntry) by ContainerID, TimeGenerated, ServiceName, Namespace | sort by TimeGenerated asc
In this case the query groups the logs per container and should remove duplicates.
Give it a try and let me know how it goes.
Regards!
- Ashok42470Copper Contributor
Thanks for the reply luchete. The reason i am giving service name; pods are ephemeral right, sometimes i had to pull the logs for the pods that were already killed whose name i may not know.
In my case, just chosen logs for last 30 min and it gave me 30k+ rows of data which shouldn't be the case. By looking at the results, comma separated strings in logentry are getting split into multiple rows. Not sure how to tackle it?
- lucheteSteel Contributor
Hello Ashok42470,
I understand the challenge with ephemeral pods. To prevent logs from being split across rows, you can aggregate them using summarize. Here’s an updated query that groups the logs into a single entry per container:
ContainerLog | join kind=inner ( KubePodInventory | where ServiceName == "<<servicename>>" ) on $left.ContainerID == $right.ContainerID | summarize Logs = make_list(LogEntry, 1000) by ContainerID, TimeGenerated, ServiceName, Namespace | extend CombinedLogs = strcat_array(Logs, " ") // Joins logs into a single string | project TimeGenerated, Namespace, ContainerID, ServiceName, CombinedLogs | sort by TimeGenerated asc
This should eliminate duplicates and keep the log messages intact. If you’re pulling logs for killed pods, consider adjusting your time range or adding filters for PodName.
Regards!