Architecture
Every single request that hits kube-apiserver (eg. kubectl get pods) generates an audit event. There’s no separate “audit service” sitting beside the apiserver; auditing is a feature built directly into the apiserver’s request handling pipeline.
As that request moves through its lifecycle, it passes through up to four stages, and the apiserver can emit an audit event at each one.
The four stages
1. RequestReceived
Generated the moment the audit handler receives the request, before it’s been processed or authorized.
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"stage": "RequestReceived",
"requestURI": "/api/v1/namespaces/default/pods",
"verb": "create",
"user": { "username": "deploy-bot", "groups": ["system:serviceaccounts"] },
"requestReceivedTimestamp": "2026-06-23T10:15:30.100000Z"
}
This stage exists, but not very usefull in most cases cause it roughly doubles event volume while telling you nothing you don’t already get.
ResponseStarted
Fired once the response headers go out, but before the response body is sent. This only applies to long-running requests - watch connections are the main case. A kubectl get pods -w or a controller’s informer watch will hit this stage, a normal kubectl get pods won’t.
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"stage": "ResponseStarted",
"requestURI": "/api/v1/namespaces/default/pods?watch=true",
"verb": "watch",
"user": { "username": "system:kube-controller-manager" }
}
ResponseComplete
The response body has been fully sent - nothing more is coming. This is the stage that matters most for almost every use case (security review, compliance evidence, “who deleted this?”, etc.), because it carries the complete picture: what was asked, who asked, and what the server actually did about it.
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Request",
"auditID": "a1b2c3d4-e5f6-7890-abcd-ef0123456789",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/production/pods/my-app-pod",
"verb": "delete",
"user": {
"username": "shubham@company.com",
"groups": ["developers", "system:authenticated"]
},
"sourceIPs": ["10.0.0.50"],
"objectRef": {
"resource": "pods",
"namespace": "production",
"name": "my-app-pod",
"apiVersion": "v1"
},
"responseStatus": { "code": 200 },
"requestReceivedTimestamp": "2026-06-23T10:15:30.123456Z",
"stageTimestamp": "2026-06-23T10:15:30.234567Z"
}
Panic
Generated only when the apiserver hits an internal panic while handling the request. Rare, but useful as a signal of something seriously wrong in the request path - worth never filtering out, regardless of how aggressive your policy is elsewhere.
Audit levels - how much detail gets captured
None
Nothing is logged. The request is matched by a rule and dropped entirely - no event written at all.
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
Metadata
Logs the request metadata - who, what verb, what resource, timestamp, source IP, response code - but never the request or response body.
{
"kind": "Event",
"level": "Metadata",
"stage": "ResponseComplete",
"verb": "get",
"user": { "username": "jane.doe", "groups": ["developers"] },
"objectRef": { "resource": "secrets", "namespace": "production", "name": "db-creds" },
"responseStatus": { "code": 403, "reason": "Forbidden" }
}
For anything that touches secrets - Metadata is the only level you should use - otherwise it will show you the request body or the response body which has the base64 encoded secret which anyone can easily decode. Metadata doesnt show those details.
Request
Metadata plus the full request body. No response body.
{
"kind": "Event",
"level": "Request",
"stage": "ResponseComplete",
"verb": "create",
"user": { "username": "ci-bot" },
"objectRef": { "resource": "configmaps", "namespace": "kube-system", "name": "feature-flags" },
"requestObject": {
"kind": "ConfigMap",
"apiVersion": "v1",
"metadata": { "name": "feature-flags", "namespace": "kube-system" },
"data": { "new-checkout-flow": "true" }
}
}
You see what was sent, but not what the server returned.
RequestResponse
The most verbose - metadata, request body, and response body. Useful for high-value resources where you want to know exactly what changed and exactly what the server’s final state ended up being, but expensive in storage if we apply it broadly.
{
"kind": "Event",
"level": "RequestResponse",
"stage": "ResponseComplete",
"verb": "create",
"user": { "username": "kubernetes-admin", "groups": ["system:masters"] },
"objectRef": { "resource": "rolebindings", "namespace": "production", "name": "ci-deploy-binding" },
"requestObject": {
"kind": "RoleBinding",
"roleRef": { "kind": "Role", "name": "deploy-role" },
"subjects": [{ "kind": "ServiceAccount", "name": "ci-bot", "namespace": "production" }]
},
"responseObject": {
"kind": "RoleBinding",
"metadata": { "uid": "rb-uid-456", "resourceVersion": "9821" }
}
}
This is the level you want for RBAC changes, NetworkPolicy changes, and anything where “what did it end up looking like after” matters as much as “what was requested.”
Writing an audit policy
A policy is just a list of rules evaluated top to bottom - the first matching rule decides the level for that request, and evaluation stops there. Order matters a lot here; a broad rule placed too early will swallow requests you meant to catch with a more specific rule below it.
apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
- "RequestReceived"
rules:
# Drop noisy, low-value system traffic first
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
- level: None
userGroups: ["system:nodes"]
verbs: ["get", "list", "watch"]
- level: None
nonResourceURLs:
- "/healthz*"
- "/readyz*"
- "/metrics"
# Never write secret bodies, no matter the verb
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
# Full detail for the resources that actually matter for security review
- level: RequestResponse
resources:
- group: "rbac.authorization.k8s.io"
- group: "networking.k8s.io"
resources: ["networkpolicies"]
- group: ""
resources: ["pods/exec", "pods/attach"]
# Request body for general mutations elsewhere
- level: Request
verbs: ["create", "update", "patch", "delete"]
# Catch-all
- level: Metadata
A rule can filter on any combination of users, userGroups, verbs, resources (with group / resources / resourceNames), namespaces, and nonResourceURLs. The catch all thing at the end is for any other event that doesnt fall into the above criterias.
How to tune this in practice
There isn’t one universal policy - what gets logged, and at what level, depends heavily on the cluster’s role.
It depends on environment (prod vs. dev/staging):
- Production clusters generally run
RequestResponseon RBAC, NetworkPolicy, and exec/attach - these are the things you need full forensic detail on if something goes wrong. - Dev and staging clusters usually drop to
Metadata-only almost everywhere, or even disable auditing for low-value namespaces, since the security stakes are lower and the volume isn’t worth the storage cost.
It depends on compliance requirements:
- If you’re working toward ISO 27001, SOC 2, or NIS2 (relevant to anything classified as an Important/Essential Entity), auditors generally want to see who changed access control and network policy objects, with enough detail to reconstruct the change - which pushes RBAC and NetworkPolicy rules toward
RequestResponseregardless of environment. - Without a compliance driver, most teams are comfortable with
Metadataas the default and only escalate specific resource types.
It depends on log volume and retention budget:
-
A high-traffic cluster with thousands of requests per second at
RequestResponselevel can generate enormous log volume fast. Teams with tighter storage or shipping budgets push more rules towardMetadataand reserveRequest/RequestResponsefor a less no. of resource types. -
This is also why
omitStages: ["RequestReceived"]is used by most of the SRE teams - it’s like a “50% volume cut with no loss of information” in most cases, sinceResponseCompletealready contains the request_received timestamp.
It depends on what’s already noisy in your cluster:
- Health checks (
system:kube-probe), kube-proxy watches, kubelet node status updates, and reconciler loops (ArgoCD, controller-manager, your GitOps tooling) account for a large fraction of total apiserver traffic. Filtering these toNoneearly in the rule list is a good practice - without it, signals will get buried in noise.
So what you should do is:
Nonefor known system/health noise, placed firstMetadata- only, forced, forsecrets.RequestResponsefor RBAC objects, netpols, andpods/exec/pods/attachRequestfor general mutating verbs (create/update/patch/delete) on everything elseMetadatacatch all at the bottom