Fix kmsg timestamps#288
Open
Sayan- wants to merge 1 commit into
Open
Conversation
|
Created a monitoring plan for this PR. What this PR does: Fixes OOM kill events ( Intended effect:
Risks:
Status updates will be posted automatically on this PR as monitoring progresses. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
system_oom_killevents were stamped with the kmsg envelope timestamp, which isCLOCK_MONOTONIC-derived. That clock freezes while a unikraft VM is suspended (scale-to-zero), so on any VM that had standbyed the OOM timestamps skewed backward by the accumulated suspended duration — sometimes by hours.sysmonkmsg reader now stamps each record with wall-clock read time (time.Now()) instead of the envelope timestamp. We only ever read live records (the source seeks to end first), so read time is an accurate event time.events.Event.Tsand added a gotcha (WebRTC OSS launch #13) inAGENTS.md: all event timestamps must be wall-clock at emit/observe, never a monotonic/source clock.Test
kmsgparserSourcestamps observation time, not the envelope's monotonic timeNote
Low Risk
Localized change to kmsg timestamp stamping and documentation; behavior improves correctness on suspended VMs with minimal blast radius.
Overview
Fixes
system_oom_kill(and related OOM parsing) timestamps that could appear hours in the past on scale-to-zero VMs because kmsg envelope times are monotonic-derived and freeze while the VM is suspended.kmsgparserSourcenow sets eachKmsgMessage.Timestampto wall-clock read time (time.Now()at forward), not the parser’s envelope timestamp; comments onOomInstance.TimeOfDeathandKmsgMessagedescribe that contract.Documents the same rule on
events.Event.Tsand adds AGENTS.md gotcha #13 so in-process telemetry producers stamp at emit/observe. Adds a linux-gated unit test that the forwarded stamp is observation time, not the envelope time.Reviewed by Cursor Bugbot for commit 0b909ed. Bugbot is set up for automated code reviews on this repo. Configure here.