fix: [SDK-4794] prewarm dispatchers at cold-start entry points#2664
Conversation
📊 Diff Coverage ReportDiff Coverage Report (Changed Lines Only)Gate: aggregate coverage on changed executable lines must be ≥ 80% (JaCoCo line data for lines touched in the diff). Changed Files Coverage
Overall (aggregate gate)22/49 touched executable lines covered (44.9% — requires ≥ 80%) Per-file detail (informational; gate is aggregate above):
❌ Coverage Check FailedAggregate coverage on touched lines is 44.9% (minimum 80%). |
|
@claude review |
There was a problem hiding this comment.
LGTM — mechanical extension of the existing idempotent prewarm() to the common ThreadUtils entry points.
Extended reasoning...
Overview
This PR adds a single OneSignalDispatchers.prewarm() call to six ThreadUtils entry points (suspendifyOnMain, suspendifyOnSerialIO, suspendifyWithCompletion, suspendifyWithErrorHandling, launchOnIO, launchOnDefault) and one explicit call at FCMBroadcastReceiver.onReceive before goAsync(). KDoc on prewarm() is updated to document the explicit early callsites. Tests are added/updated in ThreadUtilsFeatureFlagTests, SyncJobServiceTests, and a new FCMBroadcastReceiverTests.
Security risks
None. This is a performance/startup optimization that shifts dispatcher cold-init off the main thread. No auth, crypto, permissions, IPC parsing, or data exposure surface is touched. The FCM receiver continues to short-circuit on google.com/iid token-update intents before the prewarm call, matching prior behavior.
Level of scrutiny
Low. Although the file touches threading infrastructure (which would normally warrant high scrutiny), each individual edit is a one-line invocation of a function that already exists, is documented and verified idempotent (@Volatile prewarmStarted guard + double-checked lock), is fire-and-forget (work runs on a separate OneSignal-prewarm daemon thread at NORM_PRIORITY - 2), and swallows failures with a logged fallback. The behavioral change is just "warm dispatchers a bit earlier in more code paths."
Other factors
- The only bug surfaced is a test-consistency Nit about pre-existing
BackgroundManagerTestsnot stubbing the newprewarm()call onmockkObject(OneSignalDispatchers)— the inline comment notes the race is impractical and CI passes; not blocking. - The diff-coverage gate failed at 57.1% (FF-off branches of the new ThreadUtils sites aren't directly verified, and the FCM-receiver explicit prewarm shares behavior with the ThreadUtils prewarm it duplicates by design). The uncovered lines are low-risk one-liners.
- New test coverage is solid for the FF-on contract on each entry point and for the FCM receiver's filter behavior on token-update intents.
|
Thanks for chasing this. I think there’s one core issue to address before merge: the new Since Can we either:
I’d also like to see coverage or trace evidence that the first cold helper call no longer initializes Small follow-ups:
|
Production OTel showed the first launchOnIO/launchOnSerialIO on a cold start constructing the ThreadPoolExecutor + dispatcher + scope synchronously on the main thread (Activity trampoline, JobService.onStartJob, FCM goAsync), blocking it for seconds (SDK-4507 family). prewarm() shifts that construction onto the low-priority OneSignal-prewarm daemon. Call prewarm() at the cold-start entry points so the daemon gets a head start before the first dispatch: - receivers: FCMBroadcastReceiver, NotificationDismissReceiver, BootUpReceiver, UpgradeReceiver (before goAsync()) - ADM: ADMMessageHandler, ADMMessageHandlerJob - HMS: OneSignalHmsEventBridge.onNewToken / onMessageReceived - notification-open trampolines: NotificationOpenedActivityBase, NotificationOpenedActivityHMS These join the existing prewarm() calls in OneSignalImp.initWithContext* and SyncJobService.onStartJob. prewarm() is idempotent and fire-and-forget, so the calls are cheap. It is a best-effort head start, not a hard guarantee: a caller that dispatches immediately can still win the lazy-init race, so prewarm is placed where there is real lead time before the first dispatch. - reword prewarm() KDoc as a best-effort head start with the entry-point table - stub OneSignalDispatchers.prewarm() in IOMockHelper so mockkObject specs don't spawn the daemon - add PrewarmEntryPointTests, SyncJobServiceTests and FCMBroadcastReceiverTests coverage for the explicit prewarm calls Co-authored-by: Cursor <cursoragent@cursor.com>
d3bbefd to
c51092c
Compare
|
A few things to fix before merge:
|
…p guard)
Addresses fadi-george's pre-merge review on the prewarm cold-start PR:
- prewarm(): wrap Thread construction, priority assignment, and start() in a
try/catch (Throwable) so a daemon-startup failure (e.g. OOM "unable to create
new native thread", SecurityManager, InternalError) can never propagate onto a
cold-start entry point's caller thread. Reset the started guard on failure so a
later entry point can retry; dispatchers keep their lazy fallbacks. Also widen
the in-thread catch to Throwable.
- Tests now assert ordering, not just that prewarm() was called: prewarm() must
run before the first suspendifyOnIO dispatch (verifyOrder) in
PrewarmEntryPointTests, FCMBroadcastReceiverTests, and SyncJobServiceTests.
- Replace afterAny { unmockkAll() } with teardown of only the spec-owned mock
(unmockkObject(OneSignal)). unmockkAll() was stripping IOMockHelper's
spec-scoped static/object mocks after the first test. Per-test prewarm() call
counts are isolated via clearMocks(OneSignalDispatchers, answers = false).
- Remove internal ticket IDs from source/test comments and test names.
- Reword the stale IOMockHelper comment that claimed ThreadUtils helpers call
prewarm() (the wide approach was rejected; only entry points call it).
Co-authored-by: Cursor <cursoragent@cursor.com>
processIntent() now calls OneSignalDispatchers.prewarm() at this entry point. The Robolectric spec drives setup() on the test thread without IOMockHelper, so the real prewarm() ran — spawning the OneSignal-prewarm daemon and lazy-initing the IO/Default/SerialIO executors + scopes, which persist across the JVM for later test classes. Register listener(IOMockHelper) so prewarm() is stubbed to a no-op, matching the pattern already used in FCMBroadcastReceiverTests and PrewarmEntryPointTests. Co-authored-by: Cursor <cursoragent@cursor.com>
|
A few remaining things to fix/consider:
|
Address the 12 weighted issues failing :OneSignal:core:detekt: - Add KDoc to IFeatureManager/isEnabled, FeatureFlag.isEnabledIn, and the public JwtTokenStore listener/getter functions (UndocumentedPublicClass/Function). - Extract notInitializedMessage() and awaitInitCompletion() from waitUntilInitInternal() to get under the LongMethod threshold. - Suppress TooGenericExceptionCaught on prewarm(), where catching Throwable is intentional so a prewarm failure never propagates onto a cold-start caller. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: AR Abdul Azeez <abdul@onesignal.com> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Fadi George <fadii925@gmail.com>
Summary
Call
OneSignalDispatchers.prewarm()at every cold-start entry point so the IO/Default/SerialIO dispatcher + executor + scope construction happens on the low-priorityOneSignal-prewarmdaemon instead of on the thread that handles the wake-up. Entry points covered:FCMBroadcastReceiver,NotificationDismissReceiver,BootUpReceiver,UpgradeReceiver(beforegoAsync())ADMMessageHandler,ADMMessageHandlerJob(each callback)OneSignalHmsEventBridge.onNewToken/onMessageReceivedNotificationOpenedActivityBase,NotificationOpenedActivityHMSOneSignalImp.initWithContext/initWithContextSuspend,SyncJobService.onStartJobprewarm()is idempotent and fire-and-forget, so the calls are cheap. Theprewarm()KDoc is reworded to describe it as a best-effort head start, list the full entry-point table, and tell future contributors to prewarm any new entry point.Why this approach over the two alternatives
Three options were considered for SDK-4507/SDK-4794 cold-start ANRs (the production OTel showed the first
launchOnIO/launchOnSerialIOconstructing theThreadPoolExecutorsynchronously on the main thread):1. Wide — prewarm inside the
ThreadUtilshelpers (suspendifyOnIO,launchOnIO, …). One chokepoint, automatic for all background work.Rejected:
prewarm()is fire-and-forget, so a helper that prewarms and then immediately dispatches can still win the lazy-init race and construct the executor on the caller thread. It made the race more likely won but didn't actually fix it, and coupled every background dispatch to prewarm.2. Durable — make the dispatcher defer cold construction off the caller thread (queue the first dispatch and replay it on the daemon so construction can never land on a caller thread, regardless of timing).
Rejected for now: it's a guaranteed fix, but it adds ~70 lines of lock/queue/
CompletableJobconcurrency code to the most safety-critical file in the SDK, with subtle serial-ordering /join()/ cancellation semantics and a permanent review + maintenance cost. That is a large charge to close a gap we have not proven still fires.3. Chosen — prewarm at the cold-start entry points. The expensive construction can only first hit a thread at a small, stable, enumerable set of doorways (the receivers / jobs / activities / bridges above); deep callers run after init on background threads. Prewarming at each doorway — where there is real lead time before the first dispatch (a
goAsync()handoff orinitWithContextwork) — covers the actual production surface (the OTel ANRs were all at entry points) with ~10 one-line calls and zero new concurrency code.Trade-off accepted: it is best-effort (a pathologically slow device could still lose the race), and a new entry point must remember to call
prewarm()— both documented in theprewarm()KDoc. If telemetry later shows the race still firing, option #2 remains available as a follow-up.Test plan
SyncJobServiceTests—onStartJobcallsprewarm()beforesuspendifyOnIOFCMBroadcastReceiverTests— prewarm fires for real FCM payloads, skipped forgoogle.com/iidtoken-update intentsPrewarmEntryPointTests—BootUpReceiver/UpgradeReceiver/NotificationDismissReceiverprewarm before dispatchOneSignalDispatchersTests— existing off-thread + idempotent prewarm coverage still passesADMMessageHandler(Job)andOneSignalHmsEventBridgehave no unit-test path (the Amazon/Huawei base classes aren't on the test classpath); those calls are verified by inspection.Made with Cursor