Add option to set worker healthcheck timeout #2500

mmac-m3a · 2024-11-01T05:51:18Z

Summary

Adding new command line and config option worker_healthcheck_timeout which sets the timeout for worker liveness from the supervisor when multiple workers are in use. Default timeout is unchanged as well as frequency of health checks.

Rationale

Applications with CPU intensive synchronous startup may starve the worker process for CPU cycles and make the pong thread generate response too late, which in turn makes the supervisor kill and relaunch the worker.

Checklist

I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
I've updated the documentation accordingly.

Rationale for no explicit test

I was not able to create simple unit test that would reliably trigger the health check timeout. I can 100% reliably trigger it in my application and I've also verified that longer health check timeout resolves the problem.

omid-jf · 2024-12-05T20:56:55Z

I was receiving many "Child Process Died" messages in my FastAPI application. Manually increasing the timeout mentioned in this PR fixed my issue.

racinmat · 2025-05-20T09:38:04Z

Hi, is anything blocking merging this PR? It will solve #2450 for many cases, because you could pass higher timeout.

vjeranc · 2025-05-20T10:04:03Z

This was discussed a while ago, the fix to increase the timeout was known then, so not sure what's the blocker.

IMO, there should be no pinging or ponging. OS is not stupid, either process is suspended or it is not. If process is stuck, either in a deadlock or in CPU heavy operation and would be considered "inactive", that's another issue.

Anyone hosting web apps should have their own health checks and restart the process/container.

Similarly, processes/workers in uvicorn are created with a very expensive fork. It would be better to just invoke uvicorn N times at the web app level and completely remove "workers" concept. The savings on the socket level (all processes reading from the same socket is abysmal), cheaper fork would allow lower memory footprint, a much better saving, which is not the case now.

If a cheaper fork was done instead, it would make some sense to manage process liveness at uvicorn level. With current expensive fork, nothing is gained by the workers=N argument, as we see in this case, it just created more bugs.

Add option to set worker healthcheck timeout

a795717

omid-jf mentioned this pull request Dec 5, 2024

Add option to set worker healthcheck timeout ictr/uvicorn#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add option to set worker healthcheck timeout #2500

Add option to set worker healthcheck timeout #2500

Uh oh!

mmac-m3a commented Nov 1, 2024

Uh oh!

omid-jf commented Dec 5, 2024

Uh oh!

racinmat commented May 20, 2025

Uh oh!

vjeranc commented May 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Add option to set worker healthcheck timeout #2500

Are you sure you want to change the base?

Add option to set worker healthcheck timeout #2500

Uh oh!

Conversation

mmac-m3a commented Nov 1, 2024

Summary

Rationale

Checklist

Rationale for no explicit test

Uh oh!

omid-jf commented Dec 5, 2024

Uh oh!

racinmat commented May 20, 2025

Uh oh!

vjeranc commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

vjeranc commented May 20, 2025 •

edited

Loading