-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: key components health check #2510
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very good, Ming. Thank you!
logger.exception(e) | ||
|
||
if response.has_error(): | ||
raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=response.model_dump()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we dump this, should the parameter include the error?
e.g.
try:
create_default_folder_if_it_doesnt_exist(session, uuid.UUID(user_id))
response.folder = "ok"
except Exception as e:
logger.exception(e)
response.folder = e.message()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. I will do another PR to improve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This portion of the code needs to be changed. This makes the health check not idempotent and not able to run in parallel, which is problematic in many scenarios. For example, in k8s liveness and readiness probe could run in parallel.
It's ok to write/read to the cache but there's no need to verify the correctness, we should just check the availability.
await chat.set_cache("health_check", str(test_id))
if v := await chat.get_cache("health_check"):
if v.get("result") == str(test_id):
response.chat = "ok"
else:
logger.error("chat service get incorrect value")
- I don't see the value of testing the variable service. If variables are on the db, you're testing the DB (again!) and if they are on k8s it's very expensive and not worth it (if you're on k8s, k8s is up..)
Also you're leaking the variable in case the deletion fails.
I would completely remove the variable section.
- Again for the folder section, there's no need to test the DB connectivity again.
key components health check
key components health check (cherry picked from commit a2ec87b)
Since /health is also supported by uvicorn, so until the langflow's router is initialized.
/health
returns the status of uvicorn. That does not reflect the readiness of the langflow.This PR introduces a new endpoint
/health_check
that also test key services like chat, session, and variable services that make sure cache , DB, or kubernetes backend are working. It offers more reliable service type of check rather a simple route basedhealth
check. Only a few selected services to save the health check time.Once this PR is merged and released, Kubernetes's probe should use the /health_check endpoint.