Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No uncaught service exceptions are logged out of the box. #4324

Closed
trustin opened this issue Jun 28, 2022 · 5 comments · Fixed by #4687
Closed

No uncaught service exceptions are logged out of the box. #4324

trustin opened this issue Jun 28, 2022 · 5 comments · Fixed by #4687
Labels
discussion new feature sprint Issues for OSS Sprint participants
Milestone

Comments

@trustin
Copy link
Member

trustin commented Jun 28, 2022

When a new user writes an Armeria server, the user often doesn't realize they have to add LoggingService to see the actual exception in the log. This default behavior can be very confusing. It would be nice if we can log the uncaught service exceptions by default somehow without worrying too much about double notification.

One idea:

  • Add a new writable property RequestContext.shouldLogUncaughtException which is true by default.
  • Modify LoggingClient and LoggingService to set false to shouldLogUncaughtException.

Thoughts?

@ikhoon
Copy link
Contributor

ikhoon commented Jun 28, 2022

Overall it is a useful feature. That being said, as an error of a client request is propagated via the returned HttpResponse, I prefer to add shouldLogUncaughtException only to ServiceRequestContext.

@alexc-db
Copy link
Contributor

This uncaught exception log can be setup as a periodic task that logs summary of the uncaught exceptions, something like

[LOG_LEVEL] Observed X uncaught exceptions in last PERIOD seconds. If you don't see error logs - please use LoggingService decorator.

LOG_LEVEL and PERIOD can be configured via server builder, use e.g. ERROR and 60 by default, respectively. This way users can opt out by changing log level or setting PERIOD to e.g. -1.

Or, if we'd like to generalize this even more - we could have a log message that returns multiple stats like that

[INFO] Server stats:
...
- Unhandled errors: X (LoggingService disabled)
...

In my experience periodic summaries like that are super useful for debugging because they can store detailed internal state summary for the service, easy to find, provides insight even when other metrics (e.g. Prometheus) are missing.

@ikhoon ikhoon modified the milestones: 1.17.0, 1.18.0 Jun 29, 2022
@trustin
Copy link
Member Author

trustin commented Jun 29, 2022

If we log the counter message, we could uncount the uncaught exceptions whose ctx.shouldLogUncaughtException is false to make the counter log message disappears when all ervices are covered by LoggingService or other handlers that call ctx.setShouldLogUncaughtException(false).

Given logging every uncaught exceptions can be very spammy, we could:

  • Log the counter message as suggested by @alexc-db; and
  • Maybe log the most recently uncaught exception as an example when logging the counter? (This should be useful when a user is testing their server.)

@ikhoon ikhoon modified the milestones: 1.18.0, 1.19.0 Jul 22, 2022
@ikhoon ikhoon modified the milestones: 1.19.0, 1.20.0 Aug 30, 2022
@minwoox minwoox modified the milestones: 1.20.0, 1.21.0 Sep 29, 2022
@ikhoon ikhoon modified the milestones: 1.21.0, 1.22.0 Dec 12, 2022
@jrhee17 jrhee17 modified the milestones: 1.22.0, 1.23.0 Jan 27, 2023
@trustin trustin added the sprint Issues for OSS Sprint participants label Feb 16, 2023
@seonWKim
Copy link
Contributor

seonWKim commented Feb 17, 2023

Hi, I've few questions while trying to solve this issue.

  • If we have to add shouldLogUncaughtException to the ServiceRequestContext(which is a public interface), we have to add some getter and setter methods for shouldLogUncaughtException. Will it be okay to add methods on this public interface?
    • ex) boolean shouldLogUncaughtException() or void setShouldLogUncaughtException(boolean value)
    • There's only 2 implementations of ServiceRequestContext in armeria project so it's not a lot of work to add new methods to implementations.
  • Should we report uncaught exceptions periodically by using some sort of daemon threads?
    • Or how about logging uncaught exceptions information when a new exception occurs and some specific time has passed after last logged time? This will be less noisy and log only when some exceptions occur.

@trustin
Copy link
Member Author

trustin commented Feb 17, 2023

  • Yes, you can add the public methods.
  • You don't need a dedicated thread for doing this. You can get one of the boss or worker event loop from ServerConfig when Server starts itself and schedule a periodic task using scheduleWithFixedDelay(). You need to keep the ScheduledFuture returned by scheduleWithFixedDelay() so you can cancel it when Server stops itself.

minwoox pushed a commit that referenced this issue Mar 27, 2023
#4687)

Motivation:

If `LoggingService` is not added to services, exceptions(without appropriate exception handlers) thrown by those services are not caught. This default behavior can be confusing for users because, even though exceptions are being thrown, there are no logs to indicate that anything is wrong.
The purpose of this feature is to enable the reporting of unhandled exceptions by default, so that users can be alerted to any issues and take appropriate action.

Modifications:

- `com.linecorp.armeria.defaultUnhandledExceptionsReportIntervalMillis` vm option can be used to configure intervals between reporting unhandled exceptions in milliseconds. 
- Add `shouldReportUnhandledExceptions` to `DefaultServiceRequestContext` which determines whether to log unhandled exceptions.
- Add `ExceptionReportingServerErrorHandler` to report unhandled exceptions. 

Result:

- [Closes #<4324>](#4324). (If this resolves the issue.)
- User will be able to check the number of uncaught exceptions and thrown exception periodically by default. 
- If multiple servers with same `MeterRegistry` exist, `armeria.server.exceptions.unhandled` metrics will be the sum of all the number of unhandled exceptions on those servers. However, individual servers will be reported with logs that include total number of unhandled exceptions specific to that server only. 

<b>Example</b>
- User can change default settings by using `ServerBuilder` 
```
final ServerBuilder sb = Server.builder();
sb.http(port);
sb.unhandledExceptionsReportIntervalMillis(1000L);
configureServices(sb);
return sb.build();
```
- Result 
```
[armeria-common-worker-nio-2-1] WARN com.linecorp.armeria.server.ExceptionReportingServerErrorHandler - Observed 1 unhandled exceptions in last 1000ms(1000000000ns). Please consider adding the LoggingService decorator to get detailed error logs. One of the thrown exceptions:
example.armeria.server.annotated.ExceptionHandlerService$LocallySpecificException
	at example.armeria.server.annotated.ExceptionHandlerService.exception1(ExceptionHandlerService.java:36)
	at com.linecorp.armeria.internal.server.annotation.AnnotatedService.invoke(AnnotatedService.java:355)
	at com.linecorp.armeria.internal.server.annotation.AnnotatedService.lambda$serve0$8(AnnotatedService.java:324)
	at java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:684)
...
```
Closes #4324
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion new feature sprint Issues for OSS Sprint participants
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants