Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lifecycle control #965

Merged
merged 5 commits into from
May 20, 2016
Merged

Fix lifecycle control #965

merged 5 commits into from
May 20, 2016

Conversation

tagomoris
Copy link
Member

This change includes some changes and bug fixes about lifecycle of process and plugins.

  • plugins should call super in start and shutdown
  • plugins sometimes raise non-standard errors: these should be rescued and logged
  • ServerEngine sends SIGTERM to stop worker process, but it wasn't rescued
    • shutdown was called in ensure clause, and it invokes shutdown sequence unexpectedly

This change also includes removing default_loop from Engine. It wasn't used in anywhere, so I simplified main loop by removing it.

@sonots
Copy link
Member

sonots commented May 18, 2016

I never rescue Exception (non StandardError) because we can do nothing for NoMemoryError, SystemStackError, SystemExit, and so on.

plugins sometimes raise non-standard errors: these should be rescued and logged

What did they raise? I guess they are doing something wrong.

@tagomoris
Copy link
Member Author

tagomoris commented May 18, 2016

@sonots I did it mainly for SignalException, which can be raised from anywhere. And I also thought about kind of ScriptError (we can raise it by eval at anywhere in ruby).

@sonots
Copy link
Member

sonots commented May 18, 2016

@tagomoris Then, I think we should rescue only SignalException and StandardError.

@tagomoris
Copy link
Member Author

@sonots If plugin.shutdown occurs infinite recursion call in 3rd party plugins, then it raises SystemStackError, but it can be rescued and we can execute shutdown for another plugins.
Same with exit 0 in #shutdown of 3rd party plugins. Isn't it?

@sonots
Copy link
Member

sonots commented May 19, 2016

Do we rescue exit? If I write exit 0 on plugin (I never wite though), I expect that fluentd dies immediately without rescuing.

@sonots
Copy link
Member

sonots commented May 19, 2016

Also, it looks root_agent.rb#L166 does not propagate error, so it recsues and ignores SystemExit. Ignoring SystemExit is not intuitive for me.

@tagomoris
Copy link
Member Author

Yes, that is what I will do. I think that we should not forgive plugins to exit.
When a plugin will do exit, other plugins may have memory buffer with events which will be lost with exit.

@sonots
Copy link
Member

sonots commented May 19, 2016

Someone may want to write exit just for developing purpose. Non standard, non-intuitive behavior brings troubles.

@sonots
Copy link
Member

sonots commented May 19, 2016

@repeatedly
Copy link
Member

plugins should be sandbox so calling exit should not affect other plugin behaviour.
If we support exit like feature, it should be provided via plugin helper like mechanizm.

@repeatedly
Copy link
Member

repeatedly commented May 19, 2016

If we go with rescue Exception, we should write a comment because other developers can't understand why catch all exceptions instead of standard recoverable errors.

@tagomoris
Copy link
Member Author

Generally speaking, I say "Don't rescue Exception" if this is a Rails application.
But this is middleware, and making data safe is most important.

How many non-standard exceptions If we don't rescue Exception? Non-standard exceptions are:

  • NoMemoryError
  • ScriptError (LoadError, NotImplementedError, SyntaxError)
  • SignalException
  • SystemExit
  • SystemStackError
  • fatal

#shutdown of plugins will do:

  • infinite recursive call (SystemStackError)
  • unexpected exit (SystemExit)
  • unexpected deletion of signal handler (SignalException)
  • evaluate script with syntax error or something else (ScriptError)
  • huge object generation in #write at shutdown (NoMemoryError)

@tagomoris
Copy link
Member Author

@repeatedly okay, will do.

@sonots
Copy link
Member

sonots commented May 19, 2016

I think that we should not forgive plugins to exit.

I agree this. So, I think we do not need to take care of SystemExit. Rescuing SystemExit rather looks as forgiving plugins to call exit as a API design.

@tagomoris
Copy link
Member Author

I meant that "we should not forgive plugins to exit entire Fluentd process", not about just calling "exit" method.

@sonots
Copy link
Member

sonots commented May 19, 2016

Okay, you meant that API supports (takes cares safely) plugins call exit method. If it is the design, rescuing SystemExit is okay. But, I also think we need to document as repeatedly says.

@tagomoris
Copy link
Member Author

I added a comment.

* ServerEngine just send SIGTERM, but it was not handled appropriately
  * default signal handler raises SignalException with SIGTERM from `@default_loop.run`
  * and then, `shutdown` in ensure clause kicked (almost unexpectedly)
@tagomoris tagomoris merged commit 5512f83 into master May 20, 2016
@tagomoris
Copy link
Member Author

Merged.

@tagomoris tagomoris deleted the fix-lifecycle-control branch June 2, 2016 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants