-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly handle pathnames by closing opened files again #1090
Conversation
b6bf792
to
b8db82a
Compare
Wouldn't it be easier to change Also, please add specs. |
@pointlessone Thanks a lot for the feedback! I absolutely agree, directly reading the content is a lot simpler. Do you know of a way to test open file handles from inside Ruby, without resorting to OS tools? I've added a file double way of testing whether the block syntax is used for File.open, and a version that uses lsof and only runs on systems that use that, but I'm not perfectly happy with either... |
lib/prawn/images.rb
Outdated
end | ||
# String or Pathname | ||
io_or_path = Pathname.new(io_or_path) | ||
raise ArgumentError, "#{io_or_path} not found" unless io_or_path.file? | ||
io = io_or_path.open('rb') | ||
io | ||
File.open(io_or_path, 'rb') do |file_io| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File.read(io_or_path)
Instead of the block open
would work as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have to be #binread
though, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. That is closer to the original code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because Pathname
supports a lot of class-level methods from File you could also just call binread
on the Pathname
instance directly
File.open(io_or_path, 'rb') do |file_io| | |
io_or_path.binread |
You could then simplify your no-files-left-open spec by passing a known-non-existent file path in plus a correctly configured test double/spy and verify that the spy had binread.
Something like this:
it 'uses binread, which closes files opened from pathnames' do
pathname_double = instance_double("Pathname", binread: "1234567")
expect(Pathname).to receive(:new).with(filename) { pathname_double }
_info = pdf.image(filename)
expect(pathname_double).to have_received(:binread)
end
Not 100% happy with that, but hopefully you see the point - use of a non-existent path ensures that file system accesses outside calling #binread
on the double would fail with a system error, so you don't need to verify through something like ObjectSpace.
Well, this issue was never considered a serious one because Ruby is quite good a closing files for you. And does so fairly quickly. It does so when it GCs the file object. I imagine, you can write a reliable spec by preventing that from Happening. Let say, you can disable GC for one spec so that opened file object could live long enough for you to do all the assertion you want. |
Unfortunately, it happens to be serious for us because the files we open live on an SMB share, who complains about open files fairly quickly. The processes using Prawn are Rails workers in Passenger, who doesn't appear to close the files until the process dies and gets replaced by a new one, which can be minutes... Both specs are reliable, but I'd prefer to just ask File whether it knows of any open handles. It doesn't look like Ruby has anything of that sort though. |
No, Ruby doesn't keep a list of all open files. You may be observing this more often in Passenger because many web-servers tune GC to not happen during request. However, it should happen in between requests so I don't know why it has to die to free up file handles. Additionally, the files might be retained by something other than requests and so, leaking. |
Leaking is of course possible, but the issue arises in the places where we feed Prawn just pathnames as strings. I'm not sure how long it's being retained, the fact is that it's sometimes minutes. We tried to fix the symptoms by always passing file instances instead of filenames (and then closing those ourselves), but figured that's silly when we can just help fixing the cause instead of digging through all our code and maintaining lists of files in dozens of places... |
2680130
to
4c57766
Compare
I found a way to write the spec that I'm happy with. What do you think, @pointlessone? |
f1d2505
to
2c87a2e
Compare
Isn't |
I wasn't aware. What do you think of running that particular spec only for MRI? |
I'm not opposed to the idea of impl-specific specs. As long as there are corresponding specs for other implementations. |
767bc2b
to
6b3fc50
Compare
I've rebased it all into one commit. Thanks again for your feedback! |
spec/prawn/images_spec.rb
Outdated
it 'closes opened files again after getting Pathname objects' do | ||
if RUBY_PLATFORM == 'java' | ||
# JRuby doesn't support ObjectSpace | ||
expect(RUBY_PLATFORM).to eq('java') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There needs to be a spec here. Otherwise we can't be sure it works on Ruby.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to come up with something... The spec that used lsof
isn't really great either, that doesn't test anything for Windows platforms. Or would that work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now we don't run tests on Windows but we don't have any platform-specific code either. Everything is Ruby-only so in theory we shouldn't have any major issues with adding Windows. It at all possible I'd prefer it stayed that way.
spec/prawn/images_spec.rb
Outdated
it 'accepts Pathname objects' do | ||
info = pdf.image(Pathname.new(filename)) | ||
expect(info.height).to eq(453) | ||
end | ||
|
||
context 'with File message spy' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This spec feels a bit too coupled to the implementation and doesn't describe the actual intent particularly well but since I can't give any useful advice on how to improve it, let's keep it. However, tag it with issue. Here's an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I don't like testing the implementation (and only implicitly the logic) either. Something like the version with ObjectSpace
but working on every platform would've been awesome...
3cf0b13
to
e91658b
Compare
I think it would make sense to add the For reference, it 'closes opened files again after getting Pathname objects' do
system_has_lsof = system('lsof -v > /dev/null 2>&1')
system_has_grep = system('grep --version > /dev/null 2>&1')
if system_has_lsof && system_has_grep
gc_was_disabled = GC.disable # make sure GC doesn't remove File instance and close file
open_files_before = `lsof -c ruby | grep "#{filename}"`
_info = pdf.image(Pathname.new(filename))
open_files_after = `lsof -c ruby | grep "#{filename}"`
GC.enable unless gc_was_disabled
expect(open_files_after).to eq(open_files_before)
else
expect(system_has_lsof && system_has_grep).to be_falsy # just for Rubocop
end
end ...and it 'closes opened files again after getting Pathname objects' do
if <guard ObjectSpace availability>
gc_was_disabled = GC.disable # make sure GC doesn't remove File instance and close file
open_files_before = ObjectSpace.each_object(File).count { |f| !f.closed? }
_info = pdf.image(Pathname.new(filename))
open_files_after = ObjectSpace.each_object(File).count { |f| !f.closed? }
GC.enable unless gc_was_disabled
expect(open_files_after).to eq(open_files_before)
else
expect(<tautology>).to be_truthy # just for Rubocop
end
end If I add these to the PR, even without corresponding specs when the tools aren't available, the logic is tested from three non-perfect angles instead of just one:
If anybody has an idea for a logic-based cross-platform spec after all, I'm all ears. But without that, personally I'd feel happier with these three specs and a comment explaining their raison d'être than just one... |
Sounds reasonable. However, instead of meaningless expectations in |
d4f0d82
to
7170bed
Compare
spec/prawn/images_spec.rb
Outdated
end | ||
end | ||
|
||
it 'closes opened files, spec with lsof' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An easier approach to these kind of specs is to wrap the definition with the platform check, which means you avoid the sometimes-this-spec-is-pending-but-not-always problem:
describe "situation" do
system_has_lsof = system('lsof -v > /dev/null 2>&1')
system_has_grep = system('grep --version > /dev/null 2>&1')
if system_has_lsof && system_has_grep
it 'closes opened files, spec with lsof' do
...
end
end
if RUBY_PLATFORM != 'java'
it 'closes opened files, spec with ObjectSpace' do
...
end
end
end
2946c13
to
45cd3b9
Compare
The PR is still in the status "changes requested" ... the specs are as fitting as I think is possible now, right? Is there something more I can / should do? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution.
This PR fixes #975 by using a standard
File.open
block for handling path names as images.The io is
yield
ed within that block instead of returned from the method, and thus is correctly closed after it has been used to read from.