Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce size of wheels by stripping debug information #108

Merged
merged 1 commit into from
Feb 18, 2022

Conversation

marcelm
Copy link
Contributor

@marcelm marcelm commented Feb 18, 2022

According to marcelm/dnaio#47, it seems you like smaller wheels, so I’d like to suggest using CFLAGS=-g0 to reduce their size.

See pypa/cibuildwheel#331

I’m using this option already in dnaio and Cutadapt and it works well (wheel size is now ~150 kB instead of ~500 kB). I wasn’t able to test this locally for python-isal, so the savings may be smaller, but running strip (which I assume does something similar as -g0 would do) on the uncompressed .so files on the wheels available from PyPI reduces their size by 90%.

It is possible that the option is not necessary or has no effect on Windows and/or macOS; I just added it for completeness.

Checklist

  • Pull request details were added to CHANGELOG.rst
  • Documentation was updated (if needed)

@codecov
Copy link

codecov bot commented Feb 18, 2022

Codecov Report

Merging #108 (addcce8) into develop (0dd01a2) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #108   +/-   ##
========================================
  Coverage    95.18%   95.18%           
========================================
  Files            2        2           
  Lines          249      249           
  Branches        65       65           
========================================
  Hits           237      237           
  Misses           5        5           
  Partials         7        7           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0dd01a2...addcce8. Read the comment docs.

@rhpvorderman
Copy link
Collaborator

rhpvorderman commented Feb 18, 2022

@marcelm This is very important when building containers. Having unhelpful information in the binaries bloats the binary. I expect exactly nobody who is installed python-isal to run gdb on it, so -g0 makes sense. Thanks!

I checked and it seems that CFLAGS does not override any options when building, but merely adds the flags to the compile command. That is a good thing, that limits any unforeseen side effects.

I did some size comparisions:
Normal compilation:

904K    src/isal/igzip_lib.cpython-39-x86_64-linux-gnu.so
28K     src/isal/_isal.cpython-39-x86_64-linux-gnu.so
944K    src/isal/isal_zlib.cpython-39-x86_64-linux-gnu.so

With CFLAGS=-g0

848K    src/isal/igzip_lib.cpython-39-x86_64-linux-gnu.so
20K     src/isal/_isal.cpython-39-x86_64-linux-gnu.so
860K    src/isal/isal_zlib.cpython-39-x86_64-linux-gnu.so

Better! No let's run strip on it

252K    src/isal/igzip_lib.cpython-39-x86_64-linux-gnu.so
16K     src/isal/_isal.cpython-39-x86_64-linux-gnu.so
260K    src/isal/isal_zlib.cpython-39-x86_64-linux-gnu.so

That's quite a difference! Note how for the very very simple _isal module, which only stores a few constants, the difference is not that bug (40% reduction from original) while for the parts that are statically linked to isa-l it is a huge difference. I wonder what 'useless' parts strip removes.

I will do some investigating before I merge this. There is no haste as I am waiting for the next release of ISA-L to release 1.0.0.

@marcelm
Copy link
Contributor Author

marcelm commented Feb 18, 2022

(It’s mentioned in the linked issue:) It appears that -g0 helps more more with Cython extensions. Possibly you want to try LDFLAGS="-Wl,--strip-all" or LDFLAGS="-Wl,--strip-debug". I think running strip is equivalent to passing --strip-all to the linker.

@rhpvorderman
Copy link
Collaborator

My bad! I still had the build cache on for python-isal (that way it only builds ISA-L once). The build options are passed trough to ISA-L. So I rebuild ISA-L as well now with -g0:

364K    src/isal/igzip_lib.cpython-39-x86_64-linux-gnu.so
20K     src/isal/_isal.cpython-39-x86_64-linux-gnu.so
376K    src/isal/isal_zlib.cpython-39-x86_64-linux-gnu.so

Much better!

Strip looks even better though, but I do need a better understanding of this before I dare to apply it. For now -g0 seems to have most of the gains already. I will merge this.

@rhpvorderman rhpvorderman merged commit fb70d0e into develop Feb 18, 2022
@rhpvorderman
Copy link
Collaborator

An excellent addition, thanks @marcelm !

@rhpvorderman rhpvorderman deleted the smaller-wheels branch February 18, 2022 11:14
@marcelm
Copy link
Contributor Author

marcelm commented Feb 18, 2022

Great! Then the savings are similar to what I see in Cutadapt and dnaio.

@rhpvorderman
Copy link
Collaborator

For posteriety this does not work well with the windows build of ISA-L. So the flag was removed for Windows.

@rhpvorderman
Copy link
Collaborator

I also checked LDFLAGS="-Wl,--strip-debug" now. That does not matter for the size as ISA-L is already build with the same settings as the extension. (with -g0 so there is no debugging information). Stripping of the debug information at the linking stage is therefore not necessary.

As for strip, it removes the complete symbol table and relocation information. I have no idea what the impact of that is. I do know that numpy is content with just stripping of the debug information so I will keep it at that too. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants