Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COPY static release build artifacts into alpine Docker - DO NOT MERGE #3877

Closed

Conversation

jkneubuh
Copy link
Contributor

@jkneubuh jkneubuh commented Dec 19, 2022

DO NOT MERGE

Signed-off-by: Josh Kneubuhl jkneubuh@us.ibm.com

Type of change

  • New feature

Description

This PR resolves the many permutations of SIGSEGV encountered when running multi-arch Fabric 2.5 binaries in Docker.

Additional details

There have been several, unsuccessful efforts to unwind the dependencies between multi-stage Docker builds, alpine libc, pkcs11, and CGO native binaries for Fabric. This has recently appeared as a critical issue for users of Fabric running on an M1 / arm64 system, where the docker based builds regularly SIGSEGV due to the link of golang-alpine's libc/libmusl into the binaries.

This PR resolves the cross-platform issues by moving the Fabric binary build out of Docker, relying on golang's multi-arch support to prepare statically linked binaries in an upstream step of the build. With this simplified model, the reference / release binaries are simply COPY steps, removing any and all dependencies on alpine's mangled support for libc.

Related issues

Signed-off-by: Josh Kneubuhl <jkneubuh@us.ibm.com>
@jkneubuh jkneubuh requested a review from a team as a code owner December 19, 2022 16:50
@C0rWin
Copy link
Contributor

C0rWin commented Dec 20, 2022

This PR resolves the cross-platform issues by moving the Fabric binary build out of Docker, relying on golang's multi-arch support to prepare statically linked binaries in an upstream step of the build. With this simplified model, the reference / release binaries are simply COPY steps, removing any and all dependencies on alpine's mangled support for libc.

@jkneubuh so you basically moving away from reproducible and repeatable builds? Can you please elaborate why building within the docker cannot work any longer?

@jkneubuh
Copy link
Contributor Author

Hi @C0rWin, thank you for the review.

No that is incorrect. This is not moving away from reproducible builds, but is shifting the build pipeline out of Docker and up into the host / reference builder. The problem has to do with golang's (experimental) bundling of libc runtimes in the golang:alpine base images.

There is a discussion thread opened at #3876 with some additional details and elaboration. Can you please check that thread and see if it helps to explain the context for the migration?

@C0rWin
Copy link
Contributor

C0rWin commented Dec 20, 2022

Hi @C0rWin, thank you for the review.

No that is incorrect. This is not moving away from reproducible builds, but is shifting the build pipeline out of Docker and up into the host / reference builder. The problem has to do with golang's (experimental) bundling of libc runtimes in the golang:alpine base images.

I do not understand, in the discussion you say the following:

make will run: go build ...

make release will run: GOOS=${GOOS} GOARCH=${GOARCH} go build ...

make docker will execute make release and then COPY the static release/${GOOS}-${GOARCH}/ binaries onto Alpine.

hence the result of the build depends on the hosting machine where you run make docker which makes it not reproducible, I think we running away from fixing the problem by pathing symptom.

It might be a time to consider building our own slim image without relying on externals given the problem is with the alpine image at use.
י

@jkneubuh
Copy link
Contributor Author

The build is still reproducible, relying on the golang system compiler to generate the statically linked binaries. The reference release builds are generated with a specific version of golang, ubuntu, and in the case of the CA (for sqlite), a musl cross-compiler.

Are there specific concerns related to the revision of libc that you are considering? Where are the cases in which the reproducibility of the build are envisioned to cause troubles? I would sincerely like to understand if this is a theoretical point, or if there are areas where the use of a revision-locked build of golang will cause differences in the build outputs.

The idiomatic go practice is to rely on the golang system compiler to generate cross-arch binaries in a predictable and repeatable environment. In which areas will this technique not be applicable to Fabric?

We did look into the use of an alternate base image, based on ubuntu. That is also an option, but has the trade-off of increasing (significantly) the image sizes.

@C0rWin
Copy link
Contributor

C0rWin commented Dec 20, 2022

@jkneubuh, I think you are missing my point. Before this change, compiled binary produced by running make docker was similar across different environments, regardless content of the hosting matching where you ran it. Now, you have changed this, and I do not think this is the right way to solve it.

@jkneubuh
Copy link
Contributor Author

Hi @C0rWin, respectfully, I don't believe I am missing your point. I understand completely that running the build on the host system is subject to variances between environments, and assert that the use of a reference build system, use of the golang cross-compiler, and official release binaries generated at GH satisfy the goal. My question is: does pinning the revision of the golang runtime on the host system, and optionally applying a libc cross-compiler to pin libc functions at compile time, introduce issues that will cause problems for Fabric? Effectively the builds are reproducible, in that we are relying on the "go way" to enforce equivalence between runtimes. Software projects, for aeons, have been using reference build systems, compiler toolchains, and distribution of static binaries as a mechanism to enforce the reproducibility of builds.

There are several ways to solve the alpine linking problems, the challenge here is to find a compromise in which all of the different ways in which people are building and running Fabric can be satisfied. This is the thread that was being pulled on in the discussion #3876 -- there are many competing interests, not all of which can be solved without compromising some element in the system.

I agree that this is a different technique for establishing consistency of builds, by offloading out to golang. My question still remains: is this a theoretical concern? Or are there issues or places whereby the mechanical migration of the compiler out of Docker will introduce regressions, errors, difficulty in diagnosis, etc?

Two other techniques we could employ to achieve a similar result would be :

  • Re-apply the updates in PR Run a full external link for CGO against musl when building in Docker #3872, running a full-static link with gcc-musl in the Docker FROM alpine. (This has the side-effect of breaking virtually everything in the pkcs11 builds, which require a static runtime.)

  • Ditch alpine and build FROM ubuntu, UBI, or some other similar base image. (We did try this, and the images are ... large.)

What would you propose, as an alternate technique?

@jkneubuh jkneubuh marked this pull request as draft December 21, 2022 12:18
@jkneubuh jkneubuh changed the title COPY static release build artifacts into alpine Docker COPY static release build artifacts into alpine Docker - DO NOT MERGE Dec 21, 2022
@jkneubuh
Copy link
Contributor Author

Alternate approach: PR #3881

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants