Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Rosetta] Prioritize rosetta over qemu-user-static by creating binfmt.d(5) configuration #2474

Merged
merged 5 commits into from
Jul 17, 2024

Conversation

norio-nomura
Copy link
Contributor

This change aims to ensure that Rosetta is prioritized over qemu-user-static when both are installed. The key steps and intentions of this change include:

  1. Creating a binfmt.d(5) Configuration for Rosetta: This configuration ensures that Rosetta is registered with systemd-binfmt, taking precedence over qemu-user-static.

  2. Early Mounting of Rosetta Volume: Using user-data to mount the Rosetta volume earlier in the boot process, preventing errors related to the Rosetta interpreter not being found by systemd-binfmt.service(8).

  3. Duplicate Registration Check: Adding a check to avoid errors from duplicate registrations in binfmt_misc during subsequent startups, ensuring smooth operation when Rosetta is already registered.

The expected methods for installing qemu-user-static are as follows:

  • Running docker run --rm --privileged multiarch/qemu-user-static --reset -p yes on a Docker rootful VM.
  • Executing apt-get install qemu-user-static within the provision script in lima.yaml.

Both methods create configuration files for binfmt.d(5).

Thanks,

@jandubois
Copy link
Member

I haven't read the PR yet, but I have 2 questions so far:

  1. What is the difference between multiarch/qemu-user-static and tonistiigi/binfmt, and what are the reasons to choose one over the other?

  2. What is the effect of this PR on non-systemd based machines (like Alpine)?

@norio-nomura
Copy link
Contributor Author

norio-nomura commented Jul 10, 2024

  1. What is the difference between

tonistiigi/binfmt installs the qemu-user-static binaries, but the binfmt registration needed to use them is lost after a host reboot.
multiarch/qemu-user-static not only installs the qemu-user-static binaries and registers them with binfmt but also offers an option to create configuration files for binfmt.d(5). This ensures that the binfmt registration is persistent even after a host reboot.

what are the reasons to choose one

I am trying to create an aarch64 binary of SwiftLint in my docker-swiftlint repo. While I was able to cross-compile the aarch64 binary using the x86_64 version of the Swift Toolchain, I encountered issues with crashes and hangs when running that aarch64 binary within a Dockerfile using qemu-user-static for emulation. Upon investigation, it seems that this issue occurs generally when running executables generated by Swift (including the Swift Toolchain itself) under qemu-user-static emulation.

To avoid this issue, I tried various methods of installing qemu-user-static other than using the docker/setup-qemu-action (which uses tonistiigi/binfmt). These methods included:

  • Installing via Ubuntu's apt
  • Building and installing from the qemu-9.0.1 source code
  • Installing qemu-9.0.1 from Debian experimental

After none of these methods resolved the issue, I tried using multiarch/qemu-user-static, which was used to create arm64 binaries on ci.swift.org. With this, the error completely disappeared, and the runtime tests passed successfully.

After your question, I investigated which distribution of qemu-user-static caused crashes and hangs with the aarch64 version of SwiftLint. You can find the results here. Sorry for the delayed response.

I'll answer the second question in the next post.

@norio-nomura norio-nomura force-pushed the prioritize-rosetta-over-qemu branch 2 times, most recently from 6f54d36 to 003f82a Compare July 10, 2024 01:56
@norio-nomura
Copy link
Contributor Author

norio-nomura commented Jul 10, 2024

  1. What is the effect of this PR on non-systemd based machines (like Alpine)?

I have fixed the issue with the user-data template causing test failures on Alpine, but it seems one test is still failing due to inability to download the image.

2024-07-10T02:01:59.4749003Z time="2024-07-10T02:01:59Z" level=fatal msg="failed to download "https://github.com/lima-vm/alpine-lima/releases/download/v0.2.39/alpine-lima-std-3.20.0-x86_64.iso\": Get "https://github.com/lima-vm/alpine-lima/releases/download/v0.2.39/alpine-lima-std-3.20.0-x86_64.iso\": dial tcp 140.82.114.4:443: i/o timeout"

Could you please rerun that test?

@vasileknik76
Copy link
Contributor

Another difference between multiarch/qemu-user-static and tonistiigi/binfmt, that tonistiigi uses qemu with patches. One of meaningful patches is set CPU to max instead of default. This allows run programs with AVX instructions.

@norio-nomura
Copy link
Contributor Author

  1. What is the effect of this PR on non-systemd based machines (like Alpine)?

I fixed the mount error that was occurring on Alpine due to this PR.

@norio-nomura
Copy link
Contributor Author

  1. What is the effect of this PR on non-systemd based machines (like Alpine)?

Regarding binfmt, the behavior should remain unchanged on non-systemd machines.

@norio-nomura
Copy link
Contributor Author

I cannot reproduce the kernel panic that occurred in the failed integration test on my local x86_64 machine.
https://github.com/lima-vm/lima/actions/runs/9889834439/job/27316746959?pr=2474#step:11:78

==> /Users/runner/.lima/default/serial.log <==
[ 0.016000] setup_IO_APIC+0x2c3/0x370
[ 0.016000] ? enable_IO_APIC+0x1af/0x290
[ 0.016000] apic_intr_mode_init+0x61/0x130
[ 0.016000] x86_late_time_init+0x24/0x40
[ 0.016000] start_kernel+0x2be/0x450
[ 0.016000] x86_64_start_reservations+0x18/0x30
[ 0.016000] x86_64_start_kernel+0xbf/0x110
[ 0.016000] secondary_startup_64_no_verify+0x184/0x18b
[ 0.016000]
[ 0.016000] ---[ end Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Then try booting with the 'noapic' option. ]---

@norio-nomura norio-nomura force-pushed the prioritize-rosetta-over-qemu branch 2 times, most recently from 837b249 to b6580e4 Compare July 13, 2024 00:55
@AkihiroSuda
Copy link
Member

I cannot reproduce the kernel panic that occurred in the failed integration test on my local x86_64 machine. https://github.com/lima-vm/lima/actions/runs/9889834439/job/27316746959?pr=2474#step:11:78

==> /Users/runner/.lima/default/serial.log <==
[ 0.016000] setup_IO_APIC+0x2c3/0x370
[ 0.016000] ? enable_IO_APIC+0x1af/0x290
[ 0.016000] apic_intr_mode_init+0x61/0x130
[ 0.016000] x86_late_time_init+0x24/0x40
[ 0.016000] start_kernel+0x2be/0x450
[ 0.016000] x86_64_start_reservations+0x18/0x30
[ 0.016000] x86_64_start_kernel+0xbf/0x110
[ 0.016000] secondary_startup_64_no_verify+0x184/0x18b
[ 0.016000]
[ 0.016000] ---[ end Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Then try booting with the 'noapic' option. ]---

Me too, this issue seems specific to the infra of GHA

Sometime `qemu-user-static` interpreters are registered with `systemd-binfmt.service(8)` by creating configuration files for `binfmt.d(5)`.
By creating a `binfmt.d(5)` configuration for Rosetta, Rosetta will be prioritized over `qemu-user-static` regardless of the execution timing of `systemd-binfmt.service(8)`.
In subsequent startups, Rosetta will already be registered with binfmt_misc at this stage, so a check has been added to prevent errors due to duplicate registrations.

Signed-off-by: Norio Nomura <norio.nomura@gmail.com>
After creating the `binfmt.d(5)` configuration for Rosetta, `systemd-binfmt.service(8)` attempts to register at an earlier stage in subsequent boots. To prevent errors from not finding the Rosetta interpreter, `user-data` is used to mount the Rosetta volume earlier.

Signed-off-by: Norio Nomura <norio.nomura@gmail.com>
Signed-off-by: Norio Nomura <norio.nomura@gmail.com>
Since the mount option workaround for selinux to vz-rosetta will be covered in 05-lima-mounts.sh.

Signed-off-by: Norio Nomura <norio.nomura@gmail.com>
If `binfmt.d/rosetta.conf` already exists, Rosetta might already be registered by the time this script is called. To apply the settings in `lima.yml`, if Rosetta is already registered, the script will remove `binfmt.d/rosetta.conf` and unregister it from `binfmt_misc`.

Signed-off-by: Norio Nomura <norio.nomura@gmail.com>
@norio-nomura
Copy link
Contributor Author

rebased

Copy link
Member

@jandubois jandubois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@jandubois jandubois merged commit 598f45f into lima-vm:master Jul 17, 2024
27 checks passed
@norio-nomura norio-nomura deleted the prioritize-rosetta-over-qemu branch July 17, 2024 01:51
@jandubois
Copy link
Member

Another difference between multiarch/qemu-user-static and tonistiigi/binfmt, that tonistiigi uses qemu with patches. One of meaningful patches is set CPU to max instead of default. This allows run programs with AVX instructions.

And there are additional patches that are only applied to the Docker Desktop version: https://github.com/tonistiigi/binfmt/blob/015fe68a09f75b84e7cf84cc54466d3add6d01c5/docker-bake.hcl#L80-L86

I've not had time to look into why these patches are relevant to Docker Desktop but not to other uses of qemu-static.

I wish somebody could figure out what the ideal version would be for Lima and add it to our docs!

@norio-nomura
Copy link
Contributor Author

Thanks! 🙏

tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Aug 21, 2024
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [lima-vm/lima](https://github.com/lima-vm/lima) | minor | `v0.22.0` -> `v0.23.1` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>lima-vm/lima (lima-vm/lima)</summary>

### [`v0.23.1`](https://github.com/lima-vm/lima/releases/tag/v0.23.1)

[Compare Source](lima-vm/lima@v0.23.0...v0.23.1)

#### Changes

-   Fixed the CI to generate the release note ([#&#8203;2555](lima-vm/lima#2555))

#### Usage

```console
[macOS]$ limactl create
[macOS]$ limactl start
...
INFO[0029] READY. Run `lima` to open the shell.

[macOS]$ lima uname
Linux
```

***

The binaries were built automatically on GitHub Actions.
The build log is available for 90 days: https://github.com/lima-vm/lima/actions/runs/10441930092

The sha256sum of the SHA256SUMS file itself is `e93a48f3a011c25367da50ab3609bb28437fcde259371f005f8b234caa46efff` .

***

Release manager: [@&#8203;AkihiroSuda](https://github.com/AkihiroSuda)

### [`v0.23.0`](https://github.com/lima-vm/lima/releases/tag/v0.23.0)

[Compare Source](lima-vm/lima@v0.22.0...v0.23.0)

-   YAML:
    -   Add a `param` field for defining variables ([#&#8203;2498](lima-vm/lima#2498), thanks to [@&#8203;norio-nomura](https://github.com/norio-nomura))

-   vz:
    -   Prioritize rosetta over qemu-user-static ([#&#8203;2474](lima-vm/lima#2474), thanks to [@&#8203;norio-nomura](https://github.com/norio-nomura))
    -   Configura AOT caching options using an abstract socket ([#&#8203;2489](lima-vm/lima#2489), thanks to [@&#8203;norio-nomura](https://github.com/norio-nomura))

-   Templates:
    -   add `alpine-image` ([#&#8203;2360](lima-vm/lima#2360), thanks to [@&#8203;jandubois](https://github.com/jandubois))
    -   remove `centos-stream-8`, `deprecated/centos-7` ([#&#8203;2457](lima-vm/lima#2457))
    -   update to the latest revisions ([#&#8203;2553](lima-vm/lima#2553))

-   Governance:
    -   MAINTAINERS: invite Oleksandr Redko ([@&#8203;alexandear](https://github.com/alexandear)) as a Reviewer  ([#&#8203;2383](lima-vm/lima#2383))

Full changes: https://github.com/lima-vm/lima/milestone/46?closed=1
Thanks to [@&#8203;AdamKorcz](https://github.com/AdamKorcz) [@&#8203;AmedeeBulle](https://github.com/AmedeeBulle) [@&#8203;SmartManoj](https://github.com/SmartManoj) [@&#8203;afbjorklund](https://github.com/afbjorklund) [@&#8203;alexandear](https://github.com/alexandear) [@&#8203;danchr](https://github.com/danchr) [@&#8203;fwilhe2](https://github.com/fwilhe2) [@&#8203;jandubois](https://github.com/jandubois) [@&#8203;norio-nomura](https://github.com/norio-nomura) [@&#8203;tcooper](https://github.com/tcooper) [@&#8203;why168](https://github.com/why168)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants