Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call getpwuid only after chroot is set #3742

Merged
merged 1 commit into from
Feb 7, 2024

Conversation

zedi-pramodh
Copy link

There is a regression in recent commit where getpwuid() was added before chroot is set. Due to that any container which runs with non-root userid will fail to start.

Testing done

Before this fix grafana container which runs with user id 472 was failing to start. It works with this fix.

@eriknordmark
Copy link
Contributor

Can you add a reference to the commit/PR which caused the regression (so it is easier to track and to figure out if this fix needs to be back-ported).

@zedi-pramodh zedi-pramodh added the stable Should be backported to stable release(s) label Feb 6, 2024
@zedi-pramodh
Copy link
Author

Yes this need to be back ported. I labeled as stable.

@zedi-pramodh
Copy link
Author

commit 53e2d24
Author: Shahriyar Jalayeri shahriyar@zededa.com
Date: Wed Oct 25 10:16:56 2023 +0000

initrd : check for errors and do initgroups on chroot

check for errors on security sensitive calls and call initgroups() to remove
root from the new uid supplementary groups.

Signed-off-by: Shahriyar Jalayeri <shahriyar@zededa.com>

Copy link
Contributor

@rouming rouming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original intention of the change was to be sure shim vm has the same user, but let @shjala review that.

My only request is please add a proper commit message describing the problem and what exact commit this fix targets. Thanks.

While starting a cmd from the container rootfs, initrd script tries to chroot to the rootfs and executes the cmd.
The eve specific chroot (chroot2.c) tries to set root to the container rootfs and execute the command.
/chroot2 /mnt/rootfs $ug $pidfile $cmd
The issue here is that chroot2.c is calling getpwuid() on the userid even before setting chroot.
That makes getpwuid to process the user id in eve context than the user container context.
This works fine with userid root, but for non-root users it fails since that user may (will not) be present in eve context.

This fix moves the getpwuid() call after chroot().

Signed-off-by: Pramodh Pallapothu <pramodh@zededa.com>
@zedi-pramodh
Copy link
Author

Updated the commit message

Copy link
Member

@OhmSpectator OhmSpectator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the grafana container mentioned in the PR description? I've tried to find it in the scripts and cannot find it... Should it even use this util? I think the use case should be discussed with @shjala.

Until now the only use case that I'm aware about is here:

eval /chroot2 /mnt/rootfs "${WORKDIR:-/}" $ug $pid_file $cmd 2>&1 | tee -i /dev/hvc0

@zedi-pramodh
Copy link
Author

What is the grafana container mentioned in the PR description? I've tried to find it in the scripts and cannot find it... Should it even use this util? I think the use case should be discussed with @shjala.

Until now the only use case that I'm aware about is here:

I did not understand your comment. Customers can use any container image, grafana is just an app running in the container. It happens that it runs with non-root userid. You can check alpha for that image if you are interested.

@OhmSpectator
Copy link
Member

OhmSpectator commented Feb 6, 2024

Customers can use any container image, grafana is just an app running in the container.

So, is it just a regular container Application deployed on a node?

Copy link

codecov bot commented Feb 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (2dcaf10) 19.69% compared to head (4e48fca) 19.72%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3742      +/-   ##
==========================================
+ Coverage   19.69%   19.72%   +0.02%     
==========================================
  Files         235      235              
  Lines       51708    51708              
==========================================
+ Hits        10185    10198      +13     
+ Misses      40782    40771      -11     
+ Partials      741      739       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zedi-pramodh
Copy link
Author

So, is it just a regular container Application deployed on a node?

Yes.

@OhmSpectator
Copy link
Member

OhmSpectator commented Feb 6, 2024

So, it means EVE has been failing to run any container App since 11.3.0... Wow.
It's good that we do not have the original code in any stable/LTS branch, so no customer is affected, and the backports are not needed.
I still wanna understand how we use this chroot2 tool in Pillar... I didn't know we use it when starting apps. Looks hacky.
UPD... Argh... It's a part of the ramdisk that we use in any container app...

@OhmSpectator
Copy link
Member

Another question: why our regression tests did not detect this problem... @uncleDecart, do you know whether we have any tests for the container-based apps?

@zedi-pramodh
Copy link
Author

Its not all containers fail to start. Only containers that run the apps with non-root user. May be most of the test cases run the apps in the container as root user.

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kick off tests; @shjala should review before we merge this.

@milan-zededa
Copy link
Contributor

Another question: why our regression tests did not detect this problem... @uncleDecart, do you know whether we have any tests for the container-based apps?

@OhmSpectator It was detected but Shah is busy this week with high-priority TPM issues: lf-edge/eden#944

@eriknordmark
Copy link
Contributor

Based on the discussion in lf-edge/eden#944 this fix is needed.
(And we need to get the eden tests to pass regularly to not miss a test failure like this in the noise of other stochastic failures.)

@eriknordmark eriknordmark merged commit f33609a into lf-edge:master Feb 7, 2024
33 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stable Should be backported to stable release(s)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants