Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git-style globbing #51

Open
wants to merge 2 commits into
base: next
Choose a base branch
from
Open

git-style globbing #51

wants to merge 2 commits into from

Conversation

z33ky
Copy link

@z33ky z33ky commented May 23, 2014

This adds git-style globbing, which also matches / in paths names as requested in #37.

My autogen is a newer version than the one used on the master branch, so I added the update in a separate commit (without the globbing change) in order to supply the changes to the build-files introduced by the globbing addition properly.

z33ky added 2 commits May 23, 2014 19:02
This adds gits wildmatch.c and adapts it to build outside the git-tree.

wildmatch allows the ** wildcard, which also matches / in paths.
@IgnorantGuru
Copy link
Owner

Thanks for contributing. In freeing is_glob_special from the git tree, please explain your replacing:

#define GIT_GLOB_SPECIAL 0x08
const unsigned char sane_ctype[256];
#define GIT_SPACE 0x01
#define GIT_DIGIT 0x02
#define GIT_ALPHA 0x04
#define GIT_GLOB_SPECIAL 0x08
#define GIT_REGEX_SPECIAL 0x10
#define GIT_PATHSPEC_MAGIC 0x20
#define GIT_CNTRL 0x40
#define GIT_PUNCT 0x80
enum {
S = GIT_SPACE,
A = GIT_ALPHA,
D = GIT_DIGIT,
G = GIT_GLOB_SPECIAL, /* *, ?, [, \\ */
R = GIT_REGEX_SPECIAL, /* $, (, ), +, ., ^, {, | */
P = GIT_PATHSPEC_MAGIC, /* other non-alnum, except for ] and } */
X = GIT_CNTRL,
U = GIT_PUNCT,
Z = GIT_CNTRL | GIT_SPACE
};
const unsigned char sane_ctype[256] = {
X, X, X, X, X, X, X, X, X, Z, Z, X, X, Z, X, X, /* 0.. 15 */
X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, /* 16.. 31 */
S, P, P, P, R, P, P, P, R, R, G, R, P, P, R, P, /* 32.. 47 */
D, D, D, D, D, D, D, D, D, D, P, P, P, P, P, G, /* 48.. 63 */
P, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, /* 64.. 79 */
A, A, A, A, A, A, A, A, A, A, A, G, G, U, R, P, /* 80.. 95 */
P, A, A, A, A, A, A, A, A, A, A, A, A, A, A, A, /* 96..111 */
A, A, A, A, A, A, A, A, A, A, A, R, R, U, P, X, /* 112..127 */
/* Nothing in the 128.. range */
};
#define sane_istest(x,mask) ((sane_ctype[(unsigned char)(x)] & (mask)) != 0)
#define is_glob_special(x) sane_istest(x,GIT_GLOB_SPECIAL)

with:

static int is_glob_special(char c)
{
    static const char globs[] = { '$', '(', ')', '+', '.', '^', '{', '|' };
    size_t i;
    for(i = 0; i < sizeof(globs); ++i) {
        if(globs[i] == c)
            return 1;
    }
    return 0;
}

What was your source for your modified is_glob_special? They don't seem equivalent. What behavioral change is introduced?

Also, globs in udevil have a prominent role in security. Has git's wildmatch.c performed in a security-related role, is it well-debugged, etc? Why should I trust this code in an suid program which handles system security via globbing? Thanks.

I have pulled your changes in my git-glob branch for consideration and testing. However, I changed is_glob_special back to the original - is there a reason why this won't work as-is?

@IgnorantGuru
Copy link
Owner

Review Notes:

  • Running 'git help ignore' and reading the PATTERN FORMAT section, it seems this function may do more than handle ** What are the exact differences? Implications?
  • Difference noted above between is_glob_special - which is preferred and why?
  • Pending some of the above, I might be inclined to add this conditionally: if the pattern contains "**", udevil will run the wildmatch function, otherwise it will revert to the current method. Thus use of git-style globs will be optional and exceptional, but we should still make sure the implementation is secure for this use.
  • Lots of string handling code here, which can contain tricky bugs, but this code is not run as root, has been around awhile, and udevil validates all arguments as UTF-8.
  • Is there a simpler function for this support, such as simply accepting a /** suffix as meaning all files beneath? (This would be trivial to implement so long as no other wildcards were accepted in the pattern - simple path prefix test.) Do we really need any other use of ** to warrant introducing so much code and potential behavior change?
  • Which patterns in udevil.conf really need this function?

@OmegaPhil
Copy link

I see IG has already taken this apart, but since I've been asked for input, what is the use case for matching '/'? Do you have some really complicated hierarchies to manage?

@IgnorantGuru
Copy link
Owner

Please see this comment on next branch minimal changes for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants