Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use non-local OS error strings (en-US) #34422

Closed
wants to merge 1 commit into from

Conversation

liigo
Copy link
Contributor

@liigo liigo commented Jun 23, 2016

Closes #34318

@rust-highfive
Copy link
Collaborator

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@tbu-
Copy link
Contributor

tbu- commented Jun 23, 2016

Is there precedence of other programming languages doing this? From the top of my head, I only know that Python doesn't.

I don't think this is a good idea, we shouldn't impose English upon the user. If you have, say, a Windows in German, then it is expected that all strings by the operating system actually are in German.

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jun 23, 2016
@liigo
Copy link
Contributor Author

liigo commented Jun 24, 2016

@tbu- Yes, I'm Chinese, I would prefer reading OS error strings in Chinese. But Rust doesn't really print OS error strings in Chinese, it print them as \u{xxxx}\u{yyyy}..., which I can't read. English is better than \u{xxxx}\u{yyyy}..., at least. See #34318.

@tbu-
Copy link
Contributor

tbu- commented Jun 25, 2016

Maybe we should rather escape less characters in the Debug implementation of &str.

@retep998
Copy link
Member

I'd rather disable escaping only for OS error strings specifically, not for all strings.

@tbu-
Copy link
Contributor

tbu- commented Jun 25, 2016

@retep998 I believe Python has a good strategy here, and they don't escape most Unicode symbols, except for weird ones like zero-width space (U+200B):

Python 3.5.1 (default, <timestamp>) 
[GCC 5.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> "ä", "\u00e4" , "\u200b"
('ä', 'ä', '\u200b')

Equivalent Rust:

fn main() {
    println!("{:?} {:?}", "ä", "\u{200b}");
}

Outputs:

"\u{e4}" "\u{200b}"

@alexcrichton
Copy link
Member

Thanks for the PR @liigo! The libs team got a chance to talk about this today and the conclusion was that we're going to go with a solution like #34485 instead to keep localized error strings but try to escape fewer characters.

bors added a commit that referenced this pull request Jul 28, 2016
Escape fewer Unicode codepoints in `Debug` impl of `str`

Use the same procedure as Python to determine whether a character is
printable, described in [PEP 3138]. In particular, this means that the
following character classes are escaped:

- Cc (Other, Control)
- Cf (Other, Format)
- Cs (Other, Surrogate), even though they can't appear in Rust strings
- Co (Other, Private Use)
- Cn (Other, Not Assigned)
- Zl (Separator, Line)
- Zp (Separator, Paragraph)
- Zs (Separator, Space), except for the ASCII space `' '` `0x20`

This allows for user-friendly inspection of strings that are not
English (e.g. compare `"\u{e9}\u{e8}\u{ea}"` to `"éèê"`).

Fixes #34318.
CC #34422.

[PEP 3138]: https://www.python.org/dev/peps/pep-3138/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants