Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat extern statics just like statics in the "const pointer to static" representation #67630

Merged
merged 1 commit into from
Jan 9, 2020

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Dec 26, 2019

fixes #67612

r? @spastorino

cc @RalfJung this does not affect runtime promotion at all. This is just about promotion within static item bodies.

@oli-obk oli-obk added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Dec 26, 2019
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 26, 2019
@Centril Centril added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Dec 26, 2019
@RalfJung
Copy link
Member

I don't understand what this does, but intuitively it makes sense to me that extern statics would be more like static mut than static. So why is it okay to let the two diverge here?

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 26, 2019

I don't understand what this does, but intuitively it makes sense to me that extern statics would be more like static mut than static. So why is it okay to let the two diverge here?

And this is where I learn that there is such a thing as extern { static mut FOO: T; }. I thought these were just immutable ones. Looking at

fn static_mutability(tcx: TyCtxt<'_>, def_id: DefId) -> Option<hir::Mutability> {
match tcx.hir().get_if_local(def_id) {
Some(Node::Item(&hir::Item { kind: hir::ItemKind::Static(_, mutbl, _), .. }))
| Some(Node::ForeignItem(&hir::ForeignItem {
kind: hir::ForeignItemKind::Static(_, mutbl),
..
})) => Some(mutbl),
Some(_) => None,
_ => bug!("static_mutability applied to non-local def-id {:?}", def_id),
}
}
makes this PR perfectly alright (albeit by accident).

We now treat extern static mut just like static mut and extern static just like static.

@RalfJung
Copy link
Member

So extern static is guaranteed to be read-only (except for interior mutability)?

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 27, 2019

So extern static is guaranteed to be read-only (except for interior mutability)?

I... have no clue. Neither the nomicon (https://doc.rust-lang.org/nomicon/ffi.html?highlight=static#accessing-foreign-globals) nor the reference (https://doc.rust-lang.org/nightly/reference/items/external-blocks.html?highlight=extern#external-blocks) helped me clear this up.

What I think should be true: It's up to the user to guarantee this. If C code modifies an extern static, that's UB. C code may modify extern statics and Rust code must be able to handle this by choosing appropriate interior mutability types. This may be confusing to users and there may be much silent UB out there, because they think the unsafe block around reading from extern static is what "permits" you to read from a changing static item. But the unsafe block only exists due to it being unclear whether there is a valid bit pattern for the type at the statics address.

What I know is true: It's unsafe to read from extern static and impossible to write to extern static. It's unsafe to read and write from/to extern static mut.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 27, 2019

The issue fixed by this PR shows that rustc thought it is legal to create an immutable reference to an extern static just like it is legal to do so to a normal static. Inside the initializer of a static, these even got promoted (which is what this PR fixes so it happens again).

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 27, 2019

Alternatively we can teach promotion to promote &*raw_ptr "reborrows" within statics and not just &*reference reborrows.

@RalfJung
Copy link
Member

RalfJung commented Dec 30, 2019

It seems reasonable for rustc to (unsafely) permit creating shared references to extern static if they should be immutable... right? Well at least that is consistent with what you wrote above. So I guess this change makes sense, too.

Cc @rust-lang/lang

@Centril Centril added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Jan 1, 2020
@nikomatsakis
Copy link
Contributor

I agree with @oli-obk's summary. I do think we should clearly write this up -- probably in the reference and nominicon too.

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 2, 2020

I'll adjust the reference and nomicon to clear this up

@nikomatsakis
Copy link
Contributor

We looked at this in the @rust-lang/lang team meeting but we were feeling a bit confused, actually -- @oli-obk maybe you can step back and give just a bit more context about this PR? e.g., what is the behavior now that you are changing?

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 2, 2020

This is a bugfix for a wrong assumption I made when I wrote that code. I assumed that extern statics are inherently unsafe to access, not just because of type punning, but also because the underlying static may change its value even though it's an immutable static from the rust perspective. The C code defining the extern static may change the static and the rust code would have to be able to cope with that. Raw pointers allow this, because they require the user to uphold the aliasing invariants themselves. So I chose to make any access to an extern static (so any user mention of EXTERNSTATIC) become *&raw EXTERNSTATIC. This caused a bug when such a static was used inside an aggregate which itself was promoted. Before my PR this used to work, with this PR it will work again. The regression this PR caused was entirely about promoting values accessing extern statics

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 2, 2020

Taking the example from the issue:

extern "C" {
    static X: i32;
}

static mut FOO: *const &'static i32 = [unsafe { &X }].as_ptr();

is essentially

extern "C" {
    static X: i32;
}

static mut FOO: *const &'static i32 = {
    const CONST_PTR_TO_X: *const i32 = &X;
    let _tmp0: *const i32 = CONST_PTR_TO_X;
    let _tmp1: &'static i32 = unsafe { &*X };
	let _tmp2: [&'static i32; 1] = [_tmp1];
    let _tmp3: &[&'static i32; 1] = &_tmp2;
    let _tmp4: *const &'static i32 = _tmp3.as_ptr();
    _return_location = _tmp4;
    StorageDead(_tmp4);
    StorageDead(_tmp3);
    StorageDead(_tmp2); // this is where the pointer becomes dangling
    StorageDead(_tmp1);
};

Before #66587 the code was

extern "C" {
    static X: i32;
}

static mut FOO: *const &'static i32 = {
    let _tmp1: &'static i32 = unsafe { &X };
	let _tmp2: [&'static i32; 1] = [_tmp1];
    let _tmp3: &[&'static i32; 1] = &_tmp2;
    let _tmp4: *const &'static i32 = _tmp3.as_ptr();
    _return_location = _tmp4;
    StorageDead(_tmp4);
    StorageDead(_tmp3);
    // by not emitting a StorageDead here, we leak the value
    // to the const evaluator, which will intern all memory
    // reachable from the `_return_location` value and then drop
    // all leftover memory.
    StorageDead(_tmp1);
};

The reason we started emitting a StorageDead is that promotion found the access to X to be of the form

    const CONST_PTR_TO_X: *const i32 = &X;
    let _tmp0: *const i32 = CONST_PTR_TO_X;
    let _tmp1: &'static i32 = unsafe { &*X };

and dereferencing raw pointers will stop promotion from happening. I believe that my decision in #66587 to use raw pointers was wrong to start out with.

@spastorino
Copy link
Member

spastorino commented Jan 3, 2020

@oli-obk I've approved but didn't r+ given that you see you're discussing the issue with t-lang. If everyone is happy feel free to r=me.

BTW, once this is in we can place back the zst perf optimizations we were trying to Place on #67000 that we needed to revert when we hit this issue.

@nikomatsakis
Copy link
Contributor

@oli-obk I see, so you see this as regarding the status quo, in other words. I'm inclined to agree.

I do think we should document this requirement. The current reference https://doc.rust-lang.org/nightly/reference/items/external-blocks.html doesn't really say much one way or the other.

I do remember that we added the code to make them unsafe to access relatively late, which implies that (initially) we thought of them exactly as statics. Might be interesting to find the PR that did that, but it'd require some significant archaeology.

If I understand, essentially we want to say the following:


An "extern static" declares a static variable found in external code, typically C. Note that extern statics must meet the same requirements as ordinary Rust statics, and thus that

  • They must be initialized before they are used.
  • They must have an accurate Rust type that is compatible with the C type.
  • They must not be mutated once initialized, apart from interior mutability.

Does that sound correct? Each of that statements would ideally link to a section that explains what it means, but for now perhaps we can link to issues on the unsafe code guidelines repository.

@RalfJung
Copy link
Member

RalfJung commented Jan 6, 2020

They must be initialized before they are used.

Initializing them is a kind of use, though, isn't it?^^
Rust statics always have an initial value baked into the binary. So I guess we should require something like "they must be initialized before the first Rust function runs" or so?

@nikomatsakis
Copy link
Contributor

Rust statics always have an initial value baked into the binary. So I guess we should require something like "they must be initialized before the first Rust function runs" or so?

Yes, better. This is what I meant.

@nikomatsakis
Copy link
Contributor

Well, actually, there was this question that I kind of ignored but did ponder in the back of my head. Must they be initialized before you invoke Rust code -- or just before Rust code tries to access them...? Probably safe enough to say "before invoking Rust code that could access them" or something like that.

@RalfJung
Copy link
Member

RalfJung commented Jan 6, 2020

Yeah, good question. I don't know the answer. That's why I picked the conservative option.

@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Jan 6, 2020

Would we need to be careful around these extern statics, and for example not apply dereferenceable to them (presuming such an attribute exists on statics, I'm not sure).

I believe the conservative option indicates that code like the following is not fine, and as such we are permitted (for example) to move the access of the static up above the C call. That might not be very intuitive, but would be fine I think (since the access is unsafe). Maybe we can call it out explicitly in documentation, though.

However, AFAICT, the proposed conservative definition essentially makes any Rust library that uses such an extern static generally unusable -- it must declare to the world that it cannot be packaged or otherwise used "inside C" or "as if it was C code" -- right? That feels not great, but maybe there's not much we can do without relaxing our proposed definition too much.

extern {
    static SOME_STATIC: &u32;
}

fn foo() -> Option<u32> {
    if call_c_init() {
         // okay, C initialized it
         *SOME_STATIC
    } else { None }
}

@RalfJung
Copy link
Member

RalfJung commented Jan 6, 2020

@Mark-Simulacrum Why would one not use static mut here, or AtomicPtr<i32>, or so? Clearly that static is mutated, so it ought to be marked as such.

I don't know how static initializers in C work, but presumably their constructors run before main, so as long as no Rust code gets run during the constructor phase, a truly immutable C static should be fine for Rust. If the static is mutated, on the other hand, then why would should we allow lying about that?

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 8, 2020

Would we need to be careful around these extern statics, and for example not apply dereferenceable to them (presuming such an attribute exists on statics, I'm not sure).

That is essentially what I thought first, and why we are in this situation. I made uses of extern statics be a raw pointer deref (not "just like", but really, the MIR of a raw pointer deref and of an extern static access were 100% equivalent before this PR).

But considering the fact that there's lots of user code out there that creates &'static T references to extern statics (unsafely creating them of course) and then exposes them safely, I think the ship has sailed.

I see, so you see this as regarding the status quo, in other words.

Yes, see above. We can discuss whether to break this, whether to start start linting it or whether it's correct, but I think we should merge this PR sooner than later because otherwise we turn a beta regression into a stable regression just because I made an uninformed decision in a previous PR.

@petrochenkov
Copy link
Contributor

I do remember that we added the code to make them unsafe to access relatively late, which implies that (initially) we thought of them exactly as statics. Might be interesting to find the PR that did that, but it'd require some significant archaeology.

#36173

@Mark-Simulacrum
Copy link
Member

@oli-obk I guess the distinction is that extern statics can't be dereferenced "spuriously" (i.e. by the compiler) whereas normal statics can be, right? The user can of course go to a reference if they want.

I think I'm mostly in favor of otherwise treating them purely as statics. I think going with "they're literally just statics" might also make sense (per what @RalfJung notes - if you need special behavior, use static mut).

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 8, 2020

I guess the distinction is that extern statics can't be dereferenced "spuriously" (i.e. by the compiler) whereas normal statics can be, right? The user can of course go to a reference if they want.

Well, since even writing &EXTERN_STATIC is unsafe, you were never able to create references (or raw pointers for that matter) to extern statics without unsafe code. If you never write &STATIC then you would never get "spurious" reads even from normal statics, because the static would be unreferenced in all sources.

The only reason extern statics are unsafe (as far as I can tell from historic PRs and issues) is that they could be used to transmute between types. I was unable to find a discussion saying (or even hinting at) that immutable extern statics can in fact be mutable and should thus be treated as a fancy way to write *(ADDRESS_OF_EXTERN_STATIC as *const T).

@Mark-Simulacrum
Copy link
Member

I guess it depends on if we want to go with Niko's proposal of "must be initialized" exactly like normal statics. I myself am fine with that proposal, I guess, though like @RalfJung I think I'm not entirely certain exactly what that means (i.e. any rust code or not is the question). But maybe it's not an important distinction - i.e. a definition that they must be essentially in rodata in the executable is fine.

@RalfJung
Copy link
Member

RalfJung commented Jan 8, 2020

If you never write &STATIC then you would never get "spurious" reads even from normal statics, because the static would be unreferenced in all sources.

I don't think that is decided yet. We permit spurious reads from &i32, which not also from static FOO: i32?

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 8, 2020

Well I can also implement a conservative alternative keeping the pointer representation but expanding promotion to allow borrows of extern statics

@RalfJung
Copy link
Member

RalfJung commented Jan 8, 2020

No, I think making static and extern static the same here makes sense. But these spurious reads are the justification for why I said "have to be initialized before any Rust code runs".

@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 8, 2020

ok XD then I declare the discussion as resolved and will open the reference and nomicon PRs

@bors r=spastorino

@bors
Copy link
Contributor

bors commented Jan 8, 2020

📌 Commit 097e14d has been approved by spastorino

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 8, 2020
Centril added a commit to Centril/rust that referenced this pull request Jan 8, 2020
…torino

Treat extern statics just like statics in the "const pointer to static" representation

fixes rust-lang#67612

r? @spastorino

cc @RalfJung this does not affect runtime promotion at all. This is just about promotion within static item bodies.
bors added a commit that referenced this pull request Jan 9, 2020
Rollup of 12 pull requests

Successful merges:

 - #67630 (Treat extern statics just like statics in the "const pointer to static" representation)
 - #67747 (Explain that associated types and consts can't be accessed directly on the trait's path)
 - #67884 (Fix incremental builds of core by allowing unused attribute.)
 - #67966 (Use matches macro in libcore and libstd)
 - #67979 (Move `intravisit` => `rustc_hir` + misc cleanup)
 - #67986 (Display more informative ICE)
 - #67990 (slice patterns: harden match-based borrowck tests)
 - #68005 (Improve E0184 explanation)
 - #68009 (Spell check librustc_error_codes)
 - #68023 (Fix issue #68008)
 - #68024 (Remove `-Z continue-parse-after-error`)
 - #68026 (Small improvements in lexical_region_resolve)

Failed merges:

r? @ghost
@bors bors merged commit 097e14d into rust-lang:master Jan 9, 2020
@pnkfelix
Copy link
Member

pnkfelix commented Jan 9, 2020

discussed in T-compiler meeting. beta-approved. Leaving nomination label for T-lang.

@pnkfelix pnkfelix added the beta-accepted Accepted for backporting to the compiler in the beta channel. label Jan 9, 2020
bors added a commit that referenced this pull request Jan 14, 2020
[Beta] Backports

I did not include #67134 and #67289 since they seem to be on beta already.

* Fix up Command Debug output when arg0 is specified. #67219
* Do not ICE on unnamed future #67289
* Don't suppress move errors for union fields #67314
* Reenable static linking of libstdc++ on windows-gnu #67410
* Use the correct type for static qualifs #67621
* Treat extern statics just like statics in the "const pointer to static" representation #67630
* Do not ICE on lifetime error involving closures #67687
@jonas-schievink jonas-schievink removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jan 17, 2020
@oli-obk oli-obk deleted the extern_ptr_dangling branch March 16, 2021 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta-accepted Accepted for backporting to the compiler in the beta channel. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

failed to evaluate constant