Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve shallow Clone deriving #36384

Merged
merged 2 commits into from
Sep 15, 2016
Merged

Improve shallow Clone deriving #36384

merged 2 commits into from
Sep 15, 2016

Conversation

petrochenkov
Copy link
Contributor

@petrochenkov petrochenkov commented Sep 10, 2016

Copy unions now support #[derive(Clone)].
Less code is generated for #[derive(Clone, Copy)].
+
Unions now support #[derive(Eq)].
Less code is generated for #[derive(Eq)].


Example of code reduction:

#[derive(Copy, Clone)]
enum E {
    A { a: u8, b: u16 },
    B { c: [u8; 100] },
}

Before:

fn clone(&self) -> E {
    match (&*self,) {
        (&E::A { a: ref __self_0, b: ref __self_1 },) => {
            ::std::clone::assert_receiver_is_clone(&(*__self_0));
            ::std::clone::assert_receiver_is_clone(&(*__self_1));
            *self
        }
        (&E::B { c: ref __self_0 },) => {
            ::std::clone::assert_receiver_is_clone(&(*__self_0));
            *self
        }
    }
}

After:

fn clone(&self) -> E {
    {
        let _: ::std::clone::AssertParamIsClone<u8>;
        let _: ::std::clone::AssertParamIsClone<u16>;
        let _: ::std::clone::AssertParamIsClone<[u8; 100]>;
        *self
    }
}

All the matches are removed, bound assertions are more lightweight.
let _: Checker<CheckMe>;, unlike checker(&check_me);, doesn't have to be translated by rustc_trans and then inlined by LLVM, it doesn't even exist in MIR, this means faster compilation.


Union impls are generated like this:

union U {
    a: u8,
    b: u16,
    c: [u8; 100],
}
fn clone(&self) -> U {
    {
        let _: ::std::clone::AssertParamIsCopy<Self>;
        *self
    }
}

Fixes #36043
cc @durka
r? @alexcrichton

@petrochenkov
Copy link
Contributor Author

Extended with the same improvements to #[derive(Eq)]

@durka
Copy link
Contributor

durka commented Sep 11, 2016

Hmm, why is it sometimes AssertParamIsCopy<Self> and sometimes checking each field?

@petrochenkov
Copy link
Contributor Author

@durka
The situation is a bit different for #[derive(Copy, Clone)] struct/enum and #[derive(Clone)] union.

In the first case we already know that Self: Copy and the per-field assertions are required only to reject pathological cases FieldTy: Copy + !Clone. Personally, I'd remove them because Copy + !Clone is a bug, but in this PR I preserve status quo and follow the decision from #31414.

In the second case we don't know if Self: Copy or not beforehand.
The only way to derive Clone without doing unsafe memcpy's is *self and *self in derive gives cryptic error messages when used with non-Copy types. So AssertParamIsCopy<Self> is not strictly necessary too, its only purpose is to fail during type checking and give nicer error messages for non-Copy types ("bound Copy is not satisfied" as opposed to "cannot move out of dereference").

@durka
Copy link
Contributor

durka commented Sep 11, 2016

Why not impl Clone for U where U: Copy?

On Sun, Sep 11, 2016 at 11:06 AM, Vadim Petrochenkov <
notifications@github.com> wrote:

@durka https://github.com/durka
The situation is a bit different for #[derive(Copy, Clone)] struct/enum
and #[derive(Clone)] union.

In the first case we already know that Self: Copy and the per-field
assertions are required only to reject pathological cases FieldTy: Copy +
!Clone. Personally, I'd remove them because Copy + !Clone is a bug, but
in this PR I preserve status quo and follow the decision from #31414
#31414.

In the second case we don't know if Self: Copy or not beforehand.
The only way to derive Clone without doing unsafe memcpy's is *self and
*self in derive gives cryptic error messages when used with non-Copy
types. So AssertParamIsCopy is not strictly necessary too, its only
purpose is to fail during type checking and give nicer error messages for
non-Copy types ("bound Copy is not satisfied" as opposed to "cannot move
out of dereference").


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#36384 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAC3n2UTLoxBOqsDyGIs-a25OiVM7n8bks5qpBj7gaJpZM4J5wXq
.

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Sep 11, 2016

Why not impl Clone for U where U: Copy?

I'm surprised this works actually. I expected impls with always-false predicates to be ignored without errors (i.e. impl Trait for T NEVER {} is not an error), which is not desirable behavior for erroneous derive, but they turn out to report errors.
I need to experiment with generic unions a bit. If this works, then this is certainly a better solution.

@alexcrichton
Copy link
Member

Awesome wins @petrochenkov! Seems like a great idea to me and is a great way to cut down on the amount of code generated as well.

I've only been personally following unions at a distance, though, and don't feel qualified enough to say whether or not it should be allowed to have these #[derive] modes on unions. To me (which this may be naive) it seems super unsafe and almost not what you'd ever want because if accessing a field is unsafe why should copying the whole struct be safe? In any case though I'd be fine r+'ing without that bit, but otherwise I'll cc @joshtriplett to see if we can get the ball rolling with thoughts about #[derive] and union.

@petrochenkov
Copy link
Contributor Author

Why not impl Clone for U where U: Copy?

I think I prefer the current implementation after all.
The main argument is that

#[derive(Clone)]
union U<T> {
    a: T,
}

compiles with "where Copy" implementation of derive, but despite that for any T U<T>: !Clone.
This is not how other derives work with generics. "AssertParamIsCopy" implementation gives an error, which seems better.

@joshtriplett
Copy link
Member

joshtriplett commented Sep 12, 2016

So, syntactically, unions should allow having an attached #[derive(...)], and custom derive implementations should work with unions. However, I don't think unions should support deriving any of the standard traits, including Eq and Copy, because doing so should involve an "unsafe" somewhere, and we don't currently have an "unsafe derive" mechanism. Deriving Eq or Copy presumably assumes memcmp or memcpy respectively, which doesn't seem safe.

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Sep 12, 2016

Deriving Eq ... presumably assumes memcmp, which doesn't seem safe.

Nah, Eq is only an assertion that the comparison done by PartialEq is well behaving. Deriving PartialEq, doing the comparison itself, is not supported, it has to be written by a user using unsafe code.

#[derive(Copy)] / #[derive(Clone)] is a sugar for perfectly safe impl Copy for U {} / impl Clone for U { fn clone(&self) -> Self { *self }}, not sure why it shouldn't be supported, writing these impls by hand is quite tedious.

@joshtriplett
Copy link
Member

@petrochenkov Ah, I didn't realize that about Eq; in that case, deriving Eq (but not PartialEq) seems fine.

Deriving Copy seems plausible but potentially problematic. Suppose I have a union U { x: u8, y: u32 }, and I let u = U { x: 12 };. How can I the compiler Copy that, and does doing so invoke undefined behavior?

UB aside, copying padding shouldn't do any harm. so if the Copy implementation doesn't invoke UB, deriving it for unions containing entirely Copy fields seems OK.

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Sep 12, 2016

@joshtriplett
The padding copying problem already exists for structs.
Copying is done by memcpying all the struct/union storage, I'm not actually sure how exactly Rust + LLVM deal with it formally, but it seems to work somehow.

@joshtriplett
Copy link
Member

@petrochenkov Logically, derive(Copy) for a struct acts as though it copies all the fields individually. The compiler might turn that into a memcpy or inlined copy for efficiency, but that doesn't invoke UB. For a union, though, I don't think you can actually write the equivalent of Copy in Rust code (and definitely not safe Rust code) without potentially invoking UB. Hence my concern.

@joshtriplett
Copy link
Member

CCing @ubsan for review here.

@retep998
Copy link
Member

Padding isn't UB, it merely has the value undef which is okay to memcpy (which results in the destination padding also being undef). It's only if you try to actually do anything based on the value of the padding that UB steps in and ruins your day. So implementing Copy for unions where all fields are Copy should be entirely okay.

@joshtriplett
Copy link
Member

@retep998 Thanks! That makes sense.

Based on that, I agree that derive(Copy) makes sense for unions where every field implements Copy. And derive(Clone) makes sense for any type implementing Copy.

@alexcrichton
Copy link
Member

@petrochenkov is a particular reason to allow #[derive(Eq)] on a union? Although technically possible, it seems surprising that we'd allow that but always require you to write a manual PartialEq implementation anyway.

For Copy, Clone this seems reasonable now. Thinking again a move is always safe, which is a memcpy, so it seems reasonable that a copy must always be safe as well.

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Sep 12, 2016

is a particular reason to allow #[derive(Eq)] on a union?

No, I just went through all derivable traits and supported everything that can be supported.
After minimizing generated code (removing matches) #[derive(Eq)] for unions can be enabled by simply switching a flag, so it doesn't add any kind of maintenance burden.

@alexcrichton
Copy link
Member

Perhaps it could be left out just to be conservative? I can see how #[derive(Copy, Clone)] is useful to get a Copy union, but it seems like we'd be adding Eq "just because" which we tend to try to avoid in terms of exposing APIs at least.

@strega-nil
Copy link
Contributor

@retep998 That's only the logic for LLVM, not for Rust. It's not okay in Rust to read/write undef, but I'd argue it should be, at least for bytes.

@petrochenkov
Copy link
Contributor Author

@alexcrichton

Perhaps it could be left out just to be conservative?

I don't think there are any reasons to be conservative in this particular case.
If Eq is dropped now, then it's only a matter of time until someone opens an issue "why it's not supported, it can clearly be supported".

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Sep 13, 2016

An example of equality comparable union:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa383713(v=vs.85).aspx
If #[derive(Eq)] sugar for impl Eq for T {} makes sense at all, then it seems reasonable to use it for such unions.

#[derive(Clone, Copy, Eq)]
union LARGE_INTEGER {
    ...
}

impl PartialEq for LARGE_INTEGER { ... }

@alexcrichton
Copy link
Member

@petrochenkov yeah I'm certainly not disagreeing that it's possible, nor that it's desired in some use cases. I just tend to far prefer conservative approaches. We can always add the feature but we can basically never take it out.

For me the surprise factor is so high (#[derive(PartialEq)] doesn't work) that I would err on the side of removing it in the initial implementation to perhaps be added later.

@petrochenkov
Copy link
Contributor Author

This reminds me the joke about docking a dog's tail slice by slice, so the poor dog doesn't lose it all at once.

I know, the Eq issue itself is tiny, but as a part of a larger picture I'd prefer to have a complete feature - derive support for unions, rather than a feature with random missing part and an issue "consider supporting Eq" with unclear schedule and preconditions.
Is it a blocker for merging the PR?

@alexcrichton
Copy link
Member

It seems like the concrete motivation for this PR was improving #[derive(Copy, Clone)] codegen, which then quickly snowballed into allowing #[derive(Copy, Clone)] for unions, when then finally extended to PartialEq. Supporting all #[derive] modes is not possible on unions today, and as you've said yourself flipping the switch to enable it is a one line change that can be made at any time.

Due to the limited utility today, the fact that it was just tacked onto this PR, and that it's easy to enable at any time (perhaps in a more principled fashion with other #[derive] modes if at all), makes me think that we should stick to being conservative and leave it out of this PR.

@petrochenkov
Copy link
Contributor Author

It seems like the concrete motivation for this PR was improving #[derive(Copy, Clone)] codegen

The motivation was to finish my work on unions that was blocked by matches generated by generic derive (they can match on >1 fields and that is not possible with unions).
So I removed the matches where possible and codegen improved as well, becoming (arguably) the most useful part of the PR. But it doesn't mean that union support was "tacked onto".

@petrochenkov
Copy link
Contributor Author

We can always add the feature but we can basically never take it out.

FWIW, we can take it out - unions are unstable (but I can't imagine why would we do that).

@alexcrichton
Copy link
Member

Ok that sounds reasonable to me, let's see what bors thinks!

@bors: r+ 62cb751

Manishearth added a commit to Manishearth/rust that referenced this pull request Sep 15, 2016
Improve shallow `Clone` deriving

`Copy` unions now support `#[derive(Clone)]`.
Less code is generated for `#[derive(Clone, Copy)]`.
+
Unions now support `#[derive(Eq)]`.
Less code is generated for `#[derive(Eq)]`.

---
Example of code reduction:
```
enum E {
	A { a: u8, b: u16 },
	B { c: [u8; 100] },
}
```
Before:
```
fn clone(&self) -> E {
    match (&*self,) {
        (&E::A { a: ref __self_0, b: ref __self_1 },) => {
            ::std::clone::assert_receiver_is_clone(&(*__self_0));
            ::std::clone::assert_receiver_is_clone(&(*__self_1));
            *self
        }
        (&E::B { c: ref __self_0 },) => {
            ::std::clone::assert_receiver_is_clone(&(*__self_0));
            *self
        }
    }
}
```
After:
```
fn clone(&self) -> E {
    {
        let _: ::std::clone::AssertParamIsClone<u8>;
        let _: ::std::clone::AssertParamIsClone<u16>;
        let _: ::std::clone::AssertParamIsClone<[u8; 100]>;
        *self
    }
}
```

All the matches are removed, bound assertions are more lightweight.
`let _: Checker<CheckMe>;`, unlike `checker(&check_me);`, doesn't have to be translated by rustc_trans and then inlined by LLVM, it doesn't even exist in MIR, this means faster compilation.

---
Union impls are generated like this:
```
union U {
	a: u8,
	b: u16,
	c: [u8; 100],
}
```
```
fn clone(&self) -> U {
    {
        let _: ::std::clone::AssertParamIsCopy<Self>;
        *self
    }
}
```

Fixes rust-lang#36043
cc @durka
r? @alexcrichton
bors added a commit that referenced this pull request Sep 15, 2016
Rollup of 9 pull requests

- Successful merges: #36384, #36405, #36425, #36429, #36438, #36454, #36459, #36461, #36463
- Failed merges: #36444
@bors bors merged commit 62cb751 into rust-lang:master Sep 15, 2016
@petrochenkov petrochenkov deleted the derclone branch September 21, 2016 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants