Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C# Compiler emits calls via the _least_ derived override. This is unintuitive and contradicts the model presented by the IDE. #24347

Closed
jcouv opened this issue Jan 19, 2018 · 10 comments
Assignees
Milestone

Comments

@jcouv
Copy link
Member

jcouv commented Jan 19, 2018

@yaakov-h commented on Sun Jan 14 2018

My understanding of the C# spec is that ref and out are treated separately for method overload resolution:

For other purposes of signature matching (e.g., hiding or overriding), ref and out are considered part of the signature and do not match each other. - Signatures and overloading

Given the following code, Baz.M(ref object) should override Foo.M(ref object), but at runtime it appears to override Bar(out object) instead.

"Go To Definition" agrees with me, or has made the same misunderstanding (see the comment below).

using System;

public static class Program
{
    public static void Main()
    {
        object o = null;
        new Baz().M(ref o); // Should print "Baz"
        new Baz().M(out o); // Should print "Bar"
        new Baz().N(0); // Should print "Baz"
    }
}

class Foo
{
    public virtual void M(ref object o)
    {
        Console.WriteLine("Foo");
    }

    public virtual void N(int i)
    {
        Console.WriteLine("Foo");
    }
}

class Bar : Foo
{
    // New method, does not override Foo.M()        
    public virtual void M(out object o)
    {
        o = null;
        Console.WriteLine("Bar");
    }

    // New method, does not override Foo.N()
    public virtual void N(int i)
    {
        Console.WriteLine("Bar");
    }
}

class Baz : Bar
{
    // Should override Foo.M()
    public override void M(ref object o)
    {
        // If you uncomment this line, then Go To Definition goes to Foo.M()
        // base.M(ref o);

        Console.WriteLine("Baz");
    }

    // Should override Bar.N()
    public override void N(int i)
    {
        Console.WriteLine("Baz");
    }
}

Expected Output:

Baz
Bar
Baz

Actual Output:

Foo
Baz
Baz

Is this:

  • A complete misunderstanding on my part?
  • A bug in Go To Definition?
  • An error in the language spec?
  • A bug in Roslyn?
  • A bug in the .NET Framework and .NET Core runtimes?

Thanks.


@Thaina commented on Sun Jan 14 2018

Interesting. I also play with it a little more and found out that

image

Might be related


@yaakov-h commented on Sun Jan 14 2018

Yep, the spec mentions that too, right before the bit above:

Although out and ref parameter modifiers are considered part of a signature, members declared in a single type cannot differ in signature solely by ref and out. A compile-time error occurs if two members are declared in the same type with signatures that would be the same if all parameters in both methods with out modifiers were changed to ref modifiers.


@Thaina commented on Sun Jan 14 2018

@yaakov-h If so then I don't think this is a bug. public virtual void M(out object o) in Bar just has the effect as putting new keyword to replace the original Foo.M because it has exactly same signature

Maybe a bug in compiler warning that it not warn anything about this


@yaakov-h commented on Sun Jan 14 2018

Maybe, except the spec explicitly says that for the purposes of hiding or overriding - which is exactly what we are doing here - "ref and out are considered part of the signature and do not match each other."


@Thaina commented on Sun Jan 14 2018

@yaakov-h I see then I think it was a bug from the point that Bar.M was not error from being hiding Foo.M


@MkazemAkhgary commented on Mon Jan 15 2018

have you tried this on older c# compilers? I'm wondering for how long it was unnoticed.


@yaakov-h commented on Mon Jan 15 2018

According to a coworker this behaviour has been around for years.


@stakx commented on Mon Jan 15 2018

have you tried this on older c# compilers? I'm wondering for how long it was unnoticed.

I just tried this with Mono's C# compiler, and with http://rextester.com which claims to use C# compiler version 4.0.30319.17929. (If I understand correctly, this is a pre-Roslyn version of the C# compiler.) They all produce the same output as shown above. So this behaviour has probably been around for a long time.

It appears to me that when it comes to "signatures", the C# spec differs from the underlying platform's notion of signatures. As far as I can tell, CLI signatures (at least the standalone ones) do not distinguish between ref and out—there are both simply by-ref parameters. (See ECMA-335, §22.2.10.) The CLI considers optional or required modifiers as being part of a signature, but C# doesn't use these to mark ref or out parameters. And custom attributes aren't considered part of a signature, so distinguishing using the [Out] pseudo-attribute is out of the question, too.

The C# spec OTOH explicitly states (as mentioned in the initial post) that ref and out don't match each other with regard to overriding / method hiding.

Perhaps the C# compiler simply takes a shortcut here and implements its overloading / method hiding rules by directly employing the CLI's hidebysig mechanism unaltered (which might be good enough in practice, but not really sufficient given the differences in the respective specifications)? The C# spec suggests just that in §3.6 Signatures and overloading:

This restriction [not being able to overload based on ref or out alone in the context of a single type] is to allow C# programs to be easily translated to run on the Common Language Infrastructure (CLI), which does not provide a way to define methods that differ solely in ref and out.
...
F(ref int) and F(out int) cannot be declared within the same interface because their signatures differ solely by ref and out.

Could it be that the spec writers simply forgot to consider cases where such signature "collisions" appear, not in a single type, but in a type hierarchy?


@brian-reichle commented on Mon Jan 15 2018

have you tried this on older c# compilers? I'm wondering for how long it was unnoticed.

I first noticed it around the .NET 4.0/4.5 days, I'm fairly sure I also tested it on a machine that only had .NET 3.5 and VisualStudio 2008 installed (I don't recall if csc was distributed with the framework or visual studio back then). My interpretation of ECMA-334 at the time was that having the methods in the same type was prohibited but different types in the same inheritance chain was ok (if you don't mind crazy code).

As @stakx pointed out, the ref and out are distinguished by an attribute and attributes are not considered part of the signature by the runtime, though I have seen various tools behave as though it were.

I believe this could be resolved with a MethodImpl record, but I think that might be heading down a problematic path and I would be happy if the C# spec were simply 'clarified' to prohibit this :)


@stakx commented on Mon Jan 15 2018

the ref and out are distinguished by an attribute

(@brian-reichle - I need to correct myself there: AFAIK, it isn't actually OutAttribute itself. That is only a pseudo-custom attribute and doesn't even get emitted as such; it is turned into a particular flag (ParamAttributes.Out) in the Params metadata table, see ECMA-335 §20.2.1 and §22.1.12. The information is there in IL metadata, but the runtime probably doesn't use it. I cannot find the exact definition of the hidebysig signature matching rules right now, but I suspect it doesn't take any ParamAttributes into account, so the end result is the same.)


@brian-reichle commented on Mon Jan 15 2018

@stakx, I think you might be right, but either way it's not using a modopt or modreq so the end result is the same. IIRC, matching (for the purpose of overriding) is based only on the method name, and signature blob, ParamAttributes.Out is definitely not part of the signature blob.

hidebysig is ignored by the runtime (Partition II section 15.4.2), I believe it's supposed to provide guidance for tools so they can match language behaviour.


@brian-reichle commented on Tue Jan 16 2018

Just for the record, here is a/the error report on connect from 2014

https://connect.microsoft.com/VisualStudio/feedback/details/815996/c-compiler-out-vs-ref-override-added-to-wrong-vtable-slot


@tannergooding commented on Wed Jan 17 2018

FYI. @VSadov, @jcouv


@jcouv commented on Fri Jan 19 2018

From the readonly-ref spec:

It is not permitted to overload on ref/out/in differences.
It is permitted to overload on ordinary byval and in differences.

Moving to roslyn repo as a compiler bug.

@jcouv jcouv added this to the 15.6 milestone Jan 19, 2018
@VSadov
Copy link
Member

VSadov commented Jan 19, 2018

This is a long standing issue with a design choice that virtual calls in C# are emitted via "least derived" instead of the "most derived" method.

It is confusing in cases where inheritance hierarchy as seen by CLR is not the same as seen by C#.
Here CLR does not understand ref/out differences, but C# does.

So when user sees call to Baz.M, which is intuitive and is supported by the VS and GTD, in reality a call to Foo.M is emitted.
Essentially C# traces the target method back to its least derived override, hoping that CLR will trace it back to the most derived one in the context of the call. However, since C# and CLR disagree on the hierarchy, CLR arrives to to a different method and calls that.

With introduction of named parameters, optional parameters, tuples and other features where the information from the most derived statically known method wins over, such behavior is getting more and more confusing.

Every time we see a bug like this we consider a change where we emit calls via the most derived override. In most cases the difference would be unobservable, but the confusing cases like this would be "fixed" and that in itself could be considered a breaking change.

I think the evidence is mounting that we should make the fix.

This should go back to the dotnet/csharplang.

CC:@gafter

@VSadov VSadov modified the milestones: 15.6, 15.7 Jan 19, 2018
@VSadov VSadov changed the title Ref/Out overrides don't follow spec C# Compiler emits calls via the _least_ derived override. This is unintuitive and contradicts the model presented by the IDE. Jan 19, 2018
@stakx
Copy link

stakx commented Jan 19, 2018

I think I've run into this issue as well (see below). I originally found out about it in https://stackoverflow.com/q/16419168/240733.

@VSadov, if the behavior were changed from "least derived" to "most derived" method, would this extend to how the C# compiler captures methods as MethodInfo in LINQ expression trees?

I recently dealt with a bug over devlooped/moq#497 that seems directly related to the "least derived" behavior. I added reflection code in devlooped/moq@f996915 to find the "most derived" method given the base declaration; this would perhaps become obsolete after the change you describe?

If the change is made, it might be particularly noticeable in reflection-heavy code such as Moq 4, where it could surface as new bugs. I'm not opposing the change (it would perhaps be for the better), I'd just like to mention LINQ expression trees and reflection in general (and perhaps even Roslyn analyzers?) as places where people might have to modify their code after the change to maintain the previous behavior.

If the change is not made, perhaps the current behaviour could be documented somewhere?

/cc @kzu

stakx referenced this issue in devlooped/moq Jan 19, 2018
It turns out that the C# compiler puts the wrong `PropertyInfo` in an
`Expression` for properties in derived classes. Apparently, it always
puts in the `PropertyInfo` corresponding to the property's base dec-
laration instead of the `PropertyInfo` of the accessed type.

This commit adds a clause to the `expression.ToPropertyInfo` extension
method to counteract this peculiar compiler behavior.
@jcouv
Copy link
Member Author

jcouv commented Jan 19, 2018

I'll re-open the csharplang issue with added comments, and close the present issue.

@jcouv jcouv closed this as completed Jan 19, 2018
@VSadov
Copy link
Member

VSadov commented Jan 19, 2018

@stakx - the expression tree situation would be more directly observable than the change in IL.
I doubt that expression trees would change.

@stakx
Copy link

stakx commented Jan 19, 2018

@VSadov - In a way this would be a pity, as it would mean that expression trees get even more out of sync with the C# language as whole. Thanks for taking the time to reply, though!

@VSadov
Copy link
Member

VSadov commented Jan 19, 2018

Technically C# spec specifies binding to be happening in terms of least derived methods. That is not very visible and does not arguably prescribe how actual calls are emitted.

Indeed - The actual method that gets to run may be neither the least nor the most derived (as we see statically), since there could be even more derived overrides that we do not see at compile time.
As long as the method that gets to run matches what spec imply - the IL is correct.
From that point, emitting a call to the most derived is actually more representative of the binding choices since there are fewer chances that CLR will mess up something and an unexpected method gets called.

On the other hand the expression tree, exists to represents the "bound tree". One could argue that the tree must respect the binding spec and literally have a call to the least derived method, while IL should match the execution semantics, where most derived could yield more stable behavior.
(the actual shape of the IL is not bound by particular contracts as long as the right methods get called)

What I mean here is that - an argument could be made for the IL to change, while expression trees to stay the same.

@VSadov
Copy link
Member

VSadov commented Jan 19, 2018

In reality the decision around the expression trees will be driven by the backward compatibility.

Changing shape of trees often breaks people in unexpected ways. The value vs. risk will not be in the favor of the change there.

@stakx
Copy link

stakx commented Jan 19, 2018

I totally appreciate backward compatibility concerns and am content to leave it at that.

I do not fully understand your previous posts however, I'd be grateful if you could make it clearer to me:

the actual shape of the IL is not bound by particular contracts as long as the right methods get called

Here you seem to be saying that the spec doesn't proscribe the exact IL that needs to be emitted; the exact IL is an implementation detail (i. e. in Roslyn's territory).

Every time we see a bug like this we consider a change where we emit calls via the most derived override.

This suggests that you could just change the IL without having the spec changed. Yet, moving the issue back to dotnet/csharplang suggests to me that you propose a change to the spec instead (regarding the part that defines how method binding should work).

Technically C# spec specifies binding to be happening in terms of least derived methods. [...] On the other hand the expression tree, exists to represents the "bound tree". One could argue that the tree must respect the binding spec and literally have a call to the least derived method [...]

Given the above, this argument doesn't make sense to me: If you're going to change the spec so binding happens in terms of "more derived" instead of "least derived", then couldn't a case be made with the same argument, i. e. that expression trees should respect the binding spec and therefore represent "more derived" methods?

So in order to keep backward compatibility in expression trees, you'd have to disregard the binding spec, if it gets changed?

@VSadov
Copy link
Member

VSadov commented Jan 20, 2018

This was discussed before. It is not a straightforward issue.

  • It is not clear whether the change requires spec adjustments or not. I think it does not, but I know that there are other opinions. We may not change the spec at the end, but there must be an agreement on not changing ;-) . If we do the change at all.
  • There would be an observable change in behavior. It must be examined from the compatibility standpoint.

So in order to keep backward compatibility in expression trees, you'd have to disregard the binding spec, if it gets changed?

Or amend the spec in a way that expression trees do not need to change.
Or make a note in compiler spec/implementation that spec is intentionally not followed for compat reasons.

Compatibility is treated very seriously and expression trees is one of the areas historically known to be very sensitive to changes.

@stakx
Copy link

stakx commented Jan 20, 2018

I see. Thank you for spelling it out so clearly for me. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants