Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize DISTINCT over collections of 0 or 1 element #34702

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ranma42
Copy link
Contributor

@ranma42 ranma42 commented Sep 17, 2024

This does some progress towards fixing #34482, but in several cases the optimization does not trigger because of another intermediate operation.

I will investigate whether it makes sense to combine these operations in a separate PR or if they can be easily handled in this one.

The Single/First[OrDefault] case seems to be reasonably tested; for the Take(1) case it might be better to add a new test (it is tested, but AFAICT only in combination with Single/First[OrDefault]).

@@ -255,6 +255,11 @@ public void ApplyDistinct()
throw new InvalidOperationException(RelationalStrings.DistinctOnCollectionNotSupported);
}

if (Limit is SqlConstantExpression { Value: 0 or 1 })
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to handle the case of 0? AFAICT SELECT expressions with a LIMIT 0 could be optimized away heavily (basically only the projection/typing matters; offset, predicate, distinct, ... are all irrelevant)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree - if we want to optimize the LIMIT 0 case, we should probably do that separately and go much farther than just removing DISTINCT... Any thoughts on cases where LIMIT 0 could be an actually useful thing to do (and therefore worth working on)?

@@ -1901,6 +1906,10 @@ public void ApplyLimit(SqlExpression sqlExpression)
}

Limit = sqlExpression;
if (Offset is null && Limit is SqlConstantExpression { Value: 0 or 1 })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
if (Offset is null && Limit is SqlConstantExpression { Value: 0 or 1 })
if (Offset is null && Limit is SqlConstantExpression { Value: 0 or 1 })

@@ -255,6 +255,11 @@ public void ApplyDistinct()
throw new InvalidOperationException(RelationalStrings.DistinctOnCollectionNotSupported);
}

if (Limit is SqlConstantExpression { Value: 0 or 1 })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree - if we want to optimize the LIMIT 0 case, we should probably do that separately and go much farther than just removing DISTINCT... Any thoughts on cases where LIMIT 0 could be an actually useful thing to do (and therefore worth working on)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants