Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #27149 -- Added Subquery and Exists database expressions. #6478

Merged
merged 1 commit into from
Jan 14, 2017

Conversation

schinckel
Copy link
Contributor

@schinckel schinckel commented Apr 20, 2016

Sort-of related to:

https://code.djangoproject.com/ticket/16603
https://code.djangoproject.com/ticket/25789

but this is dealing with .annotate(), rather than .filter() methods
on a queryset.

https://code.djangoproject.com/ticket/27149

@schinckel
Copy link
Contributor Author

@jarshwah provided a heap of guidance on how to make this better than it was, but it seems to be coming along nicely now. Would love to have some others provide further feedback.

# Circular reference if we do it earlier.
from django.db.models.lookups import Exact

# Is there a single relationship between subquery and query?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This join generation concerns me - not that it won't work just that it's kinda magical and ugly. It would be awesome if we could use the relationship name somewhere. Perhaps SubQuery(rel_name, qs=BLAH) which is a similar API to Prefetch? I don't know how easy that would be to get to work as the rel object would probably need to do some of the transformations. It may allow a wider variety of rel objects to work though - e.g. subquery on a M2M field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the magic here is probably the thing I like least (although the where.children.append() also feels nasty, but maybe there isn't another way to do that).

I played around with different ways of passing stuff around (but that was before I was pointed towards resolve_expression)

@charettes
Copy link
Member

Nice patch Matthew! EXISTS() is the last remaining reason I had to use RawSQL().

I was wondering if it had been considered to allow passing a boolean to the __exists lookup instead of a queryset. I feel like Author.objects.filter(books__exists=True) is more intuitive than Author.objects.filter(books__exists=Books.objects.all()).

@schinckel
Copy link
Contributor Author

I was taking the approach that you might want to filter on filtered queryset.

Using the True as a shortcut for RelatedModel.objects.all() may be nice though.

@schinckel
Copy link
Contributor Author

Building on @charettes suggestion, I think maybe there should also be special-cased versions:

>>> Model.objects.filter(related__exists=True)
>>> Model.objects.filter(related__exists=False)

Or should this collapse to a single case, where you can use .exclude() instead?

@jarshwah
Copy link
Member

Or should this collapse to a single case, where you can use .exclude() instead?

FWIW I really dislike exclude. I think Django could solve a lot of (internal) problems by ditching exclude and implementing everything in terms of filter (not that that can happen now, but no need to continue on that path).

If you choose to special case True, then do so for False also.

@schinckel
Copy link
Contributor Author

So .exclude() changes everything, and I'm not totally sure I understand how.

I've managed to implement .filter(related__exists=<boolean>), but none of the .exclude() stuff comes close: it appears to be doing an extra subquery.

@schinckel
Copy link
Contributor Author

Hmm. Looking at the documentation, it may be possible to use the EXISTS () query type automatically on a query when a queryset is provided on relation__in. Good idea?

@akaariai
Copy link
Member

akaariai commented Apr 25, 2016

The implementation of .exclude(cond) isn't anything more than .filter(~Q(cond)), and we definitely can't get rid of negated Q() objects.

There are multiple problems with the way negated filters are implemented currently in Django. I'll got a bit into why we have problems in this comment even if this PR might not be the right forum.

First, the definition of a negated query is pretty much this: If obj in qs and obj not in qs.filter(cond), then obj in qs.filter(~Q(cond)).

Now, this is pretty nice definition and we can make it work nicely, except for the case where we are filtering along a join generating multiple entries (aka a multivalued relation). Say, we have a book with three authors, aged 40, 50 and 60. If we do qs.filter(authors__age__gte=50), we generate a query with a join to authors, and filter to a set where the author's age is 50 or more. So, we actually have the same book in the queryset two times. To implement the negated query qs.filter(~Q(authors__age__gte=50)) we can't use a join. If we used a join and just negated the WHERE condition, we would filter out all rows for which the author's age is 50 or more, which leaves us one row for the book, that where the authors age is 40. This breaks the definition of negated querysets.

The correct way to write both the filter() and negated filter() query is to use a subquery. But, we can't use a subquery for the .filter() case, as qs.filter(authors__age__gte=50).annotate(avg=Avg('authors__age')) is defined to calculate the average age of the book's authors for those authors who are older than 50. If we used a subquery in the WHERE condition, we of course can't select the average from the subquery. For this reason, currently qs.exclude(authors__age__gte=50).annotate(avg=Avg('authors__age')) doesn't calculate the average over authors whose age is less than 50. We can't use both a subquery for .exclude() and an annotation for the same relation.

Extra fun is generated because qs.filter(authors__age__gte=50).filter(authors__height__gte=180) is different from qs.filter(authors__age__gte=50, authors__height__gte=180). The first one finds books where there is at least a single author older than 50 and taller than 180, the second one finds books where there is an author aged more than 50 and (possibly different) author taller than 180. That is, the first query uses a single join, the second one uses two joins. Again, due to implementation difficulties, negated filters do not actually work like this.

So, the tl;dr version of this is that we do have a problem with negated filters, but there isn't an easy fix in sight.

The fix might be that we change the definition of filter against a multivalued join to use a subquery (thus breaking .filter().annotate()), or that we change the definition of .filter(~Q(cond)) to not use a subquery (thus breaking the current definition of negated filtering). Both of these will break silently a lot of correctly written projects, so we need something else than just a release note for this. We could also try to fix the code, and I had some PRs a while ago trying to do exactly that, but making sure the code works for complex combinations of negated and non-negated filters over multiple multivalued relations and with annotations over the same relations is extremely hard.

@akaariai
Copy link
Member

akaariai commented Apr 25, 2016

For the specific problem in this PR with exclude, we should push the joins into the subquery. That is, instead of generating a query like

select * from authors join friends on authors.id = friends.author1_id
where exists (select 1 from authors where author.id = friends.author2_id)

the query should be

select * from authors where exists (
    select 1 from authors a1
        join friends on author1_id = a1.id where friends.author2_id = authors.id)

Given models Book and Chapters(book=ForeignKey(Book)), I'd start with making sure the following queryset works: Book.objects.annotate(last_chapter=SubQuery(Chapter.objects.filter(book=F('outerq__pk')).order_by('-chapter_no')[:1]))

Building on F('outerq__') will make it possible to automatically generate the where conditions like done in this patch. And, having the ability to connect the inner and outer query in arbitrary ways would be a handy feature in itself.

@schinckel
Copy link
Contributor Author

@akaariai Thanks for the input. Lots there: I think I understand some of it.

I assume the last comment is basically stating that we shouldn't push the where automatically into the sub-query? Instead, there should be a pseudo-field called outerq? Is this what you mean?

@akaariai
Copy link
Member

The way negated querysets and multivalued relations work is complicated, unfortunately I don't see any easy way to make it simpler.

What I am after for SubQuery and Exists is that if you issue:

Book.objects.annotate(last_chapter=SubQuery(Chapter.objects.filter(book=F('outerq__pk')).order_by('-chapter_no')[:1]))

then what should happen is that the F('outerq__pk') gets resolved against the outer query (the Book query). So, F('outerq__pk') is replaced by book.id in the generated query. Similarly, if you did:

Book.objects.annotate(last_chapter=SubQuery(Chapter.objects.values('pages').filter(pk=F('outerq__publisher__pk')).order_by('-chapter_no')[:1]))

then the F('outerq...') would be resolved against the book query, generating a join to the publisher of the book and fetching the pk of the publisher. So, the query would look something like:

select *, (select pages from chapter u1 where u1.book_id = publisher.pk) as last_chapter from book inner join publisher on publisher.id = book.publisher_id

If we have that working, we are pretty far in getting all kinds of interesting subqueries implemented. You can then built on this for automatically generating the connecting condition in the subqueries.

Does this make sense?

@akaariai
Copy link
Member

Hmmh, it might be we can't easily implement the F('outerq__') style of connecting subqueries to their parent queries. We'd need to resolve the F('outerq__') lazily as at the point the filter() method is called for the inner query, we don't yet have a reference to the outer query.

An idea is to add a Query.delayed_expressions list. When the inner query is connected to the outer query, we resolve all the delayed expressions, and similarly to how relabeled_clone is implemented, replace the delayed expression with a resolved expression.

So, for example, condition filter(pk=F('outerq__publisher__pk')) would resolve to Equal(Col(innerq.pk), F('outerq__publisher__pk')), and delayed_expressions would be [F('outerq__publisher__pk')]. When the inner query is connected to the outer query, we resolve delayed expressions, and get a map of {F('outerq__publisher__pk'): Col('publisher.pk')}. We replace all keys of the map with the value pointed by the map, very similarly to how relabeled_clone replaces aliases in the query.

This isn't exactly easy, but I believe it should work reliably.

@schinckel
Copy link
Contributor Author

After starting to use Exists() and SubQuery() in a real context today, I think I agree. It's going to be much simpler if it's possible to refer to the outer queryset. For instance, I had a queryset that I needed to indicate if this was the first one (ordered according to a timestamp) for a given generically related object. This required a reference to a queryset of the same model, joined by both content-type and object-id.

In the end, I wound up writing it using RawSQL, because it just needed to be done. I'm not convinced it's easy/possible to do it without referring to the outer queryset using an F(outerq__)

@schinckel
Copy link
Contributor Author

Hmm. resolve_expression is not provided with the lookups, only value (and query).

This seems like the reasonable place to resolve the expression to the delayed expression.

@schinckel
Copy link
Contributor Author

Okay, pretty happy with how this went tonight. I've managed to implement delayed F() expressions (although the delayed resolution only occurs in the SubQuery and Exists expressions at this point: maybe that's the only place it makes sense to though).

It makes the syntax for using these expressions slightly more cumbersome:

Model.objects.annotate(foo=Exists(Related.objects.filter(inner_field=F('outerq__outer_field'))))

However, this explicit nature is better, because it means that you can have an arbitrary relationship to the outer query (and potentially more than one), or indeed no relationship at all.

I needed to not grab a outer_query.table_alias(), because for some reason that then caused the outer query's table to drop out of the FROM clause.

@schinckel
Copy link
Contributor Author

Oh, @akaariai, I didn't use an explicit delayed_reference on the query: but rather an implicit agreement that when resolving the subquery, any where clauses that contained an F('outerq__...') expression would be resolved to referring to the outer query.

Thanks for your help.

[resolve(_child) for _child in child.children]
if hasattr(child, 'rhs') and isinstance(child.rhs, F):
fieldname = child.rhs.name.split('__')[1]
if fieldname == 'pk':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can likely just resolve the F() expression using resolve_expression() once you have removed the outerq prefix from the F obj.

@schinckel
Copy link
Contributor Author

Apart from the .exclude() stuff, it's coming along pretty well. Really not sure how to start on that.

'ORDER BY "lookup_author"."name"'
)
print(expected)
result = str(Author.objects.exclude(article__exists=Article.objects.filter(author=OuterRef('pk'))).query)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem here is using the lookup machinery in the first place. It will try to create a subquery for you...

Could you add a hack where exists and not exists are implemented as annotation + adding that annotation to the where clause directly? That might work, though no guarantees. Btw why do you need not exists, isn't ~Q(field__exists) enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking about a way to use the annotation version: I'm currently doing some stuff in a test project where I annotate and then filter on those annotations.

The rationale for the not_exists was they are just temporary, so I can see what the query is supposed to look like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had even thought about Model.objects.filter(article=Exists(...)) as the filtering syntax, but I'm not sure how that will work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just Model.objects.filter(Exists(...))? It seems you have the outerq-innerq connection already manually added.

It might be a good choice to skip the lookup support for the initial patch and go with Exists and SubQuery only. Usually if it is possible to write a patch in two parts, it is better to do so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I'll have a quick look at the bare .filter(Exists(...)), and if that isn't super-simple, scale down the patch.

Also, I'm not sure how to handle evaluation of a queryset that contains an unresolved OuterRef() (or the resolved F() expression that refers to an outer query).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep OuterRef as OuterRef instead of resolving to F, and add as_sql to OuterRef throwing an exception with info of what went wrong.

@schinckel
Copy link
Contributor Author

Best thing about this: all of this functionality only adds classes, doesn't actually require any changes to any other django code. That's got to be a win.

@schinckel schinckel force-pushed the orm-subquery branch 2 times, most recently from 1a61598 to 24bda5b Compare May 1, 2016 04:30
@mjtamlyn
Copy link
Member

mjtamlyn commented May 4, 2016

FWIW I've used this today. I managed to get a bit of an odd issue with sql params being a tuple, in most of the expressions it's a list. This may just be a bug in my code for SearchVectorExact which blindly adds lhs params and rhs params, but we should be consistent.

@akaariai or @jarshwah - what type should params have?

@akaariai
Copy link
Member

akaariai commented May 4, 2016

Lists. We often use all_params.extend(part_params) when combining parts of SQL together.

I'm not sure if we have actually documented this anywhere - if not, we probably should.

@mjtamlyn
Copy link
Member

mjtamlyn commented May 4, 2016

Hmm ok, so I'm seeing self.subquery.query.sql_with_params() return a (empty) tuple in my use case here. My Subquery has an aggregate in it, perhaps the bug is here:

params = tuple(params)

For now I've fixed it locally by forcing SubQuery to return a list of params.

@schinckel
Copy link
Contributor Author

I thought i'd only CamelCased SubQuery when talking about the class (or instances of it): otherwise I'd used subquery.

I'm not exactly sure when I'll get enough time to look at the Oracle traceback stuff: I don't have VirtualBox set up (it crashed when then using xhyve), and I have after-work commitments today.

@timgraham
Copy link
Member

My question was: why camel case the class name?

@jarshwah
Copy link
Member

buildbot, test on oracle

@jarshwah
Copy link
Member

For what it's worth I prefer the CamelCased SubQuery visually. Subquery looks strange to me, even if it technically can be one word.

@jarshwah
Copy link
Member

Oracle errors are due to Oracle not supporting EXISTS() as a select column. It's necessary to wrap with Case/When.

# No Work

SELECT
EXISTS(SELECT 1 FROM dual)
FROM DUAL

# Works
SELECT
case when EXISTS(SELECT 1 FROM dual) then 1 else null end
FROM DUAL

Simply wrapping in a Case/When expression won't work either, since When expects a LHS/Condition (just like filter does). You'll need to change the template and hardcode the CASE WHEN SQL.

I'll have a go doing this later tonight.

I think there are other issues with how the SQL is being compiled, but that might just be my setup or how I'm printing SQL. I'll look into it further.

@jarshwah
Copy link
Member

OK, the CASE WHEN .. fixed the initial problem.

diff --git a/django/db/models/expressions.py b/django/db/models/expressions.py
index d5901b9120..f273d91dbb 100644
--- a/django/db/models/expressions.py
+++ b/django/db/models/expressions.py
@@ -1013,6 +1013,15 @@ class Exists(SubQuery):
             sql = 'NOT {}'.format(sql)
         return sql, params

+    def as_oracle(self, compiler, connection, template=None, **extra_context):
+        # Oracle doesn't allow EXISTS() in the SELECT list, so we must wrap it
+        # with a CASE WHEN expression. Since Django's When expression requires
+        # a left hand side (column) to compare against, we must change the
+        # template ourselves.
+        sql, params = self.as_sql(compiler, connection, template, **extra_context)
+        sql = 'CASE WHEN {} THEN 1 ELSE NULL END'.format(sql)
+        return sql, params
+

But now running into problems with how Oracle does LIMITs. Oracle requires a subquery for doing limits. It pushes the original query into a subquery, then wraps it with a rownum and filters on that. The problem is that the rownum outer query does a SELECT *, which results in "too many values to unpack" error.

TestCase: test_nested_subquery_outer_ref_2:

SELECT "EXPRESSIONS_TIME"."ID",
         "EXPRESSIONS_TIME"."TIME",
         
    (SELECT *
    FROM 
        (SELECT "_SUB".*,
         ROWNUM AS "_RN"
        FROM 
            (SELECT 
                (SELECT u0."TIME"
                FROM DJANGO_TEST."EXPRESSIONS_TIME" U0
                WHERE (U0."ID" = (V0."START_ID")
                        AND u0."TIME" = ("EXPRESSIONS_TIME"."TIME"))) AS "OTHER"
                FROM DJANGO_TEST."EXPRESSIONS_SIMULATIONRUN" V0) "_SUB"
                WHERE ROWNUM <= 1)
                WHERE "_RN" > 0) AS "OTHER"
            FROM DJANGO_TEST."EXPRESSIONS_TIME" 

One way I've fixed this SQL manually (in the sql shell), is by aliasing the outer subquery (that does ROWNUM <= 1), then changing the "SELECT *" to "SELECT newalias.OUTER".

I don't think that's the right path though. It requires the outer most query to have knowledge of the inner most query select list, and to return just those values. A pretty shitty situation. I'd be interested in hearing ways to solve this particular problem in SQL, so that it can be translated more easily into python.

@jarshwah
Copy link
Member

Oh, will need a test to make sure NOT EXISTS is still working correctly on Oracle.

@jarshwah
Copy link
Member

This diff fixes the Too many values error, but test_subquery is failing with mismatched data now. I'm unsure if this will break other tests though. It also comes with the caveat that subqueries on oracle can not support OFFSET, though they obviously support LIMIT. I think this is a good enough compromise for now to get merged, provided other standards are met.

diff --git a/django/db/backends/oracle/compiler.py b/django/db/backends/oracle/compiler.py
index 3ae567669f..dfb79b0dda 100644
--- a/django/db/backends/oracle/compiler.py
+++ b/django/db/backends/oracle/compiler.py
@@ -31,10 +31,17 @@ class SQLCompiler(compiler.SQLCompiler):
             high_where = ''
             if self.query.high_mark is not None:
                 high_where = 'WHERE ROWNUM <= %d' % (self.query.high_mark,)
-            sql = (
-                'SELECT * FROM (SELECT "_SUB".*, ROWNUM AS "_RN" FROM (%s) '
-                '"_SUB" %s) WHERE "_RN" > %d' % (sql, high_where, self.query.low_mark)
-            )
+
+            if self.query.low_mark:
+                sql = (
+                    'SELECT * FROM (SELECT "_SUB".*, ROWNUM AS "_RN" FROM (%s) '
+                    '"_SUB" %s) WHERE "_RN" > %d' % (sql, high_where, self.query.low_mark)
+                )
+            else:
+                # There's no offset, so simplify the query to support subqueries
+                sql = (
+                    'SELECT * FROM (SELECT "_SUB".* FROM (%s) "_SUB" %s)' % (sql, high_where)
+                )

         return sql, params

@jarshwah
Copy link
Member

This sure is promising. I ran a bunch of test modules, and the test_subquery is the only test to fail with the above patches. I'll try to solve this too though. hold my beer.

@jarshwah
Copy link
Member

OK, all tests passing with a PR I opened up against @schinckel branch: schinckel#1

If you can merge this, we can kick off the oracle build to verify, and then Tim can finish review.

@schinckel
Copy link
Contributor Author

Can I put your beer down yet?


def as_sql(self, *args, **kwargs):
raise FieldError(
'This queryset contains an unresolved reference to an outer query '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test for this error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure how to get this method to execute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove it?

alias for alias in change_map.values() if alias not in clone.subquery.query.tables)
return clone

def as_sql(self, compiler, connection, template=None, **extra_context):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage of template=None, **extra_context params seems untested. Not sure if it's important.


template = template or template_params.get('template', self.template)
sql = template % template_params
sql = connection.ops.unification_cast_sql(self.output_field) % sql
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test for this?

``SubQuery()`` expressions
--------------------------

.. currentmodule:: django.db.models.expressions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you feel this adds some value? You added aliases in django.db.models so I don't see why these expressions would be treated differently from the rest in this document. I didn't want to change it without asking.

@timgraham timgraham force-pushed the orm-subquery branch 3 times, most recently from a1f76e8 to 9acac32 Compare January 12, 2017 17:41

.. currentmodule:: django.db.models.expressions

.. class:: Subquery(queryset, output_field=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryset is called subquery in the code -- should it be consistent one way or the other?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be queryset, because that shows what type of argument is expected.

.. versionadded:: 1.11

When a value is required that is dependent upon values from within a related
object, it's possible to use an explicit ``Subquery``. For example::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to describe in words what the querysets are computing. Also, I'm not sure how valuable the examples are without the model definitions.

Copy link
Contributor Author

@schinckel schinckel Jan 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think maybe I'll rewrite the examples to use an existing model structure, or at least something with "known" semantics. Perhaps Author, Post, Comment?

@jarshwah
Copy link
Member

buildbot, test on oracle.

@schinckel
Copy link
Contributor Author

So I've had a crack at https://code.djangoproject.com/ticket/16603, however I can't get it to work (because of the M2M stuff, and I'm not sure I understand exactly what the regression test is supposed to be testing).

I've also had a look at https://code.djangoproject.com/ticket/25789, and I think I'm just not going to claim that they can be solved by these classes. I'm not convinced they can't, but I don't think we have time to investigate further before feature freeze.

@schinckel schinckel force-pushed the orm-subquery branch 2 times, most recently from fe0adac to d1cf201 Compare January 13, 2017 12:49
Copy link
Member

@timgraham timgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please check my edits.


def as_sql(self, *args, **kwargs):
raise FieldError(
'This queryset contains an unresolved reference to an outer query '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove it?


>>> from django.db.models import Exists, OuterRef
>>> recent_comments = Comment.objects.filter(post=OuterRef('pk'), created_at__gte=one_day_ago)
>>> Post.objects.filter(recent_comment=Exists(recent_comments))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, this doesn't work (and the next section says so -- FieldError: Cannot resolve keyword 'recent_comment' into field.). I revised it as I saw fit.

self.assertCountEqual(qs.values('pk').filter(is_point_of_contact=True), expected_result)
# A less elegant way to write the same query: this uses a LEFT OUTER
# JOIN and an IS NULL which is probably less efficient than EXISTS.
self.assertCountEqual(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MySQL/PostgreSQL test failure suggests this query isn't equivalent (unless it really only cares about .count() as you had previously). Let me know what you want to do here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.

I think maybe the expected_result is incorrect, because the test value matches the one in the next query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact: that's testing a company, not an employee.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing I'm not 100% happy about with this is that the query that is generated contains a superfluous = true:

SELECT "expressions_employee"."id" 
  FROM "expressions_employee" 
 WHERE EXISTS(SELECT U0."id", 
                     U0."name", 
                     U0."num_employees", 
                     U0."num_chairs", 
                     U0."ceo_id", 
                     U0."point_of_contact_id" 
                FROM "expressions_company" U0 
               WHERE U0."point_of_contact_id" = ("expressions_employee"."id")) = True 
            ORDER BY "expressions_employee"."firstname" ASC

(And we are selecting superfluous columns with the EXISTS (SELECT ...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the = true is a byproduct of filter requiring a left and right hand side. This can be fixed when that restriction is removed via filter/expressions.

@schinckel
Copy link
Contributor Author

Yeah, your edits are great!

@timgraham timgraham changed the title Fixed #27149 -- Added SubQuery and Exists database expressions. Fixed #27149 -- Added Subquery and Exists database expressions. Jan 14, 2017
@timgraham timgraham merged commit 236ebe9 into django:master Jan 14, 2017
@schinckel schinckel deleted the orm-subquery branch May 12, 2017 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants