-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
Version Information
Server Version: 2.22.0
Environment
All
What is the current behaviour?
Consider the following setup:
-- SCHEMA
CREATE SCHEMA hasura_demo;
CREATE TABLE hasura_demo.customer (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE hasura_demo.events (
id SERIAL PRIMARY KEY,
customer_id SERIAL NOT NULL REFERENCES hasura_demo.customer(id) ON DELETE CASCADE,
event TEXT NOT NULL
);
CREATE INDEX hasura_demo_events_c_id ON hasura_demo.events(customer_id, id);
-- SETUP
INSERT INTO hasura_demo.customer (id, name)
SELECT i id, ('customer_' || i) name
FROM generate_series(0, 999) as t(i);
INSERT INTO hasura_demo.events (customer_id, event)
SELECT MOD(i, 1000) customer_id, ('event_' || i) name
FROM generate_series(0, 999999) as t(i);Track these tables, add a new role with select permissions like the following:
And then issue a GQL subscription request like the following:
subscription Events {
hasura_demo_events_stream(batch_size:100, cursor:{initial_value:{id:123}}) {
event
}
}Hasura generates roughly the following query:
SELECT
"__subs"."result_id",
"__fld_resp"."root" AS "result",
"__fld_resp"."cursor" AS "cursor"
FROM
unnest(
(ARRAY['0b3a2fc8-ef5c-11ed-a05b-0242ac120003'])::uuid[],
(ARRAY['{"session": {"x-hasura-demo-customer-id": "1"}, "cursor": {"id": "999000"}}'])::json[]
) AS "__subs" ("result_id", "result_vars")
LEFT OUTER JOIN LATERAL (
SELECT
json_build_object('hasura_demo_events_stream', "_hasura_demo_events_stream"."root") AS "root",
to_json("_hasura_demo_events_stream"."cursor") AS "cursor"
FROM (
SELECT
coalesce(json_agg("root" ORDER BY "root.pg.id" ASC), '[]') AS "root",
json_build_object('id', (max("root.pg.id"))::text) AS "cursor"
FROM (
SELECT
"_root.base"."id" AS "root.pg.id",
row_to_json((
SELECT
"_e"
FROM (
SELECT
"_root.base"."id" AS "id") AS "_e")) AS "root"
FROM (
SELECT
*
FROM
"hasura_demo"."events"
WHERE (((("hasura_demo"."events"."customer_id") = ((("__subs"."result_vars" #>> ARRAY['session', 'x-hasura-demo-customer-id']))::integer))
OR ((("hasura_demo"."events"."customer_id") IS NULL)
AND (((("__subs"."result_vars" #>> ARRAY['session', 'x-hasura-demo-customer-id']))::integer) IS NULL))
)
AND (("hasura_demo"."events"."id") > ((("__subs"."result_vars" #>> ARRAY['cursor', 'id']))::bigint)))
ORDER BY
"id" ASC
LIMIT 1000) AS "_root.base"
ORDER BY
"root.pg.id" ASC) AS "_root") AS "_hasura_demo_events_stream") AS "__fld_resp" ON ('true');The issue comes down to the extra clause in the where clause:
OR ((("hasura_demo"."events"."customer_id") IS NULL)
AND (((("__subs"."result_vars" #>> ARRAY['session', 'x-hasura-demo-customer-id']))::integer) IS NULL))In our case we don't want this check and it shouldn't be necessary anyways given the column isn't nullable. The annoying thing is Postgres generates a slightly unoptimized query plan:
Nested Loop Left Join (cost=1135.92..1135.97 rows=1 width=80) (actual time=7.035..7.047 rows=1 loops=1)
--
-> Function Scan on __subs (cost=0.01..0.01 rows=1 width=48) (actual time=0.019..0.022 rows=1 loops=1)
-> Subquery Scan on _hasura_demo_events_stream (cost=1135.91..1135.94 rows=1 width=64) (actual time=7.006..7.014 rows=1 loops=1)
-> Aggregate (cost=1135.91..1135.93 rows=1 width=64) (actual time=6.991..6.997 rows=1 loops=1)
-> Subquery Scan on "_root.base" (cost=1122.59..1130.91 rows=333 width=36) (actual time=5.275..6.487 rows=999 loops=1)
-> Limit (cost=1122.59..1123.42 rows=333 width=40) (actual time=5.244..5.426 rows=999 loops=1)
-> Sort (cost=1122.59..1123.42 rows=333 width=40) (actual time=5.241..5.321 rows=999 loops=1)
Sort Key: events.id
Sort Method: quicksort Memory: 71kB
-> Bitmap Heap Scan on events (cost=16.38..1108.64 rows=333 width=40) (actual time=0.383..5.007 rows=999 loops=1)
Recheck Cond: (((customer_id = ((__subs.result_vars #>> '{session,x-hasura-demo-customer-id}'::text[]))::integer) AND (id > ((__subs.result_vars #>> '{cursor,id}'::text[]))::bigint)) OR ((customer_id IS NULL) AND (id > ((__subs.result_vars #>> '{cursor,id}'::text[]))::bigint)))
Filter: ((customer_id = ((__subs.result_vars #>> '{session,x-hasura-demo-customer-id}'::text[]))::integer) OR ((customer_id IS NULL) AND (((__subs.result_vars #>> '{session,x-hasura-demo-customer-id}'::text[]))::integer IS NULL)))
Heap Blocks: exact=999
-> BitmapOr (cost=16.38..16.38 rows=333 width=0) (actual time=0.247..0.249 rows=0 loops=1)
-> Bitmap Index Scan on hasura_demo_events_c_id (cost=0.00..11.77 rows=333 width=0) (actual time=0.210..0.211 rows=999 loops=1)
Index Cond: ((customer_id = ((__subs.result_vars #>> '{session,x-hasura-demo-customer-id}'::text[]))::integer) AND (id > ((__subs.result_vars #>> '{cursor,id}'::text[]))::bigint))
-> Bitmap Index Scan on hasura_demo_events_c_id (cost=0.00..4.44 rows=1 width=0) (actual time=0.014..0.015 rows=0 loops=1)
Index Cond: ((customer_id IS NULL) AND (id > ((__subs.result_vars #>> '{cursor,id}'::text[]))::bigint))
SubPlan 1
-> Result (cost=0.00..0.01 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=999)
Planning Time: 0.514 ms
Execution Time: 7.201 ms
If I remove this redundant NULL check I get this:
Nested Loop Left Join (cost=27.59..27.65 rows=1 width=80) (actual time=1.683..1.686 rows=1 loops=1)
--
-> Function Scan on __subs (cost=0.01..0.01 rows=1 width=48) (actual time=0.007..0.008 rows=1 loops=1)
-> Subquery Scan on _hasura_demo_events_stream (cost=27.59..27.62 rows=1 width=64) (actual time=1.673..1.675 rows=1 loops=1)
-> Aggregate (cost=27.59..27.61 rows=1 width=64) (actual time=1.666..1.667 rows=1 loops=1)
-> Subquery Scan on "_root.base" (cost=0.44..22.59 rows=333 width=36) (actual time=0.044..1.295 rows=999 loops=1)
-> Limit (cost=0.44..15.10 rows=333 width=40) (actual time=0.038..0.348 rows=999 loops=1)
-> Index Only Scan using hasura_demo_events_c_id on events (cost=0.44..15.10 rows=333 width=40) (actual time=0.030..0.203 rows=999 loops=1)
Index Cond: ((customer_id = ((__subs.result_vars #>> '{session,x-hasura-demo-customer-id}'::text[]))::integer) AND (id > ((__subs.result_vars #>> '{cursor,id}'::text[]))::bigint))
Heap Fetches: 0
SubPlan 1
-> Result (cost=0.00..0.01 rows=1 width=32) (actual time=0.000..0.000 rows=1 loops=999)
Planning Time: 0.179 ms
Execution Time: 1.734 ms
The BitmapOr that Postgres seems to use in the NULL case is quite a bit slower in both planning and execution time. We have A LOT of subscriptions on our system so these add up quite a bit. I'm surprised that Postgres doesn't optimize the query a bit further since it knows the column is not nullable.
What is the expected behaviour?
Add a mechanism to skip the NULL check for permissions especially if the column isn't nullable.
Any possible solutions/workarounds you're aware of?
No