I’m not quite understanding the documentation well on looker query cache vs PDT rebuilt trigger within datagroup .
I
if I define a datagroup(with both sql_trigger and max_cache_age) like below and use it in my model persists_with, when will cache in the model be cleared? Is it every 24 hours or when the max(id) change ? Suppose max(id) changes every hour, does my model cached queries be invalidated every hour or 24 hours ?
datagroup: orders_datagroup {
sql_trigger: SELECT max(id) FROM my_tablename ;;
max_cache_age: "24 hours"
}
Is it possible to use one datagroup that can perform the following?
-
sql_trigger to rebuild PDT data every morning at 6 AM.
-
max_cache_age to refresh cached results ONLY after more than 30 hours. I don’t want max_cache_age to trigger rebuild of PDT.
1 Like
Hi @crazyrevol !
If max(id) changes every hour, then query cache is cleared every hour, because sql_trigger fires more often than max_cache_age.
You can’t separate PDT rebuild and cache expiration using one datagroup — both use the same logic.
To achieve your goal, use two datagroups:
– One with sql_trigger for PDT refresh at 6 AM
– One with max_cache_age for query cache (e.g., 30h)
@a_aleinikov - thank you for your reply! very helpful.
Three additional question -
-
If I use a dataGroup on my PDT with max_cache_age = 1h and sql_trigger for PDT refresh at 6AM. Will my PDT be built every hour?
-
if I apply a dataGroup to an explore in the model file, will the dataGroup be applied to derived table regeneration as well ? In other words, applying dataGroup to an explore make the derived_table a PDT ?
-
what sql_trigger query to write to make sure my PDT refreshes every weekday only at/after 6AM CST ?