I’m not quite understanding the documentation well on looker query cache vs PDT rebuilt trigger within datagroup .

I
if I define a datagroup(with both sql_trigger and max_cache_age) like below and use it in my model persists_with, when will cache in the model be cleared? Is it every 24 hours or when the max(id) change ? Suppose max(id) changes every hour, does my model cached queries be invalidated every hour or 24 hours ?

datagroup: orders_datagroup {
  sql_trigger: SELECT max(id) FROM my_tablename ;;
  max_cache_age: "24 hours"
}

Is it possible to use one datagroup that can perform the following?

  1. sql_trigger to rebuild PDT data every morning at 6 AM.

  2. max_cache_age to refresh cached results ONLY after more than 30 hours. I don’t want max_cache_age to trigger rebuild of PDT.

1 Like

Hi @crazyrevol !
If max(id) changes every hour, then query cache is cleared every hour, because sql_trigger fires more often than max_cache_age.

You can’t separate PDT rebuild and cache expiration using one datagroup — both use the same logic.

To achieve your goal, use two datagroups:
– One with sql_trigger for PDT refresh at 6 AM
– One with max_cache_age for query cache (e.g., 30h)

@a_aleinikov - thank you for your reply! very helpful.

Three additional question -

  1. If I use a dataGroup on my PDT with max_cache_age = 1h and sql_trigger for PDT refresh at 6AM. Will my PDT be built every hour?

  2. if I apply a dataGroup to an explore in the model file, will the dataGroup be applied to derived table regeneration as well ? In other words, applying dataGroup to an explore make the derived_table a PDT ?

  3. what sql_trigger query to write to make sure my PDT refreshes every weekday only at/after 6AM CST ?