Welcome to your new dbt project!
Try running the following commands:
- dbt run
- dbt test
├── README.md
├── analyses
├── dbt_packages
├── dbt_project.yml # Existence of this file == A DBT project
│ # This file defines various project configurations
├── logs
│ └── dbt.log
├── macros
├── models
│ ├── device_event.sql # This file is called a model;
│ │ # There should be a key called `device_event` under key `models` in
│ │ # schema.yml
│ └── schema.yml # This file should be under models/ dir
│ # This files defines data source, tables, models, test on them etc.
├── profiles.yml # By default, dbt expects the profiles.yml file to be located in the ~/.dbt/ directory.
│ # This file stores different profiles of the project
│ # basically, env configs;
│ # It contains all the details required to connect to your data warehouse.
├── seeds
├── snapshots
├── target
│ ├── compiled
│ │ └── sample
│ │ └── models
│ │ └── schema.yml
│ ├── graph.gpickle
│ ├── manifest.json
│ ├── partial_parse.msgpack
│ ├── run
│ │ └── sample
│ │ └── models
│ │ └── schema.yml
│ └── run_results.json
└── tests
~/dbt-eg on main! ⌚ 18:59:47
$ dbt test \
--profile bigquery \ # override profile set, if any, in dbt_project.yml
--target default \ # override target set, if any, in profile.yml
# override vars set, if any, in dbt_project.yml
--vars "{source_iot_telemetry: iot_telemetry_dev, table_device: device, table_event: event}" \
--select device_event # specify particular model to test
zsh: correct 'test' to 'tests' [nyae]? n
13:29:54 Running with dbt=1.4.4
13:29:54 Found 1 model, 4 tests, 0 snapshots, 0 analyses, 334 macros, 0 operations, 0 seed files, 2 sources, 0 exposures, 0 metrics
13:29:54
13:29:54 Nothing to do. Try checking your model configs and model specification args
(.venv)
~/dbt-eg on main! ⌚ 18:59:47
$ dbt run \
--profile bigquery \
--vars "{source_iot_telemetry: iot_telemetry_dev, table_device: device, table_event: event}" \
--select device_event
13:44:44 Running with dbt=1.4.4
13:44:44 Found 1 model, 4 tests, 0 snapshots, 0 analyses, 334 macros, 0 operations, 0 seed files, 2 sources, 0 exposures, 0 metrics
13:44:44
13:44:49 Concurrency: 1 threads (target='staging')
13:44:49
13:44:49 1 of 1 START sql view model iot_telemetry_dev.device_event ..................... [RUN]
13:44:51 1 of 1 OK created sql view model iot_telemetry_dev.device_event ................ [CREATE VIEW (0 processed) in 1.94s]
13:44:51
13:44:51 Finished running 1 view model in 0 hours 0 minutes and 7.23 seconds (7.23s).
13:44:51
13:44:51 Completed successfully
13:44:51
13:44:51 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
Run model device_event_fast
and it's upstream dependencies
(.venv)
~/dbt-eg on main! ⌚ 18:59:47
$ dbt run \
--profile bigquery \
--target default \ # this time being explicit, just for verbose
--vars "{source_iot_telemetry: iot_telemetry_dev, table_device: device, table_event: event}" \
--select "+device_event_fast"
Run pipeline device
and it's upstream dependencies
(.venv)
~/dbt-eg on main! ⌚ 18:59:47
$ dbt run \
--profile bigquery \
--target default \
--vars "{source_iot_telemetry: iot_telemetry_dev, table_device: device, table_event: event}" \
--select "+device"
- resources
- models
- snapshots
- seeds
- tests
https://docs.getdbt.com/docs/build/sources
- Incremental models
-
https://docs.getdbt.com/reference/resource-configs/bigquery-configs
-
configs
- dbt_project.yml
-
properties
- source
- macro
https://docs.getdbt.com/docs/build/analyses
In DBT, libraries (to install) are called packages.
https://docs.getdbt.com/docs/build/packages
- Hub
- Git
- Private
- Local
https://docs.getdbt.com/docs/build/tests
-
Singular test
-
Generic tests
-
Open-Sourced Custom tests
-
Custom tests
-
Store test failures in table
-
Misc
- https://docs.getdbt.com/docs/get-started/connection-profiles
- https://docs.getdbt.com/reference/profiles.yml
- Node selection
-
DBT Cloud
-
Airflow
-
Perfect
-
Dagster
-
CI/Jenkins
-
Cron
;
at the end of SQL statement is not accepted in<model>.sql
filesdbt_project
model config is namespaced byproject
name, and then by directory (NOT package) names (optionally)profile
naming convention could be per warehouse - that suits better- Under which,
targets
could be different environment of the warehouse
- Under which,
- How to modularize code?
- Just use (sub) directories, there is no concept of package - same as Go/Python
- The files name within dir
models
should be unique. Otherwise it will raise Compilation error.
- How to import accross modularized [models defined within (sub) directories]
- As model name should be unique through-out the project, use
ref("<model name>")
to reference a model.
- As model name should be unique through-out the project, use
- How to run/test a model along with its dependent models as well?
- Refer
model selection
CLI syntax - You can use upstream (prefix model name with
+
) syntax
- Refer
- What's difference between
config
andproperty
?- Property
- It describes/defines the resource and their structure
- Any
.yml
file under anyresource
(models, seeds, analyses, snapshots etc.) directory is called a property file
- Config
config
are special type of property, that changes the configurations of operations on the resources (e.g. how to materialize)config
could be defined:- As
config
macro within.sql
file - As
config
key/node within any.yml
property file - As
<resource_name>
key/node withindbt_project.yml
file
- As
- Property
- How to parition/cluster bigquery models using config/property
- https://docs.getdbt.com/guides/best-practices/how-we-structure/1-guide-overview
- https://medium.com/geekculture/how-to-structure-your-dbt-project-c62103deceb4
- https://cloudacademy.com/lab/best-practices-organizing-dbt-models/
- https://towardsdatascience.com/dbt-models-structure-c31c8977b5fc
- https://towardsdatascience.com/the-most-efficient-way-to-organize-dbt-models-244e23c17072
- https://www.databricks.com/blog/2022/12/15/best-practices-super-powering-your-dbt-project-databricks.html
- https://airbyte.com/blog/best-practices-dbt-style-guide
- https://www.databricks.com/blog/2022/12/15/best-practices-super-powering-your-dbt-project-databricks.html (lint/format)
- Learn more about dbt in the docs
- Check out Discourse for commonly asked questions and answers
- Join the chat on Slack for live discussions and support
- Find dbt events near you
- Check out the blog for the latest news on dbt's development and best practices