Skip to main content

Utility Functions

DbtOperator

This operator triggers DBT command using the command line. This allows you to gracefully handle test failures instead of needing to fail the pipeline and avoid using trigger rules like "All done"

Args:

command (str): The name of your indexsource as written in the config file. ie: ProviderMerge

profile_dir (str): The directory of the DBT profile

project_dir (str): The directory of the DBT project

fail_on_test (str): Set to True if you want to block fail the Airflow task if any DBT tests fail. Default to False.

append_options (list): A list of strings options to append to the DBT command line. Default to []

environment_variables (object): Will set all the properties that start with DBT_ on this object to environment variables in the DBT environment. Defaults to .

enable_debug_mode (bool): Set to True if you want to run the DBT run in debug mode which shows all the queries that run. Setting the Airflow variable dbt_enable_debug_mode will override this parameter. Default False.

Returns: Returns a Airflow operator

Raises: KeyError: Raises an exception.

MonarchStatusOperator

This operator updates the Monarch database with the current status, the definitions here are what drive the status updates on the status page. For each ingestion source if you want to surface progress to the dashboard status page you will need to use this operator in your custom ingestion.

Args:

source_id (str): The value of the source column to link data to this metadata

name (str): The name of the source that should be displayed on the UI

status (str): The current status of the target or source. Options are Imported, Transformed, Merged, Pulled, Reviewing, Pushed.

schema (str): Name of the database schema that contains tin_list_table. tin_list_table (str): Name of the table that has a column tin that will be used to determine unique count. tin_column (str): Name of the column to use for tin_list_table to use url (str): OPTIONAL

Assumes: utilities.source_status is setup by running step 1

Returns: Returns a Airflow operator

Raises: KeyError: Raises an exception.

Statuses

Imported - When a source moves data into raw** schema

Transformed - When a source moves data into transformed** schema to match the standard data format and configurations are validated

Merged - When a source merges data into staging schema

Pulled - When a target gets the latest salesforce metadata and data to put into into landing schema

MergedAll - When all sources are merged into the target

Reviewing - When a target gets it's data source explorer indexed and final reports generated

Pushed - When a target pushes data into salesforce