Utility Functions
DbtOperator
This operator triggers DBT command using the command line. This allows you to gracefully handle test failures instead of needing to fail the pipeline and avoid using trigger rules like "All done"
Args:
command (str): The name of your indexsource as written in the config file. ie: ProviderMerge
profile_dir (str): The directory of the DBT profile
project_dir (str): The directory of the DBT project
fail_on_test (str): Set to True if you want to block fail the Airflow task if any DBT tests fail. Default to False.
append_options (list): A list of strings options to append to the DBT command line. Default to []
environment_variables (object): Will set all the properties that start with DBT_ on this object to environment variables in the DBT environment. Defaults to .
enable_debug_mode (bool): Set to True if you want to run the DBT run in debug mode which shows all the queries that run. Setting the Airflow variable dbt_enable_debug_mode will override this parameter. Default False.
Returns: Returns a Airflow operator
Raises: KeyError: Raises an exception.
MonarchStatusOperator
This operator updates the Monarch database with the current status, the definitions here are what drive the status updates on the status page. For each ingestion source if you want to surface progress to the dashboard status page you will need to use this operator in your custom ingestion.
Args:
source_id (str): The value of the source column to link data to this metadata
name (str): The name of the source that should be displayed on the UI
status (str): The current status of the target or source. Options are Imported, Transformed, Merged, Pulled, Reviewing, Pushed.
schema (str): Name of the database schema that contains tin_list_table. tin_list_table (str): Name of the table that has a column tin that will be used to determine unique count. tin_column (str): Name of the column to use for tin_list_table to use url (str): OPTIONAL
Assumes: utilities.source_status is setup by running step 1
Returns: Returns a Airflow operator
Raises: KeyError: Raises an exception.
Statuses
Imported - When a source moves data into raw** schema
Transformed - When a source moves data into transformed** schema to match the standard data format and configurations are validated
Merged - When a source merges data into staging schema
Pulled - When a target gets the latest salesforce metadata and data to put into into landing schema
MergedAll - When all sources are merged into the target
Reviewing - When a target gets it's data source explorer indexed and final reports generated
Pushed - When a target pushes data into salesforce