Skip to main content

Running Monarch in FHIR Mode

This guide will take you through getting Monarch set up in FHIR mode. It will include setup instructions up to getting sample data loaded into staging, information related to the FHIR API, and background on FHIR.

FHIR vs V12 mode

In FHIR mode, the primary results of ingesting and transforming data are FHIR resources, which can be accessed through GET endpoints on the Monarch FHIR server. FHIR mode is completely separate from Salesforce. This contrasts with V12 mode, which primarily serves to transform input data into the V12 data model for loading into Salesforce.

Note: You can also run FHIR mode using V12 data as the base data model. This approach allows you to leverage existing V12 data structures while still generating FHIR resources for API access.

Setup

Read the Database Migrations docs for some background on what the setup DAG does. This guide will only cover how to load configuration into the right spots for installing Monarch in FHIR mode. This will also assume you have already deleted all schemas other than public from your database.

Option 1: Standard FHIR Setup

Loading Config

FHIR mode comes with certain CSVs that are used to load up monarch_config tables during the setup process. To ensure everything gets loaded correctly before running the setup DAG, run the following commands

cat custom/fhir_module/config/object.csv > custom/config/object.csv
cat custom/fhir_module/config/object_field.csv > custom/config/object_field.csv
cat custom/fhir_module/config/lookup.csv > custom/config/lookup.csv
cat custom/fhir_module/config/source-explorer/source_explorer.csv > custom/config/source_explorer.csv
cat custom/fhir_module/config/source-explorer/se_object.csv > custom/config/se_object.csv
cat custom/fhir_module/config/source-explorer/se_object_field.csv > custom/config/se_object_field.csv
cat custom/fhir_module/config/source-explorer/se_object_lookup.csv > custom/config/se_object_lookup.csv
cat custom/fhir_module/config/test_to_report_mapping.csv > custom/config/test_to_report_mapping.csv
cat custom/fhir_module/config/report_master.csv > custom/config/report_master.csv

Part of the setup DAG reads data from custom/schema/object_meta.json. To load the FHIR configuration into this file, run the command

cat custom/fhir_module/schema/object_meta.json > custom/schema/object_meta.json

Running the DAG

To complete setup, navigate to localhost:8080/airflow/home and trigger dag01_setup_provider_schema with the following parameters:

  • Pull new SF metadata - False
  • Generate and Run Migrations - True
  • Load custom configuration - True

Validation

Once the setup DAG has finished running, check out your landing or staging schemas. You should see tables named practitioner, practitioner_role, organization, organization_affiliation, and location, among others. If you are still seeing V12 tables such as account or contact make sure that you have properly copied the data from fhir_module/schema/object_meta.json and ran the DAG with the Pull new SF metadata parameter set to False

Option 2: FHIR Mode with V12 Data Base

If you want to run FHIR mode using V12 data as the base data model, follow these steps:

1. Set Up V12 Data Model

First, ensure you have V12 data loaded into your staging schema using the standard V12 ingestion process. This will create the typical V12 tables like account, contact, etc.

2. Configure Source Ingestion for FHIR Target

When setting up your source ingestion, configure the target type as "FHIR" instead of "V12". This tells the system to transform the V12 data into FHIR resources rather than loading it into Salesforce.

3. Copy config and adjust the fhir_setup_custom_queries dag

To transform V12 data into FHIR resources and use source explorer you will need to copy the config files in custom/fhir_module/config/v12 and add the names of the queries from the src/custom/fhir_module/queries/v12 with the _all suffix to the list in src/custom/fhir_module/dags/setup_fhir_custom_query.py

4. Load Custom V12 Queries

To enable FHIR API access to your V12 data, you'll need to load custom queries that transform V12 data into FHIR resource format. Use the fhir_setup_custom_queries DAG to load these queries:

  1. Navigate to airflow
  2. Trigger the fhir_setup_custom_queries DAG
  3. This DAG will load custom queries from src/custom/fhir_module/queries/v12 into the monarch_config.custom_query table

5. Create FHIR Resource Schema

After loading the custom queries, re-run the 04_data_quality_and_mastering DAG. This will:

  1. Copy data from the staging schema into a new fhir_resource schema
  2. Transform the V12 data into FHIR resource format
  3. Make the data accessible through the Source Explorer endpoints for FHIR Explorer

6. Enable Source Explorer for FHIR

Once the fhir_resource schema is created, the FHIR Source Explorer UI will become accessible. This allows you to browse and explore your FHIR resources through the web interface.

Validation for V12-Based FHIR

After completing the setup, you should see:

  • V12 tables in the staging schema (account, contact, etc.)
  • FHIR resource tables in the fhir_resource schema
  • Custom queries in monarch_config.custom_query with fhir_ prefixes
  • FHIR Source Explorer UI accessible for browsing resources

Endpoints

To serve FHIR resources, we have 5 FHIR server GET endpoints to retrieve individual FHIR resources. The queries run by these endpoints are managed in the monarch_config.custom_query table.

In general, sending a GET request to /fhir/{resource_type}/{resource_id} fetches the custom_query record with the name fhir_{resource_type} and passes in resource_id as the parameter.

Custom Query

The custom query table holds information related to what SQL query to run for a specific label. The columns in monarch_config.custom_query are detailed below:

  • name - The name of the query. This is used to identify which query should be run. For FHIR, all query names should have the fhir_ prefix, e.g. fhir_practitioner or fhir_location.
  • query - The query to be run. For the FHIR endpoints, these should all be select statements from the various FHIR tables in the staging schema.
  • required_params - Parameters that are required for the query to run. At a minimum, FHIR queries should require resource_id as a parameter, to filter the resultset down to a single FHIR resource.
  • response_schema - The expected schema of the query result. This is used for validation after the query has been run to ensure that required fields are present in the final result

Loading Custom Queries

To load data into the custom_query table, there is a DAG named fhir_setup_custom_queries. The source code for this DAG is located at src/custom/fhir_module/dags/setup_fhir_custom_query.py. This dag will pull the queries from src/custom/fhir_module/queries and load them into custom_query with the appropriate expected names.

Available Resources

We currently support 5 FHIR resources in Monarch. For more information, please check out the following FHIR documentation

Sample Data

We have a couple different sets of sample data for use in Monarch's FHIR mode.

Setting Up Small Sample Data

We have a set of sample data available from the Defacto Health Website. This sample defacto data is already loaded into the e2e_testing/defacto_sample_data folder. The command cp e2e_testing/defacto_sample_data/*.json ../importdata will load this data to where it needs to be for the next DAG to load it.

NOTE: You may need to modify the filenames to read in src/custom/dags/dag_import_and_transform_defacto_health_to_fhir.py

Setting Up Large Sample Data

We have a large set of sample data provided directly from Defacto Health that is available from X by 2's sharepoint here. The DAG is configured to read locations_1.json, practitionerroles_1.json, practitioners_1.json, organizations_1.json, and organizationaffiliations_1.json once those are loaded into the importdata directory.

Setting Up Sample Merge Data

We have a function that generates sample data in the same format as the defacto data to emulate duplicate practitioners. Run these steps to generate it:

cd test_data_generator/
python3 generate_fhir_merge_test_data 100

The 100 is a parameter telling the function to generate 100 match pairs. You can adjust the number as needed.

Once the function runs, it will create the following files:

  • merge_sample_locations.json
  • merge_sample_practitionerroles.json,
  • merge_sample_practitioners.json,
  • merge_sample_organizations.json
  • merge_sample_organizationaffiliations.json

Run the command cp merge_sample_*.json ../../importdata will load this data to where it needs to be for the DAG to load it.

NOTE: You will need to modify the filenames to read in src/custom/dags/dag_import_and_transform_defacto_health_to_fhir.py. Additionally, with this data set you'll need to omit the schema_file_name and path_to_schema_properties parameters in the JSONtoRawOperator call. This is because the merge data set contains additional monarch-specific fields that are not part of the defacto schema.

Transforming Sample Data

Once data is in the importdata directory and the Defacto Health DAG has been updated to match the filenames, you can run 02_import_and_conform_defacto_to_fhir.

This dag will create a raw_defacto schema to store the raw Defacto data loaded into importdata. It then loads the importdata files into raw_defacto and runs the DBT transformations in src/custom/dbt/dbt_transform_defacto_to_fhir. These transformations primarily handle two different things - loading data into tables in dbt_transformed_defacto and creating source explorer views in fhir_source_explorer. Once data is loaded into staging, these views will have data in them and the Source Explorer UI can be used to view records for FHIR resources.

Staging Sample Data

To get data into staging, run 03_import_data_into_landing, then 04_data_quality_and_mastering. 03_import_data_into_landing uses the script located under src/custom/copy_scripts/defacto_to_fhir_final_copy.sql to load data from dbt_transformed_defacto into landing. Some tasks in 04_data_quality_and_mastering may fail, but as long as these are after the task that writes from landing to staging then it doesn't matter.

FHIR Data Model

Most of the FHIR Data Model is taken directly from the FHIR specification for different FHIR resources. Because FHIR is document data, and we use Postgres for relational data modeling, parts of the FHIR spec had to be translated to a relational model. We therefore need several junction tables between FHIR Resources and related details. The primary example of this is seen with the contact_detail tables.

Each FHIR resource has its own table for data related to that specific FHIR resource. There is also a parent resource table that tracks all FHIR resources and resource IDs. Most FHIR resources can also have 1 or more sets of contact information, which is stored in the contact_detail table and related to a FHIR resource through a junction table. For example, Practitioner contact information would be tracked through the practitioner_contact_detail table.

FHIR background knowledge

For general FHIR knowledge, the best place to reference is the FHIR spec itself - https://build.fhir.org/modules.html

This section will give a general and quick overview of what the FHIR spec is for. More detailed information about the individual FHIR resources are linked above in the Available Resources section.

What is FHIR?

FHIR (Fast Healthcare Interoperability Resources) is a data standard developed by HL7 to improve healthcare data communication. By aligning healthcare systems to a single standard for healthcare data, communication between these systems is streamlined.

If everyone is using FHIR, then no one has to worry about mapping their data from System A so that it is compatible with System B!

What are FHIR Resources?

A FHIR Resource is a data object (in the case of Monarch's implementation, we use JSON representations of FHIR resources) that represents some concept in the healthcare space. Each FHIR Resource has fields defined in the FHIR spec, but FHIR also supports extensions for custom fields if needed. The primary identifier of a FHIR Resource is the Resource ID, which is a unique ID generated at the time of creation. Monarch's implementation uses UUIDs for these IDs, but that isn't strictly necessary as long as each Resource has a unique ID in the space of all Resources.

Internal sample data may have non-unique IDs, for example a Practitioner and an Organization both sharing an ID of 1. So when we pull this data into Monarch, we assign each the Practitioner and the Organization a Resource ID and store the existing ID as a source ID value for mapping back to raw data later.

FHIR Resource References

While most fields in FHIR resources are standard data types (int, string, etc.) or collections of datatypes (ex. a HumanName element in a FHIR resource is just a collection of strings for surname, givenname, etc.), some are references to other FHIR resources.

These references are meant to point towards a FHIR resource existing on the FHIR server. Monarch's implementation uses strings for references, in the format of {resource type}/{resource id}. For example, a Location resource may have a reference to another Location resource with Resource ID 1 as it's partOf element. The FHIR resource we generate would include "partOf": "Location/1"