data-load-config-dev-notes
Dev Notes Overview
These are issues we've run into again and again while working through data loads.
Behavior of Object_Field vs Lookup Configurations
There is some confusion about the behavior of fields pertaining to when they're in the object_field or lookup config tables. This section describes the behavior during various steps given certain combinations.
1. object_field enabled ✅, lookup enabled ✅
- Smart Copy
- The field gets copied from staging -> staging_to_upload
- The lookup is traversed to get to the next table
- Data load Main step
- The field is not added directly to the main field list
- If not a deferred lookup, join on the other table to load in that record's Id as the field value. If it is a defer, skip
- Data load Defer step (only for deferred lookup)
- Join on the other table to load in that record's Id as the field value
2. object_field disabled ❌, lookup enabled ✅
- Note: previously lookup fields were omitted from the object_field config entirely. But disabling a field yields different behavior than omitting it from the table.
- Smart Copy
- The field is removed from staging_to_upload
- This is problematic. For example if Id or ParentId are disabled, the staging_to_upload schema will be missing the field entirely
- Better usage would be to enable both the object_field and lookup rows or remove the row from object_field entirely
- The lookup is traversed to get to the next table
- The field is removed from staging_to_upload
- Data load Main step
- The field is removed from the main fields
- If not a deferred lookup, join on the other table to load in that record's Id as the field value. If it is a defer, skip
- Data load Defer step (only for deferred lookup)
- Join on the other table to load in that record's Id as the field value
- In general please AVOID this situation. To disable a field please follow configuration 4.
3. object_field enabled ✅, lookup disabled ❌
- Smart Copy
- The field gets copied from staging -> staging_to_upload
- Data load Main step
- The field is added directly to the main field list without the join. The value loaded is what you see in the db for that field.
- Data load No defer step
- One use case of this setting is when loading the OwnerId field. If the table being loaded already contains the exact OwnerId value for the field, there is no need to join on the User table to fetch it.
- More broadly, when an object has
sf_use_source_column = False, foreign key fields that reference it will contain its raw value (e.g. Id) rather than its migration id. This means any time a field is a foreign key to another table, if that other table hassf_use_source_column = Falsethen it can be treated as a regular field and its lookup can be disabled.
- More broadly, when an object has
4. object_field disabled ❌, lookup disabled ❌
- Smart Copy
- The field is removed from staging_to_upload
- Data load Main step
- The field is removed from the main fields
- Data load No defer step
- This is the correct way to fully ignore a field (unlike configuration 2)
Lookup Recommendations
When configuring object_field and lookup, here are our recommendations on what to enable or disable:
Object_Field
- Disable fields that are updated by Salesforce:
- CreatedById
- CreatedDate
- LastActivityDate
- LastModifiedById
- LastModifiedDate
- LastReferencedDate
- LastViewedDate
- SystemModstamp
- Make sure the
object_field.csvfield name casing matches that of object-meta.json
Lookup
- Disable lookups to
userandrecordtypetables- In other words, disable lookups to objects where
sf_use_source_column = False
- In other words, disable lookups to objects where
- For every lookup that goes to account, set its relation to
Parent