How to deal with the merging of 2 sources?

How to configure the unification of two sources?

Let’s imagine we have 2 product repositories Catalog_1 and Catalog_2, does it make sense to merge them?
πŸ’‘
Are they different products in each catalog or are they the same products with a common identifier (such as product_id)? β†’ According to the answer, the process will be different.
  • Case 1: they are the same products and they have no common attribute.
    • Take as the deduplication key the id common to the 2 sources.
    • Map from source to dataset each attribute that is specific to a single source.
  • Case 2: they are the same products having, sometimes common attributes, sometimes attributes specific to each source. We must then unify the 2 sources by treating them in the manner of multi-source contacts:
    • Take as a dedup key the id common to the 2 sources.
    • Map from source to dataset each attribute that is specific to a single source.
    • Define a strategy for the fields common to the 2 sources:
    • Map either only one source (the simplest, too bad for the other) or establish a hierarchy of sources:
      • Create for each field concerned in the dataset: field_X_source_1 / field_X_source_2 and add a calculated field field_X_master which will take the first non-zero value among the two sources with a rule of prioritization of one of the two sources.
  • Case 3: they are not the same products and they have a common key.
    • Unify the two sources by taking the common key as the deduplication key.
    • Map all the fields of each source.
  • Case 4: they are not the same products and they do not have a common key.
    • Unify the two sources by putting in the dataset a multiple key = id_source_1 OR id_source_2.
    • Create a field calculated at the dataset level to create a product_id_master field based on (prefix1_id_source_1 OR prefix2_source2).
    • Create a product_id_master field in the order items table according to the same principle.