docs: warnings about risks of using incremental (MERGE) replication m…

…ethod
adswerve · Jan 25, 2022 · 67ac76c · 67ac76c
1 parent f3eb383
commit 67ac76c
Show file tree

Hide file tree

Showing 2 changed files with 2 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -153,7 +153,7 @@ sample [target-config.json](/sample_config/target-config-exchange-rates-api.json
     * `truncate`: Deleting all previous rows and uploading the new ones to the table
     * `incremental`: **Upserting** new rows into the table, using the **primary key** given by the tap connector
       (if it finds an old row with same key, updates it. Otherwise it inserts the new row)
- - WARNING: we do not recommend using `incremental` option as it might result in loss of production data. We recommend using `append` option instead which will preserve historical data. 
+ - WARNING: We do not recommend using `incremental` option (which uses `MERGE` SQL statement). It might result in loss of production data, because historical records get updated. Instead, we recommend using the `append` replication method, which will preserve historical data. 
 
 Sample **target-config.json** file:
 

diff --git a/target_bigquery/processhandler.py b/target_bigquery/processhandler.py
@@ -266,8 +266,7 @@ def _do_temp_table_based_load(self, rows):
                 incremental_success = False
                 if self.incremental:
                     self.logger.info(f"Copy {tmp_table_name} to {self.tables[stream]} by INCREMENTAL")
-                    #TODO: reword the warning about this replication method
-                    self.logger.warning(f"INCREMENTAL replication method might result in data loss because we are editing the production data during the sync operation. We recommend that you use APPEND target-bigquery replication instead.")
+                    self.logger.warning(f"INCREMENTAL replication method (MERGE SQL statement) is not recommended. It might result in loss of production data, because historical records get updated during the sync operation. Instead, we recommend using the APPEND replication method, which will preserve historical data.")
                     table_id = f"{self.project_id}.{self.dataset.dataset_id}.{self.tables[stream]}"
                     try:
                         self.client.get_table(table_id)