Tdl 15459 update bookmarking for transactions #141

namrata270998 · 2022-03-16T05:17:37Z

Description of change

Updated the bookmarking for transactions stream to not filter based on created_at

QA steps

Verify that the tap returns all the transactions for updated orders

Risks

Rollback steps

revert this branch

…metadata

tap_shopify/__init__.py

savan-chovatiya · 2022-03-24T10:27:04Z

tests/test_bookmarks_updated.py

-                    self.assertGreaterEqual(replication_key_value, simulated_bookmark_value, msg="Second sync records do not respect the previous                                                  bookmark")
-                    # verify the 2nd sync bookmark value is the max replication key value for a given stream
-                    self.assertLessEqual(replication_key_value, second_bookmark_value_utc, msg="Second sync bookmark was set incorrectly, a record with a greater replication key value was synced")
+                # The `transactions` stream is a child of th `orders` stream. Hence the bookmark for transactions is solely dependent on the value of bookmark in 'transaction_orders' which stores the parent record's bookmark.


Suggested change

# The `transactions` stream is a child of th `orders` stream. Hence the bookmark for transactions is solely dependent on the value of bookmark in 'transaction_orders' which stores the parent record's bookmark.

# The `transactions` stream is a child of the `orders` stream. Hence the bookmark for transactions is solely dependent on the value of bookmark in 'transaction_orders' which stores the parent record's bookmark.

savan-chovatiya · 2022-03-24T10:28:27Z

tests/base.py

+            # `transactions` is child stream of `orders` stream which is incremental.
+                # We are writing a separate bookmark for the child stream in which we are storing 
+                # the bookmark based on the parent's replication key.
+                # But, we are not using any fields from the child record for it.
+                # That's why the `transactions` stream does not have replication_key but still it is incremental.


Suggested change

# `transactions` is child stream of `orders` stream which is incremental.

# We are writing a separate bookmark for the child stream in which we are storing

# the bookmark based on the parent's replication key.

# But, we are not using any fields from the child record for it.

# That's why the `transactions` stream does not have replication_key but still it is incremental.

# `transactions` is child stream of `orders` stream which is incremental.

# We are writing a separate bookmark for the child stream in which we are storing

# the bookmark based on the parent's replication key.

# But, we are not using any fields from the child record for it.

# That's why the `transactions` stream does not have replication_key but still it is incremental.

Also, format this comment at every place in the code.

tap_shopify/streams/transactions.py

KrisPersonal

Let us walk through the code on Monday

…into TDL-15459-update-bookmarking-for-transactions

RushiT0122

We can generalise hardcoded tuple ('transactions') used at multiple location in tests. I found total 9 such instances in different TCs.

tests/base.py

kspeer825 · 2022-04-20T14:20:24Z

tests/test_start_date.py

-                    if target_value:
+                if target_value and stream not in ('transactions'):


Look like the indentation level is off here.

Fixed the indentation

kspeer825 · 2022-04-21T14:49:03Z

tests/test_bookmarks.py

        incremental_streams = {key for key, value in self.expected_replication_method().items()
-                               if value == self.INCREMENTAL and key in testable_streams}
+                               if value == self.INCREMENTAL and key in testable_streams and key not in ('transactions')}

        # Our test data sets for Shopify do not have any abandoned_checkouts


The abandoned_checkouts stream should remain under test. It is incorrectly removed on crest master. See base.py init

The abandoned checkouts are added back in another PR in this commit which is not merged yet.

kspeer825 · 2022-04-21T14:49:50Z

tests/test_bookmarks_updated.py

+                        # verify the 2nd sync replication key value is greater or equal to the 1st sync bookmarks
+                        self.assertGreaterEqual(replication_key_value, simulated_bookmark_value, msg="Second sync records do not respect the previous                                                  bookmark")
+                        # verify the 2nd sync bookmark value is the max replication key value for a given stream
+                        self.assertLessEqual(replication_key_value, second_bookmark_value_utc, msg="Second sync bookmark was set incorrectly, a record with a greater replication key value was synced")

                # verify that we get less data in the 2nd sync
                # collects has all the records with the same value of replication key, so we are removing from this assertion


I think the manipulated state should be altered so this does not happen.

Here, for the transactions stream, we had 2 bookmarks earlier i.e. transactions and transaction_orders(which stores the bookmark of the parent i.e. orders bookmark). However, as this card suggested removing the filtering of the transactions based on the transactions bookmark, we have now removed the transaction_orders completely. Hence this assertion of checking the replication key value against the bookmark value would now actually check the transaction_orders bookmark value against the replication value of the transactions. Thus, we have skipped this assertion for the stream.

Ah I was not clear, I meant for the collections stream not the transactions. Can more data be generated, and the state injected for that stream be moved so that not less records are coming through on the second sync?

We have generated more data for the collections stream and updated the simulated state. Hence we don't have to skip it anymore!

kspeer825 · 2022-04-21T14:56:37Z

tests/test_bookmarks_updated.py

                    self.assertLess(second_sync_count, first_sync_count,
                                    msg="Second sync does not have less records, bookmark usage not verified")

                # verify that we get atleast 1 record in the second sync
-                if stream not in ('collects'):
+                if stream not in ('collects', 'transactions'):


If there is no data replicated for these streams on the 2nd sync, then I don't think there is much benefit to having them in this test. They should be marked as not under test and removed from expected_streams. The expected_streams or some variable like that should drive the whole test including table selection, assertions, etc. That way adding/removing streams to the test in the future is just a matter of updating a set or list in this test file. It should be done for each test not in base.py.

My bad, we had data for transactions, hence added back the stream for that particular assertion

namrata270998 added 8 commits March 16, 2022 05:05

updated bookmark strategy for transactions

e0c55a4

updated bookmakr for transactions

df05032

removed unused var

432068e

removed unused import

5a9d583

updated integration tests

ff0a963

rempoved the replication key for transactions

012c578

changed rep key from None to []

c925f2f

removed the rep key for transactions in catalog

3c97292

namrata270998 requested review from dbshah1212 and savan-chovatiya March 22, 2022 13:00

namrata270998 added 3 commits March 22, 2022 13:07

changed back rep key to None

8af075b

test cci run

1814b1c

removed the replication key and method which was written outside the …

0675c47

…metadata

savan-chovatiya suggested changes Mar 24, 2022

View reviewed changes

dbshah1212 reviewed Mar 24, 2022

View reviewed changes

tap_shopify/streams/transactions.py Show resolved Hide resolved

namrata270998 added 2 commits March 24, 2022 13:53

resolved comments

eafd58e

resolved comments

5546bfc

namrata270998 requested review from savan-chovatiya and dbshah1212 March 28, 2022 06:23

fixed cci failure

43ee4b1

savan-chovatiya approved these changes Mar 28, 2022

View reviewed changes

dbshah1212 approved these changes Mar 28, 2022

View reviewed changes

dbshah1212 requested review from KrisPersonal, kspeer825 and RushiT0122 April 7, 2022 05:40

KrisPersonal reviewed Apr 10, 2022

View reviewed changes

hpatel41 mentioned this pull request Apr 11, 2022

TDL-17512: Add missing tap-tester tests #134

Open

2 tasks

Merge branch 'crest-work' of https://github.com/singer-io/tap-shopify …

baa88bc

…into TDL-15459-update-bookmarking-for-transactions

namrata270998 requested a review from KrisPersonal April 14, 2022 12:43

RushiT0122 requested changes Apr 18, 2022

View reviewed changes

tests/base.py Outdated Show resolved Hide resolved

resolved comments

910567f

namrata270998 requested a review from RushiT0122 April 20, 2022 13:33

used SKIPPED_STREAMS for transactions in tap-tester

c989cfb

kspeer825 suggested changes Apr 21, 2022

View reviewed changes

namrata270998 added 2 commits April 21, 2022 15:46

fixed indentation

792562e

added an assertion for transactions

3f82903

namrata270998 requested a review from kspeer825 April 22, 2022 12:34

namrata270998 added 2 commits April 29, 2022 12:32

included the collects stream for tests

037ee46

updated simulated state for collects stream

515640b

kspeer825 approved these changes May 6, 2022

View reviewed changes

RushiT0122 approved these changes May 17, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tdl 15459 update bookmarking for transactions #141

Tdl 15459 update bookmarking for transactions #141

namrata270998 commented Mar 16, 2022 •

edited

Loading

savan-chovatiya Mar 24, 2022

namrata270998 Mar 24, 2022

savan-chovatiya Mar 24, 2022

namrata270998 Mar 24, 2022

KrisPersonal left a comment

RushiT0122 left a comment

kspeer825 Apr 20, 2022

namrata270998 Apr 21, 2022

kspeer825 Apr 21, 2022

namrata270998 Apr 22, 2022 •

edited

Loading

kspeer825 Apr 21, 2022

namrata270998 Apr 22, 2022

kspeer825 Apr 27, 2022

namrata270998 May 3, 2022 •

edited

Loading

kspeer825 May 6, 2022

kspeer825 Apr 21, 2022

namrata270998 Apr 22, 2022 •

edited

Loading

	# The `transactions` stream is a child of th `orders` stream. Hence the bookmark for transactions is solely dependent on the value of bookmark in 'transaction_orders' which stores the parent record's bookmark.
	# The `transactions` stream is a child of the `orders` stream. Hence the bookmark for transactions is solely dependent on the value of bookmark in 'transaction_orders' which stores the parent record's bookmark.

		if target_value:
		if target_value and stream not in ('transactions'):

Tdl 15459 update bookmarking for transactions #141

Are you sure you want to change the base?

Tdl 15459 update bookmarking for transactions #141

Conversation

namrata270998 commented Mar 16, 2022 • edited Loading

Description of change

QA steps

Risks

Rollback steps

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KrisPersonal left a comment

Choose a reason for hiding this comment

RushiT0122 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namrata270998 Apr 22, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namrata270998 May 3, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

namrata270998 Apr 22, 2022 • edited Loading

Choose a reason for hiding this comment

namrata270998 commented Mar 16, 2022 •

edited

Loading

namrata270998 Apr 22, 2022 •

edited

Loading

namrata270998 May 3, 2022 •

edited

Loading

namrata270998 Apr 22, 2022 •

edited

Loading