Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source-mssql tests back to green! #35616

Closed

Conversation

stephane-airbyte
Copy link
Contributor

@stephane-airbyte stephane-airbyte commented Feb 25, 2024

We are getting CI back to green. There's several aspects to this

  1. some tests are failing on CI and localhost. We're disabling them for now (SshKeyMssqlSourceAcceptanceTest.java,
    SshPasswordMssqlSourceAcceptanceTest.java, and one part of AbstractMssqlSourceDatatypeTest.java)

  2. some tests are only failing on CI. We attempt to fix those:
    a) some tests are failing to enable CDC for tables. This is due to a timing issue. We moved that logic into its own function that will try for a total of 5 minutes before giving up.
    b) some tests are failing when trying to read the minLsn. There is a 1sec wait implemented in the production code. Instead we introduce a busy loop that will wait for a total of 5 minutes for records to appear in the CDC table before giving up. That function is implemented in test code.

  3. some tests are still failing in CI. We're disabling them for now ()

In the process of fixing (2), I noticed that there's a rare case that would cause a sync to fail if no records were created in the source DB after enablement of CDC. This is fixed by using Lsn.NULL instead of throwing an exception

Copy link

vercel bot commented Feb 25, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 27, 2024 5:09am

Copy link
Contributor Author

stephane-airbyte commented Feb 25, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @stephane-airbyte and the rest of your teammates on Graphite Graphite

@stephane-airbyte stephane-airbyte force-pushed the stephane/02-21-remove_all_display_names branch from 6d8c75d to 6af37a0 Compare February 26, 2024 23:40
@stephane-airbyte stephane-airbyte force-pushed the stephane/02-24-move_source-mssql_to_latest_CDK branch 2 times, most recently from e45df64 to 029a4d3 Compare February 26, 2024 23:42
@stephane-airbyte stephane-airbyte force-pushed the stephane/02-21-remove_all_display_names branch from 6af37a0 to 137374f Compare February 27, 2024 00:00
@stephane-airbyte stephane-airbyte force-pushed the stephane/02-24-move_source-mssql_to_latest_CDK branch 2 times, most recently from bbe3176 to 0f65fb8 Compare February 27, 2024 00:06
Base automatically changed from stephane/02-21-remove_all_display_names to master February 27, 2024 00:08
@stephane-airbyte stephane-airbyte force-pushed the stephane/02-24-move_source-mssql_to_latest_CDK branch 2 times, most recently from dd7b58e to 90275fb Compare February 27, 2024 00:30
@octavia-squidington-iii octavia-squidington-iii added the CDK Connector Development Kit label Feb 27, 2024
@stephane-airbyte stephane-airbyte changed the title move source-mssql to latest CDK source-mssql tests back to green! Feb 27, 2024
@stephane-airbyte stephane-airbyte force-pushed the stephane/02-24-move_source-mssql_to_latest_CDK branch from 9792da7 to 0dca925 Compare February 27, 2024 05:05
@octavia-squidington-iii octavia-squidington-iii removed the CDK Connector Development Kit label Feb 27, 2024
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done. Marking as "needs changes" mostly because of the commented code block.

I don't mind disabling the test cases and merging this change as long as you either know how to re-enable them or you intend to find out immediately.

* .airbyteType(JsonSchemaType.NUMBER) .addInsertValues("'123'", "'1234567890.1234567'", "null")
* .addExpectedValues("123.0", "1.23456794E9", null) .createTablePatternSql(CREATE_TABLE_SQL)
* .build());
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detritus

@@ -46,7 +46,7 @@ public AirbyteMessage saveState(final Map<String, String> offset, final SchemaHi

final JsonNode asJson = Jsons.jsonNode(state);

LOGGER.info("debezium state: {}", asJson);
LOGGER.info("debezium state offset: {}", Jsons.jsonNode(offset));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a debug log? Just curious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure. I just know we never want to print the full debezium state. It can be several megs, and might contain PII

}
LOGGER.info("identified target lsn: " + maxLsn);
return new MssqlCdcTargetPosition(maxLsn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there situations in our tests where the maxLsn is legitimately NULL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not that I know off. And not by design

@@ -178,4 +173,17 @@ private List<AirbyteStateMessage> filterStateMessages(final List<AirbyteMessage>
.collect(Collectors.toList());
}

@Test
@Disabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a string explaining why it's disabled, other this might more easily end up being disabled for ever.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment applies below.

@Disabled
public void testUpdate() throws Exception {

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test cases that you're disabling in this block are actually super meaningful tests. I'm not comfortable with declaring victory unless you have a downstream PR ready in your stack that re-enables them meaningfully. It's OK to disable them to get the CI back to green but just to be clear let's not kid ourselves that we're done here.

}
}
throw new RuntimeException("Couldn't enable CDC for table %s.%s".formatted(schemaName, tableName));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.
Do we want to only do this waiting at most once per container, perhaps? Any time after the first call the stored procedure should just work on the first try.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question. As far as I can tell, once the procedure worked for a DB, it'll work for the next one. One caveat is that the same procedure can't be run twice at the same time for the same DB, hence the synchronized keyword

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with just leaving it up to a global timeout to handle this edge case. I noticed the synchronized, nicely done!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some more testing, and I don't think it's always working after the 1st table of a container has been initialized. I really think it's at least once per DB...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/mssql
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants