-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multi-row inserts #107
Conversation
Bump.: ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for the PR.
I had a quick initial look and left some comments.
I wasn't comfortable adding multi-row support to GenericDatabaseDialect, since I wasn't able to actually confirm that all databases support it.
What would happen if the multi-row support was added to all dialects and some DB didn't support it? The inserts would fail and things would crash?
Would it make sense to include the support "globally" and then it would be a "user error" if his DB didn't support it?
It seems that e.g. MySQL supports multi-row insert. I wouldn't like to take the route of not supporting it for a DB that has the feature, e.g. MySQL.
So, either we should double check which ones support it and add support for those or just add the support globally and it wouldn't just work if the underlying DB didn't support it.
build.gradle
Outdated
@@ -117,6 +118,7 @@ dependencies { | |||
|
|||
implementation "com.google.guava:guava:27.1-jre" | |||
implementation "org.slf4j:slf4j-api:$slf4jVersion" | |||
implementation "com.amazon.redshift:redshift-jdbc42:2.1.0.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really needed in implementation
-scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might have even escaped from another PR I plan to open (a Redshift dialect). I think I can remove it completely.
src/test/resources/log4j.properties
Outdated
@@ -22,4 +23,4 @@ log4j.appender.stdout.layout=org.apache.log4j.PatternLayout | |||
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c:%L)%n | |||
|
|||
log4j.logger.org.apache.kafka=ERROR | |||
log4j.logger.io.aiven.connect=ERROR | |||
log4j.logger.io.aiven.connect=DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this? I think if you face errors while developing, you can change the log level. I'd revert the change in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. 👍
@@ -1,6 +1,7 @@ | |||
/* | |||
* Copyright 2020 Aiven Oy | |||
* Copyright 2018 Confluent Inc. | |||
* Copyright 2021 Meroxa, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong year. Applies to all files where this line appears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch. Written last year.: )
Hey, I saw your comments, but apparently no new commits? What about the more important question:
|
Hey Juha! Sorry for the lack of activity here, unfortunately I wasn't able to get to it. However, I think today/tomorrow I'll be done with the comments here. The reason why I believed it might make sense for this to be turned off by default is because I simply wasn't sure what would make sense. Now that you mentioned MySQL (I simply forgot checking it), I do believe it should be the default. Edit: I've just checked this: https://insights.stackoverflow.com/survey/2021#most-popular-technologies-database-prof. The top 4 mentioned here do support multi-row inserts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some very minor comments.
Is the idea to git rid of MultiInsertDialect
, move the code to GenericDatabaseDialect
and make all dbs support multi insert?
break; | ||
default: | ||
throw new AssertionError(); | ||
|
||
} | ||
statement.addBatch(); | ||
// in a multi-row insert, all records are a single item in the batch | ||
if (!(insertMode == MULTI)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps?
if (insertMode != MULTI) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup!
import static org.mockito.Mockito.when; | ||
|
||
public class BufferedRecordsTest { | ||
|
||
private final SqliteHelper sqliteHelper = new SqliteHelper(getClass().getSimpleName()); | ||
private String dbUrl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be final
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, now that you ask, I think yes. I put it into setUp(), since I thought sqliteHelper.sqliteUri()
can be called only after sqliteHelper
is actually set up, but that doesn't appear to be the case.
connection); | ||
|
||
final Schema schema = newPlanetSchema(); | ||
for (int i = 1; i <= 5; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea to create five rows, but since the batch size is two, there is not a flush for the last row?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. Basically, a test for the flushing logic too. I'll add a comment, since it looks like it's not that obvious.: )
That's correct. The multi-insert support is part of GenericDatabaseDialect, but I forgot deleting MultiInsertDialect. I'll do that now and address the other feedback you left. Thanks a lot! |
src/main/java/io/aiven/connect/jdbc/dialect/DatabaseDialect.java
Outdated
Show resolved
Hide resolved
OK, so for some my IntelliJ doesn't show Checkstyle errors anymore, and that's why I missed those above. I'll fix soon. |
@akudiyar @juha-aiven I force-pushed the branch, to have the commits signed. When you have some time, it'd be great if you could check this PR again.: ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some minor comments, let's address these and this PR should be ready.
any idea on when redshift dialect would be supported? |
Hello, Aiven team!
This is in relation to #40 (and, originally #34). It takes the spike in #39 from @ahmeroxa as a basis.
Multi-row insert is implemented in a separate dialect and added to only Pg and SQLite. I wasn't comfortable adding multi-row support to GenericDatabaseDialect, since I wasn't able to actually confirm that all databases support it. I found that for example, PostgreSQL didn't support this until 8.2 But if we can, for example, assume that it's only the minority of DBs (and versions) which do not support it, then that could be a reason for putting multi-row inserts into GenericDatabaseDialect.
Best wishes,
Haris
Update: Since it looks like the most used DBs (MySQL, Pg, SQL Server) all support multi-row inserts, the default for the dialects is to support multi-row inserts.