Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple queries from AbstractTextTableWriter? #71

Closed
vforchi opened this issue Dec 18, 2024 · 2 comments
Closed

Multiple queries from AbstractTextTableWriter? #71

vforchi opened this issue Dec 18, 2024 · 2 comments

Comments

@vforchi
Copy link

vforchi commented Dec 18, 2024

I am trying to upgrade from stil 4.0 to 4.3, and my tests failed when trying to access a ResultSet that was already closed in a custom StarTable I wrote.

I digged a little bit and I found that uk.ac.starlink.table.formats.AbstractTextTableWriter#writeStarTable invokes getRowSequence() twice, and uk.ac.starlink.table.jdbc.JDBCStarTable#getRowSequence (haven't checked the others) creates a DB connection and runs a query everytime it's invoked.

This looks very inefficient: is it possible to read the data once in writeStarTable? I could change my implemementation to do something similar to what the JDBCStarTable does, but it doesn't look like the best solution.

@mbtaylor
Copy link
Member

Some writers need to make multiple passes through the table row stream to do the write, others can get away with a single pass. The FITS writer for instance needs to know the number of rows up front. AbstractTextTableWriter does it so it knows how many characters are needed for each column so it can line the columns up when it writes them (it doesn't necessarily read all the rows to do that, you can configure how many with the setSampledRows method). For some StarTable implementations performing multiple iterations through the row stream is not particularly expensive (especially if you only take the first few rows on some passes), but for others, like JDBCStarTable, it is. The writer implementations don't know about the performance characteristics of the StarTable instances they are asked to output.

If you write the JDBCStarTable to an output handler that doesn't need multiple passes (e.g. VOTable, CSV) it will only execute the SQL once. But if you want to support output handlers that do/might need multiple passes, then you can stream the ResultSet-based table into some temporary StarTable instance that's cheap to iterate over, and then feed that to the StarTableWriter.

Machinery is provided for this kind of stream-to-cache operation. The most straighforward way to do it is to call uk.ac.starlink.table.Tables.randomTable on the JDBC table. If you want a bit more control you can use a uk.ac.starlink.table.StoragePolicy instance directly and call copyTable on it. If you want to go straight from a ResultSet to a StarTable in this way, with an assurance that you know when the SQL is actually being executed, you can use uk.ac.starlink.table.jdbc.SequentialResultSetStarTable, which is a hobbled StarTable implementation, but it's adequate for feeding to a StoragePolicy.

For instance:

import java.io.IOException;
import java.io.OutputStream;
import java.sql.ResultSet;
import java.sql.SQLException;
import uk.ac.starlink.table.StarTable;
import uk.ac.starlink.table.StarTableOutput;
import uk.ac.starlink.table.StarTableWriter;
import uk.ac.starlink.table.Tables;
import uk.ac.starlink.table.jdbc.SequentialResultSetStarTable;

public class Write {

    public static void writeResultSet(ResultSet rset, StarTableWriter writer, OutputStream out)
            throws IOException, SQLException {
        StarTable oneShotTable = new SequentialResultSetStarTable(rset);
        StarTable multiShotTable = Tables.randomTable(oneShotTable);
        new StarTableOutput().writeStarTable(multiShotTable, out, writer);
    }
}

@vforchi
Copy link
Author

vforchi commented Dec 18, 2024

Thanks, randomTable seems to have fixed the issue.

@vforchi vforchi closed this as completed Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants