You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears that #9739 added RESOURCE_EXHAUSTED to the list of status codes to retry, but it fails to distinguish between the gRPC-generated RESOURCE_EXHAUSTED that's generated when a request payload is larger than the configured limit.
Client
spanner v1.60.0+
Environment
any, but Linux 6.1.0 (Ubuntu 23.10)
Code and Dependencies
Creating a KeySet that's too large for a single gRPC request.
Here's an example main package that exits immediately with an error with v1.59.0 and hangs indefinitely (due to internal retries) with v1.60.0+.
package main
import (
"context""fmt""os""strconv""cloud.google.com/go/spanner""cloud.google.com/go/spanner/spannertest""cloud.google.com/go/spanner/spansql""google.golang.org/api/option""google.golang.org/grpc""google.golang.org/grpc/credentials/insecure"
)
funcrun() error {
ctx, cancel:=context.WithCancel(context.Background())
defercancel()
srv, testSpanErr:=spannertest.NewServer("localhost:0")
iftestSpanErr!=nil {
returnfmt.Errorf("failed to create new test spanner: %w", testSpanErr)
}
consttblName="fizzlebat"constkeyColumn="foobar"constdataColumn="foolbit"ifcreateErr:=srv.UpdateDDL(&spansql.DDL{List: []spansql.DDLStmt{
&spansql.CreateTable{
Name: tblName,
Columns: []spansql.ColumnDef{
{
Name: keyColumn,
NotNull: true,
Type: spansql.Type{
Array: false,
Len: spansql.MaxLen,
Base: spansql.String,
},
},
{
Name: dataColumn,
NotNull: true,
Type: spansql.Type{
Array: false,
Len: spansql.MaxLen,
Base: spansql.String,
},
},
},
PrimaryKey: []spansql.KeyPart{{Column: keyColumn, Desc: false}},
}}}); createErr!=nil {
panic(fmt.Errorf("failed to create table %q: %s", tblName, createErr))
}
defersrv.Close()
conn, cErr:=grpc.NewClient(srv.Addr, grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithDefaultCallOptions(grpc.MaxCallSendMsgSize(1_000_000)))
ifcErr!=nil {
returnfmt.Errorf("failed to dial test spanner: %w", cErr)
}
constdbName="projects/vimeo-starlord-dev-inmem/instances/inmem/databases/foobar"client, clientErr:=spanner.NewClient(ctx, dbName, option.WithGRPCConn(conn))
ifclientErr!=nil {
returnfmt.Errorf("failed to init spanner client: %w", clientErr)
}
// Make sure everything gets cleaned up from the clientsdeferclient.Close()
// start counting at 100B so we get 11 digits (in decimal)// use sequential IDs because spantest is naively implemented and keeps// a map with the keys->data plus a separate sorted slice with primary// keys. Since it sorts after every key insertion this becomes// O((n logn)^2) (and possibly worse) for random key insertions, while// if we use sequential keys, it doesn't need to re-sort (just verify// that the slice is sorted (which is O(n), making// this just O(n^2) overall).constlowID=100_000_000_000dbids:= [100_000]spanner.Key{}
forz:=rangedbids {
dbids[z] = spanner.Key{strconv.FormatInt(int64(z)+lowID, 10)}
}
ks:=spanner.KeySetFromKeys(dbids[:]...)
iter:=client.Single().Read(ctx, tblName, ks, []string{keyColumn, dataColumn})
ifiterErr:=iter.Do(func(*spanner.Row) error { returnnil }); iterErr!=nil {
returnfmt.Errorf("failed to iterate: %w", iterErr)
}
returnnil
}
funcmain() {
iferr:=run(); err!=nil {
fmt.Fprintf(os.Stderr, "failure: %s\n", err)
os.Exit(1)
}
}
The call should fail with some wrapping of (preferably with status RESOURCE_EXHAUSTED):
"trying to send message larger than max (1710063 vs. 1000000)"
In this case, the error I get with v1.59.0 is:
failure: failed to iterate: spanner: code = "ResourceExhausted", desc = "trying to send message larger than max (1800054 vs. 1000000)"
Actual behavior
The client gets into a retry loop and stalls.
Additional context
It appears that this is a regression in v1.60.0
Note: we discovered this in our tests for code that handles this case, and have already rearranged the code to do proactive checking of the keyset size rather than optimistically making the request first and only checking the size on error. It's more important to us that the fix be robust than fast. (I realize that it may take some time to side-channel additional info from the spanner frontend)
The text was updated successfully, but these errors were encountered:
It appears that #9739 added
RESOURCE_EXHAUSTED
to the list of status codes to retry, but it fails to distinguish between the gRPC-generatedRESOURCE_EXHAUSTED
that's generated when a request payload is larger than the configured limit.Client
spanner v1.60.0+
Environment
any, but Linux 6.1.0 (Ubuntu 23.10)
Code and Dependencies
Creating a KeySet that's too large for a single gRPC request.
Here's an example
main
package that exits immediately with an error with v1.59.0 and hangs indefinitely (due to internal retries) with v1.60.0+.go.mod
Expected behavior
The call should fail with some wrapping of (preferably with status
RESOURCE_EXHAUSTED
):In this case, the error I get with v1.59.0 is:
Actual behavior
The client gets into a retry loop and stalls.
Additional context
It appears that this is a regression in v1.60.0
Note: we discovered this in our tests for code that handles this case, and have already rearranged the code to do proactive checking of the keyset size rather than optimistically making the request first and only checking the size on error. It's more important to us that the fix be robust than fast. (I realize that it may take some time to side-channel additional info from the spanner frontend)
The text was updated successfully, but these errors were encountered: