Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go api: use byte slices for deck parsing - reuse deck parsing results for subsequent reqs #1

Conversation

benw10-1
Copy link

As listed in this pull request, the optimization can be taken a bit further.

Added deck struct to manage parsing logic.
Converted string parsing functions into their byte counterparts.
Added functionality for reusing parsed deck data between reqs.

Added tests for the bytes deck parsing logic (left the old for now, but can remove have just been using for comparisons for now).
Added tests for the deck API handler (bumped up coverage by a bit)

Do not have as good of a computer for the benchmarks, but here is the comparison from my machine:

goos: windows
goarch: amd64
pkg: github.com/MaT1g3R/slaytherelics/api
cpu: AMD Ryzen 7 3800XT 8-Core Processor
                              │   old.out    │               new.out                │
                              │    sec/op    │    sec/op     vs base                │
DecompressDeck/Simple_deck-16   1.947µ ± ∞ ¹   2.224µ ± ∞ ¹       ~ (p=1.000 n=1) ²
DecompressDeck/Big_deck-16      74.90µ ± ∞ ¹   59.58µ ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                         12.08µ         11.51µ        -4.68%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                              │    old.out    │                new.out                 │
                              │     B/op      │     B/op       vs base                 │
DecompressDeck/Simple_deck-16     841.0 ± ∞ ¹     554.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
DecompressDeck/Big_deck-16      220.5Ki ± ∞ ¹   116.4Ki ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                         13.46Ki         7.935Ki        -41.04%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                              │   old.out   │               new.out                │
                              │  allocs/op  │  allocs/op   vs base                 │
DecompressDeck/Simple_deck-16   17.00 ± ∞ ¹   15.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
DecompressDeck/Big_deck-16      165.0 ± ∞ ¹   112.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                         52.96         40.99        -22.61%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

Unsure of why the small deck is faster as of writing this, but the main change I wanted to implement was reuse of the request results.

Seems like even in the "large deck" cases we end up not storing that much in the decompressed buffer, was <2Kb for the large case per-item.

Copy link
Owner

@zdylag zdylag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing all this! Really exciting stuff! Mostly left some opinionated formatting thoughts for the WIP.


// ensures work is only done once even when racing for deck parse. Once its loaded, will be checked
// using atomic.LoadUint32 instead of a mutex lock
parseOnce sync.Once
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope: shame we aren't on go1.21, its the next version up and would give us https://pkg.go.dev/sync#OnceValue

Given the work-in-progress part of this PR and our total control over the repo I wonder if it's worth a quick bump.


// region deck parsing unstable

// deck designed to be parsed once and then used for lookups. The load of the parsing is in the request context as to
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit throughout: Go comments (and especially GoDoc comments) should be complete sentences ending in punctuation https://go.dev/wiki/CodeReviewComments#comment-sentences.

// not block the sub
type deck struct {
// raw data not parsed, after parsing, raw data buf is freed and replaced by ready-to-use result
buf []byte
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit throughout: GoDoc comments should begin with the name of the thing being described https://go.dev/wiki/CodeReviewComments#comment-sentences.

return err
}

parts := bytes.Split(decompressed, []byte(";;;"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for throughout: is it safe to assume we never have unicode things to worry about? In general I believe this is unsafe, but I haven't thought too deeply about if these particular byte sequences we use around this file will be safe (they very well might be!).

}

slices.SortFunc(keys, func(i, j string) bool {
// may allocate some extra space in some cases, but we will shrink after we are done formatting
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing it but where is this shrink? (I don't think there's any benefit to shrinking here probably, but just making sure I'm following).


for i := len(compressionDict) - 1; i >= 0; i-- {
word := compressionDict[i]
// TODO: this is the source of lots of allocs and CPU cycles, probably no need for regexp here
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was the most naive 1-1 move haha. You inspired me to see how the naive string replacer would do and I put that up here: Spireblight#20

return text, nil
}

type delimCB func(val []byte) error
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: for type names especially could we type out fuller names? I think this stands for callback, but it took me a bit to guess.

}

// no guarentee of not copying here in map lookup, so unsafe it out. Is valid as long as `decompressed` is still valid and static
cardStr := unsafe.String(&card[0], len(card))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe is pretty spooky. For a single string copy is this measurably important?

}

// Bytes use to get parsed deck
func (d *deck) Bytes() (res []byte, err error) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it is fairly obvious when the expensive part of this happens: is it worth doing something (maybe even just renaming this) to make it clear this is the expensive call?

@benw10-1 benw10-1 closed this Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants