Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NDT5 download a.minRTT is suspect #1094

Open
mattmathis opened this issue Jun 12, 2022 · 8 comments
Open

NDT5 download a.minRTT is suspect #1094

mattmathis opened this issue Jun 12, 2022 · 8 comments
Assignees

Comments

@mattmathis
Copy link
Contributor

Dave Clark reported that a.minRTT does not agree with other sources of RTT data. In particular _internal202205.raw.S2C.TCPInfo.MinRTT is often smaller.

# MinRTT test query

SELECT 
  a.MinRTT,
  _internal202205.raw.S2C.TCPInfo.MinRTT / 1000.0 AS TiMinRTT,
FROM `mlab-collaboration.mm_preproduction.extended_ndt5_downloads`
WHERE date = '2022-06-1'
LIMIT 1000

I did not check other tables.

Note that #945 applies

@autolabel autolabel bot added the review/triage Team should review and assign priority label Jun 12, 2022
@stephen-soltesz
Copy link
Contributor

There are indeed multiple RTT values. The second one below is what's being used in the a.MinRTT calculation: https://github.com/m-lab/etl/blob/master/parser/ndt5_result.go#L171

I'm puzzled why there is no fraction from these..

SELECT 
  a.MinRTT,
  _internal202205.raw.S2C.TCPInfo.MinRTT / 1000.0 AS TiMinRTT,
  _internal202205.raw.S2C.MinRTT / 1000.0 / 1000.0 AS TiMinRTT2,

FROM `mlab-collaboration.mm_preproduction.extended_ndt5_downloads`
WHERE date = '2022-06-01'
LIMIT 1000

@stephen-soltesz
Copy link
Contributor

Looks like raw.S2C.MinRTT is the synthetic "minrtt" calculated and sent to the client to emulate the original web100 minrtt behavior...

https://github.com/m-lab/ndt-server/blob/3fc9f3d218c98e919798f58738734eff67b4aed3/ndt5/web100/web100_linux.go#L20-L35

@stephen-soltesz
Copy link
Contributor

Hmm, was S2C.TCPInfo.MinRTT always there? it may not have been in earlier versions of ndt5...

@stephen-soltesz
Copy link
Contributor

stephen-soltesz commented Jun 13, 2022

So, rather than using the synthetic minrtt unconditionally, we could:

  • use TCPInfo when available (true for recent data)
  • use synthetic minRTT when TCPInfo is not available (true for earlier data), pending a gardener JOIN fix to pull in tcpinfo into ndt5 rows.

Or, only use TCPInfo when available, and mark a.MinRTT undefined otherwise until it can be fixed in the join.

@mattmathis
Copy link
Contributor Author

The kernel has similar logic: if BBR is present the kernel reports its value for MirRTT.

@stephen-soltesz
Copy link
Contributor

This morning we agreed that:

  • I would fix the parser to use TCPInfo when present.
  • you agreed to fix the unified views to use the joined TCPInfo unconditionally (until the gardener join could materialize the same).

@stephen-soltesz
Copy link
Contributor

stephen-soltesz commented Nov 28, 2023

The summary MinRTT was calculated incorrectly from microseconds to seconds (rather than milliseconds). This is likely due to confusion from the mixture of sources and types in the NDT5 result structure, including:

  • the synthetic MinRTT, natively recorded as time.Duration and loaded to BigQuery as an INTEGER in nanoseconds.
  • the raw TCPInfo metrics, natively uint32 as microseconds and in BigQuery as an INTEGER in microseconds.
  • the summary record, converted from either the synthetic MinRTT or raw TCPinfo metric to uint32 value.

This bug affects all parsed data from June 18 2020 to present until the fix is deployed and the historical data reprocessed.

SELECT
  a.MinRTT,
  raw.S2C.MinRTT,
  raw.S2C.TCPInfo.MinRTT
FROM `mlab-sandbox.ndt.ndt5`
WHERE date = "2023-11-20" and raw.S2C is not null
LIMIT 1000
Row MinRTT (s not ms) MinRTT_1 (ns) MinRTT_2 (us)
1 0.041616 41000000 41617
2 0.041675 41000000 41675
3 0.041571 41000000 41571
4 0.041344 41000000 41344
5 0.041371 41000000 41371
6 ...

@stephen-soltesz
Copy link
Contributor

stephen-soltesz commented Nov 28, 2023

After #1129 is merged:

@stephen-soltesz stephen-soltesz self-assigned this Nov 28, 2023
@stephen-soltesz stephen-soltesz removed the review/triage Team should review and assign priority label Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants