Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daily count stats bug #2244

Open
srallen opened this issue Jun 14, 2021 · 19 comments · Fixed by #2359 or #6399
Open

Daily count stats bug #2244

srallen opened this issue Jun 14, 2021 · 19 comments · Fixed by #2359 or #6399
Labels
bug Something isn't working

Comments

@srallen
Copy link
Contributor

srallen commented Jun 14, 2021

Package app-project

Describe the bug

A clear and concise description of what the bug is.

Found while doing follow up testing for #2238. I was testing to make sure I didn't introduce a regression and found a bug that is pre-existing on production. I'm wondering if this is a bug having to do with the localization happening in DailyClassificationsChartContainer here:

const stats = thisWeek.map(stat => {
const day = new Date(stat.period)
const locale = counterpart.getLocale()
const count = (day.getDay() === TODAY.getDay()) ? counts.today : stat.count
const longLabel = day.toLocaleDateString(locale, { weekday: 'long' })
const alt = `${longLabel}: ${count}`
const label = day.toLocaleDateString(locale, { weekday: 'narrow' })
return Object.assign({}, stat, { alt, count, label, longLabel })
})

Anyway, I made a classification on Monday and the bar chart shows I have one on Sunday and Monday even though I didn't actually on Sunday:

Screen Shot 2021-06-14 at 1 41 57 PM

What it's rendering:

Screen Shot 2021-06-14 at 1 38 07 PM

The original store data is correct:

Screen Shot 2021-06-14 at 1 44 05 PM

And in the response from the stats service:

Screen Shot 2021-06-14 at 1 54 28 PM

So the culprit is likely the above noted function.

To Reproduce

Steps to reproduce the behavior:

  1. Go to https://fe-project.zooniverse.org/projects/blicksam/transcription-task-testing/classify/workflow/13898
  2. Make a classification (on a Monday in a US timezone maybe?)
  3. Check out the daily classifications bar chart
  4. There's a count for Sunday and Monday.

Expected behavior

A clear and concise description of what you expected to happen.

Only counts render on the days a volunteer has made them. Note the data from the stats service is correct. This is just a rendering bug.

@srallen srallen added the bug Something isn't working label Jun 14, 2021
@srallen
Copy link
Contributor Author

srallen commented Jun 15, 2021

Today, Tuesday, it appears that it is rendering the classification I made on Monday as being made on Sunday:

Screen Shot 2021-06-15 at 1 39 40 PM

@eatyourgreens
Copy link
Contributor

eatyourgreens commented Jun 21, 2021

Weekly stats should start on Monday and end on Sunday, but 2021-06-14T00:00:00Z is a Sunday in Chicago, 6 hours earlier.

describe('weekly classification stats', function () {
it('should be created', function () {
expect(rootStore.user.personalization.stats.thisWeek.length).to.equal(7)
})
it('should start on Monday', function () {
expect(rootStore.user.personalization.stats.thisWeek[0]).to.deep.equal({ count: 12, period: '2019-09-30' })
})
it('should end on Sunday', function () {
expect(rootStore.user.personalization.stats.thisWeek[6]).to.deep.equal({ count: 15, period: '2019-10-06' })
})
})

I'm not sure why the first day's classifications are being shown on Sunday and also on Monday, but I suspect that's a bug in the check for the current day here:

@eatyourgreens
Copy link
Contributor

eatyourgreens commented Jun 21, 2021

Maybe we should do all our time calculations for the graph in Zulu time (UTC) and ignore the JS locale. getDay uses your local time. What's happening in the first screenshot (I think) is day.getDay() is returning 1, but the day index in the stats store is 0.

@eatyourgreens
Copy link
Contributor

See also zooniverse/Panoptes-Front-End#3734 where this has come up before (local timezones being off with respect to the stats timestamps.)

@srallen
Copy link
Contributor Author

srallen commented Jun 22, 2021

  1. I made the classification middle of the day on Monday Chicago time, if the stats time stamp is reporting that as Sunday, then that's incorrect.
  2. Weeks start on Sunday in the US, so the stats chart isn't going to be localized in this way?

@eatyourgreens
Copy link
Contributor

It turns out we’ve run into this problem before. #1128 (comment)

@eatyourgreens
Copy link
Contributor

I think replacing getDay with getUTCDay should fix it.

Note: It's important to keep in mind that while the time value at the heart of a Date object is UTC, the basic methods to fetch the date and time or its components all work in the local (i.e. host system) time zone and offset.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date

@eatyourgreens
Copy link
Contributor

It's hard to debug this in a UK time zone, but I think getDay() is causing problems when it's used to calculate the current day.

const monday = new Date("2021-08-16T00:00:00Z")
// this is 1 (Monday) in the UK, 0 (Sunday) in a US time zone.
console.log(monday.getDay())
// I think this will be 1 (Monday) anywhere in the world.
console.log(monday.getUTCDay())

@eatyourgreens
Copy link
Contributor

eatyourgreens commented Aug 18, 2021

The day names for the graph labels also need changes.

const monday = new Date("2021-08-16T00:00:00Z")
// This prints Monday in the UK, Sunday in a US time zone.
console.log(monday.toLocaleDateString('en', { weekday: 'long' }))
// This prints Monday anywhere in the world.
console.log(console.log(monday.toLocaleDateString('en', { timeZone: 'UTC', weekday: 'long' })))

@eatyourgreens
Copy link
Contributor

#2359 should fix this by explicitly using UTC for the dates sent back by the stats API. Local time is only used when figuring out the current day of the week, so that we can update your classification count for today after each new classification.

@eatyourgreens
Copy link
Contributor

I'm not sure what happens if you classify after midnight UTC eg. 9pm CDT (2am UTC.) I assume the stats service adds those classifications to tomorrow's count?

@eatyourgreens
Copy link
Contributor

eatyourgreens commented Oct 19, 2024

@goplayoutside3 this bug has regressed. See https://www.zooniverse.org/talk/17/3466158?comment=5709216

It was fixed, with tests, in #2359, but those tests were removed in #5210. The missing tests are the tests that check the day name and classification count for each bar of the chart.

I think the bug is this line here, which can return the wrong result for today when your local clock isn’t UTC.

Eg. during the evening in Chicago, your local clock could be Wednesday but the UTC day would be Thursday. If the code updates Wednesday, when you press Done, it will actually increment yesterday’s count.

@goplayoutside3
Copy link
Contributor

goplayoutside3 commented Oct 29, 2024

This behavior hasn't been fixed. I made two classifications on Daily Minor Planet today. It's now evening in Chicago and I don't see accurate counts for today's classifications in YourStats. It just says 0.

When I log the period, dayTime, and weekDay from calculateWeeklyStats(), I see:
Screenshot 2024-10-28 at 8 32 48 PM

@eatyourgreens
Copy link
Contributor

Should this line be TODAY.getUTCDay()?

@eatyourgreens
Copy link
Contributor

eatyourgreens commented Oct 29, 2024

Just noticed, looking at the mocks in the test, that the move to ERAS changed period for each count from a UTC timestamp to just the date. I don’t know if that would introduce a bug.

Looking at the screenshots at the top of this issue, the original code was written to use midnight UTC for each weekday, eg. count: 1, period: 2021-06-14T00:00:00Z.

EDIT: Looks like the mocks don't match real API data. Real responses do have full UTC timestamps.

@eatyourgreens
Copy link
Contributor

eatyourgreens commented Oct 29, 2024

I've opened #6412 to add the missing timestamps to the mock ERAS data, and run the tests with a time in the -05:00 timezone. Changing timezone in the tests doesn't trigger a bug, though. Weekly counts still begin on Monday and end on Sunday.

Here's period, dayNumber and weekDay from calculateWeeklyStats when that test runs. weekDay is correctly set to UTC, even though the local clock is CDT. In your screenshot, the weekday is CDT in your browser, so that's probably where the bug is. I'm not sure if it's possible to run a test that overrides the time zone of the machine that's running the test, so I'm not sure if the tests can catch this.

{
  period: '2019-09-30',
  dayNumber: 1,
  weekDay: 2019-09-30T01:00:00.000Z,
  count: 12
}
{
  period: '2019-10-01',
  dayNumber: 2,
  weekDay: 2019-10-01T01:00:00.000Z,
  count: 13
}
{
  period: '2019-10-02',
  dayNumber: 3,
  weekDay: 2019-10-02T01:00:00.000Z,
  count: 14
}
{
  period: '2019-10-03',
  dayNumber: 4,
  weekDay: 2019-10-03T01:00:00.000Z,
  count: 10
}
{
  period: '2019-10-04',
  dayNumber: 5,
  weekDay: 2019-10-04T01:00:00.000Z,
  count: 11
}
{
  period: '2019-10-05',
  dayNumber: 6,
  weekDay: 2019-10-05T01:00:00.000Z,
  count: 8
}
{
  period: '2019-10-06',
  dayNumber: 0,
  weekDay: 2019-10-06T01:00:00.000Z,
  count: 15
}

@goplayoutside3
Copy link
Contributor

Some context on stats and classifications across timezones:

Stats bucketed by days always use UTC and not a user’s local timezone. ERAS classifications come from Panoptes' classification created_at (which is in UTC) through the Kinesis stream. Panoptes (aka our source of truth) recording in UTC has been true since panoptes existence. Therefore, the stats time buckets are 1day UTC because classification metadata is stored as UTC.

A user scenario where the user is in a different day than UTC:

  • It’s already tomorrow (Oct 30) in Guam. I can replicate this by setting my local machine time/date to Guam’s timezone.
  • I make a classification. The classification is recorded in panoptes with metadata of UTC, not my computer’s local timezone.
  • When time buckets of stats are fetched from ERAS, the time buckets are in UTC.
  • The bar chart in app-project’s YourStats consumes the bucketed stats data, but it cannot be converted to the user’s local computer’s timezone because the smallest bucket of stats is 1 day as determined by UTC.
  • Therefore, the bar charts say I made a classification yesterday because it’s Oct 29 UTC when I make the classification (even though I’m in Guam).

The labeling on Your Stats and its design were pre-ERAS, but regardless of the stats data, the component’s goal of accurately showing a user the number of classifications they’ve submitted “today” is really difficult to keep track of and display in a bar chart of week days accurately.

Rather than bug fixing the current YourStats code, some changes to the design are being discussed to bring the component in line with a user's /stats page, and remove the assumption we can provide a user an accurate count of their "today" classifications specific to their timezone.

@eatyourgreens
Copy link
Contributor

eatyourgreens commented Oct 29, 2024

I think I've fixed the bug that was returning the wrong weekday as the start of the week. Tested by running the tests in Guam, Kuala Lumpur, Chicago, and London. The tests in #6412 passed each time.

I missed this line when I converted the weekly stats to use UTC. It should use getUTCDay instead of getDay.

@eatyourgreens
Copy link
Contributor

eatyourgreens commented Oct 30, 2024

The labeling on Your Stats and its design were pre-ERAS, but regardless of the stats data, the component’s goal of accurately showing a user the number of classifications they’ve submitted “today” is really difficult to keep track of and display in a bar chart of week days accurately.

Since #2359, the days shown in the daily stats chart have been UTC for just this reason. That also brought them in line with the UTC days shown on the project stats page. “Today” on the daily stats chart and the project stats page is whatever the current day is in Greenwich. Every now and again there’s a Talk comment from someone working in the US in the evening, asking why the project stats graphs are showing “tomorrow” but I think the mods mostly know that all the stats are UTC now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants