Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CWA difference #369

Closed
muschellij2 opened this issue Oct 21, 2020 · 19 comments
Closed

CWA difference #369

muschellij2 opened this issue Oct 21, 2020 · 19 comments

Comments

@muschellij2
Copy link
Contributor

muschellij2 commented Oct 21, 2020

I'm going to open one here and at @activityMonitoring as well for https://github.com/activityMonitoring/biobankAccelerometerAnalysis.

Below I show 3 things:

  1. some really weird values from GGIR (way too high)
  2. Not the same values are being returned for GGIR and biobankAccelerometerAnalysis
  3. Skips in python3 reading from @activityMonitoring

GGIR

library(readr)
library(GGIR)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
fname = "data/example_90001_0_0.cwa.gz"
tfile = R.utils::gunzip(fname, 
                        remove = FALSE, temporary = TRUE)

out = GGIR::g.cwaread(tfile, end = Inf)
data = out$data[, c("time", "x", "y", "z")]
data$time = as.POSIXct(data$time, origin = "1970-01-01")
dim(data)
#> [1] 29864812        4
data[22656010:22656030,]
#>                         time        x         y          z
#> 22656010 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656011 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656012 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656013 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656014 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656015 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656016 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656017 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656018 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656019 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656020 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656021 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656022 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656023 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 22656024 2015-06-22 00:56:04 -0.50000  0.359375   0.609375
#> 22656025 2015-06-22 00:56:04 48.11477 76.729466 123.842931
#> 22656026 2015-06-22 00:56:04 48.09753 76.702373 123.799213
#> 22656027 2015-06-22 00:56:04 48.08029 76.675281 123.755496
#> 22656028 2015-06-22 00:56:04 48.06305 76.648189 123.711779
#> 22656029 2015-06-22 00:56:04 48.04581 76.621096 123.668062
#> 22656030 2015-06-22 00:56:04 48.02857 76.594004 123.624344

accProcess from biobankAccelerometerAnalysis

After running

python3 accProcess.py --skipCalibration=True --rawOutput=True data/example_90001_0_0.cwa.gz

we get the CSV, and read it in:

csv = readr::read_csv(sub("cwa", "csv", fname))
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   time = col_character(),
#>   x = col_double(),
#>   y = col_double(),
#>   z = col_double()
#> )
dim(csv)
#> [1] 29856000        4
head(csv)
#> # A tibble: 6 x 4
#>   time                                              x     y      z
#>   <chr>                                         <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03.771+0100 [Europe/London] -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03.781+0100 [Europe/London] -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03.791+0100 [Europe/London] -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03.801+0100 [Europe/London] -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03.811+0100 [Europe/London] -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03.821+0100 [Europe/London] -0.375 0.328  0.734

Making time an actual time object

csv = csv %>% 
  mutate(time = substr(time, 1, 23),
         time = lubridate::as_datetime(time, 
                                       tz = "Europe/London"))

Goes from 12:56AM to 12:46PM the next day - I'm not sure why this would be or if this is correct, but seems odd.

csv[22655995:22656020,]
#> # A tibble: 26 x 4
#>    time                     x      y      z
#>    <dttm>               <dbl>  <dbl>  <dbl>
#>  1 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  2 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  3 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  4 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  5 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  6 2015-06-22 00:56:03 -0.484  0.359  0.609
#>  7 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#>  8 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#>  9 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#> 10 2015-06-23 12:46:03 -0.922 -0.328 -0.5  
#> # … with 16 more rows

Created on 2020-10-21 by the reprex package (v0.3.0.9001)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.6      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-10-21                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                           
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 4.0.0)                   
#>  backports     1.1.10     2020-09-15 [1] CRAN (R 4.0.2)                   
#>  cli           2.1.0      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 4.0.0)                   
#>  data.table    1.13.0     2020-07-24 [2] CRAN (R 4.0.2)                   
#>  digest        0.6.26     2020-10-17 [1] CRAN (R 4.0.2)                   
#>  dplyr       * 1.0.2      2020-08-18 [2] CRAN (R 4.0.2)                   
#>  ellipsis      0.3.1      2020-05-15 [2] CRAN (R 4.0.0)                   
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.0)                   
#>  fansi         0.4.1      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  fs            1.5.0      2020-07-31 [2] CRAN (R 4.0.2)                   
#>  generics      0.0.2      2018-11-29 [2] CRAN (R 4.0.0)                   
#>  GGIR        * 2.0-1      2020-06-01 [2] Github (wadpac/GGIR@2bd3c76)     
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
#>  highr         0.8        2019-03-20 [2] CRAN (R 4.0.0)                   
#>  hms           0.5.3      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  htmltools     0.5.0      2020-06-16 [2] CRAN (R 4.0.0)                   
#>  knitr         1.30       2020-09-22 [1] CRAN (R 4.0.2)                   
#>  lifecycle     0.2.0      2020-03-06 [2] CRAN (R 4.0.0)                   
#>  lubridate     1.7.9      2020-06-08 [2] CRAN (R 4.0.0)                   
#>  magrittr      1.5        2014-11-22 [2] CRAN (R 4.0.0)                   
#>  pillar        1.4.6      2020-07-10 [2] CRAN (R 4.0.2)                   
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.0)                   
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.0)                   
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.utils       2.10.1     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R6            2.4.1      2019-11-12 [2] CRAN (R 4.0.0)                   
#>  Rcpp          1.0.5      2020-07-06 [2] CRAN (R 4.0.0)                   
#>  readr       * 1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                   
#>  reprex        0.3.0.9001 2020-09-30 [1] Github (tidyverse/reprex@d3fc4b8)
#>  rlang         0.4.8.9000 2020-10-20 [1] Github (r-lib/rlang@011cb4c)     
#>  rmarkdown     2.4        2020-09-30 [1] CRAN (R 4.0.2)                   
#>  rstudioapi    0.11       2020-02-07 [2] CRAN (R 4.0.0)                   
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.0)                   
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                   
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.0)                   
#>  styler        1.3.2      2020-02-23 [2] CRAN (R 4.0.0)                   
#>  tibble        3.0.4      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  tidyselect    1.1.0      2020-05-11 [2] CRAN (R 4.0.0)                   
#>  utf8          1.1.4      2018-05-24 [2] CRAN (R 4.0.0)                   
#>  vctrs         0.3.4      2020-08-29 [1] CRAN (R 4.0.2)                   
#>  withr         2.3.0      2020-09-22 [1] CRAN (R 4.0.2)                   
#>  xfun          0.18       2020-09-29 [1] CRAN (R 4.0.2)                   
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.0)                   
#> 
#> [1] /Users/johnmuschelli/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library
@muschellij2
Copy link
Contributor Author

I added the output from cwa-convert from https://github.com/digitalinteraction/openmovement/tree/master/Software/AX3/cwa-convert to show that even the first 3 rows have differences:

# python3 accProcess.py --skipCalibration=True --rawOutput=True data/example_90001_0_0.cwa.gz

setwd("~/Dropbox/Packages/biobankAccelerometerAnalysis/")
library(readr)
library(GGIR)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
fname = "data/example_90001_0_0.cwa.gz"
xyz = c("x", "y", "z")
tfile = R.utils::gunzip(fname, 
                        remove = FALSE, temporary = TRUE)

out = GGIR::g.cwaread(tfile, end = Inf)
data = out$data[, c("time", xyz)]
data = tibble::as_tibble(data)
data$time = as.POSIXct(data$time, origin = "1970-01-01")
dim(data)
#> [1] 29864812        4
as.data.frame(data[22656010:22656030,])
#>                   time        x         y          z
#> 1  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 2  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 3  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 4  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 5  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 6  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 7  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 8  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 9  2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 10 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 11 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 12 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 13 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 14 2015-06-22 00:56:03 -0.50000  0.359375   0.609375
#> 15 2015-06-22 00:56:04 -0.50000  0.359375   0.609375
#> 16 2015-06-22 00:56:04 48.11477 76.729466 123.842931
#> 17 2015-06-22 00:56:04 48.09753 76.702373 123.799213
#> 18 2015-06-22 00:56:04 48.08029 76.675281 123.755496
#> 19 2015-06-22 00:56:04 48.06305 76.648189 123.711779
#> 20 2015-06-22 00:56:04 48.04581 76.621096 123.668062
#> 21 2015-06-22 00:56:04 48.02857 76.594004 123.624344

csv = readr::read_csv(sub("cwa", "csv", fname))
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   time = col_character(),
#>   x = col_double(),
#>   y = col_double(),
#>   z = col_double()
#> )
dim(csv)
#> [1] 29856000        4
head(csv)
#> # A tibble: 6 x 4
#>   time                                              x     y      z
#>   <chr>                                         <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03.771+0100 [Europe/London] -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03.781+0100 [Europe/London] -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03.791+0100 [Europe/London] -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03.801+0100 [Europe/London] -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03.811+0100 [Europe/London] -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03.821+0100 [Europe/London] -0.375 0.328  0.734

csv = csv %>% 
  mutate(time = substr(time, 1, 23),
         time = lubridate::as_datetime(time, 
                                       tz = "Europe/London"))

as.data.frame(csv[22655995:22656020,])
#>                   time      x      y      z
#> 1  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 2  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 3  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 4  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 5  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 6  2015-06-22 00:56:03 -0.484  0.359  0.609
#> 7  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 8  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 9  2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 10 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 11 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 12 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 13 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 14 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 15 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 16 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 17 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 18 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 19 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 20 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 21 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 22 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 23 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 24 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 25 2015-06-23 12:46:03 -0.922 -0.328 -0.500
#> 26 2015-06-23 12:46:03 -0.922 -0.328 -0.500

con_fname = sub("cwa", "csv", fname)
con_fname = file.path(dirname(con_fname), paste0("cwaconvert_", basename(con_fname)))
con = readr::read_csv(con_fname, col_names = FALSE)
#> 
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   X1 = col_datetime(format = ""),
#>   X2 = col_double(),
#>   X3 = col_double(),
#>   X4 = col_double()
#> )
colnames(con) = c("time", xyz)
dim(con)
#> [1] 29245200        4
head(con)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734

as.data.frame(con[22656010:22656030,])
#>                   time        x        y        z
#> 1  2015-06-23 14:29:32 0.484375 0.656250 0.406250
#> 2  2015-06-23 14:29:32 0.484375 0.640625 0.406250
#> 3  2015-06-23 14:29:32 0.484375 0.640625 0.406250
#> 4  2015-06-23 14:29:32 0.484375 0.640625 0.421875
#> 5  2015-06-23 14:29:32 0.484375 0.625000 0.406250
#> 6  2015-06-23 14:29:32 0.484375 0.625000 0.406250
#> 7  2015-06-23 14:29:32 0.500000 0.640625 0.390625
#> 8  2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 9  2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 10 2015-06-23 14:29:32 0.500000 0.625000 0.406250
#> 11 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 12 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 13 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 14 2015-06-23 14:29:32 0.515625 0.625000 0.390625
#> 15 2015-06-23 14:29:32 0.515625 0.625000 0.406250
#> 16 2015-06-23 14:29:32 0.531250 0.625000 0.406250
#> 17 2015-06-23 14:29:32 0.531250 0.625000 0.406250
#> 18 2015-06-23 14:29:32 0.531250 0.609375 0.390625
#> 19 2015-06-23 14:29:32 0.546875 0.609375 0.406250
#> 20 2015-06-23 14:29:32 0.546875 0.609375 0.406250
#> 21 2015-06-23 14:29:32 0.546875 0.609375 0.406250

head(csv)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.313  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734
head(con)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734
round(head(con[, xyz]) - head(csv[, xyz]), 3)
#>   x      y z
#> 1 0  0.000 0
#> 2 0  0.000 0
#> 3 0  0.000 0
#> 4 0  0.000 0
#> 5 0 -0.001 0
#> 6 0  0.000 0
round(head(con[, xyz]) - head(data[, xyz]), 3)
#>        x      y     z
#> 1  0.000  0.000 0.000
#> 2 -0.002 -0.008 0.023
#> 3  0.000  0.001 0.001
#> 4  0.001  0.001 0.000
#> 5  0.001  0.001 0.000
#> 6  0.001  0.001 0.000
head(data)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.420 0.258  0.680
#> 3 2015-06-19 10:00:03 -0.422 0.280  0.733
#> 4 2015-06-19 10:00:03 -0.407 0.296  0.734
#> 5 2015-06-19 10:00:03 -0.392 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.376 0.327  0.734

Created on 2020-10-21 by the reprex package (v0.3.0.9001)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.6      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-10-21                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                           
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 4.0.0)                   
#>  backports     1.1.10     2020-09-15 [1] CRAN (R 4.0.2)                   
#>  cli           2.1.0      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 4.0.0)                   
#>  data.table    1.13.0     2020-07-24 [2] CRAN (R 4.0.2)                   
#>  digest        0.6.26     2020-10-17 [1] CRAN (R 4.0.2)                   
#>  dplyr       * 1.0.2      2020-08-18 [2] CRAN (R 4.0.2)                   
#>  ellipsis      0.3.1      2020-05-15 [2] CRAN (R 4.0.0)                   
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.0)                   
#>  fansi         0.4.1      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  fs            1.5.0      2020-07-31 [2] CRAN (R 4.0.2)                   
#>  generics      0.0.2      2018-11-29 [2] CRAN (R 4.0.0)                   
#>  GGIR        * 2.0-1      2020-06-01 [2] Github (wadpac/GGIR@2bd3c76)     
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
#>  highr         0.8        2019-03-20 [2] CRAN (R 4.0.0)                   
#>  hms           0.5.3      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  htmltools     0.5.0      2020-06-16 [2] CRAN (R 4.0.0)                   
#>  knitr         1.30       2020-09-22 [1] CRAN (R 4.0.2)                   
#>  lifecycle     0.2.0      2020-03-06 [2] CRAN (R 4.0.0)                   
#>  lubridate     1.7.9      2020-06-08 [2] CRAN (R 4.0.0)                   
#>  magrittr      1.5        2014-11-22 [2] CRAN (R 4.0.0)                   
#>  pillar        1.4.6      2020-07-10 [2] CRAN (R 4.0.2)                   
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.0)                   
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.0)                   
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.utils       2.10.1     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R6            2.4.1      2019-11-12 [2] CRAN (R 4.0.0)                   
#>  Rcpp          1.0.5      2020-07-06 [2] CRAN (R 4.0.0)                   
#>  readr       * 1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                   
#>  reprex        0.3.0.9001 2020-09-30 [1] Github (tidyverse/reprex@d3fc4b8)
#>  rlang         0.4.8.9000 2020-10-20 [1] Github (r-lib/rlang@011cb4c)     
#>  rmarkdown     2.4        2020-09-30 [1] CRAN (R 4.0.2)                   
#>  rstudioapi    0.11       2020-02-07 [2] CRAN (R 4.0.0)                   
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.0)                   
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                   
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.0)                   
#>  styler        1.3.2      2020-02-23 [2] CRAN (R 4.0.0)                   
#>  tibble        3.0.4      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  tidyselect    1.1.0      2020-05-11 [2] CRAN (R 4.0.0)                   
#>  utf8          1.1.4      2018-05-24 [2] CRAN (R 4.0.0)                   
#>  vctrs         0.3.4      2020-08-29 [1] CRAN (R 4.0.2)                   
#>  withr         2.3.0      2020-09-22 [1] CRAN (R 4.0.2)                   
#>  xfun          0.18       2020-09-29 [1] CRAN (R 4.0.2)                   
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.0)                   
#> 
#> [1] /Users/johnmuschelli/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

@Mirkes
Copy link
Contributor

Mirkes commented Oct 21, 2020 via email

@muschellij2
Copy link
Contributor Author

This data is in 100Hz. I understand it may have a varying effective sampling rate, but this still doesn't seem to indicate why the 3rd record is different or there are large differences from this other open-source software for getting the raw data. Also - the values are way to high for acceleration (140g) in the GGIR output but not in the other solutions.

@vincentvanhees
Copy link
Member

vincentvanhees commented Oct 21, 2020

Thanks @muschellij2 - this issue may overlap with #361 which I closed last week after making pull request #367. The issue was that sample frequency was not correctly recognised from the page headers.

I see you are using an older GGIR version, does the issue persist when you use the GGIR master branch here on GitHub?

In January I expanded the g.cwaread functionality to also be able to handle AX6 cwa file formats and with help from Dan Jackson we resolved some issues in the numUnpack function affecting AX3 data with non-default configuration settings.

@vincentvanhees
Copy link
Member

just to re-assure any PA/Sleep researcher reading this. The GGIR code does flag every large QC window (default 15 minutes) as invalid if it has even a single raw acceleration value larger than dynamic range x 1.5, positive or negative. This as a safety net in case something goes wrong with reading the data or files get corrupted. That said, it would be good to investigate what is going on.

@vincentvanhees
Copy link
Member

... even the first 3 rows have differences ...

Different packages may handle start of the file differently. For example, there is a known issue with AX3 values in some specific firmware versions being 'stuck' for the first couple of seconds. Some packages may intentionally skip these values while other package simply include them in the output. So, I think it is best not to rely too heavily on row or page number but focus on matching timestamps instead.

@muschellij2 if you could also send me a copy of the problematic file that would be much appreciated. In the mean time I am trying to retrieve a copy of a file from another group who had a similar problem a while ago, which I now realize could have been the same issue.

@vincentvanhees
Copy link
Member

  1. Where can I find accProcess.py?

The last CSV is from cwaconvert https://github.com/digitalinteraction/openmovement/tree/master/Software/AX3/cwa-convert

  1. which of the cwaconvert tools in that directory did you use?

@muschellij2
Copy link
Contributor Author

muschellij2 commented Oct 22, 2020 via email

@vincentvanhees
Copy link
Member

vincentvanhees commented Oct 22, 2020

re. minor differences in acceleration values

When I plot the extracted acceleration signal derived with OMGUI csv export which internally uses cwaconvert and compare this with GGIR::g.cwaread output I see a small, approximately constant, offset between the two signals which otherwise follow the same pattern in time, which seems to match John's observation

This evening I reviewed all the code in GGIR's g.cwaread and numUnpack relative to the cwa file format documentation in https://github.com/digitalinteraction/openmovement/blob/master/Docs/ax3/cwa.h
and I could not find any obvious discrepancies.

The part of the code I found more difficult to review is the resampling step because this is not cwa format specific:
GGIR's resampling function looks similar to what is done in OMGUI(see LinearInterpolate function) although I am not entirely sure this is the actual code used by cwaconvert and I am also not familiar with c, so maybe I am misjudging this.

The resampling in biobankAccelerometerAnalysis looks different from what GGIR does....or is it mathematically equivalent?

@Mirkes would you mind having a look at these resample functions and check whether you can spot differences that could explain the minor differences in acceleration values?

re. extreme values

In current version of GGIR these are flagged and ignored (not g.cwaread but more upstream in GGIR). Hopefully it will be possible to retrieve the problematic example file in the upcoming weeks to investigate further.

@muschellij2
Copy link
Contributor Author

I've requested the ability to share the UK Biobank file from the UKBB team.

@vincentvanhees
Copy link
Member

vincentvanhees commented Oct 23, 2020

@danielgjackson Is there is a difference in cwa file format between (1) the cwa files used in UK Biobank and the demofile on the Axivity website and (2) cwa files extracted from commercially sold AX3 sensors?

I observe that commercial AX3 sensor output is identical across OMGUI exported csv files and GGIR. The small offset in values only seems to appear in UK Biobank data and the Axivity demofile:

compare_OMGUI_GGIR

Further, I think the resampling inside GGIR is definitely not the problem, because when I turn it off and compare output with GGIR-based resampled data the signal aligns well as it should.

@Mirkes
Copy link
Contributor

Mirkes commented Oct 23, 2020 via email

@muschellij2
Copy link
Contributor Author

muschellij2 commented Oct 23, 2020

I don't think the resampling is the problem. I wrapped that software btw to https://github.com/muschellij2/read.cwa so that you can run:

remotes::install_github("muschellij2/read.cwa")
library(read.cwa)
out = read_cwa(file)

@muschellij2
Copy link
Contributor Author

muschellij2 commented Oct 28, 2020

For easy comparison, I also wrapped that software from @activityMonitoring btw to https://github.com/muschellij2/pycwa so that you can run:

remotes::install_github("muschellij2/pycwa")
library(pycwa)
out = py_read_cwa(file)

@aidendoherty

@vincentvanhees
Copy link
Member

vincentvanhees commented Oct 28, 2020

Thanks @muschellij2. For my own cwa test file collected with commercial AX3 sensor the pycwa function output looks consistent with g.cwaread and omgui export. So, this indicates that the observed small offset in raw data truly is specific to the combination of GGIR::g.cwaread and older cwa data.

I am currently not working on paid projects related to cwa data but am happy to invest a bit more time on it once someone has been able to pinpoint the cause of the issue (or contracts me to dedicate more time to it). Hopefully my input here has at least been helpful to inform further investigation by others.

@muschellij2
Copy link
Contributor Author

muschellij2 commented Oct 30, 2020

So I made 2 other package for reading CWA files: https://github.com/muschellij2/pycwa and https://github.com/muschellij2/read.cwa for comparison to GGIR and potential checks.

setwd("~/Dropbox/Packages/biobankAccelerometerAnalysis/")
library(readr)
library(GGIR)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(read.cwa)
library(pycwa)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
file = "data/example_90001_0_0.cwa"


xyz = c("x", "y", "z")
py = py_read_cwa(file, verbose = FALSE)

om = read_cwa(file, verbose = FALSE)
gg = GGIR::g.cwaread(file, end = Inf, progressBar = TRUE, desiredtz = "UTC")
#>   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%
gg = gg$data
gg = tibble::as_tibble(gg)
om = om$data

colnames(gg) = tolower(colnames(gg))
colnames(om) = tolower(colnames(om))

gg = gg[, c("time", xyz)]
gg$time = as.POSIXct(gg$time, tz = "UTC", origin = "1970-01-01")
om = om[, c("time", xyz)]
py = py[, c("time", xyz)]

nrow(py)
#> [1] 29856000
nrow(om)
#> [1] 29245200
nrow(gg)
#> [1] 29864812

cr = function(x) {
  sapply(x[xyz], range)
}
cr(py)
#>           x      y      z
#> [1,] -8.000 -8.000 -8.000
#> [2,]  7.984  7.563  7.984
cr(om)
#>              x         y         z
#> [1,] -8.000000 -8.000000 -8.000000
#> [2,]  7.984375  7.984375  7.984375
cr(gg)
#>              x         y         z
#> [1,] -1737.464 -1192.311 -1979.984
#> [2,]  3779.179  2105.408  1938.802

head(py)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 09:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 09:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 09:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 09:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 09:00:03 -0.391 0.313  0.734
#> 6 2015-06-19 09:00:03 -0.375 0.328  0.734
head(om)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.422 0.25   0.703
#> 3 2015-06-19 10:00:03 -0.422 0.281  0.734
#> 4 2015-06-19 10:00:03 -0.406 0.297  0.734
#> 5 2015-06-19 10:00:03 -0.391 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.375 0.328  0.734
head(gg)
#> # A tibble: 6 x 4
#>   time                     x     y      z
#>   <dttm>               <dbl> <dbl>  <dbl>
#> 1 2015-06-19 10:00:03 -0.281 0.781 -0.75 
#> 2 2015-06-19 10:00:03 -0.420 0.258  0.680
#> 3 2015-06-19 10:00:03 -0.422 0.280  0.733
#> 4 2015-06-19 10:00:03 -0.407 0.296  0.734
#> 5 2015-06-19 10:00:03 -0.392 0.312  0.734
#> 6 2015-06-19 10:00:03 -0.376 0.327  0.734
fd = function(df) {
  df %>% mutate(time_sec = lubridate::floor_date(time, "1 second"))
}
py = fd(py)
gg = fd(gg)
om = fd(om)

btt = function(df) {
  dt = as_datetime("2015-06-22 00:56:03")
  ind = which(lubridate::floor_date(df$time, "1 second") == dt)
  ind = seq(max(ind), max(ind) + 1000)
  df[ind,]
}

# need to add because of DST
py$time = py$time + as.period(1, "hour")

btt(py)
#> # A tibble: 1,001 x 5
#>    time                     x      y      z time_sec           
#>    <dttm>               <dbl>  <dbl>  <dbl> <dttm>             
#>  1 2015-06-22 00:56:03 -0.484  0.359  0.609 2015-06-21 23:56:03
#>  2 2015-06-23 12:46:03 -0.922 -0.328 -0.5   2015-06-23 11:46:03
#>  3 2015-06-23 12:46:03 -0.922 -0.328 -0.5   2015-06-23 11:46:03
#>  4 2015-06-23 12:46:03 -0.922 -0.328 -0.5   2015-06-23 11:46:03
#>  5 2015-06-23 12:46:03 -0.922 -0.328 -0.5   2015-06-23 11:46:03
#>  6 2015-06-23 12:46:03 -0.922 -0.328 -0.5   2015-06-23 11:46:03
#>  7 2015-06-23 12:46:03 -0.922 -0.328 -0.5   2015-06-23 11:46:03
#>  8 2015-06-23 12:46:03 -0.922 -0.328 -0.5   2015-06-23 11:46:03
#>  9 2015-06-23 12:46:03 -0.922 -0.328 -0.5   2015-06-23 11:46:03
#> 10 2015-06-23 12:46:03 -0.922 -0.328 -0.5   2015-06-23 11:46:03
#> # … with 991 more rows
btt(om)
#> # A tibble: 1,001 x 5
#>    time                     x      y      z time_sec           
#>    <dttm>               <dbl>  <dbl>  <dbl> <dttm>             
#>  1 2015-06-22 00:56:03 -0.484  0.359  0.609 2015-06-22 00:56:03
#>  2 2015-06-23 12:46:32 -0.922 -0.328 -0.5   2015-06-23 12:46:32
#>  3 2015-06-23 12:46:32 -0.875 -0.312 -0.469 2015-06-23 12:46:32
#>  4 2015-06-23 12:46:32 -0.922 -0.328 -0.5   2015-06-23 12:46:32
#>  5 2015-06-23 12:46:32 -0.922 -0.312 -0.484 2015-06-23 12:46:32
#>  6 2015-06-23 12:46:32 -0.922 -0.328 -0.5   2015-06-23 12:46:32
#>  7 2015-06-23 12:46:32 -0.922 -0.328 -0.5   2015-06-23 12:46:32
#>  8 2015-06-23 12:46:32 -0.922 -0.328 -0.5   2015-06-23 12:46:32
#>  9 2015-06-23 12:46:32 -0.922 -0.328 -0.5   2015-06-23 12:46:32
#> 10 2015-06-23 12:46:32 -0.922 -0.328 -0.5   2015-06-23 12:46:32
#> # … with 991 more rows
btt(gg)
#> # A tibble: 1,001 x 5
#>    time                    x      y       z time_sec           
#>    <dttm>              <dbl>  <dbl>   <dbl> <dttm>             
#>  1 2015-06-22 00:56:03  -0.5  0.359   0.609 2015-06-22 00:56:03
#>  2 2015-06-22 00:56:04  -0.5  0.359   0.609 2015-06-22 00:56:04
#>  3 2015-06-22 00:56:04  48.1 76.7   124.    2015-06-22 00:56:04
#>  4 2015-06-22 00:56:04  48.1 76.7   124.    2015-06-22 00:56:04
#>  5 2015-06-22 00:56:04  48.1 76.7   124.    2015-06-22 00:56:04
#>  6 2015-06-22 00:56:04  48.1 76.6   124.    2015-06-22 00:56:04
#>  7 2015-06-22 00:56:04  48.0 76.6   124.    2015-06-22 00:56:04
#>  8 2015-06-22 00:56:04  48.0 76.6   124.    2015-06-22 00:56:04
#>  9 2015-06-22 00:56:04  48.0 76.6   124.    2015-06-22 00:56:04
#> 10 2015-06-22 00:56:04  48.0 76.5   124.    2015-06-22 00:56:04
#> # … with 991 more rows

Created on 2020-10-30 by the reprex package (v0.3.0.9001)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.6      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-10-30                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                           
#>  assertthat    0.2.1      2019-03-21 [2] CRAN (R 4.0.0)                   
#>  backports     1.1.10     2020-09-15 [1] CRAN (R 4.0.2)                   
#>  cli           2.1.0      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  crayon        1.3.4      2017-09-16 [2] CRAN (R 4.0.0)                   
#>  data.table    1.13.2     2020-10-19 [1] CRAN (R 4.0.2)                   
#>  digest        0.6.27     2020-10-24 [1] CRAN (R 4.0.2)                   
#>  dplyr       * 1.0.2      2020-08-18 [2] CRAN (R 4.0.2)                   
#>  ellipsis      0.3.1      2020-05-15 [2] CRAN (R 4.0.0)                   
#>  evaluate      0.14       2019-05-28 [2] CRAN (R 4.0.0)                   
#>  fansi         0.4.1      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  fs            1.5.0      2020-07-31 [2] CRAN (R 4.0.2)                   
#>  generics      0.0.2      2018-11-29 [2] CRAN (R 4.0.0)                   
#>  GGIR        * 2.1-3      2020-10-22 [1] Github (wadpac/GGIR@49aedcd)     
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
#>  highr         0.8        2019-03-20 [2] CRAN (R 4.0.0)                   
#>  hms           0.5.3      2020-01-08 [2] CRAN (R 4.0.0)                   
#>  htmltools     0.5.0      2020-06-16 [2] CRAN (R 4.0.0)                   
#>  jsonlite      1.7.1      2020-09-07 [1] CRAN (R 4.0.2)                   
#>  knitr         1.30       2020-09-22 [1] CRAN (R 4.0.2)                   
#>  lattice       0.20-41    2020-04-02 [2] CRAN (R 4.0.2)                   
#>  lifecycle     0.2.0      2020-03-06 [2] CRAN (R 4.0.0)                   
#>  lubridate   * 1.7.9      2020-06-08 [2] CRAN (R 4.0.0)                   
#>  magrittr      1.5        2014-11-22 [2] CRAN (R 4.0.0)                   
#>  Matrix        1.2-18     2019-11-27 [2] CRAN (R 4.0.2)                   
#>  pillar        1.4.6      2020-07-10 [2] CRAN (R 4.0.2)                   
#>  pkgconfig     2.0.3      2019-09-22 [2] CRAN (R 4.0.0)                   
#>  purrr         0.3.4      2020-04-17 [2] CRAN (R 4.0.0)                   
#>  pycwa       * 0.1.0      2020-10-30 [1] local                            
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R.utils       2.10.1     2020-08-26 [1] CRAN (R 4.0.2)                   
#>  R6            2.4.1      2019-11-12 [2] CRAN (R 4.0.0)                   
#>  Rcpp          1.0.5      2020-07-06 [2] CRAN (R 4.0.0)                   
#>  read.cwa    * 0.2.1      2020-10-26 [1] local                            
#>  readr       * 1.4.0      2020-10-05 [1] CRAN (R 4.0.2)                   
#>  reprex        0.3.0.9001 2020-09-30 [1] Github (tidyverse/reprex@d3fc4b8)
#>  reticulate    1.18       2020-10-25 [1] CRAN (R 4.0.2)                   
#>  rlang         0.4.8.9000 2020-10-22 [1] Github (r-lib/rlang@7a36238)     
#>  rmarkdown     2.4        2020-09-30 [1] CRAN (R 4.0.2)                   
#>  rstudioapi    0.11       2020-02-07 [2] CRAN (R 4.0.0)                   
#>  sessioninfo   1.1.1      2018-11-05 [2] CRAN (R 4.0.0)                   
#>  stringi       1.5.3      2020-09-09 [1] CRAN (R 4.0.2)                   
#>  stringr       1.4.0      2019-02-10 [2] CRAN (R 4.0.0)                   
#>  styler        1.3.2      2020-02-23 [2] CRAN (R 4.0.0)                   
#>  tibble        3.0.4      2020-10-12 [1] CRAN (R 4.0.2)                   
#>  tidyselect    1.1.0      2020-05-11 [2] CRAN (R 4.0.0)                   
#>  utf8          1.1.4      2018-05-24 [2] CRAN (R 4.0.0)                   
#>  vctrs         0.3.4      2020-08-29 [1] CRAN (R 4.0.2)                   
#>  withr         2.3.0      2020-09-22 [1] CRAN (R 4.0.2)                   
#>  xfun          0.18       2020-09-29 [1] CRAN (R 4.0.2)                   
#>  yaml          2.2.1      2020-02-01 [2] CRAN (R 4.0.0)                   
#> 
#> [1] /Users/johnmuschelli/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

@danielgjackson
Copy link

@danielgjackson Is there is a difference in cwa file format between (1) the cwa files used in UK Biobank and the demofile on the Axivity website and (2) cwa files extracted from commercially sold AX3 sensors?

There shouldn't be any difference that would explain an offset.

Very early firmware could not measure fractional timestamps, but did indicate the sample on which the second rolled over, this allowed recreation of timestamps of roughly the sample interval. Fractional timestamps were added in a relatively complicated way so that it was backwards-compatible with this method. Perhaps this code path for non-fractional times is somehow different in GGIR?

The small offset in values only seems to appear in UK Biobank data and the Axivity demofile

If the above does not turn up anything useful, I think this really must be a "raw export" (cwa-convert) comparison to really see what's going on.

Also: Biobank data is in the "packed" mode, so it's worth ensuring that any comparison is on the same basis.

@vincentvanhees
Copy link
Member

vincentvanhees commented Dec 23, 2020

I have had another look at this issue and have come to believe that the issue is not in GGIR::g.cwaread but in the resampling algorithm as implemented in OMGUI and probably also the resampling step of biobankAccelerometerAnalysis.

Motivation:
To demonstrate this I used OMGUI to export the .cwa AX3 demofile from the Axivity website to two different .csv file types:

  • File1 - With resampling in g-units
  • File2 - Without resampling in g-units

When I plot the content of File1 and File2 on top of each other I get the exact same offset as I showed earlier in this thread. So, the offset does not originate from GGIR::g.cwaread but from OMGUI's own resampling step @danielgjackson. As biobankAccelerometerAnalysis had consistent output with OMGUI this may indicate that the same issue applies there too. @aidendoherty: To test this you will have to plot the resampled and original acceleration signal on top of each other as a function of timestamp.

GGIR's resampling code is here.

Unrelated extra checks on GGIR

Following Dan's advice I also looked closer at GGIR. For this I generated a third file (File3) with OMGUI: Without resampling in raw values (1-256). Next, I edited GGIR::g.cwaread to not scale the acceleration signal such that it exports only the raw unscaled values. Next, I plotted this signal on top of the content of File3, which looks identical. So, the raw value extraction seems to be working fine in GGIR::g.cwaread, even for these older cwa-recordings. Further, I see that GGIR::g.cwaread correctly identifies the scaling factor to be 4096.

@vincentvanhees
Copy link
Member

vincentvanhees commented Jan 6, 2021

I am now closing this issue:

  • I have created a separate issue for the extreme values issue g.cwaread reading AX3 cwa data when sensor was plugged in laptop by participant #380 to help focus follow-up efforts on that.
  • Differences in the start of the recordings are expected in the order of seconds. What matters most is that patterns in data match when plotted against timestamps. There we see that data is in sync across the software tools (see plot earlier in the thread), which is good.
  • I think the offset in values between the software tools is explained now (see my previous post above). I will open a separate issue in openmovement to help make sure this receives attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants