how to use filter_by_commit_interval #339
-
In filter.R I see the function filter_by_commit_interval. I should be able to use this in the dv8_showcase to look at just a portion of the git log, by specifying start and end commits, right? Would I add it to the block at lines 67-76? e.g. project_git <- project_git %>% |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 5 replies
-
Honestly, I would just manipulate the table directly: The table should have a column for dates. Use the lubridate R package, convert the string to date, and subset accordingly. I don't recall if the filter functions have been normalized across the board to use the pipe operator. You could also try to search for the function on the Kaiaulu repo for usages across notebooks, for an example. Unfortunately, this function is years old, and I no longer remember the usage, so it is best we go by lack of clarity in the documentation instead. |
Beta Was this translation helpful? Give feedback.
-
How is this best achieved in R? Do I use a while loop and skip all rows before and after my target start/end hashes? |
Beta Was this translation helpful? Give feedback.
-
But this is not a subset in the sense that we normally use it when talking about a datatable (i.e. select only those rows where column 3 has the value "Kansas". We need to start the subset at a particular commit hash and then whatever rows follow, up to the ending commit hash, are included in the subset. And there is no natural ordering of commit hashes, so I can not subset by taking all the rows that are numerically in between start and end hash. The ordering is simply whatever order they appear in the git log. That is why I needed a loop. |
Beta Was this translation helpful? Give feedback.
I'm not sure if we have the same mental model here. If you are trying to filter_by_commit_hash, what is happening under the hood is the timestamp of the two commit hashes are used to obtain their respective timestamps. Then, the time range between said commit hash is subset using the date field (you can perform greater than or less than operations on dates).
The only alternative is to manually specify every commit hash associated to the time period, which I don't think you want.
(*) Note even though you may do a git_checkout to get a source code folder to a particular point in time, it would be incorrect to subset the git log by just said commit hash. You would only get the files for that…