Currently tsumugu follows a simple algorithm to determine whether a path should be completely excluded, partially excluded, or included:
-
When parsing regex, a
rev_inner
regex will be generated by replacing variables (${UBUNTU_LTS}
, etc.) to(?<distro_ver>.+)
(aka, match everything). Therev_inner
would be used like this:pub fn is_others_match(&self, text: &str) -> bool { !self.inner.is_match(text) && self.rev_inner.is_match(text) }
-
First, users' exclusions and inclusions are preprocessed. For all exclusions, if it is a prefix of any inclusion, it will be put into the
list_only_regexes
, otherwise it will be put intoinstant_stop_regexes
. All inclusions are ininclude_regexes
. -
While working threads are handling listing requests:
-
Check with
instant_stop_regexes
andinclude_regexes
:for regex in &self.instant_stop_regexes { if regex.is_match(text) { return Comparison::Stop; } } for regex in &self.include_regexes { if regex.is_match(text) { return Comparison::Ok; } }
-
Then, the path will be checked with
rev_inner
regex byis_others_match()
, and also completely excluded if matches (a fast shortcut).This is used for cases like Fedora -- it has many versions (currently from 1 to 40). Listing other version folders not in
${FEDORA_CURRENT}
is a waste of time and network. With this trick we could skip these unmatched versions. -
Finally, if the path matches
list_only_regexes
, files under this directory will be ignored (unless they are matched byinclude_regexes
), but subdirectories will still be listed. Paths that are not matched by any regexes will be included as usual.
-
In this process some paths, which would be unnecessary, will still be listed. However, this logic suits needs of filtering OS versions well.
Also note that currently, this is used when generating relative path for comparison:
pub fn relative_to_str(relative: &[String], filename: Option<&str>) -> String {
let mut r = relative.join("/");
if r.starts_with('/') {
warn!("unexpected / at the beginning of relative ({r})");
} else {
r.insert(0, '/');
}
if r.len() != 1 {
if r.ends_with('/') {
warn!("unexpected / at the end of relative ({r})")
} else {
r.push('/')
}
}
// here r already has / at the end
match filename {
None => r,
Some(filename) => {
assert!(!filename.starts_with('/') && !filename.ends_with('/'));
format!("{}{}", r, filename)
}
}
}
As a result:
- All relative paths for comparison have "/" at front.
- Directory paths have "/" at back, and files don't.
Examples:
http://example.com/file
=>/file
http://example.com/dir
=>/dir/
http://example.com/dir/file
=>/dir/file
Not that for compatibilities considerations, this trick is done: User regex which starts with ^
and not ^/
, would be replaced: ^
-> ^/
(this might break some very rare regexes).
So you could write /something$
to exclude ALL files and directories with name something
, instead of using 2 regexes (^something$
and /something$
, to match something
at root and others not in root).
And also, upstream
itself is NOT included when comparing. So if your upstream is set to https://some.example.com/dir/
, you need to exclude ^something/
to exclude https://some.example.com/dir/something/
instead of ^dir/something/
.
Test with tsumugu list, if in doubt.