-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat as same site #68
base: future
Are you sure you want to change the base?
Conversation
Areas targetted are ones likely to change when new features are added. - Arrays: Make sure one entry per line, and always trail with a comma - CSS selectors: One selector per line, a no-op fake selector to ensure every real selector ends with a trailing comma Both of the above mean when doing merges, the only lines changed are ones added/removed - not the final one in the list just because a comma was added to extend the list. index.html: Fixed up some indenting to make editing in future clearer.
…ion of which parts of a domain should be used to identify a site vs it's subdomains or base domains. Update both options pages to include this, with instructions how to use it. Default to sane values which preserve previous functionality (hardcoded recognition of .co.* domains) Rewrite domain matching routine to use new configuration option using regexes. This patch fixes an issue where countries that use second-level domains (.net.nz, .co.nz, .org.nz, .gov.nz) were misidentified with the second level domain part being used as the site name. It has sane defaults, but allows configuration so that users can customise for the particulars of their country and other websites that use a two-level base domain system. This should be fully future proof now.
Note I don't have the boilerplate installed to run your tests (and have not modified or created any tests for this feature) I also run windows so have trouble building the icons and the scryptasm - to get by I just downloaded the latest release and ripped the files out. |
let significant_domain = domain_parts.slice(-significant_parts) | ||
return [domain_parts, significant_domain]; | ||
if (!url) | ||
throw "extractDomainFromUrl: No url provided!"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throwing here is not such a great Idea.. everything that follows extractDomainFromUrl
will be skipped, in particular loadSites and updateUIForDomainSettings.
So instead of having access to all your sites, it appears to have lost all data.
probably better to return some sane default
{ | ||
let protocolsIgnored = ["about", "resource", "moz-extension", "chrome-extension", "chrome", "edge", "brave"]; | ||
if (protocolsIgnored.includes(urlParsed.protocol)) | ||
throw "extractDomainFromUrl: Invalid url protocol provided"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially here. about:blank is a perfectly legal page to open to just quickly lookup a password.
}); | ||
|
||
if (domainMatched == null) | ||
throw 'extractDomainFromUrl: Could not match any sites! Check your configuration settings!'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing about throw
white-space: nowrap; | ||
} | ||
|
||
/* FIXME: I don't understand how the left alignment of inputs works here, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats alright :) makes me remember I want to fix this cleanly with flexbox (which was very new back in 2015)
/([^\.]+\.[^\.]+)$/i, // Default matcher matches two-part domain names (.* pattern) i.e. github.com | ||
]; | ||
|
||
config.treat_as_same_site.split('\n').forEach(function(pattern) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I'm often guilty of this myself, but hey, now I'm in review mode and trying to understand someone else's code ;)
Could it be better to make it a named function and do something like
let matchRegEx = config.treat_as_same_site.split('\n').map(translate_match_string_to_regex); rxDomainMatchers.push(...matchRegEx.map(pattern => new RegExp(pattern, 'i')))
I'm left to wondering if we're over-engineering this (ref my comments in masterpassword-chrome). The user interface for this is rather advanced (hence the need for a longish explanatory help text) Also it doesn't seem to work as intended in Firefox. I can't make a new line, and all the defaults end up on the same line. about over engineering: The remaining problems to solve are
|
@@ -37,6 +37,11 @@ | |||
input[type=checkbox] { | |||
width:1em; | |||
} | |||
#treat_as_same_site { | |||
height: 15em; | |||
white-space: nowrap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the nowrap is the problem.. seems like pre
would be the correct(?) choice
ref https://bugzilla.mozilla.org/show_bug.cgi?id=1137650
(porting comment by @ttyridal over from masterpassword-chrome PR)
|
I think the functionality of both of our solutions to managing 2LD's is actually pretty much the same - the implementation is just a bit different. We both try to expand the 'sane defaults' of what a 2LD is so that MP 'just works' for most people by listing out a bunch of 2LD's. There are two main differences in our implementations:
I do concede that figuring out how to operate this feature may be beyond the average user - which is why sane defaults are so important. However, on the second point; hopefully we can agree that a list of 2LD's isn't code but configuration data. Once you adjust thinking to be in line with that; it isn't a far stretch to imagine an 'advanced settings (here be dragons)' area or some such, not unlike about:config in the browser - that would list internal configuration settings that are really not meant to be changed by most users - but that could be for those that understand enough to realise changing them would fix an issue for them (i.e. me!). In that vein; an advanced config area need not concern itself with baby-proofing, and I think has some right to be a bit more verbose in it's descriptions and method of config entry. Once you get to this point thinking through what I am saying; you can see my solution basically does exactly this.
The only thing extra that need be done to complete the idea above is have a tickbox for 'show advanced options' or some such in the config UI, and hide the config by default unless the tickbox is ticked. What do you think? |
Sorry further to this - understand what you are saying as far as having resolved this issue by changing the way site matching works. Presumably you are doing the same thing I am doing in the domain matching method - attempting match against all sites then scoring based on parts matched - which would be why you no longer support multiple configuratons per site. If that's the case, then this patch only fixes up the initial domain identification for new unstored sites. If you want to turf it given that then fine. Up to you. |
Actually on properly taking stock of the fact that you have effectively solved the issue I was trying to solve already; I have to concede that this patch had two things going for it for me - pride in an elegant solution, and added/fixed functionality. The functionality has already been fixed which takes that out of the equation. The solution is no longer elegant but redundant given the former. Which leaves pride for what was an elegant but no longer elegant solution - which should not be a reason for accepting a PR! I can see that storing the 2LD list as a config option itself could cause issues for future updates - as it may get sucked into storage and then even if the defaults are updated in future to expand coverage, they will not be used. So that's another negative of this solution that would require engineering to fix. So the only other thing to discuss from my POV coming from masterpassword-chrome with my fix applied, is what are the drawbacks/functionality loss of shifting to the your new model.
|
You are writing out my thoughts :) Having slept on this, I really really dislike the user experience part, but your implementation is a I would be perfectly happy with your approach if we burry it in some –ignore-for-most-users- advanced It would be better to get the stored-url just right, because it does play a part for proposing the Regarding “multiple configurations per site”. This is still possible, ie having user1@site and user2@site. So yes, you can still have user1@ and user2@ selectable in the dropdown. Since they match on the url |
ok.. I stumbled across https://publicsuffix.org/ ... that's 230K bytes of config data to get this right 😬 |
LOL, we swapped positions. You came around to my thoughts and I came around
to yours.
Yeesh, cool that there is a public suffix list - but 230K, and is it even
up to date?
Funny, I bet the browser already has a copy of this local somewhere.
Personally I think I'd accept we can't win completely and go for a more
compact list like we have that covers as many core cases as possible.
I don't think we can win trying to manage much more than 2LD's without the
user's help giving hints for their specific country or frequented sites.
It's a pity we can't come at it from another angle and as a secondary check
peek at the sites cookies - as the domain field in the cookie is already
set up to define the cookie to traverse the entire site including
subdomains.
So where do we go from here?
I guess the other review points need to be tided up aside from the config
data and visibility issue.
Main one coming to mind is the error throwing vs empty url return.
|
https://wiki.mozilla.org/Public_Suffix_List/Uses#document.domain
|
https://issueexplorer.com/issue/w3c/webextensions/58 Sorry for spam. |
Tested this and it won't work. Similarly browsing an article on worldbuilding.stackexchange.com yields document.domain == worldbuilding.stackexchange.com StackExchange is a good test case; as it effectively uses one site with global logins to a bunch of subdomains that are basically just categorical. In this case; as I mentioned using cookies as a way to glean this info, sure enough their cookies are set to .stackexchange.com in all cases. I guess that means that same origin policy is no good for our use - as technically since document.domain differs for these subdomains, they should be being treated as separate security contexts which cannot share data without CORS set up. Tested in Firefox and behaviour is identical. I guess that article is just no longer accurate. |
I think you answered most of the questions yourself :) Yes, there are valid arguments against PSL. On the other hand, it is what browsers are using. And it would definitively cover more cases than a custom list you and I compile here can ever do. I think it is the best we can do today. I did a quick implementation of PSL in branch wip_getDomainRight. Got the list down to ~80k with very simple measures, and working on an idea that would reach ~60k. Compression Streams API would yield ~20k which I think is absolutely acceptable. Unfortunately it is very new and not supported in Firefox yet. Since I just yesterday figured out how to avoid a ~100k growth of scrypt-asm.js that's still a 40k win :) Possibly, one could offer user configurable overrides.. but then we're back on how to present that to the user.. One thing I'm definetively holding back on is going online / making requests somehere.. I think it is a (security) feature that the extension does not communicate with the outside world. |
It's nice to pick up the terminology that has been laid out, at least; Sound right? Beyond that in the case where different private subdomains actually use different logins or are different sites; it is then up to the user to manually correct this, and after that the scoring system will ensure it works fine unlike the old implementation.
For efficient lookups of the data, did you look at tldts? |
I think so. All your examples will be identified as (associated with the) Website.com.au (url) I did browse through a couple of libs. Maybe I misunderstood, but I got the impression they where quite large and/or requires some build system/module loader. I think convenient and small is more important than efficient. We're doing one lookup when the pop-up opens. As long as it's in the ms range, we're good. |
going a bit crazy here. Browsers don't support gzip/deflate data yet (waiting for the Compression Streams API) and other compression schemes where reasonable libs are available simply don't cut it on the compression rate. in the mean time, png is lossless and deflate compression - exactly what we need :) So this patch pre-process theh PSL list for easy lookup (and removes a lot of reduntant text) and export the result as a json dictionary. this is then converted to png by imagemagick. The browser loads the image, we access the pixel values and end up with our desired json dict. GH-68
So what happens when I browse staging.website.com.au, having previously amended the sitename to be staging.website.com.au? I guess I can test this myself. As for comments re libs/efficiency - yeah again that would be my thought also. We're not engineering a kernel here lol :) What timezone are you in, by the way? I'm in NZ (+13) |
Hmm I just tested this as if codegolf.stackexchange.com and worldbuilding.stackexchange.com were independent websites. This puts me back in the starting position that this PR was meant to fix - when you come across sites that use private subdomains that are actually different sites for purposes of login. For a real world example of this; take for example wordpress.com - which you can host your own website on at yourwebsitename.wordpress.com - each of which is it's own website with it's own logins etc. I guess a workaround for this would be if you allow sitename to be editable in the config screen - perhaps only if double-clicked. Also, there appears to be autofill enabled on the sitename box, which wants to suggest all prevously entered sitenames. |
Norway (+1) so yeah :-D
yes, that's by design. sort by score the ones that (partially) match the url, then a separator and finally every other site. Do you think it was better before? or is it the autocomplete thing that's giving you pain? I'm not sure I'm 100% satisfied with how it works myself to be honest. Maybe it's better to just filter the list and not doing suggestions/type-ahead |
You're right.. that's a bug in the match-url-to-sitename thing.. When I added that it was primarily to catch this case: regarding wordpress.. I'm surprised they are not in the public_suffix_list. maybe worth sending in a pr there= (hmm.. if the browsers use psl to limit cross domain cookies.... 😈)
You already can. what you are editing in the config page is the url to match the record/sitename/siteconfig to. So in the two stackexchange cases, you will get the correct default after editing. but it would still show the other one as "related" (which clearly could be considered a bug: {sitename: worldbuilding.stackexchange.com, url: worldbuilding.stackexchange.com} should definitively score low when visiting codegolf.stackexchange.com, even if it's still stachexchange. |
clearly some missing tests in this area still 🙄 and maybe some contradicting goals.. I still want my apple.com/icloud.com match :) |
Sure. That's a good feature that I also use. Same goes for gmail.com and google.com Perhaps I should do a PR for some test cases that cover issues I have. They can then be turned into issues to work on individually. |
The sitename field on the config page is readonly for me. It's just a td, not an input. Ok so two issues here:
I guess this PR is starting to branch off a bit. I should submit these things as individual issues/features.
Not so worried about that. Bug or undocumented feature? :) |
Oh ok it's the mpcombobox I'm seeing. I thought it was the browser doing it's own autofill thing. I think the mpcombo is fine. I now also understand what you meant about requiring the promise chain to continue even with unmatched domains (url='') so that you can select from your sites. Yes this is a great feature, means I don't have to keep browsing to a site just to get it's password to use in another app (like battle.net, etc) I think the idea of 'free lookup' mode is a good one - but I feel like it confuses things trying to make the same UI have two functions - one being creation of entries for the current site (or recall if already in place), and another being free-lookup of any site's password. There is some simplicity of cognitive load that one knows that when they click the MP icon, what they get is directly related to the site they are viewing. Making it able to look up unrelated sites kind of breaks an unwritten rule in my head that instantly adds cognitive overhead - if you get what I mean. I have shown family how to use your plugin and it took some work to get them to understand that it was not a general password lookup app that they typed the sitename into manually each time to get a password - but was meant to be automatic based on the site you are on. (if they do do what I said, they end up with all these random unrelated sites attached to whatever domain they happened to be on when they used the plugin this way) So adding a mode where what they are supposed to do to look up a random unrelated site is do it that way kind of makes it even harder for a novice to 'get it' Perhaps a better UI idiom for this would be a dedicated 'free lookup' mode to disassociate the two modes of UI operation. Of course; I am just pontificating at this point. You know what they say about opinions. They are like aholes - everyone has one! The reality is what you have done works fine aside from the teething bugs. Maybe I am just expecting it to operate like the old version - if I give it some time maybe the new idiom is better. I dunno. I'm waffling at this point. |
bug, no bug. hard to say. It will work it hitting enter or tab. cancel the edit on Escape. now, if you click outside, is it ack or cancel? in most situations on the web, it's ack. so -> bug. Don't remember why I wanted to disallow changing the sitename from config. It was problematic somehow with the old storage format, but I can't see why it couldn't be allowed now.
An alternative is to show all sitenames only on about:blank etc pages.. and some "see all" button in the other case?
The button is there. if you have more sitenames for the same url (to give a clue about just that)
That's mostly how I use it and why I implemented the default login-name like I did. :) When trying it, I liked yours better, as it was somewhat "less intrusive".. It also fails when one have to increment the counter. Not sure I understood how you've explained your family to use it. I tried to keep the default behaviour the same as before (but failed?) |
That seems more clear at a UX level to me.
For the less technically inclined (read: luddites), the idiom they are familiar with is an 'app'.
The result of this is that 'email', or 'facebook' is assocated with whatever random domain they are on, and it is unlikely to even recall automatically in the future. So educating them away from this behaviour can be a bit of a task. It takes them energy to get their head around the way the popup associates to the website they are on. It can be done but it takes some gymnastics. So given that; all of which applies to the old masterpassword-chrome; now imagine trying to teach the users the same thing, except it has been made even easier for them to use the plugin the 'wrong' way by providiing a lookup mechanism for sites that don't relate to the current domain. This can lead to them not understanding why sites they want to select are not in their stored sites - the reason being that they never actually fill in their details for that site on the actual domain because they just don't get that it works that way. |
I meant to reply to this at the time but forgot. |
How is it different to blogspot or various dynamic dns sites (eg dyndns) - that are on the list? I see.. so I'm about to make the extension too limit-less and non-tech-user-friendly 😮 can't have any of that 🤣🤣 Interesting part is that it is kind of the use-case I'm after. Quick lookup for using outside the browser (some native-app appstore thing etc). It would have been great if it could do both. Be a "generic" password lookup app - with context sensitiveness, so the first option is the one you most likely want. But you are also correct about the significant drawback: stored sites gets polluted. I have tried to squeeze an idea out of my head for a heuristic to decide if we're changing the sitename purposefully, or just doing a lookup (and thus should not save the url). without luck so far. The only alternative I can come up with is a save button.. And that's so 1999 :) It could work if you mostly use the context feature: eg when visiting gmail.com and changing it to google.com, ask: "do you often use your google.com account on gmail.com?" or something.. ugh.. definitively not as a popup... Using about:blank as a mode switch for this lookup-mode is perhaps the best choice. And it will not add url to the stored site. I actually have about:blank as my default page so it would work pretty great. I guess many (most?) have google.com, their email, facebook (shrug) or some corporate homepage though. Leads me to another, kind of related, issue. To save or not to save the default suggested site (so an export will contain all sites you've used the extension on). I know several users make and revert a change, just to get it saved. Part of the reason why I liked your default login name solution better... It forces an interaction and gives us a hint that this site is worth saving. |
Areas targetted are ones likely to change when new features are added. - Arrays: Make sure one entry per line, and always trail with a comma - CSS selectors: One selector per line, a no-op fake selector to ensure every real selector ends with a trailing comma Both of the above mean when doing merges, the only lines changed are ones added/removed - not the final one in the list just because a comma was added to extend the list. index.html: Fixed up some indenting to make editing in future clearer. (cherry picked from merge request GH-68 )
Fair point.
Hah. Hope you don't take any of my comments the wrong way. It's your baby! Only sharing my experience with a couple of users.
Yep as I replied above.
Yeah tough.
Yeah I was just writing that above then read the next paragraph. HAH.
By definition the only URL's available in this mode would be stored, right - unless a user manually typed one in.
Yeah I'd lean toward no (not saving default generated sites) - but this multimode operation creates some use cases that create extra challenges. The only ideas I can come up with off the top of my head here are:
But again here we are starting to flesh out engineering an entire other system and the scope of this discussion thread trails off :) Not that I find that a problem at all - happy to bounce ideas around - just I am a fan of biting off things in little chunks and refining. This one; design of the free-lookup mode, and the use cases/systems which support it, I think is worthy of it's own feature/issue and may generate multiple PR's (or internal commits) to interate over? |
Add new sync'd option 'treat_as_same_site', which allows configuration of which parts of a domain should be used to identify a site vs it's subdomains or base domains.
Update both options pages to include this, with instructions how to use it. Default to sane values which preserve previous functionality (hardcoded recognition of .co.* domains)
Rewrite domain matching routine to use new configuration option using regexes.
This patch fixes an issue where countries that use second-level domains (.net.nz, .co.nz, .org.nz, .gov.nz) were misidentified with the second level domain part being used as the site name.
It has sane defaults, but allows configuration so that users can customise for the particulars of their country and other websites that use a two-level base domain system.
This should be fully future proof now.
This is an adaption of my feature of the same name from masterpassword-chrome