-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathreport.tex
137 lines (90 loc) · 18.2 KB
/
report.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
% TEMPLATE for Usenix papers, specifically to meet requirements of
% USENIX '05
% originally a template for producing IEEE-format articles using LaTeX.
% written by Matthew Ward, CS Department, Worcester Polytechnic Institute.
% adapted by David Beazley for his excellent SWIG paper in Proceedings,
% Tcl 96
% turned into a smartass generic template by De Clarke, with thanks to
% both the above pioneers
% use at your own risk. Complaints to /dev/null.
% make it two column with no page numbering, default is 10 point
% Munged by Fred Douglis <[email protected]> 10/97 to separate
% the .sty file from the LaTeX source template, so that people can
% more easily include the .sty file into an existing document. Also
% changed to more closely follow the style guidelines as represented
% by the Word sample file.
% Note that since 2010, USENIX does not require endnotes. If you want
% foot of page notes, don't include the endnotes package in the
% usepackage command, below.
% This version uses the latex2e styles, not the very ancient 2.09 stuff.
\documentclass[letterpaper,twocolumn,10pt]{article}
\usepackage{usenix,epsfig,endnotes}
\usepackage{url}
\let\footnote=\endnote
\begin{document}
%don't want date printed
\date{}
%make title bold and 14 pt font (Latex default is non-bold, 16 pt)
\title{\Large \bf Mouse and Keyboard Tracking Detection}
%for single author (just remove % characters)
\author{
{\rm Nathan G. Feldman}\\%in case you're wondering, there's already a few other Nathan Feldmans on Google Scholar, so I need to distinguish myself in scholarly work
University of California, San Diego
\and
{\rm Scott Finney}\\
University of California, San Diego
% copy the following lines to add more authors
% \and
% {\rm Name}\\
%Name Institution
} % end author
\maketitle
% Use the following at camera-ready time to suppress page numbers.
% Comment it out when you first submit the paper for review.
\thispagestyle{empty}
%1. What closely related work, if any, did you find and read?
%2. What things did you try on the project since you proposed it?
%3. What findings did #2 lead to?
%4. How did the related work you read and your findings change your proposal, if at all?
%5. What is your plan for the remainder of the quarter?
%6. What, if anything, do you need from me and Keaton to help you succeed
\subsection*{Abstract}
We present the design and implementation of a technique for detecting when websites store data about mouse movement and keyboard presses and send it out over the internet, either back to the website's own server or out to the server of a third-party service. We have built a web crawler to visit as many of the top Alexa websites as possible and use this technique on each of them to find out which ones do mouse and keyboard tracking.
We analyze our results to find out as precisely as possible what information each of them is sending, to whom, and for what purpose it is being sent. In doing so, we aim to discover how much the typical web users' expectations of privacy on the web are being violated.
\section{Introduction}
The core of this project is to design and implement a technique for detecting mouse and keyboard tracking on the web. The technique should allow us to find out when a website's client-side script is sending information about the user's mouse movement and keyboard presses out over other the internet. Mouse movement, while not a particularly severe security concern, may violate the typical user's expecation of privacy on the web. Keyboard press detection raises the same concern, but may also reveal sensitive information about the user, especially her password, to malicious sites she visited by accident or by coersion. (Of course, any website where the user has to log in can see the user's password for that site, which is most likely used by the user for other sites as well. We will discuss in more detail in our final report why keyboard detection gives the adversary more of an ability to steal a user's password or other sensitive information.)
We performed a web crawl applying the techique to the top 10,000 Alexa websites. This allows us to determine which of the most ``popular'' websites are tracking mouse movement and keyboard presses. Additionally, by collecting network traffic sent by the websites' client-side scripts, our data will indicate to whom the mouse and keyboard information is being sent.
After gathering the data from the web crawl, we performed an analysis to try to find out how this mouse and keyboard information is being used. Our hypothesis was that this kind of tracking is being done quite widely, but mostly just for UX purposes. Web developers and UX designers have reason to learn exactly how users are using their site so that they may continuously rework the website design to make it easier to interact with. We would like to find out how many websites are using this information for illegitimate purposes. We gain some insight into the web developers' reasons for mouse and keyboard tracking by looking at the addresses to which the information is being sent. It is reasonable for the information to be sent back to the website's own server for usability testing purposes. Additionally, we are aware of some third-party mouse and keyboard tracking services that are commonly used by web developers for usability testing, such as ClickTale~\cite{clicktale}, Crazy Egg~\cite{crazyegg}, Inspectlet~\cite{inspectlet}, and Mouseflow~\cite{mouseflow}. %should those be endnotes instead of citations?
%Talk about what privacy violation means here?
\section{Related Work}
Our goal is similar in structure to fingerprinting detection. We wish to load a number of sites and somehow analyze their scripts to decide if they are tracking certain information from the browser.
Acar et al.~\cite{fpdetective} built a framework called FPDetective for crawling the web for instanances of web-based fingerprinting. They made slight modifications to the JavaScript engines inside the browsers being used for crawling in order to log certain events in JavaScript. They use heuristics to decide if a particular script is fingerprinting the browser. Primarily, they suggest that if a script attempts to load a large number of fonts, they are probably fingerprinting the browser, since there are likely no other reasons a website would have for loading that many fonts. They used PhantomJS~\cite{phantomjs}
a ``headless'' web browser, in order to do a load each of the top million Alexa sites and determine based on the behavior of the loaded scripts whether they were performing any fingerprinting. They found that 404 of the top million sites were doing fingerprinting in JavaScript, and an extra 95 sites of the top 10,000 were doing Flash-based fingerprinting (found through separate methods).
Mayer and Mitchell~\cite{fourthparty} rely on a framework called FourthParty, a Mozilla Firefox extension that also does JavaScript event logging for web measurement, in order to inform a discussion on web tracking policy. The framework is more general-purpose but less powerful than FPDetective, in that it is meant to be used for any type of measurement and analysis done over the web, but that is only a browser extension and not a browser modification.
Jang et al.~\cite{tainttracking} built a much more powerful tool to detect several forms of malice in JavaScript, such as cookie stealing, history sniffing, and mouse and keyboard behavior tracking, which was run over the top 50,000 Alexa sites. The tool is actually a modified version of Google Chrome instrumented with an ``information flow engine,'' which which is capable of tracking how accessed data values are passed through the script. For instance, it is capable of ``tainting'' the $x$- and $y$-coordinate values taken from a mouse move event and detecting if those values are ever sent to a method that sends an HTTP request out over the network. Mouse movement and key presses were not their focus, however. In fact, the only events whose values they discuss in their analysis are clicks, mouseovers, and clipboard copies. While they do have found evidence of mouse event values being passed to network functions, they do not discuss how basic mouse movement or keyboard press information is being used, or if it is being used at all.
While building upon the taint-tracking system from ~\cite{tainttracking} would have been an excellent idea for our experiments, the code was not in a working condition, and fixing it and building upon it would have taken more work than we could afford in one school quarter. We could have built on top of FPDetective or FourthParty as well, but both were built on versions of browsers that are now outdated, and there would have been a lot of overhead in getting them set up and getting acquainted with their codebases.
Instead, we felt our technique (described in the next section) was simple enough that building our own crawler directly on PhantomJS would have been the quickest and easiest way to produce results by the end of the quarter. This also guarantees us the ability to collect exactly the data that we need and use the external tools we want, in case the frameworks mentioned above confine us to using specific API calls. Finally, by using our own, single-purpose script, our web crawl and analysis should run faster without the overhead of all the other measurements taken in the frameworks above.
\section{Technique}
We wrote a PhantomJS script that visits a target webpage, triggers mouse and keyboard events, and records relevant network traffic. The script was run over the top 10,000 Alexa websites for detecting mouse tracking, and the top 5,000 Alexa sites for detecting keyboard tracking. The metadata from network requests and responses (JavaScript XMLHttpRequests) are stored in a MongoDB database~\cite{mongodb}, to be processed by a separate scripts for analysis after the data collection is complete. This network traffic analysis is meant to detect which sites is performing covert mouse and keyboard tracking.
Essentially, our heuristic is look for correlation between mouse and keyboard events and network traffic. We first filter out the first several hundred milliseconds of HTTP requests and responses, since this is only going to be the initial page and resource load, and we will not trigger any mouse or keyboard events during this time. After the initial loads are done, we hypothesize that there will only be two types of network traffic left. The first type is periodic background traffic. This includes messages to let the server know the user is still online, and requests for updates to the page content. Because the browser does not allow the server to send messages to the client without a request from the browser, content cannot be directly ``pushed'' to the client once it is created, it must be explictly asked for by the user. As an example, a social networking site might have the browser send HTTP requests to the server every five seconds to ask for new posts to the user's news feed, new messages and notifications, and just to let the server know that the user is still online for chat. This all happens because the server is not capable of sending messages to the browser once new posts are created or messages or notifications are received. We suspect that almost all of this type of HTTP requests are only sent at fixed intervals and with roughly constant size.
The second type of traffic is that triggered by user interaction. Because all user interaction is controlled by our scripts, we should only see variations in the frequency and size of HTTP requests sent the browser after the initial loads as a result of our triggered mouse and keyboard events. Therefore, we should be able to detect a simple pattern of network traffic over the first few seconds after the initial loads. After this, we should only see a disruption in the pattern as a result of mouse and keyboard events. Moreover, we think that patterns in sequences of mouse and keyboard events will result in detectable patterns of HTTP requests if the site is performing some type of mouse or keyboard tracking.
We believe that there is no reason for mouse and keyboard events to trigger network traffic in this fashion except for covert tracking, unless the page changes as a result of the network traffic. For instance, as mentioned in~\cite{tainttracking}, moving the mouse to certain locations could cause an image to load, and so this would not be covert. In order for this to be the case, the server needs to send something back to the browser to be loaded into the page. Thus, we will filter out sites where most HTTP requests of the second type result in large HTTP responses from the server. We deduce that all remaining sites with the second type of traffic are doing covert tracking. In fact, we don't actually expect to find many of these false positves where HTTP responses resulting from mouse and keyboard events contain real substance because we believe that most traffic of this type will come from mouseover events, which we are not generating.
\section{Implementation}
JavaScript has a concurrency model called an event-loop. Two instances of functions may not be run at the same time, but once the main script has run, events are triggered that cause their handler functions to run. These events are essentially queued up as they occur. When an event is processed, the corresponding handler functions are run to completion, and then the next event is processed if there is one waiting.
Our PhantomJS script will take a queue of $N$ sites, but can process a user-specified number of concurrent sites ($C$) at once. It keeps an array of $C$ pages, loads them up, and then whenever one page is finished being processed, the next site waiting in the queue is loaded into that array slot. This continues until all $N$ sites have been processed. This PhantomJS script needs to be run in intervals by a separate shell script, otherwise memory leaks in PhantomJS cause it to hang or crash.
For each site, the PhantomJS script does the following. Once PhantomJS has determined that the static content has finished loading, the script waits for one second. After that, the first wave of mouse movements begins, which lasts 5 seconds. Then there is a pause in mouse movements for 5 seconds, and finally a second wave of mouse movements is triggered, lasting 4 seconds\endnote{This was meant to be 5 seconds as well, but we made an arithmetic error trying to divide the time spent on the page into thirds.}. During each wave, a function is triggered every 10 milliseconds that triggers 20 mouse movements. The locations to which the mouse is moved alternate between two parts of the screen but are chosen randomly within a certain range.
The script was run again, triggering keyboard presses instead of mouse movements. This worked exactly the same, with the waves of keyboard presses timed the same as waves of mouse movements, except that keys are chosen uniformly at random from the set of capital letters.
Handler functions are registered for the events that the JavaScript running on a site sends an XMLHttpRequest (or ``XHR'') to a server, as well as the events that a response is received from the server. Each response has to correspond to a request. Corresponding requests and responses share a unique {\tt id} field in the metadata collected by PhantomJS. At each of these events, the handler logs the metadata of the HTTP message provided by PhantomJS in addition to the an identifier for the site being processed, the time (in milliseconds since the epoch) that the handler was called, and a string that indicates whether this message was a request or a response. This data is temporarily stored in a text file, which is picked up by a separate script to be stored in the database\endnote{Trying to access the database straight from our PhantomJS script proved difficult, given the amount of data that was being produced so quickly, the system call overhead, on top of the large amounts of memory being used.}. We also stored the times at which each of the waves of mouse movements/keyboard presses started at ended.
We have two PHP scripts used for analysis. The first script looks at all the requests generated and corresponding responses for each site. It counts up the number of requests sent during each relevant time window: the one second after the static content is loaded, the first wave of mouse movements/keyboard presses, the pause between waves, and the second wave of mouse movements/keyboard presses. It keeps track of the domains to which the requests are being sent during the two waves. It also counts the total amount of data sent during each window. This is measured by the number of characters after the top-level domain in the URL to which the request is being sent (covering {\tt GET} requests), plus the number of bytes in the message body, taken from the {\tt Content-Length} HTTP header (for {\tt POST} requests). However, we filter out requests from these counts that seem to be targeting image files, and requests to {\tt gstatic.com} (used for Google Analytics), since these are irrelevant to our results.
The script lists the sites that are flagged as suspicous by our heuristic, along with the associated XHR counts, data counts, and the list of domains to which requests are made. A site is flagged if, during each of waves, at least 250 more bytes of information was sent than during the pause between waves. Additionally, we require that during the first wave, at most 1,000 more bytes of information was sent compared to the second wave. This prevents the script from flagging sites that were still loading initial data during the first wave, because this would have skewed our results\endnote{This would not have been an issue if we had let the page sit for more time before triggering the mouse and keyboard events, but by the time we realized this was an issue, it was too late to re-run the crawl. It's likely we also have a lot of false negatives because of this, and perhaps this part of the heuristic could be refined more.}. Finally, The site outputs a compounded list of domains to which data was sent by the flagged sites, along with a count of the total number of messages sent to each one.
The second PHP script simply outputs a list of websites that send any messages to any domain in a list of known mouse-tracking services.
{\footnotesize \bibliographystyle{acm}
\bibliography{bib}}
%\newpage
%\begingroup
%\parindent 0pt
%\parskip 2ex
%\def\enotesize{\normalsize}
\theendnotes
%\endgroup
\end{document}