Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
dougb5 committed May 5, 2020
0 parents commit 174375f
Show file tree
Hide file tree
Showing 12 changed files with 980 additions and 0 deletions.
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
This is the code for the Google search history exploration / intersection
demo hosted at https://prototypes.cortico.ai/search-history/.
This is a single-page app and can run locally, without a Web server.
Just open search_history.html from your browser.

Third-party libraries used (in the vendor/ subdirectory)

- js-sha256: https://cdnjs.com/libraries/js-sha256
- FileSaver: https://github.com/eligrey/FileSaver.js/blob/master/dist/FileSaver.min.js.map
- jszip: https://stuk.github.io/jszip/
- datatables: https://datatables.net/
51 changes: 51 additions & 0 deletions compact_word_vectors.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@

// Javascript functions for operating on compact word vectors
// Requires that "word_vectors" has been set by including a data file such
// as word_vector_data_mikolov1.js

var CWV_MAX_DISTANCE = 300;

function get_word_vector(w) {
var vec = word_vectors[w.toLowerCase()];
if (!vec) return null;
return decode64(vec);
}

function word_distance(word1, word2) {
var bytes1 = get_word_vector(word1);
var bytes2 = get_word_vector(word2);
if (!bytes1 || !bytes2) return CWV_MAX_DISTANCE; // A word wasn't found
return word_vector_distance(bytes1, bytes2);
}

function word_vector_distance(bytes1, bytes2) {
var bitsInCommon = 0;
for (var i=0; i<bytes1.length; i++) {
var b1 = bytes1.charCodeAt(i);
var b2 = bytes2.charCodeAt(i);
var res = b1 ^ b2;
while (res !== 0) {
bitsInCommon += res & 1;
res = res >> 1;
}
}
return bitsInCommon;
}

function related_words(word1) {
var results = [];
var bytes1 = get_word_vector(word1);
if (bytes1) {
Object.keys(word_vectors).forEach(function(key, index) {
if (word1 !== key) {
var bytes2 = get_word_vector(key);
var distance = word_vector_distance(bytes1, bytes2);
if (distance < 110) {
results.push([key, distance]);
}
}
});
}
results.sort(function(a, b) { return a[1] - b[1]; });
return results;
}
Binary file added screenshot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
128 changes: 128 additions & 0 deletions search_history.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
<!DOCTYPE html>
<html>
<head>
<title>Google Search History Inspector / Intersector</title>
<link rel="stylesheet" type="text/css" href="vendor/datatables.min.css"/>
<link rel="stylesheet" href="https://ajax.googleapis.com/ajax/libs/jqueryui/1.11.3/themes/smoothness/jquery-ui.css" />
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
<script src="vendor/sha256.min.js"></script>
<script src="vendor/FileSaver.min.js"></script>
<script src="vendor/decode64.js"></script>
<script src="vendor/jszip.min.js"></script>
<script type="text/javascript" src="vendor/datatables.min.js"></script>
<meta property="og:description" content="You tell Google a lot about who you are through your searches. This tool helps you inspect and reflect on your Google search history so that you can get a sense of what's been on your mind over the years and what Google remembers about you. Two people can use it together to reveal searches that they have in common with each other and discover hidden shared interests.">
</head>
<style>
body {
margin: 50px;
font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
}
.docs {
max-width: 1000px;
margin: 10px auto;
padding: 20px;
line-height: 1.4;
border-radius: 5px;
border: 2px solid #999;
}
.dataTables_filter {
float: left !important;
}
#intersectFile { display: none; }
div#hash-password-form { display: none; }
thead th { cursor: pointer; }
#filterAddRelatedWordsButton { display: none; padding-left: 10px; }
#loadingMessage { padding-bottom: 10px; }
#searchCountMessage { padding-left: 20px; color: grey; }
.resultLink {text-decoration: none; color: black;}
#downloadPlainDiv { margin: 5px; }}
#intersectionDiv { margin: 5px; }}
</style>
<body>

<div class="docs">
<center><h1>Google Search History Inspector / Intersector</h1>

<a href="screenshot.png"><img src="screenshot.png" width=400></a>

</center>

<p>
You tell Google a lot about who you are through your searches. This
tool helps you inspect and reflect on your Google search history so that you
can get a sense of what's been on your mind over the years and what
Google remembers about you. Two people can use it together to
reveal searches that they have in common with each other and
discover hidden shared interests.
</p>

<p><b>Warning: Your search history is very sensitive data and it will
show up on this screen when you follow the instructions below. This
tool processes your search history entirely on your computer and does not send
your data to this or any other web site or service. (You can go offline or
disconnect your Internet connection after step 1 below if you're
skeptical of that!)</b>

<p>Ready? Here's what to do:

<ol>
<li>Go to <a target="_blank" href="https://takeout.google.com/settings/takeout">Google Takeout</a> to download an archive of your search data for your primary Google account. If you have more than one Google account that you frequently use, do this for each of the accounts.
<ul>
<li>Under "Select data to include", choose only the "My Activity" option and turn everything else off. (It won't hurt to include other data, but it will take much longer to export and download.) If "My Activity" is not available, you might have Google's search history turned off, or you might be using a Google G-Suite domain whose administrator has restricted data export.
<li>(Optional) To limit the activity to only searches, you can click on the "My Activity" line, select "Select specific Activity Data", and then untoggle everything except "Search". This will reduce the size of the Takeout download.
<li>On the next step, keep all the default choices (".zip" file type) and select "Create archive". Wait for the export to finish&mdash;this can take several minutes&mdash;and then download the resulting zip file to your computer.
</ul>
<li>Select the "Choose files" button below and select the zip file(s) you downloaded above.
<li>After a few moments a table with your searches will be displayed on the screen. From there you can page through your searches or explore them using the keyword filter and sorting features:
<ul>
<li>Click on the "Number of searches" header to prioritize the searches you've made most frequently.
<li>Click on the "Years searched" header to prioritize the searches you've made in the greatest number of years.
<li>Click on the "Most recent search" header to show them in time order, most recent first.
<li>Type something into the "Filter by keyword" box to show only the searches that contain a particular keyword. Then click on "Also show related searches" to broaden the filter to include topics related to the keyword you typed.
<li>Click the "Download to a .tsv" file button beneath the table to get a file that you can import into a spreadsheet program.
</ul>
<li>To learn what you have in common with a friend:
<ul>
<li>Click the "Download hashes for all searches to a file" button at the bottom of the table and choose a password to protect the data in the file. After a few moments, a file will be downloaded that you can safely send to your friend without revealing your searches.
<li>Your friend should do steps 1 to 3 above on their own computer and then select "Intersect with a friend's hashes". You will be prompted to type in the password you chose above and then select the hash file you've sent to your friend. At that point your friend's searches will be narrowed down to show only the ones that match a search you've made as well. <b>(Be aware that, although this will only reveal searches which you have in common with your friend, the results have the potential to be uncomfortable or surprising to one or both of you.)</b>
</ul>
<li>When you're done, close this browser tab and delete the Google Takeout zip file from your computer. If you'd like to delete items from your Google search history or change your search history settings based on what you've learned, head over to <a href="https://myactivity.google.com/">myactivity.google.com</a>
</ol>
</p>

<i><a href="https://docs.google.com/presentation/d/1SfbfnuNSlDRtUOpN1KNc2RMtw030defCDME6p_268fQ">Here's a
recent presentation about this work</a>. Please send comments and questions about this tool
to <a href="mailto:[email protected]">Doug Beeferman</a></i></a>.

</div>

<center>
<input class="inputfile" type="file" id="file" multiple />
</center>

<br><br>

<script src="word_vector_data_mikolov1.js?v=1"></script>
<script src="compact_word_vectors.js?v=1"></script>
<script src="search_history.js?v=7"></script>

<div id="loadingMessage"></div>
<div id="intersectionMessage"></div>
<div id="result_table">
<table id="search_table" width=100% class="display compact" cellspacing="0">
</table>
</div>

<div id="hash-password-message"></div>
<div id="hash-password-form">
<form>
<fieldset>
<label for="password" id="hash-password-title">Password</label>
<input type="password" name="password" id="hash-password" value="" class="text ui-widget-content ui-corner-all">
<input type="submit">
</fieldset>
</form>
</div>

</body>
</html>
Loading

0 comments on commit 174375f

Please sign in to comment.