One of my favorite authors ever, in one of my favorite books ever, wrote:
Just because you’re paranoid doesn’t mean they aren’t after you.
- Joseph Heller in Catch-22
And so to the paranoid among you, and those who should be, I present a quick lesson in how truly un-anonymous you are online, and how much more anonymous you can become.
The first thing everyone needs to know, but most people don’t full appreciate, is how your activity online is like your activity in real life, without unusual precautions you’ll leave your virtual DNA everywhere. For most of you a collective “so what” is a reasonable reaction. You’ve got lives to live, and you don’t anticipate anyone would likely be interested in where on the web you go. But anonymity (and the privacy it brings) can be important in many reasonable situations. And for some there is a general principle involved: the principle that we have a fundamental right to privacy, that should not be abridged (just because it is so easy to do and provides a thin veneer of national or regional security).
On that note! Let me share show you how, how much, and where you shed your virtual DNA…
Let’s examine the simplest thing you’re likely to do on the web. You just watched a Discovery channel show about pandas and find yourself curious about how pandas procreate. What happens when you do a quick Google search on “panda bear sex” and click the first result? Here’s what happens:
|Your Action or its Result||Who Sees Your Data?||What Data do They See?||Why do they want it?||How long do they hold it?|
|You type your keywords “panda bear sex” in your browser’s search box and hit “enter”.||Browser (it’s search form history)||
||Your browser’s search form remembers this search phrase to make your life easier.||Indefinite. Usually until you have so many phrases that it prunes the list. Even after that the data is still on your hard drive, recoverable until over-written.|
|Your browser plugins (also known as browser helper objects and add-ons) act on the URL, if applicable.||Anti-virus / Browser Plugins||
||Most software firewall/anti-virus suites include a browser plugin that can check every site you want to visit against a list of potentially harmful sites. This can mean (depending on implementation) that they are passing information to their backend about every single browser request you make. It’s like you are cc:ing them on every URL you want. Plugins make the experience of the web much richer, but each one has access to the URLs you visit, the content of those URLs, and anything else it wants on your hard drive (files, data, webcam, microphone, gps, etc.).||Indefinite. Could be anything.|
|Your web browser contacts Google search via your internet service provider (ISP) to get the Google search results.||Your ISP (the government, etc.)||
||Your ISP is the only one knows exactly what IP corresponds with exactly what household. And for that household they have a name, phone number, address, perhaps credit card and social secrity number, etc. In no way am I suggesting you should be afraid of your ISP, per se. They will not divulge your identity behind your IP to just anyone, but in this new age of loosely targetted government warrantless wiretaps, RIAA anti-piracy monitoring and lawsuits, etc. ISPs are giving up your identity with and without legal necessity. And ISPs have installed government packet sniffing NarusInsight nodes at their facilities which can analyze all network traffic passing by, looking for activity they deem “suspicious”. And suspicious likely includes the use of keywords, phrases, website urls, etc. that may have worrying or innocent uses.Also note, other ISPs are involved as your internet traffic travels crosses various networks. The ISPs in between can record the traffic they see.||ISPs are legally required to retain information about you for 6 months to 2 years, specifically to help law enforcement. What they retain is left somewhat open-ended, but is at least the information about who had what IP when. ISPs have also in the past generated revenue by selling traffic information to thirdparty companies, helping search engines and advertisers know what web sites are popular; they would not directly include your IP, but poorly written sites can leak some data through URLs.|
|Google receives and responds to the request, via the ISP.||
||They record the URLs you visit to improve their search results, and also to provide you with features. If you have a gmail account, a Google account, a YouTube account, etc. and you have cookies enabled, Google knows specifically who you are with every search you do and can do things like show you (optionally) your search history.||They say they keep data at least 9 months. (Presumably they keep the data indefinitely if they have your permission as part of a feature of theirs, or if they dis-associate it from your IP.)|
|Your browser receives the results from Google, but won’t show it to you quite yet. First your browser stores the file it received on the hard drive, adding it to your browser’s cache of the web page.||Browser Cache||
||What you see when you view a web page is a combination of many text, style, video, and audio assets all combined into one rendered document. Each asset is fetched seperately, and stored separately in the cache. Many of these assets are re-used between different pages on a site (for example the images in the header and footer of a page). It would be wasteful for the browser to request these re-used assets every time you visit another page on the same site. The cache saves the remote server work, saves your local browser work, and lets you click from one page to another more quickly (since it already has most of the assets you need).||Indefinite. Lifetime of the cache, then as long as it takes for the info on the disk to be over-written.|
|Your browser history records the url of your search results in your browser history. It still won’t show you the page yet, still a few steps away!||Browser History||
||Your browser’s history can be your good friend or your worst enemy. Useful when you want to revisit a site whose name you can’t remember, but it can be an awful snitch if you plan to cheat on your wife or husband via an online dating site.||Indefinite. You can modify the retention time in your browser settings, but keep in mind the data on a hard drive is not destroyed until it is over-written (and not even, always, then). A URL you visited 2 weeks ago may disappear from the list because you set a 2 week limit, but the url is still on the hard drive and can be recovered, until the disk happens to re-use that space.If you tell your browser you want NO history, this doesn’t necessarily mean it wasn’t recorded on the disk. Many browsers still record to disk and only delete the entries when you close the browser. But deleting is not destroying.|
|Your browser plugins act on the document of results from Google, if applicable. Nothing is shown yet, but we’re getting close!||Anti-virus / Plugins||See above on Anti-virus, plugins.||See above on Anti-virus, plugins.||See above on Anti-virus, plugins.|
|Next your browser sets cookies that Google requested. Almost there!||Cookies||
||Cookies are vital for site personalization and authentication. They are benign except that they can contain data which could be found and used to tie you to sessions on other servers, topics you are interested in (based on searches, ads clicked, etc.).||Indefinite. Lifetime of the cookie, then as long as it takes for the info on the disk to be over-written.|
|Now you see Google results!||None*||n/a||* In the case of Google where all the advertisements are Google’s this final step of viewing the page doesn’t open you up to any new privacy leakage… but see the next few steps which mention the anonymity risks regular ads, Java, Flash, and other things pose…||n/a|
|You click on the first search result and your browser sends a request to Google via your ISP to redirect you to the first search result, “PandaLovingInfo.com”.||Google, ISP||See above on Google and ISP.||Google wants to know which results people click on.||See above on Google and ISP.|
|Your browser is redirected to PandaLovingInfo.com.||Website, ISP||
||Websites want to know where their inbound traffic is, want to know how many users they have, what their users do, etc. They can collect this anonymously and then tie it to an account you create later.And see above on ISP.||Indefinite. No universal rule, they can keep the data as long as they like. And see above on ISP.|
|You now see the webpage on PandaLovingInfo.com, where all your questions will surely be answered!||Cache, plugins, and cookies||See above on cache, anti-virus, plugins, and cookies.||See above on cache, anti-virus, plugins, and cookies.||See above on cache, anti-virus, plugins, and cookies.|
|You are shown advertisements on PandaLovingInfo.com offering many wonderfully peculiar items.||Advertisers on the website||
||Advertisers want to know where you live, what you’re interested in, and anything else they can. They can track you between sites, so they know you are the same person who was interested in zebra mating rituals last week.||Indefinite. Whatever they want it to be.|
The above is about as simple a web experience as you can get. You do one search and view one result, and see how many people are given access to what you’re searching for, and to varying degrees, who you are, what you like, etc. If you want to be truly anonymous, every single “leak” listed above must be plugged.
In the next installment I’ll talk about the dangers posed from these traces you leave, and in the final installment what you can do about it.