Benscomputer.no-ip.org

Creative Commons
License

Please see footer for details of the license applied to this article.



Benscomputer.no-ip.org

Patent Free


Streetview and WiFi - What really happened



Readers are unlikely to have missed the current row over Google 'accidentally' collecting WiFi data with their Streetview cars. What does seem to be being missed (judging by portions of the Internet) is what the problem actually is. So lets take a look at what happened.

From the Beginning

StreetView is a project to photographically map large swathes of the globe. It involves driving cars fitted with cameras covering 360 degrees at about 30MPH around.

In addition to this, Google also fitted the cars with WiFi equipment in order to collect data on Wireless Access Points - MAC addressM and SSID (Network name). This is obtained by intercepting packets as they fly through the air, and looking at the headers. Even on an encrypted network, the routing frames needed for this part of the project are likely to be available. It's possible that Google may have elected to only record open networks, but it's not clear.

The reason Google wanted to record MAC/SSID information is to aid in Geolocation. By using GPS to identify where a specific MAC/SSID is based, it becomes possible (using the resulting dataset) to identify where a device (say an Android powered mobile phone) is located based on the wireless networks in range. If there are a number of networks in range, you can triangulate onto the users rough location based on which networks are available, and the distance from the access point (based on signal strength).

You may not agree with this aim, but it is both a legal and legitimate project (one which I believe you can opt out of (and yes, I'd prefer Opt in!))


Where it went wrong

Rather than writing the necessary code from scratch, Google re-used a piece of code - written for a slightly different purpose - written by one of its engineers.

For whatever reason, it wasn't noticed that this piece of code also collects payload data (i.e. the information you are sending over the network). Perhaps the description in their code repository didn't mention this, and only appeared to collect MAC/SSID - who knows?

As a result Google 'unknowingly' drove around collecting payloads as well as the intended data (MAC/SSID), until it was noticed and they went public. At this point lawyers, newspapers and FUD spreaders all went mad.


Common FUD debunked

Q: Surely they should have noticed that the size of the data collected was far bigger than expected?

We're not talking about accidentally downloading files here, the size of a packet is small. They'd only have collected a few packets - Given the range of WiFi, the speed the cars travel at and the fact that their equipment changes channel many times per second (50 IIRC). They may have collected a few extra packets if they were stuck in traffic/at traffic lights etc. but it's still going to be a pretty tiny amount.


Q: How can you accidentally fit WiFi equipment to a car?

They didn't. The WiFi equipment was deliberately fitted for an intended purpose (the collection of MAC/SSIDs). The accident was keeping the payload.


Q: They profit from compiling data, this wasn't a mistake

Yes they do profit from collecting data on us, but the data likely to be collected through this is inconsequential. Consider the following;
  • Most E-mail connections are encrypted with SSL/TLS (there are exceptions)
  • Internet Banking is encrypted
  • The amount of data collected is tiny (see above)
  • Far more information can be gained by using less insidious methods

  • Also, it's highly likely that if you were affected by this, that your wireless network is unencrypted (see below for notes on WEP). If this is the case, then your next door neighbour has a far better opportunity to compile data on you. Bytes of information on your network are of no use to Google.


    Q: Why wasn't this picked up in code review? Either their internal processes suck, or this was deliberate

    The collection of Access Point Information isn't (generally) likely to pose any security risks etc. So, if timelines are tight, many companies may actually skip the code review on this kind of project. Some companies just don't afford their employees the time to complete the due diligence that would ideally be completed on all projects (which would, admittedly, help prevent situations such as this from occurring).



    Open/Encrypted Networks

    Not much has been said on this area of the issue, so I'm speculating here.

    It's likely that only open (i.e. unencrypted) networks were affected by this data collection. It is, however, possible that WEP encrypted networks were also affected.
    If you consider the following, you'll see that it is plausible;

  • The code was written, but never deployed - Suggests a Proof of Concept (POC)
  • WEP encryption can easily be broken, multiple scripts/libraries exist to decrypt WEP encrypted frames
  • The collection of encrypted data is ethical in some situations (where you own the network, have permission, Security testing etc.)
  • The ability to collected WEP encrypted data is an interesting curiosity to include in a POC
  • No code review was undertaken


  • So, it's plausible that the original author included (or wrote) a function to crack WEP encrypted frames in his Proof of Concept. This would have been included in the deployed code (as no-one knew it functioned in that manner) and so would have collected data from both Open and WEP encrypted networks. It is, however, unlikely. Even if it turns out to be true, the use of WPA/WPA2 would likely mean that Google were unable to decrypt any encrypted frames that were collected.



    So was it a mistake?

    I'm always very cynical about corporate 'mistakes', but given the facts above, I'd say it was an honest mistake.

    Google stand to gain nothing from this behaviour. Even if they could gain something, the fall out (as we are seeing) would be so great as to negate any benefit. They can collect so much more data about us through the use of cookies and free services than they ever could by sniffing wireless networks.

    Google are far from a best friend to Privacy, but this cannot have been a deliberate act. There's just no reasonable motive. I'm not claiming that the mistake should go unpunished, but it truly is a bit of astorm in a tea-cup.

    I do think those that have launched a class action lawsuit over the issue may well be in for a shock. The question I'd ask is - given the tiny amount of data that could possibly have been collected (per network), did a privacy invasion really happen (if the data's not usable, is it an invasion of privacy?). The only person able to answer that will be the Judge in the case.

    I can't help feel that part of the reason that this has been whipped into such a big issue is that Google have deep pockets.


    Bootnote:

    On a completely unrelated topic: I've used WiFi rather than Wifi because I think it looks better. It's not through a belief that this is the correct way to do it!

    Also, apologies to any who feel I have over simplified the facts. I've done so because many of those who believe the FUD seem to be incapable of inferring many of the facts for themselves. Many of them also seem to display a lack of technical understanding! Obviously some or all of this article could be incorrect, it's simply based upon what has been released so far







    COMMENTS: If you wish to comment on a story, please use the Contact Me page. Due to the level of comment spam on the net, this has become the easiest way for me to police it. Thanks



    This page is copyright to me, Ben Tasker.

    Creative Commons License

    This work is licenced under a Creative Commons Licence.


    All Images operate under  a seperate license
    Please read this page for more information. The Full Image License can be read here


    RIPA NOTICE: NO CONSENT IS GIVEN FOR INTERCEPTION OF PAGE TRANSMISSION



    DISCLAIMER:


    Note: all views expressed on this site are my own, and do not necessarily represent the views of my friends, family or employer.