Breaking the Google Addiction one step at a time

Google isn't your friend. Google isn't my friend. Google is, and always has been, a data-whore.

But, still we use them and allow them to slurp up more and more data about us.

They're a bit like Amazon in that respect - you know they're an increasingly terrible company, but they're just so convenient and you keep on using them whilst ignoring the power they're amassing over the market.

But, it is something that's been concerning me more and more over the years.

We install adblockers, no-script and other extensions to add a fig-leaf to our privacy, or to try and avoid Google's user-hostile changes, yet we keep on using the same services. Even when they completely change the UI around on us, for no good reason, we still keep using their services.

I decided, quite a while ago, it was time I made a change, but then did very little, at least until recently.

As great as a "clean-break" might sound, going cold turkey off Google's services is never going to work - no model of user behaviour supports making massive jarring changes.

So I decided to start with the most obvious interaction with Google - their search engine. I don't have Google Home or similar, so my most frequent interaction with Google is search.

 

Not that simple

I've tried switching search engine before, and ultimately come back to Google. Why? because Google was always better at returning the results I need.

I noted, though, when I set out on this that Google has been slipping in this respect recently - could their stumble yield an opportunity for a competitor to usurp them within my search habits?

 

Choice of Search Engine

There's not actually that much choice in underlying search engine (really it's Google, Bing, Yandex or Baidu), but there's quite a market in search engines which use these core few as a provider - providing a layer of separation between you and the data-hungry search provider.

Previous attempts to replace Google have been with providers like DuckDuckGo, who use Google as the underlying search service. But, despite seeming like an easy win, every one of those attempts has ultimately failed.

So, I decided to try a different approach and settled on Ecosia instead.

They use ad revenue in order to plant trees, which is nice, though that wasn't really a consideration either way for me. My biggest concern really that they were using Bing as their search service provider.

My impression of Bing has always been: it's great for porn (and bypassing network level filters) but not so good at finding what you need, when you need it.

Still, it's been quite some time since I last tried anything Bing based, so maybe things have moved on.

So I started off by configuring Firefox Android to use Ecosia, before gradually enabling it on my desktop, and experimenting by creating a site-search with it.

 

What about Privacy?

The whole point of this exercise is, of course, improve my privacy, so questions have to be asked about Ecosia's privacy posture.

Their privacy page is fairly unequivocable:

  • Doesn't store your searches permanently
  • Anonymises all search data within one week
  • Doesn't sell user or search data to advertisers
  • Searches are encrypted (seems to just be they use HTTPS, but OK)
  • They don't use external tracking tools like Google Analytics
  • All their tracking can be disabled (and honours Do Not Track)

This is all good, so long as it's true. Plus, Ecosia is based in Germany, so if they aren't being suitably upfront, they can be battered using GDPR (depending on what the hell with Brexit. I won't recount my views again here).

Being somewhat cynical and untrusting, I decided to spend a bit of time in developer tools looking at what they're actually doing.

 

Page Resources

The number of resources pulled in for a search is refreshingly light for someone used to staring, in horror, at the result of a Google search (load times are because I had some big uploads running in the background)

One cause for concern though, are those images.

They're used for thumbnails in results

That in itself isn't an issue, but rather than proxying them through, Ecosia loads them directly from Bing. 

This gives Bing a potential opportunity to associate me to searches. However, Ecosia have at least removed the simplest opportunity for Bing to do this:

<meta name="referrer" content="origin-when-crossorigin">

Modern browsers, therefore, won't include a full referer header, and will just send https://www.ecosioa.org when requesting the images. Older browsers will, though, so will deliver the search term (in the form of the query string) straight into Bing's hands, straight from my IP.

It's only one vector by which Bing could track users though - they could instead set a unique cookie when the image is requested (they currently don't), or vary the image URL slightly to associate with a received search - they don't appear to be doing the latter either, but with URL's like https://www.bing.com/th?id=AMMS_27d93d5191f80594e1908e1b8dd7c8bc&w=110&h=110&c=7&rs=1&qlt=80&cdv=1&pid=16.1 it'd be hard for the average user to spot.

It's not a significant flaw, but given how unequivocal Ecosia are in their privacy page, it's a pity this vector has been allowed to slip through. Bing seem to be behaving themselves at the moment, but it'd be nice for them not to be given the opportunity to do otherwise.

 

Analytics

The privacy page did mention they use analytics, and that it's not via a third party provider (like Google Analytics).

Turning my adblock off so that their analytics would load revealed that Ecosia are running their own instance of Snowplow.

Basically it ultimately means that their little JS agent sends Ecosia some rather nastily nested JSON (with some values being base64 strings which are, themselves, JSON encapsulated objects). Some of the data being submitted is a straight duplication of data elsewhere in the same submission, but yeah.

The format's all quite yuck, but the data they collect is basically the standard analytics fare - browser version, os version, have you got flash etc. Ecosia have also bolted on some additional data relating to the results themselves:

{
    "data": {
        "data": {
            "ads_shown": "ad_displayed",
            "comp_shown": "no_widget",
            "entity_shown": true,
            "flights_shown": false,
            "green_domains_shown": false,
            "hotel_shown": false,
            "map_snippet_shown": false,
            "page_num": null,
            "query": "hello world",
            "search_type": "search"
        },
        "schema": "iglu:org.ecosia/search_event/jsonschema/1-0-5"
    },
    "schema": "iglu:com.snowplowanalytics.snowplow/unstruct_event/jsonschema/1-0-0"
}

No real concerns there, it all seems reasonable enough. Plus, my adblocker picked up on their analytics script from the outset, because they've been honest and not tried to obfuscate filenames or similar.

I did add Ecosia's analytics domain to my DNS level blocking though, so that I could be sure I was excluded from their analytics as I started using it on more devices.

 

Ads

As I was doing a dive into their behaviour, the obvious next step was to look at what their behaviour was regarding ads. Would ad supported searches result in random, unknown bits of 3rd party javascript being run, exposing the user to malvertising? 

The first thing I noticed here, is that I couldn't actually see any ads, even with all adblocking off. Their analytics call said ads were enabled/displayed though.

The root cause of this was that my search term "Hello World" didn't match any ads at their end, so they didn't send/load any.

That's a point in their favour - rather than throwing any old ad at me, they instead sent no ads.

So, I changed my search to shoes. As a result I got some fairly unobtrusive (in the sense they fit in) ads

The images at the top are all ads.

My one criticism here, really, is that every result above the fold is an Ad (they are all marked as such) so I had to scroll to get to the organic results. On the other hand, I searched for Shoes and was presented with a range of shoes.

Looking at the requests being made, the ads are entirely images or text. There are no scripts, third party or otherwise, at play here. That was briefly another point for Ecosia.

 

Unusable Ads

That point was lost though when I found an annoyance in this regard.

Because the ads were image/text only my ad-blocker wasn't blocking them when I re-enabled it. That's fine in principle (the ads are relevant to my search, and aren't of a form that exposes me to malvertising), but quickly falls apart when you click on some ads.

This can be demonstrated using a different search term "Mens White Vest":

That very first result is "Men's White Basic Vest" from Burtons for £6 with £3.95 shipping. I couldn't buy it if I wanted to.

You see, unlike other advertisers, Burton in their wisdom decided not to have the ad go straight to their site (for me to buy the item), but instead to point to that bastion of privacy clickserve.dartsearch.net.

That domain, for very good reason, is blocked in my DNS.

Burtons, if you're reading this - I hope you had to pay Bing for every single time I clicked that advert to see if the behaviour differed with things tweaked. Maybe keep better company in future?

Whilst that's not Ecosia's fault directly (especially as advertisers work with Bing to create the adverts), it does mean that their ads are suddenly less useful.

There's no value to me in having screen real-estate taken up by something I can't actually buy even if I really, really, really want to click on it. Not all advertisers are that dense though (shout out to Matalan, who have the good sense not to pimp out their customer's data).

At the time, I figured I'd put up with it and see how I got on, noting:

On the other hand, I guess it's a self-selecting pool: those who insist on associating with the likes of dartsearch (like Burton's above) will get no custom (but presumably still have to pay Bing for clicks, allowing Ecosia to plant more trees), while those (like Matalan above) who do not take the piss can still potentially get custom. 

It's not been an issue since, so I guess it really is just Burtons that lose out in that respect.

 

Index Coverage

So, Burtons aside, Ecosia had passed my various tests at this point - they appear to be honouring their privacy commitments (absent an oversight with images).

All that, though, is meaningless if the results aren't up to scratch, and the thing is powered by Bing: I am male, I am on the internet, but the majority of my searches are not for porn, and I need them to work properly.

I ran some searches for my own content, and did find that some of it was missing, which felt like something of a bad omen.

This was probably because I've tended to ignore Bing in favour of Google when optimising content, but therein lies the rub: most webmasters probably do the same, meaning Bing may not have indexed the content I need.

But, at the same time, I also figured that sitting and trying to find specific terms that Ecosia failed on was neither helpful, nor representative of what my actual experiences would be. So I pressed on, gradually moving devices and browsers over to using Ecosia, as my confidence grew.

As a side note, I took the opportunity to sit and improve Bing and Yandex's awareness of my content. It turns out Bing have been busy, and their Webmaster tools is now far superior to Google's offering (though that's helped by Google's recent step backward in the form of their UI changes).

 

The Verdict

I predicted quite early on, that if things went well, I'd forget about my little experiment, and forget to update the ticket. But if things went poorly, I'd be on there all the time bitching about how crap the results were.

There are gaps of 10-14 days between some of the comments on the ticket, with the subsequent updates simply saying that there haven't been any issues - exactly as predicted, I'd largely forgotten about the changes and just got on with things that needed doing.

I did note in a later comment that my own behaviour had changed a little, suggesting that the results aren't perfect, but certainly haven't been poor enough for me to give up and search with Google (the exception being to see where a result appeared for the purposes of a ticket comment).

So, ultimately, whilst it's possible to find terms and pages that perform better in Google, there really hasn't been enough real-world impact to motivate me to switch back.

 

But... Bing... Privacy...

This is an obvious enough question that I thought it was worth addressing head on.

What have I gained here?

I wanted to move my searches off Google in order to protect my privacy, but what I'm now doing is sending it (albeit indirectly) to Microsoft/Bing.

The important point here, is that "albeit indirectly". There's an additional layer in between us which helps to disassociate my searches from each other as well as from me. In principle, I could even search my own name and then search something else, and Bing should have no way to link those 2 searches together (that one earlier concern aside).

But, in practice, the disassociation and it's effects is actually far broader than that.

If I go to Google now and search for something, that search can be associated with data in other Google services to allow Google to build a profile of me.

Am I just searching "butt plug" for a picture? If I've a bunch of emails from sex toy shops in my inbox, and/or Google Analytics has seen me on bigstretchers.com (don't want to test if that's a real domain now...) then it's much more likely I've a deeper interest than it is that it's just a passing interest.

Because of the disassociation in my searches, Bing cannot link my searches to either me or my Bing Analytics persona. Even if they could somehow get past that, they can't link it to the purchase history in my gmail inbox (because they don't have access).

 

It's easy to argue that it really shouldn't matter if Google knows this info, but it's a lazy argument which ignores the underlying issues: you haven't actually been given a say in whether Google builds these profiles of you, or what topics/attributes it decides to profile. It also fails to consider that what you may be fine with now, may later become a source of regret or embarrassment.

 

Conclusion

I've avoided using Google Search for a month now, without even really noticing the change. Ecosia's worked well for me both at home and at work.

There are still Google services that I use, and some of those will be much harder to supplant.

Although we're a long way past the point where these moves are warranted - Google has been extremely data-hungry for a long time - something else does seem to have changed of late.

Google's attitude and public persona seems to have shifted into something much more aggressive:

Google's approach no longer seems to be one of a good netizen working alongside others, so much as deciding they're going to do something with a wide-impact and everyone else can just live with it.

There's always been issues (for users) with Google discontinuing products, so over-reliance on any given Google product has always been a little unwise, but now, I don't know, something just feels different.

I already block most of Google's properties when they're a third party in the browsing session, so it's the properties that I directly interact with which likely provide Google with the most information about me.

Hopefully, as time passes, I'll find good alternatives to each of those services without accidentally centralising onto someone else.

Some of that, though, will probably be reliant on various markets changing: I've long bemoaned the state of the iPhone. Getting one is too big a trade-off for me, so instead I'm stuck on Android with it's privacy issues and poor availability of security patches. If Apple could just make the iPhone less jarring and restrictive for me to use, pretty please?

Although email is still a vector for me, I did move my purchase history out of Google's view a little while ago - by reconfiguring Amazon, Ebay etc to use an address on my own mailserver. That was as a result of seeing this page (you need to be logged in). While it's a given the information is available to Google if it's in your inbox, seeing it extracted and listed out like that really drove a point home.

Now my future search history is also out of Google's hands, another step in the right direction.