Screenshot Social Media, Don't embed

Ben Tasker

2019-09-22 11:55 (updated 2023-07-24 10:11)

Ever since the web was born, there have been concerns about preserving what's published on there for future generations. That's why things like the Wayback machine exist. Things like our approach, and concerns, around online privacy have also evolved with time.

But, the way we communicate on the web has changed pretty dramatically. Personal blogs are still a thing, but humanity has increasingly leaned towards communicating via social media - Twitter, Facebook etc.

Now, we increasingly see news reports with embedded posts containing expert commentary about the topic of the news, and even reports about something someone has posted.

Those expert commentators are even occasionally being asked to change the way they tweet to make it easier for news sites to embed those tweets into their own stories (that request turned out to be from Sky News btw).

For all their many, many faults, the social media networks are a big part of how we communicate now, and posts on them are embedded all over the place.

This brings with it a number of avoidable, but major issues.

The aim of this post is to discuss those, and explain why you should instead be posting a screenshot of the tweet/post.

I'm going to refer to "Twitter" and "Tweets" a lot, purely because it's shorter than "Facebook" or "Social Media", but the concerns here apply across the board.

What Do I Mean by Embed

The social media companies much prefer it when you use their embed functionality to display SM posts elsewhere. That is, rather than copy/pasting the tweet (or screenshotting it), you write something like the following into the markup of your page

<blockquote class="twitter-tweet">
  <p lang="en" dir="ltr">
      As a parent of young kids I believe in parenting and using the tools that are available.
      <br><br>Course, I&#39;m not trying to win votes with soundbites that ignore reality 
      <a href="https://t.co/pNK2WlmD3j">https://t.co/pNK2WlmD3j</a>
  </p>&mdash; Ben Tasker (@bentasker) 
  <a href="https://twitter.com/bentasker/status/1175368613894795265?ref_src=twsrc%5Etfw">
     September 21, 2019
  </a>
</blockquote> 
<script async 
    src="https://platform.twitter.com/widgets.js" 
    charset="utf-8">
</script>

This will result in readers seeing something like this:

As a parent of young kids I believe in using the tools that are available. Course I'm not trying to win votes with soundbites that ignore reality

It presents really nicely, and in many ways can add to an article quite well.

So, what's the problem?

Preservation of History's Context

I'll open this section by noting that Twitter - in particular - have got much better in this respect, though the underlying concern isn't entirely gone.

Although the embed code now includes the wording from the tweet (it didn't previously), you're still relying on a third party continuing to either support embedding, or (in the worst cases) exist in order to have the content shown in the way, and context you wanted it to.

The problem is, when Twitter goes away, what happens then? True, they may not vanish entirely and might instead continue on as a shadow of their former self. Even then there's risk - Myspace lost all data older than 2016 in a botched server migration.

Whatever the cause, if Twitter goes away, then you're left with just the text in the <blockquote> portion. For a lot of tweets, that won't matter too much, but in the tweet above I'm quoting someone else. That's a pretty important piece of context that will no longer be available, but can entirely change the way a post is read.

The loss of that context can occur even without Twitter suddenly disappearing:

Getting a bit meta but you should ignore both of these

It's a bit of a contrived tweet, but it's fairly obvious here that "both of these" refers to the quoted tweet. If I'd tweeted "everything here is a lie", you'd understand exactly what was a lie.

But, what happens if the person being quoted didn't like being called out on their lie, and so deletes that tweet (or blocks the person quoting them)?

The context is gone.

If your news article was about my arguing with myself, the tweet you embedded is no longer particularly useful for highlighting that. If I went one further and deleted the tweet you embedded then:

Now you've lost both the context and the styling that made you embed in the first place.

If it was important to support your post (or even the focus of it) the value of your content has dropped fairly dramatically.

When you embed, you're ultimately relying on a single source - the Social media network - for the content that you're embedding, despite the fact that the user's featured within it have the ability to click "delete".

This may seem unimportant given the silliness of the tweets above, but remember that Tweets are being embedded to add context to stories about the Supreme Court challenge to Boris Johnson's Prorogation of Parliament, with commentators such as Joshua Rozenberg having tweets embedded left right and centre.

Whatever the judgement in that case, the events we're witnessing at the moment are going to become part of our consitutional history, and as a result are quite likely to be something of a historical curiosity in future. There are various bits of analysis and commentary that are at risk as a result of simply embedding tweets rather than actively preserving them.

Privacy

Some may consider this an unimportant factor, but if you're publishing content then you're presumably hoping for quite a number of eyeballs to look at it.

When you embed a tweet, each and every one of those views will mean that the user's browser has had to contact the relevant social media network. And what do social media companies like to do? Track everyone as much as they can.

By embedding tweets, you're allowing a third party to track your visitors without those visitor's consent. Twitter now know that that user has visited your site.

In fact, if you dig down into the technical details a little more, it's worse than that. Twitter are deliberately collecting information on the site.

Let's say that you want to protect your user's privacy a bit, so you add a referrer-policy header or meta-tag to your site - preventing visitor browsers from including a Referer header when contacting twitter.

It makes absolutely no difference, because the Javascript they embed deliberately sends the information back in the query string:

That's just the initial loading stages too, later in the process they POST themselves a lot more

POST out to twitter showing the exact page I'm on

They've got a bunch of javascript events attached too to submit more information based on what your visitors do around that tweet.

This is all information that's being aggregated in the background, and your visitors don't really get a say in it. Those that decide they want to prevent this, and (like me) use an adblocker to prevent the embeds from loading are taken back to the issues described above.

This is what I see when I visit a page using embeds

There's a complete loss of context. If you visit the tweet itself, the user's quoted one of their earlier tweets and their comment is in relation to that

That example is taken from a page titled "15 tweets everyone should've read this week", and the majority of the tweets on the page rely on some context - a quote or an image - that isn't visible when the embed fails. As a result, that page, rather than being a (potentially) funny read is a waste of time.

If you're serving content to Europeans, then this could present a real risk to you under GDPR. It's already been found by the European Court of Justice that Facebook's "Like" button is problematic, and that website owners who embed it can be held liable for Facebook's processing of data as a result of a button being embedded.

Although an embedded tweet offers a little more value to the user than an embedded "like" button, the principle remains that both could just as easily just be an image served from your server instead.

Safety

By embedding, you're also building in a hook out to 3rd party javascript which may, or may not, be safe at any given point. Social media companies get compromised too, and you'll notice that the embed code I posted above doesn't make use of any safety features like SubResource Integrity (SRI) checks - because the SM networks want to be able to update the code you're pulling in.

Of course, far off in the distant future, twitter.com might not even be owned by Twitter any more - meaning the new owner of that domain can inject whatever javascript they want into your pages.

Alternatively, it may not be that far in the distant future, and instead might be a government poisoning DNS so that they can serve up modified tracking code as broadly as possible - the SSL cert may only be viewed as a hurdle by some governments. I've not heard of any instances of (say) platform.twitter.com being targetted by DNS poisoning in this manner, but it'd certainly make a juicy target.

All Avoidable

Even if you're feeling slightly dubious about the severity of the claims above, the reality is that they're all trivially avoidable.

Simply screenshot the post (as I've done above) instead, and include some/all of the post in the image's alt attribute. You can even link it out to the tweet if you desire

<a href="https://twitter.com/bentasker/status/1175368613894795265">   
   <img src="/images/BlogItems/Screenshot_20190922_110537.png" 
   alt="As a parent of young kids I believe in using the tools that are available. Course I'm not trying to win votes with soundbites that ignore reality" />
</a>

That way, the text is still indexed by search engines, the context within the tweet/post can never be lost from your content, and visitors to your page don't wind up sending a ping into the data-hungry social media networks.

Conclusion

Obviously, what I've laid out here cannot apply to all content - you've little choice but to embed a Youtube video, for example.

For text and static image posts though, using SM embeds really does seem like an incredibly short-sighted, even potentially harmful, move.

Given that you included the tweet in your content, you obviously felt that it provided important context. Yet, by embedding you've inserted a reliance on a single-point of failure, leaving yourself open to the risk that that context could easily be removed.

Even if you feel the tweet doesn'tr really add much value, what's the cost of a visitor getting frustrated at the broken tweets and wandering off to try and find a less broken source, rather than exploring your other content?

For a subset of your visitors, that desirable context will already have been removed - either because the user blocks Twitter (for privacy reasons), or because they're on a network where the admin have blocked it "for" them. That "admin" may very well be an oppressive government.

For those visitors who don't have Twitter blocked, you've also exposed your their browsing habits to the social media networks, whilst increasing the possibility that malicious code will be run in their browser.

And all this, for what? In the vast majority of cases, you've gained nothing compared to hitting "Print Screen" on your keyboard.

Personally, I'd argue that if we care about the content we're writing, then we should also care about ensuring that the context in which it's written is preserved.

That's far too important a task to trust to a bunch of unknown third-parties, so even if you're not that bothered about your visitors' privacy, you should be using screenshots instead of embed code. If the cost/benefit of that seems too high, what was the perceived benefit in including the tweet in the first place?

Update: 2023-07-24

When I wrote this post, I suggested that - one day - Twitter could disappear. Even when writing that, I never quite envisaged Twitter's downfall happening in the way that it has transpired. As someone else wrote, Musk hasn't killed Twitter, he's just made it irrelevant.

In 2022, Twitter was bought (against his own will) by the world's richest man - Elon Musk - following him having publicly offered a stupid price and then trying to back away after the prospect of his involvement drove advertisers away, tanking Twitter's value. Musk eventually completed the purchase and took control of Twitter in October 2022.

What followed was a period of lay-offs and abject mismanagement (losing more advertisers, emboldening racists and causing outages), some of which you can read more about in "A Week on Twitter".

One of the predictable results of these changes to Twitter was that significant portions of its userbase started leaving (Infosec discussions, for example, no longer generally happen on Twitter having moved to the fediverse) with some deleting their tweets as they left.

Since then, Musk has also made changes to the accessibility of content on Twitter: users need to be logged in to view tweets (although this doesn't seem to be consistently enforced). Even when logged in, users face rate limits controlling how many tweets can be viewed per day, transforming Twitter from a public square into a walled (and incredibly limited) garden.

In fact, so deep are the changes, that even the Twitter brand is going away, to be replaced by "X" (x.com/bentasker currently seems to redirect to twitter.com/bentasker).

Screenshot of the @twitter account carrying the X logo. They've not changed the handle though

This re-brand is likely to be more about Musk's earlier ousting from X.com than it is an attempt at reputation-washing in the way that Facebook became Meta (which, incidentally, didn't work out very well).

Needless to say, one of the results of Twitter having been wrapped so tightly in a Musk branded ball-of-fail is that existing Twitter embeds into news stories and posts are now broken and seem quite likely to remain that way (even before he took over, Musk had suggested he might charge for Tweet embeds).

Whatever context those embeds had added is now lost. The extent to how much that matters obviously depends on each embed's relation to the post itself. It's easy to shrug and say that Tweets can't be that important but, just 3 years ago, the US had a president who effectively ruled via Twitter (and was even accused of inciting an insurrection via that same medium).

Most tweets may be throw-away, but some almost certainly carry historical significance, leading to something of value having been lost when the embeds broke.

So, whichever social media service you're quoting, whether it's Mastodon, Post, Bluesky, Threads, Facebook or something else - screenshot those posts, don't embed them.