Joomla Performance Tweaks for Busy Websites

Joomla! now runs a fair proportion of websites, and it's interface obviously appeals to a great many users (over 30 million downloads as of April 2012). For big busy sites, however, the performance isn't always as good as it could be. It's not bad, by any means, but can certainly be improved upon.

To clarify: by busy, I mean numerous visitors all hitting at more or less exactly the same time.

Aimed at developers and owners of large Joomla sites, the tweaks in this documentation will help you improve the performance of your site. However, it should be considered advice, and not a step-by-step instructional, if your site is that busy, the database tweaks in particular may actually hinder performance slightly.

If visitors are reporting long load times, especially during busy periods, then these tweaks may be of use to you.

Database

Joomla (generally) uses MySQL, and beyond that the MyISAM database engine. MyISAM gives good fast read times, but, on busy sites it can be something of a performance bottleneck due to one limitation: MyISAM uses table level locking.

So if 5 of us hit the site at the exact same time, 4 of us will have to wait whilst the sessions table is updated for the first, and then we'll each take our turn. As a visitor, it isn't obvious that this is the issue, the page just takes a long time to load and might even timeout.

It's actually quite easily remedied though, we just switch the engine for the relevant tables to InnoDB which supports record-level locking (see below for why we don't just switch everything). To do so, run the following SQL queries in MySQL (through PHPMyAdmin if you don't like the CLI) - don't forget to swap pref for your database prefix.

SHOW STATUS;
-- Look at the output and ensure it includes entries relating to InnoDB
-- some hosts do disable it

USE `yourdatabasename`;
CREATE TABLE tmp_session LIKE pref_session;
ALTER TABLE tmp_session ENGINE=InnoDB;
INSERT INTO pref_session SELECT * FROM pref_session;
RENAME TABLE pref_session TO myisam_session;
RENAME TABLE tmp_session TO pref_session;

There, it's that simple. The reason we don't run ALTER TABLE directly against the existing table is so we have room for maneuvre if something goes wrong. This way takes a second longer, but we can do it whilst the site is live with minimal disruption.

Another area (if you use them) that suffers from only being able to use table-level locking is banners, so we run

USE `yourdatabasename`;
CREATE TABLE tmp_banners LIKE pref_banners;
ALTER TABLE tmp_banners ENGINE=InnoDB;
INSERT INTO pref_banners SELECT * FROM pref_banners;
RENAME TABLE pref_banners TO myisam_banners;
RENAME TABLE tmp_banners TO pref_banners;
CREATE TABLE tmp_banner_tracks LIKE pref_banner_tracks;
ALTER TABLE tmp_banner_tracks ENGINE=InnoDB;
INSERT INTO pref_banner_tracks SELECT * FROM pref_banner_tracks;
RENAME TABLE pref_banner_tracks TO myisam_banner_tracks;
RENAME TABLE tmp_banner_tracks TO pref_banner_tracks;

And we've now switched those two over as well.

Why don't we use InnoDB across the Database?

You may have noticed that we ommitted the table #__banner_clients, but why? Basically, it's a case of assessing which engine best suits which table. Banner_clients has far more reads than it has writes, and there's little chance of there being a huge number of writes being attempted simultaneously. More to the point, even if this were to happen, the people left waiting are the admins in the back-end, not the customers/readers on the website.

Changing to InnoDB only makes sense where a table will potentially need to be updated by multiple people at once, and even then you need to assess the regularity. Taking our changes above as an example

Table When written to Primarily written from Engine
#__banners Every page load (if page has banners) Front-end InnoDB
#__banner_tracks Every page load (if page has banners and impression tracking is enabled) Front-end InnoDB
#__banner_clients When a client is added Back-end MyISAM
#__session Every page load Front-end InnoDB
#__content When item viewed (hits updated), when item edited or saved Back-end MyISAM

Even the above table is subject to caveats - if your site isn't particularly busy, the slower read speed of InnoDB might lead to a performance decrease, and the way in which your users use the site needs to play a part too. There may also be other tables that benefit, but the three in the examples are the core ones that can really slow a busy site down.

 

Caching

This shouldn't need saying, but where possible, have Joomla's caching enabled - if there are modules that you don't want to cache then go in and turn caching off in their settings.

Sometimes, though, using Joomla's cache just isn't an option - I've worked on some sites where the hitcount needs to be as accurate as possible, and delayed updates to items are considered unacceptable.

Depending on your needs, it's reasonably easy to adjust the model of a module so that you can force it to cache despite Joomla's caching being disabled - something I've had to do in the past, but not something you should do lightly - if a security update comes out you need to migrate your changes to the newer version. Sometimes, though, the performance gain is worth the effort.

The best compromise I've found is to add the unchanged module to a GIT repo, then commit your changes. When and update is released, you can simply merge the changes into your repo and then package and install the update with your modifications intact.

 

Spread the load

You may or may not be aware that webservers tend to limit the number of simultaneous connections an individual client can establish (as well as a much bigger global limit), so sometimes it's not possible to download everything simultaneously.

If the limit is set to 10, and there are 30 files I need to retrieve (could be CSS, JS, Images, fonts etc) then my load time is going to be a slower as a result (especially if some of the files are big and I'm on a slow connection).

Some Joomla! templates aand extension will automatically combine all JS into one file, and then do the same for the CSS, but I don't particularly like this as it introduces a serious single point of failure, and some implementation completely prevent browser-side caching.

The route I've largely settled on is to install and configure the JoomlArt Amazon S3 extension, for a number of reasons

  • Load times directly from S3 are good, and from CloudFront very good worldwide
  • If correctly configured, you can parallelise downloads across multiple hostnames

It does take a little bit of setting up for the first time, but the JA documentation is pretty good if you do get stuck. The main things to note (from a performance perspective are)

  • Set up an individual bucket for each area of your site that hosts static content (i.e. one for templates, one for /media, one for /images etc)
  • Set the component to use subdomains (i.e. btasker.amazons3.com not amazonaws.com/s3/btasker)

What this does, is provide different hostnames for each of the content areas, so imagine if, out of the 30 items I needed to load there were

  • 8 in each of the three buckets
  • the remaining 6 on the server

My browser would be able to download them all at once, rather than just 10 at a time.

There's also little reason to load MooTools and JQuery from your server, so install the Google Ajax plugin and have visitors load the libraries from GoogleCode. If they've visited any other site that does the same, the library will probably already be in their browser cache. Even if it isn't, it's still one less request for your server to have to handle.

 

Choose extensions carefully

The quality of extensions is always important, but the bigger the site the more drastic an impact an individual extension or customisation can have. As an example, I worked on a (largely) K2 based site with well over 100,000 K2 items. The customer wanted a multi-category extension to K2, so we located one and installed it.

From that point on, performance well and truly sucked. 

One of the major issues was, that in order to support multi-category, the extension had changed #__k2_items.catid from INT to a varchar (to store comma seperated category id's) and then updated various queries to use FIND_IN_SET. On a smaller site (indeed, before we imported their existing data) you probably wouldn't have noticed, but on a site of that size it caused serious issues. Despite running on a 12 core server, the site was giving (at peak periods) load times of up to 30 seconds, though it was sharing the server with a few massive sites, most of which had caching disabled. To make things worse, it was using all the available MySQL connections and so slowing the other sites too.

Ultimately the route I took was to adjust the extension so that it used a mapping table, allowing database indexes to be used more effectively (although indexes were there, the cardinality was incredibly high - may have been scanning the entire table in effect).

Between implementing that and changing to InnoDB, the big sites now load in less than 3 seconds, and less than 1% of queries are taking longer than 1 second to complete. Still room for improvement, certainly, but it shows the difference that a few carefully thought out changes can make.

The moral being, when working on a large/busy site (this one wasn't live!), it's well worth looking at how each extension works and considering the performance implications. Some things just don't scale particularly well. Some of the Component Content Management Systems (CCMS) are great from a feature perspective, but don't all scale particularly well, especially if you're using them in a slightly different way to the intended purpose.

 

Don't limit yourself

This is more of a personal thing than a tip really: when planning the project make sure you allow time for performance tweaking. Joomla is incredibly flexible, and there's no reason you shouldn't be able to do anything you want with it:, so avoid tying yourself into such a tight deadline that you have to compromise because a desired feature destroys performance.

In the multi-category example above, the alternative was that the site owner could either maintain a copy of the item for every category they wanted it in, or only post to 1 category. Both pretty poor alternatives really, but with a little work, the original aim was achieved without adversely affecting performance.

 

Don't hack the core

There may be occasions when it's acceptable to hack about with Joomla's core, but I can't think of any. If you make changes to Joomla's core files you'll be less able to deploy updates (at least not without losing your changes), which in practice means there's a good chance the site won't be updated and will be subject to whatever vulnerabilities are discovered.

From a performance perspective, the changes you make today may well degrade performance tomorrow. Requirements change, and you need to maintain the flexibility to react easily. Core hacks often get forgotten about, at least until the next upgrade overwrites them!

 

Conclusion

Joomla is a very capable Content Management System, but as with any system, careful consideration needs to be given when planning projects of a large scale. There is no one-size-fits-all solution, you need to analyze the requirements of your site, the capabilities of the server and the expected usage pattern.

Using good analytics can help you to make these decisions: if you know how users actually use your site, it's easier to assess what will improve performance. There's very little point focusing on optimising category pages if 99% of your visitors never load them, but unless you know the usage patterns this is probably exactly what you'll end up doing.