Making your Joomla Site Fly with NGinx Reverse Proxy Caching

I've written previously about configuring NGinx to act as a reverse proxy for Apache, as well as some of the specific tweaks you need to make if you're serving a Joomla! based site. In this documentation, we're going to look at how to use NGinx's Reverse Proxy caching feature to make your site really fly.

There are a small number of technical hurdles which we'll overcome to ensure that the user is experience is fast and smooth without losing interactivity on those sites which demand it. 

 

Contents

  1. Introduction
  2. Technical Challenges
  3. What we're going to do
  4. Configuring the plugin
  5. Setting Cookies
  6. Configuring NGinx
  7. When Not To Cache
  8. Conclusion

 

Introduction

Joomla! includes it's own cache, but even with that enabled, the time to first response can still be a little too high on some sites (especially if you have a lot of content and are using a third party SEF component, and/or have a templating framework installed).

The way around this is to cache a little further downstream, in this documentation we'll be assuming that you've already followed the steps to have a (non-caching) NGinx Reverse Proxy installed and configured for use.

 

Technical Challenges

They won't affect all sites, but there are a number of technical hurdles to overcome;

  1. We don't want all pages to be cached (especially the back-end)
  2. We don't want to change what we tell browsers about caching
  3. We want to be state-aware

From a security perspective, items 1 & 3 are especially important. If a logged-in user views their profile page we don't want their details to be shown to every user that tries to access that URL!

Item 2 can also be important, with NGinx's caching we have the option to quickly and easily flush the cache, if a browser caches the response it's not so easy. So we probably want to continue to send our existing 'no-cache' value for cache-control.

 

What We're Going To Do

In order to be aware of the various states of the system, we need to do a little configuration at the Joomla! end as well as configuring NGinx appropriately. NGinx, of course, has no understanding of Joomla! and so can't check whether a user is logged in or not. It'd be silly to try and give NGinx a direct understanding of Joomla, so we're going to communicate with it using custom HTTP headers.

To aid in this, I've created a plugin allowing us to conditionally send custom HTTP headers. Before we start though, we really need to summarise exactly what it is we want;

Send a custom header AND set a cookie IF (user is logged in) OR (requested page is in /administrator) OR (requested page is in a specific component).

Taking this site as an example, the shop contains both a cart and a login module. We don't want to present a cached version of either, so we want to ensure that the relevant pages are always retrieved from the Origin. Similarly, we don't want to present information that's only available to logged in users to one who's not logged in - which is exactly what will happen if we allow it to be cached.

 

Configuring the Plugin

Once the plugin is installed, visit Plugin Manager -> Send Custom Header and use the following settings;

  • Status: Enabled
  • Headers: X-Dont-Cache-Me=no-cache
  • Cookies: JNoCache=True
  • Debug Mode: On
  • SSL Connections: Obey Rules
  • Run On: Front and Back-End
  • Send When: User Logged In
  • Always Run On: com_users, com_jshopping     <- I'm excluding the shop here
  • Never Run On: empty
  • Always on URLs: /administrator,/administrator/,/administrator/index.php
  • Never on URLs: empty

Save the plugin and you should notice no immediate difference on the site, but the change is in the HTTP headers. To view these, either use Live HTTP headers in Firefox, Developer Tools in Chrome or run the command below (use your domain and a valid URL).

telnet www.bentasker.co.uk 80
GET /shop HTTP/1.1
host: www.bentasker.co.uk

Notice the double carriage return after the host header. You'll get a large amount of output, but at the very top you should see something like. 

HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Sun, 18 Aug 2013 16:53:51 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
X-Dont-Cache-Me: no-cache
Set-Cookie: JNoCache=True
X-Always-Rule-Applies: True
Cache-Control: no-cache
Pragma: no-cache
Vary: Accept-Encoding

In the example above, the bits we're interested in are

X-Dont-Cache-Me: no-cache
Set-Cookie: JNoCache=True
X-Always-Rule-Applies: True

The first two being the headers we've asked the plugin to set (the second indirectly - we told it to set a cookie). The last is part of the debug output, in this instance the component behind the URL used was in the Always Run On list, we could expect to see any of the following though

  • X-Disabled-On-SSL: If present, it means the connection is over SSL but the plugin has been disabled for SSL connections
  • X-Force-Enabled-On-SSL: If present, the connection is over SSL and the plugin has been set to force enabled for these connections
  • X-Always-Rule-Applies: The Requested resource correlates with an entry in Always Run On or Always Run on URLs
  • X-Never-Rule-Applies: The Requested resource correlates with an entry in Never run on or Never run on URLs or is a back-end resource when Run On is set to Front-End (or vice-versa)
  • X-User-Login-State-Incorrect: The plugin is set to only run when a user is logged in, and no user is logged in (or vice versa)
  • X-All-Rules-Satisfied: The plugin worked through all rules successfully, the header was sent but not because of being explicitly forced to

 We obviously wouldn't expect to see the SSL headers when telnetting to port 80, but if you're using Firefox/Chrome to access via HTTPS you might.

So we're now happy that Joomla is configured to behave as we want, so the next step is to configure NGinx to take the headers into account.

 

Setting Cookies

There is one thing to note with setting a cookie as we've done above. If a user visits a page that has been explicitly included (such as the shop on my site), a cookie will be set which will also disable caching. However, this will have the consequence of bypassing caching for all pages the user visits on your site until the cookie expires at the end of their session.

The reason I've used it in the example above, it simply to show that it can be done. Generally speaking, it would be better to rely solely on the headers.

 

Configuring NGinx

We now need to configure NGinx to both cache and take the headers into account. To begin with we need to tell NGinx where to store cache files

nano /etc/nginx/nginx.conf

# We need to add the following within the http block
proxy_cache_path /var/www/cache levels=1;2 keys_zone=my-cache:8m max_size: 1000m inactive=600m;
proxy_temp_path /var/www/cache/tmp;

Save and close. Next we need to configure our server block so that the proxied requests are cached. Assuming we have the following

server {
listen 80;

root /var/www/vhosts/example.com/public_html;
index index.php index.html index.htm;

server_name www.example.com;

location / {
try_files $uri $uri/ /index.php;
}

location ~ \.php$ {

proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header Host $host;
proxy_pass http://127.0.0.1:8080;

}
location ~ /\.ht {
deny all;
}
}

We want to adjust to add the cache, but only if our custom header isn't present in the response. Finally we're going to hide our custom header from the browser as there's no real need to send it on

server {
listen 80;

root /var/www/vhosts/example.com/public_html;
index index.php index.html index.htm;

server_name www.example.com;

location / {
try_files $uri $uri/ /index.php;
}

location ~ \.php$ {

proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header Host $host;
proxy_pass http://127.0.0.1:8080;

# Define which cache to use
proxy_cache my-cache;
# Define which responses to cache and for how long
proxy_cache_valid 200 302 60m;
proxy_cache_valid 404 5m;

# Don't cache if our headers (or cookie) are present
proxy_no_cache $upstream_http_x_dont_cache_me $cookie_jnocache;
proxy_cache_bypass $upstream_http_x_dont_cache_me $cookie_jnocache;

# Ignore the standard no-cache headers - these will still be sent to the browser
proxy_ignore_headers X-Accel-Expires Expires Cache-Control;

# Don't send our custom header to the browser
proxy_hide_header X-Dont-Cache-Me;

# This next line is important if we're planning on caching for more than one site on the server
proxy_cache_key "$scheme$host$request_uri";
}
location ~ /\.ht {
deny all;
}
}

We've now configured our Virtual Host to do the following

  1. Write cache files to 'my-cache'
  2. Cache requests resulting in a HTTP 200 or 302 for 60 minutes
  3. Cache requests resulting in a 404 for 5 minutes
  4. Don't write to the cache if our custom header and/or cookie are present (proxy_no_cache)
  5. Don't use existing cache content if our custom header and/or cookie are present (proxy_cache_bypass)
  6. Ignore the cache-control headers sent by the Origin server (but still pass to the browser)
  7. Don't send our custom header to the browser

Now, all we need to do is restart NGinx and we should be good to go. 

service nginx restart

 

When Not To Cache

We've already identified a few circumstances where we wouldn't want pages to cache, but it's important to think carefully about what your requirements are and the effect these have on caching. In the following scenarios you'll either want to exclude caching from certain URLs/components, not use caching or change the way your site works;

  1. The accuracy of Joomla's hit counts is important to you
  2. You use com_banners for paid advertising
  3. You use login modules rather than a login page
  4. You use anything that include an anti-CSRF token
  5. Users are required to login to access any content on your site
  6. You geo-locate and serve different content to different regions of the world

The basic rule of thumb is, unless you exclude it, something that could be presented to one user will (potentially) be presented to all. That's why the shop is excluded on my site, if you log in, I need to be sure the next user won't see your details!

Hit counts are affected because content will be served from the cache, and so Joomla will never see, let alone act upon, the request, so the hitcount will only ever increment when the cache is stale.

Com_banners suffers from a similar issue, in the sense that banners won't rotate whilst the content is loading from the cache. It'll only be when the cache is refreshed that they'll rotate. Any advertising systems using Javascript (such as Google Adsense) aren't affected by this as the ad content is loaded by the browser.

Anti Cross Site Request Forgery tokens tend not to work well with caching, the anti-CSRF token a user submits needs to be correct but won't be if it's been loaded from a cache - the tokens change for every request. Joomla login forms in particular use anti-CSRF (which is why we excluded /administrator when configuring the plugin above).

If users aren't permitted to access content until they've logged in, there's very little point in enabling this type of caching as you'll have to exclude everything!

Finally, login modules can be problematic. They generally contain an anti-CSRF token, so every page they appear on needs to be excluded from caching. An alternative method is to replace them with a link to a login page, the login page itself can then be excluded.

 

Conclusion

Depending on the performance of your site (and visiting pattern), setting up this type of caching may well be worthwhile as you can easily reduce the time to first packet down dramatically. It's reasonably easy to set up, but does require a little thought beforehand to ensure that information isn't accidentally leaked.

There are situations where you might decide that caching in this manner isn't an option, and I haven't yet figured out a way to use headers to force NGinx to flush it's cache (as an AfterSave plugin might attempt), but for many sites it could lead to a marked increase in performance whilst also improving on the potential capacity of the server.

 

Next: Joomla and NGinx Reverse Proxy Caching: Keeping your dynamic content fresh