Using Django's cache framework with Wagtail

Tags: wagtail, cache

It's always nice to have the opportunity to get stuck into a part of Django/Wagtail that I haven't used much before. Django, yup.  Wagtail, yup.  Caching in Wagtail, not so much.

This was partly an exercise in learning but also meant I could optimise server response for this website.

The situation

The website is hosted on a small droplet and is (so far!) low traffic.  Like most blogs, each post can require a lot of database hits - for its content, author, categories, tags, comments etc. Ditto for a blog index page.  I also use wagtail's StreamFields a lot which can add extra rendering time.

Don't get me wrong, before caching, the site loaded pretty fast (sub 2 seconds, ~350-400ms server response time) but I wanted to get the server response consistently down below 200ms (as Google recommends).

I also wanted to avoid the first user (after cache expiration) having a slow page load experience due to the page being freshly rendered and cached.  It's a low traffic site and I want all my visitors to have a great experience - including googlebot!

The strategy

Cache all bits of the site that change infrequently.

Don't cache dynamic bits like contact forms, comments, comment form (csrf tokens require a 'vary' header and comments might change frequently).

Cache content for a long time (I chose 8 days) and use cron to refresh the cache periodically (I chose every 7 days, so the cache never actually naturally expires). Add a way to manually refresh the cache after I have updated content.

Use Django's database cache backend.  Memcache/Redis etc are much better but more complex and I didn't have the space on the droplet. The default LocMem backend is process specific and so uses quite a lot of memory (there is a different cache for each process) and is hard to 'pre-load' the cache for all processes. The database backend is easy to set up, is not process specific, and it's easy to inspect the contents - it is just another database table.

Implementation

Per view caching is tricky with wagtail, as the standard django 'view' is tucked away in wagtail's Page.serve method.  I dug around on google and drew inspiration from this discussion - thanks guys!

UPDATE

I dug into per-view caching with wagtail and it is actually quite easy - check out Per-View cache with Django & Wagtail CMS

END UPDATE

I went with template fragment caching.  It gives granular control over which parts of a page are cached and for how long.  I could cache each blog post, but not their comments or the comment form.  I could cache the menu block for each url.  I could omit caching the contact page, so the form would work as expected (but the menu would be cached).  I could have a 'messages' block on each page, uncached - you get the idea.

Implementing template caching is quite easy (django docs).  Here's an example:


<!-- blog index page template example -->
{% extends "base.html" %}
{% load blog_tags cache %}

{% block content %}
{% cache 691200 blog_index request.get_full_path request.preview_timestamp %}
<!-- 691200 = 8 days. The next 2 arguments make sure the fragment
 is cached for a specific URL (there are index pages for each tag and category)
  and can be identified in the cache table easily. The last argument is to allow wagtail page previews  -->
     {% include 'spotlight_hero.html' %}
     {% category_menu_strip category %}

<section>
  <div class="container">
    <div class="row">
      <div class="col-sm-12">

    {% for rl in self.related_links.all %}
    <!-- NOTE: this is where the queryset is evaluated, so this database hit will be cached -->
        <p>{{ rl.title }}: <a href='{{ rl.link_page.url }}'>{{ rl.link_page }}</a></p>
    {% endfor %}

        <div class="row">
        {% for blog in blogs %}
                {% include 'blog/blog_post.html' %}
        {% endfor %}
        </div>

      </div>
    </div>
  </div>
</section>
{% endcache %}
{% endblock %}
    

Wagtail page preview

We add a timestamp to the cache signature when previewing so that we actually see the preview rather than a cached one for the same url :-)  This is how:


# Add this to each of your page models to over ride their serve_preview method

class MyPage(Page):
    # your other class stuff here    
    def serve_preview(self, request, mode_name):
        request.preview_timestamp = datetime.now()
        return super(MyPage, self).serve_preview(request, mode_name)
    

Refreshing the cache

We need a way to trigger caching.  In other words we need to have a function that requests each page/url of our site that we want to cache.


# in my utils.py
# I will want to use this in a management command as well as a view
import requests
from wagtail.wagtailcore.models import Page
from blog.models import BlogCategory, BlogTag

def request_all_pages():
    # get all pages that are not 'Root' ie are a subclass of Page -this will include the home page
    pages = Page.objects.not_exact_type(Page).live()
    print("Requesting " + str(len(pages)) + " pages...")
    for page in pages:
        print(page.full_url)
        requests.get(page.full_url) . # this will create a cache for the page
    # now get all category and tag index page urls
    # in my set up they are regular models and not Page subclasses :-)
    site = pages[0].get_site()
    base_url = site.root_url + "/blog/%s/%s/"
    cats = BlogCategory.objects.all()
    for cat in cats:
        print(base_url % ('category', cat.slug))
        requests.get(base_url % ('category', cat.slug))
    tags = BlogTag.objects.all()
    for tag in tags:
        print(base_url % ('tag', tag.slug))
        requests.get(base_url % ('tag', tag.slug))
    

We also need a way of expiring the cache.  I looked at my database cache_table and saw that cache_keys for template fragments were named something like this:

:1:template.cache.blog_index.gfhjshj464546474748h....

I use django-compressor and its cache_keys looked a bit like:

:1:django-compressor.js.6575jjdkdk777ll....

So i could identify my template fragment caches and delete them whilst leaving other cached items intact:


# in my utils.py
# cache_table does not map to django's ORM (we have no model) so we use raw sql.
from django.db import connection

def clear_page_cache():
    # just clears all template fragment caches
    sql = "DELETE FROM cache_table WHERE cache_key LIKE '%template.cache.%'"
    with connection.cursor() as cursor:
        cursor.execute(sql)
    print("Page Cache deleted...")
    

UPDATE:

Django's template caching, by default, tries to use a cache backend called 'template_fragments', so I could just create another cache table and add it to my CACHES settings config as 'template_fragments'. That way I could just clear that cache like so:


# in settings.py
CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
        'LOCATION': 'my_cache_table',
    },
    'template_fragments': {
        'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
        'LOCATION': 'my_other_cache_table',
    }
}

#in utils.py
from django.core.cache import caches

def clear_page_cache():
    cache = caches['template_fragments']
    cache.clear()
    

We can now use these functions in a cronjob once a week to refresh the cache, in deployment (say using Fabric) and manually via a view:


# in app/management/commands/clear_page_cache.py
from django.core.management.base import BaseCommand
import kronos . # kronos will pick up management commands as well as tasks in cron.py
from my_django_project.utils import clear_page_cache


@kronos.register('0 5 * * 0') . # once per week
class Command(BaseCommand):
    help = 'clear just all page caches, leave compress and others intact.'

    def handle(self, *args, **options):
        clear_page_cache()
    

# in app/management/commands/request_all_pages.py
from django.core.management.base import BaseCommand
import kronos
from my_django_project.utils import request_all_pages


@kronos.register('5 5 * * 0')  # just after clearing cache
class Command(BaseCommand):
    help = 'Make a get request to all live pages on site - to trigger cacheing.'

    def handle(self, *args, **options):
        request_all_pages()
    

# fabfile.py
# I could run this in deployment
# Note: I had to move imports into the clear_page_cache/request_all_pages functions to run successfully.

def deploy():
    # other deployment stuff
    with cd('/path/to/project/'):
        run('python manage.py clear_page_cache')
        run('python manage.py request_all_pages')
    

Next, add a 'Refresh Cache' menu item to the Wagtail Settings menu (will refresh the cache when clicked):


# in views.py
from django.contrib import messages
from django.shortcuts import redirect
from django.contrib.auth.decorators import login_required
from django_project.utils import request_all_pages, clear_page_cache

@login_required()
def RefreshPageCache(r):
    try:
        clear_page_cache()
        request_all_pages()
        messages.success(r, 'Page Cache has been refreshed!')
    except:
        messages.error(r, 'There was a problem refreshing the Page Cache' )
    return redirect('/admin/')
    

# in app/wagtail_hooks.py
from wagtail.wagtailadmin.menu import MenuItem
from wagtail.wagtailcore import hooks

@hooks.register('register_settings_menu_item')
def register_refresh_cache_menu_item():
    return MenuItem('Refresh Cache', reverse('refresh-page-cache'), classnames='icon icon-folder-inverse', order=1)

# don't forget to add an appropriate url in urls.py
    

Some thoughts

Optimisation is never ending! You have to stop somewhere (I got a 40-50% improvement in server response time).  This implementation works for now but isn't set in stone.  If things change in the future, it's not much work to change cache timeouts or add other template fragments to the cache with short timeouts, such as a post's comments.

Equally, if I needed to cache fragments or pages that are different for each user (say, an area of the site that users must log in to) it is easy to tweak the cache identifier with the user's unique username/email:


{% extends "base.html" %}
{% load wagtailcore_tags cache %}


{% block content %}
    {% cache 691200 blog_post request.get_full_path request.preview_timestamp request.user.username %}
        {% include 'blog/blog_post.html' with blog=self %}
    {% endcache %}

    {% if COMMENTS_APP %}
    {# add cache for comments here #}    
        {% include 'blog/blog_post_comments.html' with blog=self %}
    {% endif %}

{% endblock %}
    

Finally, I thought I would mention that the milage you get out of template fragment caching will vary.  Database calls made in the view, before the template stage will not be cached.  However, it is often possible to move the execution of a queryset into the template layer, if you want to - querysets are lazy!

Loading comments...
';