Looking at coronavirus.data.gov.uk

Update 5th May: They have removed stuff they don’t need from the JSON, so it is now down to 700K (3.3MB total, still up from when I originally wrote this blog post). Hopefully they will get gzipping working on it soon.


Update 28th April: They have added lower-tier and male/female data to the JSON data, adding over a megabyte to the JSON file request, it’s now 1633K, total transfer of 4.2MB.


Last weekend I got annoyed by an issue that had been closed on the GitHub repo of the UK’s coronavirus dashboard. The issue was about the site not working at all without JavaScript and whether it could do so, and the closure comment said: “That’s unfortunately impossible. The entire service has been implemented in JavaScript (not just the map).” Bit of a red rag to a bull, so I wrote a version that worked fine without JavaScript in a few hours on Sunday morning, and also used far less bandwidth/CPU/etc, stuck it up on my website, and tweeted about it.

I will go into how my version works and the differences between them further down, but firstly to be clear, this is not to do down anyone who works on the official site. I am sure the site was put together under a lot of pressure and constraints. The issue I mentioned above got reopened later the same day, with the comment: “By impossible, I meant that we did not have resources allocated to it at the time, nor did we have it as a priority. Now we do, so we're getting there.” This speaks to the actual reason I made my version – if you only have very limited resources (very understandable), it is important that you should spend those resources wisely, using the most appropriate technology for your aims, thinking of the users that will want to use your service, and the site’s performance and resilience. I worry that more and more nowadays, people jump to JavaScript frameworks because that is what they know or have been taught, even though they are entirely inappropriate for a wide array of things and can often produce poor results. Look at this recent article by Tim Kadlec, The Cost of JavaScript Frameworks.

The UK government already has a lot of good guidance on this, e.g. Building a resilient frontend using progressive enhancement, and How to test frontend performance, and a lovely blog post by Matt Hobbs at Why we focus on frontend performance. Having these guidelines be followed by everyone across all the variations of procurement and delivery, however, is much trickier than implementing them, in my book!

The current live site

So, the first thing we need to do is look at the current site and see how it works. https://coronavirus.data.gov.uk is a static site, fetching and displaying remote data client-side using JavaScript. There appears to be no server-side activity – something somewhere is generating the new data files each day, but that is presumably independent of the website itself. It is a 100% client-side JavaScript React site; nothing is displayed without JavaScript, and all the HTML is constructed client-side by the JavaScript.

I have just loaded the page here (afternoon of 26th April 2020) on my laptop, and here is the Network tab of Developer Tools in Firefox:

Screenshot of Firefox’s Developer Tools Network Tab loading coronavirus.data.gov.uk.

For those of you who haven’t seen one before, this displays each resource requested for a webpage, where it came from, what triggered the request, what type of resource it is, its transfer size and actual size (transfer can be compressed), and a waterfall display of the timings of the responses.

  1. First off, we request the web page itself. Being a 100% client-side JavaScript site, there is not much to this, some bare bones HTML with header metadata, and the HTML elements to load some CSS and JavaScript.
  2. As you can see, the browser then requests all the CSS and JavaScript files simultaneously (two CSS and two JavaScript). The CSS is 43K in transfer, the JavaScript 659K (c. 1.75M uncompressed). Nothing can be displayed until the JavaScript downloads and is run; the gap after the download is presumably the browser parsing the new JavaScript and working out what to do.
  3. Favicons (8K) and the GOV.UK font (64K) come next; the header/footer of the page are probably starting to display now, and then the JavaScript starts requesting the external data to construct the main part of the body.
  4. That involves a 588K JSON file containing the data (not gzipped by the server, which would reduce it about 90% – reported to them), and then three GeoJSON files (365K, although the site requests the third one twice, so 531K given both are requested simultaneously and so the second request doesn’t get a cached copy from the first request – reported to them) for the map boundaries and circle positions. Only one GeoJSON file is needed for the initial display.
  5. The JavaScript can now construct all the HTML necessary for display, and does so. The JavaScript also handles changing the tab from Nations to Regions, updating the map/list when clicked, switching between charts and tables, and dealing with display changes if the browser size is changed.
  6. The map, once that is set up by the JavaScript, is a vector map, using OpenMapTiles. It loads its style/metadata (27K), then a bit later fetches its vector data (4.pbf and 5.pbf, 245K) and then the remaining .pbf requests are various different glyphs for one font the map uses (totalling 1097K).

In total, on a cold page load of the official site, a desktop browser will transfer 3.19MB of data. On mobile the map is not displayed, so it “only” loads 1.33MB of data. Ignoring the data, the stuff necessary to display anything at all (even just a header and footer) comes to c. 768K.

Resource transfer of official site, desktop
Resource Total transfer size
Map font1097K
JavaScript659K
Page data588K
Map overlay data531K
Map tile data272K
Page font64K
CSS43K
Favicon8K
HTML2K
Resource transfer of official site, mobile
Resource Total transfer size
JavaScript659K
Page data588K
Page font64K
CSS43K
Favicon8K
HTML2K

In the videos I posted on Twitter made on the very useful WebPageTest, the site took 18 seconds to load content on 3G. Note this does not just apply to the main page – the about page, which is only static text, is similarly generated client-side only after downloading all the JavaScript. That took just as long to load on 3G. Given the presumed popularity of this site, having it load quickly is quite important – not just for users but also for whoever is paying/donating the bandwidth costs involved.

My version

My version can be seen at http://dracos.co.uk/made/coronavirus.data.gov.uk/. Please note in the below I am not saying any of this is “best practice”, I was doing this in a few hours on a Sunday, but it is the intent and the structure I think are important. Here is a network diagram for my version:

Screenshot of Firefox’s Developer Tools Network Tab loading my version of coronavirus.data.gov.uk.

My version transfers 407KB in total on load, including the needed data:

Resource transfer of my version
Resource Total
transfer size
Official site
transfer size
Difference
Map overlay data141K531K-390K
JavaScript106K
(40K map, 67K chart)
659K-553K
Map tile/font data86K1369K-1283K
CSS48K43K+5K
HTML18K2K+16K
Favicon8K8K0K
Page data0588K-588K
Total407K3200K-2793K
(-87%)

[I am not allowed to use the same font as GOV.UK, but if I was mine would be 64K more to load that, same as the official site. I’ve not included the font in the table above.]

As you can guess from the difference in size, my version loads much quicker because it is a lot smaller in total; because it has a lot less JavaScript, so your computer or phone has to do a lot less processing; and because it gives the browser something to display immediately.

What I changed, front end

HTML
The HTML is actually quite a bit bigger – because it contains the actual HTML that you want to see, which the browser can show immediately, incrementally, as browsers are designed to do. That HTML, in its data tables, also includes all the data needed by the JavaScript map and charts, which can read the data out of them rather than fetch a large JSON file and get the right bits from there.
JavaScript
I dropped all the JavaScript the official site had, and added back the two main libraries needed – Leaflet for the maps, and chart.js for the charts. I used their default installations with no customisation (on FixMyStreet we customised chart.js to shrink it down to 13K, for example). I added the minimal extra JavaScript needed for the tab behaviour, and the chart/table toggle. I did this all inline – mainly because I was hacking it together, but it would also save the loading of another external file which could matter for a high-performance site.
CSS
The CSS is a little bigger, because I include a new CSS file containing the JavaScript-generated CSS that would otherwise be part of the JavaScript.
Map
The map – well, I didn’t want to spend time installing my own tiles or similar, so used an existing free tileserver from Stamen. This is a raster tile server, but even ignoring the megabyte font used by the official site, the raster tiles are quite a bit smaller than the vector ones. In overview situations, if there isn’t a need, a raster tile may well be a better choice than a vector one.
Map overlay data
I set it up to only load the one GeoJSON file needed to display the on-load map. The other GeoJSON files are requested when you click the tabs that change the map to need that data. If you were worried about it taking a while to load that, you could potentially e.g. load them later on once the main page had loaded, or load them when someone starts to hover over the table, or something.

Back end

On the server, it currently works as follows. Every 10 minutes, it asks the official server if the latest data JSON file has changed since it last got it. If it has, it downloads the new file, and generates new static HTML files by requesting a dynamic PHP file and saving the output to disc (originally, and if there were no traffic expectation or this blog post involved, you could simply have the PHP file produce the output directly as a normal index.php). The use of PHP is not important; any language, even JavaScript, could be used to generate the static HTML from the data however you would wish. The important thing is to have a resilient base layer of HTML and CSS, and then to enhance that with JavaScript. Doing so here actually made it much smaller and quicker, because the JavaScript could read the existing HTML to do its thing.

I stress resilient base layer above, because it is possible to try to “fix” a website such as this by adding more JavaScript that server-side renders your page, providing a quicker initial paint time of the website and some non-JavaScript content without having to change the front-end of the site. Whilst this might and can work if done carefully, doing so while treating the server-side render as separate, or a fallback, could lead to the perverse situation of increasing the data transfer for most people, if they get an HTML page that then gets totally ignored by all the existing client-side JavaScript. The server and the client have to work in tandem in this case, the server generating content that the client can hook into and enhance.

The future

I am keeping abreast of changes made to the official site at present – text tweaks, the new stacked graph, and so on – but it’s not something I wish to keep doing for a long time, unlike, say, traintimes.org.uk :) (Performance notes on that available too!) I have hopefully contributed something back, a couple of bug reports and one pull request (to make the table headers sticky); sadly making a slimline static version of the existing site using its own code was not really possible for someone outside to do, unlike small bugs/features, because it would involve team buy-in, server maintainability and so on.

I hope the official site improves in future so that mine is unnecessary. I also hope that people consider the cost of JavaScript frameworks, and indeed JavaScript, and whether you can do without or with less.