How I Built This Blog

I have received a number of compliments on my blog's style or theme and even more requests for details on the blogging environment. So here's how I built my blog.

When my post Why I Keep a Research Blog was discussed on Hacker News, a number of commenters mentioned my blog’s simple style and asked for the details of the blogging environment. Since then, I have received many emails along the same lines. However, until now, I haven’t put time into making my blog’s theme accessible for a simple reason: I want to focus on clear and thorough technical writing, not meta-posts about writing. In fact, I think a failure mode of writing is using how to write to delay the act itself. Think of a person who spends half a Saturday researching blogging platforms rather than, you know, blogging. But after yet another email, I wondered: maybe I’m ignoring an opportunity to be useful because I want to be useful in some other way. So whatever. Here’s how I built this site.

I assume the reader is technically competent but may not have much web development experience. Apologies if this is too basic. For the eager, here’s my GitHub repo.

Static pages

First, I don’t use a blogging platform or theme. I just use Jekyll, which is a static site generator, with my own custom layout. I like Jekyll because it allows me to write plaintext Markdown files, which it then compiles into static web pages. At a high level, all you need to do to replicate my site is to compile your own Markdown files using Jekyll while including my layout and CSS. Then host the files however you’d like. My hunch is that the people who like my website like the simplicity. Blogging platforms are often overstylized, and dynamic sites can distract from the content.

Jekyll uses Liquid, which is a templating language that is so straightforward to use you just need to see an example. Here is a simplified version of my index.html file, what you see when you visit my landing page:


<div id='posts' class='section'>
    {% for post in site.posts %}
        <div class='post-row'>
            <p class='post-title'>
                <a href="{{ post.url }}">
                    {{ post.title }}
                </a>
            </p>
            <p class='post-date'>
                {{ post.date | date_to_long_string }}
            </p>
        </div>
        <p class='post-subtitle'>
            {{ post.subtitle }}
        </p>
    {% endfor %}
</div>

As you can see, logic and control is in curly braces plus percent signs {% and %} and variables are in doubly curly braces {{ and }}. Jekyll gives you access to the site.posts object, which is iterable, and each post object has properties that can be dot-accessed. These properties are just the metadata of the Markdown file associated with that post. For example, here is the top of my Markdown file for my post on the reparameterization trick:

---
title: The Reparameterization Trick
subtitle: A common explanation for the reparameterization trick with variational 
  autoencoders is that we cannot backpropagate through a stochastic node. I provide a 
  more formal justification.
layout: default
date: 2018-04-29
keywords: reparameterization trick, stochastic gradient variational bayes, variational 
  autoencoder, vae, variational inference, autoencoder, graphical models, neural 
  networks, deep learning
published: true
---

## Informal justifications

In _Auto-Encoding Variational Bayes_, {% cite kingma2013auto %}, Kingma presents an
unbiased, differentiable, and scalable estimator for the ELBO in variational inference.
A key idea behind this estimator is the _reparameterization trick_. But why do we need 
this trick in the first place? When first learning about variational autoencoders 
(VAEs), I tried to find an answer online but found the explanations too informal...

For example, the keywords metadata allows me to statically inject keywords on to each post.

Jekyll does all the hard work of converting this to HTML. It knows how to layout the content, for example where the subtitle goes and which CSS selectors should be assigned to it, using the layout metadata item, which is the template I mentioned earlier. If you have a more complex site, you could have different layouts for different pages.

Math

To render math, I use two different JavaScript libraries, MathJax and KaTeX. MathJax is rendered after the web page is delivered, meaning that it often takes a long time for math-heavy posts to load. For example, this post on matrices loads very slowly. Note that this slow loading causes the page to reflow, forcing things to jump around. (Note to future readers: I may fix this one day.)

This delayed rendering and the subsequent page reflows annoy me, and I’ve started using this awesome Jekyll plugin, Jekyll-KaTeX, which lets me preprocess the math when building the site. It takes longer to compile the static pages now, but I think it’s worth it. For contrast, see how quickly the math renders on this post on Bayesian linear regression. Also, no reflow!

If I could do it again, I’d write every post in KaTeX. However, one person did email me to say that MathJax is useful because it allows you to inspect the underlying LaTeX.

Citations, RSS feed, and sitemap

I use Jekyll plugins for maintaining citations, an RSS feed, and a sitemap. For citations, I use Jekyll-Scholar. For example, here is a citation of (Bishop, 2006). You just keep all your BibTex-style citations in a separate file, e.g.


@article{bishop2006pattern,
  title={Pattern Recognition and Machine Learning},
  author={Bishop, Christopher M},
  year={2006},
  publisher={Springer}
}

cite them using the cite command—see the reparameterization trick snippet for an example—, and then let Jekyll-Scholar do the rest. My template plus CSS organizes the citations in a footer and styles them appropriately. For what it’s worth, Markdown also supports footnotes, but I’ve never used one on the site, and my layout and CSS will not style them.

For RSS, I use Jekyll-Feed. This plugin automatically builds my RSS feed. I don’t use an RSS reader myself, but I like supporting early web technologies that are simple and non-invasive.

For a sitemap, I use Jekyll-Sitemap. This plugin automatically builds my sitemap, which is useful for web search engines.

Figures

To include a figure in a post, use this HTML snippet as a template:

<div class='figure'>
    <img src="/image/sampling/rejection_sampling_diagram.png"
         style="width: 60%; display: block; margin: 0 auto;"/>
    <div class='caption'>
        <span class='caption-label'>Figure 1.</span> Diagram of rejection sampling. The 
        density $q(\mathbf{z})$ must be always greater than $p(\mathbf{x})$. A new sample 
        is rejected if it falls in the gray region and accepted otherwise. These accepted 
        samples are distributed according to $p(\mathbf{x})$. This is achieved by sampling 
        $z_i$ from $q(\mathbf{z})$, and then sampling uniformly from $[0, k q(z_i)]$. 
        Samples under the curve $p(z_i)$ are accepted.
    </div>
</div>

My CSS file will appropriately style the figure and caption classes. I size each figure manually, e.g. “style="width: 60%”. You’ll need to host the images yourself and specify the image paths appropriately.

A dozen people have also asked me how I make my figures. I often use Adobe Illustrator, which is great for vector graphics. I have an old version of this program from my architecture school days. However, I wouldn’t recommend it to others because it’s expensive and time-consuming to learn. I use it because it’s what I have and what I know.

I don’t think the program matters that much in this context. In a pinch—for example, if I’m traveling and do not have access to my old laptop—I have used Google Drawings, Photopea, and Gimp. Instead, I’d recommend focusing on a few simple rules:

And these are just instances of a more general rule (see below).

Consistency

Feel free to copy my site. However, I think it’s more important to understand a basic design principle so that you can modify my template as you please. My underlying principle is: make things consistent as a rule and differentiate for emphasis. Why? We notice difference. It’s why Strunk and White recommend parallel construction in writing. Changing the construction of a sentence or phrase in a list is subtly jarring. This is true visually as well. Do an image search for “stand out”, and note how often designers emphasize the outlier with color and form. But this only works if you’re consistent when you don’t intend to emphasize.

At a subconscious level, lots of differentiation is distracting because you don’t know what to focus on. While not letting the eyes rest can be used to great artistic effect, I don’t think that is relevant for technical blogging. This is why people find sites like mine easy on the eyes. Don’t distract through arbitrary differentiation. Let the mind focus on what matters, which in this case is the writing.

  1. Bishop, C. M. (2006). Pattern Recognition and Machine Learning.