There are 4 entries in this Category.

Using Jekyll for static web site building on a Mac


Martin Pilkington occasionally mentioned the Jekyll static web site builder in his tweets, so when I wanted to start a new web site, I thought I’d give it a try, as I’ve been uncomfortable with having to back up WordPress sites in separate steps to get both the database and the image and movie assets.

Installing Jekyll

Make sure you have Xcode installed and its command line tools. Then just open Terminal and type:

sudo gem install jekyll

Creating a new site

To create a site, do

jekyll new example.com

This creates a folder named example.com in the current directory to hold the files for the new site.

Open the folder and in it the _config.yml file. Change the entry title: to what you want to call your new site, and url: to your domain name (in our above example, http://example.com). Feel free to change what few other settings there are, but you don’t need to.

Two neat _config.yml tips:

  1. You can leave entries empty, e.g. the email: or github_username:, and they will just disappear, including their icons.
  2. You can make files and folders that Jekyll would usually skip (like .htaccess) by adding an entry like: include: [.htaccess, .well-known, _foo.xml]. You can also exclude files from copying/processing this way, e.g. exclude: Readme.md.

Testing the site

Jekyll includes its own web server. Simply type

cd example.com
jekyll serve

the site is now available under http://localhost:4000. The server will automatically watch for changes to the folder and re-build the site. If you still need to manually trigger a rebuild (e.g. to deploy your site without launching the server), just use

jekyll build

Adding your own pages

Like WordPress, Jekyll has pages and blog posts. Any file that doesn’t start with an underscore is considered as a page. Be it index.html or about.md. Let’s edit about.md to describe our site, not Jekyll. Open it in a text editor.

The file starts with a section like this:

layout: page
title: About
permalink: /about/

called the “front matter”, followed by regular Markdown, as you’d expect from a .md file. This section specifies the info that Jekyll needs about the page, and tells Jekyll to substitute placeholders in the file. You can leave this section empty, just its presence tells Jekyll to process the file. So it’s easy to create a new page. You can also make up your own settings here, if you want.

The ones in this example are standard ones Jekyll knows out of the box:

layout: specifies that the file _layouts/page.html should be wrapped around this file, and this file’s contents should be inserted where that file says ||contents~~. This is how Jekyll applies themes and shows navigation on each page.

permalink specifies the address at which the page will end up in the generated web site. So in our example, you’d find this page at http://example.com/about/ instead of at http://example.com/about.html.

title: is actually just a variable used by the _layouts/default.html template. Any variable you define can be used on the page by writing e.g. || page.title ~~. So if you added a line temperature: 40 Centigrade you could put it on the page as || page.temperature ~~.

Other interesting variables are categories, tags and published.

Blogging with Jekyll

Jekyll offers special blogging support. Mainly this just involves saving pages into the _posts folder and prefixing the file names with an ISO date, e.g. 2015-01-30-My First Post.md. But it also has special functions to make it easier to link between posts in a stable fashion, and to generate lists of posts with teaser text etc. The Official Jekyll docs on blogging cover this well.

Importing from WordPress

Jekyll has support for importing from WordPress. First, install the importer:

sudo gem install jekyll-import
sudo gem install sequel
sudo gem install unidecode
sudo gem install htmlentities
sudo gem install mysql2

and then do what Jekyll’s WordPress Importer docs say.

How to build a good restaurant web site

The typical restaurant web site, I’ve found, is completely useless and a waste of money. Here’s a short list why:

  • Most of them are 100% Flash. Nobody who owns a smart phone can view them. At all. So if I’m on the road and want to know whether your restaurant is open, I can’t see that, just because you wanted a photo slideshow with crossfades.
  • Most of them are missing the opening hours and/or the address. Those that have them often hide them in lots of prose. Someone on the road with their phone will want to know that information first.
  • Most of them are missing the menus. While some of them have the permanent menu, particularly the daily lunch deals or weekly changing menus are why a prospective customer might come back to your web site.

Of course, everyone can moan and complain, so here’s my short and sweet summary of how to make a good web site for a restaurant:

  • Put the following on your front page: Your address (including the city and country, this is the internet, after all!), your opening hours, and a tag line like “Greek taverna” or “Italian kitchen” or “exclusive 4-course dining in separés” or something else that helps a first-time visitor immediately get an idea of your restaurant.
    And no, your address as the “legally responsible party” on your web site’s imprint page doesn’t count. That could be an office building for a restaurant chain. Make sure it’s clear where to go. Put a small picture of your front entrance on there so they recognize it.
  • Don’t use Flash. People on cell phones can’t see Flash, they just get a lego brick icon and that’s it.
    If someone is in your general area and wants to know where to go, they will call up your site on their smart phone to check the opening hours. Make it easy for them. You’re wasting money if half your interested customers can’t see your site.
  • Put your daily menu and specials on the site. This is easier than it sounds. You don’t have to pay a web designer every time. Pay them to make you one editable page where you can just log in with a password and edit the text from any browser. You probably already type up the daily menu and print it every day. Just copy it over there, click “Save” and anyone on the internet (potential customers sitting at work thinking where to go for lunch together, for instance) can immediately see what you have to offer.
    Your permanent menu is nice, but people who’ve been at your place a couple times probably have a general idea what’s on it already. The specials change daily or weekly, everyone has to look those up.
  • Bonus points: Include a phone number (or even better, a web form) where people can make reservations. Ideally they’d be hooked up to your reservation system and immediately give feedback. Otherwise, make sure you check your e-mail often and confirm reservations in a timely manner.
    If you want to provide prose or an image gallery, put them on extra pages, so cell phone visitors don’t have to download all of that over a mobile connection.
    And finally: Pay for a professional web designer and professional photographer. It will show in the end result.

That’s my short list of how to make a good, useful restaurant web site. I hope it will help restaurant owners get the right thing from their web designers.

HTTP Auth with PHP in CGI-mode (e.g. on Dreamhost)

Dreamhost is different from my previous web hosters in that it runs PHP as a CGI, and not as a Mod-PHP integrated with Apache. This mostly makes no difference, but has caused me one problem: HTTP AUTH password authentication using the $_SERVER['PHP_AUTH_USER'] and $_SERVER['PHP_AUTH_PW'] doesn’t work anymore. Since it took me a while to find a workaround, here’s the skinny:

The workaround is to pass the username and password info as an environment variable to the script. You do this using mod_rewrite in your .htaccess file in the following way:

RewriteEngine on
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]

This defines an environment variable named “HTTP_AUTHORIZATION” and stores all HTTP AUTH credentials in there before your script gets run. Next step is to grab your login info from there in your PHP script:

// split the user/pass parts
if( !isset($_SERVER['PHP_AUTH_USER']) )
list($_SERVER['PHP_AUTH_USER'], $_SERVER['PHP_AUTH_PW']) = explode(':', base64_decode(substr($_SERVER['HTTP_AUTHORIZATION'], 6)));
if( strlen($_SERVER['PHP_AUTH_USER']) == 0 || strlen($_SERVER['PHP_AUTH_PW']) == 0 )

Here we first check if we need to do anything at all. If we get a username, we must obviously be running under mod_php, so no need to do anything. However, if we don’t have a password, try to get one from the environment variable, to work around the PHP CGI limitations. We extract the “username:password” string into the $_SERVER['PHP_AUTH_USER'] and $_SERVER['PHP_AUTH_PW'] variables, where they would be on a regular mod_php system. But sadly, we may get two empty strings from this process if there was no password passed, so in that case we unset the two variables again, so that it behaves just like it usually would if no password was provided, and all our code can check it with isset($_SERVER['PHP_AUTH_USER']) as usual.

Building a distributed Twitter

With the goings-on at Twitter HQ, Brent Simmons started thinking about a distributed Twitter. Now considering he’s the author of NetNewsWire, a great RSS reader, I’m sure this has crossed his mind as well, but I’d like to lay out a possible distributed Twitter redesign based on RSS before you:

RSS is ideal. It’s XML, so it’s extensible. It is widely supported. There are libraries for reading it for pretty much every programming language. And it was intended to be polled for new, current information. It also deals in items, which can be what each Tweet will be. And finally, at its simplest form, an RSS feed is just a text file on a server, so implementations can be very simple, and can happen on CDNs and other “stupid” web servers, if needed. I’ll first go into the technical infrastructure, and then I’ll illustrate how this would actually look to the end-user.
illustration of a few accounts and a directory

How do we store our tweet database in a distributed fashion?

Every user would have an RSS feed on a web server somewhere. This feed contains all the tweets they posted. The URL of this feed is that user’s “user ID”, the globally-unique name that identifies that user, and what other users need to subscribe to their feed. This RSS feed would contain one tweet per feed item (in the description). @mentions would be encoded as links, with a special attribute indicating this is an account name. If you reply to a tweet, the RSS item would contain an additional in-reply-to attribute in this link that holds the GUID of the feed item for the referenced tweet.

So, for example:

<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<title>RSS Example</title>
<description>This is an example of an RSS feed</description>
<lastBuildDate>Mon, 28 Aug 2006 11:12:55 -0400 </lastBuildDate>
<pubDate>Tue, 29 Aug 2006 09:00:00 -0400</pubDate>

<description>&lt;a href="http://www.example.com/ulistweets.rss" feedaccount="yes" inreplytoitem="17576"&gt;@uli&lt;/a&gt; Remember to bring along that Dr. Who boxset</description>
<guid isPermaLink="false"> 1102345</guid>
<pubDate>Tue, 29 Aug 2006 09:00:00 -0400</pubDate>


In this example, the GUID is a well-formed number, but of course, like with every RSS feed, it could be an arbitrary string (or even the URL of the blog post this item was generated from, if someone uses a blogging engine to generate their tweet RSS feed).

How do I ‘follow’ someone?

At the most basic form, following someone in this distributed Twitter system simply means subscribing to their RSS feed. You need their URL, add it to your feed reader and you see what tweets they send. But of course the typical user will have a specialized “distributed Twitter” interface for this: A script on a server somewhere that provides a Twitter-like web interface. It will keep a list of the URLs of all the people you follow, will let you add new ones, and will show you your “personal timeline”, i.e. an aggregated list of the tweets of everyone you follow and your own. Even better, it can even vend this personal timeline as RSS again, so you can use any RSS reader to view your personal timeline.

And of course it will have a little form where you can type in your message, and it will add that message to your personal RSS feed.

Writing tweets

For the user, writing a tweet would pretty much work like before. But behind the scenes, we would need a bit more smarts: We’d need some way to turn a short name like uli into the actual URL, and we need to encode this in the RSS feed somehow. Dave Winer had a great idea here: Why not just use DNS? If I write @firstname.lastname.com, it would automatically know to go to the server firstname.lastname.com and look there for an RSS file, say, microblog.rss. It would then encode the target user’s name into the RSS item’s description as <a href="http://firstname.lastname.com/microblog.rss" feedaccount="yes">@firstname.lastname.com</a>. That way, every search engine and RSS reader sees it as a link, to users it looks almost like before, and it uniquely identifies that user across the entire internet. The feedaccount attribute helps dedicated ‘distributed Twitter’ clients recognize these links as account/tweet links (different from other links in a post).

Even better: We could integrate various existing services by providing subdomains on their servers. So my Twitter account would be uliwitness.twitter.com, and my App.net account would be uli.app.net. And the scripts on those servers could even know about their home and let me use a shorter version of the name to reference people on the same server (i.e. @uliwitness or @uli). Since the RSS contains a full link, people on other servers reading these tweets will know which @uli is meant, and their clients could even rewrite the linked text so the user always sees the same name for the same account.

So, you see, even with an RSS-based, distributed Twitter, you can still follow people, reference them quickly and easily, and view your aggregated timeline. Moreover, since there is a script generating your timeline from the other users’ full post histories, you will even keep the ability to filter your timeline. Be it to ignore tweets by someone, even when they are retweeted by someone you’re following, or to filter out replies to messages from users you’re not following, or whatever else you can think of.

What about retweets, DMs and protected accounts?

Retweets work essentially the same way as replies. The Retweet item would contain a copy of the message, and the user name following the “RT” prefix would be a link pointing back at the tweet that is being retweeted. That way you don’t have to hit each account whose message has been retweeted (because you already have a copy), but you can. And again, the user just chooses “Retweet” in their script’s web interface, and all the magic will happen behind the scenes.

Protected accounts are easy as well: You don’t want those messages just lying around, readable for everyone who happens to figure out the URL, so we use HTTP AUTH to prevent access. Whenever someone wants to follow our protected account, we have to approve them anyway. So we just have the script generate a username/password for them and that gets sent back as a reply in a standardized form (e.g. as a DM).

DMs are essentially like protected accounts. A DM can be implemented as a special, additional RSS feed to which the sender posts direct messages to you. Since you have to follow someone for them to be able to DM you, whenever you check if they have new messages, you can check if they have left new DMs for you. The only problem here is how to generate and distribute the username and password. We don’t want to have to generate and keep on file a username/password pair for any person that follows me to have, just in case I want to DM them one distant day. Moreover, so far following happened completely without me having to do anything, it was just someone else occasionally hitting my RSS feed. Distributing passwords means I need a script on the server and a database all of a sudden, not just a text file or two.

Asynchronous cryptography to the rescue! To implement DMs, what we can do is publish a public key with every RSS feed. When I want to DM someone, I grab their public key, and encrypt their message using their public key. The only way to read this message is to use the private key, which only the destination account’s owner has. Of course, the little script that I use to follow someone, show me my timeline etc. will take care of all that transparently behind the scenes, so that, again, I just click “DM” to send a direct message.

But, wouldn’t using RSS unduly burden popular tweeters?

They would have to make sure their hoster can cope with their bandwidth demands, yes. But that’s how the web already works. You register a domain. Either you have enough bandwidth, or you need to spend money on bandwidth and load balancing. There would be inexpensive hosters for small users (like WordPress.com or Tumblr), and there would be big hosters for big companies with lots of traffic (like Akamai or S3 or Slicehost or whatever…).

Beyond that, I’ve intentionally chosen RSS as the lowest common denominator. There is nothing keeping you from implementing HTTP 304 “Not Modified” status codes or PuSH (as someone in the comments suggests) or similar standards to allow people to get change notifications without having to download the entire feed and maybe even to be able to just get the changed feed items. Similarly, you can keep a list of all feeds your users subscribe to at the moment and periodically pull copies of all messages into your database, for faster search and presentation of the personal timeline. It would almost be like NNTP, where messages get distributed and cached locally across a network of servers. Just that your script and your users decide which accounts (with their tweets, DMs etc.) you look at. But even if you don’t do that, every client should be able to deal with a simple, stupid RSS feed at the least. It’s well-supported, well-known and well-understood.

So. Distributed Twitter.

Yeah. It’s definitely a possibility. Most features will map across fairly easily, and if we manage to set up an independent directory and a search engine, the remaining features would be very possible as well. For those who care, here’s a rough implementation of parts of this design that I knocked out in PHP a while ago: Chirp distributed Twitter server