Latest Posts:

Why Did I Break A Perfectly Good Website?

20 May, 2016 · by Karl Auerbach · Read in about 3 min · (551 Words)
blog post hugo website archiving

I have reworked the old, Joomla based, CaveBear website. It took a lot of work. A lot of URLs got changed, thus breaking external links. And I am sure that a lot of small adjustments remain to be done.

The old one was not broken.

So why did I break a perfectly good website?

Well, I’ll tell you why.

It all begins with the idea that much of the content of today’s world-wide-web will disappear.

My old web is almost certainly going to fade into oblivion when I get older and die.

It might be my personal vanity, but I consider much of what I have written to have potential value in future years when people look back and ask “How did the internet get started?”

And I’d like if my experiences and opinions were visible to those people in the future.

The old website dependended on long term stability of domain names - something that is not possible under ICANN’s arbitrary and capricious rule that domain names can be acquired for a maximum of ten years. What a stupid rule!

And the even if the cavebear.com domain name could be locked-in for a long time, who is going to maintain the DNS records. It would be silly to expect the website to remain at the same IPv4/IPv6 addresses forever.

And the old website depends on the proper operation and configuration of an Apache web server, a bunch of PHP (and we all know how troublesome PHP can be), and a MySQL database.

Moving the old website from one machine to another is a significant effort - databases have to be dumped, new databases created and loaded, Apache configurations updated, files moved, SELINUX pacified, and so forth.

And on top of that the old website was filled with absolute URLs - meaning that it would break if moved to a new domain name.

So I said enough is enough.

And I went to a static website generation system - Hugo.

So now the CaveBear website consists of two sets of directories - one in which I have the content organized roughly along the web page hierarchy, and another that contains the final hierarchy of html files and directories perfect for dropping into Apache or Nginx or any other web server.

The old content remains in HTML, new content is done in Markdown.

It takes less than two seconds for Hugo to build the publishing hierarchy from the content hierarchy - and that’s more than 2200 content files!

And because the publishing hierarchy is finished html files, the pages load much faster. About the only external file that is fetched along with the base file for a page is a CSS file, which is probably cached by the user’s browser if the user has visited any other pages on the site.

If I want to move to another domain, I just move the hierarchies. Only one file needs to be tweeked to update the website name (and even that tweek seems unnecessary.)

And because the website is now just a hierarchy of directories, html, and css files the whole thing can be captured with great fidelity by places like The Internet Archives.

Or to put it more bluntly - this is an effort to preserve my digital legacy after I am gone.