The Wiki Will Likely Return Shortly (Sans the PHP Back End That Caused Terrible Load Spikes)
Goodbye to MediaWiki?
TODAY was a very busy day. A very productive day, too! We did lots of site- and capsule-related work (the latter refers to Gemini, which not many people bother to check out because it's outside their "comfort zone", at least for now).
This evening we made a likely 'breakthrough' that will possibly enable the imminent return of the wiki in a static page/s 'mode'. Well, technically not a wiki at all. Not anymore. Many lessons can be learned from all this.
The first time I installed a wiki it was 2004 or 2005. I installed it in my own site. I then installed another... and another. Over the years I used and installed a lot of wiki software, not even the same software. Here in Techrights we've had a wiki for nearly 15 years.
Wikis are a huge technical debt and an overhead. Installing them may be easy, just like buying a puppy. The storage requirements vary depending on the implementation and performance depends on various factors. As noted here before, in late summer our wiki was being banged on constantly by bots, usually at a pace of about 300-400 page requests per minute. Each requests yields a "payload", resulting in RAM usage, CPU usage, and traffic that goes astray (bots like spiders and "Hey Hi" are useless). In our case, the page stores many versions of the same pages over and over again (almost 18,000 revisions of wiki pages in total). Instead of compressing or storing just changes (incremental) it just adds up - to the point where some wiki table exceeded 1 gigabyte in size. For a wiki with less than 2,000 pages in total, most of them rather short, this space usage is unacceptable. The cost-benefit analysis said we needed to salvage our data, evacuate it from the wiki, and go static permanently. We're planned this for a long time (years), but plans are easier than practice. SQL hacking and data processing (syntax for wikis is typically not the same as plain HTML) takes time and then there's meticulous testing. We don't want to produce lousy conversions because plenty of time (many years) was spent putting these pages together, refining them over time. They are valuable resources and they catalog blog posts that are otherwise scattered and exceedingly sporadic.
What are our conclusions?
- If you're going to install some wiki software, then carefully think ahead (a priori) of operational toll, including upgrades, maintenance, moderation and so on.
- Consider how long you plan for this to run, as underlying stacks will change and require plenty of manual intervention over the years.
- Check if the software is likely to even be around in a few years (security patches, compatibility fixes as per (2) above).
- Consider the cost (metaphorically and literally) of serving pages by regenerating them over and over again. Hosting isn't free. If you think it is free, someone likely tricked you into feeling that way.
- Check the database schema if the wiki software uses relational databases (some use plain files or Git). Because one day you might have to wrestle with it just to get your data out. Plan ahead.
I started my computing days in the 'DOS generation'. Things were a lot simpler back then. As things become more bloated they become more complex and thus expensive to maintain. Remember what Theo de Raadt said yesterday. It makes sense that they try to keep OpenBSD as simple as they can get away with. Complexity is an enemy of security or, put another way, complexity and security are mutually incompatible.
Moving from one wiki software to another (I did this several times in the past, with 4 wiki pieces of software thrown in the mix) is not solving the issue but leaping from one pile of technical debt to another. It's like loan "consolidation". Go static instead and leave the worries behind.
We're gratified that in ~15 years MediaWiki very seldom broke itself or needed intensive repair(ing). Some years ago I needed to restore things from a nightly database dump due to mass spamming attacks (manual rollbacks would take way too long), but that's about it. Techrights never suffered data losses.
Thank you for the fish, MediaWiki, but it's time to move on. █