Peace of mind is priceless. For that reason, Miraheze stores both internal and external backups of all wikis. Along with storing internal and external backups, we also post dumps of all public wikis every month on archive.org to allow users to download them monthly and to provide further peace of mind.
As such, Miraheze has a total of three types of backups which are taken. On top of this, users may generate their own backups, quickly and conveniently on demand using our DataDump tool.
Miraheze takes three types of backups to ensure as much resiliency as possible.
- Internal backups are backups kept on hand which the Site Reliability Engineering team can use to quickly bring the entire site up in the event of a catastrophic failure. These backups include full database dumps, which include user account information and CheckUser information. See the schedule below for more information.
- External backups are automatic backups kept on servers controlled by us but on a different host and in a different country. This is done to ensure that a failure on one host or in the power grid of one country, etc., doesn't cause extended downtime or data loss to our users. These types of backups include critical parts of our infrastructure such as the databases of all wikis, private Git repository data, Phabricator configurations, and much more. See the schedule below for more information.
- Public backups are XML backups which we upload every month to archive.org of all public wikis. We do this to make sure we have a reliable backup of all wikis on an external site along with to ensure users have peace of mind by seeing a backup that is readily available for usage by us/them.
General backup schedules
Up to date as of 12 January, 2023
Miraheze automatically runs the following backups for disaster recovery purposes:
- Weekly: Private Git repository for configuration management secrets and SSL keys
- Weekly: mhglobal (CreateWiki, ManageWiki, global tables) and reports (TSPortal) databases
- Fortnightly: All other databases in SQL format for all wikis and other services
- Fortnightly: Phabricator images and database
- Monthly: piwik (Matomo) database
- Monthly: XML dump of all private wikis
- 3-monthly: Full XML dumps of all wikis, including private wikis
- On demand: XML backups of all wikis scheduled for deletion
- Not currently ran: Static images for all wikis
- Monthly: All public wikis; XML dumps uploaded to archive.org
On top of our internal, external, and public backups, users may generate their own using different ways.
DataDump is a Miraheze-developed extension that allows wiki administrators to easily and quickly create database dumps. Administrators can create XML backups (which hold all pages and revisions), image backups, and ManageWiki setting backups. It is the quickest, easiest and most convenient solution.
To use DataDump, go to Special:DataDump on your wiki and select what backup you want. Once you submit your request, your backups will be generated. Depending on the size of the wiki, it may take from a few seconds up to a few hours to generate a database dump.
DataDump offers an API module which lets users use DataDump via the command line. As of yet, there are no scripts that make use of this.
While we strongly recommend using DataDump as it's the most convenient, you may also generate a database dump using less interactive command-line scripts. We do not recommend any in particular nor do we endorse any. However, one of these such well-known scripts is the Mediawiki Client Tools' Mediawiki Scraper Python 3 script, based on the original WikiTeam Python 2.7 script.
User account information will not be preserved. The XML dump can include full or only most recent page history. The images dump will contain all file types with associated descriptions. The siteinfo.json file will contain information about wiki features such as the installed extensions and skins.
- Bacula (former backup system)