Tech:Deploy-mediawiki

deploy-mediawiki is a new deployment tool currently available on test3 as part of the rollout of scap-like deploy process.

It is currently only deployed on test3 but in future the rollout will allow all deploy commands to be ran from a single server for production (mw11).

The --servers argument must currently be included. You should just append every command with --servers=skip on test3 or use 'all' for most deployments. You can include a comma seperated list if you wish to deploy to specific servers.

Logging
All errors and warnings will log failure messages to SAL. A deployment will abort if the commands to copy data from staging to the canary server.

You can add --no-log. This will direct output to your terminal rather than logsalmsg.

The main parameters
To deploy config changes (should happen automatically once puppet runs, is logged):
 * deploy-mediawiki --config --servers=skip

To deploy MediaWiki with no i18n/l10n changes
 * deploy-mediawiki --world --servers=skip

To deploy MediaWiki with i18n/l10n changes (equivalent to running MergeMessageLists and RebuildLC):
 * deploy-mediawiki --world --l10n --servers=skip

To deploy MediaWiki with gitinfo changes (this should be done when updating or installing extensions):
 * deploy-mediawiki --world --gitinfo --servers=skip

You can use any mix of the 4 --world --config --gitinfo and --l10n parameters.

If you wanted to deploy a change to only a single file without syncing all:
 * deploy-mediawiki --files=w/index.php,w/api.php --servers=skip

To sync a folder:
 * deploy-mediawiki --folders=w/extensions/Echo,w/Skins/Vector --servers=skip

If a canary check fails
If you see 'DEPLOY ABORTED' in SAL, this means that one of the servers in the list failed checks to ensure mediawiki was not down. This is a fairly basic check so it probably means the server is completely fataling.

If you need to bypass the check, use --force. The check will still happen but failures will be shown in your terminal output instead and will be ignored.

The ABORTED message will show which server failed, hopefully it will be 'localhost' which means the server you are deploying from.

If icinga alerts do not show the failed server as depooled, you should probably get someone to consider it or very quickly rollback. You may want to deploy to only the broken server first before deploying the rollback everywhere.

Failover a canary server
While it won't cause issues to have a short loss of service on the active canary server, you just won't be able to deploy new stuff. Eventually you might need to fail over to a replacement. Follow these steps if you do. If the old server is being moved to a normal setup: If the server is being decommissioned:
 * 1) ensure the replacement is fully up to date
 * 2) enable use_staging and is_canary on the new server, disable remote_sync and switch default_sync to 'all' in puppet.
 * 3) run puppet on the new server - this should add the staging clones, remove the ssh access from the current canary to the the new and set the config to sync everywhere.
 * 4) add the ssh key to ssh agent and ensure it is started. If a new ssh key is needed, generate one and update puppet to use the new key.
 * 1) disable is_canary, use_staging, enable remote_sync and set default_sync to 'skip' (default in puppet).
 * 2) run puppet. This will enable remote access from the new canary. It will not wipe the old staging folder.
 * 1) follow the normal server lifecycle procedure.
 * 2) wipe /srv/mediawiki-staging/, disable ssh agent to wipe the key.