Tech:Graylog

Graylog is a log management solution for logs stored on the servers. As part of of a central logging project, Miraheze is exploring Graylog for the central log storage. The web interface is available at https://graylog.miraheze.org/. Access is restricted to Site Reliability Engineering department personnel. Said people can use their LDAP credentials for authentication. The point of contact for this service is User:Southparkfan.

Architecture
Graylog runs on graylog1.miraheze.org as of now. There are three daemons running there:  for the actual log management,   for storing the logs and   for storing Graylog's configuration.

+--+                                       +--+                     | test2.miraheze.org               |                                        | graylog1.miraheze.org                    | | ++                  |                                        |                                          |                     | |            |                   |                                        | +---+      +---+ | ++   | | MediaWiki  |-\                 |                                        | |               |      |               | | +--\-+  | +-+   |            | |   --/                          |         |          |                     |            --\  | |             |   | syslog-ng  -/          TLS encrypted              |          \         \                     | |  NGINX     -            | |                                        |  +---|---+  \  +---+ | | |            |   ++ |                                        |  |               |   \ |               | |                     | +-+    /             |                                        |  |     NGINX     |    ||    mongod     | | |                  /              |                                        |  |               |     |               | |                     | +-+  /               |                                        |  +--|+     +---+ |                     | | /dev/log    | /                |                                        +-|+ | | (kernel logs|/                |                                                  |                                                      | |, etc.)     |                  |                                                  | | +-+                 |                                                  |                                                      |                                  |                                                  |                                                      +--+                                        +-|-+                                                                                                                        |                   |                                                                                                                        |  Tech Team member | |                  |                                                                                                                        +---+
 * |  | |            |  ---\             |                                    --|graylog-server  elasticsearch | |
 * Miraheze User |   | ++      --\          |             12210/tcp   --/    | |               |\     |               | |

In the example above, test2 runs syslog-ng, which is responsible for receiving the logs locally and sending them to graylog-server. By setting  to 'syslog_ng' in puppet, base::syslog will install syslog-ng and configure it to listen on   (for anything on the server sending its logs to that destination, such as MediaWiki and NGINX) and 'system' for services such as ssh and kernel logs.

Streams
Streams are Graylog's categories of data. By default, the  stream is the stream for every message sent to Graylog. Streams are useful to limit access for certain members. For example, MediaWiki Administrators can only access the streams for MediaWiki and NGINX logs.

Quering the data
Graylog has a search syntax that's close to Lucene's syntax. For MediaWiki and NGINX, custom fields have been defined: go to https://graylog.miraheze.org/search and click on 'Fields' on your left. Using these fields, you can query the data. For example:


 * View NGINX logs for your IP address:
 * View all SSH logs:
 * View all MediaWiki errors and warnings:

Access
For security reasons, the Graylog interface is not accessible without a SOCKS5 proxy, just like Proxmox' interface. In order to make the process of using tunnels as easy as possible, please install SmartProxy: Chrome or Firefox. We'll be using port 8089 (although other ports will work too) on your desktop or laptop, which will be used for a SOCKS5 proxy over SSH. If you have access to graylog1, you can use graylog1.miraheze.org. If you don't have access to graylog1, use any of the MediaWiki servers (mw*.miraheze.org).

In SmartProxy, create a proxy server: Proxy Server > Add server > Name = "Miraheze Proxy", Address = "127.0.0.1", Port = "8089", Protocol = "SOCKS5" > Save. Afterwards, create a proxy rule: Proxy Rules > Add rule > Rule type = "Search Domain and SubDomain", Domain = "graylog.miraheze.org", then "Apply Proxy" to "Miraheze Proxy" > Save and then click "Save" on the bottom of the page as well.

You can also see this quick video on what the configuration looks like for SmartProxy

OpenSSH
If using OpenSSH, you can use.

PuTTY
It is recommended to save this config to a session. Choose a server you would like to connect to. Go to Connection > SSH > Tunnels, enter  in   and select the radio buttons   and. If you are planning to use Graylog for an extended period of time, without using PuTTY for executing commands on servers (idle state), you may hit a timeout: see this for a fix.

Administration
Configuring Graylog is a combination of Puppet usage and using the web interface for configuration (where configuration will eventually be stored in MongoDB on graylog1.miraheze.org). role::graylog is used for graylog1's configuration. base::syslog contains the configuration for every server logging to Graylog.

Future

 * Building a graylog2 server for replication/redundancy
 * Applying authentication mechanisms + encryption to Elasticsearch for Grafana <-> Elasticsearch integration