GRAYLOG

This is a short write up covering the deployment of a graylog stack in a docker container.

This approach uses a full graylog config file in lieu of using environment variables, and the foundations for this setup can be found here.

In a departure from the documented file, there are three separate docker-compose files, one per service, as follows:

Graylog docker-compose

version: '3.7'

networks:
  hilldocker0:
    external: true

services:
  graylog:
    image: graylog/graylog:4.0
    container_name: graylog
    hostname: graylog

    networks:
      hilldocker0:
        ipv4_address: 172.40.0.22

    restart: unless-stopped

    ports:
      # Graylog web interface and REST API
      - "9001:9000"
      # Syslog TCP
      - "1520:1520"
      - "1514:1514"
      - "1515:1515"
      # Syslog UDP
      - "1520:1520/udp"
      - "1514:1514/udp"
      - "1515:1515/udp"
      # GELF TCP
      - "12201:12201"
      # GELF UDP
      - "12201:12201/udp"

    volumes:
      - ./persistent/var_lib:/usr/share/graylog/data
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro

Mongo docker-compose

version: '3.7'

networks:
  hilldocker0:
    external: true

services:
  mongo:
    image: mongo:latest
    container_name: mongo
    hostname: mongo

    restart: unless-stopped

    volumes:
      - ./persistent/var_lib:/data/db

    networks:
      hilldocker0:
        ipv4_address: 172.40.0.20

Elasticsearch docker-compose

version: '3.7'

networks:
  hilldocker0:
    external: true

services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.10.0
    container_name: elasticsearch
    hostname: elasticsearch

    restart: unless-stopped

    volumes:
      - ./persistent/var_lib:/usr/share/elasticsearch/data

    environment:
      - cluster.name=doc_log
      - node.name=docnode-1
      - http.host=172.40.0.21
      - transport.host=172.40.0.21
      - network.host=172.40.0.21
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"

    ulimits:
      memlock:
        soft: -1
        hard: -1

    deploy:
      resources:
        limits:
          memory: 2g

    networks:
      hilldocker0:
        ipv4_address: 172.40.0.21

Graylog configuration file

Note

Graylog is running as USER graylog with the ID 1000 in Docker. That ID needs to be able to read the configuration files you place into the container.

This is only a partial file indicating the changed values. To get started, create the new configuration directory next to the docker-compose.yml file and copy the default files from GitHub:

$ mkdir -p ./graylog/config
$ cd ./graylog/config
$ wget https://raw.githubusercontent.com/Graylog2/graylog-docker/3.2/config/graylog.conf
$ wget https://raw.githubusercontent.com/Graylog2/graylog-docker/3.2/config/log4j2.xml

From this base make changes similar to the following, aligned with your own network configuration.

############################
# GRAYLOG CONFIGURATION FILE
############################
#
# If you are running more than one instances of Graylog server you have to select one of these
# instances as master. The master will perform some periodical tasks that non-masters won't perform.
is_master = true

# The auto-generated node ID will be stored in this file and read after restarts.
node_id_file = /usr/share/graylog/data/config/node-id

# You MUST set a secret to secure/pepper the stored user passwords here. Use at least 64 characters.
# Generate one by using for example: pwgen -N 1 -s 96
password_secret =

# The default root user is named 'admin'
root_username = admin

# Default password: admin
# CHANGE THIS!
root_password_sha2 =

# The email address of the root user.
# Default is empty
root_email = "user@domain.co.nz"

# The time zone setting of the root user. See http://www.joda.org/joda-time/timezones.html for a list of valid time zones.
# Default is UTC
root_timezone = Pacific/Auckland

# Set plugin directory here (relative or absolute)
plugin_dir = /usr/share/graylog/plugin

###############
# HTTP settings
###############

#### HTTP bind address
#
# The network interface used by the Graylog HTTP interface.
#
# Default: 127.0.0.1:9000
http_bind_address = 172.40.0.22:9000
#http_bind_address = [2001:db8::1]:9000
#http_bind_address = 0.0.0.0:9001

#### HTTP publish URI
#
# The HTTP URI of this Graylog node which is used to communicate with the other Graylog nodes in the cluster and by all
# clients using the Graylog web interface.
#
# Default: http://$http_bind_address/
http://$http_bind_address/

#### External Graylog URI
#
# The public URI of Graylog which will be used by the Graylog web interface to communicate with the Graylog REST API.
#
http_external_uri = http://192.168.0.10:9001/

# List of Elasticsearch hosts Graylog should connect to.
# Need to be specified as a comma-separated list of valid URIs for the http ports of your elasticsearch nodes.
# If one or more of your elasticsearch hosts require authentication, include the credentials in each node URI that
# requires authentication.
#
# Default: http://127.0.0.1:9200
elasticsearch_hosts = http://172.40.0.21:9200

# Do you want to allow searches with leading wildcards? This can be extremely resource hungry and should only
# be enabled with care. See also: http://docs.graylog.org/en/2.1/pages/queries.html
allow_leading_wildcard_searches = false

# Do you want to allow searches to be highlighted? Depending on the size of your messages this can be memory hungry and
# should only be enabled after making sure your Elasticsearch cluster has enough memory.
allow_highlighting = false

# Added by RAH 10 Feb 2020
elasticsearch_shards = 2
elasticsearch_replicas = 0
elasticsearch_index_prefix = glhd

# ("outputbuffer_processors" variable)
output_batch_size = 500

# Flush interval (in seconds) for the Elasticsearch output. This is the maximum amount of time between two
# batches of messages written to Elasticsearch. It is only effective at all if your minimum number of messages
# for this time period is less than output_batch_size * outputbuffer_processors.
output_flush_interval = 1

# As stream outputs are loaded only on demand, an output which is failing to initialize will be tried over and
# over again. To prevent this, the following configuration options define after how many faults an output will
# not be tried again for an also configurable amount of seconds.
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30

# The number of parallel running processors.
# Raise this number if your buffers are filling up.
processbuffer_processors = 5
outputbuffer_processors = 3
# Wait strategy describing how buffer processors wait on a cursor sequence. (default: sleeping)
processor_wait_strategy = blocking

# Size of internal ring buffers. Raise this if raising outputbuffer_processors does not help anymore.
# For optimum performance your LogMessage objects in the ring buffer should fit in your CPU L3 cache.
# Must be a power of 2. (512, 1024, 2048, ...)
ring_size = 65536

inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking

# Enable the disk based message journal.
message_journal_enabled = true

# How many seconds to wait between marking node as DEAD for possible load balancers and starting the actual
# shutdown process. Set to 0 if you have no status checking load balancers in front.
lb_recognition_period_seconds = 3

# MongoDB connection string
# See https://docs.mongodb.com/manual/reference/connection-string/ for details
mongodb_uri = mongodb://172.40.0.20:27017/graylog

# Increase this value according to the maximum connections your MongoDB server can handle from a single client
# if you encounter MongoDB connection problems.
mongodb_max_connections = 100

# Number of threads allowed to be blocked by MongoDB connections multiplier. Default: 5
# If mongodb_max_connections is 100, and mongodb_threads_allowed_to_block_multiplier is 5,
# then 500 threads can block. More than that and an exception will be thrown.
# http://api.mongodb.com/java/current/com/mongodb/MongoOptions.html#threadsAllowedToBlockForConnectionMultiplier
mongodb_threads_allowed_to_block_multiplier = 5
# Automatically load content packs in "content_packs_dir" on the first start of Graylog.
content_packs_loader_enabled = true

# The directory which contains content packs which should be loaded on the first start of Graylog.
content_packs_dir = /usr/share/graylog/data/contentpacks

# A comma-separated list of content packs (files in "content_packs_dir") which should be applied on
# the first start of Graylog.
# Default: empty
content_packs_auto_load = grok-patterns.json

# For some cluster-related REST requests, the node must query all other nodes in the cluster. This is the maximum number
# of threads available for this. Increase it, if '/cluster/*' requests take long to complete.
# Should be http_thread_pool_size * average_cluster_size if you have a high number of concurrent users.
proxied_requests_thread_pool_size = 32

Adding the time based volumes to the graylog docker-compose file is an attempt to set the correct timesize as the container is being bought up. If this dosn’t work then proceed with the next steps.

Once the containers have started and graylog is healthy, log into the graylog container as root and issue the following sudo dpkg-reconfigure tzdata. This will ensure time is unified across the admin user, browser and graylog server.

Last update: April 1, 2021