Stackdriver Monitoring で GCE と Nginx を監視する

このブログは GCE 上で WordPress を利用して運営している。

GCPと親和性の高い Stackdriver を利用して、このブログを監視していきたい。
ログの量が月額50G以内であれば無料枠に収まる。

Stackdriver 設定
デフォルトで何を取得している？
Nginx を監視
参考

Stackdriver 設定

GCP のコンソールから Stackdriver を選択し、Stackdriver アカウントを作成。

対象GCEに Stackdriver monitoring agent をインストール。

curl -O https://repo.stackdriver.com/stack-install.sh
sudo bash stack-install.sh --write-gcm

下記にディレクトリが作成される。

ls -la /etc/stackdriver/

Stackdriverのコンソールより

Uptime Checks
Uptime Checks Overview
Create Uptime Checks
New Uptime Check 画面で以下に修正し、Save する
Title : blog
Check Type : HTTPS
Resource Type : Instance

10分ほどしたら、Stackdriverコンソール上でグラフが表示されるようになる。

グラフは Resources → Instances → 自分のインスタンス名で確認できる。

デフォルトで何を取得している？

設定ファイルは以下に存在するので確認してみる。

less /etc/stackdriver/collectd.conf

以下がデフォルト設定（書き方きたないな・・・）。

Interval 60

# Explicitly set hostname to "" to indicate the default resource.
Hostname ""

# The Stackdriver agent does not use fully qualified domain names.
FQDNLookup false

LoadPlugin syslog
<Plugin "syslog">
  LogLevel "info"
</Plugin>
# if you uncomment this, you will get collectd logs separate from syslog
#LoadPlugin logfile
#<Plugin "logfile">
#  LogLevel "info"
#  File "/var/log/collectd.log"
#  Timestamp true
#</Plugin>

LoadPlugin df
<Plugin "df">
  FSType "devfs"
  IgnoreSelected true
  ReportByDevice true
  ValuesPercentage true
</Plugin>

LoadPlugin cpu
<Plugin "cpu">
  ValuesPercentage true
  ReportByCpu false
</Plugin>
LoadPlugin swap
<Plugin "swap">
  ValuesPercentage true
</Plugin>
LoadPlugin interface
LoadPlugin disk
LoadPlugin load
LoadPlugin memory
<Plugin "memory">
  ValuesPercentage true
</Plugin>
LoadPlugin processes
LoadPlugin tcpconns
LoadPlugin write_gcm
LoadPlugin match_regex
LoadPlugin match_throttle_metadata_keys
LoadPlugin stackdriver_agent

<Plugin "processes">
  ProcessMatch "all" ".*"
  Detail "ps_cputime"
  Detail "ps_disk_octets"
  Detail "ps_rss"
  Detail "ps_vm"
</Plugin>

<Plugin "disk">
  # No config - collectd fails parsing configuration if tag is empty.
</Plugin>

<Plugin "tcpconns">
  AllPortsSummary true
</Plugin>

LoadPlugin exec
# Monitor the Stackdriver Logging agent. This should fail gracefully if for any
# reason the metrics endpoint for the Logging agent isn't reachable.
<Plugin "exec">
  # The script doesn't need any privileges, so run as 'nobody'.
  Exec "nobody" "/opt/stackdriver/collectd/bin/stackdriver-read_agent_logging" "http://localhost:24231/metrics"
</Plugin>

LoadPlugin aggregation
LoadPlugin "match_regex"
<Plugin "memory">
  ValuesPercentage true
</Plugin>

PostCacheChain "PostCache"
<Chain "PostCache">
  <Rule "processes">
    <Match "regex">
      Plugin "^processes$"
      Type "^(ps_cputime|disk_octets|ps_rss|ps_vm)$"
    </Match>
    <Target "jump">
      Chain "MaybeThrottleProcesses"
    </Target>
    Target "stop"
  </Rule>
  <Rule "otherwise">
    <Match "throttle_metadata_keys">
      OKToThrottle false
    </Match>
    <Target "write">
      Plugin "write_gcm"
    </Target>
  </Rule>
</Chain>

<Chain "MaybeThrottleProcesses">
  <Rule "default">
    <Match "throttle_metadata_keys">
      OKToThrottle true
      TrackedMetadata "processes:pid"
      TrackedMetadata "processes:command"
      TrackedMetadata "processes:command_line"
      TrackedMetadata "processes:owner"
    </Match>
    <Target "write">
       Plugin "write_gcm"
    </Target>
  </Rule>
</Chain>

# if you have other config, especially for plugins, you can drop them
# into this directory
Include "/opt/stackdriver/collectd/etc/collectd.d"
Include "/etc/stackdriver/collectd.d"

簡単に読む限りでは、1分間隔で以下の一般的なサーバメトリクスを取得している。

syslog
df
cpu
swap
processes
exec
tcpconn
memory

Nginx を監視

アプリケーションのメトリクスは Agent にプラグインをインストールしなければならない。

今回は WordPress に利用している Nginx を監視してみる。

まず、Nginx 側で Nginx のステータス情報ハンドラを有効にする必要がある。

root 権限で行う。

sudo su

Nginx のフォルダに status.conf を配置。

(cd /etc/nginx/conf.d/ && curl -O https://raw.githubusercontent.com/Stackdriver/stackdriver-agent-service-configs/master/etc/nginx/conf.d/status.conf)

Nginx 再起動。

systemctl restart nginx

Stackdriver 側で Nginx モニタリングプラグインを配置。

(cd /opt/stackdriver/collectd/etc/collectd.d/ && curl -O https://raw.githubusercontent.com/Stackdriver/stackdriver-agent-service-configs/master/etc/collectd.d/nginx.conf)

Stackdriver を再起動。

systemctl restart stackdriver-agent

これで Stackdriverコンソールの Resources から Nginx を選択することができる。

参考

【GCP入門編・第15回】 GCP から AWS までモニタリングできる Google Stackdriver を紹介！

【GCP入門編・第16回】アプリのパフォーマンスを視覚的に確認できる Stackdriver Monitoring を紹介！

【GCP入門編・第17回】 Stackdriver Monitoring で Google Compute Engine を監視しよう！

Nginx プラグイン