FluentD 實作 Nginx Access Log
FROM fluent/fluentd:v1.8.1-1.0

# Use root account to use apk
USER root

# below RUN includes plugin as examples elasticsearch is not required
# you may customize including plugins as you wish
RUN apk add --no-cache --update --virtual .build-deps \
        sudo build-base ruby-dev \
 && apk add mariadb-dev \
 && sudo gem install fluent-plugin-elasticsearch \
 && sudo gem install fluent-plugin-mongo \
 && sudo gem install fluent-plugin-sql \
 && sudo gem install mysql2 -v 0.5.2 \
 && sudo gem sources --clear-all \
 && apk del .build-deps \
 && rm -rf /home/fluent/.gem/ruby/2.5.0/cache/*.gem

VOLUME ["/fluentd/etc","/fluentd/log","/var/log"]


version: '3' 
      context: .
      dockerfile: ./Dockerfile
    image: my/fluentd:latest
      - 24224:24224
      - 24224:24224/udp
    container_name: fluentd
      - /docker-data/fluentd/etc:/fluentd/etc
      - /docker-data/fluentd/log:/fluentd/log
      - /var/log:/var/log
      - mynet
      name: my-net

FluentD Configuration

Config File Syntax
1. source
2. match
3. filter
4. system
5. label
6. @include

source - 資料從哪裡來

FluentD 可以選擇資料來源,標準的輸入來源包含 http & forward
http 讓 FluentD 監聽 HTTP 訊息, forward 則是 TCP

source 將 "事件" 提交給 FluentD的 routing 引擎
"事件" 包含三個實體 tag , time 及 record
eg. http://this.host:9880/myapp.access?json={"event":"data"}
tag: myapp.access
time: (current time)
record: {"event":"data"}

match - 告訴 FluentD 要做什麼

match 尋找符合 "事件" tag ,並且處理它,標準的輸出包含 file & forward

match 如何工作

* 符合單一 tag 部分
    ex. a.* 
    符合 a.b 
    不符合 a or a.b.c
** 符合 0 或任何 tag
    ex. a.**
    符合 a or a.b or a.b.c 
{X,Y,Z} 符合 X or Y or Z
    ex. {a,b}
    符合 a or b
    不符合 c
#{...} 可以使用 Ruby 表達式   不會 Ruby ...

match 有順序性,所以要警慎設定 
eg. myapp.access 在 ** 前面,若是 myapp.access 在 ** 後面 則永遠執行不到
如果你是 "事件" 有多個 輸出 可以考慮 copy plugin
<match myapp.access>
  @type file
  path /var/log/fluent/access

# Capture all unmatched tags. Good :)
<match **>
  @type blackhole_plugin

filter - "事件" 處理的管線

Input -> filter 1 -> ... -> filter n -> output
filter 必須在 match 前,否則也執行不到

eg. http://this.host:9880/myapp.access?json={"event":"data"}

取得 "事件" 
tag: myapp.access,
time: (current time),
record: {"event":"data"}
  @type http
  port 9880

進行 record_transformer 將 record 加入 host_param
{"event":"data"} => {"event":"data","host_param","webserver1"}
<filter myapp.access>
  @type record_transformer
    host_param "#{Socket.gethostname}"

<match myapp.access>
  @type file
  path /var/log/fluent/access

system - 系統層級的設定

大多數的設定可以透過 command line 進行

label 用來將 filter , output 分群

forward "事件" 會進行 record_transformer -> elasticsearch 
  @type forward

http "事件" 因為 @label  會進行 grep -> s3
  @type tail
  @label @SYSTEM

<filter access.**>
  @type record_transformer
    # ...
<match **>
  @type elasticsearch
  # ...

<label @SYSTEM>
  <filter var.log.middleware.**>
    @type grep
    # ...
  <match **>
    @type s3
    # ...

@include - Re-Use 設定檔


Fluentd v1.4.0 後你可以使用 #{...} 嵌入 Ruby

修改 fluentd.conf 可以使用 command line 確認 conf 是否有問題
fluentd --dry-run -c fluent.conf

在 " 雙引號字串內 \ 是跳脫字元 你可以用 \r \n \t ....




Parse 可以存在 source , match , filter 內
Parse plugin
Parse 參數如下
參數名稱 預設 說明
types 與 @type 不同,這是用來定義,Specify types for converting field into other type
time_key nil time field for event time
null_value_pattern nil Specify null value pattern
null_empty_string false If true, empty string field is replaced with nil
estimate_current_event true If true, use Fluent::EventTime.now(current time) as a timestamp when time_key is specified.
keep_time_key false If true, keep time field in the record.
timeout nil Specify timeout for parse processing. This is mainly for detecting wrong regexp pattern.


Buffer 必須在 match 中
buffer 接受 @type file , memory (defalut)
通常建議使用 file 提高耐久性
output 插件由收集 "事件" 建立 buffer chunk

# ...
使用 , 或 空白 分隔多個鍵值

當不指定 buffer trunk key , output 插件會把所有 "事件" 附加在同一個 chunk,直到裝不下

<match tag.**>
# ...
# ...

No chunk keys: All events will be appended into the same chunk.

11:59:30 web.access {"key1":"yay","key2":100}  --|
12:00:01 web.access {"key1":"foo","key2":200}  --|---> CHUNK_A
12:00:25 ssh.login  {"key1":"yay","key2":100}  --|

當 tag 作為buffer chunk key , output 插件依 tag 寫入不同的 chunk

<match tag.**>
# ...
<buffer tag>
# ...

 Tag chunk key: events will be separated per tags

11:59:30 web.access {"key1":"yay","key2":100}  --|
                                                 |---> CHUNK_A
12:00:01 web.access {"key1":"foo","key2":200}  --|

12:00:25 ssh.login  {"key1":"yay","key2":100}  ------> CHUNK_B

當 time 被指定為 buffer chunk key

<match tag.**>
# ...
<buffer time>
timekey      1h # chunks per hours ("3600" also available)
timekey_wait 5m # 5mins delay for flush ("300" also available)

# Time chunk key: events will be separated for hours (by timekey 3600)

11:59:30 web.access {"key1":"yay","key2":100}  ------> CHUNK_A

12:00:01 web.access {"key1":"foo","key2":200}  --|
                                                 |---> CHUNK_B
12:00:25 ssh.login  {"key1":"yay","key2":100}  --|

Other keys

<match tag.**>
# ...
<buffer key1>
# ...

# Chunk keys: events will be separated by values of "key1"

11:59:30 web.access {"key1":"yay","key2":100}  --|---> CHUNK_A
12:00:01 web.access {"key1":"foo","key2":200}  -)|(--> CHUNK_B
12:00:25 ssh.login  {"key1":"yay","key2":100}  --|

多個chunk key 組合

# <buffer tag,time>

11:58:01 ssh.login  {"key1":"yay","key2":100}  ------> CHUNK_A

11:59:13 web.access {"key1":"yay","key2":100}  --|
                                                 |---> CHUNK_B
11:59:30 web.access {"key1":"yay","key2":100}  --|

12:00:01 web.access {"key1":"foo","key2":200}  ------> CHUNK_C

12:00:25 ssh.login  {"key1":"yay","key2":100}  ------> CHUNK_D

當指定了 chunk key , 這些值可以被取出來作為其他設定參數值

<match log.*>
@type file
path  /data/${tag}/access.log  #=> "/data/log.map/access.log"
<buffer tag>
# ...

當 timekey 在 buffer chunk keys ,可以用 strptime 表示式
<match log.*>
@type file
path  /data/${tag[1]}/access.%Y-%m-%d.%H%M.log #=> "/data/map/access.2017-02-28.20:48.log"
<buffer tag,time>
timekey 1m



Grafana Dashboard 建立

建立自己的 Dashboard # 由於 intelligent sense 相當不錯,輸入關鍵字他會帶出 metric label # 另外可參考 https://prometheus.io/docs/prometheus/latest/querying/basics/ Prometheus Query # 或是直接拿其他已建立的Dashboard 可複製到新的 Dashboard ex: node_memory_MemTotal_bytes # 取伺服器記憶體容量資料 # 過濾條件在{}加入 ex: node_memory_MemTotal_bytes{instance="${server 1}:9100"} # 要取特定伺服器資料 # Setting 中設定 Variables ex: node_memory_MemTotal_bytes{instance=~"$node"} # 變數名稱 node 建立 Alert .Visualization 必須是Graph

FluentD 實作 Nginx Access Log 補充

FluentD 實作 Nginx Access Log 補充 前一篇針對 FluentD 安裝 及 Nginx Access log format 設定提供範例 本篇補充 1. 將 access_log 存入 MySQL 2. 針對Input 加工,ex 解析 Path 拆成不同欄位,在傳入 Output 延伸閱讀 FluentD 參數說明 FluentD 實作 Nginx Access Log 將 access_log 存入 MySQL <worker 0> <source> ... 略 </source> <match nginx.web.access> @type copy ... 略 <store> @type sql host ${MySQL Host address} port ${MySQL Port} adapter mysql2 database ${MySQL Database} username ${MySQL User Name} password ${MySQL Password} <table> table ${MySQL table} column_mapping 'logtime:logtime,method:method,path:path,code:code,size:size,resptime:resptime,token:token,path_url:path_url,timestamp:created_at' </table> </store> </match> </worker> 針對Input 加工,ex 解析 Path 拆成不同欄位,在傳入 Output 情境: 以下 access log 範例,需要針對 Query Parameter 拆解並存入新欄位,以利分析. [27/Dec/2019:07:14:10 ...