FluentD 參數說明

FluentD

高效、統一的日誌收集器

延伸閱讀

FluentD 實作 Nginx Access Log
FluentD 實作 Nginx Access Log 補充

FluentD 安裝

Dockerfile


FROM fluent/fluentd:v1.8.1-1.0

# Use root account to use apk
USER root

# below RUN includes plugin as examples elasticsearch is not required
# you may customize including plugins as you wish
RUN apk add --no-cache --update --virtual .build-deps \
        sudo build-base ruby-dev \
 && apk add mariadb-dev \
 && sudo gem install fluent-plugin-elasticsearch \
 && sudo gem install fluent-plugin-mongo \
 && sudo gem install fluent-plugin-sql \
 && sudo gem install mysql2 -v 0.5.2 \
 && sudo gem sources --clear-all \
 && apk del .build-deps \
 && rm -rf /home/fluent/.gem/ruby/2.5.0/cache/*.gem

VOLUME ["/fluentd/etc","/fluentd/log","/var/log"]

docker-compose.yml


version: '3' 
services:
  fluentd:
    build:
      context: .
      dockerfile: ./Dockerfile
    image: my/fluentd:latest
    ports:
      - 24224:24224
      - 24224:24224/udp
    container_name: fluentd
    volumes:
      - /docker-data/fluentd/etc:/fluentd/etc
      - /docker-data/fluentd/log:/fluentd/log
      - /var/log:/var/log
    networks:
      - mynet
networks:
  mynet:
    external:
      name: my-net

FluentD Configuration

Config File Syntax
設定檔通常由以下標籤組合
1. source
2. match
3. filter
4. system
5. label
6. @include

source - 資料從哪裡來

FluentD 可以選擇資料來源，標準的輸入來源包含 http & forward
http 讓 FluentD 監聽 HTTP 訊息， forward 則是 TCP

Routing
source 將 "事件" 提交給 FluentD的 routing 引擎
"事件" 包含三個實體 tag , time 及 record
eg. http://this.host:9880/myapp.access?json={"event":"data"}
tag: myapp.access
time: (current time)
record: {"event":"data"}

match - 告訴 FluentD 要做什麼

match 尋找符合 "事件" tag ，並且處理它，標準的輸出包含 file & forward

match 如何工作


* 符合單一 tag 部分
    ex. a.* 
    符合 a.b 
    不符合 a or a.b.c
** 符合 0 或任何 tag
    ex. a.**
    符合 a or a.b or a.b.c 
{X,Y,Z} 符合 X or Y or Z
    ex. {a,b}
    符合 a or b
    不符合 c
#{...} 可以使用 Ruby 表達式   不會 Ruby ...

match 有順序性，所以要警慎設定 
eg. myapp.access 在 ** 前面，若是 myapp.access 在 ** 後面 則永遠執行不到
如果你是 "事件" 有多個 輸出 可以考慮 copy plugin
<match myapp.access>
  @type file
  path /var/log/fluent/access
</match>

# Capture all unmatched tags. Good :)
<match **>
  @type blackhole_plugin
</match>

filter - "事件" 處理的管線

Input -> filter 1 -> ... -> filter n -> output
filter 必須在 match 前，否則也執行不到

eg. http://this.host:9880/myapp.access?json={"event":"data"}


取得 "事件" 
tag: myapp.access,
time: (current time),
record: {"event":"data"}
<source>
  @type http
  port 9880
</source>

進行 record_transformer 將 record 加入 host_param
{"event":"data"} => {"event":"data","host_param","webserver1"}
<filter myapp.access>
  @type record_transformer
  <record>
    host_param "#{Socket.gethostname}"
  </record>
</filter>

再輸出到檔案
<match myapp.access>
  @type file
  path /var/log/fluent/access
</match>

system - 系統層級的設定

大多數的設定可以透過 command line 進行

label 用來將 filter , output 分群


forward "事件" 會進行 record_transformer -> elasticsearch 
<source>
  @type forward
</source>

http "事件" 因為 @label  會進行 grep -> s3
<source>
  @type tail
  @label @SYSTEM
</source>

<filter access.**>
  @type record_transformer
  <record>
    # ...
  </record>
</filter>
<match **>
  @type elasticsearch
  # ...
</match>

<label @SYSTEM>
  <filter var.log.middleware.**>
    @type grep
    # ...
  </filter>
  <match **>
    @type s3
    # ...
  </match>
</label>

@include - Re-Use 設定檔

提醒事項

Fluentd v1.4.0 後你可以使用 #{...} 嵌入 Ruby

修改 fluentd.conf 可以使用 command line 確認 conf 是否有問題
fluentd --dry-run -c fluent.conf

在 " 雙引號字串內 \ 是跳脫字元你可以用 \r \n \t ....

其他Setion

Parse
Buffer
Format
Extract
Inject

Parse

用來分析原始資料
Parse 可以存在 source , match , filter 內
Parse plugin
Parse 參數如下

參數名稱	預設	說明
types		與 @type 不同，這是用來定義，Specify types for converting field into other type
time_key	nil	time field for event time
null_value_pattern	nil	Specify null value pattern
null_empty_string	false	If true, empty string field is replaced with nil
estimate_current_event	true	If true, use Fluent::EventTime.now(current time) as a timestamp when time_key is specified.
keep_time_key	false	If true, keep time field in the record.
timeout	nil	Specify timeout for parse processing. This is mainly for detecting wrong regexp pattern.

Buffer

Buffer 必須在 match 中
buffer 接受 @type file , memory (defalut)
通常建議使用 file 提高耐久性
output 插件由收集 "事件" 建立 buffer chunk
eg.


<buffer ARGUMENT_CHUNK_KEYS>
# ...
</buffer>
使用 , 或 空白 分隔多個鍵值

當不指定 buffer trunk key , output 插件會把所有 "事件" 附加在同一個 chunk，直到裝不下


<match tag.**>
# ...
<buffer>
# ...
</buffer>
</match>

No chunk keys: All events will be appended into the same chunk.

11:59:30 web.access {"key1":"yay","key2":100}  --|
                                                 |
12:00:01 web.access {"key1":"foo","key2":200}  --|---> CHUNK_A
                                                 |
12:00:25 ssh.login  {"key1":"yay","key2":100}  --|

當 tag 作為buffer chunk key , output 插件依 tag 寫入不同的 chunk


<match tag.**>
# ...
<buffer tag>
# ...
</buffer>
</match>

 Tag chunk key: events will be separated per tags

11:59:30 web.access {"key1":"yay","key2":100}  --|
                                                 |---> CHUNK_A
12:00:01 web.access {"key1":"foo","key2":200}  --|

12:00:25 ssh.login  {"key1":"yay","key2":100}  ------> CHUNK_B

當 time 被指定為 buffer chunk key


<match tag.**>
# ...
<buffer time>
timekey      1h # chunks per hours ("3600" also available)
timekey_wait 5m # 5mins delay for flush ("300" also available)
</buffer>
</match>

# Time chunk key: events will be separated for hours (by timekey 3600)

11:59:30 web.access {"key1":"yay","key2":100}  ------> CHUNK_A

12:00:01 web.access {"key1":"foo","key2":200}  --|
                                                 |---> CHUNK_B
12:00:25 ssh.login  {"key1":"yay","key2":100}  --|

Other keys


<match tag.**>
# ...
<buffer key1>
# ...
</buffer>
</match>

# Chunk keys: events will be separated by values of "key1"

11:59:30 web.access {"key1":"yay","key2":100}  --|---> CHUNK_A
                                                 |
12:00:01 web.access {"key1":"foo","key2":200}  -)|(--> CHUNK_B
                                                 |
12:00:25 ssh.login  {"key1":"yay","key2":100}  --|

多個chunk key 組合


# <buffer tag,time>

11:58:01 ssh.login  {"key1":"yay","key2":100}  ------> CHUNK_A

11:59:13 web.access {"key1":"yay","key2":100}  --|
                                                 |---> CHUNK_B
11:59:30 web.access {"key1":"yay","key2":100}  --|

12:00:01 web.access {"key1":"foo","key2":200}  ------> CHUNK_C

12:00:25 ssh.login  {"key1":"yay","key2":100}  ------> CHUNK_D

Placeholders
當指定了 chunk key , 這些值可以被取出來作為其他設定參數值


<match log.*>
@type file
path  /data/${tag}/access.log  #=> "/data/log.map/access.log"
<buffer tag>
# ...
</buffer>
</match>

當 timekey 在 buffer chunk keys ，可以用 strptime 表示式
<match log.*>
@type file
path  /data/${tag[1]}/access.%Y-%m-%d.%H%M.log #=> "/data/map/access.2017-02-28.20:48.log"
<buffer tag,time>
timekey 1m
</buffer>
</match>

碼農的學習手札

搜尋此網誌