Network Traffic Analytics goflow2+kafta+clickhouse+grafana

Pipeline Network Traffic Analytics NOC Enterprise dengan GoFlow2, Kafka, ClickHouse, dan Grafana

Dokumentasi ini menjelaskan pipeline Network Traffic Analytics dari awal hingga akhir menggunakan arsitektur modern, ringan, dan otomatis.

📋 Prasyarat Sistem

  • OS: Ubuntu Server 22.04 / 24.04 fresh install
  • Hardware minimal: 4 Core CPU, 8GB RAM. Direkomendasikan 16GB untuk traffic di atas 10Gbps
  • Storage: SSD/NVMe
  • Software: Docker, Docker Compose, Git, Python 3
  • Hak akses: user sudah dimasukkan ke grup docker

Install Docker dan Docker Compose

sudo apt update
sudo apt install docker.io docker-compose-v2 -y
sudo systemctl enable --now docker

Tambahkan User ke Grup Docker

sudo usermod -aG docker $USER
newgrp docker

Konfigurasi LXC Proxmox Jika Menggunakan Container

Edit konfigurasi CTID sesuai ID container Anda.

nano /etc/pve/lxc/CTID.conf
unprivileged: 0
lxc.apparmor.profile: unconfined

🚀 Tahap 1: Instalasi Core Services dengan Docker Compose

Stack ini menggunakan Kafka mode KRaft tanpa Zookeeper agar hemat RAM, GoFlow2 untuk menangkap sFlow, ClickHouse untuk analitik forensik, dan Grafana untuk visualisasi.

Buat Direktori Proyek

mkdir -p ~/flow-monitor/data/kafka
cd ~/flow-monitor

Buat File docker-compose.yml

nano docker-compose.yml

Isi File docker-compose.yml

version: '3.8'

services:
  goflow2:
    image: netsampler/goflow2:latest
    restart: unless-stopped
    command: -format=json -transport=kafka -transport.kafka.brokers=kafka:29092 -transport.kafka.topic=flows
    ports:
      - "6343:6343/udp"
    depends_on:
      - kafka

  kafka:
    image: confluentinc/cp-kafka:7.6.0
    restart: unless-stopped
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_PROCESS_ROLES: 'broker,controller'
      KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
      KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka:29093'
      CLUSTER_ID: 'ciWo7IWazngRchmPES6q5A'
      KAFKA_LISTENERS: INTERNAL://0.0.0.0:29092,EXTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:29093
      KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:29092,EXTERNAL://127.0.0.1:9092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_HEAP_OPTS: "-Xmx1G -Xms512M"
      KAFKA_LOG_RETENTION_HOURS: 2
      KAFKA_LOG_RETENTION_BYTES: 1073741824
      TZ: Asia/Jakarta
    volumes:
      - ./data/kafka:/var/lib/kafka/data
    ports:
      - "127.0.0.1:9092:9092"

  clickhouse:
    image: clickhouse/clickhouse-server:latest
    restart: unless-stopped
    ports:
      - "8123:8123"
      - "9000:9000"
    volumes:
      - ./data/clickhouse:/var/lib/clickhouse
    depends_on:
      - kafka

  grafana:
    image: grafana/grafana:latest
    restart: unless-stopped
    ports:
      - "3000:3000"
    volumes:
      - ./data/grafana:/var/lib/grafana
    depends_on:
      - clickhouse

Jalankan Seluruh Sistem

docker compose up -d

🗄️ Tahap 2: Konfigurasi Pipeline ClickHouse

Masuk ke terminal ClickHouse untuk membuat database, tabel antrean, materialized view, dan kamus resolusi.

docker exec -it flow-monitor-clickhouse-1 clickhouse-client

A. Persiapan Database dan Keamanan

CREATE DATABASE IF NOT EXISTS netflow;
CREATE USER IF NOT EXISTS grafana IDENTIFIED WITH no_password;
GRANT SELECT ON netflow.* TO grafana;
GRANT dictGet ON netflow.* TO grafana;

B. Kamus Router dan Sampling

CREATE TABLE netflow.routers (
    sampler_address String,
    name String,
    sampling UInt32
) ENGINE = Dictionary;

CREATE DICTIONARY netflow.router_dict (
    sampler_address String,
    name String,
    sampling UInt32
) PRIMARY KEY sampler_address
SOURCE(CLICKHOUSE(DB 'netflow' TABLE 'routers'))
LIFETIME(MIN 300 MAX 360) LAYOUT(COMPLEX_KEY_HASHED());

-- Daftarkan IP Router Anda di sini beserta rate sampling-nya
INSERT INTO netflow.routers VALUES ('103.170.100.226', 'Router-Core-1', 4096);

C. Pipa Utama Real Traffic

-- Tabel Utama
CREATE TABLE netflow.flows (
    Date Date,
    TimeReceived DateTime,
    sampler_address String,
    src_addr String,
    dst_addr String,
    src_port UInt32,
    dst_port UInt32,
    tcp_flags UInt32,
    bytes UInt64,
    packets UInt64,
    proto String
) ENGINE = MergeTree()
PARTITION BY Date
ORDER BY (TimeReceived, sampler_address)
TTL Date + INTERVAL 7 DAY;

-- Corong Kafka
CREATE TABLE netflow.kafka_flows (
    sampler_address String,
    src_addr String,
    dst_addr String,
    src_port UInt32,
    dst_port UInt32,
    tcp_flags UInt32,
    proto String,
    bytes UInt64,
    packets UInt64
) ENGINE = Kafka
SETTINGS kafka_broker_list = 'kafka:29092',
         kafka_topic_list = 'flows',
         kafka_group_name = 'clickhouse_reader',
         kafka_format = 'JSONEachRow';

-- Pipa Penghubung dengan Auto-Multiplication Sampling
CREATE MATERIALIZED VIEW netflow.flows_mv TO netflow.flows AS
SELECT
    toDate(now()) AS Date,
    now() AS TimeReceived,
    sampler_address,
    src_addr,
    dst_addr,
    src_port,
    dst_port,
    tcp_flags,
    bytes * dictGetOrDefault('netflow.router_dict', 'sampling', tuple(sampler_address), toUInt32(1)) AS bytes,
    packets * dictGetOrDefault('netflow.router_dict', 'sampling', tuple(sampler_address), toUInt32(1)) AS packets,
    proto
FROM netflow.kafka_flows;

D. Kamus ASN dan Tabel Historis Agregasi 1 Tahun

-- Tabel Penampung ASN
CREATE TABLE netflow.asn_data (
    ip_range String,
    asn UInt32,
    asn_name String
) ENGINE = Dictionary;

-- Kamus ASN Cerdas
CREATE DICTIONARY netflow.asn_dict (
    ip_range String,
    asn UInt32,
    asn_name String
) PRIMARY KEY ip_range
SOURCE(CLICKHOUSE(DB 'netflow' TABLE 'asn_data'))
LIFETIME(MIN 3600 MAX 7200) LAYOUT(ip_trie());

-- Tabel Historis Ringan
CREATE TABLE netflow.flows_hourly (
    Date Date,
    Hour DateTime,
    sampler_address String,
    src_asn_name String,
    dst_asn_name String,
    proto String,
    total_bytes UInt64,
    total_packets UInt64
) ENGINE = SummingMergeTree()
ORDER BY (Date, Hour, sampler_address, src_asn_name, dst_asn_name, proto)
TTL Date + INTERVAL 1 YEAR;

-- MV untuk menerjemahkan IP ke ASN sebelum disimpan
CREATE MATERIALIZED VIEW netflow.flows_hourly_mv TO netflow.flows_hourly AS
SELECT
    toDate(TimeReceived) AS Date,
    toStartOfHour(TimeReceived) AS Hour,
    sampler_address,
    dictGetOrDefault('netflow.asn_dict', 'asn_name', tuple(toIPv4OrDefault(src_addr)), 'Unknown ASN') AS src_asn_name,
    dictGetOrDefault('netflow.asn_dict', 'asn_name', tuple(toIPv4OrDefault(dst_addr)), 'Unknown ASN') AS dst_asn_name,
    proto,
    sum(bytes) AS total_bytes,
    sum(packets) AS total_packets
FROM netflow.flows
GROUP BY Date, Hour, sampler_address, src_asn_name, dst_asn_name, proto;

Ketik exit untuk keluar dari ClickHouse.

exit

🔄 Tahap 3: Otomatisasi Update ASN dengan Cronjob

Script ini akan mengunduh database BGP global dan menyuntikkannya ke ClickHouse setiap malam.

Buat Script Bash

nano ~/flow-monitor/update_asn.sh

Isi File update_asn.sh

#!/bin/bash
cd /home/$USER/flow-monitor
echo "Mulai Update ASN: $(date)"

# 1. Eksekusi script Python untuk generate asn_data.csv
/usr/bin/python3 update_asn.py

# 2. Inject ke ClickHouse tanpa sudo
docker exec -i flow-monitor-clickhouse-1 clickhouse-client --query="TRUNCATE TABLE netflow.asn_data"
cat asn_data.csv | docker exec -i flow-monitor-clickhouse-1 clickhouse-client --query="INSERT INTO netflow.asn_data FORMAT CSV"

# 3. Reload kamus di RAM
docker exec -i flow-monitor-clickhouse-1 clickhouse-client --query="SYSTEM RELOAD DICTIONARY netflow.asn_dict"

echo "Selesai Update ASN: $(date)"
echo "----------------------------------------"

Berikan Hak Eksekusi

chmod +x ~/flow-monitor/update_asn.sh

Jadwalkan Cron Setiap Jam 02:00 Pagi

crontab -e

Tambahkan baris berikut di bagian paling bawah. Ganti /home/user sesuai user server Anda.

0 2 * * * /home/user/flow-monitor/update_asn.sh >> /home/user/flow-monitor/asn_update.log 2>&1

📊 Tahap 4: Visualisasi Grafana

Buka Grafana melalui alamat berikut. Ganti IP_SERVER dengan IP server Anda.

http://<IP_SERVER>:3000
  • Login default: admin / admin
  • Masuk ke Connections > Data sources > Add data source
  • Cari ClickHouse
  • Jika plugin belum ada, install plugin ClickHouse untuk Grafana
  • Klik Save & Test setelah konfigurasi selesai

Install Plugin ClickHouse Jika Belum Ada

docker exec -it flow-monitor-grafana-1 grafana-cli plugins install grafana-clickhouse-datasource
docker restart flow-monitor-grafana-1

Konfigurasi Data Source ClickHouse

  • Server address: clickhouse
  • Server port: 9000
  • Username: grafana
  • Default database: netflow
Catatan: Setelah data source berhasil, import JSON dashboard final yang sudah dilengkapi variabel dropdown router dan pemisahan trafik inbound/outbound berdasarkan IP prefix internal Anda.

✅ Penutup

Dengan konfigurasi ini, server akan menerima traffic sFlow melalui GoFlow2, mengirim data ke Kafka, memprosesnya di ClickHouse, lalu menampilkannya melalui Grafana. Raw data otomatis disimpan selama 7 hari, sedangkan data agregasi per jam disimpan selama 1 tahun.

No comments:

Post a Comment