Pipeline Network Traffic Analytics NOC Enterprise dengan GoFlow2, Kafka, ClickHouse, dan Grafana
Dokumentasi ini menjelaskan pipeline Network Traffic Analytics dari awal hingga akhir menggunakan arsitektur modern, ringan, dan otomatis.
📋 Prasyarat Sistem
- OS: Ubuntu Server 22.04 / 24.04 fresh install
- Hardware minimal: 4 Core CPU, 8GB RAM. Direkomendasikan 16GB untuk traffic di atas 10Gbps
- Storage: SSD/NVMe
- Software: Docker, Docker Compose, Git, Python 3
- Hak akses: user sudah dimasukkan ke grup docker
Install Docker dan Docker Compose
sudo apt update
sudo apt install docker.io docker-compose-v2 -y
sudo systemctl enable --now docker
Tambahkan User ke Grup Docker
sudo usermod -aG docker $USER
newgrp docker
Konfigurasi LXC Proxmox Jika Menggunakan Container
Edit konfigurasi CTID sesuai ID container Anda.
nano /etc/pve/lxc/CTID.conf
unprivileged: 0
lxc.apparmor.profile: unconfined
🚀 Tahap 1: Instalasi Core Services dengan Docker Compose
Stack ini menggunakan Kafka mode KRaft tanpa Zookeeper agar hemat RAM, GoFlow2 untuk menangkap sFlow, ClickHouse untuk analitik forensik, dan Grafana untuk visualisasi.
Buat Direktori Proyek
mkdir -p ~/flow-monitor/data/kafka
cd ~/flow-monitor
Buat File docker-compose.yml
nano docker-compose.yml
Isi File docker-compose.yml
version: '3.8'
services:
goflow2:
image: netsampler/goflow2:latest
restart: unless-stopped
command: -format=json -transport=kafka -transport.kafka.brokers=kafka:29092 -transport.kafka.topic=flows
ports:
- "6343:6343/udp"
depends_on:
- kafka
kafka:
image: confluentinc/cp-kafka:7.6.0
restart: unless-stopped
environment:
KAFKA_NODE_ID: 1
KAFKA_PROCESS_ROLES: 'broker,controller'
KAFKA_CONTROLLER_LISTENER_NAMES: 'CONTROLLER'
KAFKA_CONTROLLER_QUORUM_VOTERS: '1@kafka:29093'
CLUSTER_ID: 'ciWo7IWazngRchmPES6q5A'
KAFKA_LISTENERS: INTERNAL://0.0.0.0:29092,EXTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:29093
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka:29092,EXTERNAL://127.0.0.1:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_HEAP_OPTS: "-Xmx1G -Xms512M"
KAFKA_LOG_RETENTION_HOURS: 2
KAFKA_LOG_RETENTION_BYTES: 1073741824
TZ: Asia/Jakarta
volumes:
- ./data/kafka:/var/lib/kafka/data
ports:
- "127.0.0.1:9092:9092"
clickhouse:
image: clickhouse/clickhouse-server:latest
restart: unless-stopped
ports:
- "8123:8123"
- "9000:9000"
volumes:
- ./data/clickhouse:/var/lib/clickhouse
depends_on:
- kafka
grafana:
image: grafana/grafana:latest
restart: unless-stopped
ports:
- "3000:3000"
volumes:
- ./data/grafana:/var/lib/grafana
depends_on:
- clickhouse
Jalankan Seluruh Sistem
docker compose up -d
🗄️ Tahap 2: Konfigurasi Pipeline ClickHouse
Masuk ke terminal ClickHouse untuk membuat database, tabel antrean, materialized view, dan kamus resolusi.
docker exec -it flow-monitor-clickhouse-1 clickhouse-client
A. Persiapan Database dan Keamanan
CREATE DATABASE IF NOT EXISTS netflow;
CREATE USER IF NOT EXISTS grafana IDENTIFIED WITH no_password;
GRANT SELECT ON netflow.* TO grafana;
GRANT dictGet ON netflow.* TO grafana;
B. Kamus Router dan Sampling
CREATE TABLE netflow.routers (
sampler_address String,
name String,
sampling UInt32
) ENGINE = Dictionary;
CREATE DICTIONARY netflow.router_dict (
sampler_address String,
name String,
sampling UInt32
) PRIMARY KEY sampler_address
SOURCE(CLICKHOUSE(DB 'netflow' TABLE 'routers'))
LIFETIME(MIN 300 MAX 360) LAYOUT(COMPLEX_KEY_HASHED());
-- Daftarkan IP Router Anda di sini beserta rate sampling-nya
INSERT INTO netflow.routers VALUES ('103.170.100.226', 'Router-Core-1', 4096);
C. Pipa Utama Real Traffic
-- Tabel Utama
CREATE TABLE netflow.flows (
Date Date,
TimeReceived DateTime,
sampler_address String,
src_addr String,
dst_addr String,
src_port UInt32,
dst_port UInt32,
tcp_flags UInt32,
bytes UInt64,
packets UInt64,
proto String
) ENGINE = MergeTree()
PARTITION BY Date
ORDER BY (TimeReceived, sampler_address)
TTL Date + INTERVAL 7 DAY;
-- Corong Kafka
CREATE TABLE netflow.kafka_flows (
sampler_address String,
src_addr String,
dst_addr String,
src_port UInt32,
dst_port UInt32,
tcp_flags UInt32,
proto String,
bytes UInt64,
packets UInt64
) ENGINE = Kafka
SETTINGS kafka_broker_list = 'kafka:29092',
kafka_topic_list = 'flows',
kafka_group_name = 'clickhouse_reader',
kafka_format = 'JSONEachRow';
-- Pipa Penghubung dengan Auto-Multiplication Sampling
CREATE MATERIALIZED VIEW netflow.flows_mv TO netflow.flows AS
SELECT
toDate(now()) AS Date,
now() AS TimeReceived,
sampler_address,
src_addr,
dst_addr,
src_port,
dst_port,
tcp_flags,
bytes * dictGetOrDefault('netflow.router_dict', 'sampling', tuple(sampler_address), toUInt32(1)) AS bytes,
packets * dictGetOrDefault('netflow.router_dict', 'sampling', tuple(sampler_address), toUInt32(1)) AS packets,
proto
FROM netflow.kafka_flows;
D. Kamus ASN dan Tabel Historis Agregasi 1 Tahun
-- Tabel Penampung ASN
CREATE TABLE netflow.asn_data (
ip_range String,
asn UInt32,
asn_name String
) ENGINE = Dictionary;
-- Kamus ASN Cerdas
CREATE DICTIONARY netflow.asn_dict (
ip_range String,
asn UInt32,
asn_name String
) PRIMARY KEY ip_range
SOURCE(CLICKHOUSE(DB 'netflow' TABLE 'asn_data'))
LIFETIME(MIN 3600 MAX 7200) LAYOUT(ip_trie());
-- Tabel Historis Ringan
CREATE TABLE netflow.flows_hourly (
Date Date,
Hour DateTime,
sampler_address String,
src_asn_name String,
dst_asn_name String,
proto String,
total_bytes UInt64,
total_packets UInt64
) ENGINE = SummingMergeTree()
ORDER BY (Date, Hour, sampler_address, src_asn_name, dst_asn_name, proto)
TTL Date + INTERVAL 1 YEAR;
-- MV untuk menerjemahkan IP ke ASN sebelum disimpan
CREATE MATERIALIZED VIEW netflow.flows_hourly_mv TO netflow.flows_hourly AS
SELECT
toDate(TimeReceived) AS Date,
toStartOfHour(TimeReceived) AS Hour,
sampler_address,
dictGetOrDefault('netflow.asn_dict', 'asn_name', tuple(toIPv4OrDefault(src_addr)), 'Unknown ASN') AS src_asn_name,
dictGetOrDefault('netflow.asn_dict', 'asn_name', tuple(toIPv4OrDefault(dst_addr)), 'Unknown ASN') AS dst_asn_name,
proto,
sum(bytes) AS total_bytes,
sum(packets) AS total_packets
FROM netflow.flows
GROUP BY Date, Hour, sampler_address, src_asn_name, dst_asn_name, proto;
Ketik exit untuk keluar dari ClickHouse.
exit
🔄 Tahap 3: Otomatisasi Update ASN dengan Cronjob
Script ini akan mengunduh database BGP global dan menyuntikkannya ke ClickHouse setiap malam.
Buat Script Bash
nano ~/flow-monitor/update_asn.sh
Isi File update_asn.sh
#!/bin/bash
cd /home/$USER/flow-monitor
echo "Mulai Update ASN: $(date)"
# 1. Eksekusi script Python untuk generate asn_data.csv
/usr/bin/python3 update_asn.py
# 2. Inject ke ClickHouse tanpa sudo
docker exec -i flow-monitor-clickhouse-1 clickhouse-client --query="TRUNCATE TABLE netflow.asn_data"
cat asn_data.csv | docker exec -i flow-monitor-clickhouse-1 clickhouse-client --query="INSERT INTO netflow.asn_data FORMAT CSV"
# 3. Reload kamus di RAM
docker exec -i flow-monitor-clickhouse-1 clickhouse-client --query="SYSTEM RELOAD DICTIONARY netflow.asn_dict"
echo "Selesai Update ASN: $(date)"
echo "----------------------------------------"
Berikan Hak Eksekusi
chmod +x ~/flow-monitor/update_asn.sh
Jadwalkan Cron Setiap Jam 02:00 Pagi
crontab -e
Tambahkan baris berikut di bagian paling bawah. Ganti /home/user sesuai user server Anda.
0 2 * * * /home/user/flow-monitor/update_asn.sh >> /home/user/flow-monitor/asn_update.log 2>&1
📊 Tahap 4: Visualisasi Grafana
Buka Grafana melalui alamat berikut. Ganti IP_SERVER dengan IP server Anda.
http://<IP_SERVER>:3000
- Login default: admin / admin
- Masuk ke Connections > Data sources > Add data source
- Cari ClickHouse
- Jika plugin belum ada, install plugin ClickHouse untuk Grafana
- Klik Save & Test setelah konfigurasi selesai
Install Plugin ClickHouse Jika Belum Ada
docker exec -it flow-monitor-grafana-1 grafana-cli plugins install grafana-clickhouse-datasource
docker restart flow-monitor-grafana-1
Konfigurasi Data Source ClickHouse
- Server address: clickhouse
- Server port: 9000
- Username: grafana
- Default database: netflow
✅ Penutup
Dengan konfigurasi ini, server akan menerima traffic sFlow melalui GoFlow2, mengirim data ke Kafka, memprosesnya di ClickHouse, lalu menampilkannya melalui Grafana. Raw data otomatis disimpan selama 7 hari, sedangkan data agregasi per jam disimpan selama 1 tahun.

No comments:
Post a Comment