第193集网络监控与性能分析实战 | 字数总计: 9.1k | 阅读时长: 48分钟 | 阅读量:
1. 网络监控概述 网络监控是现代IT基础设施运维的核心组成部分,通过实时监控网络流量、连接状态、性能指标和故障情况,可以确保网络服务的稳定性和高效性。本文将详细介绍网络监控的核心指标、监控工具、性能分析方法以及最佳实践。
1.1 网络监控的重要性
服务保障 : 确保网络服务的可用性和稳定性
性能优化 : 识别网络瓶颈,优化网络配置
安全防护 : 检测异常流量和潜在安全威胁
容量规划 : 为网络扩容提供数据支持
故障诊断 : 快速定位和解决网络问题
1.2 核心监控指标
带宽使用率 : 网络接口的流量使用情况
连接状态 : TCP/UDP连接的数量和状态
延迟时间 : 网络延迟和响应时间
丢包率 : 网络数据包丢失情况
错误率 : 网络错误和重传情况
吞吐量 : 网络数据传输速率
1.3 监控层次
物理层 : 网络接口状态、链路状态
数据链路层 : MAC地址、VLAN信息
网络层 : IP地址、路由信息
传输层 : TCP/UDP连接状态
应用层 : 应用协议性能
2. 网络监控工具详解 2.1 netstat命令详解 netstat是基础的网络连接查看工具,提供网络连接和统计信息。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 netstat -an netstat -ant netstat -anu netstat -tlnp netstat -rn netstat -i netstat -s netstat -an | awk '/^tcp/ {print $6}' | sort | uniq -c netstat -an | grep :80 netstat -an | grep 192.168.1.100
2.2 ss命令详解 ss是netstat的现代替代品,提供更快的网络连接信息。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 ss -a ss -t ss -u ss -l ss -p ss -s ss -tlnp | grep :80 ss -t state established ss -i ss -tulpn
2.3 iftop流量监控 iftop提供实时网络流量监控。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 yum install iftop apt install iftop iftop iftop -i eth0 iftop -P iftop -n iftop -p iftop -f "host 192.168.1.100" iftop -t -s 10
2.4 nload流量监控 nload提供网络接口流量监控。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 yum install nload apt install nload nload nload eth0 nload -t 2000 nload -a nload -t 1000 -i 10000
2.5 网络监控脚本 创建自定义的网络监控脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 #!/bin/bash LOG_FILE="/var/log/network_monitor.log" ALERT_BANDWIDTH_THRESHOLD=80 ALERT_CONNECTION_THRESHOLD=1000 CHECK_INTERVAL=60 log_message () { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1 " >> $LOG_FILE } check_interface_status () { for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then status=$(cat /sys/class/net/$interface /operstate) if [ "$status " != "up" ]; then log_message "WARNING: Interface $interface is $status " fi fi done } check_bandwidth_usage () { for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_bytes=$(cat /sys/class/net/$interface /statistics/rx_bytes) tx_bytes=$(cat /sys/class/net/$interface /statistics/tx_bytes) total_bytes=$((rx_bytes + tx_bytes)) if [ $total_bytes -gt 0 ]; then log_message "Interface $interface : RX=${rx_bytes} bytes, TX=${tx_bytes} bytes" fi fi done } check_connection_count () { tcp_connections=$(ss -t | wc -l) udp_connections=$(ss -u | wc -l) total_connections=$((tcp_connections + udp_connections)) if [ $total_connections -gt $ALERT_CONNECTION_THRESHOLD ]; then log_message "WARNING: High connection count: $total_connections " fi log_message "Connection count: TCP=$tcp_connections , UDP=$udp_connections , Total=$total_connections " } check_network_errors () { for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_errors=$(cat /sys/class/net/$interface /statistics/rx_errors) tx_errors=$(cat /sys/class/net/$interface /statistics/tx_errors) rx_dropped=$(cat /sys/class/net/$interface /statistics/rx_dropped) tx_dropped=$(cat /sys/class/net/$interface /statistics/tx_dropped) if [ $rx_errors -gt 0 ] || [ $tx_errors -gt 0 ] || [ $rx_dropped -gt 0 ] || [ $tx_dropped -gt 0 ]; then log_message "WARNING: Interface $interface errors - RX_ERR:$rx_errors , TX_ERR:$tx_errors , RX_DROP:$rx_dropped , TX_DROP:$tx_dropped " fi fi done } check_network_latency () { gateway=$(ip route | grep default | awk '{print $3}' | head -1) if [ -n "$gateway " ]; then latency=$(ping -c 3 -W 1 $gateway 2>/dev/null | grep "avg" | awk -F'/' '{print $5}' ) if [ -n "$latency " ]; then if (( $(echo "$latency > 100 " | bc -l) )); then log_message "WARNING: High latency to gateway: ${latency} ms" else log_message "Latency to gateway: ${latency} ms" fi fi fi } main () { log_message "Network monitor started" while true ; do check_interface_status check_bandwidth_usage check_connection_count check_network_errors check_network_latency sleep $CHECK_INTERVAL done } main
3. 网络性能分析 3.1 带宽分析 3.1.1 带宽使用率分析 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 INTERFACE="eth0" INTERVAL=1 while true ; do current_time=$(date +%s) rx_bytes=$(cat /sys/class/net/$INTERFACE /statistics/rx_bytes) tx_bytes=$(cat /sys/class/net/$INTERFACE /statistics/tx_bytes) if [ -n "$prev_time " ]; then time_diff=$((current_time - prev_time)) rx_diff=$((rx_bytes - prev_rx_bytes)) tx_diff=$((tx_bytes - prev_tx_bytes)) rx_rate=$((rx_diff / time_diff)) tx_rate=$((tx_diff / time_diff)) rx_mbps=$(echo "scale=2; $rx_rate * 8 / 1024 / 1024" | bc) tx_mbps=$(echo "scale=2; $tx_rate * 8 / 1024 / 1024" | bc) echo "$(date '+%H:%M:%S') RX: ${rx_mbps} Mbps TX: ${tx_mbps} Mbps" fi prev_time=$current_time prev_rx_bytes=$rx_bytes prev_tx_bytes=$tx_bytes sleep $INTERVAL done
3.1.2 带宽趋势分析 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 #!/bin/bash INTERFACE="eth0" LOG_FILE="/var/log/bandwidth_trend.log" INTERVAL=60 log_bandwidth () { local timestamp=$(date '+%Y-%m-%d %H:%M:%S' ) local rx_bytes=$(cat /sys/class/net/$INTERFACE /statistics/rx_bytes) local tx_bytes=$(cat /sys/class/net/$INTERFACE /statistics/tx_bytes) echo "$timestamp ,$rx_bytes ,$tx_bytes " >> $LOG_FILE } generate_trend_report () { echo "=== Bandwidth Trend Report ===" echo "Time,RX_Bytes,TX_Bytes,RX_MBps,TX_MBps" tail -n 100 $LOG_FILE | while IFS=',' read timestamp rx_bytes tx_bytes; do if [ -n "$prev_rx_bytes " ]; then rx_diff=$((rx_bytes - prev_rx_bytes)) tx_diff=$((tx_bytes - prev_tx_bytes)) rx_mbps=$(echo "scale=2; $rx_diff / 1024 / 1024" | bc) tx_mbps=$(echo "scale=2; $tx_diff / 1024 / 1024" | bc) echo "$timestamp ,$rx_bytes ,$tx_bytes ,$rx_mbps ,$tx_mbps " fi prev_rx_bytes=$rx_bytes prev_tx_bytes=$tx_bytes done } while true ; do log_bandwidth sleep $INTERVAL done
3.2 连接分析 3.2.1 连接状态分析 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 #!/bin/bash echo "=== Connection Analysis ===" echo "Connection Status Distribution:" ss -s echo -e "\nTCP Connection States:" ss -t -a | awk 'NR>1 {print $1}' | sort | uniq -c | sort -nr echo -e "\nTop 10 Local Ports:" ss -tln | awk 'NR>1 {print $4}' | cut -d: -f2 | sort | uniq -c | sort -nr | head -10 echo -e "\nTop 10 Remote IPs:" ss -tn | awk 'NR>1 {print $4}' | cut -d: -f1 | sort | uniq -c | sort -nr | head -10 echo -e "\nEstablished Connections by Port:" ss -t state established | awk 'NR>1 {print $4}' | cut -d: -f2 | sort | uniq -c | sort -nr echo -e "\nTime Wait Connections:" ss -t state time-wait | wc -l echo -e "\nClose Wait Connections:" ss -t state close-wait | wc -l
3.2.2 连接性能分析 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #!/bin/bash echo "=== Connection Performance Analysis ===" echo "Connection Establishment Time:" time nc -z google.com 80 echo -e "\nConnection Keep-Alive Analysis:" ss -t -o | grep keepalive | wc -l echo -e "\nRetransmission Analysis:" cat /proc/net/snmp | grep Tcp | awk '{print "RetransSegs:", $13}' echo -e "\nWindow Size Analysis:" ss -t -i | grep -o "wscale:[0-9]*" | sort | uniq -c echo -e "\nRTT Analysis:" ss -t -i | grep -o "rtt:[0-9.]*" | sort | uniq -c
3.3 延迟分析 3.3.1 网络延迟监控 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 #!/bin/bash TARGETS=("8.8.8.8" "1.1.1.1" "google.com" "baidu.com" ) INTERVAL=10 while true ; do echo "=== Latency Check $(date) ===" for target in "${TARGETS[@]} " ; do echo -n "Pinging $target : " ping_result=$(ping -c 3 -W 1 $target 2>/dev/null) if [ $? -eq 0 ]; then avg_latency=$(echo "$ping_result " | grep "avg" | awk -F'/' '{print $5}' ) min_latency=$(echo "$ping_result " | grep "avg" | awk -F'/' '{print $4}' ) max_latency=$(echo "$ping_result " | grep "avg" | awk -F'/' '{print $6}' ) echo "Min: ${min_latency} ms, Avg: ${avg_latency} ms, Max: ${max_latency} ms" if (( $(echo "$avg_latency > 100 " | bc -l) )); then echo "WARNING: High latency detected for $target " fi else echo "FAILED" fi done echo "" sleep $INTERVAL done
3.3.2 延迟趋势分析 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 #!/bin/bash TARGET="8.8.8.8" LOG_FILE="/var/log/latency_trend.log" INTERVAL=60 while true ; do timestamp=$(date '+%Y-%m-%d %H:%M:%S' ) ping_result=$(ping -c 5 -W 1 $TARGET 2>/dev/null) if [ $? -eq 0 ]; then avg_latency=$(echo "$ping_result " | grep "avg" | awk -F'/' '{print $5}' ) packet_loss=$(echo "$ping_result " | grep "packet loss" | awk '{print $6}' | tr -d '%' ) echo "$timestamp ,$avg_latency ,$packet_loss " >> $LOG_FILE echo "$timestamp : Latency=${avg_latency} ms, Loss=${packet_loss} %" else echo "$timestamp : Ping failed" >> $LOG_FILE echo "$timestamp : Ping failed" fi sleep $INTERVAL done
4. 网络问题诊断 4.1 连接问题诊断 4.1.1 连接超时诊断 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 #!/bin/bash TARGET_HOST="$1 " TARGET_PORT="$2 " if [ -z "$TARGET_HOST " ] || [ -z "$TARGET_PORT " ]; then echo "Usage: $0 <host> <port>" exit 1 fi echo "=== Connection Timeout Diagnosis ===" echo "Target: $TARGET_HOST :$TARGET_PORT " echo -e "\n1. DNS Resolution:" nslookup $TARGET_HOST echo -e "\n2. Route Trace:" traceroute $TARGET_HOST echo -e "\n3. Port Connectivity:" timeout 10 nc -zv $TARGET_HOST $TARGET_PORT echo -e "\n4. Firewall Check:" iptables -L -n | grep $TARGET_PORT echo -e "\n5. Local Port Check:" ss -tlnp | grep :$TARGET_PORT echo -e "\n6. System Resources:" echo "Memory usage:" free -h echo "CPU usage:" top -bn1 | grep "Cpu(s)" echo "File descriptors:" lsof | wc -l
4.1.2 连接重置诊断 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 #!/bin/bash echo "=== Connection Reset Diagnosis ===" echo "Connection Reset Statistics:" cat /proc/net/snmp | grep Tcp | awk '{print "RstSegs:", $14}' echo -e "\nNetwork Interface Errors:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_errors=$(cat /sys/class/net/$interface /statistics/rx_errors) tx_errors=$(cat /sys/class/net/$interface /statistics/tx_errors) echo "$interface : RX_ERR=$rx_errors , TX_ERR=$tx_errors " fi done echo -e "\nTCP Parameters:" echo "tcp_keepalive_time: $(cat /proc/sys/net/ipv4/tcp_keepalive_time) " echo "tcp_keepalive_intvl: $(cat /proc/sys/net/ipv4/tcp_keepalive_intvl) " echo "tcp_keepalive_probes: $(cat /proc/sys/net/ipv4/tcp_keepalive_probes) " echo -e "\nConnection States:" ss -s echo -e "\nTIME_WAIT Connections:" ss -t state time-wait | wc -l
4.2 性能问题诊断 4.2.1 带宽瓶颈诊断 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 #!/bin/bash echo "=== Bandwidth Bottleneck Diagnosis ===" echo "Network Interface Status:" ip link show echo -e "\nInterface Statistics:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then echo "Interface: $interface " cat /sys/class/net/$interface /statistics/rx_bytes cat /sys/class/net/$interface /statistics/tx_bytes cat /sys/class/net/$interface /statistics/rx_packets cat /sys/class/net/$interface /statistics/tx_packets echo "" fi done echo "Network Queue Status:" tc qdisc show echo -e "\nNetwork Buffer Status:" cat /proc/sys/net/core/rmem_maxcat /proc/sys/net/core/wmem_maxcat /proc/sys/net/core/rmem_defaultcat /proc/sys/net/core/wmem_defaultecho -e "\nTCP Window Size:" cat /proc/sys/net/ipv4/tcp_window_scalingcat /proc/sys/net/ipv4/tcp_rmemcat /proc/sys/net/ipv4/tcp_wmem
4.2.2 延迟问题诊断 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 #!/bin/bash TARGET="$1 " if [ -z "$TARGET " ]; then TARGET="8.8.8.8" fi echo "=== Latency Diagnosis for $TARGET ===" echo "1. Basic Ping Test:" ping -c 10 $TARGET echo -e "\n2. Detailed Ping Test:" ping -c 20 -i 0.2 $TARGET echo -e "\n3. Route Trace:" traceroute $TARGET echo -e "\n4. MTU Check:" ping -M do -s 1472 $TARGET ping -M do -s 1500 $TARGET echo -e "\n5. Interface MTU:" ip link show | grep mtu echo -e "\n6. TCP Parameters:" echo "tcp_congestion_control: $(cat /proc/sys/net/ipv4/tcp_congestion_control) " echo "tcp_no_delay_ack: $(cat /proc/sys/net/ipv4/tcp_no_delay_ack) " echo "tcp_low_latency: $(cat /proc/sys/net/ipv4/tcp_low_latency) "
4.3 安全威胁诊断 4.3.1 异常流量检测 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 #!/bin/bash echo "=== Anomaly Traffic Detection ===" echo "1. Connection Count Analysis:" total_connections=$(ss -t | wc -l) echo "Total TCP connections: $total_connections " if [ $total_connections -gt 1000 ]; then echo "WARNING: High connection count detected" fi echo -e "\n2. Port Analysis:" ss -tln | awk 'NR>1 {print $4}' | cut -d: -f2 | sort | uniq -c | sort -nr | head -10 echo -e "\n3. IP Analysis:" ss -tn | awk 'NR>1 {print $4}' | cut -d: -f1 | sort | uniq -c | sort -nr | head -10 echo -e "\n4. SYN Flood Check:" syn_count=$(ss -t state syn-sent | wc -l) echo "SYN connections: $syn_count " if [ $syn_count -gt 100 ]; then echo "WARNING: Potential SYN flood detected" fi echo -e "\n5. Connection State Analysis:" ss -t -a | awk 'NR>1 {print $1}' | sort | uniq -c | sort -nr
4.3.2 网络攻击检测 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 #!/bin/bash echo "=== Network Attack Detection ===" echo "1. Port Scan Detection:" ss -tln | awk 'NR>1 {print $4}' | cut -d: -f2 | sort | uniq -c | sort -nr | head -20 echo -e "\n2. Connection Pattern Analysis:" ss -tn | awk 'NR>1 {print $4}' | cut -d: -f1 | sort | uniq -c | sort -nr | head -20 echo -e "\n3. DDoS Detection:" ss -tn | awk 'NR>1 {print $4}' | cut -d: -f1 | sort | uniq -c | sort -nr | head -5 echo -e "\n4. Network Error Analysis:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_errors=$(cat /sys/class/net/$interface /statistics/rx_errors) tx_errors=$(cat /sys/class/net/$interface /statistics/tx_errors) rx_dropped=$(cat /sys/class/net/$interface /statistics/rx_dropped) tx_dropped=$(cat /sys/class/net/$interface /statistics/tx_dropped) if [ $rx_errors -gt 0 ] || [ $tx_errors -gt 0 ] || [ $rx_dropped -gt 0 ] || [ $tx_dropped -gt 0 ]; then echo "$interface : ERR_RX=$rx_errors , ERR_TX=$tx_errors , DROP_RX=$rx_dropped , DROP_TX=$tx_dropped " fi fi done
5. 网络优化策略 5.1 带宽优化 5.1.1 带宽分配优化 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 #!/bin/bash echo "=== Bandwidth Optimization ===" echo "1. Current Bandwidth Configuration:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then speed=$(cat /sys/class/net/$interface /speed 2>/dev/null || echo "Unknown" ) duplex=$(cat /sys/class/net/$interface /duplex 2>/dev/null || echo "Unknown" ) echo "$interface : Speed=$speed , Duplex=$duplex " fi done echo -e "\n2. Network Buffer Optimization:" echo "Current buffer settings:" cat /proc/sys/net/core/rmem_maxcat /proc/sys/net/core/wmem_maxecho "Setting larger buffers..." echo 16777216 > /proc/sys/net/core/rmem_maxecho 16777216 > /proc/sys/net/core/wmem_maxecho 16777216 > /proc/sys/net/core/rmem_defaultecho 16777216 > /proc/sys/net/core/wmem_defaultecho -e "\n3. TCP Parameter Optimization:" echo "Current TCP settings:" cat /proc/sys/net/ipv4/tcp_rmemcat /proc/sys/net/ipv4/tcp_wmemecho "4096 87380 16777216" > /proc/sys/net/ipv4/tcp_rmemecho "4096 65536 16777216" > /proc/sys/net/ipv4/tcp_wmemecho 1 > /proc/sys/net/ipv4/tcp_window_scalingecho 1 > /proc/sys/net/ipv4/tcp_timestampsecho 1 > /proc/sys/net/ipv4/tcp_sack
5.1.2 流量控制优化 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 #!/bin/bash echo "=== Traffic Control Optimization ===" echo "1. Current Queue Rules:" tc qdisc show echo -e "\n2. Setting HTB Queue Rules:" INTERFACE="eth0" tc qdisc del dev $INTERFACE root 2>/dev/null tc qdisc add dev $INTERFACE root handle 1: htb default 30 tc class add dev $INTERFACE parent 1: classid 1:1 htb rate 1000mbit tc class add dev $INTERFACE parent 1:1 classid 1:10 htb rate 800mbit ceil 1000mbit tc class add dev $INTERFACE parent 1:1 classid 1:20 htb rate 150mbit ceil 200mbit tc class add dev $INTERFACE parent 1:1 classid 1:30 htb rate 50mbit ceil 100mbit tc filter add dev $INTERFACE parent 1: protocol ip prio 1 u32 match ip dport 80 0xffff flowid 1:10 tc filter add dev $INTERFACE parent 1: protocol ip prio 2 u32 match ip dport 443 0xffff flowid 1:10 tc filter add dev $INTERFACE parent 1: protocol ip prio 3 u32 match ip dport 22 0xffff flowid 1:20 echo "Traffic control rules configured successfully"
5.2 延迟优化 5.2.1 网络延迟优化 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 #!/bin/bash echo "=== Latency Optimization ===" echo "1. TCP Parameter Optimization:" echo "Current TCP settings:" cat /proc/sys/net/ipv4/tcp_no_delay_ackcat /proc/sys/net/ipv4/tcp_low_latencyecho 1 > /proc/sys/net/ipv4/tcp_no_delay_ackecho 1 > /proc/sys/net/ipv4/tcp_low_latencyecho "2. TCP Congestion Control Optimization:" echo "Current congestion control:" cat /proc/sys/net/ipv4/tcp_congestion_controlecho bbr > /proc/sys/net/ipv4/tcp_congestion_controlecho -e "\n3. Network Interface Optimization:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then echo 1 > /sys/class/net/$interface /gro_flush_timeout ethtool -K $interface tso on 2>/dev/null ethtool -K $interface gso on 2>/dev/null echo "Optimized interface: $interface " fi done echo -e "\n4. Interrupt Handling Optimization:" echo "Current interrupt settings:" cat /proc/interrupts | grep eth0echo 2 > /proc/irq/24/smp_affinity
5.2.2 应用层延迟优化 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 #!/bin/bash echo "=== Application Latency Optimization ===" echo "1. DNS Resolution Optimization:" echo "Current DNS settings:" cat /etc/resolv.confecho "nameserver 127.0.0.1" > /etc/resolv.conf.localecho "nameserver 8.8.8.8" >> /etc/resolv.conf.localecho "nameserver 1.1.1.1" >> /etc/resolv.conf.localecho -e "\n2. Connection Pool Optimization:" echo "Current connection limits:" cat /proc/sys/net/core/somaxconncat /proc/sys/net/ipv4/tcp_max_syn_backlogecho 65535 > /proc/sys/net/core/somaxconnecho 65535 > /proc/sys/net/ipv4/tcp_max_syn_backlogecho -e "\n3. TCP Connection Reuse Optimization:" echo "Current TCP reuse settings:" cat /proc/sys/net/ipv4/tcp_tw_reusecat /proc/sys/net/ipv4/tcp_tw_recycleecho 1 > /proc/sys/net/ipv4/tcp_tw_reuseecho -e "\n4. TCP Fast Open Optimization:" echo "Current TCP fast open settings:" cat /proc/sys/net/ipv4/tcp_fastopenecho 3 > /proc/sys/net/ipv4/tcp_fastopen
5.3 安全优化 5.3.1 网络安全加固 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 #!/bin/bash echo "=== Network Security Hardening ===" echo "1. Firewall Configuration:" iptables -F iptables -X iptables -t nat -F iptables -t nat -X iptables -P INPUT DROP iptables -P FORWARD DROP iptables -P OUTPUT ACCEPT iptables -A INPUT -i lo -j ACCEPT iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A INPUT -p tcp --dport 22 -j ACCEPT iptables -A INPUT -p tcp --dport 80 -j ACCEPT iptables -A INPUT -p tcp --dport 443 -j ACCEPT iptables -A INPUT -p tcp --dport 80 -m limit --limit 25/minute --limit-burst 100 -j ACCEPT iptables -A INPUT -p tcp --syn -m limit --limit 1/s --limit-burst 3 -j ACCEPT echo "Firewall rules configured" echo -e "\n2. TCP Parameter Hardening:" echo 1 > /proc/sys/net/ipv4/tcp_syncookiesecho 1 > /proc/sys/net/ipv4/tcp_syn_retriesecho 30 > /proc/sys/net/ipv4/tcp_fin_timeoutecho 1 > /proc/sys/net/ipv4/tcp_tw_reuseecho -e "\n3. IP Forwarding Configuration:" echo "Current IP forwarding:" cat /proc/sys/net/ipv4/ip_forwardecho 0 > /proc/sys/net/ipv4/ip_forwardecho 0 > /proc/sys/net/ipv4/conf/all/accept_redirectsecho 0 > /proc/sys/net/ipv4/conf/all/send_redirectsecho 0 > /proc/sys/net/ipv4/conf/all/accept_source_route
5.3.2 网络监控安全 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 #!/bin/bash echo "=== Network Monitoring Security ===" echo "1. Network Monitoring Log Configuration:" LOG_FILE="/var/log/network_security.log" monitor_connections () { while true ; do timestamp=$(date '+%Y-%m-%d %H:%M:%S' ) connection_count=$(ss -t | wc -l) if [ $connection_count -gt 1000 ]; then echo "$timestamp - WARNING: High connection count: $connection_count " >> $LOG_FILE fi ss -tln | awk 'NR>1 {print $4}' | cut -d: -f2 | sort | uniq -c | sort -nr | head -5 | while read count port; do if [ $count -gt 100 ]; then echo "$timestamp - WARNING: High connection count on port $port : $count " >> $LOG_FILE fi done sleep 60 done } monitor_errors () { while true ; do timestamp=$(date '+%Y-%m-%d %H:%M:%S' ) for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_errors=$(cat /sys/class/net/$interface /statistics/rx_errors) tx_errors=$(cat /sys/class/net/$interface /statistics/tx_errors) if [ $rx_errors -gt 0 ] || [ $tx_errors -gt 0 ]; then echo "$timestamp - WARNING: Interface $interface errors - RX:$rx_errors , TX:$tx_errors " >> $LOG_FILE fi fi done sleep 60 done } monitor_connections & monitor_errors & echo "Network security monitoring started" echo "Log file: $LOG_FILE "
6. 网络监控最佳实践 6.1 监控策略 6.1.1 分层监控策略 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 #!/bin/bash echo "=== Layered Network Monitoring Strategy ===" monitor_physical_layer () { echo "1. Physical Layer Monitoring:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then status=$(cat /sys/class/net/$interface /operstate) speed=$(cat /sys/class/net/$interface /speed 2>/dev/null || echo "Unknown" ) duplex=$(cat /sys/class/net/$interface /duplex 2>/dev/null || echo "Unknown" ) echo "Interface: $interface , Status: $status , Speed: $speed , Duplex: $duplex " if [ "$status " != "up" ]; then echo "WARNING: Interface $interface is $status " fi fi done } monitor_data_link_layer () { echo -e "\n2. Data Link Layer Monitoring:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then mac_address=$(cat /sys/class/net/$interface /address) mtu=$(cat /sys/class/net/$interface /mtu) echo "Interface: $interface , MAC: $mac_address , MTU: $mtu " fi done } monitor_network_layer () { echo -e "\n3. Network Layer Monitoring:" echo "Routing table:" ip route show echo -e "\nIP addresses:" ip addr show echo -e "\nARP table:" ip neigh show } monitor_transport_layer () { echo -e "\n4. Transport Layer Monitoring:" echo "TCP connections:" ss -t -s echo -e "\nUDP connections:" ss -u -s echo -e "\nConnection states:" ss -t -a | awk 'NR>1 {print $1}' | sort | uniq -c | sort -nr } monitor_application_layer () { echo -e "\n5. Application Layer Monitoring:" echo "Listening ports:" ss -tlnp | head -20 echo -e "\nTop connections by port:" ss -tln | awk 'NR>1 {print $4}' | cut -d: -f2 | sort | uniq -c | sort -nr | head -10 } monitor_physical_layer monitor_data_link_layer monitor_network_layer monitor_transport_layer monitor_application_layer
6.1.2 关键指标监控 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 #!/bin/bash echo "=== Key Network Metrics Monitoring ===" monitor_bandwidth () { echo "1. Bandwidth Usage Monitoring:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_bytes=$(cat /sys/class/net/$interface /statistics/rx_bytes) tx_bytes=$(cat /sys/class/net/$interface /statistics/tx_bytes) rx_packets=$(cat /sys/class/net/$interface /statistics/rx_packets) tx_packets=$(cat /sys/class/net/$interface /statistics/tx_packets) echo "Interface: $interface " echo " RX: $rx_bytes bytes, $rx_packets packets" echo " TX: $tx_bytes bytes, $tx_packets packets" fi done } monitor_connections () { echo -e "\n2. Connection Count Monitoring:" tcp_connections=$(ss -t | wc -l) udp_connections=$(ss -u | wc -l) established_connections=$(ss -t state established | wc -l) listening_ports=$(ss -tln | wc -l) echo "TCP connections: $tcp_connections " echo "UDP connections: $udp_connections " echo "Established connections: $established_connections " echo "Listening ports: $listening_ports " } monitor_latency () { echo -e "\n3. Latency Monitoring:" targets=("8.8.8.8" "1.1.1.1" ) for target in "${targets[@]} " ; do echo -n "Pinging $target : " ping_result=$(ping -c 3 -W 1 $target 2>/dev/null) if [ $? -eq 0 ]; then avg_latency=$(echo "$ping_result " | grep "avg" | awk -F'/' '{print $5}' ) echo "${avg_latency} ms" else echo "FAILED" fi done } monitor_errors () { echo -e "\n4. Error Rate Monitoring:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_errors=$(cat /sys/class/net/$interface /statistics/rx_errors) tx_errors=$(cat /sys/class/net/$interface /statistics/tx_errors) rx_dropped=$(cat /sys/class/net/$interface /statistics/rx_dropped) tx_dropped=$(cat /sys/class/net/$interface /statistics/tx_dropped) if [ $rx_errors -gt 0 ] || [ $tx_errors -gt 0 ] || [ $rx_dropped -gt 0 ] || [ $tx_dropped -gt 0 ]; then echo "Interface $interface :" echo " RX Errors: $rx_errors , TX Errors: $tx_errors " echo " RX Dropped: $rx_dropped , TX Dropped: $tx_dropped " fi fi done } monitor_bandwidth monitor_connections monitor_latency monitor_errors
6.2 告警机制 6.2.1 网络告警脚本 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 #!/bin/bash ALERT_EMAIL="admin@company.com" ALERT_WEBHOOK="https://hooks.slack.com/services/xxx" LOG_FILE="/var/log/network_alerts.log" log_alert () { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1 " >> $LOG_FILE } send_email_alert () { local subject="$1 " local message="$2 " echo "$message " | mail -s "$subject " $ALERT_EMAIL log_alert "Email alert sent: $subject " } send_slack_alert () { local message="$1 " curl -X POST -H 'Content-type: application/json' \ --data "{\"text\":\"$message \"}" \ $ALERT_WEBHOOK log_alert "Slack alert sent: $message " } check_interface_status () { for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then status=$(cat /sys/class/net/$interface /operstate) if [ "$status " != "up" ]; then local alert_msg="CRITICAL: Interface $interface is $status " send_email_alert "Network Interface Down" "$alert_msg " send_slack_alert "$alert_msg " fi fi done } check_bandwidth_usage () { for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then speed=$(cat /sys/class/net/$interface /speed 2>/dev/null) if [ -n "$speed " ] && [ "$speed " != "Unknown" ]; then rx_bytes=$(cat /sys/class/net/$interface /statistics/rx_bytes) tx_bytes=$(cat /sys/class/net/$interface /statistics/tx_bytes) if [ $rx_bytes -gt 1000000000 ]; then local alert_msg="WARNING: High bandwidth usage on interface $interface " send_email_alert "High Bandwidth Usage" "$alert_msg " send_slack_alert "$alert_msg " fi fi fi done } check_connection_count () { total_connections=$(ss -t | wc -l) if [ $total_connections -gt 1000 ]; then local alert_msg="WARNING: High connection count: $total_connections " send_email_alert "High Connection Count" "$alert_msg " send_slack_alert "$alert_msg " fi } check_network_latency () { gateway=$(ip route | grep default | awk '{print $3}' | head -1) if [ -n "$gateway " ]; then latency=$(ping -c 3 -W 1 $gateway 2>/dev/null | grep "avg" | awk -F'/' '{print $5}' ) if [ -n "$latency " ]; then if (( $(echo "$latency > 100 " | bc -l) )); then local alert_msg="WARNING: High latency to gateway: ${latency} ms" send_email_alert "High Network Latency" "$alert_msg " send_slack_alert "$alert_msg " fi fi fi } check_network_errors () { for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_errors=$(cat /sys/class/net/$interface /statistics/rx_errors) tx_errors=$(cat /sys/class/net/$interface /statistics/tx_errors) if [ $rx_errors -gt 0 ] || [ $tx_errors -gt 0 ]; then local alert_msg="WARNING: Network errors on interface $interface - RX:$rx_errors , TX:$tx_errors " send_email_alert "Network Errors" "$alert_msg " send_slack_alert "$alert_msg " fi fi done } main () { log_alert "Network alert check started" check_interface_status check_bandwidth_usage check_connection_count check_network_latency check_network_errors log_alert "Network alert check completed" } main
6.3 自动化运维 6.3.1 网络自动恢复 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 #!/bin/bash LOG_FILE="/var/log/network_recovery.log" MAX_RETRIES=3 RETRY_INTERVAL=60 log_recovery () { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1 " >> $LOG_FILE } restart_interface () { local interface="$1 " local retry_count=0 while [ $retry_count -lt $MAX_RETRIES ]; do log_recovery "Attempting to restart interface $interface (attempt $((retry_count + 1) ))" ip link set $interface down sleep 5 ip link set $interface up sleep 10 status=$(cat /sys/class/net/$interface /operstate) if [ "$status " = "up" ]; then log_recovery "Interface $interface restarted successfully" return 0 fi retry_count=$((retry_count + 1 )) sleep $RETRY_INTERVAL done log_recovery "Failed to restart interface $interface after $MAX_RETRIES attempts" return 1 } restart_network_service () { log_recovery "Restarting network service" systemctl restart network sleep 30 if systemctl is-active --quiet network; then log_recovery "Network service restarted successfully" return 0 else log_recovery "Failed to restart network service" return 1 fi } check_connectivity () { local target="$1 " if [ -z "$target " ]; then target="8.8.8.8" fi ping -c 3 -W 1 $target > /dev/null 2>&1 return $? } main () { log_recovery "Network auto-recovery started" if ! check_connectivity; then log_recovery "Network connectivity check failed" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then status=$(cat /sys/class/net/$interface /operstate) if [ "$status " != "up" ]; then log_recovery "Interface $interface is $status , attempting restart" restart_interface "$interface " fi fi done if ! check_connectivity; then log_recovery "Still no connectivity, restarting network service" restart_network_service fi else log_recovery "Network connectivity is normal" fi log_recovery "Network auto-recovery completed" } main
7. 网络监控工具集成 7.1 Prometheus集成 7.1.1 Node Exporter网络指标 1 2 3 4 5 6 7 8 9 10 global: scrape_interval: 15s scrape_configs: - job_name: 'node-exporter' static_configs: - targets: ['localhost:9100' ] scrape_interval: 5s metrics_path: /metrics
7.1.2 自定义网络监控指标 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 #!/bin/bash METRICS_FILE="/tmp/network_metrics.prom" METRICS_PORT=8080 generate_metrics () { cat > $METRICS_FILE << EOF # HELP network_interface_up Interface up status # TYPE network_interface_up gauge EOF for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then status=$(cat /sys/class/net/$interface /operstate) up_value=0 if [ "$status " = "up" ]; then up_value=1 fi echo "network_interface_up{interface=\"$interface \"} $up_value " >> $METRICS_FILE fi done cat >> $METRICS_FILE << EOF # HELP network_interface_rx_bytes_total Total bytes received # TYPE network_interface_rx_bytes_total counter EOF for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_bytes=$(cat /sys/class/net/$interface /statistics/rx_bytes) echo "network_interface_rx_bytes_total{interface=\"$interface \"} $rx_bytes " >> $METRICS_FILE fi done cat >> $METRICS_FILE << EOF # HELP network_interface_tx_bytes_total Total bytes transmitted # TYPE network_interface_tx_bytes_total counter EOF for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then tx_bytes=$(cat /sys/class/net/$interface /statistics/tx_bytes) echo "network_interface_tx_bytes_total{interface=\"$interface \"} $tx_bytes " >> $METRICS_FILE fi done cat >> $METRICS_FILE << EOF # HELP network_tcp_connections_total Total TCP connections # TYPE network_tcp_connections_total gauge EOF tcp_connections=$(ss -t | wc -l) echo "network_tcp_connections_total $tcp_connections " >> $METRICS_FILE cat >> $METRICS_FILE << EOF # HELP network_udp_connections_total Total UDP connections # TYPE network_udp_connections_total gauge EOF udp_connections=$(ss -u | wc -l) echo "network_udp_connections_total $udp_connections " >> $METRICS_FILE } start_metrics_server () { while true ; do generate_metrics sleep 10 done & python3 -c " import http.server import socketserver import os class MetricsHandler(http.server.SimpleHTTPRequestHandler): def do_GET(self): if self.path == '/metrics': self.send_response(200) self.send_header('Content-type', 'text/plain') self.end_headers() with open('$METRICS_FILE ', 'r') as f: self.wfile.write(f.read().encode()) else: self.send_response(404) self.end_headers() with socketserver.TCPServer(('', $METRICS_PORT ), MetricsHandler) as httpd: httpd.serve_forever() " } start_metrics_server
7.2 Grafana仪表板 7.2.1 网络监控仪表板配置 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 { "dashboard" : { "title" : "Network Monitoring Dashboard" , "panels" : [ { "title" : "Network Interface Status" , "type" : "stat" , "targets" : [ { "expr" : "network_interface_up" , "legendFormat" : "{{interface}}" } ] } , { "title" : "Network Traffic" , "type" : "graph" , "targets" : [ { "expr" : "rate(network_interface_rx_bytes_total[5m])" , "legendFormat" : "{{interface}} RX" } , { "expr" : "rate(network_interface_tx_bytes_total[5m])" , "legendFormat" : "{{interface}} TX" } ] } , { "title" : "TCP Connections" , "type" : "graph" , "targets" : [ { "expr" : "network_tcp_connections_total" , "legendFormat" : "TCP Connections" } ] } , { "title" : "UDP Connections" , "type" : "graph" , "targets" : [ { "expr" : "network_udp_connections_total" , "legendFormat" : "UDP Connections" } ] } ] } }
8. 实战案例 8.1 Web服务器网络监控 8.1.1 Nginx网络监控 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 #!/bin/bash echo "=== Nginx Network Monitoring ===" nginx_pids=$(pgrep nginx) if [ -z "$nginx_pids " ]; then echo "ERROR: Nginx is not running!" exit 1 fi echo "Nginx PIDs: $nginx_pids " echo -e "\nListening Ports:" ss -tlnp | grep nginx echo -e "\nConnection Statistics:" echo "HTTP connections (port 80):" ss -t | grep :80 | wc -l echo "HTTPS connections (port 443):" ss -t | grep :443 | wc -l echo -e "\nConnection States:" ss -t | awk 'NR>1 {print $1}' | sort | uniq -c | sort -nr echo -e "\nNetwork Errors:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_errors=$(cat /sys/class/net/$interface /statistics/rx_errors) tx_errors=$(cat /sys/class/net/$interface /statistics/tx_errors) if [ $rx_errors -gt 0 ] || [ $tx_errors -gt 0 ]; then echo "Interface $interface : RX_ERR=$rx_errors , TX_ERR=$tx_errors " fi fi done echo -e "\nBandwidth Usage:" for interface in $(ip link show | grep -E "^[0-9]+:" | cut -d: -f2 | tr -d ' ' ); do if [ "$interface " != "lo" ]; then rx_bytes=$(cat /sys/class/net/$interface /statistics/rx_bytes) tx_bytes=$(cat /sys/class/net/$interface /statistics/tx_bytes) rx_mbps=$(echo "scale=2; $rx_bytes * 8 / 1024 / 1024" | bc) tx_mbps=$(echo "scale=2; $tx_bytes * 8 / 1024 / 1024" | bc) echo "Interface $interface : RX=${rx_mbps} Mbps, TX=${tx_mbps} Mbps" fi done
8.1.2 Apache网络监控 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 #!/bin/bash echo "=== Apache Network Monitoring ===" apache_pids=$(pgrep httpd) if [ -z "$apache_pids " ]; then echo "ERROR: Apache is not running!" exit 1 fi echo "Apache PIDs: $apache_pids " echo -e "\nListening Ports:" ss -tlnp | grep httpd echo -e "\nConnection Statistics:" echo "HTTP connections (port 80):" ss -t | grep :80 | wc -l echo "HTTPS connections (port 443):" ss -t | grep :443 | wc -l echo -e "\nConnection States:" ss -t | awk 'NR>1 {print $1}' | sort | uniq -c | sort -nr echo -e "\nModule Status:" httpd -M 2>/dev/null | head -20 echo -e "\nPerformance Statistics:" if [ -f /var/log/httpd/access_log ]; then echo "Recent requests:" tail -10 /var/log/httpd/access_log fi
8.2 数据库网络监控 8.2.1 MySQL网络监控 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 #!/bin/bash echo "=== MySQL Network Monitoring ===" mysql_pids=$(pgrep mysqld) if [ -z "$mysql_pids " ]; then echo "ERROR: MySQL is not running!" exit 1 fi echo "MySQL PIDs: $mysql_pids " echo -e "\nListening Ports:" ss -tlnp | grep mysql echo -e "\nConnection Statistics:" mysql -e "SHOW STATUS LIKE 'Connections';" 2>/dev/null mysql -e "SHOW STATUS LIKE 'Threads_connected';" 2>/dev/null mysql -e "SHOW STATUS LIKE 'Threads_running';" 2>/dev/null echo -e "\nNetwork Connections:" ss -t | grep :3306 | wc -l echo -e "\nConnection States:" ss -t | grep :3306 | awk '{print $1}' | sort | uniq -c | sort -nr echo -e "\nNetwork Errors:" mysql -e "SHOW STATUS LIKE 'Aborted_connects';" 2>/dev/null mysql -e "SHOW STATUS LIKE 'Aborted_clients';" 2>/dev/null echo -e "\nNetwork Statistics:" mysql -e "SHOW STATUS LIKE 'Bytes_received';" 2>/dev/null mysql -e "SHOW STATUS LIKE 'Bytes_sent';" 2>/dev/null
8.2.2 Redis网络监控 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 #!/bin/bash echo "=== Redis Network Monitoring ===" redis_pids=$(pgrep redis-server) if [ -z "$redis_pids " ]; then echo "ERROR: Redis is not running!" exit 1 fi echo "Redis PIDs: $redis_pids " echo -e "\nListening Ports:" ss -tlnp | grep redis echo -e "\nConnection Statistics:" redis-cli info clients 2>/dev/null echo -e "\nNetwork Connections:" ss -t | grep :6379 | wc -l echo -e "\nConnection States:" ss -t | grep :6379 | awk '{print $1}' | sort | uniq -c | sort -nr echo -e "\nNetwork Statistics:" redis-cli info stats 2>/dev/null | grep -E "(total_connections_received|total_commands_processed|instantaneous_ops_per_sec)" echo -e "\nMemory Usage:" redis-cli info memory 2>/dev/null | grep -E "(used_memory|used_memory_peak|used_memory_rss)"
8.3 负载均衡器网络监控 8.3.1 HAProxy网络监控 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 #!/bin/bash echo "=== HAProxy Network Monitoring ===" haproxy_pids=$(pgrep haproxy) if [ -z "$haproxy_pids " ]; then echo "ERROR: HAProxy is not running!" exit 1 fi echo "HAProxy PIDs: $haproxy_pids " echo -e "\nListening Ports:" ss -tlnp | grep haproxy echo -e "\nConnection Statistics:" echo "HAProxy connections:" ss -t | grep haproxy | wc -l echo -e "\nBackend Server Status:" if [ -f /var/run/haproxy/admin.sock ]; then echo "show stat" | socat stdio /var/run/haproxy/admin.sock | head -20 fi echo -e "\nLoad Balancing Statistics:" if [ -f /var/run/haproxy/admin.sock ]; then echo "show stat" | socat stdio /var/run/haproxy/admin.sock | grep -E "(srv|backend)" fi
9. 总结 网络监控是现代IT基础设施运维的核心技能,通过合理的监控策略和工具使用,可以:
保障服务可用性 : 及时发现网络故障,确保服务稳定运行
优化网络性能 : 识别网络瓶颈,优化网络配置
提升安全防护 : 检测异常流量和潜在安全威胁
支持容量规划 : 为网络扩容提供数据支持
快速故障诊断 : 快速定位和解决网络问题
9.1 关键要点
全面监控 : 覆盖物理层到应用层的全方位监控
实时告警 : 建立完善的告警机制,及时发现问题
自动化运维 : 实现网络自动恢复和健康检查
性能分析 : 使用专业工具进行深度性能分析
持续优化 : 根据监控数据持续优化网络配置
9.2 最佳实践
分层监控 : 从物理层到应用层的全方位监控
阈值设置 : 合理设置监控阈值,避免误报
工具集成 : 集成多种监控工具,提供统一视图
文档记录 : 详细记录监控配置和故障处理流程
定期演练 : 定期进行故障演练,验证监控有效性
通过本文的学习和实践,您将掌握企业级网络监控的核心技能,能够有效监控和管理生产环境中的各种网络设备和应用,确保网络服务稳定高效运行。