1. Redis云化部署概述

Redis云化部署是将传统Redis架构迁移到云端环境,利用云平台的优势实现弹性扩展、高可用性和运维自动化。1主1从架构是云化部署中最常用的高可用方案,既保证了数据安全,又实现了读写分离。

1.1 Redis云化部署优势

  1. 弹性扩展: 根据业务需求动态调整资源
  2. 高可用性: 云平台提供的基础设施保障
  3. 运维自动化: 减少人工运维成本
  4. 成本优化: 按需付费,降低总体成本
  5. 安全可靠: 云平台的安全防护机制
  6. 监控完善: 云平台提供的监控和告警服务

1.2 1主1从架构特点

  • 主节点: 处理写操作和部分读操作
  • 从节点: 处理读操作,提供数据备份
  • 自动故障转移: 主节点故障时从节点自动提升
  • 数据同步: 实时数据同步保证一致性
  • 负载均衡: 读写分离提升整体性能

1.3 云化部署架构类型

  1. 容器化部署: 使用Docker容器部署Redis
  2. Kubernetes部署: 使用K8s管理Redis集群
  3. 云服务部署: 使用云平台提供的Redis服务
  4. 混合云部署: 结合公有云和私有云

2. Docker容器化部署

2.1 Docker镜像构建

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Dockerfile
FROM redis:7.0-alpine

# 安装必要工具
RUN apk add --no-cache bash curl

# 创建Redis用户
RUN addgroup -g 1000 redis && \
adduser -D -s /bin/bash -u 1000 -G redis redis

# 创建数据目录
RUN mkdir -p /data && chown redis:redis /data

# 复制配置文件
COPY redis.conf /usr/local/etc/redis/redis.conf
COPY entrypoint.sh /usr/local/bin/entrypoint.sh

# 设置权限
RUN chmod +x /usr/local/bin/entrypoint.sh

# 切换到Redis用户
USER redis

# 暴露端口
EXPOSE 6379

# 启动脚本
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
CMD ["redis-server", "/usr/local/etc/redis/redis.conf"]

2.2 Redis主节点配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# redis-master.conf
port 6379
bind 0.0.0.0
protected-mode no
daemonize no
logfile ""
loglevel notice

# 持久化配置
save 900 1
save 300 10
save 60 10000
dbfilename dump.rdb
dir /data

# 内存配置
maxmemory 2gb
maxmemory-policy allkeys-lru

# 主从复制配置
repl-diskless-sync yes
repl-diskless-sync-delay 5
repl-backlog-size 1mb
repl-backlog-ttl 3600

# 安全配置
requirepass redis123

2.3 Redis从节点配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# redis-slave.conf
port 6379
bind 0.0.0.0
protected-mode no
daemonize no
logfile ""
loglevel notice

# 持久化配置
save 900 1
save 300 10
save 60 10000
dbfilename dump.rdb
dir /data

# 内存配置
maxmemory 2gb
maxmemory-policy allkeys-lru

# 主从复制配置
replicaof redis-master 6379
masterauth redis123
replica-read-only yes
repl-diskless-sync yes
repl-diskless-sync-delay 5

# 安全配置
requirepass redis123

2.4 Docker Compose部署

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# docker-compose.yml
version: '3.8'

services:
redis-master:
build: .
container_name: redis-master
ports:
- "6379:6379"
volumes:
- redis-master-data:/data
- ./redis-master.conf:/usr/local/etc/redis/redis.conf
environment:
- REDIS_ROLE=master
networks:
- redis-network
restart: unless-stopped
healthcheck:
test: ["CMD", "redis-cli", "-a", "redis123", "ping"]
interval: 30s
timeout: 10s
retries: 3

redis-slave:
build: .
container_name: redis-slave
ports:
- "6380:6379"
volumes:
- redis-slave-data:/data
- ./redis-slave.conf:/usr/local/etc/redis/redis.conf
environment:
- REDIS_ROLE=slave
- REDIS_MASTER_HOST=redis-master
depends_on:
- redis-master
networks:
- redis-network
restart: unless-stopped
healthcheck:
test: ["CMD", "redis-cli", "-a", "redis123", "ping"]
interval: 30s
timeout: 10s
retries: 3

volumes:
redis-master-data:
redis-slave-data:

networks:
redis-network:
driver: bridge

2.5 启动脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# entrypoint.sh

set -e

# 根据环境变量设置角色
if [ "$REDIS_ROLE" = "slave" ]; then
echo "Starting Redis as slave..."
# 等待主节点启动
while ! redis-cli -h $REDIS_MASTER_HOST -p 6379 -a redis123 ping; do
echo "Waiting for master to be ready..."
sleep 2
done
echo "Master is ready, starting slave..."
fi

# 启动Redis
exec "$@"

3. Kubernetes部署方案

3.1 Redis主节点Deployment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# redis-master-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-master
labels:
app: redis
role: master
spec:
replicas: 1
selector:
matchLabels:
app: redis
role: master
template:
metadata:
labels:
app: redis
role: master
spec:
containers:
- name: redis
image: redis:7.0-alpine
ports:
- containerPort: 6379
env:
- name: REDIS_ROLE
value: "master"
volumeMounts:
- name: redis-data
mountPath: /data
- name: redis-config
mountPath: /usr/local/etc/redis/redis.conf
subPath: redis.conf
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
exec:
command:
- redis-cli
- -a
- redis123
- ping
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- redis-cli
- -a
- redis123
- ping
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: redis-data
persistentVolumeClaim:
claimName: redis-master-pvc
- name: redis-config
configMap:
name: redis-master-config

3.2 Redis从节点Deployment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# redis-slave-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-slave
labels:
app: redis
role: slave
spec:
replicas: 1
selector:
matchLabels:
app: redis
role: slave
template:
metadata:
labels:
app: redis
role: slave
spec:
containers:
- name: redis
image: redis:7.0-alpine
ports:
- containerPort: 6379
env:
- name: REDIS_ROLE
value: "slave"
- name: REDIS_MASTER_HOST
value: "redis-master-service"
volumeMounts:
- name: redis-data
mountPath: /data
- name: redis-config
mountPath: /usr/local/etc/redis/redis.conf
subPath: redis.conf
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
exec:
command:
- redis-cli
- -a
- redis123
- ping
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- redis-cli
- -a
- redis123
- ping
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: redis-data
persistentVolumeClaim:
claimName: redis-slave-pvc
- name: redis-config
configMap:
name: redis-slave-config

3.3 Service配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# redis-services.yaml
apiVersion: v1
kind: Service
metadata:
name: redis-master-service
labels:
app: redis
role: master
spec:
ports:
- port: 6379
targetPort: 6379
protocol: TCP
selector:
app: redis
role: master
type: ClusterIP

---
apiVersion: v1
kind: Service
metadata:
name: redis-slave-service
labels:
app: redis
role: slave
spec:
ports:
- port: 6379
targetPort: 6379
protocol: TCP
selector:
app: redis
role: slave
type: ClusterIP

3.4 ConfigMap配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# redis-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-master-config
data:
redis.conf: |
port 6379
bind 0.0.0.0
protected-mode no
daemonize no
logfile ""
loglevel notice
save 900 1
save 300 10
save 60 10000
dbfilename dump.rdb
dir /data
maxmemory 2gb
maxmemory-policy allkeys-lru
repl-diskless-sync yes
repl-diskless-sync-delay 5
repl-backlog-size 1mb
repl-backlog-ttl 3600
requirepass redis123

---
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-slave-config
data:
redis.conf: |
port 6379
bind 0.0.0.0
protected-mode no
daemonize no
logfile ""
loglevel notice
save 900 1
save 300 10
save 60 10000
dbfilename dump.rdb
dir /data
maxmemory 2gb
maxmemory-policy allkeys-lru
replicaof redis-master-service 6379
masterauth redis123
replica-read-only yes
repl-diskless-sync yes
repl-diskless-sync-delay 5
requirepass redis123

3.5 PersistentVolume配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# redis-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: redis-master-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: fast-ssd
hostPath:
path: /data/redis-master

---
apiVersion: v1
kind: PersistentVolume
metadata:
name: redis-slave-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: fast-ssd
hostPath:
path: /data/redis-slave

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-master-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: fast-ssd

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-slave-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: fast-ssd

4. 云平台部署方案

4.1 阿里云Redis部署

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 使用阿里云CLI创建Redis实例
aliyun r-kvstore CreateInstance \
--InstanceName "redis-master" \
--InstanceClass "redis.master.small.default" \
--InstanceType "Redis" \
--EngineVersion "5.0" \
--RegionId "cn-hangzhou" \
--ZoneId "cn-hangzhou-h" \
--VpcId "vpc-xxx" \
--VSwitchId "vsw-xxx" \
--Password "Redis123456"

# 创建从节点
aliyun r-kvstore CreateInstance \
--InstanceName "redis-slave" \
--InstanceClass "redis.slave.small.default" \
--InstanceType "Redis" \
--EngineVersion "5.0" \
--RegionId "cn-hangzhou" \
--ZoneId "cn-hangzhou-i" \
--VpcId "vpc-xxx" \
--VSwitchId "vsw-xxx" \
--Password "Redis123456"

4.2 腾讯云Redis部署

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 使用腾讯云CLI创建Redis实例
tccli redis CreateInstances \
--ZoneId 100003 \
--TypeId 2 \
--MemSize 1024 \
--GoodsNum 1 \
--Period 1 \
--Password "Redis123456" \
--BillingMode 1 \
--VpcId "vpc-xxx" \
--SubnetId "subnet-xxx" \
--ProjectId 0 \
--AutoRenew 1 \
--ProductVersion "5.0" \
--RedisReplicasNum 1 \
--RedisShardNum 1

4.3 AWS ElastiCache部署

1
2
3
4
5
6
7
8
9
10
11
12
# 使用AWS CLI创建Redis集群
aws elasticache create-replication-group \
--replication-group-id "redis-master-slave" \
--description "Redis 1 Master 1 Slave" \
--node-type "cache.t3.micro" \
--engine "redis" \
--engine-version "7.0" \
--num-cache-clusters 2 \
--cache-parameter-group-name "default.redis7" \
--security-group-ids "sg-xxx" \
--subnet-group-name "redis-subnet-group" \
--auth-token "Redis123456"

5. 云端监控与运维

5.1 Prometheus监控配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# prometheus-redis.yml
global:
scrape_interval: 15s

scrape_configs:
- job_name: 'redis-master'
static_configs:
- targets: ['redis-master-service:6379']
metrics_path: /metrics
scrape_interval: 10s

- job_name: 'redis-slave'
static_configs:
- targets: ['redis-slave-service:6379']
metrics_path: /metrics
scrape_interval: 10s

5.2 Grafana仪表板配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
{
"dashboard": {
"title": "Redis 1主1从监控",
"panels": [
{
"title": "Redis连接数",
"type": "graph",
"targets": [
{
"expr": "redis_connected_clients",
"legendFormat": "{{instance}}"
}
]
},
{
"title": "Redis内存使用",
"type": "graph",
"targets": [
{
"expr": "redis_memory_used_bytes",
"legendFormat": "{{instance}}"
}
]
},
{
"title": "Redis命令执行数",
"type": "graph",
"targets": [
{
"expr": "redis_commands_processed_total",
"legendFormat": "{{instance}}"
}
]
}
]
}
}

5.3 告警规则配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# redis-alerts.yml
groups:
- name: redis
rules:
- alert: RedisDown
expr: redis_up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Redis instance is down"
description: "Redis instance {{ $labels.instance }} is down"

- alert: RedisHighMemoryUsage
expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "Redis high memory usage"
description: "Redis instance {{ $labels.instance }} memory usage is above 80%"

- alert: RedisHighConnections
expr: redis_connected_clients > 1000
for: 5m
labels:
severity: warning
annotations:
summary: "Redis high connections"
description: "Redis instance {{ $labels.instance }} has too many connections"

5.4 自动化运维脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
#!/bin/bash
# redis-cloud-ops.sh

# 检查Redis状态
check_redis_status() {
local host=$1
local port=$2
local password=$3

if redis-cli -h $host -p $port -a $password ping > /dev/null 2>&1; then
echo "Redis $host:$port is healthy"
return 0
else
echo "Redis $host:$port is unhealthy"
return 1
fi
}

# 主从切换
failover_redis() {
local master_host=$1
local master_port=$2
local slave_host=$3
local slave_port=$4
local password=$5

echo "Starting failover process..."

# 检查主节点状态
if ! check_redis_status $master_host $master_port $password; then
echo "Master is down, promoting slave to master..."

# 提升从节点为主节点
redis-cli -h $slave_host -p $slave_port -a $password replicaof no one

echo "Slave promoted to master successfully"
return 0
else
echo "Master is healthy, no failover needed"
return 1
fi
}

# 数据备份
backup_redis() {
local host=$1
local port=$2
local password=$3
local backup_dir=$4

echo "Starting backup process..."

# 创建备份目录
mkdir -p $backup_dir

# 执行BGSAVE
redis-cli -h $host -p $port -a $password bgsave

# 等待备份完成
while [ $(redis-cli -h $host -p $port -a $password lastsave) -eq $(redis-cli -h $host -p $port -a $password lastsave) ]; do
sleep 1
done

# 复制备份文件
scp $host:/data/dump.rdb $backup_dir/dump-$(date +%Y%m%d-%H%M%S).rdb

echo "Backup completed: $backup_dir"
}

# 主函数
main() {
case $1 in
"check")
check_redis_status $2 $3 $4
;;
"failover")
failover_redis $2 $3 $4 $5 $6
;;
"backup")
backup_redis $2 $3 $4 $5
;;
*)
echo "Usage: $0 {check|failover|backup} [args...]"
exit 1
;;
esac
}

main "$@"

6. 云端安全配置

6.1 网络安全配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# redis-network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: redis-network-policy
spec:
podSelector:
matchLabels:
app: redis
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: application
ports:
- protocol: TCP
port: 6379
egress:
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379

6.2 访问控制配置

1
2
3
4
5
6
7
8
9
10
11
12
# Redis ACL配置
# 创建用户
ACL SETUSER app-user on >app-password ~* &* +@read +@write

# 创建只读用户
ACL SETUSER read-user on >read-password ~* &* +@read

# 创建管理员用户
ACL SETUSER admin-user on >admin-password ~* &* +@all

# 保存ACL配置
ACL SAVE

6.3 加密传输配置

1
2
3
4
5
6
7
8
9
# TLS配置
# 生成证书
openssl req -x509 -newkey rsa:4096 -keyout redis.key -out redis.crt -days 365 -nodes

# Redis TLS配置
tls-port 6380
tls-cert-file /etc/ssl/redis.crt
tls-key-file /etc/ssl/redis.key
tls-ca-cert-file /etc/ssl/ca.crt

7. 性能优化策略

7.1 云端资源优化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 资源限制配置
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"

# HPA配置
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: redis-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: redis-slave
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70

7.2 网络优化

1
2
3
4
5
6
7
8
9
10
11
# 网络优化配置
# 增加网络缓冲区
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 262144
net.core.wmem_default = 262144

# TCP优化
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_congestion_control = bbr

7.3 存储优化

1
2
3
4
5
6
7
8
9
10
11
12
# 存储类配置
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iops: "3000"
throughput: "125"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

8. 故障处理与恢复

8.1 云端故障检测

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#!/bin/bash
# cloud-fault-detection.sh

# 检查云服务状态
check_cloud_service() {
local service=$1
local region=$2

case $service in
"aws")
aws elasticache describe-cache-clusters --region $region
;;
"aliyun")
aliyun r-kvstore DescribeInstances --RegionId $region
;;
"tencent")
tccli redis DescribeInstances --Region $region
;;
esac
}

# 自动故障恢复
auto_recovery() {
local service=$1
local instance_id=$2

echo "Starting auto recovery for $service instance $instance_id"

case $service in
"aws")
aws elasticache reboot-cache-cluster --cache-cluster-id $instance_id
;;
"aliyun")
aliyun r-kvstore RestartInstance --InstanceId $instance_id
;;
"tencent")
tccli redis RestartInstance --InstanceId $instance_id
;;
esac
}

8.2 数据恢复策略

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 数据恢复脚本
restore_redis_data() {
local backup_file=$1
local target_host=$2
local target_port=$3
local password=$4

echo "Restoring data from $backup_file to $target_host:$target_port"

# 停止Redis服务
redis-cli -h $target_host -p $target_port -a $password shutdown

# 复制备份文件
scp $backup_file $target_host:/data/dump.rdb

# 启动Redis服务
redis-server /etc/redis/redis.conf

echo "Data restore completed"
}

9. 最佳实践总结

9.1 云化部署原则

  1. 高可用优先: 确保服务持续可用
  2. 弹性扩展: 支持动态资源调整
  3. 安全可靠: 加强安全防护
  4. 监控完善: 全面监控集群状态
  5. 自动化运维: 减少人工干预

9.2 1主1从架构优势

  • 简单可靠: 架构简单,易于维护
  • 成本适中: 相比集群模式成本较低
  • 性能良好: 读写分离提升性能
  • 故障恢复: 自动故障转移机制
  • 数据安全: 多副本保证数据安全

9.3 云端部署建议

  • 选择合适云平台: 根据业务需求选择
  • 合理配置资源: 避免资源浪费
  • 加强监控告警: 及时发现和处理问题
  • 定期备份数据: 保证数据安全
  • 优化网络配置: 提升网络性能

通过合理的Redis云化部署和1主1从架构设计,可以构建稳定、高性能的云端Redis系统,满足企业级应用的需求。