第344集SpringBoot链路追踪架构实战:TraceId日志追踪、微服务链路分析与企业级监控完整解决方案
|字数总计:3.5k|阅读时长:18分钟|阅读量:
SpringBoot链路追踪架构实战:TraceId日志追踪、微服务链路分析与企业级监控完整解决方案
一、链路追踪概述
1.1 TraceId的作用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| 链路追踪核心概念: TraceId: - 全局唯一追踪ID - 标识整个请求链路 - 跨服务传递 SpanId: - 单个服务追踪 - 请求在服务内的时间戳 - 嵌套Span关系 作用: - 问题快速定位 - 性能瓶颈分析 - 服务依赖梳理 - 日志聚合查询
|
1.2 实现方式对比
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| 链路追踪方案: 方案1: MDC手动实现 - 优点: 简单轻量 - 缺点: 需要手动传递 - 适用: 单体应用 方案2: Spring Cloud Sleuth - 优点: 自动集成,支持Zipkin - 缺点: 功能较重 - 适用: 微服务架构 方案3: SkyWalking - 优点: 性能监控,APM - 缺点: 需要额外组件 - 适用: 生产环境 方案4: Pinpoint - 优点: 字节码增强 - 缺点: 侵入性较强 - 适用: Java应用
|
二、MDC基础实现
2.1 MDC基本使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
| package com.example.util;
import org.slf4j.MDC; import java.util.UUID;
public class TraceIdUtil { private static final String TRACE_ID = "traceId";
public static String generateTraceId() { return UUID.randomUUID().toString().replace("-", ""); }
public static void setTraceId(String traceId) { MDC.put(TRACE_ID, traceId); }
public static String getTraceId() { return MDC.get(TRACE_ID); }
public static void clearTraceId() { MDC.remove(TRACE_ID); }
public static void put(String key, String value) { MDC.put(key, value); }
public static String get(String key) { return MDC.get(key); }
public static void clear() { MDC.clear(); } }
|
2.2 日志配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
| <?xml version="1.0" encoding="UTF-8"?> <configuration> <conversionRule conversionWord="clr" converterClass="org.springframework.boot.logging.logback.ColorConverter" /> <conversionRule conversionWord="wex" converterClass="org.springframework.boot.logging.logback.WhitespaceThrowableProxyConverter" /> <conversionRule conversionWord="wEx" converterClass="org.springframework.boot.logging.logback.ExtendedWhitespaceThrowableProxyConverter" /> <property name="CONSOLE_LOG_PATTERN" value="${CONSOLE_LOG_PATTERN:-%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(%5p) %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr([%traceId%]){cyan} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n%wEx}"/> <property name="FILE_LOG_PATTERN" value="${FILE_LOG_PATTERN:-%d{yyyy-MM-dd HH:mm:ss.SSS} %5p ${PID:- } --- [%t] %X{traceId} %-40.40logger{39} : %m%n%wEx}"/> <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender"> <encoder> <pattern>${CONSOLE_LOG_PATTERN}</pattern> <charset>UTF-8</charset> </encoder> </appender> <appender name="INFO_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender"> <file>logs/app-info.log</file> <encoder> <pattern>${FILE_LOG_PATTERN}</pattern> <charset>UTF-8</charset> </encoder> <rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy"> <fileNamePattern>logs/app-info.%d{yyyy-MM-dd}.%i.log</fileNamePattern> <maxFileSize>100MB</maxFileSize> <maxHistory>30</maxHistory> <totalSizeCap>10GB</totalSizeCap> </rollingPolicy> </appender> <root level="INFO"> <appender-ref ref="CONSOLE"/> <appender-ref ref="INFO_FILE"/> </root> </configuration>
|
2.3 拦截器实现
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
| package com.example.interceptor;
import com.example.util.TraceIdUtil; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.web.servlet.HandlerInterceptor;
import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse;
public class TraceInterceptor implements HandlerInterceptor { private static final Logger logger = LoggerFactory.getLogger(TraceInterceptor.class); @Override public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception { String traceId = request.getHeader("X-Trace-Id"); if (traceId == null || traceId.isEmpty()) { traceId = TraceIdUtil.generateTraceId(); } TraceIdUtil.setTraceId(traceId); response.setHeader("X-Trace-Id", traceId); logger.info("Request started with TraceId: {}", traceId); return true; } @Override public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex) throws Exception { String traceId = TraceIdUtil.getTraceId(); logger.info("Request completed with TraceId: {}", traceId); TraceIdUtil.clearTraceId(); } }
|
2.4 拦截器配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| package com.example.config;
import com.example.interceptor.TraceInterceptor; import org.springframework.context.annotation.Configuration; import org.springframework.web.servlet.config.annotation.InterceptorRegistry; import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;
@Configuration public class WebMvcConfig implements WebMvcConfigurer { @Override public void addInterceptors(InterceptorRegistry registry) { registry.addInterceptor(new TraceInterceptor()) .addPathPatterns("/**") .excludePathPatterns( "/actuator/**", "/health", "/error" ); } }
|
2.5 Feign客户端传递TraceId
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| package com.example.feign;
import com.example.util.TraceIdUtil; import feign.RequestInterceptor; import feign.RequestTemplate; import org.slf4j.Logger; import org.slf4j.LoggerFactory;
public class TraceFeignInterceptor implements RequestInterceptor { private static final Logger logger = LoggerFactory.getLogger(TraceFeignInterceptor.class); @Override public void apply(RequestTemplate template) { String traceId = TraceIdUtil.getTraceId(); if (traceId != null && !traceId.isEmpty()) { template.header("X-Trace-Id", traceId); logger.debug("TraceId added to Feign request: {}", traceId); } } }
|
2.6 Feign客户端配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| package com.example.config;
import com.example.feign.TraceFeignInterceptor; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration;
@Configuration public class FeignConfig { @Bean public TraceFeignInterceptor traceFeignInterceptor() { return new TraceFeignInterceptor(); } }
|
三、Spring Cloud Sleuth集成
3.1 Maven依赖
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| <dependencies> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-sleuth</artifactId> </dependency> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-sleuth-zipkin</artifactId> </dependency> <dependency> <groupId>org.springframework.amqp</groupId> <artifactId>spring-rabbit</artifactId> </dependency> <dependency> <groupId>org.springframework.kafka</groupId> <artifactId>spring-kafka</artifactId> </dependency> </dependencies>
|
3.2 配置application.yml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
| spring: application: name: user-service sleuth: sampler: probability: 1.0 zipkin: base-url: http://localhost:9411 enabled: true sender: type: rabbit rabbit: exchange: zipkin web: skip-pattern: /actuator/health|/actuator/info sampler: rabbitmq: host: localhost port: 5672 username: guest password: guest virtual-host: /
|
3.3 自定义TraceId格式
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| package com.example.config;
import brave.Tracing; import brave.propagation.B3Propagation; import brave.propagation.ExtraFieldPropagation; import brave.propagation.Propagation; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.autoconfigure.condition.ConditionalOnClass; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration;
@Configuration @ConditionalOnClass(Tracing.class) public class CustomTraceConfiguration {
@Bean public Propagation.Factory customPropagationFactory() { return ExtraFieldPropagation.newFactory( B3Propagation.FACTORY, "X-Request-Id", "X-User-Id", "X-Session-Id" ); } }
|
3.4 自定义Span信息
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| package com.example.interceptor;
import brave.Span; import brave.Tracer; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.servlet.HandlerInterceptor;
import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse;
public class CustomSpanInterceptor implements HandlerInterceptor { @Autowired private Tracer tracer; @Override public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception { Span currentSpan = tracer.currentSpan(); if (currentSpan != null) { currentSpan.tag("userId", getUserId(request)); currentSpan.tag("uri", request.getRequestURI()); currentSpan.tag("method", request.getMethod()); } return true; } private String getUserId(HttpServletRequest request) { return (String) request.getAttribute("userId"); } }
|
四、完整的日志链路追踪
4.1 全局异常处理
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| package com.example.handler;
import brave.Span; import brave.Tracer; import com.example.util.TraceIdUtil; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.ExceptionHandler; import org.springframework.web.bind.annotation.RestControllerAdvice;
@RestControllerAdvice public class GlobalExceptionHandler { private static final Logger logger = LoggerFactory.getLogger(GlobalExceptionHandler.class); @Autowired private Tracer tracer; @ExceptionHandler(Exception.class) public Result handleException(Exception e) { String traceId = TraceIdUtil.getTraceId(); logger.error("Exception occurred with TraceId: {}", traceId, e); Span currentSpan = tracer.currentSpan(); if (currentSpan != null) { currentSpan.tag("error", "true"); currentSpan.tag("error.message", e.getMessage()); } return Result.error("System error", traceId); } }
|
4.2 请求日志AOP
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
| package com.example.aop;
import com.example.util.TraceIdUtil; import org.aspectj.lang.ProceedingJoinPoint; import org.aspectj.lang.annotation.Around; import org.aspectj.lang.annotation.Aspect; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.stereotype.Component;
@Aspect @Component public class RequestLogAspect { private static final Logger logger = LoggerFactory.getLogger(RequestLogAspect.class); @Around("@annotation(org.springframework.web.bind.annotation.GetMapping) || " + "@annotation(org.springframework.web.bind.annotation.PostMapping) || " + "@annotation(org.springframework.web.bind.annotation.PutMapping) || " + "@annotation(org.springframework.web.bind.annotation.DeleteMapping)") public Object logRequest(ProceedingJoinPoint joinPoint) throws Throwable { long start = System.currentTimeMillis(); String className = joinPoint.getTarget().getClass().getName(); String methodName = joinPoint.getSignature().getName(); String traceId = TraceIdUtil.getTraceId(); logger.info("=== Request Start ==="); logger.info("Class: {}", className); logger.info("Method: {}", methodName); logger.info("TraceId: {}", traceId); logger.info("Args: {}", joinPoint.getArgs()); try { Object result = joinPoint.proceed(); long duration = System.currentTimeMillis() - start; logger.info("=== Request End ==="); logger.info("Duration: {}ms", duration); logger.info("Result: {}", result); return result; } catch (Exception e) { logger.error("Request failed with TraceId: {}", traceId, e); throw e; } } }
|
4.3 异步任务支持
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
| package com.example.config;
import brave.Tracing; import brave.propagation.CurrentTraceContext; import brave.propagation.ThreadLocalCurrentTraceContext; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.scheduling.annotation.AsyncConfigurer; import org.springframework.scheduling.annotation.EnableAsync; import org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor; import zipkin2.internal.Nullable;
import java.util.concurrent.Executor;
@Configuration @EnableAsync public class AsyncConfig implements AsyncConfigurer { @Bean public ThreadPoolTaskExecutor asyncExecutor(Tracing tracing) { ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor(); executor.setCorePoolSize(10); executor.setMaxPoolSize(20); executor.setQueueCapacity(100); executor.setThreadNamePrefix("async-"); executor.initialize(); executor.setTaskDecorator(new TraceContextDecorator(tracing)); return executor; } static class TraceContextDecorator implements org.springframework.core.task.TaskDecorator { private final Tracing tracing; public TraceContextDecorator(Tracing tracing) { this.tracing = tracing; } @Override public Runnable decorate(Runnable runnable) { CurrentTraceContext context = tracing.currentTraceContext(); return () -> { try (CurrentTraceContext.Scope scope = context.newScope(context.get())) { runnable.run(); } }; } } }
|
五、ELK日志聚合
5.1 Logstash配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
| input { tcp { port => 5044 codec => json_lines } }
filter { if [traceId] { mutate { add_field => { "trace_id" => "%{traceId}" } } } if [level] { mutate { add_field => { "log_level" => "%{level}" } } } date { match => [ "timestamp", "yyyy-MM-dd HH:mm:ss.SSS" ] } }
output { elasticsearch { hosts => ["localhost:9200"] index => "app-logs-%{+YYYY.MM.dd}" } }
|
5.2 日志格式统一
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
| package com.example.util;
import com.alibaba.fastjson.JSON; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.slf4j.MDC;
public class LogUtils {
public static void logBusiness(String module, String action, Object data) { Logger logger = LoggerFactory.getLogger("BUSINESS_LOG"); String traceId = MDC.get("traceId"); String json = JSON.toJSONString(data); logger.info("Module: {}, Action: {}, TraceId: {}, Data: {}", module, action, traceId, json); }
public static void logPerformance(String operation, long duration) { Logger logger = LoggerFactory.getLogger("PERFORMANCE_LOG"); String traceId = MDC.get("traceId"); logger.info("Operation: {}, Duration: {}ms, TraceId: {}", operation, duration, traceId); }
public static void logError(String module, String action, Throwable e) { Logger logger = LoggerFactory.getLogger("ERROR_LOG"); String traceId = MDC.get("traceId"); logger.error("Module: {}, Action: {}, TraceId: {}, Error: {}", module, action, traceId, e.getMessage(), e); } }
|
六、实战案例
6.1 用户服务示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
| package com.example.service;
import com.example.util.LogUtils; import com.example.util.TraceIdUtil; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service;
@Service public class UserService { private static final Logger logger = LoggerFactory.getLogger(UserService.class); @Autowired private UserRepository userRepository; public User getUserById(Long id) { String traceId = TraceIdUtil.getTraceId(); logger.info("Getting user by id: {}, TraceId: {}", id, traceId); long start = System.currentTimeMillis(); try { User user = userRepository.findById(id); long duration = System.currentTimeMillis() - start; logger.info("User found: {}, Duration: {}ms, TraceId: {}", user, duration, traceId); LogUtils.logBusiness("User", "GetUser", user); return user; } catch (Exception e) { logger.error("Error getting user, TraceId: {}", traceId, e); LogUtils.logError("User", "GetUser", e); throw e; } } }
|
6.2 调用链示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
| package com.example.controller;
import com.example.feign.UserFeignClient; import com.example.service.OrderService; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.PathVariable; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController;
@RestController @RequestMapping("/orders") public class OrderController { private static final Logger logger = LoggerFactory.getLogger(OrderController.class); @Autowired private OrderService orderService; @Autowired private UserFeignClient userFeignClient; @GetMapping("/{id}") public Order getOrder(@PathVariable Long id) { logger.info("Getting order: {}", id); Order order = orderService.getOrderById(id); User user = userFeignClient.getUser(order.getUserId()); logger.info("Order retrieved: {}, User: {}", order, user); return order; } }
|
七、监控和告警
7.1 日志监控
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
| package com.example.monitor;
import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.stereotype.Component;
@Component public class LogMonitor { private static final Logger logger = LoggerFactory.getLogger(LogMonitor.class);
public void monitorSlowRequest(String uri, long duration) { if (duration > 1000) { logger.warn("Slow request detected: URI: {}, Duration: {}ms", uri, duration); } }
public void monitorErrorRate(String service, int errorCount, int totalCount) { double errorRate = (double) errorCount / totalCount; if (errorRate > 0.1) { logger.error("High error rate: Service: {}, Rate: {:.2%}", service, errorRate); } } }
|
八、最佳实践
8.1 最佳实践清单
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| 链路追踪最佳实践: 1. TraceId管理: - 统一的TraceId生成规则 - TraceId在所有日志中输出 - TraceId跨服务传递 2. 日志规范: - 统一的日志格式 - 包含TraceId、时间戳、日志级别 - 记录关键业务信息 3. 性能考虑: - 合理设置采样率 - 避免过度的日志记录 - 使用异步日志 4. 监控告警: - 慢请求监控 - 错误率监控 - 日志聚合分析 5. 安全考虑: - 避免记录敏感信息 - TraceId不要包含业务信息 - 日志脱敏处理
|
8.2 常见问题解决
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
|
public class TraceContextTaskDecorator implements TaskDecorator { @Override public Runnable decorate(Runnable runnable) { Map<String, String> context = MDC.getCopyOfContextMap(); return () -> { if (context != null) { MDC.setContextMap(context); } try { runnable.run(); } finally { MDC.clear(); } }; } }
|
九、总结
本文介绍了 SpringBoot 链路追踪实现要点:
核心要点
- MDC 实现:手动管理 TraceId,配合拦截器使用
- Spring Cloud Sleuth:自动集成、支持 Zipkin
- 日志配置:统一格式、输出 TraceId
- 跨服务传递:Feign 通过请求头传递
- 异步任务支持:TaskDecorator 传播上下文
技术要点
- TraceId 生成:UUID,全局唯一
- MDC:线程本地存储
- Sleuth:自动追踪与采样
- Zipkin:分布式追踪
- ELK:日志聚合分析
实践建议
- 统一 TraceId 生成与传递规则
- 合理设置采样率,降低性能影响
- 日志规范统一,输出 TraceId
- 持续监控,分析慢请求与错误率
- 使用 ELK 聚合与分析日志
通过链路追踪,可高效定位问题并优化性能。