业务系统日志追踪

一、遇到的问题

开发中经常会打印业务日志,这些日志在排查问题或跟踪调用流程时很是有用,不少业务日志并无考虑排查问题时的便利性,看似都记录了日志,一旦出现了线上问题,排查起来特别麻烦,不少时候又不方便让用户重现问题,形成解决问题周期长甚至是没法解决问题;若是日志记录的好,能够根据关键字搜索得出一条调用链日志,顺着这个日志链条就能够看出程序的执行全过程了。html

二、如何记录日志

固然,应用、环境、ip、日志时间、日志级别、线程、输入输出参数…这些都是必要便于排错的日志信息就很少说了,重点说下如何记录跟踪一次调用流程的日志记录方法。git

一种方法是:打印日志时候指明业务线,好比支付的流程里,全部步骤业务日志统一前缀 "支付流程:userId:",那么排查问题时候就能够根据关键字和userId来快速获得某个userid在支付流程里的全部轨迹。github

可是这种方法有以下问题:web

一、全部log统一前缀实行起来不太容易api

二、同一个userid的log可能会很是多,并不能按request来归类app

三、若是涉及到了公共方法调用(同时被多个业务流程调用),统一前缀很难实施dom

……异步

另外一种方法是给每一个request生成惟一的标识,给打印log的地方埋点这个标识,根据这个标识能够自动造成log链路。分布式

 

本身埋点方式

 

使用filter(filter使用传送门)给每一个request生成惟一标识,在log里带上这个标识,将一个request的log经过惟一标识串成一串。 ide

@WebFilter(filterName = "myFilter",urlPatterns = "/*")
public class LogFilter implements Filter {
    @Override
    public void init(FilterConfig filterConfig) throws ServletException {

    }

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
        String replace = UUID.randomUUID().toString().replace("-", "");
        LogHolder.setLogId(replace);
        LogUtil.info("thread:"+Thread.currentThread().getName() + ",uuid : " + replace+",url:"+((HttpServletRequest) request).getRequestURI());
        chain.doFilter(request, response);
        LogHolder.remove();
    }

    @Override
    public void destroy() {

    }
}
View Code

LogUtil:

@Slf4j
public class LogUtil {

    public static void info(String content, Object... args) {
        log.info(String.format("LogFlowId: %s , %s", LogHolder.getLogId(), content), args);
    }

    public static void warn(String content, Object... args) {
        log.info(String.format("LogFlowId: %s , %s", LogHolder.getLogId(), content), args);
    }

    public static void error(String content, Exception ex) {
        log.error(String.format("LogFlowId: %s , %s", LogHolder.getLogId(), content), ex); 
  }
}

业务场景: 

    @RequestMapping("/log")
    public void writelog() throws InterruptedException {
        LogUtil.info("request begin");
        test();
        LogUtil.info("request end");
    }

    private void test() throws InterruptedException {
        long l = System.currentTimeMillis();
        Thread.sleep(1000);
        LogUtil.info("操做耗时:" + (System.currentTimeMillis() - l) + " ms");
    }
View Code

输出日志以下:

能够根据任意一步的logFlowId检索出整个request生命周期内的全部log。

对于通常的状况均可以实现,可是若是程序里使用了线程池会不会有线程复用的问题?来验证一下:

业务代码修改以下: 

    @RequestMapping("/log")
    public void writelog() {
        LogUtil.info("request begin");
        ExecuterServiceUtil.run(() -> test());
        LogUtil.info("request end");
    }

    public void test() {
        long l = System.currentTimeMillis();
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        LogUtil.info("操做耗时:" + (System.currentTimeMillis() - l) + " ms");
    }
View Code

获取不到设置在ThreadLocal里的logFlowId了,这是由于线程池从新开辟了子线程,子线程不能访问主线程设置的logFlowId,可使用 InheritableThreadLocal 来解决问题。

修改LogHolder为:

public class LogHolder {
    private LogHolder() {
    }

    public static final ThreadLocal<String> LOG_ID = new InheritableThreadLocal<>();

    public static String getLogId() {
        return LOG_ID.get();
    }

    public static void setLogId(String logId) {
        LOG_ID.set(logId);
    }

    public static void remove() {
        LOG_ID.remove();
    }
}
View Code

再次运行发现logFlowId有值了,可是却重复了,很明显是由于线程复用形成的问题。

最终解决办法:

一、获取主线程设置的logFlowId

二、执行前设置到子线程里

三、用完后清除子线程变量

输出日志以下所示:

若是是分布式应用跟踪,能够经过header透传方式,将logFlowId传入下一个应用里。

附录

demo

使用MDC

使用拦截器

使用前(preHandle)生成并设置log_trace_id,在使用完(afterCompletion)清除log_trace_id

public class LogInterceptor implements HandlerInterceptor {

    private static final String DEMO_LOG_TRACE_ID = "demo_log_trace_id";
    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
        MDC.put(DEMO_LOG_TRACE_ID, UUIdUtil.getUUID());
        return true;
    }

    @Override
    public void postHandle(HttpServletRequest request, HttpServletResponse response, Object handler, ModelAndView modelAndView) throws Exception {
    }

    @Override
    public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex) throws Exception {
        MDC.remove(DEMO_LOG_TRACE_ID);
    }
}

一样的,若是使用线程池,会有相似的问题。咱们继承线程池方法,稍做修改

public class ThreadExecutorPoolUtils extends ThreadPoolTaskExecutor {

    /**
     * 全部线程都会委托给这个execute方法,在这个方法中咱们把父线程的MDC内容赋值给子线程
     * https://logback.qos.ch/manual/mdc.html#managedThreads
     *
     * @param runnable
     */
    @Override
    public void execute(Runnable runnable) {
        // 获取父线程MDC中的内容,必须在run方法以前,不然等异步线程执行的时候有可能MDC里面的值已经被清空了,这个时候就会返回null
        Map<String, String> context = MDC.getCopyOfContextMap();
        super.execute(() -> run(runnable, context));
    }

    @Override
    public Future<?> submit(Runnable runnable) {
        // 获取父线程MDC中的内容,必须在run方法以前,不然等异步线程执行的时候有可能MDC里面的值已经被清空了,这个时候就会返回null
        Map<String, String> context = MDC.getCopyOfContextMap();
       return super.submit(()->run(runnable, context));
    }

    @Override
    public <T> Future<T> submit(Callable<T> callable) {
        // 获取父线程MDC中的内容,必须在run方法以前,不然等异步线程执行的时候有可能MDC里面的值已经被清空了,这个时候就会返回null
        Map<String, String> context = MDC.getCopyOfContextMap();
        return super.submit(() -> call(callable, context));
    }

    /**
     * 子线程委托的执行方法
     *
     * @param runnable {@link Runnable}
     * @param context  父线程MDC内容
     */
    private void run(Runnable runnable, Map<String, String> context) {
        // 将父线程的MDC内容传给子线程
        MDC.setContextMap(context);
        try {
            // 执行异步操做
            runnable.run();
        } finally {
            // 清空MDC内容
            MDC.clear();
        }
    }

    /**
     * 子线程委托的执行方法
     *
     * @param callable {@link Callable}
     * @param context  父线程MDC内容
     */
    private <T> T call(Callable<T> callable, Map<String, String> context) throws Exception {
        // 将父线程的MDC内容传给子线程
        MDC.setContextMap(context);
        try {
            // 执行异步操做
            return callable.call();
        } finally {
            // 清空子线程MDC内容,防止线程复用问题
            MDC.clear();
        }
    }
}
View Code

logback-xml配置

<pattern>[%d{yyyy-MM-dd HH:mm:ss.SSS}] [%thread] [%level] [%X{demo_log_trace_id}] %logger:%L x:\(%X\) - %msg%n</pattern>

输出日志格式

2019-02-02 10:27:14.279 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.279 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.280 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.280 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.280 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.280 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.280 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.281 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.282 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.282 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.282 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.282 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.282 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.255 [INFO][Test-Thread-Pool4]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool4 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.260 [INFO][Test-Thread-Pool2]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool2 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.282 [INFO][Test-Thread-Pool4]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool4 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
2019-02-02 10:27:14.282 [INFO][Test-Thread-Pool3]:demo.web.api.DemoController [lambda$mutilLog$0:69] [d4676669641a4c6ab94f841d77c2b18e] Test-Thread-Pool3 runs >>>>>, activeAccount:4, corePoolSize:4, maxPoolSize:5, maxPoolSize:Test-Thread-Group, poolSize:4
相关文章
相关标签/搜索