[Spring Cloud] 7 Spring Cloud Sleuth

时间 2019-11-08

标签 spring cloud sleuth 栏目 Spring 繁體版

原文原文链接

Spring Cloud Sleuth

分布式链路跟踪

Spring Cloud Sleuth是Spring Cloud的分布式链路跟踪解决方案。css

7.1 Terminology

Spring Cloud Sleuth借鉴了Dapper的术语。html

Span ：最基本的工做单元。例如：发送一个RPC就是一个新的span，一样一次RPC的应答也是。Span经过一个惟一的，长度64位的ID来做为标识，另外一个64位ID用于跟踪。Span也能够带有其余数据，例如：描述，时间戳，键值对标签，起始Span的ID，以及处理ID（一般使用IP地址）等等。 Span有起始和结束，他们跟踪着时间信息。span应该都是成对出现的，有失必有终，因此一旦建立了一个span，那就必须在将来某个时间点结束它。 提示： 起始的span一般被称为：root span。它的id一般也被做为一个跟踪记录的id。java
Trace ：一个树结构的Span集合。例如：在分布式大数据存储中，可能每一次请求都是一次跟踪记录。mysql
Annotation ：用于记录一个事件时间信息。一些基础Annotation用于记录请求的起始和结束,例如：git
- cs : Client Sent 客户端发送。这个annotation表示一个span的起始。
- sr : Server Received 服务端接收。表示服务端接收到请求，并开始处理。若是减去cs的时间戳，则表示网络传输时长。
- ss : Server Sent 服务端完成请求处理，应答信息被发回客户端。若是减去sr的时间戳，则表示服务端处理请求的时长。
- cr : Client Received 客户端接收。标志着Span的结束。客户端成功的接收到服务端的应答信息。若是减去cs的时间戳，则表示请求的响应时长。

能够经过下图，可视化的描述了Span和Trace的概念： github

每个颜色都表示着一个span（7个span，从A到G）。他们都有这这些数据信息：web

Trace Id = X
Span Id = D
Client Sent

这表示着，这个span的Trace-Id为X，Span-Id为D。事件为Client Sent。正则表达式

这些Span的上下级关系能够经过下图来表示： redis

7.2 Purpose 做用

下面内容，将以上面图中的例子做为原型来介绍。算法

7.2.1 Distributed tracing with Zipkin 经过Zipkin进行分布式链路跟踪

上例中总共有7个span。若是在Zipkin中，将能够看到：

然而当你点看一个某个跟踪记录时，会发现4个span：

注意： 在跟踪记录的视图中，可能会看到某些span被合并了。这也就意味着，有2个span的Server Received，Server Sent / Client Received，Client Sent发送到Zipkin，将被视为同一个span。

为何7个span只显示了4个呢？

1个span来自http:/start。包含这Server Received (SR) 和 Server Sent (SS) 标记。
2个span来自service1到service2的http:/foo接口的RPC调用。包含着service1的Client Sent (CS) 和 Client Received (CR) 标记。也包含着service2的Server Received (SR) and Server Sent (SS) 标记。实际上有2个span，可是逻辑上是一个RPC调用的span。
2个span来自service2到service3的http:/bar接口的RPC调用。包含着service2的Client Sent (CS) 和 Client Received (CR) 标记。也包含着service3的Server Received (SR) 和 Server Sent (SS) 标记。实际上有2个span，可是逻辑上是一个RPC调用的span。
2个span来自service2到service4的http:/baz接口的RPC调用。包含着service2的Client Sent (CS) 和 Client Received (CR) 标记。也包含着service4的Server Received (SR) 和 Server Sent (SS) 标记。实际上有2个span，可是逻辑上是一个RPC调用的span。

所以，能够统计一下实际上有多少span，1个来自http:/start,2个来自service1调用service2，2个来自service2调用service3，2个来自service2调用service4，总共7个span。

逻辑上则视为4个span，1个外部请求service1，3个RPC调用。

7.2.2 Visualizing errors 错误信息的显示

Zipkin能够在跟踪记录中显示错误信息。当异常抛出而且没有捕获，Zipkin就会自动的换个颜色显示。在跟踪记录的清单中，当看到红色的记录时，就表示有异常抛出了。下图就显示了错误信息：

若是点开其中一个span，能够看到下列信息：

正如你看到的，能够很清晰的显示错误信息。

7.2.3 Live examples

能够点击下图，查看一个在线例子：

点击“dependency”图标，能够看到下图：

7.2.4 Log correlation 相关日志

当使用grep命令对应用日志按跟踪ID进行过滤，例如：2485ec27856c56f4，那能够获得下列信息：

service1.log:2016-02-26 11:15:47.561  INFO [service1,2485ec27856c56f4,2485ec27856c56f4,true] 68058 --- [nio-8081-exec-1] i.s.c.sleuth.docs.service1.Application   : Hello from service1. Calling service2
service2.log:2016-02-26 11:15:47.710  INFO [service2,2485ec27856c56f4,9aa10ee6fbde75fa,true] 68059 --- [nio-8082-exec-1] i.s.c.sleuth.docs.service2.Application   : Hello from service2. Calling service3 and then service4
service3.log:2016-02-26 11:15:47.895  INFO [service3,2485ec27856c56f4,1210be13194bfe5,true] 68060 --- [nio-8083-exec-1] i.s.c.sleuth.docs.service3.Application   : Hello from service3
service2.log:2016-02-26 11:15:47.924  INFO [service2,2485ec27856c56f4,9aa10ee6fbde75fa,true] 68059 --- [nio-8082-exec-1] i.s.c.sleuth.docs.service2.Application   : Got response from service3 [Hello from service3]
service4.log:2016-02-26 11:15:48.134  INFO [service4,2485ec27856c56f4,1b1845262ffba49d,true] 68061 --- [nio-8084-exec-1] i.s.c.sleuth.docs.service4.Application   : Hello from service4
service2.log:2016-02-26 11:15:48.156  INFO [service2,2485ec27856c56f4,9aa10ee6fbde75fa,true] 68059 --- [nio-8082-exec-1] i.s.c.sleuth.docs.service2.Application   : Got response from service4 [Hello from service4]
service1.log:2016-02-26 11:15:48.182  INFO [service1,2485ec27856c56f4,2485ec27856c56f4,true] 68058 --- [nio-8081-exec-1] i.s.c.sleuth.docs.service1.Application   : Got response from service2 [Hello from service2, response from service3 [Hello from service3] and from service4 [Hello from service4]]

若是使用了日志收集工具，如： Kibana, Splunk 等。那就能够按照事件发生的顺序进行显示。例如在Kibana中能够看到下列信息：

若是想要使用Logstash的Grok模式，能够这样：

filter {
       # pattern matching logback pattern
       grok {
              match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:severity}\s+\[%{DATA:service},%{DATA:trace},%{DATA:span},%{DATA:exportable}\]\s+%{DATA:pid}---\s+\[%{DATA:thread}\]\s+%{DATA:class}\s+:\s+%{GREEDYDATA:rest}" }
       }
}

注意： 若是想要在Spring Cloud Foundry中整合Grok可使用下面的规则：

filter {
       # pattern matching logback pattern
       grok {
              match => { "message" => "(?m)OUT\s+%{TIMESTAMP_ISO8601:timestamp}\s+%{LOGLEVEL:severity}\s+\[%{DATA:service},%{DATA:trace},%{DATA:span},%{DATA:exportable}\]\s+%{DATA:pid}---\s+\[%{DATA:thread}\]\s+%{DATA:class}\s+:\s+%{GREEDYDATA:rest}" }
       }
}

7.2.5 JSON Logback with Logstash

通常在使用Logstash时不会直接保存日志到某个文本文件中，而是使用一个JSON文件(Logstash能够直接使用JSON)。那就必须添加相关依赖。

Dependencies setup 依赖设置

须要确保Logback已经添加到classpath（ch.qos.logback:logback-core）
添加Logstash的Logback编码器：net.logstash.logback:logstash-logback-encoder:4.6

Logback setup 设置Logback

下面会展现一个Logback配置的例子（文件名为： logback-spring.xml）

应用日志信息会被记录成JSON格式到build/${spring.application.name}.json文件
日志还会有两个额外的输出：控制台和标准日志文件
日志格式和上一节中同样

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
	<include resource="org/springframework/boot/logging/logback/defaults.xml"/>
	
	<springProperty scope="context" name="springAppName" source="spring.application.name"/>
	<!-- Example for logging into the build folder of your project -->
	<property name="LOG_FILE" value="${BUILD_FOLDER:-build}/${springAppName}"/>

	<property name="CONSOLE_LOG_PATTERN"
			  value="%clr(%d{yyyy-MM-dd HH:mm:ss.SSS}){faint} %clr(${LOG_LEVEL_PATTERN:-%5p}) %clr([${springAppName:-},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-},%X{X-B3-ParentSpanId:-},%X{X-Span-Export:-}]){yellow} %clr(${PID:- }){magenta} %clr(---){faint} %clr([%15.15t]){faint} %clr(%-40.40logger{39}){cyan} %clr(:){faint} %m%n${LOG_EXCEPTION_CONVERSION_WORD:-%wEx}"/>

	<!-- Appender to log to console -->
	<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
		<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
			<!-- Minimum logging level to be presented in the console logs-->
			<level>DEBUG</level>
		</filter>
		<encoder>
			<pattern>${CONSOLE_LOG_PATTERN}</pattern>
			<charset>utf8</charset>
		</encoder>
	</appender>

	<!-- Appender to log to file -->
	<appender name="flatfile" class="ch.qos.logback.core.rolling.RollingFileAppender">
		<file>${LOG_FILE}</file>
		<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
			<fileNamePattern>${LOG_FILE}.%d{yyyy-MM-dd}.gz</fileNamePattern>
			<maxHistory>7</maxHistory>
		</rollingPolicy>
		<encoder>
			<pattern>${CONSOLE_LOG_PATTERN}</pattern>
			<charset>utf8</charset>
		</encoder>
	</appender>
	
	<!-- Appender to log to file in a JSON format -->
	<appender name="logstash" class="ch.qos.logback.core.rolling.RollingFileAppender">
		<file>${LOG_FILE}.json</file>
		<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
			<fileNamePattern>${LOG_FILE}.json.%d{yyyy-MM-dd}.gz</fileNamePattern>
			<maxHistory>7</maxHistory>
		</rollingPolicy>
		<encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
			<providers>
				<timestamp>
					<timeZone>UTC</timeZone>
				</timestamp>
				<pattern>
					<pattern>
						{
						"severity": "%level",
						"service": "${springAppName:-}",
						"trace": "%X{X-B3-TraceId:-}",
						"span": "%X{X-B3-SpanId:-}",
						"parent": "%X{X-B3-ParentSpanId:-}",
						"exportable": "%X{X-Span-Export:-}",
						"pid": "${PID:-}",
						"thread": "%thread",
						"class": "%logger{40}",
						"rest": "%message"
						}
					</pattern>
				</pattern>
			</providers>
		</encoder>
	</appender>
	
	<root level="INFO">
		<appender-ref ref="console"/>
		<appender-ref ref="logstash"/>
		<!--<appender-ref ref="flatfile"/>-->
	</root>
</configuration>

注意： 若是想要自定义logback-spring.xml，能够经过bootstrap中的spring.application.name属性来替代application的配置。不然，自定义的logback配置文件不会被加载。

7.3 Adding to the project

整合到项目中

7.3.1 Only Sleuth (log correlation) 仅包含Sleuth（日志相关部分）

若是仅仅想使用Spring Cloud Sleuth而不想整合Ziphin，那只须要添加Sleuth的依赖就行：spring-cloud-starter-sleuth。

Maven

<dependencyManagement> (1)
         <dependencies>
             <dependency>
                 <groupId>org.springframework.cloud</groupId>
                 <artifactId>spring-cloud-dependencies</artifactId>
                 <version>Brixton.RELEASE</version>
                 <type>pom</type>
                 <scope>import</scope>
             </dependency>
         </dependencies>
   </dependencyManagement>

   <dependency> (2)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-starter-sleuth</artifactId>
   </dependency>

由Spring BOM来管理依赖版本
添加spring-cloud-starter-sleuth依赖

Gradle

dependencyManagement { (1)
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton.RELEASE"
    }
}

dependencies { (2)
    compile "org.springframework.cloud:spring-cloud-starter-sleuth"
}

由Spring BOM来管理依赖版本
添加spring-cloud-starter-sleuth依赖

7.3.2 Sleuth with Zipkin via HTTP 经过HTTP整合Sleuth和Zipkin

能够经过spring-cloud-starter-zipkin来整合：

Maven

<dependencyManagement> (1)
         <dependencies>
             <dependency>
                 <groupId>org.springframework.cloud</groupId>
                 <artifactId>spring-cloud-dependencies</artifactId>
                 <version>Brixton.RELEASE</version>
                 <type>pom</type>
                 <scope>import</scope>
             </dependency>
         </dependencies>
   </dependencyManagement>

   <dependency> (2)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-starter-zipkin</artifactId>
   </dependency>

由Spring BOM来管理依赖版本
添加spring-cloud-starter-zipkin依赖

Gradle

dependencyManagement { (1)
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton.RELEASE"
    }
}

dependencies { (2)
    compile "org.springframework.cloud:spring-cloud-starter-zipkin"
}

由Spring BOM来管理依赖版本
添加spring-cloud-starter-zipkin依赖

7.3.3 Sleuth with Zipkin via Spring Cloud Stream 经过Spring Cloud Stream整合Sleuth和Zipkin

能够经过spring-cloud-sleuth-stream来整合：

Maven

<dependencyManagement> (1)
         <dependencies>
             <dependency>
                 <groupId>org.springframework.cloud</groupId>
                 <artifactId>spring-cloud-dependencies</artifactId>
                 <version>Brixton.RELEASE</version>
                 <type>pom</type>
                 <scope>import</scope>
             </dependency>
         </dependencies>
   </dependencyManagement>

   <dependency> (2)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-sleuth-stream</artifactId>
   </dependency>
   <dependency> (3)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-starter-sleuth</artifactId>
   </dependency>
   <!-- EXAMPLE FOR RABBIT BINDING -->
   <dependency> (4)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-stream-binder-rabbit</artifactId>
   </dependency>

由Spring BOM来管理依赖版本
添加spring-cloud-sleuth-stream依赖
添加spring-cloud-starter-sleuth依赖
添加Spring Cloud Stream桥接（例子中使用 Rabbit桥接）

Gradle

dependencyManagement { (1)
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton.RELEASE"
    }
}

dependencies {
    compile "org.springframework.cloud:spring-cloud-sleuth-stream" (2)
    compile "org.springframework.cloud:spring-cloud-starter-sleuth" (3)
    // Example for Rabbit binding
    compile "org.springframework.cloud:spring-cloud-stream-binder-rabbit" (4)
}

由Spring BOM来管理依赖版本
添加spring-cloud-sleuth-stream依赖
添加spring-cloud-starter-sleuth依赖
添加Spring Cloud Stream桥接（例子中使用 Rabbit桥接）

7.3.4 Spring Cloud Sleuth Stream Zipkin Collector

若是想要在Zipkin中使用Spring Cloud Sleuth 流式控制，则须要添加spring-cloud-sleuth-zipkin-stream依赖：

Maven

<dependencyManagement> (1)
         <dependencies>
             <dependency>
                 <groupId>org.springframework.cloud</groupId>
                 <artifactId>spring-cloud-dependencies</artifactId>
                 <version>Brixton.RELEASE</version>
                 <type>pom</type>
                 <scope>import</scope>
             </dependency>
         </dependencies>
   </dependencyManagement>

   <dependency> (2)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-sleuth-zipkin-stream</artifactId>
   </dependency>
   <dependency> (3)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-starter-sleuth</artifactId>
   </dependency>
   <!-- EXAMPLE FOR RABBIT BINDING -->
   <dependency> (4)
       <groupId>org.springframework.cloud</groupId>
       <artifactId>spring-cloud-stream-binder-rabbit</artifactId>
   </dependency>

由Spring BOM来管理依赖版本
添加spring-cloud-sleuth-zipkin-stream依赖
添加spring-cloud-starter-sleuth依赖
添加Spring Cloud Stream桥接（例子中使用 Rabbit桥接）

Gradle

dependencyManagement { (1)
    imports {
        mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton.RELEASE"
    }
}

dependencies {
    compile "org.springframework.cloud:spring-cloud-sleuth-zipkin-stream" (2)
    compile "org.springframework.cloud:spring-cloud-starter-sleuth" (3)
    // Example for Rabbit binding
    compile "org.springframework.cloud:spring-cloud-stream-binder-rabbit" (4)
}

由Spring BOM来管理依赖版本
添加spring-cloud-sleuth-zipkin-stream依赖
添加spring-cloud-starter-sleuth依赖
添加Spring Cloud Stream桥接（例子中使用 Rabbit桥接）

而后，须要在主类上加上@EnableZipkinStreamServer注解：

package example;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.cloud.sleuth.zipkin.stream.EnableZipkinStreamServer;

@SpringBootApplication
@EnableZipkinStreamServer
public class ZipkinStreamServerApplication {

	public static void main(String[] args) throws Exception {
		SpringApplication.run(ZipkinStreamServerApplication.class, args);
	}

}

7.4 Additional resources 附加资源

关于Spring Cloud Sleuth 和 Zipkin相关介绍，能够观看Marcin Grzejszczak的视频

7.5 Features 特性

添加trace/span ID到日志（Slf4J MDC），这样就能够经过一个trace或span来提取相关的完整日志。例如：

2016-02-02 15:30:57.902  INFO [bar,6bfd228dc00d216b,6bfd228dc00d216b,false] 23030 --- [nio-8081-exec-3] ...
2016-02-02 15:30:58.372 ERROR [bar,6bfd228dc00d216b,6bfd228dc00d216b,false] 23030 --- [nio-8081-exec-3] ...
2016-02-02 15:31:01.936  INFO [bar,46ab0d418373cbc9,46ab0d418373cbc9,false] 23030 --- [nio-8081-exec-4] ...

注意，MDC的[appname,traceId,spanId,exportable]实体分别表示：

- spanId 特定操做的ID
- appname 发生操做的应用名称
- traceId 这次跟踪的ID
- exportable 是否发送到Zipkin

对于分布式链路跟踪，提供一个抽象的通用数据模型：trace，span，annotation，key-value annotation。基本基于HTrace，可是兼容Zipkin（Dapper）
记录时间信息，用于后续分析。使用Sleuth，能够快速发现系统中的延迟缘由。Sleuth不会写入太多日志，不会引发过多性能开销。
- 包含链路数据，其他能够扩展
- 包含可选的数据展现接口，如HTTP
- 管理卷数据支持多种采样策略
- 可以经过Zipkin进行数据的查询和可视化展现
可以跟踪常规Spring应用的访问入口和回应点，如：servlet，filter，async endpoints，rest template，定时任务，消息渠道，zuul filters，feign客户端等等。
Sleuth自带一个默认策略，来决定跟踪数据是经过http整合，仍是其余通信方式来传播消息。例如：经过HTTP方式传输时，报文头兼容Zipkin。这些传播逻辑可经过SpanInjector和SpanExtractor自定义或者扩展。
对接收/丢弃的span进行简单的统计度量。
若是加入spring-cloud-sleuth-zipkin，那应用就会自动采用Zipkin兼容的方式来记录和收集跟踪信息。默认状况下，会经过HTTP发送到本地Zipkin服务（端口：9411）.能够经过spring.zipkin.baseUrl来修改这一地址。
若是加入If spring-cloud-sleuth-stream，那应用会采用Spring Cloud Stream的方式来记录和收集跟踪信息。应用会自动成为跟踪信息的生产者，而后将消息发送给配置的消息代理中间件（如：RabbitMQ,Kafka,Redis）。

重要： 若是使用Zipkin或者Stream，能够配置span记录输出的采样率，配置项为spring.sleuth.sampler.percentage（默认0.1，也就是10%）。这个可能会让开发者觉得丢失了一些span，其实否则。

注意： SLF4J MDC老是会设置，而且若是使用logback，那上面的例子中trace/span的id则会当即显示在日志中。其余的日志系统须要配置各自的格式来达到这样的效果。默认的logging.pattern.level设置为%clr(%5p) %clr([${spring.application.name:},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-},%X{X-Span-Export:-}]){yellow} (这也是一个Spring Boot整合logback时有的特性)。这就意味着，在使用SLF4J时不须要在手工配置这个格式了，自动会这样输出。

7.6 Sampling 采样

在分布式链路跟踪中，跟踪数据可能会很是大，因此采样变的很重要。（通常来讲，不须要把每个发生的动做都导出） Spring Cloud Sleuth有一个Sampler策略，能够经过这个实现类来控制采样算法。采样器不会阻碍span相关id的产生，可是会对导出以及附加事件标签的相关操做形成影响。默认状况下，若是一个span已经激活，则会继续使用策略用之后续跟踪，可是，新的span老是会标记上不用导出。

若是应用是使用这个策略，则会发现日志中跟踪记录是完整的，可是远程存储端则不必定。通过测试，默认值是足够的，若是你只想使用日志来记录，则更好。（好比，使用ELK来进行日志收集分析方案）。若是须要导出span数据到Zipkin或者Spring Cloud Stream，那AlwaysSampler能够处处所有数据，PercentageBasedSampler则会处处固定频率的分片，能够根据须要自行选择使用。

注意： 在使用spring-cloud-sleuth-zipkin或者spring-cloud-sleuth-stream时，默认使用PercentageBasedSampler。能够经过spring.sleuth.sampler.percentage对其进行配置。这个值介于0.0到1.0之间。

若是想要使用其余策略，也很简单，只须要：

@Bean
public Sampler defaultSampler() {
	return new AlwaysSampler();
}

7.7 Instrumentation

Spring Cloud Sleuth能够自动的跟踪全部Spring应用，所以，不须要作什么额外的操做。会自动选择相应的方法进行处理，例如：若是是一个servlet的web应用，则会使用一个Filter；若是是Spring Integration，则会使用`ChannelInterceptors。

还能够在span标签中自定义一些键。为了限制span数据大小，默认状况下，一次HTTP请求仅仅会带上少许的元数据，如：状态码，主机地址以及URL。能够经过spring.sleuth.keys.http.headers进行额外的配置，能够列出想要带上的Header名字。

注意： 标签数据只有当Sampler容许时，才会收集和导出。默认状况下，是不会收集这些数据的。这些数据通常来讲，量很大，也没太多的意义。

注意： Spring Cloud Sleuth的数据采集仍是比较积极的，就是说，老是会积极的尝试从线程上下文中获取跟踪数据。一样不管是否须要导出都会捕获时间事件。之后，可能会考虑改为被动模式。

7.8 Span lifecycle 生命周期

经过org.springframework.cloud.sleuth.Tracer接口的api，能够观察到Span的各个生命周期操做：

start 当开始一个span时，就会分配一个名字，以及记录启动时间戳。
close 当span已经完成（记录截止时间戳），而且若是其符合条件，则导出到Zipkin。同时从当前线程上下文中移除此span。
continue 做为一个span的副本而建立的一个新的span实例。
detach 不会中止或者关闭span，仅仅是从当前线程上下文中移除此span。
create with explicit parent 建立一个新的span，并显示指定其父span。

提示： 一般不须要去操做这些api，Spring会自动建立Tracer，开发者只须要自动注入就可使用。

7.8.1 Creating and closing spans 建立和关闭

能够经过Tracer手动建立span：

// Start a span. If there was a span present in this thread it will become
// the `newSpan`'s parent.
Span newSpan = this.tracer.createSpan("calculateTax");
try {
	// ...
	// You can tag a span
	this.tracer.addTag("taxValue", taxValue);
	// ...
	// You can log an event on a span
	newSpan.logEvent("taxCalculated");
} finally {
	// Once done remember to close the span. This will allow collecting
	// the span to send it to Zipkin
	this.tracer.close(newSpan);
}

这个例子，展现了如何手工建立一个span实例。若是当前线程上下文中已经存在一个span了，那已存在就span会成为新建立的span的父级。

重要： 建立完span要记住清理！若是想要发送到Zipkin，就不要忘了关闭span。

7.8.2 Continuing spans 持续

有的时候，其实不须要建立一个span，仅仅是须要在现有的span继续一些持续的操做。例如，下列状况：

AOP 若是在最对已有span的操做，进行AOP时，就不须要再额外建立了
Hystrix 在执行Hystrix命令时，从逻辑上讲，仍然属于当前操做中的一部分，因此，通常也不须要再次建立span。

接下来就展现，如何在现有的span上继续处理：

Span continuedSpan = this.tracer.continueSpan(spanToContinue);
assertThat(continuedSpan).isEqualTo(spanToContinue);

使用Tracer接口：

// let's assume that we're in a thread Y and we've received
// the `initialSpan` from thread X
Span continuedSpan = this.tracer.continueSpan(initialSpan);
try {
	// ...
	// You can tag a span
	this.tracer.addTag("taxValue", taxValue);
	// ...
	// You can log an event on a span
	continuedSpan.logEvent("taxCalculated");
} finally {
	// Once done remember to detach the span. That way you'll
	// safely remove it from the current thread without closing it
	this.tracer.detach(continuedSpan);
}

重要建立完span要记住清理！在对现有span上继续操做后，不要忘了最后调用detach。假如：span由线程X建立，而后它等着线程Y,Z来完成后续动做；span在线程Y，Z在完成本身的操做后要调用detach；这样线程X关闭span时，数据才会被收集。

7.8.3 Creating spans with an explicit parent

有的时候须要建立一个新的span，并显示指定其父span。好比说，在一个线程中已经存在span，而后调用另外一个线程，这时候想要一个新的span来独立监控新线程的执行。Tracer接口中的startSpan方法就能够被用到，例如：

// let's assume that we're in a thread Y and we've received
// the `initialSpan` from thread X. `initialSpan` will be the parent
// of the `newSpan`
Span newSpan = this.tracer.createSpan("calculateCommission", initialSpan);
try {
	// ...
	// You can tag a span
	this.tracer.addTag("commissionValue", commissionValue);
	// ...
	// You can log an event on a span
	newSpan.logEvent("commissionCalculated");
} finally {
	// Once done remember to close the span. This will allow collecting
	// the span to send it to Zipkin. The tags and events set on the
	// newSpan will not be present on the parent
	this.tracer.close(newSpan);
}

重要： 仍是同样，不要忘了关闭span。否在当关闭当前线程时，会在日志中看到不少警告。更糟的是，不关闭span，就不会被Zipkin收集到数据。

7.9 Naming spans 具名

为span命名，可不是个轻松活。Span的名字应该可以表述一个操做。名字应该是代价低廉的（好比，不带有id）。

所以，不少span名字都是按照必定规则造出来的：

controller-method-name 当控制器的某个方法收到请求时：conrollerMethodName
async 为一些异步操做进行包装，如：Callable,Runnable
@Scheduled 使用简单类名

对于异步处理，还能够手工指定名字。

7.9.1 @SpanName annotation

可使用@SpanName注解来命名。

@SpanName("calculateTax")
class TaxCountingRunnable implements Runnable {

	@Override public void run() {
		// perform logic
	}
}

在这个例子中，当按照这样的方式来执行时：

Runnable runnable = new TraceRunnable(tracer, spanNamer, new TaxCountingRunnable());
Future<?> future = executorService.submit(runnable);
// ... some additional logic ...
future.get();

span就会被命名为：calculateTax

7.9.2 toString() method

还有一中比较少见的方式，为Runnable或者Callable建立一个独立的class。最多见通常都是使用匿名类。当没有@SpanName注解时，会检查是否重写了toString()方法。

Runnable runnable = new TraceRunnable(tracer, spanNamer, new Runnable() {
	@Override public void run() {
		// perform logic
	}

	@Override public String toString() {
		return "calculateTax";
	}
});
Future<?> future = executorService.submit(runnable);
// ... some additional logic ...
future.get();

这样也会建立一个名字为calculateTax的span。

7.10 Customizations 定制化

经过SpanInjector和SpanExtractor，能够定制span的建立和传播。

跟踪信息在进程间传播，有两种方式：

经过Spring Integration
经过HTTP

启动或者合并到一个已有的跟踪记录时，Span的id能够兼容Zipkin头（不管是Message头仍是HTTP头）。在出站请求时，跟踪信息会自动注入，以便下一跳的继续跟踪。

7.10.1 Spring Integration

对于Spring Integration能够经过带有Message以及MessageBuilder的特殊Bean来完成跟踪信息构建。

@Bean
public SpanExtractor<Message> messagingSpanExtractor() {
    ...
}

@Bean
public SpanInjector<MessageBuilder> messagingSpanInjector() {
    ...
}

能够本身实现他们，在本身class上加上@Primary就行。

7.10.2 HTTP

对于HTTP方式，则是经过HttpServletRequest来完成跟踪信息的构建。

@Bean
public SpanExtractor<HttpServletRequest> httpServletRequestSpanExtractor() {
    ...
}

能够本身实现他们，在本身class上加上@Primary就行。

7.10.3 Example

假如不使用标准的Zipkin方式来命名HTTP头：

trace id 命名为：correlationId
span id 命名为：mySpanId

则SpanExtractor以下：

static class CustomHttpServletRequestSpanExtractor
		implements SpanExtractor<HttpServletRequest> {

	@Override
	public Span joinTrace(HttpServletRequest carrier) {
		long traceId = Span.hexToId(carrier.getHeader("correlationId"));
		long spanId = Span.hexToId(carrier.getHeader("mySpanId"));
		// extract all necessary headers
		Span.SpanBuilder builder = Span.builder().traceId(traceId).spanId(spanId);
		// build rest of the Span
		return builder.build();
	}
}

而后，能够这样注册它：

@Bean
@Primary
SpanExtractor<HttpServletRequest> customHttpServletRequestSpanExtractor() {
	return new CustomHttpServletRequestSpanExtractor();
}

Spring Cloud Sleuth处于安全的缘由，不会在Http Response上，加上trace/span相关的头信息。若是须要加上，则能够自定义一个SpanInjector，而后配置一个Servlet Filter来完成：

static class CustomHttpServletResponseSpanInjector
		implements SpanInjector<HttpServletResponse> {

	@Override
	public void inject(Span span, HttpServletResponse carrier) {
		carrier.addHeader(Span.TRACE_ID_NAME, span.traceIdString());
		carrier.addHeader(Span.SPAN_ID_NAME, Span.idToHex(span.getSpanId()));
	}
}

static class HttpResponseInjectingTraceFilter extends GenericFilterBean {

	private final Tracer tracer;
	private final SpanInjector<HttpServletResponse> spanInjector;

	public HttpResponseInjectingTraceFilter(Tracer tracer, SpanInjector<HttpServletResponse> spanInjector) {
		this.tracer = tracer;
		this.spanInjector = spanInjector;
	}

	@Override
	public void doFilter(ServletRequest request, ServletResponse servletResponse, FilterChain filterChain) throws IOException, ServletException {
		HttpServletResponse response = (HttpServletResponse) servletResponse;
		Span currentSpan = this.tracer.getCurrentSpan();
		this.spanInjector.inject(currentSpan, response);
		filterChain.doFilter(request, response);
	}
}

而后，能够这样注册它们：

@Bean
SpanInjector<HttpServletResponse> customHttpServletResponseSpanInjector() {
	return new CustomHttpServletResponseSpanInjector();
}

@Bean
HttpResponseInjectingTraceFilter responseInjectingTraceFilter(Tracer tracer) {
	return new HttpResponseInjectingTraceFilter(tracer, customHttpServletResponseSpanInjector());
}

7.10.4 Custom SA tag in Zipkin 在Zipkin中定制SA标签

有的时候想要手工建立一个Span，用于跟踪一个外部服务的调用。那可使用peer.service标签来建立span，标签中能够包含想要调用的值。下面这个例子就是扩展调用Redis服务：

org.springframework.cloud.sleuth.Span newSpan = tracer.createSpan("redis");
try {
	newSpan.tag("redis.op", "get");
	newSpan.tag("lc", "redis");
	newSpan.logEvent(org.springframework.cloud.sleuth.Span.CLIENT_SEND);
	// call redis service e.g
	// return (SomeObj) redisTemplate.opsForHash().get("MYHASH", someObjKey);
} finally {
	newSpan.tag("peer.service", "redisService");
	newSpan.tag("peer.ipv4", "1.2.3.4");
	newSpan.tag("peer.port", "1234");
	newSpan.logEvent(org.springframework.cloud.sleuth.Span.CLIENT_RECV);
	tracer.close(newSpan);
}

重要： 记住不要同时添加peer.service和SA标签！只须要加上peer.service就行。

7.10.5 Custom service name 定制服务名

默认状况下，Sleuth会假定span须要发送到Zipkin的spring.application.name服务。在实际使用时，可能不想这样。可能须要指定一个服务来接收某个应用的所有span。其实这样只须要简单配置一下就行，如:

spring.zipkin.service.name: foo

7.10.6 Host locator 主机定位

为了能够跨主机来跟踪，须要对主机名和端口进行抉择。默认的策略是经过server的配置属性。若是没有配置，则会尝试从网络中获取。

若是启用了服务发现，且服务实例已经注册了，那就须要设置这个配置项：

spring.zipkin.locator.discovery.enabled: true

7.11 Span Data as Messages

当引入spring-cloud-sleuth-stream依赖并加上Channel Binder（如：spring-cloud-starter-stream-rabbit或者spring-cloud-starter-stream-kafka）后，就能够经过Spring Cloud Stream来堆积和发送span数据了。这样就会自动产生消息，而且消息负载会是Spans类型。

7.11.1 Zipkin Consumer

有一个专门的注解来转换消息，可让Span数据推送到Zipkin的SpanStore中。如：

@SpringBootApplication
@EnableZipkinStreamServer
public class Consumer {
	public static void main(String[] args) {
		SpringApplication.run(Consumer.class, args);
	}
}

这样Span数据就能够经过Spring Cloud Stream来转发给Zipkin了。若是想要UI界面，再加上下面这个依赖就行：

<groupId>io.zipkin.java</groupId>
<artifactId>zipkin-autoconfigure-ui</artifactId>

这样就拥有了一个Zipkin服务，默认端口为9411。

默认的SpanStore是经过内存实现的。也可使用MySQL，加入spring-boot-starter-jdbc依赖就行。具体配置以下：

spring:
  rabbitmq:
    host: ${RABBIT_HOST:localhost}
  datasource:
    schema: classpath:/mysql.sql
    url: jdbc:mysql://${MYSQL_HOST:localhost}/test
    username: root
    password: root
# Switch this on to create the schema on startup:
    initialize: true
    continueOnError: true
  sleuth:
    enabled: false
zipkin:
  storage:
    type: mysql

注意： @EnableZipkinStreamServer也带有@EnableZipkinServer,因此，将会以标准的Zipkin服务接口的方式来处理，即：经过HTTP方式收集span数据，经过Zipkin Web来进行查询。

7.11.2 Custom Consumer

跟踪信息的自定义消费端也比较简单，可使用spring-cloud-sleuth-stream来绑定到SleuthSink。例如：

@EnableBinding(SleuthSink.class)
@SpringBootApplication(exclude = SleuthStreamAutoConfiguration.class)
@MessageEndpoint
public class Consumer {

    @ServiceActivator(inputChannel = SleuthSink.INPUT)
    public void sink(Spans input) throws Exception {
        // ... process spans
    }
}

注意： 上例中，明确排除了SleuthStreamAutoConfiguration,所以，应用自己就不会发送消息了，但这也是可选的，实际使用中，能够根据须要不排除。

7.12 Metrics

当前版本的Spring Cloud Sleuth只是对span进行简单的度量。主要是经过Spring Boot的metrics机制，对span的接收和丢弃数量进行了度量。每次sapn发送到Zipkin时，接收数量就会递增。当有错误时，丢弃数量就会递增。

7.13 Integrations 整合

7.13.1 Runnable and Callable

若是是使用Runnable或者Callable来包装逻辑代码。能够这样：

Runnable runnable = new Runnable() {
	@Override
	public void run() {
		// do some work
	}

	@Override
	public String toString() {
		return "spanNameFromToStringMethod";
	}
};
// Manual `TraceRunnable` creation with explicit "calculateTax" Span name
Runnable traceRunnable = new TraceRunnable(tracer, spanNamer, runnable, "calculateTax");
// Wrapping `Runnable` with `Tracer`. The Span name will be taken either from the
// `@SpanName` annotation or from `toString` method
Runnable traceRunnableFromTracer = tracer.wrap(runnable);

Callable<String> callable = new Callable<String>() {
	@Override
	public String call() throws Exception {
		return someLogic();
	}

	@Override
	public String toString() {
		return "spanNameFromToStringMethod";
	}
};
// Manual `TraceCallable` creation with explicit "calculateTax" Span name
Callable<String> traceCallable = new TraceCallable<>(tracer, spanNamer, callable, "calculateTax");
// Wrapping `Callable` with `Tracer`. The Span name will be taken either from the
// `@SpanName` annotation or from `toString` method
Callable<String> traceCallableFromTracer = tracer.wrap(callable);

这样每次执行都会有新的Span的建立和关闭。

7.13.2 Hystrix

7.13.2.1 Custom Concurrency Strategy 定制并发策略

能够注册一个自定义的HystrixConcurrencyStrategy,它经过TraceCallable能够包装Sleuth中全部的Callable实例。这个策略，会自行判断在以前的Hystrix命令是否已经开始跟踪，来决定是建立仍是延续使用span。也能够经过设置spring.sleuth.hystrix.strategy.enabled为false来关闭这个策略。

7.13.2.2 Manual Command setting

假设有下面这样的HystrixCommand:

HystrixCommand<String> hystrixCommand = new HystrixCommand<String>(setter) {
	@Override
	protected String run() throws Exception {
		return someLogic();
	}
};

为了跟踪，能够用TraceCommand对其进行必定的包装:

TraceCommand<String> traceCommand = new TraceCommand<String>(tracer, traceKeys, setter) {
	@Override
	public String doRun() throws Exception {
		return someLogic();
	}
};

7.13.3 RxJava

建议自定义一个RxJavaSchedulersHook,它使用TraceAction来包装实例中全部的Action0。这个钩子对象，会根据以前调度的Action是否已经开始跟踪，来决定是建立仍是延续使用span。能够经过设置spring.sleuth.rxjava.schedulers.hook.enabled为false来关闭这个对象的使用。

能够定义一组正则表达式来对线程名进行过滤，来选择哪些线程不须要跟踪。可使用逗号分割的方式来配置spring.sleuth.rxjava.schedulers.ignoredthreads属性。

7.13.4 HTTP integration

这个特性的开启，经过spring.sleuth.web.enabled属性。当不想使用时，设置为false就行。

7.13.4.1 HTTP Filter

经过TraceFilter能够对全部入站请求进行跟踪。这时候，Span的名字为http:加上请求的路径。例如，若是请求是/foo/bar，那span名字就是http:/foo/bar。经过spring.sleuth.web.skipPattern配置项，能够配置一个URI规则来跳过监控。若是classpath中有一个ManagementServerProperties,其中contextPath也不会被跟踪。

7.13.4.2 HandlerInterceptor

若是须要对span名字进行进一步的控制，可使用TraceHandlerInterceptor，它会对已有的HandlerInterceptor进行包装，或者直接添加到已有的HandlerInterceptors中。 TraceHandlerInterceptor会在HttpServletRequest中添加一个特别的request attribute。若是TraceFilter没有发现这个属性，就会建立一个额外的“fallback”（保底）span，这样确保跟踪信息完整。

7.13.4.3 Async Servlet support

若是控制器返回了一个Callable或者WebAsyncTask，Spring Cloud Sleuth会延续已有的span，而不是建立一个新的span。

7.13.5 HTTP client integration

7.13.5.1 Synchronous Rest Template

重要： 一个AsyncRestTemplateBean被注册时会有一个版本概念。若是须要本身的Bean来替代TraceAsyncRestTemplate。最好的方式是自定义一个ClientHttpRequestFactory以及AsyncClientHttpRequestFactory。若是须要本身的AsyncRestTemplate而又不想包装它，那这个就不会被跟踪。

自定义span在发送和接收请求时的建立/关闭逻辑，能够自定义ClientHttpRequestFactory和AsyncClientHttpRequestFactoryBean来达到这个目的。记住使用那些能兼容跟踪的实例（不要忘了在TraceAsyncListenableTaskExecutor中包装一个ThreadPoolTaskScheduler来使用）。

例如：自定义请求工厂：

@EnableAutoConfiguration
@Configuration
public static class TestConfiguration {

	@Bean
	ClientHttpRequestFactory mySyncClientFactory() {
		return new MySyncClientHttpRequestFactory();
	}

	@Bean
	AsyncClientHttpRequestFactory myAsyncClientFactory() {
		return new MyAsyncClientHttpRequestFactory();
	}
}

若是须要阻止AsyncRestTemplate特性，能够设置spring.sleuth.web.async.client.enabled为false。

若是须要禁用默认的TraceAsyncClientHttpRequestFactoryWrapper,能够设置spring.sleuth.web.async.client.factory.enabled为false。

若是不想建立AsyncRestClient,能够设置spring.sleuth.web.async.client.template.enabled为false。

7.13.6 Feign

默认状况下，Spring Cloud Sleuth提供了一个TraceFeignClientAutoConfiguration来整合Feign。若是须要禁用的话，能够设置spring.sleuth.feign.enabled为false。若是禁用，与Feign相关的机制就不会发生。

Feign部分功能是经过FeignBeanPostProcessor来完成的。能够设置spring.sleuth.feign.processor.enabled为false来禁用这个类。若是禁用，那Spring Cloud Sleuth就不会执行自定义的Feign组件。不过，全部默认的Feign组件仍是有效的。

7.13.7 Asynchronous communication

7.13.7.1 @Async annotated methods

在Spring Cloud Sleuth中，有相应的机制来处理异步组件的跟踪，这样在不一样的线程之间也可以进行跟踪。能够设置spring.sleuth.async.enabled为false来关闭。

若是在方法上加上@Async,那会自动的建立一个新的span，并带有下列特性：

span 名字会被命名为被注解的方法名
span 标签中会自动带上方法的类名和方法名

7.13.7.2 @Scheduled annotated methods

在Spring Cloud Sleuth中，有相应的机制来处理调度方法的执行，这样在不一样的线程之间也可以进行跟踪。能够设置spring.sleuth.scheduled.enabled为false来关闭。

若是在方法上加上@Scheduled，那就会自动建立一个新的span，并带有下列特性：

span 名字会被命名为被注解的方法名
span 标签中会自动带上方法的类名和方法名

若是不须要跟踪某些@Scheduled，能够在spring.sleuth.scheduled.skipPattern设置一些正则表达式来过滤一些class。

提示： 若是一块儿使用spring-cloud-sleuth-stream和spring-cloud-netflix-hystrix-stream，那span会被每个Hystrix metrics建立并发送到Zipkin。这可能不是你想要的。但是进行以下设置，来阻止此行为：spring.sleuth.scheduled.skipPattern=org.springframework.cloud.netflix.hystrix.stream.HystrixStreamTask

7.13.7.3 Executor, ExecutorService and ScheduledExecutorService

Sleuth自己就提供了LazyTraceExecutor,TraceableExecutorService以及TraceableScheduledExecutorService。这些线程池对于每一次新任务的提交，调用或者调度都会建立新的span。

下面的列子，展现了如何在使用CompletableFuture时经过TraceableExecutorService来处理跟踪信息。

CompletableFuture<Long> completableFuture = CompletableFuture.supplyAsync(() -> {
	// perform some logic
	return 1_000_000L;
}, new TraceableExecutorService(executorService,
		// 'calculateTax' explicitly names the span - this param is optional
		tracer, traceKeys, spanNamer, "calculateTax"));

7.13.8 Messaging 消息

Spring Cloud Sleuth自己就整合了Spring Integration。它发布/订阅事件都是会建立span。能够设置spring.sleuth.integration.enabled为false来禁用这个机制。

Spring Cloud Sleuth直到1.0.4版本，使用消息时，仍是会发送一些无效的跟踪头。这些头实际上和HTTP头的命名同样（都带有-分隔符）。为了向下兼容，从1.0.4版本开始，有效和无效头都会发送。到Spring Cloud Sleuth 1.1版本，将会移除那些不建议使用的头。

从1.0.4版本开始，能够经过spring.sleuth.integration.patterns配置哪些消息通道须要跟踪。默认状况下，全部的消息通道都会被跟踪。

7.13.9 Zuul

Sleuth会注册一些Zuul Filter,用于传播跟踪信息（在请求头中带上跟踪信息）。能够设置spring.sleuth.zuul.enabled为false来关闭。

7.14 Running examples

能够找到一些部署在Pivotal Web Services中的例子。能够在下列连接中找到：