Analyzing the Performance of an Anycast CDN(含论文谷歌翻译,人工补正)

Analyzing the Performance of an Anycast CDN

ABSTRACT

Content delivery networks must balance a number of trade-offswhen deciding how to direct a client to a CDN server. WhereasDNS-based redirection requires a complex global traffic manager,anycast depends on BGP to direct a client to a CDN front-end. Any-cast is simple to operate, scalable, and naturally resilient to DDoSattacks. This simplicity, however, comes at the cost of precise con-trol of client redirection. We examine the performance implicationsof using anycast in a global, latency-sensitive, CDN. We analyzemillions of client-side measurements from the Bing search serviceto capture anycast versus unicast performance to nearby front-ends.We find that anycast usually performs well despite the lack of pre-cise control but that it directs roughly 20% of clients to a suboptimalfront-end. We also show that the performance of these clients canbe improved through a simple history-based prediction scheme. 


摘要

当决定用哪一个边缘server来服务用户时,内容分发网络(cdn)必须在许多策略中作出平衡。经典的基于DNS调度策略需要一个负责的全球流量管理系统(complex global traffic manager),而基于anycast的cdn系统则有赖于BGP协议来将用户导向边缘服务器。anycast简单、可伸缩性强并且天然抵御ddos攻击。然而,这种简单性是以精确控制客户端重定向为代价的。这篇文章旨在研究在一个全球的、对延迟敏感的CDN网络中使用anycast会有哪些性能影响。我们分析了Bing搜索服务中数以百万计的客户端测量结果,以便将anycast与单播到边缘服务器到性能进行对比。我们发现,尽管缺乏精确控制,但是Anycast通常表现良好,但是它将大约20%的客户端指向次优的终端服务器。 我们还发现,这些客户端的表现可以通过一个简单的基于历史的预测方案来改善。


Keywords/关键词

Anycast; CDN; Measurement; 


1. INTRODUCTION 

Content delivery networks are a critical part of Internet infras-tructure. CDNs deploy front-end servers around the world anddirect clients to nearby, available front-ends to reduce bandwidth,improve performance, and maintain reliability. We will focus ona CDN architecture which directs the client to a nearby front-end,which terminates the client’s TCP connection and relays requests toa backend server in a data center. The key challenge for a CDN isto map each client to the right front-end. For latency-sensitive ser-vices such as search results, CDNs try to reduce the client-perceivedlatency by mapping the client to a nearby front-end 


CDN是互联网基础设施的重要组成部分。 CDN在全球部署边缘服务器,并将客户端引导至附近的可用边缘服务器,以减少带宽,提高性能并保持可靠性。 我们将关注CDN架构,它将客户端引导到附近的边缘服务器,终止客户端的TCP连接,并将请求转发到数据中心的后端服务器。 CDN面临的主要挑战是将每个客户端映射到正确的边缘服务器。 对于延迟敏感的服务(如搜索结果),CDN尝试通过将客户端映射到附近的边缘服务器来减少客户端感知的延迟


CDNs can use several mechanisms to direct the client to a front-end. The two most popular mechanisms are DNS and anycast.DNS-based redirection was pioneered by Akamai. It offers fine-grained and near-real time control over client-front-end mapping,but requires considerable investment in infrastructure and opera-tions 


CDN可以使用多种机制将客户端引导到边缘服务器。 两种最流行的机制是DNS和Anycast。 Akamai开创了基于DNS的调度策略。 它提供了对客户端-边缘服务器映射的细粒度和接近实时的控制,但需要在基础架构和操作方面进行大量的投入。


Some newer CDNs like CloudFlare rely on anycast [1], announc-ing the same IP address(es) from multiple locations, leaving theclient-front-end mapping at the mercy of Internet routing protocols.Anycast offers only minimal control over client-front-end mapping and is performance agnostic by design. However, it is easy andcheap to deploy an anycast-based CDN – it requires no infrastruc-ture investment, beyond deploying the front-ends themselves. Theanycast approach has been shown to be quite robust in practice 

CloudFlare等一些较新的CDN依赖于Anycast ,从多个位置广播相同的IP地址,使得客户端和边缘服务器端映射只受Internet路由协议的支配。Anycast只提供对客户端和边缘服务器之间调度的最小控制, 并且性能是不可知的。 然而,部署一个基于Anycast的CDN非常简单 - 它不需要基础设施投资,除了部署边缘服务器本身。 在实践中,广播方法已经被证明是相当强大的。


In this paper, we aim to answer the questions: Does anycast directclients to nearby front-ends? What is the performance impact ofpoor redirection, if any? To study these questions, we use data fromBing’s anycast-based CDN [23]. We instrumented the search stackso that a small fraction of search response pages carry a JavaScriptbeacon. After the search results display, the JavaScript measureslatency to four front-ends– one selected by anycast, and three nearbyones that the JavaScript targets. We compare these latencies tounderstand anycast performance and determine potential gains fromdeploying a DNS solution. 

在本文中,我们的目标是回答以下问题:Anycast是否将客户引导到最近的边缘服务器? 糟糕的重定向对性能的影响是什么? 为了研究这些问题,我们使用Bing的基于Anycast的CDN的数据。 我们安装了搜索堆栈( search stack),以便一小部分搜索响应页面携带一个JavaScript信标。 在搜索结果显示之后,JavaScript会测量四个前端的延迟 - 一个是anycast,另一个是JavaScript定位的三个。 我们比较这些延迟以了解anycast性能,并确定部署DNS解决方案可能带来的收益。


Our results paint a mixed picture of anycast performance. Formost clients, anycast performs well despite the lack of centralizedcontrol. However, anycast directs around 20% of clients to a sub-optimal front-end. When anycast does not direct a client to the bestfront-end, we find that the client usually still lands on a nearby alter-native front-end. We demonstrate that the anycast inefficiencies arestable enough that we can use a simple prediction scheme to drive DNS redirection for clients underserved by anycast, improving per-formance of 15%-20% of clients. Like any such study, our specificconclusions are closely tied to the current front-end deployment ofthe CDN we measure. However, as the first study of this kind that weare aware of, the results reveal important insights about CDN per-formance, demonstrating that anycast delivers optimal performancefor most clients. 

我们的结果描绘了anycast性能的综合性能图。 对于大多数客户端来说,尽管缺乏集中控制,anycast仍然表现良好。 但是,Anycast将大约20%的客户指向次优的边缘服务器。 当anycast不把客户引导到最好的边缘服务器时,我们发现客户端通常仍然在附近的另一个边缘服务器。 我们证明,anycast的这种选择是足够稳定的,我们可以使用一个简单的预测方案来使这些选择到次优服务器的客户端发起DNS重定向,提高客户端15%-20%的性能。 像任何这样的研究一样,我们的具体结论与我们所测量的CDN当前的边缘服务器部署密切相关。 然而,作为我们首先了解的这种类型的研究,结果揭示了有关CDN性能的重要见解,证明了anycast能够为大多数客户端提供最佳性能。


2. CLIENT REDIRECTION 

A CDN can direct a client to a front-end in multiple ways.DNS:The client will fetch a CDN-hosted resource via a hostnamethat belongs to the CDN. The client’s local DNS resolver (LDNS),typically configured by the client’s ISP, will receive the DNS requestto resolve the hostname and forward it to the CDN’s authoritativenameserver. The CDN makes a performance-based decision aboutwhat IP address to return based on which LDNS forwarded therequest. DNS redirection allows relatively precise control to redirectclients on small timescales by using small DNS cache TTL values. 

CDN可以通过多种方式将客户端引导到边缘服务器。 DNS:客户端将通过属于CDN的域名获取CDN托管的资源。 客户端的本地DNS解析器(LDNS)通常由客户端的ISP配置,将接收DNS请求以解析域名并将其转发(cname)给CDN的权威名称服务器。 CDN根据哪个LDNS转发请求,根据性能决定返回哪个IP地址。 DNS重定向使用较小的DNS缓存TTL值进行相对精确的控制,从而在小时间范围内重定向客户端。

Since a CDN must make decisions at the granularity of LDNSrather than client, DNS-based redirection faces some challenges.An LDNS may be distant from the clients that it serves or may serveclients distributed over a large geographic region, such that thereis no good single redirection choice an authoritative resolver canmake. This situation is very common with public DNS resolverssuch as Google Public DNS and OpenDNS, which serve large, ge-ographically disparate sets of clients [17]. A proposed solution tothis issue is the EDNS client-subnet-prefix standard (ECS) whichallows a portion of the client’s actual IP address to be forwarded to the authoritative resolver, allowing per-prefix redirection deci-sions [21].

由于CDN必须以LDNS的粒度而不是客户端进行决策,因此基于DNS的重定向会面临一些挑战。 LDNS可能与其服务的客户端相距甚远,或者可能服务于分布在大的地理区域的客户端,使得权威服务器可以做出没有良好的单一重定向选择。 这种情况在公共DNS解析器(如Google Public DNS和OpenDNS)中非常常见,这些解析器为大量的地理位置不同的客户提供服务。 针对这个问题的建议解决方案是EDNS客户端 - 子网前缀标准(ECS),其允许客户端的实际IP地址的一部分被转发到权威服务器,允许根据ip段来决策调度。

Anycast:Anycast is a routing strategy where the same IP addressis announced from many locations throughout the world. ThenBGP routes clients to one front-end location based on BGP’s notionof best path. Because anycast defers client redirection to Internetrouting, it offers operational simplicity. Anycast has an advantageover DNS-based redirection in that each client redirection is handledindependently – avoiding the LDNS problems described above. 

Anycast:Anycast是一种路由策略,在全球很多地方都有相同的IP地址。 然后,BGP根据BGP的最佳路径概念将客户端路由到一个边缘服务器位置。 由于Anycast将客户端重定向的实现推迟到Internet路由层面,因此它提供了操作简单性。 Anycast比基于DNS的重定向具有优势,因为每个客户端重定向都是独立处理的 - 避免了上述的LDNS问题。

Anycast has some well-known challenges. First, anycast is un-aware of network performance, just as BGP is, so it does not reactto changes in network quality along a path. Second, anycast is un-aware of server load. If a particular front-end becomes overloaded,it is difficult to gradually direct traffic away from that front-end,although there has been recent progress in this area [23]. Simplywithdrawing the route to take that front-end offline can lead to cas-cading overloading of nearby front-ends. Third, anycast routingchanges can cause ongoing TCP sessions to terminate and need tobe restarted. In the context of the Web, which is dominated byshort flows, this does not appear to be an issue in practice [31, 23].Many companies, including Cloudflare, CacheFly, Edgecast, andMicrosoft, run successful anycast-based CDNs 

anycast有一些众所周知的挑战。 首先,anycast,或者说BGP协议,意识不到网络的性能,所以它不会对路径上网络质量的变化做出反应。 其次,Anycast不知道服务器负载。 如果一个特定的边缘服务器变得超载,那么很难逐渐将流量从这个前端引导出去,尽管这个领域已经有了最近的进展[23]。 简单地撤销路线(AS-path)以使该边缘服务器离线可能导致附近不服务器的重载过载。 第三,anycast路由选择改变会导致正在进行的TCP会话终止并需要重新启动。 在以短流量为主的网络环境下,这在实践中似乎不是问题[31,23]。 包括Cloudflare,CacheFly,Edgecast和Microsoft在内的许多公司都运行成功的基于Anycast的CDN

Other Redirection Mechanisms:Whereas anycast and DNS directa client to a front-end before the client initiates a request, the re-sponse from a front-end can also direct the client to a different serverfor other resources, using, for example, HTTP status code 3xx or manifest-based redirection common for video [4]. These schemesadd extra RTTs, and thus are not suitable for latency-sensitive Webservices such as search. We do not consider them further in thispaper. 

其他重定向机制:尽管在客户端发起请求之前,anycast和DNS将客户端引导到边缘服务器,但边缘服务器的响应也可以将客户端引导到其他服务器以获取其他资源,例如, HTTP状态代码3xx或基于清单的重定向通常用于视频[4]。 这些方案增加了额外的RTT,因此不适用于延迟敏感的Web服务,如搜索。 我们在本文中没有进一步考虑它们。


3. METHODOLOGY 

Our goal is to answer two questions: 1) How effective is anycast in directing clients to nearby front-ends? And 2) How does anycast performance compare against the more traditional DNS-based unicast redirection scheme? We experiment with Bing’s anycast-based CDN to answer these questions. The CDN has dozens of front end locations around the world, all within the same Microsoft-operatedautonomous system. We use measurements from real clients to Bing CDN front-ends using anycast and unicast. In § 4, we compare the size of this CDN to others and show how close clients are to the front ends.

我们的目标是回答两个问题:1)在指导客户到附近的边缘服务器方面,anycast有多有效? 2)与传统的基于DNS的单播重定向方案相比,anycast性能如何? 我们试用Bing的基于Anycast的CDN来回答这些问题。 这个CDN在世界各地拥有数十个边缘服务器位置,全部都在同版本的微软操作系统内。 我们从真实客户端来测试的Bing CDN,比较anycast和单播的测量结果。 在第4节中,我们将这个CDN与其他CDN进行比较,并展示客户端到客户端的网络距离。


3.1 Routing Configuration
All test front-ends locations have both anycast and unicast IP addresses.
Anycast: Bing is currently an anycast CDN. All production search traffic is current served using anycast from all front-ends.
Unicast: We also assign each front-end location a unique /24 prefix which does not serve production traffic. Only the routers at the closest peering point to that front-end announce the prefix, forcing traffic to the prefix to ingress near the front-end rather than entering Microsoft’s backbone at a different location and traversing the backbone to reach the front-end. This routing configuration allows the best head-to-head comparison between unicast and anycast redirection, as anycast traffic ingressing at a particular peering point will also go to the closest front-end.
所有测试的边缘服务器都有anycast和单播IP地址。
Anycast:Bing目前是anycast CDN。 所有生产搜索流量都是使用来自所有边缘服务器的anycast服务。
单播:我们还为每个前端位置分配一个唯一的/ 24前缀,它不提供生产流量。 只有与边缘服务器最近的对等点(peering point)的路由器才公布前缀,强制前缀流量进入边缘服务器,而不是进入不同位置的微软骨干网,穿越骨干网到达边缘服务器。 这种路由配置允许在单播和anycast之间进行最好的直接比较,因为在特定的对等点进入的anycast流量也将到达最近的边缘服务器。

3.2 Data Sets

We use both passive and active measurements in our study, as discussed below.

我们下面使用了主动测量和被动测量的方式

3.2.1 Passive Measurements
Bing server logs provide detailed information about client request for each search query. For our analysis we use the client IP address, location, and what front-end was used during a particular request. This data set was collected on the first week of April 2015 and represents many millions of queries.

Bing服务器日志为每个搜索查询提供有关客户端请求的详细信息。 对于我们的分析,我们使用客户端IP地址,位置,以及在特定的过程中使用了哪些边缘服务器请求。 这个数据集是在2015年4月的第一个星期收集的并代表了数百万的查询。

3.2.2 Active Measurements 

To actively measure CDN performance from the client, we injecta JavaScript beacon into a small fraction of Bing Search results.After the results page has completely loaded, the beacon instructsthe client to fetch four test URLs. These URLs trigger a set of DNSqueries to our authoritative DNS infrastructure. The DNS queryresults are randomized front-end IPs for measurement diversity,which we discuss more in § 3.3. 

为了主动测量来自客户端的CDN性能,我们将JavaScript信标注入Bing搜索结果的一小部分。结果页面完全加载后,信标指示客户端获取四个测试URL。 这些URL会触发一组DNS查询给我们权威的DNS基础设施。 DNS查询结果是用于测量多样性的随机前端IP,我们将在第3.3节中进一步讨论(这句话不大理解,后面看下怎么翻译)

The beacon measures the latency to these front-ends by down-loading the resources pointed to by the URLs, and reports the resultsto a backend infrastructure. Our authoritative DNS servers also pushtheir query logs to the backend storage. Each test URL has a glob-ally unique identifier, allowing us to join HTTP results from theclient side with DNS results from the server side [34]. 

信标通过下载URL指向的资源来测量这些前端的延迟,并将结果报告给后端基础设施。 我们权威的DNS服务器也将他们的查询日志推送到后端存储。 每个测试url都有一个全球唯一的标识符,允许我们将来自客户端的HTTP结果与服务器端的DNS结果[34]结合起来。

The JavaScript beacon implements two techniques to improvequality of measurements. First, to remove the impact of DNS lookupfrom our measurements, we first issue a warm-up request so thatthe subsequent test will use the cached DNS response. While DNSlatency may be responsible for some aspects of poor Web-browsingperformance [5], in this work we are focusing on the performanceof paths between client and front-ends. We set TTLs longer thanthe duration of the beacon. Second, using JavaScript to measurethe elapsed time between the start and end of a fetch is known to notbe a precise measurement of performance [32], whereas the W3CResource Timing API [29] provides access to accurate resourcedownload timing information from compliant Web browsers. Thebeacon first records latency using the primitive timings. Uponcompletion, if the browser supports the resource timing API, thenthe beacon substitutes the more accurate values. 

JavaScript信标实现了两种技术来提高测量质量。 首先,为了从我们的测量中消除DNS查找的影响,我们首先发出预热请求,以便后续测试将使用缓存的DNS响应。 虽然DNS延迟可能是Web浏览性能不佳的一些方面的负责,但在这项工作中,我们关注的是客户端和前端之间路径的性能。 我们设置的TTL比测试信标的持续时间长。 其次,使用JavaScript来衡量获取开始和结束之间的已用时间已知不是性能的精确测量[32],而W3C资源计时API [29]提供了从符合标准的资源下载定时信息 网页浏览器。 信标首先使用原始时间记录等待时间。 完成后,如果浏览器支持资源定时API,则信标代替更准确的值。(第一个方法就是dns预热,第二个方法看不大明白)

We study measurements collected from many millions of searchqueries over March and April 2015. We aggregated client IP ad-dresses from measurements into /24 prefixes because they tend tobe localized [27]. To reflect that the number of queries per /24 isheavily skewed across prefixes [35], for both the passive and ac-tive measurements, we present some of our results weighting the/24s by the number of queries from the prefix in our correspondingmeasurements 

我们研究从2015年3月和4月的数百万搜索查询中收集到的测量数据。我们将来自测量的客户IP地址聚合到/ 24前缀(c段)中,因为它们倾向于本地化[27]。 为了反映每个C段查询的数量在前缀[35]中严重偏斜,对于被动测量和活动测量,我们给出了一些我们的结果,用相应前缀的查询数加权为/ 24 测量(最后一句没看懂,做了对ip地址做了聚合,然后呢?)

3.3 Choice of Front-ends to Measure 

the main goal of our measurements is to compare the perfor-mance achieved by anycast with the performance achieved by di-recting clients to their best performing front-end. Measuring fromeach client to every front-end would introduce too much overhead,but we cannot know a priori which front-end is the best choice fora given client at a given point in time. 

我们测量的主要目标是比较选播和通过将客户指向最佳性能前端所获得的性能(简而言之就是DNS调度,太绕口了)进行比较。 从每个客户端到每个前端的测量会带来太多的开销,但是我们无法预先知道哪个前端对于特定客户端在给定时间点是最佳选择。(当然,除非ip库是准确的且服务器的负载不高)

We use three mechanisms to balance measurement overhead withmeasurement accuracy in terms of uncovering the best performingchoices and obtaining sufficient measurements to them. First, foreach LDNS, we consider only the ten closest front-ends to the LDNS(based on geolocation data) as candidates to consider returning tothe clients of that LDNS. Recent work has show that LDNS is agood approximation of client location: excluding 8% of demandfrom public resolvers, only 11-12% of demand comes from clientswho are further than 500km from their LDNS [17]. In Figure 1, wewill show that our geolocation data is sufficiently accurate that thebest front-ends for the clients are generally within that set. 

我们使用三种机制来平衡测量开销和测量准确性,从而揭示最佳选择方案并获得足够的测量结果。 首先,对于LDNS,我们只考虑LDNS的十个最接近的边缘服务器(基于地理位置数据)作为候选者考虑返回到该LDNS的客户端。 最近的研究表明,LDNS对客户位置的接近程度很好:不包括8%的公共dns,只有11-12%的需求来自离LDNS 500公里以上的客户。 在图1中,我们将显示我们的地理位置数据足够准确,客户的最佳前端通常在该集合内。(运营商分配的localdns 通常接近用户)


Second, to further reduce overhead, each beacon only makes four measure-ments to front-ends: (a) a measurement to the front-end selectedby anycast routing; (b) a measurement to the front-end judged to begeographically closest to the LDNS; and (c-d) measurements to twofront-ends randomly selected from the other nine candidates, withthe likelihood of a front-end being selected weighted by distancefrom the client LDNS (e.g. we return the 3rd closest front-end withhigher probability than the 4th closest front-end). Third, for mostof our analysis, we aggregate measurements by /24 and considerdistributions of performance to a front-end, so our analysis is robusteven if not every client measures to the best front-end every time. 

为了进一步降低开销,每个信标只对边缘服务器进行四次测量:(a)对anycast路由选择的边缘服务器进行测量; (b)对被判断为在地理上最接近LDNS的边缘服务器的测量; 和(cd)对从其他九个候选者中随机选择的两个边缘服务器进行测量,并且选择边缘服务器的可能性是通过距离客户端LDNS的距离加权的(例如,第三最接近的边缘服务器被选中的可能性高于第四最接近的前端)。 第三,对于我们大部分的分析,我们将测量结果按照c段进行路由聚合,并考虑到边缘服务器的性能分布,所以即使不是每个客户端每次都测量到最好的前端,我们的分析也是稳健的。

To partially validate our approach, Figure 1 shows the distributionof minimum observed latency from a client /24 to a front-end. ThelabeledNth line includes latency measurements from the nearestNfront-ends to the LDNS. The results show decreasing latencyas we initially include more front-ends, but we see little decreaseafter adding five front-ends per prefix, for example. So, we do notexpect that minimum latencies would improve for many prefixes ifwe measured to more than the nearest ten front-ends that we includein our beacon measurements 

为了部分验证我们的方法,图1显示了从客户机/ 24到前端的最小观察延迟的分布。 带标签的第N行包括从最近的N个前端到LDNS的延迟测量。 结果显示,我们最初包含更多前端的延迟降低,但是例如,在为每个前缀添加五个前端之后,我们看不到什么降低。 所以,如果我们测量的距离超过了我们在信标测量中包含的最近的十个前端,我们并不期望最小延迟会改善许多前缀(这段完全没看懂了)

4. CDN SIZE AND GEO-DISTRIBUTION 

The results in this paper are specific to Bing’s anycast CDN de-ployment. In this section we characterize the size of the deployment,showing that our deployment is of a similar scale–a few dozens offront-end server locations–to most other CDNs and in particularmost anycast CDNs, although it is one of the largest deploymentswithin that rough scale. We then measure what the distribution ofthese dozens of front-end locations yields in terms of the distancefrom clients to the nearest front-ends. Our characterization of theperformance of this CDN is an important first step towards under-standing anycast performance. An interesting direction for futurework is to understand how to extend these performance results toCDNs with different numbers and locations of servers and withdifferent interdomain connectivity [18]. 

本文的结果是特定于Bing的anycast CDN部署。 在本节中,我们描述了部署的规模,显示了我们的部署与大多数其他CDN类似,具有几十个前端服务器位置,特别是大部分任播CDN,尽管它是该部署中最大的部署之一 粗糙的规模。 然后,我们测量这几十个前端位置在客户到最近前端的距离方面的分布情况。 我们对CDN性能的描述是理解选播性能的重要的第一步。 未来工作的一个有趣的方向是理解如何将这些性能结果扩展到具有不同服务器数量和位置的CDN以及不同的域间连接性[18]。

We compare our CDN to others based on the number of server locations, which is one factor impacting CDN and anycast performance.We examine 21 CDNs and content providers for which there is publicly available data [3]. Four CDNs are extreme outliers. ChinaNetCenter and ChinaCache each have over 100 locations in China. Previous research found Google to have over 1000 locations worldwide [16], and Akamai is generally known to have over 1000 as well [17]. While this scale of deployment is often the popular image of a CDN, it is in fact the exception. Ignoring the large Chinese deployments, the next largest CDNs we found public data for are CDNetworks (161 locations) and SkyparkCDN (119 locations). The remaining 17 CDNs we examined (including ChinaNetCenter’s and ChinaCache’s deployments outside of China) have between 17 locations (CDNify) and 62 locations (Level3). In terms of number
of locations and regional coverage, the Bing CDN is most similar to Level3 and MaxCDN. Well-known CDNs with smaller deployments include Amazon CloudFront (37 locations), CacheFly (41 locations), CloudFlare (43 locations) and EdgeCast (31 locations). CloudFlare, CacheFly, and EdgeCast are anycast CDNs

我们根据服务器位置数量来比较我们的CDN和其他服务器位置,这是影响CDN和anycast性能的一个因素。我们检查了21个CDN和内容提供商,其中有公开的数据[3]。四个CDN是极端的异常值。 ChinaNetCenter和ChinaCache在中国各有100多个节点。以前的研究也发现谷歌在全球有超过1000个地点[16],而众所周知Akamai也有1000多个地点[17]。虽然这种部署规模往往在CDN中流行,但实际上却是例外。忽略中国的大规模部署,我们发现公开数据的第二大的CDN是CDNetworks(161个地点)和SkyparkCDN(119个地点)。我们研究的其余17个CDN(包括ChinaNetCenter和ChinaCache在中国以外的部署)有17个地点(CDNify)和62个地点(Level3)。在数量方面
Bing CDN与Level3和MaxCDN最为相似。具有较小部署的知名CDN包括Amazon CloudFront(37个位置),CacheFly(41个位置),CloudFlare(43个位置)和EdgeCast(31个位置)。 CloudFlare,CacheFly和EdgeCast是任播CDN(edgecast 是Verizon的cdn服务)

To give some perspective on the density of front-end distribution, Figure 2 shows the distance from clients to nearest front-ends, weighted by client Bing query volumes. The median distance of the nearest front-end is 280 km, of the second nearest is 700 km, and of fourth nearest is 1300 km.

此处给出边缘服务器的分布密度图,图2显示了客户端到最近边缘服务器的距离,由客户端Bing查询量加权。 最近的前端的中间距离是280公里,第二最近的是700公里,第四最近的是1300公里

5. ANYCAST PERFORMANCE

We use measurements to estimate the performance penalty anycast pays in exchange for simple operation. Figure 3 is based on millions of measurements, collected over a period of a few days, and inspired us to take on this project.

我们使用测量来评估选播支付的性能惩罚以换取简单的操作。 图3基于数百万次的测量数据,并在几天的时间内收集起来,并激励我们接受这个项目。


As explained in § 3, each execution of the JavaScript beacon yields four measurements, one to the front-end that anycast selects,and three to nearby unicast front-ends. For each request, we find the latency difference between anycast and the lowest-latency unicast front-end. Figure 3 shows the fraction of requests where anycast performance is slower than the best of the three unicast front-ends. Most of the time, in most regions, anycast does well, performing as well as the best of the three nearby unicast front-ends. However, anycast is at least 25ms slower for 20% of requests, and just below 10% of anycast measurements are 100ms or more slower than the best unicast for the client.

正如第3节所述,每个JavaScript信标的执行都会产生四个测量结果,一个用于anycast选择的边缘服务器,另一个用于附近的单播边缘服务器。 对于每个请求,我们发现任播和最低延迟的单播边缘服务器之间的延迟差异。 图3显示了任意播性能比三个单播边缘服务器中最好(the best of the three unicast front-ends)的部分慢的请求。 大多数时候,在大多数地区,选播都表现良好,表现和最好的三个单播边缘服务器一样好。 然而,对于20%的请求,选播速度至少要慢25ms,对于客户端来说,只有10%的选播测量速度比100ms或更慢。

This graph suggests possible benefits in using DNS-based redirection for some clients, with anycast for the rest. Note that this is not an upper bound: to derive that, we would have to poll all front-ends in each beacon execution, which is too much overhead. There is also no guarantee that a deployed DNS-based redirection system will be able to achieve the performance improvement seen in Figure 3 – to do so the DNS-based redirection system would have to
be practically clairvoyant. Nonetheless, this result was sufficiently tantalizing for us to study anycast performance in more detail, and seek ways to improve it.

这个图表显示了为某些客户使用基于DNS的重定向可能效果更好,剩下的客户则是使用anycast更好 。 请注意,这不是一个上限:为了得出这个结论,我们必须轮询每个信标执行中的所有前端,这个开销太大了。 也不能保证部署的基于DNS的重定向系统能够实现图3所示的性能改进 - 为了达到这种效果,基于DNS的重定向系统必须实际上十分精确。 尽管如此,这个结果足以让我们更详细地研究任播性能,并寻求改进方法。

Examples of poor anycast routes: A challenge in understanding anycast performance is figuring out why clients are being directed to distant or poor performing edges front-ends. To troubleshoot, we used the RIPE Atlas [2] testbed, a network of over 8000 probes predominantly hosted in home networks. We issued traceroutes from Atlas probes hosted within the same ISP-metro area pairs wherewe have observed clients with poor performance. We observe
in our analysis that many instances fall into one of two cases. 1) BGP’s lack of insight into the underlying topology causes anycast to make suboptimal choices and 2) intradomain routing policies of ISPs select remote peering points with our network.

糟糕的选播路线的例子:理解选播性能的一个挑战是弄清楚为什么客户被引导到遥远或者表现不佳的边缘服务器。 为了排除故障,我们使用了RIPE Atlas 测试平台,这是一个8000多个主要在家庭网络监测点的网络。 我们发布了来自同一ISP-metro区域内的Atlas探测器的traceroute,我们观察到客户端的性能较差。 我们观察
在我们的分析中,许多情况都属于两种情况之一。 1)BGP对基础拓扑结构缺乏洞察力导致选播做出次优选择; 2)ISP的域内路由策略选择远程对等点与我们的网络。

In one interesting example, a client was roughly the same distancefrom two border routers announcing the anycast route. Anycastchose to route towards router A. However, internally in our network,router B is very close to a front-end C, whereas router A has alonger intradomain route to the nearest front-end, front-end D. Withanycast, there is no way to communicate [39] this internal topologyinformation in a BGP announcement 

在一个有趣的例子中,一个客户端与两个宣布选播路由的边界路由器的距离大致相同。 Anycast选择路由到路由器A.然而,在我们的网络内部,路由器B非常接近前端C,而路由器A具有到最近的前端,前端D的更长的域内路由。 在BGP通告中没有办法沟通[39]这个内部拓扑信息

Several other examples included cases where a client is nearby afront-end but the ISP’s internal policy chooses to hand off traffic at adistant peering point. Microsoft intradomain policy then directs theclient’s request to the front-end nearest to the peering point, not tothe client. Some examples we observed of this was an ISP carryingtraffic from a client in Denver to Phoenix and another carryingtraffic from Moscow to Stockholm. In both cases, direct peeringwas present at each source city. 

其他几个例子包括客户端在前端附近的情况,但ISP的内部策略选择在远处的对等点切换流量。 然后Microsoft intradomain策略将客户端的请求引导到离对等点最近的前端,而不是客户端。 我们观察到的一些例子是一个运营商从丹佛到凤凰城的客户,另一个从莫斯科到斯德哥尔摩的运输。 在这两种情况下,每个来源城市都存在直接对等关系。

Intrigued by these sorts of case studies, we sought to understandanycast performance quantitatively. The first question we ask iswhether anycast performance is poor simply because it occasionallydirects clients to front-ends that are geographically far away, as wasthe case when clients in Moscow went to Stockholm. 

受这些案例研究的影响,我们试图定量地理解任播性能。 我们要问的第一个问题是,任播性能是否差,仅仅是因为它偶尔引导客户到地理位置远的前端,就像莫斯科的客户去斯德哥尔摩的时候那样。

Does anycast direct clients to nearby front-ends? In a largeCDN with presence in major metro areas around the world, mostISPs will see BGP announcements for front-ends from a number ofdifferent locations. If peering among these points is uniform, thenthe ISP’s least cost path from a client to a front-end will often be thegeographically closest. Since anycast is not load or latency aware,geographic proximity is a good indicator of expected performance. 

Anycast是否将客户引导到附近的前端? 在遍布全球主要城市地区的大型CDN中,大多数ISP将会从许多不同的地点看到前端的BGP通告。 如果这些点之间的对等是一致的,那么ISP从客户到前端花费最低的路径往往是地理上最接近的。 由于Anycast不是负载或延迟感知,所以地理邻近度是预期性能的一个很好的指标。

Figure 4 shows the distribution of the distance from client toanycast front-end for all clients in one day of production Bing traffic.One line weights clients by query volume. Anycast is shown toperform 5-10% better at all percentiles when accounting for moreactive clients. We see that about 82% of clients are directed to afront-end within 2000 km while 87% of client volume is within2000 km. 

图4显示了在生产Bing流量的一天内,所有客户端到客户端到选播前端的距离分布情况。 一行按查询量权重客户端。 考虑到更活跃的客户,Anycast被认为在所有百分位数上表现更好。 我们看到大约有82%的客户是在2000公里以内的前端,而客户端的87%在2000公里以内。