索引是数据库设计的基础,并告诉开发人员使用数据库关于设计者的意图。不幸的是,当性能问题出现时,索引每每被添加为过后考虑。这里最后是一个简单的系列文章,应该使他们快速地使任何数据库专业人员“快速” SQL Server索引阶段1中的级别1一般引入了SQL Server索引,特别引入了非聚簇索引。做为咱们的第一个案例研究,咱们演示了从表中检索单个行时索引的潜在好处。在这个层面上,咱们继续调查非集群指标。在超出从表中检索单个行的状况下,检查他们对良好查询性能的贡献。 就像大多数这些层面的状况同样,咱们引入少许的理论,检查一些索引内部的内容来帮助解释理论,而后执行一些查询。这些查询是在没有索引的状况下执行的,而且打开了性能报告统计信息,以便查看索引的影响。 咱们将使用咱们在Level 1中使用的AdventureWorks数据库中的表的子集,集中在整个级别的Contact表。咱们将只使用一个索引,即咱们在1级中使用的FullName索引来讲明咱们的观点。为了确保咱们控制Contact表上的索引,咱们将在dbo模式中建立表的两个副本,并仅在其中一个上建立FullName索引。这将给咱们咱们的受控环境:表的两个副本:一个具备单个非汇集索引,另外一个没有任何索引。sql
注意: 在这个楼梯级别显示的全部TSQL代码能够在文章底部下载。 清单1中的代码建立了Person.Contact表的副本,咱们能够在咱们但愿以“clean slate”开始的任什么时候候从新运行这个批处理。数据库
IF EXISTS (app
SELECT *数据库设计
FROM sys.tables ide
WHERE OBJECT_ID = OBJECT_ID('dbo.Contacts_index'))sqlserver
DROP TABLE dbo.Contacts_index;性能
GO测试
IF EXISTS (ui
SELECT *this
FROM sys.tables
WHERE OBJECT_ID = OBJECT_ID('dbo.Contacts_noindex'))
DROP TABLE dbo.Contacts_noindex;
GO
SELECT * INTO dbo.Contacts_index
FROM Person.Contact;
SELECT * INTO dbo.Contacts_noindex
FROM Person.Contact;
清单2.1:制做Person.Contact表的副本联系人表格的一个片断显示在这里:
ContactID FirstName MiddleName LastName EmailAddress
1288 Laura F Norman laura1@adventure-works.com 651 Michael Patten michael20@adventure-works.com 1652 Isabella R James isabella6@adventure-works.com 1015 David R Campbell david8@adventure-works.com 1379 Balagane Swaminath balaganesan0@adventure-works.c 742 Steve Schmidt steve3@adventure-works.com 1743 Shannon C Guo shannon16@adventure-works.com 1106 John Y Chen john2@adventure-works.com 1470 Blaine Dockter blaine1@adventure-works.com 833 Clarence R. Tatman clarence0@adventure-works.com 1834 Heather M Wu heather6@adventure-works.com 1197 Denise H Smith denise0@adventure-works.com 560 Jennifer J. Maxham jennifer1@adventure-works.com 1561 Ido Ben-Sacha ido1@adventure-works.com 924 Becky R. Waters becky0@adventure-works.com
非汇集索引条目 如下语句在Contacts_index表上建立咱们的FullName非聚簇索引。
Contacts_index table.
CREATE INDEX FullName
ON Contacts_index
( LastName, FirstName );
清单2.2 - 建立一个非汇集索引请记住,非聚簇索引按顺序存储索引键,以及用于访问表中实际数据的书签。 您能够将书签看做一种指针。 将来的层次将更详细地描述书签,其形式和使用。 这里显示FullName索引的片断,包括姓氏和名字做为键列,加上书签:
:--- Search Key Columns : Bookmark
Russell Zachary => Ruth Andy => Ruth Andy => Ryan David => Ryan Justin => Sabella Deanna => Sackstede Lane => Sackstede Lane => Saddow Peter => Sai Cindy => Sai Kaitlin => Sai Manuel => Salah Tamer => Salanki Ajay => Salavaria Sharon =>
每一个条目都包含索引键列和书签值。另外,SQL Server非聚簇索引条目具备一些仅供内部使用的头信息,并可能包含一些可选的数据值。这两个都将在后面的层面进行讨论。在这个时候,对非基本指标的基本理解也不重要。 如今,咱们只须要知道键值就能使SQL Server找到合适的索引条目;而且该条目的书签值使SQL Server可以访问表中相应的数据行。 索引条目的好处是在顺序 索引的条目按索引键值进行排序,因此SQL Server能够在任一方向上快速遍历条目。顺序条目的扫描能够从索引的开始,索引的结尾或索引内的任何条目开始。 所以,若是一个请求要求全部以姓氏字母“S”开头的联系人(WHERE LastName LIKE'S%'),SQL Server能够快速导航到第一个“S”项(“Sabella,Deanna”),而后遍历索引,使用书签访问行,直到到达第一个“T”条目;在这一点上它知道它已经检索了全部的“S”条目。 若是全部选定的列都在索引中,上面的请求会更快地执行。所以,若是咱们发出:
SELECT FirstName, LastName
FROM Contact
WHERE LastName LIKE 'S%';
SQL Server能够快速导航到第一个“S”条目,而后遍历索引条目,忽略书签并直接从索引条目检索数据值,直到达到第一个“T”条目。在关系数据库术语中,索引已经“覆盖”了查询。 从序列数据中受益的任何SQL操做符均可以从索引中受益。这包括ORDER BY,GROUP BY,DISTINCT,UNION(不是UNION ALL)和JOIN ... ON。 例如,若是一个请求经过姓氏询问联系人的数量,SQL Server能够从第一个条目开始计数,而后沿索引继续。每次更改姓氏的值时,SQL Server都会输出当前计数并开始新的计数。与以前的请求同样,这是一个覆盖查询; SQL Server只访问索引,彻底忽略表。 请注意按键列从左到右的顺序的重要性。若是一个请求询问全部姓“Ashton”的人,咱们的索引是很是有用的,可是若是这个请求是针对全部名字是“Ashton”的人,那么这个索引几乎没有任何帮助。 测试一些样本查询 若是要执行后续的测试查询,请确保运行脚本以建立新的联系人表的两个版本:dbo.Contacts_index和dbo.Contacts_noindex;并运行该脚本以在dbo.Contacts_index上建立LastName,FirstName索引。 为了验证上一节中的断言,咱们打开了在1级中使用的相同性能统计信息,并运行一些查询;有和没有索引。
SET STATISTICS io ON
SET STATISTICS time ON
因为AdventureWorks数据库中的Contacts表中只有19972行,因此很难得到有意义的统计时间值。 咱们大多数的查询会显示一个CPU时间值为0,因此咱们不显示统计时间的输出; 只从统计数据IO中反映出可能须要读取的页数。 这些值将容许咱们在相对意义上比较查询,以肯定哪些查询具备哪些索引比其余索引执行得更好。 若是您想要更大的表进行更加实际的计时测试,则可使用本文提供的构建百万行版本的Contact表的脚本。 接下来的全部讨论都假设你使用的是标准的19972行表。 测试涵盖的查询 咱们的第一个查询是一个将被索引覆盖的查询; 一个为全部姓氏以“S”开头的联系人检索一组有限的列。 查询执行信息如表2.1所示。
SQL |
SELECT FirstName, LastName FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'S%' |
Without Index |
(2130 row(s) affected) Table 'Contacts_noindex'. Scan count 1, logical reads 568. |
With Index |
(2130 row(s) affected) Table 'Contacts_index'. Scan count 1, logical reads 14. |
Index Impact |
IO reduced from 568 reads to 14 reads. |
Comments |
An index that covers the query is a good thing to have. Without an index, the entire table is scanned to find the rows. The “2130 rows” statistic indicates that “S” is a popular initial letter for last names, occurring in ten percent of all contacts. |
表2.1:运行覆盖查询时的执行结果 测试一个不包含的查询 接下来,咱们修改咱们的查询以请求与以前相同的行,但包括不在索引中的列。 查询执行信息见表2.2。
SQL |
SELECT * FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'S%' |
Without Index |
Same as previous query. (Because it is a table scan). |
With Index |
(2130 row(s) affected) Table 'Contact_index'. Scan count 1, logical reads 568. |
Index Impact |
No impact at all. |
Comments |
The index was never used during the execution of the query! SQL Server decided that jumping from an index entry to the corresponding row in the table 2130 times (once for each row) was more work than scanning the entire table of one million rows to find the 2130 rows that it needed. |
表2.2:运行非覆盖查询时的执行结果 测试一个不包含的查询,但更有选择性 这一次,咱们使咱们的查询更具选择性; 也就是说,咱们缩小了被请求的行数。 这增长了索引对该查询有利的可能性。 查询执行信息如表2.3所示。
SQL |
SELECT * FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'Ste%' |
Without Index |
Same as previous query. (Because it is a table scan). |
With Index |
(107 row(s) affected) Table 'Contact_index'. Scan count 1, logical reads 111. |
Index Impact |
IO reduced from 568 reads to 111 reads.. |
Comments |
SQL Server accessed the 107 “Ste%” entries, all of which are located consecutively within the index. Each entry’s bookmark was then used to retrieve to corresponding row. The rows are not located consecutively within the table. The index benefitted this query; but not as much as it benefitted the first query, the “covered” query; especially in terms of number of IOs required to retrieve each row. You might expect that reading 107 index entries plus 107 rows would require 107 + 107 reads. The reason why only 111 reads were required will be covered at a higher level. For now, we will say that very few of the reads were used to access the index entries; most were used to access the rows. Since the previous query, which requested 2130 rows, did not benefit from the index; and this query, which requested 107 rows, did benefit from the index - you might also wonder “where does the tipping point lie?” The calculations behind SQL Server’s decision also will be covered in a future level. |
表2.3:运行更具选择性的非覆盖查询时的执行结果 测试涵盖的聚合查询 咱们最后一个示例查询将是一个聚合查询; 这是一个涉及计数,合计,平均等的查询。 在这种状况下,这是一个查询,告诉咱们在联系人表中名称重复的程度。 结果部分看起来像这样:
Steel Merrill 1 Steele Joan 1 Steele Laura 2 Steelman Shanay 1 Steen Heidi 2 Stefani Stefano 1 Steiner Alan 1 查询执行信息见表2.4。
SQL |
SELECT LastName, FirstName, COUNT(*) as 'Contacts' FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'Ste%' GROUP BY LastName, FirstName |
Without Index |
Same as previous query. (Because it is a table scan). |
With Index |
(104 row(s) affected) Table 'Contacts_index'. Scan count 1, logical reads 4. |
Index Impact |
IO reduced from 568 reads to 4 reads. |
Comments |
All the information needed by the query is in the index; and it is in the index in the ideal sequence for calculating the counts. All the “last name begins with ‘Ste’” entries are consecutive within the index; and within that group, all the entries for a single FirstName / LastName value are grouped together. No accessing of the table was required; nor was any sorting of intermediate results needed. Again, an index that covers the query is a good thing to have. |
表2.4:运行覆盖聚合查询时的执行结果 测试未覆盖的聚合查询 若是咱们改变查询来包含不在索引中的列,咱们能够获得咱们在表2.5中看到的性能结果。
SQL |
SELECT LastName, FirstName, MiddleName, COUNT(*) as 'Contacts' FROM dbo.Contacts -- execute with both Contacts_noindex and -- Contacts_index WHERE LastName LIKE 'Ste%' GROUP BY LastName, FirstName, MiddleName |
Without Index |
Same as previous query. (Because it is a table scan). |
With Index |
(105 row(s) affected) Table 'ContactLarge'. Scan count 1, logical reads 111. |
Index Impact |
IO reduced from 568 reads to 111 reads; same as the previous non-covered query |
Comments |
Intermediate work done while processing the query does not always appear in the statistics. Techniques that use memory or tempdb to sort and merge data are examples of this. In reality, the benefit of an index may be greater than that shown by the statistics. |