SQL Server 2005, wide indexes, computed columns, and sargable queries
In my database, assume we have a table defined as follows:
CREATE TABLE [Chemical](
[ChemicalId] int NOT NULL IDENTITY(1,1) PRIMARY KEY,
[Name] nvarchar(max) NOT NULL,
[Description] nvarchar(max) NULL
)
The value for Name can be very large, so we must use nvarchar(max). Unfortunately, we want to create an index on this column, but nvarchar(max) is not supported inside an index.
So we create the following computed column and associated index based upon it:
ALTER TABLE [Chemical]
ADD [Name_Indexable] AS LEFT([Name], 20)
CREATE INDEX [IX_Name]
ON [Chemical]([Name_Indexable])
INCLUDE([Name])
The index will not be unique but we can enforce uniqueness via a trigger.
If we perform the following query, the execution plan results in a index scan, which is not what we want:
SELECT [ChemicalId], [Name], [Description]
FROM [Chemical]
WHERE [Name]='[1,1''-Bicyclohexyl]-2-carboxylic acid, 4'',5-dihydroxy-2'',3-dimethyl-5'',6-bis[(1-oxo-2-propen-1-yl)oxy]-, methyl ester'
However, if we modify the query to make it "sargable," then the execution plan results in an index seek, which is what we want:
SELECT [ChemicalId], [Name], [Description]
FROM [Chemical]
WHERE [Indexable_Name]='[1,1''-Bicyclohexyl]-' AND [Name]='[1,1''-Bicyclohexyl]-2-carboxylic acid, 4'',5-dihydroxy-2'',3-dimethyl-5'',6-bis[(1-oxo-2-propen-1-yl)oxy]-, methyl ester'
Is this a good solution if we control the format of all queries executed against the database via our middle tier? Is there a better way? Is this a major kludge? Should we be using full-text indexing?
IMHO, yes, I think this is a bad approach. If you knew that the first 20 characters will be unique, then it should be a first class column with a unique constraint. If you want to have better searching on the Name column, then using full-text search is the right way to go. If you want to ensure that the varchar(max) column is unique, then create a computed column that generates a hash off the value and place a unique constraint off that.
Alter Table Add NameHash Hashbytes('SHA1', [Name])
ADDITION
Given our discussion, if your searches are always going to be on a exact match, then you could hash your search parameter and compare it to NameHash above. However, the catch is that the match must be an exact match (ie case-sensitive).
I still content that the FTS will be your best bet. Even though there is overhead in breaking up your text into words, FTS is the tool best designed for doing searches against large amounts of text. The longer your search criteria, the more exact it will be, the faster the search.
Your index is on name_indexable
, not on name
. Since name_indexable
is generated from a function involving name
instead of directly on the column name
, the optimizer won't automatically use the index when your where
clause includes a reference to name
. You must search on name_indexable
in order to use the index. Since you have a middle tier, your best bet is probably to provide a function that searches on name_indexable
if the given name is <= 200 characters, and otherwise searches on both.
使Name_Index列成为持久计算列和主键,并通过追加ChemicalId而不是依赖触发器来强制执行唯一性。
CREATE TABLE dbo.[Chemical]
([ChemicalId] int NOT NULL IDENTITY(1,1),
[Name] Nvarchar(max) NOT NULL,
[Description] Nvarchar(max) NOT NULL,
[Name_Index] AS (CONVERT(VARCHAR(20), LEFT([Name], 20)) + CONVERT(VARCHAR(20), [ChemicalId])) PERSISTED PRIMARY KEY);
链接地址: http://www.djcxy.com/p/44002.html