微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

sql-server – 多语句TVF与内联TVF性能

比较 Palindrome question上的一些答案(仅限10k用户,因为我删除了答案),我的结果令人困惑.

我提出了一个multi-statement,schema-bound TVF,我认为它比运行标准功能更快.我也认为多语句TVF会被“内联”,虽然我错了,但你会在下面看到.这个问题是关于这两种风格的TVF的性能差异.首先,您需要查看代码.

这是多语句TVF:

IF OBJECT_ID('dbo.Ispalindrome') IS NOT NULL
DROP FUNCTION dbo.Ispalindrome;
GO
CREATE FUNCTION dbo.Ispalindrome
(
    @Word NVARCHAR(500)
) 
RETURNS @t TABLE
(
    Ispalindrome BIT NOT NULL
)
WITH SCHEMABINDING
AS
BEGIN
    DECLARE @Ispalindrome BIT;
    DECLARE @LeftChunk NVARCHAR(250);
    DECLARE @RightChunk NVARCHAR(250);
    DECLARE @StrLen INT;
    DECLARE @Pos INT;
    SET @RightChunk = '';
    SET @Ispalindrome = 0;
    SET @StrLen = LEN(@Word) / 2;
    IF @StrLen % 2 = 1 SET @StrLen = @StrLen - 1;
    SET @Pos = LEN(@Word);
    SET @LeftChunk = LEFT(@Word,@StrLen);
    WHILE @Pos > (LEN(@Word) - @StrLen)
    BEGIN
        SET @RightChunk = @RightChunk + SUBSTRING(@Word,@Pos,1)
        SET @Pos = @Pos - 1;
    END
    IF @LeftChunk = @RightChunk SET @Ispalindrome = 1;
    INSERT INTO @t VALUES (@Ispalindrome);
    RETURN
END
GO

内联TVF:

IF OBJECT_ID('dbo.InlineIspalindrome') IS NOT NULL
DROP FUNCTION dbo.InlineIspalindrome;
GO
CREATE FUNCTION dbo.InlineIspalindrome
(
    @Word NVARCHAR(500)
)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN (
    WITH Nums AS
    (
      SELECT
        N = number
      FROM
        dbo.Numbers
    )
    SELECT
      Ispalindrome =
        CASE
          WHEN EXISTS
          (
            SELECT N
            FROM Nums
            WHERE N <= L / 2
              AND SUBSTRING(S,N,1) <> SUBSTRING(S,1 + L - N,1)
          )
          THEN 0
          ELSE 1
        END
    FROM
      (SELECT LTRIM(RTRIM(@Word)),LEN(@Word)) AS v (S,L)
);
GO

上述函数中的Numbers表定义为:

CREATE TABLE dbo.Numbers
(
    Number INT NOT NULL 
);

注意:数字表没有任何索引,也没有主键,并且包含1,000,000行.

试验台临时表:

IF OBJECT_ID('tempdb.dbo.#Words') IS NOT NULL
DROP TABLE #Words;
GO
CREATE TABLE #Words 
(
    Word VARCHAR(500) NOT NULL
);

INSERT INTO #Words(Word) 
SELECT o.name + REVERSE(w.name)
FROM sys.objects o
CROSS APPLY (
    SELECT o.name
    FROM sys.objects o
) w;

在我的测试系统上,上面的INSERT导致将16900行插入到#Words表中.

为了测试这两种变化,我设置了STATISTICS IO,TIME ON;并使用以下内容

SELECT w.Word,p.Ispalindrome
FROM #Words w
    CROSS APPLY dbo.Ispalindrome(w.Word) p
ORDER BY w.Word;


SELECT w.Word,p.Ispalindrome
FROM #Words w
    CROSS APPLY dbo.InlineIspalindrome(w.Word) p
ORDER BY w.Word;

我期望InlineIspalindrome版本明显更快,但是以下结果不支持该假设.

多语句TVF:

Table ‘#A1CE04C3’. Scan count 16896,logical reads 16900,physical reads 0,read-ahead reads 0,lob logical reads 0,lob physical reads 0,lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0,logical reads 0,lob read-ahead reads 0.
Table ‘#Words’. Scan count 1,logical reads 88,lob read-ahead reads 0.

sql Server Execution Times:
cpu time = 1700 ms,elapsed time = 2022 ms.
sql Server parse and compile time:
cpu time = 0 ms,elapsed time = 0 ms.

内联TVF:

Table ‘Numbers’. Scan count 1,logical reads 1272030,lob read-ahead reads 0.

sql Server Execution Times:
cpu time = 137874 ms,elapsed time = 139415 ms.
sql Server parse and compile time:
cpu time = 0 ms,elapsed time = 0 ms.

执行计划如下:

在这种情况下,为什么内联变量比多语句变量慢得多?

在回应@AaronBertrand的评论时,我修改了dbo.InlineIspalindrome函数,以限制CTE返回的行与输入字的长度相匹配:

CREATE FUNCTION dbo.InlineIspalindrome
(
    @Word NVARCHAR(500)
)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN (
    WITH Nums AS
    (
      SELECT
        N = number
      FROM
        dbo.Numbers
      WHERE 
        number <= LEN(@Word)
    )
    SELECT
      Ispalindrome =
        CASE
          WHEN EXISTS
          (
            SELECT N
            FROM Nums
            WHERE N <= L / 2
              AND SUBSTRING(S,L)
);

正如@MartinSmith建议的那样,我已经在dbo.Numbers表中添加一个主键和聚簇索引,这肯定有助于并且更接近人们期望在生产环境中看到的内容.

现在重新运行上面的测试会产生以下统计信息:

交叉申请dbo.Ispalindrome(w.Word)p:

(17424 row(s) affected)
Table ‘#B1104853’. Scan count 17420,logical reads 17424,logical reads 90,lob read-ahead reads 0.

sql Server Execution Times:
cpu time = 1763 ms,elapsed time = 2192 ms.

dbo.FunctionIspalindrome(w.Word):

(17424 row(s) affected)
Table ‘Worktable’. Scan count 0,lob read-ahead reads 0.

sql Server Execution Times:
cpu time = 328 ms,elapsed time = 424 ms.

交叉申请dbo.InlineIspalindrome(w.Word)p:

(17424 row(s) affected)
Table ‘Numbers’. Scan count 1,logical reads 237100,lob read-ahead reads 0.

sql Server Execution Times:
cpu time = 17737 ms,elapsed time = 17946 ms.

我在sql Server 2012 SP3,v11.0.6020,Developer Edition上测试了这个.

这是我的数字表的定义,主键和聚集索引:

CREATE TABLE dbo.Numbers
(
    Number INT NOT NULL 
        CONSTRAINT PK_Numbers
        PRIMARY KEY CLUSTERED
);

;WITH n AS
(
    SELECT v.n 
    FROM (
        VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)
        ) v(n)
)
INSERT INTO dbo.Numbers(Number)
SELECT ROW_NUMBER() OVER (ORDER BY n1.n)
FROM n n1,n n2,n n3,n n4,n n5,n n6;

解决方法

您的数字表是一个堆,每次都可能被完全扫描.

在Number上添加一个聚簇主键,并使用forceseek提示尝试以下操作以获得所需的搜索.

据我所知,这个提示是必需的,因为sql Server只估计该表的27%将与谓词匹配(< = 30%和<>减少到27%).因此,它只需要读取3-4行才能找到匹配的行,并且可以退出半连接.因此,扫描选项的成本非常低廉.但事实上,如果确实存在任何回文,那么它将不得不阅读整个表格,因此这不是一个好的计划.

CREATE FUNCTION dbo.InlineIspalindrome
(
    @Word NVARCHAR(500)
)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN (
    WITH Nums AS
    (
      SELECT
        N = number
      FROM
        dbo.Numbers WITH(FORCESEEK)
    )
    SELECT
      Ispalindrome =
        CASE
          WHEN EXISTS
          (
            SELECT N
            FROM Nums
            WHERE N <= L / 2
              AND SUBSTRING(S,L)
);
GO

随着这些变化,它飞了我(需要228ms)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐