LeetCode 28：实现strStr() Implement strStr()

时间 2019-12-20

标签 leetcode 实现 strstr implement 繁體版

原文原文链接

爱写bug（ID：icodebugs）
做者：爱写bughtml

实现 strStr() 函数。java

给定一个 haystack 字符串和一个 needle 字符串，在 haystack 字符串中找出 needle 字符串出现的第一个位置 (从0开始)。若是不存在，则返回 -1。python

Implement strStr().面试

Return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.算法

Example 1:api

Input: haystack = "hello", needle = "ll"
Output: 2

Example 2:数组

Input: haystack = "aaaaa", needle = "bba"
Output: -1

Clarification:oracle

What should we return when needle is an empty string? This is a great question to ask during an interview.函数

For the purpose of this problem, we will return 0 when needle is an empty string. This is consistent to C's strstr() and Java's indexOf()).优化

说明:

当 needle 是空字符串时，咱们应当返回什么值呢？这是一个在面试中很好的问题。

对于本题而言，当 needle 是空字符串时咱们应当返回 0 。这与C语言的 strstr() 以及 Java的 indexOf()) 定义相符。

解题思路(Java):

暴力穷举：

复杂度：时间 O(n^2) 空间 O(1)

字符串 a 从第一个索引开始逐一匹配字符串 b 的第一个索引：a[i++]==b[0]，若是为true，则进入内循环字符串a从第 i+j 个字符开始与字符串b 第 j个字符匹配：a[i+j]==b[j]

代码：

class Solution {
    public int strStr(String haystack, String needle) {
        if(needle.equals(""))return 0;
        int haystackLen=haystack.length(),needleLen=needle.length();
        char firstChar=needle.charAt(0);

        for(int i=0;i<=haystackLen-needleLen;i++){
            if(haystack.charAt(i)==firstChar){
                int j=1;
                for(;j<needleLen;j++){
                    if(haystack.charAt(i+j)!=needle.charAt(j)) break;
                }
                if(j==needleLen) return i;
            }
        }
        return -1;
    }
}

KMP算法：

复杂度：时间 O(n+m) 空间 O(M)

下面引用一组图片帮助理解(图片来源：http://www.javashuo.com/article/p-ncuqifhj-eo.html )：

说明： 图片中字符串haystack为："BBC ABCDAB ABCDABCDABDE"，模式串 needle 为："ABCDABD"

第一步开始匹配：

第二步匹配到第一个相同字符：

第三步两个字符串逐一贯后匹配，直到到字符 D 与空格字符匹配失败，结束该轮次匹配：

第四步从新匹配，但不用从第二步的下一个字符 B 开始，由于空格字符前与模式字符串前6个字符已经匹配相同。既C字符以前的两个字符 AB 与空格字符前两个字符 AB 相同，两个字符串可直接从空白字符与 C 字符开始匹配：

能够看到图片中一下跳过了 haystack 五个字符ABCDAB 和 needle 的两个字符AB。优化思路很清晰。

代码：

class Solution {
    public int strStr(String haystack, String needle) {
        if(needle.equals("")) return 0;
        int[] next = new int[needle.length()];
        getNext(next, needle);// 获得next数组
        // i是匹配串haystack的指针，j是模式串needle的指针
        int i = 0, j = 0;
        while(i < haystack.length() && j < needle.length()){
            // 若是j=-1，即next数组中该字符为第一位，下标+1后，从新匹配
            if(j == -1 || haystack.charAt(i) == needle.charAt(j)){
                // 若是匹配成功，则自增1，匹配下一个字符
                i++;j++;
            } else {
                j = next[j];// 若是匹配失败，则将j赋值next[j]，避免匹配重复匹配
            }
        }
        return j == needle.length() ? i - j : -1;
    }

    private void getNext(int[] next, String needle){
        // k是前缀中相同部分的末尾，也是相同部分的长度
        // j是后缀的末尾，即后缀相同部分的末尾
        int k = -1, j = 0;
        next[0] = -1;
        while(j < needle.length() - 1){
            // 若是k=-1，匹配失败，从新开始计算前缀和后缀相同的长度
            // 若是两个字符相等，则在上次前缀和后缀相同的长度加1，继续下一段字符最大公共先后缀匹配
            if (k == -1 || needle.charAt(j) == needle.charAt(k)){
                k++;j++;
                if (needle.charAt(j) != needle.charAt(k))
                    next[j] = k;
                else
                    //由于不能出现p[j] = p[ next[j ]]，因此当出现时须要继续递归，k = next[k] = next[next[k]]，以减小重复部分的多余匹配
                    next[j] = next[k];
            } else {
                // 不然，前缀长度缩短为next[k]
                k = next[k];
            }
        }
    }
}

总结：

KMP算法优化的方向很明了，主要难点就在于对next数组的求法和理解，KMP算法不是本文的重点，若有兴趣深刻了解，推荐一篇博文：http://www.javashuo.com/article/p-ncuqifhj-eo.html

另外还有Sunday算法 是找到与模式字符串相同长度的源字符串从右向左匹配，其中心思想为：

若是该字符没有在模式串中出现，直接从该字符向右移动位数 = 模式串长度 + 1。（由于源字符串含有该字符的相同长度字符串不可能匹配）

若是该字符在模式串中出现过，其移动位数 = 模式串中最右端的该字符到末尾的距离+1。

字符串haystackBBC ABC 与模式串needle ABCDABD 匹配，字符串haystack中的空格字符未在模式串needle 中出现，则能够直接跳过空格字符后面六个字符的匹配，由于包含空格字符的相同长度字符串都不可能匹配成功，因此能够跳过6个。

Python3：

说明：上面两种方法在全部语言均可行，只是语法不一样，因此在py3中再也不复现，仅展现一些py3特有的语法投机取巧解题。

利用py3内建函数find()直接得结果。

class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        return haystack.find(needle)

find() 方法描述

find() 方法检测字符串中是否包含子字符串 str ，若是指定 beg（开始）和 end（结束）范围，则检查是否包含在指定范围内，若是指定范围内若是包含指定索引值，返回的是索引值在字符串中的起始位置。若是不包含索引值，返回-1。若是子字符串为空，返回0。

语法
str.find(str, beg=0, end=len(string))
参数

str -- 指定检索的字符串

beg -- 开始索引，默认为0。

end -- 结束索引，默认为字符串的长度。

利用py3字符出切片特性解决：

class Solution:
    def strStr(self, haystack: str, needle: str) -> int:
        for i in range(len(haystack)-len(needle)+1):
            if haystack[i:i+len(needle)]==needle:#截取切片
                return i
        return -1

注：算法导论第32章：字符串匹配有完整的一章相关讨论。

LeetCode 28：实现strStr() Implement strStr()

解题思路(Java):

暴力穷举：

代码：

KMP算法：

代码：

总结：

Python3：

find() 方法描述

语法

参数