微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

使用正则表达式搜索子序列| C ++

如何解决使用正则表达式搜索子序列| C ++

我想在字符串中搜索一个以0开头的序列,以1开头和结尾。例如,

对于100001功能应打印:100001 用于1000101功能的应打印:10001和101

我试图使用正则表达式来完成它,但是我的代码却没有做到。

#include <iostream>
#include <regex>



int main(int argc,char * argv[]){

     std::string number(argv[1]);
     std::regex searchedPattern("1?[0]+1");

     std::smatch sMatch;

     std::regex_search(number,sMatch,searchedPattern);

     for(auto& x : sMatch){
         std::cout << x << std::endl;
     }

     return 0;
}

我用来在Linux(Ubuntu版本18.04)上编译代码的命令:

g++ Cpp_Version.cpp -std=c++14 -o exec
./exec 1000101

g ++版本:

g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
copyright (C) 2017 Free Software Foundation,Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or fitness FOR A PARTIculaR PURPOSE.

输出为:

10001

我质疑我的模式是错误的。有什么想法可以改善它吗?

解决方法

std::regex_search不会搜索所有结果。请改用std::sregex_iterator。其documentation状态(强调我的意思):

在构造上, 并在每次增加 时,它会调用std::regex_search

#include <iostream> // std::cout,std::cerr
#include <regex> // std::regex,std::smatch,std::regex_search,std::sregex_iterator
#include <cstdlib> // EXIT_FAILURE,EXIT_SUCCESS

int main(int argc,char **argv) {
    if (argc < 2) {
        std::cerr << "./a.out 1000101" << std::endl;
        return EXIT_FAILURE;
    }
    std::string n{argv[1]};
    std::regex p{"(?=(1[0]+1))"};
    std::smatch m;
    if (false == std::regex_search(n,m,p)) {
        std::cerr << "regex_search has no match!" << std::endl;
        return EXIT_FAILURE;
    }
    std::cout << "regex_search found " << m.size() << " matches! But this is misleading...\n";
    for (const auto & field : m) {
        const auto begin = std::distance(n.cbegin(),field.first);
        const auto end = begin + std::distance(field.first,field.second);
        std::cout
            << "[" << begin << "," << end << "]\t"
            << field << "\n";
    }
    std::cout << "Unfortunately `sregex_iterator` can't tell you how many matches.\n";
    for (std::sregex_iterator it{n.cbegin(),n.cend(),p},end{}; it != end; ++it) {
        m = *it;
                // m[0] is the capture for the lookahead. it is always empty,but it is needed to have an overlapping match group.
                // m[1] is the capture of your param.
        for (const auto & field : m) {
            const auto begin = std::distance(n.cbegin(),field.first);
            const auto end = begin + std::distance(field.first,field.second);
            std::cout
                << "[" << begin << "," << end << "]\t"
                << field << "\n";
        }
    }
    return EXIT_SUCCESS;
}

这是输出:

$ g++ --version
g++ (GCC) 10.2.0
Copyright (C) 2020 Free Software Foundation,Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ g++ -std=c++20 -O2 -Wall -pedantic example.cpp && ./a.out 1000100101
regex_search found 2 matches! But this is misleading...
[0,0]
[0,5]   10001
Unfortunately `sregex_iterator` can't tell you how many matches.
[0,5]   10001
[4,4]
[4,8]   1001
[7,7]
[7,10]  101

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。