方法，该方法提取匹配模式的url列表

本文关键字：方法 url 列表模式提取 | 更新日期: 2023-09-27 18:19:01

嗯，我试图创建一个方法，使用Regex，这将提取匹配此模式@"http://(www'.)?([^'.]+)'.com"的url列表，到目前为止，我已经这样做了:

public static List<string> Test(string url)
    {
        const string pattern = @"http://(www'.)?([^'.]+)'.com";
        List<string> res = new List<string>();
        MatchCollection myMatches = Regex.Matches(url, pattern);
        foreach (Match currentMatch in myMatches)
        {
        }
        return res;

    }

主要问题是，我应该在foreach循环中使用哪些代码

        res.Add(currentMatch.Groups.ToString());

或

            res.Add(currentMatch.Value);

谢谢!

方法，该方法提取匹配模式的url列表

你只需要得到所有的.Match.Value。在代码中，应该使用

res.Add(currentMatch.Value);

或者直接使用LINQ:

res = Regex.Matches(url, pattern).Cast<Match>()
           .Select(p => p.Value)
           .ToList();

res.Add(currentMatch.Groups.ToString());将给出:System.Text.RegularExpressions.GroupCollection，所以您没有测试它。

您期望从参数url中匹配多少次?

我会这样写:

static readonly Regex _domainMatcher =  new Regex(@"http://(www'.)?([^'.]+)'.com", RegexOptions.Compiled);
public static bool IsValidDomain(string url)
{
    return _domainMatcher.Match(url).Success;
}

或

public static string ExtractDomain(string url)
{ 
    var match = _domainMatcher.Match(url);
    if(match.Success)
        return match.Value;
    else
        return string.Empty;
}

因为参数名为url所以它应该是一个url

如果有更多的可能性，并且您想提取所有匹配模式的域名:

public static IEnumerable<string> ExtractDomains(string data)
{
    var result = new List<string>();
    var match = _domainMatcher.Match(data);
    while (match.Success)
    {
        result.Add(match.Value);
        match = match.NextMatch();
    }
    return result;
}

注意IEnumerable<>而不是List<>，因为调用者不需要修改结果。

或者这个懒惰的变体:

public static IEnumerable<string> ExtractDomains(string data)
{
    var match = _domainMatcher.Match(data);
    while (match.Success)
    {
        yield return match.Value;
        match = match.NextMatch();
    }
}