Regex解析Youtube url与CSV视频ID
本文关键字:CSV 视频 ID url 解析 Youtube Regex | 更新日期: 2023-09-27 18:03:17
我试图抓取以逗号分隔格式的YouTube视频id。
url: http://www.youtube.com/watch?v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io
期望输出:
ClcKC_U_7fM, ujmoYyEyDP8, cwRFjWdxeRQ, Z4BKV121mP4, T241s7O7-Io
I tried following
Regex regexPattern = new Regex(@"""[^""'r'n]*""|'[^''r'n]*'|[^,'r'n]*");
Match matchResults = regexPattern.Match(url);
while (matchResults.Success)
{
Console.WriteLine(matchResults.Value);
matchResults = matchResults.NextMatch();
}
输出http://www.youtube.com/watch?v=ClcKC_U_7fM
ujmoYyEyDP8
cwRFjWdxeRQ
Z4BKV121mP4
T241s7O7-Io
我尝试了其他方法
var regex = new Regex(@"(?:.+?)?(?:''/v''/|watch''/|''?v=|''&v=|youtu''.be''/|''/v=|^youtu''.be''/)([a-zA-Z0-9_-]{11})+");
foreach (Match match in regex.Matches(url))
{
//Console.WriteLine(match);
foreach (var groupdata in match.Groups.Cast<Group>().Where(groupdata => !groupdata.ToString().StartsWith("http://") && !groupdata.ToString().StartsWith("https://") && !groupdata.ToString().StartsWith("youtu") && !groupdata.ToString().StartsWith("www.")))
{
groupdata.ToString();
Console.WriteLine(groupdata.ToString());
}
}
输出ClcKC_U_7fM
对得到以下结果有什么想法吗?
ClcKC_U_7fM, ujmoYyEyDP8, cwRFjWdxeRQ, Z4BKV121mP4, T241s7O7-Io
更新
我忘了提到各种url模式
1) http://www.youtube.com/watch?v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io
2) http://www.youtube.com/embed/watch?feature=player_embedded&v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io
3) http://www.youtube.com/watch?v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io&feature=related
谢谢
使用OP发布的原始代码,将以下代码替换为正则表达式(?<=(v=)|,)[^(,|&)]*
string url = "http://www.youtube.com/watch?v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io";
Regex regexPattern = new Regex("(?<=(v=)|,)[^(,|&)]*");
Match matchResults = regexPattern.Match(url);
while (matchResults.Success)
{
Console.WriteLine(matchResults.Value);
matchResults = matchResults.NextMatch();
}
输出:ClcKC_U_7fM
ujmoYyEyDP8
cwRFjWdxeRQ
Z4BKV121mP4
T241s7O7-Io
如果您先在=
符号处分割url,然后使用简单的字符串分割,可能会更容易。
string url = "http://www.youtube.com/embed/watch?feature=player_embedded&v=ClcKC_U_7fM,ujmoYyEyDP8,cwRFjWdxeRQ,Z4BKV121mP4,T241s7O7-Io";
string[] s1 = url.Split('?');
string[] queries = s1[1].Split('&');
foreach (string query in queries)
{
if (query.ToLower().StartsWith("v="))
{
string[] s2 = query.Split('=');
string[] s3 = s2[1].Split(',');
foreach (string s in s3)
{
Console.WriteLine(s);
}
break;
}
}