从抓取的url中获取绝对url的最合适方式

本文关键字:url 方式 获取 抓取 | 更新日期: 2023-09-27 18:17:54

假设根url如下

http://www.monstermmorpg.com
现在我将展示几个url示例以及如何获取目标
url1: http://www.monstermmorpg.com/
url2: http://www.monstermmorpg.com/Register#21312
url3: Register#21312
url4: /Register
url5: Register
url6: /Register?news=true&news2=true
// there may be more that goes to same url but i don't have full list atm

我需要一个函数,将导致以下url与根url

的帮助
url1: http://www.monstermmorpg.com
url2: http://www.monstermmorpg.com/Register
url3: http://www.monstermmorpg.com/Register
url4: http://www.monstermmorpg.com/Register
url5: http://www.monstermmorpg.com/Register
url6: http://www.monstermmorpg.com/Register?news=true&news2=true

有这种方法,但我认为这是不够的,还有更好的方法吗?

c# .net 4.5 WPF应用程序
Uri baseUri= new Uri("http://www.contoso.com");
 Uri myUri = new Uri(baseUri,"catalog/shownew.htm?date=today");
 Console.WriteLine(myUri.AbsoluteUri);

从抓取的url中获取绝对url的最合适方式

static void Main(string[] args)
{
    var baseUrl = "http://www.monstermmorpg.com";
    var urls = new string[] {
        "http://www.monstermmorpg.com/",
        "http://www.monstermmorpg.com/Register#21312",
        "Register#21312",
        "/Register",
        "Register",
        "/Register?news=true&news2=true" };
    var absoluteUrls = new List<string>();
    foreach (var url in urls)
    {
        if (url.StartsWith("http"))
        {
            var uri = new Uri(url);
            absoluteUrls.Add(uri.Host + uri.PathAndQuery);
        }
        else
        {
            var urlWithSlash = url;
            if (!urlWithSlash.StartsWith("/"))
                urlWithSlash = "/" + url;
            var uri = new Uri(baseUrl + urlWithSlash);
            absoluteUrls.Add(uri.Host + uri.PathAndQuery);
        }
    }
    // Now absoluteUrls contains 
    //url1: http://www.monstermmorpg.com
    //url2: http://www.monstermmorpg.com/Register
    //url3: http://www.monstermmorpg.com/Register
    //url4: http://www.monstermmorpg.com/Register
    //url5: http://www.monstermmorpg.com/Register
    //url6: http://www.monstermmorpg.com/Register?news=true&news2=true
}