Httpwebrequest(503)出现问题

本文关键字:问题 Httpwebrequest | 更新日期: 2023-09-27 17:59:54

我正在使用HttpWebrequest从谷歌获取结果。我使用代理来获取数据。现在有一个奇怪的问题,对于某些查询,它返回数据,而对于某些查询则抛出异常The remote server returned an error: (503) Server Unavailable.。有人可能会认为代理很糟糕,但当你把它放在internet explorer中时,你打开谷歌,它就在那里。没有503错误。但httpwebrequest在某些查询中给出了它。即,如果你打算获得

http://www.google.com/search?q=site:http://www.yahoo.com 

它会抛出异常,就好像你去一样

http://www.google.com/search?q=info:http://www.yahoo.com

它是有效的。

到目前为止我的代码是

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(file);
                request.ProtocolVersion = HttpVersion.Version11;
                request.Method = "GET";
               request.KeepAlive = false;
                request.ContentType = "text/html";
                request.Timeout = 1000000000;
                request.ReadWriteTimeout = 1000000000;
                request.UseDefaultCredentials = true;
                request.Credentials = CredentialCache.DefaultCredentials;
    Uri newUri = new Uri("http://" + proxy[selectedProxy].ProxyAddress.Trim() + "/");
                    WebProxy myProxy = new WebProxy();
                    myProxy.Credentials = CredentialCache.DefaultCredentials;
                    myProxy.Address = newUri;
                    request.Proxy = myProxy;
 WebResponse response = request.GetResponse();
                    // System.Threading.Thread.Sleep(Delay);
                    StreamReader reader = null;
                    string data = null;
                    reader = new StreamReader(response.GetResponseStream());
                        data = reader.ReadToEnd();

Httpwebrequest(503)出现问题

您收到"对不起,您是垃圾邮件机器人消息",需要输入captcha才能继续或更改代理。由于某些原因,当您收到503错误时,默认情况下无法提取页面内容,尽管如果您在浏览器中执行相同操作,内容将显示给您。

这很奇怪。也许是一些url编码问题。尝试以下应该注意正确处理一切:

using System;
using System.Net;
using System.Web;
class Program
{
    static void Main()
    {
        using (var client = new WebClient())
        {
            var newUri = new Uri("http://proxy.foo.com/");
            var myProxy = new WebProxy();
            myProxy.Credentials = CredentialCache.DefaultCredentials;
            myProxy.Address = newUri;
            client.Proxy = myProxy;
            var query = HttpUtility.ParseQueryString(string.Empty);
            query["q"] = "info:http://www.yahoo.com";
            var url = new UriBuilder("http://www.google.com/search");
            url.Query = query.ToString();
            Console.WriteLine(client.DownloadString(url.ToString()));
        }
    }
}

这取决于使用相同IP地址向Google发送查询的频率。如果你把查询发送到谷歌的速度太快,那么谷歌会屏蔽你的IP地址。当这种情况发生时,谷歌会返回一个503错误,并重定向到他们的抱歉页面。

这样做:

try
            {
                response = (HttpWebResponse) webRequest.GetResponse();
            }
            catch (WebException ex)
            {
                using (var sr = new StreamReader(ex.Response.GetResponseStream()))
                {
                    var html = sr.ReadToEnd();
                }
            }

调试时,请检查html变量中的值。您将看到这是一个HTML页面,您应该在其中填写一个captcha代码