Httpwebrequest(503)出现问题
本文关键字:问题 Httpwebrequest | 更新日期: 2023-09-27 17:59:54
我正在使用HttpWebrequest
从谷歌获取结果。我使用代理来获取数据。现在有一个奇怪的问题,对于某些查询,它返回数据,而对于某些查询则抛出异常The remote server returned an error: (503) Server Unavailable.
。有人可能会认为代理很糟糕,但当你把它放在internet explorer中时,你打开谷歌,它就在那里。没有503错误。但httpwebrequest
在某些查询中给出了它。即,如果你打算获得
http://www.google.com/search?q=site:http://www.yahoo.com
它会抛出异常,就好像你去一样
http://www.google.com/search?q=info:http://www.yahoo.com
它是有效的。
到目前为止我的代码是
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(file);
request.ProtocolVersion = HttpVersion.Version11;
request.Method = "GET";
request.KeepAlive = false;
request.ContentType = "text/html";
request.Timeout = 1000000000;
request.ReadWriteTimeout = 1000000000;
request.UseDefaultCredentials = true;
request.Credentials = CredentialCache.DefaultCredentials;
Uri newUri = new Uri("http://" + proxy[selectedProxy].ProxyAddress.Trim() + "/");
WebProxy myProxy = new WebProxy();
myProxy.Credentials = CredentialCache.DefaultCredentials;
myProxy.Address = newUri;
request.Proxy = myProxy;
WebResponse response = request.GetResponse();
// System.Threading.Thread.Sleep(Delay);
StreamReader reader = null;
string data = null;
reader = new StreamReader(response.GetResponseStream());
data = reader.ReadToEnd();
您收到"对不起,您是垃圾邮件机器人消息",需要输入captcha才能继续或更改代理。由于某些原因,当您收到503错误时,默认情况下无法提取页面内容,尽管如果您在浏览器中执行相同操作,内容将显示给您。
这很奇怪。也许是一些url编码问题。尝试以下应该注意正确处理一切:
using System;
using System.Net;
using System.Web;
class Program
{
static void Main()
{
using (var client = new WebClient())
{
var newUri = new Uri("http://proxy.foo.com/");
var myProxy = new WebProxy();
myProxy.Credentials = CredentialCache.DefaultCredentials;
myProxy.Address = newUri;
client.Proxy = myProxy;
var query = HttpUtility.ParseQueryString(string.Empty);
query["q"] = "info:http://www.yahoo.com";
var url = new UriBuilder("http://www.google.com/search");
url.Query = query.ToString();
Console.WriteLine(client.DownloadString(url.ToString()));
}
}
}
这取决于使用相同IP地址向Google发送查询的频率。如果你把查询发送到谷歌的速度太快,那么谷歌会屏蔽你的IP地址。当这种情况发生时,谷歌会返回一个503错误,并重定向到他们的抱歉页面。
这样做:
try
{
response = (HttpWebResponse) webRequest.GetResponse();
}
catch (WebException ex)
{
using (var sr = new StreamReader(ex.Response.GetResponseStream()))
{
var html = sr.ReadToEnd();
}
}
调试时,请检查html变量中的值。您将看到这是一个HTML页面,您应该在其中填写一个captcha代码