如何指定HTTP状态代码304(NotModified)不是Amazon S3 GetObject API内部的错误条件
本文关键字:GetObject S3 Amazon API 内部 错误条件 不是 状态 HTTP 何指定 代码 | 更新日期: 2023-09-27 18:10:52
背景
我正在尝试使用S3作为一些"相当"静态XML文档的"无限大"大型缓存层。我希望确保客户端应用程序(将在数千台机器上同时运行,并且每小时多次请求XML文档(只有在自上次客户端应用程序下载这些文档以来其内容发生了更改时才下载这些XML文档。
方法
在AmazonS3上,我们可以使用HTTPETAG。默认情况下,AmazonS3对象的ETAG设置为该对象的MD5哈希。
然后,我们可以在GetObjectRequest.ETagToNotMatch
属性中指定XML文档的MD5哈希。这确保了当我们进行AmazonS3.GetObject
调用(或者在我的情况下是异步版本AmazonS3.BeginGetObject
和AmazonS3.EndGetObject
(时,如果所请求的文档具有与GetObjectRequest.ETagToNotMatch
中包含的相同的MD5哈希,则S3自动返回HTTP状态代码304(NotModified(,并且XML文档的实际内容是而不是下载的。
问题
然而,问题在于,当调用AmazonS3.GetObject
(或其异步等价物(时,Amazon.Net API实际上将HTTP状态代码304(NotModified(视为错误,并重试获取请求三次,然后最终抛出Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
。
显然,我可以将此实现更改为使用AmazonS3.GetObjectMetaData
,然后比较ETAG和使用AmazonS3.GetObject
(如果它们不匹配(,但当文件过时时,会有两个对S3的请求,而不是一个。无论XML文档是否需要下载,我都希望有一个请求。
有什么想法吗?这是bug还是我遗漏了什么?有没有什么方法可以将重试次数减少到一次并"处理"异常(尽管我对这条路线感到"恶心"(。
实施
我使用的是AWS SDK for.NET(1.3.14版(。
以下是我的实现(稍微减少以保持更短(:
public Task<GetObjectResponse> DownloadString(string key, string etag = null) {
var request = new GetObjectRequest { Key = key, BucketName = Bucket };
if (etag != null) {
request.ETagToNotMatch = etag;
}
var task = Task<GetObjectResponse>.Factory.FromAsync(_s3Client.BeginGetObject, _s3Client.EndGetObject, request, null);
return task;
}
然后我这样称呼它:
var dlTask = s3Manager.DownloadString("new one", "d7db7bc318d6eb9222d728747879b52e");
var responseTasks = new[]
{
dlTask.ContinueWith(x => _log.Error("Error downloading string.", x.Exception), TaskContinuationOptions.OnlyOnFaulted),
dlTask.ContinueWith(x => _log.Warn("Downloading string was cancelled."), TaskContinuationOptions.OnlyOnCanceled),
dlTask.ContinueWith(x => _log.Info(string.Format("Done with download: {0}", x.Result.ETag)), TaskContinuationOptions.OnlyOnRanToCompletion)
};
try {
Task.WaitAny(responseTasks);
} catch (AggregateException aex) {
_log.Error("Error while processing download string.", aex);
}
_log.Info("Exiting...");
然后生成以下日志文件输出:
2011-10-11 13:21:20,376 [11] INFO Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.6140812.
2011-10-11 13:21:20,385 [11] INFO Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:20,789 [11] INFO Amazon.S3.AmazonS3Client - Retry number 1 for request GetObject.
2011-10-11 13:21:22,329 [11] INFO Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.1400356.
2011-10-11 13:21:22,329 [11] INFO Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:23,929 [11] INFO Amazon.S3.AmazonS3Client - Retry number 2 for request GetObject.
2011-10-11 13:21:26,508 [11] INFO Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:00.9790314.
2011-10-11 13:21:26,508 [11] INFO Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:32,908 [11] INFO Amazon.S3.AmazonS3Client - Retry number 3 for request GetObject.
2011-10-11 13:21:40,604 [11] INFO Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.2950718.
2011-10-11 13:21:40,605 [11] INFO Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:40,621 [11] ERROR Amazon.S3.AmazonS3Client - Error for GetResponse
Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause)
at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode)
at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result)
2011-10-11 13:21:40,635 [10] INFO Example.Program - Exiting...
2011-10-11 13:21:40,638 [19] ERROR Example.Program - Error downloading string.
System.AggregateException: One or more errors occurred. ---> Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause)
at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode)
at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result)
at Amazon.S3.AmazonS3Client.endOperation[T](IAsyncResult result)
at Amazon.S3.AmazonS3Client.EndGetObject(IAsyncResult asyncResult)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endMethod, TaskCompletionSource`1 tcs)
--- End of inner exception stack trace ---
---> (Inner Exception #0) Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause)
at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode)
at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result)
at Amazon.S3.AmazonS3Client.endOperation[T](IAsyncResult result)
at Amazon.S3.AmazonS3Client.EndGetObject(IAsyncResult asyncResult)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endMethod, TaskCompletionSource`1 tcs)<---
我也在亚马逊开发者论坛上发布了这个问题,并得到了AWS官方员工的回复:
经过调查,我们了解问题,但我们正在寻找以获得有关如何最好地处理此问题的反馈。
第一种方法是使用GetObjectResponse指示未返回或设置对象将输出流设置为null。这将是更干净的代码,但它确实会给任何依赖抛出异常,尽管是在3次重试之后。它也会与CopyObject操作不一致,该操作确实引发例外,没有所有疯狂的重试。
另一种选择是我们抛出一个类似于CopyObject的异常让我们保持一致,没有突破性的变化,但这更困难以进行编码。
如果有人对处理这件事的方式有意见,请回复这个线程。
标准
我已经在帖子中添加了我的想法,如果有其他人有兴趣参与,这里有链接:
AmazonS3.GetObject将HTTP 304(NotModified(视为错误。如何允许它?
注意:亚马逊解决此问题后,我将更新我的答案以反映结果。
更新:(2012-01-24(仍在等待亚马逊的进一步信息
更新:(2018-12-06(2013年在AWS SDK 1.5.20中修复了此问题https://forums.aws.amazon.com/thread.jspa?threadID=77995&tstart=0