DeflateStream.ReadAsync (.NET 4.5 System.IO.Compression) 读取的
本文关键字:Compression IO 读取 System ReadAsync NET DeflateStream | 更新日期: 2023-09-27 18:36:31
在 c# 中将一些旧代码转换为使用 async 时,我开始在 DeflateStream 的 Read() 和 ReadAsync() 方法的返回值变体中看到问题。
我认为从同步代码的过渡就像
bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);
到它的等效异步版本
bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);
应始终返回相同的值。
请参阅添加到问题底部的更新代码 - 以正确的方式使用流 - 因此使初始问题无关紧要
我发现在多次迭代后,这并不成立,在我的特定情况下,在转换后的应用程序中导致随机错误。
我在这里错过了什么吗?
下面是简单的重现案例(在控制台应用程序中),其中Assert
将在迭代 #412 的 ReadAsync
方法中为我中断,给出如下所示的输出:
....
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync #412 - 453 bytes read
---- DEBUG ASSERTION FAILED ----
我的问题是,为什么此时DeflateStream.ReadAsync
方法返回 453 字节?
注意:这只发生在某些输入字符串上 - CreateProblemDataString
中的大量StringBuilder
东西是我能想到的为这篇文章构建字符串的最佳方式。
class Program
{
static byte[] DataAsByteArray;
static int uncompressedSize;
static void Main(string[] args)
{
string problemDataString = CreateProblemDataString();
DataAsByteArray = Encoding.ASCII.GetBytes(problemDataString);
uncompressedSize = DataAsByteArray.Length;
MemoryStream memoryStream = new MemoryStream();
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Compress, true))
{
for (int i = 0; i < 1000; i++)
{
deflateStream.Write(DataAsByteArray, 0, uncompressedSize);
}
}
// now read it back synchronously
Read(memoryStream);
// now read it back asynchronously
Task retval = ReadAsync(memoryStream);
retval.Wait();
}
static void Read(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize];
int bytesRead = -1;
int i = 0;
while (bytesRead > 0 || bytesRead == -1)
{
bytesRead = deflateStream.Read(buffer, 0, uncompressedSize);
System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
i++;
}
}
}
static async Task ReadAsync(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize];
int bytesRead = -1;
int i = 0;
while (bytesRead > 0 || bytesRead == -1)
{
bytesRead = await deflateStream.ReadAsync(buffer, 0, uncompressedSize);
System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
System.Diagnostics.Debug.Assert(bytesRead == 0 || bytesRead == uncompressedSize);
i++;
}
}
}
/// <summary>
/// This is one of the strings of data that was causing issues.
/// </summary>
/// <returns></returns>
static string CreateProblemDataString()
{
StringBuilder sb = new StringBuilder();
sb.Append("0601051081 ");
sb.Append(" ");
sb.Append(" 225021 0300420");
sb.Append("34056064070072076361102 13115016017");
sb.Append("5 192 230237260250 2722");
sb.Append("73280296 326329332 34535535");
sb.Append("7 3 ");
sb.Append(" 4");
sb.Append(" ");
sb.Append(" 50");
sb.Append("6020009 030034045 063071076 360102 13");
sb.Append("1152176160170 208206 23023726025825027227328");
sb.Append("2283285 320321333335341355357 622005009 0");
sb.Append("34053 060070 361096 130151176174178172208");
sb.Append("210198 235237257258256275276280290293 3293");
sb.Append("30334 344348350 ");
sb.Append(" ");
sb.Append(" ");
sb.Append(" ");
sb.Append(" 225020012014 046042044034061");
sb.Append("075078 361098 131152176160170 208195210 230");
sb.Append("231260257258271272283306 331332336 3443483");
sb.Append("54 29 ");
sb.Append(" ");
sb.Append(" 2");
sb.Append("5 29 06 0");
sb.Append("1 178 17");
sb.Append("4 205 2");
sb.Append("05 195 2");
sb.Append("31 231 23");
sb.Append("7 01 01 0");
sb.Append("2 260 26");
sb.Append("2 274 2");
sb.Append("72 274 01 01 0");
sb.Append("3 1 5 3 6 43 52 ");
return sb.ToString();
}
}
更新了代码以将流正确读取到缓冲区中
输出现在如下所示:
...
ReadAsync #410 - 2055 bytes read
ReadAsync #411 - 2055 bytes read
ReadAsync PARTIAL #412 - 453 bytes read, offset for next read = 453
ReadAsync #412 - 1602 bytes read
ReadAsync #413 - 2055 bytes read
...
static void Read(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
int bytesRead; // number of bytes read from Read operation
int offset = 0; // offset for writing into buffer
int i = -1; // counter to track iteration #
while ((bytesRead = deflateStream.Read(buffer, offset, uncompressedSize - offset)) > 0)
{
offset += bytesRead; // offset in buffer for results of next reading
System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
{
offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
i++; // increment counter that tracks iteration #
System.Diagnostics.Debug.WriteLine("Read #{0} - {1} bytes read", i, bytesRead);
}
else // buffer still not full
{
System.Diagnostics.Debug.WriteLine("Read PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
}
}
}
}
static async Task ReadAsync(MemoryStream memoryStream)
{
memoryStream.Position = 0;
using (DeflateStream deflateStream = new DeflateStream(memoryStream, CompressionMode.Decompress, true))
{
byte[] buffer = new byte[uncompressedSize]; // buffer to hold known fixed size record.
int bytesRead; // number of bytes read from Read operation
int offset = 0; // offset for writing into buffer
int i = -1; // counter to track iteration #
while ((bytesRead = await deflateStream.ReadAsync(buffer, offset, uncompressedSize - offset)) > 0)
{
offset += bytesRead; // offset in buffer for results of next reading
System.Diagnostics.Debug.Assert(offset <= uncompressedSize, "should never happen - because would mean more bytes read than requested.");
if (offset == uncompressedSize) // buffer full, complete fixed size record in buffer.
{
offset = 0; // buffer is now filled, next read to start at beginning of buffer again.
i++; // increment counter that tracks iteration #
System.Diagnostics.Debug.WriteLine("ReadAsync #{0} - {1} bytes read", i, bytesRead);
}
else // buffer still not full
{
System.Diagnostics.Debug.WriteLine("ReadAsync PARTIAL #{0} - {1} bytes read, offset for next read = {2}", i+1, bytesRead, offset);
}
}
}
}
Damien的评论完全正确。但是,你的错误已经足够普遍了,恕我直言,这个问题应该得到一个实际的答案,如果没有其他原因,只是为了帮助犯同样错误的人更容易找到问题的答案。
所以,要清楚:
与 .NET 中所有面向流的 I/O 方法一样,其中一个方法提供byte[]
缓冲区,并且该方法返回读取的字节数,您可以对字节数做出的唯一假设是:
- 该数字不会大于您要求读取的最大字节数(即作为要读取的字节数传递给方法)
- 该数字将是非负数,并且只要实际上还有剩余数据要读取,该数字就会大于 0(当您到达流的末尾时将返回 0)。
当使用这些方法中的任何一个进行读取时,您甚至不能指望相同的方法总是返回相同数量的字节(取决于上下文......显然在某些情况下,这实际上是确定性的,但您仍然不应该依赖它),并且不能保证任何类型的不同方法,即使是从同一源读取的方法, 将始终返回与其他方法相同的字节数。
由调用方将字节作为流读取,同时考虑指定每次调用读取的字节数的返回值,并以适合该特定字节流的任何方式重新组合这些字节。
请注意,在处理Stream
对象时,可以使用 Stream.CopyTo()
方法。当然,它只会复制到另一个Stream
对象。但在许多情况下,可以使用目标对象而不将其视为Stream
。例如,您只想将数据作为文件写入,或者您想将其复制到MemoryStream
然后使用 MemoryStream.ToArray()
方法将其转换为字节数组(然后您可以访问该数组而不用担心在给定的读取操作中读取了多少字节...当您到达数组时, 所有这些都已阅读:))。