在C#中,我如何获得输出文件中每个文本行的文件位置(在编写时)
本文关键字:文件 位置 何获得 输出 文本 | 更新日期: 2023-09-27 18:28:10
当我向临时文件写入文本行时,我想存储每行的文件位置,以便将来随机访问。
不起作用
StreamWriter outFile;
{ ...
fPos[i++] = outFile.BaseStream.Position;
}
上述循环中的fPos值为0,0,0,。。。,10241024,。。。20282048等——即物理位置。
以下确实有效,但取消了缓冲:
{ ...
outFile.Flush();
fPos[i++] = outFile.BaseStream.Position;
}
有没有更好的方法可以在不放弃缓冲的情况下获得(逻辑)文件位置?
如果数据足够小,可以将StreamWriter
写入MemoryStream
。在每行之后刷新,并在写入每行之前获取内存流的位置。将数据写入内存流后,将其复制到文件中。
如果数据大于内存容量,则可以使用MemoryStream
作为缓冲区,并在磁盘文件充满时将该内存流刷新到磁盘文件中。这需要一点簿记,但也不会太糟糕。
稍微复杂一点的是从BufferedStream
中创建自己的类,该类计算写入到它的字节数。打开缓冲区大小为0的流写入程序,让它写入缓冲流。然后,您可以查询缓冲流中写入的字节数。当然,该流将写入文件流。所以它会是这样的:
using (var fs = new FileStream(...))
{
using (var buff = new CustomBufferedStream(fs))
{
using (var writer = new StreamWriter(buff, encoding))
{
... do your output here
}
}
}
您可能无法将StreamWriter
缓冲区大小设置为0。如果是这种情况,那么您必须在每一行之后刷新,并且您可能必须重写自定义流的Flush
方法,这样它就不会自动刷新到磁盘文件。我不记得StreamWriter.Flush
是否自动调用BaseStream.Flush
。
我记不清这是从哪里来的,但它过去对我来说效果很好。
将读取器替换为写入器:
/// <summary>
/// A stream writer that allows its current position to be recorded and
/// changed. This is not generally possible for stream writers, because
/// of the use buffered reads.
/// This writer can be used only for text content. Theoretically it can
/// process any encoding, but was checked only for ANSI and UTF8 If you
/// want to use this class with any other encoding, please check it.
/// NOT SAFE FOR ASYNC WRITES.
/// </summary>
public class PositionableStreamWriter : StreamWriter
{
/// <summary>
/// Out best guess counted position.
/// </summary>
private long _position;
/// <summary>
/// Guards against e.g. calling "Write(char[] buffer, int index, int count)"
/// as part of the implementation of "Write(string value)", which would cause
/// double counting.
/// </summary>
private bool _guardRecursion;
public PositionableStreamWriter(string fileName, bool append, Encoding enc)
: base(fileName, append, enc)
{
CommonConstruct();
}
public PositionableStreamWriter(Stream stream, Encoding enc)
: base(stream, enc)
{
CommonConstruct();
}
/// <summary>
/// Encoding can really haven't preamble
/// </summary>
private bool CheckPreamble(long lengthBeforeFlush)
{
byte[] preamble = this.Encoding.GetPreamble();
if (this.BaseStream.Length >= preamble.Length)
{
if (lengthBeforeFlush == 0)
return true;
else // we would love to read, but base stream is write-only.
return true; // just assume a preamble is there.
}
return false;
}
/// <summary>
/// Use this property to get and set real position in file.
/// Position in BaseStream can be not right.
/// </summary>
public long Position
{
get { return _position; }
set
{
this.Flush();
((Stream)base.BaseStream).Seek(value, SeekOrigin.Begin);
_position = value;
}
}
public override void Write(char[] buffer)
{
if (buffer != null && !_guardRecursion)
_position += Encoding.GetByteCount(buffer);
CallBase(() => base.Write(buffer));
}
public override void Write(char[] buffer, int index, int count)
{
if (buffer != null && !_guardRecursion)
_position += Encoding.GetByteCount(buffer, index, count);
CallBase(() => base.Write(buffer, index, count));
}
public override void Write(string value)
{
if (value != null && !_guardRecursion)
_position += Encoding.GetByteCount(value);
CallBase(() => base.Write(value));
}
public override void WriteLine()
{
if (!_guardRecursion)
_position += Encoding.GetByteCount(NewLine);
CallBase(() => base.WriteLine());
}
public override void WriteLine(char[] buffer)
{
if (buffer != null && !_guardRecursion)
_position += Encoding.GetByteCount(buffer) + Encoding.GetByteCount(NewLine);
CallBase(() => base.WriteLine(buffer));
}
public override void WriteLine(char[] buffer, int index, int count)
{
if (buffer != null && !_guardRecursion)
_position += Encoding.GetByteCount(buffer, index, count) + Encoding.GetByteCount(NewLine);
CallBase(() => base.WriteLine(buffer, index, count));
}
public override void WriteLine(string value)
{
if (value != null && !_guardRecursion)
_position += Encoding.GetByteCount(value) + Encoding.GetByteCount(NewLine);
CallBase(() => base.WriteLine(value));
}
private void CallBase(Action callBase)
{
if (_guardRecursion == true)
callBase();
else
{
try
{
_guardRecursion = true;
callBase();
}
finally
{
_guardRecursion = false;
}
}
}
private void CommonConstruct()
{
var lenghtAtConstruction = BaseStream.Length;
if (lenghtAtConstruction == 0)
Flush(); // this should force writing the preamble if a preamble is being written.
//we need to add length of symbol which is in begin of file and describes encoding of file
if (CheckPreamble(lenghtAtConstruction))
{
_position = this.Encoding.GetPreamble().Length;
}
}
}
用法:
class Program
{
static void Main(string[] args)
{
var pos = new List<long>();
using (var writer = new PositionableStreamWriter("tst.txt", false, Encoding.Unicode))
{
pos.Add(writer.Position);
writer.Write("abcde");
pos.Add(writer.Position);
writer.WriteLine("Nope");
pos.Add(writer.Position);
writer.WriteLine("Something");
pos.Add(writer.Position);
writer.WriteLine("Another thing");
pos.Add(writer.Position);
}
using (var stream = File.Open("tst.txt", FileMode.Open))
using (var reader = new BinaryReader(stream))
{
for (int i = 0; i < pos.Count - 1; i++)
{
stream.Position = pos[i];
var len = (int)(pos[i + 1] - pos[i]);
var buf = reader.ReadBytes(len);
Console.Write(Encoding.Unicode.GetString(buf));
}
}
}
}
我会创建一个计算输出字符的子类。根据这个从StreamWriter继承尽可能少的努力,您只需要覆盖void Write(Char)
class StreamWriterWithPosition : StreamWriter {
override void Write(Char c) { pos += Encoding.GetByteCount(c.ToString()); base.Write(c);}
int pos = 0;
}
(未测试!)