可以';t将PostgreSQL中的文件作为大对象存储然后读取
本文关键字:对象 存储 读取 然后 文件 PostgreSQL 可以 | 更新日期: 2023-09-27 18:26:30
我正在尝试使用NpgSQL v3.0.4.0向PostgreSQL数据库V9.4.x写入和读取大型对象。因此,我实现了一种方法,将本地文件作为一个大对象存储在数据库中,如下所示:
public static async Task<uint> InsertLargeObjectFileToDB(string theFilePath)
{
// connecting to DB
string connstring = MakeDatabaseConnectionString();
// make a connection object
NpgsqlConnection Conn = new NpgsqlConnection(connstring);
try
{
await OpenDatabaseConnection(Conn); //open database connection
}
catch (Exception Ex)
{
throw (Ex);
}
uint oid; // to store object ID number
try
{
// Reading and writing Large Objects requires the use of a transaction
using (FileStream fs = new FileStream(theFilePath, FileMode.Open))
{
using (var transaction = Conn.BeginTransaction())
{
// Retrieve a Large Object Manager for this connection
var manager = new NpgsqlLargeObjectManager(Conn);
// Create a new empty file, returning the identifier to later access it
oid = manager.Create();
using (var DbStream = manager.OpenReadWrite(oid))
{
long theFileSize = GetFileSizeInBytes(theFilePath);
StreamReader sr = new StreamReader(fs);
byte[] buffer = new byte[1024 * 1024];
while (sr.BaseStream.Position < theFileSize)
{
await fs.ReadAsync(buffer, 0, buffer.Length);
await DbStream.WriteAsync(buffer, 0, buffer.Length);
}
}
transaction.Commit();
return oid;
}
}
}
catch // any error
{
// exception
Exception ex = new Exception();
ex.Data.Add(ex.Data.Count, "some error message");
throw ex;
}
}
然后,我实现了另一种方法来读取一个大对象,并将其存储在临时目录中的一个随机命名的文件中,如下所示:
public static async Task<string> GetLargeObjectFileFromDB(uint oid)
{
// connecting to DB
string connstring = MakeDatabaseConnectionString();
// make a connection object
NpgsqlConnection Conn = new NpgsqlConnection(connstring);
try
{
await OpenDatabaseConnection(Conn); //open database connection
}
catch (Exception Ex)
{
throw (Ex);
}
// getting a temorary file name from the system to use it to store the fetched file
string TempFileName = GetRandomFileNameFromSystem();
try
{
using (FileStream LocalStream = new FileStream(TempFileName, FileMode.Create))
{
using (var transaction = Conn.BeginTransaction())
{
// create a Large Object Manager for this connection
var DbLargeObjectManager = new NpgsqlLargeObjectManager(Conn);
using (var DbStream = await DbLargeObjectManager.OpenReadAsync(oid))
{
byte[] buffer = new byte[1024 * 1024];
// get the length of the database object
long LengthOfDbObject = DbStream.Length;
while (DbStream.Position < LengthOfDbObject)
{
// read from the database to buffer
await DbStream.ReadAsync(buffer, 0, buffer.Length);
//write from buffer to local file
await LocalStream.WriteAsync(buffer, 0, buffer.Length);
}
}
transaction.Commit();
return TempFileName;
}
}
}
catch // any error
{
// exception
Exception ex = new Exception();
ex.Data.Add(ex.Data.Count, "Error inserting object in database");
throw ex;
}
}
正如你所看到的,我一直在异步写作。问题是,我对这两种方法进行了测试,这次测试将6MB的文件写入数据库,但当我再次从数据库中读取该文件时,它大约大了400kb,(当然)MD5哈希不匹配。别忘了提到,没有任何例外。如果你感兴趣,下面是测试:
private async void button3_Click(object sender, EventArgs e)
{
listBox1.Items.Clear();
// getting the MD5 hash of the source file
string FirstMd5Hash = GetMd5OfFile(tbSourceFile.Text);
// performance measurment ##########################################
DateTime dt1 = new DateTime(DateTime.Now.Ticks);
listBox1.Items.Add("Uploading file to database");
//storing that file into database
uint oid = await InsertLargeObjectFileToDB(tbSourceFile.Text);
// performance measurment #########################################################
DateTime dt2 = new DateTime(DateTime.Now.Ticks);
TimeSpan ts = new TimeSpan(dt2.Ticks - dt1.Ticks);
listBox1.Items.Add("Large object (oid = " + oid + ") inserted in " + ts.Seconds + "." + ts.Milliseconds + " seconds");
// performance measurment ##########################################
dt1 = new DateTime(DateTime.Now.Ticks);
listBox1.Items.Add("reading file back from the database");
// get that object back from the database into temporary file
string ReturnedFileName = await PostgresqlLargeObject.GetLargeObjectFileFromDB(oid);
// performance measurment #########################################################
dt2 = new DateTime(DateTime.Now.Ticks);
ts = new TimeSpan(dt2.Ticks - dt1.Ticks);
listBox1.Items.Add("reading done in " + ts.Seconds + "." + ts.Milliseconds + " seconds");
//calculate md5 of that file
string SecondMd5Hash = GetMd5OfFile(ReturnedFileName);
// compare the 2 hashes
if (FirstMd5Hash == SecondMd5Hash)
{
listBox1.Items.Add("the hashes are match . MD5 = " + FirstMd5Hash);
}
else
{
listBox1.Items.Add("failed with oid = " + oid);
tbFileBack.Text = ReturnedFileName;
}
}
怎么了?
我已经解决了这个问题。考虑到Emil的回答,结果发现您必须异步读取,然后同步写入。我不知道为什么,但这个代码是有效的:
using (FileStream LocalStream = new FileStream(TempFileName, FileMode.Create))
{
using (var transaction = Conn.BeginTransaction())
{
// create a Large Object Manager for this connection
var DbLargeObjectManager = new NpgsqlLargeObjectManager(Conn);
using (var DbStream = await DbLargeObjectManager.OpenReadAsync(oid))
{
byte[] buffer = new byte[262144]; //256KB
// query the database stream length
long DatabaseStreamLength = DbStream.Length;
while (DbStream.Position < DatabaseStreamLength)
{
// read from the database to buffer (async)
int bufferByteCount = await DbStream.ReadAsync(buffer, 0, buffer.Length);
//write from buffer to local file (sync)
LocalStream.Write(buffer, 0, bufferByteCount);
}
}
transaction.Commit();
}
}
执行await <stream>.WriteAsync(buffer, 0, buffer.Length);
时,应该写入上一个读取方法实际读取的字节数(该值将返回)。