在C#中分析具有文本的文本文件
本文关键字:文本 文件 | 更新日期: 2023-09-27 18:22:31
我在解析带有文字的文本文件时遇到问题。
我遇到的问题是:
以下是我正在解析的文本文件示例:
BT /F1 9 Tf 53.8646616541353 441 Td ( Voucher AADA Trans. Prods CDE TRX Payment) Tj ET
BT /F1 9 Tf 53.8646616541353 432 Td ( Number Num Date WH ID Name Name # Year Inv. # CD Due Date Qty Price Disct % Amount Due) Tj ET
BT /F1 9 Tf 53.8646616541353 423 Td (--------- ---- ---------- -- ------ -------- ----- -- ---- ---------- -- ---------- ------ ------- ------- ------- ------------) Tj ET
BT /F1 9 Tf 53.8646616541353 414 Td ( 21812539 09/30/2015 NA 29264 Symante SUMME 52 2015 1735247 RM 09/30/2015 2 $15.00 50.0000 100.0% 15.00 ) Tj ET
BT /F1 9 Tf 53.8646616541353 405 Td ( 21827266 10/01/2015 NA 29264 Symante SUMME 52 2015 1735966 RE 10/01/2015 1 $15.00 50.0000 100.0% '(7.50')) Tj ET
BT /F1 9 Tf 53.8646616541353 396 Td ( 21832628 10/02/2015 NA 29264 Symante SUMME 52 2015 1736174 RM 10/02/2015 1 $15.00 50.0000 100.0% 7.50 ) Tj ET
BT /F1 9 Tf 53.8646616541353 387 Td ( 21838251 10/02/2015 NA 29264 Symante SUMME 52 2015 1736429 RE 10/02/2015 1 $15.00 50.0000 100.0% '(7.50')) Tj ET
BT /F1 9 Tf 53.8646616541353 378 Td ( 21841821 10/03/2015 NA 29264 Symante SUMME 52 2015 1736583 RM 10/03/2015 1 $15.00 50.0000 100.0% 7.50 ) Tj ET
BT /F1 9 Tf 53.8646616541353 369 Td ( 21874851 10/08/2015 NA 29264 Symante SUMME 52 2015 1738192 RE 10/08/2015 1 $15.00 50.0000 100.0% '(7.50')) Tj ET
BT /F1 9 Tf 53.8646616541353 360 Td ( 21879328 10/09/2015 NA 29264 Symante SUMME 52 2015 1738389 RM 10/09/2015 1 $15.00 50.0000 100.0% 7.50 ) Tj ET
BT /F1 9 Tf 53.8646616541353 351 Td ( 21933007 10/16/2015 NA 29264 Symante SUMME 52 2015 0000531968 SK 10/16/2015 1 $15.00 50.0000 100.0% '(7.50')) Tj ET
BT /F1 9 Tf 53.8646616541353 342 Td ( -------------) Tj ET
BT /F1 9 Tf 53.8646616541353 333 Td ( Sub Total: '($1,650.00')) Tj ET
BT /F1 9 Tf 53.8646616541353 324 Td ( -------------) Tj ET
BT /F1 9 Tf 53.8646616541353 315 Td ( 21827466 10/02/2015 NA 57629 0000531284 PO 10/02/2015 0 100.0% '(1500.00')) Tj ET
BT /F1 9 Tf 53.8646616541353 306 Td ( -------------) Tj ET
BT /F1 9 Tf 53.8646616541353 297 Td ( Sub Total: '($1,500.00')) Tj ET
BT /F1 9 Tf 53.8646616541353 288 Td ( -------------) Tj ET
BT /F1 9 Tf 53.8646616541353 279 Td ( 21663952 09/02/2015 SN 57629 Zeal '(I') 61-SE 61 2015 0000529704 IN 11/01/2015 2443 $14.95 50.0000 100.0% 11111.43 ) Tj ET
BT /F1 9 Tf 53.8646616541353 270 Td ( 21663953 09/02/2015 SN 57629 Zeal '(I') 61-SE 61 2015 0000529704 SP 11/01/2015 2443 $14.95 50.0000 100.0% '(200.33')) Tj ET
BT /F1 9 Tf 53.8646616541353 261 Td ( 21699656 09/09/2015 S2 57629 Zeal '(I') 61-SE 61 2015 0000530025 IN 11/08/2015 449 $14.95 50.0000 100.0% 1156.28 ) Tj ET
BT /F1 9 Tf 53.8646616541353 252 Td ( 21699657 09/09/2015 S2 57629 Zeal '(I') 61-SE 61 2015 0000530025 SP 11/08/2015 449 $14.95 50.0000 100.0% '(36.82')) Tj ET
BT /F1 9 Tf 53.8646616541353 243 Td ( 21699658 09/09/2015 SL 57629 Zeal '(I') 61-SE 61 2015 0000530025 IN 11/08/2015 1320 $14.95 50.0000 100.0% 1111.00 ) Tj ET
BT /F1 9 Tf 53.8646616541353 234 Td ( 21699659 09/09/2015 SL 57629 Zeal '(I') 61-SE 61 2015 0000530025 SP 11/08/2015 1320 $14.95 50.0000 100.0% '(108.24')) Tj ET
BT /F1 9 Tf 53.8646616541353 225 Td ( 21736996 09/16/2015 S1 57629 Zeal '(I') 61-SE 61 2015 0000530390 IN 11/15/2015 1016 $14.95 50.0000 100.0% 1111.60 ) Tj ET
BT /F1 9 Tf 53.8646616541353 216 Td ( 21736997 09/16/2015 S1 57629 Zeal '(I') 61-SE 61 2015 0000530390 SP 11/15/2015 1016 $14.95 50.0000 100.0% '(83.31')) Tj ET
BT /F1 9 Tf 53.8646616541353 207 Td ( 21808378 09/29/2015 NA 57629 Zeal '(I') 61-SE 61 2015 1735086 RE 09/29/2015 8 $14.95 50.0000 100.0% '(59.80')) Tj ET
BT /F1 9 Tf 53.8646616541353 198 Td ( 21838252 10/02/2015 NA 57629 Zeal '(I') 61-SE 61 2015 1736429 RE 10/02/2015 1 $14.95 50.0000 100.0% '(7.48')) Tj ET
BT /F1 9 Tf 53.8646616541353 189 Td ( 21874852 10/08/2015 NA 57629 Zeal '(I') 61-SE 61 2015 1738192 RE 10/08/2015 4 $14.95 50.0000 100.0% '(29.90')) Tj ET
BT /F1 9 Tf 53.8646616541353 180 Td (
如果你看第20行,产品名称是Zeal(I)。负数(最后一列到期金额)也用括号括起来。
然而,当我尝试时,我正在逐行分析文本文件
line.Replace(@"'(", "");
这似乎不起作用。我以前从未在文件中遇到过这些文字,所以我不知道如何处理。除此之外,我几乎完成了解析。
我做这件事的方式是非常直接的
string line;
int count = 0; // to be removed. Used in testing to cap count.
while ((line = reader.ReadLine()) != null)
{
if (count <= 10)
{
if (line.Length > 170 && line.Length < 200)
{
if (!ContainsAny(line))
{
line.Replace(@"'(", "");
indexStart = line.IndexOf("Td (") + 4;
col0 = line.Substring(indexStart, 9);
col1 = line.Substring(indexStart + 10, 4);
col2 = line.Substring(indexStart + 15, 10);
col3 = line.Substring(indexStart + 26, 2);
col4 = line.Substring(indexStart + 29, 6);
col5 = line.Substring(indexStart + 36, 8);
col6 = line.Substring(indexStart + 45, 5);
col7 = line.Substring(indexStart + 51, 2);
col8 = line.Substring(indexStart + 54, 4);
col9 = line.Substring(indexStart + 59, 10);
col10 = line.Substring(indexStart + 70, 2);
col11 = line.Substring(indexStart + 73, 10);
col12 = line.Substring(indexStart + 84, 6);
col13 = line.Substring(indexStart + 91, 7).Replace("$", "");
col14 = line.Substring(indexStart + 99, 7);
col15 = line.Substring(indexStart + 107, 7).Replace("%", "");
col16 = line.Substring(indexStart + 115, 12);
MessageBox.Show(string.Format("{0}; {1}; {2}; {3}; {4}; {5}; {6}; {7}; {8}; {9}; {10}; {11}; {12}; {13}; {14}; {15}; {16};", col0, col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13, col14, col15, col16));
//writer.WriteLine(lineOut);
count += 1; // to be removed. Used in testing to cap count.
}
}
}
当我写入文件时得到的结果是
21841821 10/03/2015 NA 29264 Symante SUMME 52 2015 1736583 RM 10/03/2015 1 15 50 100 7.5
21874851 10/08/2015 NA 29264 Symante SUMME 52 2015 1738192 RE 10/08/2015 1 15 50 100 -7.5
21879328 10/09/2015 NA 29264 Symante SUMME 52 2015 1738389 RM 10/09/2015 1 15 50 100 7.5
21933007 10/16/2015 NA 29264 Symante SUMME 52 2015 531968 SK 10/16/2015 1 15 50 100 -7.5
21827466 10/02/2015 NA 57629 531284 PO 10/02/2015 0 100 -4500
21663952 09/02/2015 SN 57629 Zeal '(I ) 61- E 1 20 5 00005297 4 N 11/01/20 5 24 3 14. 5 50.00 0 100. 18261.40%
21663953 09/02/2015 SN 57629 Zeal '(I ) 61- E 1 20 5 00005297 4 P 11/01/20 5 24 3 14. 5 50.00 0 100. -200.00%
21699656 09/09/2015 S2 57629 Zeal '(I ) 61- E 1 20 5 00005300 5 N 11/08/20 5 4 9 14. 5 50.00 0 100. 3356.20%
line.Replace(@"'(", "");
不修改string
。它只是返回新更改的string
。你应该写:
line = line.Replace(@"'(", "");
检查String.Replace
:的文档
返回一个新字符串,其中当前实例中指定字符串的所有出现都将替换为另一个指定字符串一串
您需要使用:
line=line.Replace(@"'(", "");
看起来你写的东西比实际需要的要多。
var allLines = File.ReadAllLines(@"C:'myfile.text");
var correctedLines = allLines.Select(l => l.Replace(@"'(", "").Replace(@"')", ""));
//now use corrected lines in your code