使用FileHelper解析具有可选列的csv

本文关键字:csv FileHelper 使用 | 更新日期: 2023-09-27 18:11:35

我正在使用Filehelper 3.1.5来解析CSV文件,但我的问题是CSV文件应该支持许多可选列,我还没有发现为这个任务配置Filehelper。

下面是一个例子:

[DelimitedRecord(";")]
[IgnoreEmptyLines]
public class TestRecord
{
    //Mandatory
    [FieldNotEmpty]
    public string A;
    [FieldOptional]
    public string B;
    [FieldOptional]
    public string C;
}

我希望能够像这样处理数据:

A;C
TestA1;TestC1
TestA2;TestC1

但是当我解析它时,我将得到"TestC1"作为记录[1]的结果。B

var engine = new FileHelperEngine<TestRecord>();
var records = engine.ReadFile("TestAC.csv");
string column = records[1].C;
Assert.IsTrue(column.Equals("TestC1"));  //Fails, returns ""
column = records[1].B;
Assert.IsTrue(column.Equals("TestC1"));  //True, but that was not what I wanted

谢谢你的建议!

使用FileHelper解析具有可选列的csv

文件帮助器3.2.5版测试

为了使FileHelper。引擎正确识别您的列,您将不得不动态地删除不再使用的字段。下面的代码是基于您的代码添加了一些位,并从控制台程序运行:

        string tempFile = System.IO.Path.GetTempFileName();
        System.IO.File.WriteAllText(tempFile, @"A;C'r'n'TestA1;TestC1'r'nTestA2;TestC1");
        var engine = new FileHelperEngine<TestRecord>();
        var records = engine.ReadFile(tempFile, 1);
        // Get the header text from the file
        var headerFile = engine.HeaderText.Replace("'r", "").Replace("'n", "");
        // Get the header from the engine record layout
        var headerFields = engine.GetFileHeader();
        // Test fixed string against column as column could be null and Debug.Assert can't use .Equals on a null object!
        string column = records[0].C;
        Debug.Assert("TestC1".Equals(column), "Test 1 - Column C does not equal 'TestC1'");  //Fails, returns ""
        // Test fixed string against column as column could be null and Debug.Assert can't use .Equals on a null object!
        column = records[0].B;
        Debug.Assert(!"TestC1".Equals(column), "Test 1 - Column B does equal 'TestC1'");  //True, but that was not what I wanted
        // Create a new engine otherwise we get some random error from Dynamic.Assign once we start removing fields
        // which is presumably because we have called ReadFile() before hand.
        engine = new FileHelperEngine<TestRecord>();
        if (headerFile != headerFields)
        {
            var fieldHeaders = engine.Options.FieldsNames;
            var fileHeaders = headerFile.Split(';').ToList();
            // Loop through all the record layout fields and remove those not found in the file header
            for (int index = fieldHeaders.Length - 1; index >= 0; index--)
                if (!fileHeaders.Contains(fieldHeaders[index]))
                    engine.Options.RemoveField(fieldHeaders[index]);
        }
        headerFields = engine.GetFileHeader();
        Debug.Assert(headerFile == headerFields);
        var records2 = engine.ReadFile(tempFile);
        // Test fixed string against column as column could be null and Debug.Assert can't use .Equals on a null object!
        column = records2[0].C;
        Debug.Assert("TestC1".Equals(column), "Test 2 - Column C does not equal 'TestC1'");  //Fails, returns ""
        // Test fixed string against column as column could be null and Debug.Assert can't use .Equals on a null object!
        column = records2[0].B;
        Debug.Assert(!"TestC1".Equals(column), "Test 2 - Column B does equal 'TestC1'");  //True, but that was not what I wanted
        Console.WriteLine("Seems to be OK now!");
        Console.ReadLine();

注意:我发现一个重要的事情是,在当前版本3.2.5中,删除字段后已经读取了文件的第一行将导致引擎熔断!

我还为您的类添加了IgnoreFirst()属性,以便它跳过标题行并将被忽略的文本设置为engine.HeaderText。这将产生以下类:

    [DelimitedRecord(";")]
    [IgnoreEmptyLines]
    [IgnoreFirst()]
    public class TestRecord
    {
        //Mandatory
        [FieldNotEmpty]
        public string A;
        [FieldOptional]
        public string B;
        [FieldOptional]
        public string C;
    }

我认为你应该用标题来装饰你的专栏,比如:

[DelimitedRecord(";")]
[IgnoreEmptyLines]
public class TestRecord
{
    //Mandatory
    [FieldNotEmpty, FieldOrder(0), FieldTitle("A")]
    public string A;
    [FieldOptional, FieldOrder(1), FieldTitle("B")]
    public string B;
    [FieldOptional, FieldOrder(2), FieldTitle("C")]
    public string C;
}

这样,运行时就知道列名是什么,并相应地解析它们。否则,它只知道文件中有两列,并且需要额外的分号。因此,下面的代码可以用于您的原始设置:

A;;C TestA1;;TestC1 TestA2;;TestC1

这只适用于FileHelpers v2,因为v3不再具有FieldTitle