导入Excel工作表并使用松散耦合验证导入的数据

本文关键字:导入 耦合 验证 数据 Excel 工作 | 更新日期: 2023-09-27 18:24:50

我正在尝试开发一个模块,该模块将读取excel表(可能也来自其他数据源,因此应该是松散耦合的),并将它们转换为实体以便保存。

逻辑是这样的:

  1. excel工作表可以是不同的格式,例如excel工作表中的列名可能不同,因此我的系统需要能够将不同的字段映射到我的实体
  2. 现在,我将假设上面定义的格式是相同的,并且现在是硬编码的,而不是在配置映射UI上设置后动态地来自数据库
  3. 数据甚至需要在映射之前进行验证。因此,我应该能够预先针对某些内容进行验证。我们没有使用类似XSD或其他东西,所以我应该根据我用作导入模板的对象结构来验证它

问题是,我把一些东西放在一起,但我没有说我喜欢我所做的我的问题是如何改进下面的代码,使其更加模块化,并解决验证问题

下面的代码是一个模型,预计不会工作,只是为了查看设计的一些结构

这是我到目前为止想出的代码,我意识到有一件事我需要提高我的设计模式技能,但现在我需要你的帮助,如果你能帮助我的话:

//The Controller, a placeholder
class UploadController
{
    //Somewhere here we call appropriate class and methods in order to convert
    //excel sheet to dataset
}

在我们使用MVC控制器上传文件后,可能会有不同的控制器专门用于导入某些行为,在本例中,我将上传与个人相关的表,

interface IDataImporter
{
    void Import(DataSet dataset);
}

//除了PersonImporter,我们还可以使用许多其他进口商类PersonImporter:IDataImporter{//我们将数据集划分为适当的数据表,并调用所有IImportActions//与人员数据导入相关//我们在这里调用插入DataContext的数据库函数,因为这样//我们可以做更少的数据库往返。

public string PersonTableName {get;set;}
public string DemographicsTableName {get;set;}
public Import(Dataset dataset)
{
    CreatePerson();
    CreateDemograhics();
}
//We put different things in different methods to clear the field. High cohesion.
private void CreatePerson(DataSet dataset)
{   
    var personDataTable = GetDataTable(dataset,PersonTableName);
    IImportAction addOrUpdatePerson = new AddOrUpdatePerson();
    addOrUpdatePerson.MapEntity(personDataTable);
}
private void CreateDemograhics(DataSet dataset)
{
    var demographicsDataTable = GetDataTable(dataset,DemographicsTableName);
    IImportAction demoAction = new AddOrUpdateDemographic(demographicsDataTable);
    demoAction.MapEntity();
}
private DataTable GetDataTable(DataSet dataset, string tableName)
{
    return dataset.Tables[tableName];
}

}

我有IDataImporter和专门的混凝土类PersonImporter。然而,我不确定到目前为止它看起来是否不错,因为事情应该是稳固的,所以在项目周期的后期基本上很容易扩展,这将是未来改进的基础,让我们继续前进:

IImportActions是魔术最常发生的地方。我没有设计基于表的东西,而是开发基于行为的东西,这样就可以调用它们中的任何一个,以在更模块化的模型中导入东西。例如,一张表可能有两个不同的动作。

interface IImportAction
{
    void MapEntity(DataTable table);
}
//A sample import action, AddOrUpdatePerson
class AddOrUpdatePerson : IImportAction
{
    //Consider using default values as well?
    public string FirstName {get;set;}
    public string LastName {get;set;}
    public string EmployeeId {get;set;}
    public string Email {get;set;}
    public void MapEntity(DataTable table)
    {
        //Each action is producing its own data context since they use
        //different actions.
        using(var dataContext = new DataContext())
        {
            foreach(DataRow row in table.Rows)
            {
                if(!emailValidate(row[Email]))
                {
                    LoggingService.LogWarning(emailValidate.ValidationMessage);
                }
                var person = new Person(){
                    FirstName = row[FirstName],
                    LastName = row[LastName],
                    EmployeeId = row[EmployeeId],
                    Email = row[Email]
                };
                dataContext.SaveObject(person);
            }
            dataContext.SaveChangesToDatabase();
        }
    }   
}
class AddOrUpdateDemographic: IImportAction
{
    static string Name {get;set;}
    static string EmployeeId {get;set;}
    //So here for example, we will need to save dataContext first before passing it in 
    //to get the PersonId from Person (we're assuming that we need PersonId for Demograhics)    
    public void MapEntity(DataTable table)
    {
        using(var dataContext = new DataCOntext())
        {
            foreach(DataRow row in table.Rows)
            {
                var demograhic = new Demographic(){
                    Name = row[Name],
                    PersonId = dataContext.People.First(t => t.EmployeeId = int.Parse(row["EmpId"]))
                };
                dataContext.SaveObject(person);
            }
            dataContext.SaveChangesToDatabase();
        }
    }
}

还有验证,不幸的是,我最擅长的就是验证。验证需要易于扩展和松散耦合,而且我需要能够提前调用此验证,而不是添加所有内容。

public static class ValidationFactory
{
    public static Lazy<IFieldValidation> PhoneValidation = new Lazy<IFieldValidation>(()=>new PhoneNumberValidation());
    public static Lazy<IFieldValidation> EmailValidation = new Lazy<IFieldValidation>(()=>new EmailValidation());
    //etc.
}
interface IFieldValidation
{
    string ValidationMesage{get;set;}
    bool Validate(object value);
}
class PhoneNumberValidation : IFieldValidation
{
    public string ValidationMesage{get;set;}
    public bool Validate(object value)
    {   
        var validated = true; //lets say...
        var innerValue = (string) value;
        //validate innerValue using Regex or something
        //if validation fails, then set ValidationMessage propert for logging.
        return validated;
    }
}
class EmailValidation : IFieldValidation
{
    public string ValidationMesage{get;set;}
    public bool Validate(object value)
    {   
        var validated = true; //lets say...
        var innerValue = (string) value;
        //validate innerValue using Regex or something
        //if validation fails, then set ValidationMessage propert for logging.
        return validated;
    }
}

导入Excel工作表并使用松散耦合验证导入的数据

我在一个项目上也做过同样的事情。不同的是,我不必导入Excel工作表,而是导入CSV文件。我创建了一个CSVValueProvider。因此,CSV数据自动绑定到我的IEnumerable模型。

至于验证,我认为遍历所有行和单元格并逐一验证它们不是很有效,尤其是当CSV文件有数千条记录时。因此,我所做的是创建了一些验证方法,这些方法逐列而不是逐行地检查CSV数据,并对每列执行linq查询,返回包含无效数据的单元格的行号。然后,将无效的行号/列名添加到ModelState中。

更新:

以下是我所做的。。。

CSVReader类:

// A class that can read and parse the data in a CSV file.
public class CSVReader
{
    // Regex expression that's used to parse the data in a line of a CSV file
    private const string ESCAPE_SPLIT_REGEX = "({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*";
    // String array to hold the headers (column names)
    private string[] _headers;
    // List of string arrays to hold the data in the CSV file. Each string array in the list represents one line (row).
    private List<string[]> _rows;
    // The StreamReader class that's used to read the CSV file.
    private StreamReader _reader;
    public CSVReader(StreamReader reader)
    {
        _reader = reader;
        Parse();
    }
    // Reads and parses the data from the CSV file
    private void Parse()
    {
        _rows = new List<string[]>();
        string[] row;
        int rowNumber = 1;
        var headerLine = "RowNumber," + _reader.ReadLine();
        _headers = GetEscapedSVs(headerLine);
        rowNumber++;
        while (!_reader.EndOfStream)
        {
            var line = rowNumber + "," + _reader.ReadLine();
            row = GetEscapedSVs(line);
            _rows.Add(row);
            rowNumber++;
        }
        _reader.Close();
    }
    private string[] GetEscapedSVs(string data)
    {
        if (!data.EndsWith(","))
            data = data + ",";
        return GetEscapedSVs(data, ",", "'"");
    }
    // Parses each row by using the given separator and escape characters
    private string[] GetEscapedSVs(string data, string separator, string escape)
    {
        string[] result = null;
        int priorMatchIndex = 0;
        MatchCollection matches = Regex.Matches(data, string.Format(ESCAPE_SPLIT_REGEX, separator, escape));
        // Skip empty rows...
        if (matches.Count > 0) 
        {
            result = new string[matches.Count];
            for (int index = 0; index <= result.Length - 2; index++)
            {
                result[index] = data.Substring(priorMatchIndex, matches[index].Groups["Separator"].Index - priorMatchIndex);
                priorMatchIndex = matches[index].Groups["Separator"].Index + separator.Length;
            }
            result[result.Length - 1] = data.Substring(priorMatchIndex, data.Length - priorMatchIndex - 1);
            for (int index = 0; index <= result.Length - 1; index++)
            {
                if (Regex.IsMatch(result[index], string.Format("^{0}.*[^{0}]{0}$", escape))) 
                    result[index] = result[index].Substring(1, result[index].Length - 2);
                result[index] = result[index].Replace(escape + escape, escape);
                if (result[index] == null || result[index] == escape) 
                    result[index] = "";
            }
        }
        return result;
    }
    // Returns the number of rows
    public int RowCount
    {
        get
        {
            if (_rows == null)
                return 0;
            return _rows.Count;
        }
    }
    // Returns the number of headers (columns)
    public int HeaderCount
    {
        get
        {
            if (_headers == null)
                return 0;
            return _headers.Length;
        }
    }
    // Returns the value in a given column name and row index
    public object GetValue(string columnName, int rowIndex)
    {
        if (rowIndex >= _rows.Count)
        {
            return null;
        }
        var row = _rows[rowIndex];
        int colIndex = GetColumnIndex(columnName);
        if (colIndex == -1 || colIndex >= row.Length)
        {
            return null;
        }
        var value = row[colIndex];
        return value;
    }
    // Returns the column index of the provided column name
    public int GetColumnIndex(string columnName)
    {
        int index = -1;
        for (int i = 0; i < _headers.Length; i++)
        {
            if (_headers[i].Replace(" ","").Equals(columnName, StringComparison.CurrentCultureIgnoreCase))
            {
                index = i;
                return index;
            }
        }
        return index;
    }
}

CSV值提供者工厂类别:

public class CSVValueProviderFactory : ValueProviderFactory
{
    public override IValueProvider GetValueProvider(ControllerContext controllerContext)
    {
        var uploadedFiles = controllerContext.HttpContext.Request.Files;
        if (uploadedFiles.Count > 0)
        {
            var file = uploadedFiles[0];
            var extension = file.FileName.Split('.').Last();
            if (extension.Equals("csv", StringComparison.CurrentCultureIgnoreCase))
            {
                if (file.ContentLength > 0)
                {
                    var stream = file.InputStream;
                    var csvReader = new CSVReader(new StreamReader(stream, Encoding.Default, true));
                    return new CSVValueProvider(controllerContext, csvReader);
                }
            }
        }
        return null;
    }
}

CSV值提供程序类:

// Represents a value provider for the data in an uploaded CSV file.
public class CSVValueProvider : IValueProvider
{
    private CSVReader _csvReader;
    public CSVValueProvider(ControllerContext controllerContext, CSVReader csvReader)
    {
        if (controllerContext == null)
        {
            throw new ArgumentNullException("controllerContext");
        }
        if (csvReader == null)
        {
            throw new ArgumentNullException("csvReader");
        }
        _csvReader = csvReader;
    }
    public bool ContainsPrefix(string prefix)
    {
        if (prefix.Contains('[') && prefix.Contains(']'))
        {
            if (prefix.Contains('.'))
            {
                var header = prefix.Split('.').Last();
                if (_csvReader.GetColumnIndex(header) == -1)
                {
                    return false;
                }
            }
            int index = int.Parse(prefix.Split('[').Last().Split(']').First());
            if (index >= _csvReader.RowCount)
            {
                return false;
            }
        }
        return true;
    }
    public ValueProviderResult GetValue(string key)
    {
        if (!key.Contains('[') || !key.Contains(']') || !key.Contains('.'))
        {
            return null;
        }
        object value = null;
        var header = key.Split('.').Last();
        int index = int.Parse(key.Split('[').Last().Split(']').First());
        value = _csvReader.GetValue(header, index);
        if (value == null)
        {
            return null;
        }
        return new ValueProviderResult(value, value.ToString(), CultureInfo.CurrentCulture);
    }
}

对于验证,正如我之前提到的,我认为使用DataAnnotation属性进行验证是无效的。对于具有数千行的CSV文件,逐行验证数据将花费很长时间。因此,我决定在模型绑定完成后验证控制器中的数据。我还应该提到,我需要根据数据库中的一些数据验证CSV文件中的数据。如果你只需要验证电子邮件地址或电话号码之类的东西,你还可以使用DataAnnotation。

以下是验证电子邮件地址列的示例方法:

private void ValidateEmailAddress(IEnumerable<CSVViewModel> csvData)
{
    var invalidRows = csvData.Where(d => ValidEmail(d.EmailAddress) == false).ToList();
    foreach (var invalidRow in invalidRows)
    {
        var key = string.Format("csvData[{0}].{1}", invalidRow.RowNumber - 2, "EmailAddress");
        ModelState.AddModelError(key, "Invalid Email Address"); 
    }        
}
private static bool ValidEmail(string email)
{
    if(email == "")
        return false;
    else
        return new System.Text.RegularExpressions.Regex(@"^['w-'.]+@(['w-]+'.)+['w-]{2,6}$").IsMatch(email);
}

更新2:

对于使用DataAnnotaion的验证,您只需在CSVViewModel中使用DataAnnotation属性,如下所示(CSVViewModel是您的CSV数据将在控制器操作中绑定到的类):

public class CSVViewModel
{
    // User proper names for your CSV columns, these are just examples...   
    [Required]
    public int Column1 { get; set; } 
    [Required]
    [StringLength(30)]
    public string Column2 { get; set; }
}