导入Excel工作表并使用松散耦合验证导入的数据
本文关键字:导入 耦合 验证 数据 Excel 工作 | 更新日期: 2023-09-27 18:24:50
我正在尝试开发一个模块,该模块将读取excel表(可能也来自其他数据源,因此应该是松散耦合的),并将它们转换为实体以便保存。
逻辑是这样的:
- excel工作表可以是不同的格式,例如excel工作表中的列名可能不同,因此我的系统需要能够将不同的字段映射到我的实体
- 现在,我将假设上面定义的格式是相同的,并且现在是硬编码的,而不是在配置映射UI上设置后动态地来自数据库
- 数据甚至需要在映射之前进行验证。因此,我应该能够预先针对某些内容进行验证。我们没有使用类似XSD或其他东西,所以我应该根据我用作导入模板的对象结构来验证它
问题是,我把一些东西放在一起,但我没有说我喜欢我所做的我的问题是如何改进下面的代码,使其更加模块化,并解决验证问题
下面的代码是一个模型,预计不会工作,只是为了查看设计的一些结构
这是我到目前为止想出的代码,我意识到有一件事我需要提高我的设计模式技能,但现在我需要你的帮助,如果你能帮助我的话:
//The Controller, a placeholder
class UploadController
{
//Somewhere here we call appropriate class and methods in order to convert
//excel sheet to dataset
}
在我们使用MVC控制器上传文件后,可能会有不同的控制器专门用于导入某些行为,在本例中,我将上传与个人相关的表,
interface IDataImporter
{
void Import(DataSet dataset);
}
//除了PersonImporter,我们还可以使用许多其他进口商类PersonImporter:IDataImporter{//我们将数据集划分为适当的数据表,并调用所有IImportActions//与人员数据导入相关//我们在这里调用插入DataContext的数据库函数,因为这样//我们可以做更少的数据库往返。
public string PersonTableName {get;set;}
public string DemographicsTableName {get;set;}
public Import(Dataset dataset)
{
CreatePerson();
CreateDemograhics();
}
//We put different things in different methods to clear the field. High cohesion.
private void CreatePerson(DataSet dataset)
{
var personDataTable = GetDataTable(dataset,PersonTableName);
IImportAction addOrUpdatePerson = new AddOrUpdatePerson();
addOrUpdatePerson.MapEntity(personDataTable);
}
private void CreateDemograhics(DataSet dataset)
{
var demographicsDataTable = GetDataTable(dataset,DemographicsTableName);
IImportAction demoAction = new AddOrUpdateDemographic(demographicsDataTable);
demoAction.MapEntity();
}
private DataTable GetDataTable(DataSet dataset, string tableName)
{
return dataset.Tables[tableName];
}
}
我有IDataImporter
和专门的混凝土类PersonImporter
。然而,我不确定到目前为止它看起来是否不错,因为事情应该是稳固的,所以在项目周期的后期基本上很容易扩展,这将是未来改进的基础,让我们继续前进:
IImportActions
是魔术最常发生的地方。我没有设计基于表的东西,而是开发基于行为的东西,这样就可以调用它们中的任何一个,以在更模块化的模型中导入东西。例如,一张表可能有两个不同的动作。
interface IImportAction
{
void MapEntity(DataTable table);
}
//A sample import action, AddOrUpdatePerson
class AddOrUpdatePerson : IImportAction
{
//Consider using default values as well?
public string FirstName {get;set;}
public string LastName {get;set;}
public string EmployeeId {get;set;}
public string Email {get;set;}
public void MapEntity(DataTable table)
{
//Each action is producing its own data context since they use
//different actions.
using(var dataContext = new DataContext())
{
foreach(DataRow row in table.Rows)
{
if(!emailValidate(row[Email]))
{
LoggingService.LogWarning(emailValidate.ValidationMessage);
}
var person = new Person(){
FirstName = row[FirstName],
LastName = row[LastName],
EmployeeId = row[EmployeeId],
Email = row[Email]
};
dataContext.SaveObject(person);
}
dataContext.SaveChangesToDatabase();
}
}
}
class AddOrUpdateDemographic: IImportAction
{
static string Name {get;set;}
static string EmployeeId {get;set;}
//So here for example, we will need to save dataContext first before passing it in
//to get the PersonId from Person (we're assuming that we need PersonId for Demograhics)
public void MapEntity(DataTable table)
{
using(var dataContext = new DataCOntext())
{
foreach(DataRow row in table.Rows)
{
var demograhic = new Demographic(){
Name = row[Name],
PersonId = dataContext.People.First(t => t.EmployeeId = int.Parse(row["EmpId"]))
};
dataContext.SaveObject(person);
}
dataContext.SaveChangesToDatabase();
}
}
}
还有验证,不幸的是,我最擅长的就是验证。验证需要易于扩展和松散耦合,而且我需要能够提前调用此验证,而不是添加所有内容。
public static class ValidationFactory
{
public static Lazy<IFieldValidation> PhoneValidation = new Lazy<IFieldValidation>(()=>new PhoneNumberValidation());
public static Lazy<IFieldValidation> EmailValidation = new Lazy<IFieldValidation>(()=>new EmailValidation());
//etc.
}
interface IFieldValidation
{
string ValidationMesage{get;set;}
bool Validate(object value);
}
class PhoneNumberValidation : IFieldValidation
{
public string ValidationMesage{get;set;}
public bool Validate(object value)
{
var validated = true; //lets say...
var innerValue = (string) value;
//validate innerValue using Regex or something
//if validation fails, then set ValidationMessage propert for logging.
return validated;
}
}
class EmailValidation : IFieldValidation
{
public string ValidationMesage{get;set;}
public bool Validate(object value)
{
var validated = true; //lets say...
var innerValue = (string) value;
//validate innerValue using Regex or something
//if validation fails, then set ValidationMessage propert for logging.
return validated;
}
}
我在一个项目上也做过同样的事情。不同的是,我不必导入Excel工作表,而是导入CSV文件。我创建了一个CSVValueProvider。因此,CSV数据自动绑定到我的IEnumerable模型。
至于验证,我认为遍历所有行和单元格并逐一验证它们不是很有效,尤其是当CSV文件有数千条记录时。因此,我所做的是创建了一些验证方法,这些方法逐列而不是逐行地检查CSV数据,并对每列执行linq查询,返回包含无效数据的单元格的行号。然后,将无效的行号/列名添加到ModelState中。
更新:
以下是我所做的。。。
CSVReader类:
// A class that can read and parse the data in a CSV file.
public class CSVReader
{
// Regex expression that's used to parse the data in a line of a CSV file
private const string ESCAPE_SPLIT_REGEX = "({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*";
// String array to hold the headers (column names)
private string[] _headers;
// List of string arrays to hold the data in the CSV file. Each string array in the list represents one line (row).
private List<string[]> _rows;
// The StreamReader class that's used to read the CSV file.
private StreamReader _reader;
public CSVReader(StreamReader reader)
{
_reader = reader;
Parse();
}
// Reads and parses the data from the CSV file
private void Parse()
{
_rows = new List<string[]>();
string[] row;
int rowNumber = 1;
var headerLine = "RowNumber," + _reader.ReadLine();
_headers = GetEscapedSVs(headerLine);
rowNumber++;
while (!_reader.EndOfStream)
{
var line = rowNumber + "," + _reader.ReadLine();
row = GetEscapedSVs(line);
_rows.Add(row);
rowNumber++;
}
_reader.Close();
}
private string[] GetEscapedSVs(string data)
{
if (!data.EndsWith(","))
data = data + ",";
return GetEscapedSVs(data, ",", "'"");
}
// Parses each row by using the given separator and escape characters
private string[] GetEscapedSVs(string data, string separator, string escape)
{
string[] result = null;
int priorMatchIndex = 0;
MatchCollection matches = Regex.Matches(data, string.Format(ESCAPE_SPLIT_REGEX, separator, escape));
// Skip empty rows...
if (matches.Count > 0)
{
result = new string[matches.Count];
for (int index = 0; index <= result.Length - 2; index++)
{
result[index] = data.Substring(priorMatchIndex, matches[index].Groups["Separator"].Index - priorMatchIndex);
priorMatchIndex = matches[index].Groups["Separator"].Index + separator.Length;
}
result[result.Length - 1] = data.Substring(priorMatchIndex, data.Length - priorMatchIndex - 1);
for (int index = 0; index <= result.Length - 1; index++)
{
if (Regex.IsMatch(result[index], string.Format("^{0}.*[^{0}]{0}$", escape)))
result[index] = result[index].Substring(1, result[index].Length - 2);
result[index] = result[index].Replace(escape + escape, escape);
if (result[index] == null || result[index] == escape)
result[index] = "";
}
}
return result;
}
// Returns the number of rows
public int RowCount
{
get
{
if (_rows == null)
return 0;
return _rows.Count;
}
}
// Returns the number of headers (columns)
public int HeaderCount
{
get
{
if (_headers == null)
return 0;
return _headers.Length;
}
}
// Returns the value in a given column name and row index
public object GetValue(string columnName, int rowIndex)
{
if (rowIndex >= _rows.Count)
{
return null;
}
var row = _rows[rowIndex];
int colIndex = GetColumnIndex(columnName);
if (colIndex == -1 || colIndex >= row.Length)
{
return null;
}
var value = row[colIndex];
return value;
}
// Returns the column index of the provided column name
public int GetColumnIndex(string columnName)
{
int index = -1;
for (int i = 0; i < _headers.Length; i++)
{
if (_headers[i].Replace(" ","").Equals(columnName, StringComparison.CurrentCultureIgnoreCase))
{
index = i;
return index;
}
}
return index;
}
}
CSV值提供者工厂类别:
public class CSVValueProviderFactory : ValueProviderFactory
{
public override IValueProvider GetValueProvider(ControllerContext controllerContext)
{
var uploadedFiles = controllerContext.HttpContext.Request.Files;
if (uploadedFiles.Count > 0)
{
var file = uploadedFiles[0];
var extension = file.FileName.Split('.').Last();
if (extension.Equals("csv", StringComparison.CurrentCultureIgnoreCase))
{
if (file.ContentLength > 0)
{
var stream = file.InputStream;
var csvReader = new CSVReader(new StreamReader(stream, Encoding.Default, true));
return new CSVValueProvider(controllerContext, csvReader);
}
}
}
return null;
}
}
CSV值提供程序类:
// Represents a value provider for the data in an uploaded CSV file.
public class CSVValueProvider : IValueProvider
{
private CSVReader _csvReader;
public CSVValueProvider(ControllerContext controllerContext, CSVReader csvReader)
{
if (controllerContext == null)
{
throw new ArgumentNullException("controllerContext");
}
if (csvReader == null)
{
throw new ArgumentNullException("csvReader");
}
_csvReader = csvReader;
}
public bool ContainsPrefix(string prefix)
{
if (prefix.Contains('[') && prefix.Contains(']'))
{
if (prefix.Contains('.'))
{
var header = prefix.Split('.').Last();
if (_csvReader.GetColumnIndex(header) == -1)
{
return false;
}
}
int index = int.Parse(prefix.Split('[').Last().Split(']').First());
if (index >= _csvReader.RowCount)
{
return false;
}
}
return true;
}
public ValueProviderResult GetValue(string key)
{
if (!key.Contains('[') || !key.Contains(']') || !key.Contains('.'))
{
return null;
}
object value = null;
var header = key.Split('.').Last();
int index = int.Parse(key.Split('[').Last().Split(']').First());
value = _csvReader.GetValue(header, index);
if (value == null)
{
return null;
}
return new ValueProviderResult(value, value.ToString(), CultureInfo.CurrentCulture);
}
}
对于验证,正如我之前提到的,我认为使用DataAnnotation属性进行验证是无效的。对于具有数千行的CSV文件,逐行验证数据将花费很长时间。因此,我决定在模型绑定完成后验证控制器中的数据。我还应该提到,我需要根据数据库中的一些数据验证CSV文件中的数据。如果你只需要验证电子邮件地址或电话号码之类的东西,你还可以使用DataAnnotation。
以下是验证电子邮件地址列的示例方法:
private void ValidateEmailAddress(IEnumerable<CSVViewModel> csvData)
{
var invalidRows = csvData.Where(d => ValidEmail(d.EmailAddress) == false).ToList();
foreach (var invalidRow in invalidRows)
{
var key = string.Format("csvData[{0}].{1}", invalidRow.RowNumber - 2, "EmailAddress");
ModelState.AddModelError(key, "Invalid Email Address");
}
}
private static bool ValidEmail(string email)
{
if(email == "")
return false;
else
return new System.Text.RegularExpressions.Regex(@"^['w-'.]+@(['w-]+'.)+['w-]{2,6}$").IsMatch(email);
}
更新2:
对于使用DataAnnotaion的验证,您只需在CSVViewModel中使用DataAnnotation属性,如下所示(CSVViewModel是您的CSV数据将在控制器操作中绑定到的类):
public class CSVViewModel
{
// User proper names for your CSV columns, these are just examples...
[Required]
public int Column1 { get; set; }
[Required]
[StringLength(30)]
public string Column2 { get; set; }
}