折叠重复和半重复记录的代码
本文关键字:记录 代码 半重 折叠 | 更新日期: 2023-09-27 18:14:34
我有一个这种类型的模型列表:
public class TourDude {
public int Id { get; set; }
public string Name { get; set; }
}
这是我的清单:
public IEnumerable<TourDude> GetAllGuides {
get {
List<TourDude> guides = new List<TourDude>();
guides.Add(new TourDude() { Name = "Dave Et", Id = 1 });
guides.Add(new TourDude() { Name = "Dave Eton", Id = 1 });
guides.Add(new TourDude() { Name = "Dave EtZ5", Id = 1 });
guides.Add(new TourDude() { Name = "Danial Maze A", Id = 2 });
guides.Add(new TourDude() { Name = "Danial Maze B", Id = 2 });
guides.Add(new TourDude() { Name = "Danial", Id = 3 });
return guides;
}
}
我想检索这些记录:
{ Name = "Dave Et", Id = 1 }
{ Name = "Danial Maze", Id = 2 }
{ Name = "Danial", Id = 3 }
目标主要是折叠重复和接近重复(通过ID确认),取最短的可能值(当比较时)作为名称。
从哪里开始?是否有一个完整的LINQ可以为我做到这一点?我需要编写一个相等比较器吗?
编辑1:
var result = from x in GetAllGuides
group x.Name by x.Id into g
select new TourDude {
Test = Exts.LongestCommonPrefix(g),
Id = g.Key,
};
IEnumerable<IEnumerable<char>> test = result.First().Test;
string str = test.First().ToString();
如果您想按Id
对条目进行分组,然后在每个组中找到Name
s的最长公共前缀,那么您可以这样做:
var result = from x in guides
group x.Name by x.Id into g
select new TourDude
{
Name = LongestCommonPrefix(g),
Id = g.Key,
};
使用算法从这里找到最长的公共前缀。
结果:{ Name = "Dave Et", Id = 1 }
{ Name = "Danial Maze ", Id = 2 }
{ Name = "Danial", Id = 3 }
static string LongestCommonPrefix(IEnumerable<string> xs)
{
return new string(xs
.Transpose()
.TakeWhile(s => s.All(d => d == s.First()))
.Select(s => s.First())
.ToArray());
}
我可以通过将ID上的记录分组,然后从按Name长度排序的每个组中选择第一个记录来实现这一点:
var result = GetAllGuides.GroupBy(td => td.Id)
.Select(g => g.OrderBy(td => td.Name.Length).First());
foreach (var dude in result)
{
Console.WriteLine("{{Name = {0}, Id = {1}}}", dude.Name, dude.Id);
}