折叠重复和半重复记录的代码

本文关键字:记录 代码 半重 折叠 | 更新日期: 2023-09-27 18:14:34

我有一个这种类型的模型列表:

public class TourDude {
    public int Id { get; set; }
    public string Name { get; set; }
}

这是我的清单:

    public IEnumerable<TourDude> GetAllGuides {
        get {
            List<TourDude> guides = new List<TourDude>();
            guides.Add(new TourDude() { Name = "Dave Et", Id = 1 });
            guides.Add(new TourDude() { Name = "Dave Eton", Id = 1 });
            guides.Add(new TourDude() { Name = "Dave EtZ5", Id = 1 });
            guides.Add(new TourDude() { Name = "Danial Maze A", Id = 2 });
            guides.Add(new TourDude() { Name = "Danial Maze B", Id = 2 });
            guides.Add(new TourDude() { Name = "Danial", Id = 3 });
            return guides;
        }
    }

我想检索这些记录:

{ Name = "Dave Et", Id = 1 } 
{ Name = "Danial Maze", Id = 2 }
{ Name = "Danial", Id = 3 }

目标主要是折叠重复和接近重复(通过ID确认),取最短的可能值(当比较时)作为名称。

从哪里开始?是否有一个完整的LINQ可以为我做到这一点?我需要编写一个相等比较器吗?

编辑1:

        var result = from x in GetAllGuides
                     group x.Name by x.Id into g
                     select new TourDude {
                         Test = Exts.LongestCommonPrefix(g),
                         Id = g.Key,
                     };
        IEnumerable<IEnumerable<char>> test = result.First().Test;
        string str = test.First().ToString();

折叠重复和半重复记录的代码

如果您想按Id对条目进行分组,然后在每个组中找到Name s的最长公共前缀,那么您可以这样做:

var result = from x in guides
             group x.Name by x.Id into g
             select new TourDude
             {
                 Name = LongestCommonPrefix(g),
                 Id = g.Key,
             };

使用算法从这里找到最长的公共前缀。

结果:

{ Name = "Dave Et", Id = 1 }
{ Name = "Danial Maze ", Id = 2 }
{ Name = "Danial", Id = 3 }

static string LongestCommonPrefix(IEnumerable<string> xs)
{
    return new string(xs
        .Transpose()
        .TakeWhile(s => s.All(d => d == s.First()))
        .Select(s => s.First())
        .ToArray());
}

我可以通过将ID上的记录分组,然后从按Name长度排序的每个组中选择第一个记录来实现这一点:

var result = GetAllGuides.GroupBy(td => td.Id)
    .Select(g => g.OrderBy(td => td.Name.Length).First());
foreach (var dude in result)
{
    Console.WriteLine("{{Name = {0}, Id = {1}}}", dude.Name, dude.Id);
}