将数据从平面数组转换为分层结构

本文关键字:分层 结构 转换 数组 数据 平面 | 更新日期: 2023-09-27 18:35:21

我希望将数据从平面列表转换为分层结构。我如何以可读的方式完成此操作,但在性能上仍然可以接受,并且是否有任何可以利用的 .NET 库。我认为这在某些术语中被认为是一个"方面"(在这种情况下是行业)。

public class Company
{        
    public int CompanyId { get; set; }
    public string CompanyName { get; set; }
    public Industry Industry { get; set; }
}
public class Industry
{
    public int IndustryId { get; set; }
    public string IndustryName { get; set; }
    public int? ParentIndustryId { get; set; }
    public Industry ParentIndustry { get; set; }
    public ICollection<Industry> ChildIndustries { get; set; }
}

现在假设我有一个List<Company>,我希望将其转换为List<IndustryNode>

//Hierarchical data structure
public class IndustryNode
{
    public string IndustryName{ get; set; }
    public double Hits { get; set; }
    public IndustryNode[] ChildIndustryNodes{ get; set; }
}

因此,生成的对象在序列化后应如下所示:

{
    IndustryName: "Industry",
    ChildIndustryNodes: [
        {
            IndustryName: "Energy",
            ChildIndustryNodes: [
                {
                    IndustryName: "Energy Equipment & Services",
                    ChildIndustryNodes: [
                        { IndustryName: "Oil & Gas Drilling", Hits: 8 },
                        { IndustryName: "Oil & Gas Equipment & Services", Hits: 4 }
                    ]
                },
                {
                    IndustryName: "Oil & Gas",
                    ChildIndustryNodes: [
                        { IndustryName: "Integrated Oil & Gas", Hits: 13 },
                        { IndustryName: "Oil & Gas Exploration & Production", Hits: 5 },
                        { IndustryName: "Oil & Gas Refining & Marketing & Transporation", Hits: 22 }
                    ]
                }
            ]
        },
        {
            IndustryName: "Materials",
            ChildIndustryNodes: [
                {
                    IndustryName: "Chemicals",
                    ChildIndustryNodes: [
                        { IndustryName: "Commodity Chemicals", Hits: 24 },
                        { IndustryName: "Diversified Chemicals", Hits: 66 },
                        { IndustryName: "Fertilizers & Agricultural Chemicals", Hits: 22 },
                        { IndustryName: "Industrial Gases", Hits: 11 },
                        { IndustryName: "Specialty Chemicals", Hits: 43 }
                    ]
                }
            ]
        }
    ]
}

其中"点击"是属于该组的公司数量。

为了澄清,我需要将List<Company>转换为List<IndustryNode>而不是序列化List<IndustryNode>

将数据从平面数组转换为分层结构

试试这个:

    private static IEnumerable<Industry> GetAllIndustries(Industry ind)
    {
        yield return ind;
        foreach (var item in ind.ChildIndustries)
        {
            foreach (var inner in GetAllIndustries(item))
            {
                yield return inner;
            }
        }
    }
    private static IndustryNode[] GetChildIndustries(Industry i)
    {
        return i.ChildIndustries.Select(ii => new IndustryNode()
        {
            IndustryName = ii.IndustryName,
            Hits = counts[ii],
            ChildIndustryNodes = GetChildIndustries(ii)
        }).ToArray();
    }

    private static Dictionary<Industry, int> counts;
    static void Main(string[] args)
    {
        List<Company> companies = new List<Company>();
        //...
        var allIndustries = companies.SelectMany(c => GetAllIndustries(c.Industry)).ToList();
        HashSet<Industry> distinctInd = new HashSet<Industry>(allIndustries);
        counts = distinctInd.ToDictionary(e => e, e => allIndustries.Count(i => i == e));
        var listTop = distinctInd.Where(i => i.ParentIndustry == null)
                        .Select(i =>  new IndustryNode()
                                {
                                    ChildIndustryNodes = GetChildIndustries(i),
                                    Hits = counts[i],
                                    IndustryName = i.IndustryName
                                }
                        );
    }

未经测试

您正在寻找序列化程序。MSFT有一个VS原生的,但我喜欢Newtonsofts,它是免费的。MSFT文档和示例在这里,Newtonsoft文档在这里。

牛顿软件是免费的,简单和快速的。

尝试使用 json 序列化程序来实现此目的。我看到你的数据结构没问题,这只是序列化的问题。

var industryNodeInstance = LoadIndustryNodeInstance();
var json = new JavaScriptSerializer().Serialize(industryNodeInstance);

如果您想在序列化程序之间进行选择,请参阅以下内容:http://www.servicestack.net/benchmarks/#burningmonk-benchmarks

负载行业节点实例方法

  • 构建List<Industry>

  • 转换IndustryTree = List<IndustryNode>

  • 实现树方法,例如遍历。尝试查看C# 中的树数据结构

这里有一些伪代码,可能会让你一路走来。我创建了一个地图/字典索引,并用公司列表填充它。然后我们从索引中提取顶级节点。请注意,可能存在边缘情况(例如,此索引最初可能需要部分填充,因为您的公司似乎从未引用过最顶级的节点,因此必须以其他方式填充这些节点)。

Dictionary<String, IndustryNode> index = new Dictionary<String, IndustryNode>();
public void insert(Company company)
{ 
    if(index.ContainsKey(company.Industry.IndustryName))
    {
        index[company.Industry.IndustryName].hits++;
    }
    else
    {
        IndustryNode node = new IndustryNode(IndustryName=company.Industry, Hits=1);
        index[node.IndustryName] = node;
        if(index.ContainsKey(company.Industry.ParentIndustry.IndustryName))
        {
            index[company.Industry.ParentIndustry.IndustryName].ChildrenIndustries.Add(node);
        }
    }    
}
List<IndustryNode> topLevelNodes = index
    .Where(kvp => kvp.Item.ParentIndustry == null)
    .ToList(kvp => kvp.Item);