在两个列表中查找差异
本文关键字:查找 列表 两个 | 更新日期: 2023-09-27 18:29:48
我正在考虑一种在两个列表中查找差异的好方法
问题是:
两个列表都有一些字符串,其中前3个数字/字符(*分隔)表示唯一键(后面跟着文本String="key1*key2*key3*text")。
这是字符串示例:
AA1*1D*4*The quick brown fox*****CC*3456321234543~
其中"*AA1*1D*4*"是唯一密钥
List1:"index1*index2*index3","index2*index2*index三","Inde3*index2*index三"
清单2:"index2*index2*index3"、"index1*index2*index 3"、
我需要匹配两个列表中的索引并进行比较。
如果一个列表中的所有3个索引与另一个列表的3个索引匹配,我需要跟踪新列表中的两个字符串条目
如果一个列表中有一组索引没有出现在另一个列表,我需要跟踪一侧,并在另一侧保留一个空条目。(上面例子中的#4)
返回列表
这就是我到目前为止所做的,但我在这里有点挣扎:
List<String> Base = baseListCopy.Except(resultListCopy, StringComparer.InvariantCultureIgnoreCase).ToList(); //Keep unique values(keep differences in lists)
List<String> Result = resultListCopy.Except(baseListCopy, StringComparer.InvariantCultureIgnoreCase).ToList(); //Keep unique values (keep differences in lists)
List<String[]> blocksComparison = new List<String[]>(); //we container for non-matching blocks; so we could output them later
//if both reports have same amount of blocks
if ((Result.Count > 0 || Base.Count > 0) && (Result.Count == Base.Count))
{
foreach (String S in Result)
{
String[] sArr = S.Split('*');
foreach (String B in Base)
{
String[] bArr = B.Split('*');
if (sArr[0].Equals(bArr[0]) && sArr[1].Equals(bArr[1]) && sArr[2].Equals(bArr[2]) && sArr[3].Equals(bArr[3]))
{
String[] NA = new String[2]; //keep results
NA[0] = B; //[0] for base
NA[1] = S; //[1] for result
blocksComparison.Add(NA);
break;
}
}
}
}
你能为这个过程提出一个好的算法吗?
感谢
您可以使用哈希集。
为List1创建一个哈希集。记住index1*index2*index3与index3*index2*index 1不同。
现在遍历第二个列表。
Create Hashset for List1.
foreach(string in list2)
{
if(hashset contains string)
//Add it to the new list.
}
如果我正确理解你的问题,你希望能够通过元素的"key"前缀来比较元素,而不是通过整个字符串内容来比较。如果是这样的话,实现一个自定义的相等比较器将使您能够轻松地利用LINQ集算法。
此程序。。。
class EqCmp : IEqualityComparer<string> {
public bool Equals(string x, string y) {
return GetKey(x).SequenceEqual(GetKey(y));
}
public int GetHashCode(string obj) {
// Using Sum could cause OverflowException.
return GetKey(obj).Aggregate(0, (sum, subkey) => sum + subkey.GetHashCode());
}
static IEnumerable<string> GetKey(string line) {
// If we just split to 3 strings, the last one could exceed the key, so we split to 4.
// This is not the most efficient way, but is simple.
return line.Split(new[] { '*' }, 4).Take(3);
}
}
class Program {
static void Main(string[] args) {
var l1 = new List<string> {
"index1*index1*index1*some text",
"index1*index1*index2*some text ** test test test",
"index1*index2*index1*some text",
"index1*index2*index2*some text",
"index2*index1*index1*some text"
};
var l2 = new List<string> {
"index1*index1*index2*some text ** test test test",
"index2*index1*index1*some text",
"index2*index1*index2*some text"
};
var eq = new EqCmp();
Console.WriteLine("Elements that are both in l1 and l2:");
foreach (var line in l1.Intersect(l2, eq))
Console.WriteLine(line);
Console.WriteLine("'nElements that are in l1 but not in l2:");
foreach (var line in l1.Except(l2, eq))
Console.WriteLine(line);
// Etc...
}
}
打印以下结果:
Elements that are both in l1 and l2:
index1*index1*index2*some text ** test test test
index2*index1*index1*some text
Elements that are in l1 but not in l2:
index1*index1*index1*some text
index1*index2*index1*some text
index1*index2*index2*some text
List one = new List();
List two = new List();
List three = new List();
HashMap<String,Integer> intersect = new HashMap<String,Integer>();
for(one: String index)
{
intersect.put(index.next,intersect.get(index.next) + 1);
}
for(two: String index)
{
if(intersect.containsKey(index.next))
{
three.add(index.next);
}
}