如何获取该单词出现的每个单词的行数(而不是弹药数)
本文关键字:何获取 单词 获取 单词出 | 更新日期: 2023-09-27 18:27:59
我正在努力达成一致。我需要得到单词出现的行数。为了做到这一点,我决定阅读每一行,然后检查字典中的单词是否在这一行,如果是,记住这一行的索引,并将该索引传递到列表中但如果一行中有两个相同的单词,我不需要显示这行的索引两次。最好的方法是什么?
public static void Main(string[] args)
{
Dictionary<string, int> concordanceDictionary = new Dictionary<string, int>();
string[] lines = File.ReadAllLines(@"C:'Text.txt");
string myText =
File.ReadAllText(path: @"C:Text.txt").ToLower();
string[] words = SplitWords(myText);
foreach (var word in words)
{
int i = 1;
if (!concordanceDictionary.ContainsKey(word))
{
concordanceDictionary.Add(word, i);
}
else
{
concordanceDictionary[word]++;
}
}
var list = concordanceDictionary.Keys.ToList();
list.Sort();
foreach (var key in list)
{
List<int> numberOfLine = new List<int>();
foreach (var line in lines)
{
if (line.Contains(key))
{
int m = IndexOf(line);
numberOfLine.Add(m);
}
Console.WriteLine("{0}.........: {1}....{2}", key, concordanceDictionary[key], numberOfLine);
}
}
}
static string[] SplitWords(string s)
{
return Regex.Split(s, @"'W+");
}
}
这里有一个问题
int m = IndexOf(line);
numberOfLine.Add(m);
如何获取字典中每个单词的行数?
这是我的上一篇文章:持有字符串的数量有任何方法可以解决它,但我在C#方面太新手了,有些事情我不理解。如果你能更详细地解释
这类问题通过创建自己的类而变得非常容易,而不是试图将用例强行放入内置对象和/或处理必须保持索引奇偶性的单独集合。
我会先制作一个WordInfo对象:
public class WordInfo
{
public WordInfo(string word, int firstLineNumber)
{
this.Word = word;
this.WordCount = 1;
this.LineNumbers = new List<int>() { firstLineNumber };
}
public string Word { get; set; }
public int WordCount { get; set; }
public List<int> LineNumbers { get; set; }
}
然后你可以让你的concordanceDictionary对象是一个字符串Dictionary,WordInfo:
Dictionary<string, WordInfo> concordanceDictionary = new Dictionary<string, WordInfo>();
int i = 1;
foreach (var line in File.ReadLines(@"C:'Text.txt"))
{
foreach (string word in SplitWords(line).ToLower())
{
if (!concordanceDictionary.ContainsKey(word))
{
concordanceDictionary.Add(word, new WordInfo(word, i));
}
else
{
concordanceDictionary[word].WordCount++;
if (!concordanceDictionary[word].LineNumbers.Contains(i))
{
concordanceDictionary[word].LineNumbers.Add(i);
}
}
}
i++;
}
然后,如果您仍然想对WordInfo对象进行排序:
List<WordInfo> sortedWordInfos = concordanceDictionary.Values.OrderByDescending(a => a.WordCount).ToList();
相反,从文本中构建一个出现列表,然后使用它完成索引可能会更容易
var lines = new []{ "This is line one and", "this is line two"};
var concordance = new[]{"and", "line", "this", "two", "foo"}
.Select(x => x.ToLower()).ToList();
var occurrences = lines
.SelectMany((line, i) =>
line.Split(' ')
.Distinct()
.Select(w => new { name = w, index = i + 1 }))
.ToLookup(w => w.name.ToLower(), w => w.index);
var all = concordance
.Select(w => w + " : " + string.Join(", ", occurrences[w]));
foreach (var r in all) // or first or all
Console.WriteLine(r);
确保一行中的每个单词只进入一次的关键项是Distinct
调用。还要注意如何在Select
或SelectMany
调用中放置索引。