在c#的N个最佳句子中选择最佳句子

本文关键字:最佳 句子 选择 | 更新日期: 2023-09-27 18:15:30

我有一个文本文件,其中文本是句子排列的。每个句子连续重复3次。每组句子之间唯一的区别是标签的出现次数。标签是[loc], [Time]和[PER]。考虑下面的例子。

Match between [Loc]India[/Loc] and [Loc]Seri Lanka[/Loc] will start at [Time]12 o'clock[/TIME]  
Match between [Loc]India[/Loc] and [Loc]Seri Lanka[/Loc] will start at 12 o'clock   
Match between [Loc]India[/Loc] and Seri Lanka will start at [Time]12 o'clock[/TIME]  
[PER]Dhoni[/PER] will lead Indian Team  
[PER]Dhoni[/PER] will lead [PER]Indian Team[/PER]  
Dhoni  will lead Indian Team  

我的目标是在每组句子中选择那些有最多标签的句子。例:在第一组句子中。总共有1个[Loc],[Loc]和[Time]三个标签在第二组句子2中相似。
我尝试使用StreamReader,但我无法跳过句子。

在c#的N个最佳句子中选择最佳句子

如果标签确实出现在句子中,则只进行计数。

(我写这个在VB.net,但语法本质上是相同的)

Dim sen1 as String = between [Loc]India[/Loc] and [Loc]Seri Lanka[/Loc] will start at [Time]12 o'clock[/TIME] 
Dim split as String() = sen1.split("/"c)
Dim senCount as Integer = split.count

好了,现在你知道句子1有3个标签了。

你甚至可以把它变成这样的函数-

Private function tagCount(byval thisSentence As String)
     Dim split as String() = thisSentence.split("/"c)
     Dim senCount as Integer = split.count
     Return senCount
End Function

然后循环遍历句子,确定哪个句子的标签最多。