用c#从mysql数据库中检索Word评分

本文关键字:检索 Word 评分 数据库 mysql | 更新日期: 2023-09-27 18:04:27

我想从数据库中检索一个词的分数然后我将对段落做出判断这是正面段落还是负面段落
数据库文件格式如下所示。其中某关键词有正负分

Word                Pos_Score           Neg_Score
Able                .324                .834
Country             .987                .213
Love                .378                .734 
agree               .546                .123
industry            .289                .714
guests              .874                .471

段落将是这样的。

I agree with you.  It seems an intelligent tourist industry allows its guests to either immerse fully, in part, or not, depending upon the guest.  That is why the ugly American charges have always confused me.  

现在我将段落的每个单词与数据库文件进行比较,如果在数据库文件中找到单词,那么我将检索单词的Pos_Score和Neg_Score分数,这些分数将存储在变量中,当整个段落将在末尾进行比较时,Pos_Score将单独添加,Neg_Score将单独添加。这就是结果。
代码我尝试的是这个

    private void button1_Click(object sender, EventArgs e)
            {
                string MyConString = "server=localhost;" +
                   "database=sentiwornet;" + "password=zia;" +
                   "User Id=root;";
                MySqlConnection connection = new MySqlConnection(MyConString);
                MySqlCommand command = connection.CreateCommand();
                MySqlDataReader Reader;
                StreamReader reader = new StreamReader("D:''input.txt");
                string line;
                while ((line = reader.ReadLine()) != null)
                {
                    string[] parts = line.Split(' ');
                    foreach (string part in parts)
                    {
                        command.CommandText = "SELECT Pos_Score FROM score WHERE Word = 'part'";
                        command.CommandText = "SELECT Neg_Score FROM score WHERE Word = 'part'";
                        //var 
                        connection.Open();
                        Reader = command.ExecuteReader();
                    }
                }
            }

用c#从mysql数据库中检索Word评分

首先,这个查询的效率非常低。相反,如果您的段落足够小,我将执行数据库内的所有连接,方法是将参数作为csv列表传入,然后转换为SQL中的表。下面的函数将做到这一点(由http://codebank.wordpress.com/2007/03/06/simple-sql-csv-to-table-2/):

提供)

注意:你需要去掉所有的标点符号,使用string.Replace(new[] { '.', ',' ... etc })

同样,我的代码也有可能不能完全满足你的要求——它甚至无法编译——但这正是编程的乐趣所在。这就给了你一个关于如何解决一个相当复杂的问题的大致思路。

编辑:我刚刚意识到你正在使用MySql。这段代码适用于MSSQL——我从来没有在CLR中使用过MySql,所以我不知道是否所有的类都是等价的。你可能需要回到你以前做的事情。

CSV到List

Create Function dbo.fn_CSVToTable (@CSVList Varchar(MAX))
Returns @Table Table (ColumnData Varchar(50))
As
Begin
If right(@CSVList, 1) <> ','
Select @CSVList = @CSVList + ','
Declare @Pos    Smallint,
@OldPos Smallint
Select  @Pos    = 1,
@OldPos = 1
While   @Pos < Len(@CSVList)
Begin
Select  @Pos = CharIndex(',', @CSVList, @OldPos)
Insert into @Table
Select  LTrim(RTrim(SubString(@CSVList, @OldPos, @Pos - @OldPos))) Col001
Select  @OldPos = @Pos + 1
End
Return
End
<<h3> SQL过程/h3>
CREATE PROCEDURE dbo.spGetWordScores (@csv varchar(MAX))
AS
select POS_SCORE, NEG_SCORE, WORD from score
inner join dbo.fn_CSVToTable(@csv) input
    on input.ColumnData = score.WORD

新的c#代码

var MyConString = "server=localhost;" +
               "database=sentiwornet;" + "password=zia;" +
               "User Id=root;";
var connection = new MySqlConnection(MyConString);
//Each line in the array will probably be one paragraph.
var fileLines = File.ReadAllLines("D:''input.txt");
foreach (var line in fileLines)
{
        //Format your line into words by removing punctuation. I'm not going to bother
        //with that code because it is trivial.
        //var csv = line.Split(' ');
        var command = connection.CreateCommand();
                    command.CommandText = "exec spGetWordScores";
                    command.Parameters.AddWithValue("@csv", csv);
        var ds = command.ExecuteDataSet();
        //Now you have a DataSet with your word scores. do with them what you will.
}

有用的扩展方法

public static class Extensions
{
    public static DataSet ExecuteDataSet(this SqlCommand command)
    {
        using (SqlDataAdapter da = new SqlDataAdapter(command)) {
        DataSet ds = new DataSet();
        // Fill the DataSet using default values for DataTable names, etc
        da.Fill(ds);
        return ds;
        }
    }
}

来回访问数据库会降低性能。最好编写一个存储过程,它接受输入字符串,将其拆分并计算分数——这样所有的处理都将在一台机器上进行,您将节省大量的时间,因为不需要传递部分结果。