Azure 表存储查询从错误的分区返回数据

本文关键字:分区 返回 数据 错误 存储 查询 Azure | 更新日期: 2023-09-27 18:32:29

我正在使用 azure 表存储,我正在尝试快速迭代表。

我一定做错了,但我一辈子都看不出为什么;

当我只指定一个分区时,我最终会得到多个分区的结果。即,如果我仅使用"pkey1"来约束查询,则返回"pkey1"的1000个结果,然后返回"pkey2"的325个结果

完全搞不清楚这是怎么发生的..

这是我正在使用的代码:

private CloudTableClient _client;
private string _tableName;
private class QueryState
{
    public CloudTableQuery<T> Ctq;
    public Action<IEnumerable<T>> Populator;
    public ManualResetEvent Mre;
    public string Pkey;
    public QueryState(CloudTableQuery<T> ctq, Action<IEnumerable<T>> populator, ManualResetEvent mre, string pkey)
    {
        Populator = populator;
        Ctq = ctq;
        Mre = mre;
        Pkey = pkey;
    }
}
    public void ParallelQueryWithClause(Action<IEnumerable<T>> populator, string[] partitionKeys)
    {
        List<ManualResetEvent> mre = new List<ManualResetEvent>();
        foreach (string pKey in partitionKeys)
        {
            //_retry.Go(tsc =>
            //    {
                    TableServiceContext tsc =  _client.GetDataServiceContext();
                    ManualResetEvent m = new ManualResetEvent(false);
                    mre.Add(m);
                    CloudTableQuery<T> query = tsc.CreateQuery<T>(_tableName).Where(e => e.PartitionKey == pKey).AsTableServiceQuery<T>();
                    Action<IAsyncResult> act  = null;
                    act = result =>
                        {
                            int retries = 0;
                            while (retries++ < 5)
                            {
                                try
                                {
                                    QueryState qsInternal = result.AsyncState as QueryState;
                                    CloudTableQuery<T> ctq = qsInternal.Ctq;
                                    ResultSegment<T> seg = ctq.EndExecuteSegmented(result);
                                    if (seg.Results.Count() > 0)
                                        populator(seg.Results);
                                    if (seg.ContinuationToken != null)
                                    {
                                        ctq.BeginExecuteSegmented(seg.ContinuationToken, iasync => act(iasync), qsInternal);
                                    }
                                    else
                                    {
                                        m.Set();
                                    }
                                    break;
                                }
                                catch(Exception ex)
                                {
                                    Logger.LogError(ex);
                                }
                            }
                        };
                    query.BeginExecuteSegmented(iasync => act(iasync), new QueryState(query, populator, m, pKey));
                //});
        }
        ManualResetEvent.WaitAll(mre.ToArray());
    }

和示例调用代码:

AzureTableStorage<ProductEntity> _ats = new AzureTableStorage<ProductEntity>("Products");
string[] partitions = new string[] { "pkey1" };
    Dictionary<string, int> cntr = new Dictionary<string, int>();
    _ats.ParallelQueryWithClause(p =>
    {
        lock (cntr)
        {
            foreach (ProductEntity pe in p)
            {
                if (cntr.ContainsKey(pe.PartitionKey))
                    cntr[pe.PartitionKey]++;
                else
                    cntr.Add(pe.PartitionKey, 1);
            }
        }
    }, partitions);

希望这是有道理的,有人可以提供帮助!

Azure 表存储查询从错误的分区返回数据

您可能会遇到正在修改闭包值的情况。http://marlongrech.wordpress.com/2010/06/02/closures-in-c-can-be-evil/这将使用 pKey null 运行查询,实际上不会按 pKey 进行筛选,从而返回表中的所有值。

尝试替换

foreach (string pKey in partitionKeys)

foreach (string pKeyTmp in partitionKeys)
{
   string pKey = pKeyTmp;