防止意外重新枚举IEnumerable的技术

本文关键字：枚举 IEnumerable 技术意外新枚举 | 更新日期: 2023-09-27 18:22:49

当一个断点似乎神奇地在枚举器中的同一位置出现两次时，我只是花了一些时间挠头。

事实证明，这个错误是一个直接的疏忽：

    protected override void Extract()
    {
        LogGettingOffers();
        var offerIds = CakeMarketingUtility.OfferIds(advertiserId);
        LogExtractingClicks(offerIds);
        foreach (var offerId in offerIds)
        {
            int rowCount;
            var clicks = RetryUtility.Retry(3, 10000, new[] { typeof(Exception) }, () =>
            {
                return CakeMarketingUtility.EnumerateClicks(dateRange, advertiserId, offerId);
            });
            foreach (var clickBatch in clicks.InBatches(1000))
            {
                LogExtractedClicks(offerId, clickBatch);
                // SHOULD BE clickBatch, NOT clicks
                Add(clicks);
            }
        }
        End();
    }

这让我想知道，如果有这样的错误，可以采取什么样的预防措施来编写代码。

注意，我不确定走这条路是否有意义——也许答案是"不要写错误的代码"，我愿意接受。。

以下是产生结果的实际代码：

    public static IEnumerable<Click> EnumerateClicks(DateRange dateRange, int advertiserId, int offerId)
    {
        // initialize to start at the first row
        int startAtRow = 1;
        // hard code an upper limit for the max number of rows to be returned in one call
        int rowLimitForOneCall = 5000;
        bool done = false;
        int total = 0;
        while (!done)
        {
            Logger.Info("Extracted a total of {0} rows, checking for more, starting at row {1}..", total, startAtRow);
            // prepare the request
            var request = new ClicksRequest
            {
                start_date = dateRange.FromDate.ToString("MM/dd/yyyy"),
                end_date = dateRange.ToDate.ToString("MM/dd/yyyy"),
                advertiser_id = advertiserId,
                offer_id = offerId,
                row_limit = rowLimitForOneCall,
                start_at_row = startAtRow
            };
            // create the client, call the service and check the response
            var client = new ClicksClient();
            var response = client.Clicks(request);
            if (!response.Success)
            {
                throw new Exception("ClicksClient failed");
            }
            // update the running total
            total += response.RowCount;
            // return result
            foreach (var click in response.Clicks)
                yield return click;
            // update stopping condition for loop
            done = (response.RowCount < rowLimitForOneCall);
            // increment start row for next iteration
            startAtRow += rowLimitForOneCall;
        }
        Logger.Info("Extracted a total of {0}, done.", total);
    }

防止意外重新枚举IEnumerable的技术

对于这个特定的问题，我认为解决方案是"不要写错误的代码"。特别是当可以在不改变任何状态的情况下生成结果时（比如从列表中枚举元素时），我认为可以从任何可枚举对象创建多个枚举器。

您可以创建一个IEnumerable包装器，确保GetEnumerator只被调用一次，但如果您实际上合法地需要调用两次呢？你真正想要的是捕捉错误，而不是捕捉被多次枚举的可枚举对象，这不是你可以轻易地放在软件解决方案中的东西。

也许问题是clickBatch和clicks的类型相同，所以编译器无法区分两者。

有时我需要确保我公开的可枚举对象只被调用一次。例如：返回我只有一次读取可用的流信息，或者非常昂贵的查询。

尝试以下扩展类：

public static class Extensions
{
    public static IEnumerable<T> SingleEnumeration<T>(this IEnumerable<T> source)
    {
        return new SingleEnumerator<T>(source);
    }
}
public class SingleEnumerator<T> : IEnumerable<T>
{
    public SingleEnumerator(IEnumerable<T> source)
    {
        this.source = source;
    }
    public IEnumerator<T> GetEnumerator()
    {
        // return an empty stream if called twice (or throw)
        if (source == null)
            return (new T[0]).AsEnumerable().GetEnumerator();
        // return the actual stream
        var result =source.GetEnumerator();
        source = null;
        return result;
    }
    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        // return an empty stream if called twice (or throw)
        if (source == null)
            return (new T[0]).AsEnumerable().GetEnumerator();
        var result = source.GetEnumerator();
        source = null;
        return result;
    }
    private IEnumerable<T> source;
}