Redis - 在给定日期时间范围内跟踪和查询的命中计数

本文关键字：查询跟踪范围内时间日期 Redis | 更新日期: 2023-09-27 18:36:14

我有许多不同的项目，我想跟踪每个项目的点击次数，然后在给定日期时间范围内查询每个项目的命中计数，精确到每秒。

所以我开始将命中存储在一个排序集中，例如每秒（unix 纪元时间）一个排序集：

zincrby ItemCount:1346742000 item1 1    
zincrby ItemCount:1346742000 item2 1
zincrby ItemCount:1346742001 item1 1
zincrby ItemCount:1346742005 item9 1

现在要获取给定日期范围内每个项目的总点击数：

1. Given a start datetime and end datetime:
   Calculate the range of epochs that fall under that range.
2. Generate the key names for each sorted set using the epoch values example:
   ItemCount:1346742001, ItemCount:1346742002, ItemCount:1346742003
3. Use Union store to aggregate all the values from different sorted sets 
   ZUINIONSTORE _item_count KEYS....
4. To get the final results out:
   ZRANGE _item_count 0, -1 withscores

所以它有点工作，但是当我有一个大的日期范围（如 1 个月）时，我会遇到问题，从步骤 1 和 2 计算的键名数量达到数百万（每天 86400 个纪元值）。使用如此多的密钥时，ZUINIONSTORE 命令会失败 - 套接字会损坏。此外，循环遍历并生成这么多密钥需要一段时间。

我怎样才能在 Redis 中以更有效的方式设计它，同时仍然将跟踪粒度一直降低到几秒钟，而不是几分钟或几天。

Redis - 在给定日期时间范围内跟踪和查询的命中计数

是的，你应该避免排序集的大联合。一个你可以做的很好的技巧，假设你知道一个项目每秒可以获得的最大命中数。

按项目排序，时间戳为分数和值。
但是，如果您不是第一个编写分数的客户，则分数将增加 1/（max_predicted_hits_per_second）。这样，小数点后的数字始终是命中/max_predicted_hits_per秒，但您仍然可以执行范围查询。

假设max_predicted_hits_per_second是 1000。我们所做的是这样的（Python 示例）：

#1. make sure only one client adds the actual timestamp, 
#by doing SETNX to a temporary key)
now = int(time.time())
rc = redis.setnx('item_ts:%s' % itemId, now)

#just the count part
val = float(1)/1000
if rc: #we are the first to incement this second
   val += now
   redis.expire('item_ts:%s' % itemId, 10) #we won't need that anymore soon, assuming all clients have the same clock
#2 increment the count
redis.zincrby('item_counts:%s' % itemId, now, amount = val)

现在查询范围将是这样的：

counts = redis.zrangebyscore('item_counts:%s' % itemId, minTime, maxTime + 0.999, withscores=True)
total = 0
for value, score in counts:
    count = (score - int(value))*1000
    total += count