我想评估我的Windows Azure Table 如何存储查询比例表。 为此,我设计了一个简单的测试环境,我可以在这个环境中增加我表格中的数据数量,并测量查询的执行时间。 根据我想界定一个成本函数,用来评估未来查询的绩效。
我评价了以下查询:
- Query with PartitionKey and RowKey
- Query with PartitionKey and an attribute
- Query with PartitionKey and two RowKeys
- Query with PartitionKey and two attributes
对于最后两个问题,我检查了以下两种模式:
- PartitionKey == "..." && (RowKey == "..." || RowKey == "...")
- (PartitionKey == "..." && RowKey == "...") || (PartitionKey == "..." && RowKey == "...")
为了尽量减少转移延迟,我对一个Azure实例进行了测试。从测量中,我可以看到
- query 1 (not surprisingly, as the table is indexed based on those fields) is extremely fast, it s about 10-15ms if I have about 150000 entries in the table.
- query 2 requires a partition scan, so the execution time is increasing linearly with the stored data.
- query 3.1 performs almost exactly as query 2. So this query is also executed with a full partition scan, which for me seems a bit odd.
- query 4.1 is a bit more than two times slower than query 3.1. So it seems like it is evaluated with two partition scans.
- and finally, query 3.2 and 4.2 performs almost exactly 4 times slower than query 2.
您能否解释查询/过滤翻译的内涵?即使我们接受查询3.1需要分区扫描,查询4.1也可以用同样的逻辑(在同一时间)来评估。查询3.2和4.2对我来说似乎是一个谜。有任何指点吗?
显然,这整个要点是,我想在一个查询中查询不同的要素,以尽量减少成本,同时不丧失性能。 但似乎每个要素都使用单独的查询(与任务平行图书馆)是唯一真正的快速解决方案。 这样做的公认方式是什么?