目标:
- filter and aggregate as fast as possible
- GROUP BY
设计方式:
NOT top-down
基于硬件能力设计
- we will do GROUP BY in memory
- will put all data in a hash table
- if the hash table is large, it will not fit in L3 cache of CPU
- if the values of GROUP BY keys are not distributed locally, then we have L3 cache miss for every row in a table
- L3 cache miss has 70..100 ns latency
- How many keys per second we can process?
基于硬件设计,内存,cpu,cache,从底层的角度入手,而非单纯的软件角度在外围在处理。。。
解决一个问题,要分场景,不同场景有不同解决方案
- Hash Table
- memcpy
- 甚至对于小规模数据,有一个特化版本, memcpySmallAllowReadWriteOverflow15
- 不排斥新算法,选取实际效果最优的
对于不同数据规模,有不同的实现
quantileTiming
uniqCombined
- 小规模: flat array
- 中规模: hash table
- 极大规模: HyperLogLog
- keep in mind low-level details when designing your system
- design based on hardware capabilities
- choose data structures and abstractions based on the needs of the task
- provide specializations for special cases
- try the new, “best” algorithms, that you read about yesterday
- choose algorithm in runtime based on statistics
- benchmark on real datasets
- test for performance regressions in CI
- measure and observe everything
- even in production environment
- and rewrite code all the time
基于硬件的设计,是很大的一个看重点
merge tree,定期合并碎片化文件
存储与计算分离的思考模式?
最开始单纯就是解决group by 问题
算法是最重要,抽象性是其次的,也就是性能是最重要的,普适性并不是一开始考虑的