Golang Distributed Data Store | Devil King's Blog

Timbala

“A distributed system is a model in which components located on networked computers communicate and coordinate their actions by passing messages.”

Requirements:

Sharding
Replication
High availability and throughputfor data ingestion

OpenTSDB

分成多个里程碑

单节点，可存，可查
多节点的shared, replication部分，以及手动方式的rebalance
anti-entropy?
研究性的，numa,data/cache locally, SSDs, 等等

最终集中在几点：

Coordination
- keep coordination to a minimum
- avoid coordination bottlenecks
Indexing
- each node knows what data is
- Consistent view; knows where each piece of data should reside
On-disk storage format
- Log-structured merge
- LevelDB
- RocksDB
- LMDB
- B-trees and b-tries (bitwise trie structure) for indexes
- Locality-preserving hashes
Cluster membership
node in cluster
could be static动态更好？
node dead to stop use
Data placement (replication/sharding)
- Consistent hashing,
- 1/n of data should be displaced/relocated when a single node fails, partition key
Failure modes

hashicorp’s memberlist

Consistent hashing:

func Hash(key uint64, numBuckets int) int32 {
    var b int64 = -1
    var j int64
    for j < int64(numBuckets) {
        b = j
        key = key*2862933555777941757 + 1
        j = int64(float64(b+1) * (float64(int64(1)<<31) / float64((key>>33)+1)))
    }
    return int32(b)
}

这里的测试挺有意思

Unit tests
- data distributed tests 涉及到分配的平均性
- data displacement tests 迁移的测试
- data displacement failure 迁移失败的处理
- jump hash gotcha 进入cluster，所有nodes的jump hash算法的调整
Acceptance tests
Integration tests
Benchmarking