DevilKing's blog

冷灯看剑,剑上几分功名?炉香无需计苍生,纵一穿烟逝,万丈云埋,孤阳还照古陵

0%

Golang Distributed Data Store

原文链接

Timbala

“A distributed system is a model in which components located on networked computers communicate and coordinate their actions by passing messages.”

Requirements:

  • Sharding
  • Replication
  • High availability and throughputfor data ingestion

OpenTSDB

分成多个里程碑

  • 单节点,可存,可查
  • 多节点的shared, replication部分,以及手动方式的rebalance
  • anti-entropy?
  • 研究性的,numa,data/cache locally, SSDs, 等等

最终集中在几点:

  • Coordination
    • keep coordination to a minimum
    • avoid coordination bottlenecks
  • Indexing
    • each node knows what data is
    • Consistent view; knows where each piece of data should reside
  • On-disk storage format
    • Log-structured merge
    • LevelDB
    • RocksDB
    • LMDB
    • B-trees and b-tries (bitwise trie structure) for indexes
    • Locality-preserving hashes
  • Cluster membership
  • node in cluster
  • could be static动态更好?
  • node dead to stop use
  • Data placement (replication/sharding)
    • Consistent hashing,
    • 1/n of data should be displaced/relocated when a single node fails, partition key
  • Failure modes

hashicorp’s memberlist

Consistent hashing:

1
2
3
4
5
6
7
8
9
10
func Hash(key uint64, numBuckets int) int32 {
var b int64 = -1
var j int64
for j < int64(numBuckets) {
b = j
key = key*2862933555777941757 + 1
j = int64(float64(b+1) * (float64(int64(1)<<31) / float64((key>>33)+1)))
}
return int32(b)
}

这里的测试挺有意思

  • Unit tests

    • data distributed tests 涉及到分配的平均性

    • data displacement tests 迁移的测试

    • data displacement failure 迁移失败的处理

    • jump hash gotcha 进入cluster,所有nodes的jump hash算法的调整

  • Acceptance tests

  • Integration tests

  • Benchmarking