DevilKing's blog

Google Spanner

Posted on 2017-02-15 In spanner

Spanner: Google’s Globally-Distributed Database

Agenda:

Overview
Property
Implements

Spanner

Spanner的优势

分布式多版本的数据库

支持ACID
类sql query
Schematized table
半关系型

这里好像提到NewSql，相对于关系型以及NoSql

针对读事务的无锁机制
额外的一致性模型
支持副本以及2PC
实时性部分？TrueTime，全局time部分？

使用go处理一百万请求

Posted on 2017-02-12 In golang

原文链接

Agenda:

The Problem
The Solution
Conclusion

Problem

接收大量的post json数据的请求，并将这些数据上传至s3服务器上

先期采用worker-tier的方式

Sidekiq
Resque
DelayedJob
Elasticbeanstalk Worker Tier
RabbitMQ
and so on…

如果采用这种方式，便会分离成为两个cluster，一个处理json请求，一个负责将数据传到s3上

但如果用go的话，可以在一个将这两个cluster化身成为两个method来进行

Solution

goroutines

采用的方式便是goroutines，但切忌用navie的方式

// Go through each payload and queue items individually to be posted to S3
for _, payload := range content.Payloads {
    go payload.UploadToS3()   // <----- DON'T DO THIS
}

考虑到requests的生命周期很短的情况，我们采用chan的方式，chan的方式，其实也类似于内存级的消息队列。

但随之而来的问题，就是buffer的部分，很容易到达limit，你无法控制limit的增长

We have decided to utilize a common pattern when using Go channels, in order to create a 2-tier channel system, one for queuing jobs and another to control how many workers operate on the JobQueue concurrently.

相关的数据结构为：

type Worker struct {
	WorkerPool  chan chan Job
	JobChannel  chan Job
	quit    	chan bool
}

首先启动多个worker来进行dispatcher的操作，在dispatcher的操作里，会去先尝试获取一个有效的worker，然后再将这个job传递给这个worker来进行操作，随后，在woerker里，通过jobChannel的方式，获取到相关的job，从而进行s3的上传工作

关键的代码如下：

for {
		select {
		case job := <-JobQueue:
			// a job request has been received
			go func(job Job) {
				// try to obtain a worker job channel that is available.
				// this will block until a worker is idle
				jobChannel := <-d.WorkerPool

				// dispatch the job to the worker job channel
				jobChannel <- job
			}(job)
		}
	}

此处为dispatcher操作

// Start method starts the run loop for the worker, listening for a quit channel in
// case we need to stop it
func (w Worker) Start() {
	go func() {
		for {
			// register the current worker into the worker queue.
			w.WorkerPool <- w.JobChannel

			select {
			case job := <-w.JobChannel:
				// we have received a work request.
				if err := job.Payload.UploadToS3(); err != nil {
					log.Errorf("Error uploading to S3: %s", err.Error())
				}

			case <-w.quit:
				// we have received a signal to stop
				return
			}
		}
	}()
}

此处为worker内部的操作

带来的效果是，服务器数量从100台drop到20台。

Conclusion

Simplicity always wins in my book. We could have designed a complex system with many queues, background workers, complex deployments, but instead we decided to leverage the power of Elasticbeanstalk auto-scaling and the efficiency and simple approach to concurrency that Golang provides us out of the box

语言带来的便利性，可能会好于引入其他各种复杂的系统

There is always the right tool for the job

逃避虽然可耻但有用

Posted on 2017-02-12 In weekly

单纯、平等、坦诚，作为处世的原则

安稳、沁入人心

本周工作：

上传的流量的切换至线上，已上60%，目前看起来还好
尝试进行hindsight的安装
收入抓取以及相关数据的修复
初步的OKR，自定义一部分

本周未完成：

相关文档的整理
关于arpu值部分的一些bug的修复

下周计划：

OKR的确认
文档开始补全
渠道监控部分代码的交接以及一些bug的修复
上传部分上至100%，进行观察

本周所得：

lua_sanbox部分的一些知识
hindsight部分的一些考量

开始尝试写一些技术性的文章，来进行相关的总结，从读后感开始，再结合自己实际的项目经验

关于之前开源的部分，也要慢慢捡起来，开源的目标不变

其他日程上的工作也要按部就班地进行

hindsight

Posted on 2017-02-09 In log

hindsight 作为hekad部分的替代者，成为日志处理部分的新宠

Agenda:

项目起因
install guide
use tips
performance

项目起因

heka 作为使用golang编写，进行相关的日志处理，相比较logstash来说，相关的cpu利用率部分有所降低。同时，相对于logstash来说，支持多种插件(插件采用lua或者go编写)。

heka vs logstash

在实际使用过程中，cpu的使用率部分，从400%降至100%左右

logstash部分，采用gork正则的方式来过滤相关的信息

heka部分，亦采用正则，辅之以lua脚本的方式

门槛方面，logstash门槛较低，正则匹配较为简单；heka，需要加入插件部分的支持，同时需要lua脚本的支持

heka vs hindsight

hindsight利用lua sanbox部分，再辅以lua_sandbox_extensions来支持各种对日志的处理要求

The state of Go(1.8)

Posted on 2017-02-06 In golang

Changes since Go 1.7

The Language
The Standard Library
The Runtime
The Tooling
The Community

The Language

Conversion rules

可以针对same sequence of fields and corresponding fields with same type 进行类型强转

Ports to other platforms

针对大小端的32位linux系统的兼容性

同时针对arm进行支持

Tool

针对context部分的修正

1	go tool fix -diff -force=context state-of-go/tools/gofix.go

vet部分可以静态检查代码？

1	go vet state-of-go/tools/govet.go

SSA everywhere?

Runtime

Detection of concurrent map accesses

关于map的并发读写问题，

是指掩盖了map的并发读写的问题，还是检测map的并发读写问题？

bench use mutex contention profile

关于mutex部分，wide protection and narrow protection

GC history in tweets

defer and cgo is faster

The Standard Library

sort部分

1 2	sort.Sort(byName(p)) sort.Sort(byAge(p))

原生支持按照field部分进行排序，无需定义之前len,less,swap等等操作

同时也只支持sort.Slice

plugins部分

1	go build -buildmode=plugin

可以编译成为.so文件，然后在第三方文件中使用，通过Lookup的方式，来寻找相关的函数以及变量

http shutdown

加入http.server的shutdown机制

这样能够及时判断http.server是否退出？

HTTP/2

可以使用http.Response来推送进行？使用http.Pusher的功能？