Skip to content
gqlxj1987's Blog
Go back

使用go处理一百万请求

Edit page

原文链接

Agenda:

Problem

接收大量的post json数据的请求,并将这些数据上传至s3服务器上

先期采用worker-tier的方式

如果采用这种方式,便会分离成为两个cluster,一个处理json请求,一个负责将数据传到s3上

但如果用go的话,可以在一个将这两个cluster化身成为两个method来进行

Solution

goroutines

采用的方式便是goroutines,但切忌用navie的方式

// Go through each payload and queue items individually to be posted to S3
for _, payload := range content.Payloads {
    go payload.UploadToS3()   // <----- DON'T DO THIS
}

考虑到requests的生命周期很短的情况,我们采用chan的方式,chan的方式,其实也类似于内存级的消息队列。

但随之而来的问题,就是buffer的部分,很容易到达limit,你无法控制limit的增长

We have decided to utilize a common pattern when using Go channels, in order to create a 2-tier channel system, one for queuing jobs and another to control how many workers operate on the JobQueue concurrently.

相关的数据结构为:

type Worker struct {
	WorkerPool  chan chan Job
	JobChannel  chan Job
	quit    	chan bool
}

首先启动多个worker来进行dispatcher的操作,在dispatcher的操作里,会去先尝试获取一个有效的worker,然后再将这个job传递给这个worker来进行操作,随后,在woerker里,通过jobChannel的方式,获取到相关的job,从而进行s3的上传工作

关键的代码如下:

for {
		select {
		case job := <-JobQueue:
			// a job request has been received
			go func(job Job) {
				// try to obtain a worker job channel that is available.
				// this will block until a worker is idle
				jobChannel := <-d.WorkerPool

				// dispatch the job to the worker job channel
				jobChannel <- job
			}(job)
		}
	}

此处为dispatcher操作

// Start method starts the run loop for the worker, listening for a quit channel in
// case we need to stop it
func (w Worker) Start() {
	go func() {
		for {
			// register the current worker into the worker queue.
			w.WorkerPool <- w.JobChannel

			select {
			case job := <-w.JobChannel:
				// we have received a work request.
				if err := job.Payload.UploadToS3(); err != nil {
					log.Errorf("Error uploading to S3: %s", err.Error())
				}

			case <-w.quit:
				// we have received a signal to stop
				return
			}
		}
	}()
}

此处为worker内部的操作

带来的效果是,服务器数量从100台drop到20台。

Conclusion

Simplicity always wins in my book. We could have designed a complex system with many queues, background workers, complex deployments, but instead we decided to leverage the power of Elasticbeanstalk auto-scaling and the efficiency and simple approach to concurrency that Golang provides us out of the box

语言带来的便利性,可能会好于引入其他各种复杂的系统

There is always the right tool for the job


Edit page
Share this post on:

Previous Post
Google Spanner
Next Post
逃避虽然可耻但有用