Running keras model in golang

为什么要running在golang上

Current infrastructure is already running Kubernetes / Docker containers and Golang makes the binaries extremely small and efficient 运行比较小
Web frameworks for Go are much faster than the Python ones golang的web性能高
The team aren’t necessarily data scientists working in Python and work in Go 没必要切换语言？
Pushing data internally using GRPC for faster communication between micro services

Binary Classification in Keras

# Use TF to save the graph model instead of Keras save model to load it in Golang
builder = tf.saved_model.builder.SavedModelBuilder("myModel")  
# Tag the model, required for Go
builder.add_meta_graph_and_variables(sess, ["myTag"])  
builder.save()  
sess.close()

采用saveModel的方式

loading and running the model in Go

package main

import (  
    "fmt"

    tf "github.com/tensorflow/tensorflow/tensorflow/go"
)

func main() {  
    // replace myModel and myTag with the appropriate exported names in the chestrays-keras-binary-classification.ipynb
    model, err := tf.LoadSavedModel("myModel", []string{"myTag"}, nil)

    if err != nil {
        fmt.Printf("Error loading saved model: %s\n", err.Error())
        return
    }

    defer model.Session.Close()

    tensor, _ := tf.NewTensor([1][250][250][3]float32{})

    result, err := model.Session.Run(
        map[tf.Output]*tf.Tensor{
            model.Graph.Operation("inputLayer_input").Output(0): tensor, // Replace this with your input layer name
        },
        []tf.Output{
            model.Graph.Operation("inferenceLayer/Sigmoid").Output(0), // Replace this with your output layer name
        },
        nil,
    )

    if err != nil {
        fmt.Printf("Error running the session with input, err: %s\n", err.Error())
        return
    }

    fmt.Printf("Result value: %v \n", result[0].Value())

}

The tensor we input is in the shape [batch size][width][height][channels].

相同版本的python代码

%%time
from keras.preprocessing import image  
from keras.models import load_model  
import numpy as np  
model = load_model("model.h5")  
img = np.zeros((1,250,250,3))  
x = np.vstack([img]) # just append to this if we have more than one image.  
classes = model.predict_classes(x)  
print(classes)

可以尝试比较一下相关的时间问题？

Performance

Recall the model was:

3x3x32 Convolutional Layer
3x3x32 Convolutional Layer
2x2 Max Pool Layer
64 Node Fully Connected Layer with Dropout
1 Sigmoid output Layer

For Python:

CPU: - ~2.72s to warm up and run one inference and ~0.049s for each inference after
GPU: - ~3.52s to warm up and run one inference and ~0.009s for each inference after
Saved Model Size (HDF5) 242MB

For Go:

CPU: - ~0.255s to warm up and run one inference and ~0.045s for each inference after
GPU: - N/A
Saved Model Size(Protobuf binaries) 236MB

use Go to serve up your models in prod

感觉在k8s上运行更好一些？

在于model训练之后，基本稳定后的predict的操作