我自己是覺得目前 PyTorch 的 ecosystem 還是不如 TensorFlow / Keras 完整,但是在 NLP 相關研究上的使用相對於其他研究或是其他框架是有比較多的趨勢。
Flux-A New Approach to System Intuition
Flux:A New Approach to System Intuition
Target:
understand in real time what effect a variable is having on a subset of request traffic during a Chaos Experiment
We require a tool that can give us this holistic understanding of traffic as it flows through our complex, distributed system
requirements:
- Realtime data
- Data on the volume, latency, and health of requests
- Insight into traffic at network edge
- The ability to drill into IPC traffic
- Dependency information about the microservices as requests travel through the system
Pain Suit: As a microservice experiences failure, the corresponding electrodes cause a painful sensation.
主要是针对流量监控部分?
traffic failover?
控制流量流向,
The inter-region traffic from victim to savior increases while the savior region scales up. At that point, we switch DNS to point to the savior region. For about 10 seconds you see traffic to the victim region die down as DNS propagates. At this point, about 56 seconds in, nearly all of the victim region’s traffic is now pointing to the savior region
CNN架构
为什么CNN模型战胜了传统的计算机视觉方法?
图像分类的传统流程涉及两个模块:
特征提取
从原始像素中提取更高级的特征,常用的传统特征报告GIST,HOG,SIFT,LBP等
分类
常用的分类模型有SVM,LR,随机森林及决策树
流程的一大问题是:特征提取不能根据图像和其他标签进行调整。传统的流程比较好的方法是,使用多种特征提取器,然后组合他们得到一种更好的特征
深度学习背后的哲学,没有建立硬编码的特征提取器。它将特征提取和分类两个模块集成一个系统,通过识别图像的特征来进行提取并基于有标签数据进行分类。这样的集成系统就是多层感知机,即有多层神经元密集连接而成的神经网络
CNN模型中的两个特点:神经元间的权重共享和卷积层之间的稀疏连接
为了理解CNN背后的设计哲学,你可能会问:其目标是什么?
准确度和计算量
AlexNet
AlexNet采用ReLU激活函数,而不像传统神经网络早期所采用的Tanh或Sigmoid激活函数,
f(x)=max(0, x)
优势是,训练速度更快,不会出现梯度消失的问题
另一个特点是,其通过在每个全连接层后面加上Dropout层减少了模型的过拟合问题。Dropout层以一定的概率随机的关闭当前层中神经元激活值
VGG16
VGG16相比AlexNet的一个改进是采用连续的几个3x3的卷积代替AlexNet中的较大卷积核(11x11, 5x5)
对于给定的感受野(与输出有关的输入图片的局部大小),采用堆积的小卷积核是优于采用大的卷积核,因为多层非线性层可以增加网络深度来保证学习更复杂的模式,而且代价还比较小(参数更少)。
VGG-D,使用了一种块结构: 多次重复使用同一大小的卷积核来提取更复杂和更具有表达性的特征
- GoogLeNet/Inception
基于这样的理念:在深度网络中大部分的激活值是不必要的(为0),或者由于相关性是冗余。因为,最高效的深度网络架构应该是激活值之间是稀疏连接,这意味着512个输出特征图是没有必要与所有的512输入特征图相连
设计了一种称为inception的模块,使用密集结构来近似一个稀疏的CNN。另一个特点是,使用了一中瓶颈层(实际上就是1x1卷积)来降低计算量
另一个特殊设计是最后的卷积层后使用全局均值池化层替换全连接层。所谓全局池化就是在整个2D特征图上取均值
- ResNet
从前面可以看到,随着网络深度增加,网络的准确度应该同步增加,当然要注意过拟合问题。但是网络深度增加的一个问题在于这些增加的层是参数更新的信号,因为梯度是从后向前传播的,增加网络深度后,比较靠前的层梯度会很小。
这意味着这些层基本上学习停滞了,这就是梯度消失问题。深度网络的第二个问题在于训练,当网络更深时意味着参数空间更大,优化问题变得更难,因此简单地去增加网络深度反而出现更高的训练误差。
引入所谓的学习残差的概念,
毫秒级实时排序
Golang Tips
Go Datastructures slices
diff between c arrays and go array
- array name in c is an alias, in go is reference/pointer
- c arrays can passed to function as a pointer, but go is pass values
- in c, array can not be copied like ar1 = ar2 unless ar1 and ar2 are pointer, but it is possible in go
- c array should be freed, go supports garbage collection
diff between go array and slice
- go slice is a 3 word data structure <pointer, length, capacity>
slice is a collection of data in contiguous blocks of memory
Nil slice: var slice[] int
, empty slice slice:=make([]int, 0)
or slice:=[]int{}
growing slice -> like java list, factor to grow? no..
growing in slice
- the append function takes in a source slice and append values and returns a new slice
- append always increases the length of the new slice but capacity may or may not increase
slice append - third index
- the third index of the slice restricts the capacity
- slice:=source[2:3:4]
- by setting the capacity == length, the new slice is forced to detach from source backing array and creates its own backing array
- the above technique is used in scenarios where we just want to modify the new slice backing array without changing the source backing array
1 | func main() { |
the reuslt:
1 | [1 2 3 4 5 6] |
notice the detach operation
but what the meaning of the capacity?
passing slices to functions
since only the pointer to the backing array is passed, this is very efficient. whether the size of the backing array is 10 or one million only 24 bytes are passed to function
RFC: Apache Beam Go SDK design
RFC: Apache Beam Go SDK design
is an advanced unified programming model, implement batch and streaming data processing jobs that run on any execution engine
weak point:
- no generics
- no function or method overloading
- no inheritance
- limited reflection and serialization support
- no annotation support
strong point:
- first-class functions
- full type reflection
- multiple return values
- and more
key design points
natively-typed dofns and other user functions
weakly-typed ptransforms that capture arity natively
static type checking at pipeline construction time
- kv is implicit. we use multiple arguments and return tuples to represent unfolded KV for DoFns
- slide input forms.
- simulated generic types. we achieve some of the effect of generics by introducing special ‘universal’ types T,U,… X,Y,Z over interface{}
error handling
examples
model representation
- Pipeline
- Runner
- PCollection
- Coder
- DoFn and other user functions
Transforms
- Impulse
- Create
- ParDo family
- GroupByKey
- Flatten
- Combine
- Partition
-