Go Datastructures slices
diff between c arrays and go array
- array name in c is an alias, in go is reference/pointer
- c arrays can passed to function as a pointer, but go is pass values
- in c, array can not be copied like ar1 = ar2 unless ar1 and ar2 are pointer, but it is possible in go
- c array should be freed, go supports garbage collection
diff between go array and slice
- go slice is a 3 word data structure <pointer, length, capacity>
slice is a collection of data in contiguous blocks of memory
Nil slice: var slice[] int
, empty slice slice:=make([]int, 0)
or slice:=[]int{}
growing slice -> like java list, factor to grow? no..
growing in slice
- the append function takes in a source slice and append values and returns a new slice
- append always increases the length of the new slice but capacity may or may not increase
slice append - third index
- the third index of the slice restricts the capacity
- slice:=source[2:3:4]
- by setting the capacity == length, the new slice is forced to detach from source backing array and creates its own backing array
- the above technique is used in scenarios where we just want to modify the new slice backing array without changing the source backing array
1 | func main() { |
the reuslt:
1 | [1 2 3 4 5 6] |
notice the detach operation
but what the meaning of the capacity?
passing slices to functions
since only the pointer to the backing array is passed, this is very efficient. whether the size of the backing array is 10 or one million only 24 bytes are passed to function
RFC: Apache Beam Go SDK design
RFC: Apache Beam Go SDK design
is an advanced unified programming model, implement batch and streaming data processing jobs that run on any execution engine
weak point:
- no generics
- no function or method overloading
- no inheritance
- limited reflection and serialization support
- no annotation support
strong point:
- first-class functions
- full type reflection
- multiple return values
- and more
key design points
natively-typed dofns and other user functions
weakly-typed ptransforms that capture arity natively
static type checking at pipeline construction time
- kv is implicit. we use multiple arguments and return tuples to represent unfolded KV for DoFns
- slide input forms.
- simulated generic types. we achieve some of the effect of generics by introducing special ‘universal’ types T,U,… X,Y,Z over interface{}
error handling
examples
model representation
- Pipeline
- Runner
- PCollection
- Coder
- DoFn and other user functions
Transforms
- Impulse
- Create
- ParDo family
- GroupByKey
- Flatten
- Combine
- Partition
-