Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#2 README #29

Merged
merged 3 commits into from
Jan 17, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.ja.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# gcf

gcf は Generics を用いて様々なコレクション操作を提供するライブラリです。
gcf(Go Colletion Framework) は Generics を用いて様々なコレクション操作を提供するライブラリです。
コレクションに対する操作を共通のインターフェイスを用いて行うことにより、操作の合成を容易に行うことができるようになります。

## モチベーション
Expand Down Expand Up @@ -50,7 +50,7 @@ r := itb.ToSlice(itb)

- Go 1.18 RC(or Beta)

Generics を利用しているため、現在はBetaのステータスではありますが、1.18のバージョンが必要になります。
gcf では Generics を利用しているため、このバージョンは現在はBetaのステータスではありますが、利用するためには1.18のバージョンが必要になります。
ローカル環境にGo 1.18をインストールしたくない場合に利用できるvscodeでの利用に合わせたコンテナ利用環境も用意しています。
([.devcontainer](https://github.com/meian/gcf/tree/main/.devcontainer) 以下を参照)

Expand All @@ -73,7 +73,7 @@ gcfは `Iterator` パターンによって処理を連携するよう構成さ

gcf の各関数は `Iterable[T]` のインターフェイスを持ち、これは `Iterator()` によって `Iterator[T]` を生成する機能のみを持ちます。
`Iterator[T]` は `MoveNext()` によってコレクションから取得できる要素を次の要素に移動し、`Current()` によって現在の位置の要素を取得します。
`Iterable[T]` によって操作を合成し、状態はそこから生成される `Iterator[T]` にのみ保持されることで、生成した操作を再利用しやすくなることを念頭に置いています
`Iterable[T]` によって操作を合成し、状態はそこから生成される `Iterator[T]` にのみ保持されることで、生成した操作を再利用しやすくなっています

### MoveNext + Current

Expand Down
139 changes: 132 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,144 @@
# gcf

gcf is a collection framework that supports generics on golang.
gcf (Go Colletion Framework) is a library that provides various collection operations using Generics.
By operating on the collection using a common interface, you can easily composite the operations.

# Concept
## Motivation

TODO: write about iterator and so on.
I wanted a functions that allows Go to easily composite and use processes instead of using basic syntax such as for and if.

# Installation
Until now, it was difficult to provide processing to multiple types with the same interface, but since the support for Generics in Go 1.18 made this implementation easier, gcf was actually built as a library.

Install gcf with `go get` on your go.mod managed directory.
## Example

Take as an example the process of `extracting only odd numbers from the elements of a slice and returning those numbers by 3 times`.

```golang
func Odd3(s []int) []int {
var r []int
for _, v := range s {
if v%2 == 0 {
continue
}
r := append(v*3)
}
return r
}
```

When using gcf, implement as follows.

```golang
// var s []int

itb := gcf.FromSlice(s)
itb = gcf.Filter(itb, func(v int) bool {
return v%2 > 0
})
itb = gcf.Map(itb, func(v int) int {
return v * 3
})

// Get the processing result as a slice
r := itb.ToSlice(itb)
```

This example is meant to show how to use it briefly.
Replacing inline processing with gcf will significantly reduce the performance, so it is not recommended to rewrite processing that can be easily implemented and managed inline using gcf.

## Environment

- Go 1.18 RC (or Beta)

Since gcf uses Generics, this version is currently in Beta status, but you need version 1.18 to use it.
We also have a container usage environment for vscode that you can use if you do not want to install Go 1.18 in your local environment.
(See below [.devcontainer] (https://github.com/meian/gcf/tree/main/.devcontainer))

## Installation

Install by using `go get` on the directory under the control of the Go module.

```bash
go get -d github.com/meian/gcf
```

# Performance
## Design

### Implements by Iterator

gcf is designed to composite processing with the `Iterator` pattern.
Some processes may allocate memory internally, but most processes avoid unnecessary memory allocations in the middle of the process.

### Iterable + Iterator

Each function returns `Iterable[T]`, which only has the ability to generate `Iterator[T]` by `Iterator()`.
`Iterator[T]` moves the element position next by `MoveNext()`, and gets current element by `Current()`.
Functions is composited by `Iterable[T]`, and the state is keep only in the `Iterator[T]`, which makes it easy to reuse the generated composition.

### MoveNext + Current

In Iterator implementation, you may see set of functions that uses `HasNext()` to check for the next element and `Next()` to move to the next element and return the element.
In gcf, we implemented that `MoveNext()` moves to the next element and returns move result, and `Current()` returns the current element.
This is because we took advantage to get current value multiple times without changing, rather than providing next check without changing.

### Top-level functions

In libraries of collection operations in other languages, the processing is often defined by methods so that we can use method chain, but we implemented by top-level functions.
This is because generics in Go cannot define type parameters at the method level, so some functions cannot be provided by methods, and if only some functions are provided by methods, the processing cannot be maintained consistency.
If it is implemented to define type parameters at the method level as a result of future version upgrades of Go, we will consider providing method chain functions.

## Performance

The performance of gcf has the following characteristics.

- It takes a processing time proportional to the number of elements in the collection and the number of processes to be combined.
- Overwhelmingly slower than in-line processing (about 70 times)
- About 4 times slower than function call without allocation
- Overwhelmingly faster than channel processing (about 60 times)

Due to the characteristics of the library, it is processed repeatedly, so it is not recommended to use it for processing that requires severe processing speed.

Please refer to the [Benchmark README](bench/README.md) for details.

## Function

### Implemented

The following functions are implemented.
See function comments for feature details.
There are some implementations for which comments have not been described, but we will add them in the future.

- `FromSlice`
- Also implemented as an immutable version, `FromSliceImmutable`.
- `Filter`
- `Map`
- `Concat`
- `Repeat`
- `RepeatIterable`

### To Be

TODO: write about benchmarks.
- `Range`
- Specify from, to, step and return the values in order
- Only numeric type will be provided
- `Reverse`
- Returns in reverse order
- `Distinct`
- Returns unique elements
- `FlatMap`
- Maps and returns flatten iterator
- `Sort`
- Sort collections
- `OrderBy`
- Sort collections by specified criteria
- `Take`
- Returns only the specified number from the beginning
- `Last`
- Returns only the specified number from the end
- `Skip`
- Excluding the number specified from the beginning
- `SkipLast`
- Excluding the specified number from the end
- channel function
- Create Iterable from channel
- Get the result of Iterable on channel
42 changes: 42 additions & 0 deletions bench/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Benchmark

To evaluate the performance of this library, we measured the following benchmarks for `gcf.Filter`.

## [`BenchmarkFilter_Volumes`](filter_test.go#L10)

By changing the number of elements and the number of filters, we are measuring how the processing time changes with respect to the amount of data and the number of processing.
Please refer to [filter-volumes.txt](filter-volumes.txt) for results.

As a result of the measurement, it was confirmed that the processing time was roughly proportional to the amount of data and the number of processing.

## [`BenchmarkFilter_Compare`](filter_test.go#L38)

By comparing the processing of the same logic in different implementation methods, the difference in processing speed relative to other implementations is measured.
The logic is as follows.

- The Iterable source is a number from 1 to 100
- Finds elements whose 13 remainder, 11 remainder, and 7 remainder are non-zero

The target of the benchmark is as follows.

- filter
- Calculate each remainder with `gcf.Filter`
- Allocation only occurs when Filter is generated
- if-func
- The result of each remainder calculation by the external function by if statement condition
- No allocation has occurred
- if-inline
- The result of each remainder calculation by if statement condition directly
- No allocation has occurred
- chan
- Calculate each remainder on the channel
- Allocation occurs only channel and maybe goroutine

Please refer to [filter-compare-8.txt](filter-compare-8.txt) for results.

As a result of the measurement, the processing time is too slow incomparable to the inline evaluation, but the processing time is about 4 times as long as the evaluation by the function.
In addition, the processing time is overwhelmingly shorter than using a channel.

In addition, the processing result when 1 CPU core is used is shown in [filter-compare-1.txt](filter-compare-1.txt).
As a result, it was confirmed that `gcf.Filter` is almost unaffected by the number of CPU cores.
(I'm not sure why the channel implementation takes longer to process with more CPU cores)