LSTM support #28

deveshjawla · 2020-11-30T00:54:14Z

Hi Jonathan,
As I try to understand the core of Alphazero.jl, I had a question about the input to the neural network. Looking at src/learning.jl, I believe the neural net receives a batch of input, but the problem is I couldn't figure out what is exact input to the neural network, specifically the part data=(W, X, A, P, V) as training input, maybe you could tell me?

jonathan-laurent · 2020-11-30T12:02:38Z

You can look at the following comment in src/learning.jl:

# A samples collection is represented on the learning side as a (W, X, A, P, V)
# tuple. Each component is a `Float32` tensor whose last dimension corresponds
# to the sample index. Writing `n` the number of samples and `a` the total
# number of actions:
# - W (size 1×n) contains the samples weights
# - X (size …×n) contains the board representations
# - A (size a×n) contains the action masks (values are either 0 or 1)
# - P (size a×n) contains the recorded MCTS policies
# - V (size 1×n) contains the recorded values
# Note that the weight of a sample is computed as an increasing
# function of its `n` field.

Also, regarding samples weights, you can read more here: https://jonathan-laurent.github.io/AlphaZero.jl/dev/reference/params/#AlphaZero.SamplesWeighingPolicy.

deveshjawla · 2020-12-02T01:20:43Z

Hey Man, thanks for the input. I had read this already. But somehow I forgot that Flux can take a tuple consisting of input and output data for the neural network. But definitely, your answer helped me to take a more closer look. :D

deveshjawla closed this as completed Dec 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSTM support #28

LSTM support #28

deveshjawla commented Nov 30, 2020

jonathan-laurent commented Nov 30, 2020

deveshjawla commented Dec 2, 2020

LSTM support #28

LSTM support #28

Comments

deveshjawla commented Nov 30, 2020

jonathan-laurent commented Nov 30, 2020

deveshjawla commented Dec 2, 2020