- 38

## Learning Opportunities

This puzzle can be solved using the following concepts. Practice using these concepts and improve your skills.

## Statement

## Goal

Assume we are dealing with (nondeterministic) one-player game. To find an optimal sequence of movements we could use Monte Carlo Tree Search algorithm (https://en.wikipedia.org/wiki/Monte_Carlo_tree_search).

Thus, we perform a number of so-called playouts, and gradually build an MCTS tree that will help us choosing statistically best choices for each turn of the game. A playout is a sequence of moves reaching the game tree leaf, so it has assigned a true score. It consists of two parts: the beginning, which is selected by the algorithm using the UCT formula; and the remaining part which is usually a random sequence of movements.

In this puzzle, we are given a list of playouts (encoded as words, where each letter is a single move) with assigned scores, that should be used to build an MCTS tree. After building a tree, the task is to return the sequence of moves, reaching the MCTS tree leaf, that will be chosen using UCT policy given exploration/exploitation constant

For given node

`N`visited

`N.v`times, according to the UCB1 formula we should choose a child

`M`that maximizes the value given by:

`M.s/M.v + C*sqrt(ln(N.v)/M.v)`, where

`M.v`is number of visits in node

`M`and

`M.s`is sum of scores obtained for this node (so the first component of the sum is average score for node

`M`).

**Final remarks:**

- Note that this puzzle differs form the real-life-scenario where the playouts are not given, but they are also computed using UCT+random policies.

- In standard implementations you are forced to choose an unexplored move if such exists. Here we assume that after reading the playout data we do not have such moves in the non-leaf nodes of the MCTS tree.

- A tie-breaking rules when comparing UCT values is the ordering on letters (i.e. smaller letter should be chosen).

**Example explanation:**

- Reading

- Reading

- Finally, reading

- Choosing move from the root based on the UCB1 formula will favor move

- As there are no further nodes in MCTS tree along that paths, the 1-move sequence

Input

**Line 1:**2 space-separated values:

an integer

`N`- the number of performed playouts

a real number

`C`- the constant C (the exploration parameter)

**Next**Sequence of movements performed in this playout, followed by a space, followed by the playout's result

`N`lines:Output

Sequence of movements that will be chosen in the MCTS tree using UCB1 selection.

Constraints

0 <

0 < playout length < 50

1 < branching factor < 10

-100.1 < score (playout's result) < 100.1

`N`< 5000 < playout length < 50

1 < branching factor < 10

-100.1 < score (playout's result) < 100.1

Example

Input

3 0.1 baa 30 ab 20 bbb 4

Output

a

A higher resolution is required to access the IDE

Join the CodinGame community on Discord to chat about puzzle contributions, challenges, streams, blog articles - all that good stuff!

JOIN US ON DISCORD Online Participants