diff --git a/README.md b/README.md index c207592..b159e57 100644 --- a/README.md +++ b/README.md @@ -172,6 +172,7 @@ If you aren't able to access any paper on this list, please [try using Sci-Hub]( - [Theory of Mind for Multi-agent Coordination in Hanabi](http://fse.studenttheses.ub.rug.nl/id/eprint/28327) (thesis) - [The Hanabi challenge: From Artificial Teams to Mixed Human-Machine Teams](http://oru.diva-portal.org/smash/record.jsf?pid=diva2%3A1691114&dswid=-1981) (thesis) - [A Graphical User Interface For The Hanabi Challenge Benchmark](http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-94615) (thesis) +- [Analysis of Symmetry and Conventions in Off-Belief Learning (OBL) in Hanabi](https://fanpu.io/blog/2022/symmetry-and-conventions-in-obl-hanabi/) (blogPost) # Hearthstone - [Mapping Hearthstone Deck Spaces through MAP-Elites with Sliding Boundaries](http://arxiv.org/abs/1904.10656) (journalArticle) diff --git a/boardgame-research.rdf b/boardgame-research.rdf index 10a6553..ccb92ae 100644 --- a/boardgame-research.rdf +++ b/boardgame-research.rdf @@ -9289,6 +9289,45 @@ guaranteed decent high score. The algorithm got a lowest score of 79 and a 1 text/html + + blogPost + + + + + + + + William Zhang + + + + + Fan Pu Zeng + + + + + + Analysis of Symmetry and Conventions in Off-Belief Learning (OBL) in Hanabi + We investigate if policies learnt by agents using the Off-Belief Learning (OBL) algorithm in the multi-player cooperative game Hanabi in the zero-shot coordination (ZSC) context are invariant across symmetries of the game, and if any conventions formed during training are arbitrary or natural. We do this by a convention analysis on the action matrix of what the agent does, introduce a novel technique called the Intervention Analysis to estimate if the actions taken by the policies learnt are equivalent between isomorphisms of the same game state, and finally evaluate if our observed results also hold in a simplified version of Hanabi which we call Mini-Hanabi. + + + https://fanpu.io/blog/2022/symmetry-and-conventions-in-obl-hanabi/ + + + + + attachment + Analysis_Of_Symmetry_And_Conventions_In_Off_Belief_Learning_In_Hanabi.pdf + + + https://fanpu.io/assets/research/Analysis_Of_Symmetry_And_Conventions_In_Off_Belief_Learning_In_Hanabi.pdf + + + 2023-03-01 11:40:51 + 3 + 2048 @@ -9402,6 +9441,7 @@ guaranteed decent high score. The algorithm got a lowest score of 79 and a + Hearthstone