Can intelligent agents learn to cooperate?  We’ll look at this in two different settings, which we’re interested in for different reasons. First is Dec-POMDPs, where the agents share a common objective, but cooperation is still nontrivial because of incomplete information. The CIRL setup is a great example of this. Dec-POMDPs often have a trivial solution where the agents agree in advance on a communication scheme, but this is useless for human-machine coordination. We’ll look at a proposed solution in the context of Hanabi.

Next we’ll look at cooperation between agents with incompatible preferences. The most obvious way this can happen is through reciprocity; we’ll see this in the context of LOLA.

Slides

L04_Cooperation.pdf

Student Presentation 1 (zero-shot coordination)

Student Presentation 2 (LOLA)

Alignment Agendas

Paper Presentations

Supplemental Readings