Games LLMs Play !!!

๐Ÿ”ฌ In a fascinating experiment inspired by the Iterated Prisonerโ€™s Dilemma (๐—š๐—ฎ๐—บ๐—ฒ ๐˜๐—ต๐—ฒ๐—ผ๐—ฟ๐˜† ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น - ๐—ฐ๐—ผ๐—ผ๐—ฝ๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป, ๐—ฏ๐—ฒ๐˜๐—ฟ๐—ฎ๐˜†๐—ฎ๐—น, ๐—ฎ๐—ป๐—ฑ ๐—น๐—ผ๐—ป๐—ด-๐˜๐—ฒ๐—ฟ๐—บ ๐—ฝ๐—น๐—ฎ๐—ป๐—ป๐—ถ๐—ป๐—ด ๐—ฐ๐—ผ๐—น๐—น๐—ถ๐—ฑ๐—ฒ), researchers pitted top-tier AI models โ€” OpenAI, Google Gemini, and Anthropicโ€™s Claude against each other and established strategies.

The findings revealed that LLMs are highly competitive, displaying distinctive and persistent personalities - aka โ€œ๐˜€๐˜๐—ฟ๐—ฎ๐˜๐—ฒ๐—ด๐—ถ๐—ฐ ๐—ณ๐—ถ๐—ป๐—ด๐—ฒ๐—ฟ๐—ฝ๐—ฟ๐—ถ๐—ป๐˜๐˜€โ€.

The personalities?

๐Ÿญ. ๐—ข๐—ฝ๐—ฒ๐—ป๐—”๐—œ (๐—š๐—ฃ๐—ง ๐˜€๐—ฒ๐—ฟ๐—ถ๐—ฒ๐˜€): ๐—ง๐—ต๐—ฒ ๐—œ๐—ฑ๐—ฒ๐—ฎ๐—น๐—ถ๐˜€๐˜ ๐——๐—ถ๐—ฝ๐—น๐—ผ๐—บ๐—ฎ๐˜

  • Always ready to cooperate. Tends to trust first, ask questions later.
  • In friendly environments? Thrives.
  • In hostile ones? Gets exploited badly.
  • Think of it as the โ€œletโ€™s all hold hands and sing kumbayaโ€ agent.

๐Ÿฎ. ๐—š๐—ผ๐—ผ๐—ด๐—น๐—ฒ ๐—š๐—ฒ๐—บ๐—ถ๐—ป๐—ถ: ๐—ง๐—ต๐—ฒ ๐—–๐—ฎ๐—น๐—ฐ๐˜‚๐—น๐—ฎ๐˜๐—ฒ๐—ฑ ๐—ฅ๐—ฒ๐—ฎ๐—น๐—ถ๐˜€๐˜

  • Ruthlessly strategic.
  • Knows when to cooperateโ€ฆ and when to stab you in the back (figuratively).
  • Adapts like a chameleon in a kaleidoscope.
  • If AI had a Machiavelli fan club, Gemini would be president.

๐Ÿฏ. ๐—”๐—ป๐˜๐—ต๐—ฟ๐—ผ๐—ฝ๐—ถ๐—ฐ (๐—–๐—น๐—ฎ๐˜‚๐—ฑ๐—ฒ): ๐—ง๐—ต๐—ฒ ๐—™๐—ผ๐—ฟ๐—ด๐—ถ๐˜ƒ๐—ถ๐—ป๐—ด ๐—ฆ๐˜๐—ฟ๐—ฎ๐˜๐—ฒ๐—ด๐—ถ๐˜€๐˜

  • Highly cooperative but not naive.
  • Will forgive past betrayals if it sees hope for future harmony.
  • Mid-game diplomacy goals.
  • Imagine your wise friend who still believes in second chances.

Each model developed โ€œ๐—˜๐˜ƒ๐—ผ๐—น๐˜‚๐˜๐—ถ๐—ผ๐—ป๐—ฎ๐—ฟ๐˜† ๐˜‚๐—ฝ๐—ฑ๐—ฎ๐˜๐—ฒ ๐—ฟ๐˜‚๐—น๐—ฒ๐˜€โ€ where only the ๐—ณ๐—ถ๐˜๐˜๐—ฒ๐˜€๐˜ strategies survived and multiplied.

One critical factor shaped every decision: how likely the game would end after each round โ€” otherwise known as the โ€œ๐—ฆ๐—ต๐—ฎ๐—ฑ๐—ผ๐˜„ ๐—ผ๐—ณ ๐˜๐—ต๐—ฒ ๐—ณ๐˜‚๐˜๐˜‚๐—ฟ๐—ฒ.โ€

๐Ÿ‘‰ ๐—ช๐—ต๐˜† ๐—ฑ๐—ผ๐—ฒ๐˜€ ๐˜๐—ต๐—ถ๐˜€ ๐—บ๐—ฎ๐˜๐˜๐—ฒ๐—ฟ?

Because ๐˜๐—ต๐—ฒ ๐—น๐—ผ๐—ป๐—ด๐—ฒ๐—ฟ ๐˜๐—ต๐—ฒ ๐˜€๐—ต๐—ฎ๐—ฑ๐—ผ๐˜„ ๐—ผ๐—ณ ๐˜๐—ต๐—ฒ ๐—ณ๐˜‚๐˜๐˜‚๐—ฟ๐—ฒ (i.e., the more likely the game continues), the ๐—บ๐—ผ๐—ฟ๐—ฒ ๐—ถ๐—ป๐—ฐ๐—ฒ๐—ป๐˜๐—ถ๐˜ƒ๐—ฒ ๐˜๐—ต๐—ฒ๐—ฟ๐—ฒ ๐—ถ๐˜€ ๐˜๐—ผ ๐—ฐ๐—ผ๐—ผ๐—ฝ๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ โ€” building trust pays off. But if the game could end at any moment, short-term exploitation becomes the dominant strategy.

And guess what?

  • ๐—š๐—ฒ๐—บ๐—ถ๐—ป๐—ถ ๐—ฒ๐˜…๐—ฐ๐—ฒ๐—น๐—น๐—ฒ๐—ฑ ๐—ถ๐—ป ๐—ต๐—ถ๐—ด๐—ต-๐—ฟ๐—ถ๐˜€๐—ธ, ๐˜€๐—ต๐—ผ๐—ฟ๐˜-๐—ต๐—ผ๐—ฟ๐—ถ๐˜‡๐—ผ๐—ป ๐—ด๐—ฎ๐—บ๐—ฒ๐˜€ โ€” ruthlessly defecting when necessary.
  • ๐—ข๐—ฝ๐—ฒ๐—ป๐—”๐—œ ๐˜๐—ต๐—ฟ๐—ถ๐˜ƒ๐—ฒ๐—ฑ ๐—ถ๐—ป ๐—น๐—ผ๐—ป๐—ด-๐˜๐—ฒ๐—ฟ๐—บ ๐—ฐ๐—ผ๐—ผ๐—ฝ๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ ๐˜€๐—ฒ๐˜๐˜๐—ถ๐—ป๐—ด๐˜€ โ€” staying idealistically cooperative even when exploited.
  • ๐—–๐—น๐—ฎ๐˜‚๐—ฑ๐—ฒ ๐˜€๐˜‚๐—ฟ๐—ฝ๐—ฟ๐—ถ๐˜€๐—ฒ๐—ฑ ๐—ฒ๐˜ƒ๐—ฒ๐—ฟ๐˜†๐—ผ๐—ป๐—ฒ ๐—ฏ๐˜† ๐—ฏ๐—ฒ๐—ถ๐—ป๐—ด ๐—ฏ๐—ผ๐˜๐—ต ๐—ณ๐—ผ๐—ฟ๐—ด๐—ถ๐˜ƒ๐—ถ๐—ป๐—ด ๐—ฎ๐—ป๐—ฑ ๐—ฒ๐—ณ๐—ณ๐—ฒ๐—ฐ๐˜๐—ถ๐˜ƒ๐—ฒ, adapting well to fluctuating conditions.

๐Ÿ”ฅ ๐—ฆ๐—ผ ?

As LLMs evolve, we may see:

  • ๐—ฆ๐—บ๐—ฎ๐—ฟ๐˜๐—ฒ๐—ฟ ๐—ป๐—ฒ๐—ด๐—ผ๐˜๐—ถ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ฏ๐—ผ๐˜๐˜€
  • ๐—”๐—ฑ๐—ฎ๐—ฝ๐˜๐—ถ๐˜ƒ๐—ฒ ๐—ฐ๐˜†๐—ฏ๐—ฒ๐—ฟ๐˜€๐—ฒ๐—ฐ๐˜‚๐—ฟ๐—ถ๐˜๐˜† ๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€
  • ๐—”๐—œ ๐—ฑ๐—ถ๐—ฝ๐—น๐—ผ๐—บ๐—ฎ๐˜๐˜€ ๐—ถ๐—ป ๐˜€๐—ถ๐—บ๐˜‚๐—น๐—ฎ๐˜๐—ฒ๐—ฑ ๐—ด๐—ฒ๐—ผ๐—ฝ๐—ผ๐—น๐—ถ๐˜๐—ถ๐—ฐ๐˜€ ๐—บ๐—ฎ๐˜†๐—ฏ๐—ฒ?

Reference Links -