One Trillion and One Nights

An experiment using LLMs to procedurally generate browser-based JRPGs

13 min readJan 22, 2025

I spent four years working at Unity Technologies in the period from 2017 to 2021 on the now sadly defunct ML-Agents project. During my time there, I thought a lot about the potential role that AI might one day be able to play in game development. My colleagues and I were interested in how deep neural networks could enable game developers to realize their visions in less time consuming and expensive ways than current technology allowed. Unfortunately, those of us working on AI for games at the time were just a little bit too early. Fast forward to 2025 and the development of large language models and diffusion models has started to make some of the dreams we had back in 2017 a reality. It has also justifiably sparked fierce debates about what role AI should play in any fundamentally creative endeavor. We always imagined that tools like these would empower game designers, programmers, and artists, but there is also a worry that it might do the opposite.

Last summer I finished my two-year postdoc at Microsoft Research. In that time I had largely focused on computational neuroscience, and wanted to re-immerse myself in the forefront of AI research. I decided that the best way to do that would be to spend some free time playing around to discover what was now possible when bringing LLMs into game development. I was interested in how these tools might enable new kinds of procedural game experiences where the player can have a central role in defining what the game they play looks and feels like. One of my personal favorite game genres is the Japanese-style Role Playing Game (JRPG). All the way back in middle school some of the first games I ever created were JRPGs that I programmed on my TI-84 calculator. It felt fitting to me to return to the genre for this experiment.

I was interested in whether it would be possible to use an LLM to generate a whole game world including a rich cast of characters, diverse settings, and a compelling narrative in a procedural way, based on the player’s input. Outside of using ChatGPT and GitHub Copilot, I didn’t really have much or any experience working directly with LLM APIs and integrating them into software, so this was a genuine open question for me. As I started the project, I decided that I wanted to emulate the old JRPGs that I enjoyed playing as a child, which were quite linear. Rather than create a choose-your-own-adventure procedural world (which would be exciting in itself, but has been done by many others already), I decided to focus on allowing the player to define up-front the kind of game world, protagonist, and story they’d like to play through. I call the project that resulted from this experiment One Trillion and One Nights after the classic compilation of Arabic stories.

Today I am open sourcing One Trillion and One Nights with a simple MIT license. It was a fun and educational experience working on this over the past six months. I learned a lot about the current state of LLMs and generative AI more broadly, both in terms of their strengths and weaknesses. Although I was often impressed by a plot point or character personality that the model would come up with, I can say confidently that AI will not be replacing game designers, artists, or writers anytime soon. Still, I think there is a lot that can be learned from attempts like this to see how far we can get by handing over as much as possible to an LLM. I hope that playing the resulting game(s) that this project can generate will be fun for some. I also hope that the open source code can be useful for people who want to design procedural games with generative AI and are looking for a starting point.

The project is available on GitHub at this link. In the rest of this article I outline some of the design choices I made, show off a few screenshots demonstrating the kind of games that can be generated, and finally reflect a bit on the process as a whole.

Game Design

Before even getting into design, it is worth talking a bit about the basic game development choices I made. Given that Python has the most robust support for LLM APIs, I decided to use it as the backend for the game as opposed to Unity or some other established game engine. The frontend was written in Javascript and runs in a web browser. I used a simple websocket to communicate the game state between the backend and frontend. I knew I wanted the game to be simple and entirely 2D, so I didn’t need all the things a full game engine could offer anyway. Because of this design choice, the game is easy to run in a web browser, but unfortunately a bit difficult to host online. This was a trade-off that I decided to make in order to make prototyping as easy as possible given my existing skill set, which certainly does not reside in web development.

The procedural aspects of each game result from a series of questions that the player is prompted to answer when starting the application up. These seed questions ask the player what kind of setting, protagonist, and story they would like the game to have. The player is provided with both multiple choice questions as well as open-ended questions which they can answer however they like. It is through these open ended questions that the player can come up with ideas that have probably never been seen in a JRPG, such as a game in which a cyborg owl detective has to stop an evil kangaroo from taking over the moon (at least I have not played such a game before).

Once the player has answered the seed questions, the LLM then uses those answers to generate the game world, protagonist, and story according to a series of prompts which I crafted by hand. This includes a set of guidelines for what constitutes a JRPG. Because generation happens in real time, this is done in a hierarchical manner in order to enable the player to get into the game as quickly as possible. As an example: the story is generated first as a paragraph long synopsis, then as a paragraph for each of the five separate chapters, then as a sequence of events in each chapter. The same happens for the game world, which is first defined for the entire world, then for individual locations. Likewise for characters, shops, enemies, weapons, spells, items, and all other unique aspects of a given game. In each game, practically every proper noun is generated by an LLM in a way that was somehow influenced by the player’s original answers to the seed questions.

The actual gameplay consists of the player navigating various menus for travel, battle, shopping, and character interaction. I opted for this approach over a 2D overworld in order to avoid the need to generate sprites, which is an area that diffusion models are still not very consistent at. Within the context of a menu-based game, there were ample opportunities to work in procedural elements. Thanks to the LLM operating behind the scenes, it is possible for each interaction between characters to be unique. For example, the player might enter a shop and have a discussion with the shopkeeper that is specific to the particular point in the story that the player is at, regardless of when in the story they actually met the shopkeeper. This also enables battles to be dynamic as well. For example, when a player casts a fire spell on a goblin, the text might describe the way in which the goblin reacts to the fire, based on how much damage it does.

Over the course of the six months that I worked on the project, I found myself often having to make decisions about the structure and scaffolding of the gameplay itself. What I ended up settling on is relatively simple, especially by standards of any commercial game. It is a kind of minimal-viable-product when it comes to JRPGs. In each of the five chapters of the game, the player navigates through a series of locations which are either a town, field, or dungeon. At towns they can interact with shops and inns where they can buy items and equipment or rest. In fields there are enemies and treasure chests. And in each dungeon there is a boss that must be defeated. Along the way the player will meet and recruit allies to join their party, as well as discover new items and spells to acquire. Each of the generated games are pretty simple and can be completed in an hour or two. My hope is that the replayability lies in the ability to create an endless number of new game worlds, each of which is meaningfully different from the others.

Two Examples

To give a sense of what kinds of game worlds are possible, I took a few screenshots from two different generated games. The game is a comedy about a chicken detective trying to solve a mystery in a futuristic cyberpunk world. The second is a more traditional fantasy story about a fairy princess trying to save her land from a plague. Both were generated in only a couple minutes by Claude 3.5 Sonnet for a few cents each.

Cyberpunk Chicken Game

A dialogue window showing the main character talking to himself. The tone of the story was seeded to be humorous, and here we see the LLMs’ attempt at humor.

A battle screen in which the player’s party of two characters faces off against an enemy.

Description of a spell cast by the player against the enemy. The language used conforms to the fact that the character casting the spell is a cyborg chicken.

Fairy Princess Game

The navigation menu for the game. Players can choose where to move using the cardinal direction buttons.

Locations in the world contain randomly placed treasure chests with consumable items.

The player can choose dialogue options when trying to recruit NPCs to join their party.

A Few Reflections

Good game design is difficult. I have incredible respect for designers who can craft game experiences which balance being engaging and challenging while also telling a compelling story. Working on this project reminded me of just how much care goes into crafting a game like this. It also made clear just how far LLMs are from being able to supplant any true craftsperson in this domain. That said, current LLMs can be useful sounding boards for design ideas and can be involved in the process of helping to refine these ideas. At least with current technology however, I don’t see an LLM designing a good game with novel gameplay elements from scratch by itself. Of course, given the rate of advancement in this area, that will likely change in the next few years.

There are also limitations to the quality of the storylines that current LLMs can produce. I found that each model would have its own biases about what it thought constituted a JRPG. For example, there was a point in development where a certain model would insist on including ley lines as a prominent plot element in the story, regardless of how incongruous doing so would be to the rest of the game world and narrative. I also found that LLMs had certain phrases which they seemed to insist on incorporating wherever possible. “Whispers” and “echoes” were two such terms that found their way into almost every piece of narration or dialogue at one point. Although it was amusing, this repetition of phrases points to a rigidity in these models, where “JRPG narrative” causes a kind of mode collapse to certain stereotyped phrases and plot points. Again, this is something that I imagine will be less of an issue in the coming years, but right now it is quite noticeable.

Another issue that I wrestled with was the speed of the models. Right now there is a very real trade-off between models that are fast, such as Gemini 1.5 Flash and models which display greater intelligence, such as Claude 3.5 Sonnet. For many use cases, waiting a few seconds to a minute is not a problem, especially when the output from the LLM might have taken minutes for a person to generate themselves. The problem is that having to wait a few seconds every time you take any action in the game significantly slows down the momentum and flow of the experience. In the end, I decided to set the default LLM to GPT-4o, as it seemed to provide the best tradeoff between speed and quality. This is something that can be easily changed in the configuration. Ideally, an LLM would need to be able to generate a complete response in a matter of milliseconds before the average player would find it acceptable. I am fairly certain that in the next couple of years we will get there, but we aren’t there yet.

Finally, I grappled a lot with the question of how much control to give the player over how the narrative unfolds in the game. As I stated above, my original goal was to create a game inspired by traditional JRPGs, which were largely linear affairs. That being said, there are countless points in the game where player choice could have had a more significant impact. For example, given that the locations in the game are procedurally generated, the game could give the player the ability to choose where they travel to next. Depending on where they go, the story could unfold in dramatically different ways. This is of course the more “western” RPG approach taken by Dungeons and Dragons, not to mention countless other games. The key to making this kind of game work would be to ensure that the story that unfolds remains compelling at the macro level. JRPGs have a number of narrative tropes which are easy to parody given how frequently they recur, but those tropes are often included because they ultimately enable a satisfying overall narrative experience. I look forward to the inevitable day when developers start to figure out how to strike the balance between player freedom and a compelling story in procedural games like this. Smarter LLMs will help, but having a game designer with a really strong sense of what they want will likely be even more critical.

Code Release

Developing a game like this could go on almost indefinitely, and I knew that there were other very different endeavors I wanted to pursue in 2025. With that in mind, I set for myself a strict deadline of December 31st, 2024 to finish working on the project. I was able to include all of the features that I wanted, but the game itself is relatively bare bones, and there admittedly isn’t very much balancing that was done to the gameplay. Furthermore, I am sure that there will likely be bugs and other issues that I wasn’t able to find or address before the end of the year. If people bring these issues up on Github, I am open to helping to work to resolve them. I am also happy to accept pull requests that contain fixes or other minor improvements to the project’s codebase. Although I don’t expect this to be a great game, I do want it to at least be functional enough to be of use educationally and somewhat entertaining.

For bigger changes or additions to the project, I encourage people to fork the project and continue developing it as they see fit. I am releasing the code open source with an MIT license. With this license you could try to make a game to sell, but given the quality of what currently gets generated, I doubt anyone would actually pay to play one of these right now anyway. That said, I want this to be something that serves an educational or informative role, not a commercial one. If people do decide to build on it substantially, especially as it relates to integrating LLMs into the gameplay in compelling new ways, please let me know. I’d love to hear what new ideas people are dreaming up, and if someone does make a commercial game out of this, I’d be curious to play it myself.

The field of AI is moving incredibly quickly. What was state of the art merely six months ago is in many cases already hopelessly out of date. I am sure that this project as it currently exists will likely face a similar fate. On the one hand it will be possible to swap in newer LLMs in the future, which will likely significantly improve the quality of the narrative and dialogue that the game generates. I think that this has the potential to extend the value of a project like this somewhat. On the other hand, with models like OpenAI’s o3 coming down the pipeline in the coming months, I also imagine that the process of game design and development itself is about to be shaken up in a more fundamental way. It may not be so long into the future that LLMs are involved at a more fundamental level in tweaking and balancing the design of various gameplay elements in a game in real time.

There is a lot of justified fear and anger at the ways in which generative AI is starting to be used in game development, especially when it is being used to replace creative talent (and often with a final product that is shoddier than anything a professional human would create). At the same time, I think there is great potential in this technology as well, especially in the hands of passionate developers and smaller teams whose visions currently far exceed the resources they have at their disposal. With each new generation of LLMs and other AI models, the barrier to entry on the technical side of development is quickly approaching zero. In the future, I can imagine a world in which instead of engineering ability, the limiting factor will instead be creativity. I can’t wait to see (and play) what people make.