Another gotcha is forgetting to await, so easy to do even with Typescript because Promise<void> can be ignored and Promise<boolean> will be considered as Truthy without warning by the compiler.
The method above finds a running Game instance. Clients connect via Web Socket to a running instance. They all need to connect to the same one! When I brought down the server to test resilience I noted that when the clients auto-reconnected sometimes they’d see different game instances. The clients have exponential backoff, but not random exponential backoff, so their reconnects all came together.
Looking at the code, there is:
- Read the Map of Game Code to Game at line 86
- Asynchronous “await” for data access at line 89
- Asynchronous “await” as the game loads data at line 92
- Write to the Map of Game Code to Game at line 93
This is a classic Read-Modify-Write race condition. It’s so fundamental I spent some time with my kids this morning getting them to read a number, run to the other end of the house, do a calculation then come back and replace whatever is on the paper with their result to demonstrate it. (They’d asked me what I was doing so I thought I’d explain).
We need to write to the Map in the same synchronous block as we read from it, but we don’t know the answer at this time! The way I solve this at work on the client is to store a Promise of work. This is normally in case the user or page script load beats my server data load to needing the data. With a Promise it doesn’t matter if the answer is known or still pending when it is accessed. It also allows the Map to be written immediately.
It still feels counter-intuitive on first scan that the delete() is earlier in the code than the set(). You have to remember that the asynchronous work with a remote database read at the top will happen later.
Express Sessions are similarly fiddly when route handlers are asynchronous. It’s hard to know what the state of the Session is by the time that you reach the end of the asynchronous function, and as for interaction with WebSockets!
I modified my Session to allow a single browser instance to enter the game as multiple Players. This meant that the Session has an array of Player Ids, and we see another Read-Modify-Write race condition.
I reduced my Session to just a User ID. This is created on login and never touched again! It is vital that the UserID is set before any asynchronous work.
I maintain the relationship between User and Player using a Redis Set. This allows Redis to maintain integrity as I perform Add, Remove and List operations.
At the moment a User has many Players and a Player has zero or one Game. The two connect when the user’s browser creates the WebSocket connection. I could change this to create the Player later, but the code has moved in an agile way and I need to prioritise other things. A Player may change their display details or their current Game. This opens up another Read-Modify-Write race but it is mitigated by the client serialising its access. If I move Player Name and Icon to User then Player becomes a simple joining object and there is no race condition between them. Flexibility in testing or playing as multiple cub-scouts on the same browser will be reduced.
Web Sockets and Sessions
There is sample code to access Sessions from Web Sockets online. Typescript complains bitterly about it because Express methods are being called with things that are not HTTP Requests and Responses. It just rings alarm bells, and there’s this problem with asynchronous work which is also a worry.
I’ve used JSON Web Token to solve this. It’s an extra round trip. The client makes a GET request to produce a short lived JWT which it then sends as its first message to the Web Socket Server. This token contains the Player ID and the Game ID. It was signed by a method which verified that the User ID (via the Express Session’s signed Cookie) owns that Player.
The system uses UUIDs for Player and User. These are difficult to guess. Game uses a Base36 number which is easier for a human to type but still gives a sparse space to search in. I could use Redis short lived keys to start rate limiting access if I need to, or if this ever becomes Big I’d have to look at Denial of Service prevention as a front-end service.
How does it scale? Most access now is simple REST API backed by Redis, so it should scale as Redis scales and I can add server nodes. Games do take memory whenever there is at least one connected Player. They are shelved back to the database when no players are connected, and deleted when finished with. I think I can scale by encoding Server-ID in the game code used to access the game, or even mapping this in Redis and returning the information to the client. Then it’s a matter of routing the WebSocket connections to the right Node based on picking out the right part of the request URL. Either that or explore Redis Message Queues. At the moment it will have no issue handling 20 Cub Scouts on a private home hosted server that is alive only for the duration of the meeting.