mastodon.world is one of the many independent Mastodon servers you can use to participate in the fediverse.
Generic Mastodon server for anyone to use.

Server stats:

9.5K
active users

#gamedesign

26 posts26 participants0 posts today

It's been a very busy week with hardly any time to even read my #GameDesign book. Hopefully there'll be more time soon.

Just now I read about elegance in game design being a good thing. One thing that strikes me is the idea that simple things that give many ways to play with are especially interesting. Reminds me of me teaching my 4 year old to play chess.

The Leaderboard Illusion

arxiv.org/abs/2504.20879

arXiv logo
arXiv.orgThe Leaderboard IllusionMeasuring progress is fundamental to the advancement of any scientific field. As benchmarks play an increasingly central role, they also grow more susceptible to distortion. Chatbot Arena has emerged as the go-to leaderboard for ranking the most capable AI systems. Yet, in this work we identify systematic issues that have resulted in a distorted playing field. We find that undisclosed private testing practices benefit a handful of providers who are able to test multiple variants before public release and retract scores if desired. We establish that the ability of these providers to choose the best score leads to biased Arena scores due to selective disclosure of performance results. At an extreme, we identify 27 private LLM variants tested by Meta in the lead-up to the Llama-4 release. We also establish that proprietary closed models are sampled at higher rates (number of battles) and have fewer models removed from the arena than open-weight and open-source alternatives. Both these policies lead to large data access asymmetries over time. Providers like Google and OpenAI have received an estimated 19.2% and 20.4% of all data on the arena, respectively. In contrast, a combined 83 open-weight models have only received an estimated 29.7% of the total data. We show that access to Chatbot Arena data yields substantial benefits; even limited additional data can result in relative performance gains of up to 112% on the arena distribution, based on our conservative estimates. Together, these dynamics result in overfitting to Arena-specific dynamics rather than general model quality. The Arena builds on the substantial efforts of both the organizers and an open community that maintains this valuable evaluation platform. We offer actionable recommendations to reform the Chatbot Arena's evaluation framework and promote fairer, more transparent benchmarking for the field

Working further on UI layout. Each zone will be foldable and scalable. Also thinking to make swipeable (or switchable with buttons) screens to the left and to the right (depending on context and utility). Imagine a smartphone screen with numerous desktops each having its own widgets. Central area is for mouse navigation and target picking, so it won't be covered.

🌸 Ready for a delightful afternoon of IRL connection and conversation in VanCity?

Join me for a Tea Social where we dive into all things #UXDesign, Marketing Psychology, #ProductManagement, and Game Dev!

Chat with Fernanda Nauata #GameDesign expert this Sunday, May 11th, from 1:00 PM to 3:00 PM PDT.

Let's sip tea, share stories, and build our amazing community!
You can check out all the details and RSVP below.

lu.ma/5so4marf

We've been so focused on game development that we hadn't had time to upload new images to our gallery. But good news — we just uploaded a new folder for our higher-tier Patrons to enjoy!

Access to these previews is part of the Chloe tier and above, but only if you don't mind spoiling parts of the story.

As we mentioned, only Chloe-level Patrons (and higher) can check out some exclusive sketches from the upcoming release. Just a heads-up: there are massive spoilers in these images! You might even spot some unfamiliar faces... but please, don't ask — just enjoy!

Gallery -> thsd.fun/0220

The 0.22 update is planned for mid-May, though it might take a little longer. As you’ll see in the gallery, this is shaping up to be a big release with a ton of new images. What’s there now is just a bit more than half of what's coming, so there’s still plenty of work ahead!

⭐ Favorite |🔁 Reply | 🔃 Boost to support the development of our game!

Should ammo be an inventory space taker?

Taking it should be a risk/reward decision (die = lost ammo), but is using a slot of space too much punishment? Toying with one free stack of each type but thats going to be a weird "oh, NOW the ammo takes space?" player discovery.

But I don't love the idea of being able to hoover up ammo and bring home enough for WW4.

Thoughts?