World of Warships: No Animals Were Harmed: an Insight into Parameter Testing for World of Warships

About a year ago, I was approached by our studio director, who asked me, “Oleg,by the way, what are the World of Warships client performance parameters today?” The answer at the time was, “Well, no more than 15 fps for any configuration, sorry.

In those days, we had to do a lot of optimization, as Warships ran smoothly only on higher-end PCs. So, we decided to initiate stress tests to resolve this.

For starters, we had to consider which configurations were the most pervasive among players.

We thoroughly analyzed the stats that were kindly provided by our World of Tanks and Warplanes colleagues and other resources. Based on the gathered data, we have arranged our test assembly using four of the most popular PC configurations.

The test setup before our recent renovation:

1

 

We frequently changed the internal components with the goal of surveying a wide distribution of PC configurations, both low- and high-end. We considered the fact that many players launch our other titles on a heterogeneous assortment of computing technology — including old laptops, desktop towers, and toaster ovens. Also, we were sure to remember that players may use a variety of operating systems and antivirus software.

All this information brought four configurations, connected to two monitors via KVM (keyboard-video-mouse) switches, like some Frankenstein monster of Warships badassery. It grew sentient and hungry for more power, so we also used our own workstations and two entire gaming rooms to ensure adequate testing conditions.

Testing: A How-To

All that time spent on building this assembly was a good investment; we consistently thought hard about possible variants of tests to run. The beginning was quite easy – we sat and launched the game while tracking the frames-per-second. One of us created a training room, invited people, and before long, we were deeply engrossed in modern combat simulation:

2

Later on, we realized that this approach was imperfect, due to an extensive number of innovations continually being added. Want a breakdown of the issues we encountered?

  •  The manual test turned out to be very time-consuming
  •  We couldn’t track down direct in-game commands and actions for our players
  •  The test accuracy was quite low, due to a wide variety of possible outcomes and unpredictable behavior for players online
  •  On the surface, we got only a raw number for FPS – without any idea what was actually happening “inside”
  •  The monkey got very cranky

Tracking Success

We were sure to stick to a familiar World of Tanks feature – replay recording — in order to track the necessary data. The testing process has also changed a bit: Now, it takes only one person to record a maximum-intensity battle against bots. Afterward, it is replayed on all the test PCs, gathering stats along the way.

This approach turned out to be ideal for comparing several PC configurations and game versions, as all the battle actions were perfectly duplicated across the testing machines. One of the first missions we undertook was to check the modified ship models. To do it, we took a general version of the game and made a record; then we added new ships. Next, we made another record, and replayed both records on test configurations. That’s how we obtained the data.

Alas, there are no ideal instruments for such tests, so when we decided to compare two versions with diverse server parts, the replays failed and crashed the game. Thus, we had to make separate replays for each version. Naturally, this was where another issue showed up: Even though we could reproduce our actions pretty closely, the bots in those days were completely uncontrollable. Thus, we had to compare replays with completely different scenarios. Of course, this required a huge number of them to be processed. As you might suspect, the increased number of measurements can detrimentally influence the overall quality of tests and can decrease dispersion of the values obtained.

In other words, it is only by having enough data readily available that we can make valid assumptions about the current state of the game, and because the tests were based on replays with varied scenarios, many data points needed to be recorded. How were they obtained? Was it really necessary to spend a lot of time recording them manually? We needed a solution, and the monkey needed rest…

Playtests

Every single day, Lesta studio performs several big playtests. Anyone willing may attend them and watch the new gameplay elements and ship models. Every PC is making records of the battles performed. After each test, specially trained persons send these replays to a special game server that belongs to our performance test team where they store and replay them when necessary. Playtests allow us to perform simultaneous tests for a large number of game situations.
Beside the fact that participants prefer different types of ships, all of them also have varying tactics and play styles. Along with these eccentricities, players keep surprising us, playing in the least anticipated manner. For instance, we witnessed a battle where an aircraft carrier made attempts to ram the destroyers attacking her. Or, there are players who like spending a battle switching between cameras to see the shells in flight.

An average playtest provides us with 70 – 150 replay files featuring any and all fight patterns possible. We sort them by maps and latency, prepare .cfg files, and proceed to the next important step.

Mass Performance Test – No Animals Harmed

3

Once we have enough replays, we can run really massive tests of the client and bear witness to the real performance of the game while considering all possible situations. The system that was created by our stress test specialists automatically launches the replay on the desired test stations.

After long hours of testing, we get the statistics of what happens in game. We do our best to track down everything we can get our hands on! First, we ensure that the profiler works, as this tool is responsible for logging game performance; it helps us to locate problems in the client. There are also a couple of other tools running alongside; these check client systems every second and send this information to the server.

30 FPS

It is significant to mention the importance of this value and where it comes from. Some things have scarcely changed since the days of magnetic records and bobbins. Dynamic pictures are still made in the same old manner: video cards align frames to replace each other. We have to get this to happen successfully at least 30 times per second to get a smooth picture. If the number is lower – the gameplay becomes uncomfortable. This is what we call lag. :)

4

Every single frame in the game may be influenced by a vast number of operations, including: water, ground, cloud, and ship rendering effects. And let’s not forget explosions, fires, tracers, and computing and obtaining physics data from the server, as well as sending internal data, etc. Whew!

All these events occur simultaneously, processed by both the CPU and video card. Not only do we need to make everything work efficiently, we also have to ensure that both units never have to wait for each other.

Today, taking a look at the newest upcoming version, I was caught by our director once again, who asked me the same question as he did before. This time, I had a decent answer. Based on the work done by many departments, uncountable innovations, and performance improvements, many PC configurations may now enjoy World of Warships at the target FPS. Even weak PC and laptop samples may now provide a comfortable gaming experience.

World of Warships Blog

Bookmark the permalink.