The data shows: Play Aim Lab and you get better, play more and you get even better.
While time spent practicing a game should make you improve at that game, Aim Lab’s AI changes the difficulty level for you while you play, depending on your performance. This helps each player to reach their optimal conditions for learning. Maybe you’ve been using Aim Lab and you want to know if it really does work or how much people can improve. Fair enough.
Each dot on a plot represents a group of users who played a certain amount of Aim Lab in December. In this graph, the arrow is pointing to the dot that represents percent improvement in kills per second for spidershot, for users who played between 101-125 total Aim Lab tasks (a 60 second bout of AI-assisted training) in December. You’ll see that users in that group improved about 10% in kills per second.
The trend line (line cutting across the dots) on each plot shows the overall relationship between the total number of tasks played in one month (x-axis or bottom of plot) and percent improvement (y-axis or left side of plot).
Here are two similar graphs where microshot and spidershot data are plotted separately and within each of these tasks, accuracy, kills per second, and reaction time measurements are plotted separately.
The plots show
- The effect of practice on accuracy is not as drastic as it is on kills per second or reaction time. That’s ok, because if someone is improving in Aim Lab, in response, tasks automatically become more difficult, so there’s a tradeoff. If your kills per second and reaction time are improving, you’ll see more targets and/or smaller targets. So even maintaining the same level of accuracy while kills per second and reaction time improve, is a good thing.
- Trend lines are above the zero mark on the y-axis. This means, overall, the change from starting score to end score for users was an improvement.
- Trend lines slope upward: This means, on average, the more tasks a user played in December, the greater improvement they saw.
In other words, play Aim Lab and you get better, play more and you get even better.
I took a look at this by starting with data from the entire month of December for 20 Aim Lab tasks from the 83,000 users who were active that month. Then, for each user, I calculated total tasks played for the month. So every 60-second task of strafetrack, spidershot, detection, snipershot, etc. = 1 task played.
I calculated for microshot and spidershot separately, the percent change in accuracy, kills per second, and reaction time performance for each user over the course of the month. For example, if someone started out at 1 kill per second in early December and ended up at 1.2 kills per second by the end of the month, they improved by 20%.
After removing outliers, I grouped users based on the total number of tasks they played over the course of December. Everyone who played between 50-60 tasks are in one bin, everyone who played between 61-70 tasks are in a second bin, etc. I didn’t include users who played fewer than 50 tasks, because that doesn’t give us enough information to see if they’re truly improving.
Then I plotted the percent change in performance for a bin against the total number of tasks. I analyzed performance for spidershot and microshot, only, because they are played the most often, so give us the most data.
You might already have some questions…
Q: This shows that if you play Aim Lab a lot, you get better at Aim Lab, but do you get better at other titles?
A: In an earlier Aim Lab blog post, we describe our experiment showing that while practice, itself can lead to improvement, the feedback from Aim Lab (for example, showing you how your performance differs by screen location) leads to even greater improvement. Sensitivity matching options in Aim Lab are designed to allow you to transfer these skill improvements to your favorite titles. In addition, training in Aim Lab is designed to cover many game-agnostic skills – better reaction time or accuracy and decreased screen location bias are good, no matter what you play next. Users are telling us that it’s helping them a lot and we’re working on ways to combine external gaming data with Aim Lab data for individual players without being creepy.
Q: Other than time spent practicing, what else might affect my performance in Aim Lab?
A: Other analyses we’ve done show a big difference in performance by mode (training, assessment, speed, ultimate, etc) and weapon type. Also, you all own different gaming rigs, which could affect performance.
Q: So why didn’t you account for weapon type or task mode in this graph?
A: Logical suggestion – we’ll get to that in another post.
Q: Why do you get rid of outliers? What if someone is just really really good or bad?
A: Outliers are exactly what they sound like – not a good representation of the group and they can have a big effect on results way out of proportion to the effect one user should have. In some cases, they’re not real data – someone could stand up to get a snack in the middle of playing and then appear to have a round with the reaction time of a sloth.
Q: What if I played a bunch in November, but not much in December? If you’re only looking at December data, the dots might be in the wrong place.
A: Yup. This was not a sliding window analysis. Still, those are nice trend lines because we have a ton of data from a ton of users. Thank you, Aim Lab users. Scroll back up and look at the nice trend lines.
Q: Doesn’t your starting performance level affect how much you can improve?
A: Absolutely. Another set of graphs showed that the lower your performance when starting off, the more you improve. Not a surprise. But those graphs are a lot harder to look at and not everyone likes graphs. The trend lines in the plots included here average all of that out.
Q: Some of those graphs look like a curved line would fit the trend better than a straight line.
A: Yes. These graphs were made to inform the largest number of readers possible and straight trend lines are better for that.
For each user, for microshot and spidershot tasks, separately, I calculated a start score as median of the 3rd – 7th tasks recorded in December and end score as median of the last 5 tasks recorded in December. To get the median, you order a set of numbers from lowest to highest, and take the middle value. This is similar to taking the average of a set of values, but corrects for extremes. For example, the median of 1,2,3,4,100 = 3 but the average of 1,2,3,4,100 = 22.
The same user playing a given task 5 times in a row will probably get 5 slightly different scores. Plus, when someone starts Aim Lab, they may need a couple of tries to use it correctly.
So, I ignored the first two tries on each task and then took the median of the 3rd through 7th tasks recorded to get a starting score that would be a more accurate reflection of a user’s starting ability and the median of the last 5 to get the end score.
For each user, if kills per second, reaction time, or accuracy was way off-base, making it an outlier (either within highest 2.5% or lowest 2.5% of values for all users), data from that task was removed. Similarly, if a user was at the extreme low or high end of the distribution for total number of tasks played over the course of the month, all of that user’s data were removed. No offense. About 5600 users were included after this data filtering.