Wonderstar AnalyticsSee all entries in this blog |
Player Attributes (21/03/2014 04:49) |
"Do not be blinded by the Q." - Spinner
Last week I set you guys a small challenge, while I finished up Season 111''s Match Analytics: To guess how important each attribute was for performance in a match, all else being equal. Of two otherwise-identical midfielders, would the one with higher tackling or higher speed perform better over the course of a season? Does an extra perception point on your attacker help him more or less in a game than an extra heading point? And do older players really perform better than younger ones with similar stats? I think this is an interesting question to ask the community, because when you play this game you quickly realise that the simple formulae for quality in different positions is an imperfect measure of how good a player actually is. I'm talking of course about this one:
Figure 1: Player Quality Formulae for each position
To work out the quality of a player, you'd find his position, multiply each of his stats by the numbers given in the table, and add them together. But of course, this means that a midfielder for example could gain 10 heading and 20 perception and his quality value wouldn't change. Yet he'd certainly play better in a match - he'd be more likely to score at corners, play more intelligent passes, make successful interceptions and so on. So can we come up with a better set of formulae?
Crowdsourcing A New Formula One approach is to use the "wisdom of the crowds" - how important do managers in the game feel each attribute is? Let's say you: - Persuade roughly a dozen managers from 9 different leagues to estimate how important each attribute is, and
- Those managers have 353 full ManagerLeague seasons under their belt and a combined national-level trophy haul of 28 pots, and
- You mix everything all together
Then you might get something that looks a bit like this:
Figure 2: Attribute Importance for each position, Crowdsourced
So that's an alternative based on the combined wisdom and intuition of a bunch of experienced managers. I think it's valuable - so how does it differ from the standard Q formulae? Well, the first thing that pops out is how much more balanced these formulae are: Each position relies on a wider range of attributes, and is less dependent on the primary stat than the Q formulae imply. Perception is rated highly for every position, which is not that surprising given Spinner's advice that Speed and Perception are now more important in the latest version of the sim. Stamina is held to be moderately useful for every position. Older players are generally considered to play better than younger ones. You could do worse than use this as your guide to player ability, instead of Q. But could you do even better?
Measuring A New Formula Our second approach then is to look at a large number of players in similar teams, and see how strongly associated each attribute is with their performance over the course of a season. There are some more details in the previous post, but roughly speaking we could: - Find a large number of players - say 5000 - who all play in similar strength leagues, and
- Record their attributes, age, and average season performance, and
- Run a big multiple linear regression for each position to see how strongly each attribute predicts performance
Then, you'd get something like this instead:
Figure 3: Attribute Importance for each position, Calculated from Multiple Linear Regression
So that all looks OK, Goalkeepers pretty much what we expected, Defenders look reasonable, midfielders fine, attackers... hang on a minute. Yes, I imagine one stat in particular stands out as being a bit... unexpected. I'll step through each position in turn, and address that in a bit more detail when it comes up. But apart from that, the strengths of different attributes actually make quite a bit of sense. For each position, the formula above is closer to the crowdsourced estimates in Figure 2 than it is to the simple Q table. In other words, the intuition players have about which attributes are over or underrated actually seems to be quite accurate, and is corroborated by the regression. What's more, more experienced managers guessed attribute strengths that were closer to the regression estimates in figure 3: Both the amount of experience a manager had (number of ML seasons played) and the number of national-level trophies they won correlated with their accuracy in the contest. Intuition and objective measurement lead you in roughly the same direction, and more experienced intuition matches up even more closely to what we measure. So whether you look at pure intuition or a completely objective analysis, you get quite a similar story: More attributes are important than just the 20 in the Q formulae, Perception is indeed important for every player, primary stats are overrated, Age helps (for certain positions at least). Having said that, there are also some pretty big differences, so let's go through things position by position. As we go I'll discuss why I think certain attributes are important for each position, using my own understanding of the sim, as well as reasons why we might see differences across the three methods. The way we're going to do this is look at all three methods for assessing attributes in each position: The original Q formula, the crowdsourced intuition, and the objective regression. I've also included a simple average of the three for each position. The reason I do this is because each method has different strengths and weaknesses, and in such cases you normally end up with more information when you combine data from different places. So for example, the Q formula is the only one that comes directly from Spinner. But, the sim has been tweaked since these formulae were designed, it's very simple, and it ignores a lot of attributes for each position. Meanwhile, the crowdsourced data reflects the average intuition of managers currently playing the game, so it's more up to date and also incorporates what people have learned from reading Spinner's announcements or the forums. On the other hand, it's based very closely on the Q formula, perhaps too closely - so if the Q formula isn't that accurate, the crowdsourced estimates might not be either. That goes for 'myths' and misunderstandings too - especially if they've been included in a help blog somewhere, people are likely to read them and accept them, without necessarily questioning how true they are. And intuition is pretty useful, but it's subject to all of the brain's usual weaknesses - including a tendency to overinterpret noisy data for example. The regression meanwhile 'looks at' a lot more data than the average human manager, and in a rigorous, carefully designed way. But it's limited by the data you put into the analysis, which in this case is all league games around Q80. If tactics and attributes act differently at other Q levels for example, that might affect the results. And of course they're not exact - roughly speaking, each estimate is probably within about 4% of the true underlying value. A simple average of the three helps to smooth some of this out. Of course, you can average things in any way you wish - I'd be inclined to weigh the regression data more highly than the other two for example, since the Q and crowdsourced methods are so similar to eachother. But you might have a deep distrust of maths, and decide to completely ignore the regression results. That's all fine! Gathering the data and laying it out gives us all a chance to weigh things up ourselves and use our own judgment. So, let's take a look at each position.
Goalkeepers Here's a summary of the goalkeeper formulae from our three sources: The original Q formula, crowdsourced intuition, and objective regression:
Figure 4: Important Attributes for Goalkeepers, estimated from three sources
First thing to note: They're all quite similar I'd say, and that's probably because the goalkeeper is the most specialised and simple position on the field. There's only a few things they're ever called on to do - saving shots, claiming crosses, stopping penalties, and passing the ball back out to start a new attack. I've not really seen goalkeepers committing fouls, running out of the box to make clearances, dribbling round an attacker, dealing with backpasses, coming up for corners in stoppage time or anything else like that - the sim keeps things pretty uncomplicated. Keeping is clearly the most important attribute, which makes sense given that the most important part of the job is stopping goalbound efforts. Perception is next, and it's quite possibly used when running out to claim a cross, saving a penalty, and possibly to avoid giving away corners by 'saving' a shot that is going wide. Most managers seem to think the Q formula slightly overrates perception here, and the regression agrees. Speed can probably be thought of as 'reflexes' for goalkeepers and may be important for making saves, as well as claiming crosses. The regression starts telling a different story when it comes to the last three attributes though. Passing doesn't seem to affect performance in any meaningful way - perhaps the success of long passes goalkeepers tend to make are determined more by the receiving player - and Stamina also has no noticeable effect on performance. It's perhaps a good time to remind ourselves that we're only dicussing effects on match performance - so even if stamina doesn't help in-game, it obviously adds some value to the player by letting him play more matches between rests. Finally, the most interesting thing revealed by the regression is how important Age seems to be. Though several managers already buy into the importance of playing veterans, and we all know that the average lineup age affects your team's chances, the regression suggests that the age of a player is in and of itself a much bigger driver of individual performance than commonly thought. There are a couple of slightly complicated caveats to bear in mind here. Firstly, attributes were measured at the end of a season. An older player gains less over a season, so his end-of-season attribute values will be close to his average values for the season, whereas a younger player with the same end-of-season attributes may have actually had lower values (hence lower performance) earlier in the season, before catching up thanks to gains. This could cause age to seem about 2-3% more important than it actually is. Secondly, age is on a different scale to everything else: 17-40ish, rather than 30ish-99. This means that while gaining one year in age is about as important as gaining one point in perception, over a whole sample of players age will actually account for a little less of the variation in performance than you might expect, simply because ages aren't as different between players. For example, two players might have a speed difference of 6 on average, and an age difference of 3. Even though each age point is more important than each speed point, it's the difference in speed there that has the biggest overall effect (6x15% vs 3x20%). Finally, it may well be that it's experience which actually improves performance, and that age is just a good estimate of that, so be cautious about filling your team with veterans who've been sitting on the bench all their career. Overall though, age is still accounting for somewhere between 10% and 20% of performance, which is hard to ignore.
Defenders Here's a summary of the defender formulae from our three sources: The original Q formula, crowdsourced intuition, and objective regression:
Figure 5: Important Attributes for Defenders, estimated from three sources
There's a little more difference between the three methods for defenders. First off, intuition agrees with the regression - Tackling is important, but overrated at 36%. Defenders tackle quite a lot during a game, but they're also required to make a lot of blocks, interceptions and clearances, which might rely a bit more on some of the other stats. Passing is also a little overrated, but still quite useful, while Speed and Perception - likely important for making interceptions, and, in the latter case, offsides - are if anything slightly underrated. On the other hand, Heading doesn't seem to be as important as we thought, perhaps because defensive headers don't need to be accurate, and aerial duels also rely on speed and perception. Stamina is also heavily overrated in terms of match performance, and the estimate from the regression is not significant (a rule of thumb for the regression values: Any values below 5% should be considered a little shaky, and anything over 10% rock solid). Notice how stamina and heading have similar patterns in Figure 3, i.e. nothing for GK, low for Def/Mid and high for attackers? It could be that stamina acts as strength in aerial duels, and is just a less important characteristic than heading for these situations. Shooting doesn't predict performance in defenders, though it's quite likely that the true value is a couple of percent or so - I have seen my defenders shoot, it's just very uncommon. Finally, Age is once again a surprisingly important factor. All the same caveats apply as discussed above for goalkeepers, but it does seem that a veteran defender should play better on average than a rookie with similar stats.
Midfielders Here's a summary of the midfielder formulae from our three sources, the original Q formula, crowdsourced intuition, and objective regression:
Figure 6: Important Attributes for Midfielders, estimated from three sources
First and foremost, midfielders have to be flexible. Almost every stat seems to be useful, though interestingly the distribution from the regression is a little less balanced than the crowdsourced intuition. In fact, Passing and Speed are by some way the most important stats when it comes to performance. I'd guess this was a slight overestimate however. It is possible that midfielders with high values here tend to get put on the wings more - where they should see more of the ball, and thus get higher performance. That could probably account for a few percent of each if it's happening regularly. The same effect in reverse could be shaving a couple of percentage points off the apparent value of Stamina, Tackling and Shooting, so although they all seem to be heavily overrated, bear that in mind. The basic Q formula infamously neglects Perception and Heading, and the regression agrees with intuition that these are actually worth somewhere in the region of 5% and 10% respectively. Of course, many managers might pick a player with low heading and high perception to take free kicks and corners, so when you take that performance bonus into account it's possible heading is actually a couple of percent more useful than the regression implies, and perception less important. I've had a working theory for a while that the transfer market has overcompensated a little and actually overvalues perception these days, so it's interesting that the regression also pegs perception as being less crucial than most managers believe. Finally, Age contributes to performance again, though apparently not as much as for defenders or goalkeepers.
Attackers Yeah, I don't blame you if you skipped forward to read this one. What the heck is going on here?
Figure 7: Important Attributes for Attackers, estimated from three sources
OK, before addressing the stat that is staring you in the face and convincing you the analysis must have gone haywire, let's quickly go over the others, because there's actually a lot going on here that's different across the three methods. First, Attack is the only position where Stamina seems to be important. Attackers make a lot of headers - just check how many crosses, corners and free-kicks happen in a game - so if stamina represents a player's strength and helps in those aerial duels, it kind of makes sense that an attacker is going to rely on it more than a defender or midfielder. There are fewer attackers on the field, so each gets more aerial duels per game, and the result probably makes a bigger difference for their performance rating since it often ends in a shot on target or a goal, rather than just a clearance. And about those Headers - more important than Shooting? Really? Before running this analysis I was pretty sure that shooting was indeed worth twice as much as heading, because in match reports you get lines like "Striker Jim shoots!" about twice as often as "A great header by Striker Jim!". But if you've read the match reports carefully, you'll have noticed that crosses end up in "shots" from strikers quite a lot, not always headers as you might expect. So what if the word "shoot" is just a bit misleading, and chances from crosses are using the heading stat regardless of the test? Then you'd see something more like the 17%/20% split given by the regression. I'm not completely convinced that's the case yet, but it would be a big deal if it turned out heading abilities were actually more important than shooting. Talking of more important than shooting, Perception seems to be strongly underrated for strikers. This may be a recent thing - the offside trap has become quite a popular and effective tactic in later versions of the sim, and so the need for high-perception strikers to stay onside has become correspondingly greater. On the other hand, the fact that Speed shows no relationship to performance is odd. The first conclusion I'd draw is that it means speed isn't helpful for breaking the offside trap - perhaps it used to be and just got downgraded in the current version of the sim. It should be important for getting to the ball first - in rebounds, long balls, misplaced passes - but perhaps these rely more heavily on perception, and/or strikers just aren't involved in middle-of-the-park speed duels to reach long balls as often as midfielders are. Finally, it could be mildly important for receiving certain passes such as through balls, but because set pieces have become so prevalent in the sim it's ended up not being as often-used as Spinner originally intended. Passing I think is easier to explain. Exactly how many times a match do your attackers get called on to make a meaningful pass? No, kick-offs don't count. I have never really understood why passing was rated as important for my forwards, virtually every time one of them gets the ball he has a crack at goal. It just doesn't happen often enough to be worth 9% of their value. On the other hand, they're just tackling all the time, aren't they? Okay. Tackles are the most important stat for a striker. More important than perception, shooting, age, passing and stamina. Combined. According to the regression. So there's an easy 'out' here, especially if you're disinclined to trust the analysis - it could just be wrong. It could be noise. I might have made a mistake somewhere. But I'm not sure it's that easy to dismiss. First off - I could definitely have made a mistake somewhere, human error happens all the time. But I've double checked and triple checked and if I had made an error, it's not really clear why it would just give me a weird result for tackles in particular. Perhaps it's noise? Maybe just a handful of strikers happen to be huge overperformers, and completely coincidentally have high tackling. Well sure, except that the analysis is waaaay too overpowered to be that susceptible to noise or outliers. Because there are almost 1000 players - around 20,000 matches - that 33% is pretty solid. It could be out by 3 or 4%, sure, and I definitely think it's more likely to be an overestimate than an underestimate, but 30% off? Not a chance. Put it this way - the regression is 99.5% certain that the value is somewhere north of 25%. It isn't noise. So if the effect is real, what could explain it? Well, I think we need to consider the possibility that the numbers are right, and that tackling really is critical for a striker. Now, obviously attackers aren't going around tackling all the time. So perhaps tackling is standing in for something else. The Grand Unified Belizio Theory of Tackling / The Witterings Of A Madman Delete as appropriate. Do you remember reading Bulldog's help guide when you started the game? You might recall a few lines of advice from Spinner, faithfully recorded on that page. One of them provides the quotation that begins this monster of a blog post. But it's another one that caught my eye: "If you see certain players trying to perform "solo-raids" over and over, know that they are a bit special, and would do well with some extra speed and tackling to assist them in their attempts." - Spinner It's commonly understood that Spinner is referring here to players with high flair, a hidden attribute. The 'solo-raids' look a little like this in the match reports: Dave Defender with the ball now, trying to outsmart Mike Midfielder.
This is great football! Running past him using his speed and technique!
Note that the match reports often give you clues (deliberately) about what attributes or teamstats are helping a player out in a given situation. And Spinner has often stressed that it's possible to learn a lot from the match reports. Here, we're specifically told that the player uses his "speed and technique" to dribble the ball past his opponent. Speed, sure, but we don't have an attribute for technique, right? That's why I'm wondering if the Spinner quote is important - it tells us that flair players benefit particularly from Speed and Tackling. So does tackling correspond to technique, or ball control? Well... I recently ran a quick analysis on my own flair players. The success rate of their dribbling was predicted by 3 factors: The number of times they attempted to dribble per match (which I take to be a measure of their underlying "flair" attribute), their speed... and their tackling. And tackling was more strongly predictive than speed. Small sample size of course, but interesting. And remember. ManagerLeague uses only 8 visible player attributes. Every time someone suggests increasing this, Spinner is adamant that it isn't going to change - he likes the relative simplicity of it. So imagine you're Spinner, and you're on iteration 9 of your sim, which is 3 times as long and complex as when you started it. You're trying to make your match situations as realistic as you possibly can, but without introducing new stats. What's an elegant alternative? To recycle stats, and make them represent a broader range of skills. So if you want to make controlling the ball a key role for a striker, maybe you might use the otherwise useless tackling stat to code for that... So that's the theory. Apologies if someone's already aired it, but I have never read it on the blogs or forums here: Tackling = Ball Control. If that's the case, it might be able to explain why attackers benefit so much from a higher value: Quite often, when they receive a cross or a pass, the striker miscontrols the ball. If performance ratings are heavily penalised for doing so, and of course get a big boost if instead the attacker gets a shot off, and control is being tested on every attcking reception, and only highlighted when it fails... that might just be enough to explain why tackling is so strongly predictive of performance. One extra little bit of evidence in favour. Remember back in this early blog post, where I discussed how playing in certain positions might affect what attributes you improved? We concluded that although it was mostly random, players did seem to gain more in attributes that they should be using during a match. Well, if you look back to the distribution of attribute gains by position (the second table), which attribute do you think gains fastest in attackers? - Belizio
Click Here to return to the Wonderstar Analytics home page |
Share on Facebook |
This blogger owns the team The Wonderstars. (TEAM:154471) |