In the last continuation of this series, we covered the best attacking central defensive midfielders in the MLS. However, there are different styles in the MLS, it makes sense to also cover the defensive-minded central defensive midfielders in the MLS.
In this data analysis, using data and statistics, we will conduct an analysis of the central defensive midfielders in the MLS figuring out the best defensive central defensive midfielder in the league.
Setting the Guidelines with PCA
Before in this series, one of the troubles I encountered was just how difficult it was figuring out different styles with the myriad of the statistics we have available. For some players and some positions, using statistics to find the best player is relatively easy. Looking at wingers, attacking statistics, in general, can lead us to the find the best such as Leroy Sane of Manchester City and some upcoming players such as Nicolas Pépé of Arsenal. However, looking at positions such as central defensive midfielders using statistics of any kind quickly becomes very difficult as these players have various components to their game.
Making scatterplots of various metrics not only becomes monotonous but also can lead to us missing out some players simply due to our monotonous activity.
In lieu of this, the Principal Component Analysis becomes very helpful. A PCA, as it is referred to, is simply a helpful guide that can tell us, mathematically, the different types of players that exist in a bigger pool of players. In doing so, the PCA also tells us what indicators most correspond with the types of players.
As such, players like Bayern Munich’s Thiago Alcantara and Chelsea’s Mateo Kovačić would be characterized as attacking central defensive midfielders that are press-resistant. The statistics corresponding to their mould would include metrics like dribbles, progressive passes, and through balls.
I applied this PCA to central defensive midfielders in the MLS who had played more than 20 games and here were the results.

Here we see four types of midfielders that are the most common in the MLS. Since our focus is on a defensive midfielder, we want to search for the type of midfielder that has a high correlation with defensive statistics. What this allows us to do is find other statistics that, at first don’t seem correlated, but upon a mathematical inspection, are positively correlated.
In this heatmap, the more blue the colour is, the greater correlation while the opposite is true for the shades of red.





