Thursday 8 October 2020

Tournament Correlations Added to Golf Predictor

I am happy to announce a new tournament correlation capability for Golf Predictor. This relatively computationally intensive capability is not available to users directly, but I can run it to identify tournaments with the most similar (good) results in the same GP Season (calendar year) on the same tour (European/US PGA Tour). For example, this may provide insights into an upcoming tournament by comparing the results for the event in the previous year with the other tour events in the same year. Obviously, this would be more useful for events held on the same course as the previous year. The functionality can either compare the entire fields for correlations or just matching top twenty results. The latter is much more useful as it may identity potential good performers in the upcoming event.


When comparing the entire field of the selected event to each of the others, if the same golfer has played in both and his results are similar, they are tagged as correlated. The degree of similarity can be adjusted, but the default is plus//minus ten places, e.g. if player A finished 23rd in the selected tournament, any result from 13th to 33rd in any of the other tournaments will be designated as correlated. For top twenty correlations, the results in both compared events obviously have to be in the top twenty.

When finished comparing each tournament to the selected event, the number of correlated results are totalled and expressed as a percentage of the total number of matches found. For example, if there are fifty golfers found that played in both compared events and of those, fifteen had similar results in both, the correlation factor for those events would be 30%. Higher correlation percentages are easier to find when there are fewer matches, so only tournaments with a significant number of matches should count. Through experimentation, with an appropriate lower limit on the number of matches, correlation percentages of approximately 25-40% seem to be the highest. In events with over 120 golfers, the minimum number of matches needed should be set to around fifty for better results in overall correlations. For top twenty correlations, there should be a lower minimum number (e.g. 12) of matches and a similar minimum correlation percentage to overall correlations. In both cases, the minimum number of matches and the minimum correlation percentage for correlated events can be specified in the system.


Once the correlations are found, the system can then examine the field for the upcoming event and find those playing who performed well (i.e, finished in the top twenty) in each correlated event this year (or, if the event hasn't been held yet or isn't being held at all, last year's event). This list of players for each correlated event is then amalgamated and sorted by frequency to find the players with the most matches. This should tell us the most likely golfers to perform well. For example, for the 2020 Shriners Hospitals for Children Open, the golfers with the most matches in correlated results found by the system are:

Bryson Dechambeau, 4 

Abraham Ancer, 4 

Hideki Matsuyama, 3 

Chez Reavie, 3 

Patrick Cantlay, 3


The correlated events for this tournament, using top twenty correlations, a minimum of 12 matches and a correlation percentage of 25% are:


Genesis Invitational

Players Championship

US PGA Championship

3M Open

Safeway Open

Travelers Championship

Open Championship

RBC Heritage 

There may be a case for only including correlated events that are held on the same course in the last two years. Doing so in the above case would exclude the two majors and reduce the matches for all the golfers above by one, except for DeChambeau. It would also bring other golfers on two matches into the equation, including Frittelli, Simpson, Hahn, Dahmen, Homa, English, Redman, Streelman and Zhang.


Hopefully, this new correlated tournaments functionality will be useful for most upcoming events. I will probably run it for most events that are held on the same course as the previous year and post the results either on this blog or on social media. Just another way to make Golf Predictor a little better!



No comments: