The data analytics landscape in Ireland

The data analytics landscape in Ireland

What does the data analytics landscape look like in Ireland?

Check out the results of our survey and see whether you won the €100 voucher.

 

A few weeks ago, we here at Idiro Analytics decided to run a survey on the data analytics landscape of Ireland. Our hope was to get a better insight into the data analytics field here in Ireland and gain a better understanding of where Ireland stands in the global analytics industry.

As with any field, it can be too easy to only focus on your own area and what works for you, but having an overview of what other people in your industry are doing can be extremely valuable. Even something as simple as seeing tools being used by others in your industry could lead you to explore options you might not be aware of.

With data analytics and Big Data being such popular buzzwords for marketers to throw around, we thought it would make sense to gather facts directly from those working in the industry.

We received a great response to our survey from the Irish-based data analytics community and we would like to thank everyone who completed it.

Here is the summary of the survey results:

We’ll start with the job title, and it’s interesting to see here that there are a lot of people working with data who don’t categorise themselves as being in a traditional data analyst / scientist role.

When asking the types of business problems you were solving with data, we found that a huge 69% of people were using data for marketing and sales activities.

Although marketing and sales analytics might not be the only focus, it’s clear that a lot of companies in Ireland are finding value in that area of analytics. That reflects Idiro’s own experience - most of the analytics work we do for our own customers is around helping our customers do better marketing or sales.

So are some industries doing more advanced analytics than others? It seems so - we found that the highest number of analysts working on predictive analytics projects are in these three industries:

Now let’s look at the analytics tools. As you’d expect, there’s a massive number of different software tools in use by the analytics community in Ireland.

And we can see, the top 3 analytics software tools in Ireland are Excel, R & MS SQL Server.

There are big differences according to the industry the analyst works in, for example:

Finance/Accounting

Utilities

We have an interesting breakdown by industry, but the table is far too big to show here, so contact us if you’d like to see it.

Next, we looked at the type of work being done (reporting, insights / analysis, modelling), and split by the different job titles:


 

 

 

Idiro researcher invited to speak at MACSI 10

Idiro researcher invited to speak at MACSI 10

Idiro researcher invited to speak at MACSI 10

 

The importance of academic research has never been underestimated here at Idiro Analytics. Encouraging our analysts to explore new and innovative technologies and techniques when solving data problems has always been a part of Idiro’s company culture.

Bridging the gap between academic research and industry is an area Idiro are very proud to be involved in. With this in mind, we're happy to report that our colleague Davide Cellai was invited to be a speaker in the workshop to celebrate 10 years of MACSI.

This is Davide’s report from the event:

"Last week I participated, as an invited speaker, in the workshop to celebrate the 10 years of MACSI. MACSI (Mathematics Applications Consortium for Science and Industry) is a consortium based at the University of Limerick that promotes collaboration between applied mathematicians and industry.

MACSI was founded in 2006 by the largest single grant ever awarded to mathematics in Ireland, and since then it has been quite a special point of reference in the country for mathematicians interested in industrial applications. I had been working in MACSI for more than four years and I knew the people over there very well. MACSI engages in industrial collaborations both at the national and international level. People in the consortium work hard to improve products and processes for the industrial partners and provide advanced training in mathematical modelling to students and researchers.

Davide Cellai speaking at MACSI

Idiro and MACSI have a long-standing collaboration. Indeed, Idiro is always keen to collaborate with academia. As we continuously improve our models and expand our range of services, we often employ cutting-edge research to meet those challenges. In 2014, when I was still in MACSI, I won a Science Foundation Ireland Industry Fellowship, a grant that gave me the opportunity to move to Idiro and apply Network Science models to the problem of predicting telecommunication churn, using one of the datasets available in the company. This work was so successful that I was later hired by Idiro as a Senior Data & Analytics Architect.

In my talk, I illustrated the model (called m-exposure model) that I developed during the Fellowship and the outcome of this work. While the m-exposure model was designed for portout churn, we then developed a similar model for expiry churn. Both models are now part of Idiro's suite of SNA tools for churn prediction.

Finally, I presented some new ideas and challenges that we would like to pursue in the near future.

There was a lot of interest in my talk. I got both great questions and great feedback (and lots of compliments) in the following hours. Some of the scientists in the audience were particularly interested in Idiro's future projects. Hopefully, we will get some good ideas for designing our next products.

The workshop was also very interesting in its own way. I listened to some great envisioning talks. In one of them, Professor Wil Schilders was comparing the benefits of faster computer hardware with faster algorithms. His point was that the latter was actually more interesting. In other words, it's often better to have a new algorithm running on an old computer than an old algorithm running on a new computer. Some other talks were also speaking about how science can improve society, from the elimination of tropical diseases to exact calculation of delay time after a road incident. I was delighted to be invited to such a great event."

Davide Cellai at MACSI

Major International Telco chooses Red Sqirl

Major International Telco chooses Red Sqirl

Major International Telco chooses Red Sqirl

 

One of the first commercial users of the Red Sqirl analytics platform for Big Data is a multinational telco, which has deployed Red Sqirl in two countries and is using it to deliver analytics on its Hadoop platform. This customer has asked to remain anonymous.  

Red Sqirl is a flexible drag-and-drop Big Data analytics platform with a unique open architecture. Red Sqirl makes it easy for analysts and data scientists to analyse the data on your Hadoop platform.  

The problem

 

This multinational telco operates on four continents worldwide. Unfortunately, for historical reasons these operating companies use a wide variety of different database systems and analytical platforms.  

As data becomes an increasingly important asset for organisations, many companies are taking steps to maximise the value of their data. In addition, most large companies now have access to many new types of data - for example, social media posts.

To exploit synergies across the worldwide organisation, this multinational telco decided to standardise database platforms across the group. In order to best meet the challenges of storing and using ‘big data’, the company chose Hadoop as its standard database platform.  Hadoop is currently being rolled out across the company’s operations.  

The company now faces the challenge of migrating existing code onto Hadoop, and allowing asset re-use and swapping across business units who have multiple historic platforms and assets.  Moreover, the Hadoop ecosystem does not contain a ready-made data analytics module.  The market leading traditional data analytics software platforms are not designed for the Hadoop ecosystem and tend to be inefficient when analysing Hadoop data. The telco searched for a native Hadoop analytics platform with an easy-to-use graphical interface.  In addition, because of the wide variety of requirements, the platform had to be highly flexible and cost-effective for a worldwide rollout.

Solution chosen: Red Sqirl

 

Following a thorough technical / usability evaluation of a number of analytics platforms, the company agreed a contract to trial Red Sqirl - initially in the company’s head office and in one operating company.  

Red Sqirl exceeds the telco’s requirements as set out above. Of particular interest to this company is Red Sqirl's unique capability for sharing, via the Red Sqirl Analytics Store.  Moreover, the Red Sqirl development has deep experience of telco analytics.  Red Sqirl is already proven in predicting telco churn, as shown in the workflow below.

A Red Sqirl screenshot showing a telco churn modelling workflow
Telco churn modelling in Red Sqirl - a workflow


This telecoms operator now has the confidence that their analytics solutions are scalable, deployable and easy to use.

Implementation

 

Red Sqirl was installed in both the head office and the country telco, which took a matter of minutes in each case. Training workshops were held in both locations, to ensure that analysts could get the most out of the Red Sqirl platform. Red Sqirl’s drag-and-drop interface is similar to that of most GUI analytical tools, so the training was quickly completed.  Feedback from students was very positive - as the graph below shows, students scored the training very highly.

Student feedback on Red Sqirl training course
Student feedback on Red Sqirl training course


The company then needed to migrate analytical assets from legacy infrastructure to Red Sqirl.  The Red Sqirl development team showed the way by taking a particular analytical model that had been developed in another language and quickly porting it onto Red Sqirl.

The company is now using Red Sqirl as its primary analytics platform in the country in question - and early results are promising.  

A sentiment analysis of a Premier League game: Man United vs Arsenal

A sentiment analysis of a Premier League game: Man United vs Arsenal

A sentiment analysis of a Premier League game: Man United vs Arsenal

 

One of the biggest rivalries in the English Premier League is the one between Manchester United and Arsenal. Less so in recent years since the retirement of Alex Ferguson, but still a hugely anticipated match between two of the world's biggest clubs.

One of the major outlets for fans of these two clubs to voice their opinions and feelings about upcoming games is Twitter. With a Twitter fanbase of 9.31M for Man United and 8.48M for Arsenal, it can be a great source of data about how the fans are feeling in the build-up to a game and their emotions in the hours after it.

On the 19th of November 2016, Man United and Arsenal went head to head in a game that, in all honesty, won't go down in history as being a great game. Although both teams are battling to stay in the running for the title, the game ended with just a 1-1 draw. Nothing all that exciting for the Twitter followers to remark on (with the exception of the equalising goal).

But we can still see some interesting results when we analyse the tweets being posted about the game by doing a sentiment analysis. A sentiment analysis essentially takes a piece of text and assigns emotions to the specific words being used. That overall text can then be determined to be positive or negative and we can work out the specific emotions being expressed. We can then plot these emotions on a graph and examine how they change over time.

For example, this tweet below can be classed as being an overall negative one. Each of the words being used “abysmal”, “gutless” etc. can be grouped into specific emotions, this helps us understand the feelings being expressed in the tweet.

screen-shot-2016-11-24-at-15-44-41

Below is a graph of the results. As you can see, tweets about this game started growing strongly an hour or two before the match, peaked towards the end of the match, and declined steadily until ten hours after the match. 

red-sqirl-premier-league-sentiment-2

Two interesting points worth highlighting from the results are the levels of surprise and trust:

red-sqirl-premier-league-surprise-2

Looking at the surprise, we can see a clear spike towards the end of the game, most likely caused when Olivier Giroud scored the equalising goal in the 89th minute.

screen-shot-2016-11-24-at-16-20-08

Analysing the trust is quite interesting, a huge number of people tweeting felt a lot of trust before the game kicked off, it then drops slightly after kickoff but starts to rise again half way through the game. Possibly at halftime with the score being 0-0, the fans felt it was all still all to play for.

It's clear to see the potential use cases for a system such as this, a complex analysis of a large constantly updating dataset, scheduled to run at predefined intervals. For example, we’ve used this previously to explore the sentiment on the US presidential election.

The difficulty with an analytical project like this is setting it up. Building the data pipeline that goes from gathering the data, to building the analysis workflow, to scheduling that workflow to run periodically and then to display the results, usually takes a lot of expertise and overhead. However, Idiro Analytics have developed a tool called Red Sqirl which can perform each of these steps in one intuitive interface.

Modern sports and data analytics now go hand-in-hand, it'd be hard to imagine a professional sports organisation that wouldn't be utilising data analytics in some form. And with data becoming more easily obtainable, it opens up so many more opportunities. With the right tools data analytics can be accessible to a lot more people.

Red Sqirl

Red Sqirl is a flexible drag-and-drop Big Data analytics platform with a unique open architecture.

Red Sqirl makes it easy for your analysts and data scientists to analyse the data you hold on your Hadoop platform.

For more information visit RedSqirl.com, and for a guide on how to build the entire process of analysing Twitter data using Red Sqirl, as outlined above, please read our detailed guide.

Title image courtesy of Premier League ©

What do the Irish think about Trump and Clinton?

What do the Irish think about Trump and Clinton?

What do the Irish think about Trump and Clinton?

 

Comparing Irish people's opinions to the rest of the world

 

It’s everyone's favourite subject right now, the US election. Unless you've somehow avoided consuming any form of media over the last few months, you'll have no doubt been exposed to a lot of opinions and "facts" about the two front runners for the US election Hillary Clinton and Donald Trump.

With such a media overload, it's hard not to form our own ideas about who should be elected and who shouldn't. It's a strange phenomenon, the world being so invested in an election for a nation we have no vote in. The American people will vote for an American president, and yet the rest of the world seems to feel like we're involved in the decision.

With this in mind, we here at Idiro Analytics decided to get a clearer understanding of the opinions of people here in Ireland surrounding the election. Do the opinions of the Irish people differ from those of the rest of the world?

To do this we chose to use Twitter as our source of public opinion to do an analysis on. We gathered thousands of tweets posted about the election over a 24 hour period in the days leading up to the election and ran a sentiment analysis on them.

This means we were able to break down each tweet and work out the sentiment (overall feelings) being expressed by analysing the types of words being used in each tweet. From this, we can then chart if the majority of tweets being posted about both Clinton and Trump are positive or negative and the general feelings behind each one.

First, let’s look at the sentiment for both Clinton and Trump worldwide:
idiro-analytics-what-irish-think-us-election-wh

idiro-analytics-what-irish-think-us-election-wt

One interesting note from the two charts above is the huge difference in the number of tweets being posted about each person. The number of people tweeting about Trump is over three times higher than the number of people tweeting about Hillary.

If we break this down further into just positive and negative sentiments, we can see that the majority of Tweets being posted worldwide about both Clinton and Trump are negative.

idiro-analytics-what-irish-think-us-election-pnh

idiro-analytics-what-irish-think-us-election-pnt

 

Now let's look at the sentiment of Irish people towards the two. (Note that in order to get a large enough sample to analyse, we used tweets posted by people in Ireland over 4 to 6 days leading up to the election)

idiro-analytics-what-irish-think-us-election-hti

From looking at the chart above, it's strange that even after all we've read about Trump over the past year, we're still surprised by him.

idiro-analytics-what-irish-think-us-election-hi

idiro-analytics-what-irish-think-us-election-ti

Although it’s not by a huge amount, we can see that the sentiment towards Hillary in Ireland is positive compared to the negative worldwide sentiment towards her, whereas Trump is still negative.

Lastly, let’s combine the worldwide sentiment for both Hillary and Trump versus the sentiment towards them in Ireland.

idiro-analytics-what-irish-think-us-election-hwi

idiro-analytics-what-irish-think-us-election-twi

From these last two charts we can see that the Irish people have a little more fear and anger about the future than the rest of the world. Is there something we know that they don't?

 


 

About Idiro
Based in Dublin, Ireland, Idiro Analytics is an award-winning provider of analytics to businesses around the world.

For an overview of Idiro’s analytics services, see our homepage www.idiro.com

Media contact information
Simon Rees, Clients & Marketing Director, Idiro Analytics.

simon.rees@idiro.com

+353 1671 9036

 



red sqirl
The data analytics work for this article was performed using Red Sqirl. From within Red Sqirl, we were able to build a data pipeline that gathered thousands of tweets, sorted each tweet, run multiple different analysis steps on the data and output results into visualisations in real-time. Visit the Red Sqirl website for more details

 


 

Can Ireland beat the All Blacks?  Irish people really believe they can

Can Ireland beat the All Blacks? Irish people really believe they can

Can Ireland beat the All Blacks? Irish people really believe they can

 

There's an excited buzz in the air for Irish rugby supporters right now. Tomorrow, once again, Ireland will take on the All Blacks, in an attempt to break a 27-time losing streak (and 1 draw), against unquestionably the greatest rugby nation in the world.

And you might think, what's the big deal, this isn't a major tournament, there are no records to be broken, it's being played in a country that doesn't have a love for the sport, nothing is really on the line. This is just one more attempt in a long 111 year losing streak to the All blacks.

And yet, the Irish never really look at things like that. When faced with impossible odds in any sport, when everyone has all but written us off, the Irish supporters always have the mindset of 'yeah but what if...'


But what if, they're just underestimating us

But what if, they slip up

But what if, they get overwhelmed by the Irish supporters

But what if ...

As such a small nation, we'll always be considered to be punching above our weight. But when it comes to rugby, we really do stand proudly up there with the best in the world. We truly believe that if everything goes right, we can beat any team.

The emotions felt for the Irish team is a difficult thing to put down on paper, people talk of the mood in the Irish camp, the atmosphere around the stadium before a game, the emotions of the supporters, it's not really something that can be drawn on a chart.

But what if there was a way?

If we were to use a data analytics technique called sentiment analysis, could we understand the overall emotions being felt about the game tomorrow?

Sentiment analysis is essentially taking a piece of text and by looking at the words being used, determining if that piece of text is overall positive, negative or neutral. Where this can become interesting is if we were to apply it to something like Twitter.

If we apply this technique to all of the tweets being posted by people in Ireland about the Ireland vs All blacks game, could we get a picture of the overall feelings towards the game?

So, looking at the weeks leading up to the game, we took all of the tweets from people talking about Ireland vs the All Blacks in Ireland and performed a sentiment analysis on them. We also decided to do the same with all of the tweets coming out of New Zealand about the game and we were able to plot them on the charts below.

screen-shot-2016-11-04-at-18-35-50

screen-shot-2016-11-04-at-18-35-32

 

What these two charts are showing us is the overall emotions being felt by the people in both countries about the game tomorrow. Now, we know this is far from definitive fact, these are only showing the feeling of the people talking on Twitter about the game, and that’s only going to be a fraction of the overall supporters. Interestingly, though, this would include any journalists posting about the game, the people that others look to help formulate opinions.

From looking at the results, we can see that a high proportion of the tweets from both countries have a sentiment of 'anticipation', which may seem obvious, but just stands to prove the concept of this technique.

The next highest sentiment from both countries would be a feeling of 'trust'. Again, may seem obvious from the people in New Zealand, of course they would have trust in their team, it's the All Blacks. But, this does bring us back to the point we made earlier, when up against a team we’ve never been able to beat, the Irish people still have trust that we can win.

Another interesting point to take from these tables is that, for New Zealand, there does seem to be some fear creeping in. We know and the All Blacks know that Ireland are a good team, there is the potential that they can actually win this game. And what may be weighing on the minds of the New Zealand supporters is that all of the pressure is on them. If Ireland lose, then they've lost to the best, but the All Blacks are the ones on the winning streak.

One last analysis we did was to get the overall feeling towards Joe Schmidt, with him committing his future to Ireland, we wanted to see what the Irish people thought. And as you can see from the chart below, we have complete trust in him.

screen-shot-2016-11-04-at-18-35-50

COME ON IRELAND!

 

Where Red Sqirl lies in the Big Data landscape

Where Red Sqirl lies in the Big Data landscape

Where Red Sqirl lies in the Big Data landscape

 

Today, Big Data is the platform of choice for storing, exploring, visualising and modelling your data.
 

In order to get where it is today, there have been a number of distinct generations of Big Data, each one advancing on from the one before it. The first generation simply gave us the ability to analyse Petabytes of data with tools like MapReduce. The next generation gave us more responsive tools for analysing data such as Spark, Impala and Prestodb. Now, this third generation sees the emergence of tools for moving data around such as Kafka, Kudu.
 

The Data Lake & Real Time Analytics

These tools have changed the way that data is analysed and how data solutions are now built. The emergence of Big Data has brought with it many new concepts, one of them being the Data Lake.

A Data Lake is a massive enterprise-wide data repository to which analysts can contribute to and cherry pick data they need, in a format best suited to the data. The Data Lake looks to solve the problem of data silos, eliminating dozens of independently managed data collections and creating one combined data collection. Data lakes have become essential to Big Data projects due to an increasing demand for data to be accessible and agile.

Another term becoming popular right now is Real-time analytics. Essentially, it means triggering an event that fulfils a prerequisite in real time. Although the term real-term is misleading, as you can’t actually analyse Big Data in real-time, but only act on it. Real-time analytics works by rather than analysing an entire base, the analytics instead relies on intelligently interacting with parts of the data lake, in order to perform actions on a user by user basis.

Real-time analytics relies heavily on periodic batch process analyses to continuously evaluate the impact of new data, and make the behaviour of each user's action evolve over time. Without these batch processes, the analytics being performed would not be on up to date data.
 

Data Pipelines

The key to any Big Data analytics job and an up to date Big Data warehouse are these periodic background processes, and if done well, a huge range of services can be built from them.

For example, you can perform ad-hoc analyses easily, you can maintain analytics jobs and upgrade/update them quickly etc. The method for creating these background processes is known as building data pipelines. Building data pipelines are an essential part of analysing data using Big Data techniques.

It's for this reason, that all the most popular Hadoop distributions - Cloudera, Hortonworks, MapR, all include a tool for periodic processing: Apache Oozie.

Apache Oozie is the tool that triggers processes based on time and data availability, Oozie supports any data format, language and is fault tolerant. Apache Oozie is, however, very difficult to use as there is a lot of overhead between implementing a process to run once and running it on a regular basis.
 

So we built Red Sqirl

Red Sqirl is a drag & drop analytics tool which can also build Oozie workflows in the background. With Red Sqirl you can build, deploy and maintain data pipelines easier than ever before using an intuitive drag and drop interface.

 

The counties with the most dangerous roads in Ireland ahead of the bank holiday weekend

The counties with the most dangerous roads in Ireland ahead of the bank holiday weekend

On Bank Holiday weekends we’re used to reading about people being killed on Irish roads. But which counties have the most dangerous roads?


Although Dublin and Cork have had the highest number of fatalities, does that mean they have the most dangerous roads in the country or do other factors need to be taken into account?

As with most bank holiday weekends, there is a heightened risk of driving over the next few days. This can mainly be attributed to higher volumes of traffic as many people visit family and friends and in doing so, undertake long road journeys.

The Road Safety Authority (RSA) have issued statements about taking extra care on the roads this weekend. And by looking at the numbers over the last 20 years, it's clear the RSA are succeeding in their goal to make our roads safer. Even though the number of fatalities this year are higher than the same time period in 2015, the overall trend is that our roads are becoming safer.

 

Idiro-analytics-Irish-dangerous-roads-fatalities-2015

 

In the interest of improving the safety of Irish roads, we here at Idiro Analytics wanted to shed some light on some of the details of road safety statistics that can usually be overlooked or misinterpreted, leading to the wrong conclusions.

In 2014 and 2015 the number of road fatalities in Ireland were 193 and 166 respectively. By studying the charts below, it's easy to see how the assumptions can be made that the two most dangerous counties for road accidents are Dublin and Cork. However, these figures don't show the full story, because there are a lot of other variables to take into consideration.

 

Idiro-analytics-Irish-dangerous-roads-fatalities

 

Other details that need to be taken into account are:

  • The length of road in each county
  • The number of vehicles on the roads
  • The average distance traveled
  • The total population sizes

 

Below you can see details from each of these different variables (note: summarized tables - not all information is contained):

 

Idiro-analytics-Irish-dangerous-roads-total-length-roads-population-in-ireland

 

When analysing all the information, we can see a clear picture can starting to form. Although Dublin and Cork may at first glance seem to have the most hazardous roads and will be noticed more in the national press, it is Longford and Monaghan that rank 1 and 2 respectively with the most dangerous roads.

 

Idiro-analytics-Irish-dangerous-roads-fatalities-per-km

 

Both Longford and Monaghan have low populations, low road lengths, a low amount of vehicles on the road and low average distance travelled, but it was found that they have a high proportional fatality rate averaged over 2014 and 2015.

  • Longford and Monaghan: 2 fatalities per 10,000 vehicles
  • Longford and Monaghan: 3 fatalities per 300 million km travelled
  • Dublin: 0 fatalities per 10,000 vehicles
  • Dublin: 1 fatality per 300 million km travelled
  • Cork: 1 fatality per 10,000 vehicles
  • Cork: 1 fatality per 300 million km travelled

 

Looking for a cause


With this in mind, we can now try to work out some of the possible causes and determine areas that may need further investigation.

Access to public transport could be one possible factor. Both Longford and Monaghan have a low number of public service vehicles (buses and taxis) per km per head of population.

Idiro-analytics-Irish-dangerous-roads-public-transport

Another factor leading to these insights could be found in a recent road surface survey carried out by the Department of Transport, Tourism and Sport (DTTAS) and the National Roads Authority (NRA) in 2011/2012.

The survey found that although Longford and Monaghan rank low on counties needing 'routine maintenance', 'surface restoration', 'road reconstruction', both counties are ranked number 1 and 2 for needing resealing & restoration of skid resistance.

Idiro-analytics-Irish-dangerous-roads-maintenance

We all know, from the information given to us from the RSA, that on a bank holiday weekend we need to be extra careful when travelling. And we also know that a lot of lives have been lost on the roads in both Dublin and Cork, a higher number than any other county.


But, one thing we need to be aware of is that although the number of road deaths in those two counties is high, they would not have the most dangerous roads in the country. Per km the roads in both Co Longford & Co Monaghan pose a greater risk and extra care needs to be taken.


Therefore, be careful out on the roads this weekend & especially so in Cos Longford & Monaghan.

In order to make this article more accessible, we've only included summaries of the overall data that we analysed. But, we invite anybody who finds an interest in these figures to contact us if you have questions or would like to discuss any part in detail. We'll be happy to discuss the findings with the hope that the information can lead to safer Irish roads.


About Idiro
Based in Dublin, Ireland, Idiro Analytics is an award-winning provider of analytics to businesses around the world.

For an overview of Idiro’s analytics services, see our homepage www.idiro.com

Media contact information
Simon Rees, Clients & Marketing Director, Idiro Analytics.

simon.rees@idiro.com

+353 1534 30 34

Mayo still to beat Dublin by one point in the All-Ireland replay

Mayo still to beat Dublin by one point in the All-Ireland replay

Mayo to beat Dublin by one point
in the All-Ireland replay

 

In the two weeks following an All-Ireland final containing 2 own goals and a draw that nobody predicted, a trend seems to be forming among GAA supporters; Dublin underperformed and Mayo may have missed their chance. The consensus being that for the replay, Dublin will ‘click into gear’ and play ‘their’ game and come out on top.

But is this really the case?

Most seemed to think the 2016 title belonged to Dublin before they even stepped into Croke Park on that rainy Sunday afternoon, but by looking at the numbers, we found that the pundits’ confidence was unfounded.

If you read our last article, you'll have seen that Dublin were not the predicted winners of the final, we had Mayo to win by one point. And although we admit predicting the exact score of a game isn't really possible, we ended up being pretty close.

We had set ourselves the challenge of working out a model to predict who would be the 2016 champions (and get an edge on the bookmakers). We did this by looking at the information available to us on both the Dublin and Mayo teams’ performances over time. With key areas being goal difference between Mayo/Dublin, point differences between them, regular differences in finals, average goals and points that season, differences between average goals/points that season and the finals etc.

We came up with the prediction of Mayo to win by just one point.

Now, even if we were to analyse every single data point and statistic since the GAA was formed in 1884, we still wouldn't have been able to predict two own goals in an All-Ireland final. But the fact that our prediction seemed to go against the general opinion of Dublin being favourites and the game ending up being so close, we thought it might be worth trying again.

The difference, this time, is that we now have more data to work with. Not only have both teams played another game that we can factor into our original predictive model, but we now have more data on how each team performs against each other.

 

Results of previous fixtures

 
idiro-all-ireland-2016-prediction-results
Over the past four years, Dublin and Mayo have now played each other four times. The particular details of those matches play a key role in predicting the outcome of this Saturday's match, with a higher weighting on the most recent games as they are the most relevant to each team's current form.

 

GAA Football All-Ireland Senior Championship final 2016

 
idiro-all-ireland-2016-prediction-stats3
With these extra details in mind, we were able to refine our original prediction and develop a new one.

 

The Idiro Analytics official prediction for the All-Ireland final replay

Mayo 1:13 - 1:12 Dublin

Mayo to win by just one point

 

Now there’s no doubt that the weather did have a major effect on the performance of both teams on that error-filled Sunday two weeks ago. But with the weather forecasted to be a lot milder this weekend, we should see a much-improved display by both teams. Although looking at the numbers, we still stand by our original analysis that these two teams are more evenly matched than people seem to think.

 

About Idiro
Based in Dublin, Ireland, Idiro Analytics is an award-winning provider of analytics to businesses around the world.

For an overview of Idiro’s analytics services, see our homepage www.idiro.com

Media contact information
Simon Rees, Clients & Marketing Director, Idiro Analytics.

simon.rees@idiro.com

+353 1534 30 34

Analysis performed by Eduards Vanags

Mayo to beat Dublin by one point

Mayo to beat Dublin by one point

Mayo to beat Dublin by one point

 

All the Dubs and Mayo people will give you an answer, of course, but can data analytics predict the outcome of this weekend's All-Ireland final, or more impressive yet, the score?

Predicting the outcome of a single game is a difficult task, predicting a winner of a league competition would be a much safer bet.

In a league competition, teams would play a lot of games, diminishing the impact of losses on their overall performance. And although they may falter a few times, underperform, fail to capitalise on chances etc. over the course of a season, it is usually the best teams who come out on top. But in a knockout competition, anything can happen. Which is an argument for why Leicester City winning the English Premier League is a bigger achievement than Portugal winning the Euros. In the knockout competition, Portugal were crowned champions by only winning four games, only one of those in 90 minutes and one on penalties. How much of that success was down to luck, and if it was a league style competition, would they have still won?

But let’s say we want to work out who’s going to win the All-Ireland final this weekend, Dublin or Mayo, is it even possible? The short answer, no! But let's give it a shot anyway.

Now, some other sports (e.g. soccer) have the luxury of huge pools of data and statistics. With such sports, we can base the predictions for who will win games in the Euros and the World Cup with huge weightings on player performance rankings, and comparing performances when they’ve played against the same teams. But the GAA isn’t quite there yet in terms of individual player data. Also, the way the league is structured means that rarely do both Mayo and Dublin come up against the same teams on a regular basis (over the last four years, Dublin have played Kerry just twice and Mayo have also played Kerry twice).

What we do have to work with is the performance of both teams over time. Our data analysts broke this down and looked into key areas such as goal difference between Mayo/Dublin, point differences between them, regular differences in finals, average goals and points that season, differences between average goals/points that season and the finals etc.

idiro-results

By excluding any emotional bias and purely looking at the history and current form of both teams, Idiro Analytics have calculated a prediction of:
 

Mayo 1:15 - 2:11 Dublin

Mayo to beat Dublin by one point

 
Again, predicting the result of a single game is definitely not an exact science. That’s especially true with such a fast paced high scoring sport, where one misplaced pass or slip could sway the game one way or the other. But interestingly, by only focusing on the numbers and not the emotional elements of the game, our prediction seems to go against the general consensus of Dublin having the edge on Mayo.

If you were to base your opinion on who would win by just looking at the odds set by the bookmakers, you may be led to believe that Dublin are 8 times more likely to win. But the thing to keep in mind here is the relative number of people making the bets. The population of Dublin is roughly ten times more than the population of Mayo - and with matches like this, many punters bet with their hearts, not their heads - meaning the odds may look disproportionate. Another thing to remember here is that bookmakers set the odds solely with the intention of making a profit no matter who wins. So although Dublin may look like they have this all wrapped up, that might not be the case.

Idiro-analytics-population-dublin-mayo

Our predictive model has Mayo to win by a margin of one point, which at first glance may not seem like such a big deal considering how evenly matched these two counties are (by looking at their results over the last number of years).

But for Mayo to be so close to Dublin really is a major achievement, again when we take into consideration the relative populations of each county.

According to the most recent Irish Sports Council’s monitor report, the percentage of people actively playing Gaelic football in Connacht is 3.7%, whereas in Dublin county it’s just 0.6%. But adjusting for population size, the number of active players the Dublin team could potentially choose from is roughly 8070 with Mayo only having 4014 players.

actively-playing-gaelic-football

Now, if Mayo had the same population as Dublin (1 345 000 people), with an active player percentage of 3.7%, they would have a pool of players to choose from of 50 000, compared to Dublin's 8070.

The Mayo players will know that looking at the history it’s too close to call, but looking at how well they’ve played given the disproportionate advantage Dublin have in terms of population, they may just feel they deserve it more.

Dublin supporters might not want to be too confident.

 

About Idiro
Based in Dublin, Ireland, Idiro Analytics is an award-winning provider of analytics to businesses around the world.

For an overview of Idiro’s analytics services, see our homepage www.idiro.com.
Media contact information
Simon Rees, Clients & Marketing Director, Idiro Analytics.

simon.rees@idiro.com