The data analytics landscape in Ireland

The data analytics landscape in Ireland

What does the data analytics landscape look like in Ireland?

Check out the results of our survey and see whether you won the €100 voucher.

 

A few weeks ago, we here at Idiro Analytics decided to run a survey on the data analytics landscape of Ireland. Our hope was to get a better insight into the data analytics field here in Ireland and gain a better understanding of where Ireland stands in the global analytics industry.

As with any field, it can be too easy to only focus on your own area and what works for you, but having an overview of what other people in your industry are doing can be extremely valuable. Even something as simple as seeing tools being used by others in your industry could lead you to explore options you might not be aware of.

With data analytics and Big Data being such popular buzzwords for marketers to throw around, we thought it would make sense to gather facts directly from those working in the industry.

We received a great response to our survey from the Irish-based data analytics community and we would like to thank everyone who completed it.

Here is the summary of the survey results:

We’ll start with the job title, and it’s interesting to see here that there are a lot of people working with data who don’t categorise themselves as being in a traditional data analyst / scientist role.

When asking the types of business problems you were solving with data, we found that a huge 69% of people were using data for marketing and sales activities.

Although marketing and sales analytics might not be the only focus, it’s clear that a lot of companies in Ireland are finding value in that area of analytics. That reflects Idiro’s own experience - most of the analytics work we do for our own customers is around helping our customers do better marketing or sales.

So are some industries doing more advanced analytics than others? It seems so - we found that the highest number of analysts working on predictive analytics projects are in these three industries:

Now let’s look at the analytics tools. As you’d expect, there’s a massive number of different software tools in use by the analytics community in Ireland.

And we can see, the top 3 analytics software tools in Ireland are Excel, R & MS SQL Server.

There are big differences according to the industry the analyst works in, for example:

Finance/Accounting

Utilities

We have an interesting breakdown by industry, but the table is far too big to show here, so contact us if you’d like to see it.

Next, we looked at the type of work being done (reporting, insights / analysis, modelling), and split by the different job titles:


 

 

 

Idiro researcher invited to speak at MACSI 10

Idiro researcher invited to speak at MACSI 10

Idiro researcher invited to speak at MACSI 10

 

The importance of academic research has never been underestimated here at Idiro Analytics. Encouraging our analysts to explore new and innovative technologies and techniques when solving data problems has always been a part of Idiro’s company culture.

Bridging the gap between academic research and industry is an area Idiro are very proud to be involved in. With this in mind, we're happy to report that our colleague Davide Cellai was invited to be a speaker in the workshop to celebrate 10 years of MACSI.

This is Davide’s report from the event:

"Last week I participated, as an invited speaker, in the workshop to celebrate the 10 years of MACSI. MACSI (Mathematics Applications Consortium for Science and Industry) is a consortium based at the University of Limerick that promotes collaboration between applied mathematicians and industry.

MACSI was founded in 2006 by the largest single grant ever awarded to mathematics in Ireland, and since then it has been quite a special point of reference in the country for mathematicians interested in industrial applications. I had been working in MACSI for more than four years and I knew the people over there very well. MACSI engages in industrial collaborations both at the national and international level. People in the consortium work hard to improve products and processes for the industrial partners and provide advanced training in mathematical modelling to students and researchers.

Davide Cellai speaking at MACSI

Idiro and MACSI have a long-standing collaboration. Indeed, Idiro is always keen to collaborate with academia. As we continuously improve our models and expand our range of services, we often employ cutting-edge research to meet those challenges. In 2014, when I was still in MACSI, I won a Science Foundation Ireland Industry Fellowship, a grant that gave me the opportunity to move to Idiro and apply Network Science models to the problem of predicting telecommunication churn, using one of the datasets available in the company. This work was so successful that I was later hired by Idiro as a Senior Data & Analytics Architect.

In my talk, I illustrated the model (called m-exposure model) that I developed during the Fellowship and the outcome of this work. While the m-exposure model was designed for portout churn, we then developed a similar model for expiry churn. Both models are now part of Idiro's suite of SNA tools for churn prediction.

Finally, I presented some new ideas and challenges that we would like to pursue in the near future.

There was a lot of interest in my talk. I got both great questions and great feedback (and lots of compliments) in the following hours. Some of the scientists in the audience were particularly interested in Idiro's future projects. Hopefully, we will get some good ideas for designing our next products.

The workshop was also very interesting in its own way. I listened to some great envisioning talks. In one of them, Professor Wil Schilders was comparing the benefits of faster computer hardware with faster algorithms. His point was that the latter was actually more interesting. In other words, it's often better to have a new algorithm running on an old computer than an old algorithm running on a new computer. Some other talks were also speaking about how science can improve society, from the elimination of tropical diseases to exact calculation of delay time after a road incident. I was delighted to be invited to such a great event."

Davide Cellai at MACSI

A sentiment analysis of a Premier League game: Man United vs Arsenal

A sentiment analysis of a Premier League game: Man United vs Arsenal

A sentiment analysis of a Premier League game: Man United vs Arsenal

 

One of the biggest rivalries in the English Premier League is the one between Manchester United and Arsenal. Less so in recent years since the retirement of Alex Ferguson, but still a hugely anticipated match between two of the world's biggest clubs.

One of the major outlets for fans of these two clubs to voice their opinions and feelings about upcoming games is Twitter. With a Twitter fanbase of 9.31M for Man United and 8.48M for Arsenal, it can be a great source of data about how the fans are feeling in the build-up to a game and their emotions in the hours after it.

On the 19th of November 2016, Man United and Arsenal went head to head in a game that, in all honesty, won't go down in history as being a great game. Although both teams are battling to stay in the running for the title, the game ended with just a 1-1 draw. Nothing all that exciting for the Twitter followers to remark on (with the exception of the equalising goal).

But we can still see some interesting results when we analyse the tweets being posted about the game by doing a sentiment analysis. A sentiment analysis essentially takes a piece of text and assigns emotions to the specific words being used. That overall text can then be determined to be positive or negative and we can work out the specific emotions being expressed. We can then plot these emotions on a graph and examine how they change over time.

For example, this tweet below can be classed as being an overall negative one. Each of the words being used “abysmal”, “gutless” etc. can be grouped into specific emotions, this helps us understand the feelings being expressed in the tweet.

screen-shot-2016-11-24-at-15-44-41

Below is a graph of the results. As you can see, tweets about this game started growing strongly an hour or two before the match, peaked towards the end of the match, and declined steadily until ten hours after the match. 

red-sqirl-premier-league-sentiment-2

Two interesting points worth highlighting from the results are the levels of surprise and trust:

red-sqirl-premier-league-surprise-2

Looking at the surprise, we can see a clear spike towards the end of the game, most likely caused when Olivier Giroud scored the equalising goal in the 89th minute.

screen-shot-2016-11-24-at-16-20-08

Analysing the trust is quite interesting, a huge number of people tweeting felt a lot of trust before the game kicked off, it then drops slightly after kickoff but starts to rise again half way through the game. Possibly at halftime with the score being 0-0, the fans felt it was all still all to play for.

It's clear to see the potential use cases for a system such as this, a complex analysis of a large constantly updating dataset, scheduled to run at predefined intervals. For example, we’ve used this previously to explore the sentiment on the US presidential election.

The difficulty with an analytical project like this is setting it up. Building the data pipeline that goes from gathering the data, to building the analysis workflow, to scheduling that workflow to run periodically and then to display the results, usually takes a lot of expertise and overhead. However, Idiro Analytics have developed a tool called Red Sqirl which can perform each of these steps in one intuitive interface.

Modern sports and data analytics now go hand-in-hand, it'd be hard to imagine a professional sports organisation that wouldn't be utilising data analytics in some form. And with data becoming more easily obtainable, it opens up so many more opportunities. With the right tools data analytics can be accessible to a lot more people.

Red Sqirl

Red Sqirl is a flexible drag-and-drop Big Data analytics platform with a unique open architecture.

Red Sqirl makes it easy for your analysts and data scientists to analyse the data you hold on your Hadoop platform.

For more information visit RedSqirl.com, and for a guide on how to build the entire process of analysing Twitter data using Red Sqirl, as outlined above, please read our detailed guide.

Title image courtesy of Premier League ©

What do the Irish think about Trump and Clinton?

What do the Irish think about Trump and Clinton?

What do the Irish think about Trump and Clinton?

 

Comparing Irish people's opinions to the rest of the world

 

It’s everyone's favourite subject right now, the US election. Unless you've somehow avoided consuming any form of media over the last few months, you'll have no doubt been exposed to a lot of opinions and "facts" about the two front runners for the US election Hillary Clinton and Donald Trump.

With such a media overload, it's hard not to form our own ideas about who should be elected and who shouldn't. It's a strange phenomenon, the world being so invested in an election for a nation we have no vote in. The American people will vote for an American president, and yet the rest of the world seems to feel like we're involved in the decision.

With this in mind, we here at Idiro Analytics decided to get a clearer understanding of the opinions of people here in Ireland surrounding the election. Do the opinions of the Irish people differ from those of the rest of the world?

To do this we chose to use Twitter as our source of public opinion to do an analysis on. We gathered thousands of tweets posted about the election over a 24 hour period in the days leading up to the election and ran a sentiment analysis on them.

This means we were able to break down each tweet and work out the sentiment (overall feelings) being expressed by analysing the types of words being used in each tweet. From this, we can then chart if the majority of tweets being posted about both Clinton and Trump are positive or negative and the general feelings behind each one.

First, let’s look at the sentiment for both Clinton and Trump worldwide:
idiro-analytics-what-irish-think-us-election-wh

idiro-analytics-what-irish-think-us-election-wt

One interesting note from the two charts above is the huge difference in the number of tweets being posted about each person. The number of people tweeting about Trump is over three times higher than the number of people tweeting about Hillary.

If we break this down further into just positive and negative sentiments, we can see that the majority of Tweets being posted worldwide about both Clinton and Trump are negative.

idiro-analytics-what-irish-think-us-election-pnh

idiro-analytics-what-irish-think-us-election-pnt

 

Now let's look at the sentiment of Irish people towards the two. (Note that in order to get a large enough sample to analyse, we used tweets posted by people in Ireland over 4 to 6 days leading up to the election)

idiro-analytics-what-irish-think-us-election-hti

From looking at the chart above, it's strange that even after all we've read about Trump over the past year, we're still surprised by him.

idiro-analytics-what-irish-think-us-election-hi

idiro-analytics-what-irish-think-us-election-ti

Although it’s not by a huge amount, we can see that the sentiment towards Hillary in Ireland is positive compared to the negative worldwide sentiment towards her, whereas Trump is still negative.

Lastly, let’s combine the worldwide sentiment for both Hillary and Trump versus the sentiment towards them in Ireland.

idiro-analytics-what-irish-think-us-election-hwi

idiro-analytics-what-irish-think-us-election-twi

From these last two charts we can see that the Irish people have a little more fear and anger about the future than the rest of the world. Is there something we know that they don't?

 


 

About Idiro
Based in Dublin, Ireland, Idiro Analytics is an award-winning provider of analytics to businesses around the world.

For an overview of Idiro’s analytics services, see our homepage www.idiro.com

Media contact information
Simon Rees, Clients & Marketing Director, Idiro Analytics.

simon.rees@idiro.com

+353 1671 9036

 



red sqirl
The data analytics work for this article was performed using Red Sqirl. From within Red Sqirl, we were able to build a data pipeline that gathered thousands of tweets, sorted each tweet, run multiple different analysis steps on the data and output results into visualisations in real-time. Visit the Red Sqirl website for more details

 


 

Can Ireland beat the All Blacks?  Irish people really believe they can

Can Ireland beat the All Blacks? Irish people really believe they can

Can Ireland beat the All Blacks? Irish people really believe they can

 

There's an excited buzz in the air for Irish rugby supporters right now. Tomorrow, once again, Ireland will take on the All Blacks, in an attempt to break a 27-time losing streak (and 1 draw), against unquestionably the greatest rugby nation in the world.

And you might think, what's the big deal, this isn't a major tournament, there are no records to be broken, it's being played in a country that doesn't have a love for the sport, nothing is really on the line. This is just one more attempt in a long 111 year losing streak to the All blacks.

And yet, the Irish never really look at things like that. When faced with impossible odds in any sport, when everyone has all but written us off, the Irish supporters always have the mindset of 'yeah but what if...'


But what if, they're just underestimating us

But what if, they slip up

But what if, they get overwhelmed by the Irish supporters

But what if ...

As such a small nation, we'll always be considered to be punching above our weight. But when it comes to rugby, we really do stand proudly up there with the best in the world. We truly believe that if everything goes right, we can beat any team.

The emotions felt for the Irish team is a difficult thing to put down on paper, people talk of the mood in the Irish camp, the atmosphere around the stadium before a game, the emotions of the supporters, it's not really something that can be drawn on a chart.

But what if there was a way?

If we were to use a data analytics technique called sentiment analysis, could we understand the overall emotions being felt about the game tomorrow?

Sentiment analysis is essentially taking a piece of text and by looking at the words being used, determining if that piece of text is overall positive, negative or neutral. Where this can become interesting is if we were to apply it to something like Twitter.

If we apply this technique to all of the tweets being posted by people in Ireland about the Ireland vs All blacks game, could we get a picture of the overall feelings towards the game?

So, looking at the weeks leading up to the game, we took all of the tweets from people talking about Ireland vs the All Blacks in Ireland and performed a sentiment analysis on them. We also decided to do the same with all of the tweets coming out of New Zealand about the game and we were able to plot them on the charts below.

screen-shot-2016-11-04-at-18-35-50

screen-shot-2016-11-04-at-18-35-32

 

What these two charts are showing us is the overall emotions being felt by the people in both countries about the game tomorrow. Now, we know this is far from definitive fact, these are only showing the feeling of the people talking on Twitter about the game, and that’s only going to be a fraction of the overall supporters. Interestingly, though, this would include any journalists posting about the game, the people that others look to help formulate opinions.

From looking at the results, we can see that a high proportion of the tweets from both countries have a sentiment of 'anticipation', which may seem obvious, but just stands to prove the concept of this technique.

The next highest sentiment from both countries would be a feeling of 'trust'. Again, may seem obvious from the people in New Zealand, of course they would have trust in their team, it's the All Blacks. But, this does bring us back to the point we made earlier, when up against a team we’ve never been able to beat, the Irish people still have trust that we can win.

Another interesting point to take from these tables is that, for New Zealand, there does seem to be some fear creeping in. We know and the All Blacks know that Ireland are a good team, there is the potential that they can actually win this game. And what may be weighing on the minds of the New Zealand supporters is that all of the pressure is on them. If Ireland lose, then they've lost to the best, but the All Blacks are the ones on the winning streak.

One last analysis we did was to get the overall feeling towards Joe Schmidt, with him committing his future to Ireland, we wanted to see what the Irish people thought. And as you can see from the chart below, we have complete trust in him.

screen-shot-2016-11-04-at-18-35-50

COME ON IRELAND!

 

Where Red Sqirl lies in the Big Data landscape

Where Red Sqirl lies in the Big Data landscape

Where Red Sqirl lies in the Big Data landscape

 

Today, Big Data is the platform of choice for storing, exploring, visualising and modelling your data.
 

In order to get where it is today, there have been a number of distinct generations of Big Data, each one advancing on from the one before it. The first generation simply gave us the ability to analyse Petabytes of data with tools like MapReduce. The next generation gave us more responsive tools for analysing data such as Spark, Impala and Prestodb. Now, this third generation sees the emergence of tools for moving data around such as Kafka, Kudu.
 

The Data Lake & Real Time Analytics

These tools have changed the way that data is analysed and how data solutions are now built. The emergence of Big Data has brought with it many new concepts, one of them being the Data Lake.

A Data Lake is a massive enterprise-wide data repository to which analysts can contribute to and cherry pick data they need, in a format best suited to the data. The Data Lake looks to solve the problem of data silos, eliminating dozens of independently managed data collections and creating one combined data collection. Data lakes have become essential to Big Data projects due to an increasing demand for data to be accessible and agile.

Another term becoming popular right now is Real-time analytics. Essentially, it means triggering an event that fulfils a prerequisite in real time. Although the term real-term is misleading, as you can’t actually analyse Big Data in real-time, but only act on it. Real-time analytics works by rather than analysing an entire base, the analytics instead relies on intelligently interacting with parts of the data lake, in order to perform actions on a user by user basis.

Real-time analytics relies heavily on periodic batch process analyses to continuously evaluate the impact of new data, and make the behaviour of each user's action evolve over time. Without these batch processes, the analytics being performed would not be on up to date data.
 

Data Pipelines

The key to any Big Data analytics job and an up to date Big Data warehouse are these periodic background processes, and if done well, a huge range of services can be built from them.

For example, you can perform ad-hoc analyses easily, you can maintain analytics jobs and upgrade/update them quickly etc. The method for creating these background processes is known as building data pipelines. Building data pipelines are an essential part of analysing data using Big Data techniques.

It's for this reason, that all the most popular Hadoop distributions - Cloudera, Hortonworks, MapR, all include a tool for periodic processing: Apache Oozie.

Apache Oozie is the tool that triggers processes based on time and data availability, Oozie supports any data format, language and is fault tolerant. Apache Oozie is, however, very difficult to use as there is a lot of overhead between implementing a process to run once and running it on a regular basis.
 

So we built Red Sqirl

Red Sqirl is a drag & drop analytics tool which can also build Oozie workflows in the background. With Red Sqirl you can build, deploy and maintain data pipelines easier than ever before using an intuitive drag and drop interface.

 

The counties with the most dangerous roads in Ireland ahead of the bank holiday weekend

The counties with the most dangerous roads in Ireland ahead of the bank holiday weekend

On Bank Holiday weekends we’re used to reading about people being killed on Irish roads. But which counties have the most dangerous roads?


Although Dublin and Cork have had the highest number of fatalities, does that mean they have the most dangerous roads in the country or do other factors need to be taken into account?

As with most bank holiday weekends, there is a heightened risk of driving over the next few days. This can mainly be attributed to higher volumes of traffic as many people visit family and friends and in doing so, undertake long road journeys.

The Road Safety Authority (RSA) have issued statements about taking extra care on the roads this weekend. And by looking at the numbers over the last 20 years, it's clear the RSA are succeeding in their goal to make our roads safer. Even though the number of fatalities this year are higher than the same time period in 2015, the overall trend is that our roads are becoming safer.

 

Idiro-analytics-Irish-dangerous-roads-fatalities-2015

 

In the interest of improving the safety of Irish roads, we here at Idiro Analytics wanted to shed some light on some of the details of road safety statistics that can usually be overlooked or misinterpreted, leading to the wrong conclusions.

In 2014 and 2015 the number of road fatalities in Ireland were 193 and 166 respectively. By studying the charts below, it's easy to see how the assumptions can be made that the two most dangerous counties for road accidents are Dublin and Cork. However, these figures don't show the full story, because there are a lot of other variables to take into consideration.

 

Idiro-analytics-Irish-dangerous-roads-fatalities

 

Other details that need to be taken into account are:

  • The length of road in each county
  • The number of vehicles on the roads
  • The average distance traveled
  • The total population sizes

 

Below you can see details from each of these different variables (note: summarized tables - not all information is contained):

 

Idiro-analytics-Irish-dangerous-roads-total-length-roads-population-in-ireland

 

When analysing all the information, we can see a clear picture can starting to form. Although Dublin and Cork may at first glance seem to have the most hazardous roads and will be noticed more in the national press, it is Longford and Monaghan that rank 1 and 2 respectively with the most dangerous roads.

 

Idiro-analytics-Irish-dangerous-roads-fatalities-per-km

 

Both Longford and Monaghan have low populations, low road lengths, a low amount of vehicles on the road and low average distance travelled, but it was found that they have a high proportional fatality rate averaged over 2014 and 2015.

  • Longford and Monaghan: 2 fatalities per 10,000 vehicles
  • Longford and Monaghan: 3 fatalities per 300 million km travelled
  • Dublin: 0 fatalities per 10,000 vehicles
  • Dublin: 1 fatality per 300 million km travelled
  • Cork: 1 fatality per 10,000 vehicles
  • Cork: 1 fatality per 300 million km travelled

 

Looking for a cause


With this in mind, we can now try to work out some of the possible causes and determine areas that may need further investigation.

Access to public transport could be one possible factor. Both Longford and Monaghan have a low number of public service vehicles (buses and taxis) per km per head of population.

Idiro-analytics-Irish-dangerous-roads-public-transport

Another factor leading to these insights could be found in a recent road surface survey carried out by the Department of Transport, Tourism and Sport (DTTAS) and the National Roads Authority (NRA) in 2011/2012.

The survey found that although Longford and Monaghan rank low on counties needing 'routine maintenance', 'surface restoration', 'road reconstruction', both counties are ranked number 1 and 2 for needing resealing & restoration of skid resistance.

Idiro-analytics-Irish-dangerous-roads-maintenance

We all know, from the information given to us from the RSA, that on a bank holiday weekend we need to be extra careful when travelling. And we also know that a lot of lives have been lost on the roads in both Dublin and Cork, a higher number than any other county.


But, one thing we need to be aware of is that although the number of road deaths in those two counties is high, they would not have the most dangerous roads in the country. Per km the roads in both Co Longford & Co Monaghan pose a greater risk and extra care needs to be taken.


Therefore, be careful out on the roads this weekend & especially so in Cos Longford & Monaghan.

In order to make this article more accessible, we've only included summaries of the overall data that we analysed. But, we invite anybody who finds an interest in these figures to contact us if you have questions or would like to discuss any part in detail. We'll be happy to discuss the findings with the hope that the information can lead to safer Irish roads.


About Idiro
Based in Dublin, Ireland, Idiro Analytics is an award-winning provider of analytics to businesses around the world.

For an overview of Idiro’s analytics services, see our homepage www.idiro.com

Media contact information
Simon Rees, Clients & Marketing Director, Idiro Analytics.

simon.rees@idiro.com

+353 1534 30 34

Mayo still to beat Dublin by one point in the All-Ireland replay

Mayo still to beat Dublin by one point in the All-Ireland replay

Mayo to beat Dublin by one point
in the All-Ireland replay

 

In the two weeks following an All-Ireland final containing 2 own goals and a draw that nobody predicted, a trend seems to be forming among GAA supporters; Dublin underperformed and Mayo may have missed their chance. The consensus being that for the replay, Dublin will ‘click into gear’ and play ‘their’ game and come out on top.

But is this really the case?

Most seemed to think the 2016 title belonged to Dublin before they even stepped into Croke Park on that rainy Sunday afternoon, but by looking at the numbers, we found that the pundits’ confidence was unfounded.

If you read our last article, you'll have seen that Dublin were not the predicted winners of the final, we had Mayo to win by one point. And although we admit predicting the exact score of a game isn't really possible, we ended up being pretty close.

We had set ourselves the challenge of working out a model to predict who would be the 2016 champions (and get an edge on the bookmakers). We did this by looking at the information available to us on both the Dublin and Mayo teams’ performances over time. With key areas being goal difference between Mayo/Dublin, point differences between them, regular differences in finals, average goals and points that season, differences between average goals/points that season and the finals etc.

We came up with the prediction of Mayo to win by just one point.

Now, even if we were to analyse every single data point and statistic since the GAA was formed in 1884, we still wouldn't have been able to predict two own goals in an All-Ireland final. But the fact that our prediction seemed to go against the general opinion of Dublin being favourites and the game ending up being so close, we thought it might be worth trying again.

The difference, this time, is that we now have more data to work with. Not only have both teams played another game that we can factor into our original predictive model, but we now have more data on how each team performs against each other.

 

Results of previous fixtures

 
idiro-all-ireland-2016-prediction-results
Over the past four years, Dublin and Mayo have now played each other four times. The particular details of those matches play a key role in predicting the outcome of this Saturday's match, with a higher weighting on the most recent games as they are the most relevant to each team's current form.

 

GAA Football All-Ireland Senior Championship final 2016

 
idiro-all-ireland-2016-prediction-stats3
With these extra details in mind, we were able to refine our original prediction and develop a new one.

 

The Idiro Analytics official prediction for the All-Ireland final replay

Mayo 1:13 - 1:12 Dublin

Mayo to win by just one point

 

Now there’s no doubt that the weather did have a major effect on the performance of both teams on that error-filled Sunday two weeks ago. But with the weather forecasted to be a lot milder this weekend, we should see a much-improved display by both teams. Although looking at the numbers, we still stand by our original analysis that these two teams are more evenly matched than people seem to think.

 

About Idiro
Based in Dublin, Ireland, Idiro Analytics is an award-winning provider of analytics to businesses around the world.

For an overview of Idiro’s analytics services, see our homepage www.idiro.com

Media contact information
Simon Rees, Clients & Marketing Director, Idiro Analytics.

simon.rees@idiro.com

+353 1534 30 34

Analysis performed by Eduards Vanags

Mayo to beat Dublin by one point

Mayo to beat Dublin by one point

Mayo to beat Dublin by one point

 

All the Dubs and Mayo people will give you an answer, of course, but can data analytics predict the outcome of this weekend's All-Ireland final, or more impressive yet, the score?

Predicting the outcome of a single game is a difficult task, predicting a winner of a league competition would be a much safer bet.

In a league competition, teams would play a lot of games, diminishing the impact of losses on their overall performance. And although they may falter a few times, underperform, fail to capitalise on chances etc. over the course of a season, it is usually the best teams who come out on top. But in a knockout competition, anything can happen. Which is an argument for why Leicester City winning the English Premier League is a bigger achievement than Portugal winning the Euros. In the knockout competition, Portugal were crowned champions by only winning four games, only one of those in 90 minutes and one on penalties. How much of that success was down to luck, and if it was a league style competition, would they have still won?

But let’s say we want to work out who’s going to win the All-Ireland final this weekend, Dublin or Mayo, is it even possible? The short answer, no! But let's give it a shot anyway.

Now, some other sports (e.g. soccer) have the luxury of huge pools of data and statistics. With such sports, we can base the predictions for who will win games in the Euros and the World Cup with huge weightings on player performance rankings, and comparing performances when they’ve played against the same teams. But the GAA isn’t quite there yet in terms of individual player data. Also, the way the league is structured means that rarely do both Mayo and Dublin come up against the same teams on a regular basis (over the last four years, Dublin have played Kerry just twice and Mayo have also played Kerry twice).

What we do have to work with is the performance of both teams over time. Our data analysts broke this down and looked into key areas such as goal difference between Mayo/Dublin, point differences between them, regular differences in finals, average goals and points that season, differences between average goals/points that season and the finals etc.

idiro-results

By excluding any emotional bias and purely looking at the history and current form of both teams, Idiro Analytics have calculated a prediction of:
 

Mayo 1:15 - 2:11 Dublin

Mayo to beat Dublin by one point

 
Again, predicting the result of a single game is definitely not an exact science. That’s especially true with such a fast paced high scoring sport, where one misplaced pass or slip could sway the game one way or the other. But interestingly, by only focusing on the numbers and not the emotional elements of the game, our prediction seems to go against the general consensus of Dublin having the edge on Mayo.

If you were to base your opinion on who would win by just looking at the odds set by the bookmakers, you may be led to believe that Dublin are 8 times more likely to win. But the thing to keep in mind here is the relative number of people making the bets. The population of Dublin is roughly ten times more than the population of Mayo - and with matches like this, many punters bet with their hearts, not their heads - meaning the odds may look disproportionate. Another thing to remember here is that bookmakers set the odds solely with the intention of making a profit no matter who wins. So although Dublin may look like they have this all wrapped up, that might not be the case.

Idiro-analytics-population-dublin-mayo

Our predictive model has Mayo to win by a margin of one point, which at first glance may not seem like such a big deal considering how evenly matched these two counties are (by looking at their results over the last number of years).

But for Mayo to be so close to Dublin really is a major achievement, again when we take into consideration the relative populations of each county.

According to the most recent Irish Sports Council’s monitor report, the percentage of people actively playing Gaelic football in Connacht is 3.7%, whereas in Dublin county it’s just 0.6%. But adjusting for population size, the number of active players the Dublin team could potentially choose from is roughly 8070 with Mayo only having 4014 players.

actively-playing-gaelic-football

Now, if Mayo had the same population as Dublin (1 345 000 people), with an active player percentage of 3.7%, they would have a pool of players to choose from of 50 000, compared to Dublin's 8070.

The Mayo players will know that looking at the history it’s too close to call, but looking at how well they’ve played given the disproportionate advantage Dublin have in terms of population, they may just feel they deserve it more.

Dublin supporters might not want to be too confident.

 

About Idiro
Based in Dublin, Ireland, Idiro Analytics is an award-winning provider of analytics to businesses around the world.

For an overview of Idiro’s analytics services, see our homepage www.idiro.com.
Media contact information
Simon Rees, Clients & Marketing Director, Idiro Analytics.

simon.rees@idiro.com

The benefits of playing Pokemon Go

The benefits of playing Pokemon Go

The benefits of playing Pokemon Go 

 


 

There are plenty of articles about Pokemon Go around the internet, some outlining the frustrations of trying to go about your day without having to sidestep someone staring at their phone, others showing the "mass hysteria" caused by the sighting of a Charizard, and many more about the accidents people have gotten themselves into while playing.

But few people are discussing why this game has become such a huge success and embraced it for what it is, a fun trending topic which has potential benefits.

So to understand this rise in popularity let's cast our minds back to January 2014, and the sudden rise of another game, Flappy Bird. If you're unaware, Flappy Bird was a side-scroller game where the player tries to control a small bird through obstacles using very limited controls.

If you're unaware, Flappy Bird was a side-scroller game where the player tries to control a small bird through obstacles using very limited controls.

flappy-bird

The idea of Flappy Bird becoming one of the most successful mobile games of that time defied all logic to professional game designers. It had little to no original game mechanics or design, it was even openly criticised for plagiarism from other game designers, and yet it took off to become the most downloaded free game in the IOS App Store when it was released.

Others have written about the addictive nature of Flappy Bird and that people liked how it was fresh and simple when so many games were becoming overly complicated, but simply, it became popular because everyone's friends were playing it.

 


 And now it's the same story with Pokemon Go.

Pokemon Go is essentially a simplified version of the game Ingress, (both games were created by Niantic), but created to appeal to diehard Pokemon fans. Pokemon Go and Ingress have a lot of the same features and use the same game mechanics, but one just has some Pokemon thrown into the mix. But, for Pokemon Go things have escalated dramatically.

Pokemon Go started to build momentum, a few people started playing and then more, it got noticed and talked about on sites like Reddit, and the more it was talked about, the more successful it became.

Every once in a while something like this happens, and it doesn't necessarily need to be a game, think of the rise of Tinder (100 million downloads as of March 2016). If people were interested in dating apps, there were plenty of options available, but it became so popular because everyone knew someone who was talking about it.

It becomes the thing to be a part of, to stay in the loop, to stay cool. If everyone playing Pokemon Go was a huge Pokemon fan, they would have already been playing one of the many other Pokemon games out there.

What makes Pokemon Go interesting, compared to other mobile games/apps that have had their moment in the sun, is that non-players can't avoid interacting with players. Step outside and look around and you'll most likely spot someone playing the game.

People may complain, but then again, people tend to complain about everything. The game brings people out to interact with real world places, making it more difficult for others to ignore. But anything that encourages exercise and gets people of all ages to get on their feet and move around is a great thing.

So, being data analysts, (and nerds), we wanted to encourage the exercising aspect of Pokemon Go by working out specific numbers to help players understand the physical benefits of playing. We wanted to help players justify their Pokemon hunting habit by having solid data to back up the 'it's good for you to keep playing' argument.

So let's break it down and look at what you would need to do in order to get to level 20 in the game, of course, people can go higher than level 20, but we’ll just set that as a nice target for now.

 


 

First, let's break down some of the numbers:

 

idiro-analytics-pokemon-go-table1

 


 

To reach level 20, you would need to gain 210,000  experience points overall.

 

idiro-analytics-pokemon-go-image

 

 

To reach that goal by only focusing on catching Pokemon, would involve one of the following:

 

idiro-analytics-pokemon-go-bonus-image

Catching 1909 Pokemon with a curveball bonus / Catching 1909 Pokemon with a nice! throw bonus / Catching 1400 Pokemon with a great! throw bonus / Catching 1050 Pokemon with Excellent! throw bonus
 
 

 
 

But from the chart above, we know that catching Pokemon isn't the only way to gain experience points (XP). 

Another way is to incubate eggs in order to build up your XP. If you have an egg and place it into an incubator, you can hatch that egg and earn XP points. The egg will hatch after you travel 5km and the speed at which you travel that 5km will determine the amount of XP points you earn.

 

idiro-analytics-pokemon-go-eggs-table

Walking 1 egg(5km) - you can earn 8 XP per min / Jogging 1 egg(5km) - you can earn 15 XP per min / Running 1 egg(5km) - you can earn 24 XP per min

 


 

We decided to run and experiment in order to get a benchmark. We started our experiment at a beginner level 5 and played Pokemon Go for 94 minutes in a city centre. 

These were our results:

idiro-analytics-pokemon-go-experiment

 


 
 
So if we take these numbers as the base, in order to reach level 20 we would need to play Pokemon Go for a total of 67 hours, travelling a distance of 202 km,  along the way catching 562 Pokemon.
 
In other words, playing Pokemon Go for just less than three days straight without stopping, and travelling the distance of the London marathon almost 5 times, that's not so bad, right?
 
In terms of calories burned while walking that distance (we'll assume we won't be running those three days), we would burn 27259  calories*
*The number of calories burned here is calculated based on our particular weight and average speed walking.
 
 

 
 
idiro-analytics-pokemon-go-experiment-table-2 

Other less fun activities we would need to do in order to burn that many calories would be*:

 idiro-analytics-pokemon-go-experiment-table-3 

 


 

Now, our numbers here are based on our particular experiment, we would need a larger dataset in order to get more solid results. But, there is no reason why you can’t use this a baseline reference when arguing with friends and family about whether playing Pokemon Go is a waste of time.

We've also run a more complex experiment using the data analytics tool Red Sqirl. We used advanced predictive data analytics techniques to work out where Pokemon will appear in the game. You can read more about this experiment here on hack.guides()