Analysis of NCT test results helps car buyers choose wisely

Analysis of NCT test results helps car buyers choose wisely

 

Today Idiro has published a data visualisation dashboard (here) allowing you to explore the 2016 NCT test results. 

Update 21/8/17: The Sunday Independent ran a story yesterday on Idiro's dashboard - see here http://www.independent.ie/life/motoring/car-reviews/put-your-car-to-the-test-36049372.html 

Update 30/8/17: Our dashboard has been picked up by many other media including RTÉ: https://www.rte.ie/lifestyle/living/2017/0830/900977-is-this-the-worst-car-model-for-nct/

About this dashboard and the data

This interactive report is published by Idiro Analytics as a demonstration of our data visualisation capability. The data used is published by RSA.ie and covers NCT tests conducted in 2016. The data covers the first NCT test that each car underwent in 2016 - data has not been published on retests for cars that fail.

The data is filtered to show only the twenty most popular makes of vehicle tested in 2016, and for each of these, only models with at least 1000 tests carried out. This was necessary because to show every make and model of car tested would make the dashboard unusable. As a result, from the total 1.4 million test carried out in 2016, 1.1 million tests are represented in this dashboard.

If you wish to examine the data further or want to look at makes / models of cars that are not shown in this analysis, the full dataset is available for download at RSA.ie.

How to use this dashboard

Pro tip: To reset all your filters and return to the original screen, click your browser’s refresh button.

Pro tip: To compare different car brands, press CTRL + click on the type of makes you want to filter in the bubble chart “Overall Popularity & Passing % of Model”. 

 

The RSA provides the ‘Year of Car’ of each car tested.  You can filter by the age of cars using this slider. 

If you want to see cars from 2010 or before that were tested in 2016, you simply drag the end buttons in the slider (at the top) over to the desired year.

As the year changes, so do the interactive maps. This bubble table shows the top 20 most popular car models with a 2010 or older registration tested in 2016 that passed first time.

The bubbles indicate how popular that model is, and the colour of each circle indicates the pass rate - from deep blue (high pass rate) to deep red (low pass rate).

For example, we can see below that for vehicles registered in 2010 and older. Toyota is the most popular and has a high pass rate in 2016.

In the next graphic to the right, ‘First time pass rate by model’, you’ll see the different Toyota models that were tested.

In the next table ‘Model Popular Model & Age’, you’ll see the most popular models in the Toyota age range of the models tested.

Although the Prius has the highest pass rate in the Toyota model, it isn’t the most popular make, Corolla is the most popular.  As you scroll down the interactive map, you can see the failure rates of each car that was tested in the NCT and what the cause of the failure was.

 

You can filter what cars failed on by category or you can choose to see total which will show all the categories on the ‘First Time Failure By Year On’ drop-down menu.  The image below displays blue bars which indicate the units tested in each year and the red line graph shows the failure percentage rate. Remember that pass/fail thresholds can vary according to the age of the car.

As you scroll across, you’ll see the ‘First Time Failure By Category’ table. This table shows each category that the NCT test each car.

This graph displays what caused Toyota cars to fail their NCT. 

 

 

Here's the link to the dashboard: https://public.tableau.com/profile/idiro.analytics#!/vizhome/NCT2016Top20MakesResults/NCT2016-20MostPopularMakes

To contact Idiro about this blog post or about how Idiro's analytics can help your business, drop us an email at info@idiro.com.  To download the source data from RSA.ie, click here.  

 

 

Big data – will it solve your marketing problems?

Big data – will it solve your marketing problems?

As ever, Tom Fishburne has a point.  Increasingly, organisations are turning to their data to improve decision-making and improve commercial results - but buying big data infrastructure won't solve your marketing problems.  In many ways, installing the big data infrastructure is the easy bit.  The real challenge, as Idiro has found time and time again, is turning all that data into money.  For this you need people with the BI and analytics skills to mine all that newly-available data for dashboards, insights and predictions.  And of course the organisation needs to be ready to change - to try new ways of using data to drive commercial activity - and it needs to be prepared to fail.  Samuel Beckett said:

'Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.'

With the right analytics partner, the journey to excellence in data-driven marketing should be a lot easier than Beckett paints it - but nevertheless, it takes skill and a ruthless focus on the results.  However, the results from using your organisation's data to drive its business are nearly always well worth the effort.

Analysis shows that cash is king in the Irish housing market

Analysis shows that cash is king in the Irish housing market

As the housing market in Ireland is heating up again, we examined the trends in cash purchases of housing from 2010 till 2016. There is a significant jump in non-mortgaged home purchases going from €347 million in 2010 to approximately €6 billion in 2016. This represents an increase of more than 1,600% over that period.

Housing sales 2010-2016: cash vs. mortgages

To see an interactive version of this graph, click on the image above

The proportion of cash transactions for housing in Ireland peaked in 2013 and has been on a downward trajectory since then. However, it is still 44 percentage points higher than where it was in 2010.

Cash sales as a percentage of housing transactions has increased massively since 2010, and recently declined slightly.

To see an interactive version of this graph, click on the image above

There can be significant externalities created by an influx of cash within the property sector. It can lead to an increase in home prices and displace the median income
home buyers out of the property market, leading to an increased affordability gap. It should be noted that although cash sales have been a feature of the Irish housing market, recently the proportion of institutional and international investors has increased. Finding data on investments by international property players such as Blackstone within the Irish real estate market would require further research. It is well understood that credit cannot compete with cash. As long as the housing market in Ireland is dominated by cash buyers, whole classes of renters are likely to be priced out of their dream of owning a home.

Idiro staff and friends launch Databeers Dublin with sell-out success first event

Idiro staff and friends launch Databeers Dublin with sell-out success first event

'Databeers' is a series of talks that started in 2014 in Madrid, and then spread to Barcelona, London and many other cities.  Databeers aims to bring together data experts (from industry or academia, at a level accessible to a wide audience), with short talks and beer, i.e. in a fun way.  The format of Databeers talks is unlike any other: three or four talks, each seven minutes (yes, seven minutes!) long, on any aspect of data, free beer, and of course free entry.  

A group of friends working at Idiro (Julie, Davide & Simon) heard about Databeers, and thought it would be good to bring Databeers to Dublin.   We recruited three more people - FrancescaAonghus and Stefano - and formed the organising team.  The team found a fine venue (thank you Bank of Ireland), a beer sponsor (many thanks, Estrella Damm), three great speakers (thanks Krithika, Cathal and Patricia).  We launched the event and sold out in 3 days!  On the night we had a full house, three great talks and we managed to avoid the hassle of undrunk beer to keep till next time...

The Databeers Dublin team: L-R: Francesca, Stefano, Simon, Davide, Julie and Aonghus.

The next event is already being planned, so keep an eye on @DatabeersDub for updates!

 

Major International Telco chooses Red Sqirl

Major International Telco chooses Red Sqirl

Major International Telco chooses Red Sqirl

 

One of the first commercial users of the Red Sqirl analytics platform for Big Data is a multinational telco, which has deployed Red Sqirl in two countries and is using it to deliver analytics on its Hadoop platform. This customer has asked to remain anonymous.  

Red Sqirl is a flexible drag-and-drop Big Data analytics platform with a unique open architecture. Red Sqirl makes it easy for analysts and data scientists to analyse the data on your Hadoop platform.  

The problem

 

This multinational telco operates on four continents worldwide. Unfortunately, for historical reasons these operating companies use a wide variety of different database systems and analytical platforms.  

As data becomes an increasingly important asset for organisations, many companies are taking steps to maximise the value of their data. In addition, most large companies now have access to many new types of data - for example, social media posts.

To exploit synergies across the worldwide organisation, this multinational telco decided to standardise database platforms across the group. In order to best meet the challenges of storing and using ‘big data’, the company chose Hadoop as its standard database platform.  Hadoop is currently being rolled out across the company’s operations.  

The company now faces the challenge of migrating existing code onto Hadoop, and allowing asset re-use and swapping across business units who have multiple historic platforms and assets.  Moreover, the Hadoop ecosystem does not contain a ready-made data analytics module.  The market leading traditional data analytics software platforms are not designed for the Hadoop ecosystem and tend to be inefficient when analysing Hadoop data. The telco searched for a native Hadoop analytics platform with an easy-to-use graphical interface.  In addition, because of the wide variety of requirements, the platform had to be highly flexible and cost-effective for a worldwide rollout.

Solution chosen: Red Sqirl

 

Following a thorough technical / usability evaluation of a number of analytics platforms, the company agreed a contract to trial Red Sqirl - initially in the company’s head office and in one operating company.  

Red Sqirl exceeds the telco’s requirements as set out above. Of particular interest to this company is Red Sqirl's unique capability for sharing, via the Red Sqirl Analytics Store.  Moreover, the Red Sqirl development has deep experience of telco analytics.  Red Sqirl is already proven in predicting telco churn, as shown in the workflow below.

A Red Sqirl screenshot showing a telco churn modelling workflow
Telco churn modelling in Red Sqirl - a workflow


This telecoms operator now has the confidence that their analytics solutions are scalable, deployable and easy to use.

Implementation

 

Red Sqirl was installed in both the head office and the country telco, which took a matter of minutes in each case. Training workshops were held in both locations, to ensure that analysts could get the most out of the Red Sqirl platform. Red Sqirl’s drag-and-drop interface is similar to that of most GUI analytical tools, so the training was quickly completed.  Feedback from students was very positive - as the graph below shows, students scored the training very highly.

Student feedback on Red Sqirl training course
Student feedback on Red Sqirl training course


The company then needed to migrate analytical assets from legacy infrastructure to Red Sqirl.  The Red Sqirl development team showed the way by taking a particular analytical model that had been developed in another language and quickly porting it onto Red Sqirl.

The company is now using Red Sqirl as its primary analytics platform in the country in question - and early results are promising.  

Three and a half degrees of separation?

Three and a half degrees of separation?

Three and a half degrees of separation?

 

Last month a team of researchers at Facebook posted an article where they update the "mean degree of separation" of Facebook users.  You have most probably heard of the "Six Degrees of Separation" legend: between you and me, as between anyone in the world, there is a chain of acquaintances that connect us; this chain is at most 6 steps long. In other words, you know somebody, who knows somebody, ..., who knows me! Apparently, this idea dates back to Frigyes Karinthy, a Hungarian writer from the first half of the 20th century, but it was then investigated by social scientists and, with the arrival of social networks and Big Data, people have started using online social networks to test it experimentally.

In 2011, researchers at Cornell, the Università Degli Studi di Milano, and Facebook computed the mean degree of separation across the 721 million people using Facebook at the time and found that it was 3.74. Here the separation is defined in terms of intermediate individuals between a given pair, instead of the number of steps. The news is that Facebook users grew to 1.59 billion and the mean degree of separation shrank to 3.57. If you visit the page, a fast algorithm calculates your own mean degree of separation.

However, one may ask how representative is Facebook of real-world social acquaintances.  Maintaining a real-world social relationship is expensive in terms of time and energies while Facebook "friendship" comes almost for free.  Therefore, one can argue that Facebook connectedness overestimates the real connectedness of individuals.

It is reasonable to believe that some of those links are so weak to be non-meaningful. Also, social contacts change with time, while one may expect that most Facebook users wouldn't regularly prune their inactive links.

Finally, albeit large, the Facebook world is still a sample of the whole humankind, and it is certainly not a random sample.  Just to mention a few reasons, access to the Internet in Africa is still much more difficult than in the other continents (even if things are changing fast), the population on Facebook is less represented for elderly people, etc.

As a result, it is possible that the Facebook sample has a lower mean degree of separation than the world population as a whole.  But it is still a very large sample.

All these remarks are quite intuitive, but the network scientist Duncan Watts has criticized both the work and the approach, putting in evidence the counter-intuitive behaviour of the so-called "small-world networks".  In a famous paper written with Steven Strogatz in 1998, they proposed a very simple model that showed how adding a few "shortcuts" in a network (links connecting random pairs of individuals) quickly reduces the shortest path length (i.e. the degree of separation) and, quite interestingly, that doesn't get much shorter if you keep adding random links after this initial drop.  The argument, therefore, is that the world has already become small decades ago, and it's quite unlikely to shrink much further.

This, in my opinion, opens the question if a mathematical model could quantitatively fit the measured reduction in Facebook's mean degree of separation from 2011 to 2016. It could help us in understanding better which topological features of the network are relevant in the process.

Anyway, there are many more details in a real-world social network that still need to be understood besides the degrees of separation.  In a recent paper, the sociologist Robin Dunbar has investigated the relationship between the number of "friends" reported in Facebook and the number of the ones personally perceived by individuals.  In some earlier papers, he and his co-workers had identified progressively self-contained layers of closeness in human acquaintances.  They showed that, typically, human layers of social closeness approximately contain 5, 15, 50 and 150 individuals, plus two external layers of 500 and 1500 alters.  In his recent paper and references wherein, Dunbar shows that online social networks can realistically approximate not only the 150-friends layer but also the two most internal layers.  He also claims that the number of online contacts is usually not larger than the one of offline contacts.

Indeed, there is a minority of users who report a larger set of online friends with respect to the offline world, but it is hypothesised that whose extra online connections are weak acquaintances, that in online social networks cannot be typically distinguished from close friends.  This can only be seen by investigating the traffic among individuals by counting the direct posts on Facebook or replies on Twitter.

Besides online social networks, mobile phone networks are a valid alternative to measure social acquaintances.  Phones are still diffused in areas with low Internet connectivity, communication often carries a cost and it is possible to measure the traffic between individuals.

In Idiro we investigate these problems every day and are able to detect layers of social acquaintance to improve digital marketing strategies and provide value for our customers.

Idiro researcher holds seminar on Social Network Analysis in telecommunications

Academic research has always been important to Idiro – that’s how we became world leaders in the application of Social Network Analysis (SNA) techniques to telecommunications business problems.  It is also why our CEO is on the board of CeADAR, the Irish Centre for Applied Data Analytics Research.

So we are happy to report that our colleague Davide Cellai, who has been working on advanced uses of SNA in solving telecommunications business problems, this week gave a seminar at the University of Aalto, Finland on his advanced SNA churn research.  Davide says:

“Last Thursday I was invited to give a seminar at the University of Aalto, near Helsinki. This week, I have been hosted by the group of Jari Saramaki and Kimmo Kaski, who were possibly the first researchers to focus a research group in social network analysis applied to telecommunications, several years ago. I presented our work on port-out churn, plus some percolation models of robustness of infrastructures I have been involved with in recent years.  Here are some examples:

Social Network Analysis in telecommunications - research by Idiro Analytics
Distribution of fraction of time two subscribers spend speaking to each others in an interval of 12 weeks

In order to profile the type of connection between pairs of subscribers, we calculate the amount of time two individuals spend at peak (i.e. during working hours) and off-peak time of the day. For each pair, we can evaluate if conversations occur mainly at peak or off-peak time and give a score accordingly. In the first figure, we show the distribution of this score over all the phone calls that occurred in a period of 12 weeks. We can see two strong peaks, representing calls occurring exclusively off-peak and at peak times, respectively. We also show an enlargement of the central zone, with a broad hill representing pairs of subscribers who speak both at peak and off-peak time. This polarization suggested us to identify three layers of acquaintances: peak, off-peak, and mixed peak/off-peak.

Social Network Analysis in telecommunications - research by Idiro Analytics
Probability that a subscriber churns as a function of churning subscribers in her network of social acquaintances
This plot shows the probability that a subscriber churns after m_out of her friends have churned in a recent time interval. Generally speaking, we can see that a subscriber with more churning friends is more likely to churn, as the red crosses tend to grow with m_out. Moreover, we can see that the probability of churning is higher for the mixed peak/off-peak layer (purple boxes), meaning that at this level of relationships, churning propagates more easily.
The seminar participants were quite interested in the way we could identify types of social acquaintance based on the time of calls. They also suggested that exposure to churn in terms of duration, instead of number of churners, would improve the sensitivity of the method. In particular, Jari Saramaki, who has also experience in the data analytics industry, envisions that machine learning methods should be fed with this kind of insightful social network information to produce best lift.

The week I have spent here has been very useful. For example, researchers have shown me a method to identify families that doesn’t use any community finding algorithm, and a way to map a temporal network into a network of events, that can be treated with a known formalism. Another post-doc is working with psychiatrists to detect the onset of a mental disorder in the pattern of social activities of a person. Finally, it has been also interesting to hear that a few students or post-docs are starting a company. Best of luck, and thank you Aalto!”

Idiro gratefully acknowledges the support of Science Foundation Ireland in this work.  Idiro works with telecommunications companies across the world, helping them with customer retention, customer acquisition etc.  To learn more about Idiro’s work on Social Network Analysis in telecommunications, or to find out how Idiro helps telcos to get better marketing results through our SNA models, contact our experts.

Idiro Analytics – proud to announce the launch of our new website

Idiro Analytics website logo

Here at Idiro Analytics we’re very excited to announce the launch of our new re-designed website, which has gone live today at www.idiro.com

Following our recent name change earlier this year, we wanted to provide our customers, and potential new customers, with a new website that more accurately represents Idiro Analytics, what we do, and how we have grown from where we began back in 2003.

We have carefully designed a new website layout, that now has a better user experience, and will help our customers find all the information they need about Idiro, and answer any questions they may have.

We are also extremely proud of the experience and expertise of our staff and we wanted to showcase this. Our company page now shows more information about our company, and the people in charge, who are always driving Idiro Analytics forward.   

Our CEO, Aidan Connolly said:

“I am very pleased to introduce our new Idiro Analytics website. This new website reflects where Idiro is today and our company’s evolution over the past twelve years. We work hard to always stay ahead in an ever-changing technology world,  and this new website is another representation of that work. On behalf of everyone here in Idiro Analytics, I’d like to take this opportunity to thank you for your continued support.”

Why you should not buy an Irish lottery ticket until the weekend

Irish lotto

Buying an Irish lottery ticket for the weekend draw? Hold on a minute.

Here’s why you should not buy an Irish lottery ticket until the weekend.  This month changes take effect in Ireland’s national lottery game – they are adding two numbers, so we now pick from 47 numbers. The odds of each row winning will now be just under one in eleven million. Expect more rollover draws, bigger wins and fewer winners.  And if you buy a Saturday ticket on the previous Sunday, you have a bigger chance of being murdered before the draw than you have of winning the big prize.

Bear with me.  There were 52 murders in Ireland last year. Therefore, the overall odds of being murdered in any given week is one in 4.5 million (one person per week, out of each of Ireland’s circa 4.5M population).  There were 196 road deaths last year, giving you average weekly odds of being killed on the roads of one in 1.2 million.  Let’s not talk about how many die of heart disease…

Let’s say you buy a 2-line ticket on a Sunday for the following weekend lottery draw.  You have a higher chance of being murdered (all things being equal) by the weekend than you have of your numbers coming up for the big prize on Saturday.  Furthermore, you have a much higher chance of dying on Ireland’s roads than of winning.

And worse, if you win the lottery, someone else may have the same numbers, meaning you have to share the prize money, reducing the benefit. If you die next week, however, it doesn’t matter how many others share your fate – there is no upside to dying in company.

Here’s the good news: your chance of being murdered or dying on the roads before the draw falls steadily as you get closer to Saturday – but your odds of winning Saturday’s lottery stay the same, at eleven million to one.

So keep your Saturday lotto draw money in your pocket, at least till the weekend!

p.s. your chance of winning the big Euromillions prize is one in 116,531,800 – so buy that ticket as close to the draw as you can. Good luck!