How We Do It

Idiro’s social network analytics platform relies on complex processing to summarize, build and evaluate the behaviour of individuals through their communication patterns and observed actions. The principal challenge faced by Idiro in performing analysis is the sheer size of the input data. Social networks comprising of many hundreds of millions of actors are not unknown, with each actor often having dozens or links that together form complex large-scale communication graphs. Recording, parsing and evaluating these graphs would be time-consuming using current off-the-shelf data mining tools, and would likely become infeasible given the processing and memory limitations of even current enterprise-grade servers. To overcome the challenge of scale, Idiro uses an Hadoop-based distributed processing environment to process even the largest social networks. We have developed a number of bespoke data mining tools and algorithms specific for social network analysis and leverage cluster-based processing to increase processing throughput. Similarly, Idiro provides customers with the option of using multiple graph-partitioning, clustering, link-traversal and community-specific predictive models, suited to particular domains and for specific needs.

Currently Idiro works with very large mobile network operators in Europe, America and Asia, who provide us with detailed data to be used for periodic analysis, e.g. every week, month or quarter. The amount of data typically required for analysis has driven Idiro’s adoption of the Hadoop cluster framework, to gain any advantages of a distributed processing environment in handling issues of scale, file throughput and distributed graph processing. Hadoop offers linear scalability on commodity hardware servers, meaning that any growth in subscriber base, or requirements to reduce processing times can be resolved by adding more hardware to the existing cluster. In addition Idiro offers customers the possibility of processing truly vast social networks that may not, by their size, be stored and processed in memory on single-server installations. Lastly, the Idiro Hadoop framework supports many open-source and widely supported plug-ins, such as Oozie, Mahoot, Giraffe and Hive to enable enterprise-grade performance, scheduling, data manipulation and fail-over for limited capital outlay.

Usually, our customers provide link-level and action data associated with their subscribers, most often in the form of a database table or flat file output. This input data is loaded into the Idiro platform, adjusted, summarized and processed according to a customer-specified schedule. The output generally take the form of probabilistic scores associated with subscribers over one or more actions (or combinations thereof), which can be loaded back automatically into the customers data-warehouse. Idiro provides consultancy services on top of the basic analysis, to enable our customers to fully utilize our results within campaigns or subscriber clustering. Our technology roadmap includes more complex graph analysis algorithms, agent-based analytics, real-time query interface and interactive graph visualization and query functionality.