Analyze mobile data usage patterns to advise customers on how to lower their subscription costs. Segment customers, find those who are about to churn and the potential reasons why, and also estimate the time to churn to see if intervention is possible.
A mobile virtual network operator in North America wanted to understand their customers better. First, they were interested in how customers use mobile data in detail. How are different apps using foreground and background data, what are typical usage patterns, and how can data usage be optimized using available settings? A large group of volunteers had all of their app-level usage data collected over the span of months - we had a unique and comprehensive data set to work with. In a subsequent project, they wanted to better understand why and when customers churn. This is a problem with an interesting twist: you have to explicitly handle censorship in data sets. In the long term, all customers eventually churn - you just did not observe it yet. From a business perspective you would also like to know how much time is left to the predicted cancellation, so you can try to influence the decision where possible.
We analyzed customer data usage data, on a per-device and per-app level, totaling hundreds of millions of data points. We first reconstructed device state, per-application traffic, and separated background from foreground data usage using log files of the monitoring app, accounting for connectivity changes as a user moves around. We created hourly, daily and weekly summary statistics, correcting UTC timestamps for actual location. The data exploration and analysis was done completely in (large) memory, as using PostgreSQL turned out to be slow for this purpose. In a second phase project, we investigated customer churn. For this purpose, we used geospatial location data combined with usage, billing, and customer support data to segment customers. We then developed a proof-of-concept churn model, as well as an interpretability layer to identify the attributes of an individual customer the churn model considers predictive. We built a dashboard application to allow interactive exploration of the data and model predictions, even by non-technical users. We also created a survival model which estimates how soon a customer will churn and correctly accounts for the bias due to the right-censored nature of the data.
The investigation of data usage patterns lead to interesting insights to how apps - at that point in time - handled data transfers depending on network connectivity. Those could be used in blog posts to inform users about simple changes in application settings that would reduce their monthly data bill. The customer segmentation and churn modeling helped answer questions like if and how signal coverage issues influence customer retention.