Direct-to-Consumer Brand

Model how likely customers are to purchase, given data across multiple engagement channels. Uncover pain points of competing products in the marketplace, and understand latent needs of consumers to guide product development.

Quality outdoor furniture is not a casual purchase. Customers are likely to do extensive research and eventually buy weeks - or even months - after they first arrive on your website. You will encounter the same customers using multiple devices, maybe without connecting the multiple touch points back to the same person: The first interaction might be on a mobile device visiting your website via an ad on Instagram, and the eventual purchase happens on their partner's desktop. Can customers which are likely to purchase be identified early on? Can you make a probabilistic assumption that you are seeing the same household interacting with your website? Which typical pain points do reviews of the competition show, can you build better products with a superior customer service experience? Can you inform product development by identifying unmet needs, using open source data?

We built a scikit-learn+XGBoost-based purchase probability model pipeline, enriched with geolocation-linked census data and SHAP interpretability layer, to identify customers with high purchase probability. The model can be deployed into production via Docker and Ray Serve. Product reviews were used as input for topic modeling and sentiment analysis to uncover common themes, both for positive and negative reviews. We created a custom training data set using zero-shot topic modeling as filtering mechanism for 600+ GB of text data. The filtered data was used to adapt a pre-trained GPT-2 natural language generation model, and build a tool that both helps to ideate product descriptions and discover latent consumer needs.

One of the most important variables driving model predictions, according to our interpretability layer, could be influenced directly in production. This lead to an immediate lift in conversions, even without the full model being deployed. We were able to identify both typical pain points and desired properties for products in the space, providing valuable inputs for creative design and product road map.