πŸ“Š Business Context


What is it about?
  • The delay between purchases is the average number of days between two orders for each customer.
  • We define the delay between purchases with following hypothesis:
    • This delay is computed for all customers that have at least 2 orders, even if the number is not significant when the customer made only 2 orders.
    • We consider all orders available, whatever the channel (Ecommerce, physical stores, etc.).
What are the main use cases?
  • Reactivation > I want to reactivate customers for which the time since last purchase is higher than the delay between purchases.
  • Analysis > I want to analyse the distribution of delay between purchases, and perhaps see if some products categories or first order channels have impact on it.
What business impact can I expect?
  • Impact on: Retention, LTV
  • Level of impact: ⭐️⭐️️

↔️ Inputs & Outputs


Data input
You will need
  • A Customers dataset with following attributes
    • customer_id: unique identifier of the customer who made the purchase.
  • An orders dataset with at least the following attributes :
    • order_id: unique identifier of the transaction.
    • order_date: date at which the purchase was made.
    • customer_id: unique identifier of the customer who made the purchase.
User Settings
  • There is no setting for this scoring.
Expected output
  • The output of this recipe is a new Dataset containing a row for each of your contacts who made at least one purchase.
  • In the Dataset table, you will have a new attribute avg_delay_between_orders :
    • For all customers who made a single purchase, this attribute will have a NULL value.
    • For customers who made at least two purchases, this attribute will contain the rounded average number of days between purchases.
  • You can then make a join between your new Dataset and your existing Contacts Dataset to map your avg_delay_between_orders

πŸ–₯️ Implementation in Octolis


1. Prepare your input data
Make sure that you have built an Dataset containing all your customers, and another one with all your orders, with all the attributes needed.
In case you have several order Sources (e.g. eShop & PoS) you need to group all sources into an orders master Dataset.
2. Apply the SQL Template β€œDelay between purchases”
  • Create a new Dataset using SQL expert mode.
  • Copy / past the following SQL template.
  • After pasting the template, make sure to replace the column names in the query with the names of your actual columns.
    • contact_masterid : column in your orders Dataset that contains the unique identifier of the contact who made the order.
    • created_at : column that contains the date of each order.
    • total_amount : column that contains the amont spent by the customer for each order.
    • category : column that contains the type of each order (usually in Octolis : online / offline / abandoned_cart).
    • completed : column that indicates if the order has been completed or not (sometimes usefull for online orders that can be stored in DB before being paid).
  • Add your delay between purchases Dataset as a source of your Contacts Dataset, map the needed fields to import them into your Contacts Dataset.
sql
/* Delay between purchases SQL Template */ WITH with_previous_date AS (SELECT "contact_id", "created_at"::TIMESTAMP, "category", (LAG("created_at"::TIMESTAMP, 1) OVER (PARTITION BY "contact_id" ORDER BY "created_at"::TIMESTAMP)) AS "previous_order_date", (LAG("created_at"::TIMESTAMP, 1) OVER (PARTITION BY "contact_id", "category" ORDER BY "created_at"::TIMESTAMP)) AS "previous_order_date_per_category" FROM raw.orders /* Optional filters depending on your dataset */ WHERE "total_amount" > 0 AND "completed" IS TRUE) SELECT "contact_id", ROUND(AVG(DATE_PART('day', "created_at" - "previous_order_date"))) AS "avg_delay_between_orders", ROUND(AVG(DATE_PART('day', "created_at" - "previous_order_date_per_category")) FILTER (WHERE "category" = 'online')) AS "avg_delay_between_orders_online", ROUND(AVG(DATE_PART('day', "created_at" - "previous_order_date_per_category")) FILTER (WHERE "category" = 'offline')) AS "avg_delay_between_orders_offline" FROM with_previous_date GROUP BY "contact_id";