New
March 24, 2021

Using Transactional Payment Data to Predict Real-World Footfall

As consumer behaviour becomes increasingly trackable through digital payments, transactional datasets have emerged as a powerful proxy for physical foot traffic

Blog Banner Image

Overview

As consumer behaviour becomes increasingly trackable through digital payments, transactional datasets have emerged as a powerful proxy for physical foot traffic — especially when transactions are linked to specific locations, times, and merchant types. Arest's dataset, which captures rich, anonymised point-of-sale (POS) activity, provides a granular lens into where and when consumers are active, without requiring personal data.‍

Traditional approaches to targeting often rely on aggregated demographic data or limited purchase history. However, merchant-level transactional data—derived from card payment ecosystems—offers a near-real-time, high-resolution view into consumer spending patterns across geographies and merchant types.

We examines how such datasets, even in anonymised or aggregated form, can be utilised to design, model, and evaluate promotional offers and targeted advertising strategies.

The dataset in question includes anonymised transactional records summarised by merchant. Each entry captures key metrics

Key Data Fields Enabling Footfall Prediction

From the provided sample, the following attributes are critical for modelling footfall

transactionDate - High-precision timestamp of each purchase

merchantAddress

merchantCity, merchantPostalCode: - Geolocation data that allows transactions to be mapped to physical store locations

mcc, mccDesc: - Merchant Category Codes identify the type of venue (e.g. grocery store, taxi, supermarket).

transactionAmount - Acts as a signal of engagement and consumption behaviour.

terminalId -Enables differentiation between multiple terminals at the same or nearby locations (useful in malls or chains).

This structure enables a rich basis for behavioural inference without the need for personally identifiable information (PII), thus preserving user privacy.

Methodological Framework

Customer Segmentation and Behavioural Clustering

Using unsupervised learning techniques (e.g., K-means, DBSCAN), merchants or consumer segments can be clustered based on spending patterns. Features such as revenue-to-footfall ratio, frequency of low/high spend visits, or city-level density inform the design of location-sensitive offers.

Uplift and Promotion Response Modelling

To assess promotional effectiveness, uplift modelling can be employed. By training models to predict differential outcomes between treated (exposed to promotions) and control groups, one can quantify the causal impact of advertising efforts. Techniques include:

  • Two-model approach (treatment vs. control),
  • Uplift trees or causal forests,
  • Counterfactual prediction using matching methods.
Forecasting and Simulation

Time series methods (e.g. ARIMA, Prophet, or LSTM networks) can be applied to forecast future footfall or revenue under varied promotion intensities. Such models can simulate the introduction of discounts or loyalty rewards, projecting their economic impact.

Personalised Targeting

Though this dataset lacks user-level identifiers, aggregated consumer behaviour by merchant location or type can still drive intelligent ad delivery. For example, combining transaction timing and spend data allows targeting ads during off-peak hours or directing promotions to nearby geographic zones with underutilised traffic potential.


The following chart visualises an anonymised subset of merchant data, plotting Revenue (bars) and Footfall (line) across different merchants. This dual-axis approach illustrates how spend and visit volumes vary significantly even within the same sector:

(Note: Real merchant names have been pseudonymised for privacy and compliance purposes.)

Modelling Footfall

By aggregating this data across time and geography, one can infer:

  • Volume of unique payment events per location over time (e.g. by hour, day, week).
  • Merchant-level visit frequency, especially when comparing stores of the same brand in different locations.
  • Temporal patterns, such as rush hours, weekend peaks, or the effect of holidays on movement.
  • Surrounding behaviour, such as high transaction volumes at taxis or convenience stores near high-traffic venues (stadiums, transit hubs, shopping centres).

Practical Applications

Retail Strategy Optimisation

Retail chains and franchises can benchmark their performance by comparing transaction density and spend across locations. This allows for promotion tailoring per outlet and detection of anomalous or underperforming branches.

Real-Time Deal Targeting

With integration into digital wallets or payment platforms, merchant-level transaction data can be used to trigger time-sensitive promotions (e.g., lunch offers when footfall dips below expected levels).

Campaign ROI Evaluation

Advertisers and marketing teams can use this data to track the revenue impact of campaigns at a local or regional level, enabling granular ROI analysis and budget reallocation.

Ethical and Compliance Considerations

While the data is anonymised, ethical handling remains paramount. All modelling efforts must comply with GDPR and data minimisation principles. Moreover, interpretability and transparency of AI models are essential when automating marketing decisions.

Conclusion

Merchant-level transactional data, when modelled effectively, serves as a powerful proxy for consumer intent and response. By leveraging this data for promotion modelling and targeted advertising, organisations can not only improve marketing efficiency but also offer more relevant, timely, and engaging consumer experiences. Future work could explore the integration of temporal and external datasets (e.g. weather, local events) to enhance model precision further.