FAQ
Contents
FAQ#
Q: What is the panel?#
A: It is Huq’s panel, collected on a first party basis, meaning we control the distribution of the SDK that collects user’s anonymised location data (and WiFi data but we don’t actively use this information here), and the data pipeline that processes said data.
Addition information: Different devices could have different versions of Huq’s SDK, SDK versions themselves or OS versions could have affected the data collection frequency. We have applied suitable filter/method to ensure the data collection difference would not affect the end result in a drastic manner.
Filter for SDK that has geo-triggered data collection behaviour contributing to at least 80% of the observations of the month;
Filter for devices that have at least generated 100 observations from the SDK that has geo-triggered data collection behaviour;
Q: What is the metric being reported?#
A: It’s visit duration per panel capita, basically the sum of duration of detected visit of anyone in the panel divided by the number of people in the panel.
Q: How big is the panel?#
A: This varies over time - the panel size ranges from 40K to 200K in different time periods. The point estimate metric of visit duration per panel capita should not be affected by the panel size, unless it falls to dangerously low numbers.
As a point of reference, Mintel only has a panel of 2K.
Q: Is there any statistic method applied?#
A: None at the moment. These are point estimates. However, we are in the process of developing the application of statistical method using block bootstrapping or some variants of bootstrapping that are applicable to time series panel data like ours.
Q: Are there any confidence bounds?#
A: See above.
Q: Is there any bias?#
A: There is definitely sampling bias in our dataset. We have quantified the sampling bias in geography based on home location, by comparing what we observe as a panel against the census which is a more authoritative source on home location. Once we have quantified this sampling bias, the sampling bias is removed by up-sampling the undersampled region, and down-sampling the oversampled region.
The granularity of region we used in quantifying sampling bias is just counties in the UK. This could be Metropolitan counties, counties or the equivalents in Wales, Scotland and Northern Ireland, for London we just grouped all the boroughs as one region.
Q: What’s the release schedule of this product?#
A: Daily signal is released with a 7 days lag.
Q: What is counted as a visit?#
A: We classify visits as continued stay within a drawn polygon of the brand in question. The visit starts whenever there is an accurate observation recorded within the polygon and it ends immediately when the next accurate observation is outside of the drawn polygon, we allow some degree of inaccuracy because location data is never fully accurate.
Q: What have you done to filter out employees working at their stores?#
A: We have applied a simple time based filter, if they stayed in the store for 3+ hours consistently, it’s pretty obvious they are an employee.