TABS Analytics Blog

The Flaws in the NRF Retail Sales Estimate for Black Friday 2014

NRF Data 2014

 

With all of the increased activity behind Thanksgiving and Black Friday sales, it was quite a shock when the National Retail Federation (NRF) released their annual report for the four day period starting on Thanksgiving Thursday and ending the following Sunday. After all we were seeing a period of more retail hours, more retailer promotions and more aggressive discounts, yet the NRF estimate was that sales fell a dramatic 11%. The decline was attributable to a decline in shopper count (-5%) and spending per shopper (-6%).

This Black Friday weekend was built up to be particularly important to break the pattern of sluggish consumer spending, and was even believed to be "make or break" for some of the weaker retailers such as Sears Holdings, JC Penney and Radio Shack. The 11% decline, then, was quite a slap to all of those in the industry and the media that had invested so much into this season.

While there were a few in the media that took some time to corroborate these dour results most of them took these results and ran with them. Between the press release on Sunday afternoon and the opening NYSE bell on Monday morning, there were already plenty of articles on the failure of promotion to revitalize the consumer spending, the likelihood of profit reductions due to the extended hours and deeper promotions, and the general insanity of all of these promotions, in general.

We now see a counterinsurgency of analysts - me being one of them - that are challenging this forecast and trying to bring some reality back into the picture. After all, there were real consequences to the NRF estimate. Virtually every major publicly traded retailer, both online and bricks & mortar, saw significant drops in their share price. It is also likely that there were several board rooms buzzing about what actions were needed to address these weaknesses in the market.

For over a decade the NRF has issued a survey among over 4,000 respondents 18+ asking about their shopping behavior on Thursday and Friday and their expected behavior on Saturday and Sunday. They are asked whether they shopped, where they shopped, and how much they spent. These results are then aggregated, summarized, analyzed and released in a very tight window; 48 hours is hardly enough time to carefully vet such an extensive and high profile consumer survey. Flaws and shortcuts are inevitable. Here, then, is a list of some of the larger flaws in the NRF estimates derived from this survey:

  • The methodology failed to corroborate with other data sources. Despite what other skeptics of the results contend, there is nothing inherently flawed in the survey method used by NRF. We at TABS Group use a similar methodology for our Annual Vitamin Study. Lacking in the NRF research, though, is an acknowledgement or adjustment for the fact that the results conflicted with other, reliable estimates, particularly in the area of online sales. Using survey data as your sole data source for making forecasts is where the problems occur. Results must align with external sources of higher credibility and accuracy.
  • The methodology is bad at forecasting actual weekend sales. The graph above comes from High Frequency Economics. We can see that year-over-year results for the past five years have always been in the +2-5% range. Note, however, the wildly variable shifts in the NRF estimates over that time: +18% in 2011 to -11% in 2014.
  • Their own data contradicts their conclusion of earlier promotions hurting Black Friday weekend sales. In 2013, 31% of the respondents said they had done 10% of their shopping or less. In 2014 that number was 33%. On the high end, 2013 saw 22% of the shoppers reporting 75% of their shopping was done. That number came down down 19% in 2014. So shoppers report that they have done less, not more, of their holiday shopping so far, which refutes the notion of the pull forward effect of earlier deals.
  • Their data lacks "internal validity." Internal validity is the process of comparing distinct buyers groups within the results to determine if we can spot any data anomalies. Example #1: As many Men shopped as Women (both at 55%). This would contradict every other piece of shopping research we have ever seen that shows women are much more likely to shop,
  • Lack of internal validity Example #2: According to the NRF shopping incidence among 18-34 was an extremely high 74%, which was 37% higher than the 54% registered among the 35-54 Age. 35-54 shoppers are significantly heavier shoppers than 18-34, and that behavior is accentuated during the holidays. Data from the US Census shows that almost 1/3 of people in the younger age group still live with their parents, so it is inconceivable that they are the real "go-getters" for getting their holiday shopping out of the way,as their gift giving requirements are much less.
  • Lack of internal validity Example #3: Using NRF's own data on store count and shopper count, Department Stores grabbed 46% more shoppers than Discounters such as Walmart, Target and Costco (67.9MM vs. 46.2MM) with 31% less stores (4,612 vs 6,639). Knowing the dominant market share that Walmart and Costco have in most of the areas they compete, this is another stat that does not stand the test of common sense.
  • They punted on adjusting their forecast for the total season despite their results from Black Friday weekend. While it is true that there are many other days ahead that will be heavier shopping days than the Thanksgiving weekend, an 11% year-over-year decline is material, and should definitely affect the forecast, even modestly. The fact that the +4.1% growth for the season remains absolutely unchanged is clear evidence that even the NRF doesn't place a lot of credence in the -11% figure.
  • Price and transaction sizes are highly inaccurate through survey. It is plausible that NRF figures for traffic (-5%) is directionally accurate, as that has been a trend in retail prior to this weekend. However, stating the -6% in transaction size is reckless. There are almost no researchers that would place high reliance on a consumer's self-reported aggregated level of spending over an extended period.
  • No accounting for the 10-17 Age Group. Goodness knows there is enough breathless information written about the remarkable spending power of teens in every other part of the year, but they are noticeably omitted from this study. So does the shopping incidence go from 0% for 10-17 Age and then spike to 74% among 18-34? There is likely a material amount of purchasing that is done by this age group that should be accounted for in the study.
  • Not clear the sample is representative. Given all of the validation problems cited I assume that the sample is not representative, but that is not clear from the data provided by the NRF. No sample sizes are provided for the demographic groups, so we can't confirm whether the sample is representative of the 18+ population or not.

So in summary, if this study is going to continue to have such a significant influence on the business landscape, major steps will need to be taken to clean up the research. A few suggestions: 1) take an extra day on releasing the results to conduct necessary validation work, 2) balance the sample to the US population, 3) report only traffic and penetration, omit any information about transaction amounts, 4) release additional data to facilitate deeper validation and review, 5) include Age 10-17 in the research to determine if they are a meaningful source of sales and 6) provide the necessary qualifiers and caveats to the research instead of treating as hard and fast truth.