Time Series Analysis and Web Scraping for a Restaurant
Skills
Web Scraping, BeautifulSoup, Time Series, Prophet, Pandas, Data Visualization, Plotly, Seaborn
Objective
Here’s a fun one, cross posted with my brother’s site. I got a project from a large restaurant (2+ mil in annual sales) in the San Diego area (Lemon Grove) to do some analysis on their collected sales to figure out areas they might be able to increase customers and the predict revenue as that correlates with how much food they have to order. I was only given total revenue numbers every day for 6 years (this restaurant never closes), so I took it upon myself to see what other data I could scrape using BeautifulSoup and include it in the analysis.
Results
We can predict sales with a reasonable 11% mean absolute percentage error
Superbowl Sunday has a much more sizable effect than originally thought. One way to gain customers back is to put the game on a tv and advertise Superbowl-like food for the day. At a measurable cost of 2000$ + per year, Superbowl Sunday would be offset by a new tv which would only be a one-time cost of around 500$.
By exploring the outliers, I was also able to find that Mexican Mother's day has an even larger effect than mother's day in the US. The restaurant is in a predominatly Hispanic area, thus would do well to meet the needs of the holiday/change to holiday pricing.
Credit use is increasing over time.
Each inch of rain essentially cuts 500$ off sales for the day
Note: I’d skip to Time Series Analysis with Prophet for the cool stuff, it’s essentially a package for automated curve fitting using the Holt-Winters technique. The non-stationary analysis is mostly me playing around with visualization packages to validate that Prophet works. Here’s the GitHub for the weather scraping code