Identifying High Profit Movies

If I was a movie studio looking to fund some movie projects, would I be better off funding smaller low-budget films instead? Or am I still better off trying to go for the next Avatar?

It’s no secret the Hollywood movie industry is driven by blockbuster hits. It’s getting to the point where people are predicting doom and gloom over inflating movie budgets, especially since the recent wave of summer blockbusters with “disappointing” box office performance. It’s easy to see why Hollywood is so fixated on the summer blockbuster if you look at scatter plot of box office revenue versus movie budget for the past decade of movies:

blockbuster scatter

The dotted line is the break-even point for the movie industry, which is generally accepted as twice it’s production budget. Obviously the biggest hits of recent years like The Avengers and Avatar bring billions in revenue, but I wondered just how many of these movies are languishing below that break even line.

probability breakeven

Not looking good. Only a little over 50% of movies in the past decade (2002 to 2013) made enough revenue to cover their production expenses. Movie studios can’t afford to continuously lose money on huge expensive blockbuster projects, so I began to wonder if it was possible to identify a less-risky population of movies. If we separate out the “risky” high budget movies from the lower budget movies (<$10 million), the picture still doesn’t improve much:

Movies with a production budget under $10 million are only about 6% more likely to reach that break even point.

I also wondered if genre had an effect on movie success. Plotting average movie revenue and budget for each movie genre in the two groups paints a very interesting picture:


Not surprisingly, the Action and Adventure genres have the highest average revenue among the high budget movies. They also have the highest average budgets as well. Among the low budget movies, Horror and Romantic Comedy immediately jump out as having amazing revenue returns. Another way of looking at the same data would be to plot average and median profitability (as defined by the break even line):


Here, profit ratio is defined as the revenue divided by budget after subtracting off twice budget. On average, low-budget horror movies are by far the highest profiting movies when normalized by production budget. If we look at the median values to avoid the effects of blockbuster hits that might skew the average value, low-budget horror is still one of the top performing genres (second only to concert). These charts also show some very risky categories, such as high-budget documentary and westerns, which generally don’t return your production investment.

As an exercise in linear regression, I fit a Ridge Regression model in scikit-learn to the low-budget movie data using budget and several categorical features (movie rating, critic/audience scores, and genre):


With a model R2 score of 0.34878, the fit isn’t the greatest so I would hesitate to use this model for predicting movie success. However looking at the coefficient values we can see some interesting correlations. A higher coefficient value means that according to the model, that feature correlates positively with movie revenue. Once again we have the Horror genre correlating well with revenue, but we also see the R-rating jump out as well.

I wasn’t the only person having difficulty with building a predictive model. Other people in Metis Data Science were also sharing about how we had problems building an accurate predictor for movie success with the data we had. Others had tried things ranging from genre combinations to different release windows during the year.

If we look closer at the low-budget end of the box office scatter plot, we can see that the horror genre just dominates the space. With few exceptions, all the hits in this space are horror films.

So what are the main takeaways?

  • Building a linear predictor for movie success isn’t always clear cut. I suspect a linear model just isn’t a good fit for skewed data like movie revenue where we have certain outliers that are just far better performers than everyone else.
  • Low-budget horror movies are extremely profitable. They don’t return as much raw revenue as something like Avatar, but being cheap to make means you can always churn out more sequels or try again if you don’t succeed.
  • Majority of the successful horror movies are R-rated, so if you’re going that route you want to make sure to include the gore in your film.
Written on November 4, 2014