Methods
Assembling a larger sample of movies
Members of our lab have studied many quantitative aspects of movies, incrementally increasing the sample size as we have progressed. Much of this is discussed and reviewed in Cutting (2016a). Cutting et al. (2010) analyzed the shot-duration patterns of 150 English-language, feature-length, popular movies – ten each for 15 years evenly divisible by five (e.g. 1935, 1940, …, 2000, 2005). We sampled across genres and from among the most popular of these release years. Subsequently, we expanded that sample to include ten similarly chosen movies from 1915, 1920, 1925, and 1930, and ten from 2010 and 2015.
For other purposes, we had replaced ten of these movies that were longer than 2.5 h. These are thought to have different narrative properties than those under that limit (Thompson, 1999). The alternates were ten with more standard durations from the same genres and release years. Nevertheless, here we have included both the originals and the ten alternates. We also added two from Cutting, DeLong, and Brunick (2011). This aggregation, so far, yields 222 movies released over a century, 1915 to 2015. A listing of 210 of these is given in Cutting (2016b), ten more can be found in the supplementary material to Cutting et al. (2010), and two in Cutting, DeLong and Brunick (2011).
To these we added 75 separate feature-length movies made for children and explored by Brunick (2014). Three per year, these were released between 1985 and 2008 and were the highest grossing G-rated theater or direct-to-DVD releases. Two of these films overlapped with the previous aggregate, yielding a total of 73 different movies. The children’s movies have remarkably similar shot-pattern characteristics to the movies made for adolescents and adults for the same period (Brunick & Cutting: Pace and appearance in movies made for children and adults, in preparation), which provides a list of those movies.
In sum, we now had a grand total of 295 English-language, feature-length movies, almost twice that of Cutting et al. (2010). Many analyses below, however, are done on 263 movies, and some on 180, 48, and 24. Thus, the statistical power for determining effects – where α = 0.05, and d = 0.80 – is 0.99+, 0.99+, 0.99+, 0.77, and 0.46, respectively, for samples of 295, 263, 180, 48, and 24 movies. The median effect size reported in this article is d = 0.72.
Measuring time-series power spectra
With one exception, we followed the methods used by Cutting et al. (2010). We created a vector consisting of the linear sequence of shot durations in each movie. All movies in this sample had between 188 and 3235 shots, as determined in previous research. The movies were between 49 and 204 min in duration. The Academy of Motion Picture Arts and Sciences defines a feature-length film as one lasting at least 40 min. However, except for the silent movies (1915–1925, mean duration = 81 min) and the children’s movies (mean duration = 89 min) in this sample, the mean duration of the other feature movies is quite constant at about 110 min from 1930 to 2015. Again, the values in these shot-duration arrays were then normalized for each movie.
The next step entailed Fourier analysis; this was accomplished by fitting phase-shifted sine waves to successive and successively larger segments (windows) of the shot vector. The lengths of these shot windows were powers of two – 2 shots, 4 shots, 8, 16, 32, 64, 128, and up to 256 shots. We fit travelling windows of each size along the length of the shot vector. That is, for example, for segment lengths of eight shots we fit the normalized durations of shots 1 to 8, then 2 to 9, then 3 to 10, then 4 to 11, and so forth through n-7 to n, where n is the number of shots in the movie. We then averaged these separate fits, calculating mean power.
Cutting et al. (2010) had extended their analyses to 2m shots, where m is the largest power of 2 that is less than the n shots in each movie. They then fit these data with a hybrid model that measured both white noise (or random noise, which has a flat spectrum, and is referred to as 1/f 0) and colored noise. Their assumption, and that of Gilden (2001; see also Wijnants et al., 2009), was that all such signals have a background of random (white) noise and that a fractal-like pattern should be estimated as emerging in the context of that background. Importantly, for the colored-noise part of the model, we varied alpha (α) in 1/f α until the simultaneous combination of colored noise and white noise best fit the data. All obtained values of alpha for these movies were in the range of 0.0–1.54, with a mean of 0.55 and a standard deviation of 0.25. Some of these fits are shown in Fig. 3 here and others were shown in Fig. 3 of Cutting et al. (2010).
If the size of the shot sample inherently increases the exponent alpha, as suggested in previous discussions and research (Cutting, 2014c; DeLong, 2015; Salt, 2010), this might be because the increase in the number of samples in each window and that the averages over the larger number of samples reduces statistical variability, yielding smoother and more reliable functions. To explore this possibility, we truncated the power analysis after travelling windows of 28 (or 256) shots but analyzed the shot vector out to n, its last shot. Thus, the larger the n the more the averages should smooth the results. In addition, we analyzed only those movies with at least 512 shots. This latter criterion reduced the sample to 263 movies.
Results
Slopes and individual movies
Figure 3 shows the data and model fits for nine movies. The Lion King (Allers & Minkoff, 1994), a movie of 1202 shots, provides a framework for the display of the others. By convention and as in Fig. 3, the traveling window sizes (wavelengths, or 1/frequencies) appear on the abscissa in descending order (256 to 2 shots). These are plotted against the relative log power values on the ordinate. The data are shown by a thicker blue line. The model fit (combining white and colored noise) is shown by a thinner red line. Notice that several model fits are slightly curved, as they should be with a log-scaled mixture of white noise (a flat function) and colored noise (a sloped function). The influence of the white noise would diminish with greater wavelengths and greater power. The slope of the colored noise fit (α in 1/f α) for The Lion King is 0.54, about halfway between a true fractal (1/f 1) and white noise (1/f 0).
Given this backdrop, a full range of data and model fits from eight other movies are also shown in Fig. 3. Notice that the slopes (the values of alpha in 1/f α) are near 1.0 for the leftmost pair of movies (Back to the Future, Zemeckis, 1985, and Mission: Impossible – Rogue Nation, McQuarrie, 2015), near 0.67 for the next two movies (Inside Out, Docter & Del Carmen, 2015, and Harry Potter and the Deathly Hallows, Part 1, Yates, 2010), near 0.33 for the third pair (Bells of St. Mary’s, McCarey, 1945, and Apollo 13, Howard, 1995), and near zero for the rightmost pair (Return of the Pink Panther, Edwards, 1975, and Asphalt Jungle, Huston, 1955). Notice, too, that the more recent movies are generally to the left and that they also generally have more shots.
Thus, the results for these nine movies set up the pattern for both effects – that more recent movies have a steeper slope, in line with the results of Cutting et al. (2010), but they also have more shots. And a third effect is that across all movies the increase in the number of shots is correlated with the improvement of the hybrid model fits (adjusted R2 = 0.05, t(261) = − 3.81, p = 0.0002, d = 0.47). Mean root-mean-squared deviations for movies with about 500 shots is about 0.20, whereas that for those with about 2000 shots is about 0.10. Notice that the fit for Return of the Pink Panther is particularly poor.
Expectations and the patterns of slopes across movies
Cutting et al. (2010) reported that the pattern of slopes among the earlier movies (from 1935 to about 1960) was relatively flat and varied and that the pattern for the later movies (about 1960 to 2005) increased over time with less variation. Cutting et al. also reported that the linear increase across the whole set, 1935 to 2005, was also reliable, but not as compelling. With the movies added to the beginning of the release year distribution (1915, 1920, 1925, and 1930) and to its end (2010 and 2015) it was difficult to know what we should predict. More critically, however, the addition of the children’s movies increased the density of movies between 1985 and 2008, roughly the time frame of the sharpest increase in slope, which was the central and emphasized finding of Cutting et al. The results are shown in Fig. 4a.
Based on previous results, we looked for both linear and quadratic trends. To be clear, the data are quite noisy, which is the main reason for waiting eight years to update Cutting et al. (2010) until we could explore many films over a longer period of time. An increasing linear trend was modest (adjusted R2 = 0.017, t(261) = 2.34, p = 0.02, d = 0.30), but the quadratic trend shown in the figure was more robust (adjusted R2 = 0.087, t(260) = 4.60, p < 0.0001, d = 0.57).
Again, the quadratic trend bottoms out at about 1960, a result that would appear to reinforce the division of popular movies into those of the Hollywood Studio era and those that came later. However, the left-hand side of the trend has little statistical support. Although the apparent decline in slopes from 1915 to 1955 looks impressive, it is not by itself reliable (adjusted R2 = 0.045, t(60) = − 1.68, p = 0.098). Thus, the quadratic function fails the two-lines test (Simonsohn, 2017) – dividing the distribution between falling and rising segments, and testing for the significance of both linear trends. Nonetheless, our interest had originally been focused on the period from 1960 onwards.
Importantly for the argument presented in Cutting et al. (2010), the linear trend of the subsample from 1960 to 2015 was also quite strong (adjusted R2 = 0.11, t(200) = 5.10, p < 0.0001, d = 0.72). Thus far, then, our evidence extends the results of Cutting et al. (2010).
Slopes and shot-sample size
In exploration of the effects of shot number (sample size), Fig. 4c shows the scatterplot of the same slope values against the number of shots in each movie. The regression trend, with its 95% confidence interval, is quite strong (adjusted R2 = 0.15, t(261) = 6.78, p < 0.0001, d = 0.84), replicating Cutting (2014c). As one can see, the mean slope (alpha value) for movies with only about 500 shots is near 0.4, but for those with 3000 shots is near 1.0. Clearly, as was seen in the individual movie data of Fig. 3, both release year and shot number are contenders in accounting for the data.
Using two predictors of slope, the quadratic regression values from the release-year data of each movie and the linear regression values for the number of shots, we find that shot number is a stronger predictor (t(260) = 4.84 p < 0.0001, d = 0.60) than is release year (t(260) = 2.49, p = 0.014, d = 0.30). Indeed, in stepwise regression, we find that entering the number of shots first accounts for 15% of the variance and the addition of the quadratic values adds only 2%, whereas entering the quadratic values first yields 9% of the variance, but the addition of shots adds another 8%. Thus, it is clear that the number of shots, not release year, is the more potent cause of the increase in slope.
Moreover, and again, since the pattern in the post-1960 movies was most critical to the conclusions of Cutting et al. (2010), we could simply assess the linear effects of release year and shot number on derived slope in those movies from 1965 to 2015. Together these account for 24% of the variance in the data, but the effect of shot number is again substantial (t(189) = 5.89, p < 0.0001, d = 0.86), whereas that of release year is not (t(189) = 1.89, p = 0.06). Clearly, the evolution towards a 1/f 1, or fractal, structure in the shot patterns of movies is reflected in more shots per movie in these data than in release years.
Why do longer shot vectors garner higher slope values? Again, one reason might be a smoothing of the data through the averaging of more samples. As can be seen in Fig. 3, the fits of the hybrid model to the data seem to get better as the number of shots increases (right to left). On the other hand, one might have assumed that the mean slope estimates would remain roughly the same across movies with different numbers of shots, but with decreased variance (not increased slope) as the number of shots per movie increased. This possibility is one rationale behind the simulations in Study 3.
Long- and short-range dependence
An important issue emerges from the broader literature in the context of these data and analyses. This concerns long-range dependence, also called – and seemingly in a deliberate ploy to confuse psychologists – “memory.” The idea comes from hydrology and originally concerned the cadence of the build-up of runoff from rainstorms throughout a watershed as the water approached a dam on a large river (Hurst, Black, & Simaika, 1965). Over the subsequent decades this idea was then applied to many time-domain self-similar processes, even brain states (Tagliazucchi et al., 2013).
To be concrete, the implication of this idea to the results of Study 1 and those of Cutting et al. (2010) is as follows: they claimed that there are long-range relationships among the shot durations. But that claim may be suspect. In particular, the relatively high power in the long-wavelength results of The Lion King, seen at the left-hand side of Fig. 3, suggests that, among others, there are correlations among the shot durations at lags of 256 shots that are due to long-range processes underlying the data. As it turns out, however, this need not be the case. Short-range dependence (local correlations) can lead to effects that look like long-range processes are at work (Karagiannis, Faloutsos, & Riedl, 2002; Wagenmakers, Farrell & Ratcliff, 2005).
This is a known problem, an active research area, and has been addressed in many different venues (see DeLong, 2015) – for example, Karagiannis et al. (2002) in telecommunications research and Wagenmakers et al. (2005; Farrell, Wagenmakers, & Ratcliff, 2006) in response to Gilden (2001) and his study of reaction times. Both sets of authors offered solutions. Wagenmakers et al. suggested testing the difference in autoregressive (AR) model results [ARFIMA(1,d,1) - ARMA(1,1)] on each data vector. The first model has a component (d for dimension, not effect size) that could measure long-range dependence, but the second model does not. However, Gilden (2009) questioned this approach on grounds of model flexibility and the overfitting of data.
On the other hand, Karagiannis et al. (2002) tested many AR indices and endorsed the Whittle estimator, which takes on values of 0 for white noise, near 0.5 for pink noise, and about 1.0 for brown noise. Named for work by Peter Whittle (1951), a New Zealand/Finnish mathematician, the Whittle estimator was found it to be the most robust in detecting long-range dependence provided that the data are not periodic (Karagiannis et al., 2002; Stadnitski, 2012b), which the movie data are not. We have employed the exact local Whittle estimator (Shimotsu & Phillips, 2005), a further improvement. The Whittle is typically used to estimate the fractional (non-white) noise dimension (d) underlying time-series data for autoregression models. Once the nature of this parameter is estimated, other patterns in the data can be explored.Footnote 3
Given the possible contamination of long-term dependence measures by short-term processes and following earlier simulations by DeLong (2015), it occurred to us that the power-spectrum slope values calculated in Study 1 might not be the best estimates of long-range dependence and, hence fractality, in the movie data. Thus, it seemed prudent to re-measure all of our movies with the exact local Whittle estimator.