How has hyperdrive technology in Star Wars changed over time? Are there differences by trilogies? We analyze these data from the They Star Wars API (SWAPI).

Setup

Load the tidyverse which is a collection of R packages that share common philosophies and are designed to work together. Load rwars which accesses SWAPI. Load additional packages for visualizing the data.

library(tidyverse)
library(rwars)
library(forcats)
library(ggrepel)
library(ggthemes)

Using the Tidyverse

What is the ratio of ships to vehicles in each movie? We will use the rwars package to access our data and the tidyverse package to tidy the data. The output format will be one row per observation and one column per metric. We also define a new label, trilogies, and join it to our data.

trilogies <- c(
  "Prequels: Episode I-III", 
  "Originals: Episode IV-VI", 
  "Sequels: Episode VII"
  )
films <- rwars::get_all_films()$results
results <- tibble(
  title = map_chr(films, "title"),
  episode = map_dbl(films, "episode_id"),
  starships = map_dbl(films, ~length(.x$starships)),
  vehicles = map_dbl(films, ~length(.x$vehicles)),
  planets = map_dbl(films, ~length(.x$planets))
  ) %>% 
  mutate(ships = vehicles + starships) %>%
  mutate(ratio = starships / ships * 100) %>% 
  mutate(Trilogy = trilogies[findInterval(episode, c(1,4,7))])
results

Visualization

We will visually examine vehicls with hyperdrive (starships) to the total number of vehicles (starships + vehicles) to determine if there are trends over time or by trilogy.

results %>%
  ggplot(aes(ships, starships)) +
  geom_point(aes(color = Trilogy)) +
  theme_fivethirtyeight() +
  geom_smooth(method = "lm") +
  geom_text(aes(label = title), vjust = -1, size = 2.5) +
  labs(
    title = "Hyperdrive Correlations",
    subtitle = "The Number of Ships with Hyperdrive Capability"
  )

There is a strong correlation between the number of ships with hyperdrive and the total number of ships. Notice that the number of ships increases within each trilogy. Expect more ships in Episode VIII: The Last Jedi.

ggplot(results, aes(reorder(title, episode), ratio)) + 
  geom_bar(aes(fill = Trilogy), stat = "identity", size = 1) +
  labs(
    title = "The Rise of Hyperdrive",
    subtitle = "Percentage of Ships with Hyperdrive Capability"
  ) +
  scale_y_continuous(labels = function(x){paste(x,"%")}) +
  theme_fivethirtyeight() +
  scale_colour_fivethirtyeight() +
  theme(
    axis.text.x = element_text(angle = 35, vjust = 0.9, hjust = 0.9)
  )

The data show a positive trend for the percentage of ships with hyperdrive capability. Notice that 100% of the ships in The Force Awakens had hyperdrive. What will be the percentage for The Last Jedi?.

Model

Based on our visual inspection, we will build a simple linear model that predicts the number of ships with hyperdrive.

starship_model <- lm(starships ~ ships, data = results)
coef_ships <- coef(starship_model)['ships']
summary(starship_model)

Call:
lm(formula = starships ~ ships, data = results)

Residuals:
      1       2       3       4       5       6       7 
 1.2740 -1.3325 -1.7260 -0.5865  1.6675  0.9215 -0.2180 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  1.31637    1.30877   1.006  0.36068   
ships        0.45081    0.07858   5.737  0.00225 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.442 on 5 degrees of freedom
Multiple R-squared:  0.8681,    Adjusted R-squared:  0.8418 
F-statistic: 32.92 on 1 and 5 DF,  p-value: 0.002254

The model indicates that for every additional ship introduced there are 0.45 more ships with hyperdrive capability added. In other words, the number of ships with hyperdrive is half of all ships plus one.

Insights and predictions

There is a strong correlation between total number of ships and the number of ships with hyperdrive. The model predicts the number of ships with hyperdrive is roughly half of all ships plus one.

These data indicate an increased emphasis on hyperdrive from one trilogy to the next. However, it is important to note that the trilogies were made out of order. So there was actually a decrease in the percentage of hyperdrives from the second to the first trilogy.

We predict that Episode VIII will have more ships overall than Episode VII, and that it will have a very high percentage of ships with hyperdrive.

LS0tCnRpdGxlOiAiU3RhciBXYXJzIGFuZCB0aGUgSHlwZXJkcml2ZSIKc3VidGl0bGU6IEluc2lnaHRzIGFjcm9zcyBlcGlzb2RlcyBhbmQgdHJpbG9naWVzCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCkhvdyBoYXMgaHlwZXJkcml2ZSB0ZWNobm9sb2d5IGluIFN0YXIgV2FycyBjaGFuZ2VkIG92ZXIgdGltZT8gQXJlIHRoZXJlIGRpZmZlcmVuY2VzIGJ5IHRyaWxvZ2llcz8gV2UgYW5hbHl6ZSB0aGVzZSBkYXRhIGZyb20gdGhlIFtUaGV5IFN0YXIgV2FycyBBUEkgKFNXQVBJKV0oaHR0cDovL3N3YXBpLmNvLykuCgojIyBTZXR1cAoKTG9hZCB0aGUgYHRpZHl2ZXJzZWAgd2hpY2ggaXMgYSBjb2xsZWN0aW9uIG9mIFIgcGFja2FnZXMgdGhhdCBzaGFyZSBjb21tb24gcGhpbG9zb3BoaWVzIGFuZCBhcmUgZGVzaWduZWQgdG8gd29yayB0b2dldGhlci4gTG9hZCBgcndhcnNgIHdoaWNoIGFjY2Vzc2VzIFNXQVBJLiBMb2FkIGFkZGl0aW9uYWwgcGFja2FnZXMgZm9yIHZpc3VhbGl6aW5nIHRoZSBkYXRhLgoKYGBge3IsIG1lc3NhZ2U9RkFMU0V9CmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KHJ3YXJzKQpsaWJyYXJ5KGZvcmNhdHMpCmxpYnJhcnkoZ2dyZXBlbCkKbGlicmFyeShnZ3RoZW1lcykKYGBgCgojIyBVc2luZyB0aGUgVGlkeXZlcnNlCgpXaGF0IGlzIHRoZSByYXRpbyBvZiBzaGlwcyB0byB2ZWhpY2xlcyBpbiBlYWNoIG1vdmllPyBXZSB3aWxsIHVzZSB0aGUgYHJ3YXJzYCBwYWNrYWdlIHRvIGFjY2VzcyBvdXIgZGF0YSBhbmQgdGhlIGB0aWR5dmVyc2VgIHBhY2thZ2UgdG8gdGlkeSB0aGUgZGF0YS4gVGhlIG91dHB1dCBmb3JtYXQgd2lsbCBiZSBvbmUgcm93IHBlciBvYnNlcnZhdGlvbiBhbmQgb25lIGNvbHVtbiBwZXIgbWV0cmljLiBXZSBhbHNvIGRlZmluZSBhIG5ldyBsYWJlbCwgYHRyaWxvZ2llc2AsIGFuZCBqb2luIGl0IHRvIG91ciBkYXRhLgoKYGBge3J9CnRyaWxvZ2llcyA8LSBjKAogICJQcmVxdWVsczogRXBpc29kZSBJLUlJSSIsIAogICJPcmlnaW5hbHM6IEVwaXNvZGUgSVYtVkkiLCAKICAiU2VxdWVsczogRXBpc29kZSBWSUkiCiAgKQpmaWxtcyA8LSByd2Fyczo6Z2V0X2FsbF9maWxtcygpJHJlc3VsdHMKcmVzdWx0cyA8LSB0aWJibGUoCiAgdGl0bGUgPSBtYXBfY2hyKGZpbG1zLCAidGl0bGUiKSwKICBlcGlzb2RlID0gbWFwX2RibChmaWxtcywgImVwaXNvZGVfaWQiKSwKICBzdGFyc2hpcHMgPSBtYXBfZGJsKGZpbG1zLCB+bGVuZ3RoKC54JHN0YXJzaGlwcykpLAogIHZlaGljbGVzID0gbWFwX2RibChmaWxtcywgfmxlbmd0aCgueCR2ZWhpY2xlcykpLAogIHBsYW5ldHMgPSBtYXBfZGJsKGZpbG1zLCB+bGVuZ3RoKC54JHBsYW5ldHMpKQogICkgJT4lIAogIG11dGF0ZShzaGlwcyA9IHZlaGljbGVzICsgc3RhcnNoaXBzKSAlPiUKICBtdXRhdGUocmF0aW8gPSBzdGFyc2hpcHMgLyBzaGlwcyAqIDEwMCkgJT4lIAogIG11dGF0ZShUcmlsb2d5ID0gdHJpbG9naWVzW2ZpbmRJbnRlcnZhbChlcGlzb2RlLCBjKDEsNCw3KSldKQpyZXN1bHRzCmBgYAoKIyMgVmlzdWFsaXphdGlvbgoKV2Ugd2lsbCB2aXN1YWxseSBleGFtaW5lIHZlaGljbHMgd2l0aCBoeXBlcmRyaXZlIChgc3RhcnNoaXBzYCkgdG8gdGhlIHRvdGFsIG51bWJlciBvZiB2ZWhpY2xlcyAoYHN0YXJzaGlwcyArIHZlaGljbGVzYCkgdG8gZGV0ZXJtaW5lIGlmIHRoZXJlIGFyZSB0cmVuZHMgb3ZlciB0aW1lIG9yIGJ5IHRyaWxvZ3kuCgpgYGB7cn0KcmVzdWx0cyAlPiUKICBnZ3Bsb3QoYWVzKHNoaXBzLCBzdGFyc2hpcHMpKSArCiAgZ2VvbV9wb2ludChhZXMoY29sb3IgPSBUcmlsb2d5KSkgKwogIHRoZW1lX2ZpdmV0aGlydHllaWdodCgpICsKICBnZW9tX3Ntb290aChtZXRob2QgPSAibG0iKSArCiAgZ2VvbV90ZXh0KGFlcyhsYWJlbCA9IHRpdGxlKSwgdmp1c3QgPSAtMSwgc2l6ZSA9IDIuNSkgKwogIGxhYnMoCiAgICB0aXRsZSA9ICJIeXBlcmRyaXZlIENvcnJlbGF0aW9ucyIsCiAgICBzdWJ0aXRsZSA9ICJUaGUgTnVtYmVyIG9mIFNoaXBzIHdpdGggSHlwZXJkcml2ZSBDYXBhYmlsaXR5IgogICkKYGBgCgpUaGVyZSBpcyBhIHN0cm9uZyBjb3JyZWxhdGlvbiBiZXR3ZWVuIHRoZSBudW1iZXIgb2Ygc2hpcHMgd2l0aCBoeXBlcmRyaXZlIGFuZCB0aGUgdG90YWwgbnVtYmVyIG9mIHNoaXBzLiBOb3RpY2UgdGhhdCB0aGUgbnVtYmVyIG9mIHNoaXBzIGluY3JlYXNlcyB3aXRoaW4gZWFjaCB0cmlsb2d5LiBFeHBlY3QgbW9yZSBzaGlwcyBpbiAqRXBpc29kZSBWSUlJOiBUaGUgTGFzdCBKZWRpKi4KCmBgYHtyfQpnZ3Bsb3QocmVzdWx0cywgYWVzKHJlb3JkZXIodGl0bGUsIGVwaXNvZGUpLCByYXRpbykpICsgCiAgZ2VvbV9iYXIoYWVzKGZpbGwgPSBUcmlsb2d5KSwgc3RhdCA9ICJpZGVudGl0eSIsIHNpemUgPSAxKSArCiAgbGFicygKICAgIHRpdGxlID0gIlRoZSBSaXNlIG9mIEh5cGVyZHJpdmUiLAogICAgc3VidGl0bGUgPSAiUGVyY2VudGFnZSBvZiBTaGlwcyB3aXRoIEh5cGVyZHJpdmUgQ2FwYWJpbGl0eSIKICApICsKICBzY2FsZV95X2NvbnRpbnVvdXMobGFiZWxzID0gZnVuY3Rpb24oeCl7cGFzdGUoeCwiJSIpfSkgKwogIHRoZW1lX2ZpdmV0aGlydHllaWdodCgpICsKICBzY2FsZV9jb2xvdXJfZml2ZXRoaXJ0eWVpZ2h0KCkgKwogIHRoZW1lKAogICAgYXhpcy50ZXh0LnggPSBlbGVtZW50X3RleHQoYW5nbGUgPSAzNSwgdmp1c3QgPSAwLjksIGhqdXN0ID0gMC45KQogICkKYGBgCgpUaGUgZGF0YSBzaG93IGEgcG9zaXRpdmUgdHJlbmQgZm9yIHRoZSBwZXJjZW50YWdlIG9mIHNoaXBzIHdpdGggaHlwZXJkcml2ZSBjYXBhYmlsaXR5LiBOb3RpY2UgdGhhdCAxMDAlIG9mIHRoZSBzaGlwcyBpbiAqVGhlIEZvcmNlIEF3YWtlbnMqIGhhZCBoeXBlcmRyaXZlLiBXaGF0IHdpbGwgYmUgdGhlIHBlcmNlbnRhZ2UgZm9yICpUaGUgTGFzdCBKZWRpPyouCgojIyBNb2RlbAoKQmFzZWQgb24gb3VyIHZpc3VhbCBpbnNwZWN0aW9uLCB3ZSB3aWxsIGJ1aWxkIGEgc2ltcGxlIGxpbmVhciBtb2RlbCB0aGF0IHByZWRpY3RzIHRoZSBudW1iZXIgb2Ygc2hpcHMgd2l0aCBoeXBlcmRyaXZlLgoKYGBge3J9CnN0YXJzaGlwX21vZGVsIDwtIGxtKHN0YXJzaGlwcyB+IHNoaXBzLCBkYXRhID0gcmVzdWx0cykKY29lZl9zaGlwcyA8LSBjb2VmKHN0YXJzaGlwX21vZGVsKVsnc2hpcHMnXQpzdW1tYXJ5KHN0YXJzaGlwX21vZGVsKQpgYGAKClRoZSBtb2RlbCBpbmRpY2F0ZXMgdGhhdCBmb3IgZXZlcnkgYWRkaXRpb25hbCBzaGlwIGludHJvZHVjZWQgdGhlcmUgYXJlIGByIHJvdW5kKGNvZWZfc2hpcHMsIDIpYCBtb3JlIHNoaXBzIHdpdGggaHlwZXJkcml2ZSBjYXBhYmlsaXR5IGFkZGVkLiBJbiBvdGhlciB3b3JkcywgdGhlIG51bWJlciBvZiBzaGlwcyB3aXRoIGh5cGVyZHJpdmUgaXMgaGFsZiBvZiBhbGwgc2hpcHMgcGx1cyBvbmUuCgojIyBJbnNpZ2h0cyBhbmQgcHJlZGljdGlvbnMKClRoZXJlIGlzIGEgc3Ryb25nIGNvcnJlbGF0aW9uIGJldHdlZW4gdG90YWwgbnVtYmVyIG9mIHNoaXBzIGFuZCB0aGUgbnVtYmVyIG9mIHNoaXBzIHdpdGggaHlwZXJkcml2ZS4gVGhlIG1vZGVsIHByZWRpY3RzIHRoZSBudW1iZXIgb2Ygc2hpcHMgd2l0aCBoeXBlcmRyaXZlIGlzIHJvdWdobHkgaGFsZiBvZiBhbGwgc2hpcHMgcGx1cyBvbmUuCgpUaGVzZSBkYXRhIGluZGljYXRlIGFuIGluY3JlYXNlZCBlbXBoYXNpcyBvbiBoeXBlcmRyaXZlIGZyb20gb25lIHRyaWxvZ3kgdG8gdGhlIG5leHQuIEhvd2V2ZXIsIGl0IGlzIGltcG9ydGFudCB0byBub3RlIHRoYXQgdGhlIHRyaWxvZ2llcyB3ZXJlIG1hZGUgb3V0IG9mIG9yZGVyLiBTbyB0aGVyZSB3YXMgYWN0dWFsbHkgYSBkZWNyZWFzZSBpbiB0aGUgcGVyY2VudGFnZSBvZiBoeXBlcmRyaXZlcyBmcm9tIHRoZSBzZWNvbmQgdG8gdGhlIGZpcnN0IHRyaWxvZ3kuIAoKV2UgcHJlZGljdCB0aGF0ICpFcGlzb2RlIFZJSUkqIHdpbGwgaGF2ZSBtb3JlIHNoaXBzIG92ZXJhbGwgdGhhbiAqRXBpc29kZSBWSUkqLCBhbmQgdGhhdCBpdCB3aWxsIGhhdmUgYSB2ZXJ5IGhpZ2ggcGVyY2VudGFnZSBvZiBzaGlwcyB3aXRoIGh5cGVyZHJpdmUu