Word Frequencies

The video that accompanies this notebook is available at https://ucdavis.box.com/v/sts-205-notebook-2.

In this notebook we will be analyzing and visualizing word frequencies in the State of the Union addresses. Let’s begin by looking at the metadata table we created last week.

Working with data frames

library(tidyverse)
Registered S3 methods overwritten by 'dbplyr':
  method         from
  print.tbl_lazy     
  print.tbl_sql      
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
✓ ggplot2 3.3.3     ✓ purrr   0.3.4
✓ tibble  3.0.6     ✓ dplyr   1.0.4
✓ tidyr   1.1.2     ✓ stringr 1.4.0
✓ readr   1.4.0     ✓ forcats 0.5.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()
sotu_meta <- read_csv("sotu_metadata.csv")

── Column specification ────────────────────────────────────────────────────────
cols(
  year = col_double(),
  pres = col_character(),
  use_last = col_logical()
)
head(sotu_meta)

In general, there are two ways to work with data frames in R. Base R allows you to access any column of a data frame by specifying the name of the data frame followed by a dollar sign and the name of the column.

sotu_meta$year
  [1] 1790.0 1790.5 1791.0 1792.0 1793.0 1794.0 1795.0 1796.0 1797.0 1798.0
 [11] 1799.0 1800.0 1801.0 1802.0 1803.0 1804.0 1805.0 1806.0 1807.0 1808.0
 [21] 1809.0 1810.0 1811.0 1812.0 1813.0 1814.0 1815.0 1816.0 1817.0 1818.0
 [31] 1819.0 1820.0 1821.0 1822.0 1823.0 1824.0 1825.0 1826.0 1827.0 1828.0
 [41] 1829.0 1830.0 1831.0 1832.0 1833.0 1834.0 1835.0 1836.0 1837.0 1838.0
 [51] 1839.0 1840.0 1841.0 1842.0 1843.0 1844.0 1845.0 1846.0 1847.0 1848.0
 [61] 1849.0 1850.0 1851.0 1852.0 1853.0 1854.0 1855.0 1856.0 1857.0 1858.0
 [71] 1859.0 1860.0 1861.0 1862.0 1863.0 1864.0 1865.0 1866.0 1867.0 1868.0
 [81] 1869.0 1870.0 1871.0 1872.0 1873.0 1874.0 1875.0 1876.0 1877.0 1878.0
 [91] 1879.0 1880.0 1881.0 1882.0 1883.0 1884.0 1885.0 1886.0 1887.0 1888.0
[101] 1889.0 1890.0 1891.0 1892.0 1893.0 1894.0 1895.0 1896.0 1897.0 1898.0
[111] 1899.0 1900.0 1901.0 1902.0 1903.0 1904.0 1905.0 1906.0 1907.0 1908.0
[121] 1909.0 1910.0 1911.0 1912.0 1913.0 1914.0 1915.0 1916.0 1917.0 1918.0
[131] 1919.0 1920.0 1921.0 1922.0 1923.0 1924.0 1925.0 1926.0 1927.0 1928.0
[141] 1929.0 1930.0 1931.0 1932.0 1934.0 1935.0 1936.0 1937.0 1938.0 1939.0
[151] 1940.0 1941.0 1942.0 1943.0 1944.0 1945.0 1946.0 1947.0 1948.0 1949.0
[161] 1950.0 1951.0 1952.0 1953.0 1953.5 1954.0 1955.0 1956.0 1957.0 1958.0
[171] 1959.0 1960.0 1961.0 1961.5 1962.0 1963.0 1964.0 1965.0 1966.0 1967.0
[181] 1968.0 1969.0 1970.0 1971.0 1972.0 1973.0 1974.0 1975.0 1976.0 1977.0
[191] 1978.0 1979.0 1980.0 1981.0 1982.0 1983.0 1984.0 1985.0 1986.0 1987.0
[201] 1988.0 1989.0 1990.0 1991.0 1992.0 1993.0 1994.0 1995.0 1996.0 1997.0
[211] 1998.0 1999.0 2000.0 2001.0 2001.5 2002.0 2003.0 2004.0 2005.0 2006.0
[221] 2007.0 2008.0 2009.0 2010.0 2011.0 2012.0 2013.0 2014.0 2015.0 2016.0
[231] 2017.0 2018.0 2019.0

You can also use functions to summarize the columns. Here we use the table() function to see how many addresses were given by each president.

table(sotu_meta$pres)

      Abraham Lincoln        Andrew Jackson        Andrew Johnson 
                    4                     8                     4 
         Barack Obama     Benjamin Harrison       Calvin Coolidge 
                    8                     4                     6 
    Chester A. Arthur       Donald J. Trump  Dwight D. Eisenhower 
                    4                     3                     9 
Franklin D. Roosevelt       Franklin Pierce      George H.W. Bush 
                   12                     4                     4 
       George W. Bush     George Washington        Gerald R. Ford 
                    9                     8                     3 
     Grover Cleveland       Harry S. Truman        Herbert Hoover 
                    8                     8                     4 
       James Buchanan         James Madison          James Monroe 
                    4                     8                     8 
           James Polk          Jimmy Carter            John Adams 
                    4                     4                     4 
      John F. Kennedy     John Quincy Adams            John Tyler 
                    3                     4                     4 
    Lyndon B. Johnson      Martin van Buren      Millard Fillmore 
                    6                     4                     3 
        Richard Nixon         Ronald Reagan   Rutherford B. Hayes 
                    5                     7                     4 
   Theodore Roosevelt      Thomas Jefferson      Ulysses S. Grant 
                    8                     8                     8 
       Warren Harding       William H. Taft    William J. Clinton 
                    2                     4                     8 
     William McKinley        Woodrow Wilson        Zachary Taylor 
                    4                     8                     1 

You can subset a data frame by either rows or columns using square brackets.

sotu_meta[1:2, ]  #First two rows
sotu_meta[, 1:2] #First two columns
sotu_meta[1:2, 1:2] #First two rows and columns
sotu_meta[sotu_meta$pres == "George Washington", ] #Rows where pres is George Washington
sotu_meta[sotu_meta$use_last, ] #Rows where use_last is TRUE
sotu_meta[sotu_meta$year %in% 1800:1899, ] #Addresses given between 1800 and 1899

You can add a column by naming it and assigning it a value. Here we will add a column with the president’s last name.

sotu_meta$pres_last <- str_extract(sotu_meta$pres, "[A-Z][a-z]+$")
head(sotu_meta)

The dplyr package allows you to manipulate data frames without using the dollar sign notation, which often makes your code cleaner. The dplyr package is part of the tidyverse, so we don’t need to install it separately. The dplyr function select() allows you to select columns by name.

select(sotu_meta, year, pres)

You can also use pipes with dplyr functions:

sotu_meta <- sotu_meta %>% select(-pres_last)
head(sotu_meta)

In the code above, I changed the sotu_meta data frame to eliminate the pres_last column. The dplyr function filter() allows you to select rows by given criteria.

sotu_meta %>% filter(pres == "George Washington")

The dyplyr function arrange() lets you sort the data frame by the value of a given column.

arrange(sotu_meta, -year)
sotu_meta %>% arrange(-year)

The dplyr function mutate() lets you change the value of a column or create a new column.

sotu_meta %>% mutate(pres_last = str_extract(pres, "[A-Z][a-z]+$"))

Now we are going to read in the text of the addresses and convert them to a vector where each element is a separate address, just as we did last week.

sotu <- readLines("stateoftheunion1790-2019.txt") %>% 
          str_c(collapse = " ") %>% 
          str_split("\\*\\*\\* ") %>% unlist

The sotu_meta table has 233 rows and the sotu vector has 233 elements, so we can make the sotu vector a column of the sotu_meta table.

sotu_meta$text <- sotu
head(sotu_meta)

Now the text of each address is in the text column of the sotu_meta table. We want to get rid of the first paragraph of every address, since that is only metadata, and we want to get rid of the last paragraph of every address where use_last is FALSE. The code below replaces the first paragraph with nothing, using the same regular expression we used last week to extract the first paragraph.

sotu_meta <- sotu_meta %>% mutate(text = str_replace(text, "[^<]+<p> ", ""))
head(sotu_meta)
sotu_meta <- sotu_meta %>% mutate(remove = ifelse(use_last, "", str_extract(text, " <p>[^<]+$")),
                                  text = ifelse(use_last, text, str_replace(text, " <p>[^<]+$", "")))
sotu_meta$remove
  [1] ""                                                                                                                                                                                                                                          
  [2] " <p> GO. WASHINGTON "                                                                                                                                                                                                                      
  [3] " <p> GO. WASHINGTON "                                                                                                                                                                                                                      
  [4] " <p> GO. WASHINGTON "                                                                                                                                                                                                                      
  [5] " <p> GO. WASHINGTON "                                                                                                                                                                                                                      
  [6] " <p> GO. WASHINGTON "                                                                                                                                                                                                                      
  [7] " <p> GO. WASHINGTON "                                                                                                                                                                                                                      
  [8] " <p> GO. WASHINGTON "                                                                                                                                                                                                                      
  [9] ""                                                                                                                                                                                                                                          
 [10] ""                                                                                                                                                                                                                                          
 [11] ""                                                                                                                                                                                                                                          
 [12] ""                                                                                                                                                                                                                                          
 [13] ""                                                                                                                                                                                                                                          
 [14] " <p> TH. JEFFERSON "                                                                                                                                                                                                                       
 [15] " <p> TH. JEFFERSON "                                                                                                                                                                                                                       
 [16] " <p> TH. JEFFERSON "                                                                                                                                                                                                                       
 [17] " <p> TH. JEFFERSON "                                                                                                                                                                                                                       
 [18] " <p> TH. JEFFERSON "                                                                                                                                                                                                                       
 [19] " <p> TH. JEFFERSON "                                                                                                                                                                                                                       
 [20] " <p> TH. JEFFERSON "                                                                                                                                                                                                                       
 [21] ""                                                                                                                                                                                                                                          
 [22] ""                                                                                                                                                                                                                                          
 [23] ""                                                                                                                                                                                                                                          
 [24] ""                                                                                                                                                                                                                                          
 [25] ""                                                                                                                                                                                                                                          
 [26] ""                                                                                                                                                                                                                                          
 [27] ""                                                                                                                                                                                                                                          
 [28] ""                                                                                                                                                                                                                                          
 [29] ""                                                                                                                                                                                                                                          
 [30] ""                                                                                                                                                                                                                                          
 [31] ""                                                                                                                                                                                                                                          
 [32] ""                                                                                                                                                                                                                                          
 [33] ""                                                                                                                                                                                                                                          
 [34] ""                                                                                                                                                                                                                                          
 [35] ""                                                                                                                                                                                                                                          
 [36] ""                                                                                                                                                                                                                                          
 [37] " <p> JOHN QUINCY ADAMS "                                                                                                                                                                                                                   
 [38] " <p> JOHN QUINCY ADAMS "                                                                                                                                                                                                                   
 [39] " <p> JOHN QUINCY ADAMS "                                                                                                                                                                                                                   
 [40] " <p> JOHN QUINCY ADAMS "                                                                                                                                                                                                                   
 [41] ""                                                                                                                                                                                                                                          
 [42] ""                                                                                                                                                                                                                                          
 [43] ""                                                                                                                                                                                                                                          
 [44] ""                                                                                                                                                                                                                                          
 [45] ""                                                                                                                                                                                                                                          
 [46] ""                                                                                                                                                                                                                                          
 [47] ""                                                                                                                                                                                                                                          
 [48] ""                                                                                                                                                                                                                                          
 [49] ""                                                                                                                                                                                                                                          
 [50] " <p> M. VAN BUREN "                                                                                                                                                                                                                        
 [51] " <p> M. VAN BUREN "                                                                                                                                                                                                                        
 [52] " <p> M. VAN BUREN "                                                                                                                                                                                                                        
 [53] ""                                                                                                                                                                                                                                          
 [54] ""                                                                                                                                                                                                                                          
 [55] ""                                                                                                                                                                                                                                          
 [56] ""                                                                                                                                                                                                                                          
 [57] " <p> JAMES K. POLK "                                                                                                                                                                                                                       
 [58] " <p> JAMES K. POLK "                                                                                                                                                                                                                       
 [59] " <p> JAMES K. POLK "                                                                                                                                                                                                                       
 [60] " <p> JAMES K. POLK "                                                                                                                                                                                                                       
 [61] " <p> Z. TAYLOR. "                                                                                                                                                                                                                          
 [62] ""                                                                                                                                                                                                                                          
 [63] ""                                                                                                                                                                                                                                          
 [64] ""                                                                                                                                                                                                                                          
 [65] ""                                                                                                                                                                                                                                          
 [66] ""                                                                                                                                                                                                                                          
 [67] ""                                                                                                                                                                                                                                          
 [68] ""                                                                                                                                                                                                                                          
 [69] ""                                                                                                                                                                                                                                          
 [70] ""                                                                                                                                                                                                                                          
 [71] ""                                                                                                                                                                                                                                          
 [72] ""                                                                                                                                                                                                                                          
 [73] ""                                                                                                                                                                                                                                          
 [74] ""                                                                                                                                                                                                                                          
 [75] ""                                                                                                                                                                                                                                          
 [76] ""                                                                                                                                                                                                                                          
 [77] ""                                                                                                                                                                                                                                          
 [78] ""                                                                                                                                                                                                                                          
 [79] ""                                                                                                                                                                                                                                          
 [80] ""                                                                                                                                                                                                                                          
 [81] " <p> U. S. GRANT "                                                                                                                                                                                                                         
 [82] " <p> U. S. GRANT "                                                                                                                                                                                                                         
 [83] " <p> U. S. GRANT "                                                                                                                                                                                                                         
 [84] ""                                                                                                                                                                                                                                          
 [85] " <p> U. S. GRANT "                                                                                                                                                                                                                         
 [86] ""                                                                                                                                                                                                                                          
 [87] " <p> U. S. GRANT "                                                                                                                                                                                                                         
 [88] " <p> U. S. GRANT "                                                                                                                                                                                                                         
 [89] ""                                                                                                                                                                                                                                          
 [90] " <p> R. B. HAYES "                                                                                                                                                                                                                         
 [91] ""                                                                                                                                                                                                                                          
 [92] ""                                                                                                                                                                                                                                          
 [93] ""                                                                                                                                                                                                                                          
 [94] ""                                                                                                                                                                                                                                          
 [95] ""                                                                                                                                                                                                                                          
 [96] ""                                                                                                                                                                                                                                          
 [97] ""                                                                                                                                                                                                                                          
 [98] ""                                                                                                                                                                                                                                          
 [99] ""                                                                                                                                                                                                                                          
[100] ""                                                                                                                                                                                                                                          
[101] " <p> BENJ. HARRISON "                                                                                                                                                                                                                      
[102] ""                                                                                                                                                                                                                                          
[103] " <p> BENJ. HARRISON "                                                                                                                                                                                                                      
[104] " <p> BENJ. HARRISON "                                                                                                                                                                                                                      
[105] " <p> GROVER CLEVELAND "                                                                                                                                                                                                                    
[106] " <p> GROVER CLEVELAND "                                                                                                                                                                                                                    
[107] " <p> GROVER CLEVELAND "                                                                                                                                                                                                                    
[108] " <p> GROVER CLEVELAND "                                                                                                                                                                                                                    
[109] ""                                                                                                                                                                                                                                          
[110] ""                                                                                                                                                                                                                                          
[111] ""                                                                                                                                                                                                                                          
[112] ""                                                                                                                                                                                                                                          
[113] ""                                                                                                                                                                                                                                          
[114] ""                                                                                                                                                                                                                                          
[115] ""                                                                                                                                                                                                                                          
[116] ""                                                                                                                                                                                                                                          
[117] ""                                                                                                                                                                                                                                          
[118] ""                                                                                                                                                                                                                                          
[119] ""                                                                                                                                                                                                                                          
[120] " <p> Tuesday, December 8, 1908. "                                                                                                                                                                                                          
[121] ""                                                                                                                                                                                                                                          
[122] ""                                                                                                                                                                                                                                          
[123] ""                                                                                                                                                                                                                                          
[124] ""                                                                                                                                                                                                                                          
[125] ""                                                                                                                                                                                                                                          
[126] ""                                                                                                                                                                                                                                          
[127] ""                                                                                                                                                                                                                                          
[128] ""                                                                                                                                                                                                                                          
[129] ""                                                                                                                                                                                                                                          
[130] ""                                                                                                                                                                                                                                          
[131] ""                                                                                                                                                                                                                                          
[132] ""                                                                                                                                                                                                                                          
[133] ""                                                                                                                                                                                                                                          
[134] ""                                                                                                                                                                                                                                          
[135] ""                                                                                                                                                                                                                                          
[136] ""                                                                                                                                                                                                                                          
[137] ""                                                                                                                                                                                                                                          
[138] ""                                                                                                                                                                                                                                          
[139] ""                                                                                                                                                                                                                                          
[140] ""                                                                                                                                                                                                                                          
[141] " <p> December 3, 1929 "                                                                                                                                                                                                                    
[142] " <p> December 2, 1930 "                                                                                                                                                                                                                    
[143] " <p> December 8, 1931 "                                                                                                                                                                                                                    
[144] " <p> HERBERT HOOVER The White House, December 6, 1932. "                                                                                                                                                                                   
[145] ""                                                                                                                                                                                                                                          
[146] ""                                                                                                                                                                                                                                          
[147] ""                                                                                                                                                                                                                                          
[148] ""                                                                                                                                                                                                                                          
[149] ""                                                                                                                                                                                                                                          
[150] ""                                                                                                                                                                                                                                          
[151] ""                                                                                                                                                                                                                                          
[152] ""                                                                                                                                                                                                                                          
[153] ""                                                                                                                                                                                                                                          
[154] ""                                                                                                                                                                                                                                          
[155] ""                                                                                                                                                                                                                                          
[156] ""                                                                                                                                                                                                                                          
[157] ""                                                                                                                                                                                                                                          
[158] ""                                                                                                                                                                                                                                          
[159] ""                                                                                                                                                                                                                                          
[160] ""                                                                                                                                                                                                                                          
[161] ""                                                                                                                                                                                                                                          
[162] ""                                                                                                                                                                                                                                          
[163] ""                                                                                                                                                                                                                                          
[164] ""                                                                                                                                                                                                                                          
[165] ""                                                                                                                                                                                                                                          
[166] ""                                                                                                                                                                                                                                          
[167] ""                                                                                                                                                                                                                                          
[168] ""                                                                                                                                                                                                                                          
[169] ""                                                                                                                                                                                                                                          
[170] ""                                                                                                                                                                                                                                          
[171] ""                                                                                                                                                                                                                                          
[172] ""                                                                                                                                                                                                                                          
[173] ""                                                                                                                                                                                                                                          
[174] ""                                                                                                                                                                                                                                          
[175] ""                                                                                                                                                                                                                                          
[176] ""                                                                                                                                                                                                                                          
[177] ""                                                                                                                                                                                                                                          
[178] ""                                                                                                                                                                                                                                          
[179] ""                                                                                                                                                                                                                                          
[180] ""                                                                                                                                                                                                                                          
[181] ""                                                                                                                                                                                                                                          
[182] ""                                                                                                                                                                                                                                          
[183] ""                                                                                                                                                                                                                                          
[184] ""                                                                                                                                                                                                                                          
[185] ""                                                                                                                                                                                                                                          
[186] " <p> February 2, 1973. "                                                                                                                                                                                                                   
[187] ""                                                                                                                                                                                                                                          
[188] ""                                                                                                                                                                                                                                          
[189] ""                                                                                                                                                                                                                                          
[190] ""                                                                                                                                                                                                                                          
[191] ""                                                                                                                                                                                                                                          
[192] ""                                                                                                                                                                                                                                          
[193] ""                                                                                                                                                                                                                                          
[194] ""                                                                                                                                                                                                                                          
[195] " <p> NOTE: The President spoke at 9 p.m. in the House Chamber at the Capitol. He was introduced by Thomas P. O'Neill, Jr., Speaker of the House of Representatives. The address was broadcast live on nationwide radio and television. "   
[196] " <p> NOTE: The President spoke at 9:03 p.m. in the House Chamber of the Capitol. He was introduced by Thomas P. O'Neill, Jr., Speaker of the House of Representatives. The address was broadcast live on nationwide radio and television. "
[197] " <p> NOTE: The President spoke at 9:02 p.m. in the House Chamber of the Capitol. He was introduced by Thomas P. O'Neill, Jr., Speaker of the House of Representatives. The address was broadcast live on nationwide radio and television. "
[198] " <p> NOTE: The President spoke at 9:05 p.m. in the House Chamber of the Capitol. He was introduced by Thomas P. O'Neill, Jr., Speaker of the House of Representatives. The address was broadcast live on nationwide radio and television. "
[199] " <p> NOTE: The President spoke at 8:04 p.m. in the House Chamber of the Capitol. He was introduced by Thomas P. O'Neill, Jr., Speaker of the House of Representatives. The address was broadcast live on nationwide radio and television. "
[200] " <p> NOTE: The President spoke at 9:03 p.m. in the House Chamber of the Capitol. He was introduced by Jim Wright, Speaker of the House of Representatives. The address was broadcast live on nationwide radio and television. "            
[201] " <p> NOTE: The President spoke at 9:07 p.m. in the House Chamber of the Capitol. He was introduced by Jim Wright, Speaker of the House of Representatives. The address was broadcast live on nationwide radio and television. "            
[202] ""                                                                                                                                                                                                                                          
[203] ""                                                                                                                                                                                                                                          
[204] ""                                                                                                                                                                                                                                          
[205] ""                                                                                                                                                                                                                                          
[206] ""                                                                                                                                                                                                                                          
[207] ""                                                                                                                                                                                                                                          
[208] ""                                                                                                                                                                                                                                          
[209] ""                                                                                                                                                                                                                                          
[210] ""                                                                                                                                                                                                                                          
[211] ""                                                                                                                                                                                                                                          
[212] ""                                                                                                                                                                                                                                          
[213] ""                                                                                                                                                                                                                                          
[214] " <p> GEORGE W. BUSH. THE WHITE HOUSE, February 27, 2001. "                                                                                                                                                                                 
[215] " <p> GEORGE W. BUSH. THE WHITE HOUSE, September 20, 2001. "                                                                                                                                                                                
[216] ""                                                                                                                                                                                                                                          
[217] ""                                                                                                                                                                                                                                          
[218] ""                                                                                                                                                                                                                                          
[219] ""                                                                                                                                                                                                                                          
[220] " <p> [Applause, the Members rising.] "                                                                                                                                                                                                     
[221] " <p> [Applause, the Members rising.] "                                                                                                                                                                                                     
[222] ""                                                                                                                                                                                                                                          
[223] ""                                                                                                                                                                                                                                          
[224] ""                                                                                                                                                                                                                                          
[225] ""                                                                                                                                                                                                                                          
[226] ""                                                                                                                                                                                                                                          
[227] ""                                                                                                                                                                                                                                          
[228] ""                                                                                                                                                                                                                                          
[229] ""                                                                                                                                                                                                                                          
[230] ""                                                                                                                                                                                                                                          
[231] ""                                                                                                                                                                                                                                          
[232] " <p> THE WHITE HOUSE, January 30, 2018. "                                                                                                                                                                                                  
[233] ""                                                                                                                                                                                                                                          

Now that we have what we want (a data frame with the text of the State of the Unions and the metadata), we are going to write a function so that we can do this all in one step any time we want.

Making new functions

A function is just an algorithm that we make up ourselves and want to re-use. To give an example, we can make a function called do_math() that will take any number, multiply it by two, and then add five.

do_math <- function(donuts) {
  donuts*2 + 5
}

do_math(5)
[1] 15
#Because all R operations are vectorized, we can also pass a vector to the function
do_math(c(1, 3, 5, 7))
[1]  7 11 15 19

The function to make the State of the Union table won’t take any arguments, it will just make the table of State of the Union addresses and return it.

make_sotu <- function() {
  sotu_meta <- read_csv("sotu_metadata.csv")
  sotu_meta$text <- unlist(str_split(str_c(readLines("stateoftheUnion1790-2019.txt"), collapse = " "), "\\*\\*\\* "))
  sotu_meta <- sotu_meta %>% mutate(text = str_replace(text, "[^<]+<p> ", "")) %>%
                             mutate(text = ifelse(use_last, text, str_replace(text, " <p>[^<]+$", "")))
  return(sotu_meta)
}

Copy this function into a new R script called functions.r. The code below clears out your working environment, runs the functions.r script, and makes the sotu data frame.

rm(list = ls())
source("functions.r")
sotu <- make_sotu()

── Column specification ────────────────────────────────────────────────────────
cols(
  year = col_double(),
  pres = col_character(),
  use_last = col_logical()
)
head(sotu)

We won’t be using the paragraph tags today, so we can remove them from the text column of sotu.

#Eliminate paragraph tags
sotu$text <- str_replace_all(sotu$text, "<p>", "")

Tokenizing

The first step in calculating word frequencies is tokenizing our text, which means splitting it up into units. The tidytext package offers useful tools for tokenizing.

#install.packages("tidytext")
library(tidytext)

For word frequencies, our units will be words. We will use the unnest_tokens() function. The default unit of unnest_tokens() is the word, so we only need to specify two arguments: the name of the new column that will contain the words (“gram”) and the name of the column that contains the text we want to tokenize (“text”).

sotu_words <- sotu %>% unnest_tokens(gram, text)
head(sotu_words)

As you can see, the unnest_tokens() function also removed all punctuation and converted all letters to lower case. We now have one row for each word in each address.

We can visualize the number of words in each year with the ggplot() function. This is a very powerful and flexible function that we can use for all kinds of visualizations, so it is worth taking some time to understand how it works. You may also want to consult the ggplot2 cheat sheet. The ggplot2 package is part of the tidyverse package, so you don’t need to install it separately.

At its most basic, ggplot() takes two arguments: the data frame you want to visualize, and a mapping of aesthetics (aes()). Here we are saying that we want to graph the sotu_words data frame with the year column on the x-axis.

ggplot(sotu_words, aes(x = year))

When we run this, we get a blank graph. Now we need to add a layer telling ggplot() what kind of geometry we want to use to visualize our data. In this case, we will use geom_bar(), which is a vertical bar. The height of the bar is the number of rows with the given x-axis value (year). So what we will see here is the number of words in each year.

ggplot(sotu_words, aes(x = year)) + geom_bar()

We can use color to indicate which president gave each speech. Since color will map on to an element of the data (the pres column), it goes in the aes() function.

ggplot(sotu_words, aes(x = year, fill = pres)) + geom_bar()

The legend is quite large and really unnecessary, since there was only one president in each year, so we can use a theme() layer to remove it.

ggplot(sotu_words, aes(x = year, fill = pres)) + geom_bar() + theme(legend.position = "none")

We can use the layers scale_x_continuous() and scale_y_continuous() to control the appearance of the x- and y-axes, and the labs() layers to label the axes and add a title to the graph.

ggplot(sotu_words, aes(x = year, fill = pres)) + 
      geom_bar() + theme(legend.position = "none") +
      scale_x_continuous(breaks = seq(1790, 2030, 20)) +
      scale_y_continuous(labels = scales::comma) +
      labs(x = "Year", y = "Words", title = "State of the Union Addresses, 1790-2019: Number of Words")

Now let’s get a sense of what the words are. We can count the number of times each word appears with the count() function and view the top words with the sort = TRUE argument.

sotu_words %>% count(gram, sort = TRUE) 

It may not be surprising to find that the most common words are short words that don’t carry much meaning. In text mining, these are known as stopwords. Depending on what you are doing, you may or may not want to remove them. For example, these words are often very useful for author attribution. In this case, however, we will remove them. The tidytext package provides a list of stopwords from three lexicons (SMART, snowball, and onix). The code below prints each lexicon.

data(stop_words)
head(stop_words)
for (l in 1:length(unique(stop_words$lexicon))) {
  lex <- unique(stop_words$lexicon)[l]
  print(lex)
  print(stop_words$word[stop_words$lexicon == lex])
}
[1] "SMART"
  [1] "a"             "a's"           "able"          "about"        
  [5] "above"         "according"     "accordingly"   "across"       
  [9] "actually"      "after"         "afterwards"    "again"        
 [13] "against"       "ain't"         "all"           "allow"        
 [17] "allows"        "almost"        "alone"         "along"        
 [21] "already"       "also"          "although"      "always"       
 [25] "am"            "among"         "amongst"       "an"           
 [29] "and"           "another"       "any"           "anybody"      
 [33] "anyhow"        "anyone"        "anything"      "anyway"       
 [37] "anyways"       "anywhere"      "apart"         "appear"       
 [41] "appreciate"    "appropriate"   "are"           "aren't"       
 [45] "around"        "as"            "aside"         "ask"          
 [49] "asking"        "associated"    "at"            "available"    
 [53] "away"          "awfully"       "b"             "be"           
 [57] "became"        "because"       "become"        "becomes"      
 [61] "becoming"      "been"          "before"        "beforehand"   
 [65] "behind"        "being"         "believe"       "below"        
 [69] "beside"        "besides"       "best"          "better"       
 [73] "between"       "beyond"        "both"          "brief"        
 [77] "but"           "by"            "c"             "c'mon"        
 [81] "c's"           "came"          "can"           "can't"        
 [85] "cannot"        "cant"          "cause"         "causes"       
 [89] "certain"       "certainly"     "changes"       "clearly"      
 [93] "co"            "com"           "come"          "comes"        
 [97] "concerning"    "consequently"  "consider"      "considering"  
[101] "contain"       "containing"    "contains"      "corresponding"
[105] "could"         "couldn't"      "course"        "currently"    
[109] "d"             "definitely"    "described"     "despite"      
[113] "did"           "didn't"        "different"     "do"           
[117] "does"          "doesn't"       "doing"         "don't"        
[121] "done"          "down"          "downwards"     "during"       
[125] "e"             "each"          "edu"           "eg"           
[129] "eight"         "either"        "else"          "elsewhere"    
[133] "enough"        "entirely"      "especially"    "et"           
[137] "etc"           "even"          "ever"          "every"        
[141] "everybody"     "everyone"      "everything"    "everywhere"   
[145] "ex"            "exactly"       "example"       "except"       
[149] "f"             "far"           "few"           "fifth"        
[153] "first"         "five"          "followed"      "following"    
[157] "follows"       "for"           "former"        "formerly"     
[161] "forth"         "four"          "from"          "further"      
[165] "furthermore"   "g"             "get"           "gets"         
[169] "getting"       "given"         "gives"         "go"           
[173] "goes"          "going"         "gone"          "got"          
[177] "gotten"        "greetings"     "h"             "had"          
[181] "hadn't"        "happens"       "hardly"        "has"          
[185] "hasn't"        "have"          "haven't"       "having"       
[189] "he"            "he's"          "hello"         "help"         
[193] "hence"         "her"           "here"          "here's"       
[197] "hereafter"     "hereby"        "herein"        "hereupon"     
[201] "hers"          "herself"       "hi"            "him"          
[205] "himself"       "his"           "hither"        "hopefully"    
[209] "how"           "howbeit"       "however"       "i"            
[213] "i'd"           "i'll"          "i'm"           "i've"         
[217] "ie"            "if"            "ignored"       "immediate"    
[221] "in"            "inasmuch"      "inc"           "indeed"       
[225] "indicate"      "indicated"     "indicates"     "inner"        
[229] "insofar"       "instead"       "into"          "inward"       
[233] "is"            "isn't"         "it"            "it'd"         
[237] "it'll"         "it's"          "its"           "itself"       
[241] "j"             "just"          "k"             "keep"         
[245] "keeps"         "kept"          "know"          "knows"        
[249] "known"         "l"             "last"          "lately"       
[253] "later"         "latter"        "latterly"      "least"        
[257] "less"          "lest"          "let"           "let's"        
[261] "like"          "liked"         "likely"        "little"       
[265] "look"          "looking"       "looks"         "ltd"          
[269] "m"             "mainly"        "many"          "may"          
[273] "maybe"         "me"            "mean"          "meanwhile"    
[277] "merely"        "might"         "more"          "moreover"     
[281] "most"          "mostly"        "much"          "must"         
[285] "my"            "myself"        "n"             "name"         
[289] "namely"        "nd"            "near"          "nearly"       
[293] "necessary"     "need"          "needs"         "neither"      
[297] "never"         "nevertheless"  "new"           "next"         
[301] "nine"          "no"            "nobody"        "non"          
[305] "none"          "noone"         "nor"           "normally"     
[309] "not"           "nothing"       "novel"         "now"          
[313] "nowhere"       "o"             "obviously"     "of"           
[317] "off"           "often"         "oh"            "ok"           
[321] "okay"          "old"           "on"            "once"         
[325] "one"           "ones"          "only"          "onto"         
[329] "or"            "other"         "others"        "otherwise"    
[333] "ought"         "our"           "ours"          "ourselves"    
[337] "out"           "outside"       "over"          "overall"      
[341] "own"           "p"             "particular"    "particularly" 
[345] "per"           "perhaps"       "placed"        "please"       
[349] "plus"          "possible"      "presumably"    "probably"     
[353] "provides"      "q"             "que"           "quite"        
[357] "qv"            "r"             "rather"        "rd"           
[361] "re"            "really"        "reasonably"    "regarding"    
[365] "regardless"    "regards"       "relatively"    "respectively" 
[369] "right"         "s"             "said"          "same"         
[373] "saw"           "say"           "saying"        "says"         
[377] "second"        "secondly"      "see"           "seeing"       
[381] "seem"          "seemed"        "seeming"       "seems"        
[385] "seen"          "self"          "selves"        "sensible"     
[389] "sent"          "serious"       "seriously"     "seven"        
[393] "several"       "shall"         "she"           "should"       
[397] "shouldn't"     "since"         "six"           "so"           
[401] "some"          "somebody"      "somehow"       "someone"      
[405] "something"     "sometime"      "sometimes"     "somewhat"     
[409] "somewhere"     "soon"          "sorry"         "specified"    
[413] "specify"       "specifying"    "still"         "sub"          
[417] "such"          "sup"           "sure"          "t"            
[421] "t's"           "take"          "taken"         "tell"         
[425] "tends"         "th"            "than"          "thank"        
[429] "thanks"        "thanx"         "that"          "that's"       
[433] "thats"         "the"           "their"         "theirs"       
[437] "them"          "themselves"    "then"          "thence"       
[441] "there"         "there's"       "thereafter"    "thereby"      
[445] "therefore"     "therein"       "theres"        "thereupon"    
[449] "these"         "they"          "they'd"        "they'll"      
[453] "they're"       "they've"       "think"         "third"        
[457] "this"          "thorough"      "thoroughly"    "those"        
[461] "though"        "three"         "through"       "throughout"   
[465] "thru"          "thus"          "to"            "together"     
[469] "too"           "took"          "toward"        "towards"      
[473] "tried"         "tries"         "truly"         "try"          
[477] "trying"        "twice"         "two"           "u"            
[481] "un"            "under"         "unfortunately" "unless"       
[485] "unlikely"      "until"         "unto"          "up"           
[489] "upon"          "us"            "use"           "used"         
[493] "useful"        "uses"          "using"         "usually"      
[497] "uucp"          "v"             "value"         "various"      
[501] "very"          "via"           "viz"           "vs"           
[505] "w"             "want"          "wants"         "was"          
[509] "wasn't"        "way"           "we"            "we'd"         
[513] "we'll"         "we're"         "we've"         "welcome"      
[517] "well"          "went"          "were"          "weren't"      
[521] "what"          "what's"        "whatever"      "when"         
[525] "whence"        "whenever"      "where"         "where's"      
[529] "whereafter"    "whereas"       "whereby"       "wherein"      
[533] "whereupon"     "wherever"      "whether"       "which"        
[537] "while"         "whither"       "who"           "who's"        
[541] "whoever"       "whole"         "whom"          "whose"        
[545] "why"           "will"          "willing"       "wish"         
[549] "with"          "within"        "without"       "won't"        
[553] "wonder"        "would"         "would"         "wouldn't"     
[557] "x"             "y"             "yes"           "yet"          
[561] "you"           "you'd"         "you'll"        "you're"       
[565] "you've"        "your"          "yours"         "yourself"     
[569] "yourselves"    "z"             "zero"         
[1] "snowball"
  [1] "i"          "me"         "my"         "myself"     "we"        
  [6] "our"        "ours"       "ourselves"  "you"        "your"      
 [11] "yours"      "yourself"   "yourselves" "he"         "him"       
 [16] "his"        "himself"    "she"        "her"        "hers"      
 [21] "herself"    "it"         "its"        "itself"     "they"      
 [26] "them"       "their"      "theirs"     "themselves" "what"      
 [31] "which"      "who"        "whom"       "this"       "that"      
 [36] "these"      "those"      "am"         "is"         "are"       
 [41] "was"        "were"       "be"         "been"       "being"     
 [46] "have"       "has"        "had"        "having"     "do"        
 [51] "does"       "did"        "doing"      "would"      "should"    
 [56] "could"      "ought"      "i'm"        "you're"     "he's"      
 [61] "she's"      "it's"       "we're"      "they're"    "i've"      
 [66] "you've"     "we've"      "they've"    "i'd"        "you'd"     
 [71] "he'd"       "she'd"      "we'd"       "they'd"     "i'll"      
 [76] "you'll"     "he'll"      "she'll"     "we'll"      "they'll"   
 [81] "isn't"      "aren't"     "wasn't"     "weren't"    "hasn't"    
 [86] "haven't"    "hadn't"     "doesn't"    "don't"      "didn't"    
 [91] "won't"      "wouldn't"   "shan't"     "shouldn't"  "can't"     
 [96] "cannot"     "couldn't"   "mustn't"    "let's"      "that's"    
[101] "who's"      "what's"     "here's"     "there's"    "when's"    
[106] "where's"    "why's"      "how's"      "a"          "an"        
[111] "the"        "and"        "but"        "if"         "or"        
[116] "because"    "as"         "until"      "while"      "of"        
[121] "at"         "by"         "for"        "with"       "about"     
[126] "against"    "between"    "into"       "through"    "during"    
[131] "before"     "after"      "above"      "below"      "to"        
[136] "from"       "up"         "down"       "in"         "out"       
[141] "on"         "off"        "over"       "under"      "again"     
[146] "further"    "then"       "once"       "here"       "there"     
[151] "when"       "where"      "why"        "how"        "all"       
[156] "any"        "both"       "each"       "few"        "more"      
[161] "most"       "other"      "some"       "such"       "no"        
[166] "nor"        "not"        "only"       "own"        "same"      
[171] "so"         "than"       "too"        "very"      
[1] "onix"
  [1] "a"           "about"       "above"       "across"      "after"      
  [6] "again"       "against"     "all"         "almost"      "alone"      
 [11] "along"       "already"     "also"        "although"    "always"     
 [16] "among"       "an"          "and"         "another"     "any"        
 [21] "anybody"     "anyone"      "anything"    "anywhere"    "are"        
 [26] "area"        "areas"       "around"      "as"          "ask"        
 [31] "asked"       "asking"      "asks"        "at"          "away"       
 [36] "back"        "backed"      "backing"     "backs"       "be"         
 [41] "became"      "because"     "become"      "becomes"     "been"       
 [46] "before"      "began"       "behind"      "being"       "beings"     
 [51] "best"        "better"      "between"     "big"         "both"       
 [56] "but"         "by"          "came"        "can"         "cannot"     
 [61] "case"        "cases"       "certain"     "certainly"   "clear"      
 [66] "clearly"     "come"        "could"       "did"         "differ"     
 [71] "different"   "differently" "do"          "does"        "done"       
 [76] "down"        "down"        "downed"      "downing"     "downs"      
 [81] "during"      "each"        "early"       "either"      "end"        
 [86] "ended"       "ending"      "ends"        "enough"      "even"       
 [91] "evenly"      "ever"        "every"       "everybody"   "everyone"   
 [96] "everything"  "everywhere"  "face"        "faces"       "fact"       
[101] "facts"       "far"         "felt"        "few"         "find"       
[106] "finds"       "first"       "for"         "four"        "from"       
[111] "full"        "fully"       "further"     "furthered"   "furthering" 
[116] "furthers"    "gave"        "general"     "generally"   "get"        
[121] "gets"        "give"        "given"       "gives"       "go"         
[126] "going"       "good"        "goods"       "got"         "great"      
[131] "greater"     "greatest"    "group"       "grouped"     "grouping"   
[136] "groups"      "had"         "has"         "have"        "having"     
[141] "he"          "her"         "here"        "herself"     "high"       
[146] "high"        "high"        "higher"      "highest"     "him"        
[151] "himself"     "his"         "how"         "however"     "i"          
[156] "if"          "important"   "in"          "interest"    "interested" 
[161] "interesting" "interests"   "into"        "is"          "it"         
[166] "its"         "itself"      "just"        "keep"        "keeps"      
[171] "kind"        "knew"        "know"        "known"       "knows"      
[176] "large"       "largely"     "last"        "later"       "latest"     
[181] "least"       "less"        "let"         "lets"        "like"       
[186] "likely"      "long"        "longer"      "longest"     "made"       
[191] "make"        "making"      "man"         "many"        "may"        
[196] "me"          "member"      "members"     "men"         "might"      
[201] "more"        "most"        "mostly"      "mr"          "mrs"        
[206] "much"        "must"        "my"          "myself"      "necessary"  
[211] "need"        "needed"      "needing"     "needs"       "never"      
[216] "new"         "new"         "newer"       "newest"      "next"       
[221] "no"          "nobody"      "non"         "noone"       "not"        
[226] "nothing"     "now"         "nowhere"     "number"      "numbers"    
[231] "of"          "off"         "often"       "old"         "older"      
[236] "oldest"      "on"          "once"        "one"         "only"       
[241] "open"        "opened"      "opening"     "opens"       "or"         
[246] "order"       "ordered"     "ordering"    "orders"      "other"      
[251] "others"      "our"         "out"         "over"        "part"       
[256] "parted"      "parting"     "parts"       "per"         "perhaps"    
[261] "place"       "places"      "point"       "pointed"     "pointing"   
[266] "points"      "possible"    "present"     "presented"   "presenting" 
[271] "presents"    "problem"     "problems"    "put"         "puts"       
[276] "quite"       "rather"      "really"      "right"       "right"      
[281] "room"        "rooms"       "said"        "same"        "saw"        
[286] "say"         "says"        "second"      "seconds"     "see"        
[291] "seem"        "seemed"      "seeming"     "seems"       "sees"       
[296] "several"     "shall"       "she"         "should"      "show"       
[301] "showed"      "showing"     "shows"       "side"        "sides"      
[306] "since"       "small"       "smaller"     "smallest"    "some"       
[311] "somebody"    "someone"     "something"   "somewhere"   "state"      
[316] "states"      "still"       "still"       "such"        "sure"       
[321] "take"        "taken"       "than"        "that"        "the"        
[326] "their"       "them"        "then"        "there"       "therefore"  
[331] "these"       "they"        "thing"       "things"      "think"      
[336] "thinks"      "this"        "those"       "though"      "thought"    
[341] "thoughts"    "three"       "through"     "thus"        "to"         
[346] "today"       "together"    "too"         "took"        "toward"     
[351] "turn"        "turned"      "turning"     "turns"       "two"        
[356] "under"       "until"       "up"          "upon"        "us"         
[361] "use"         "used"        "uses"        "very"        "want"       
[366] "wanted"      "wanting"     "wants"       "was"         "way"        
[371] "ways"        "we"          "well"        "wells"       "went"       
[376] "were"        "what"        "when"        "where"       "whether"    
[381] "which"       "while"       "who"         "whole"       "whose"      
[386] "why"         "will"        "with"        "within"      "without"    
[391] "work"        "worked"      "working"     "works"       "would"      
[396] "year"        "years"       "yet"         "you"         "young"      
[401] "younger"     "youngest"    "your"        "yours"      

You can use the filter() function to remove stopwords. I also want to take out tokens with digits in them. Since tokenizing and removing stopwords is something I will want to do again in the future, I’m going to write a function for it.

sotu_tokenize_words <- function() {
  sotu_words <- sotu %>% mutate(text = str_replace_all(text, "<p>", "")) %>% unnest_tokens(gram, text) %>%
                  filter(!gram %in% stop_words$word & !str_detect(gram, "[:digit:]"))
  return(sotu_words)
}
rm(list = ls())
source("functions.r")
sotu <- make_sotu()

── Column specification ────────────────────────────────────────────────────────
cols(
  year = col_double(),
  pres = col_character(),
  use_last = col_logical()
)
sotu_words <- sotu_tokenize_words()

Now we can see the most frequent non-stop words across all of the State of the Union addresses.

sotu_words %>% count(gram, sort = TRUE)

Word frequency over time

There are many different ways that we can analyze word frequency over time. One way is to graph the frequency of the overall top 25 (or any other number of) words in each year.

We can get the overall top 25 words with the top_n() function:

tops <- (sotu_words %>% count(gram) %>% top_n(25, n))$gram
tops
 [1] "act"        "america"    "american"   "citizens"   "congress"  
 [6] "country"    "department" "federal"    "foreign"    "government"
[11] "law"        "nation"     "national"   "nations"    "peace"     
[16] "people"     "power"      "public"     "service"    "subject"   
[21] "system"     "time"       "united"     "war"        "world"     

Now we select only the rows of sotu_words that refer to these words.

sotu_words %>% filter(gram %in% tops)

We can use the group_by() function together with the count() function to see how many times each of these words appear in each State of the Union address.

sotu_words %>% filter(gram %in% tops) %>% group_by(year) %>% count(gram)

We can easily compare from year to year by graphing the results.

sotu_words %>% filter(gram %in% tops) %>% group_by(year) %>% count(gram) %>%
  ggplot(aes(x = year, y = gram, size = n)) + geom_point() + theme(legend.position = "none")

If we want to color the dots by president, we need to add pres to our group_by() function and to the aes() function.

sotu_words %>% filter(gram %in% tops) %>% group_by(year, pres) %>% count(gram) %>%
  ggplot(aes(x = year, y = gram, size = n, color = pres)) + 
    geom_point() + theme(legend.position = "none")

Now we can generalize everything we just did into a function.

all_top_grams <- function(dataset, number) {
  tops <- (dataset %>% count(gram) %>% top_n(number))$gram
  dataset %>% filter(gram %in% tops) %>% group_by(year, pres) %>% count(gram) %>%
    ggplot(aes(x = year, y = gram, size = n, color = pres)) + 
      geom_point() + theme(legend.position = "none")
}
all_top_grams(sotu_words, 10)
Selecting by n

When we graph raw counts, we find that the longer addresses use all of the words more often (see Carter, 1980). We can standardize the counts by dividing each one by the total number of (non-stop) words in each address. Below I’m going to revise the all_top_grams() function to standardize the word count.

all_top_grams <- function(dataset, number) {
  tops <- (dataset %>% count(gram) %>% top_n(number))$gram
  dataset %>% group_by(year, pres) %>% count(gram) %>% 
    mutate(pct = n/sum(n)) %>% filter(gram %in% tops) %>%
    ggplot(aes(x = year, y = gram, size = pct, color = pres)) + 
      geom_point() + theme(legend.position = "none")
}
all_top_grams(sotu_words, 10)
Selecting by n

all_top_grams(sotu_words, 25)
Selecting by n

Another option would be to consider the top 5 (or any other number of) words used by each president.

top_grams_pres <- function(dataset, number) {
  tops <- (dataset %>% group_by(pres) %>% count(gram) %>% top_n(number))$gram
  dataset %>% group_by(year, pres) %>% count(gram) %>% 
    mutate(pct = n/sum(n)) %>% filter(gram %in% tops) %>%
    ggplot(aes(x = year, y = gram, size = pct, color = pres)) + 
    geom_point() + theme(legend.position = "none")
}
top_grams_pres(sotu_words, 5)
Selecting by n

Another way to visualize changes in word usage over time is to make separate graphs of the top words for each president.

sotu_words %>% group_by(pres) %>% count(gram) %>% mutate(pct = n/sum(n)) %>% top_n(5, pct) %>%
  ggplot(aes(x = gram, y = pct, fill = pres)) + geom_col() +
    coord_flip() + facet_wrap(vars(pres), scale = "free_y") + theme(legend.position = "none")

This graph is too big to see in our notebook, but we can export it to a .png file.

png("words_by_pres.png", height = 20, width = 30, units = "in", res = 200)
  sotu_words %>% group_by(pres) %>% count(gram) %>% mutate(pct = n/sum(n)) %>% top_n(5) %>%
    ggplot(aes(x = gram, y = pct, fill = pres)) + geom_col() +
      coord_flip() + facet_wrap(vars(pres), scale = "free_y") + theme(legend.position = "none")
Selecting by pct
dev.off()
null device 
          1 

This is pretty, but the alphabetical ordering of presidents is not conducive to analyzing change over time. What we need to do is convert the pres variable to a factor to preserve the chronological order. A factor is a special data type that stores strings as numbers. We can’t manipulate factors the way we would manipulate regular character variables, and we also can’t do math with them.

table(factor(sotu_words$pres))

      Abraham Lincoln        Andrew Jackson        Andrew Johnson 
                10066                 30626                 14491 
         Barack Obama     Benjamin Harrison       Calvin Coolidge 
                19777                 20290                 20092 
    Chester A. Arthur       Donald J. Trump  Dwight D. Eisenhower 
                 7999                  6313                 23634 
Franklin D. Roosevelt       Franklin Pierce      George H.W. Bush 
                17934                 16010                  6428 
       George W. Bush     George Washington        Gerald R. Ford 
                18101                  6180                  5626 
     Grover Cleveland       Harry S. Truman        Herbert Hoover 
                42412                 26853                 10480 
       James Buchanan         James Madison          James Monroe 
                21020                  8277                 14925 
           James Polk          Jimmy Carter            John Adams 
                27014                 20584                  2739 
      John F. Kennedy     John Quincy Adams            John Tyler 
                 7147                 11333                 12442 
    Lyndon B. Johnson      Martin van Buren      Millard Fillmore 
                10725                 16934                 11752 
        Richard Nixon         Ronald Reagan   Rutherford B. Hayes 
                 6874                 12819                 13413 
   Theodore Roosevelt      Thomas Jefferson      Ulysses S. Grant 
                58242                  7360                 24428 
       Warren Harding       William H. Taft    William J. Clinton 
                 4529                 27370                 21834 
     William McKinley        Woodrow Wilson        Zachary Taylor 
                26986                 11902                  2909 
table(fct_inorder(factor(sotu_words$pres)))

    George Washington            John Adams      Thomas Jefferson 
                 6180                  2739                  7360 
        James Madison          James Monroe     John Quincy Adams 
                 8277                 14925                 11333 
       Andrew Jackson      Martin van Buren            John Tyler 
                30626                 16934                 12442 
           James Polk        Zachary Taylor      Millard Fillmore 
                27014                  2909                 11752 
      Franklin Pierce        James Buchanan       Abraham Lincoln 
                16010                 21020                 10066 
       Andrew Johnson      Ulysses S. Grant   Rutherford B. Hayes 
                14491                 24428                 13413 
    Chester A. Arthur      Grover Cleveland     Benjamin Harrison 
                 7999                 42412                 20290 
     William McKinley    Theodore Roosevelt       William H. Taft 
                26986                 58242                 27370 
       Woodrow Wilson        Warren Harding       Calvin Coolidge 
                11902                  4529                 20092 
       Herbert Hoover Franklin D. Roosevelt       Harry S. Truman 
                10480                 17934                 26853 
 Dwight D. Eisenhower       John F. Kennedy     Lyndon B. Johnson 
                23634                  7147                 10725 
        Richard Nixon        Gerald R. Ford          Jimmy Carter 
                 6874                  5626                 20584 
        Ronald Reagan      George H.W. Bush    William J. Clinton 
                12819                  6428                 21834 
       George W. Bush          Barack Obama       Donald J. Trump 
                18101                 19777                  6313 

We can see when we make a table of the new president column that the presidents are listed in chronological order rather than alphabetical order. Now let’s make a general function to create the words_by_pres.png chart.

grams_by_pres <- function(dataset, filename, number) {
  graph <- dataset %>% mutate(president = fct_inorder(factor(pres))) %>%
           group_by(president) %>% count(gram) %>% mutate(pct = n/sum(n)) %>% top_n(number) %>%
              ggplot(aes(x = gram, y = pct, fill = president)) + geom_col() +
              coord_flip() + facet_wrap(vars(president), scale = "free_y") + theme(legend.position = "none")
  png(filename, height = 20, width = 30, units = "in", res = 200)
    print(graph)
  dev.off()
}
grams_by_pres(sotu_words, "words_by_pres2.png", 10)
Selecting by pct
null device 
          1 

LS0tCnRpdGxlOiAiTm90ZWJvb2sgMiAtIFdvcmQgRnJlcXVlbmNpZXMiCmF1dGhvcjogIkVtaWx5IEtsYW5jaGVyIE1lcmNoYW50IgpkYXRlOiAiU1RTIDIwNSIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQojIFdvcmQgRnJlcXVlbmNpZXMKVGhlIHZpZGVvIHRoYXQgYWNjb21wYW5pZXMgdGhpcyBub3RlYm9vayBpcyBhdmFpbGFibGUgYXQgaHR0cHM6Ly91Y2RhdmlzLmJveC5jb20vdi9zdHMtMjA1LW5vdGVib29rLTIuCgpJbiB0aGlzIG5vdGVib29rIHdlIHdpbGwgYmUgYW5hbHl6aW5nIGFuZCB2aXN1YWxpemluZyB3b3JkIGZyZXF1ZW5jaWVzIGluIHRoZSBTdGF0ZSBvZiB0aGUgVW5pb24gYWRkcmVzc2VzLiBMZXQncyBiZWdpbiBieSBsb29raW5nIGF0IHRoZSBtZXRhZGF0YSB0YWJsZSB3ZSBjcmVhdGVkIGxhc3Qgd2Vlay4KCiMjIFdvcmtpbmcgd2l0aCBkYXRhIGZyYW1lcwpgYGB7cn0KbGlicmFyeSh0aWR5dmVyc2UpCnNvdHVfbWV0YSA8LSByZWFkX2Nzdigic290dV9tZXRhZGF0YS5jc3YiKQpoZWFkKHNvdHVfbWV0YSkKYGBgCkluIGdlbmVyYWwsIHRoZXJlIGFyZSB0d28gd2F5cyB0byB3b3JrIHdpdGggZGF0YSBmcmFtZXMgaW4gUi4gQmFzZSBSIGFsbG93cyB5b3UgdG8gYWNjZXNzIGFueSBjb2x1bW4gb2YgYSBkYXRhIGZyYW1lIGJ5IHNwZWNpZnlpbmcgdGhlIG5hbWUgb2YgdGhlIGRhdGEgZnJhbWUgZm9sbG93ZWQgYnkgYSBkb2xsYXIgc2lnbiBhbmQgdGhlIG5hbWUgb2YgdGhlIGNvbHVtbi4KYGBge3J9CnNvdHVfbWV0YSR5ZWFyCmBgYApZb3UgY2FuIGFsc28gdXNlIGZ1bmN0aW9ucyB0byBzdW1tYXJpemUgdGhlIGNvbHVtbnMuIEhlcmUgd2UgdXNlIHRoZSBgdGFibGUoKWAgZnVuY3Rpb24gdG8gc2VlIGhvdyBtYW55IGFkZHJlc3NlcyB3ZXJlIGdpdmVuIGJ5IGVhY2ggcHJlc2lkZW50LgpgYGB7cn0KdGFibGUoc290dV9tZXRhJHByZXMpCmBgYApZb3UgY2FuIHN1YnNldCBhIGRhdGEgZnJhbWUgYnkgZWl0aGVyIHJvd3Mgb3IgY29sdW1ucyB1c2luZyBzcXVhcmUgYnJhY2tldHMuCmBgYHtyfQpzb3R1X21ldGFbMToyLCBdICAjRmlyc3QgdHdvIHJvd3MKc290dV9tZXRhWywgMToyXSAjRmlyc3QgdHdvIGNvbHVtbnMKc290dV9tZXRhWzE6MiwgMToyXSAjRmlyc3QgdHdvIHJvd3MgYW5kIGNvbHVtbnMKc290dV9tZXRhW3NvdHVfbWV0YSRwcmVzID09ICJHZW9yZ2UgV2FzaGluZ3RvbiIsIF0gI1Jvd3Mgd2hlcmUgcHJlcyBpcyBHZW9yZ2UgV2FzaGluZ3Rvbgpzb3R1X21ldGFbc290dV9tZXRhJHVzZV9sYXN0LCBdICNSb3dzIHdoZXJlIHVzZV9sYXN0IGlzIFRSVUUKc290dV9tZXRhW3NvdHVfbWV0YSR5ZWFyICVpbiUgMTgwMDoxODk5LCBdICNBZGRyZXNzZXMgZ2l2ZW4gYmV0d2VlbiAxODAwIGFuZCAxODk5CmBgYAoKWW91IGNhbiAgYWRkIGEgY29sdW1uIGJ5IG5hbWluZyBpdCBhbmQgYXNzaWduaW5nIGl0IGEgdmFsdWUuIEhlcmUgd2Ugd2lsbCBhZGQgYSBjb2x1bW4gd2l0aCB0aGUgcHJlc2lkZW50J3MgbGFzdCBuYW1lLgpgYGB7cn0Kc290dV9tZXRhJHByZXNfbGFzdCA8LSBzdHJfZXh0cmFjdChzb3R1X21ldGEkcHJlcywgIltBLVpdW2Etel0rJCIpCmhlYWQoc290dV9tZXRhKQpgYGAKVGhlIFtkcGx5cl0oaHR0cHM6Ly9yc3R1ZGlvLmNvbS93cC1jb250ZW50L3VwbG9hZHMvMjAxNS8wMi9kYXRhLXdyYW5nbGluZy1jaGVhdHNoZWV0LnBkZikgcGFja2FnZSBhbGxvd3MgeW91IHRvIG1hbmlwdWxhdGUgZGF0YSBmcmFtZXMgd2l0aG91dCB1c2luZyB0aGUgZG9sbGFyIHNpZ24gbm90YXRpb24sIHdoaWNoIG9mdGVuIG1ha2VzIHlvdXIgY29kZSBjbGVhbmVyLiBUaGUgYGRwbHlyYCBwYWNrYWdlIGlzIHBhcnQgb2YgdGhlIGB0aWR5dmVyc2VgLCBzbyB3ZSBkb24ndCBuZWVkIHRvIGluc3RhbGwgaXQgc2VwYXJhdGVseS4gVGhlIGBkcGx5cmAgZnVuY3Rpb24gYHNlbGVjdCgpYCBhbGxvd3MgeW91IHRvIHNlbGVjdCBjb2x1bW5zIGJ5IG5hbWUuCmBgYHtyfQpzZWxlY3Qoc290dV9tZXRhLCB5ZWFyLCBwcmVzKQpgYGAKWW91IGNhbiBhbHNvIHVzZSBwaXBlcyB3aXRoIGBkcGx5cmAgZnVuY3Rpb25zOgpgYGB7cn0Kc290dV9tZXRhIDwtIHNvdHVfbWV0YSAlPiUgc2VsZWN0KC1wcmVzX2xhc3QpCmhlYWQoc290dV9tZXRhKQpgYGAKSW4gdGhlIGNvZGUgYWJvdmUsIEkgY2hhbmdlZCB0aGUgYHNvdHVfbWV0YWAgZGF0YSBmcmFtZSB0byBlbGltaW5hdGUgdGhlIGBwcmVzX2xhc3RgIGNvbHVtbi4gVGhlIGBkcGx5cmAgZnVuY3Rpb24gYGZpbHRlcigpYCBhbGxvd3MgeW91IHRvIHNlbGVjdCByb3dzIGJ5IGdpdmVuIGNyaXRlcmlhLgpgYGB7cn0Kc290dV9tZXRhICU+JSBmaWx0ZXIocHJlcyA9PSAiR2VvcmdlIFdhc2hpbmd0b24iKQpgYGAKVGhlIGBkeXBseXJgIGZ1bmN0aW9uIGBhcnJhbmdlKClgIGxldHMgeW91IHNvcnQgdGhlIGRhdGEgZnJhbWUgYnkgdGhlIHZhbHVlIG9mIGEgZ2l2ZW4gY29sdW1uLgpgYGB7cn0KYXJyYW5nZShzb3R1X21ldGEsIC15ZWFyKQpzb3R1X21ldGEgJT4lIGFycmFuZ2UoLXllYXIpCmBgYApUaGUgYGRwbHlyYCBmdW5jdGlvbiBgbXV0YXRlKClgIGxldHMgeW91IGNoYW5nZSB0aGUgdmFsdWUgb2YgYSBjb2x1bW4gb3IgY3JlYXRlIGEgbmV3IGNvbHVtbi4KYGBge3J9CnNvdHVfbWV0YSAlPiUgbXV0YXRlKHByZXNfbGFzdCA9IHN0cl9leHRyYWN0KHByZXMsICJbQS1aXVthLXpdKyQiKSkKYGBgCgpOb3cgd2UgYXJlIGdvaW5nIHRvIHJlYWQgaW4gdGhlIHRleHQgb2YgdGhlIGFkZHJlc3NlcyBhbmQgY29udmVydCB0aGVtIHRvIGEgdmVjdG9yIHdoZXJlIGVhY2ggZWxlbWVudCBpcyBhIHNlcGFyYXRlIGFkZHJlc3MsIGp1c3QgYXMgd2UgZGlkIGxhc3Qgd2Vlay4KYGBge3J9CnNvdHUgPC0gcmVhZExpbmVzKCJzdGF0ZW9mdGhldW5pb24xNzkwLTIwMTkudHh0IikgJT4lIAogICAgICAgICAgc3RyX2MoY29sbGFwc2UgPSAiICIpICU+JSAKICAgICAgICAgIHN0cl9zcGxpdCgiXFwqXFwqXFwqICIpICU+JSB1bmxpc3QKYGBgClRoZSBgc290dV9tZXRhYCB0YWJsZSBoYXMgMjMzIHJvd3MgYW5kIHRoZSBgc290dWAgdmVjdG9yIGhhcyAyMzMgZWxlbWVudHMsIHNvIHdlIGNhbiBtYWtlIHRoZSBgc290dWAgdmVjdG9yIGEgY29sdW1uIG9mIHRoZSBgc290dV9tZXRhYCB0YWJsZS4KYGBge3J9CnNvdHVfbWV0YSR0ZXh0IDwtIHNvdHUKaGVhZChzb3R1X21ldGEpCmBgYApOb3cgdGhlIHRleHQgb2YgZWFjaCBhZGRyZXNzIGlzIGluIHRoZSBgdGV4dGAgY29sdW1uIG9mIHRoZSBgc290dV9tZXRhYCB0YWJsZS4gV2Ugd2FudCB0byBnZXQgcmlkIG9mIHRoZSBmaXJzdCBwYXJhZ3JhcGggb2YgZXZlcnkgYWRkcmVzcywgc2luY2UgdGhhdCBpcyBvbmx5IG1ldGFkYXRhLCBhbmQgd2Ugd2FudCB0byBnZXQgcmlkIG9mIHRoZSBsYXN0IHBhcmFncmFwaCBvZiBldmVyeSBhZGRyZXNzIHdoZXJlIGB1c2VfbGFzdGAgaXMgRkFMU0UuIFRoZSBjb2RlIGJlbG93IHJlcGxhY2VzIHRoZSBmaXJzdCBwYXJhZ3JhcGggd2l0aCBub3RoaW5nLCB1c2luZyB0aGUgc2FtZSByZWd1bGFyIGV4cHJlc3Npb24gd2UgdXNlZCBsYXN0IHdlZWsgdG8gZXh0cmFjdCB0aGUgZmlyc3QgcGFyYWdyYXBoLgpgYGB7cn0Kc290dV9tZXRhIDwtIHNvdHVfbWV0YSAlPiUgbXV0YXRlKHRleHQgPSBzdHJfcmVwbGFjZSh0ZXh0LCAiW148XSs8cD4gIiwgIiIpKQpoZWFkKHNvdHVfbWV0YSkKYGBgCgpgYGB7cn0Kc290dV9tZXRhIDwtIHNvdHVfbWV0YSAlPiUgbXV0YXRlKHJlbW92ZSA9IGlmZWxzZSh1c2VfbGFzdCwgIiIsIHN0cl9leHRyYWN0KHRleHQsICIgPHA+W148XSskIikpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgdGV4dCA9IGlmZWxzZSh1c2VfbGFzdCwgdGV4dCwgc3RyX3JlcGxhY2UodGV4dCwgIiA8cD5bXjxdKyQiLCAiIikpKQpzb3R1X21ldGEkcmVtb3ZlCmBgYApOb3cgdGhhdCB3ZSBoYXZlIHdoYXQgd2Ugd2FudCAoYSBkYXRhIGZyYW1lIHdpdGggdGhlIHRleHQgb2YgdGhlIFN0YXRlIG9mIHRoZSBVbmlvbnMgYW5kIHRoZSBtZXRhZGF0YSksIHdlIGFyZSBnb2luZyB0byB3cml0ZSBhIGZ1bmN0aW9uIHNvIHRoYXQgd2UgY2FuIGRvIHRoaXMgYWxsIGluIG9uZSBzdGVwIGFueSB0aW1lIHdlIHdhbnQuCgojIyBNYWtpbmcgbmV3IGZ1bmN0aW9ucwpBIGZ1bmN0aW9uIGlzIGp1c3QgYW4gYWxnb3JpdGhtIHRoYXQgd2UgbWFrZSB1cCBvdXJzZWx2ZXMgYW5kIHdhbnQgdG8gcmUtdXNlLiBUbyBnaXZlIGFuIGV4YW1wbGUsIHdlIGNhbiBtYWtlIGEgZnVuY3Rpb24gY2FsbGVkIGBkb19tYXRoKClgIHRoYXQgd2lsbCB0YWtlIGFueSBudW1iZXIsIG11bHRpcGx5IGl0IGJ5IHR3bywgYW5kIHRoZW4gYWRkIGZpdmUuCmBgYHtyfQpkb19tYXRoIDwtIGZ1bmN0aW9uKGRvbnV0cykgewogIGRvbnV0cyoyICsgNQp9Cgpkb19tYXRoKDUpCiNCZWNhdXNlIGFsbCBSIG9wZXJhdGlvbnMgYXJlIHZlY3Rvcml6ZWQsIHdlIGNhbiBhbHNvIHBhc3MgYSB2ZWN0b3IgdG8gdGhlIGZ1bmN0aW9uCmRvX21hdGgoYygxLCAzLCA1LCA3KSkKYGBgClRoZSBmdW5jdGlvbiB0byBtYWtlIHRoZSBTdGF0ZSBvZiB0aGUgVW5pb24gdGFibGUgd29uJ3QgdGFrZSBhbnkgYXJndW1lbnRzLCBpdCB3aWxsIGp1c3QgbWFrZSB0aGUgdGFibGUgb2YgU3RhdGUgb2YgdGhlIFVuaW9uIGFkZHJlc3NlcyBhbmQgcmV0dXJuIGl0LgpgYGB7cn0KbWFrZV9zb3R1IDwtIGZ1bmN0aW9uKCkgewogIHNvdHVfbWV0YSA8LSByZWFkX2Nzdigic290dV9tZXRhZGF0YS5jc3YiKQogIHNvdHVfbWV0YSR0ZXh0IDwtIHVubGlzdChzdHJfc3BsaXQoc3RyX2MocmVhZExpbmVzKCJzdGF0ZW9mdGhlVW5pb24xNzkwLTIwMTkudHh0IiksIGNvbGxhcHNlID0gIiAiKSwgIlxcKlxcKlxcKiAiKSkKICBzb3R1X21ldGEgPC0gc290dV9tZXRhICU+JSBtdXRhdGUodGV4dCA9IHN0cl9yZXBsYWNlKHRleHQsICJbXjxdKzxwPiAiLCAiIikpICU+JQogICAgICAgICAgICAgICAgICAgICAgICAgICAgIG11dGF0ZSh0ZXh0ID0gaWZlbHNlKHVzZV9sYXN0LCB0ZXh0LCBzdHJfcmVwbGFjZSh0ZXh0LCAiIDxwPltePF0rJCIsICIiKSkpCiAgcmV0dXJuKHNvdHVfbWV0YSkKfQpgYGAKQ29weSB0aGlzIGZ1bmN0aW9uIGludG8gYSBuZXcgUiBzY3JpcHQgY2FsbGVkIGBmdW5jdGlvbnMucmAuIFRoZSBjb2RlIGJlbG93IGNsZWFycyBvdXQgeW91ciB3b3JraW5nIGVudmlyb25tZW50LCBydW5zIHRoZSBgZnVuY3Rpb25zLnJgIHNjcmlwdCwgYW5kIG1ha2VzIHRoZSBgc290dWAgZGF0YSBmcmFtZS4KYGBge3J9CnJtKGxpc3QgPSBscygpKQpzb3VyY2UoImZ1bmN0aW9ucy5yIikKc290dSA8LSBtYWtlX3NvdHUoKQpoZWFkKHNvdHUpCmBgYAoKV2Ugd29uJ3QgYmUgdXNpbmcgdGhlIHBhcmFncmFwaCB0YWdzIHRvZGF5LCBzbyB3ZSBjYW4gcmVtb3ZlIHRoZW0gZnJvbSB0aGUgYHRleHRgIGNvbHVtbiBvZiBgc290dWAuCmBgYHtyfQojRWxpbWluYXRlIHBhcmFncmFwaCB0YWdzCnNvdHUkdGV4dCA8LSBzdHJfcmVwbGFjZV9hbGwoc290dSR0ZXh0LCAiPHA+IiwgIiIpCmBgYAoKIyMgVG9rZW5pemluZwpUaGUgZmlyc3Qgc3RlcCBpbiBjYWxjdWxhdGluZyB3b3JkIGZyZXF1ZW5jaWVzIGlzIHRva2VuaXppbmcgb3VyIHRleHQsIHdoaWNoIG1lYW5zIHNwbGl0dGluZyBpdCB1cCBpbnRvIHVuaXRzLiBUaGUgYHRpZHl0ZXh0YCBwYWNrYWdlIG9mZmVycyB1c2VmdWwgdG9vbHMgZm9yIHRva2VuaXppbmcuCmBgYHtyfQojaW5zdGFsbC5wYWNrYWdlcygidGlkeXRleHQiKQpsaWJyYXJ5KHRpZHl0ZXh0KQpgYGAKRm9yIHdvcmQgZnJlcXVlbmNpZXMsIG91ciB1bml0cyB3aWxsIGJlIHdvcmRzLiBXZSB3aWxsIHVzZSB0aGUgYHVubmVzdF90b2tlbnMoKWAgZnVuY3Rpb24uIFRoZSBkZWZhdWx0IHVuaXQgb2YgYHVubmVzdF90b2tlbnMoKWAgaXMgdGhlIHdvcmQsIHNvIHdlIG9ubHkgbmVlZCB0byBzcGVjaWZ5IHR3byBhcmd1bWVudHM6IHRoZSBuYW1lIG9mIHRoZSBuZXcgY29sdW1uIHRoYXQgd2lsbCBjb250YWluIHRoZSB3b3JkcyAoImdyYW0iKSBhbmQgdGhlIG5hbWUgb2YgdGhlIGNvbHVtbiB0aGF0IGNvbnRhaW5zIHRoZSB0ZXh0IHdlIHdhbnQgdG8gdG9rZW5pemUgKCJ0ZXh0IikuCmBgYHtyfQpzb3R1X3dvcmRzIDwtIHNvdHUgJT4lIHVubmVzdF90b2tlbnMoZ3JhbSwgdGV4dCkKaGVhZChzb3R1X3dvcmRzKQpgYGAKCkFzIHlvdSBjYW4gc2VlLCB0aGUgYHVubmVzdF90b2tlbnMoKWAgZnVuY3Rpb24gYWxzbyByZW1vdmVkIGFsbCBwdW5jdHVhdGlvbiBhbmQgY29udmVydGVkIGFsbCBsZXR0ZXJzIHRvIGxvd2VyIGNhc2UuIFdlIG5vdyBoYXZlIG9uZSByb3cgZm9yIGVhY2ggd29yZCBpbiBlYWNoIGFkZHJlc3MuIAoKV2UgY2FuIHZpc3VhbGl6ZSB0aGUgbnVtYmVyIG9mIHdvcmRzIGluIGVhY2ggeWVhciB3aXRoIHRoZSBgZ2dwbG90KClgIGZ1bmN0aW9uLiBUaGlzIGlzIGEgdmVyeSBwb3dlcmZ1bCBhbmQgZmxleGlibGUgZnVuY3Rpb24gdGhhdCB3ZSBjYW4gdXNlIGZvciBhbGwga2luZHMgb2YgdmlzdWFsaXphdGlvbnMsIHNvIGl0IGlzIHdvcnRoIHRha2luZyBzb21lIHRpbWUgdG8gdW5kZXJzdGFuZCBob3cgaXQgd29ya3MuIFlvdSBtYXkgYWxzbyB3YW50IHRvIGNvbnN1bHQgdGhlIFtnZ3Bsb3QyIGNoZWF0IHNoZWV0XShodHRwczovL3JzdHVkaW8uY29tL3dwLWNvbnRlbnQvdXBsb2Fkcy8yMDE1LzAzL2dncGxvdDItY2hlYXRzaGVldC5wZGYpLiBUaGUgYGdncGxvdDJgIHBhY2thZ2UgaXMgcGFydCBvZiB0aGUgYHRpZHl2ZXJzZWAgcGFja2FnZSwgc28geW91IGRvbid0IG5lZWQgdG8gaW5zdGFsbCBpdCBzZXBhcmF0ZWx5LgoKQXQgaXRzIG1vc3QgYmFzaWMsIGBnZ3Bsb3QoKWAgdGFrZXMgdHdvIGFyZ3VtZW50czogdGhlIGRhdGEgZnJhbWUgeW91IHdhbnQgdG8gdmlzdWFsaXplLCBhbmQgYSBtYXBwaW5nIG9mIGFlc3RoZXRpY3MgKGBhZXMoKWApLiBIZXJlIHdlIGFyZSBzYXlpbmcgdGhhdCB3ZSB3YW50IHRvIGdyYXBoIHRoZSBgc290dV93b3Jkc2AgZGF0YSBmcmFtZSB3aXRoIHRoZSBgeWVhcmAgY29sdW1uIG9uIHRoZSB4LWF4aXMuCmBgYHtyfQpnZ3Bsb3Qoc290dV93b3JkcywgYWVzKHggPSB5ZWFyKSkKYGBgCldoZW4gd2UgcnVuIHRoaXMsIHdlIGdldCBhIGJsYW5rIGdyYXBoLiBOb3cgd2UgbmVlZCB0byBhZGQgYSBsYXllciB0ZWxsaW5nIGBnZ3Bsb3QoKWAgd2hhdCBraW5kIG9mIGdlb21ldHJ5IHdlIHdhbnQgdG8gdXNlIHRvIHZpc3VhbGl6ZSBvdXIgZGF0YS4gSW4gdGhpcyBjYXNlLCB3ZSB3aWxsIHVzZSBgZ2VvbV9iYXIoKWAsIHdoaWNoIGlzIGEgdmVydGljYWwgYmFyLiBUaGUgaGVpZ2h0IG9mIHRoZSBiYXIgaXMgdGhlIG51bWJlciBvZiByb3dzIHdpdGggdGhlIGdpdmVuIHgtYXhpcyB2YWx1ZSAoYHllYXJgKS4gU28gd2hhdCB3ZSB3aWxsIHNlZSBoZXJlIGlzIHRoZSBudW1iZXIgb2Ygd29yZHMgaW4gZWFjaCB5ZWFyLgpgYGB7cn0KZ2dwbG90KHNvdHVfd29yZHMsIGFlcyh4ID0geWVhcikpICsgZ2VvbV9iYXIoKQpgYGAKV2UgY2FuIHVzZSBjb2xvciB0byBpbmRpY2F0ZSB3aGljaCBwcmVzaWRlbnQgZ2F2ZSBlYWNoIHNwZWVjaC4gU2luY2UgY29sb3Igd2lsbCBtYXAgb24gdG8gYW4gZWxlbWVudCBvZiB0aGUgZGF0YSAodGhlIGBwcmVzYCBjb2x1bW4pLCBpdCBnb2VzIGluIHRoZSBgYWVzKClgIGZ1bmN0aW9uLgpgYGB7cn0KZ2dwbG90KHNvdHVfd29yZHMsIGFlcyh4ID0geWVhciwgZmlsbCA9IHByZXMpKSArIGdlb21fYmFyKCkKYGBgClRoZSBsZWdlbmQgaXMgcXVpdGUgbGFyZ2UgYW5kIHJlYWxseSB1bm5lY2Vzc2FyeSwgc2luY2UgdGhlcmUgd2FzIG9ubHkgb25lIHByZXNpZGVudCBpbiBlYWNoIHllYXIsIHNvIHdlIGNhbiB1c2UgYSBgdGhlbWUoKWAgbGF5ZXIgdG8gcmVtb3ZlIGl0LgpgYGB7cn0KZ2dwbG90KHNvdHVfd29yZHMsIGFlcyh4ID0geWVhciwgZmlsbCA9IHByZXMpKSArIGdlb21fYmFyKCkgKyB0aGVtZShsZWdlbmQucG9zaXRpb24gPSAibm9uZSIpCmBgYApXZSBjYW4gdXNlIHRoZSBsYXllcnMgYHNjYWxlX3hfY29udGludW91cygpYCBhbmQgYHNjYWxlX3lfY29udGludW91cygpYCB0byBjb250cm9sIHRoZSBhcHBlYXJhbmNlIG9mIHRoZSB4LSBhbmQgeS1heGVzLCBhbmQgdGhlIGBsYWJzKClgIGxheWVycyB0byBsYWJlbCB0aGUgYXhlcyBhbmQgYWRkIGEgdGl0bGUgdG8gdGhlIGdyYXBoLgpgYGB7cn0KZ2dwbG90KHNvdHVfd29yZHMsIGFlcyh4ID0geWVhciwgZmlsbCA9IHByZXMpKSArIAogICAgICBnZW9tX2JhcigpICsgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gIm5vbmUiKSArCiAgICAgIHNjYWxlX3hfY29udGludW91cyhicmVha3MgPSBzZXEoMTc5MCwgMjAzMCwgMjApKSArCiAgICAgIHNjYWxlX3lfY29udGludW91cyhsYWJlbHMgPSBzY2FsZXM6OmNvbW1hKSArCiAgICAgIGxhYnMoeCA9ICJZZWFyIiwgeSA9ICJXb3JkcyIsIHRpdGxlID0gIlN0YXRlIG9mIHRoZSBVbmlvbiBBZGRyZXNzZXMsIDE3OTAtMjAxOTogTnVtYmVyIG9mIFdvcmRzIikKYGBgCk5vdyBsZXQncyBnZXQgYSBzZW5zZSBvZiB3aGF0IHRoZSB3b3JkcyBhcmUuIFdlIGNhbiBjb3VudCB0aGUgbnVtYmVyIG9mIHRpbWVzIGVhY2ggd29yZCBhcHBlYXJzIHdpdGggdGhlIGBjb3VudCgpYCBmdW5jdGlvbiBhbmQgdmlldyB0aGUgdG9wIHdvcmRzIHdpdGggdGhlIGBzb3J0ID0gVFJVRWAgYXJndW1lbnQuCmBgYHtyfQpzb3R1X3dvcmRzICU+JSBjb3VudChncmFtLCBzb3J0ID0gVFJVRSkgCmBgYApJdCBtYXkgbm90IGJlIHN1cnByaXNpbmcgdG8gZmluZCB0aGF0IHRoZSBtb3N0IGNvbW1vbiB3b3JkcyBhcmUgc2hvcnQgd29yZHMgdGhhdCBkb24ndCBjYXJyeSBtdWNoIG1lYW5pbmcuIEluIHRleHQgbWluaW5nLCB0aGVzZSBhcmUga25vd24gYXMgc3RvcHdvcmRzLiBEZXBlbmRpbmcgb24gd2hhdCB5b3UgYXJlIGRvaW5nLCB5b3UgbWF5IG9yIG1heSBub3Qgd2FudCB0byByZW1vdmUgdGhlbS4gRm9yIGV4YW1wbGUsIHRoZXNlIHdvcmRzIGFyZSBvZnRlbiB2ZXJ5IHVzZWZ1bCBmb3IgYXV0aG9yIGF0dHJpYnV0aW9uLiBJbiB0aGlzIGNhc2UsIGhvd2V2ZXIsIHdlIHdpbGwgcmVtb3ZlIHRoZW0uIFRoZSBgdGlkeXRleHRgIHBhY2thZ2UgcHJvdmlkZXMgYSBsaXN0IG9mIHN0b3B3b3JkcyBmcm9tIHRocmVlIGxleGljb25zIChTTUFSVCwgc25vd2JhbGwsIGFuZCBvbml4KS4gVGhlIGNvZGUgYmVsb3cgcHJpbnRzIGVhY2ggbGV4aWNvbi4KYGBge3J9CmRhdGEoc3RvcF93b3JkcykKaGVhZChzdG9wX3dvcmRzKQpmb3IgKGwgaW4gMTpsZW5ndGgodW5pcXVlKHN0b3Bfd29yZHMkbGV4aWNvbikpKSB7CiAgbGV4IDwtIHVuaXF1ZShzdG9wX3dvcmRzJGxleGljb24pW2xdCiAgcHJpbnQobGV4KQogIHByaW50KHN0b3Bfd29yZHMkd29yZFtzdG9wX3dvcmRzJGxleGljb24gPT0gbGV4XSkKfQpgYGAKCllvdSBjYW4gdXNlIHRoZSBgZmlsdGVyKClgIGZ1bmN0aW9uIHRvIHJlbW92ZSBzdG9wd29yZHMuIEkgYWxzbyB3YW50IHRvIHRha2Ugb3V0IHRva2VucyB3aXRoIGRpZ2l0cyBpbiB0aGVtLiBTaW5jZSB0b2tlbml6aW5nIGFuZCByZW1vdmluZyBzdG9wd29yZHMgaXMgc29tZXRoaW5nIEkgd2lsbCB3YW50IHRvIGRvIGFnYWluIGluIHRoZSBmdXR1cmUsIEknbSBnb2luZyB0byB3cml0ZSBhIGZ1bmN0aW9uIGZvciBpdC4KYGBge3J9CnNvdHVfdG9rZW5pemVfd29yZHMgPC0gZnVuY3Rpb24oKSB7CiAgc290dV93b3JkcyA8LSBzb3R1ICU+JSBtdXRhdGUodGV4dCA9IHN0cl9yZXBsYWNlX2FsbCh0ZXh0LCAiPHA+IiwgIiIpKSAlPiUgdW5uZXN0X3Rva2VucyhncmFtLCB0ZXh0KSAlPiUKICAgICAgICAgICAgICAgICAgZmlsdGVyKCFncmFtICVpbiUgc3RvcF93b3JkcyR3b3JkICYgIXN0cl9kZXRlY3QoZ3JhbSwgIls6ZGlnaXQ6XSIpKQogIHJldHVybihzb3R1X3dvcmRzKQp9CmBgYAoKYGBge3J9CnJtKGxpc3QgPSBscygpKQpzb3VyY2UoImZ1bmN0aW9ucy5yIikKc290dSA8LSBtYWtlX3NvdHUoKQpzb3R1X3dvcmRzIDwtIHNvdHVfdG9rZW5pemVfd29yZHMoKQpgYGAKTm93IHdlIGNhbiBzZWUgdGhlIG1vc3QgZnJlcXVlbnQgbm9uLXN0b3Agd29yZHMgYWNyb3NzIGFsbCBvZiB0aGUgU3RhdGUgb2YgdGhlIFVuaW9uIGFkZHJlc3Nlcy4KYGBge3J9CnNvdHVfd29yZHMgJT4lIGNvdW50KGdyYW0sIHNvcnQgPSBUUlVFKQpgYGAKIyMgV29yZCBmcmVxdWVuY3kgb3ZlciB0aW1lClRoZXJlIGFyZSBtYW55IGRpZmZlcmVudCB3YXlzIHRoYXQgd2UgY2FuIGFuYWx5emUgd29yZCBmcmVxdWVuY3kgb3ZlciB0aW1lLiBPbmUgd2F5IGlzIHRvIGdyYXBoIHRoZSBmcmVxdWVuY3kgb2YgdGhlIG92ZXJhbGwgdG9wIDI1IChvciBhbnkgb3RoZXIgbnVtYmVyIG9mKSB3b3JkcyBpbiBlYWNoIHllYXIuCgpXZSBjYW4gZ2V0IHRoZSBvdmVyYWxsIHRvcCAyNSB3b3JkcyB3aXRoIHRoZSBgdG9wX24oKWAgZnVuY3Rpb246CmBgYHtyfQp0b3BzIDwtIChzb3R1X3dvcmRzICU+JSBjb3VudChncmFtKSAlPiUgdG9wX24oMjUsIG4pKSRncmFtCnRvcHMKYGBgCk5vdyB3ZSBzZWxlY3Qgb25seSB0aGUgcm93cyBvZiBgc290dV93b3Jkc2AgdGhhdCByZWZlciB0byB0aGVzZSB3b3Jkcy4KYGBge3J9CnNvdHVfd29yZHMgJT4lIGZpbHRlcihncmFtICVpbiUgdG9wcykKYGBgCldlIGNhbiB1c2UgdGhlIGBncm91cF9ieSgpYCBmdW5jdGlvbiB0b2dldGhlciB3aXRoIHRoZSBgY291bnQoKWAgZnVuY3Rpb24gdG8gc2VlIGhvdyBtYW55IHRpbWVzIGVhY2ggb2YgdGhlc2Ugd29yZHMgYXBwZWFyIGluIGVhY2ggU3RhdGUgb2YgdGhlIFVuaW9uIGFkZHJlc3MuCmBgYHtyfQpzb3R1X3dvcmRzICU+JSBmaWx0ZXIoZ3JhbSAlaW4lIHRvcHMpICU+JSBncm91cF9ieSh5ZWFyKSAlPiUgY291bnQoZ3JhbSkKYGBgCldlIGNhbiBlYXNpbHkgY29tcGFyZSBmcm9tIHllYXIgdG8geWVhciBieSBncmFwaGluZyB0aGUgcmVzdWx0cy4KYGBge3J9CnNvdHVfd29yZHMgJT4lIGZpbHRlcihncmFtICVpbiUgdG9wcykgJT4lIGdyb3VwX2J5KHllYXIpICU+JSBjb3VudChncmFtKSAlPiUKICBnZ3Bsb3QoYWVzKHggPSB5ZWFyLCB5ID0gZ3JhbSwgc2l6ZSA9IG4pKSArIGdlb21fcG9pbnQoKSArIHRoZW1lKGxlZ2VuZC5wb3NpdGlvbiA9ICJub25lIikKYGBgCklmIHdlIHdhbnQgdG8gY29sb3IgdGhlIGRvdHMgYnkgcHJlc2lkZW50LCB3ZSBuZWVkIHRvIGFkZCBgcHJlc2AgdG8gb3VyIGBncm91cF9ieSgpYCBmdW5jdGlvbiBhbmQgdG8gdGhlIGBhZXMoKWAgZnVuY3Rpb24uCmBgYHtyfQpzb3R1X3dvcmRzICU+JSBmaWx0ZXIoZ3JhbSAlaW4lIHRvcHMpICU+JSBncm91cF9ieSh5ZWFyLCBwcmVzKSAlPiUgY291bnQoZ3JhbSkgJT4lCiAgZ2dwbG90KGFlcyh4ID0geWVhciwgeSA9IGdyYW0sIHNpemUgPSBuLCBjb2xvciA9IHByZXMpKSArIAogICAgZ2VvbV9wb2ludCgpICsgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gIm5vbmUiKQpgYGAKTm93IHdlIGNhbiBnZW5lcmFsaXplIGV2ZXJ5dGhpbmcgd2UganVzdCBkaWQgaW50byBhIGZ1bmN0aW9uLgpgYGB7cn0KYWxsX3RvcF9ncmFtcyA8LSBmdW5jdGlvbihkYXRhc2V0LCBudW1iZXIpIHsKICB0b3BzIDwtIChkYXRhc2V0ICU+JSBjb3VudChncmFtKSAlPiUgdG9wX24obnVtYmVyKSkkZ3JhbQogIGRhdGFzZXQgJT4lIGZpbHRlcihncmFtICVpbiUgdG9wcykgJT4lIGdyb3VwX2J5KHllYXIsIHByZXMpICU+JSBjb3VudChncmFtKSAlPiUKICAgIGdncGxvdChhZXMoeCA9IHllYXIsIHkgPSBncmFtLCBzaXplID0gbiwgY29sb3IgPSBwcmVzKSkgKyAKICAgICAgZ2VvbV9wb2ludCgpICsgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gIm5vbmUiKQp9CmFsbF90b3BfZ3JhbXMoc290dV93b3JkcywgMTApCmBgYAoKV2hlbiB3ZSBncmFwaCByYXcgY291bnRzLCB3ZSBmaW5kIHRoYXQgdGhlIGxvbmdlciBhZGRyZXNzZXMgdXNlIGFsbCBvZiB0aGUgd29yZHMgbW9yZSBvZnRlbiAoc2VlIENhcnRlciwgMTk4MCkuIFdlIGNhbiBzdGFuZGFyZGl6ZSB0aGUgY291bnRzIGJ5IGRpdmlkaW5nIGVhY2ggb25lIGJ5IHRoZSB0b3RhbCBudW1iZXIgb2YgKG5vbi1zdG9wKSB3b3JkcyBpbiBlYWNoIGFkZHJlc3MuIEJlbG93IEknbSBnb2luZyB0byByZXZpc2UgdGhlIGBhbGxfdG9wX2dyYW1zKClgIGZ1bmN0aW9uIHRvIHN0YW5kYXJkaXplIHRoZSB3b3JkIGNvdW50LgpgYGB7cn0KYWxsX3RvcF9ncmFtcyA8LSBmdW5jdGlvbihkYXRhc2V0LCBudW1iZXIpIHsKICB0b3BzIDwtIChkYXRhc2V0ICU+JSBjb3VudChncmFtKSAlPiUgdG9wX24obnVtYmVyKSkkZ3JhbQogIGRhdGFzZXQgJT4lIGdyb3VwX2J5KHllYXIsIHByZXMpICU+JSBjb3VudChncmFtKSAlPiUgCiAgICBtdXRhdGUocGN0ID0gbi9zdW0obikpICU+JSBmaWx0ZXIoZ3JhbSAlaW4lIHRvcHMpICU+JQogICAgZ2dwbG90KGFlcyh4ID0geWVhciwgeSA9IGdyYW0sIHNpemUgPSBwY3QsIGNvbG9yID0gcHJlcykpICsgCiAgICAgIGdlb21fcG9pbnQoKSArIHRoZW1lKGxlZ2VuZC5wb3NpdGlvbiA9ICJub25lIikKfQphbGxfdG9wX2dyYW1zKHNvdHVfd29yZHMsIDEwKQphbGxfdG9wX2dyYW1zKHNvdHVfd29yZHMsIDI1KQpgYGAKCgpBbm90aGVyIG9wdGlvbiB3b3VsZCBiZSB0byBjb25zaWRlciB0aGUgdG9wIDUgKG9yIGFueSBvdGhlciBudW1iZXIgb2YpIHdvcmRzIHVzZWQgYnkgZWFjaCBwcmVzaWRlbnQuIApgYGB7cn0KdG9wX2dyYW1zX3ByZXMgPC0gZnVuY3Rpb24oZGF0YXNldCwgbnVtYmVyKSB7CiAgdG9wcyA8LSAoZGF0YXNldCAlPiUgZ3JvdXBfYnkocHJlcykgJT4lIGNvdW50KGdyYW0pICU+JSB0b3BfbihudW1iZXIpKSRncmFtCiAgZGF0YXNldCAlPiUgZ3JvdXBfYnkoeWVhciwgcHJlcykgJT4lIGNvdW50KGdyYW0pICU+JSAKICAgIG11dGF0ZShwY3QgPSBuL3N1bShuKSkgJT4lIGZpbHRlcihncmFtICVpbiUgdG9wcykgJT4lCiAgICBnZ3Bsb3QoYWVzKHggPSB5ZWFyLCB5ID0gZ3JhbSwgc2l6ZSA9IHBjdCwgY29sb3IgPSBwcmVzKSkgKyAKICAgIGdlb21fcG9pbnQoKSArIHRoZW1lKGxlZ2VuZC5wb3NpdGlvbiA9ICJub25lIikKfQp0b3BfZ3JhbXNfcHJlcyhzb3R1X3dvcmRzLCA1KQpgYGAKCkFub3RoZXIgd2F5IHRvIHZpc3VhbGl6ZSBjaGFuZ2VzIGluIHdvcmQgdXNhZ2Ugb3ZlciB0aW1lIGlzIHRvIG1ha2Ugc2VwYXJhdGUgZ3JhcGhzIG9mIHRoZSB0b3Agd29yZHMgZm9yIGVhY2ggcHJlc2lkZW50LgoKYGBge3J9CnNvdHVfd29yZHMgJT4lIGdyb3VwX2J5KHByZXMpICU+JSBjb3VudChncmFtKSAlPiUgbXV0YXRlKHBjdCA9IG4vc3VtKG4pKSAlPiUgdG9wX24oNSwgcGN0KSAlPiUKICBnZ3Bsb3QoYWVzKHggPSBncmFtLCB5ID0gcGN0LCBmaWxsID0gcHJlcykpICsgZ2VvbV9jb2woKSArCiAgICBjb29yZF9mbGlwKCkgKyBmYWNldF93cmFwKHZhcnMocHJlcyksIHNjYWxlID0gImZyZWVfeSIpICsgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gIm5vbmUiKQpgYGAKVGhpcyBncmFwaCBpcyB0b28gYmlnIHRvIHNlZSBpbiBvdXIgbm90ZWJvb2ssIGJ1dCB3ZSBjYW4gZXhwb3J0IGl0IHRvIGEgYC5wbmdgIGZpbGUuCmBgYHtyfQpwbmcoIndvcmRzX2J5X3ByZXMucG5nIiwgaGVpZ2h0ID0gMjAsIHdpZHRoID0gMzAsIHVuaXRzID0gImluIiwgcmVzID0gMjAwKQogIHNvdHVfd29yZHMgJT4lIGdyb3VwX2J5KHByZXMpICU+JSBjb3VudChncmFtKSAlPiUgbXV0YXRlKHBjdCA9IG4vc3VtKG4pKSAlPiUgdG9wX24oNSkgJT4lCiAgICBnZ3Bsb3QoYWVzKHggPSBncmFtLCB5ID0gcGN0LCBmaWxsID0gcHJlcykpICsgZ2VvbV9jb2woKSArCiAgICAgIGNvb3JkX2ZsaXAoKSArIGZhY2V0X3dyYXAodmFycyhwcmVzKSwgc2NhbGUgPSAiZnJlZV95IikgKyB0aGVtZShsZWdlbmQucG9zaXRpb24gPSAibm9uZSIpCmRldi5vZmYoKQpgYGAKIVtdKHdvcmRzX2J5X3ByZXMucG5nKQoKVGhpcyBpcyBwcmV0dHksIGJ1dCB0aGUgYWxwaGFiZXRpY2FsIG9yZGVyaW5nIG9mIHByZXNpZGVudHMgaXMgbm90IGNvbmR1Y2l2ZSB0byBhbmFseXppbmcgY2hhbmdlIG92ZXIgdGltZS4gV2hhdCB3ZSBuZWVkIHRvIGRvIGlzIGNvbnZlcnQgdGhlIGBwcmVzYCB2YXJpYWJsZSB0byBhICoqZmFjdG9yKiogdG8gcHJlc2VydmUgdGhlIGNocm9ub2xvZ2ljYWwgb3JkZXIuIEEgZmFjdG9yIGlzIGEgc3BlY2lhbCBkYXRhIHR5cGUgdGhhdCBzdG9yZXMgc3RyaW5ncyBhcyBudW1iZXJzLiBXZSBjYW4ndCBtYW5pcHVsYXRlIGZhY3RvcnMgdGhlIHdheSB3ZSB3b3VsZCBtYW5pcHVsYXRlIHJlZ3VsYXIgY2hhcmFjdGVyIHZhcmlhYmxlcywgYW5kIHdlIGFsc28gY2FuJ3QgZG8gbWF0aCB3aXRoIHRoZW0uCmBgYHtyfQp0YWJsZShmYWN0b3Ioc290dV93b3JkcyRwcmVzKSkKdGFibGUoZmN0X2lub3JkZXIoZmFjdG9yKHNvdHVfd29yZHMkcHJlcykpKQpgYGAKV2UgY2FuIHNlZSB3aGVuIHdlIG1ha2UgYSB0YWJsZSBvZiB0aGUgbmV3IGBwcmVzaWRlbnRgIGNvbHVtbiB0aGF0IHRoZSBwcmVzaWRlbnRzIGFyZSBsaXN0ZWQgaW4gY2hyb25vbG9naWNhbCBvcmRlciByYXRoZXIgdGhhbiBhbHBoYWJldGljYWwgb3JkZXIuIE5vdyBsZXQncyBtYWtlIGEgZ2VuZXJhbCBmdW5jdGlvbiB0byBjcmVhdGUgdGhlIGB3b3Jkc19ieV9wcmVzLnBuZ2AgY2hhcnQuCmBgYHtyfQpncmFtc19ieV9wcmVzIDwtIGZ1bmN0aW9uKGRhdGFzZXQsIGZpbGVuYW1lLCBudW1iZXIpIHsKICBncmFwaCA8LSBkYXRhc2V0ICU+JSBtdXRhdGUocHJlc2lkZW50ID0gZmN0X2lub3JkZXIoZmFjdG9yKHByZXMpKSkgJT4lCiAgICAgICAgICAgZ3JvdXBfYnkocHJlc2lkZW50KSAlPiUgY291bnQoZ3JhbSkgJT4lIG11dGF0ZShwY3QgPSBuL3N1bShuKSkgJT4lIHRvcF9uKG51bWJlcikgJT4lCiAgICAgICAgICAgICAgZ2dwbG90KGFlcyh4ID0gZ3JhbSwgeSA9IHBjdCwgZmlsbCA9IHByZXNpZGVudCkpICsgZ2VvbV9jb2woKSArCiAgICAgICAgICAgICAgY29vcmRfZmxpcCgpICsgZmFjZXRfd3JhcCh2YXJzKHByZXNpZGVudCksIHNjYWxlID0gImZyZWVfeSIpICsgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gIm5vbmUiKQogIHBuZyhmaWxlbmFtZSwgaGVpZ2h0ID0gMjAsIHdpZHRoID0gMzAsIHVuaXRzID0gImluIiwgcmVzID0gMjAwKQogICAgcHJpbnQoZ3JhcGgpCiAgZGV2Lm9mZigpCn0KZ3JhbXNfYnlfcHJlcyhzb3R1X3dvcmRzLCAid29yZHNfYnlfcHJlczIucG5nIiwgMTApCmBgYAohW10od29yZHNfYnlfcHJlczIucG5nKQ==