<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Locke Data Blog</title>
    <link>https://itsalocke.com/blog/</link>
    <description>Recent content on Locke Data Blog</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-gb</language>
    <copyright>&lt;a rel=&#34;license&#34; href=&#34;http://creativecommons.org/licenses/by-nc-sa/4.0/&#34;&gt;&lt;img alt=&#34;Creative Commons License&#34; style=&#34;border-width:0&#34; src=&#34;https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png&#34; /&gt;&lt;/a&gt;&lt;br /&gt;This work is licensed under a &lt;a rel=&#34;license&#34; href=&#34;http://creativecommons.org/licenses/by-nc-sa/4.0/&#34;&gt;Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License&lt;/a&gt;.</copyright>
    <lastBuildDate>Fri, 14 Dec 2018 00:00:00 +0000</lastBuildDate>
    
        <atom:link href="https://itsalocke.com/blog/index.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
      <title>Gift ideas for the R lovers</title>
      <link>https://itsalocke.com/blog/gift-ideas-for-the-r-lovers/</link>
      <pubDate>Fri, 14 Dec 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/gift-ideas-for-the-r-lovers/</guid>
      <description>
        
        

&lt;p&gt;Are you looking for gift ideas for the R addicts in your life or the next guest speaker at your user group? Here are a few ideas, in four categories! Not all of them feature our company, promised. ;-)&lt;/p&gt;

&lt;h1 id=&#34;learning-and-using-r&#34;&gt;Learning and using R&lt;/h1&gt;

&lt;p&gt;Is your dear one just starting to learn R? You might help them by gifting them a book, or even a couple of them! We suggest the R fundamentals series by Locke Data&amp;rsquo;s Steph Locke, &lt;a href=&#34;https://www.amazon.com/gp/product/1999842618/ref=dbs_a_def_rwt_bibl_vppi_i2&#34;&gt;Volume 1, Working with R&lt;/a&gt; and &lt;a href=&#34;https://www.amazon.com/gp/product/1979699933/ref=dbs_a_def_rwt_bibl_vppi_i1&#34;&gt;Volume 2, Data Manipulation in R&lt;/a&gt;, both of them available as paperback and Kindle versions. Other useful books for R users of all levels include &lt;a href=&#34;http://bestprogrammingbooks.com/9-r-popular-o-reilly-books/&#34;&gt;R books by O&amp;rsquo;Reilly&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote class=&#34;twitter-tweet&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;Got all my &lt;a href=&#34;https://twitter.com/hashtag/rstats?src=hash&amp;amp;ref_src=twsrc%5Etfw&#34;&gt;#rstats&lt;/a&gt; cheat sheets printed and laminated to pass around during R-Lab to help the students this semester. &lt;a href=&#34;https://twitter.com/hashtag/tidyverse?src=hash&amp;amp;ref_src=twsrc%5Etfw&#34;&gt;#tidyverse&lt;/a&gt; &lt;a href=&#34;https://t.co/AcxIDfj0i9&#34;&gt;pic.twitter.com/AcxIDfj0i9&lt;/a&gt;&lt;/p&gt;&amp;mdash; Dylan McDowell (@dylanjm_ds) &lt;a href=&#34;https://twitter.com/dylanjm_ds/status/1040644353331326976?ref_src=twsrc%5Etfw&#34;&gt;September 14, 2018&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;https://platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;


&lt;p&gt;If the R user in your life uses RStudio packages, they might appreciate using &lt;a href=&#34;https://www.rstudio.com/resources/cheat sheets/&#34;&gt;RStudio cheatsheets&lt;/a&gt; about data manipulation, dates and times, Keras, etc. The cheatsheets are available for free, and you could print and laminate a few of them as a gift!&lt;/p&gt;

&lt;h1 id=&#34;becoming-a-legit-package-developer&#34;&gt;Becoming a legit package developer&lt;/h1&gt;

&lt;p&gt;Is the person you want to surprise a package developer? How about increasing their confidence and joy in their package by gifting them a voucher for a hex logo creation by &lt;a href=&#34;https://twitter.com/LockeCreatives&#34;&gt;Locke Creatives&lt;/a&gt;? Oz Locke, who has designed the hex logos for RDogLadies, &lt;a href=&#34;https://github.com/ropensci/visdat#visdat-&#34;&gt;&lt;code&gt;visdat&lt;/code&gt;&lt;/a&gt; and &lt;a href=&#34;https://github.com/njtierney/naniar#naniar-&#34;&gt;&lt;code&gt;naniar&lt;/code&gt;&lt;/a&gt;, among others. Reach out to Oz on Twitter if you&amp;rsquo;d like to know more!&lt;/p&gt;

&lt;p&gt;As a complement, you could then offer to print the created hex logos as stickers.&lt;/p&gt;

&lt;h1 id=&#34;decorating-one-s-office&#34;&gt;Decorating one&amp;rsquo;s office&lt;/h1&gt;

&lt;p&gt;Why not gift R art to hang on the walls? We like&lt;/p&gt;

&lt;blockquote class=&#34;twitter-tweet&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;personifyr::sf [be a space wrangler]. &lt;a href=&#34;https://twitter.com/hashtag/rstats?src=hash&amp;amp;ref_src=twsrc%5Etfw&#34;&gt;#rstats&lt;/a&gt; &lt;a href=&#34;https://t.co/HehHARwM3E&#34;&gt;pic.twitter.com/HehHARwM3E&lt;/a&gt;&lt;/p&gt;&amp;mdash; Allison Horst (@allison_horst) &lt;a href=&#34;https://twitter.com/allison_horst/status/1071456081308614656?ref_src=twsrc%5Etfw&#34;&gt;December 8, 2018&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;https://platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;


&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Drawings by the very talented Allison Horst, that you could have print and have framed. She does not seem to have a shop yet, but you could contact her!&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;This nifty &lt;a href=&#34;https://www.etsy.com/fr/listing/554706505/ceci-nest-pas-une-pipe-au-point-de-croix&#34;&gt;&lt;code&gt;magrittr&lt;/code&gt; cross-stich found on Etsy&lt;/a&gt;!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&#34;getting-out-of-one-s-office&#34;&gt;Getting out of one&amp;rsquo;s office&lt;/h1&gt;

&lt;p&gt;Loving R does not mean working at one&amp;rsquo;s desk all the time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The R community likes hex shaped things. You could gift a hex shaped baking tin, or other hex shaped objects, with material to personalize them, or you could personalize them yourself. E.g. see these gorgeous coasters!&lt;br /&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote class=&#34;twitter-tweet&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;I recently acquired some cork and a soldering iron, and learned how to make coasters by burning patterns into them. So even though I don&amp;#39;t have any actual hex stickers, I now have these! &lt;a href=&#34;https://t.co/g8ddXvAH3S&#34;&gt;pic.twitter.com/g8ddXvAH3S&lt;/a&gt;&lt;/p&gt;&amp;mdash; Kim Cressman (@swmpkim) &lt;a href=&#34;https://twitter.com/swmpkim/status/1065295487136382977?ref_src=twsrc%5Etfw&#34;&gt;November 21, 2018&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;https://platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;


&lt;ul&gt;
&lt;li&gt;Have you heard of the SatRday conferences? They&amp;rsquo;re community-lead, low-cost one day R conferences. You could buy a ticket to your dear one&amp;rsquo;s local satRday conference, or even make a trip out of it if the conference is not that local. Check out &lt;a href=&#34;https://satrdays.org/events/&#34;&gt;SatRdays calendar&lt;/a&gt;. There&amp;rsquo;s one planned in &lt;a href=&#34;https://paris2019.satrdays.org/&#34;&gt;Paris&lt;/a&gt; on February the 23d!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That&amp;rsquo;s it! It might be a little bit late for the holidays depending on the ones you celebrate, but occasions to give presents will surely happen again soon! Please share your other gift ideas for R lovers in the comments!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>covrpage, more information on unit testing</title>
      <link>https://itsalocke.com/blog/covrpage-more-information-on-unit-testing/</link>
      <pubDate>Mon, 10 Dec 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/covrpage-more-information-on-unit-testing/</guid>
      <description>
        
        

&lt;p&gt;In this post, we shall explore the first R package that received &lt;a href=&#34;https://itsalocke.com/blog/package-support-offer/&#34;&gt;Locke Data&amp;rsquo;s new support&lt;/a&gt;, &lt;a href=&#34;https://yonicd.github.io/covrpage/&#34;&gt;&lt;code&gt;covrpage&lt;/code&gt;&lt;/a&gt; by Jonathan Sidi! With this nifty package you can better communicate the unit testing completeness and goodness of your package!&lt;/p&gt;

&lt;h1 id=&#34;what-s-covrpage&#34;&gt;What&amp;rsquo;s &lt;code&gt;covrpage&lt;/code&gt;?&lt;/h1&gt;

&lt;p&gt;Trust is earned not &lt;del&gt;inherited&lt;/del&gt; importedFrom. Now that you&amp;rsquo;ve built a cool package, you want potential users to trust it so that they might adopt it. So how can you build trust in your software? Unit testing is one of the components building trustworthiness of your package. Imagine you&amp;rsquo;re at the point where you&amp;rsquo;ve tested most lines of your code with thorough assertions, including checks of edge cases. Proof of that hard work will be a high test coverage, that potential users of your package might notice thanks to a bright green coverage badge.&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/covrpage_readme_badge.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Green coverage badge for the rfm package&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;But how would they know your tests are thorough? That&amp;rsquo;s what &lt;code&gt;covrpage&lt;/code&gt; helps you with, by creating a summary report of your tests that goes beyond the coverage percentage.&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/covrpage_tests_readme.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Tests summary report for the rfm package&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;This way, potential users can see at a glance how good the unit testing of your package is. This report can be used as a README for the tests folder, as well as a vignette in the &lt;code&gt;pkgdown&lt;/code&gt; website and/or CRAN page of a package.&lt;/p&gt;

&lt;h2 id=&#34;is-the-covrpage-report-only-for-users&#34;&gt;Is the &lt;code&gt;covrpage&lt;/code&gt; report only for users?&lt;/h2&gt;

&lt;p&gt;No, it can also inform your work on your package, by helping you track progress of the unit tests you&amp;rsquo;re working on, and it can show to potential &lt;em&gt;contributors&lt;/em&gt; where help is needed.&lt;/p&gt;

&lt;h1 id=&#34;how-do-i-publish-the-covrpage-report&#34;&gt;How do I publish the &lt;code&gt;covrpage&lt;/code&gt; report?&lt;/h1&gt;

&lt;p&gt;There are two places where you can keep the &lt;code&gt;covrpage&lt;/code&gt; report, and it&amp;rsquo;s advised to use both since they will get seen by different readers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A README for the tests/ folder, which is the original report location. &lt;code&gt;covrpage::covrpage()&lt;/code&gt; sets it up. Target audience: users or collaborators browsing the GitHub repo of your package, possibly guided there by a badge in the main README.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;A vignette, that&amp;rsquo;ll get inserted into the &lt;code&gt;pkgdown&lt;/code&gt; website of your package, and the &lt;a href=&#34;https://cran.r-project.org/web/packages/texPreview/index.html&#34;&gt;CRAN page&lt;/a&gt; if/when your package is released on CRAN. &lt;code&gt;covrpage::use_covrpage_vignette()&lt;/code&gt; sets it up. Target audience: users reading the rendered documentation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In both cases, you can also ensure the report stays up-to-date by having it deployed from Travis every time you push to your repository.&lt;/p&gt;

&lt;h1 id=&#34;can-i-get-a-covrpage-report-for-any-package&#34;&gt;Can I get a &lt;code&gt;covrpage&lt;/code&gt; report for any package?&lt;/h1&gt;

&lt;p&gt;See a new package on GitHub that you would like to use, but need to feel more comfortable before &lt;em&gt;committing&lt;/em&gt; to it? You can run &lt;code&gt;covrpage_snapshot()&lt;/code&gt; to create a report in a sterile environment, without affecting your &lt;code&gt;.libPaths&lt;/code&gt;, and make a more informed decision wether to install the package.&lt;/p&gt;

&lt;p&gt;For more information see the &lt;a href=&#34;https://yonicd.github.io/covrpage/articles/snapshots.html&#34;&gt;snapshots&lt;/a&gt; vignette.&lt;/p&gt;

&lt;h1 id=&#34;how-do-i-learn-more-about-covrpage&#34;&gt;How do I learn more about &lt;code&gt;covrpage&lt;/code&gt;?&lt;/h1&gt;

&lt;p&gt;To read more about getting started with &lt;code&gt;covrpage&lt;/code&gt; in your own package in a few lines of code only, we recommend checking out the &lt;a href=&#34;https://yonicd.github.io/covrpage/articles/get-started.html&#34;&gt;&amp;ldquo;get started&amp;rdquo; vignette&lt;/a&gt;. It explains more how to setup the Travis deploy, mentions which functions power the &lt;code&gt;covrpage&lt;/code&gt; report, and gives more motivation for using &lt;code&gt;covrpage&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;And to learn how the information provided by &lt;code&gt;covrpage&lt;/code&gt; should be read, read the &lt;a href=&#34;https://yonicd.github.io/covrpage/articles/how-to-read-covrpage-report.html&#34;&gt;&amp;ldquo;How to read the &lt;code&gt;covrpage&lt;/code&gt; report&amp;rdquo; vignette&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&#34;what-did-covrpage-gain-from-locke-data-s-help&#34;&gt;What did &lt;code&gt;covrpage&lt;/code&gt; gain from Locke Data&amp;rsquo;s help?&lt;/h1&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/covrpageLogo.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;covrpage logo&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;As &lt;a href=&#34;https://itsalocke.com/blog/package-support-offer/&#34;&gt;announced recently&lt;/a&gt;, we at Locke Data will help a package a month get more widely adopted. In the case of &lt;code&gt;covrpage&lt;/code&gt;, our main input was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;the creation of a nice logo by &lt;a href=&#34;https://twitter.com/LockeCreatives&#34;&gt;Oz Locke&lt;/a&gt;. The logo highlights that &lt;code&gt;covrpage&lt;/code&gt; extends the &lt;code&gt;covr&lt;/code&gt; and &lt;code&gt;testthat&lt;/code&gt; packages by helping the user inspect their results.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;the improvement of documentation, in particular vignettes and the README, to make it clearer why potential users should care about &lt;code&gt;covrpage&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;exploration of Travis deploy of the report via the &lt;code&gt;tic&lt;/code&gt; package, to make the setup smoother.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thanks Jonathan for letting us get involved for your package, and good luck with future development! And if &lt;em&gt;you&lt;/em&gt; think your broadly applicable package could benefit from our support getting your package into shape then &lt;a href=&#34;https://airtable.com/shrH3z9fQIbEJzPUn&#34;&gt;apply now&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Project planning with plotly</title>
      <link>https://itsalocke.com/blog/project-planning-with-plotly/</link>
      <pubDate>Mon, 26 Nov 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/project-planning-with-plotly/</guid>
      <description>
        
        

&lt;p&gt;Something a little different today for a quick chat about my latest
project and why I’m finding the &lt;code&gt;plotly&lt;/code&gt; package so helpful!&lt;/p&gt;

&lt;p&gt;Are you like me and physically can’t function unless you’ve got a to do
list in front of you? Well even if you’re not, imagine my pain while I’m
wearing my non - Locke Data hat and trying to plan out the final year of
my PhD thesis!&lt;/p&gt;

&lt;p&gt;I needed something that updated easily, something visual and something
to keep my supervisors in the know. I’ve previously made gantt charts
using LaTeX but found it ridiculously clunky to get working and decided
there had to be a better way. And if I could include interactivity then
all the better, which is how I discovered &lt;code&gt;plotly&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&#34;plotly&#34;&gt;Plotly&lt;/h2&gt;

&lt;p&gt;The plotly package makes interactive, publication quality graphs online.
It uses the JavaScript graphing library and is really versatile! You
can make graphs, maps, 3D plots and, as I’m about to explain, gantt
charts.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;require&lt;/span&gt;(plotly)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&#34;step-by-step-gantt-chart&#34;&gt;Step by Step Gantt Chart&lt;/h2&gt;

&lt;p&gt;Install the packages you need for generating the gantt chart&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(RColorBrewer)
&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(readxl)
&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(widgetframe)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Setting fonts isn’t necessary as there are defaults, but it’s always
nice to know how to do it should you want to.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;f &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;list&lt;/span&gt;(
  family &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Courier New, monospace&amp;#34;&lt;/span&gt;,
  size &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;18&lt;/span&gt;,
  color &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;#7f7f7f&amp;#34;&lt;/span&gt;
)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Set a title for the X axis. Uncomment the Y axis if you’d like to label
that too :)&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;x &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;list&lt;/span&gt;(
  title &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Date&amp;#34;&lt;/span&gt;,
  titlefont &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; f
)
&lt;span style=&#34;color:#75715e&#34;&gt;# y &amp;lt;- list(&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#   title = &amp;#34;Task Number&amp;#34;,&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#   titlefont = f&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;# ) &lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&#34;read-in-your-data&#34;&gt;Read in your data&lt;/h3&gt;

&lt;p&gt;To make this gantt chart work, you need a column with a date you intend
to &lt;code&gt;start&lt;/code&gt; the task - we’ll make sure this is formatted in a minute. You
also need a column with a &lt;code&gt;duration&lt;/code&gt; to generate the length of the bar
for each task (this is in days in my case!) and if you want to group and colour code your tasks, it’s
the &lt;code&gt;Chapter&lt;/code&gt; column which does this.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;Task&lt;/code&gt; and &lt;code&gt;Progress&lt;/code&gt; columns aren’t necessary, but they are good
for labelling and keeping yourself accountable, and an easy way to share
updates with people who are checking up on your
progress!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;df &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; read_xlsx(path &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;../../../../Book_thesis/00_Time_Plan/timeplan.xlsx&amp;#34;&lt;/span&gt;, sheet &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;) 
&lt;span style=&#34;color:#66d9ef&#34;&gt;head&lt;/span&gt;(df)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 5
##   Task                       start               Duration Chapter Progress
##   &amp;lt;chr&amp;gt;                      &amp;lt;dttm&amp;gt;                 &amp;lt;dbl&amp;gt; &amp;lt;chr&amp;gt;   &amp;lt;chr&amp;gt;   
## 1 Review literature review … 2018-07-03 00:00:00        2 2       Complet…
## 2 2.1 Energy on a larger sc… 2018-07-05 00:00:00        5 2       Complet…
## 3 2.1 Energy in a UK context 2018-07-12 00:00:00        7 2       Complet…
## 4 2.2 Traditional collectio… 2018-07-23 00:00:00        3 2       Complet…
## 5 2.2 New technologies and … 2018-07-26 00:00:00        3 2       Complet…
## 6 2.3 Data Bias and Represe… 2018-07-31 00:00:00        3 2       Complet…
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Make sure your date column is correctly formatted&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;start &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;as.Date&lt;/span&gt;(df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;start, format&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;%Y-%m-%d&amp;#34;&lt;/span&gt;) &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Next generate a list of colours. This corresponds to my &lt;code&gt;Chapter&lt;/code&gt;
column, so change this accordingly. It creates a list assigning a colour
to each group of
tasks&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;cols &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; brewer.pal(&lt;span style=&#34;color:#66d9ef&#34;&gt;length&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;unique&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;factor&lt;/span&gt;(df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;Chapter))), name &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Set3&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#75715e&#34;&gt;# Generate a list of colours that are as long as your group of tasks&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;# This won&amp;#39;t work if column has blanks&lt;/span&gt;

df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;color &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;factor&lt;/span&gt;(df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;Chapter, labels &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; cols) &lt;span style=&#34;color:#75715e&#34;&gt;# Attach these colours as factors to each group of tasks&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The next chunk of code generates a date line updated automatically based
on your computer&amp;rsquo;s system date.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#75715e&#34;&gt;# annotation&lt;/span&gt;
a &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;list&lt;/span&gt;(text &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Today&amp;#39;s date&amp;#34;&lt;/span&gt;,
          x &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;Sys.Date&lt;/span&gt;(),
          y &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1.02&lt;/span&gt;,
          xref &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;x&amp;#39;&lt;/span&gt;,
          yref &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;paper&amp;#39;&lt;/span&gt;,
          xanchor &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;left&amp;#39;&lt;/span&gt;,
          showarrow &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;FALSE&lt;/span&gt;
)

&lt;span style=&#34;color:#75715e&#34;&gt;# use shapes to create a line&lt;/span&gt;
l &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;list&lt;/span&gt;(type &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; line,
          x0 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;Sys.Date&lt;/span&gt;(),
          x1 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;Sys.Date&lt;/span&gt;(),
          y0 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;,
          y1 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;,
          xref &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;x&amp;#39;&lt;/span&gt;,
          yref &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;paper&amp;#39;&lt;/span&gt;,
          line &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;list&lt;/span&gt;(color &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;black&amp;#39;&lt;/span&gt;,
                      width &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;0.7&lt;/span&gt;)
)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Build your plot combining all the elements you generated earlier. Change
the &lt;code&gt;text&lt;/code&gt; section to personalise your chart&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;p &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; plot_ly()
&lt;span style=&#34;color:#66d9ef&#34;&gt;for&lt;/span&gt;(i &lt;span style=&#34;color:#66d9ef&#34;&gt;in&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;nrow&lt;/span&gt;(df) &lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;)){
  p &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; add_trace(p,
                 x &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;start[i], df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;start[i] &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;Duration[i]), 
                 y &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(i, i), 
                 mode &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;lines&amp;#34;&lt;/span&gt;,
                 line &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;list&lt;/span&gt;(color &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;color[i], width &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;20&lt;/span&gt;),
                 showlegend &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; F,
                 hoverinfo &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;text&amp;#34;&lt;/span&gt;,
                 text &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;paste&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Task: &amp;#34;&lt;/span&gt;, df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;Task[i], &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&amp;lt;br&amp;gt;&amp;#34;&lt;/span&gt;,
                              &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Duration: &amp;#34;&lt;/span&gt;, df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;Duration[i], &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;days&amp;lt;br&amp;gt;&amp;#34;&lt;/span&gt;,
                              &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Chapter: &amp;#34;&lt;/span&gt;, df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;Chapter[i], &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&amp;lt;br&amp;gt;&amp;#34;&lt;/span&gt;,
                              &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Status: &amp;#34;&lt;/span&gt;, df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;Progress[i]), &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&amp;lt;br&amp;gt;&amp;#34;&lt;/span&gt;,
                 evaluate &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; T
  )
}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now simply plot your chart and add axis labels, your title and ‘Today’s
date’ - The plotly package will generate an interactive chart that you
can zoom, hover and share!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;p &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
  layout(xaxis &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; x, 
         &lt;span style=&#34;color:#75715e&#34;&gt;# yaxis = y,&lt;/span&gt;
         title &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Thesis Schedule&amp;#34;&lt;/span&gt;,
         annotations &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; a,
         shapes &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; l)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;iframe id=&#34;serviceFrameSend&#34; src=&#34;../img/p.html&#34; width=&#34;1000&#34; height=&#34;1000&#34; frameborder=&#34;0&#34;&gt;&lt;/iframe&gt;

&lt;p&gt;Ta-Dah! Now all I have to do is stick to it…&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>namer, Automatic Labelling of R Markdown Chunks</title>
      <link>https://itsalocke.com/blog/namer-automatic-labelling-of-r-markdown-chunks/</link>
      <pubDate>Wed, 31 Oct 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/namer-automatic-labelling-of-r-markdown-chunks/</guid>
      <description>
        
        

&lt;p&gt;We&amp;rsquo;ve just &lt;a href=&#34;https://cran.r-project.org/web/packages/namer/index.html&#34;&gt;released a sweet package&lt;/a&gt; to save you stress from the hassle of unnamed chunks in R Markdown! &lt;code&gt;namer&lt;/code&gt; will name all your chunks, so you can quickly debug in future. More details in this post!&lt;/p&gt;

&lt;h1 id=&#34;why-name-your-r-markdown-chunks&#34;&gt;Why name your R Markdown chunks?&lt;/h1&gt;

&lt;p&gt;When writing R Markdown documents, be it a single report or a whole book based on dozens of documents, it&amp;rsquo;s crucial to name your R Markdown chunks. Informative names can help when navigating your files and will be used as informative filenames for figures generated in chunks. But even when not informative, chunk labels are very important! Imagine you&amp;rsquo;re compiling a whole book, and oops, a bug appears in&amp;hellip; unnamed-chunk-566. How do you find the culprit?! That name you get in the error message is &lt;em&gt;not&lt;/em&gt; written in any file!&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/namer-oh-no.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Sad chibi Stef being told by R that there is an error in unnamed-chunk-12&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;Or, since caches are named after chunks, if you delete one chunk in a document full of unnamed chunks, all your caches become useless. The horror!&lt;/p&gt;

&lt;p&gt;One good piece of advice would be to always name your chunks. That&amp;rsquo;s what we say when teaching R Markdown, and what&amp;rsquo;s promoted &lt;a href=&#34;https://masalmon.eu/2017/08/08/chunkpets/&#34;&gt;in this blog post of mine featuring puppy photographs&lt;/a&gt;! But, sadly, we don&amp;rsquo;t always follow best practice, do we?&lt;/p&gt;

&lt;h1 id=&#34;what-does-namer-do&#34;&gt;What does &lt;code&gt;namer&lt;/code&gt; do?&lt;/h1&gt;

&lt;p&gt;Luckily, thanks to a brilliant idea of Steph&amp;rsquo;s, &lt;code&gt;namer&lt;/code&gt; is now here to save your (vegan) bacon! This nifty package will name your chunks for you when you&amp;rsquo;ve (willingly or unwillingly) forgotten to do so! For any R Markdown document, &lt;code&gt;namer&lt;/code&gt; creates and saves chunk labels based on the filename stripped from its extension. Therefore, all chunks of &amp;ldquo;my-fantastic-report.Rmd&amp;rdquo; get named &amp;ldquo;my-fantastic-report-1&amp;rdquo;, &amp;ldquo;my-fantastic-report-2&amp;rdquo;. Voilà!&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/namer-yaaasss.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Happy chibi Stef being told by R that there is an error in the chunk labelled test-12&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;In practice, &lt;code&gt;namer&lt;/code&gt; provides this service for a single document, or a whole folder at once, and an RStudio add-in for naming the chunks of the current active document.&lt;/p&gt;

&lt;p&gt;The screenshot below is &lt;a href=&#34;https://github.com/lockedata/pres-datascience/pull/1&#34;&gt;a real-life example&lt;/a&gt;  result of running namer::name_dir_chunks(&amp;ldquo;pres&amp;rdquo;). In each of the files in the dir &amp;ldquo;pres&amp;rdquo;, it labelled chunks using the filename and numbers.&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/namer1.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Difference between two R Markdown reports before and after using namer to label chunks&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;h1 id=&#34;how-to-use-namer&#34;&gt;How to use &lt;code&gt;namer&lt;/code&gt;?&lt;/h1&gt;

&lt;p&gt;First, install &lt;code&gt;namer&lt;/code&gt; from CRAN.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;install.packages(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;namer&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then, say you want to name the chunks of a report saved under &amp;ldquo;reports/my-report.md&amp;rdquo;, type&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;namer&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;name_chunks(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;reports/my-report.md&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here comes a warning from us&amp;hellip; When using &lt;code&gt;namer&lt;/code&gt;, please check the edits before pushing them to your code base. Such automatic chunk labelling is best paired with &lt;a href=&#34;http://happygitwithr.com/&#34;&gt;version control&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;If you want to label the chunks of all the R Markdown documents of your &amp;ldquo;reports&amp;rdquo; folder, use &lt;code&gt;namer::name_dir_chunks()&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;namer&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;name_dir_chunks(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;reports&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now, as life is sometimes a bit more complicated, you might sometimes need to &lt;em&gt;unname&lt;/em&gt; all chunks of a report before re-naming it, if you e.g. used &lt;code&gt;namer&lt;/code&gt; once, then added many chunks, or if you don&amp;rsquo;t like the naming scheme you had been using. We have a function for that too! It will unname all chunks except the setup chunk!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;namer&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;unname_all_chunks(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;reports/my-report.md&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Last but not least, we have a minimal RStudio addin shipped with the package, to name chunks of any R Markdown file. It allows you to select that file thanks to &lt;a href=&#34;https://github.com/lockedata/namer/pull/14&#34;&gt;our very first Hacktoberfest contributor&lt;/a&gt;, &lt;a href=&#34;https://github.com/ellisvalentiner&#34;&gt;Ellis Valentiner&lt;/a&gt;! Read more about Hacktoberfest at Locke Data &lt;a href=&#34;https://itsalocke.com/blog/up-your-open-source-game-with-hacktoberfest-at-locke-data/&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&#34;future-plans-for-even-more-nifty-automatic-naming&#34;&gt;Future plans for even more nifty automatic naming&lt;/h1&gt;

&lt;p&gt;Like &lt;a href=&#34;https://itsalocke.com/oss/packages/&#34;&gt;all our packages&lt;/a&gt;, &lt;code&gt;namer&lt;/code&gt; is &lt;a href=&#34;https://github.com/lockedata/namer&#34;&gt;developed in the open on GitHub&lt;/a&gt; so you can go read what we&amp;rsquo;re planning for the future. In particular, we&amp;rsquo;re aiming to extend the RStudio addin to make it possible to select any file or folder, not just the active document.&lt;/p&gt;

&lt;p&gt;Furthermore, if you&amp;rsquo;re keen to get involved in open-source development, we&amp;rsquo;d be glad to mentor you, &lt;a href=&#34;https://github.com/lockedata/namer/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted+%3Araised_hand%3A%22&#34;&gt;check out the issues that&amp;rsquo;d welcome external help&lt;/a&gt;! As a side-note, if you&amp;rsquo;re into programmatic change or assessment of chunk options, you might be interested in &lt;a href=&#34;https://github.com/ropenscilabs/tinkr&#34;&gt;the &lt;code&gt;tinkr&lt;/code&gt; package&lt;/a&gt;, that provides a function to transform R Markdown files into XML documents, and another one to save them back as R Markdown.&lt;/p&gt;

&lt;p&gt;In the meantime, go write awesome R Markdown documents, now that the fear of the unnamed chunks no longer dawns on you!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Packages for Testing your R Package</title>
      <link>https://itsalocke.com/blog/packages-for-testing-your-r-package/</link>
      <pubDate>Mon, 22 Oct 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/packages-for-testing-your-r-package/</guid>
      <description>
        
        

&lt;p&gt;Testing your R package is crucial, and thankfully it only gets easier with time, thanks to experience&amp;hellip; and awesome packages helping you setup and improve tests! In this post, we shall offer a roundup of packages for testing R packages, first in a section about general testing setup, and then in a section about testing &amp;ldquo;peculiar&amp;rdquo; stuff.&lt;/p&gt;

&lt;h1 id=&#34;general-package-testing-infrastructure&#34;&gt;General package testing infrastructure&lt;/h1&gt;

&lt;h2 id=&#34;create-tests&#34;&gt;Create tests&lt;/h2&gt;

&lt;p&gt;If you&amp;rsquo;re brand-new to unit testing your R package, I&amp;rsquo;d recommend reading &lt;a href=&#34;http://r-pkgs.had.co.nz/tests.html&#34;&gt;this chapter from Hadley Wickham&amp;rsquo;s book about R packages&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;There&amp;rsquo;s an R package called &lt;code&gt;RUnit&lt;/code&gt; for unit testing, but in the whole post we&amp;rsquo;ll mention resources around the &lt;a href=&#34;https://github.com/r-lib/testthat&#34;&gt;&lt;code&gt;testthat&lt;/code&gt; package&lt;/a&gt; since it&amp;rsquo;s the one we use in our packages, and arguably the most popular one. &lt;code&gt;testthat&lt;/code&gt; is great! Don&amp;rsquo;t hesitate to reads its docs again if you started using it a while ago, since the &lt;a href=&#34;https://www.tidyverse.org/articles/2017/12/testthat-2-0-0/&#34;&gt;latest major release&lt;/a&gt; added the &lt;a href=&#34;http://testthat.r-lib.org/reference/teardown.html&#34;&gt;&lt;code&gt;setup()&lt;/code&gt; and &lt;code&gt;teardown()&lt;/code&gt; functions&lt;/a&gt; to run code before and after all tests, very handy.&lt;/p&gt;

&lt;p&gt;To setup testing in an existing package i.e. creating the test folder and adding &lt;code&gt;testthat&lt;/code&gt; as a dependency, run &lt;a href=&#34;http://usethis.r-lib.org/reference/use_testthat.html&#34;&gt;&lt;code&gt;usethis::use_testthat()&lt;/code&gt;&lt;/a&gt;. In our WIP &lt;a href=&#34;https://github.com/lockedata/pRojects&#34;&gt;&lt;code&gt;pRojects&lt;/code&gt; package&lt;/a&gt;, we set up the tests directory for you so you don&amp;rsquo;t forget. Then, in any case, add new tests for a function using &lt;code&gt;usethis::use_test()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&#34;https://github.com/s-fleck/testthis&#34;&gt;&lt;code&gt;testthis&lt;/code&gt; package&lt;/a&gt; might help make your testing workflow even smoother. In particular, &lt;code&gt;test_this()&lt;/code&gt; &amp;ldquo;reloads the package and runs tests associated with the currently open R script file.&amp;rdquo;, and there&amp;rsquo;s also a function for opening the test file associated with the current R script. &lt;em&gt;Edit: as of version 2.0.0 &lt;code&gt;devtools&lt;/code&gt; itself features &lt;a href=&#34;https://www.tidyverse.org/articles/2018/10/devtools-2-0-0/#testing-single-files&#34;&gt;functions for testing single files&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&#34;assess-your-tests&#34;&gt;Assess your tests&lt;/h2&gt;

&lt;p&gt;To get a sense of how good your tests are, check these out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://github.com/r-lib/covr&#34;&gt;&lt;code&gt;covr&lt;/code&gt;&lt;/a&gt; computes the &lt;em&gt;test coverage&lt;/em&gt; i.e. the percentage of lines of code that are covered by tests. &lt;code&gt;covr&lt;/code&gt; allows skipping some lines. If run on Travis or Appveyor, for instance, it can send a report to online coverage tools such as CodeCov or Coveralls, allowing you to visualize the coverage. At Locke Data we mostly don&amp;rsquo;t run &lt;code&gt;covr&lt;/code&gt; locally but instead have it run on Travis. To set that up, &lt;a href=&#34;http://usethis.r-lib.org/reference/ci.html&#34;&gt;&lt;code&gt;usethis::use_coverage()&lt;/code&gt;&lt;/a&gt;.
Here is &lt;a href=&#34;https://coveralls.io/github/lockedata/HIBPwned?branch=master&#34;&gt;Coveralls report for &lt;code&gt;HIBPwned&lt;/code&gt;&lt;/a&gt; and &lt;a href=&#34;https://github.com/lockedata/hibpwned#hibpwned&#34;&gt;the corresponding badge&lt;/a&gt;. See a &lt;a href=&#34;https://codecov.io/github/ropensci/Ropenaq?branch=master&#34;&gt;CodeCov report for comparison&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://github.com/yonicd/covrpage&#34;&gt;&lt;code&gt;covrpage&lt;/code&gt;&lt;/a&gt; creates a detailed coverage report that can serve as a README for your test folder. We&amp;rsquo;ve done that &lt;a href=&#34;https://github.com/lockedata/HIBPwned/tree/master/tests#tests-and-coverage&#34;&gt;for &lt;code&gt;HIBPwned&lt;/code&gt;&lt;/a&gt; so now without clicking in a coverage report, thanks to &amp;ldquo;detailed test results&amp;rdquo;, you can see the tests associated with each context.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&#34;test-all-the-things&#34;&gt;Test all the things&lt;/h1&gt;

&lt;p&gt;Now, sometimes you might encounter cases of things that you don&amp;rsquo;t quite know how to test. Here&amp;rsquo;s a small list, but please comment about anything we&amp;rsquo;ve forgotten!&lt;/p&gt;

&lt;h2 id=&#34;mocking&#34;&gt;Mocking&lt;/h2&gt;

&lt;p&gt;Sometimes you need to test whether your package works as expected &amp;ldquo;if something happens&amp;rdquo;, &amp;ldquo;if a thing has this value&amp;rdquo; and can&amp;rsquo;t rely on arguments. E.g. what happens if the environment variable &lt;code&gt;GITHUB_PAT&lt;/code&gt; doesn&amp;rsquo;t exist, or if a dependency isn&amp;rsquo;t installed? In such cases, what you might be after is &lt;em&gt;mocking&lt;/em&gt;. The &lt;code&gt;testthat&lt;/code&gt; package itself has a &lt;code&gt;with_mock()&lt;/code&gt; function, but it&amp;rsquo;s now recommended to rather use the &lt;a href=&#34;https://github.com/jfiksel/mockery&#34;&gt;&lt;code&gt;mockery&lt;/code&gt;&lt;/a&gt; or &lt;a href=&#34;https://github.com/krlmlr/mockr&#34;&gt;&lt;code&gt;mockr&lt;/code&gt; packages&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&#34;webmocking&#34;&gt;Webmocking&lt;/h2&gt;

&lt;p&gt;If the mocking you need to perform is e.g. mimicking a 404 result from an API, or saving a web API response and replay it to not have to re-query the API at each test, you can use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;webmockr&lt;/code&gt; &lt;a href=&#34;https://itsalocke.com/blog/some-web-api-package-development-lessons-from-hibpwned/&#34;&gt;like we did for &lt;code&gt;HIBPwned&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://github.com/ropensci/vcr&#34;&gt;&lt;code&gt;vcr&lt;/code&gt;&lt;/a&gt; and &lt;code&gt;webmockr&lt;/code&gt; together, see &lt;code&gt;vcr&lt;/code&gt; docs. It works for both &lt;code&gt;crul&lt;/code&gt; and &lt;code&gt;httr&lt;/code&gt;. &lt;code&gt;vcr&lt;/code&gt; docs include a list of packages using &lt;code&gt;vcr&lt;/code&gt; for testing in the wild.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/web/packages/httptest/index.html&#34;&gt;&lt;code&gt;httptest&lt;/code&gt;&lt;/a&gt; for &lt;code&gt;httr&lt;/code&gt; only.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;test-plot-outputs&#34;&gt;Test plot outputs&lt;/h2&gt;

&lt;p&gt;You can test your plot outputs haven&amp;rsquo;t changed by using &lt;a href=&#34;https://github.com/lionel-/vdiffr&#34;&gt;&lt;code&gt;vdiffr&lt;/code&gt;&lt;/a&gt;. To set things up you need to run &lt;code&gt;vdiffr::manage_cases()&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&#34;test-shiny-apps&#34;&gt;Test Shiny apps&lt;/h2&gt;

&lt;p&gt;Fear not, there&amp;rsquo;s a whole package dedicated to help you test Shiny apps! Check out &lt;a href=&#34;https://github.com/rstudio/shinytest&#34;&gt;&lt;code&gt;shinytest&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&#34;test-rstudio-add-ins&#34;&gt;Test RStudio add-ins&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;remedy&lt;/code&gt; package by ThinkR has &lt;a href=&#34;https://github.com/ThinkR-open/remedy/tree/master/tests&#34;&gt;tests for its add-ins&lt;/a&gt;. See in particular the &lt;a href=&#34;https://github.com/ThinkR-open/remedy/blob/89bac0d2c5b692f1d394f7f3706ad824fdf649aa/tests/testthat/helper-functions.R#L14&#34;&gt;&lt;code&gt;scratch_file()&lt;/code&gt; function&lt;/a&gt;. The tests need to be run only when RStudio is available so a helper defines a &lt;a href=&#34;https://github.com/ThinkR-open/remedy/blob/master/tests/testthat/helper-functions.R#L1&#34;&gt;&lt;code&gt;skip_if_not_rstudio()&lt;/code&gt; function&lt;/a&gt;. Thanks to &lt;a href=&#34;https://colinfay.me/&#34;&gt;Colin Fay&lt;/a&gt; and &lt;a href=&#34;https://github.com/yonicd/&#34;&gt;Jonathan Sidi&lt;/a&gt; for showing it to me.&lt;/p&gt;

&lt;h2 id=&#34;test-htmlwidgets&#34;&gt;Test htmlwidgets?&lt;/h2&gt;

&lt;p&gt;I&amp;rsquo;ll be honest, I haven&amp;rsquo;t seen examples of this in the wild, which is not surprising given I don&amp;rsquo;t use htmlwidgets a lot. Still, worth mentioning are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://github.com/schloerke/viztest&#34;&gt;&lt;code&gt;viztest&lt;/code&gt;&lt;/a&gt; tests htmlwidgets based on screenshots.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;the idea to test the content of the html before or after interactions. &lt;a href=&#34;https://github.com/cpsievert/rdom&#34;&gt;&lt;code&gt;rdom&lt;/code&gt;&lt;/a&gt; can be a part of such a workflow: &lt;code&gt;rdom&lt;/code&gt; + &lt;code&gt;xml2&lt;/code&gt; to scrape the result + &lt;code&gt;testthat&lt;/code&gt; of course. Thanks to &lt;a href=&#34;https://github.com/davidgohel&#34;&gt;David Gohel&lt;/a&gt; for telling me this!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;test-interactive-behavior&#34;&gt;Test interactive behavior?&lt;/h2&gt;

&lt;p&gt;This is another topic I haven&amp;rsquo;t totally figured out, but I like using &lt;code&gt;usethis&lt;/code&gt; tests as a reference, be it the &lt;a href=&#34;https://github.com/r-lib/usethis/tree/master/tests/manual&#34;&gt;manual/ folder&lt;/a&gt; or the &lt;a href=&#34;https://github.com/r-lib/usethis/tree/master/tests/testthat&#34;&gt;testthat/ folder&lt;/a&gt;. &lt;a href=&#34;https://github.com/s-fleck/testthis/tree/master/tests/testthat&#34;&gt;&lt;code&gt;testthis&lt;/code&gt; tests&lt;/a&gt; might also be an inspiration.&lt;/p&gt;

&lt;h1 id=&#34;conclusion&#34;&gt;Conclusion&lt;/h1&gt;

&lt;p&gt;The ecosystem for R package testing is getting more and more complete and automatic, which is very exciting. To &lt;em&gt;deploy&lt;/em&gt; your tests, you&amp;rsquo;ll need to learn about &lt;a href=&#34;https://ropensci.github.io/dev_guide/ci.html&#34;&gt;continuous integration&lt;/a&gt;, which thankfully is also an area with &lt;a href=&#34;https://github.com/ropenscilabs/travis&#34;&gt;exciting&lt;/a&gt; &lt;a href=&#34;https://github.com/ropenscilabs/tic&#34;&gt;developments&lt;/a&gt;. Maybe a subject for another post&amp;hellip; In the meantime, feel free to tell us about your favourite resources for testing packages!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Package support offer</title>
      <link>https://itsalocke.com/blog/package-support-offer/</link>
      <pubDate>Mon, 15 Oct 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/package-support-offer/</guid>
      <description>
        
        &lt;p&gt;The R community and the package ecosystem are awesome but it can be difficult to sustain your R packages when you only have so much free time. To make a stellar package you&amp;rsquo;ve got to keep on top of the issues, make great documentation, have that all important hex sticker, and generally have good quality code. This all takes time, time you don&amp;rsquo;t always have. We would like to help.&lt;/p&gt;

&lt;p&gt;Once per month, we&amp;rsquo;ll select (from applicants) one package to setup for success. These packages will have general utility to a broad part of the community but could benefit from some decidated time to help give it the boost it needs to be more widely adopted.&lt;/p&gt;

&lt;p&gt;For the selected packages we will then work with the package author to ensure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The GitHub repo is setup well and go over all existing issues to make sure they&amp;rsquo;re well formed, tagged, and prioritised. (3 hours of Locke Data time)&lt;/li&gt;
&lt;li&gt;The package implements best practices in terms of the code, tests, and documentation. (5 hours of Locke Data time)&lt;/li&gt;
&lt;li&gt;The package is easy to market with a pkgdown site and a logo. (Up to a day of Locke Creatives time)&lt;/li&gt;
&lt;li&gt;The package starts getting coverage by showcasing the package in our blog. (Up to 3 hours of Locke Data time)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In return you&amp;rsquo;ll list us as sponsors in the DESCRIPTION using the &lt;code&gt;fnd&lt;/code&gt; tag and list our support on the README (and therefore the pkgdown). We&amp;rsquo;ll also tell folks we helped you.&lt;/p&gt;

&lt;p&gt;If you think your package fits the bill (i.e. broadly applicable) and you could benefit from our support getting your package into shape then &lt;a href=&#34;https://airtable.com/shrH3z9fQIbEJzPUn&#34;&gt;apply now&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;PS We can also offer this service commercially, and provide CRAN submission and ongoing maintainence options too if you&amp;rsquo;re a company who just wants to get stuff live.&lt;/em&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>cransays - Follow your R Package Journey to CRANterbury with our Dashboard!</title>
      <link>https://itsalocke.com/blog/cransays---follow-your-r-package-journey-to-cranterbury-with-our-dashboard/</link>
      <pubDate>Thu, 11 Oct 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/cransays---follow-your-r-package-journey-to-cranterbury-with-our-dashboard/</guid>
      <description>
        
        

&lt;p&gt;We at &lt;a href=&#34;https://github.com/lockedata&#34;&gt;Locke Data&lt;/a&gt; maintain a few R packages that we&amp;rsquo;ve submitted to CRAN to help increase their userbase. After running &lt;code&gt;devtools::release()&lt;/code&gt;, clicking in a confirmation email&amp;hellip; what remains is &lt;em&gt;waiting&lt;/em&gt;. Inspired by our experience, we&amp;rsquo;ve created &lt;a href=&#34;https://cransays.itsalocke.com/articles/dashboard.html&#34;&gt;a dashboard&lt;/a&gt; to help other package maintainers follow their package&amp;rsquo;s journey to CRANterbury. Read more about its making in this post!&lt;/p&gt;

&lt;h1 id=&#34;why-create-the-cransays-dashboard&#34;&gt;Why create the cransays dashboard?&lt;/h1&gt;

&lt;p&gt;Sometimes, depending on the workload of CRAN volunteers, it can last a while before a package ends up on their way to CRAN. The number of submissions has increased a lot within the last few years! As per &lt;a href=&#34;https://cran.r-project.org/web/packages/policies.html#Submission&#34;&gt;CRAN policies&lt;/a&gt;, &amp;ldquo;You can check that the submission was received by looking at &lt;a href=&#34;ftp://CRAN.R-project.org/incoming/.&amp;quot;&#34;&gt;ftp://CRAN.R-project.org/incoming/.&amp;quot;&lt;/a&gt;. The content of that ftp server is divided into various subfolders, making the research of one&amp;rsquo;s package pretty tricky.&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/cran-incoming.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Part of the content of CRAN incoming ftp server&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;After having to do that a few days in a row for checking on one of our submissions, we decided a dashboard showing all packages and their submission stage in one go would be quite useful, so we made it happen!&lt;/p&gt;

&lt;h1 id=&#34;how-we-made-the-cransays-dashboard&#34;&gt;How we made the cransays dashboard?&lt;/h1&gt;

&lt;p&gt;In this section you&amp;rsquo;ll discover how we create a snapshot of the CRAN incoming state, format it, and then deploy it to the interwebs every hour!&lt;/p&gt;

&lt;h2 id=&#34;creating-a-snapshot-of-cran-incoming&#34;&gt;Creating a snapshot of CRAN incoming&lt;/h2&gt;

&lt;p&gt;First of all, we wrote a function &lt;code&gt;take_snapshot()&lt;/code&gt; to create a snapshot of the ftp server. Technically, we query the different ftp folders using &lt;code&gt;curl&lt;/code&gt; and &lt;code&gt;utils::read.delim()&lt;/code&gt;, munging strings and datetimes along the way. Our original code was adapted from &lt;a href=&#34;https://github.com/edgararuiz/cran-stages/&#34;&gt;an insightful repo by Edgar Ruiz at RStudio&lt;/a&gt;, that is a good read about the different stages of submission.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;take_snapshot()&lt;/code&gt; function was improved thanks to the contributions of &lt;a href=&#34;https://mitchelloharawild.com/&#34;&gt;Mitchell O&amp;rsquo;Hara-Wild&lt;/a&gt; and &lt;a href=&#34;https://www.normalesup.org/~hgruson/&#34;&gt;Hugo Gruson&lt;/a&gt;. Receiving PRs so rapidly after opening &lt;a href=&#34;https://itsalocke.com/blog/up-your-open-source-game-with-hacktoberfest-at-locke-data/&#34;&gt;hacktoberfest issues&lt;/a&gt; was really appreciated!&lt;/p&gt;

&lt;h2 id=&#34;presenting-the-snapshot&#34;&gt;Presenting the snapshot&lt;/h2&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/cransays.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;cransays dashboard&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;Then, the dashboard itself is the &lt;code&gt;data.frame&lt;/code&gt; returned by &lt;code&gt;take_snapshot()&lt;/code&gt; formatted by the &lt;code&gt;DT&lt;/code&gt; package. The formatting of that &lt;code&gt;datatable&lt;/code&gt; was improved thanks to external contributions again! Thank you, &lt;a href=&#34;https://www.normalesup.org/~hgruson/&#34;&gt;Hugo Gruson&lt;/a&gt; and &lt;a href=&#34;https://www.jimhester.com&#34;&gt;Jim Hester&lt;/a&gt;!&lt;/p&gt;

&lt;h2 id=&#34;packaging-it-all-together&#34;&gt;Packaging it all together&lt;/h2&gt;

&lt;p&gt;Find the source for our dashboard &lt;a href=&#34;https://github.com/lockedata/cransays&#34;&gt;here&lt;/a&gt;. &lt;code&gt;cransays&lt;/code&gt; is structured as a package, because&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Packages are cool, duh!&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;It helps with managing dependencies thanks to DESCRIPTION.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;It allowed us to rip off the benefits of &lt;a href=&#34;https://github.com/r-lib/pkgdown&#34;&gt;&lt;code&gt;pkgdown&lt;/code&gt;&lt;/a&gt; and &lt;a href=&#34;https://github.com/ropenscilabs/travis&#34;&gt;&lt;code&gt;travis&lt;/code&gt;&lt;/a&gt; + &lt;a href=&#34;https://github.com/ropenscilabs/tic&#34;&gt;&lt;code&gt;tic&lt;/code&gt;&lt;/a&gt;. We wrote the &lt;code&gt;DT&lt;/code&gt; code inside a vignette &amp;ndash; &lt;a href=&#34;https://usethis.r-lib.org/reference/use_vignette.html&#34;&gt;&lt;code&gt;usethis::use_vignette()&lt;/code&gt;&lt;/a&gt; for the win. Then, with the two lines below, we were able to deploy a website for the package from Travis! Every push to the master branch of our repo meant &lt;code&gt;pkgdown::build_site()&lt;/code&gt; is run on Travis and the result is pushed to a gh-pages branch!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#75715e&#34;&gt;# install.packages(&amp;#34;usethis&amp;#34;)&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;# remotes::install_github(&amp;#34;ropenscilabs/tic&amp;#34;)&lt;/span&gt;

usethis&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;use_pkgdown() &lt;span style=&#34;color:#75715e&#34;&gt;# creates the pkgdown config file&lt;/span&gt;
travis&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;use_tic() &lt;span style=&#34;color:#75715e&#34;&gt;# creates boilerplate deploy code, lets you browse GitHub and Travis to create tokens.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We added a few lines to and &lt;a href=&#34;https://github.com/lockedata/cransays/blob/master/_pkgdown.yml#L2&#34;&gt;_pkgdown.yml&lt;/a&gt; and &lt;a href=&#34;https://github.com/lockedata/cransays/blob/master/.travis.yml#L6&#34;&gt;travis.yml&lt;/a&gt; to be able to use &lt;a href=&#34;https://github.com/lockedatapublished/lockedatapkg&#34;&gt;our own &lt;code&gt;pkgdown&lt;/code&gt; style&lt;/a&gt;. Steph wrote &lt;a href=&#34;https://itsalocke.com/blog/automated-documentation-hosting-on-github-via-travis-ci/&#34;&gt;posts about deploying docs from Travis in case you want to know more about the process&lt;/a&gt; but when using rOpenSci&amp;rsquo;s packages &lt;code&gt;tic&lt;/code&gt; and &lt;code&gt;travis&lt;/code&gt;, you don&amp;rsquo;t even need to think about it much.&lt;/p&gt;

&lt;p&gt;Thanks to this &lt;code&gt;pkgdown&lt;/code&gt;-Travis setup, you can find our dashboard &lt;a href=&#34;https://cransays.itsalocke.com/articles/dashboard.html&#34;&gt;here&lt;/a&gt; with &lt;a href=&#34;https://cransays.itsalocke.com/index.html&#34;&gt;a homepage with more information&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&#34;updating-it-regularly&#34;&gt;Updating it regularly!&lt;/h2&gt;

&lt;p&gt;Steph took care of the deploy via Netlify to a subdomain of itsalocke.com, otherwise we could have used the GitHub pages hosting from the gh-pages branch. Steph also created a Zapier zap so that there&amp;rsquo;d be a Travis build triggered every hour. It means that our dashboard is updated at least every hour, which seemed &lt;a href=&#34;https://en.wikipedia.org/wiki/Lagom&#34;&gt;&lt;em&gt;lagom&lt;/em&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&#34;sit-back-and-enjoy-or-improve-the-project&#34;&gt;Sit back and enjoy, or improve the project?&lt;/h1&gt;

&lt;p&gt;Now, we&amp;rsquo;re able to follow our submissions using our nifty dashboard, and you can do that too! We&amp;rsquo;ve also been thinking about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;getting metadata about each submission by unpacking them and using the &lt;a href=&#34;https://github.com/r-lib/desc&#34;&gt;&lt;code&gt;desc&lt;/code&gt; package&lt;/a&gt; and &lt;a href=&#34;https://github.com/r-lib/pkgload&#34;&gt;&lt;code&gt;pkgload::parse_ns_file()&lt;/code&gt;&lt;/a&gt;,&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;and saving the snapshots over time, very similar to &lt;a href=&#34;https://github.com/edgararuiz/cran-stages/&#34;&gt;what the RStudio folks did in the spring&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stay tuned via &lt;a href=&#34;https://github.com/lockedata/cransays/issues&#34;&gt;&lt;code&gt;cransays&lt;/code&gt;&amp;rsquo; issue tracker&lt;/a&gt;! In the meantime, happy R package development, and good luck with your CRAN submissions!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Processing complicated package outputs</title>
      <link>https://itsalocke.com/blog/processing-complicated-package-outputs/</link>
      <pubDate>Tue, 09 Oct 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/processing-complicated-package-outputs/</guid>
      <description>
        
        

&lt;p&gt;Sometimes packages have functions that don&amp;rsquo;t do the things the way you want them to do them and you have to either re-build the function, or work with it as-is and add code around it to solve your issue.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;ve had to do this recently with the &lt;code&gt;googleway&lt;/code&gt; package and it&amp;rsquo;s &lt;code&gt;google_distance()&lt;/code&gt; function so I wanted to take you through step by step how I wrote code to go from a single value function to a function that handles many inputs and returns 4 rows per input. I won&amp;rsquo;t be dwelling on how to write a function specifically, just showing you the workflow I often go through.&lt;/p&gt;

&lt;h2 id=&#34;requirements&#34;&gt;Requirements&lt;/h2&gt;

&lt;p&gt;Key functionality we&amp;rsquo;ll need today is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;googleway for providing the base function&lt;/li&gt;
&lt;li&gt;the tidyverse, namely purrr and dplyr, for lots of the data manipulation&lt;/li&gt;
&lt;li&gt;memoise for caching requests so we spend less cash&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(tidyverse)
&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(memoise)
&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(googleway)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&#34;google-distance&#34;&gt;Google Distance&lt;/h2&gt;

&lt;p&gt;To calculate distances we can use the &lt;a href=&#34;https://developers.google.com/maps/documentation/distance-matrix/start&#34;&gt;google distance API&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This needs a key in order to use it. Note that this service does not have a free tier to use, however it is ~$5 per 1,000 requests and a trial of Google Cloud is available.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;key&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;AIzaSyBIeuWMWWweyv1SAoAxcY1IZ-2nuErFQY8&amp;#34;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then we need to prep our desired information.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;office &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;E14 5EU&amp;#34;&lt;/span&gt;
monday_9am &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;as.POSIXct&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2018-12-03 09:00&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&#34;handling-google-distance&#34;&gt;Handling &lt;code&gt;google_distance()&lt;/code&gt;&lt;/h2&gt;

&lt;p&gt;The API is used to working with just a single address at a time so we need to do a bit of prep here to make it work with lots of accounts.&lt;/p&gt;

&lt;p&gt;For starters, we can use the memoise package to cache results so if we send the same address multiple times it doesn&amp;rsquo;t need to go back to the API. Phew, since that API costs money to call!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;google_distance_deduped &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; memoise&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;memoise(google_distance)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Giving this a go with a single example, let&amp;rsquo;s see what google gives us:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;from &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;SE3 8UQ&amp;#34;&lt;/span&gt;

example_1 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; google_distance_deduped(from,
                        office,
                        mode &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;transit&amp;#34;&lt;/span&gt;,
                        arrival_time &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; monday_9am,
                        key&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;key)

example_1&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## $destination_addresses
## [1] &amp;quot;Canary Wharf, London E14 5EU, UK&amp;quot;
## 
## $origin_addresses
## [1] &amp;quot;Shooters Hill Rd, London SE3 8UQ, UK&amp;quot;
## 
## $rows
##                          elements
## 1 5.7 km, 5653, 37 mins, 2221, OK
## 
## $status
## [1] &amp;quot;OK&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;possibly()&lt;/code&gt; function will mean that if there&amp;rsquo;s an error for a call that it doesn&amp;rsquo;t break everything and we won&amp;rsquo;t have to start all over again.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;google_distance_try &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;possibly(google_distance_deduped, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Fail&amp;#34;&lt;/span&gt;)

from &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;NA&lt;/span&gt;)

example_2 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; google_distance_try(from,
                        office,
                        mode &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;transit&amp;#34;&lt;/span&gt;,
                        arrival_time &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; monday_9am,
                        key&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;key)

example_2&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Fail&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then to make the function work over multiple addresses, we need to change it slightly. The &lt;code&gt;map()&lt;/code&gt; function will iterate over all the addresses.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;google_distance_loop &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(x,&lt;span style=&#34;color:#66d9ef&#34;&gt;...&lt;/span&gt;){
    purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;map(x, google_distance_try,&lt;span style=&#34;color:#66d9ef&#34;&gt;...&lt;/span&gt;)
}

from &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;rep&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;SE3 8UQ&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;)

example_3 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; google_distance_loop(from,
                        office,
                        mode &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;transit&amp;#34;&lt;/span&gt;,
                        arrival_time &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; monday_9am,
                        key&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;key)

example_3&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [[1]]
## [[1]]$destination_addresses
## [1] &amp;quot;Canary Wharf, London E14 5EU, UK&amp;quot;
## 
## [[1]]$origin_addresses
## [1] &amp;quot;Shooters Hill Rd, London SE3 8UQ, UK&amp;quot;
## 
## [[1]]$rows
##                          elements
## 1 5.7 km, 5653, 37 mins, 2221, OK
## 
## [[1]]$status
## [1] &amp;quot;OK&amp;quot;
## 
## 
## [[2]]
## [[2]]$destination_addresses
## [1] &amp;quot;Canary Wharf, London E14 5EU, UK&amp;quot;
## 
## [[2]]$origin_addresses
## [1] &amp;quot;Shooters Hill Rd, London SE3 8UQ, UK&amp;quot;
## 
## [[2]]$rows
##                          elements
## 1 5.7 km, 5653, 37 mins, 2221, OK
## 
## [[2]]$status
## [1] &amp;quot;OK&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So our code is working over multiple cases and handling bad inputs pretty well, but how do we get some meaningful stuff out of it. Looking at the data, we get back a part of a table that contains a response.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;example_1 &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  map(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;rows&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  map(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;elements&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## $destination_addresses
## NULL
## 
## $origin_addresses
## NULL
## 
## $rows
## NULL
## 
## $status
## NULL
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We see that there seems to be no way someone can use public transport between the two locations. Perhaps another way of getting there will return a result?&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;from &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;SE3 8UQ&amp;#34;&lt;/span&gt;
example_4 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; google_distance_loop(from,
                        office,
                        mode &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;walking&amp;#34;&lt;/span&gt;,
                        arrival_time &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; monday_9am,
                        key&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;key)

example_4 &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  map(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;rows&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  map(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;elements&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  flatten() &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  map(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;duration&amp;#34;&lt;/span&gt;) &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [[1]]
##             text value
## 1 1 hour 20 mins  4777
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;When a commute is possible, we get a response back that includes the number of seconds it might take someone to travel to work for 9am on a Monday.&lt;/p&gt;

&lt;p&gt;First of all, we&amp;rsquo;ll need to reliably extract this information from a batch of repsonses. This takes multiple steps due to the way the API gives us info.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;from &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;SE3 8UQ&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#66d9ef&#34;&gt;NA&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;SE3 8UQ&amp;#34;&lt;/span&gt;)

google_distance_tbl &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(x, &lt;span style=&#34;color:#66d9ef&#34;&gt;...&lt;/span&gt;) {
  google_distance_loop(x,&lt;span style=&#34;color:#66d9ef&#34;&gt;...&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  map(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;rows&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  map(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;elements&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  flatten() &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  map(&lt;span style=&#34;color:#66d9ef&#34;&gt;unclass&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  map_df(flatten) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
  &lt;span style=&#34;color:#66d9ef&#34;&gt;cbind&lt;/span&gt;(x)
}
  
example_5 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; google_distance_tbl(from,
                        office,
                        mode &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;walking&amp;#34;&lt;/span&gt;,
                        arrival_time &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; monday_9am,
                        key&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;key)
example_5  &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;##             text value    status       x
## 1 1 hour 20 mins  4777        OK SE3 8UQ
## 2           &amp;lt;NA&amp;gt;    NA NOT_FOUND    &amp;lt;NA&amp;gt;
## 3 1 hour 20 mins  4777        OK SE3 8UQ
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So now we&amp;rsquo;re going to need to ask about the different transit options for each address to find out the range of values in order to cope with &amp;ldquo;ZERO_RETURN&amp;rdquo; records. Once we have this information, we can then use the &lt;code&gt;google_distance_all&lt;/code&gt; function to find out how long it&amp;rsquo;ll take someone to drive, walk, cycle, or use public transport to travel between two points.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;from &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;SE3 8UQ&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#66d9ef&#34;&gt;NA&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;SE3 8UQ&amp;#34;&lt;/span&gt;)

google_distance_all &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(x, office, arrival_time, key, &lt;span style=&#34;color:#66d9ef&#34;&gt;...&lt;/span&gt;) {
  
  interested_in &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;expand.grid&lt;/span&gt;(from&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;x, 
     mode&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;driving&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;walking&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;bicycling&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;transit&amp;#34;&lt;/span&gt;), 
      stringsAsFactors &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;FALSE&lt;/span&gt;)

map2_df(interested_in&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;from,interested_in&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;mode&lt;/span&gt;, 
     &lt;span style=&#34;color:#f92672&#34;&gt;~&lt;/span&gt;mutate(
       google_distance_tbl(&lt;span style=&#34;color:#ae81ff&#34;&gt;.&lt;/span&gt;x, office, mode&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;.&lt;/span&gt;y,
                        arrival_time &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; arrival_time,
                        key&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;key),
       from&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;.&lt;/span&gt;x, mode&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;.&lt;/span&gt;y)
)
}

example_6 &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; google_distance_all(from, office, monday_9am, key)

example_6&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;##              text value    status       x    from      mode
## 1         12 mins   739        OK SE3 8UQ SE3 8UQ   driving
## 2            &amp;lt;NA&amp;gt;    NA NOT_FOUND    &amp;lt;NA&amp;gt;    &amp;lt;NA&amp;gt;   driving
## 3         12 mins   739        OK SE3 8UQ SE3 8UQ   driving
## 4  1 hour 20 mins  4777        OK SE3 8UQ SE3 8UQ   walking
## 5            &amp;lt;NA&amp;gt;    NA NOT_FOUND    &amp;lt;NA&amp;gt;    &amp;lt;NA&amp;gt;   walking
## 6  1 hour 20 mins  4777        OK SE3 8UQ SE3 8UQ   walking
## 7         32 mins  1930        OK SE3 8UQ SE3 8UQ bicycling
## 8            &amp;lt;NA&amp;gt;    NA NOT_FOUND    &amp;lt;NA&amp;gt;    &amp;lt;NA&amp;gt; bicycling
## 9         32 mins  1930        OK SE3 8UQ SE3 8UQ bicycling
## 10        37 mins  2221        OK SE3 8UQ SE3 8UQ   transit
## 11           &amp;lt;NA&amp;gt;    NA NOT_FOUND    &amp;lt;NA&amp;gt;    &amp;lt;NA&amp;gt;   transit
## 12        37 mins  2221        OK SE3 8UQ SE3 8UQ   transit
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Having this many functions though clutters things up and makes it difficult to refactor and improve things. We should unpack all the functionality into one big function.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#75715e&#34;&gt;#&amp;#39; Get distance data between two points based on all the travel mode options. Works for many origin points.&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#&amp;#39;&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#&amp;#39; @param x A vector of origins in address or postcode format&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#&amp;#39; @param dest A single destinationin address or postocde format&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#&amp;#39; @param arrival_time A POSIXct datetime that folks need to arrive by&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#&amp;#39; @param key A google distance API key&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#&amp;#39; @param ... Additional options to pass to `google_distance()`&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#&amp;#39;&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;#&amp;#39; @return Data.frame containing (typically) 4 rows per input element&lt;/span&gt;

google_distance_all &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(x, dest, arrival_time, key, &lt;span style=&#34;color:#66d9ef&#34;&gt;...&lt;/span&gt;){
  
  &lt;span style=&#34;color:#75715e&#34;&gt;# simple hygeine stuff&lt;/span&gt;
  gd &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;possibly(
    memoise&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;memoise(
      google_distance)
    , &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Fail&amp;#34;&lt;/span&gt;
  )
  
  &lt;span style=&#34;color:#75715e&#34;&gt;# Prep dataset&lt;/span&gt;
   interested_in &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;expand.grid&lt;/span&gt;(from&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;x, 
     mode&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;driving&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;walking&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;bicycling&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;transit&amp;#34;&lt;/span&gt;), 
      stringsAsFactors &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;FALSE&lt;/span&gt;)
   &lt;span style=&#34;color:#75715e&#34;&gt;# Perform google_distance calls for all combos&lt;/span&gt;
  purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;map2(interested_in&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;from,interested_in&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;mode&lt;/span&gt;, 
     &lt;span style=&#34;color:#f92672&#34;&gt;~&lt;/span&gt;gd(&lt;span style=&#34;color:#ae81ff&#34;&gt;.&lt;/span&gt;x, dest, mode&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;.&lt;/span&gt;y,
                        arrival_time &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; arrival_time,
                        key&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;key)
  ) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
    &lt;span style=&#34;color:#75715e&#34;&gt;# Extract relevant section&lt;/span&gt;
    purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;map(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;rows&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
    purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;map(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;elements&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
    purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;flatten() &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
    &lt;span style=&#34;color:#75715e&#34;&gt;# Simplify the data.frames&lt;/span&gt;
    purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;map(&lt;span style=&#34;color:#66d9ef&#34;&gt;unclass&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
    purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;map_df(purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;flatten) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt; 
    &lt;span style=&#34;color:#75715e&#34;&gt;# Add original lookup values&lt;/span&gt;
    &lt;span style=&#34;color:#66d9ef&#34;&gt;cbind&lt;/span&gt;(interested_in)
}

results &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; google_distance_all(
  from,
  office,
  arrival_time &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; monday_9am,
  key &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; key
)

results&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;##              text value    status    from      mode
## 1         12 mins   739        OK SE3 8UQ   driving
## 2            &amp;lt;NA&amp;gt;    NA NOT_FOUND    &amp;lt;NA&amp;gt;   driving
## 3         12 mins   739        OK SE3 8UQ   driving
## 4  1 hour 20 mins  4777        OK SE3 8UQ   walking
## 5            &amp;lt;NA&amp;gt;    NA NOT_FOUND    &amp;lt;NA&amp;gt;   walking
## 6  1 hour 20 mins  4777        OK SE3 8UQ   walking
## 7         32 mins  1930        OK SE3 8UQ bicycling
## 8            &amp;lt;NA&amp;gt;    NA NOT_FOUND    &amp;lt;NA&amp;gt; bicycling
## 9         32 mins  1930        OK SE3 8UQ bicycling
## 10        37 mins  2221        OK SE3 8UQ   transit
## 11           &amp;lt;NA&amp;gt;    NA NOT_FOUND    &amp;lt;NA&amp;gt;   transit
## 12        37 mins  2221        OK SE3 8UQ   transit
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I will undoubtedly want to do some cleaning after this and there&amp;rsquo;s certainly room for improvement on the function but this is a good starting point for getting some data to work with. The iterative way I build functions means I can try to solve a bit at a time &amp;ndash; hopefully this will help you when you&amp;rsquo;re faced with needing to build your own functions.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Tidyverse &#39;Starts_with&#39; in M/Power Query</title>
      <link>https://itsalocke.com/blog/tidyverse-starts_with-in-m/power-query/</link>
      <pubDate>Mon, 08 Oct 2018 15:13:39 +0100</pubDate>
      
      <guid>https://itsalocke.com/blog/tidyverse-starts_with-in-m/power-query/</guid>
      <description>
        
        

&lt;p&gt;As a heavy &lt;code&gt;R&lt;/code&gt; and &lt;code&gt;Tidyverse&lt;/code&gt; user, I&amp;rsquo;ve been playing with Microsofts &lt;code&gt;m&lt;/code&gt;/Power Query language included in Excel and PowerBI from that perspective, looking for the functions to make my life easier, developing small code pipelines for my processing and trying to get a smooth, clear and maintainable data manipulation process in place.&lt;/p&gt;

&lt;h2 id=&#34;the-problem&#34;&gt;The Problem&lt;/h2&gt;

&lt;p&gt;In PowerBI I have data generated from an API call to HubSpot, which deliveres a &lt;code&gt;json&lt;/code&gt; which is flattened as the first step of the process into a table with hundreds of columns. These columns have a pretty regular naming convention, in a form similar to this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;client_notified_timestamp
client_notified_source
client_notified_sourceid
client_notified_value
client_responded_timestamp
client_responded_source
client_responded_sourceid
client_responded_value
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The general rule is that the variable is encoded in the first part of the column name string, and that the columns with &lt;code&gt;[variable]_value&lt;/code&gt; hold the actual value while the other three columns (&lt;code&gt;[variable]_source&lt;/code&gt;, &lt;code&gt;[variable]_sourceid&lt;/code&gt; and &lt;code&gt;[variable]_timestamp&lt;/code&gt;) contain metadata we don&amp;rsquo;t really need here.&lt;/p&gt;

&lt;h2 id=&#34;the-target&#34;&gt;The Target&lt;/h2&gt;

&lt;p&gt;If I was using R to do this job (which &lt;em&gt;technically&lt;/em&gt; I could, but was not possible because of the context the PowerBI file is going to be used in), I could use tidyverse to do this pretty simply:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;dataset &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    select(&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt;ends_with(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;_source&amp;#34;&lt;/span&gt;),&lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt;ends_with(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;_sourceid&amp;#34;&lt;/span&gt;)))&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Anything that ends with &lt;code&gt;&amp;quot;_source&amp;quot;&lt;/code&gt; or &lt;code&gt;&amp;quot;_sourceid&amp;quot;&lt;/code&gt; gets dropped, everything else remains. A nice compact, maintainable and clear expression of a &amp;lsquo;rule&amp;rsquo; of processing.&lt;/p&gt;

&lt;h2 id=&#34;the-solution&#34;&gt;The Solution&lt;/h2&gt;

&lt;p&gt;This is the solution I used:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-m&#34; data-lang=&#34;m&#34;&gt;let
    Source = ...,
    rawData = Source{[tableId=&amp;#34;myData&amp;#34;]}[Data],
    removeSources = Table.RemoveColumns(rawData, List.Select(Table.ColumnNames(rawData), each Text.EndsWith(_, &amp;#34;Source ID&amp;#34;) or Text.EndsWith(_, &amp;#34;Source&amp;#34;)))
in
    removeSources&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This code block sources &lt;code&gt;rawData&lt;/code&gt; and &amp;lsquo;lists&amp;rsquo; the columns matching my requirements (&lt;code&gt;&amp;quot;_source&amp;quot;&lt;/code&gt; and &lt;code&gt;&amp;quot;_sourceid&amp;quot;&lt;/code&gt;) using the logical condition &lt;code&gt;each Text.EndsWith(_, &amp;quot;Source ID&amp;quot;) or Text.EndsWith(_, &amp;quot;Source&amp;quot;)&lt;/code&gt; on the column names returned from &lt;code&gt;Table.ColumnNames(rawData)&lt;/code&gt; feeding into &lt;code&gt;List.Select(...)&lt;/code&gt;. This list is the second argument to the function &lt;code&gt;Table.RemoveColumns(...)&lt;/code&gt;, which is operating on the &lt;code&gt;rawData&lt;/code&gt; again, to finally return only the columns I want.&lt;/p&gt;

&lt;h2 id=&#34;the-observations&#34;&gt;The Observations&lt;/h2&gt;

&lt;p&gt;This generally suits the requirements: &lt;em&gt;relatively&lt;/em&gt; readable functions, multiple logical conditions operating on the column names that &amp;lsquo;select&amp;rsquo; which I want returned in the next step.&lt;/p&gt;

&lt;p&gt;It is admittedly a little more verbose than the &lt;code&gt;R&lt;/code&gt; I had in mind, and right now I&amp;rsquo;m not sure if that&amp;rsquo;s me or just the language. There is some repetition in specifying &lt;code&gt;rawData&lt;/code&gt; in multiple places, which I haven&amp;rsquo;t found a shorthand for if there is one. Parts of it seem only &amp;lsquo;functional-ish&amp;rsquo;? The construction of &lt;code&gt;each Text.EndsWith(_, &amp;quot;Source ID&amp;quot;) or Text.EndsWith(_, &amp;quot;Source&amp;quot;))&lt;/code&gt; is pretty object-oriented. Without wanting to sound insulting maybe &lt;code&gt;m&lt;/code&gt; is only &amp;lsquo;semi-functional&amp;rsquo; in the technical definition of the term?&lt;/p&gt;

&lt;h2 id=&#34;the-caveat&#34;&gt;The Caveat&lt;/h2&gt;

&lt;p&gt;This is the first &lt;code&gt;m&lt;/code&gt; code I&amp;rsquo;ve really written and my knee-jerk first impressions. I&amp;rsquo;m sure there is a lot more to this language that I have yet to understand and maybe even come to appreciate.&lt;/p&gt;

&lt;h2 id=&#34;the-conclusion&#34;&gt;The Conclusion&lt;/h2&gt;

&lt;p&gt;Despite these observations I wouldn&amp;rsquo;t discount the potential of &lt;code&gt;m&lt;/code&gt;/Power Query. While many Microsoft tools let you use R baked in, it&amp;rsquo;s only baked in to the point where you can guarantee &lt;code&gt;R&lt;/code&gt; is installed on the machine, and it&amp;rsquo;s an undeniable fact of data that we have to work with Excel and Power BI in many situations. I&amp;rsquo;m actually quite looking forward to working with this not-quite-familar &amp;lsquo;functional-ish data language&amp;rsquo; in the future. When it&amp;rsquo;s the tool for the job at least :)&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Speed Up With Microsoft</title>
      <link>https://itsalocke.com/blog/speed-up-with-microsoft/</link>
      <pubDate>Thu, 04 Oct 2018 13:19:59 +0100</pubDate>
      
      <guid>https://itsalocke.com/blog/speed-up-with-microsoft/</guid>
      <description>
        
        

&lt;p&gt;People use R for lots of reasons: &amp;ldquo;It&amp;rsquo;s great for the models I need&amp;rdquo;, &amp;ldquo;I
like the functional approach&amp;rdquo;, &amp;ldquo;It&amp;rsquo;s the tool I&amp;rsquo;m most comfortable
with&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;People don&amp;rsquo;t use R for these reasons: &amp;ldquo;I have a favourite processor
core, I don&amp;rsquo;t want to use the others&amp;rdquo;, &amp;ldquo;I love how my memory needs to
fit all my data&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;What if I told you that you didn&amp;rsquo;t need to worry about that any more?&lt;/p&gt;

&lt;h2 id=&#34;multi-threaded-r&#34;&gt;Multi-threaded R&lt;/h2&gt;

&lt;p&gt;Microsoft have their own version of &lt;code&gt;R&lt;/code&gt; called &lt;a href=&#34;https://docs.microsoft.com/en-us/machine-learning-server/r-client/what-is-microsoft-r-client&#34;&gt;Microsoft R
Client&lt;/a&gt;.
It has a bunch of high-tech, whiz-bang features, but we&amp;rsquo;re going to
focus on one: multi-threading calculations. Vanilla &lt;code&gt;R&lt;/code&gt; is single
threaded.This means any calculations are done sequentially, in order,
one at a time. However, this isn&amp;rsquo;t using the ability of most modern,
domestic laptops. To get setup with Microsoft R Client, &lt;a href=&#34;https://docs.microsoft.com/en-us/machine-learning-server/r-client/install-on-windows&#34;&gt;follow the
install
instructions&lt;/a&gt;.
Once you&amp;rsquo;ve got Microsoft R Client installed, you will need to make sure
it&amp;rsquo;s the version of &lt;code&gt;R&lt;/code&gt; that is active in your session. If you are using
RStudio this is easy to do by going to
&lt;code&gt;Tools &amp;gt; Global Options &amp;gt; General &amp;gt; R Version&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&#34;../img/R-options-rclient.PNG&#34; alt=&#34;&#34; /&gt;&lt;/p&gt;

&lt;p&gt;You will probably be asked to Restart R Studio, so close it and open
back up, then we can run the code.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;R version 3.4.3 (2017-11-30) -- &amp;quot;Kite-Eating Tree&amp;quot;
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type &#39;license()&#39; or &#39;licence()&#39; for distribution details.

R is a collaborative project with many contributors.
Type &#39;contributors()&#39; for more information and
&#39;citation()&#39; on how to cite R or R packages in publications.

Type &#39;demo()&#39; for some demos, &#39;help()&#39; for on-line help, or
&#39;help.start()&#39; for an HTML browser interface to help.
Type &#39;q()&#39; to quit R.

Microsoft R Open 3.4.3
The enhanced R distribution from Microsoft
Microsoft packages Copyright (C) 2018 Microsoft

Loading Microsoft R Client packages, version 3.4.3.0097. 
Microsoft R Client limits some functions to available memory.
See: https://go.microsoft.com/fwlink/?linkid=799476 for information
about additional features.

Type &#39;readme()&#39; for release notes, privacy() for privacy policy, or
&#39;RevoLicense()&#39; for licensing information.

Using the Intel MKL for parallel mathematical computing(using 2 cores).
Default CRAN mirror snapshot taken on 2018-01-01.
See: https://mran.microsoft.com/.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This message will pop up, and it&amp;rsquo;s worth noting as it&amp;rsquo;s got some
information in it that you might need to think about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It&amp;rsquo;s worth noting that right now Microsoft r Client is lagging
behind the current &lt;code&gt;R&lt;/code&gt; version, and is based on version 3.4 of &lt;code&gt;R&lt;/code&gt;,
not 3.5. This will mean your default package libraries will not be
shared between the installations if you are running &lt;code&gt;R&lt;/code&gt; 3.5.&lt;/li&gt;
&lt;li&gt;It&amp;rsquo;s using a snapshot of &lt;code&gt;CRAN&lt;/code&gt; called &lt;code&gt;MRAN&lt;/code&gt; to source packages by
default. 90% of the time it will operate just as you expect, but
because it takes a &amp;lsquo;snapshot&amp;rsquo; of packages, newer features and
changes that have hit &lt;code&gt;CRAN&lt;/code&gt; may not be in the version of the
package you are grabbing.

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;RevoScaleR&lt;/code&gt; and probably the &lt;code&gt;ggplot2&lt;/code&gt; and &lt;code&gt;dplyr&lt;/code&gt; packages
will likely be installed for you already as default in Microsoft
R Client. The other two you will probably have to install
yourself.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Intel MKL will have scanned your system on install and attempted to
work out how many cores your processor has. Here it&amp;rsquo;s identified 2
on my old Lenovo Yoga. This is where the speed boost will come from.&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- --&gt;

&lt;pre&gt;&lt;code&gt;knitr::opts_chunk$set(echo = TRUE)
library(microbenchmark)
library(RevoScaleR)
library(ggplot2)
library(lockeutils)
theme_set(theme_ld() + theme(axis.title.x = element_text(vjust = -1)))
library(dplyr)
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&#34;test-data&#34;&gt;Test Data&lt;/h3&gt;

&lt;p&gt;Here we make a set of example data, 3 data frame of random numbers with
various &amp;lsquo;normal&amp;rsquo; distributions. The data frames are of different
lengths: 500,000 , 1,000,000 and 5,000,000.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;set.seed(9000)
sample_df = function(n){
  data.frame(
    col_1 = c(rnorm(n, mean = 11, sd = 0.5)), 
    col_2 = c(rnorm(n, mean = 6, sd = 1)), 
    col_3 = c(rnorm(n, mean = 3, sd = 0.75))
  )
}

df_500k &amp;lt;- sample_df(500000)
df_1m &amp;lt;- sample_df(1000000)
df_5m &amp;lt;- sample_df(5000000)
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&#34;benchmarking&#34;&gt;Benchmarking&lt;/h3&gt;

&lt;p&gt;Running benchmarks in &lt;code&gt;R&lt;/code&gt; is easy with the microbenchmark package. The
package tries to be as accurate as possible in measuring the time for
each of it&amp;rsquo;s runs, and also allows you to easily compare different
approaches and specify the amount of repeats. I&amp;rsquo;ve decided to test each
of the 3 data sets in each of 2 different linear modelling functions,
giving 6 different groups of results. Each group will be run 10 times
for a total set of 60 runs.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;res &amp;lt;- microbenchmark(
  lm_50k = lm(col_1 ~ col_2 + col_3, data = df_500k),
  rxLM_50k = rxLinMod(col_1 ~ col_2 + col_3, data = df_500k, reportProgress = 0),
  lm_1m = lm(col_1 ~ col_2 + col_3, data = df_1m),
  rxLM_1m = rxLinMod(col_1 ~ col_2 + col_3, data = df_1m, reportProgress = 0),
  lm_5m = lm(col_1 ~ col_2 + col_3, data = df_5m),
  rxLM_5m = rxLinMod(col_1 ~ col_2 + col_3, data = df_5m, reportProgress = 0),
  times = 10
  )
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&#34;results&#34;&gt;Results&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;res&lt;/code&gt; object is of class &lt;code&gt;microbenchmark&lt;/code&gt;, and has a plotting method
that can be used via &lt;code&gt;ggplot2::autoplot()&lt;/code&gt;.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;autoplot(res) +  
  labs(title = &amp;quot;Violin plot of model run durations&amp;quot;, 
       subtitle = &amp;quot;`rxLinMod()` vs `lm()`&amp;quot;,
       caption = &amp;quot;Microsoft R Client 3.4.3, 2 cores&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src=&#34;../img/plot-1.png&#34; alt=&#34;&#34; /&gt;&lt;/p&gt;

&lt;p&gt;In each of these tests, we can see that the &lt;code&gt;RevoScaleR::rxLinMod()&lt;/code&gt;
functions outperforms the base &lt;code&gt;lm()&lt;/code&gt; by a large margin. Note the log
scale for &lt;code&gt;Time [milliseconds]&lt;/code&gt;!&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;knitr::kable(summary(res))
&lt;/code&gt;&lt;/pre&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;expr&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;min&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;lq&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;mean&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;median&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;uq&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;max&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;neval&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;lm_50k&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1106.2070&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1141.3168&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1740.0966&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1340.5601&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1474.8239&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4679.1277&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;rxLM_50k&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;115.3501&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;120.7544&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;137.2811&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;136.9788&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;147.7357&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;171.9592&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;lm_1m&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2203.7005&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2890.8738&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3018.6052&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3072.3086&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3340.6342&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3934.8914&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;rxLM_1m&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;184.8220&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;199.1018&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;223.0184&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;214.2320&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;223.2292&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;307.1239&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;lm_5m&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;14897.7473&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;15450.7170&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;16295.3558&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;16183.4236&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;16720.8177&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;18837.9140&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;rxLM_5m&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;704.0234&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;805.8069&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1108.2574&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;965.9696&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1044.1400&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2084.2737&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;h3 id=&#34;but-how&#34;&gt;But how?&lt;/h3&gt;

&lt;p&gt;It does this because the &lt;em&gt;Microsoft R Client&lt;/em&gt; uses &lt;em&gt;Intel MKL for
parallel mathematical computing&lt;/em&gt;. This allows the &lt;code&gt;RevoScaleR&lt;/code&gt; package
to implement a &amp;lsquo;parallelised algorithm&amp;rsquo; to solve the linear regression
using the BLAS and LAPACK FORTRAN libraries. &lt;a href=&#34;https://mran.microsoft.com/documents/rro/multithread&#34;&gt;More technical details are
avilable in the
docs&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&#34;what-about-the-out-of-memory-part&#34;&gt;What about the out of memory part?&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;RevoScaleR&lt;/code&gt; and Microsoft R Client can help with that, though the
solution requires a little more involvement. The short version is that
it can leverage &lt;em&gt;distributed data sources&lt;/em&gt; as a backend, such as spark,
hadoop and sqlserver. this allows the data to be spread across many
&lt;em&gt;compute nodes&lt;/em&gt; that are managed by software called &amp;lsquo;Microsoft Machine
Learning Server`. &lt;a href=&#34;https://docs.microsoft.com/en-us/machine-learning-server/r/concept-what-is-revoscaler&#34;&gt;This is a good overview of the
ideas&lt;/a&gt;
on the main site, which also holds all the documentation.&lt;/p&gt;

&lt;h2 id=&#34;would-you-like-to-know-more&#34;&gt;Would you like to know more?&lt;/h2&gt;

&lt;p&gt;Locke Data have are developing course going into depth with this
technology from an &lt;code&gt;R&lt;/code&gt; context and &lt;a href=&#34;../../training/onlinetraining/&#34;&gt;are releasing it at the end of
October&lt;/a&gt;. Please come and join us
(virtually!) for some hands-on learning and detailed tutorials.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Up your open source game with Hacktoberfest at Locke Data!</title>
      <link>https://itsalocke.com/blog/up-your-open-source-game-with-hacktoberfest-at-locke-data/</link>
      <pubDate>Mon, 01 Oct 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/up-your-open-source-game-with-hacktoberfest-at-locke-data/</guid>
      <description>
        
        &lt;p&gt;How awesome is open source software? Quite awesome in our opinion! Locke Data maintains several open source repos &lt;a href=&#34;https://github.com/lockedata&#34;&gt;on GitHub&lt;/a&gt;, in particular of &lt;a href=&#34;https://github.com/search?q=topic%3Ar-package+org%3Alockedata+fork%3Atrue&#34;&gt;R packages&lt;/a&gt;, and we&amp;rsquo;d like you to join in the fun! This month, we&amp;rsquo;re taking part in Hacktoberfest and will do our best to mentor you through your first open source contributions if you wish!&lt;/p&gt;

&lt;p&gt;Hacktoberfest is a month-long operation celebrating open source software. As an open source newbie, it&amp;rsquo;s the occasion to start participating in open source development! All you need to do is to sign up &lt;a href=&#34;https://hacktoberfest.digitalocean.com/&#34;&gt;at the Hacktoberfest website&lt;/a&gt;, and then to look for &amp;ldquo;hacktoberfest&amp;rdquo;-labelled issues on GitHub to see where your help is needed. &lt;a href=&#34;https://blog.github.com/2018-09-24-hacktoberfest-is-back-and-celebrating-its-fifth-year/&#34;&gt;More general details in GitHub blog post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We at Locke Data have labelled &lt;a href=&#34;https://github.com/search?q=user%3Alockedata+label%3Ahacktoberfest&amp;amp;state=open&amp;amp;type=Issues&#34;&gt;a few issues&lt;/a&gt; (follow &lt;a href=&#34;https://github.com/search?q=user%3Alockedata+label%3Ahacktoberfest&amp;amp;state=open&amp;amp;type=Issues&#34;&gt;the link&lt;/a&gt;) in our package repos and would be glad to have you as a contributor. All our repos are nice places with a code of conduct!&lt;/p&gt;

&lt;p&gt;If you choose to contribute to one of our repositories, we&amp;rsquo;ll offer pointers as needed. Ellen is the team member in charge of managing our Hacktoberfest efforts, but we&amp;rsquo;ll all cheer! Have fun! If you need to reach out to us for extra help, pointers or just want to know more, feel free to reach out to us on twitter @LockeData and give us a follow while you&amp;rsquo;re over there - we&amp;rsquo;ll be talking about Hacktoberfest for the whole month!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Functions and Packages</title>
      <link>https://itsalocke.com/blog/functions-and-packages/</link>
      <pubDate>Sat, 29 Sep 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/functions-and-packages/</guid>
      <description>
        
        

&lt;p&gt;We’re done with the basics of handling data in R. Now we want to know
how to make sense of it. We know what kind of data it is, we know how to
look at column names, dimensions and the like. If you’re trying to add
value to this data however, that very often isn’t enough, so here’s a
look at using the tools available to you to start figuring out how to do
what you want.&lt;/p&gt;


&lt;div style=&#34;position: relative; padding-bottom: 56.25%; padding-top: 30px; height: 0; overflow: hidden;&#34;&gt;
  &lt;iframe src=&#34;//www.youtube.com/embed/Kq93ADb9Ii0&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%;&#34; allowfullscreen frameborder=&#34;0&#34; title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
 &lt;/div&gt;


&lt;h1 id=&#34;r-packages&#34;&gt;R packages&lt;/h1&gt;

&lt;p&gt;An R package is a bundle of functions and/or datasets. It extends the
capabilities that the “base” and “recommended” R packages have. By using
packages we can do data manipulation in a variety of ways, produce all
sorts of awesome charts, generate books like this, use other languages
like Python and JavaScript, and of course, do all sorts of data
analysis.&lt;/p&gt;

&lt;h2 id=&#34;installing-packages&#34;&gt;Installing packages&lt;/h2&gt;

&lt;p&gt;Once you’ve identified a package that contains functions or data you’re
interested in using, we need to get the package onto our machine.&lt;/p&gt;

&lt;p&gt;To get the package, you can use an R function or you can use the Install
button on the Packages tab.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;install.packages(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;datasauRus&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you need to install a number of packages, &lt;code&gt;install.packages()&lt;/code&gt; takes
a vector of package names.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;install.packages(&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;datasauRus&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tidyverse&amp;#34;&lt;/span&gt;))&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Updating packages involves re-running &lt;code&gt;install.packages()&lt;/code&gt; and it’s
usually easier to trigger this by using the Update button on the
Packages tab and selecting all the packages you want to update.&lt;/p&gt;

&lt;h3 id=&#34;installing-from-github-and-other-sources&#34;&gt;Installing from GitHub and other sources&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;install.packages()&lt;/code&gt; function works with CRAN, CRAN mirrors, and
CRAN-like repositories&lt;/p&gt;

&lt;p&gt;If you want to install BioConductor packages, there are some helper
scripts available from the BioConductor website,
&lt;a href=&#34;http://www.bioconductor.org/install/&#34;&gt;bioconductor.org&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Other package sources, such as GitHub, will involve building packages
before they can be installed. If you’re on Windows, this means you need
an additional piece of software called
&lt;a href=&#34;http://cran.r-project.org/bin/windows/Rtools/&#34;&gt;Rtools&lt;/a&gt;. The other handy
thing you’ll need is the package &lt;code&gt;devtools&lt;/code&gt; (available from CRAN).
&lt;code&gt;devtools&lt;/code&gt; provides a number of functions designed to make it easier to
install from GitHub, BitBucket, and other sources.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(devtools)
install_github(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;lockedata/pRojects&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h2 id=&#34;recommended-packages&#34;&gt;Recommended packages&lt;/h2&gt;

&lt;p&gt;Here are my recommended packages – look out for books and blogposts on
these in the future!&lt;/p&gt;

&lt;h3 id=&#34;tidyverse&#34;&gt;tidyverse&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;tidyverse&lt;/code&gt; is a suite of packages designed to make your life
easier. It’s well worth installing and many of the packages in this
recommendations section are part of the &lt;code&gt;tidyverse&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;install.packages(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tidyverse&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h3 id=&#34;getting-data-in-and-out-of-r&#34;&gt;Getting data in and out of R&lt;/h3&gt;

&lt;p&gt;The following packages can be used to get data into, and out of R:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Working with databases, you can use the &lt;code&gt;DBI&lt;/code&gt; package and it’s
companion &lt;code&gt;odbc&lt;/code&gt; to connect to most databases&lt;/li&gt;
&lt;li&gt;To get data from web pages, you can use &lt;code&gt;rvest&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;To work with APIs, you use &lt;code&gt;httr&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;To work with CSVs, you can use &lt;code&gt;readr&lt;/code&gt; or &lt;code&gt;data.table&lt;/code&gt;.[6]&lt;/li&gt;
&lt;li&gt;To work with SPSS, SAS, and Stata files, use &lt;code&gt;readr&lt;/code&gt; and &lt;code&gt;haven&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&#34;data-manipulation&#34;&gt;Data manipulation&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;tidyverse&lt;/code&gt; contains great packages for data manipulation including
&lt;code&gt;dplyr&lt;/code&gt; and &lt;code&gt;purrr&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Additionally, a favourite data manipulation package of mine is
&lt;code&gt;data.table&lt;/code&gt;. &lt;code&gt;data.table&lt;/code&gt; tends to have a bit of a steeper learning
curve than the &lt;code&gt;tidyverse&lt;/code&gt; but it’s phenomenal for brevity and
performance.&lt;/p&gt;

&lt;h3 id=&#34;data-visualisation&#34;&gt;Data visualisation&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;For static graphics &lt;code&gt;ggplot2&lt;/code&gt; is fantastic - it adds a sensible
vocabulary to help you construct charts with ease&lt;/li&gt;
&lt;li&gt;&lt;code&gt;plotly&lt;/code&gt; helps you build interactive charts from scratch or make
&lt;code&gt;ggplot2&lt;/code&gt; charts interactive&lt;/li&gt;
&lt;li&gt;&lt;code&gt;leaflet&lt;/code&gt; is a great maps package&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ggraph&lt;/code&gt; helps you build effective network diagrams&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&#34;data-science&#34;&gt;Data science&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;caret&lt;/code&gt; is an interface package to many model algorithms and has a
raft of insanely useful features itself&lt;/li&gt;
&lt;li&gt;&lt;code&gt;broom&lt;/code&gt; takes outputs from model functions and makes them into nice
data.frames&lt;/li&gt;
&lt;li&gt;&lt;code&gt;modelr&lt;/code&gt; helps build samples and supplement result sets&lt;/li&gt;
&lt;li&gt;&lt;code&gt;reticulate&lt;/code&gt; is a package for talking to Python and, therefore,
enables you to work with any deep learning framework that is based
in Python. &lt;code&gt;tensorflow&lt;/code&gt; is a package based on &lt;code&gt;reticulate&lt;/code&gt; and
allows you to work with tensorflow in R&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sparklyr&lt;/code&gt; allows you to run and work with Spark processes on your R
data&lt;/li&gt;
&lt;li&gt;&lt;code&gt;h2o&lt;/code&gt; is a package for working with H2O, a super nifty machine
learning platform&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&#34;presenting-results&#34;&gt;Presenting results&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;rmarkdown&lt;/code&gt; is the core package for combining text and code and
being able to produce outputs like HTML pages, PDFs, and Word
documents&lt;/li&gt;
&lt;li&gt;&lt;code&gt;bookdown&lt;/code&gt; facilitates books like this&lt;/li&gt;
&lt;li&gt;&lt;code&gt;revealjs&lt;/code&gt; allows you to make slide decks using &lt;code&gt;rmarkdown&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;flexdashboard&lt;/code&gt; and &lt;code&gt;shiny&lt;/code&gt; allow you to make interactive, reactive
dashboards and other analytical apps&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&#34;finding-packages&#34;&gt;Finding packages&lt;/h3&gt;

&lt;p&gt;As well as using online search facilities like
&lt;a href=&#34;http://cran.r-project.org/search.html&#34;&gt;CRAN&lt;/a&gt; and
&lt;a href=&#34;http://rdrr.io&#34;&gt;rdrr.io&lt;/a&gt; for packages, there are some handy packages
that help you find other packages!&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ctv&lt;/code&gt; allows you to get all the packages in a given &lt;a href=&#34;http://cran.r-project.org/web/views/&#34;&gt;CRAN task
view&lt;/a&gt;, which are maintained
lists of package for various tasks&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sos&lt;/code&gt; allows you to search for packages and functions that match a
keyword&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&#34;loading-packages&#34;&gt;Loading packages&lt;/h2&gt;

&lt;p&gt;To make functions and data from a package available to use, we need to
run the &lt;code&gt;library()&lt;/code&gt; function.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;utils&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;library()&lt;/code&gt; function accepts a vector of length 1, so you need to
perform multiple calls to the function to load up multiple packages.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;utils&amp;#34;&lt;/span&gt;)
&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;stats&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Once a package is loaded, you can then use any of it’s functions.&lt;/p&gt;

&lt;p&gt;You can find what functions are available in a package by looking at
it’s help page.&lt;/p&gt;

&lt;p&gt;Alternatively, you can type the package’s name and hit Tab. This
auto-completes the package’s name, adds two colons (&lt;code&gt;::&lt;/code&gt;) and then shows
the list of available functions for that package. The double colon trick
is very helpful for when you want to browse package functionality, e.g.
&lt;code&gt;utils::find()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Any function in R can be prefixed with it’s package name and the double
colon (&lt;code&gt;::&lt;/code&gt;) - this is great for telling people where
functions are coming from and for tracking dependencies in long scripts.
It is also really useful when you have two packages loaded that might
have a function of the same name. This is because the order the packages
are loaded in dictates which one gets overridden.&lt;/p&gt;

&lt;h2 id=&#34;learning-how-to-use-a-package&#34;&gt;Learning how to use a package&lt;/h2&gt;

&lt;p&gt;R documentation is some of the best out there.&lt;/p&gt;

&lt;p&gt;Yes, I will complain about the impenetrable statistical jargon some
package authors use, but the CRAN gatekeepers require that packages
generally have a really high standard of documentation.&lt;/p&gt;

&lt;p&gt;Every function you use will have a help page associated with it. This
page usually contains a description, shows what parameters the function
has, what those parameters are, and most importantly, there’s usually
examples.&lt;/p&gt;

&lt;p&gt;To navigate to the help page of an individual function in an R package
you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hit F1 on a function name in a script&lt;/li&gt;
&lt;li&gt;Type &lt;code&gt;??fnName&lt;/code&gt; and send to the console&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;??&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;mean&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Search in the Help tab&lt;/li&gt;
&lt;li&gt;Use the &lt;code&gt;help()&lt;/code&gt; function to open up the packages index page and
navigate to the relevant function&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;help(package&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;utils&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;Find the relevant package in the Packages tab and click on it.
Scroll through the index that opens up on the Help page to find the
right function&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As well as the function level documentation, good packages also provide
a higher level of documentation that covers workflows using the
packages, how to extend package functionality, or outlines any
methodologies or research that led to the package.&lt;/p&gt;

&lt;p&gt;These pieces of documentation are called &lt;strong&gt;vignettes&lt;/strong&gt;. They are
accessible on the package’s index page or you can use the function
&lt;code&gt;vignette()&lt;/code&gt; to read them.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;vignette(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;multi&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1 id=&#34;using-functions-in-packages&#34;&gt;Using functions in packages&lt;/h1&gt;

&lt;p&gt;In previous sections we’ve seen R &lt;strong&gt;functions&lt;/strong&gt; that are used on objects
to perform some activity. Functions seen so far include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;class()&lt;/code&gt; and &lt;code&gt;is.*()&lt;/code&gt; functions for checking datatypes&lt;/li&gt;
&lt;li&gt;&lt;code&gt;as.*&lt;/code&gt; for converting to datatypes&lt;/li&gt;
&lt;li&gt;&lt;code&gt;length()&lt;/code&gt; and &lt;code&gt;names()&lt;/code&gt; for metadata&lt;/li&gt;
&lt;li&gt;&lt;code&gt;head()&lt;/code&gt; and &lt;code&gt;tail()&lt;/code&gt; for getting a small amount of elements from an
object&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ncol()&lt;/code&gt;, &lt;code&gt;nrow()&lt;/code&gt;, &lt;code&gt;colnames()&lt;/code&gt;, and &lt;code&gt;rownames()&lt;/code&gt; for getting
data.frame metadata&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Sys.Date()&lt;/code&gt; and &lt;code&gt;Sys.time()&lt;/code&gt; for getting current date-time values&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are a huge range of functions out there, whether available in R
straight away, or from adding extra functionality.&lt;/p&gt;

&lt;p&gt;Understanding how functions work and being able to use them correctly
will help you learn, and use R effectively.&lt;/p&gt;

&lt;h2 id=&#34;using-a-function&#34;&gt;Using a function&lt;/h2&gt;

&lt;p&gt;A function does some computation on an object. The use of a function
consists of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A function’s name&lt;/li&gt;
&lt;li&gt;Parentheses&lt;/li&gt;
&lt;li&gt;0 or more inputs&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each input is provided to an &lt;strong&gt;argument&lt;/strong&gt; or parameter within a
function.&lt;/p&gt;

&lt;p&gt;These arguments have names, although you don’t often need to provide the
names.&lt;/p&gt;

&lt;p&gt;You can find out what arguments a function takes by using the code
completion and it’s help snippet, or by searching for the function in
the Rstudio Help tab.&lt;/p&gt;

&lt;p&gt;When you’re inside the brackets of a function you can get the list of
available arguments and auto-complete them.&lt;/p&gt;

&lt;h2 id=&#34;examining-functions&#34;&gt;Examining functions&lt;/h2&gt;

&lt;p&gt;One of the niftiest things about R is being able to see the code for a
function. You can examine how many functions work by just typing their
name without any parentheses.&lt;/p&gt;

&lt;p&gt;You can find out what arguments a function takes by using the code
completion and it’s help snippet, or by searching for the function in
the Rstudio Help tab.&lt;/p&gt;

&lt;p&gt;When you’re inside the brackets of a function you can get the list of
available arguments and auto-complete them.&lt;/p&gt;

&lt;h2 id=&#34;examining-functions-1&#34;&gt;Examining functions&lt;/h2&gt;

&lt;p&gt;One of the niftiest things about R is being able to see the code for a
function. You can examine how many functions work by just typing their
name without any parentheses.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;Sys.Date&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## function () 
## as.Date(as.POSIXlt(Sys.time()))
## &amp;lt;bytecode: 0x10df91748&amp;gt;
## &amp;lt;environment: namespace:base&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The first line(s) show how the arguments are specified. Subsequent lines
show the code and the final lines starting with &lt;code&gt;&amp;lt;&lt;/code&gt; can be mostly
ignored.&lt;/p&gt;

&lt;h2 id=&#34;function-input-patterns&#34;&gt;Function input patterns&lt;/h2&gt;

&lt;p&gt;Functions tend to conform to certain patterns of inputs.&lt;/p&gt;

&lt;h3 id=&#34;no-inputs&#34;&gt;No inputs&lt;/h3&gt;

&lt;p&gt;Some functions don’t require the user to provide info and so they don’t
have any arguments. &lt;code&gt;Sys.Date()&lt;/code&gt; and similar functions do not need user
input because the functions provide information about the system.&lt;/p&gt;

&lt;p&gt;Looking at the function definition above, we can see that there are no
arguments specified in the first line.&lt;/p&gt;

&lt;h3 id=&#34;single-inputs&#34;&gt;Single inputs&lt;/h3&gt;

&lt;p&gt;Other functions only have a single allowed input. &lt;code&gt;length()&lt;/code&gt; returns the
length of an object so it only allows you to provide it with an object.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;length&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## function (x)  .Primitive(&amp;quot;length&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We can see in this definition that the function takes the argument &lt;code&gt;x&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&#34;many-inputs&#34;&gt;Many inputs&lt;/h3&gt;

&lt;p&gt;Some functions have multiple inputs, although not all of them are
necessarily &lt;strong&gt;mandatory&lt;/strong&gt;. &lt;code&gt;head()&lt;/code&gt; and &lt;code&gt;tail()&lt;/code&gt; have been used so far
with only a single input but they take an optional argument as to how
many elements should be returned.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;head&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;letters&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;a&amp;quot; &amp;quot;b&amp;quot; &amp;quot;c&amp;quot; &amp;quot;d&amp;quot; &amp;quot;e&amp;quot; &amp;quot;f&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;head&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;letters&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;a&amp;quot; &amp;quot;b&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;rnorm()&lt;/code&gt; function allows us to generate a vector of values from a
normal distribution. We can tell it how many values we need (&lt;code&gt;n&lt;/code&gt;), and
we can optionally provide the mean (&lt;code&gt;mean&lt;/code&gt;) and standard deviation
(&lt;code&gt;sd&lt;/code&gt;) to describe the Normal curve that values should be selected from.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;rnorm&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## function (n, mean = 0, sd = 1) 
## .Call(C_rnorm, n, mean, sd)
## &amp;lt;bytecode: 0x10ded12a8&amp;gt;
## &amp;lt;environment: namespace:stats&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Looking at how &lt;code&gt;rnorm&lt;/code&gt; is specified we can see that we’re expected to
provide &lt;code&gt;n&lt;/code&gt;, but &lt;code&gt;mean&lt;/code&gt; and &lt;code&gt;sd&lt;/code&gt; are given values of 0 and 1
respectively by default.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;rnorm(n&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] -1.4818734 -1.0309718  1.4056332  0.4328255 -0.6992250
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;rnorm(n &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;, mean &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt;, sd &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1]  6.759442  8.453144  8.035506 12.549855 11.781241
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&#34;unlimited-inputs&#34;&gt;Unlimited inputs&lt;/h3&gt;

&lt;p&gt;Other functions can take an unlimited amount of input values. Functions
like &lt;code&gt;sum()&lt;/code&gt; will sum the values from a number of objects.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;sum&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## function (..., na.rm = FALSE)  .Primitive(&amp;quot;sum&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The ellipsis (&lt;code&gt;...&lt;/code&gt;) is used to denote when the user can provide any
number of values.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;sum&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;9&lt;/span&gt;, &lt;span style=&#34;color:#66d9ef&#34;&gt;pi&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 54.14159
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&#34;naming-arguments&#34;&gt;Naming arguments&lt;/h2&gt;

&lt;p&gt;Every input provided to a function is associated with an argument.&lt;/p&gt;

&lt;p&gt;Each argument must have a name. Even functions that allow unlimited
inputs assign these inputs to a name. Behind the scenes, they get put
into a list object and the list gets called &lt;code&gt;...&lt;/code&gt; (or ellipsis).&lt;/p&gt;

&lt;p&gt;There are some typical names for arguments that take your data object.
These include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;x&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;data&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;.data&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;df&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You don’t usually have to provide the argument names, just put things in
the relevant places in the function. Sometimes though, you &lt;em&gt;will&lt;/em&gt; need
to use argument names.&lt;/p&gt;

&lt;p&gt;Here are my rules of thumb for knowing when you need to name names:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You’re using the arguments in an order that is different from the
function author’s intended order (you might be skipping some
arguments as the default values are fine or you might just prefer a
different order)&lt;/li&gt;
&lt;li&gt;The arguments you want to specify show up after the &lt;code&gt;...&lt;/code&gt; in a
function’s argument list&lt;/li&gt;
&lt;li&gt;You want to give a specific name to a value in a &lt;code&gt;...&lt;/code&gt; argument&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We can provide names for clarity or so we can use arguments out of order
if we prefer to.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;rnorm(n &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;, mean &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt;, sd &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 10.775754 10.104590  7.055771 14.544010  9.135413
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;rnorm(mean &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt;, sd &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;, n &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1]  9.625624  8.345576  8.794756  8.914196 10.417960
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A common behaviour change that you’ll need to work with is how missing
(&lt;code&gt;NA&lt;/code&gt;) values get handled. Functions that allow you change this
behaviour, usually have an argument called things like &lt;code&gt;na.rm&lt;/code&gt;,
&lt;code&gt;na.omit&lt;/code&gt;, and &lt;code&gt;na.action&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;sum&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;, &lt;span style=&#34;color:#66d9ef&#34;&gt;NA&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] NA
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;sum&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;, &lt;span style=&#34;color:#66d9ef&#34;&gt;NA&lt;/span&gt;, na.rm &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;TRUE&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 15
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In the &lt;code&gt;sum()&lt;/code&gt; example, I used the &lt;code&gt;na.rm&lt;/code&gt; argument’s name. This is
because otherwise the &lt;code&gt;TRUE&lt;/code&gt; would be considered part of the values
being passed for summing. Without the name, the value gets considered as
part of the &lt;code&gt;...&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;sum&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;, &lt;span style=&#34;color:#66d9ef&#34;&gt;NA&lt;/span&gt;, &lt;span style=&#34;color:#66d9ef&#34;&gt;TRUE&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] NA
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A function will sometimes have &lt;code&gt;...&lt;/code&gt; at the end of it’s list of
arguments when it utilises other functions and those have optional /
default values.&lt;/p&gt;

&lt;p&gt;For instance the &lt;code&gt;predict()&lt;/code&gt; function allows us to take a model we’ve
built and apply it to some new data.&lt;/p&gt;

&lt;p&gt;It works for many different types of model and these different models
expect different types of inputs. Some models expect data.frames, others
expect time series data, etc.&lt;/p&gt;

&lt;p&gt;There’s lots of potential variations, the only thing that is mandatory
is the model object.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;predict&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## function (object, ...) 
## UseMethod(&amp;quot;predict&amp;quot;)
## &amp;lt;bytecode: 0x103e21638&amp;gt;
## &amp;lt;environment: namespace:stats&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;predict()&lt;/code&gt; function then determines what type of model object
you’ve provided it and passes the model, and any other values you
provided, to the relevant function, returning back the results.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;linearMod&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt;lm(Sepal.Length&lt;span style=&#34;color:#f92672&#34;&gt;~&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;.&lt;/span&gt;, data&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;iris)

predict(linearMod, iris[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;,])&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;##        1 
## 5.004788
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And so very quickly before I summarise all that for you, just a note to
say that that’s the very basics of R covered, but look out soon for a
couple of posts on making R work for you - R projects (A very good
habit) and a Github 101 coming up!&lt;/p&gt;

&lt;h2 id=&#34;summary&#34;&gt;Summary&lt;/h2&gt;

&lt;p&gt;R uses functions as the means of performing operations.&lt;/p&gt;

&lt;p&gt;Functions can take 0 or more arguments. All arguments may be mandatory,
but some can be optional or even undefined.&lt;/p&gt;

&lt;p&gt;You can use argument names to provide arguments in different orders to
that defined by the function author or to provide them in the case where
an ellipsis (&lt;code&gt;...&lt;/code&gt;) is used in a function.&lt;/p&gt;

&lt;p&gt;R packages bundle functionality and/or data.&lt;/p&gt;

&lt;p&gt;You can install packages from the central public repository (CRAN) via
&lt;code&gt;install.packages()&lt;/code&gt; or install them from GitHub with the package
&lt;code&gt;devtools&lt;/code&gt;. R packages contain documentation that helps you understand
how functions work and how the package overall works.&lt;/p&gt;

&lt;p&gt;When you want to make use of functionality from a package you can either
load all of a package’s functionality by using the &lt;code&gt;library()&lt;/code&gt; function
or refer to a specific function by prefixing the function with the
package name and two colons (&lt;code&gt;::&lt;/code&gt;) e.g. &lt;code&gt;utils::help(&amp;quot;mean&amp;quot;)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;There are many packages out there for different activities and
domain-specific types of analysis. Use online search facilities like
&lt;a href=&#34;http://rdrr.io&#34;&gt;rdrr.io&lt;/a&gt; or &lt;a href=&#34;http://cran.r-project.org/web/views/&#34;&gt;CRAN task
views&lt;/a&gt; to find ones specific to
your requirements.&lt;/p&gt;

&lt;p&gt;As usual, all the code from this installments video is included below in
one fell swoop.&lt;/p&gt;

&lt;h2 id=&#34;video-code&#34;&gt;Video code&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;install.packages(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tidyverse&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tidyverse&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;help(package&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;dplyr&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;and then&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;?&lt;/span&gt;bind_rows&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;one &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; mtcars[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;4&lt;/span&gt;, ]
two &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; mtcars[&lt;span style=&#34;color:#ae81ff&#34;&gt;11&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;14&lt;/span&gt;, ]

&lt;span style=&#34;color:#75715e&#34;&gt;# You can supply data frames as arguments:&lt;/span&gt;
bind_rows(one, two) &lt;span style=&#34;color:#f92672&#34;&gt;-&amp;gt;&lt;/span&gt; THREE&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;Sys.Date&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;length&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;length&lt;/span&gt;(THREE)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;head&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;head&lt;/span&gt;(THREE, &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;rnorm&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;rnorm(n&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;rnorm(n&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;, mean &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;, sd &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;7&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#75715e&#34;&gt;#Have a read of the blog post here to find out why we sometimes use the argument names inside the brackets!&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;sum&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;sum&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;9&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;7&lt;/span&gt;, &lt;span style=&#34;color:#66d9ef&#34;&gt;pi&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;predict&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;linearMod&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt;lm(Sepal.Length&lt;span style=&#34;color:#f92672&#34;&gt;~&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;.&lt;/span&gt;, data&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;iris)
&lt;span style=&#34;color:#75715e&#34;&gt;# logisticMod&amp;lt;-glm(Species~., data=iris, family=binomial)&lt;/span&gt;

predict(linearMod, iris[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;,])&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Happy coding :) Ellen!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Cosmos DB for Data Science</title>
      <link>https://itsalocke.com/blog/cosmos-db-for-data-science/</link>
      <pubDate>Fri, 07 Sep 2018 10:54:45 +0100</pubDate>
      
      <guid>https://itsalocke.com/blog/cosmos-db-for-data-science/</guid>
      <description>
        
        

&lt;p&gt;Cosmos DB is a snazzy new(ish) Microsoft Azure product. I was able to go to Microsoft Office in London for three days of training on the database service, which was really well structured and well run, with a lot of knowledgeable Microsoft bods around to pass on their considerable knowledge. This post will extract out some key features and benefits of the service, and then discuss how this fit&amp;rsquo;s into a data scientists role.&lt;/p&gt;

&lt;h1 id=&#34;what-is-cosmosdb&#34;&gt;What is CosmosDB?&lt;/h1&gt;

&lt;p&gt;From a pure and simple database perspective cosmos is a &amp;lsquo;Document Database&amp;rsquo;. This means it is a &lt;em&gt;non-relational&lt;/em&gt; store of &lt;em&gt;JSON documents&lt;/em&gt;. MongoDB and CouchDB are examples of other Document Databases, but CosmosDB opens a whole new universe of potential.&lt;/p&gt;

&lt;p&gt;Two quotes from Microsoft demonstrate this pretty effectively.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;A globally distributed, massively scalable, multi-model database service&amp;rdquo;&lt;/p&gt;

&lt;p&gt;&amp;ldquo;A fully-managed globally distributed database service built to guarantee extremely low latency and massive scale for modern apps&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Microsoft seems to me to be targeting a few specific markets with this product. Many of the case studies revolved around IoT, specifically the increasing importance of data from car telemetry systems, web and mobile apps, gaming and retail.&lt;/p&gt;

&lt;p&gt;These applications are highly suited to the technology for a number of reasons&lt;/p&gt;

&lt;h2 id=&#34;semi-structured-data&#34;&gt;Semi-structured data&lt;/h2&gt;

&lt;p&gt;Imagine you have a website where you sell phones, which consumers may want to compare against in relevant ways. For example, your phone data may look like this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;{
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;phone1&amp;#34;&lt;/span&gt;:{
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;brand&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Nokia&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;model&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;3310&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;year&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2000&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;cpu&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;MAD2WD1&amp;#34;&lt;/span&gt;
    },
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;phone2&amp;#34;&lt;/span&gt;:{
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;brand&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Nokia&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;model&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;3310&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;year&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2017&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;soc&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;MT6260&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;camera&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2mp&amp;#34;&lt;/span&gt;
    }
}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Two Nokia 3310s, separated by 17 years. Some properties are identical: &lt;code&gt;brand&lt;/code&gt; and &lt;code&gt;model&lt;/code&gt;, some are different: &lt;code&gt;year&lt;/code&gt;, and some are present in one, but not the other: &lt;code&gt;cpu&lt;/code&gt;and &lt;code&gt;soc&lt;/code&gt;, and the two documents have a different length, the &lt;code&gt;phone2&lt;/code&gt; has a &lt;code&gt;camera&lt;/code&gt; resolution value.&lt;/p&gt;

&lt;p&gt;Imagine a representation of this in a relational database. If both phones were a record, each would have a few &lt;code&gt;NULL&lt;/code&gt; values. Further, if you were making a schema for this database in 2000, could you have predicted the ubiquity of phone cameras? I doubt it, which will have meant a potentially awkward schema update later.&lt;/p&gt;

&lt;p&gt;The retail market hits this all the time as products are developed, but so does the car industry, as new models are released with improved features: hybrid power &amp;gt; electric power &amp;gt; self-driving &amp;gt; ??hover cars??&lt;/p&gt;

&lt;p&gt;&lt;div style=&#34;width:100%;height:0;padding-bottom:57%;position:relative;&#34;&gt;&lt;iframe src=&#34;https://giphy.com/embed/WT40jXYyhIcww&#34; width=&#34;100%&#34; height=&#34;100%&#34; style=&#34;position:absolute&#34; frameBorder=&#34;0&#34; class=&#34;giphy-embed&#34; allowFullScreen&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;p&gt;It&amp;rsquo;s 2018. I was robbed. &lt;a href=&#34;https://giphy.com/gifs/back-to-the-future-WT40jXYyhIcww&#34;&gt;via GIPHY&lt;/a&gt;&lt;/p&gt;&lt;/p&gt;

&lt;p&gt;CosmosDB&amp;rsquo;s solution is to make the product &amp;lsquo;schemaless&amp;rsquo;. Writes won&amp;rsquo;t fail because the schema isn&amp;rsquo;t met, there isn&amp;rsquo;t one to validate against. Data is automatically parsed and indexed. The default is for all data to be indexed.&lt;/p&gt;

&lt;h2 id=&#34;write-optimised&#34;&gt;Write optimised&lt;/h2&gt;

&lt;p&gt;Internet of Things telemetry data is a growing source of data, and &lt;a href=&#34;https://en.wikipedia.org/wiki/Telemetry#Transportation&#34;&gt;automobile telemetry&lt;/a&gt; is of particularly high value, and high scale. CosmosDB is a &lt;a href=&#34;https://www.ascent.tech/wp-content/uploads/documents/microsoft/cosmos-db/cosmos-db.pdf&#34;&gt;&amp;lsquo;write-optimised&amp;rsquo;&lt;/a&gt; data store. It has been engineered to write high volumes, fast, every time.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://docs.microsoft.com/en-us/azure/cosmos-db/faq#what-happens-with-respect-to-various-config-settings-for-keyspace-creation-like-simplenetwork&#34;&gt;All writes are always durably quorum committed in any region where you write while providing performance guarantees.&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Some other document databases are similarly noted for their &amp;lsquo;fast&amp;rsquo; writes, however these tend to rely on &lt;em&gt;in-memory caching&lt;/em&gt;. In the event that someone starts driving their car you have multiple thousands of documents (records) hitting your data base &lt;em&gt;really fast&lt;/em&gt;. In-memory caching solutions in other databases rely that that stream of data will eventually end and until that point the data is held in memory, before it is written to disk. If the car is being driven for five minutes, maybe the memory caching will hold all the data, which can then be written back to disk in the future. If it&amp;rsquo;s a cross country, 3 hour trip though, it&amp;rsquo;s either going to be really expensive to hold that data, or some data will be lost due to buffer overflow. Literally the data just can&amp;rsquo;t fit into the memory. CosmosDB doesn&amp;rsquo;t rely on this (risky) business, but still maintains supremely fast performance, which is backed by some &lt;a href=&#34;https://azure.microsoft.com/en-us/support/legal/sla/cosmos-db/v1_1/&#34;&gt;pretty big SLAs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;div style=&#34;width:100%;height:0;padding-bottom:53%;position:relative;&#34;&gt;&lt;iframe src=&#34;https://giphy.com/embed/6QOQeB0enXhXq&#34; width=&#34;100%&#34; height=&#34;100%&#34; style=&#34;position:absolute&#34; frameBorder=&#34;0&#34; class=&#34;giphy-embed&#34; allowFullScreen&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;p&gt;&amp;ldquo;Oh my God, it&amp;rsquo;s full of documents!&amp;rdquo; &lt;a href=&#34;https://giphy.com/gifs/movie-sci-fi-2001-6QOQeB0enXhXq&#34;&gt;via GIPHY&lt;/a&gt;&lt;/p&gt;&lt;/p&gt;

&lt;h2 id=&#34;turn-key-distribution&#34;&gt;Turn-key distribution&lt;/h2&gt;

&lt;p&gt;A distributed database is one that is run across multiple machines, nearly always via a cloud provider. This enables data to be closer to it&amp;rsquo;s users. In the event of a global audience this becomes really important at scale, because the distance the data needs to travel from device to database is one of the factors in determining write speed. Imagine your video game has just launched new DLC, and you are expecting very bursty traffic for a few days as all your users dive into the new maps. Initialise some more clones of your data sets to decrease latency and serve the increase in traffic volume.&lt;/p&gt;

&lt;p&gt;If that sounds a little complicated, the &amp;lsquo;turn-key&amp;rsquo; nature should help calm your mind.&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/cosmosdb-geo-distribution.gif&#34; alt=&#34;Turn-key geo-distribution in cosmosDB is really this easy&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;p&gt;
        Turn-key geo-distribution in cosmosDB is really this easy
        
            
        
        &lt;/p&gt; 
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;Data storage compliance is another consideration. Some data (such as UK medical data) might not be inherently suited to this architecture, however, cosmosDB is a &amp;lsquo;Foundational (Ring 0)&amp;rsquo; Azure service, so it is available in all Azure regions, including their &amp;lsquo;Sovereign&amp;rsquo; and &amp;lsquo;&lt;a href=&#34;https://azure.microsoft.com/en-gb/global-infrastructure/government/&#34;&gt;Government&lt;/a&gt;&amp;rsquo; regions. This allows compliance control on where, and in what way, the data is stored.&lt;/p&gt;

&lt;p&gt;&lt;div style=&#34;width:100%;height:0;padding-bottom:48%;position:relative;&#34;&gt;&lt;iframe src=&#34;https://giphy.com/embed/ZyGTx7DbVmHDy&#34; width=&#34;100%&#34; height=&#34;100%&#34; style=&#34;position:absolute&#34; frameBorder=&#34;0&#34; class=&#34;giphy-embed&#34; allowFullScreen&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;p&gt;A database army. And I must say, one of the finest we&amp;rsquo;ve ever created &lt;a href=&#34;https://giphy.com/gifs/clone-ZyGTx7DbVmHDy&#34;&gt;via GIPHY&lt;/a&gt;&lt;/p&gt;&lt;/p&gt;

&lt;h2 id=&#34;multi-model&#34;&gt;Multi-model&lt;/h2&gt;

&lt;p&gt;And so how do you communicate with this weird alien database? In the creation step for a CosmosDB account you can select an api (and resultant data model) from the following list:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL&lt;/li&gt;
&lt;li&gt;MongoDB&lt;/li&gt;
&lt;li&gt;Cassandra&lt;/li&gt;
&lt;li&gt;Azure Table&lt;/li&gt;
&lt;li&gt;Gremlin (graph)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CosmosDB&amp;rsquo;s data model is then set relative to this selection, however &lt;a href=&#34;https://vincentlauzon.com/2017/09/10/hacking-changing-cosmos-db-portal-experience-from-graph-to-sql/&#34;&gt;the query methods can be modified without impacting the underlying data structure&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The data models include &lt;code&gt;key:value&lt;/code&gt;, &lt;code&gt;column:family&lt;/code&gt;, &lt;code&gt;document&lt;/code&gt; and &lt;code&gt;graph&lt;/code&gt; via it&amp;rsquo;s use of an &lt;a href=&#34;https://azure.microsoft.com/en-gb/blog/a-technical-overview-of-azure-cosmos-db/&#34;&gt;atom-record-sequence type system in the database engine&lt;/a&gt;. In practical terms this means that you can read in data in a graph form from a data source, but still query it using SQL in a &amp;lsquo;relational&amp;rsquo; way. Your social network app may be suited to write in data in a directed network way, but maybe your business intelligence team are highly invested in a SQL toolset. CosmosDB should help to bridge this gap.&lt;/p&gt;

&lt;p&gt;&lt;div style=&#34;width:100%;height:0;padding-bottom:75%;position:relative;&#34;&gt;&lt;iframe src=&#34;https://giphy.com/embed/5fLgDwo63DQcg&#34; width=&#34;100%&#34; height=&#34;100%&#34; style=&#34;position:absolute&#34; frameBorder=&#34;0&#34; class=&#34;giphy-embed&#34; allowFullScreen&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;p&gt;SQL and Graph at Tenegra! &lt;a href=&#34;https://giphy.com/gifs/eyes-temba-5fLgDwo63DQcg&#34;&gt;via GIPHY&lt;/a&gt;&lt;/p&gt;&lt;/p&gt;

&lt;h1 id=&#34;gotchas&#34;&gt;Gotchas&lt;/h1&gt;

&lt;h2 id=&#34;join&#34;&gt;&lt;code&gt;JOIN&lt;/code&gt;&lt;/h2&gt;

&lt;p&gt;In relational DBs &lt;code&gt;JOIN&lt;/code&gt; works in a familiar and intuitive way, but with this non-relational data structure what does a &lt;code&gt;JOIN&lt;/code&gt; mean, and what can we use it for?&lt;/p&gt;

&lt;p&gt;Firstly, JSON allows for arrays to be included in a document.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;[
  {
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;droid&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;BB8&amp;#34;&lt;/span&gt;,
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;affiliation&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;New Republic&amp;#34;&lt;/span&gt;,
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;series&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;BB&amp;#34;&lt;/span&gt;,
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;films&amp;#34;&lt;/span&gt;: [
      &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The Force Awakens&amp;#34;&lt;/span&gt;,
      &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The Last Jedi&amp;#34;&lt;/span&gt;
    ]
  },
  {
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;droid&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;BB9E&amp;#34;&lt;/span&gt;,
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;affiliation&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;First Order&amp;#34;&lt;/span&gt;,
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;series&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;BB&amp;#34;&lt;/span&gt;,
    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;films&amp;#34;&lt;/span&gt;: [
      &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The Last Jedi&amp;#34;&lt;/span&gt;
    ]
  }
]&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This can be queried through SQL like this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;SELECT&lt;/span&gt;
    droids.droid,
    droids.affiliation,
    films
&lt;span style=&#34;color:#66d9ef&#34;&gt;FROM&lt;/span&gt;
    droids
&lt;span style=&#34;color:#66d9ef&#34;&gt;JOIN&lt;/span&gt;
    films &lt;span style=&#34;color:#66d9ef&#34;&gt;in&lt;/span&gt; droids.films&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Which will return an output like this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;[
    {
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;droid&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;BB8&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;affiliation&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;New Republic&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;films&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The Force Awakens&amp;#34;&lt;/span&gt;
    },
    {
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;droid&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;BB8&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;affiliation&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;New Republic&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;films&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The Last Jedi&amp;#34;&lt;/span&gt;
    },
    {
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;droid&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;BB9E&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;affiliation&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;First Order&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;films&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The Last Jedi&amp;#34;&lt;/span&gt;
    }
]&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The query has &lt;em&gt;merged&lt;/em&gt; multiple documents, into a &lt;em&gt;single&lt;/em&gt; &lt;em&gt;flattened&lt;/em&gt; results set. Here there is no &lt;code&gt;INNER&lt;/code&gt;, &lt;code&gt;LEFT&lt;/code&gt;, or &lt;code&gt;OUTER&lt;/code&gt; to do subsetting. This is achieved with the &lt;code&gt;WHERE&lt;/code&gt; clause.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;SELECT&lt;/span&gt;
    droids.droid,
    droids.affiliation,
    films
&lt;span style=&#34;color:#66d9ef&#34;&gt;FROM&lt;/span&gt;
    droids
&lt;span style=&#34;color:#66d9ef&#34;&gt;JOIN&lt;/span&gt;
    films &lt;span style=&#34;color:#66d9ef&#34;&gt;in&lt;/span&gt; droids.films
&lt;span style=&#34;color:#66d9ef&#34;&gt;WHERE&lt;/span&gt;
    films &lt;span style=&#34;color:#66d9ef&#34;&gt;IN&lt;/span&gt; (&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The Last Jedi&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;[
    {
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;droid&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;BB8&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;affiliation&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;New Republic&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;films&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The Last Jedi&amp;#34;&lt;/span&gt;
    },
    {
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;droid&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;BB9E&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;affiliation&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;First Order&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;films&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;The Last Jedi&amp;#34;&lt;/span&gt;
    }
]&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;JOIN&lt;/code&gt; does just, and only that. Filtering is done in the &lt;code&gt;WHERE&lt;/code&gt; clause.&lt;/p&gt;

&lt;h2 id=&#34;queries-and-indexing&#34;&gt;Queries and Indexing&lt;/h2&gt;

&lt;p&gt;CosmosDB automatically indexes all data as it comes in by default. For high volume data this might be problematic leading to longer write times, so how can this be mitigated? The index can be selectively applied through an &amp;lsquo;index policy&amp;rsquo;. However the trade off for this means that value will no longer be queryable.&lt;/p&gt;

&lt;p&gt;For instance if your data has information like this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;[
    {
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;affiliation&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Jedi&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;character_name&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Luke Skywalker&amp;#34;&lt;/span&gt;
    },
    {
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;affiliation&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Sith&amp;#34;&lt;/span&gt;,
        &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;character_name&amp;#34;&lt;/span&gt;:&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Snoke&amp;#34;&lt;/span&gt;
    }
]&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And &lt;code&gt;character_name&lt;/code&gt; was excluded by your indexing policy, then&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;SELECT&lt;/span&gt;
    force_users.affiliation,
    force_users.character_name
&lt;span style=&#34;color:#66d9ef&#34;&gt;FROM&lt;/span&gt;
    force_users
&lt;span style=&#34;color:#66d9ef&#34;&gt;WHERE&lt;/span&gt;
    force_users.character_name &lt;span style=&#34;color:#f92672&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Luke Skywalker&amp;#34;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;would error. It wouldn&amp;rsquo;t know where Luke was!&lt;/p&gt;

&lt;h1 id=&#34;how-should-a-data-scientist-use-cosmosdb&#34;&gt;How should a data scientist use CosmosDB?&lt;/h1&gt;

&lt;h2 id=&#34;connection&#34;&gt;Connection&lt;/h2&gt;

&lt;p&gt;To make a simple connection to the data base from an R session you can use the &lt;a href=&#34;https://docs.microsoft.com/en-us/azure/cosmos-db/odbc-driver&#34;&gt;ODBC driver supplied by Microsoft&lt;/a&gt;. Once this has been setup you can use the &lt;a href=&#34;https://db.rstudio.com/odbc/&#34;&gt;&lt;code&gt;odbc&lt;/code&gt; and &lt;code&gt;DBI&lt;/code&gt; packages&lt;/a&gt; to run queries against the connection in R, which is even easier using &lt;a href=&#34;https://db.rstudio.com/rstudio/connections/&#34;&gt;RStudio&amp;rsquo;s tools&lt;/a&gt;. However, the connection can be only part of the challenge if you are interested in using a large scale data source.&lt;/p&gt;

&lt;h2 id=&#34;computing&#34;&gt;Computing&lt;/h2&gt;

&lt;p&gt;With such effortless scaling of data collection and storage, data scientists (&amp;amp; Business Analysts and &amp;amp; Management Information Specialists, etc.) might see this as a simple solution to scale issues. However, remember that this is a &amp;lsquo;&lt;em&gt;write&lt;/em&gt; optimised, data &lt;em&gt;storage&lt;/em&gt; layer&amp;rsquo;, frequently for analytics workstreams what we should be looking for is data &lt;em&gt;computation&lt;/em&gt; layers. Our workflows can be very &lt;em&gt;memory&lt;/em&gt; intensive, though luckily there are solutions.&lt;/p&gt;

&lt;h3 id=&#34;hdinsights&#34;&gt;HDInsights&lt;/h3&gt;

&lt;p&gt;Microsoft have a number of implementations of Apache Spark tied into their Azure platform. &lt;a href=&#34;https://azure.microsoft.com/en-gb/services/hdinsight/&#34;&gt;HD Insights&lt;/a&gt; is a fully managed analytics service, built on Apache Spark (and others) and &lt;a href=&#34;https://azure.microsoft.com/en-gb/services/hdinsight/r-server/&#34;&gt;is easy to plumb into an R interface&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&#34;databricks&#34;&gt;Databricks&lt;/h3&gt;

&lt;p&gt;During the training many examples of scaled out analysis workflow in practice used the &lt;a href=&#34;https://docs.databricks.com/spark/latest/data-sources/azure/cosmosdb-connector.html&#34;&gt;Databricks&lt;/a&gt; service, a proprietary fork of Apache Spark. This can then be used in an &lt;a href=&#34;https://docs.azuredatabricks.net/spark/latest/sparkr/index.html&#34;&gt;R session via &lt;code&gt;SparkR&lt;/code&gt; and &lt;code&gt;sparklyr&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&#34;normal-apache-spark&#34;&gt;Normal Apache Spark&lt;/h3&gt;

&lt;p&gt;Of course, as spark is so highly supported, there is nothing to stop you setting up your own standard spark cluster and connecting to it in the traditional way.&lt;/p&gt;

&lt;h1 id=&#34;cosmic-power&#34;&gt;Cosmic Power&lt;/h1&gt;

&lt;p&gt;CosmosDB is an exciting piece of technology. Distributed, scalable, schema-less, multi-model and multi-lingual, for many applications I see this as a real Cosmic Power. It&amp;rsquo;s not for everything, but for where it works it&amp;rsquo;s not only a powerful tool, but one that seems more straightforward to implement than many of it&amp;rsquo;s competitors, and one that&amp;rsquo;s also got Microsoft&amp;rsquo;s focus on enterprise application baked into the foundations. Enjoy your exploration of the stars.&lt;/p&gt;

&lt;p&gt;&lt;div style=&#34;width:100%;height:0;padding-bottom:68%;position:relative;&#34;&gt;&lt;iframe src=&#34;https://giphy.com/embed/pVql7tC6cSXEk&#34; width=&#34;100%&#34; height=&#34;100%&#34; style=&#34;position:absolute&#34; frameBorder=&#34;0&#34; class=&#34;giphy-embed&#34; allowFullScreen&gt;&lt;/iframe&gt;&lt;/div&gt;&lt;p&gt;&lt;a href=&#34;https://giphy.com/gifs/cosmos-carl-sagan-pVql7tC6cSXEk&#34;&gt;via GIPHY&lt;/a&gt;&lt;/p&gt;&lt;/p&gt;

&lt;p&gt;If this article has sparked a curiosity in this astronomical technology, make sure to have a look at &lt;a href=&#34;../../training/onlinetraining/&#34;&gt;our upcoming online training course&lt;/a&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>R Objects</title>
      <link>https://itsalocke.com/blog/r-objects/</link>
      <pubDate>Fri, 24 Aug 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/r-objects/</guid>
      <description>
        
        

&lt;h1 id=&#34;r-objects&#34;&gt;R objects&lt;/h1&gt;

&lt;p&gt;To quickly recap, so far we&amp;rsquo;ve just worked with some single values to get to grips with how some of the various operations work. Of course, we rarely work with a single value! If we did, we could just use a calculator.&lt;/p&gt;

&lt;p&gt;This instalment you&amp;rsquo;ll get to grips with some different ways of storing data and how to manipulate your datasets in the &amp;ldquo;traditional&amp;rdquo; way. This will help you understand a lot of code written in the past, and will equip you to understand data manipulation of tabular data.&lt;/p&gt;

&lt;p&gt;Get a cuppa and settle in to this sessions video, then have a play with the code yourself and read along with the blog! As always, get in touch with any questions on twitter using @LockeData!&lt;/p&gt;


&lt;div style=&#34;position: relative; padding-bottom: 56.25%; padding-top: 30px; height: 0; overflow: hidden;&#34;&gt;
  &lt;iframe src=&#34;//www.youtube.com/embed/vqSJYXoiJZ4s&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%;&#34; allowfullscreen frameborder=&#34;0&#34; title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
 &lt;/div&gt;


&lt;h2 id=&#34;storing-values&#34;&gt;Storing values&lt;/h2&gt;

&lt;p&gt;When we were performing operations, we got some values output to the console. One of the key principles in writing code is Don&amp;rsquo;t Repeat Yourself (DRY) so we need to know how we can avoid repeating ourselves in R. One of the ways you can do that is to store a value for use later.&lt;/p&gt;

&lt;p&gt;In R, we can store values by &lt;strong&gt;assigning&lt;/strong&gt; them a name. This makes a &lt;strong&gt;variable&lt;/strong&gt; or &lt;strong&gt;object&lt;/strong&gt;. We can do this with a few different operators, but the traditional operator is a &lt;code&gt;&amp;lt;-&lt;/code&gt;[1]. The format for assigning a value is &lt;code&gt;nameofthing &amp;lt;- value&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;my_variable &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;
my_variable&lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 16
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Valid names for a variable include upper-case letters, lower case letters, numbers anywhere but the beginning, periods (&lt;code&gt;.&lt;/code&gt;), and hyphens (&lt;code&gt;_&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;There are a number of different competing conventions for how you name variables. The most common conventions are shown below. I have no strong feelings for any system and only ask that you pick one and stick with it within a single script. Whatever you do, don&amp;rsquo;t forget names are case sensitive!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;myfirstvariable &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;
myFirstVariable &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;
MyFirstVariable &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;
my_first_variable &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;
my.first.variable &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can create names breaking the rules governing valid names by placing the rule breaking name between two back-ticks (`). I don&amp;rsquo;t recommend you do this with variables you&amp;rsquo;ll create, but you&amp;rsquo;ll often end up with names that break conventions when importing data, especially when you import from spreadsheets.&lt;/p&gt;

&lt;h2 id=&#34;vectors&#34;&gt;Vectors&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;vector&lt;/strong&gt; is a collection of values that hold the same datatype. It is &lt;strong&gt;one-dimensional&lt;/strong&gt; in that none of the &lt;strong&gt;elements&lt;/strong&gt; in the collection correspond to other values like they might in a table of values.&lt;/p&gt;

&lt;p&gt;A single value is actually a vector of &lt;strong&gt;length&lt;/strong&gt; 1.&lt;/p&gt;

&lt;p&gt;When I introduced the colon (&lt;code&gt;:&lt;/code&gt;) as a means of generating a sequence, we were in fact generating a vector where each element was a number in the sequence. The vector has a length which is as long as the number of values generated by the sequence.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] -1  0  1
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Another way of producing a vector is to use the combine function (&lt;code&gt;c()&lt;/code&gt;). This is great for combining a number of disparate character strings into a vector.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;red&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;yellow&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;blue&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;red&amp;quot;    &amp;quot;yellow&amp;quot; &amp;quot;blue&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A single value is a still a vector. What we see when we use the &lt;code&gt;c()&lt;/code&gt; function is that we&amp;rsquo;re combining vectors. As a result we can also use it on longer vectors too.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;8&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 1 2 3 2 1 5 6 7 8
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;When we combine values into a single vector, R will change everything to the same datatype using some conversions.&lt;/p&gt;

&lt;h2 id=&#34;getting-information-about-vectors&#34;&gt;Getting information about vectors&lt;/h2&gt;

&lt;p&gt;Our &lt;code&gt;class()&lt;/code&gt; function will still work with a vector with a length greater than 1 to get you it&amp;rsquo;s datatype.&lt;/p&gt;

&lt;p&gt;Let&amp;rsquo;s look at a sequence of numbers and one of the built-in vectors that contains the alphabet.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;class&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;integer&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;LETTERS&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;##  [1] &amp;quot;A&amp;quot; &amp;quot;B&amp;quot; &amp;quot;C&amp;quot; &amp;quot;D&amp;quot; &amp;quot;E&amp;quot; &amp;quot;F&amp;quot; &amp;quot;G&amp;quot; &amp;quot;H&amp;quot; &amp;quot;I&amp;quot; &amp;quot;J&amp;quot; &amp;quot;K&amp;quot; &amp;quot;L&amp;quot; &amp;quot;M&amp;quot; &amp;quot;N&amp;quot; &amp;quot;O&amp;quot; &amp;quot;P&amp;quot; &amp;quot;Q&amp;quot;
## [18] &amp;quot;R&amp;quot; &amp;quot;S&amp;quot; &amp;quot;T&amp;quot; &amp;quot;U&amp;quot; &amp;quot;V&amp;quot; &amp;quot;W&amp;quot; &amp;quot;X&amp;quot; &amp;quot;Y&amp;quot; &amp;quot;Z&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;class&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;LETTERS&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;character&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;We can use the &lt;code&gt;length()&lt;/code&gt; function to find out the number of elements in a vector.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;length&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;pi&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 1
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;length&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;LETTERS&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 26
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;To extract the names of values in a vector, we can use the &lt;code&gt;names()&lt;/code&gt; function.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;steph&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(Steph&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;forename&amp;#34;&lt;/span&gt;, Locke&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;surname&amp;#34;&lt;/span&gt;)
&lt;span style=&#34;color:#66d9ef&#34;&gt;names&lt;/span&gt;(steph)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Steph&amp;quot; &amp;quot;Locke&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&#34;calculations-on-multiple-vectors&#34;&gt;Calculations on multiple vectors&lt;/h2&gt;

&lt;p&gt;When we perform calculations on two vectors, R will try to perform the operation for each set of elements. This is an &lt;strong&gt;element-wise&lt;/strong&gt; or &lt;strong&gt;pair-wise&lt;/strong&gt; calculation methodology.&lt;/p&gt;

&lt;p&gt;In SQL, it&amp;rsquo;s equivalent to where you might write &lt;code&gt;colA*colB&lt;/code&gt; and you&amp;rsquo;ll get the answer calculated for every row in the table. In Excel, it&amp;rsquo;s equivalent to a Fill Down of multiplying two values on the same row.&lt;/p&gt;

&lt;p&gt;Let&amp;rsquo;s looks at how this works in practice in R.&lt;/p&gt;

&lt;p&gt;We have two vectors, each containing two elements.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;vecA &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;
vecB &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 1 2

## [1] 2 3
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If we want to multiply the two vectors by each other, R will match each element in the first vector with it&amp;rsquo;s counterpart in the second and multiply the two values together to make a new element.&lt;/p&gt;

&lt;p&gt;You can also use this functionality of making a vector the same length as another, known as &lt;strong&gt;recycling&lt;/strong&gt;, work for other mis-matched vector sizes. The only rule is that one of the vector lengths must divide cleanly by the other.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Two vectors of the same length divide by the other&amp;rsquo;s length exactly one time and won&amp;rsquo;t need to recycle&lt;/li&gt;
&lt;li&gt;A vector of length one always cleanly divides any other vector&amp;rsquo;s length and so will be recycled&lt;/li&gt;
&lt;li&gt;A vector of length 2, will divide any vector with an even length and so will be recycled in those cases, but it cannot recycle cleanly for odd length vectors&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;
&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;
&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;4&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;##  [1]  2  4  6  8 10 12 14 16 18 20

##  [1]  2  6  6 12 10 18 14 24 18 30

## [1] &amp;quot;longer object length is not a multiple of shorter object length&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Vector recycling is useful and dangerous &amp;ndash; it can help you make elegant code or give you unexpected results. Especially when starting out, I recommend you make your vectors either the same length or length 1.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Proceed with caution&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&#34;bitwise&#34;&gt;Bitwise&lt;/h3&gt;

&lt;p&gt;Our logical operators that we covered earlier, work in a pairwise fashion. They&amp;rsquo;ll return a vector of the same length as the longest one used in your logical statement.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;a&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;
b&lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] FALSE  TRUE

## [1] TRUE TRUE
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Making logical statements returns vectors with a logical datatype.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;a&lt;span style=&#34;color:#f92672&#34;&gt;&amp;amp;&lt;/span&gt;b&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] FALSE  TRUE
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;a&lt;span style=&#34;color:#f92672&#34;&gt;|&lt;/span&gt;b&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE TRUE
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Occasionally, you expect to only be operating on a single pair of values and want to enforce that R should only do the calculation on the first pair. In R, this called a &lt;strong&gt;bitwise&lt;/strong&gt; AND (&lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;) or OR (&lt;code&gt;||&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;A bitwise logical statement will only do the check for the first elements in the vectors and ignore all the others.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;a&lt;span style=&#34;color:#f92672&#34;&gt;&amp;amp;&amp;amp;&lt;/span&gt;b&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] FALSE
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;a&lt;span style=&#34;color:#f92672&#34;&gt;||&lt;/span&gt;b&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Use bitwise operators with extreme care!&lt;/p&gt;

&lt;p&gt;And there we have it - It&amp;rsquo;s fair to say that you&amp;rsquo;ve pretty much covered the basics of data handling in R now, so next time we&amp;rsquo;ll have a look at some packages and functions - these are the bits where somebody else has done the hard work for you!&lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;

&lt;p&gt;Ellen :)&lt;/p&gt;

&lt;p&gt;As always, here&amp;rsquo;s the video code to take away and play.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;my_first_var &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5+3&lt;/span&gt;

my_first_var &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;my.other.variable &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;7&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;4&lt;/span&gt;

&lt;span style=&#34;color:#ae81ff&#34;&gt;9-5&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;-&amp;gt;&lt;/span&gt; YetAnotherVariable&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;rm&lt;/span&gt;(my_first_var)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;dont.delete.me &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;please don&amp;#39;t!&amp;#34;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
      </description>
    </item>
    
    <item>
      <title>A glass shattering book draw with gganimate</title>
      <link>https://itsalocke.com/blog/a-glass-shattering-book-draw-with-gganimate/</link>
      <pubDate>Wed, 01 Aug 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/a-glass-shattering-book-draw-with-gganimate/</guid>
      <description>
        
        

&lt;p&gt;It&amp;rsquo;s time for a Twitter book draw again: every month, a random Locke Data Twitter follower wins an excellent data science book! This month&amp;rsquo;s book was &lt;a href=&#34;http://geni.us/mathdestruction&#34;&gt;Weapons of Math Destruction : How Big Data Increases Inequality and Threatens Democracy&lt;/a&gt;. The animation I chose to create was inspired by the idea of &lt;em&gt;destruction&lt;/em&gt; and by my wanting to try out the fantastic new API of the &lt;code&gt;gganimate&lt;/code&gt; package, and a very fast new gif encoder, &lt;code&gt;gifski&lt;/code&gt;.&lt;/p&gt;

&lt;h1 id=&#34;choosing-an-animation-concept&#34;&gt;Choosing an animation concept&lt;/h1&gt;

&lt;p&gt;I am not a designer nor an artist, but I like imagining new animations to announce the book winner, that I want to be &lt;strong&gt;fun&lt;/strong&gt;, and &lt;strong&gt;useful&lt;/strong&gt; by illustrating the use of some nifty R tools. That&amp;rsquo;s a good way for me and you to learn new R skills! I&amp;rsquo;ve reported on &lt;a href=&#34;https://itsalocke.com/blog/a-crystal-clear-book-draw/&#34;&gt;my efforts with &lt;code&gt;magick&lt;/code&gt; to create a crystal ball for chibi Steph&lt;/a&gt;, R package for image manipulation, and &lt;a href=&#34;https://itsalocke.com/blog/a-particles-arly-fun-book-draw/&#34;&gt;with &lt;code&gt;particles&lt;/code&gt;, R package for simulating, well, &lt;em&gt;particles&lt;/em&gt;, to move followers&amp;rsquo; names around!&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This month I thought about what destruction meant to me, and got the rather simple idea to have glass shatter, thus revealing the winner&amp;rsquo;s name.&lt;/p&gt;

&lt;h1 id=&#34;planning-the-my-animation-implementation&#34;&gt;Planning the my animation implementation&lt;/h1&gt;

&lt;p&gt;Feel free to skip over this section and go to the code directly, unless you want to know more about my search and process to write said code!&lt;/p&gt;

&lt;p&gt;I wondered how I could shatter glass, i.e. cut a blue square in small pieces and make these pieces fall and somehow ended up thinking that I could draw random points and then use their &lt;a href=&#34;https://en.wikipedia.org/wiki/Voronoi_diagram&#34;&gt;Voronoi tiles&lt;/a&gt; as fragments. I&amp;rsquo;m not sure glass shattering really looks like this but I decided against experimenting with my windows.&lt;/p&gt;

&lt;p&gt;I wasn&amp;rsquo;t too sure how to make and use the fragments though. I re-read &lt;a href=&#34;https://www.data-imaginist.com/2016/data-driven-x-mas-card/&#34;&gt;Thomas Lin Pedersen&amp;rsquo;s excellent post about making a Christmas card using Voronoi tesselation&lt;/a&gt; but as nice and well-explained as it was, it wasn&amp;rsquo;t exactly what I was after. I set out to find a way to create Voronoi tiles as polygons and then to shift at least some of them downwards to create the animation.&lt;/p&gt;

&lt;p&gt;I tried using the &lt;a href=&#34;https://cran.r-project.org/web/packages/deldir/index.html&#34;&gt;&lt;code&gt;deldir&lt;/code&gt; package&lt;/a&gt; a bit to create Voronoi tesselation out of random points but its output didn&amp;rsquo;t make me too happy since it was &lt;em&gt;segments&lt;/em&gt; rather than polygons. Re-creating polygons from them didn&amp;rsquo;t sound too easy. I ditched this idea and googled keywords such as Voronoi and polygons and R and then &lt;code&gt;sf&lt;/code&gt; the geospatial package since the first results indicated Voronoi tesselation was well supported for mapping stuff, and found this very useful &lt;a href=&#34;https://stackoverflow.com/questions/45719790/create-voronoi-polygon-with-simple-feature-in-r&#34;&gt;Stack Overflow post&lt;/a&gt; that made me &lt;a href=&#34;https://r-spatial.github.io/sf/reference/geos_unary.html&#34;&gt;adapt an example from &lt;code&gt;sf&lt;/code&gt; documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Once I had the polygons, the rest was animation as usual, well not really, since &lt;code&gt;gganimate&lt;/code&gt; new API by Thomas Lin Pedersen is different, and so powerful! I haven&amp;rsquo;t watched &lt;a href=&#34;https://www.youtube.com/watch?v=21ZWDrTukEs&#34;&gt;Thomas&amp;rsquo; useR! keynote talk about The Grammar of animation&lt;/a&gt; yet, but I look forward to it since I both admire his packages, and his &lt;a href=&#34;https://www.data-imaginist.com/&#34;&gt;blog writing&lt;/a&gt;. I only use very basic &lt;code&gt;gganimate&lt;/code&gt; stuff here. A good intro to its grammar can be found at &lt;a href=&#34;https://github.com/thomasp85/gganimate/tree/master#gganimate-&#34;&gt;https://github.com/thomasp85/gganimate/tree/master#gganimate-&lt;/a&gt; .&lt;/p&gt;

&lt;p&gt;I was also glad to read that  &lt;code&gt;gifski&lt;/code&gt; is now on CRAN for all your gif making needs (it&amp;rsquo;s actually the default gif renderer of &lt;code&gt;gganimate&lt;/code&gt;). This package by &lt;a href=&#34;https://github.com/jeroen&#34;&gt;Jeroen Ooms&lt;/a&gt; is not only a wrapper to the fastest gif renderer around which is cool enough in itself, but also the first CRAN package that interfaces a Rust library! Read more &lt;a href=&#34;https://ropensci.org/technotes/2018/07/23/gifski-release/&#34;&gt;about &lt;code&gt;gifski&lt;/code&gt; in this tech note&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&#34;writing-the-actual-animation-code&#34;&gt;Writing the actual animation code&lt;/h1&gt;

&lt;p&gt;The first part of the code consisted of drawing random points and using them as a basis for a Voronoi tesselation inside a box.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#75715e&#34;&gt;# adapted from https://r-spatial.github.io/sf/reference/geos_unary.html&lt;/span&gt;
lims &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt;)
&lt;span style=&#34;color:#75715e&#34;&gt;# sample 200 points inside an environment&lt;/span&gt;
&lt;span style=&#34;color:#75715e&#34;&gt;# with a fix seed&lt;/span&gt;
&lt;span style=&#34;color:#66d9ef&#34;&gt;with&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;set.seed&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;42&lt;/span&gt;),
     points &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; sf&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;st_multipoint(&lt;span style=&#34;color:#66d9ef&#34;&gt;matrix&lt;/span&gt;(runif(n &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;400&lt;/span&gt;, lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;],
                                              lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;]),,&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;)))

&lt;span style=&#34;color:#75715e&#34;&gt;# get a square box that&amp;#39;ll be the area of our animation &lt;/span&gt;
box &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; sf&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;st_polygon(&lt;span style=&#34;color:#66d9ef&#34;&gt;list&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;rbind&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;], lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;]),
                             &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;], lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;]),
                             &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;], lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;]),
                             &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;], lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;]),
                             &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;], lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;]))))

&lt;span style=&#34;color:#75715e&#34;&gt;# Get the Voronoi polygons of the points&lt;/span&gt;
v &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; sf&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;st_sfc(sf&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;st_voronoi(points, sf&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;st_sfc(box)))
&lt;span style=&#34;color:#75715e&#34;&gt;# Keep only the part of them inside the box&lt;/span&gt;
voronoi &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; sf&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;st_intersection(sf&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;st_cast(v), box)

&lt;span style=&#34;color:#75715e&#34;&gt;# Get a data.frame with the polygons&lt;/span&gt;
get_df_from_polygon &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(polygon, index){
  mat &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;as.matrix&lt;/span&gt;(polygon)
  tibble&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;tibble(x &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; mat[,&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;],
                 y &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; mat[,&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;],
                 tile &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; index)
}

df &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;map2_df(voronoi, &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;length&lt;/span&gt;(voronoi),
                     get_df_from_polygon)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The animation will consist of two parts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The fragments appearing by drawing their borders with different shades from blue like the background to white.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;The fragments falling.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The code below creates the PNGs corresponding to the first part of the animation.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;fs&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;dir_create(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;august_frames&amp;#34;&lt;/span&gt;)

plot_one &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(step, df){
  cols &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; colorRampPalette(&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;#2165B6&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;white&amp;#34;&lt;/span&gt;))(&lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt;)
  ggplot(df)  &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt;
    geom_polygon(aes(x &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; x, y &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; y, group &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; tile),
                 fill &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;#2165B6&amp;#34;&lt;/span&gt;, col &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; cols[step]) &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt;
    theme_void()

  outfil &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;file.path&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;august_frames&amp;#34;&lt;/span&gt;,
                      &lt;span style=&#34;color:#66d9ef&#34;&gt;sprintf&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;plot1_%02d.png&amp;#34;&lt;/span&gt;, step))
  ggsave(outfil, width &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;6.7&lt;/span&gt;, height &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;6.7&lt;/span&gt;)
  magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_read(outfil) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_resize(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;480x480&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_write(outfil)


}


purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;walk(&lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt;, plot_one, df)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then I made the &lt;code&gt;data.frame&lt;/code&gt; a bit bigger by adding a second state with most tiles fallen at the bottom. I first create a variable &lt;code&gt;border&lt;/code&gt; that indicates if the tile pertains to the sides or top because magically these tiles won&amp;rsquo;t fall.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;df &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; dplyr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;group_by(df, tile) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
  dplyr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;mutate(border &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;any&lt;/span&gt;(x &lt;span style=&#34;color:#f92672&#34;&gt;%in%&lt;/span&gt; lims) &lt;span style=&#34;color:#f92672&#34;&gt;|&lt;/span&gt;
                  &lt;span style=&#34;color:#66d9ef&#34;&gt;any&lt;/span&gt;(y &lt;span style=&#34;color:#f92672&#34;&gt;==&lt;/span&gt; lims[&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;]))

&lt;span style=&#34;color:#75715e&#34;&gt;# Define the second state with tiles fallen&lt;/span&gt;
df2  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; df
&lt;span style=&#34;color:#75715e&#34;&gt;# a bit random but it does look ok!&lt;/span&gt;

df2 &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; dplyr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;group_by(df2, tile) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
  dplyr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;mutate(y &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;ifelse&lt;/span&gt;(&lt;span style=&#34;color:#f92672&#34;&gt;!&lt;/span&gt;border,
                           &lt;span style=&#34;color:#ae81ff&#34;&gt;-2&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;min&lt;/span&gt;(y) &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; y,
                           y)) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
 dplyr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;ungroup()

df&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;frame &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;
df2&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;frame &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;
dfall &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; dplyr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;bind_rows(df, df2)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This was followed by the animating of the second part with &lt;code&gt;gganimate&lt;/code&gt;, here after creating a fake winner (sorry Steph, you won&amp;rsquo;t get to gift yourself the book). Because we&amp;rsquo;ll need to use PNGs from the first part, we don&amp;rsquo;t use &lt;code&gt;gganimate&lt;/code&gt;&amp;rsquo;s built-in gif or video renderers, but instead save PNGs.&lt;/p&gt;

&lt;p&gt;Important note: &lt;code&gt;gganimate&lt;/code&gt; latest version isn&amp;rsquo;t on CRAN yet, follow the instructions in its &lt;a href=&#34;https://github.com/thomasp85/gganimate&#34;&gt;GitHub repository&lt;/a&gt; or download a built source version of the package &lt;a href=&#34;https://ci.appveyor.com/project/thomasp85/gganimate/build/artifacts&#34;&gt;from Appveyor&lt;/a&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;winner &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;list&lt;/span&gt;(name &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Steph Locke&amp;#34;&lt;/span&gt;))
p &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; ggplot(dfall) &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt;
    annotate(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;text&amp;#34;&lt;/span&gt;, label &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; winner&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;name,
             x &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;, y &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;,
             size &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;12&lt;/span&gt;, family &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Roboto&amp;#34;&lt;/span&gt;,
             col &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;#E8830C&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt;
    geom_polygon(aes(x &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; x, y &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; y, group &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; tile),
                 fill &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;#2165B6&amp;#34;&lt;/span&gt;, col &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;white&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt;
    &lt;span style=&#34;color:#75715e&#34;&gt;# Here comes the gganimate code&lt;/span&gt;
    &lt;span style=&#34;color:#75715e&#34;&gt;# create transition, with the variable frame&lt;/span&gt;
    transition_states(
      frame,
      transition_length &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;,
      state_length &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;,
      wrap &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;FALSE&lt;/span&gt;
    ) &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt;
    &lt;span style=&#34;color:#75715e&#34;&gt;# use an ease that makes tiles fall faster&lt;/span&gt;
    &lt;span style=&#34;color:#75715e&#34;&gt;# at the end of the transition&lt;/span&gt;
    ease_aes(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;exponential-in&amp;#39;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt;
    theme_void() &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt;
    coord_cartesian(xlim &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; lims, ylim &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; lims)

  &lt;span style=&#34;color:#75715e&#34;&gt;# create and save frames&lt;/span&gt;

   animate(p,
   renderer &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; file_renderer(dir &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;august_frames&amp;#34;&lt;/span&gt;,
                            prefix &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;plot2_&amp;#34;&lt;/span&gt;) )&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now we can use all the PNGs that are in the &amp;ldquo;august_frames&amp;rdquo; folder, with &lt;code&gt;gifski&lt;/code&gt;!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;images &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;sort&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;as.character&lt;/span&gt;(
   fs&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;dir_ls(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;august_frames&amp;#34;&lt;/span&gt;)))

gifski&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;gifski(images,
                gif_file &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; path,
                delay &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;0.1&lt;/span&gt;,
                width &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;480&lt;/span&gt;,
                height &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;480&lt;/span&gt;)

&lt;span style=&#34;color:#75715e&#34;&gt;# clean&lt;/span&gt;
 fs&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;dir_delete(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;august_frames&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And voilà, a glass shattering animation!&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/destruction.gif&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Glass shattering book winner reveal&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;h1 id=&#34;conclusion-and-resources-round-up&#34;&gt;Conclusion and resources round-up&lt;/h1&gt;

&lt;p&gt;In this post I explained how I created an animation of glass shattering.&lt;/p&gt;

&lt;p&gt;R packages that are important to know for producing animations are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;gganimate&lt;/code&gt; that now implements the Grammar of animation, like &lt;code&gt;ggplot2&lt;/code&gt; implements the Grammar of graphics! I&amp;rsquo;d recommend to watch &lt;a href=&#34;https://www.youtube.com/watch?v=21ZWDrTukEs&#34;&gt;Thomas Lin Pedersen&amp;rsquo;s useR! keynote talk about The Grammar of animation&lt;/a&gt; and to follow the impressive activity of the &lt;a href=&#34;https://github.com/thomasp85/gganimate&#34;&gt;&lt;code&gt;gganimate&lt;/code&gt; GitHub repo&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;code&gt;gifski&lt;/code&gt; for gif rendering. It&amp;rsquo;s used under the hood by &lt;code&gt;gganimate&lt;/code&gt; but you might need it to combine PNGs yourself if you tweak animations a bit more like we did here. You can read this &lt;a href=&#34;https://ropensci.org/technotes/2018/07/23/gifski-release/&#34;&gt;tech note about &lt;code&gt;gifski&lt;/code&gt; by Jeroen Ooms&lt;/a&gt;, and to check out the &lt;a href=&#34;https://github.com/r-rust&#34;&gt;r-rust Github organization&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Likewise, if you want to tweak some frames, it&amp;rsquo;ll be useful to know a bit about &lt;code&gt;magick&lt;/code&gt;, wrapper to &lt;a href=&#34;https://www.imagemagick.org/Magick++/STL.html&#34;&gt;ImageMagick&lt;/a&gt;, an R package for image manipulation developed at &lt;a href=&#34;https://ropensci.org/&#34;&gt;rOpenSci&lt;/a&gt; by &lt;a href=&#34;https://github.com/jeroen&#34;&gt;Jeroen Ooms&lt;/a&gt;. This package has a &lt;a href=&#34;https://cran.r-project.org/web/packages/magick/vignettes/intro.html&#34;&gt;good vignette&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Have fun animating R visualizations! And don&amp;rsquo;t forget that next month, again a random Locke Data follower will win a great book: you should follow &lt;a href=&#34;https://twitter.com/LockeData&#34;&gt;Locke Data on Twitter&lt;/a&gt; to be in with a chance of winning!&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Learn to R blog series - Operators and Objects</title>
      <link>https://itsalocke.com/blog/learn-to-r-blog-series---operators-and-objects/</link>
      <pubDate>Thu, 19 Jul 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/learn-to-r-blog-series---operators-and-objects/</guid>
      <description>
        
        

&lt;h1 id=&#34;basic-operations&#34;&gt;Basic operations&lt;/h1&gt;

&lt;p&gt;Now that we have some datatypes, we can start learning what we can do with them.&lt;/p&gt;

&lt;p&gt;This weeks video whisks over the basic operators - you know what plus and minus do, right? Then we look at some other less common operators and recap it all below.&lt;/p&gt;


&lt;div style=&#34;position: relative; padding-bottom: 56.25%; padding-top: 30px; height: 0; overflow: hidden;&#34;&gt;
  &lt;iframe src=&#34;//www.youtube.com/embed/RI3dD7QegfE&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%;&#34; allowfullscreen frameborder=&#34;0&#34; title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
 &lt;/div&gt;


&lt;p&gt;Pay special attention to &lt;code&gt;all.equal()&lt;/code&gt;, there&amp;rsquo;s a reason I bang on about it!&lt;/p&gt;

&lt;h2 id=&#34;maths&#34;&gt;Maths&lt;/h2&gt;

&lt;p&gt;In R, we have our common &lt;strong&gt;operators&lt;/strong&gt; that you&amp;rsquo;re probably used to if you&amp;rsquo;ve performed calculations on computers before.&lt;/p&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Operator&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Subtract&lt;/td&gt;
&lt;td&gt;-&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5 - 4&lt;/code&gt; = 1&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Add&lt;/td&gt;
&lt;td&gt;+&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5 + 4&lt;/code&gt; = 9&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Multiply&lt;/td&gt;
&lt;td&gt;*&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5 * 4&lt;/code&gt; = 20&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Divide&lt;/td&gt;
&lt;td&gt;/&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5 / 4&lt;/code&gt; = 1.25&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Raise to the power&lt;/td&gt;
&lt;td&gt;^&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5 ^ 4&lt;/code&gt; = 625&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;R adheres to &lt;strong&gt;BODMAS&lt;/strong&gt;[1] so you can construct safe calculations that combine operators in reliable ways.&lt;/p&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Operator&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Basic sequence&lt;/td&gt;
&lt;td&gt;:&lt;/td&gt;
&lt;td&gt;&lt;code&gt;1:3&lt;/code&gt; = 1, 2, 3&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Integer division&lt;/td&gt;
&lt;td&gt;%/%&lt;/td&gt;
&lt;td&gt;&lt;code&gt;9 %/% 4&lt;/code&gt; = 2&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Modulus&lt;/td&gt;
&lt;td&gt;%%&lt;/td&gt;
&lt;td&gt;&lt;code&gt;9 %% 4&lt;/code&gt; = 1&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The colon (&lt;code&gt;:&lt;/code&gt;) is a snazzy way of generating a sequence of numbers that step by 1. You specify a beginning number and an end number and R will produce all the whole numbers including and between the two numbers. This even works for negative numbers or producing descending values.&lt;/p&gt;

&lt;h2 id=&#34;comparison&#34;&gt;Comparison&lt;/h2&gt;

&lt;p&gt;The next important thing to know about is how to write comparisons; ways of looking at two or more things and finding out if they&amp;rsquo;re the same, or different.&lt;/p&gt;

&lt;h3 id=&#34;common-operators&#34;&gt;Common operators&lt;/h3&gt;

&lt;p&gt;The less thans and greater thans are symbols that are in pretty much every language for comparisons, but the test to see if two values are the same or not can often vary across languages.&lt;/p&gt;

&lt;h3 id=&#34;summary&#34;&gt;Summary&lt;/h3&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Operator&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Less than (lt)&lt;/td&gt;
&lt;td&gt;&amp;lt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5 &amp;lt; 5&lt;/code&gt; = FALSE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;lt or equal to&lt;/td&gt;
&lt;td&gt;&amp;lt;=&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5 &amp;lt;= 5&lt;/code&gt; = TRUE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Greater than (gt)&lt;/td&gt;
&lt;td&gt;&amp;gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5 &amp;gt; 5&lt;/code&gt; = FALSE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;gt or equal to&lt;/td&gt;
&lt;td&gt;&amp;gt;=&lt;/td&gt;
&lt;td&gt;&lt;code&gt;5 &amp;gt;= 5&lt;/code&gt; = TRUE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Exactly equal&lt;/td&gt;
&lt;td&gt;==&lt;/td&gt;
&lt;td&gt;&lt;code&gt;(0.5 - 0.3) == (0.3 - 0.1)&lt;/code&gt; is FALSE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Exactly equal&lt;/td&gt;
&lt;td&gt;==&lt;/td&gt;
&lt;td&gt;2 == 2 is TRUE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Not equal&lt;/td&gt;
&lt;td&gt;!=&lt;/td&gt;
&lt;td&gt;&lt;code&gt;(0.5 - 0.3) != (0.3 - 0.1)&lt;/code&gt; is TRUE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Not equal&lt;/td&gt;
&lt;td&gt;!=&lt;/td&gt;
&lt;td&gt;2 != 2 is FALSE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Equal&lt;/td&gt;
&lt;td&gt;all.equal()&lt;/td&gt;
&lt;td&gt;&lt;code&gt;all.equal(0.5-0.3,0.3-0.1)&lt;/code&gt; is TRUE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;In&lt;/td&gt;
&lt;td&gt;%in%&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;quot;Red&amp;quot; %in% c(&amp;quot;Blue&amp;quot;,&amp;quot;Red&amp;quot;)&lt;/code&gt; is TRUE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&#34;logic&#34;&gt;Logic&lt;/h2&gt;

&lt;p&gt;Once we can do a single check, we inevitably want to do multiple checks at the same time.&lt;/p&gt;

&lt;p&gt;To combine multiple checks, we can use &lt;em&gt;logical operators&lt;/em&gt;.&lt;/p&gt;

&lt;h3 id=&#34;common-operators-1&#34;&gt;Common operators&lt;/h3&gt;

&lt;p&gt;The ampersand (&lt;code&gt;&amp;amp;&lt;/code&gt;) allows us to combine two checks to do an AND check, which is &amp;ldquo;are both things true?&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;The pipe, or bar (&lt;code&gt;|&lt;/code&gt;)[2] allows us to do an OR check, which is &amp;ldquo;are either of these things true?&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;The exclamation point (&lt;code&gt;!&lt;/code&gt;) allows us to a perform a NOT check, by negating or swapping a check&amp;rsquo;s result. This allows you say things like &amp;ldquo;is this check true and that check not true?&amp;rdquo;.&lt;/p&gt;

&lt;h3 id=&#34;other-operators&#34;&gt;Other operators&lt;/h3&gt;

&lt;p&gt;Less commonly, there other logical checks you might to perform.&lt;/p&gt;

&lt;p&gt;We can do an XOR, where one and only one of two values being checked is true.&lt;/p&gt;

&lt;h3 id=&#34;summary-1&#34;&gt;Summary&lt;/h3&gt;

&lt;p&gt;We can produce sophisticated checks from a few simple building blocks. This will come in very handy down the line when doing things like filtering datasets or creating new fields in your data.&lt;/p&gt;

&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Operator&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;

&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Not&lt;/td&gt;
&lt;td&gt;!&lt;/td&gt;
&lt;td&gt;&lt;code&gt;!TRUE&lt;/code&gt; is FALSE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;And&lt;/td&gt;
&lt;td&gt;&amp;amp;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TRUE &amp;amp; FALSE&lt;/code&gt; is FALSE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;And&lt;/td&gt;
&lt;td&gt;&amp;amp;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;c(TRUE,TRUE) &amp;amp; c(FALSE,TRUE)&lt;/code&gt; is FALSE, TRUE&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Or&lt;/td&gt;
&lt;td&gt;`&lt;/td&gt;
&lt;td&gt;`&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td&gt;Xor&lt;/td&gt;
&lt;td&gt;xor()&lt;/td&gt;
&lt;td&gt;&lt;code&gt;xor(TRUE,FALSE)&lt;/code&gt; is TRUE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&#34;play-time&#34;&gt;Play time!&lt;/h2&gt;

&lt;p&gt;This basic operations section has hopefully taught you how to manipulate values and construct comparisons. These are important building blocks in data analysis, and whilst we&amp;rsquo;ve been working with only a single value at a time, in the next section we&amp;rsquo;ll see how it works with more data.&lt;/p&gt;

&lt;p&gt;Now take the code below and run wild&amp;hellip;or let your turtle run wild at least! Use what you&amp;rsquo;ve learned here and see what you can do with operators - I&amp;rsquo;ve given you some blog posts and guides with example code in - take this and alter it so that the numbers are sums, or even better come up with your own!&lt;/p&gt;

&lt;p&gt;&lt;a href=&#34;https://www.r-bloggers.com/four-simple-turtle-graphs-to-play-with-kids/&#34;&gt;Four simple turtle graphs&lt;/a&gt;.&lt;br /&gt;
&lt;a href=&#34;https://cran.r-project.org/web/packages/TurtleGraphics/vignettes/TurtleGraphics.pdf&#34;&gt;A guide to the TurtleGraphic package for R&lt;/a&gt; This one has some great tricky examples at the end!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;install.packages(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;TurtleGraphics&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#f92672&#34;&gt;library&lt;/span&gt;(TurtleGraphics)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## Loading required package: grid
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;turtle_init() &lt;span style=&#34;color:#75715e&#34;&gt;# This starts off the turtle&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;turtle_down() &lt;span style=&#34;color:#75715e&#34;&gt;# This means that when you feed instructions to your turtle, it leaves a mark.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Use &amp;ldquo;turtle_left()&amp;rdquo; and place the number of an angle between the brackets to turn your turtle. The same applies for &amp;ldquo;turtle_right()&amp;rdquo;&lt;/p&gt;

&lt;p&gt;Use &amp;ldquo;turtle_forward()&amp;rdquo; and &amp;ldquo;turtle_backward()&amp;rdquo; to move him around and draw something. Inbetween the brackets should be a number specifying the number of &amp;ldquo;steps&amp;rdquo; the turtle should take.&lt;/p&gt;

&lt;p&gt;Here&amp;rsquo;s the twist. Use what you&amp;rsquo;ve learned today with the basic maths operations and create your drawing using sums instead of numbers. We&amp;rsquo;d love to see your efforts so tweet us a picture to @LockeData!&lt;/p&gt;

&lt;p&gt;Ellen :)&lt;/p&gt;

&lt;p&gt;P.S. My video code is below if you want to take it away with you!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 6
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 5
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;100&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;/&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;4&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 25
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;25&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;9&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 225
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;6&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;^&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 36
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;7&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 25
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;7&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] 43
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;6&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;4&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;3.5&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;3.5&lt;/span&gt; &lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;12&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;12&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;9&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] TRUE
&lt;/code&gt;&lt;/pre&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;24&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;27&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;## [1] FALSE
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;[1] Brackets, Other, Division, Multiplication, Addition, Subtraction. Note that in some countries it&amp;rsquo;s BEDMAS, where the E stands for Exponents, which is a special Other&lt;/p&gt;

&lt;p&gt;[2] Getting this symbol can be painful as it varies substantially by keyboard, so apologies if it takes you a while to hunt this symbol down.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Harmonizing and emojifying our GitHub issue trackers</title>
      <link>https://itsalocke.com/blog/harmonizing-and-emojifying-our-github-issue-trackers/</link>
      <pubDate>Thu, 12 Jul 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/harmonizing-and-emojifying-our-github-issue-trackers/</guid>
      <description>
        
        

&lt;p&gt;A part of Locke Data&amp;rsquo;s mission is sharing R knowledge and tooling with the world for free. If you have a look at &lt;a href=&#34;https://github.com/lockedata/&#34;&gt;our GitHub account&lt;/a&gt;, you&amp;rsquo;ll see we&amp;rsquo;ve pinned six of our package repos. Furthermore, to make it easier to find all the R stuff we&amp;rsquo;ve packaged up, we&amp;rsquo;ve added an &amp;ldquo;r-package&amp;rdquo; repo topic to all packages: find them all &lt;a href=&#34;https://github.com/search?q=topic%3Ar-package+org%3Alockedata+fork%3Atrue&#34;&gt;via this URL&lt;/a&gt;. Adding such repo topics isn&amp;rsquo;t the only harmonization effort we&amp;rsquo;ve done to make it easier to maintain and promote our packages suite. In this post, &lt;a href=&#34;https://twitter.com/i_steves/status/1017111900893696003&#34;&gt;at the request&lt;/a&gt; of &lt;a href=&#34;https://github.com/isteves&#34;&gt;Irene Steves&lt;/a&gt;, we shall explain why and how we semi-automatically harmonized and emojified our issue trackers with the help of GitHub&amp;rsquo;s V3 API and of the &lt;code&gt;gh&lt;/code&gt; package!&lt;/p&gt;

&lt;h1 id=&#34;emojify-what-a-short-intro-to-github-issue-tracker-organization&#34;&gt;Emojify what? A short intro to GitHub issue tracker organization&lt;/h1&gt;

&lt;p&gt;If you&amp;rsquo;re a bit familiar with software development on GitHub, either because you develop your own packages or sometimes browse the repos of other packages, you&amp;rsquo;ll know each repository has an issue tracker. That&amp;rsquo;s the place where bugs and feature requests are gathered. Both repo members and external contributors can add stuff to the list! Often, the issue tracker is your personal to-do list because no one else cares yet, and you can get righteously very excited over any new issue!&lt;/p&gt;

&lt;blockquote class=&#34;twitter-tweet&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;Yaaayy!! I built a thing and somebody is using it :)&lt;br&gt;&lt;br&gt;I guess this is what popular OSS library authors feel when they get an issue, right?&lt;a href=&#34;https://t.co/2WyqcOkElW&#34;&gt;https://t.co/2WyqcOkElW&lt;/a&gt;&lt;/p&gt;&amp;mdash; El esposo de Daniela 💉💉💉 (@g3rv4) &lt;a href=&#34;https://twitter.com/g3rv4/status/1016789214325813248?ref_src=twsrc%5Etfw&#34;&gt;July 10, 2018&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;https://platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;


&lt;p&gt;But as your repo gets a bit more popular, or as you yourself take notes of a lot of stuff, your to-do list gets very long and hard to make sense of! GitHub provides tool for organizing it. Of particular interest are &lt;a href=&#34;https://help.github.com/articles/about-milestones/&#34;&gt;&lt;em&gt;milestones&lt;/em&gt;&lt;/a&gt; that are collections of issues corresponding to say how you imagine each future release to be, and &lt;a href=&#34;https://help.github.com/articles/about-labels/&#34;&gt;&lt;em&gt;labels&lt;/em&gt;&lt;/a&gt; that indicate what each issue is about; so you can have issues related to &lt;em&gt;docs&lt;/em&gt;. Labels can also indicate the level needed to resolve an issue, and whether you&amp;rsquo;d welcome external contributors, &lt;a href=&#34;https://help.github.com/articles/helping-new-contributors-find-your-project-with-labels/&#34;&gt;cf this GitHub article&lt;/a&gt;. Labels are great, for instance you can decide to work on only docs on one day and filter issues that are related to that.&lt;/p&gt;

&lt;h1 id=&#34;why-harmonizing-sets-of-issue-labels-over-an-organization&#34;&gt;Why harmonizing sets of issue labels over an organization?&lt;/h1&gt;

&lt;p&gt;As mentioned above, Locke Data maintains a few packages now. Each of them used to have its own set of issue labels. We decided to make them all similar for two reasons. First, it&amp;rsquo;s nice to speak the same language over different projects within our GitHub organization, making it easier to transition from one repo to the other. Then, we wanted to be able to eventually make use of tools such as &lt;a href=&#34;https://github.com/jimhester/tidyversedashboard&#34;&gt;Jim Hester&amp;rsquo;s &lt;code&gt;tidyversedashboard&lt;/code&gt; package&lt;/a&gt;, and thought the &lt;a href=&#34;https://connect.rstudioservices.com/jimhester/tidyverse_dashboard/tidyverse_dashboard.html#open-issues&#34;&gt;issue tab&lt;/a&gt; would be handier if having similar labels.&lt;/p&gt;

&lt;p&gt;Therefore, we set off to decide on a set of issue labels!&lt;/p&gt;

&lt;h1 id=&#34;why-emojifying-issue-labels&#34;&gt;Why emojifying issue labels?&lt;/h1&gt;

&lt;p&gt;Maybe one reason we decided to add emojis in our set of issue labels is simply that we could! You actually only can do that &lt;a href=&#34;https://blog.github.com/2018-02-22-label-improvements-emoji-descriptions-and-more/&#34;&gt;since February the 22d this year&lt;/a&gt;! Other reasons include that it&amp;rsquo;s fun, and that emojis make it easier to differentiate issue labels at a glance. Colours are good, more on that later, but not everyone can differentiate them anyway.&lt;/p&gt;

&lt;h1 id=&#34;how-to-update-issue-label-sets-in-practice&#34;&gt;How to update issue label sets in practice?&lt;/h1&gt;

&lt;p&gt;We updated the issue labels more or less automatically, making the most of GitHub&amp;rsquo;s APIs and of related R packages! The two actual scripts live &lt;a href=&#34;https://github.com/lockedata/lockedev/blob/master/inst/legacy_code/harmonize_labels.R&#34;&gt;here&lt;/a&gt; and &lt;a href=&#34;https://github.com/lockedata/lockedev/blob/master/inst/legacy_code/brand_labels.R&#34;&gt;here&lt;/a&gt;. Here, we write up some general lessons, updated to make your life easier since the API evolved a bit since we worked on our labels.&lt;/p&gt;

&lt;p&gt;We first got the list of all package repos using the experimental &lt;a href=&#34;https://github.com/ropenscilabs/ghrecipes&#34;&gt;&lt;code&gt;ghrecipes&lt;/code&gt; package&lt;/a&gt; that interacts with GitHub V4 API and is developped in &lt;a href=&#34;https://ropensci.org/&#34;&gt;rOpenSci&lt;/a&gt;&amp;rsquo;s ropenscilabs GitHub organization. &lt;code&gt;ghrecipes::is_package&lt;/code&gt; returns TRUE if a repo has DESCRIPTION and NAMESPACE files and R and man folders. Maybe you already have such a list of your repos handy!&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;repos &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; ghrecipes&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;get_repos(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;lockedata&amp;#34;&lt;/span&gt;)
repos &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; dplyr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;mutate(repos, 
                       name &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; stringr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;str_remove(name,
                                                  &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;lockedata\\/&amp;#34;&lt;/span&gt;))
&lt;span style=&#34;color:#75715e&#34;&gt;# are packages&lt;/span&gt;
repos&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;is_pkg &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;map_lgl(repos&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;name,
                               ghrecipes&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;is_package_repo,
                               owner &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;lockedata&amp;#34;&lt;/span&gt;)

repos &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; dplyr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;filter(repos, is_pkg, &lt;span style=&#34;color:#f92672&#34;&gt;!&lt;/span&gt;is_fork)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Once we had that list, we switched to using GitHub&amp;rsquo;s V3 API via the &lt;a href=&#34;https://github.com/r-lib/gh&#34;&gt;&lt;code&gt;gh&lt;/code&gt; package&lt;/a&gt; developed in &lt;a href=&#34;https://www.rstudio.com/&#34;&gt;RStudio&lt;/a&gt;&amp;rsquo;s &lt;code&gt;r-lib&lt;/code&gt; organization.&lt;/p&gt;

&lt;p&gt;At the time we updated labels, we couldn&amp;rsquo;t directly update them but in any case now you can, no need to delete and re-create labels at the risk of losing your labelling (&lt;em&gt;we&lt;/em&gt; didn&amp;rsquo;t loose anything though). Note that in general, when messing with your issue trackers automatically, you should definitely test your code on a single repo first, since your power could, for instance, make you unlabel all issues at once, which would be quite bad!&lt;/p&gt;

&lt;p&gt;With the current endpoints here&amp;rsquo;s what you should do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href=&#34;http://happygitwithr.com/github-pat.html#how-do-you-authenticate-yourself&#34;&gt;Take care of your authentication&lt;/a&gt; taking into account your rights over the repos and choosing the scope of the token wisely.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;a href=&#34;https://developer.github.com/v3/issues/labels/#list-all-labels-for-this-repository&#34;&gt;List all labels over the repos&lt;/a&gt;. It&amp;rsquo;d look like this.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;gh&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;gh(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;GET /repos/:owner/:repo/labels&amp;#34;&lt;/span&gt;, owner &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;lockedata&amp;#34;&lt;/span&gt;, repo &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;optiRum&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;After doing that, define what each old label is going to become. When defining our new labels, we &lt;a href=&#34;https://www.webpagefx.com/tools/emoji-cheat-sheet/&#34;&gt;used this cheatsheet of emojis for GitHub&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#75715e&#34;&gt;# dput(sort(unique(labels$name)))&lt;/span&gt;
new_labels &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; tibble&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;tibble(old &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;bug&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;duplicate&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;enhancement&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;first-timers-only&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;good first issue&amp;#34;&lt;/span&gt;, 
                                     &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;help wanted&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;invalid&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;question&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;up-for-grabs&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;wontfix&amp;#34;&lt;/span&gt;
),
                             new &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;bug :bug:&amp;#34;&lt;/span&gt;,
                                     &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;duplicate :dancers:&amp;#34;&lt;/span&gt;,
                                     &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;enhancement :sparkles:&amp;#34;&lt;/span&gt;,
                                     &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;good first issue :hatching_chick:&amp;#34;&lt;/span&gt;,
                                     &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;good first issue :hatching_chick:&amp;#34;&lt;/span&gt;,
                                     &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;help wanted :raised_hand:&amp;#34;&lt;/span&gt;,
                                     &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;wontfix :see_no_evil:&amp;#34;&lt;/span&gt;,
                                     &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;question :question:&amp;#34;&lt;/span&gt;,
                                     &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;help wanted :raised_hand:&amp;#34;&lt;/span&gt;,
&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;wontfix :see_no_evil:&amp;#34;&lt;/span&gt;))&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;You&amp;rsquo;ll also need to choose colours. We first used random colours sampled thanks to &lt;a href=&#34;https://github.com/ropensci/charlatan&#34;&gt;&lt;code&gt;charlatan::ch_hex_color()&lt;/code&gt;&lt;/a&gt; but also chose to use official Locke Data colours for a few of the labels.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Then, using this correspondance table and your favorite functional programming / looping method (&lt;a href=&#34;https://github.com/tidyverse/purrr&#34;&gt;&lt;code&gt;purrr&lt;/code&gt;&lt;/a&gt; in our case, get started with &lt;a href=&#34;https://github.com/jenniferthompson/RLadiesIntroToPurrr&#34;&gt;this slidedeck&lt;/a&gt;), use that fantastic &lt;a href=&#34;https://developer.github.com/v3/issues/labels/#update-a-label&#34;&gt;endpoint&lt;/a&gt; that didn&amp;rsquo;t exist when we did that (or that we missed?). Here&amp;rsquo;s how it&amp;rsquo;d look for a single label, identified by its current name, and repo.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;gh&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;gh(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;PATCH /repos/:owner/:repo/labels/:current_name&amp;#34;&lt;/span&gt;,
       owner &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;ropenscilabs&amp;#34;&lt;/span&gt;, repo &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;ghrecipes&amp;#34;&lt;/span&gt;,
       current_name &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;enhancement&amp;#34;&lt;/span&gt;,
       name &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;feature :sparkles:&amp;#34;&lt;/span&gt;,
       color &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;ffa07a&amp;#34;&lt;/span&gt;,
       description &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;stuff that&amp;#39;d be nice to add&amp;#34;&lt;/span&gt;,
       &lt;span style=&#34;color:#ae81ff&#34;&gt;.&lt;/span&gt;send_headers &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(Accept &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;application/vnd.github.symmetra-preview+json&amp;#34;&lt;/span&gt;))&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Notice that the color has to be a hex code, minus the &amp;ldquo;#&amp;rdquo;, and also notice the &lt;code&gt;.send_headers&lt;/code&gt; parameter. I&amp;rsquo;ve tested the above chunk for real and it worked. Depending on the current state of your repos, you might want to &lt;a href=&#34;https://developer.github.com/v3/issues/labels/#create-a-label&#34;&gt;create labels from scratch&lt;/a&gt; after &lt;a href=&#34;https://developer.github.com/v3/issues/labels/#delete-a-label&#34;&gt;deleting labels&lt;/a&gt;. But &lt;em&gt;updating&lt;/em&gt; labels is good because it means that if the labels are already used, the labelled issues won&amp;rsquo;t loose their labelling.&lt;/p&gt;

&lt;h1 id=&#34;now-enjoy-labels-by-working-on-issues&#34;&gt;Now, enjoy labels by working on issues!&lt;/h1&gt;

&lt;p&gt;Admire the set of issues we can now use in each of our package repos:&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/2018-07-12-datasaurus2.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Locke Data issue labels&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;Here&amp;rsquo;s for instance the &lt;a href=&#34;https://github.com/lockedata/datasauRus/issues&#34;&gt;&lt;code&gt;datasauRus&lt;/code&gt; issue tracker&lt;/a&gt; whose pretty labels &lt;a href=&#34;https://twitter.com/i_steves/status/1017095491824373761&#34;&gt;inspired Irene Steves&lt;/a&gt;.&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/2018-07-12-datasaurus.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;datasauRus issue tracker&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;Now, it&amp;rsquo;s time for us to get cracking on some of our repo issue trackers! Have fun with your own labelling!&lt;/p&gt;

&lt;p&gt;Edited to add: &lt;a href=&#34;http://usethis.r-lib.org/reference/use_github_labels.html&#34;&gt;&lt;code&gt;usethis::use_github_labels&lt;/code&gt;&lt;/a&gt; might be of interest, although it doesn&amp;rsquo;t update the names of existing labels. It is probably particularly useful for setup. Also follow &lt;a href=&#34;https://github.com/r-lib/usethis/issues/290&#34;&gt;this issue thread&lt;/a&gt; including a &lt;a href=&#34;https://github.com/r-lib/usethis/issues/290#issuecomment-368983746&#34;&gt;prototype of an issue label set by Mara Averick, with emojis&lt;/a&gt;.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>SatRdays Cardiff</title>
      <link>https://itsalocke.com/blog/satrdays-cardiff/</link>
      <pubDate>Wed, 04 Jul 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/satrdays-cardiff/</guid>
      <description>
        
        

&lt;p&gt;Hey again lovely readers! This blog is a very special one indeed, you get to hear about our great day out at &lt;a href=&#34;//cardiff2018.satrdays.org&#34;&gt;SatRdays in Cardiff&lt;/a&gt; recently not once, not twice, but five times, from each of our team members perspectives! I think it&amp;rsquo;s fair to say that it was a very different experience for each of us - from seasoned conference attendees like Steph and Maëlle, Amy who had never presented before, sponsorship newbie Oz and then Ellen somewhere inbetween, we all had very different (but great) take aways from the day! Read on for laughs, nerves, proud moments and a quick cuddle too!&lt;/p&gt;

&lt;h2 id=&#34;ellen&#34;&gt;Ellen&lt;/h2&gt;

&lt;p&gt;Friday afternoon began with a hot and sticky drive, a 2 hour delay on the M6 and a rumbling tummy, but I made it to Cardiff just in time to catch a drink with a few of the Cardiff R user group! I was welcomed warmly by all, but most warmly of all by Steph&amp;rsquo;s dogs, Obi and Leia, with whom I was spending the night to make an early start on Saturday morning.&lt;/p&gt;

&lt;p&gt;We arrived at the venue at Cardiff University so early that even the coffee shop wasn&amp;rsquo;t open yet - but whilst we were setting up the Locke Data stall, complete with life size chibi (head not to scale!) and freebies - some pastries and coffee arrived to start/save the day. The morning workshops got off to a great start, and I went and sat in on Steph&amp;rsquo;s Tidy Data Science workshop. I wanted to see Steph present in person for the first time, and I just knew there would be some great nuggets of info in there for me to take away. I felt like I was being let in on a naughty secret when she told us about the &lt;code&gt;skimr&lt;/code&gt; and &lt;code&gt;dataexplorer&lt;/code&gt; packages! Everybody should have those in their back pocket!&lt;/p&gt;

&lt;p&gt;Lunch arrived and all too soon it was time for me to present. After watching Amy present and handle her questions like a practised pro I was up! This was my longest presentation to date, but I was looking forward to it and unusually for me, didn&amp;rsquo;t get any last minute jitters. I was so chuffed with how well received it was - a few people had seen me present the problem in Manchester a while back, and said how pleased they were that I had found a way around it. The questions I received were tricky but engaging and provided plenty of food for thought for future work - so maybe watch this space and see if I end up going along to SatRdays Amsterdam in September to present anything I come up with between now and then?&lt;/p&gt;

&lt;p&gt;Ellen :)&lt;/p&gt;

&lt;h2 id=&#34;amy&#34;&gt;Amy&lt;/h2&gt;

&lt;p&gt;Saturday began like any other day for me, other than the nerves of speaking at my first SatRdays Cardiff event! I woke up, got ready (kids included) and bid farewell to my family. Racked with nerves about doing my first talk, I strolled through Cardiff trying to enjoy the lovely weather. I arrived at the event to see some familiar and friendly faces from Team Locke, and some newer faces for people making the journey. The morning was very enjoyable, sitting at our booth while the workshops took place, allowing me some time to mentally prepare to give my afternoon talk.&lt;/p&gt;

&lt;p&gt;As lunch time arrived, so did my family! Steph is very keen on making speaking easier and more accessible to women (especially us mothers)! We had a quick bite to eat, and the kids loved playing with life size Steph Chibi! Then it was time! We all strolled over to the rooms where talks were being held, I could feel the nerves building as we got closer. The room was quite intimidating for a first time speaker, looking up and seeing row after row of seats - hoping they wouldn&amp;rsquo;t be filled! After a few technical issues in getting the laptops and presentation all setup, while listening to chants from my 3 year old calling &amp;ldquo;Mummy, Mummy!&amp;rdquo; cheering me on with his dad. Just as I was about to begin my talk, my daughter thought this would be a perfect time to toddle on over for a quick cuddle. After I got the motherly duties out of the way, my talk began.&lt;/p&gt;

&lt;p&gt;The room wasn&amp;rsquo;t overfilled however this didn&amp;rsquo;t stop me being nervous, I forgot my words a little and had to skip a slide. However once I got into the swing of it, things came a little more naturally (even though nerves were still on high). Overall it was a very nerve racking but enjoyable experience, and I got to talk about one of my favourite tools (Airtable), and received plenty of encouragement from my family and co-workers. A big thank you to Dave for asking questions, to try and prompt more information on the topic, but also for organising this amazing event (not forgetting the bottle of wine for helping with the organisation)!&lt;/p&gt;

&lt;p&gt;P.S If you weren&amp;rsquo;t there on the day, I wrote &lt;a href=&#34;https://itsalocke.com/blog/how-to-use-an-r-interface-with-airtable-api/&#34;&gt;a blog post&lt;/a&gt; on this topic too, which inspired my talk!&lt;/p&gt;

&lt;p&gt;Amy :/&lt;/p&gt;

&lt;h2 id=&#34;steph&#34;&gt;Steph&lt;/h2&gt;

&lt;p&gt;The Cardiff satRday was an incredibly proud moment for me on many fronts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;I&amp;rsquo;ve handed the reins of the &lt;a href=&#34;https://www.meetup.com/cardiff-r-user-group&#34;&gt;Cardiff R UG&lt;/a&gt; over to Dave and Paul and they ran their first conference. They did this with panache and everyone I&amp;rsquo;ve talked to about it has gushed over the awesome job they did! They had over 75% of speakers coming from under-represented groups, they had a mentoring program for new speakers, and were able to make T&amp;amp;E available for many speakers &amp;ndash; in short they aced it!&lt;/li&gt;
&lt;li&gt;This was the first UK satRday event and we&amp;rsquo;re really starting to gain traction on these events with &lt;a href=&#34;https://satrdays.org/events/&#34;&gt;8 more already in the pipeline&lt;/a&gt;. As I head up the central team on satRdays it&amp;rsquo;s incredibly gratifying to see folks delivering and participating in satRdays.&lt;/li&gt;
&lt;li&gt;This was the first event Locke Data made a big commitment to sponsor. We invested a huge amount of time and money getting all the things made. Oz, in particular, did a vast amount of preparation and then stayed on the booth when we were all moving about doing talks and catching up with folks. Four of us spoke at the event and everyone did a great job &amp;ndash; especially Amy who did her first ever talk! Seeing our core team in the flesh, sharing their knowledge and making the world a better place, really made me feel proud of what we&amp;rsquo;re growing here.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I&amp;rsquo;m so thankful that people have chosen to take up some of my banners and do things way better than I ever conceived was possible. Thank you all for making me feel on top of the world this week!&lt;/p&gt;

&lt;p&gt;Steph :D&lt;/p&gt;

&lt;h2 id=&#34;maëlle&#34;&gt;Maëlle&lt;/h2&gt;

&lt;p&gt;I&amp;rsquo;m so glad I could attend satRday Cardiff! It was my second satRday this year, since I got to keynote at satRday Cape Town in March. I came all the way from France, well, almost: I actually used the conference as an opportunity to visit family and friends in London with my husband and baby, and only travelled to Cardiff for the day. I was able to assess how welcoming the organisers were: my train tickets from London to Cardiff and back were reimbursed, and a lactation room was set at my disposal. Steph co-founded satRdays (and maintains a &lt;a href=&#34;https://github.com/stephlocke/awesome-organiser-resources&#34;&gt;list of awesome organiser resources&lt;/a&gt;), and one feels her touch in such efforts.&lt;/p&gt;

&lt;p&gt;Unsurprisingly, a big highlight of the conference was meeting my remote co-workers! I was able to pose for a photo with Steph&amp;hellip;&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/2018-06-26-chibi.jpg&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Maëlle and Chibi Steph&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;Or was it her chibi? Hard to tell, honestly. In any case, it was great to spend time with Team Locke Data! I&amp;rsquo;m also thankful to have met other people I only interact with online in different spheres of the R community (R-Ladies, rOpenSci, #rstats Twitter). Everyone was as nice as online!&lt;/p&gt;

&lt;p&gt;As regards talks, I gave one about rOpenSci onboarding system of packages, more info in &lt;a href=&#34;https://maelle.github.io/satrday_cardiff/slides#1&#34;&gt;the slidedeck&lt;/a&gt;. Nujcharee Haswell gave an enthusiastic and informative &lt;a href=&#34;https://docs.google.com/presentation/d/1OcbH-1a5fMEjGG0BWFjo5aCGetqgw6dIUnmAaROVP1c/edit&#34;&gt;intro to &lt;code&gt;tidytext&lt;/code&gt;&lt;/a&gt; just before my talk. Later, I attended the session where both Ellen and Amy were speakers!&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/2018-06-26-amy.jpg&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Amy&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;Amy was a bit nervous, but even so, her passion for Airtable was obvious! I can only be impressed when thinking that she started learning R a few months ago, and is now able to explain to other R users how to leverage Airtable from R.&lt;/p&gt;


&lt;figure &gt;
    
        &lt;img src=&#34;../img/2018-06-26-ellen.jpg&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Ellen&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;Ellen&amp;rsquo;s talk included both really interesting info about energy consumption and its analysis and real talk about the research process that made me laugh (although it probably wasn&amp;rsquo;t very funny at the time!).&lt;/p&gt;

&lt;p&gt;I live-tweeted all day long and made a &lt;a href=&#34;https://masalmon.eu/2018/06/26/storrrify-satrdaycdf-2018/&#34;&gt;storrry out of my tweets on my own blog&lt;/a&gt;. I went &amp;ldquo;home&amp;rdquo; to London very happy, and can only recommend attending a satRday near you if you can!&lt;/p&gt;

&lt;p&gt;Maëlle :o) (why am I the only one with a nose?)&lt;/p&gt;

&lt;h2 id=&#34;oz&#34;&gt;Oz&lt;/h2&gt;

&lt;p&gt;I&amp;rsquo;ve never run a sponsorship stall before, and needless to say I was very nervous, but the attendees at SatRdays made the experience a breeze and a delight. I got to chat with no end of lovely people, and spread the word about Locke Data, the community work Steph does, and even satRdays itself. A lot of work went into making all the banners, signs, and that cardboard cuttout chibi, but it was worth it to see so many people getting a laugh from them (and Amy&amp;rsquo;s kids running around with stickers and booklets!)&lt;/p&gt;

&lt;p&gt;The event was a hit from where I was standing. Though small in scope, it was perfectly formed, and handled with a level of skill and organisation I&amp;rsquo;ve seen seasoned conference organisers fail to match. The food looked lovely, and the catering staff were very friendly and helpful. All in all it was a wonderful day, and I know Locke Data will be back again next time.&lt;/p&gt;

&lt;p&gt;Oz ( ⋂‿⋂’)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;And so there we have it! A day to remember for the whole team, for all the right reasons, The next SatRday is already in the diary in Amsterdam on September 1st. Have we inspired you to get involved? The &lt;a href=&#34;http://amsterdam2018.satrdays.org&#34;&gt;Call for Speakers&lt;/a&gt; is open now!&lt;/strong&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>A crystal clear book draw</title>
      <link>https://itsalocke.com/blog/a-crystal-clear-book-draw/</link>
      <pubDate>Fri, 01 Jun 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/a-crystal-clear-book-draw/</guid>
      <description>
        
        

&lt;p&gt;As you might know, every month, a random Locke Data Twitter follower wins an excellent data science book! This month&amp;rsquo;s gift was &lt;a href=&#34;http://geni.us/introtostatslearning&#34;&gt;&amp;ldquo;An Introduction to Statistical Learning: with Applications in R&amp;rdquo;&lt;/a&gt;, a classic and useful textbook. In this post I&amp;rsquo;ll give you some &lt;code&gt;magick&lt;/code&gt;-al tips from behind-the-scenes of this month&amp;rsquo;s winner announcement. It&amp;rsquo;ll feature learning from my mistakes, and reading from a crystal ball&amp;hellip; or more seriously, image manipulation in R!&lt;/p&gt;

&lt;h1 id=&#34;an-introduction-to-learning-from-my-mistakes&#34;&gt;An Introduction to Learning from my Mistakes&lt;/h1&gt;

&lt;p&gt;In this month&amp;rsquo;s instalment, I corrected two of my previous mistakes, here is what I learnt&amp;hellip;&lt;/p&gt;

&lt;h2 id=&#34;package-your-code-right-away&#34;&gt;Package your code right away!&lt;/h2&gt;

&lt;p&gt;This book draw was the fourth one I was in charge of. The first three times I simply (or so I thought) put an R script and pictures in a GitHub repo, merely listing the dependencies in a DESCRIPTION file&amp;hellip; how suboptimal! This month I transformed &lt;a href=&#34;https://github.com/lockedata/twitterbookdraw&#34;&gt;the repo&lt;/a&gt; into a package, moving the old scripts to a subfolder after naming them correctly in order to no longer have one called&amp;hellip; &amp;ldquo;code.R&amp;rdquo;, and creating a function to draw the winner.&lt;/p&gt;

&lt;p&gt;Now the repo looks good which is nice but I can&amp;rsquo;t but regret not having organized it right away. I&amp;rsquo;d have saved time and energy and my pride. Steph actually maintains a nifty package helping with proper project setup, &lt;a href=&#34;https://github.com/lockedata/pRojects&#34;&gt;&lt;code&gt;pRojects&lt;/code&gt;&lt;/a&gt;, that&amp;rsquo;ll soon get some love and will probably be featured on this blog. Stay tuned!&lt;/p&gt;

&lt;h2 id=&#34;automate-what-can-be-automated&#34;&gt;Automate what can be automated&lt;/h2&gt;

&lt;p&gt;As I confessed &lt;a href=&#34;https://itsalocke.com/blog/a-particles-arly-fun-book-draw/&#34;&gt;last month&lt;/a&gt;, in the last announcement tweet the winner&amp;rsquo;s Twitter handle was different in the tweet text and the gif. Now it won&amp;rsquo;t happen ever again because I wrote an &lt;code&gt;announce_winner&lt;/code&gt; function so that the worfklow now is&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Draw the winner via &lt;code&gt;twitterbookdraw::draw_winner()&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Automatically create tweet text using that winner &lt;code&gt;twitterbookdraw::announce_winner()&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Automatically create visualization using that winner via  &lt;code&gt;twitterbookdraw::show_winner()&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can witness this &lt;a href=&#34;https://github.com/lockedata/twitterbookdraw#2018-06-01&#34;&gt;in &lt;code&gt;twitterbookdraw&lt;/code&gt; README&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Writing &lt;code&gt;announce_winner&lt;/code&gt; was quite pleasing thanks to using the nice &lt;a href=&#34;https://github.com/tidyverse/glue&#34;&gt;&lt;code&gt;glue&lt;/code&gt; package&lt;/a&gt; by RStudio &lt;a href=&#34;http://www.jimhester.com/&#34;&gt;Jim Hester&lt;/a&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;announce_winner &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(winner, book, book_url){
  glue&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;glue(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;This month\&amp;#39;s winner is {winner$name} (@{winner$screen_name})! DM us to receive &amp;#34;{book}&amp;#34;!
&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;
&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;             \n{book_url}
&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;
&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;             \n#datascience&amp;#39;&lt;/span&gt;)
}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;h1 id=&#34;an-introduction-to-crystal-ball-reading&#34;&gt;An Introduction to Crystal Ball Reading&lt;/h1&gt;

&lt;p&gt;I try to vary the viz used each month to keep it interesting for Twitter readers but also for anyone interested in playing with the code. Now that the GitHub repo is tidier, finding previous month&amp;rsquo;s scripts for inspiration has gotten easier! But still, blogging a few tips is probably the best way to digest that information.&lt;/p&gt;

&lt;p&gt;This month, the challenge was to create a crystall ball that&amp;rsquo;d progressively reveal the Twitter avatar of the winner, as if chibi Steph were reading it to announce the winner.&lt;/p&gt;

&lt;p&gt;I prepared this gif using the &lt;a href=&#34;https://cran.r-project.org/web/packages/magick/vignettes/intro.html&#34;&gt;&lt;code&gt;magick&lt;/code&gt; package&lt;/a&gt; for image manipulation, developed by &lt;a href=&#34;https://github.com/jeroen&#34;&gt;Jeroen Ooms&lt;/a&gt; at &lt;a href=&#34;https://ropensci.org/&#34;&gt;rOpenSci&lt;/a&gt;. It wraps the famous image manipulation library ImageMagick, allowing the user to use it in pipelines within R. Using &lt;code&gt;magick&lt;/code&gt; is in my opinion a great geometry and logic training: what size and form should each piece have, which one should you put below/above the other etc. So on top of scripting your image manipulation to make it reproducible, you get to play!&lt;/p&gt;

&lt;h2 id=&#34;step-1-create-a-face-in-hole-picture-prop&#34;&gt;Step 1: create a face-in-hole picture prop&lt;/h2&gt;

&lt;p&gt;I designed the gif skeleton as a face-in-hole photo prop, where the hole was the crystal ball.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;june_background &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(){
  &lt;span style=&#34;color:#75715e&#34;&gt;# create a blue rectangle as background&lt;/span&gt;
  background &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt;  magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_blank(&lt;span style=&#34;color:#ae81ff&#34;&gt;400&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;300&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;#2165B6&amp;#39;&lt;/span&gt;)

  &lt;span style=&#34;color:#75715e&#34;&gt;# activate it as a plot background&lt;/span&gt;
  img &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_draw(background)
  &lt;span style=&#34;color:#75715e&#34;&gt;# add stars using base plot!&lt;/span&gt;
  points(runif(n &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;42&lt;/span&gt;, min &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;, max &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;400&lt;/span&gt;),
         runif(n &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;42&lt;/span&gt;, min &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;, max &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;300&lt;/span&gt;),
         cex &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; runif(&lt;span style=&#34;color:#ae81ff&#34;&gt;42&lt;/span&gt;, min &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;, max &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;4&lt;/span&gt;),
         col &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;white&amp;#34;&lt;/span&gt;,
         pch &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;8&lt;/span&gt;)

  &lt;span style=&#34;color:#75715e&#34;&gt;# add ball support using base plot as well&lt;/span&gt;
  rect(&lt;span style=&#34;color:#ae81ff&#34;&gt;250&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;250&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;350&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;150&lt;/span&gt;,
       border &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;white&amp;#34;&lt;/span&gt;, lwd &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;,
       col &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;white&amp;#34;&lt;/span&gt;)

  &lt;span style=&#34;color:#75715e&#34;&gt;# add crystal ball, its background is a random color&lt;/span&gt;
  symbols(&lt;span style=&#34;color:#ae81ff&#34;&gt;300&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;150&lt;/span&gt;, circles &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;50&lt;/span&gt;,
          fg &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;white&amp;#34;&lt;/span&gt;, inches &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;, add &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;TRUE&lt;/span&gt;,
          bg &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;#B657A3&amp;#34;&lt;/span&gt;, lwd &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;2&lt;/span&gt;)

  dev.off()

  &lt;span style=&#34;color:#75715e&#34;&gt;# get and resize wizard chibi&lt;/span&gt;
  chibi &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_read(&lt;span style=&#34;color:#66d9ef&#34;&gt;system.file&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;assets/wizard_steph.png&amp;#34;&lt;/span&gt;,
                                          package&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;twitterbookdraw&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_resize(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;200x200&amp;#34;&lt;/span&gt;)


  &lt;span style=&#34;color:#75715e&#34;&gt;# add chibi on background&lt;/span&gt;
  img &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_composite(img, chibi, offset &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;+10+50&amp;#34;&lt;/span&gt;,
                                 operator &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Over&amp;#34;&lt;/span&gt;)

  &lt;span style=&#34;color:#75715e&#34;&gt;# make the crystal ball transparent by replacing its color&lt;/span&gt;
  &lt;span style=&#34;color:#75715e&#34;&gt;# by transparency. the hole is done!&lt;/span&gt;
  img &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_transparent(img, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;#B657A3&amp;#34;&lt;/span&gt;)

  &lt;span style=&#34;color:#75715e&#34;&gt;# return face-in-hole photo prop&lt;/span&gt;
  img
}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;figure &gt;
    
        &lt;img src=&#34;../img/2018-06-01-faceinhole.png&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Face in hole photo prop&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;h1 id=&#34;step-2-add-the-winner-s-face-behind-the-hole-and-an-evolving-opacity-veil&#34;&gt;Step 2: add the winner&amp;rsquo;s face behind the hole and an evolving opacity veil&lt;/h1&gt;

&lt;p&gt;Now, I needed to get the winner&amp;rsquo;s face behind that photo prop! For this I first downloaded the winner&amp;rsquo;s avatar from Twitter, using the winner&amp;rsquo;s information previously drawn.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;color:#75715e&#34;&gt;# winner &amp;lt;- twitterbookdraw::draw_winner()&lt;/span&gt;
winner_face &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_read(winner&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;profile_image_url) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_resize(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;150x150&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then I wrote a helper function placing the winner&amp;rsquo;s face behind the photo prop and behind a veil of a given opacity.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;june_colorized_frame &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(opacity, winner_face, background){

  &lt;span style=&#34;color:#75715e&#34;&gt;# add opacity veil in front of the winner&amp;#39;s avatar&lt;/span&gt;
  winner_face &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; winner_face &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_colorize(opacity &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; opacity,
                           color &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;black&amp;#34;&lt;/span&gt;)

  &lt;span style=&#34;color:#75715e&#34;&gt;# put the winner&amp;#39;s avatar on a white rectangle&lt;/span&gt;
  &lt;span style=&#34;color:#75715e&#34;&gt;# which will help put the avatar right behind the hole&lt;/span&gt;
  winner_face &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_blank(&lt;span style=&#34;color:#ae81ff&#34;&gt;400&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;300&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;white&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_composite(winner_face, offset &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;+220+80&amp;#34;&lt;/span&gt;)

  &lt;span style=&#34;color:#75715e&#34;&gt;# put the carefully placed and veiled winner&amp;#39;s avatar&lt;/span&gt;
  &lt;span style=&#34;color:#75715e&#34;&gt;# behind the photo prop&lt;/span&gt;
  magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_composite(winner_face, background)
}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After this, I created a sequence of evolving opacity (from very dark to lighter and to very dark again, before the big reveal of the avatar, because crystal balls blink, don&amp;rsquo;t they?).&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;show_june_winner &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt;(winner, path &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;june.gif&amp;#34;&lt;/span&gt;){
  &lt;span style=&#34;color:#75715e&#34;&gt;# get winner&amp;#39;s avatar&lt;/span&gt;
  winner_face &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_read(winner&lt;span style=&#34;color:#f92672&#34;&gt;$&lt;/span&gt;profile_image_url) &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_resize(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;150x150&amp;#34;&lt;/span&gt;)

  &lt;span style=&#34;color:#75715e&#34;&gt;# create a sequence of varying opacities&lt;/span&gt;
  &lt;span style=&#34;color:#75715e&#34;&gt;# and for each opacity create a frame&lt;/span&gt;
  &lt;span style=&#34;color:#75715e&#34;&gt;# then join and animate frames&lt;/span&gt;
  frames &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; purrr&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;map(&lt;span style=&#34;color:#66d9ef&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#66d9ef&#34;&gt;seq&lt;/span&gt;(from &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;100&lt;/span&gt;, to &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;50&lt;/span&gt;, by &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;-5&lt;/span&gt;),
                         &lt;span style=&#34;color:#66d9ef&#34;&gt;seq&lt;/span&gt;(from &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;50&lt;/span&gt;, to &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;100&lt;/span&gt;, by &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;),
                         &lt;span style=&#34;color:#66d9ef&#34;&gt;seq&lt;/span&gt;(from &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;100&lt;/span&gt;, to &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;50&lt;/span&gt;, by &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;-5&lt;/span&gt;),
               &lt;span style=&#34;color:#66d9ef&#34;&gt;rep&lt;/span&gt;(&lt;span style=&#34;color:#ae81ff&#34;&gt;0&lt;/span&gt;, &lt;span style=&#34;color:#ae81ff&#34;&gt;5&lt;/span&gt;)),
             june_colorized_frame,
             winner_face &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; winner_face,
             background &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; june_background())

  frames &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_join() &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_animate() &lt;span style=&#34;color:#f92672&#34;&gt;%&amp;gt;%&lt;/span&gt;
    magick&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;image_write(path)

}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Imagine the winner were Amy, &lt;a href=&#34;https://twitter.com/AmyMcDougall96&#34;&gt;Locke Data&amp;rsquo;s operation manager&lt;/a&gt; (post written before the actual draw&amp;hellip; Locke Data team members actually can&amp;rsquo;t win, we&amp;rsquo;d re-draw!):&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;winner &lt;span style=&#34;color:#f92672&#34;&gt;&amp;lt;-&lt;/span&gt; rtweet&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;search_users(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;AmyMcDougall96&amp;#34;&lt;/span&gt;, n &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#ae81ff&#34;&gt;1&lt;/span&gt;)
twitterbookdraw&lt;span style=&#34;color:#f92672&#34;&gt;::&lt;/span&gt;show_june_winner(winner, path &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;2018-06-01-winner.gif&amp;#34;&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;figure &gt;
    
        &lt;img src=&#34;../img/2018-06-01-winner.gif&#34; /&gt;
    
    
    &lt;figcaption&gt;
        &lt;h4&gt;Winner gif&lt;/h4&gt;
        
    &lt;/figcaption&gt;
    
&lt;/figure&gt;


&lt;p&gt;Not too shabby, good crystal ball reading skills Wizard Steph!&lt;/p&gt;

&lt;h1 id=&#34;future-plans&#34;&gt;Future plans?&lt;/h1&gt;

&lt;p&gt;Next month, again a random Locke Data follower will win a great book: you should follow &lt;a href=&#34;https://twitter.com/LockeData&#34;&gt;Locke Data on Twitter&lt;/a&gt; to get a chance! Besides, do not hesitate to ping us there if this post inspired you to play with magick&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Python and Tidyverse</title>
      <link>https://itsalocke.com/blog/python-and-tidyverse/</link>
      <pubDate>Fri, 01 Jun 2018 00:00:00 +0000</pubDate>
      
      <guid>https://itsalocke.com/blog/python-and-tidyverse/</guid>
      <description>
        
        

&lt;h2 id=&#34;introduction&#34;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;One of the great things about the R world has been a collection of R
packages called tidyverse that are easy for beginners to learn and
provide a consistent data manipulation and visualisation space. The
value of these tools has been so great that many of them have been
ported to Python. That&amp;rsquo;s why we thought we should provide an
introduction to tidyverse for Python blog post.&lt;/p&gt;

&lt;h2 id=&#34;what-is-tidyverse&#34;&gt;What is tidyverse?&lt;/h2&gt;

&lt;p&gt;&lt;a href=&#34;https://www.tidyverse.org/&#34;&gt;Tidyverse&lt;/a&gt; is an opinionated collection of
R packages designed for data science. All packages share an underlying
design philosophy, grammar, and data structures. The core R tidyverse
packages are: ggplot2, dplyr, tidyr, readr, purrr, tibble, stringr and
forcats.&lt;/p&gt;

&lt;h2 id=&#34;python-implementation-of-dplyr&#34;&gt;Python implementation of dplyr&lt;/h2&gt;

&lt;p&gt;The tidyverse package &lt;a href=&#34;https://dplyr.tidyverse.org/&#34;&gt;dplyr&lt;/a&gt; is a grammar
of data manipulation, providing a consistent set of verbs that help you
solve the most common data manipulation challenges. Here are some of the
functions dplyr provides that are commonly used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;mutate() - adds new variables that are functions of existing
variables&lt;/li&gt;
&lt;li&gt;select() - picks variables based on their names.&lt;/li&gt;
&lt;li&gt;filter() - picks cases based on their values.&lt;/li&gt;
&lt;li&gt;summarise() - reduces multiple values down to a single summary.&lt;/li&gt;
&lt;li&gt;arrange() - changes the ordering of the rows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href=&#34;https://github.com/dodger487/dplython&#34;&gt;Dplython&lt;/a&gt; is a Python
implementation of dplyr which can be installed using pip and the
following command:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pip install dplython&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Instructions on how to use pip to install python packages can be found
&lt;a href=&#34;https://packaging.python.org/tutorials/installing-packages/&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&#34;https://github.com/dodger487/dplython&#34;&gt;Dplython&lt;/a&gt; README provides
some clear examples of how the package can be used. Below is an summary
of the common functions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;select() - used to get specific columns of the data-frame.&lt;/li&gt;
&lt;li&gt;sift() - used to filter out rows based on the value of a variable in
that row.&lt;/li&gt;
&lt;li&gt;sample_n() and sample_frac() - used to provide a random sample of
rows from the data-frame.&lt;/li&gt;
&lt;li&gt;arrange() - used to sort results.&lt;/li&gt;
&lt;li&gt;mutate() - used to create new columns based on existing columns.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For more functions and example code visit the Dplython
&lt;a href=&#34;https://github.com/dodger487/dplython&#34;&gt;README&lt;/a&gt; page.&lt;/p&gt;

&lt;p&gt;At the bottom of the README a comparison is provided to
&lt;a href=&#34;https://pythonhosted.org/pandas-ply/&#34;&gt;pandas-ply&lt;/a&gt; which is another
python implementation of dplyr.&lt;/p&gt;

&lt;p&gt;Dplython comes with a sample data-set called &amp;lsquo;diamonds&amp;rsquo;. Here are some
basic examples of how to use Dplython.&lt;/p&gt;

&lt;p&gt;Import Python packages and the &amp;lsquo;diamonds&amp;rsquo; data-frame:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;import pandas
from dplython import (DplyFrame, X, diamonds, select, sift, sample_n,
    sample_frac, head, arrange, mutate, group_by, summarize, DelayFunction) 
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Create a new data-frame by selecting columns of the &amp;lsquo;diamonds&amp;rsquo;
data-frame:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;diamondsSmall = diamonds &amp;gt;&amp;gt; select(X.carat, X.cut, X.price, X.color, X.clarity  , X.depth  , X.table)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Display the top 4 rows of the &amp;lsquo;diamondsSmall&amp;rsquo; data-frame:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;print(diamondsSmall &amp;gt;&amp;gt; head(4)) 

##    carat      cut  price color clarity  depth  table
## 0   0.23    Ideal    326     E     SI2   61.5   55.0
## 1   0.21  Premium    326     E     SI1   59.8   61.0
## 2   0.23     Good    327     E     VS1   56.9   65.0
## 3   0.29  Premium    334     I     VS2   62.4   58.0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Filter the data-frame for rows where the price is higher than 18,000 and
the carat less than 1.2 and sort them by depth:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;print((diamondsSmall &amp;gt;&amp;gt; sift(X.price &amp;gt; 18000, X.carat &amp;lt; 1.2) &amp;gt;&amp;gt; arrange(X.depth)))

##        carat        cut  price color clarity  depth  table
## 27455   1.14  Very Good  18112     D      IF   59.1   58.0
## 27457   1.07  Very Good  18114     D      IF   60.9   58.0
## 27530   1.07    Premium  18279     D      IF   60.9   58.0
## 27635   1.04  Very Good  18542     D      IF   61.3   56.0
## 27507   1.09  Very Good  18231     D      IF   61.7   58.0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Provide a random sample of 5 rows from the data-frame&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;print(diamondsSmall &amp;gt;&amp;gt; sample_n(5))

##        carat        cut  price color clarity  depth  table
## 320     0.71       Good   2801     F     VS2   57.8   60.0
## 9813    0.91    Premium   4670     H     VS1   61.8   54.0
## 11795   1.18  Very Good   5088     E     SI2   62.5   60.0
## 11845   0.95  Very Good   5101     D     SI1   63.7   55.0
## 11552   1.17      Ideal   5032     F     SI1   63.0   54.0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Add a column to the data-frame containing the rounded value of &amp;lsquo;carat&amp;rsquo;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;print((diamondsSmall &amp;gt;&amp;gt; mutate(carat_bin=X.carat.round()) &amp;gt;&amp;gt;  sample_n(5)))

##        carat        cut  price color clarity  depth  table  carat_bin
## 11883   0.99  Very Good   5112     F     SI1   62.5   58.0        1.0
## 45123   0.77       Fair   1651     D     SI2   65.1   63.0        1.0
## 51630   0.31    Premium    544     E     SI1   59.2   60.0        0.0
## 49382   0.51  Very Good   2102     G      IF   62.6   56.0        1.0
## 18296   1.54  Very Good   7437     I     SI2   63.3   60.0        2.0
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&#34;python-implementation-of-ggplot2&#34;&gt;Python implementation of ggplot2&lt;/h2&gt;

&lt;p&gt;The tidyverse package &lt;a href=&#34;http://ggplot2.tidyverse.org/&#34;&gt;ggplot2&lt;/a&gt; is a
system for declaratively creating graphics, based on The Grammar of
Graphics. You provide the data, tell ggplot2 how to map variables to
aesthetics, what graphical primitives to use, and it takes care of the
details.&lt;/p&gt;

&lt;p&gt;A Python port of ggplot2 has long been requested and there are now a few
Python implementations of it; &lt;a href=&#34;http://plotnine.readthedocs.io&#34;&gt;Plotnine&lt;/a&gt;
is the one we will explore here. Plotting with a grammar is powerful, it
makes custom (and otherwise complex) plots easy to think about and
create, while the plots remain simple.&lt;/p&gt;

&lt;p&gt;Plotnine can be installed using pip:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pip install plotnine&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Plotnine splits plotting into three distinct parts which are data,
aesthetics and layers. The data step adds the data to the graph, the
aesthetics (aes) step adds visual attributes and the layers step creates
the objects on a plot. Multiple aesthetics and layers functions can be
added to a Plotnine graph.&lt;/p&gt;

&lt;p&gt;If you are a python user used to Matplotlib it can take some getting
used to a Grammar of Graphics plotting tool which is partly due to the
&lt;a href=&#34;https://goo.gl/QVf76X&#34;&gt;difference in philosophy&lt;/a&gt;. Plotnine provides
some
&lt;a href=&#34;http://plotnine.readthedocs.io/en/stable/tutorials.html&#34;&gt;tutorials&lt;/a&gt; to
help with getting to grips with the package and there is also the
&lt;a href=&#34;https://github.com/has2k1/plotnine&#34;&gt;Plotnine README&lt;/a&gt;. However if you
are new to Grammar of Graphics plotting then this highly recommended
&lt;a href=&#34;https://goo.gl/y1GBRu&#34;&gt;kaggle notebook for Plotnine&lt;/a&gt; is probably the
best place to start.&lt;/p&gt;

&lt;p&gt;Here are some examples of how to use plotnine to visualize data from the
&amp;lsquo;diamonds&amp;rsquo; data-frame that comes with Dplython.&lt;/p&gt;

&lt;p&gt;Import Python packages, the &amp;lsquo;diamonds&amp;rsquo; data-frame and create a sample
data-frame:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;import warnings; warnings.filterwarnings(&amp;quot;ignore&amp;quot;) # hide Python warnings 
import pandas
import dplython as dplython
from plotnine import *
diamondsSample = dplython.diamonds &amp;gt;&amp;gt; dplython.sample_n(5000)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Create a scatter plot of &amp;lsquo;carat&amp;rsquo; vs &amp;lsquo;price&amp;rsquo;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;print(ggplot(diamondsSample) # diamondsSample is the data  
 + aes(&#39;carat&#39;, &#39;price&#39;) # plot &#39;carat&#39; vs &#39;price&#39;
 + geom_point() # display the results as a scatter plot
 )

## &amp;lt;ggplot: (41012744)&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src=&#34;2018-05-04-python-tidyverse_files/figure-markdown_strict/unnamed-chunk-8-1.png&#34; alt=&#34;&#34; /&gt;&lt;/p&gt;

&lt;p&gt;Add additional layers e.g. a line of best fit:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;print(ggplot(diamondsSample)  
 + aes(&#39;carat&#39;, &#39;price&#39;) 
 + stat_smooth() # add a line of best fit
 + geom_point()) 

## &amp;lt;ggplot: (-9223372036813567705)&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src=&#34;2018-05-04-python-tidyverse_files/figure-markdown_strict/unnamed-chunk-9-1.png&#34; alt=&#34;&#34; /&gt;&lt;/p&gt;

&lt;p&gt;Add another aesthetic, here the data is coloured by the &amp;lsquo;cut&amp;rsquo; variable:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;print(ggplot(diamondsSample)
 + aes(&#39;carat&#39;, &#39;price&#39;)
 + aes(color=&#39;cut&#39;) # colour the data by the variable cut and create a ledgend 
 + geom_point())

## &amp;lt;ggplot: (-9223372036816020904)&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src=&#34;2018-05-04-python-tidyverse_files/figure-markdown_strict/unnamed-chunk-10-1.png&#34; alt=&#34;&#34; /&gt;&lt;/p&gt;

&lt;p&gt;Add a layer which separates the data into graphs based on &amp;lsquo;colour&amp;rsquo;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;print(ggplot(diamondsSample)
 + aes(&#39;carat&#39;, &#39;price&#39;)
 + aes(color=&#39;cut&#39;)
 + facet_wrap(&#39;color&#39;) # seperate the data by &#39;colour&#39; and graph seperately  
 + geom_point())

## &amp;lt;ggplot: (64014519)&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src=&#34;2018-05-04-python-tidyverse_files/figure-markdown_strict/unnamed-chunk-11-1.png&#34; alt=&#34;&#34; /&gt;&lt;/p&gt;

&lt;p&gt;This &lt;a href=&#34;https://goo.gl/rdlJSa&#34;&gt;article&lt;/a&gt; compares a variety of alternative
plotting packages for Python.&lt;/p&gt;

&lt;h2 id=&#34;next-steps&#34;&gt;Next steps&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Read the documents that are linked in this blog post.&lt;/li&gt;
&lt;li&gt;Learn the basics of &lt;a href=&#34;https://pandas.pydata.org/&#34;&gt;Pandas&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Use Dplython and Plotnine to practice data manipulation &amp;amp;
visualization. For example complete some of the exercises at
&lt;a href=&#34;https://www.kaggle.com/&#34;&gt;kaggle&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do you know of other good Python implementations of tidyverse? If so let
us know about them!&lt;/p&gt;

      </description>
    </item>
    
  </channel>
</rss>
