A brief history
Where I’ve been using R for the past couple of years and spent the first months struggling with it, I wanted to give a presentation that I would have wanted to see at the beginning. Not one about random bagging and a bunch of other stats but what are the best ways to do the fundamentals:
- connecting to my database
- performing data manipulations, summaries and updates
- charting my data
- producing reports
A few packages cover these awesomely and are much better than base R so whilst I was tackling a massive stats project, the things which took the time and stress were things I could have avoided with ease!
So my intro to R, takes people through the things I wish I’d been taken through thus making those first few months of R pleasant, happy times!
Here is the chronology of my presentation to date
- 25/01/2014 – Initial prototype – Cardiff User Group
- 22/03/2014 – 1st commit – SQLSaturday Exeter
- 03/07/2014 – UAT – SQLMidlands user group
- 19/07/2014 – Deployed to live – SQLBits
- 25/08/2014 – Minor update – PASS BI VC
- 13/10/2014 – Stress testing – SQLRelay
- 20/01/2015 – Rewrite – London Business Analytics Group
Change log (date ascending order)
- My first major iteration was done in the R presentation format that Rstudio at the time – it was a markdownesque variant that produced a LaTeX slide deck. I included a number of not too great charts and primarily focused on why you should use it instead of some BI tools and outlining a lot off the features. 40 slides. Lots of people willing to give the R puns a go.
- My SQLMidlands iteration had a modular basis and used rmarkdown. It was a significantly trimmed presentation compared to the first version. By trying to cover less, I was able to spend more time doing the job of explaining how the code worked. Also 40 slides but had lots of white slides to give me an opportunity to pause and reflect with people.
- I trimmed still further for the SQLBits iteration post SQLMidlands, shoehorned in some crummy examples of joins at the last-minute, and made the charts much nicer. 33 slides. By far the most responsive audience to date!
- I didn’t really change much between Bits and my latest iteration – partly, it was good enough and I was busy, and partly because I wanted to practice my speaking on a known deck.
- My latest iteration had a significant overhaul. I rebuilt entirely from scratch, focusing on the “Scenario” and took people through things as a workflow. I cut out more extraneous stuff and focused on making the content bigger (part of the weaknesses identified during Relay) by splitting the code and the tables & charts. I also went for more reproducability and tried to push people to the Github repository to emphasise the play along nature of the task. 32 slides.
I’m already starting to think of ways I can improve it further, and if you saw my session at the London Business Analytics group, why not let me know what you think I could improve?