R is an open source language released in 2001 that’s ideal for data wrangling and data science. It has connectors to pretty every much every data source under the sun, allows you wrangle data like nobody’s business, build pretty much every type of model ever thought up, and visualise it in all the niftiest ways.
There are currently more than eleven thousand extensions (referred to as packages) to R in the core ecosystem (which is a fancy word for the collected bits and pieces of R!) and two and a half thousand packages in the genomics ecosystem.
We're also seeing emerging ecosystems and paradigms within CRAN. The tidyverse is one such ecosystem, focussed primarily on analysing tabular data, and it will be used in future works extensively.
The core ecosystem is CRAN, the Comprehensive R Archive Network. CRAN is maintained by some great people who put in place a large number of quality gates that an R package must adhere to in order to be made widely available. They then host these packages and do great things like daily re-runs of all package tests to ensure packages are still working. CRAN is the default source of packages for most R users.
If you use RStudio, you’ll use a mirror of CRAN hosted by RStudio. There are a number of these mirrors scattered over the globe to help reduce the load on the central servers. You can use another one of these mirrors, or even set-up your own internal CRAN.
R is an great language for doing data analysis, data science, and more. It has it’s quirks but the community around it is huge and is making R easier to adopt every day.
R as a programming language is brilliant at its core competencies – statistics and data visualisation. It’s also a great “glue” language, by which I mean that you can use it to perform computations in many different languages and combine the results smoothly. As a result, R enables you to be an effective data wrangler, data scientist, and/or data visualisation practitioner.
Data wrangling, the ‘tidyverse’ and finding your way around RStudio.