Data types

R data types

You should already have a new fie created in your project that you made for your homework last week - open this up now so you can copy in bits of code and try them out as we go along!

When we think of different bits of data, some of it might be numbers, text, dates, and more. R has it’s own set of these data types. Before we get into the data types, let’s see how we can get R to tell us what something is.

R uses functions (basically, an inbuilt bit of code that you can call on to do things with, optionally passing it some data to work with) to take some inputs and get an output. The function that we can pass a value to, and get what data type it is as the output, is the class() function.

Try these three examples - what classes do you get? Can you see why they are different?

class(1)
## [1] "numeric"
class(1.1)
## [1] "numeric"
class("1")
## [1] "character"

You can use this class() function if you’re ever unsure what data type something is. This is great for when you’re getting unexpected results!

Numbers

Numbers are split into a few different types:

Converting to numbers

The functions as.numeric() and as.integer() allow you to convert something stored as text into a number.

These functions will give you some red text as a warning if you attempt to convert something to a number that can’t be safely converted. It will still attempt to perform the conversion, but return missings (NA) instead of actual values.

Checking numbers

You can write checks to see if something is numeric, or an integer, with is.numeric() or is.integer().

We could also use class() here and inspect the result.^[You might recall that class(1) had the result of “numeric” - R was not by default considering 1 as an integer for the purpose of the class() function.

Special numbers

As well as i to denote imaginary numbers, there are some additional symbols you might encounter or want to use.

Text

Text, also known as strings, is split up into two core types:

We are going to focus on characters.

In R, you can’t just type some text as it will be construed as an object or function name. To delimit a string you can use speech marks (") or apostrophes (') at the beginning and end of it to show where it starts and ends. These are the text delimiters in R.

Note you can’t use the two delimiters interchangeably e.g. “red’, but you can use them together to enable you have speech marks or apostrophes inside a string e.g. 'They said "Read this"' or "It's mine now".

If you need to have both inside a string you can escape the ones on the inside of a string to say they don’t count as text delimiters. To escape a delimiter you can use a backslash(\) e.g. "They said \"Read this\"".

Beware the copy and paste here - sometimes this can really mess your code up, and if it’s not working for no obvious reason, try tidying up all your speech marks and apostrophes.

Converting to strings

Converting to characters and factors is the same as working with numbers. You swap “numeric” for “character” or “factor” and you’re done!

Add another line to your script and try it out.

Similarly to checking numbers, the same rule applies to checking strings - I’m not going to give it away - have a go!

Logical values

Whilst we’ve been testing our datatypes, we’ve created a lot of logical or boolean values. Boolean values are TRUE and FALSE. R is case-sensitive so these have to be typed upper-case, otherwise it means something different.

I bet you can totally guess how to convert and check logicals. Add it to your script!

Is your cup of tea full? Is it still hot? You’re going to need it for this next bit because dates can be a stinker! I suggest you get up, stretch and refill before you carry on!

Deep breath

Dates

Dates are one of the hardest parts of programming! This is a very brief introduction to dates so if you want more (and there’s lots) - get searching!

Dates in R split into:

You might be looking at the two POSIX times and thinking to yourself “ZOMG how am I meant to choose?”. Most people use the POSIXct format[5], which is the default for many of R’s functions.

Converting to dates

You can convert to date-time’s with as.Date(),as.POSIXct(), and as.POSIXlt(). Ideally, you’ll provide a string with the date(time) in ISO8601 formats e.g. “YYYY-MM-DD hh:mm”.

Note that it’s assuming a time zone based on my device as I’ve not provided a default. It’s prudent to set the time zone in order to avoid the results of your code changing based on where the code is run or when[6].

Checking dates

Unfortunately, R does not provide functions for checking whether the class of something is a date-time type without extending it’s functionality. We have to use class() as a consequence.

class(as.Date("2017-12-31"))
## [1] "Date"

Getting dates and times

R has some functions for getting current date-time values[7].

Sys.Date()
## [1] "2018-01-04"
Sys.time()
## [1] "2018-01-04 14:58:59 GMT"
Sys.timezone()
## [1] "Europe/London"

annnnnd breath out

Missings

Every data type has an NA, an identifier for a missing value.

If you use an NA in an object it will take on the data type used in the object. You can, however, make NAs directly.

NA
## [1] NA
NA_integer_
## [1] NA
NA_character_
## [1] NA

Checking NAs

You can check what data type an NA is, using the class() function. Add this to your script now.

Want to check if something is NA? You know how to check things by now! Add another line to your script, just to be sure.

Summary

There a few more datatypes out in the wild but numbers, strings, booleans, and dates are the core types you’ll encounter.

There are normally as.* and is.* functions for converting to a datatype or checking if something is a given datatype. You can use class() to uncover the datatype too.

Keen beans, your time is now. Homework.