SSH tunnels on Windows for R
Recently I’ve had to get to grips with SSH tunnels. SSH tunnels are really useful for maintaining remote network integrity and work in a secure fashion. It is, however, a pain to open PuTTY and log in all the time, mainly because I couldn’t script it in R! It’s been a trial, but like most things it turned out to be pretty simple in the end so I thought I’d share it with you.
I generated a key in Rstudio first using openSSH. I’d put this key on GitHub and a few other places so I didn’t want to change it. To use the PuTTY tools, you have to use a different key format than the one openSSH creates so I used pretty much this same method to convert my existing key.
You need to run puttygen.exe, generate a key with a passphrase, and store the public version of the key on the remote machine. DigitalOcean have a really good article on generating keys and storing them that I recommend you follow.
Authenticate keys on startup
To avoid interactivity of any sort in my R scripts, I need some way of storing the passphrase for my key. I could generate a key without a password but that’s just a major security flaw I would prefer not to introduce. So, how can I get my keys to be pre-authenticated by me?
To tackle this persistence, it turns out our buddy PuTTY has a friend called
pageant.exe. Pageant allows us to stash passphrases for our keys whilst pageant is running – but it will lose keys when you reboot.
The most important things in these two articles were:
- In the Run dialog type shell:startup and create a shortcut to your Pageant.exe
- Edit your shortcut to start in your .ssh directory so you can have an easier time referencing multiple key files
Use plink to build your SSH tunnel
Whilst down the rabbit hole, I discovered just in passing via a beanstalk article that there’s actually been a command line interface for PuTTY called
plink. D’oh! This changed the whole direction of the solution to what I present throughout.
plink.exe as the command line interface for PuTTY we can then connect to our remote network using the key pre-authenticated via pageant. As a consequence, we can now use the
shell() command in R to use plink. We can then connect to our database using the standard Postgres driver.
library(RPostgreSQL) drv <- dbDriver("PostgreSQL") cmd<- paste0( "plink ", # use key and run in background process " -i ../.ssh/id_rsa -N -batch -ssh", # port forwarding " -L 5432:127.0.0.1:5432" # location of db " firstname.lastname@example.org" ) shell( cmd, wait=FALSE) conn <- dbConnect( drv, host = "127.0.0.1", port=5432, dbname="mydb" ) dbListTables(conn)
I went down the rabbit hole!
When picking up new (for you) technologies it can become an exercise in going down the rabbit hole – a reference to Alice in Wonderland where you’re trapped, confused, and running around looking for a way out.
Whilst going through this process I looked at the following:
These would be really cool to get working somehow. It would enable a Powershell script to run on startup that could securely retrieve my key passphrase out of the encrypted credentials database on Windows and pass them to
ssh-agent so that I could work in an openSSH environment. It was, however, a PITA, with some bugs and limitations due to new process spawns, versions of Posh-Git and more. Plus, the more moving parts, the more unlikely something is to work.
So this is a relatively simple method but not all that simple, and I still have to think about how I can make this sensibly work on different OS in order to be able to do the usual testing and continuous integration practices.
I know more about SSH than I ever wished to so if you’re struggling, give me a shout and I’ll do my best to help. If you’re much further along than me, why not comment on how you’ve solved this challenge – I’d love to hear better ways of doing this!
Image from cat-o-morphism