Tuesday, October 11, 2011

Connecting to Postgres from R on Windows 7 64bit



Here is some quick advice on how to get up and running with R + Postgres on Windows 7 64bit. There are several possibilities, but most have fatal flaws. Here is what worked for me:

Add jvm.dll to your PATH

rJava, the R<->Java bridge, will need jvm.dll, but R will have trouble finding that DLL. It resides in a folder like

C:\Program Files\Java\jdk1.6.0_25\jre\bin\server

or 

C:\Program Files\Java\jre6\jre\bin\client

Wherever yours is, add that directory to your windows PATH variable. (Windows -> "Path" -> "Edit environment variables to for your account" -> PATH -> edit the value.)

You may already have Java on your PATH. If so you should find the client/server directory in the same Java "home" dir as the one already on your PATH.


To be safe, make sure your architectures match.If you have Java in "Program Files", it is 64-bit, so you ought to run R64. If you have Java in "Program Files (x86)", that's 32-bit, so you use plain 32-bit R.

Re-launch R from the Start Menu


If R is running, quit.

From the Start Menu , Start R / RGUI, RStudio. This is very important, to make R pick up your PATH changes.

 

Install rJava 0.9.2.

Earlier versions do not work!  Mirrors are not up-to-date, so go to the source at www.rforge.net: http://www.rforge.net/rJava/files/. Note the advice there
Please use
install.packages('rJava',,'http://www.rforge.net/')
to install.”
That is almost correct. This actually works:

install.packages('rJava', .libPaths()[1], 'http://www.rforge.net/')

Watch the punctuation! The mysterious “.libPaths()[1],” just tells R to install the package in the primary library directory. For some reason, leaving the value blank doesn’t work, even though it should default.

Install RpgSQL.

install.packages('RpgSQL')

 

Set Classpath. 

Set a classpath to use for pgSQL:

jdbcClasspath = "C:\\Program Files (x86)\\PostgreSQL\\pgJDBC\\postgresql-8.4-702.jdbc4.jar"
);
Do NOT use .jaddClassPath(). The pgSQL method ignores .jclassPath()

 

Create a DBI driver.

The Examples code in help("pgSQL") has errors (as does most R documentation, sadly). Here is how to set up your driver instance:
myPgSQL = pgSQL(driverClass='org.postgresql.Driver', 
                jdbcClasspath, 
                identifier.quote="\"");

Make a Connection.

connection <- dbConnect(myPgSQL,
  user = "your_username",  
  password = "your_password", 
  dbname = "your_db_name", 
  host = "your_db_host", 
  port = 5432
);

Query!

dbGetQuery(connection, "select datname from pg_database");
dbGetQuery(connection, 
  "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'");
# dbGetQuery returns a Data Frame, very nice.  
result = dbGetQuery(connection, "select * from markets"); 
View(result);  
head(result, n=1)  # Show 1 just row of data
head(result$name, n=1)  # Show 1 just row of data
help("pgSQL")    ; # To see more example queries
dbDisconnect(con); # When you are done.
 
 
Let me know how it goes for you!
 

No comments:

Post a Comment