Apache Zeppelin-多语言多用途Notebook
Apache Zeppelin,A web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and …
Apache Zeppelin,A web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and …
成功安装了SparkR后,让我们来尝试下如何从PostgreSQL读取数据,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
Sys.setenv(SPARK_HOME="/Users/steven/Applications/spark2") .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths())) library(SparkR) d.pg="org.postgresql:postgresql:9.4.1209.jre7" sc <- sparkR.session(master="local[2]",sparkPackages=c(d.pg)) url<-"jdbc:postgresql://localhost:5432/steven?user=postgres&password=" driver<-"org.postgresql.Driver" df.pg <- read.jdbc(source="jdbc", url=url, tableName ="public.mtcars",driver=driver) printSchema(df.pg) collect(df.pg) #createOrReplaceTempView(df.pg,"mtcars") #sql("select * from mtcars") sparkR.session.stop() |
从代码可以看到,主要指定sparkPackages和read.jdbc命令,这段命令只要稍做修改,同样适用MySQL等支持JDBC连接的数据库。