如何用R与PostgreSQL进行数据挖掘之关联规则
前面用PostgreSQL 函数实现了一个简易版的关联规则算法,今天尝试下R语言的关联规则包“arules”中的apriori算法。
- 连接数据库并读取数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
library(RPostgreSQL) drv <- dbDriver("PostgreSQL") con <- dbConnect(drv, user='postgres', dbname='steven', password='', host='127.0.0.1') rs <- dbSendQuery(con,"select customer_id,brand from trans;") results <- fetch(rs,n=-1) trans表的结构和数据示例如下, CREATE TABLE public.maoye ( customer_id text NULL, last_brand text NULL, ... ); customer_id,brand 090534,匡威 090534,阿迪达斯 090534,森达 090573,他她 090573,匡威 ... |
- 安装并加载arules包
1 2 |
install.packages("arules") library("arules") |
- 转data.frame为transaction
1 2 3 4 |
transactions <- as( split(as.vector(results$brand), as.vector(results$customer_id)), "transactions" ) |
- 分析(设定support,confidence等规则)
1 2 |
rules <- apriori(transactions, parameter = list(support = 0.1, confidence = 0.6, maxlen = 2, minlen=2)) |
- 对结果进行分析,或存入数据库
1 2 3 4 5 6 |
#分析结果 inspect(rules) #结果转data.frame,并存入数据库 associations<-as(rules,"data.frame") dbWriteTable(con,c("public","arules"),associations) |
Leave a Reply