如何用R与PostgreSQL进行数据挖掘之关联规则
前面用PostgreSQL 函数实现了一个简易版的关联规则算法,今天尝试下R语言的关联规则包“arules”中的apriori算法。
- 连接数据库并读取数据
library(RPostgreSQL)
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, user='postgres', dbname='steven', password='', host='127.0.0.1')
rs <- dbSendQuery(con,"select customer_id,brand from trans;")
results <- fetch(rs,n=-1)
trans表的结构和数据示例如下,
CREATE TABLE public.maoye (
 customer_id text NULL,
 last_brand text NULL,
...
);
customer_id,brand
090534,匡威
090534,阿迪达斯
090534,森达
090573,他她
090573,匡威
...
- 安装并加载arules包
install.packages("arules")
library("arules")
- 转data.frame为transaction
transactions <- as( split(as.vector(results$brand), as.vector(results$customer_id)), "transactions" )
- 分析(设定support,confidence等规则)
rules <- apriori(transactions, parameter = list(support = 0.1, confidence = 0.6, maxlen = 2, minlen=2))
- 对结果进行分析,或存入数据库
#分析结果
inspect(rules)
#结果转data.frame,并存入数据库
associations<-as(rules,"data.frame")
dbWriteTable(con,c("public","arules"),associations)