R语言关联规则压力测试-arules

前文说到如何用R与PostgreSQL进行数据挖掘之关联规则, 下面使用真实数据使用Apriori算法做个压力测试(系统配置,Windows 2008 64-bit,SSD,128G内存),620items, 163763 transactions。mini confidence和mini support均选择0.00001(选择这么低并没有意义),minlen=2,maxlen=5,输出规则高达3亿5千万之多,现实rule占用16.6G。

apriori1

apriori

Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport support minlen maxlen target   ext
      1e-05    0.1    1 none FALSE            TRUE   1e-05      2      5  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 1 

Warning in apriori(transactions, parameter = list(support = 1e-05, confidence = 1e-05,  :
  You chose a very low absolute support count of 1. You might run out of memory! Increase minimum support.

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[620 item(s), 163763 transaction(s)] done [0.06s].
sorting and recoding items ... [614 item(s)] done [0.01s].
creating transaction tree ... done [0.07s].
checking subsets of size 1 2 3 4 5 done [37.09s].
writing ... [350487111 rule(s)] done [91.73s].
creating S4 object  ... done [137.37s].

 

接下来把maxlen增大到6,报内存不足,失败

Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport support minlen maxlen target   ext
      1e-05    0.1    1 none FALSE            TRUE   1e-05      2      6  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 1 

Warning in apriori(transactions, parameter = list(support = 1e-05, confidence = 1e-05,  :
  You chose a very low absolute support count of 1. You might run out of memory! Increase minimum support.

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[620 item(s), 163763 transaction(s)] done [0.06s].
sorting and recoding items ... [614 item(s)] done [0.01s].
creating transaction tree ... done [0.08s].
checking subsets of size 1 2 3 4 5 6 done [85.17s].
writing ... 
Error in apriori(transactions, parameter = list(support = 1e-05, confidence = 1e-05,  : 
  not enough memory. Increase minimum support!

 

同样的数据集,用SAS Enterprise Miner Workstation 13.2测试,失败,代码如下,

libname datapath "E:\lib\mba\data";

data mba;
set datapath.customer_brands;
run;

/********************关联分析****************/

proc dmdb batch data=mba out=dmassoc dmdbcat=catassoc;
id customer_id ;
class brand(desc);
run;
proc assoc data=mba dmdbcat=catassoc
out=datassoc(label='Output from Proc Assoc')
items=6 support=1;
cust customer_id;
target brand;
run;

sas001