多线程的Microsoft R Open 和单线程的R,你选哪个?

闲话少说,先上下测试效果图

耗时(越小越好)

R-3.3.1 (1 thread) Microsoft R Open-3.3.1 (1 thread) Microsoft R Open-3.3.1 (4 threads) Microsoft R Open-3.3.1 (8 threads)
Matrix multiplication 215.204 18.399 4.971 2.724
Cholesky factorization 31.390 3.248 0.906 0.584
QR decomposition 6.299 4.080 4.045 4.611
Singular value decomposition 71.764 12.886 4.162 3.144
Principal component analysis 60.527 12.603 4.918 4.094
Linear discriminant analysis 14.809 4.989 3.369 3.125

elapsed-time

性能(越大越好)

R-3.3.1 (1 thread) Microsoft R Open-3.3.1 (1 thread) Microsoft R Open-3.3.1 (4 threads) Microsoft R Open-3.3.1 (8 threads)
Matrix multiplication 1 11.70 43.29 79.00
Cholesky factorization 1 9.66 34.65 53.75
QR decomposition 1 1.54 1.56 1.37
Singular value decomposition 1 5.57 17.24 22.83
Principal component analysis 1 4.80 12.31 14.78
Linear discriminant analysis 1 2.97 4.40 4.74

看完诱人的结果,接下来我们自己动手安装测试一下,

安装

#Debian/Ubuntu
wget https://mran.revolutionanalytics.com/install/mro/3.3.1/microsoft-r-open-3.3.1.tar.gz
tar xf microsoft-r-open-3.3.1.tar.gz
cd microsoft-r-open
./install.sh

#Windows,
下载安装https://mran.revolutionanalytics.com/install/mro/3.3.1/microsoft-r-open-3.3.1.msi

测试

library("version.compare")
library("knitr")
scale.factor <- 1.0

r <- switch(Sys.info()[["sysname"]],
            Linux = {
              rscript <- findRscript()
              
              rv <- version.time(rscript, {
                as.character(getRversion())
              })
              idx <- which(unlist(rv$results) == "3.3.1")
              rscript[idx]
              
            },
            Windows = findRscript(version = "3.3.1.*x64"
            )
)


test.results <- RevoMultiBenchmark(rVersions = r, 
                                   threads = c(1, 4, 8), 
                                   scale.factor = scale.factor)
kable(test.results)
plot(test.results, theme_size = 8, main = "Elapsed time")
kable(urbanekPerformance(test.results), digits = 2)
plot(urbanekPerformance(test.results), theme_size = 8, main = "Relative Performance")

已知问题

  • 只支持64位平台
  • 不支持Windows 2008,是某些电脑会安装不成功,有Windows2008,有Windows7,原因不明,如果把别的机器安装成功的包copy过来,可以运行,但没有做更多深入测试
  • 使用高版本Debian/Ubuntu的时候,有缺少libpng12.so.0的问题,已经解决
    unable to load shared object '/usr/lib64/microsoft-r/3.3/lib64/R/modules//R_X11.so':
      libpng12.so.0: cannot open shared object file: No such file or directory
    
    #解决办法,软链接一个高版本的libpng.so,我的是testing版本的debian,实测通过
    ln -s /usr/lib/x86_64-linux-gnu/libpng.so /usr/lib64/microsoft-r/3.3/lib64/R/lib/libpng12.so.0

     

  • 并非所有函数都是MRO速度快
  • Windows下,找不到安装路径,改手动配置
    Error in readRegistry(ptn, maxdepth = 3) : 
      Registry key 'SOFTWARE\Revolution' not found
    #把代码
    Windows = findRscript(version = "3.3.1.*x64")
    #改为(具体路径以实际安装路径为准)
    Windows = c("D:\\Program Files\\Microsoft\\MRO-3.3.1\\bin\\x64\\Rscript.exe"
    ,"C:\\Program Files\\R\\R-3.3.1\\bin\\x64\\Rscript.exe")

我的测试结果

mro003 mro004 mro005 mro006_win7 mro007_win7 mro008_win7 mro009_win7 mro010_win7_rstudio mro011_win7

如何选择

如果Microsoft R Open所提供的包够用就用它,不行再考虑R

 

参考

https://github.com/andrie/version.compare/blob/master/inst/doc/version.compare.html