SAS导入导出时编码问题汇总

获得SAS默认编码(其实是通过启动时加载配置文件决定的,nls),

启动后无法修改,如果尝试通过下面命令设置,会得到警告,

WARNING 30-12: SAS option ENCODING is valid only at startup of the SAS System. The SAS option is ignored.

因此在导入、导出的时候,我们可以指定导入文件或者导出文件的编码。

比如,要导入的csv文件为utf-8,变量为中文,代码如下,

对应的UTF-8编码文件输出,

This example creates a SAS data set from an external file. The external file’s encoding is in UTF-8, and the current SAS session encoding is Wlatin1. By default, SAS assumes that the external file is in the same encoding as the session encoding, which causes the character data to be written to the new SAS data set incorrectly.
To tell SAS what encoding to use when reading the external file, specify the ENCODING= option. When you tell SAS that the external file is in UTF-8, SAS then transcodes the external file from UTF-8 to the current session encoding when writing to the new SAS data set. Therefore, the data is written to the new data set correctly in Wlatin1.
如果不指定编码,SAS会默认导出和导出的文件编码同自身默认的编码一致。
另外我们可以对SAS数据库指定编码。
比如转换SAS dataset的编码,

 

Leave a Reply

Your email address will not be published. Required fields are marked *