1 Length:设置字符串长度;
Lengthc:得到字符串存储空间大小
data chars1;
run;
这里storage_length的值为3。
data chars2;
run;
这里storage_length的值为7。因此用length时,要注意使用的次序。
2 compbl:将多个连续空格转换成为一个空格
data multiple;
datalines;
Ron Cody
89 Lazy Brook Road
Flemington NJ 08822
Bill Brown
28 Cathy Street
North City NY 11518
;
title "Listing of Data Set MULTIPLE";
proc print data=multiple noobs;
run;
结果:
Ron Cody 89 Lazy Brook Road Flemington NJ 08822
Bill Brown 28 Cathy Street North City NY 11518
3 Compress:从字符串中移删除某些字符。
data phone;
datalines;
(908)235-4490
(201) 555-77 99
;
title "Listing of Data Set PHONE";
proc print data=phone noobs;
run;
结果:
Listing of Data Set PHONE
(908)235-4490 (908)235-4490 9082354490
(201) 555-77 99 (201)555-7799 2015557799
4 Verify:检验字符串中是否含有检验字符串之外的字符。
data verify;
datalines;
001 acbed
002 abxde
003 12cce
004 abc e
;
title "Listing of Data Set VERIFY";
proc print data=verify noobs;
run;
结果:
id answer position
001 acbed 0
002 abxde 3
003 12cce 1
004 abc e 4
当用verify时一定要注意,要检验的字符串如果含有空格时容易产生意外的结果:
data trailing;
run;
这里POS=4,因为string的后面系统自动赋给空格,要解决这个问题,就要用到trim函数:
pos = verify(trim(string),'abcde');
这里POS=0
5 Substr:取一个长字符串中的一部分。
data pieces_parts;
datalines;
NYXXXX123
NJ1234567
;
title "Listing of Data Set PIECES_PARTS";
proc print data= pieces_parts noobs;
run;
Substr另类用法:给某字符串的某几个字符赋值:
data pressure;
datalines;
120 80 180 92 200 110
;
title "Listing of Data Set PRESSURE";
proc print data=pressure noobs;
run;
这里,我们对sbp_chk和dbp_chk的第四个字符赋值为*。
6 Scan:从长的字符串里分离出单词或短字符串:
Scan 的语法:SCAN(char_var,n,'list-of-delimiters'); n是char_var的第n个单词,如果char_var的单词数小于n,那么返回值将为空;如果n为负,那么scan将从右到左进行。
data parse;
datalines;
this line,contains!five.words
abcdefghijkl xxx yyy
;
title "Listing of Data Set PARSE";
proc print data=parse noobs;
run;
结果:
Listing of Data Set PARSE
piece1 piece2 piece3 piece4 piece5
this line contains five words
abcdefghij xxx yyy
Scan:获得字符串里的最后一个单词:
data first_last;
datalines;
Jeff W. Snoker (908)782-4382
Raymond Albert (732)235-4444
Alfred Edward Newman (800)123-4321
Steven J. Foster (201)567-9876
Jose Romerez (516)593-2377
;
title "Names and Phone Numbers in Alphabetical Order (by Last Name)";
proc report data=first_last nowd;
run;
8 index:搜索第二个参数(字符串)在第一个参数(字符串)的位置
Indexc:搜索第二个参数的任意一字母在第一个参数里最早出现的位置
data locate;
datalines;
abcxyz1234
1234567890
abcx1y2z39
abczzzxyz3
;
title "Listing of Data Set LOCATE";
proc print data=locate noobs;
run;
结果:
obs first first_c
9 UPCASE:字母全部大写
LOWCASE:字母全部小写
data up_down;
datalines;
M f P p D 1 2
m f m F M 3 4
;
data upper;
run;
title "Listing of Data Set UPPER";
proc print data=upper noobs;
run;
10 PROPCASE:将每个单词的第一个字母大写,其它字母全部小写
data proper;
datalines;
rOn coDY
the tall and the short
the "%$#@!" escape
;
title "Listing of Data Set PROPER";
proc print data=proper noobs;
run;
结果:
Listing of Data Set PROPER
name
Ron Cody
The Tall And The Short
The "%$#@!" Escape
11 TRANWRD:将字符串转换成其它的字符串,例如把road替换为rd.
其语法为:TRANWRD (char_var,'find_str','replace_str');
data convert;
datalines;
89 Lazy Brook Road
123 River Rd.
12 Main Street
;
title "Listing of Data Set CONVERT";
proc print data=convert;
run;
结果:
OBS ADDRESS
13 SPEDIS:模糊比较,如果两个字符串完全相同,则返回0,否则相似性越小,返回值越大。语法:SPEDIS(string1,string2);
data compare;
datalines;
same same
same sam
firstletter xirstletter
lastletter lastlettex
receipt reciept
;
title "Listing of Data Set COMPARE";
proc print data=compare noobs;
run;
结果:
Listing of Data Set COMPARE
string1 string2 points
same same 0
same sam 8
firstletter xirstletter 18
lastletter lastlettex 10
receipt reciept 7
14 any函数集:返回某一类字符首次出现的位置
ANYALNUM:
ANYDIGIT:
data find_alpha_digit;
datalines;
no digits here
the 3 and 4
123 456 789
;
proc print data=find_alpha_digit noobs;
run;
结果:
string first_alpha first_digit
no digits here 1 0
the 3 and 4 1 5
123 456 789 0 1
15 NOT函数集:返回非某一类字符首次出现的位置
NOTALNUM:
NOTALPHA:
NOTDIGIT:
NOTPUNCT:
NOTSPACE:
data data_cleaning;
datalines;
abcdefg
1234567
abc123
1234abcd
;
title "Listing of Data Set DATA_CLEANING";
proc print data=data_cleaning noobs;
run;
结果:
string only_alpha only_digit
abcdefg 0 1
1234567 1 0
abc123 4 1
1234abcd 1 5
16 CATS,CATX:合并字符串。
我们可以用“||”或“!!”来合并字符串,但用上面两个函数来合并字符串的好处是可以自动去掉原字符串前后两端的空白字符。
语法:
CATS(string1,string2,<stringn>);
CATX(separator,string1,string2,<stringn>); CATX允许合并的字符串中间加入自定义的分格符。
data join_up;
run;
title "Listing of Data Set JOIN_UP";
proc print data=join_up noobs;
run;
结果
Listing of Data Set JOIN_UP
string1 string2 string3 cats catx
17 LENGTH, LENGTHN, LENGTHC:字符串长度
LENGTH:得到字符串长度(不包括后面的trailing空白)
LENGTHN:得到字符串长度(不包括后面的trailing空白)
LENGTHC:得到字符串存储空间大小
LENGTH与 LENGTHN的区别是:当测量空字符串时,LENGTH返回值为1,而LENGTHN返回值为0。
结果:
data how_long;
run;
title "Listing of Data Set HOW_LONG";
proc print data=how_long noobs;
run;
结果:
Listing of Data Set HOW_LONG
one two three length_one lengthn_one lengthc_one length_two
18 COMPARE:
COMPARE(string1, string2 <,'modifiers'>)
这里的modifiers如下,你可以用1个或多个modifiers:
19 STRIP:删除字符串前和字符串后的空格
LEFT:删除字符串前的空格
TRIM:删除字符串后的空格
if strip(string) = 'abc' then result = 'yes';
if left(trim(string)) = 'abc' then result = 'yes';
20 COUNT:计算子符子串的个数
COUNTC:计算子符子串里任意字符的个数
语法:count(string,find_string,<'modifiers'>)
其中,modifiers可以为i,以忽略case;或t,以忽略字符串前的空格
0 comments:
Post a Comment