Chapter 5 – SQL Functions, Math, and Coalesce
Section I – Overview
As with most every computer programming language, SQL supports the use of functions to manipulate data. There are two types of functions in SQL; Those that produce strings and those that produce numbers. However, it’s important to realize that functions in SAS SQL can be different than functions in other database systems, so please be careful when trying to use queries in other systems.
Section II – Strings
Concatenation
Concatenation is the joining of multiple values into a single value. This can be done by combining any kind of string and from any source (columns, literals, etc…). In SAS, concatenation is performed by placing two vertical bars (||) between the strings you wish to join. The vertical bar character is created by holding the Shift key with the backslash key.
In this example, we combine two columns and a string literal into a single output column. Pay careful attention to the output.
Proc SQL;
Create Table Example01 as
Select ID || “‘s weather station is ” || WeatherStationID as TestString
From Chapt05.Obj_Recorders;
Quit;
Trim
In the previous example, you’ll notice the output column was not aesthetically pleasing. This is because the ID column is of character length 14, but not every value in that column has 14 characters. When this occurs, SAS will pad the end of the string with ANSI spaces. SQL comes with a function that allows us to remove those spaces: TRIM. TRIM removes both leading and trailing spaces when the value we are attempting to trim is non-missing. However, if the value is missing, the TRIM function adds a space.
Proc SQL;
Create Table Example02 as
Select TRIM(ID) || “‘s weather station is ” || WeatherStationID as TestString
From Chapt05.Obj_Recorders;
Quit;
Compress
There are also times you’ll want to remove EVERY space from a value. In this instance, the COMPRESS function should be used.
Proc SQL;
Create Table Example03 as
Select “ Let’s remove some spaces!” as String1,
Compress(“ Let’s remove some spaces!”) as CompressedString
From Chapt05.Obj_Recorders;
Quit;
Upper, Lower, PropCase
We still haven’t prettied our result up as much as we’d like. All of the ID values are fully capitalized. For some results, this is perfectly acceptable, but we’re still going to run it through the ringers.
Proc SQL;
Create Table Example04 as
Select TRIM(Upper(ID))||”‘s weather station is “||WeatherStationID as UpTestString,
TRIM(Lower(ID))||”‘s weather station is “||WeatherStationID as LowTestString,
TRIM(Propcase(ID))||”‘s weather station is “||WeatherStationID as PropTestString
From Chapt05.Obj_Recorders;
Quit;
Translate
In some cases, there may be characters that you’d like to remove. For example, you may want to replace every underscore with a space or dash. In this instance, you’d use the TRANSLATE function. To use the TRANSLATE function you must supply three arguments: The string to translate, the replacement value, and the characters you wish to replace. Keep in mind you can place as many characters as you like in the third argument, but only one in the first two.
Proc SQL;
Create Table Example05 as
Select ID,
Translate(ID, ” “, “_,.*#$%”) as FixedID
From Chapt05.Obj_Recorders;
Quit;
Substring
The substring function, SUBSTR() is one of the most powerful string functions available in SAS. It allows you to only select a portion of a string for an output value. The SUBSTR function can be executed by passing three arguments to function properly: A string to be processed, the index value to start, and the length to substring. It can also be executed by passing the string to be processed, and the index value to start. Remember in SAS, all strings are 1-based indexes (first character has index = 1).
Proc SQL;
Create Table Example06 as
Select Substr(MV90, 1, 1) as MV90Char,
Substr(MV90, 2) as MV90All
From Chapt05.Obj_Recorders;
Quit;
Put
The PUT() function allows us to convert any numeric value to a string in a given format. This is particularly effective when using DateTime values. The PUT function requires two arguments to work, the numeric value to be passed, and the output format.
Proc SQL;
Create Table Example07 as
Select Interval,
“Z3 shows as ” || Put(Interval, z3.) as Test
From Chapt05.Obj_Recorders;
Quit;
Scan
The SCAN() function will select the n-th item from a delimited string. The string can be delimited by any character. To properly use the SCAN function you must provide three values: the string to process, the n-th iteration, and the delimiter.
Proc SQL;
Create Table Example08 as
Select “CSV,ZIP,XLS” as FileTypes,
Scan(“CSV,ZIP,XLS”, 2, “,”) as Get2ndFile
From Chapt05.Obj_Recorders;
Quit;
Length
The LENGTH() function will return the number of characters in a string.
Proc SQL;
Create Table Example09 as
Select ID,
Length(ID) as Len
From Chapt05.Obj_Recorders;
Quit;
Soundex
As previously touched on in the last chapter, the Soundex is used by compters to determine what a certain word “sounds like” in order com compare strings regardless of misspellings. For more information on exactly what a Soundex is, please consult wikipedia.
Proc SQL;
Create Table Example10 as
Select ID,
Soundex(ID) as Sndx
From Chapt05.Obj_Recorders;
Quit;
Example
Now let’s put it all together for a fun little example:
Proc SQL;
Create Table Example11 as
Select PropCase(Trim(ID))||”is located in “||State
||”with an interval of “||put(Interval,z2.)||” minutes.” as String
From Chapt05.Obj_Recorders
Where Soundex(Upper(ID))=Soundex(Upper(Translate(“CFE_Redford”, ” “, “_-”)));
Quit;
Section III – Math
Basic Operators
The basic operators of mathematics are simple and thus we will not delve too much into them. In the example below, pay more attention to the operators than anything else, specifically the ** (raised to) and the SQRT (square root) operators. SQRT is actually a function, but here we treat it as an operator.
Proc SQL;
Create Table Example12 as
Select Interval+Interval as Add,
Interval-Interval as Subtract,
Interval*Interval as Multiply,
Interval/Interval as Divide,
Interval**2 as Squared,
Sqrt(Interval) as SquareRoot
From Chapt05.Obj_Recorders;
Quit;
Simple Functions
SQL comes with some simple functions that do not require groupings to work. Those are the ABS, COS, EXP, PI, SIN, and TAN functions. In the example below we demonstrate those functions.
Proc SQL;
Create Table Example13 as
Select Distinct
Abs(-12) as AbsoluteValue,
EXP(1) as e,
SIN(3.14159/4) as Sine,
COS(3.14159/4) as Cosine,
TAN(3.14159/4) as Tangent
From Chapt05.Obj_Recorders;
Quit;
Sum, Max, and Min
Three important functions — SUM, MAX and MIN — have dual roles in SQL. For this chapter we’re only going to focus on their iterative use. In the example below, we use these functions to create output by working across columns or literals. Please keep in mind that if you have a missing value you may get different results.
Proc SQL;
Create Table Example14 as
Select Distinct
SUM(1, 2, 3) as Addition,
MAX(StartDate, LastUploadDate, LastFinaledDate) as MaxDate format=mmddyy10.,
MIN(StartDate, LastUploadDate, LastFinaledDate) as MinDate format=mmddyy10.
From Chapt05.Obj_Recorders;
Quit;
Quotient and Modulo
The fundamentals of algebra allow for a fantastic set of functions in SQL. The INT() and MOD() functions are powerful tools for the SQL developer. They allow you to do mathematical manipulations on divisors and remainders of divisions. The INT() function will produce the quotient of a division, and the MOD() function will produce the remainder of the division. There are some rules to notice when it comes to modulo mathematics. Every modulo statement is written as XModY=R in the real world. In SQL it is written as Mod(x,y).
When performing a MOD calculation, remember that the remainders for X Mod Y = {0, 1, …Y-1}
Proc SQL;
Create Table Example15 as
Select Distinct
Int(6/7) as Quot67,
Int(13/7) as Quot137,
Mod(6,7) as Mod67,
Mod(13,7) as Mod137,
Mod(13+6,7) as Mod197,
Mod(Mod(13,7)+Mod(6,7),7) as ModCheck
From Chapt05.Obj_Recorders;
Quit;
From the example above it’s important to notice that regardless of how we arrive at X, the remainder that is derived can be summed to create a new remainder whether we are adding remainders or Xs. The MOD function can be a huge help when trying to iterate across groups of rows, such as when performing a task on certain hours of the day.
Section IV – Coalesce and Missing Values
Coalesce
The Coalesce function is another powerful function that allows us to deal with missing values. The Coalesce function works by creating a priority array, where each value in the function is given less weight than the preceding value. However, the only time the Coalesce function will return a lower weighted value is if the previous is a missing value.
Proc SQL;
Create Table Example16 as
Select ID,
Coalesce(Premise, “NoPremise”) as PremiseFilled,
Coalesce(Premise, “000000000″) as PremiseZerod
From Chapt05.Obj_Recorders;
Quit;