"Last night I lay in bed looking up at the stars in the sky and I thought to myself, where the heck is the ceiling."
Here is a great SAS tip that requires a bit of background explanation, but it's worth it!
The tip is about how the data step works. When you read in a data set with the set statement like below, SAS processes every column and row of the input file. If the data set is long and wide it will take SAS a long time to finish!
Data Output;
Set Input;
Run;
If you were only interested in a few columns, you might you be tempted to do something like this:
Data Output (keep = Branch Line Date);
Set Input;
Run;
Which will create an output data set with just the 3 columns, Branch Line and Date. However! using this method means that SAS still has to read in every column and row from the Input data set. Instead it's far better to do this:
Data Output;
Set Input (keep = Branch Line Date);
Run;
Here we are telling SAS to only read in the three columns Branch Line and Date from the input data set. Every other column will be ignored so the process of reading in the Input data set will be far more efficient and quick!
But what if we only want a subset of Branches, Lines or Dates?
Say we just want to see the Lines in Branch 308, we could to the following:
data Output;
set Input (keep = Branch Line Date);
where Branch = 308;
run;
Once the Input data set has been processed, every record of the Branch column would then be re-checked to see if the branch was equal to 308, as demanded by the Where statement.
But we know that the Set statement reads in every column and row of the input data set, so why not check if the Branch is equal to 308 at the same time to prevent double work?
data Output;
set Input (keep = Branch Line Date where = (Branch = 308) );
run;
By using Where in the Set statement, you can drastically improve the efficiency of the data step.
Additionally, if you have multiple conditions on the data you want to bring in, make sure the first condition is most unlikely, the set statement will stop processing a record if a condition is not met and move on to the next one. The earlier you can stop processing a record you don't want, the less work SAS will have to do!
data Output;
set Input (keep = Branch Line Date where = ( Date = '24Dec2010'd and Branch = 308 ) );
run;
There are plenty of other things you can do in the Set statement. I'd recommend you do as much here as possible. Enjoy!
data Output;
set Input (Keep = Branch Line Date Rename=(Line = Product) where = (Branch = 308) );
run;
Wednesday, 21 September 2011
Evaluating calculations in open code
"I love salad"
If you want to calculate a value outside of a datastep then use the following code.
%let Mark = %SysEvalF(6 + 6);
%put &Mark;
Log should output 12!
If you want to calculate a value outside of a datastep then use the following code.
%let Mark = %SysEvalF(6 + 6);
%put &Mark;
Log should output 12!
Friday, 16 September 2011
Exponential Smoothing in SAS
"It's Not funny!"
A very nice Chinese lady once asked me "Can you save the Kennedy Space Center?" I replied with the only response possible for a true SAS Professional.. An explanation of how to forecast using exponential smoothing models within Base SAS!
I guess this is why I still live with my parents.
I guess this is why I still live with my parents.
/* How to do Smoothing in SAS*/
There are 5 steps:
1. Sort your data so they are in line-date order
2. Specify the type of smoothing model you want with proc hpfesmspec
3. Select the smoothing model(s) you want to use with proc hpfselect
4. Create a forecast with your selected smoothing model(s) using proc hpfengine
5. Pray.
proc sort data=frozen_sample out=frozenSampleSort;
by customer linenum dow date;
run;
There are many ways to Smooth. Some popular techniques are Simple, Double and Winters. The basic smoothing technique is called ‘Simple’ in SAS. proc hpfesmspec is where we define the type of smoothing we want to use.
This is also the place where we can tell SAS to automatically choose the best smoothing weights for us, or used a fixed weight.
proc hpfesmspec repository = work.model *where the model will be saved;
specname = Simple; *the name of your model;
esm method = Simple; *the smoothing model;
run;
Add this code after the 'esm method = Simple' to use a fixed smoothing weight:
levelrest = (0,1) levelparm = "insert smoothing weight here" noest
- Select the model(s) you want to use.
Now that you’ve physically saved your model(s) the next step is to load them into memory (kind of like what happens with macro variables). You do this by creating a virtual folder, and putting all the models you want to use into it. In the next step you will point to this virtual folder as the location that contains all the models you want to use.
proc hpfselect repository = work.model *where the model(s) were saved;
name = myselect; *Name of the virtual folder;
spec simple; *The name of the model(s) you want to use;
run;
- Create a forecast with your selected model, Finally!
hpfengine is the step that actually generates the forecast. There are many options, a few important ones have been included.
proc hpfengine data = frozenSampleSort *sorted dataset;
outfor=WeekForecast *save just the forecast;
outest=WeekEst *save just the smoothing weights;
outstat=WeekStat *save just the statistics;
out=WeekData *save everything;
repository = work.model *location of models;
globalselection = myselect *location of virtual folder;
back = 0 *hides last x actual from forecast;
lead = 4 ; *forecasts y periods ahead;
by customer linenum dow; *by variable;
id date interval=day; *time between observations;
forecast weekdemand; *forecast variable;
run;
5. Hu Yu Hai Ding? Tai Ni Po Ni.
Do let me know if you have any questions, until next time SAS fans!
Subscribe to:
Posts (Atom)