Introduction
Smoothing algorithms are basically used in time series data either to produce smoothed data for presenting the trend of the data or to forecast it for what if analysis. Time series data are sequential observations from the history of data with respect to a series of date, time or time stamps. Moving average analysis is also the same kind of time series analysis where the past observations are equally weighted. But, for certain analysis (price movement or share market stock movement), the recent past data has the most weight. Exponential smoothing assigns exponentially decreasing weights over time.
Pre requisites:
- No missing/null data
- Only numeric data can be smoothed
Algorithm
Let St be the smoothed value for the t’th time period and Xn (1, 2, 3...n) be the time series; mathematically,
S1 = x0 (you cannot get the smoothed value for the first entry in the time series).
St = αxt−1 + (1−a) St−1 where α is the smoothing factor (in %- Mathematically 0 < α < 1). When Alpha tends to 0, the weight given to the history value reduced and tends to 0.
Let us consider an example of price trend of a particular material over a series of date range.
DAY | PRICE |
2-JUNE-2014 | 100 |
3-JUNE-2014 | 95 |
4-JUNE-2014 | 110 |
5-JUNE-2014 | 110 |
6-JUNE-2014 | 98 |
7-JUNE-2014 | Holiday |
8-JUNE-2014 | Holiday |
9-JUNE-2014 | 105 |
10-JUNE-2014 | 118 |
There is no data available for 7th June and 8th June. But, as per the algorithm null or missing series are not allowed. In this case, if the previous data is not available corresponding smoothed value will be taken as Xt for those entries.
Now, let us convert the input table into time series and manually apply the smoothing algorithm, with smoothing factor as 50 %( 0.5). Consider the first DAY as the base date.
DAY | Time | PRICE | Smoothed Value ( St) |
2-JUNE-2014 | 0 | 100 |
|
3-JUNE-2014 | 1 | 95 | 100 |
4-JUNE-2014 | 2 | 110 |
|
5-JUNE-2014 | 3 | 110 |
|
6-JUNE-2014 | 4 | 98 |
|
7-JUNE-2014 | 5 | Holiday |
|
8-JUNE-2014 | 6 | Holiday |
|
9-JUNE-2014 | 7 | 105 |
|
10-JUNE-2014 | 8 | 118 |
|
Calculation
- S(0) will be null
- S(1) will be , 0.5 * 100(which is Xt-1) + 0.5 * 100 = 100
- S(2) will be , 0.5 * 95 + 0.5 * 100 = 97.5
- S(3) will be , 0.5 * 110 + 0.5 * 97.5 = 103.75
- S(4) will be ,05.*110 + 0.5 * 103.75 = 106.875
- S(5) will be , 0.5 * 98 + 0.5 * 106.875 = 102.4375
- S(6) will be , 0.5 * 102.4375(previous time series values in not available, hence the smoothed value is considered ) + 0.5 * 102.4375 = 102.4375
- The same process continues for the upcoming entries and can be forecast up to n number of time series entries.
Graph simulated with smoothed data in Microsoft Excel
PAL Implementation
(Source code from SAP HANA PAL Document is re used to generate the below code snippet - Page 193 ).
CREATE SCHEMA PAL_TRY; SET SCHEMA PAL_TRY; --dropping existing procedures if any CALL SYSTEM.AFL_WRAPPER_ERASER('SINGLESMOOTH_TEST_PROC'); --Creating procedures CALL SYSTEM.AFL_WRAPPER_GENERATOR('SINGLESMOOTH_TEST_PROC','AFLPAL','SINGLESMOOTH',PAL_SINGLESMOOTH_PDATA_TBL); CREATE LOCAL TEMPORARY COLUMN TABLE #PAL_CONTROL_TBL ("NAME" VARCHAR(100), "INTARGS" INT, "DOUBLEARGS" DOUBLE, "STRINGARGS" VARCHAR(100)); --RAW_DATA_COL : column where the data is available INSERT INTO #PAL_CONTROL_TBL VALUES ('RAW_DATA_COL',1,NULL,NULL); --Alpha value INSERT INTO #PAL_CONTROL_TBL VALUES ('ALPHA', NULL,0.5,NULL); --Forecast_num : Forecast next 100 values(Includes future values ) INSERT INTO #PAL_CONTROL_TBL VALUES ('FORECAST_NUM',100, NULL,NULL); --STARTTIME : ID starts from 0 INSERT INTO #PAL_CONTROL_TBL VALUES ('STARTTIME',0, NULL,NULL); CREATE COLUMN TABLE PAL_SINGLESMOOTH_DATA_TBL LIKE PAL_SINGLESMOOTH_DATA_T ; --Loading the test Data INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (0,100.0); INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (1,95.0); INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (2,110.0); INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (3,110.5); INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (4,98.0); INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (7,105.0); INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (8,118.0); CREATE COLUMN TABLE PAL_SINGLESMOOTH_RESULT_TBL LIKE PAL_SINGLESMOOTH_RESULT_T; --Executing the procedure CALL _SYS_AFL.SINGLESMOOTH_TEST_PROC(PAL_SINGLESMOOTH_DATA_TBL,"#PAL_CONTROL_TBL", PAL_SINGLESMOOTH_RESULT_TBL) WITH OVERVIEW; SELECT * FROM PAL_SINGLESMOOTH_RESULT_TBL;
The output table PAL_SINGLESMOOTH_RESULT_TBL will contain the smoothed data which can be used for the analysis or presentation purposes.
Regards
Sreehari V Pillai