Quantcast
Channel: SCN : Blog List - SAP HANA Developer Center
Viewing all articles
Browse latest Browse all 676

HANA Predictive analysis –Single Exponential Smoothing

$
0
0

Introduction

Smoothing algorithms are basically used in time series data either to produce smoothed data for presenting the trend of the data or to forecast it for what if analysis. Time series data are sequential observations from the history of data with respect to a series of date, time or time stamps. Moving average analysis is also the same kind of time series analysis where the past observations are equally weighted. But, for certain analysis (price movement or share market stock movement), the recent past data has the most weight. Exponential smoothing assigns exponentially decreasing weights over time.

 

Pre requisites:


  1. No missing/null data
  2. Only numeric data can be smoothed

 

Algorithm

                Let St be the smoothed value for the t’th time period and Xn (1, 2, 3...n) be the time series; mathematically,

S1 = x0 (you cannot get the smoothed value for the first entry in the time series).

St = αxt−1 + (1−a) St−1 where α is the smoothing factor (in %- Mathematically 0 < α < 1). When Alpha tends to 0, the weight given to the history value reduced and tends to 0.

Let us consider an example of price trend of a particular material over a series of date range.

 

DAY

PRICE

2-JUNE-2014

100

3-JUNE-2014

95

4-JUNE-2014

110

5-JUNE-2014

110

6-JUNE-2014

98

7-JUNE-2014

Holiday

8-JUNE-2014

Holiday

9-JUNE-2014

105

10-JUNE-2014

118

 

     There is no data available for 7th June and 8th June. But, as per the algorithm null or missing series are not allowed. In this case, if the previous data is not available corresponding smoothed value will be taken as Xt for those entries.

Now, let us convert the input table into time series and manually apply the smoothing algorithm, with smoothing factor as 50 %( 0.5). Consider the first DAY as the base date.

 

DAY

Time

PRICE

Smoothed Value ( St)

2-JUNE-2014

0

100

 

3-JUNE-2014

1

95

100

4-JUNE-2014

2

110

  1. 97.5

5-JUNE-2014

3

110

  1. 103.75

6-JUNE-2014

4

98

  1. 106.875

7-JUNE-2014

5

Holiday

  1. 102.4375

8-JUNE-2014

6

Holiday

  1. 102.4375

9-JUNE-2014

7

105

  1. 102.4375

10-JUNE-2014

8

118

  1. 103.71875

 

Calculation

 

  1. S(0) will be null
  2. S(1) will be , 0.5 * 100(which is Xt-1) + 0.5 * 100 = 100
  3. S(2) will be , 0.5 * 95 + 0.5 * 100 = 97.5
  4. S(3) will be , 0.5 * 110 + 0.5 * 97.5 = 103.75
  5. S(4) will be ,05.*110 + 0.5 * 103.75 = 106.875
  6. S(5) will be , 0.5 * 98 + 0.5 * 106.875 = 102.4375
  7. S(6) will be , 0.5 * 102.4375(previous time series values in not available, hence the smoothed value is considered ) + 0.5 * 102.4375 = 102.4375
  8. The same process continues for the upcoming entries and can be forecast up to n number of time series entries.

Excel.JPG

Graph simulated with smoothed data in Microsoft Excel

 

PAL Implementation


(Source code from SAP HANA PAL Document is re used to generate the below code snippet - Page 193 ).


CREATE SCHEMA PAL_TRY;
SET SCHEMA PAL_TRY;
--dropping existing procedures if any
CALL SYSTEM.AFL_WRAPPER_ERASER('SINGLESMOOTH_TEST_PROC');
--Creating procedures
CALL SYSTEM.AFL_WRAPPER_GENERATOR('SINGLESMOOTH_TEST_PROC','AFLPAL','SINGLESMOOTH',PAL_SINGLESMOOTH_PDATA_TBL);
CREATE LOCAL TEMPORARY COLUMN TABLE #PAL_CONTROL_TBL ("NAME" VARCHAR(100),
"INTARGS" INT, "DOUBLEARGS" DOUBLE, "STRINGARGS" VARCHAR(100));
--RAW_DATA_COL : column where the data is available
INSERT INTO #PAL_CONTROL_TBL VALUES ('RAW_DATA_COL',1,NULL,NULL);
--Alpha value 
INSERT INTO #PAL_CONTROL_TBL VALUES ('ALPHA', NULL,0.5,NULL);
--Forecast_num : Forecast next 100 values(Includes future values )
INSERT INTO #PAL_CONTROL_TBL VALUES ('FORECAST_NUM',100, NULL,NULL);
--STARTTIME : ID starts from 0
INSERT INTO #PAL_CONTROL_TBL VALUES ('STARTTIME',0, NULL,NULL);
CREATE COLUMN TABLE PAL_SINGLESMOOTH_DATA_TBL LIKE PAL_SINGLESMOOTH_DATA_T ;
--Loading the test Data
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (0,100.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (1,95.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (2,110.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (3,110.5);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (4,98.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (7,105.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (8,118.0);
CREATE COLUMN TABLE PAL_SINGLESMOOTH_RESULT_TBL LIKE PAL_SINGLESMOOTH_RESULT_T;
--Executing the procedure
CALL _SYS_AFL.SINGLESMOOTH_TEST_PROC(PAL_SINGLESMOOTH_DATA_TBL,"#PAL_CONTROL_TBL", PAL_SINGLESMOOTH_RESULT_TBL) WITH OVERVIEW;  SELECT * FROM PAL_SINGLESMOOTH_RESULT_TBL;

The output table PAL_SINGLESMOOTH_RESULT_TBL will contain the smoothed data which can be used for the analysis or presentation purposes.

 

Regards

Sreehari V Pillai


Viewing all articles
Browse latest Browse all 676

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>