Quantcast
Channel: SCN : Blog List - SAP HANA Developer Center
Viewing all articles
Browse latest Browse all 676

How smart is Smart Data Access on HANA SP7?

$
0
0

Background

We feel SDA has great potential in a number of use cases, however we wanted to see how it would perform when combining local and remote data sources, which is key for any practical use. For this particular blog, I'm using both HANA for Local(BW) & Remote(Sidecar) sandboxes that are both on revision 72, both sitting on the same physical rack but on different servers. I may follow on with a Local HANA & a remote Oracle db if I folks are interested and we can get the drivers to work for connecting to an 11g db.

 

Setting up SDA

Not going to go over the actual setup in this blog, as it's been well documented on saphana and in other blogs; some I've noted below..

 

Smart Data Access: Connecting multiple SAP HANA... | SAP HANA

Smart Data Access with HADOOP  HIVE  & IMPALA

 

To Confirm:

(Local & Remote) are both HANA Rev72

 

select database_name as remote_db, substr(version,0,7) as remote_db_version from sys.m_database

local_db_revision.PNG||db_revision.PNG

 

Demo Objective:

Find the Equipment Status from the "CV_EQUIP_GEN_NEW_STATUS" calculation view on HANA Sidecar HT2 (contains SLT replicated tables from ECC) for a small range of Equipment Numbers from the Equipment Master table "/BI0/QEQUIPMENT" in BS1 (BW on HANA).

 

Step 1: Add Virtual Table

 

Navigate to _SYS_BIC (where all activated content resides)

Provisioning.PNG

Scroll to find the desired view, right click and Add as Virtual Table.

Virtual_Table.PNG

Create Virtual Table in a local Schema we created for SDA Testing.

 

SaveVirtualTable.PNG

 

Let's execute some queries..

 

Query 1

Query on BS1 to remote calc view through virtual table with some filters added.

 

select * from "SDA_TEST"."Zhana_CV_EQUIP_GEN_NEW_STATUS" r_cv
where r_cv.equnr like '6XR39%' -- equip range
and r_cv.datbi = '99991231' -- active records only

 

What's being Passed to the Remote db? You can check using..

 

  1. Highlighting the Select statement and doing a Explain Plan or Visualize Plan, I'd recommend to a Explain Plan minimum just in case you don't get the remote query you expect, e.g. In this case we want to avoid materializing the entire calc view on the remote db and bringing back the entire result set.
  2. For this demo and what I believe is the most accurate way is to query SQL Cache on the remote database to find the actual sql statement being executed.

 

 

There's a lot more fields of interest in m_sql_plan_cache, just selecting a couple of fields for illustration.

 

 

select STATEMENT_STRING,TOTAL_CURSOR_DURATION,TOTAL_EXECUTION_TIME,TOTAL_EXECUTION_FETCH_TIME,
TOTAL_PREPARATION_TIME,TOTAL_RESULT_RECORD_COUNT
from m_sql_plan_cache
where upper(statement_string) like '%CV_EQUIP_GEN_NEW_STATUS%'
and user_name = 'SMART_BS1' -- User name from the SDA connection details
order by last_execution_timestamp desc

 

I'll execute each statement 3 times so we can average these by dividing by 3.

(Note the equipment range record count is 10,224/3 = 3408)

 

 

Qry1_results.PNG

 

This is confirmed by BS1 results window


Statement 'select * from "SDA_TEST"."Zhana_CV_EQUIP_GEN_NEW_STATUS" r_cv where r_cv.equnr like '6XR39%' -- ...'

successfully executed in 481 ms 263 µs  (server processing time: 352 ms 300 µs)

Fetched 3408 row(s) in 2.048 seconds (server processing time: 9 ms 802 µs)

 

So we can see that the filtering is being successfully passed to the remote query as expected.

 

Qry1_SQL_String.PNG

 

Query 2

Query same range, but this time apply filters to local BW equipment master table and join local to remote table in query.

 

select le.equipment, re.stat
from sapbs1."/BI0/QEQUIPMENT" le, "SDA_TEST"."Zhana_CV_EQUIP_GEN_NEW_STATUS" re
where le.equipment like '6XR39%'
and le.equipment = re.equnr -- inner join on equipment
and le.dateto = '99991231' -- active records only
and re.datbi = '99991231' -- active records only

SQL Plan cache results

 

Qry2_results.PNG

 

Statement String,

 

Qry2_SQL_String.PNG

 

SDA was smart enough to apply the equipment range filter to the remote calc view, even though I had applied that filter to the local db table.

 

 

Query 3

(2 local tables & remote join to Calc View)

 

Create a local column table and insert the same equipment numbers range as previously used.

 

create column table sda_test.temp_equip_list (EQUIPMENT NVARCHAR(18) );
insert into sda_test.temp_equip_list
select equipment from sapbs1."/BI0/QEQUIPMENT" where equipment like '6XR39%';

Now include the new column table in the same query.

 

select lt.equipment, re.stat
from sda_test.temp_equip_list lt, sapbs1."/BI0/QEQUIPMENT" le, "SDA_TEST"."Zhana_CV_EQUIP_GEN_NEW_STATUS" re
where lt.equipment = le.equipment
and le.equipment = re.equnr
and le.dateto = '99991231'
and re.datbi = '99991231'

I was a bit weary of this one, so let's check the explain plan ahead of executing.

 

Qry3_ExplainPlan.PNG

 

 

That's not good, the remote query has lost the crucial equipment filter and is now only filtering out the in-active records. I would have expected some remote cache or remote join type in this scenario. Let's let it roll to confirm..

 

Query hangs and eventually after a couple of cancel attempts, gives a rollback error on the local db.

 

Checking the remote to check the damage..

Qry3_results.PNG

 

Ugghh, not good at all. it bombed after 56million rows, the remote calculation view can return up to 80 million active records. The sql statement string reflects the same as the explain plan above.

 

Query 4

Let's simplify this a bit, let's just see if having one local table and one remote table helps the remote cache and remote query join kick in. So created a new virtual table called Zhana_EQBS (4 million rows x 12 columns) and join to local sda_test.temp_equip_list (50K rows x 1 column)

 

select lt.equipment, re.b_werk plant, re.b_lager sloc
from sda_test.temp_equip_list lt, "SDA_TEST"."Zhana_EQBS" re
where lt.equipment = re.equnr

 

While the response time is quite impressive, all 4 million rows are still fetched from the remote query

(Note, again executed same query 3 times)

Qry4_SQL.PNG

Qry4_SQL_String.PNG

 

Observations in Summary:

  • So based on the results above, we have to be quite selective on how we use SDA as it comes out of the box in rev 72. It has potential and there may be improvements on the way.
  • I've been able to create a calculation view based on a virtual table and expose it to BW through an external view, but I haven't been able to create a Attribute or Analytical view based on virtual tables as all virtual tables get IS_COLUMN attribute set to false.
  • There's a bug in Studio (again rev 72) where you can't filter on the remote schema/table. It only brings up the local tables, which is quite painful if you have a lot of activated content in _SYS_BIC for example.
  • There are parameters available under configuration that may be able to help, so I'm going to open a message and engage SAP to see if we can improve the efficiency on how smart data access is suppose to work. I'm not sure if statistics have a part to play in the query optimization, but again hoping SAP folks can help out here.

Viewing all articles
Browse latest Browse all 676

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>