One of the really neat features of HANA is Smart Data Access. This lets you create virtual tables that sit on other databases and integrate that data with HANA. Lately at work I have been playing with SDA tables that are on a SQL Server database. This database is not owned by IT, so we are not going to integrate it into our HANA environment, but sometimes the business wants to include this data in stuff we are making for them. I added about seven different tables that the business created that had manually entered data about our products and joined them together in a script calculation view, then joined that together with some BW tables in another calculation view. This query took about 9 seconds to run. Hmmm, not very fast.
Being the curious person I am and the relative beginner to HANA, I decided to try an experiment. I recreated all of the SDA tables I used in that query as column tables in HANA and inserted all of the data from the SDA tables into these new ones. I then created an attribute view just like the original calculation view I created with the SDA tables. I replaced the SDA calculation view with the new attribute view and reran the query. It finished in about 1 second. Much better.
I opened up a SQL console and wrote select * on each view and looked at the total cost of the query plan. The calculation view with the SDA tables was much higher than the view with the attribute view created with the tables residing on HANA. Unfortunately I couldn't view the visual plan (something about a stackoverflow exception... lame!), but the results spoke loud enough that I don't really need to see it.
The lesson here is that SDA tables serve a purpose and should be used in specific use cases. If you need to connect to Hadoop or some other very large data source and don't want to use a large portion of your HANA memory on that data, SDA is your tool. But in my case, these were dimension tables that totaled about 15,000 rows and a very tiny amount of memory. I am much better off creating these tables in HANA and truncating and loading them every night with a Data Services job that would run in a couple seconds. The performance gains I get are greater than the time it takes to load the tables. That brings me back to the title of this blog. Just because I can use SDA doesn't mean I should.
Does anyone else have any other stories with HANA where just because you can doesn't mean you should?