Quantcast
Channel: SCN : Blog List - SAP HANA Developer Center
Viewing all articles
Browse latest Browse all 676

Acceleration Technology of SAP HANA Database Importing Data File

$
0
0

This article is according to CSV files importing to SAP HANA acceleration. And find out which factors and methods can affect the speed of importing.

Hardware factors

The speed limit of SAP HANA import is limited by hardware configuration. No matter what we have done in software level, the hardware is the most important factor. There are 3 kind of hardware factor which affect the speed limit of importing.

  • Disk type

Since SAP HANA importing is always be with transaction log writing and delta log writing, the disk write speed is significant. SSD as log are and data area of SAP HANA is recommended.

  • Number of CPU cores

SAP HANA is able to make full use of multiple cores to import data. So number of CPU cores decides the speed of importing.

  • Size of memory

SAP HANA is an in-memory database which’s data is stay in memory. If the size of memory is not big enough, data importing will lead to lack of memory. Then HANA will unload other data which is not used recently. It will reduce the speed of importing data. Besides, in the period of reading csv files, the size of cache will increase rapidly. When size of cache is too large, operation system will release some space of cache. This process will affect importing data. Through some experiments, I recommend that the size of free memory is nearly double the size of csv files.

Importing files factors

According to CSV files, there are 3 factors which can affect speed of importing data:

  • The correct format of importing files

If csv files contain data which not follow the format of table, all batch contains this data will not be imported into database. This will reduce the speed of importing data.

  • Size of csv file

The size of csv file needs to be big enough so that SAP HANA can use multiple threads technology to import data.

SAP HANA factors

In SAP HANA, data are stored not only in memory but also in disk and log files. To get the speed limit of importing data, we need abandon some configurations that is for security reasons.

Partition

The partition of table can contribute to improve the parallel degree. Through my experiments, hash partition is the best method of partition. And the numeric field is the best type of partition value.

Auto merge

By default, the data imported into table are stored in delta are. Then the delta area is merged into main area automatically. And this process will not import data into database. So we can disable auto merge to ensure the process of importing will not do merge operation.

Delta log

To column store, SAP HANA will write delta log into disk when importing data. This process will reduce the speed of importing data. Our aim of importing is to put the data in to memory, and that process is to make sure the imported data will lose. So we can disable delta log to  improve the speed of importing.

Number of threads

To make full use of multiple cores, we can set the number of threads when importing data. Through experiment, number of threads= number of CPU cores is the best setting.

Number of tuples in a batch

SAP HANA imports data in batches. We can set the number of tuples in a batch.

Summary

According to two different hardware configurations, we get different results of importing speed.

Hardware configuration

Importing speed

CPU: 16 cores

Memory: 256GB

Disk type: SSD

100M/s

CPU: 80 cores

Memory: 1TB

Disk type: SSD

  1. 308.8M/s

 

The flow chart of accelerate the speed of importing.

1.jpg


Viewing all articles
Browse latest Browse all 676

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>