Misconceptions in the Hana context
The startup time until the SAP system is available can be longer than 30 to 60 min to load all data into the main memory: Yes, to load all data into the main memory it takes some time, but this is no different for an AnyDB.
This also takes some time to fill the buffer. This usually happens after the data is first accessed and it lingers there until the LRU (Least Recently Used) algorithm springs into action and displaces it.
Hana loads the complete row store into RAM at every startup. After that, the system is immediately available. Short description of the startup process:
- Open the data files;
- Reading the information of the last successful savepoint (mapping logical pages to physical pages in the data files and loading the list for open transactions);
- Loading the Row Store (depending on the I/O subsystem - about five minutes for 100 GB);
- Playing back the redologs;
- Rollback of the unsuccessfully (commit) saved transactions;
- Writing a savepoint;
- Loading the column store marked as preload
- "lazy load" of the remaining tables (asynchronous loading of the column tables that were already loaded before the restart).
The test system is a BW on Hana on IBM Power. The DB size is 40 GB, Row Store has 6 GB and the start process takes about 60 seconds, the stop process about 75 seconds.
In the second run a 5 GB column table (REPORSRC) is added as well as SQL for the preload: alter table REPOSRC preload all. Again the start process took about 60 seconds and the stop process about 75 seconds.
Why didn't the startup process become significantly longer even though there is more data to load?
Since SPS7, the preloading process, together with the reloading of the tables, takes place asynchronously, directly after the startup process of the Hana DB is completed.
This way the system is immediately available again without waiting for the column-oriented tables to be loaded. If you want to test how long it takes for all tables to be loaded into RAM, you can do this with the script loadAllTables.py (location: /usr/sap/HDB/SYS/exe/hdb/python_support/) test (as sidadm): python ./loadAllTables.py -user=System -password= -address= -port=3xx15 -namespace=
Statistics are no longer needed with Hana; no more statistics summary runs need to be scheduled: partially correct. For column-oriented tables, the statement is correct. No special summary runs are needed because the optimizer knows about the distribution very quickly through the dictionary.
For the line memory, statistics are generated automatically as soon as they are needed (on-the-fly). This means that they do not have to be scheduled by collective runs. Currently it is not officially documented how to influence these statistics (e.g. sample size, manual statistics run etc.).
Backup
A restore always needs logs for a consistent recovery! Wrong, the Hana backups are based on a snapshot technology. So it is a completely frozen state of the database determined by the log position at the time of the backup execution.
The backup is thus in a consistent state without any log. Certainly the logs are needed for rolling forward, e.g. point in time recovery or to the last possible state before a failure.
Backup Catalog: Catalog information is stored like Oracle (*.anf file), which is absolutely needed for recovery. The backup catalog is backed up with every data and log backup!
It is not a normal readable file. Recovery can take place even without this original file from the backup (see SAP Note 1812057, Reconstructing the backup catalog with hdbbackupdiag).
This can be found in the backup location (for backup-to-disk) or in the backup set of a third party provider as well as recognizable by the name log_backup_0_0_0_0..
The catalog contains all the necessary information needed for a recovery, such as which logs are needed at which time or which files belong to which backup set.
If the backups are physically deleted at disk, VTL or tape level, the backup catalog still holds this invalid information. Currently there is no delivered automatism that cleans this up.
How big is this catalog file in the system? You can test this yourself! You can get insight with the Hana Studio in the backup editor if you display all backups including logs.
If this file is larger than 20 MB, you should pay attention to housekeeping, because as already mentioned, it is backed up with every backup. This means more than 200 times a day! 200 times 20 MB times 3 (because 3-system landscape) is already 12,000 MB.
The result of the sizing report must be doubled: The new sizing results of the SAP reports are final and do not need to be duplicated again, as may still be the case from old documentation.
A BW scale-up solution can be taken as an example. This means that master and slave nodes are located on one server. According to SAP recommendations, a scale-out approach in the BW environment consists of a master node, which carries the transactional load, and at least two slave nodes, which are responsible for reporting.
The SAP main memory sizing consists of a static and dynamic part. The static part is indexes and column and row data, which is the sum of the user data.
The dynamic portion is temporary files for reporting (OLAP BW queries), delta merge, and sorting and grouping, which adds up to the temporary memory that is released when the action is complete.
For example, Row Store with 53 GB times 2 equals 106 GB; Master Column has 11 GB times 2 equals 21 GB (rounded) plus 67 GB times 2 equals 135 GB (rounded). Results in a total of 156 GB. 50 GB of caches and services are required for each server. Which ultimately adds up to 312 GB.