Regular Pages and HugePages

This section aims to give a general picture about memory access in virtual memory systems and how pages are referenced.
When a single process works with a piece of memory, the pages that the process uses are reference in a local page table for the specific process. The entries in this table also contain references to the System-Wide Page Table which actually has references to actual physical memory addresses. So theoretically a user mode process (i.e. Oracle processes), follows its local page table to access to the system page table and then can reference the actual physical table virtually. As you can see below, it is also possible (and very common to Oracle RDBMS due to SGA use) that two different O/S processes can point to the same entry in the system-wide page table.

When HugePages are in the play, the usual page tables are employed. The very basic difference is that the entries in both process page table and the system page table has attributes about huge pages. So any page in a page table can be a huge page or a regular page. The following diagram illustrates 4096K hugepages but the diagram would be the same for any huge page size.

Some HugePages Facts/Features

HugePages can be allocated on-the-fly but they must be reserved during system startup. Otherwise the allocation might fail as the memory is already paged in 4K mostly.
HugePage sizes vary from 2MB to 256MB based on kernel version and HW architecture (See related section below.)
HugePages are not subject to reservation / release after the system startup unless there is system administrator intervention, basically changing the hugepages configuration (i.e. number of pages available or pool size)

HugePages and Oracle 11g Automatic Memory Management (AMM)

The AMM and HugePages are not compatible. One needs to disable AMM on 11g to be able to use HugePages. See hugepage in 11g for further information.

设置大页内存

[oracle@db-36 ~]$ cat /etc/sysctl.conf |grep nr_hugepages
vm.nr_hugepages=33792

vm.nr_hugepages>=SGA/2M 如SGA=64G vm.nr_hugepages>=32768

设置limits.conf

cat /etc/security/limits.conf

cat oracle soft nofile 131072
oracle hard nofile 131072
oracle soft nproc 131072
oracle hard nproc 131072
oracle soft core unlimited
oracle hard core unlimited
oracle soft memlock 69206016 –> 大于SGA
oracle hard memlock 69206016 –> 大于SGA


[oracle@db-36 ~]$ more /proc/meminfo |grep -i HugePage
HugePages_Total: 33792
HugePages_Free: 998
HugePages_Rsvd: 38
Hugepagesize: 2048 kB

表示已经使用了大页内存

scripts :用于计算系统所需要的大页

#!/bin/bash
#
# hugepages_settings.sh
#
# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
#
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
#
# This script is provided by Doc ID 401749.1 from My Oracle Support
# http://support.oracle.com

# Welcome text
echo ”
This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments. Before proceeding with the execution please make sure
that:
* Oracle Database instance(s) are up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not setup
(See Doc ID 749851.1)
* The shared memory segments can be listed by command:
# ipcs -m

Press Enter to proceed…”

read

# Check for the kernel version
KERN=`uname -r | awk -F. ‘{ printf(“%d.%d\n”,$1,$2); }’`

# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk ‘{print $2}’`

# Initialize the counter
NUM_PG=0

# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | awk ‘{print $5}’ | grep “[0-9][0-9]*”`
do
MIN_PG=`echo “$SEG_BYTES/($HPG_SZ*1024)” | bc -q`
if [ $MIN_PG -gt 0 ]; then
NUM_PG=`echo “$NUM_PG+$MIN_PG+1” | bc -q`
fi
done

RES_BYTES=`echo “$NUM_PG * $HPG_SZ * 1024” | bc -q`

# An SGA less than 100MB does not make sense
# Bail out if that is the case
if [ $RES_BYTES -lt 100000000 ]; then
echo “***********”
echo “** ERROR **”
echo “***********”
echo “Sorry! There are not enough total of shared memory segments allocated for
HugePages configuration. HugePages can only be used for shared memory segments
that you can list by command:

# ipcs -m

of a size that can match an Oracle Database SGA. Please make sure that:
* Oracle Database instance is up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not configured”
exit 1
fi

# Finish with results
case $KERN in
‘2.4’) HUGETLB_POOL=`echo “$NUM_PG*$HPG_SZ/1024” | bc -q`;
echo “Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL” ;;
‘2.6’) echo “Recommended setting: vm.nr_hugepages = $NUM_PG” ;;
*) echo “Unrecognized kernel version $KERN. Exiting.” ;;
esac

# End


example:

[oracle@db-36 ~]$ sh page.sh

This script is provided by Doc ID 401749.1 from My Oracle Support
(http://support.oracle.com) where it is intended to compute values for
the recommended HugePages/HugeTLB configuration for the current shared
memory segments. Before proceeding with the execution please make sure
that:
* Oracle Database instance(s) are up and running
* Oracle Database 11g Automatic Memory Management (AMM) is not setup
(See Doc ID 749851.1)
* The shared memory segments can be listed by command:
# ipcs -m

Press Enter to proceed…

Recommended setting: vm.nr_hugepages = 32835
[oracle@db-36 ~]$ cat /etc/sysctl.conf |grep vm.nr_hugepages
vm.nr_hugepages=33792

可以看出 我们设置的大页是很合理的