[转帖]关于大内存页面 transparent_hugepage

关于,内存,页面,transparent,hugepage · 浏览次数 : 0

小编点评

**Transparent Huge Pages (THP)** **What is THP?** THP (Transparent Huge Pages) is a kernel feature that allows the kernel to allocate memory in large chunks (2MB to 1GB) directly to applications instead of using traditional 4KB pages. This reduces memory fragmentation and improves performance. **Benefits of THP:** * **Memory efficiency:** THP allows applications to allocate memory in large chunks, which can significantly reduce memory fragmentation and improve performance. * **Performance:** By reducing memory fragmentation, THP can improve memory access times, resulting in faster performance. * **Supports anonymous memory regions:** THP is suitable for managing anonymous memory regions, which are not supported by traditional 4KB page allocation. **How THP Works:** The kernel uses a khugepaged thread to dynamically allocate and deallocate memory in large chunks. This thread scans all memory regions and replaces any 4KB page allocation requests with THP allocations when possible. **Limitations of THP:** * **Memory availability:** THP requires continuous memory space to be available for allocation. * **Performance impact:** THP can slightly impact performance, as it involves additional memory management overhead. * **RAC compatibility:** THP is not supported in RAC environments, as it can lead to node restarts. **Optimizing THP Usage:** * Use `posix_memalign()` to ensure that large allocations are aligned to 2MB boundaries. * Consider disabling Transparent Huge Pages for applications that do not require memory allocation in large chunks. **Conclusion:** THP is a useful kernel feature for memory management in Linux systems with large memory allocations. However, it has some limitations and performance implications. By optimizing its usage and considering the limitations, you can maximize the benefits of THP while avoiding its drawbacks.

正文

https://www.cnblogs.com/kevinlucky/p/9955486.html

 

Transparent Huge Pages (THP) are enabled by default in RHEL 6 for all applications. The kernel attempts to allocate hugepages whenever possible and any Linux process will receive 2MB pages if the mmap region is 2MB naturally aligned. The main kernel address space itself is mapped with hugepages, reducing TLB pressure from kernel code. For general information on Hugepages, see: What are Huge Pages and what are the advantages of using them?

The kernel will always attempt to satisfy a memory allocation using hugepages. If no hugepages are available (due to non availability of physically continuous memory for example) the kernel will fall back to the regular 4KB pages. THP are also swappable (unlike hugetlbfs). This is achieved by breaking the huge page to smaller 4KB pages, which are then swapped out normally.

But to use hugepages effectively, the kernel must find physically continuous areas of memory big enough to satisfy the request, and also properly aligned. For this, a khugepaged kernel thread has been added. This thread will occasionally attempt to substitute smaller pages being used currently with a hugepage allocation, thus maximizing THP usage.

In userland, no modifications to the applications are necessary (hence transparent). But there are ways to optimize its use. For applications that want to use hugepages, use of posix_memalign() can also help ensure that large allocations are aligned to huge page (2MB) boundaries.

Also, THP is only enabled for anonymous memory regions. There are plans to add support for tmpfs and page cache. THP tunables are found in the /sys tree under /sys/kernel/mm/redhat_transparent_hugepage.

一般而言,内存管理的最小块级单位叫做page,一个page是4096bytes,1M的内存会有256个page,1GB的话就会有256,000个page。CPU通过内置的内存管理单元维护着page表记录。

正常来说,有两种方式来增加内存可以管理的内存大小:

1.增大硬件内存管理单元的大小。

2.增大page的大小。

第一个方法不是很现实,现代的硬件内存管理单元最多只支持数百到上千的page表记录,并且,对于数百万page表记录的维护算法必将与目前的数百条记录的维护算法大不相同才能保证性能,目前的解决办法是,如果一个程序所需内存page数量超过了内存管理单元的处理大小,操作系统会采用软件管理的内存管理单元,但这会使程序运行的速度变慢。

从redhat 6(centos,sl,ol)开始,操作系统开始支持 Huge Pages,也就是大页。

简单来说, Huge Pages就是大小为2M到1GB的内存page,主要用于管理数千兆的内存,比如1GB的page对于1TB的内存来说是相对比较合适的。

THP(Transparent Huge Pages)是一个使管理Huge Pages自动化的抽象层。

目前需要注意的是,由于实现方式问题,THP会造成内存锁影响性能,尤其是在程序不是专门为大内内存页开发的时候,简单介绍如下:

操作系统后台有一个叫做khugepaged的进程,它会一直扫描所有进程占用的内存,在可能的情况下会把4kpage交换为Huge Pages,在这个过程中,对于操作的内存的各种分配活动都需要各种内存锁,直接影响程序的内存访问性能,并且,这个过程对于应用是透明的,在应用层面不可控制,对于专门为4k page优化的程序来说,可能会造成随机的性能下降现象。

小知识点:

1:从RedHat 6, OEL 6, SLES 11 and UEK2 kernels 开始,系统缺省会启用 Transparent HugePages :用来提高内存管理的性能透明大页(Transparent HugePages )和之前版本中的大页功能上类似。主要的区别是:Transparent HugePages 可以实时配置,不需要重启才能生效配置;

2:Transparent Huge Pages在32位的RHEL 6中是不支持的。

Transparent Huge Pages are not available on the 32-bit version of RHEL 6.

3: ORACLE官方不建议我们使用RedHat 6, OEL 6, SLES 11 and UEK2 kernels 时的开启透明大页(Transparent HugePages ), 因为透明大页(Transparent HugePages ) 存在一些问题:

        1.在RAC环境下 透明大页(Transparent HugePages )会导致异常节点重启,和性能问题;

        2.在单机环境中,透明大页(Transparent HugePages ) 也会导致一些异常的性能问题;

Transparent HugePages memory is enabled by default with Red Hat Enterprise Linux 6, SUSE Linux Enterprise Server 11, and Oracle Linux 6 with earlier releases of Oracle Linux Unbreakable Enterprise Kernel 2 (UEK2) kernels. Transparent HugePages memory is disabled in later releases of Oracle Linux UEK2 kernels.Transparent HugePages can cause memory allocation delays during runtime. To avoid performance issues, Oracle recommends that you disable Transparent HugePages on all Oracle Database servers. Oracle recommends that you instead use standard HugePages for enhanced performance.Transparent HugePages memory differs from standard HugePages memory because the kernel khugepaged thread allocates memory dynamically during runtime. Standard HugePages memory is pre-allocated at startup, and does not change during runtime.

Starting with RedHat 6, OEL 6, SLES 11 and UEK2 kernels, Transparent HugePages are implemented and enabled (default) in an attempt to improve the memory management. Transparent HugePages are similar to the HugePages that have been available in previous Linux releases. The main difference is that the Transparent HugePages are set up dynamically at run time by the khugepaged thread in kernel while the regular HugePages had to be preallocated at the boot up time. Because Transparent HugePages are known to cause unexpected node reboots and performance problems with RAC, Oracle strongly advises to disable the use of Transparent HugePages. In addition, Transparent Hugepages may cause problems even in a single-instance database environment with unexpected performance problems or delays. As such, Oracle recommends disabling Transparent HugePages on all Database servers running Oracle.

4:安装Vertica Analytic Database时也必须关闭透明大页功能。

 

 

RHEL6优化了内存申请的效率,而且在某些场景下对KVM的性能有明显提升:http://www.linux-kvm.org/wiki/images/9/9e/2010-forum-thp.pdf。

Hadoop是个高密集型内存运算系统,这个改动似乎给它带来了副作用。理论上运算型Java程序应该更多的使用用户态CPU才对,Cloudera官方也推荐关闭THP。于是参考一些文章作了调整:

http://structureddata.org/2012/06/18/linux-6-transparent-huge-pages-and-hadoop-workloads/

 

效果很明显,大概12:05分的时候操作的,系统态占用基本消失了。文件Cache使用上升、机器负载下降。

除了手动修改运行时参数之外,还可以修改 /etc/grub.conf 里内核的启动参数,追加“transparent_hugepage=never”(此选项只对 /sys/kernel/mm/redhat_transparent_hugepage/enabled 有效)。

与[转帖]关于大内存页面 transparent_hugepage相似的内容:

[转帖]关于大内存页面 transparent_hugepage

https://www.cnblogs.com/kevinlucky/p/9955486.html Transparent Huge Pages (THP) are enabled by default in RHEL 6 for all applications. The kernel attem

[转帖]删除数据后的Redis内存占用率为什么还是很高?

https://zhuanlan.zhihu.com/p/490569316 有时候Redis明明做了数据删除,数据量已经不大了,但是使用top命令的时候,还会发现Redis占用了很多内存? PS:关于 Redis的高并发及高可用,到底该如何保证?可以参考下这个帖子:httss://http://z

[转帖]关于字节序(大小端)的一点想法

https://www.zhihu.com/people/bei-ji-85/posts 今天在一个技术群里有人问起来了,当时有一些讨论(不完全都是我个人的观点),整理一下: 为什么网络字节序(多数情况下)是大端? 早年设备的缓存很小,先接收高字节能快速的判断报文信息:包长度(需要准备多大缓存)、地

[转帖]关于iostat的问题,svctm数据不可信

使用FIO对磁盘进行压力测试,使用1个线程对磁盘进行随机读,设置单次read的数据块分别为128KB和1M,数据如下: (1)单次IO数据块为128KB (2)单次IO数据块为1M 从上面的数据可以看出,当单次IO的数据块变大,服务时间svctm反而变短,这明显不符合常规认知。 查阅到fio的相关资

[转帖]Redis命令DEL与UNLINK的区别,如何正确删除大Key!

https://www.itxm.cn/post/47824.html 背景 在这篇文章中做过使用del命令删除大key的实验,结果是del命令随着key的增大,主线程阻塞的时间就越长。 这与之前看redis5.0.8版本的代码中关于多线程删除操作的感官不符,于是决定先查看redis关于删除操作的代

[转帖]ss 输出格式说明

ss 命令输出详解ss 全名socket statistics,是iproute2中的一员ss已经替代netstat,大热于江湖。但是关于ss命令输出的内容,是什么意思呢? [root@test]# ss -s Total: 26437 (kernel 27730) TCP: 31961 (esta

【转帖】【漏洞提示】MySQL8.0.29因重大bug官网已下架

前阵子,MySQL官网已经将 MySQL 8.0.29版本下架。据悉下架原因是由于MySQL 8.0.29 存在关于InnoDB解释器的重大Bug。而最新版本 8.0.30及以上的版本已修复此漏洞。各大镜像站也已经移除了 8.0.29 的下载。大家可根据自身项目实际情况进行升级。如果是现有版本使用的

[转帖]关于统信UOS操作系统版本介绍

https://blog.csdn.net/qq43748322/article/details/120196200 当下信创产业发展的如火如荼,今天聊聊统信操作系统UOS 相比较于其它国内品牌操作系统,统信UOS的版本、分支比较多,下面为大家详细说说各UOS版本 目前统信UOS系统主要分为桌面版和

[转帖]关于华为产品生命周期

关于企业级产品都有EOL里程碑,因些需要考虑对已购产品、业务的生命周期进行升级、迁移、替换等统筹规划。另外如果遇到产品、业务整体出售,还需要评估对现有资产的影响等不可控因素。 今天聊聊华为产品的生命周期,点击查看原文 华为产品生命周期关键里程碑: 华为软件版本生命周期关键里程碑: 点击查询华为产品生

[转帖]关于SRE方法论的一些笔记

写在前面 阿里系列有一本《云原生操作系统Kubernetes》中作者在前言里讲到Google开源的Kubernetes和《SRE Google运维解密》这本书是剑法和气功的关系换句话讲Kubernetes是术,SRE Google运维解密是道作为云原生基础设施的Kubernetes小伙伴么应该多少有