Linux內存管理之slab機制(概述)
Linux內存管理之slab機制(概述)
通過前面所有代碼的分析和總結,已經把各個部分熟悉了一遍,在此對Linux內核中slab機製做最後的總結。
夥伴系統演算法採用頁作為基本內存區,這適合於大塊內存的請求。對於小內存區的申請,比如說幾十或幾百個位元組,我們用slab機制。
Slab分配器把對象分組放進高速緩存。每個高速緩存都是同類型對象的一種「儲備」。包含高速緩存的主內存區被劃分為多個slab,每個slab由一個活多個連續的頁組成,這些頁中既包含已分配的對象,也包含空閑的對象。
1,cache對象管理器
Cache對象管理器為kmem_cache結構,如下: view plaincopyprint?/*
* struct kmem_cache
*
* manages a cache.
*/
struct kmem_cache {
/* 1) per-cpu data, touched during every alloc/free */
struct array_cache *array;/*local cache*/
/* 2) Cache tunables. Protected by cache_chain_mutex */
unsigned int batchcount;
unsigned int limit;
unsigned int shared;
unsigned int buffer_size;/*slab中對象大小*/
u32 reciprocal_buffer_size;/*slab中對象大小的倒數*/
/* 3) touched by every alloc & free from the backend */
unsigned int flags; /* constant flags */
unsigned int num; /* # of objs per slab */
/* 4) cache_grow/shrink */
/* order of pgs per slab (2^n) */
unsigned int gfporder;
/* force GFP flags, e.g. GFP_DMA */
gfp_t gfpflags;
size_t colour;/*著色塊個數*/ /* cache colouring range */
unsigned int colour_off;/* cache的著色塊的單位大小 */ /* colour offset */
struct kmem_cache *slabp_cache;
unsigned int slab_size;/*slab管理區大小,包含slab對象和kmem_bufctl_t數組*/
unsigned int dflags; /* dynamic flags */
/* constructor func */
void (*ctor)(void *obj);
/* 5) cache creation/removal */
const char *name;
struct list_head next;
/* 6) statistics */
#ifdef CONFIG_DEBUG_SLAB
unsigned long num_active;
unsigned long num_allocations;
unsigned long high_mark;
unsigned long grown;
unsigned long reaped;
unsigned long errors;
unsigned long max_freeable;
unsigned long node_allocs;
unsigned long node_frees;
unsigned long node_overflow;
atomic_t allochit;/*cache命中計數,在分配中更新*/
atomic_t allocmiss;/*cache未命中計數,在分配中更新*/
atomic_t freehit;
atomic_t freemiss;
/*
* If debugging is enabled, then the allocator can add additional
* fields and/or padding to every object. buffer_size contains the total
* object size including these internal fields, the following two
* variables contain the offset to the user object and its size.
*/
int obj_offset;
int obj_size;
#endif /* CONFIG_DEBUG_SLAB */
/*
* We put nodelists[] at the end of kmem_cache, because we want to size
* this array to nr_node_ids slots instead of MAX_NUMNODES
* (see kmem_cache_init())
* We still use and not or because cache_cache
* is statically defined, so we reserve the max number of nodes.
*/
struct kmem_list3 *nodelists;
/*
* Do not add fields after nodelists[]
*/
};
/*
* struct kmem_cache
*
* manages a cache.
*/
struct kmem_cache {
/* 1) per-cpu data, touched during every alloc/free */
struct array_cache *array;/*local cache*/
/* 2) Cache tunables. Protected by cache_chain_mutex */
unsigned int batchcount;
unsigned int limit;
unsigned int shared;
unsigned int buffer_size;/*slab中對象大小*/
u32 reciprocal_buffer_size;/*slab中對象大小的倒數*/
/* 3) touched by every alloc & free from the backend */
unsigned int flags; /* constant flags */
unsigned int num; /* # of objs per slab */
/* 4) cache_grow/shrink */
/* order of pgs per slab (2^n) */
unsigned int gfporder;
/* force GFP flags, e.g. GFP_DMA */
gfp_t gfpflags;
size_t colour;/*著色塊個數*/ /* cache colouring range */
unsigned int colour_off;/* cache的著色塊的單位大小 */ /* colour offset */
struct kmem_cache *slabp_cache;
unsigned int slab_size;/*slab管理區大小,包含slab對象和kmem_bufctl_t數組*/
unsigned int dflags; /* dynamic flags */
/* constructor func */
void (*ctor)(void *obj);
/* 5) cache creation/removal */
const char *name;
struct list_head next;
/* 6) statistics */
#ifdef CONFIG_DEBUG_SLAB
unsigned long num_active;
unsigned long num_allocations;
unsigned long high_mark;
unsigned long grown;
unsigned long reaped;
unsigned long errors;
unsigned long max_freeable;
unsigned long node_allocs;
unsigned long node_frees;
unsigned long node_overflow;
atomic_t allochit;/*cache命中計數,在分配中更新*/
atomic_t allocmiss;/*cache未命中計數,在分配中更新*/
atomic_t freehit;
atomic_t freemiss;
/*
* If debugging is enabled, then the allocator can add additional
* fields and/or padding to every object. buffer_size contains the total
* object size including these internal fields, the following two
* variables contain the offset to the user object and its size.
*/
int obj_offset;
int obj_size;
#endif /* CONFIG_DEBUG_SLAB */
/*
* We put nodelists[] at the end of kmem_cache, because we want to size
* this array to nr_node_ids slots instead of MAX_NUMNODES
* (see kmem_cache_init())
* We still use and not or because cache_cache
* is statically defined, so we reserve the max number of nodes.
*/
struct kmem_list3 *nodelists;
/*
* Do not add fields after nodelists[]
*/
};在初始化的時候我們看到,為cache對象、三鏈結構、本地cache對象預留了三個cache共分配。其他為通用數據cache,整體結構如下圖
其中,kmalloc使用的對象按照大小分屬不同的cache,32、64、128、……,每種大小對應兩個cache節點,一個用於DMA,一個用於普通分配。通過kmalloc分配的對象叫作通用數據對象。
可見通用數據cache是按照大小進行劃分的,結構不同的對象,只要大小在同一個級別內,它們就會在同一個general cache中。專用cache指系統為特定結構創建的對象,比如struct file,此類cache中的對象來源於同一個結構。
2,slab對象管理器
Slab結構如下 view plaincopyprint?/*
* struct slab
*
* Manages the objs in a slab. Placed either at the beginning of mem allocated
* for a slab, or allocated from an general cache.
* Slabs are chained into three list: fully used, partial, fully free slabs.
*/
struct slab {
struct list_head list;
/* 第一個對象的頁內偏移,對於內置式slab,colouroff成員不僅包括著色區
,還包括管理對象佔用的空間
,外置式slab,colouroff成員只包括著色區。*/
unsigned long colouroff;
void *s_mem;/* 第一個對象的虛擬地址 *//* including colour offset */
unsigned int inuse;/*已分配的對象個數*/ /* num of objs active in slab */
kmem_bufctl_t free;/* 第一個空閑對象索引*/
unsigned short nodeid;
};
/*
* struct slab
*
* Manages the objs in a slab. Placed either at the beginning of mem allocated
* for a slab, or allocated from an general cache.
* Slabs are chained into three list: fully used, partial, fully free slabs.
*/
struct slab {
struct list_head list;
/* 第一個對象的頁內偏移,對於內置式slab,colouroff成員不僅包括著色區
,還包括管理對象佔用的空間
,外置式slab,colouroff成員只包括著色區。*/
unsigned long colouroff;
void *s_mem;/* 第一個對象的虛擬地址 *//* including colour offset */
unsigned int inuse;/*已分配的對象個數*/ /* num of objs active in slab */
kmem_bufctl_t free;/* 第一個空閑對象索引*/
unsigned short nodeid;
}; 關於slab管理對象的整體框架以及slab管理對象與對象、頁面之間的聯繫在前面的slab創建一文中已經總結的很清楚了。
3,slab著色
CPU訪問內存時使用哪個cache line是通過低地址的若干位確定的,比如cache line大小為32,那麼是從bit5開始的若干位。因此相距很遠的內存地址,如果這些位的地址相同,還是會被映射到同一個cache line。Slab cache中存放的是相同大小的對象,如果沒有著色區,那麼同一個cache內,不同slab中具有相同slab內部偏移的對象,其低地址的若干位是相同的,映射到同一個cache line。如圖所示。
如此一來,訪問cache line衝突的對象時,就會出現cache miss,不停的在cache line和內存之間來回切換,與此同時,其他的cache line可能無所事事,嚴重影響了cache的效率。解決這一問題的方法是通過著色區使對象的slab內偏移各不相同,從而避免cache line衝突。
著色貌似很好的解決了問題,實質不然,當slab數目不多時,著色工作的很好,當slab數目很多時,著色發生了循環,仍然存在cache line衝突的問題。
《解決方案》
謝謝分享