和XLOG的关系:当一个新的偏移量或者成员页面被初始化为0时,Multixact模块产生一个XLOG记录,以及定义一个新的MultixactId时,也会产生一个XLOG记录。这样使pg可以在重做事务日志(XLOG replay)时完整重建进入的数据。因为这一点,pg不必遵循“在写数据前写WAL日志”的一般原则;只需要正确的保证在checkpoint完成之前我们把脏OFFSET和MEMBER页面(上面提到的两套SLRU相关结构的页面)刷出和同步到磁盘。在相应的WAL日志记录之前,如果一个页面做了,在使用该页面之前,这个页面肯定会被强制归0。因此,pg不需要用LSN信息标记内存页面;pg已经有了足够的同步。






说main()->…->PostmasterMain()->…->reset_shared()-> CreateSharedMemoryAndSemaphores()->…-> MultixactShmemInit(),初始化Multixact事务相关数据结构MultixactOffsetCtl、MultixactMemberCtl、MultixactState等,用作内存里管理和缓存Multixact事务日志文件(存放在"data/pg_multixact/offsets"和"data/pg_multixact/members"文件夹里的文件)。

MultixactShmemInit ()->SimpleLruInit()->ShmemInitStruct(),在其中调用hash_search()在哈希表索引"ShmemIndex"中查找" MultixactOffset Ctl",如果没有,就在shmemIndex中给" MultixactOffset Ctl "分一个HashElement和ShmemIndexEnt(entry),在其中的Entry中写上"MultixactOffset Ctl"。返回ShmemInitStruct(),再调用ShmemAlloc()在共享内存上给" MultixactOffset Ctl"相关结构(见下面“Multixact相关结构图”)分配空间,设置entry(在这儿及ShmemIndexEnt类型变量)的成员location指向该空间,size成员记录该空间大小,最后返回MultixactShmemInit (),让SlruCtlData *类型全局变量MultixactOffsetCtl指向SlruCtlData 类型静态全局变量MultixactOffsetCtlData,MultixactOffsetCtlData的起始地址就是在shmem里给"MultixactOffset Ctl"相关结构分配的内存起始地址,设置其中SubTransCtlData结构类型的成员值。

接着MultixactShmemInit ()->SimpleLruInit()->ShmemInitStruct(),在其中调用hash_search()在哈希表索引"ShmemIndex"中查找"MultixactMember Ctl",如果没有,就在shmemIndex中给"MultixactMember Ctl "分一个HashElement和ShmemIndexEnt(entry),在其中的Entry中写上"MultixactMember Ctl"。返回ShmemInitStruct(),再调用ShmemAlloc()在共享内存上给"MultixactMember Ctl"相关结构(见下面“Multixact相关结构图”)分配空间,设置entry(在这儿及ShmemIndexEnt类型变量)的成员location指向该空间,size成员记录该空间大小,最后返回MultixactShmemInit (),让SlruCtlData *类型全局变量MultixactMemberCtl指向SlruCtlData 类型静态全局变量MultixactMemberCtlData,MultixactMemberCtlData的起始地址就是在shmem里给"MultixactMember Ctl"相关结构分配的内存起始地址,设置其中SubTransCtlData结构类型的成员值。

然后调用ShmemInitStruct(),在其中调用hash_search()在哈希表索引"ShmemIndex"中查找"Shared Multixact State",如果没有,就在shmemIndex中给" Shared Multixact State"分一个HashElement和ShmemIndexEnt(entry),在其中的Entry中写上"Shared Multixact State"。返回ShmemInitStruct(),再调用ShmemAlloc()在共享内存上给"Shared Multixact State"相关结构(见下面“Multixact相关结构图”)分配空间,设置entry(在这儿及ShmemIndexEnt类型变量)的成员location指向该空间,size成员记录该空间大小,最后返回MultixactShmemInit (),让MultixactStateData *类型全局静态变量MultixactState指向MultixactStateData结构实例,MultixactStateData的起始地址就是在shmem里给"Shared Multixact State"相关结构分配的内存起始地址,设置其中MultixactStateData结构类型的成员值。


static MT_LOCAL SlruCtlData MultixactOffsetCtlData;

static MT_LOCAL SlruCtlData MultixactMemberCtlData;

#define MultixactOffsetCtl (&MultixactOffsetCtlData)

#define MultixactMemberCtl (&MultixactMemberCtlData)

typedef struct SlruCtlData


Slrushared shared;


* This flag tells whether to fsync writes(true for pg_clog,false for

* pg_subtrans).


bool do_fsync;


* Decide which of two page numbers is"older" for truncation purposes. We

* need to use comparison of TransactionIdshere in order to do the right

* thing with wraparound XID arithmetic.


bool (*PagePrecedes)(int,int);


* Dir is set during SimpleLruInit and does notchange thereafter. Since

* it's always the same,it doesn't need to bein shared memory.


char Dir[64];

} SlruCtlData;

typedef SlruCtlData *SlruCtl;


* Shared-memorystate


typedef struct SlrusharedData


LWLockId ControlLock;

/* Number of buffers managed by this SLRU structure */

int num_slots;


* Arrays holding info for each bufferslot. Page number is undefined

* when status is EMPTY,as is page_lru_count.


char **page_buffer;


bool *page_dirty;

int *page_number;

int *page_lru_count;

LWLockId *buffer_locks;


* We mark a page "most recentlyused" by setting

* page_lru_count[slotno]= ++cur_lru_count;

* The oldest page is therefore the one withthe highest value of

* cur_lru_count- page_lru_count[slotno]

* The counts will eventually wrap around,butthis calculation still

* works as long as no page's age exceedsINT_MAX counts.



int cur_lru_count;


* latest_page_number is the page number of thecurrent end of the log;

* this is not critical data,since we use itonly to avoid swapping out

* the latest page.


int latest_page_number;

} SlrusharedData;

typedef SlrusharedData *Slrushared;

static MultixactStateData *MultixactState;

typedef structMultixactStateData


/* next-to-be-assigned MultixactId */


/* next-to-be-assigned offset */


/* the Offset SLRU area was last truncated at thisMultixactId */



* Per-backend data starts here. We have two arrays stored in the area

* immediately following the MultixactStateDatastruct. Each is indexed by

* BackendId.(Note: valid BackendIds run from 1 to MaxBackends; element

* zero of each array is never used.)


* OldestMemberMXactId[k] is the oldestMultixactId each backend's current

* transaction(s) Could possibly be a memberof,or InvalidMultixactId

* when the backend has no live transactionthat Could possibly be a

* member of a Multixact. Each backend sets its entry to the current

* nextMXact counter just before firstacquiring a shared lock in a given

* transaction,and clears it at transactionend. (This works because only

* during or after acquiring a shared lockCould an XID possibly become a

* member of a Multixact,and that Multixactwould have to be created

* during or after the lock acquisition.)


* OldestVisibleMXactId[k] is the oldestMultixactId each backend's

* current transaction(s) think is potentiallylive,or InvalidMultixactId

* when not in a transaction or not in atransaction that's paid any

* attention to Multixacts yet. This is computed when first needed in a

* given transaction,and cleared attransaction end. We can compute it

* as the minimum of the validOldestMemberMXactId[] entries at the time

* we compute it (using nextMXact if none arevalid). Each backend is

* required not to attempt to access any SLRUdata for MultixactIds older

* than its own OldestVisibleMXactId[] setting;this is necessary because

* the checkpointer Could truncate away suchdata at any instant.


* The checkpointer can compute the safetruncation point as the oldest

* valid value among all theOldestMemberMXactId[] and

* OldestVisibleMXactId[] entries,or nextMXactif none are valid.

* Clearly,it is not possible for anylater-computed OldestVisibleMXactId

* value to be older than this,and so there isno risk of truncating data

* that is still needed.


MultixactIdperBackendXactIds[1]; /* VARIABLE LENGTH ARRAY */

} MultixactStateData;

下面看看初始化完"MultixactOffset Ctl"、"MultixactOffset Ctl"及"Shared Multixact State"相关结构后在内存中的结构图




