You've successfully subscribed to The Daily Awesome
Great! Next, complete checkout for full access to The Daily Awesome
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info is updated.
Billing info update failed.

MIT6.828 | Lab 2: Memory Management - Part 2: Virtual Memory

. 12 min read

首先回顾 x86 的保护模式内存管理架构,即 分段和页面转换 segmentation translation and page translation.

Exercise 2. Look at chapters 5 and 6 of the Intel 80386 Reference Manual, if you haven't done so already. Read the sections about page translation and page-based protection closely (5.2 and 6.4). We recommend that you also skim the sections about segmentation; while JOS uses the paging hardware for virtual memory and protection, segment translation and segment-based protection cannot be disabled on the x86, so you will need a basic understanding of it.

[post cid="574" /]

[post cid="582" /]

虚拟地址、线性地址和物理地址

在x86术语中,虚拟地址由 段选择器 segment selector 和段内的偏移量 offset 组成。 线性地址是您在分段翻译 segment translation 之后, 页面翻译 page translation 之前获得的。 物理地址是在段和页面翻译之后最终得到的,最终在硬件总线上抵达RAM的内容。

     选择器 Selector  +--------------+         +-----------+
          ---------->|              |         |           |
                     | Segmentation |         |  Paging   |
Software             |              |-------->|           |---------->  RAM
     偏移量  Offset   |  Mechanism   |         | Mechanism |
          ---------->|              |         |           |
                     +--------------+         +-----------+
            Virtual                   Linear                Physical

C 指针是虚拟地址的偏移部分。在 boot/boot.S中,引入全局描述符表(GDT)将所有段基址设为0到 0xffffffff。因此线性地址等于虚拟地址的偏移量。 因此“选择器”没有效果,即线性地址总是等于虚拟地址的偏移量。 在lab3中,我们将需要与分段进行更多交互以设置权限级别,但对于内存转换,我们可以忽略整个JOS实验室中的分段,并仅关注页面转换。

回想实验1的第3部分,我们安装了一个简单的页表,以便内核可以在其链接地址0xf0100000上运行,即使它实际上是在ROM的BIOS上方的地址为0x00100000的物理内存中加载的。 此页表仅映射了4MB内存。 在本实验中您要为JOS设置的虚拟地址空间布局中,我们将展开它以映射从虚拟地址0xf0000000开始的第一个256MB物理内存,并映射虚拟地址空间的许多其他区域。

Exercise 3. While GDB can only access QEMU's memory by virtual address, it's often useful to be able to inspect physical memory while setting up virtual memory. Review the QEMU monitor commands from the lab tools guide, especially the xp command, which lets you inspect physical memory. To access the QEMU monitor, press Ctrl-a c in the terminal (the same binding returns to the serial console).

Use the xp command in the QEMU monitor and the x command in GDB to inspect memory at corresponding physical and virtual addresses and make sure you see the same data.

Our patched version of QEMU provides an info pg command that may also prove useful: it shows a compact but detailed representation of the current page tables, including all mapped memory ranges, permissions, and flags. Stock QEMU also provides an info mem command that shows an overview of which ranges of virtual addresses are mapped and with what permissions.

x> 没有图像化界面,给出的方法没能打开monitor

从CPU上执行的代码开始,只要我们处于保护模式(我们在boot / boot.S 中首先设置过),就无法直接使用线性或物理地址。 所有内存引用都被解释为虚拟地址并由MMU转换,这意味着C中的所有指针都是虚拟地址。

JOS内核通常需要将地址操作为不透明值或整数,而不需要对它们进行引用解析(dereference),例如在物理内存分配器中。 有时这些是虚拟地址,有时它们是物理地址。 为了帮助记录代码,JOS源区分了两种情况:类型uintptr_t表示不透明的虚拟地址,physaddr_t表示物理地址。 这两种类型实际上只是32位整数(uint32_t)的同义词,因此编译器不会阻止您将一种类型分配给另一种类型! 由于它们是整数类型(不是指针),因此不能对他们进行引用解析。

JOS内核可以通过首先将其转换为指针类型来对uintptr_t引用解析。 但由于MMU会转换所有内存引用,内核不能明智地引用解析物理地址。 如果将physaddr_t转换为指针并引用解析,您可以加载并存储到结果地址(硬件会将其解释为虚拟地址),但您可能无法获得预期的内存位置。

总结:

C typeAddress typeT*Virtualuintptr_tVirtualphysaddr_tPhysical

Question

  1. Assuming that the following JOS kernel code is correct, what type should variable x have, uintptr_t or physaddr_t
	mystery_t x;
	char* value = return_a_pointer();
	*value = 10;
	x = (mystery_t) value;

value是一个指针,并且执行了 deference,此时载入x的应为虚拟地址,因此其类型应为 uintptr_t

JOS内核有时需要读取或修改仅了解物理地址的内存。 例如,向页表添加映射可能需要分配物理内存来存储页目录,然后初始化该内存。 然而内核无法绕过虚拟地址转换,因此无法直接加载和存储到物理地址。 JOS重新映射在虚拟地址0xf0000000,即从物理地址0开始所有物理内存的一个原因是帮助内核读写它只知道物理地址的内存。 为了将物理地址转换为内核实际可以读写的虚拟地址,内核必须将0xf0000000添加到物理地址,以在重映射区域中找到其对应的虚拟地址。 您应该使用KADDR(pa)进行加法。

在给定存储内核数据结构被存储的虚拟地址的情况下,JOS内核有时也需要能够找到物理地址。 由boot_alloc()分配的内核全局变量和内存位于从0xf0000000开始的加载内核的区域,这也是我们映射所有物理内存的区域。 因此,要将此区域中的虚拟地址转换为物理地址,内核可以简单地减去0xf0000000。 您应该使用PADDR(va)进行减法。

引用计数

Reference counting

在之后的lab中,通常会同时在多个虚拟地址(或多个环境的地址空间)中映射相同的物理页面。 您将在与物理页面对应的struct PageInfopp_ref字段中保留对每个物理页面的引用数量的计数。 当物理页面的此计数变为零时,可以释放该页面,因为它不再使用。 一般来说,这个计数应该等于物理页面在所有页面表中出现在UTOP下面的次数(UTOP上面的映射主要是在内核启动时设置的,不应该被释放,因此不需要引用 算他们)。 我们还将使用它来跟踪我们保留到页面目录页面的指针数量,进而跟踪页面目录对页面表页面的引用数量。

使用page_alloc时要注意,它返回的页面的引用计数始终为0,因此只要您对返回的页面执行某些操作(例如将其插入页面表),pp_ref就应该递增。 有时这是由其他函数处理的(例如,page_insert),有时调用page_alloc的函数必须直接执行。

页表管理

Page Table Management

现在,编写一组用于管理页表的方法:插入和删除线性到物理映射,以及在需要时创建页表的页面。

Exercise 4. In the file kern/pmap.c, you must implement code for the following functions.

        pgdir_walk()
        boot_map_region()
        page_lookup()
        page_remove()
        page_insert()

check_page(), called from mem_init(), tests your page table management routines. You should make sure it reports success before proceeding.

pgdir_walk()

该函数的作用是获得指向线性地址页表项的指针

img
// Given 'pgdir', a pointer to a page directory, pgdir_walk returns
// a pointer to the page table entry (PTE) for linear address 'va'.
// This requires walking the two-level page table structure.
//
// The relevant page table page might not exist yet.
// If this is true, and create == false, then pgdir_walk returns NULL.
// Otherwise, pgdir_walk allocates a new page table page with page_alloc.
//    - If the allocation fails, pgdir_walk returns NULL.
//    - Otherwise, the new page's reference count is incremented,
//	the page is cleared,
//	and pgdir_walk returns a pointer into the new page table page.
//
// Hint 1: you can turn a PageInfo * into the physical address of the
// page it refers to with page2pa() from kern/pmap.h.
//
// Hint 2: the x86 MMU checks permission bits in both the page directory
// and the page table, so it's safe to leave permissions in the page
// directory more permissive than strictly necessary.
//
// Hint 3: look at inc/mmu.h for useful macros that manipulate page
// table and page directory entries.
//
pte_t *
pgdir_walk(pde_t *pgdir, const void *va, int create)
{
	// PDX: page directory index from linear address
	uint32_t pd_idx = PDX(va);
	// PTE_P: Present bit
	if (~(pgdir[pd_idx] & PTE_P)) {
		// The relevant page table page is not exist
		if (~create) {
			return NULL;
		}
		// Allocates a new page table page
		struct PageInfo *pg = page_alloc(ALLOC_ZERO);
		if (!pg) return NULL; // the allocation fails
		// Clear the new page
		// page2kva: physical -> virtual address
		memset(page2kva(pg), 0, PGSIZE);
		// Update the reference counting manually
		pg->pp_ref++;
		// page2pa(pg) | PTE_P | PTE_U | PTE_W;	
		pgdir[pd_idx] = page2pa(pg) | PTE_P | PTE_U | PTE_W;
	}
	// PTX: page table index from linear address
	uint32_t pt_idx = PTX(va);
	// KADDR: physical address -> virtual address
	pte_t *p = KADDR(PTE_ADDR(pgdir[pd_idx])) + pt_idx;
	return p;
}

boot_map_region()

将虚拟地址 [va, va+size) 映射到物理地址 [pa, pa+size)

注释中提到可以使用上面写的 pgdir_walk ,获取页表地址,接着将物理地址的值与上权限位赋给页表地址

需要注意这里是 静态映射,不改变 reference counting

关键点是理解 *page table entry 结果是对应的物理地址。

//
// Map [va, va+size) of virtual address space to physical [pa, pa+size)
// in the page table rooted at pgdir.  Size is a multiple of PGSIZE, and
// va and pa are both page-aligned.
// Use permission bits perm|PTE_P for the entries.
//
// This function is only intended to set up the ``static'' mappings
// above UTOP. As such, it should *not* change the pp_ref field on the
// mapped pages.
//
// Hint: the TA solution uses pgdir_walk
static void
boot_map_region(pde_t *pgdir, uintptr_t va, size_t size, physaddr_t pa, int perm)
{
	// Fill this function in
	pte_t *pt;
	uint32_t offset = 0;
	while (offset < size) {
		// Pointer to the page table entry (PTE) for linear address 'va'
		pt = pgdir_walk(pgdir, (void *)va, 1);
		// Use permission bits perm|PTE_P for the entries.
		*pt = pa | perm | PTE_P;
		// va and pa are both page-aligned
		pa += PGSIZE;
		va += PGSIZE;
		// Size is a multiple of PGSIZE
		offset += PGSIZE;
	}
}

page_lookup()

根据提示,pgdir_walk 可以获取对应的页表条目指针;pa2page 可以将 页表地址转换为页表。

//
// Return the page mapped at virtual address 'va'.
// If pte_store is not zero, then we store in it the address
// of the pte for this page.  This is used by page_remove and
// can be used to verify page permissions for syscall arguments,
// but should not be used by most callers.
//
// Return NULL if there is no page mapped at va.
//
// Hint: the TA solution uses pgdir_walk and pa2page.
//
struct PageInfo *
page_lookup(pde_t *pgdir, void *va, pte_t **pte_store)
{
	// Fill this function in
	// Address to the page table entry (PTE) for linear address 'va'
	pte_t *pte = pgdir_walk(pgdir, va, 0);
	// used by page_remove
	if (pte_store != 0) {
		// Store in it the address of the pte for this page
		*pte_store = pte;
	}
	// check if there is any page mapped at va
	if (pte != NULL && (*pte & PTE_P)) {
		// PTE_ADDR: address in page table or page directory entry
		return pa2page(PTE_ADDR(*pte));
	}
	return NULL;
}

page_remove()

取消虚拟地址 va 的映射,包含以下操作:

  • reference count 减少
  • 如果reference count 减至0,需要释放物理页面
  • 对应 va 的页表条目应被置为0(如果存在)
  • 如果从页表中移除了条目,则 TLB 应失效
//
// Unmaps the physical page at virtual address 'va'.
// If there is no physical page at that address, silently does nothing.
//
// Details:
//   - The ref count on the physical page should decrement.
//   - The physical page should be freed if the refcount reaches 0.
//   - The pg table entry corresponding to 'va' should be set to 0.
//     (if such a PTE exists)
//   - The TLB must be invalidated if you remove an entry from
//     the page table.
//
// Hint: The TA solution is implemented using page_lookup,
// 	tlb_invalidate, and page_decref.
//
void
page_remove(pde_t *pgdir, void *va)
{
	// Fill this function in
	pte_t *pte;
	// the page mapped at virtual address 'va'
	struct PageInfo *page = page_lookup(pgdir, va, &pte);
	if (page) {
		// decrese the ref and auto free physical page if ref == 0
		page_decref(page);
		// set the pg table entry corresponding to 'va' to 0
		*pte = 0;
		// Invalidate a TLB entry
		tlb_invalidate(pgdir, va);
	}
}

page_insert()

完成物理页面pp 和虚拟地址 va 之间的映射,将页表条目的低12bit设置为 perm|PTEE_P

//
// Map the physical page 'pp' at virtual address 'va'.
// The permissions (the low 12 bits) of the page table entry
// should be set to 'perm|PTE_P'.
//
// Requirements
//   - If there is already a page mapped at 'va', it should be page_remove()d.
//   - If necessary, on demand, a page table should be allocated and inserted
//     into 'pgdir'.
//   - pp->pp_ref should be incremented if the insertion succeeds.
//   - The TLB must be invalidated if a page was formerly present at 'va'.
//
// Corner-case hint: Make sure to consider what happens when the same
// pp is re-inserted at the same virtual address in the same pgdir.
// However, try not to distinguish this case in your code, as this
// frequently leads to subtle bugs; there's an elegant way to handle
// everything in one code path.
//
// RETURNS:
//   0 on success
//   -E_NO_MEM, if page table couldn't be allocated
//
// Hint: The TA solution is implemented using pgdir_walk, page_remove,
// and page2pa.
//
int
page_insert(pde_t *pgdir, struct PageInfo *pp, void *va, int perm)
{
	// Fill this function in
	// Address of the page table entry (PTE) for linear address 'va'
	// create on demond, a page table should be allocated and inserted into 'pgdir'.
	pte_t *pte = pgdir_walk(pgdir, va, 1);	
	if (!pte) {	// page table couldn't be allocated
		return -E_NO_MEM;
	}
	if (*pte & PTE_P) {	// the *pte is validate
		// PTE_ADDR: Address in page table or page directory entry
		// page2pa: physical -> virtual
		if (PTE_ADDR(*pte) == page2pa(pp)) {
			// The TLB must be invalidated if a page was formerly present at 'va'
			tlb_invalidate(pgdir, va);
			// do not use page_decref(), as physical may be free
			pp->pp_ref--;	
		} else {
			// already a page mapped at 'va', it should be page_remove()
			page_remove(pgdir, va);
		}
	}
	*pte = page2pa(pp) | perm | PTE_P;
	// pp->pp_ref should be incremented if the insertion succeeds.
	pp->pp_ref++;
	// The permissions (the low 12 bits) of the page table entry should be set to 'perm|PTE_P'.
	// PDX: page directory index from linear address
	*pte = page2pa(pp) | perm | PTE_P;
	return 0;
}

编译后运行,可以看到测试通过

qemu-system-i386 -nographic -drive file=obj/kern/kernel.img,index=0,media=disk,format=raw -serial mon:stdio -gdb 
tcp::26000 -D qemu.log
6828 decimal is  octal! 
Physical memory: 131072K available, base = 640K, extended = 130432K
boot_alloc memory at <f011a000>
Next free memory at <f011b000>
boot_alloc memory at <f011b000>
Next free memory at <f015b000> 
kern end page:347
check_page_free_list() succeeded!
check_page_alloc() succeeded! 
check_page() succeeded!
check_kern_pgdir() succeeded! 
check_page_free_list() succeeded! 
check_page_installed_pgdir() succeeded!
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
K>