Wednesday, July 16, 2014

Page Faults Explained

Hello everyone!

In this post, I'm going to do my best to go in-depth regarding page faults, but do my best to speak English at the same time. There are many page fault related articles out there, but I've noticed they're either picking up from an imaginary somewhere (i.e a rushed explanation that seems to begin and end abruptly), incomplete, assume you're already knowledgeable (even basic) regarding Windows' memory manager, paging, page faults, etc. Recently, thanks very much to Pavel Yosifovich, I have a better understanding of page faults and would like to as always share my knowledge as a whole.

-- I would like to note that in the making of this post, as far as double-checking to ensure I was correct goes, if I was not flat out correct, I was either incorrect or learned way more than I thought I knew. This is one of the things I love most about making blog posts, and learning in general.

--------------------

First off, before even diving into page faults themselves, and especially since we want to do this the right way, we need to understand a few things (well, many things).

Disclaimer: I am not going to go extremely in-depth regarding Windows' memory manager (as that would take forever and a half/my knowledge is solely my knowledge), and if you are interested in that, Mark Russinovich has done a brilliant article over at TechNet, as well as many others all across the web if you do some digging (or check the reference links below). I am merely laying the groundwork for the understanding and explanation of page faults and nothing more.

If you ask me personally, Windows' memory manager (and memory management in general throughout the operating system) is one of the most complicated and in-depth parts of Windows internals. It's daunting yet extremely fascinating at the same time, as one extremely in-depth piece leads to another. It seems endless, and I highly recommend spending time reading into the memory management specifics throughout Windows, as it's truly fascinating.

Physical Memory

Physical memory is by far one of the most important resources, and one we must absolutely understand. Among many things, the memory manager within Windows is responsible for the data of all current active processes, drivers, and the operating system. Even today as of this blog post, the operating system itself accesses more code/data than can actually fit in physical memory. With this said, as said by the brilliant Mark Russinovich, think of physical memory as a window into the code/data used over time.

Now that this is known, we can understand that the amount of physical memory present on the system affects performance greatly, because if the data/code needed for a process or the operating system itself is not directly available in physical memory, it must be brought in (paged-in) from the disk which is quite the performance hit.

One of the reasons it's very important to understand physical memory before virtual memory (or in general) is because physical memory contributes to the virtual memory system limit, which interestingly enough is roughly the size of physical memory plus any page files configured on the operating system.

We can view the layout of physical memory with Meminfo (download here). Do note that you'll need to execute the program through an elevated command prompt manually. You can see the path I chose in the screenshot below, as well as the layout itself.


If you use meminfo.exe it will display the different parameters you can use. In our case, if we use meminfo.exe -r it will run Meminfo and display the valid physical memory ranges that are detected.

If you're interested in going further regarding physical memory consumption device-wise, you can use Device Manager to check what addresses devices are occupying. The image below is a simple snippet of my personal system's memory consumption as an example.


We can also take a look at the physical memory limits of Windows 7 and 8 as an example.







As we can see, the actual physical memory limits themselves on the client operating systems drastically increase regarding its x64 architecture, yet remain the same with x86. x86's physical limit has remained 4 GB since Windows XP as far as its client operating systems go. This is simply due to the fact that on x86 systems, the processor's address bus which is 32 lines (and/or 32 bits) can only access addresses ranges 0x00000000 to 0xFFFFFFFF (totaling 4 GB).

--------------------

Virtual Memory

Now that we understand some of the fundamentals behind physical memory, we can go ahead and discuss virtual memory. Do you have your cup of coffee? Good, you're going to need it. It's very important to first understand that virtual memory is a completely different entity than physical memory, although they both work together hand-in-hand.

An extremely important thing to note at this point is a process does not equal (=/=) and/or mean the same thing as a program, and the same follows regarding a thread. For example, when and if you hear a user say "My Firefox (32 bit) process is running according to Task Manger", that's actually not correct. Processes do not run, threads run. Processes are solely a set of resources used to execute a program, and consist of a private virtual address space (where memory is allocated), an executable used to start the application (.exe), a table of handles to various kernel objects, a security context (otherwise known as access token), and one (or possibly more) threads that execute the code.

With all of the above said, virtual memory (of many things, at least) is a technique of Windows' memory manager that maps memory addresses used by a program, namely virtual addresses, into physical addresses. In layman's terms, virtual memory exists to separate a program's view of physical memory, so the operating system can then go ahead and decide whether to store that program's code/data physically and/or virtually.

Let's break this down further! When you run a program, it will go ahead and generate addresses which are generated in the following ways:

  • Load instruction
  • Store instruction
  • Fetching an instruction

Absolutely phenomenal article here regarding the first two. In short, the first two create data addresses, and the third goes ahead and creates instruction addresses. It's very important to know that RAM cannot distinguish between the two, and simply sees them as addresses. Addresses generated by programs are considered virtual, therefore it needs to be translated to a physical address. How does this happen? Good question! This all occurs through address translation hardware (done by the CPU and invoked by the kernel), known as MMU.

After MMU translates virtual > physical, the operating system can then go ahead and create a virtual address space that allows programs to reference more memory than actually physically available by using disk. This is one of the main benefits regarding virtual memory, aside from memory protection.


Thanks to Mike from BrokenThorn for the above image!

All of the above now finally leads into paging and page faults, which will be discussed below.

--------------------

Paging

Paging is very important in many ways, mainly because it allows the operating system to virtualize memory without worrying about segmentation. Instead of splitting up an address space into three logical segments, it's split up into fixed-size units known as a page.

-- A page is a sequence of N bytes where N is a power of 2.

Page sizes are at least 4 K and at most 64 K or more.

--------------------

Page Table/Disk Map

Now that we understand pages/paging, every address space on the system has two things associated with it:

1. Page Table - Identifies which/what pages are in physical memory.

2. Disk Map - Identifies where all the pages are on the disk.

Both of these describe an entire address space.

In an effort to make my own content this time as opposed to using pre-created images (inspired by P.J. Denning and Steve Coile), I have created the image below.


Regarding the above image, the followings flags are:

P - Presence flag

U - Used flag

M - Modified flag

F - Page frame

A - Disk address

With the above now known, if the P flag is set, this implies that the page is currently in physical memory (RAM). The F flag determines its location in memory, and is the number of the page frame in which the page is located.

If however the P flag is not set (not in physical memory), the address mapper will throw a page fault if the process in question attempts to reference the page. If this occurs, the page fault handler will use the disk map to go ahead and locate the page on the disk, and finally swap it (or page it) in. This is only a very minor explanation of a page fault process, and I will expand on page faults below.

--------------------

Page Fault

Finally! We get to the page fault, what we've been waiting for. I described above a very basic page fault process, but it's a lot cooler/interesting than that! In its basic definition of course, a page fault occurs when a program attempts to access pages that are not currently in physical memory (RAM). This is also known as a hard fault. It's absolutely imperative you understand the difference between hard fault and soft fault, which I will discuss below.

Hard Fault - Hard Fault (otherwise known/referred to as a major fault) is the exact same thing as Page Fault, and you'll see its name in Resource Monitor on newer versions of Windows (afaik Vista and later). To expand on why hard faults are defined as they are, and to stress on why they're expensive, it's due to the process the page fault handler must follow if one occurs. For example, if the page is not loaded into memory at the time of a program referencing its address, the page fault handler needs to find a 'free location'. This free location is either a page in memory, or a non-free page in memory.
 
If the latter is currently in use by a pre-existing process, the operating system needs to spend time writing out the data in that current page, and mark it as not being loaded into memory. Once this is done, it is now a free location and can be used to read the data for the new page into memory, add an entry to its location within the MMU, and finally of course indicate that it is now successfully loaded into memory.

Below is an image representation of the entire process outlined above.


Soft Fault - Entirely different from a hard fault, a soft fault is when the MMU (as we discussed above) has not yet marked a page being loaded in memory. This is sometimes/also referred to as a minor fault, as the solution is simple (i.e make the operating system create an entry for the page, have the MMU point to that page in memory, and finally of course indicate that it is now successfully loaded into memory).

With all of this said, you can imagine why page faults/hard faults are an extremely expensive process. It's also imperative that you also understand that having to unnecessarily access the disk is very slow. If anyone has ever had a system in which it was experiencing multiple/frequent page faults for whatever reason, they can truly attest to how much their system slows to a crawl.

Why is this? Well, since you now understand what actually occurs during a page fault behind the scenes, we can imagine how ridiculously taxing this is on the disk. It doesn't help that the process of actually accessing the disk itself is slow in general, but to have to constantly do it is very bad. This is a good time to discuss the main pros & cons of virtual memory (i.e the disk). It's more like pro & con, really.

Pros - Very easy to get a lot of disk space for a small cost.

Cons - Slow!

A processor's register can be accessed in about a nanosecond, cache in 5 nanoseconds, and RAM in approximately 100 nanoseconds. With this said, the disk is literally seconds slower (at least a million times slower). If you are constantly/frequently having to go through the process of a page fault, you can truly imagine yourself now how slow it is.

--------------------

Page Fault - Continued

To go a bit more behind the scenes in regards to page faults, what actually happens when a page fault occurs is the thread that was running is placed into a wait state until the operating system's page fault handler can go ahead and go through the page fault process outlined above. This is done through an interrupt that halts (remember, wait state) the current program.
 
The instruction that went ahead and attempted to either access the page that was invalid, or nonresident (i.e not in physical memory), fails and throws an exception that generates the interrupt discussed above. Before discussing anything any further, we must first discuss why an exception is thrown. Quite simply, an exception is thrown because the CPU has no idea what page files are, etc, and only knows how to work with memory.
 
At this point, one of two things happens:
 
1. An Interrupt Service Routine (ISR) determines that the address is in fact valid, however is not resident (not in physical memory). The operating system then goes ahead and throws an exception (page fault) and goes through the page fault process outlined above. Once the page fault process is successfully completed, the program picks up right where it left off like nothing happened.
 
or
 
2.  An Interrupt Service Routine (ISR) determines that the address is in fact invalid, and then throws an exception known as an access violation. Remember above how we discussed hard (major)/soft (minor) faults? This is specifically known as an invalid fault. In this case, as opposed to following the page fault process outlined above, it is told to not attempt to access the memory as it's a null/bad address, and to simply terminate the executing program in question.

With the above said, this may dispel the common misconception of saying 'frequent page faults are okay'. Frequent page faults are not okay, but that is not to say that page faults aren't a normal operation of the operating system. On a fully functioning machine (regarding both hardware and software), you will experience page faults on a very small scale due to some programs simply requiring more memory (an example). If you are experiencing a very large number of page faults occurring (hard faults/sec), you have a problem, and you can most certainly tell because your system is likely slowed to that of a snail.

As far as some of the potential issues go when it comes to frequent page faults on a system:
 
  • Insufficient RAM (physical memory).
  • Faulty RAM.
  • Need to tailor the pagefile to your system's specific needs.
  • etc

--------------------

Page Fault - BSOD

Now that we understand what happens when a page fault occurs in user-mode, it's also imperative we understand what happens when an exception such as an access violation is thrown in kernel-mode. As you may or may not be able to tell with the way I started this, when a page fault exception such as an access violation occurs in kernel-mode, this results in a bug check (Blue Screen of Death, BSOD).
 
Why? Well, remember we discussed that when an access violation for example occurs, the page fault handler goes ahead and terminates the program. What if we're in kernel-mode and the instruction involving the violation is a device driver? "Uh oh" is exactly what happens. Luckily I have a crash dump from just the other day!
 
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e)

This indicates that a system thread generated an exception which the error handler did not catch.

BugCheck 1000007E, {ffffffffc0000005, fffff88004a5d62a, fffff880035af908, fffff880035af160}

The 1st argument is the exception code that wasn't handled by the error handler. In this cause, ffffffffc0000005 is an NTSTATUS code. Kernel-mode drivers use NTSTATUS types for return values. ffffffffc0000005's NTSTATUS value is 0xc0000005 (otherwise known as an access violation).

Using NTSTATUS Values

The 2nd argument is the memory address in which the exception occurred at. In our case, this was fffff88004a5d62a.
 
The 3rd argument is actual exception record address. In our case, this was fffff880035af908 which we can run .exr on to show exception record information.
1: kd> .exr 0xfffff880035af908
ExceptionAddress: fffff88004a5d62a (igdkmd64+0x000000000003862a)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000000000073070
Attempt to read from address 0000000000073070
Just by looking at the attempted address read, we can assume it's not a null address (because it's not zero), so it must be simply invalid. You can if interested confirm this by running !pte address which will display the page table entry (PTE) and page directory entry (PDE) for the specified address. This is not a kernel-dump, so running it in my case wouldn't yield any beneficial results.

The 4th argument is the context record address. In our case, this was fffff880035af160 which we can run .cxr on to show the context record.
1: kd> .cxr 0xfffff880035af160
rax=0000000000073000 rbx=fffffa8006299040 rcx=fffffa800637d540
rdx=00000000008dfaf0 rsi=fffffa800637d540 rdi=fffff88004a5d3c0
rip=fffff88004a5d62a rsp=fffff880035afb40 rbp=fffffa8006356b20
 r8=0000000000000000  r9=0000000000000000 r10=0000000000000018
r11=fffff880035afb60 r12=fffffa8006356b20 r13=fffffa800661cc30
r14=0000000000000038 r15=fffff88000f15fe0
iopl=0         nv up ei pl nz na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00210206
igdkmd64+0x3862a:
fffff880`04a5d62a ff5070          call    qword ptr [rax+70h] ds:002b:00000000`00073070=????????????????
This shows us the context that was saved from the exception at the time of the crash. It contains the CPU registers, the instruction we failed on, the bad address, etc. First off, as highlighted in blue, the exception (access violation) was thrown by/occurred because of igdkmd64.sys (Intel Graphics driver) referencing invalid memory. Regarding the instruction we failed on, we were calling a pointer in the rax register. The rax register in our case was 0000000000073000 (invalid). All of this invalid memory stuff occurring would result in a memory write to ????????????????, therefore the box bug checked.

We can see it from another perspective by disassembling the rip register:
1: kd> u @rip
igdkmd64+0x3862a:
fffff880`04a5d62a ff5070          call    qword ptr [rax+70h]
fffff880`04a5d62d 488b442420      mov     rax,qword ptr [rsp+20h]
fffff880`04a5d632 8b8034010000    mov     eax,dword ptr [rax+134h]
fffff880`04a5d638 c1e813          shr     eax,13h
fffff880`04a5d63b 83e001          and     eax,1
fffff880`04a5d63e 85c0            test    eax,eax
fffff880`04a5d640 0f84ab000000    je      igdkmd64+0x386f1 (fffff880`04a5d6f1)
fffff880`04a5d646 488b442478      mov     rax,qword ptr [rsp+78h]
--------------------

And that's that! I really hope you enjoyed reading, and I imagine there will be many edits/additions to be made as time goes by. For now though, at this moment, I am happy with it.

References/Links

- Pavel Yosifovich's Windows Internals 1/2.
- The Basics of Page Faults.
- Windows Memory Management (Written by: Pankaj Garg).
- Pushing the Limits of Windows: Physical/Virtual Memory.
- So What Is A Page Fault?
- HP OpenVMS Systems Documentation.
- Virtual Address Space and Physical Storage.
- Everything You Need To Know To Start Programming 64-Bit Windows Systems.
- Virtual Memory.
- Physical Memory Structures.
- How to virtualize memory without segments.
- Load/Store Instructions.
- Operating Systems Development - Virtual Memory (by Mike, 2008).
- Implementation of swapping in virtual memory.
- Page fault handling (image).

30 comments:

  1. Thank you so much for this! I'm just getting into investigating kernel mode bugchecks of this type and your write-up really helped!

    ReplyDelete
  2. Needed to compose you a very little word to thank you yet again regarding the nice suggestions you’ve contributed here.
    python training in Bangalore

    ReplyDelete

  3. Thanks for sharing, very informative blog.
    ReverseEngineering

    ReplyDelete
  4. I simply wanted to write down a quick word to say thanks to you for those wonderful tips and hints you are showing on this site.
    msbi online training

    ReplyDelete
  5. After reading your article I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article.

    SAP MM Online Training

    SAP MM Classes Online

    SAP MM Training Online

    Online SAP MM Course

    SAP MM Course Online

    ReplyDelete
  6. Very interesting blog Thank you for sharing such a nice and interesting blog and really very helpful article.

    Salesforce CRM Training in Bangalore

    Best Salesforce CRM Training Institutes in Bangalore

    ReplyDelete
  7. https://ravivarma.in/best-cloud-hosting-for-wordpress-websites/

    ReplyDelete
  8. Thanks for sharing this information! Much appreciated! The cyber security training institute in Kolkata has certain great tools to try to!

    ReplyDelete
  9. A very delightful article that you have shared here. Your blog is a valuable one. I will share it with my friends who need this info. Thanks to you for sharing an article like this. for More Details Click Here:- AOL Mail Not Working on Windows 10

    ReplyDelete
  10. nice blog thanks for sharing informative blog https://mulemasters.in/

    ReplyDelete
  11. 토토 We stumbled over here from a different web address and thought I might as well check things out. I like what I see so now i am following you. Look forward to looking into your web page repeatedly.

    ReplyDelete
  12. Digital marketing is easily one of the fastest booming courses and professions with countless opportunities across the globe. With the ever growing need for Digital marketers and many other niche-related, specialization job opportunities,https://digitalbrolly.com/digital-marketing-course-in-hyderabad/

    ReplyDelete
  13. But if multiple antiviruses detect a particular file as IDP ALEXA-infected, it means the file is corrupted and virus affected.idp.alexa.51

    ReplyDelete