Tuesday, April 22, 2014

0xBE - Journaling file system & Log File Service

For the first time today I came upon NTFS Journaling in a crash dump, so I thought I'd go ahead and write a blog post about it!

Thread/post here.


First off, before going into the specific scenario, let's talk about what a Journaling file system is. Essentially, a JFS is a file system that will go ahead and keep track of any changes that occur within what is known as a journal. In computing of course, this journal is generally a circular log located in a dedicated area of the file system. It's very important to note that this entire process itself is done before committing them to the main file system/carried through the disk.

With that said, one question may remain which is likely "Why do we even go through this process in the first place?" Well, in its simplest terms, this process is done to maintain data integrity. If a crash, hang, etc, occurs, the JFS will then have a log to go ahead recreate any potentially corrupt/lost data that occurred. Not only will it return the data to the pre-crash configuration, but it will also go ahead and recover any unsaved data and store it in the location it would have been stored in if the system had not been unexpectedly interrupted.

Now, you may be wondering how we actually communicate and/or work with something like this, and that's where NTFS.sys comes in. With NTFS.sys, we have a series of kernel-mode routines (which I will display below in my analysis) that are used to access the log file. This log file is specifically divided into two regions:

1. LFS Restart Area.

2. Infinite Logging Area.

Here's a diagram from http://www.ntfs.com/transaction.htm:

NTFS.sys calls the LFS (Log File Service) to read/write to the Restart Area. From the above diagram, we can see the two areas we mentioned above. You may notice that under LFS Restart Area, we have two copies. This is done in the event that one copy is either corrupt/inaccessible, so the second would be available in that situation.

If we take a look at the other side, we can see that we have the Logging Area, which as I mentioned above is circular (where 'infinite' comes from). New records are added to the logging file until it reaches full capacity, which the LFS then go ahead and frees up space for new records after any prior writes to the log are complete.

For what we're discussing here, that's about all we need to know. If you'd like to know more, I suggest reading the link above.


Great, so now we have some pretty decent knowledge regarding JFS, LFS, and NTFS regarding data integrity. Let's now go ahead and take a look at the crash I dealt with earlier:

An attempt was made to write to readonly memory.  The guilty driver is on the
stack trace (and is typically the current instruction pointer).
When possible, the guilty driver's name (Unicode string) is printed on
the bugcheck screen and saved in KiBugCheckDriver.
Arg1: fffff9800a472010, Virtual address for the attempted write.
Arg2: 80e0000057391121, PTE contents.
Arg3: fffff880031b5d50, (reserved)
Arg4: 000000000000000b, (reserved)

Let's go ahead and take a look at that 2nd parameter:

3: kd> !pte 80e0000057391121
                                           VA 80e0000057391121
PXE at FFFFF6FB7DBED000    PPE at FFFFF6FB7DA00008    PDE at FFFFF6FB400015C8    PTE at FFFFF680002B9C88
contains 0070000138B42867  contains 60D00000A3408867  contains 0000000000000000
pfn 138b42    ---DA--UWEV  pfn a3408     ---DA--UWEV  not valid

WARNING: noncanonical VA, accesses will fault !
From above, we can see we have an invalid virtual address (VA). This will inevitably result in a crash. Let's take a look at the call stack:

3: kd> k
Child-SP          RetAddr           Call Site
fffff880`031b5be8 fffff800`02ef37c6 nt!KeBugCheckEx
fffff880`031b5bf0 fffff800`02e73cee nt! ?? ::FNODOBFM::`string'+0x44cde
fffff880`031b5d50 fffff880`012fcd0e nt!KiPageFault+0x16e
fffff880`031b5ee0 fffff880`01303be5 Ntfs!LfsWriteLogRecordIntoLogPage+0x1ee <--- As the LFS data is being written to the LFS log, we call into a pagefault.
fffff880`031b5f80 fffff880`012ff536 Ntfs!LfsWrite+0x145 <--- Writing to the LFS.
fffff880`031b6040 fffff880`013002ef Ntfs!NtfsWriteLog+0x466 <--- Preparing to call the LFS to write to the log.
fffff880`031b6290 fffff880`013013ad Ntfs!NtfsChangeAttributeValue+0x34f <--- Changing some sort of value, which NTFS works a lot with. Unsure of what an attribute value is, though.
fffff880`031b6480 fffff880`012cea70 Ntfs!NtfsUpdateStandardInformation+0x26b <--- Looks like we have some sort of update to information.
fffff880`031b6590 fffff880`012cf41d Ntfs!NtfsCommonFlushBuffers+0x1f0 <--- Again.
fffff880`031b6670 fffff800`0331ed26 Ntfs!NtfsFsdFlushBuffers+0x10d <--- File System Driver Creation (FSD) buffer flush.
fffff880`031b66e0 fffff880`01041bcf nt!IovCallDriver+0x566
fffff880`031b6740 fffff880`010406df fltmgr!FltpLegacyProcessingAfterPreCallbacksCompleted+0x24f
fffff880`031b67d0 fffff800`0331ed26 fltmgr!FltpDispatch+0xcf
fffff880`031b6830 fffff800`0317f17b nt!IovCallDriver+0x566
fffff880`031b6890 fffff800`03113ea1 nt!IopSynchronousServiceTail+0xfb
fffff880`031b6900 fffff800`02e74e53 nt!NtFlushBuffersFile+0x171
fffff880`031b6990 fffff800`02e71410 nt!KiSystemServiceCopyEnd+0x13
fffff880`031b6b28 fffff800`03114c5f nt!KiServiceLinkage
fffff880`031b6b30 fffff800`03114a20 nt!CmpFileFlush+0x3f
fffff880`031b6b70 fffff800`03114caa nt!HvWriteDirtyDataToHive+0xe0
fffff880`031b6be0 fffff800`03105bbf nt!HvOptimizedSyncHive+0x32
fffff880`031b6c10 fffff800`03105d25 nt!CmpDoFlushNextHive+0x197
fffff880`031b6c70 fffff800`02e7f261 nt!CmpLazyFlushWorker+0xa5
fffff880`031b6cb0 fffff800`031122ea nt!ExpWorkerThread+0x111
fffff880`031b6d40 fffff800`02e668e6 nt!PspSystemThreadStartup+0x5a
fffff880`031b6d80 00000000`00000000 nt!KxStartSystemThread+0x16
Bug check (BE) as I noted above indicates that there was an attempt to write to readonly memory. The attempt to write to readonly memory was this call right here - Ntfs!LfsWriteLogRecordIntoLogPage+0x1ee. So, why did NTFS.sys make an attempt to write to readonly memory, causing a pagefault to occur? Generally, in almost all cases, you will not see a system driver and/or non-3rd party driver accessing invalid, readonly, etc, memory.

I had the user run a Chkdsk and errors were found and corrected, however no bad sectors. I also recommended running Seatools in DOS, so I will report back when I can with any new info, etc.


References I used to learn about JFS, LFS, etc: