Wednesday, January 29, 2014

LogMeIn Hamachi Windows 8 and 8.1 Bug

(Last updated 2/5/2014 - Update #7) - Updates are scattered throughout the blog post, so I'd recommend using CTRL+F "Update #numberhere" without the quotations of course to get to the update you'd like to read.

To those reading currently experiencing the 0x44 BSOD caused by LogMeIn Hamachi, the only current workaround until LogMeIn implements a fix and/or releases a stable version is to fully uninstall LogMeIn Hamachi.

As of 2/5/2014, a LogMeIn update has been released that addresses this bug. Be sure to go ahead and download and/or update to the latest version if yours doesn't automatically:


Thanks to timbadia @ Sysnative for the screenshot!

--------------------------------

At this point, Harry (x BlueRobot) and I have analyzed this to the point where we can do no more as we are only volunteer analysts with an interest in helping people. We've found consistency thanks to everyone who submitted kernel-dumps. If you would like to do your part in making sure this gets solved, please go ahead and submit a support ticket (HERE) and in the 'Please describe your issue or question' field, if you could include a link to this blog post so the LogMeIn team can see what the problem is, etc, so this can be solved as fast as possible for its users, that would be fantastic.

Keep reading below if you'd like to learn more about the specifics and internals of this crash! : )

--------------------------------

So, what's the problem?

Well, first off, the bug check seems to consistently be MULTIPLE_IRP_COMPLETE_REQUESTS (44).

-- A driver has called IoCompleteRequest to ask that an IRP be completed, but the packet has already been completed.
When dealing with the first thread, the first thing I did was straight away take a look at the call stack and saw this:




Very simple call stack, really.

1. We have Hamdrv.sys (LogMeIn Hamachi Virtual Miniport Driver).

2.  nt! ?? ::FNODOBFM::`string'+0x2a9ab - TO MY KNOWLEDGE, the debugger (WingDbg) is slightly confused about symbol names in NTDLL due to the binary being reorganized into function chunks. The functions are no longer contiguous in memory. Hot code paths are clustered together with hot code paths of other functions. “Cold” code paths are moved elsewhere. That way you save on paging I/O by maximizing the amount of relative data on each code page.

Essentially, to my understanding, when a sequence of code is compiled, it will occupy a single contiguous chunk of memory. With this said however, the optimizer can spread the executable code all over the place, replacing the inline code with a jump to some other memory location. As said above, this is done for two main reasons (maybe more, but I only understand two):

A. Code size - If there's a common code sequence, make a copy and have everything jump to that copy.

B. Locality of reference - If there's a code path that is taken quite often, putting that code together into a single location can possibly improve performance. As far as how this improves performance, to my knowledge it's mostly cache purposed. For example, a TLB (Translation lookaside buffer).

With this said, if the optimizer moves the code to a different (random) location, what happens to the information in the PDB (Program Database)?

When this happens, to my knowledge, FunctionName+Offset no longer equals FunctionAddress+Offset, therefore the output of information in the debugger isn't correct. In these specific cases, the code is moved to a location (which is random, to my knowledge) and the closest symbolic name is a string in the image. When this happens, the debugger (WinDbg) uses the string as a best guess for the return address on the stack.

To further debug this, I believe I would need to set breakpoints and then disassemble a specific routine and take a look at the control flaws (which since this is not happening to me, I cannot do). Also, as much as I am interested in doing something like this, requesting the average computer user on a forum such as MS Answers to set a breakpoint(s) for knowledge related purposes is going a little far. It's just another reason for me to set up a guinea pig system.

3. Bug check itself.

-------------

At first I thought it may have been the same user just making a new thread under a separate account possibly because they didn't want to follow my advice, but:



 












^^ Different models entirely, however the OS environment remained consistent (Windows 8.1). With this said, possibly a bug exclusive to 8.1.

-------------

At first glance of the original dump I saw, I figured there may have been security software installed on the system in question that was causing conflicts with remote access, etc. I took a look at the loaded modules list and noted McAfee, nothing more. I informed the user to remove McAfee as it may be causing conflicts, but the same exact issue and bug check persisted after McAfee's removal. With this said, it does not appear to be caused by anything network and/or security related, although I could be wrong.

As per usual, there doesn't appear to be a simple way to contact LogMeIn if you don't have an account, and their customer support seems to have vanished entirely since Jan 21st (their move away from being FREE). With this said, not sure how to bring this up. I'll keep my eyes peeled for any other of the same bug check.

-------------

Update #1 - A few thread examples:

1. Blue Screen Error - Bugcheck - Microsoft Community

2. Windows 8.1 Error : MULTIPLE IRP COMPLETE REQUESTS. - Microsoft Community

3. BSOD Win 8.1 Multiple Complete IRP Request

4.
BLUE SCREEN - MULTIPLE IRP COMPLETE REQUESTS

-- All are the same OS environment (Windows 8.1). I was thinking of providing a link for each 0x44 BSOD thread I come across, but that would be quite a bit. I may do it eventually, just not now.

I've also received notice from my good friend and Microsoft MVP zigzag3143 that he has seen quite a number of these popping up very recently. At this point, it's very evident there is a bug with the software itself and Windows 8.1 as an OS, and that this is very likely NOT a user specific or system specific issue.

Update #2 - I discovered that LogMeIn has a ticket system (here), so I've gone ahead and reported this issue. I hope to hear from them soon!

Update #3 - I have not received a response from LogMeIn since submitting a ticket, and I probably won't. In other news, I received a kernel-dump from a very generous user after I requested if it's possible they allow their system to crash with the system set to generate kernel dumps as opposed to minidumps. This way, I would be able to take a closer look and get more information. Here is what I was able to see:

First off, the call stack is interesting:

7: kd> kv
Child-SP          RetAddr           : Args to Child                                                           : Call Site
fffff880`031a9d48 fffff803`d802e6f2 : 00000000`00000044 fffffa80`0ea3f910 00000000`00000f7a 00000000`00000000 : nt!KeBugCheckEx
fffff880`031a9d50 fffff880`1a801f0a : 00000000`00000000 fffff880`1ca7ac00 fffffa80`0c7852c0 fffffa80`0c7852c0 : nt! ?? ::FNODOBFM::`string'+0xb318
fffff880`031a9e30 fffff880`1a803059 : fffffa80`0c7852c0 fffff880`031a9f00 fffff880`031a9f80 fffffa80`0000004a : Hamdrv+0x1f0a
fffff880`031a9e80 fffff880`1a802a56 : fffff880`031aa680 fffffa80`00000001 fffffa80`09686b90 fffffa80`09686cf0 : Hamdrv+0x3059
fffff880`031aa540 fffff880`01e7dcc9 : fffffa80`0b3f51a0 fffffa80`09686b90 00000000`00000000 00000000`00000000 : Hamdrv+0x2a56
fffff880`031aa570 fffff880`01e7c166 : fffffa80`09686b90 00000000`00000000 fffffa80`0c4dd7f0 fffffa80`0c4dd7f0 : ndis!ndisMSendNBLToMiniport+0xc9
fffff880`031aa5f0 fffff880`02305230 : fffffa80`09686b90 00000000`00000001 fffffa80`09686b90 fffffa80`0c4dd7f0 : ndis!ndisFilterSendNetBufferLists+0xd6
fffff880`031aa630 fffff880`0230540f : 00000000`00000000 fffffa80`09686b90 00000000`00000000 fffffa80`0c4dd7f0 : wfplwfs!L2NdisFSendNetBufferLists+0x80
fffff880`031aa670 fffff880`01e7dd8d : 00000000`00000000 fffffa80`09686b90 00000000`00000000 00000000`00000000 : wfplwfs!LwfLowerSendNetBufferLists+0x9f
fffff880`031aa6e0 fffff803`d80cb9a6 : 00000000`00000202 fffff803`d877abb9 fffff880`04714970 00000000`000007ff : ndis!ndisDataPathExpandStackCallback+0x31
fffff880`031aa720 fffff803`d80ce405 : fffff880`01e7dd5c fffff880`031aa910 00000000`00000000 fffff803`d82a8a08 : nt!KeExpandKernelStackAndCalloutInternal+0xe6
fffff880`031aa820 fffff880`01e7c05f : fffff8a0`046587f0 fffff803`d828d409 00000000`00000001 fffff803`d828c200 : nt!KeExpandKernelStackAndCalloutEx+0x25
fffff880`031aa860 fffff880`01e7c166 : 00000000`00000000 fffffa80`094f5bf0 fffffa80`08ff9970 00000000`00000000 : ndis!ndisInvokeNextSendHandler+0x2de
fffff880`031aa970 fffff880`04796997 : fffffa80`09686b90 00000000`00000022 fffffa80`0c5aebf0 fffff880`031aaa39 : ndis!ndisFilterSendNetBufferLists+0xd6
fffff880`031aa9b0 fffff880`0479bbc5 : fffffa80`0e44950e fffffa80`0e449500 fffffa80`0000004a fffffa80`00000014 : bdfndisf6+0x3997
fffff880`031aaaa0 fffff803`d80c4b77 : fffffa80`095061a0 00000000`0020b553 00000000`00000000 fffff880`0479b934 : bdfndisf6+0x8bc5
fffff880`031aab10 fffff803`d80b42a1 : fffff803`d82a2110 fffffa80`0850a040 fffff803`d80c4b18 fffff803`d807fc00 : nt!IopProcessWorkItem+0x5f
fffff880`031aab80 fffff803`d8048fd9 : 9212ecd9`41a9baff 00000000`00000080 fffff803`d80b4160 fffffa80`0850a040 : nt!ExpWorkerThread+0x142
fffff880`031aac10 fffff803`d80fd7e6 : fffff880`009cb180 fffffa80`0850a040 fffff880`009d6f40 fffffa80`0849f740 : nt!PspSystemThreadStartup+0x59
fffff880`031aac60 00000000`00000000 : fffff880`031ab000 fffff880`031a5000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16
Aside from the various Network Driver Interface Specification and Windows Filtering Platform routines, we can see two BitDefender Firewall calls. As we ascend up the stack, we of course eventually see our three LogMeIn Hamachi Virtual Miniport Driver calls, and then the bug check.

What's really interesting to me is that BitDefender's firewall turned up in the stack. Normally, if I saw this first as opposed to a minidump, I would assume that BitDefender's firewall was causing conflicts, but it has shown in various scenarios that the user would remove their security software and or firewall and the issue continued, or they didn't have any installed whatsoever to cause conflict in the first place.

So, was it called maybe for reasons other than potential conflict? Possibly just noting that it had to deal with BitDefender before beginning the Network Driver Interface Specification routines?

Also, there's definitely a trend in the packet being marked Pending as opposed to Completed:

7: kd> dt nt!_IRP fffffa800ea3f910
   +0x000 Type             : 0n-12272
   +0x002 Size             : 0xc5a
   +0x004 AllocationProcessorNumber : 0xfa80
   +0x006 Reserved         : 0xffff
   +0x008 MdlAddress       : (null)
   +0x010 Flags            : 0x60900
   +0x018 AssociatedIrp    : <unnamed-tag>
   +0x020 ThreadListEntry  : _LIST_ENTRY [ 0xfffffa80`0ea3f930 - 0xfffffa80`0ea3f930 ]
   +0x030 IoStatus         : _IO_STATUS_BLOCK
   +0x040 RequestorMode    : 1 ''
   +0x041 PendingReturned  : 0x1 ''
   +0x042 StackCount       : 1 ''
   +0x043 CurrentLocation  : 3 ''
   +0x044 Cancel           : 0 ''
   +0x045 CancelIrql       : 0 ''
   +0x046 ApcEnvironment   : 0 ''
   +0x047 AllocationFlags  : 0x4 ''
   +0x048 UserIosb         : 0x00000000`01b12c30 _IO_STATUS_BLOCK
   +0x050 UserEvent        : 0xfffffa80`09017860 _KEVENT
   +0x058 Overlay          : <unnamed-tag>
   +0x068 CancelRoutine    : 0xfffff880`1a80188c     void  +0
   +0x070 UserBuffer       : (null)
   +0x078 Tail             : <unnamed-tag>

The IoStatus also consistently appears to be IO_STATUS_BLOCK, which the driver sets the IRP's I/O status block to indicate the final status of an I/O request before calling IoCompleteRequest for the IRP. An I/O status block serves two purposes:


  • It provides a higher-level driver's IoCompletion routine a way of determining whether the service worked when the IRP is completed.
  • It provides more information about why the service either worked or did not work.

As far as I know, this is the completion status, either STATUS_SUCCESS if the requested operation was completed successfully as you mentioned earlier, or an informational, warning, or error STATUS_XXX value. This is where I am stuck, I cannot find where the error, warning, etc, is?

I'm having a fellow BSOD and Windows Internals Expert have a look at the kernel-dump I was provided. I will add more updates regarding it when I get them.


 -------------

Update #4 - Fellow BSOD, Windows Internals Expert, and good friend Harry has taken a look at the kernel-dump I provided him and some more interesting information was discovered. What's below likely won't make sense to the average user, but that's okay because it's more-so for knowledge related purposes and just some extra poking around. The initial temporary workaround still stands, which is to simply uninstall LogMeIn Hamachi until the LogMeIn developers issue a fix or patch.

7: kd> dt nt!_IO_STATUS_BLOCK fffffa800ea3f910
   +0x000 Status           : 0n207278096
   +0x000 Pointer          : 0xfffffa80`0c5ad010 Void
   +0x008 Information      : 0

Status field layout can be found below.

7: kd> ? 0n207278096
Evaluate expression: 207278096 = 00000000`0c5ad010

I've evaluated the value to hexiadecimal value, to try and decipher what it could mean. It seems that it may have returned NT_ERROR. I managed to find this within the Ntdef.h Header file, I apologise if anyone has difficulty reading my dark theme for Visual Studio.

7: kd> .formats 0n207278096
Evaluate expression:
  Hex:     00000000`0c5ad010
  Decimal: 207278096
  Octal:   0000000000001426550020
  Binary:  00000000 00000000 00000000 00000000 00001100 01011010 11010000 00010000 <-- 0 starts here
  Chars:   .....Z..
  Time:    Tue Jul 27 02:14:56 1976
  Float:   low 1.68567e-031 high 0
  Double:  1.02409e-315

Additional Information from the WDK -

When testing a return value, you should use one of the following system-supplied macros (defined in Ntdef.h):

NT_SUCCESS(Status) Evaluates to TRUE if the return value specified by Status is a success type (0 − 0x3FFFFFFF) or an informational type (0x40000000 − 0x7FFFFFFF).

NT_INFORMATION(Status) Evaluates to TRUE if the return value specified by Status is an informational type (0x40000000 − 0x7FFFFFFF).

NT_WARNING(Status) Evaluates to TRUE if the return value specified by Status is a warning type (0x80000000 − 0xBFFFFFFF).

NT_ERROR(Status) Evaluates to TRUE if the return value specified by Status is an error type (0xC0000000 - 0xFFFFFFFF).

7: kd> !irp fffffa800ea3f910 1
Irp is active with 1 stacks 3 is current (= 0xfffffa800c440c10)
 No Mdl: No System Buffer: Thread fffffa800c716900:  Irp is completed.  Pending has been returned
Flags = 00060900
ThreadListEntry.Flink = fffffa800ea3f930
ThreadListEntry.Blink = fffffa800ea3f930
IoStatus.Status = 00000000
IoStatus.Information = 00000118
RequestorMode = 00000001
Cancel = 00
CancelIrql = 0
ApcEnvironment = 00
UserIosb = 01b12c30
UserEvent = fffffa8009017860
Overlay.AsynchronousParameters.UserApcRoutine = 00000000
Overlay.AsynchronousParameters.UserApcContext = 01b12c30
Overlay.AllocationSize = 00000000 - 00000000
CancelRoutine = fffff8801a80188c   Hamdrv
UserBuffer = 00000000
&Tail.Overlay.DeviceQueueEntry = fffffa800ea3f988
Tail.Overlay.Thread = fffffa800c716900
Tail.Overlay.AuxiliaryBuffer = fffff803d8567998
Tail.Overlay.ListEntry.Flink = 00000000
Tail.Overlay.ListEntry.Blink = 00000000
Tail.Overlay.CurrentStackLocation = fffffa800c440c10
Tail.Overlay.OriginalFileObject = 00000000
Tail.Apc = 00580012
Tail.CompletionKey = 00580012
     cmd  flg cl Device   File     Completion-Context
 [  3, 0]   0  0 fffffa8009ff6060 00000000 00000000-00000000  
           \Driver\hamachi
            Args: 00000000 00000000 00000000 00000000

The 3 is the Major Function Code for IRP_MJ_READ and the Minor Function Code is IRP_MN_NORMAL from my understanding.

7: kd> !devstack fffffa8009ff6060
  !DevObj   !DrvObj            !DevExt   ObjectName
> fffffa8009ff6060  \Driver\hamachi    fffffa8009ff61b0  HamachiTap.C6AE2999-2529-481A-A4BA-C3ED805229CC

7: kd> dt nt!_IO_STACK_LOCATION 0xfffffa800c440c10
   +0x000 MajorFunction    : 0x5 ''
   +0x001 MinorFunction    : 0 ''
   +0x002 Flags            : 0xd8 ''
   +0x003 Control          : 0 ''
   +0x008 Parameters       : <unnamed-tag>
   +0x028 DeviceObject     : (null)
   +0x030 FileObject       : (null)
   +0x038 CompletionRoutine : (null)
   +0x040 Context          : (null)

The data structure shows the current stack location of the IRP. Since IRPs can be handled by different Driver Objects through Driver Stacks, and therefore may be a indication of the different driver handling a different IRP.

7: kd> !drvobj fffffa800968ee60
Driver object (fffffa800968ee60) is for:
 \Driver\hamachi
Driver Extension List: (id , addr)
(4e4d4944 fffffa800ef88bb0)
Device Object list:
fffffa8009ff6060  fffffa800b3f5050 

7: kd> !devobj fffffa800b3f5050
Device object (fffffa800b3f5050) is for:
 NDMP24 \Driver\hamachi DriverObject fffffa800968ee60
Current Irp 00000000 RefCount 0 Type 00000017 Flags 00002050
Dacl fffff9a100265f51 DevExt fffffa800b3f51a0 DevObjExt fffffa800b3f6688
ExtensionFlags (0x00000800)  DOE_DEFAULT_SD_PRESENT
Characteristics (0x00000100)  FILE_DEVICE_SECURE_OPEN
AttachedTo (Lower) fffffa800e6bc7f0 \Driver\PnpManager
Device queue is not busy.

7: kd> !devstack fffffa800b3f5050
  !DevObj   !DrvObj            !DevExt   ObjectName
> fffffa800b3f5050  \Driver\hamachi    fffffa800b3f51a0  NDMP24 <-- Network Data Management Protocol?
  fffffa800e6bc7f0  \Driver\PnpManager 00000000  0000006a
!DevNode fffffa8008e37010 :
  DeviceInst is "ROOT\NET\0000"
  ServiceName is "Hamachi"

The NDMP24 may explain the network related calls in the Call Stack for the thread which was responsible for the IRP.

7: kd> !thread fffffa800c716900
THREAD fffffa800c716900  Cid 1c40.1da0  Teb: 000007f7fc3b4000 Win32Thread: fffff901007ffb90 RUNNING on processor 1
IRP List:
    fffffa800f678980: (0006,0118) Flags: 00060000  Mdl: 00000000 <-- Npfs
    fffffa800ea04ee0: (0006,0118) Flags: 00060000  Mdl: 00000000 <-- Npfs
    fffffa800f5f7520: (0006,01f0) Flags: 00060900  Mdl: 00000000 <-- nsiproxy
    fffffa800c52a520: (0006,01f0) Flags: 00060800  Mdl: 00000000 <-- nsiproxy
Not impersonating
DeviceMap                 fffff8a00000c280
Owning Process            fffffa800c4c8940       Image:         hamachi-2.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      864481         Ticks: 0
Context Switch Count      43034          IdealProcessor: 0            
UserTime                  00:00:00.125
KernelTime                00:00:00.140
Win32 Start Address 0x000007fd00a44aa0
Stack Init fffff8801ce29c90 Current fffff8801ce29050
Base fffff8801ce2a000 Limit fffff8801ce24000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
fffff880`1ce29438 fffff880`00b77115 : fffffa80`0cc3b840 fffffa80`093c9270 00000000`00000000 fffffa80`08531e00 : nt!ObReferenceObjectByHandle
fffff880`1ce29440 fffff880`00b590a7 : fffff880`1ce29918 00000000`00000320 00000000`00000000 fffffa80`0cb9e620 : afd!AfdEnumNetworkEvents+0x75
fffff880`1ce294e0 fffff803`d844b47a : fffffa80`0c549610 00000000`00000000 00000000`000003a8 00000000`00000114 : afd!AfdFastIoDeviceControl+0x1097
fffff880`1ce29890 fffff803`d8460a76 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!IopXxxControlFile+0x3d9
fffff880`1ce29a20 fffff803`d8075453 : fffffa80`0c716900 00000000`01f9fb18 fffff880`1ce29aa8 00000000`01f9fb40 : nt!NtDeviceIoControlFile+0x56
fffff880`1ce29a90 000007fd`03242c5a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`1ce29b00)
00000000`01f9fcb8 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x000007fd`03242c5a

7: kd> !irp fffffa800f5f7520
Irp is active with 2 stacks 2 is current (= 0xfffffa800f5f7638)
 No Mdl: No System Buffer: Thread fffffa800c716900:  Irp stack trace.
     cmd  flg cl Device   File     Completion-Context
 [  0, 0]   0  0 00000000 00000000 00000000-00000000  

            Args: 00000000 00000000 00000000 00000000
>[  3, 0]   0  1 fffffa800caf9260 fffffa8008c53d10 00000000-00000000    pending
           \FileSystem\Npfs
            Args: 00002000 00000000 00000000 00000000

Npfs is a Windows driver, but I'm correct, isn't the driver attempting to complete the IRP packet again? I don't think I may be correct though, if the IRP is targeted at a different Device Object.

-------------

So, that was Harry's analysis of the 0x44 kernel-dump. As discovered above, Npfs.sys appears to be sitting at status PENDING (not complete). Npfs.sys is the Named Pipe File System service. It enables the exchange of data among multiple paths, whether the exchange is for the same application or for two or more different applications at a time.

Is this the reason the crash is occurring? Maybe, we cannot say because one kernel-dump is not enough as there needs to be a check for consistency. If we receive more kernel-dumps, we can check and see whether or not Npfs.sys is consistent in its status PENDING. If it is, it appears the developers will need to iron that out and everything will be solid.

-------------

Update #5 - Wow, today I woke up to multiple emails from various people either providing their kernel-dumps, or asking if I'd like theirs. This is absolutely incredible, I am ecstatic. I'd like to extend a HUGE thank you to every single one of you. At the end of it all, I will be absolutely sure to give credit to every single one of you by your name, etc. Let's get started!

David Laverdiere II's kernel-dump:

This is the first kernel-dump I received, so it's only fair I use it in the first major analysis to check for consistency : )

First off, let's take a look at the call stack just to see what's going on:

1: kd> kv
Child-SP          RetAddr           : Args to Child                                                           : Call Site
ffffd000`3a7bb818 fffff802`ef991b4b : 00000000`00000044 ffffe000`07e7e2a0 00000000`00000f8a 00000000`00000000 : nt!KeBugCheckEx
ffffd000`3a7bb820 fffff800`045c1f0a : 00000000`00000000 ffffd000`3ac72f00 ffffe000`088a1340 ffffe000`088a1340 : nt! ?? ::FNODOBFM::`string'+0x2a9ab
ffffd000`3a7bb930 fffff800`045c3059 : ffffe000`088a1340 ffffd000`3a7bba02 ffffd000`3a7bba80 ffffe000`00000054 : Hamdrv+0x1f0a
ffffd000`3a7bb980 fffff800`045c2a56 : 00000000`00000000 ffffe000`04b02bb0 ffffe000`04b02bb0 ffffe000`04b02d10 : Hamdrv+0x3059
ffffd000`3a7bc040 fffff800`010b7b14 : 00000000`00000000 ffffd000`3a7bc170 00000000`00000000 00000000`00000002 : Hamdrv+0x2a56
ffffd000`3a7bc070 fffff800`012e6115 : ffffe000`062ed430 ffffe000`04b02bb0 ffffe000`00000000 00000000`00000017 : ndis!NdisSendNetBufferLists+0x554
ffffd000`3a7bc250 fffff800`012e2c54 : fffff800`0144a370 00000000`00000000 ffffd000`3a7bc400 ffffe000`000086dd : tcpip!IppFragmentPackets+0x4e5
ffffd000`3a7bc3a0 fffff800`012a2475 : fffff800`0144a370 ffffe000`0657b000 ffffd000`00000000 00000000`00000000 : tcpip!IppDispatchSendPacketHelper+0x94
ffffd000`3a7bc530 fffff800`012e1b02 : 00000000`00000000 00000000`00000017 ffffd000`3a7bca40 ffffe000`02839498 : tcpip!IppPacketizeDatagrams+0x2d5
ffffd000`3a7bc6d0 fffff800`0129c5f2 : 00000000`00000000 ffffe000`00cf1204 fffff800`0144a370 ffffe000`06cd7630 : tcpip!IppSendDatagramsCommon+0x4a2
ffffd000`3a7bc8b0 fffff800`0128ed2e : 00000000`00000000 00000000`000035ce 00000000`0000eb14 00000000`0000080a : tcpip!IpNlpSendDatagrams+0x42
ffffd000`3a7bc8f0 fffff800`012ea33a : ffffe000`0808c100 00000000`00000002 ffffe000`0576a2e8 fffff800`0144a370 : tcpip!UdpSendMessagesOnPathCreation+0x42e
ffffd000`3a7bcd10 fffff800`0128f154 : 00000000`00000000 ffffe000`06214010 ffffe000`05a1a480 00000000`00000000 : tcpip!UdpSendMessages+0x1da
ffffd000`3a7bd140 fffff802`ef8c13f9 : fffff800`00c45b74 fffff800`012edef5 fffff800`00c1dfd8 ffffe000`0169de40 : tcpip!UdpTlProviderSendMessagesCalloutRoutine+0x15
ffffd000`3a7bd170 fffff800`0128e7ac : fffff800`0128f140 ffffd000`3a7bd2f0 ffffe000`04d18d00 00000000`00000000 : nt!KeExpandKernelStackAndCalloutInternal+0xe9
ffffd000`3a7bd2c0 fffff800`00c47629 : ffffe000`0576a280 ffffd000`3a7bdb80 ffffe000`0576a280 ffffe000`0576a280 : tcpip!UdpTlProviderSendMessages+0x6c
ffffd000`3a7bd340 fffff800`00c2f0ca : ffffe000`007d5950 ffffd000`3a7bd908 ffffe000`01467d00 00000000`00000000 : afd!AfdFastDatagramSend+0x579
ffffd000`3a7bd500 fffff802`efbb3f97 : ffffe000`01467d00 00000000`00000000 ffffe000`01467d00 ffffd000`20607180 : afd!AfdFastIoDeviceControl+0x10bb
ffffd000`3a7bd870 fffff802`efbb4d7a : e0000131`daf054d3 0000000c`001f0003 00000000`00000000 00000000`00000000 : nt!IopXxxControlFile+0x3f7
ffffd000`3a7bda20 fffff802`ef9624b3 : 00000000`00000000 ffffd000`3a7bdad8 00000000`00000001 000000a4`ef92f600 : nt!NtDeviceIoControlFile+0x56
ffffd000`3a7bda90 00007ffb`c0dc65ea : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ ffffd000`3a7bdb00)
000000a4`ef92eff8 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ffb`c0dc65ea
From this call stack, we can see various networking related calls and routines.

1. For example, very early on in the stack we have two afd.sys routine calls, which is the Ancillary Function Driver. Ancillary Function Driver (AFD) is the pilot auxiliary function, a Microsoft Windows service, responsible to the BSD socket function. Windows (see Winsock), the most famous application that uses sockets is Apache.

In the Windows GUI, setting AFD is via the Device Manager (devmgmt.msc) and not in services and applications (services.msc). It is displayed in the list of Non-Plug and Play drivers, as the Environment network support (AFD AFD Networking Support Environment).


2. As we move up the call stack, we go into many tcpip.sys (TCP/IP Protocol driver) routines. This is self explanatory.

3. We then have one single ndis.sys (Network Driver Interface Specification driver) routine call. The Network Driver Interface Specification (NDIS) is an application programming interface (API) for network interface cards (NICs). The NDIS forms the Logical Link Control (LLC) sublayer, which is the upper sublayer of the OSI data link layer (layer 2). Therefore, the NDIS acts as the interface between the Media Access Control (MAC) sublayer, which is the lower sublayer of the data link layer, and the network layer (layer 3). 

The NDIS is a library of functions often referred to as a "wrapper" that hides the underlying complexity of the NIC hardware and serves as a standard interface for level 3 network protocol drivers and hardware level MAC drivers. Another common LLC is the Open Data-Link Interface (ODI).

4. We then of course have our three HamDrv.sys (LogMeIn Hamachi Virtual Miniport) driver calls.

5. Finally, we then have our inevitable bug check itself.

Moving on, something very interesting in David's kernel-dump is this: 

ADDITIONAL_DEBUG_TEXT:  USB\VID_04F2&PID_B374
What does this mean? Well, this is a USB device. VID = Vendor ID, and PID =  Product ID. As we can see here in this case, the VID = 04F2 and the PID = B374. As of right now, this doesn't really tell us anything other than a bunch of numbers which isn't very helpful. However, these numbers are actually very important!

If we run them against a VID/PID database, we see that the 04F2 VID = Chicony Electronics Co., Ltd. Great, so we have a vendor! Now, what about the product? Unfortunately, there doesn't seem to be anything in a database for PID B374. With this said, I went to Google and hoped for the best. With very little searches available, the best guess would be to assume that this is a Chicony webcam and/or camera based device. I will have to ask David. Moving on...

1: kd> dt nt!_IO_STATUS_BLOCK ffffe00007e7e2a0
   +0x000 Status           : 0n18350086
   +0x000 Pointer          : 0x00000001`01180006 Void
   +0x008 Information      : 0xffffe000`00b69b10
 Now that we have this, let's check the STATUS field layout:

1: kd> ? 0n18350086
Evaluate expression: 18350086 = 00000000`01180006
 Let's go forward more:

1: kd> .formats 0n18350086
Evaluate expression:
  Hex:     00000000`01180006
  Decimal: 18350086
  Octal:   0000000000000106000006
  Binary:  00000000 00000000 00000000 00000000 00000001 00011000 00000000 00000110
  Chars:   ........
  Time:    Sat Aug 01 05:14:46 1970
  Float:   low 2.7918e-038 high 0
  Double:  9.06615e-317
 Assuming I am correct, '00000000`01180006' falls under:

NT_SUCCESS(Status) Evaluates to TRUE if the return value specified by Status is a success type (0 − 0x3FFFFFFF) or an informational type (0x40000000 − 0x7FFFFFFF). This is different than our first kernel as the first kernel returned NT_ERROR as opposed to NT_SUCCESS).
Moving forward, let's take a closer look at the IRP:

1: kd> !irp ffffe00007e7e2a0 1
Irp is active with 1 stacks 3 is current (= 0xffffe00000d39070)
 Mdl=ffffe00000b69b10: No System Buffer: Thread fffff802ef8af540:  Irp is completed.  Pending has been returned
Flags = 00060900
ThreadListEntry.Flink = ffffe000016fb590
ThreadListEntry.Blink = ffffe000076866d8
IoStatus.Status = 00000000
IoStatus.Information = 00000054
RequestorMode = 00000001
Cancel = 00
CancelIrql = 0
ApcEnvironment = 00
UserIosb = 01471f40
UserEvent = ffffe00001136790
Overlay.AsynchronousParameters.UserApcRoutine = 00000000
Overlay.AsynchronousParameters.UserApcContext = 01471f40
Overlay.AllocationSize = 00000000 - 00000000
CancelRoutine = fffff800045c188c   Hamdrv
UserBuffer = 00000000
&Tail.Overlay.DeviceQueueEntry = ffffe00007e7e318
Tail.Overlay.Thread = fffff802ef8af540
Tail.Overlay.AuxiliaryBuffer = fffff802efdd78dc
Tail.Overlay.ListEntry.Flink = 00000000
Tail.Overlay.ListEntry.Blink = 00000000
Tail.Overlay.CurrentStackLocation = ffffe00000d39070
Tail.Overlay.OriginalFileObject = 00000000
Tail.Apc = 00580012
Tail.CompletionKey = 00580012
     cmd  flg cl Device   File     Completion-Context
 [  3, 0]   0  0 ffffe000029bf900 00000000 00000000-00000000   
           \Driver\hamachi
            Args: 00000000 00000000 00000000 00000000
Great, we're consistent here. The IRP is completed and the pending has returned. We also appear to have the same # (3) which appears to be the Major Function Code for IRP_MJ_READ and the same for the Minor Function Code, which is IRP_MN_NORMAL. Let's go ahead and take 'ffffe000029bf900' and look into it:

1: kd> !devstack ffffe000029bf900
  !DevObj   !DrvObj            !DevExt   ObjectName
> ffffe000029bf900  \Driver\hamachi    ffffe000029bfa50  HamachiTap.FBA77CC0-00D0-459A-AEA1-63052984BC34
Very nice, more Hamachi. Let's now go ahead and take 'ffffe00000d39070' (CurrentStackLocation) and look into it:

1: kd> dt nt!_IO_STACK_LOCATION 0xffffe00000d39070
   +0x000 MajorFunction    : 0x5 ''
   +0x001 MinorFunction    : 0 ''
   +0x002 Flags            : 0xd8 ''
   +0x003 Control          : 0 ''
   +0x008 Parameters       : <unnamed-tag>
   +0x028 DeviceObject     : (null)
   +0x030 FileObject       : (null)
   +0x038 CompletionRoutine : (null)
   +0x040 Context          : (null)
Yes, more consistency! Remember, the data structure shows the current stack location of the IRP. Since IRPs can be handled by different Driver Objects through Driver Stacks, and therefore may be a indication of the different driver handling a different IRP. Moving on...

1: kd> !drvobj ffffe00005c66230 f
Driver object (ffffe00005c66230) is for:
 \Driver\hamachi
Driver Extension List: (id , addr)
(4e4d4944 ffffe000010b5810) 
Device Object list:
ffffe000029bf900  ffffe000075cf050
Let's take  'ffffe000075cf050' (part of the Device Object list) and look into it:

1: kd> !devobj ffffe000075cf050
Device object (ffffe000075cf050) is for:
 NDMP18 \Driver\hamachi DriverObject ffffe00005c66230
Current Irp 00000000 RefCount 0 Type 00000017 Flags 00002050
Dacl ffffc10100d9bf51 DevExt ffffe000075cf1a0 DevObjExt ffffe000075d0720
ExtensionFlags (0x00000800)  DOE_DEFAULT_SD_PRESENT
Characteristics (0x00000100)  FILE_DEVICE_SECURE_OPEN
AttachedTo (Lower) ffffe00007098e50 \Driver\PnpManager
Device queue is not busy.
More consistency! Let's keep going...

1: kd> !devstack ffffe000075cf050
  !DevObj   !DrvObj            !DevExt   ObjectName
> ffffe000075cf050  \Driver\hamachi    ffffe000075cf1a0  NDMP18
  ffffe00007098e50  \Driver\PnpManager 00000000  000000a4
!DevNode ffffe00000c15d30 :
  DeviceInst is "ROOT\NET\0000"
  ServiceName is "Hamachi"
More and more consistency! Also, if you've been noticing, I have been highlighting NDMP(n) as we've been going through this. The reason for this is NDMP(n) = Network Data Management Protocol, which is a protocol invented by the NetApp and Legato companies, meant to transport data between network attached storage (NAS) devices and backup devices. This removes the need for transporting the data through the backup server itself, thus enhancing speed and removing load from the backup server.

This is likely why we see so many network related routines in the call stack which was inevitably responsible for the IRP.

Let's now go ahead and run a !thread on the thread in which the IRP completed:


1: kd> !thread fffff802ef8af540
fffff802ef8af540 is not a thread object, interpreting as stack value...
TYPE mismatch for thread object at fffff802ef8af540
Oh no! We've reached a very unfortunate dead end in David's kernel-dump. After debugging for over an hour, I was not about to give up. I went ahead and opened up another kernel-dump (provided by Maximilian Kless), and quickly got to where we are. The thread extension worked this time!

2: kd> !thread ffffe00004edd880
THREAD ffffe00004edd880  Cid 08a0.0968  Teb: 00007ff63f615000 Win32Thread: fffff901400f86d0 RUNNING on processor 3
IRP List:
    ffffe00005a625b0: (0006,0118) Flags: 00060000  Mdl: 00000000
    ffffe00005a2c1e0: (0006,0118) Flags: 00060000  Mdl: 00000000
    ffffe00002904cc0: (0006,01f0) Flags: 00060900  Mdl: 00000000
    ffffe000025cc270: (0006,01f0) Flags: 00060800  Mdl: 00000000
Not impersonating
DeviceMap                 ffffc0000000c250
Owning Process            ffffe00004bd0900       Image:         hamachi-2.exe
Attached Process          N/A            Image:         N/A
Wait Start TickCount      227813         Ticks: 0
Context Switch Count      8289           IdealProcessor: 0            
UserTime                  00:00:00.281
KernelTime                00:00:00.406
Win32 Start Address 0x00007ffe21c281b0
Stack Init ffffd00035a46c90 Current ffffd00035a45fe0
Base ffffd00035a47000 Limit ffffd00035a41000 Call 0
Priority 8 BasePriority 8 UnusualBoost 0 ForegroundBoost 0 IoPriority 2 PagePriority 5
Child-SP          RetAddr           : Args to Child                                                           : Call Site
00000000`01f8f7e0 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ff6`40361752
At this point, I am honestly jumping up and down in my chair. We see a huge consistency here, and we have one more step to go. Let's finally go ahead and run an !irp on the 3rd value in the IRP list:

2: kd> !irp ffffe00002904cc0
Irp is active with 2 stacks 2 is current (= 0xffffe00002904dd8)
 No Mdl: No System Buffer: Thread ffffe00004edd880:  Irp stack trace. 
     cmd  flg cl Device   File     Completion-Context
 [  0, 0]   0  0 00000000 00000000 00000000-00000000   

            Args: 00000000 00000000 00000000 00000000
>[  3, 0]   0  1 ffffe00001f37060 ffffe00005de5f20 00000000-00000000    pending
           \FileSystem\Npfs
            Args: 00002000 00000000 00000000 00000000
And that's it, we have consistency! Npfs.sys marked as status PENDING in TWO kernel dumps on TWO ENTIRELY different systems. Also, even though David's kernel-dump didn't show us a !thread, it showed alarming consistency in regards to many other things seen in the dump itself.

Now, with all of this found, I will bring it up to Harry and my fellow analysts, and see if we can explain what role Npfs.sys has in these crashes. More updates to come! I will of course also take a look at the other kernel-dumps I was provided. Busy busy day.

Update #6 - This is not an analysis update, mostly because the analysis is finished. From this point on, it's of course up to LogMeIn to examine their driver code and see why things like Npfs.sys remain PENDING, etc.

Aside from receiving any new kernel-dumps from any kind individuals to quickly check for consistency, Harry (x BlueRobot) and I have analyzed this to the point where we can do no more as we are only volunteer analysts with an interest in helping people. This update is instead directed more towards how much of a huge thank you I want to extend.

First off, I simply couldn't have done a large amount of this without Harry's help. As much as I appreciate all of the recognition I am receiving for this, Harry deserves it just as much as I do. It would have taken me a very long time and it would have been very stressful. Harry provided tons of documentation and debugging related information, and both of us have learned a very large amount from this. So far I have worked on two bugs with him (HP-Envy-700-074 BIOS bug) and now this. So far, both have been extremely successful.

I'd also like to thank ZigZag3143 for not only submitting tickets and remaining in contact with LogMeIn, and providing up to date threads across various communities displaying the sheer number of people having this problem, but for being a good friend as always and lending a hand.

Next, I'd like to of course extend an extremely large thanks to everyone around the world who has sent me their kernel-dumps simply because they wanted to be a part of this and help contribute to solving the issue. I am going to be making a list of the people who contributed their kernel-dumps, by name. As I am still receiving (or waiting to receive) kernel-dumps from various people who've contacted me, it will likely be an ongoing list.

Essentially, what I am trying to convey here is that this was not just me, it was a huge team effort. I had tons of help from Harry, and extremely kind people from all around the world. I truly appreciate situations like this more than anything in life, because they are very important. As an analyst with a huge passion in debugging, this was not only a beautiful learning experience, but I got to work with a lot of extremely great people. It truly goes to show that when people work together, you can essentially solve any problem.

Update #7 - It appears as I stated at the very top that this bug has been SOLVED! Again, a huge thank you to everyone involved, and to LogMeIn for fixing it promptly for its customers.

18 comments:

  1. Hi Patrick,

    Thanks for all the advice. I was using Logmein and Logmein Hamachi and have used them both for years. I was transitioning to TeamViewer which I do agree seems to be superior. I wanted to leave Hamachi since the VPN does connect my two systems at two different places together. Once I removed Hamatchi from my laptop I stopped receiving these BSOD's and random reboots, last night was the first time in about 2 weeks this laptop (Lenova Yoga2 Pro) did not reboot. But I do have almost the same setup on my Desktop (i7 ASROCK Mobo I built) and it did not reboot or have any trouble at all. Do you think Hamatchi has to react with some other software specific to my laptop to experience this or is my desktop somehow immune? Just a thought and trying to make you aware of the whole situation. Do you have any substitutes you recommend for Hamachi?? Thanks again for your help.

    ReplyDelete
    Replies
    1. Hi Gary,

      It's very possible that there is some sort of very specific conflict, but as of right now, I do not know if that's the case. I have only received one kernel-dump from a user that was afflicted with the 0x44 Hamachi BSOD's, and from it I was able to extract some interesting information regarding their Bitdefender firewall having presence in the call stack, etc.

      The thing is though, in many cases in which a user had security software installed, even after removal, the issues persisted. With that said, it's likely not a conflict caused by anything security related, but more-so as we suspect at this time a bug in Hamachi itself. Regardless, at this time, I cannot prove otherwise as I don't have sufficient evidence yet. If I was given kernel-dumps from every user experiencing the 0x44 Hamachi crashes, I would be able to check for consistency, but I don't.

      I'm having a fellow BSOD and Windows Internals Expert have a look at the kernel-dump I was provided. I will add more updates regarding it when I get them.

      Regards,

      Patrick

      Delete
  2. This is... more than I can handle haha. I marvel at your understanding of what appears as extremely complex gibberish to me! I use Hamachi somewhat regularly (or.. did) and noticed it updated itself today. A few hours went by and I got the whole MULTIPLE_IRP_COMPLETE_REQUESTS BSOD dealio, to which I Google'd and found the troubleshooting threads you listed. Uninstalled Hamachi and renamed the driver, haven't had any problems since. Was there some kind of.. well what you call "kernel-dump".. created automatically as this happened, stored somewhere on my computer? I just ask in case you were interested in obtaining more to look at. But most of all, I just wanted to say how impressed I am of the people like you who understand this stuff at such a level haha. Also, thanks for reporting the issue to LogMeIn; hopefully it is fixed sooner than later. Have a good one!

    ReplyDelete
  3. Hi David,

    First off, thank you for your kind words! They are very much appreciated. Regarding your kernel-dump question, I am very interested if a kernel-dump was generated. You'll want to navigate to C:\Windows and check the Windows directory for a file named MEMORY.DMP. If it's there, you'll want to upload that file anywhere I can download it such as Skydrive, Mediafire, Google Drive, etc... whatever is convenient for you. After that, you can email it to me at patjbarker@gmail.com

    If MEMORY.DMP is not listed in the Windows directory, your system is likely not set to generate kernel-dumps, but minidumps instead (unfortunately don't contain as much info). If that's the case, it's fully up to you whether or not you'd like to set it to kernel-dump: http://msdn.microsoft.com/en-us/library/windows/hardware/ff542953%28v=vs.85%29.aspx

    If you do, you'd of course have to reinstall LogMeIn Hamachi, crash, and then proceed to uninstalling it again. You'd be doing this for research purposes, and again, it's optional. You don't have to if you don't want to.

    Regards,

    Patrick

    ReplyDelete
  4. Hello Patrick,
    Had I know you needed more evidence I would have contacted you sooner.
    I have also had this problem with Hamachi and my windows 8.1 I'm sure I still have all the minidumps from all the times it crashed. I've been following you for the past several days now waiting and hoping for you to find a solution. I'm rooting for you to help solve this problem, I'd be willing to install hamachi again if I wasn't busy at college and had time to afford a crash. Maybe this upcoming weekend if you still need more evidence I can try and install it again and send you the kernel dump.
    Best of Luck
    Bill

    ReplyDelete
    Replies
    1. Hi Bill,

      First off, it's nice to know you've been following the situation as it has progressed, so thanks! In the past ~24 hrs I've received multiple kernel-dumps from people, and it has been a huge help. If you'd like to send yours just to further prove the consistency, by all means go ahead and do so if that's something you don't mind doing. It's definitely appreciated!

      Regards,

      Patrick

      Delete
  5. Patrick, when do you think we should start sending e-mails to LogMeIn, if you haven't done so already?

    Harry (x BlueRobot)

    ReplyDelete
  6. I have a few dumps. I keep Hamachi installed just kill the process most of the time. Let me know how to send over to you.

    ReplyDelete
    Replies
    1. Hi Jason,

      First off, thank you. Are they kernel-dumps? If so, great. If not, please set your system to generate kernel-dumps - http://msdn.microsoft.com/en-us/library/windows/hardware/ff542953%28v=vs.85%29.aspx

      Afterwards, you'll want to navigate to C:\Windows and check the Windows directory for a file named MEMORY.DMP. If it's there, you'll want to upload that file anywhere I can download it such as Skydrive, Mediafire, Google Drive, etc... whatever is convenient for you. After that, you can email it to me at patjbarker@gmail.com

      Regards,

      Patrick

      Delete
  7. There is a new update from LogMeIn (2.2.0.130) which states "This release fixes a bug that caused system instability on Window 8 and Windows 2012."

    Just thought I would add that info here as well, for the benefit of those not on the sysnative forum.


    ~Darshan

    ReplyDelete
  8. This is not clear if new version solves the problem. Description does not say much (( We will try it to see if it crashes.

    ReplyDelete
    Replies
    1. Hi Aleksei,

      You are correct, it is not a very detailed description, however since this update I have not seen any other threads about the 0x44 BSOD and/or received emails, etc. I believe it is fixed. I don't use the software, or I'd comment myself. I may even install it temporarily to test.

      Regards,

      Patrick

      Delete
    2. I have not had a crash in two days related to it since patch. I will update/upload dumps if it occurs but it is looking ok right now. I was crashing 3-5x a day before.

      Delete
    3. I have installed the patch today and have not had a BSOD since

      Delete
  9. I have had my machine crash still after the update. Really hope they address this.

    ReplyDelete
  10. Logmein Hamachi is good. All logmein products are good, no doubt about that. However, I need guaranteed attendance for my business. Hence, I use RHUB Remote support servers. It works well.

    ReplyDelete
  11. I need help deleting LogMeIn Hamachi. It won't uninstall, delete or anything. I don't have an account for them so I can't send them a support request or even ask them a question. HELP!

    ReplyDelete
    Replies
    1. Hi Sarah,

      Please try uninstalling it via Safe Mode. It should work there as it won't be loaded, etc.

      Regards,

      Patrick

      Delete