Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LIBHOOK] makes the xen virtual machine hang #1740

Open
carttam opened this issue Nov 8, 2023 · 8 comments
Open

[LIBHOOK] makes the xen virtual machine hang #1740

carttam opened this issue Nov 8, 2023 · 8 comments

Comments

@carttam
Copy link

carttam commented Nov 8, 2023

Hi, I ran Drakvuf with Procmon and Apimon plugins on a Windows 7 SP1 virtual machine with a sample malware that I found in MalwareBazaar. After a long while after default browser (IE) openned , the Xen virtual machine hung and froze, and even the xl destroy command did not work. So, I had to kill the QEMU process to force it to stop.
xl list result:

Name                                        ID   Mem VCPUs	State	Time(s)
Domain-0                                     0  8191     4     r-----    4045.7
(null)                                      21    20     2     --p--d     218.6

Here is the time of execution of malware stderr log for both runs:
trace1
trace2

1699362400.234058 [LIBHOOK] creating return hook
1699362400.234193 Breakpoint VA 0x7fef4fb5075 -> PA 0x6d454075
1699362400.234300 Copied trapped page to new location
1699362400.234318 Activating remapped gfns in the altp2m views!
1699362400.234395 		Trap added @ PA 0x6d454075 RPA 0xff373075 Page 447572 for GetSystemMetrics.
1699362400.234416 [LIBHOOK] return hook OK
1699362400.234431 Switching altp2m and to singlestep on vcpu 1
1699362400.234593 		Trap added @ PA 0x27059d3 RPA 0xff00e9d3 Page 9989 for NtProtectVirtualMemory ret.
1699362400.234649 Switching altp2m and to singlestep on vcpu 2
1699362400.234826 [LIBHOOK] destroying return hook...
1699362400.234839 Removing breakpoint trap from 0x6d454075.
1699362400.234919 Removed memtrap for GFN 0x6d454 in altp2m view 1
1699362400.234932 Removed memtrap for GFN 0xff373 in altp2m view 1
1699362400.235305 [USERHOOK] DLL 928!7feff000000 is already hooked
1699362400.235328 Removing breakpoint trap from 0x27059d3.
1699362400.238128 [LIBHOOK] creating return hook
1699362400.238208 Breakpoint VA 0x7fefd7333d0 -> PA 0x2911c3d0
1699362400.238363 Copied trapped page to new location
1699362400.238395 Activating remapped gfns in the altp2m views!
1699362400.238511 		Trap added @ PA 0x2911c3d0 RPA 0xff1f73d0 Page 168220 for LdrGetProcedureAddress.
1699362400.238549 [LIBHOOK] return hook OK
1699362400.238569 Switching altp2m and to singlestep on vcpu 2
1699362400.238701 Pre mem cb with vCPU 2 @ 0x2becb0c4 in view 1: r--
1699362400.238735 Switching to altp2m view 0 on vCPU 2 and waiting for post_mem cb
1699362400.238876 Post mem cb @ 0x2becb0c4 vCPU 2 altp2m 0
1699362400.239177 [LIBHOOK] destroying return hook...
1699362400.239261 Removing breakpoint trap from 0x2911c3d0.
1699362400.239352 Removed memtrap for GFN 0x2911c in altp2m view 1
1699362400.239376 Removed memtrap for GFN 0xff1f7 in altp2m view 1
1699362400.241604 		Trap added @ PA 0x27059d3 RPA 0xff00e9d3 Page 9989 for NtProtectVirtualMemory ret.
1699362400.241659 Switching altp2m and to singlestep on vcpu 2
1699362400.241762 Switching altp2m and to singlestep on vcpu 1
1699362400.242005 [USERHOOK] DLL 928!7feff000000 is already hooked
1699362400.242029 Removing breakpoint trap from 0x27059d3.
1699362400.242275 [LIBHOOK] creating return hook
1699362400.242380 Breakpoint VA 0x7fef6f8b8ca -> PA 0x130c18ca
1699362400.242444 Copied trapped page to new location
1699362400.242458 Activating remapped gfns in the altp2m views!
1699362400.242534 		Trap added @ PA 0x130c18ca RPA 0xff29c8ca Page 78017 for RegOpenKeyExA.
1699362400.242564 [LIBHOOK] return hook OK
1699362400.242577 Switching altp2m and to singlestep on vcpu 2
1699362400.242706 [LIBHOOK] creating return hook
1699362400.242807 Breakpoint VA 0x7feff61d6c3 -> PA 0x214416c3
1699362400.242878 		Trap added @ PA 0x214416c3 RPA 0xff2016c3 Page 136257 for RegOpenKeyExA.
1699362400.242907 [LIBHOOK] return hook OK
1699362400.242916 [LIBHOOK] destroying return hook...
1699362400.242928 Removing breakpoint trap from 0x130c18ca.
1699362400.242985 Removed memtrap for GFN 0x130c1 in altp2m view 1
1699362400.243007 Removed memtrap for GFN 0xff29c in altp2m view 1
1699362400.243025 Switching altp2m and to singlestep on vcpu 2
1699362400.243056 [LIBHOOK] creating return hook
1699362400.243103 Breakpoint VA 0x7fef6f8b98e -> PA 0x130c198e
1699362400.243153 Copied trapped page to new location
1699362400.243171 Activating remapped gfns in the altp2m views!
1699362400.243240 		Trap added @ PA 0x130c198e RPA 0xff29c98e Page 78017 for RegOpenKeyExW.
1699362400.243260 [LIBHOOK] return hook OK
1699362400.243271 Switching altp2m and to singlestep on vcpu 1
1699362400.244567 Pre mem cb with vCPU 2 @ 0x2becb110 in view 1: r--
1699362400.244583 Switching to altp2m view 0 on vCPU 2 and waiting for post_mem cb
1699362400.244711 Post mem cb @ 0x2becb110 vCPU 2 altp2m 0
1699362400.245242 [LIBHOOK] destroying return hook...
1699362400.245269 Removing breakpoint trap from 0x214416c3.
1699362400.245401 [LIBHOOK] creating return hook
1699362400.245481 Breakpoint VA 0x7feff042bc5 -> PA 0x175c7bc5
1699362400.245537 Copied trapped page to new location
1699362400.245554 Activating remapped gfns in the altp2m views!
1699362400.245626 		Trap added @ PA 0x175c7bc5 RPA 0xff308bc5 Page 95687 for RegQueryValueExA.
1699362400.245655 [LIBHOOK] return hook OK
1699362400.245668 Switching altp2m and to singlestep on vcpu 2
1699362400.245783 [LIBHOOK] creating return hook
1699362400.245846 Breakpoint VA 0x7feff62407d -> PA 0x2853807d
1699362400.245929 		Trap added @ PA 0x2853807d RPA 0xff20207d Page 165176 for RegQueryValueExA.
1699362400.245952 [LIBHOOK] return hook OK
1699362400.245960 [LIBHOOK] destroying return hook...
1699362400.245971 Removing breakpoint trap from 0x175c7bc5.
1699362400.246024 Removed memtrap for GFN 0x175c7 in altp2m view 1
1699362400.246039 Removed memtrap for GFN 0xff308 in altp2m view 1
1699362400.246063 Switching altp2m and to singlestep on vcpu 2
1699367808.333386 [LIBHOOK] creating return hook
1699367808.333455 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.333520 Copied trapped page to new location
1699367808.333547 Activating remapped gfns in the altp2m views!
1699367808.333615 		Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.333654 [LIBHOOK] return hook OK
1699367808.333675 Switching altp2m and to singlestep on vcpu 0
1699367808.333684 Pre mem cb with vCPU 1 @ 0x42812404 in view 1: r--
1699367808.333699 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.333822 [LIBHOOK] destroying return hook...
1699367808.333852 Removing breakpoint trap from 0x749bb7fe.
1699367808.333927 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.333958 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.334010 Post mem cb @ 0x42812404 vCPU 1 altp2m 0
1699367808.334727 [LIBHOOK] creating return hook
1699367808.334834 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.334925 Copied trapped page to new location
1699367808.334953 Activating remapped gfns in the altp2m views!
1699367808.335023 		Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.335055 [LIBHOOK] return hook OK
1699367808.335079 Switching altp2m and to singlestep on vcpu 0
1699367808.335088 Pre mem cb with vCPU 1 @ 0x42812408 in view 1: r--
1699367808.335105 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.335219 [LIBHOOK] destroying return hook...
1699367808.335241 Removing breakpoint trap from 0x749bb7fe.
1699367808.335318 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.335341 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.335384 Post mem cb @ 0x42812408 vCPU 1 altp2m 0
1699367808.336017 [LIBHOOK] creating return hook
1699367808.336130 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.336221 Copied trapped page to new location
1699367808.336254 Activating remapped gfns in the altp2m views!
1699367808.336338 		Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.336378 [LIBHOOK] return hook OK
1699367808.336424 Switching altp2m and to singlestep on vcpu 0
1699367808.336610 [LIBHOOK] destroying return hook...
1699367808.336647 Removing breakpoint trap from 0x749bb7fe.
1699367808.336720 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.336753 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.337614 [LIBHOOK] creating return hook
1699367808.337686 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.337756 Copied trapped page to new location
1699367808.337784 Activating remapped gfns in the altp2m views!
1699367808.337854 		Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.337895 [LIBHOOK] return hook OK
1699367808.337908 Switching altp2m and to singlestep on vcpu 0
1699367808.337917 Pre mem cb with vCPU 1 @ 0x4281240c in view 1: r--
1699367808.337934 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.338123 [LIBHOOK] destroying return hook...
1699367808.338150 Removing breakpoint trap from 0x749bb7fe.
1699367808.338228 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.338255 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.338295 Post mem cb @ 0x4281240c vCPU 1 altp2m 0
1699367808.339208 [LIBHOOK] creating return hook
1699367808.339279 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.339346 Copied trapped page to new location
1699367808.339373 Activating remapped gfns in the altp2m views!
1699367808.339452 		Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.339483 [LIBHOOK] return hook OK
1699367808.339495 Switching altp2m and to singlestep on vcpu 0
1699367808.339621 [LIBHOOK] destroying return hook...
1699367808.339653 Removing breakpoint trap from 0x749bb7fe.
1699367808.339716 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.339747 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.340169 [LIBHOOK] creating return hook
1699367808.340238 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.340288 Copied trapped page to new location
1699367808.340296 Activating remapped gfns in the altp2m views!
1699367808.340366 		Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.340396 [LIBHOOK] return hook OK
1699367808.340407 Switching altp2m and to singlestep on vcpu 0
1699367808.340430 Pre mem cb with vCPU 1 @ 0x42812410 in view 1: r--
1699367808.340446 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.340558 [LIBHOOK] destroying return hook...
1699367808.340589 Removing breakpoint trap from 0x749bb7fe.
1699367808.340657 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.340688 Removed memtrap for GFN 0xff0c2 in altp2m view 1
1699367808.340722 Post mem cb @ 0x42812410 vCPU 1 altp2m 0
1699367808.341184 [LIBHOOK] creating return hook
1699367808.341252 Breakpoint VA 0x741977fe -> PA 0x749bb7fe
1699367808.341313 Copied trapped page to new location
1699367808.341340 Activating remapped gfns in the altp2m views!
1699367808.341425 		Trap added @ PA 0x749bb7fe RPA 0xff0c27fe Page 477627 for FindNextFileW.
1699367808.341455 [LIBHOOK] return hook OK
1699367808.341466 Switching altp2m and to singlestep on vcpu 0
1699367808.341475 Pre mem cb with vCPU 1 @ 0x42812414 in view 1: r--
1699367808.341491 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.341620 Post mem cb @ 0x42812414 vCPU 1 altp2m 0
1699367808.341895 Pre mem cb with vCPU 1 @ 0x42812418 in view 1: r--
1699367808.341930 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.342036 Post mem cb @ 0x42812418 vCPU 1 altp2m 0
1699367808.342155 Pre mem cb with vCPU 1 @ 0x4281241c in view 1: r--
1699367808.342190 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.342303 Post mem cb @ 0x4281241c vCPU 1 altp2m 0
1699367808.342422 Pre mem cb with vCPU 1 @ 0x42812420 in view 1: r--
1699367808.342456 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.342558 Post mem cb @ 0x42812420 vCPU 1 altp2m 0
1699367808.342690 Pre mem cb with vCPU 1 @ 0x42812424 in view 1: r--
1699367808.342725 Switching to altp2m view 0 on vCPU 1 and waiting for post_mem cb
1699367808.342778 [LIBHOOK] destroying return hook...
1699367808.342808 Removing breakpoint trap from 0x749bb7fe.
1699367808.342880 Removed memtrap for GFN 0x749bb in altp2m view 1
1699367808.342907 Removed memtrap for GFN 0xff0c2 in altp2m view 1
@ubersandro
Copy link

ubersandro commented Nov 10, 2023

Hello, I am writing here since I was on the point of starting a new issue but maybe we have the same problem. I am experiencing domain freezing while running Codemon for monitoring the whole userspace in Windows 10 20H1 and my output look really similar to that above. It happens sometimes and, at the moment, I cannot really reproduce arbitrarily the error. I suppose there is some trouble in managing events. My guess is that some event is not correctly handled because of some sort of lack of atomicity in removing/adding events and the domain is suspended during singlestepping but I have no idea on how to verify that this is the case.
Thanks in advance for the help,
Alessandro

@tklengyel
Copy link
Owner

Debugging that type of error is really difficult. What may help is to verify if this is a new issue or if you had the same problem with older versions. If its an issue only happening with a newer version then some recent change might have broke the logic to fix, which should easier. If its happening with older versions as well, then the logic was already broken and its much harder to figure out why.

@ubersandro
Copy link

ubersandro commented Nov 14, 2023

Ok ok, I would like to try to debug it but I am not very proficient yet working with Xen. As I was saying, my suspicion is that event management is somehow broken. Maybe passing through the vm_event interface I could figure out what makes my domU hang dumping events and checking which one is not managed by the stack libvmi+drakvuf+codemon. As an alternative, I could try to write a more concise stress test for memaccess events to try to understand what is wrong. Do you have any advice for me, Tamas?

@carttam
Copy link
Author

carttam commented Nov 14, 2023

Debugging that type of error is really difficult. What may help is to verify if this is a new issue or if you had the same problem with older versions. If its an issue only happening with a newer version then some recent change might have broke the logic to fix, which should easier. If its happening with older versions as well, then the logic was already broken and its much harder to figure out why.

I tested version 1.0 and this problem was also present. I noticed that by setting PRINT_DEBUG, the output of the callback event and struct event was similar to the previous times it was called. With many tests, I could not find any properties under which this error occurs.
I just realized that if, for example, in previous executions of Xen, after the ReturnHook that was frozen in the chrome.exe process, I only filter (running Drakvuf with -C --context-process chrome.exe), Xen does not freeze.
I tried to create a problem like the current state by breaking the code, such as changing the event output value or changing event->interrupt_event.reinject or drakvuf->in_callback, all of which led to the crash of Drakvuf itself and Xen did not freeze.
Has such a problem happened before? Or do you know the reasons that can cause this problem?
Thank you for your great project, I hope it can be solved

@carttam
Copy link
Author

carttam commented Nov 15, 2023

At last, I was able to make Xen freeze at the beginning of the execution by commenting this part of the codes.

drakvuf/src/libdrakvuf/vmi.c

Lines 1144 to 1145 in 67477d0

remove_trap(drakvuf, &container->breakpoint.guard);
remove_trap(drakvuf, &container->breakpoint.guard2);

drakvuf/src/libdrakvuf/vmi.c

Lines 1515 to 1525 in 67477d0

if ( !inject_trap_mem(drakvuf, &container->breakpoint.guard, 0) )
{
PRINT_DEBUG("[IDX] Failed to create guard trap for the breakpoint!\n");
goto err_exit;
}
if ( !inject_trap_mem(drakvuf, &container->breakpoint.guard2, 1) )
{
PRINT_DEBUG("[IDX] Failed to create guard2 trap for the breakpoint!\n");
goto err_exit;
}

@Amnpardaz-Hypervisor
Copy link

Hello ,
With many tests, I realized that the problem arises from the vmi_slat_change_gfn function to change the GFN to 0, I still don't know why this happens.
Anyway, using the vmi_set_mem_event function to change the access level to VMI_MEMACCESS_N solved the problem.

drakvuf/src/libdrakvuf/vmi.c

Lines 1184 to 1198 in 1859dc9

if ( !traps_on_gfn )
{
if ( VMI_FAILURE == vmi_slat_change_gfn(vmi, drakvuf->altp2m_idrx, container->breakpoint.guard3.memaccess.gfn, ~(addr_t)0))
{
fprintf(stderr, "Critical error in removing int3, guard3 wasn't removed\n");
drakvuf->interrupted = -1;
break;
}
if ( VMI_FAILURE == vmi_slat_change_gfn(vmi, drakvuf->altp2m_idrx, container->breakpoint.guard4.memaccess.gfn, ~(addr_t)0))
{
fprintf(stderr, "Critical error in removing int3, guard4 wasn't removed\n");
drakvuf->interrupted = -1;
break;
}
}

drakvuf/src/libdrakvuf/vmi.c

Lines 1216 to 1229 in 1859dc9

if ( VMI_SUCCESS == vmi_slat_change_gfn(vmi, drakvuf->altp2m_idx, container->memaccess.gfn, ~(addr_t)0))
{
PRINT_DEBUG("Removed memtrap for GFN 0x%lx in altp2m view %u\n",
container->memaccess.gfn, drakvuf->altp2m_idx);
struct remapped_gfn* remapped_gfn = (struct remapped_gfn*)g_hash_table_lookup(drakvuf->remapped_gfns,
GSIZE_TO_POINTER(container->memaccess.gfn));
if ( remapped_gfn )
remapped_gfn->active = 0;
g_hash_table_remove(drakvuf->memaccess_lookup_trap, trap);
g_hash_table_remove(drakvuf->memaccess_lookup_gfn, GSIZE_TO_POINTER(container->memaccess.gfn));
}

for example : vmi_set_mem_event(vmi, container->memaccess.gfn, VMI_MEMACCESS_N, drakvuf->altp2m_idx)

@tklengyel
Copy link
Owner

Yea, don't do that. That disables the core functionality of DRAKVUF and it makes the breakpoints detectable by the guest.

@yuno-x
Copy link

yuno-x commented Mar 16, 2024

I always encounter the same problem when I use the apimon of drakvuf.
The qemu-xen logs show the following memory-related error and qemu-xen hangs.
This happens in any recent version.

$ cat /var/log/xen/qemu-dm-*.log
VNC server running on :::5900
Locked DMA mapping while invalidating mapcache! 0000000000000eff -> 0x7f42f34f72e0 is present
qemu-system-i386: terminating on signal 1 from pid 24521 (xl)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants