When we perform a syscall, we request that the OS do something. On Windows version NT6+, these system calls are implemented in ntoskrnl and win32k in exported tables KiServiceTables and W32pServiceTables respectively.

We can use syscalls to bridge the user-kernel gap not covered by public APIs, though many opaque routines are mirrored renames of syscalls, often with some error checking and so on. The majority of syscalls are trampolined via ntdll. We disassemble any x86 syscall.

Syscall Disasm

The routine first places the syscall index into EAX. It then does something peculiar. Because the x86 call is not native, the syscall is instead routed through a translation layer Wow64SystemServiceCall. Instead of containing the expected SYSENTER instruction, it jumps into 64-bit mode and issues a SYSCALL.

Wow64 Translation

We want to retrieve the address of this function, place the syscall id in EAX, dynamically fix the stack and return. Instead of calling the ntdll export, we avoid any hooks, filters and callbacks and instead ship the call straight to kernel. We can do this by allocating executable memory, setting up the function and calling it on the go.

u64 nt_wow64_transition(u8* nt_base)
{
  // 1. vv vv vv vv
  // BA A0 A0 31 4B       mov     edx, 4B31A0A0h
  // FF D2                call    edx

  // 2.    vv vv vv vv
  // FF 25 24 22 3B 4B    jmp      ds:_Wow64Transition
  return *uti::ida_sig<u64*>("BA ?? ?? ?? ?? FF D2 C2 34 00").ref(1).ref(2);
}

template<u32 id, typename... arg_t>
u32 nt_make_call(arg_t... args)
{
  u8 call_stub[15] = {
    0xb8, 0x00, 0x00, 0x00, 0x00,  // mov eax, ?? ?? ?? ??
    0xba, 0x00, 0x00, 0x00, 0x00,  // mov edx, ?? ?? ?? ??
    0xff, 0xd2,                    // call edx
    0xc2, 0x00, 0x00               // retn ?? ??
  };

  auto mem_alloc = VirtualAlloc(0, sizeof(call_stub), MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);
  if (!mem_alloc)
    return STATUS_UNSUCCESSFUL;

  *(u32*)(&call_stub[0x1]) = id;
  *(u32*)(&call_stub[0x6]) = nt_wow64_transition(LoadLibraryA("ntdll.dll")); // get wow64 translation
  *(u16*)(&call_stub[0xd]) = (sizeof(arg_t) + ...);

  for (size_t i = 0; i < sizeof(call_stub); i++) mem_alloc[i] = call_stub[i]; // write stub to RWX alloc

  auto ret_status = (u32(__stdcall*)(arg_t...))(args...);

  if (!VirtualFree(mem_alloc, sizeof(call_stub), MEM_RELEASE | MEM_DECOMMIT))
    return STATUS_UNSUCCESSFUL;

  return ret_status;
}

With this method, we are able to call syscalls on the go without calling the ntdll trampoline methods. I leave further analysis of Wow64Transition as an exercise to the reader.