Updating PE file imports on process start

When we need to change the PE file imports, we might either modify the binary file in the file system or perform updates after it has been loaded to the memory. In this post, I will focus on the latter approach, showing you moments in the process lifetime when such changes are possible. We will end up with a small app capable of updating imports in newly started remote processes.

What we will be modifying

Let’s begin with some basics on the PE file structure. Typically, the data about PE file imports resides in the .idata section. And we need to read the image import directory (IMAGE_DIRECTORY_ENTRY_IMPORT) in the NT Optional Header to understand how this data is laid out. In this directory, we will find an array of IMAGE_IMPORT_DESCRIPTOR structures:

typedef struct _IMAGE_IMPORT_DESCRIPTOR { 
    union { 
        DWORD   Characteristics;            // 0 for terminating null import descriptor 
        DWORD   OriginalFirstThunk;         // RVA to original unbound IAT (PIMAGE_THUNK_DATA) 
    } DUMMYUNIONNAME; 
    DWORD   TimeDateStamp;                  // 0 if not bound, 
                                            // -1 if bound, and real date\time stamp 
                                            //     in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND) 
                                            // O.W. date/time stamp of DLL bound to (Old BIND) 
    DWORD   ForwarderChain;                 // -1 if no forwarders 
    DWORD   Name; 
    DWORD   FirstThunk;                     // RVA to IAT (if bound this IAT has actual addresses) 
} IMAGE_IMPORT_DESCRIPTOR;

The Name field points to the name of the imported DLL and the OriginalFirstThunk and FirstThunk fields point to arrays of IMAGE_THUNK_DATA, which hold information about the functions imported from a given library. All those fields’ values are relative virtual addresses (RVAs), so offsets to the image base address after it has been loaded into memory. Additionally, a thunk could represent either an import by ordinal (IMAGE_ORDINAL_FLAG is set) or an import by name (the thunk holds an RVA to the IMAGE_IMPORT_BY_NAME structure). It is important to note that each thunk array must end with a zeroed thunk. You may be wondering why there are two thunk arrays per each DLL. At the beginning, they hold the same values, but once the imports are resolved, the loader will overwrite values in the FirstThunk array with actual addresses of the resolved functions. The thunks for all the imports are usually the first bytes of the .idata section and they are also referenced by the import address table (IAT) directory (I highly recommend downloading PE 102 by @corkami – a beautiful and very readable diagram of the PE file format).

Depending on what we want to achieve, we may either modify only the resolved thunk arrays or the whole descriptors array. For example, redirecting one function to another when both functions belong to already loaded DLLs could be achieved by simply overwriting the corresponding resolved address in a thunk array. However, injecting a new DLL to the process or adding a new function import to an existing DLL requires changes to the descriptors array. And that’s the case on which I will focus mainly in this post.

The application we are going to develop, named importando, accepts a list of tasks (-i)  to perform on the remote process image:

  • -i test.dll!TestMethod or test.dll#1 to inject a function by name or by ordinal 
  • -i test.dll!TestMethod:test2.dll!TestMethod to replace a given imported function with a different one 

Let’s now examine what are our options to implement those tasks.

Updating a suspended process

A common approach is to start the process as suspended (CREATE_SUSPENDED flag) and modify the imports table before the first thread resumes execution. Unfortunately, when the CreateProcess function returns, the loader has already resolved the imported function addresses (ntdll!LdrStateInit equals 2). Therefore, this approach will not work if we need to fix an incorrect import definition (for example, a wrong DLL or function name). However, as the loader has not yet reached completion (LdrStateInit is not 3), we still may perform some actions on the import directory. For example, we can inject a new DLL into a process (like DetoursCreateProcessWithDlls). We may also override addresses of the resolved functions. When the main thread resumes (ResumeThread), the loader will finish its work and the code will execute with our changes applied.

Updating a process from a debugger

If we need to have earlier access to the executable import directory data, we could resort to the debugger API. Running process under a debugger gives us a few more chances to apply import modifications. The first interesting debug event is CREATE_PROCESS_DEBUG_EVENT. When the debugger receives it, the loader has not yet started resolving the dependencies, but the executable image is already loaded into the memory. That is a perfect moment for fixing problems that are causing critical loader errors, for example, an infamous “entry not found” error:

The next interesting event is EXCEPTION_DEBUG_EVENT with ExceptionRecord.ExceptionCode equal to STATUS_BREAKPOINT (if we are debugging a 32-bit process with a 64-bit debugger, we should skip the first STATUS_BREAKPOINT and instead wait for STATUS_WX86_BREAKPOINT). It is the initial process breakpoint, triggered by the loader when it is in a state very similar to the one in an initially suspended process, described in the previous section (so LdrStateInit equals 2). Finally, the debugger also receives LOAD_DLL_DEBUG_EVENT for each loaded DLL before the loader started resolving its dependencies. Thus, in the handler of this event, we could fix issues in the import directories of the dependent libraries.

I also recorded a YouTube video where I present how you may make those fixes manually in WinDbg. It could be helpful to better visualize the steps we will perform in the importando code.

Implementing importando (in C#)

As you remember from the first section, our goal is to support both import redirects and new import injections. If you are wondering why importando, I thought it sounds nice and the name describes what we will be doing: import and override (it happens in the reverse order, but it is just a nitpick 😊). As we want to support all types of modifications to the import directory, the logical choice is to use the debugging API. Thanks to CsWin32, writing a native debugger in C# is not a very demanding task. Here is the debugger loop with the few events importando uses:

HANDLE processHandle = HANDLE.Null;
nuint imageBase = 0;
bool is64bit = false;
bool isWow64 = false;

ModuleImport[] originalImports = [];
ModuleImport[] newImports = [];

while (!cts.Token.IsCancellationRequested)
{
    if (WaitForDebugEvent(1000) is { } debugEvent)
    {
        switch (debugEvent.dwDebugEventCode)
        {
            case DEBUG_EVENT_CODE.CREATE_PROCESS_DEBUG_EVENT:
                {
                    logger.WriteLine($"CreateProcess: {debugEvent.dwProcessId}");

                    Debug.Assert(pid == debugEvent.dwProcessId);
                    var createProcessInfo = debugEvent.u.CreateProcessInfo;

                    // we are closing hFile handle after we finish reading the image data
                    using var pereader = new PEReader(new FileStream(
                        new SafeFileHandle(createProcessInfo.hFile, true), FileAccess.Read));

                    processHandle = createProcessInfo.hProcess;
                    is64bit = pereader.Is64Bit();
                    isWow64 = Environment.Is64BitProcess && !is64bit;
                    unsafe { imageBase = (nuint)createProcessInfo.lpBaseOfImage; }

                    (originalImports, newImports) = UpdateProcessImports(processHandle,
                        pereader, imageBase, importUpdates, forwards);
                }
                break;

            case DEBUG_EVENT_CODE.EXCEPTION_DEBUG_EVENT:
                if (debugEvent.u.Exception.ExceptionRecord.ExceptionCode == (
                    isWow64 ? NTSTATUS.STATUS_WX86_BREAKPOINT : NTSTATUS.STATUS_BREAKPOINT))
                {
                    // first breakpoint exception is the process breakpoint - it happens when loader finished its initial
                    // work and thunks are resolved
                    Debug.Assert(imageBase != 0 && !processHandle.IsNull);
                    UpdateForwardedImports(processHandle, is64bit, imageBase, originalImports, newImports, forwards);
                    cts.Cancel();
                }
                else
                {
                    logger.WriteLine($"Unexpected exception: {debugEvent.u.Exception.ExceptionRecord.ExceptionCode.Value:x}");
                }
                break;

            case DEBUG_EVENT_CODE.EXIT_PROCESS_DEBUG_EVENT:
                cts.Cancel();
                break;
            default:
                break;
        }

        if (!PInvoke.ContinueDebugEvent(debugEvent.dwProcessId,
            debugEvent.dwThreadId, NTSTATUS.DBG_EXCEPTION_NOT_HANDLED))
        {
            throw new Win32Exception(Marshal.GetLastPInvokeError(), $"{nameof(PInvoke.ContinueDebugEvent)} error");
        }
    }
}

I will mention that again later, but the full source code is available in the importando GitHub repository. In the post, I will rather focus on the crucial pieces of the solution, so please refer to the code in the repository in case you would like to check the skipped parts.

I also created a few wrapping record classes for the parsed import data. Using native structures could be an option, however, I wanted to make them more C# friendly and also architecture agnostic.

interface IFunctionImport { }

record FunctionImportByName(uint Rva, ushort Hint, string FunctionName) : IFunctionImport;

record FunctionImportByOrdinal(uint Ordinal) : IFunctionImport;

record NullImport : IFunctionImport;

record FunctionThunk(IFunctionImport Import);

record ModuleImport(string DllName, uint DllNameRva, uint OriginalFirstThunkRva,
    uint FirstThunkRva, FunctionThunk[] FirstThunks)

The handler of CREATE_PROCESS_DEBUG_EVENT, or rather the UpdateProcessImports function, reads the existing imports (PEImports.ReadModuleImports), prepares new import descriptors with thunk arrays for the updated ones (PEImports.PrepareNewModuleImports), and saves them in the remote process memory (PEImports.UpdateImportsDirectory). Btw., the PEReader class is a great helper in parsing PE structures. We also need to update the imports data directory in the NT optional header as it should point to our new import descriptors (UpdatePEDirectory):

static (ModuleImport[] OriginalImports, ModuleImport[] NewImports) UpdateProcessImports(HANDLE processHandle,
    PEReader imageReader, nuint imageBase, ImportUpdate[] importUpdates, (string ForwardFrom, string ForwardTo)[] forwards)
{
    var existingImports = PEImports.ReadModuleImports(imageReader);

    var newImports = PEImports.PrepareNewModuleImports(existingImports, importUpdates, forwards);

    var is64bit = imageReader.Is64Bit();
    var (importDirRva, importDirSize) = PEImports.UpdateImportsDirectory(processHandle, is64bit, imageBase, newImports);

    nuint dataDirectoriesRva = (nuint)(imageReader.PEHeaders.PEHeaderStartOffset +
        (is64bit ? Marshal.OffsetOf<IMAGE_OPTIONAL_HEADER64>("DataDirectory") : Marshal.OffsetOf<IMAGE_OPTIONAL_HEADER32>("DataDirectory")));

    UpdatePEDirectory(dataDirectoriesRva, IMAGE_DIRECTORY_ENTRY.IMAGE_DIRECTORY_ENTRY_IMPORT, importDirRva, importDirSize);
    UpdatePEDirectory(dataDirectoriesRva, IMAGE_DIRECTORY_ENTRY.IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT, 0, 0);

    return (existingImports, newImports);
}

Because most of the RVA addresses in PE file header are 4-byte long (DWORD), the PEImports.UpdateImportsDirectory needs to allocate space in the memory for the new imports as near to the image base address as possible. I ported to C# the FindAndAllocateNearBase function from the Detours library to achieve that.

To better show you what importando is doing, I draw a simple picture of the memory layout after an example forward of the shell32.dll!StrCpyNW to shlwapi.dll!StrCpyNW. Notice a new import descriptors table (separate from the original one) that is pointed by the import directory. Importando also needed to create new thunk arrays for imports requiring an update (in this case, shell32.dll) and for new ones (shlwapi.dll), but reused existing thunks for unmodified imports (user32.dll):

Once we updated the import directories, we are ready to resume the process execution and wait for the loader to perform its initial work.

This leads us to the next stop in the debugger loop, the EXCEPTION_DEBUG_EVENT handler. The name of the event may be a little misleading as it is triggered not only when code in the remote process throws an exception, but also when it hits a breakpoint. And Windows loader triggers a breakpoint (STATUS_BREAKPOINT) when it detects that there is a debugger attached to the starting process. In the WOW64 context (when a 64-bit debugger debugs a 32-bit application), there are actually two initial breakpoints, STATUS_BREAKPOINT and STATUS_WX86_BREAKPOINT, and it is the latter that interests us. At this point, the loader resolved all the addresses in the thunk arrays from the new imports directory. However, we are not done yet as the old thunks still hold RVA (unresolved) addresses. We need to update them as those thunks are referenced by the application code. And here comes the last step in our coding journey, the UpdateForwardedImports function:

static void UpdateForwardedImports(HANDLE processHandle, bool is64bit, nuint imageBase,
    ModuleImport[] originalImports, ModuleImport[] newImports, (string ForwardFrom, string ForwardTo)[] forwards)
{
    int thunkSize = is64bit ? Marshal.SizeOf<IMAGE_THUNK_DATA64>() : Marshal.SizeOf<IMAGE_THUNK_DATA32>();

    uint GetThunkRva(ModuleImport[] moduleImports, string importName)
    { /* ... */  }

    void CopyThunkValues(uint fromRva, uint toRva)
    { /* ... */  }

    foreach ((string forwardFrom, string forwardTo) in forwards)
    {
        var originalThunkRva = GetThunkRva(originalImports, forwardFrom);
        var newThunkRva = GetThunkRva(newImports, forwardTo);

        if (originalThunkRva != 0 && newThunkRva != 0)
        {
            // new thunk should be resolved by now, so we may copy its value to the original place
            // that could be referenced by application code
            CopyThunkValues(newThunkRva, originalThunkRva);
        }
        else
        {
            Console.WriteLine($"WARNING: could not find import {forwardFrom} or {forwardTo}");
        }
    }
}

We may now continue debugging or detach from the remote process (that’s what importando is doing) and let it freely run. Our job is done and we should see new imports in the modules list of our target application.

The importando’s source code and binaries are available in its GitHub repository.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.