How Ansible impersonates users on Windows

Recently, I hit an interesting error during a deployment orchestrated by Ansible. One of the deployment steps was to execute a custom .NET application. Unfortunately, the application was failing on each run with an ACCESS DENIED error. After collecting the stack trace, I found that the failing code was ProtectedData.Protect(messageBytes, null, DataProtectionScope.CurrentUser), so a call to the Data Protection API. To pinpoint a problem I created a simple playbook:

- hosts: all
  gather_facts: no
  vars:
    ansible_user: testu
    ansible_connection: winrm
    ansible_winrm_transport: basic
    ansible_winrm_server_cert_validation: ignore
  tasks:
    - win_shell: |
        Add-Type -AssemblyName "System.Security"; \
        [System.Security.Cryptography.ProtectedData]::Protect([System.Text.Encoding]::GetEncoding(
            "UTF-8").GetBytes("test12345"), $null, [System.Security.Cryptography.DataProtectionScope]::CurrentUser)
      args:
        executable: powershell
      register: output

    - debug:
        var: output

When I run it I get the following error:

fatal: [192.168.0.30]: FAILED! => {"changed": true, "cmd": "Add-Type -AssemblyName \"System.Security\"; [System.Security.Cryptography.ProtectedData]::Protect([System.Text.Encoding]::GetEncoding(\n    \"UTF-8\").GetBytes(\"test\"), $null, [System.Security.Cryptography.DataProtectionScope]::CurrentUser)", "delta": "0:00:00.807970", "end": "2020-05-04 11:34:29.469908", "msg": "non-zero return code", "rc": 1, "start": "2020-05-04 11:34:28.661938", "stderr": "Exception calling \"Protect\" with \"3\" argument(s): \"Access is denied.\r\n\"\r\nAt line:1 char:107\r\n+ ... .Security\"; [System.Security.Cryptography.ProtectedData]::Protect([Sy ...\r\n+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException\r\n    + FullyQualifiedErrorId : CryptographicException", "stderr_lines": ["Exception calling \"Protect\" with \"3\" argument(s): \"Access is denied.", "\"", "At line:1 char:107", "+ ... .Security\"; [System.Security.Cryptography.ProtectedData]::Protect([Sy ...", "+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~", "    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException", "    + FullyQualifiedErrorId : CryptographicException"], "stdout": "", "stdout_lines": []}

The workaround to make it always work was to use the Ansible become parameters:

...
  tasks:
    - win_shell: |
        Add-Type -AssemblyName "System.Security"; \
        [System.Security.Cryptography.ProtectedData]::Protect([System.Text.Encoding]::GetEncoding(
            "UTF-8").GetBytes("test12345"), $null, [System.Security.Cryptography.DataProtectionScope]::CurrentUser)
      args:
        executable: powershell
      become_method: runas
      become_user: testu
      become: yes
      register: output
...

Interestingly, the original playbook succeeds if the testu user has signed in to the remote system interactively (for example, by opening an RDP session) and encrypted something with DPAPI before running the script.

It only made me even more curious about what is happening here. I hope it made you too 🙂

What happens when you encrypt data with DPAPI

When you call CryptProtectData (or its managed wrapper ProtectedData.Protect), internally you are connecting to an RPC endpoint protected_storage exposed by the Lsass process. The procedure s_SSCryptProtectData, implemented in the dpapisrv.dll library, encrypts the data using the user’s master key. The master key is encrypted, and to decrypt it, Lsass needs a hash of the user’s password. The decryption process involves multiple steps, and if you are interested in its details, have a look at this post.

Examining the impersonation code

Before we dive into the Ansible impersonation code, I highly recommend checking the Ansible documentation on this subject as it is exceptional and covers all the authentication cases. In this post, I am describing only my case, when I am not specifying the become_user password. However, by reading the referenced code, you should have no problems in understanding other scenarios as well.

Four C# files contain the impersonation code, with the most important one being Ansible.Become.cs. Become flags define what type of access token Ansible creates for a given user session. Get-BecomeFlags contains the logic of the flags parser and handles the interaction with the C# code.

A side note: while playing with the exec wrapper, I discovered an interesting environment variable: ANSIBLE_EXEC_DEBUG. You may set its value to a path of a file where you want Ansible to write its logs. They might reveal some details on how Ansible executes your commands.

For my case, the logic of the become_wrapper could be expressed in the following PowerShell commands:

PS> Import-Module -Name $pwd\Ansible.ModuleUtils.AddType.psm1
PS> $cs = [System.IO.File]::ReadAllText("$pwd\Ansible.Become.cs"), [System.IO.File]::ReadAllText("$pwd\Ansible.Process.cs"), [System.IO.File]::ReadAllText("$pwd\Ansible.AccessToken.cs")
PS> Add-CSharpType -References $cs -IncludeDebugInfo -CompileSymbols @("TRACE")

PS> [Ansible.Become.BecomeUtil]::CreateProcessAsUser("testu", [NullString]::Value, "powershell.exe -NonInteractive -NoProfile -ExecutionPolicy Bypass -EncodedCommand QQBkAGQALQBU...MAZQByACkA")

StandardOut
-----------
1...

The stripped base64 string is the encoded version of the commands I had in my Ansible playbook:

PS> [Text.Encoding]::Unicode.GetString([Convert]::FromBase64String("QQBkAGQALQBU...MAZQByACkA"))

Add-Type -AssemblyName "System.Security";[System.Security.Cryptography.ProtectedData]::Protect([System.Text.Encoding]::GetEncoding("UTF-16LE").GetBytes("test12345"), $null, [System.Security.Cryptography.DataProtectionScope]::CurrentUser)

The CreateProcessAsUser method internally calls GetUserTokens to create an elevated and a regular token (or only one if no elevation is available/required). As I do not specify a password neither a logon type, my code will eventually call GetS4UTokenForUser. S4U, or in other words, “Service for Users”, is a solution that allows services to obtain a logon for the user, but without providing the user’s credentials. To use S4U, services call the LsaLogonUser method, passing a KERB_S4U_LOGON structure as the AuthenticationInformation parameter. Of course, not all services can impersonate users. Firstly, the service must have the “Act as part of the operating system” privilege (SeTcbPrivilege). Secondly, it must register itself as a logon application (LsaRegisterLogonProcess). So how Ansible achieves that? It simply tries to “steal” (duplicate ;)) a token from one of the privileged processes by executing GetPrimaryTokenForUser(new SecurityIdentifier("S-1-5-18"), new List<string>() { "SeTcbPrivilege" }). As this method code is not very long and well documented, let me cite it here (GPL 3.0 license):

private static SafeNativeHandle GetPrimaryTokenForUser(SecurityIdentifier sid, List<string> requiredPrivileges = null)
{
    // According to CreateProcessWithTokenW we require a token with
    //  TOKEN_QUERY, TOKEN_DUPLICATE and TOKEN_ASSIGN_PRIMARY
    // Also add in TOKEN_IMPERSONATE so we can get an impersonated token
    TokenAccessLevels dwAccess = TokenAccessLevels.Query |
        TokenAccessLevels.Duplicate |
        TokenAccessLevels.AssignPrimary |
        TokenAccessLevels.Impersonate;
    foreach (SafeNativeHandle hToken in TokenUtil.EnumerateUserTokens(sid, dwAccess))
    {
        // Filter out any Network logon tokens, using become with that is useless when S4U
        // can give us a Batch logon
        NativeHelpers.SECURITY_LOGON_TYPE tokenLogonType = GetTokenLogonType(hToken);
        if (tokenLogonType == NativeHelpers.SECURITY_LOGON_TYPE.Network)
            continue;
        // Check that the required privileges are on the token
        if (requiredPrivileges != null)
        {
            List<string> actualPrivileges = TokenUtil.GetTokenPrivileges(hToken).Select(x => x.Name).ToList();
            int missing = requiredPrivileges.Where(x => !actualPrivileges.Contains(x)).Count();
            if (missing > 0)
                continue;
        }
        // Duplicate the token to convert it to a primary token with the access level required.
        try
        {
            return TokenUtil.DuplicateToken(hToken, TokenAccessLevels.MaximumAllowed, SecurityImpersonationLevel.Anonymous,
                TokenType.Primary);
        }
        catch (Process.Win32Exception)
        {
            continue;
        }
    }
    return null;
}


public static IEnumerable<SafeNativeHandle> EnumerateUserTokens(SecurityIdentifier sid,
    TokenAccessLevels access = TokenAccessLevels.Query)
{
    foreach (System.Diagnostics.Process process in System.Diagnostics.Process.GetProcesses())
    {
        // We always need the Query access level so we can query the TokenUser
        using (process)
        using (SafeNativeHandle hToken = TryOpenAccessToken(process, access | TokenAccessLevels.Query))
        {
            if (hToken == null)
                continue;
            if (!sid.Equals(GetTokenUser(hToken)))
                continue;
            yield return hToken;
        }
    }
}

private static SafeNativeHandle TryOpenAccessToken(System.Diagnostics.Process process, TokenAccessLevels access)
{
    try
    {
        using (SafeNativeHandle hProcess = OpenProcess(process.Id, ProcessAccessFlags.QueryInformation, false))
            return OpenProcessToken(hProcess, access);
    }
    catch (Win32Exception)
    {
        return null;
    }
}

Once Ansible obtains the SYSTEM token, it can register itself as a logon application and finally call LsaLogonUser to obtain the impersonation token (GetS4UTokenForUser). With the right token, it can execute CreateProcessWithTokenW and start the process in a desired user’s context.

Playing with the access tokens using TokenViewer

As we reached this point, maybe it is worth to play a bit more with Windows tokens, and try to reproduce the initial Access Denied error. For this purpose, I slightly modified the TokenViewer tool developed by James Forshaw (Google). You may find the code of my version in my blog repository.

Let’s run TokenViewer as the SYSTEM user. That should give us SeTcbPrivilege, necessary to create an impersonated tokens: psexec -s -i TokenViewer.exe. Next, let’s create an access token for the Network logon type:

On the group tab, there should be the NT AUTHORITY\NETWORK group listed. Now, let’s try to encrypt the “Hello World!” text with DPAPI on the Operations tab. We should receive an Access Denied error:

Leave the Token window open, move to the main window, and create a token for the Batch logon type. This is the token Ansible creates in the “become mode”. The groups tab should have the NT AUTHORITY\BATCH group enabled, and DPAPI encryption should work. Don’t close this window and move back to the previous token window. DPAPI will work now too.

I am not familiar enough with Lsass to explain in details what is happening here. However, I assume that the DPAPI problem is caused by the fact that Lsass does not cache user credentials when the user signs in with the logon type NETWORK (probably because of the performance reasons). Therefore, the lsasrv!LsapGetCredentials method fails when DPAPI calls it to retrieve the user password’s hash to decrypt the master key. Interestingly, if we open another session for a given user (for example, an interactive one), and call DPAPI to encrypt/decrypt some data, the user’s master key lands in the cache (lsasrv!g_MasterKeyCacheList). DPAPI searches this cache (dpapisrv!SearchMasterKeyCache) before calling LsapGetCredentials. That explains why our second call to DPAPI succeeded in the NETWORK logon session.

Debug notes

By Sebastian Solnica