The Home of the Security Bloggers Network
Home » Security Bloggers Network »
Consider an application written in a higher-level language like Python, NodeJS, or C#. This application must handle sensitive data such as banking credentials, credit card data, health information, or network passwords. The application developers have already hardened the application against malicious users and are confident that it is not vulnerable to database injections, account takeovers, or other remote critical-risk vulnerabilities. Furthermore, the application avoids storing sensitive data on the filesystem or sending it over network connections to other components.
Despite all of this hardening, however, the developers have overlooked a critical attack surface on this hypothetical application. It does not take any precautions to secure the data while storing it in memory. As a result of this lack of memory protections, a local attacker may be able to compromise the sensitive data by dumping the process memory.
Defending against a local attacker may seem like things have already reached the point of “too late.” This is not an unreasonable conclusion; a skilled local attacker with ample time will most likely be able to overcome any defensive control on the compromised machine. However, in practice, threat actors do not always have unlimited time before they are detected and locked out. Alternatively, they may lack the skills or resources necessary to overcome a particular control, even if that control is vulnerable in theory. For these reasons, local defenses still serve as adequate controls and are worth implementing when security is a top priority. In the event of a network breach, the memory protection techniques discussed below may buy the incident response team enough time to quarantine the affected device and lock out the actor before a damaging compromise occurs.
Most higher-level programming languages are memory-managed and feature garbage collection (GC). GC is a memory recovery feature that automatically deallocates objects when they are no longer needed, which relieves the programmer of the burden of memory management.
Unfortunately, this also restricts the programmer from having total control over how and where their objects appear in memory. The GC process may involve moving data or copying it in memory without the application’s knowledge. Hence, a given data string may appear multiple times in memory and at unpredictable locations. Worst of all, these copies do not exist from the application’s perspective, so once they occur, the developer cannot natively design solutions to remove them from memory.
Adding to the complexity of this problem, most higher-level languages employ the concept of immutability. Immutable objects cannot truly be written to, and any operations that appear to modify them generate new copies of the data. This leaves the original copy in memory with no reference to reach it from the application. The `string` data type is immutable in most higher-level languages, including NodeJS, C#, and Python. Unfortunately, this data type often stores sensitive information in the real world.
The examples below address the above problems in a C# .NET application while minimizing the exposure of sensitive data in memory. Although we have chosen a .NET application for this paper, the same general solutions should apply to any higher-level programming language.
We begin with the following application:
To simulate an “attack”, we used Procdump to dump the application’s memory at various stopping points (delineated in each example code with comments). The string `TOP SECRET DATA` served as a placeholder for sensitive data, and we inspected each memory dump for the placeholder string using HxD Hex Editor As Figure 1 shows, the placeholder data is present in numerous memory locations. This includes a partial copy for each time the immutable string was “modified” inside the `for` loop.
Figure 1: An example Procdump output showing the sensitive data in the starting application’s memory.
In C# (and most other programming languages), arrays are mutable and do not create additional copies when modified. A developer might try to reduce data exposure in memory by storing sensitive data in character arrays instead of strings and clearing out each array after use. This would prevent untrackable copies from stacking up due to immutability. The modified application would read as follows:
Unfortunately, the GC complicates this approach. If the GC compacts the block of memory containing the array, it may also create additional “copies” of the array in memory while leaving the original bytes as we see in Figure 2.
Figure 2: The data remains in memory even though the application cleared out the array.
However, this is a step in the right direction. The next iteration of the application will retain the ability to clear data from memory in a way that circumvents the GC.
To circumvent the GC, the application must contain a portion of code that is not subject to memory management. The most reliable way to do this is to write an external library in C++ that handles all sensitive application data. C++ allows direct manipulation of memory and requires the programmer to manage the allocation of new objects and deallocation of old ones. For memory security, this will allow the application to erase all sensitive data from memory, ensuring that no additional copies from garbage collection or immutability remain.
To ensure this, the library must have the following properties:
The resulting DLL will be an “area” of the application that can safely hold sensitive data without worrying about garbage collection or immutability. The next iteration of the application exhibits this:
A Note: If the reader is unfamiliar with writing custom DLLs and integrating them into a higher-level project, the following resources may be helpful:
In Figures 3 and 4 we see how this custom DLL loads “Top Secret Data” into memory while processing, but then allows it to clear.
Figure 3: Secret data stored in memory while being “processed”.
Figure 4: Once cleared, the placeholder string no longer appears in memory.
For most applications, circumventing the GC and clearing sensitive data arrays after use will serve as effective controls against a local attacker. However, if the application must keep the sensitive data in memory for long periods, an attacker can perform a timed-memory dump to extract the sensitive data. For such applications, this may constitute an unacceptable risk of exposure. The next iteration of the application will include protections against timed memory dumps.
A class called a Binary String (BSTR), similar to a C-style character pointer, exists in .NET. Binary Strings are mutable and held in a “pinned” location of .NET memory, which is not subjected to the GC. Although a Binary String could safely handle sensitive data, we chose the DLL method to make this research more applicable to other programming languages that do not have the equivalent of a Binary String.
The Binary Strings approach also lacks a clean distinction in the code base between memory “safe” and “unsafe” code. If the secret data must pass through other functions or methods, ensuring the application never loads the data into an unpinned form is difficult. This is especially true when using external libraries. Writing a DLL provides a logistically separate application area to handle sensitive data safely.
Binary Strings are not the only object that can be pinned in memory in .NET applications. Pinned instances of any objects can be created in .NET, removing them from the GC’s scope. In theory, there is nothing wrong with this approach, and in practice, it may be a better solution when dealing with complicated objects that cannot be easily converted into a set of arrays. However, the same note of caution mentioned for Binary Strings applies here.
We now address the issue where an attacker can perform a memory dump during the timeframe that the application’s secret data exists in memory. The longer an application must hold secret data in memory, the more likely this type of attack is to succeed. Obfuscating the data in memory until the moment the user needs it would improve the previous DLL. A good obfuscation technique should render the obfuscated output indistinguishable from random bytes so it is difficult to spot out of a large memory dump. While not a silver bullet, this point warrants emphasis.
Several obfuscation techniques exist, and the correct choice depends on how the application must process the secret data.
DISCLAIMER: I used the `hash-library` C++ library for hashing in the example below. I chose this library because it was lightweight and easy to use. I did not research the cryptographical correctness of this library. Although I have no reason to believe this library is insecure, do not assume it is safe simply because I included it here. Proper cryptography is difficult to get right, which is why developers should always perform thorough research on cryptography libraries before including them in a production environment.
Hashing is an ideal obfuscation technique for applications that only need to perform comparison operations on the secret data. By using a sparse hashing algorithm, the application can hash all secret data after loading it into the DLL and delete the cleartext data from memory. To perform a comparison, the application will hash the incoming data and compare the resulting hashes.
Note that salts are not an option with this approach, as using salts would cause the resulting hashes of identical cleartext inputs to be different. A universal salt for all inputs defeats the purpose of salting hashes.
The next iteration of the application will use a SHA256 hash to obfuscate all sensitive data. Once hashed, the cleartext will be completely removed from memory, as follows:
Note that the above code assumes comparisons with the secret data that usually return `false`. Figures 5-8 walk us through the process of obfuscating under these conditions. For situations where the comparison operation would return `true` (e.g., logins), the `input` parameter should have similar protections.
Figure 5: The secret is loaded into memory but not obfuscated.
Figure 6: The secret is obfuscated. Its cleartext is no longer present.
Figure 7: However, the `sha256` hash of the secret (`6d217f6863def0c80c84eb0447b59d6c9ae0b0955d2b5079cb0012e0bdafe621`) is present.
Figure 8: Once the secret is cleared, the hash is also removed from memory.
The main advantage of using hashing for an obfuscation method is its irreversibility. An attacker accessing the application’s memory would need to crack each hash to recover the sensitive data.
The downside to using a hashing obfuscation is usability. Namely, the only operation the application can meaningfully perform with other input data is a comparison. If the application must perform selective reads or writes, then hashing would be inefficient and ineffective. The last iteration of the application will use an obfuscation technique that allows for more complicated operations.
More complicated operations require a bidirectional obfuscation mechanism. In this iteration of our example application, the DLL will encrypt all sensitive data when not actively in use. Before an operation, the DLL will decrypt the information, operate, and re-encrypt the information.
Unlike with salting in the hashing scenario, an initialization vector is an option when developers have chosen encryption as their obfuscation technique. This is because all data will be decrypted before use, so there is no need for identical cleartexts to have identical ciphertexts.
Although a developer could use any secure encryption method, the following example uses the Microsoft Data Protection API (DPAPI):
Figures 9-12 show the obfuscation process that occurs via our example application.
Figure 9: The secret is loaded into memory before the encryption obfuscation occurs.
Figure 10: The secret is encrypted in memory. The cleartext secret does not appear anywhere in the memory dump.
Figure 11: The cleartext string reappears in memory while the application performs an operation.
Figure 12: No residual instances of the secret remain in memory after the application finishes using a secret.
The advantage of encryption is bi-directionality. This provides greater usability while still ensuring some measure of security. An attacker must find the application’s secret key before recovering sensitive data from a memory dump.
The downside of encryption is that the obfuscation’s security now rests on keeping the secret key away from the attacker. DPAPI on Windows partially mitigates this risk by punting the responsibility of key management to the OS, but securing the private key can be very difficult in other situations. If the cleartext private key is also in memory, bypassing the encryption is merely an extra step for the threat actor to perform. However, even this can be effective, as an attacker without advanced knowledge of the application’s internal functioning may not be able to spot the secret key in a large memory dump or distinguish encrypted data from random bytes.
A second downside to encryption is that the secret must be exposed in cleartext during a particular window of time, as it was in Figure 11.
.NET includes a `SecureString` class that takes in a series of characters and stores them as an encrypted, pinned array. At first glance, this sounds ideal because:
However, despite these potential benefits, `SecureString` suffers from usability issues that often render it useless. `SecureString` does not contain native properties or methods to decrypt the stored data. Instead, it must be marshaled into another object type (usually a Binary String). Unfortunately, this brings back the problem of holding cleartext secrets in memory. This is why Microsoft actively cautions against using`SecureString` in new projects.
The above controls will adequately protect the majority of desktop and web applications and increase the difficulty for a local attacker to recover sensitive data from memory. However, kernel-level protections may be worth considering for applications that require maximum security.
On Windows, an application can run as a “Protected Process” or “Protected Process Light” to (among other features) prevent local users and processes from accessing its memory. This is how many anti-malware services prevent tampering attempts from malicious processes. If a requesting process does not have a sufficient protection level, the OS will reject its request to access the target process memory, regardless of the requester’s user permissions. While not bulletproof, this can be a useful defensive layer to ward off attacks from privileged malware or local users.
The application must be signed by a valid Authenticode certificate from Windows or another trusted CA to run as a protected process, as Figure 13 shows. For more information, see the Microsoft Documentation.
Figure 13: An example Protected Process Light (Windows Defender). Note that the memory dump is rejected even with an administrator-level terminal.
Another kernel-level approach involves the Win32 API `OpenProcess` function. Invoking this function gains the user a handle for a target process, which they can use to access that process’s memory. By default, this function will fail for users with insufficient user-level privileges, but developers can also write a kernel driver to reject privileged calls to `OpenProcess`. Like Protected Processes, a trusted CA must sign all kernel drivers. Such a driver would have a similar effect of blocking memory dumps that local users or processes performed.
Securing an application’s memory is a complex problem with no complete solution, as a local attacker with sufficient time and resources will usually be able to overcome most local defensive measures. In practice, however, memory protections can serve as sufficient controls to deter an adversary until they are identified and removed. Memory protections are part of a defensive-in-depth strategy and can prevent a network breach from turning into a full asset compromise.
The post Safeguarding Memory in Higher-Level Programming Languages appeared first on Praetorian.
*** This is a Security Bloggers Network syndicated blog from Blog – Praetorian authored by emmaline. Read the original post at: https://www.praetorian.com/blog/safeguarding-memory-in-higher-level-programming-languages/
Step 1 of 4
Note that any programming tips and code writing requires some knowledge of computer programming. Please, be careful if you do not know what you are doing…