Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

1Learning Outcomes

In this section, we focus on implementing generic swap. Recall our goals for generic functions from the introduction:

  1. Generics should work regardless of argument type.

  2. Generics should work by accessing blocks of memory, regardless of the data type stored in those blocks.

We translate these high-level ideas into pseudocode for generic swap:

void swap(void *ptr1, void *ptr2) {
  // 1. store a copy of data1 in temporary storage

  // 2. copy data2 to location of data1

  // 3. copy data in temporary storage to location of data2
}

The above parameters approximately achieve the first of our goals for generic functions. By declaring ptr1 and ptr2 as generic pointers with void *, we effectively assume there is some data at the locations that ptr1 and ptr2 point to that need to be swapped. (We will see later that some adjustments to this function signature are necessary.)

What about the function body? Our swap_int function function from earlier uses the dereference operator (*) to read and write data to and from the memory pointed to by ptr1 and ptr2. We will see in this section that dereferencing generic pointers will not work, and we must use explore other operations for accessing memory.

2You cannot dereference void *

It is natural to want to write a generic swap function following the pattern of swap_int, swap_short, and swap_string from earlier. However, the following code will not work:

1
2
3
4
5
6
/* produces compiler errors */
void swap_faulty(void *ptr1, void *ptr2) {
  void temp = *ptr1;
  *ptr1 = *ptr2;
  *ptr2 = temp;
}

When we attempt to compile with gcc, the compiler is not happy:

$ gcc -c -o swap.o swap.c -Wall -std=c99 -g
swap.c: In function ‘swap_faulty’:
swap.c:10:8: error: variable or field ‘temp’ declared void
   10 |   void temp = *ptr1;
      |        ^~~~
swap.c:10:15: warning: dereferencing ‘void *’ pointer
   10 |   void temp = *ptr1;
      |               ^~~~~
swap.c:10:15: error: void value not ignored as it ought to be
   10 |   void temp = *ptr1;
      |               ^
swap.c:11:3: warning: dereferencing ‘void *’ pointer
   11 |   *ptr1 = *ptr2;
      |   ^~~~~
swap.c:11:11: warning: dereferencing ‘void *’ pointer
   11 |   *ptr1 = *ptr2;
      |           ^~~~~
swap.c:11:9: error: invalid use of void expression
   11 |   *ptr1 = *ptr2;
      |         ^
swap.c:12:3: warning: dereferencing ‘void *’ pointer
   12 |   *ptr2 = temp;
      |   ^~~~~
swap.c:12:9: error: invalid use of void expression
   12 |   *ptr2 = temp;
      |         ^

There are two main reasons this code won’t work. First, we cannot declare untyped variables, so a declaration like void temp; errors. Second, dereferencing void * pointers does not yield anything usable.

Consider what it means to dereference a pointer, say, declared and initialized as int32_t *p = ...; pointer means. This means we know that the p is an address of an int32_t value, and dereferencing with *p accesses those 4 bytes of memory. This allows the compiler to translate later statements like (*p) + 4 into integer arithmetic, _because it knows that *p is an int32_t-typed value.

3Generic memory access with memcpy, memmove

Generics cannot use the dereference operator (*) because the type of the data pointed to by generic pointers is unknown. Instead, generics use two functions from the C standard library: memcpy and memmove.

void *memcpy(void *dest, const void *src, size_t n); 
void *memmove(void *dest, const void *src, size_t n); 

These string.h generic functions copy n bytes from memory area src to memory area dest. In other words, they allow read/write access to data in memory using a void * pointer!

4Generic Swap

Let’s consider how the generic memcpy will replace dereferencing in one of the statements in our swap_int function function:

// in swap_int:
// ptr1, ptr2 are both declared int *
*ptr1 = *ptr2; 

This statement is simply a memcpy—it copies the bytes of the int at ptr2 into the corresponding bytes at ptr1!

// equivalent to swap_int's *ptr1 = *ptr2;
memcpy(ptr1, ptr2, sizeof(int));

memcpy requires knowing how many bytes to copy from a source to a destination. Recall from an earlier chapter that with pointer parameters, functions will not know how much data is pointed to. We must therefore update our pseudocode to add an additional size parameter:

void swap(void *ptr1, void *ptr2, size_t nbytes) {
  // 1. store a copy of data1 in temporary storage

  // 2. copy data2 to location of data1

  // 3. copy data in temporary storage to location of data2
}

4.1Implementation

Finally, we are ready to implement our generic swap function, swap!

Generic swap function

1
2
3
4
5
6
7
8
9
10
11
void swap(void *ptr1, void *ptr2, size_t nbytes) {
  // 1. store a copy of data1 in temporary storage
  char temp[nbytes];
  memcpy(temp, ptr1, nbytes);

  // 2. copy data2 to location of data1
  memcpy(ptr1, ptr2, nbytes);

  // 3. copy data in temporary storage to location of data2
  memcpy(ptr2, temp, nbytes);
}

The swap function above uses memcpy and therefore assumes that ptr1 points to nbytes of memory and ptr2 points to nbytes of memory, and that there is no overlap in these two memory areas.

Now, let’s use this new generic swap to swap two ints:

1
2
3
int data1 = 22;
int data2 = 61; 
swap(&data1, &data2, sizeof(data1));

The function call looks near-identical to our non-generic call to swap_int! Just like before, we pass in the locations of the two values we want to swap. The only modification is that we now additionally pass in the size of the value(s), which we assume to be identical.

The below slidedeck traces through swap in action using a toy example. Initially, data1 stores 22 at address 0x100, and data2 stores 61 at address 0x104. Assume sizeof(int) is 4.

5Application: swap_ends

Finally, let’s consider generics that operate on a generic array of values. We will need to do pointer arithmetic!

We would like to use the swap function to write a function swap_ends, that swaps the first and last elements in an array.

The code below will update the array arr as shown in Figure 1:

int main() {
  ...
  int32_t arr[] = {1, 2, 3, 4, 5};
  int32_t n = sizeof(arr)/sizeof(arr[0]);
  swap_ends(arr, n, sizeof(arr[0]); // to implement
  ...
}
"TODO"

Figure 1:swap_ends swaps the elements 1 and 5 in the arr array.

The function swap_ends has the following inputs and output:

void swap_ends(void *arr, size_t nelems, size_t nbytes);

The last two parameters specify (1) how large the block of memory (“array”) at arr is and (2) how to access sequential elements within the block. Combined, nelems and nbytes should help us access the last element in the array.

Answer: Let’s consider what swap does. It takes two pointers and swaps nbytes at those positions. To swap the ends of the array in Figure 1, we would like to pass in 0x100 and 0x104 to swap.

Option C does this explicitly:

The final implementation of swap_ends:

void swap_ends(void *arr, size_t nelems, size_t nbytes) {
    swap(arr, (char *) arr + (nelems - 1) * nbytes, nbytes); 
}

6More Generics in string.h

The string.h header in the C standard library contains string functions and functions that handle memory, like memcpy, memmove, and memset.

At first glance, you may then find the header name a bit misleading. However, as we have seen, memory handling functions operate byte-by-byte. Furthermore, by definition strings are byte arrays that happen to be null-terminated! Because byte arrays can be accessed with char * pointers, many implementations of string functions simply assume char * operands.

For example, the glibc implementation of strncpy uses memset and memcpy. From the Linux man pages:

... at most n bytes of src are copied. Warning: If there is no null byte among
the first n bytes of src, the string placed in dest will not be 
null-terminated. If the length of src is less than n, strncpy() writes 
additional null bytes to dest to ensure that a total of n bytes are written.

...[returns] a pointer to the destination string dest.

Here is a simplified version of the full implementation in glibc:

1
2
3
4
5
6
char *strncpy(char *dest, const char *src, size_t n) {
  size_t size = strnlen(src, n); // max(strlen(src), n)
  if (size != n) 
    memset(dest + size, '\0', n – size);
  return memcpy(dest, src, size);
}
Footnotes
  1. Some implementations of memmove actually employ temporary storage (like in C99), which risks running out of memory. Read more on StackOverflow.