Learning Objectives

  • Be able to draw a name, offset, and size table (NOS).
  • Be able to write assembly code based on a NOS table.
  • Be able to access elements in a C++ array from assembly.
  • Understand how ordering fields in a structure can affect its size.
  • Understand the term padding.
  • Understand when and where padding is applied.

Video


Structures

There's nothing magical about a structure. It is just a way to put a structure (hence the name) of memory. Memory is just a sequence of 0s and 1s. We, through assembly and C++, put some meaning behind those 0s and 1s. Try opening a music file with Microsoft Word. Doesn't work does it? This is because Word has a different take on the sequences of 0s and 1s than a music player.

The only "difficult" portion about structures is the concept of padding. The memory controller is a logic unit in the CPU that is responsible for reading and writing to main memory (RAM). However, it isn't a perfect unit. It reads in groups of blocks, and they are structured in rows and columns. Below is a diagram to show memory elements in RAM (simplified, of course).

Memory elements in DRAM are laid out in rows and columns.

The memory address is split into two sections, one section specifies the row and the other specifies the column. So, the internals of DRAM make it much more efficient to read an entire row. Looking at the 2D array below, the memory controller must sequence two reads to read bits 7-10 whereas it only needs one read to read bits 0-7. Notice that bits 7-10 are in two different rows. The bits we want to read are colored red in the figure below.

Read blocks in red spans two rows

C++ knows about this, so it tries to move the data elements in a structure into a position where the memory controller has less work to do, and hence, it is more efficient. C++ will waste memory space for efficiency, and this behavior can be configured by the programmer.


Padding

Adding wasted bits to move a field into a row is called padding. C++ will look at each field and apply the formula offset % size == 0. That is, the offset of each field must be aligned (a multiple) of its size. For example, take a look at the structure below.

struct MyStruct {
   char i;
   int j;
   long k;
};

The structure above looks fairly simple. If I asked you to give me the size of this structure, what would you say? Here's the of the size of the structure from C++.

Size of MyStruct = 16 bytes.

If we added all of the fields together, we'd have a 1-byte character, a 4-byte integer, and an 8-byte long. That's only 13 bytes! So, what happened? The answer is that C++ padded the fields to satisfy the formula offset % size == 0. The char starts at offset 0 (it's at the top of the structure), and is 1 byte. Therefore, the next byte available is 1 (byte 0 is being taken by the char). Then we get to the integer. It's 4 bytes. However, 1 % 4 == 1. That is, the remainder of 1 divided by 4 is 1, NOT 0. Therefore, C++ will start adding wasted bytes to satisfy this equation. 2 % 4 == 2, 3 % 4 == 3, 4 % 4 == 0. There we go. For the integer field, the next available offset we can use after padding is 4. So, now we've added 3 wasted bytes. The integer takes bytes 4, 5, 6, and 7. So, the next byte available is offset 8. A long is 8 bytes, and 8 % 8 == 0, so there are no wasted bytes. The long therefore takes bytes 8, 9, 10, 11, 12, 13, 14, and 15. If we add bytes 0 through 15, that gives us a total of 16 bytes, which is why the size of the structure comes out to be 16 bytes.

End Padding

A structure can also contain padding at the end of a structure. This is because you can make an array out of any structure in C++. The formula for this is that the size of the structure % the largest field size must == 0. So, when totaling a structure, you must consider padding in-between fields, and the padding at the end of the last field.


The NOS Table

In assembly, we don't have any "structures", and we are required to know where to go to read and write values in memory. Therefore, it is incumbent upon us to know where C++ has padded certain places in a structure. To make this easier, I recommend using a NOS table, which stands for Name, Offset, Size. Let's draw a NOS table for our example structure above.

NameOffsetSize
char i01
int j44
long k88
TOTAL SIZE16
An example of a NOS table.

With this table, we can now easily dereference members of this structure. Let's take an example:

# long AddAll(MyStruct *s);
AddAll:
     # $v0 AddAll($a0)
     lb $t0, 0($a0)
     lw $t1, 4($a0)
     ld $t2, 8($a0)
     add $v0, $t0, $t1
     add $v0, $v0, $t2
     jr $ra

In the code above, $a0 contains a memory address (remember, pointers are memory addresses). We use the NOS table to dereference i, j, and k. Notice I specify the offsets as 0, 4, and 8, which I get from the second column of the table. Then I use lb, lw, and ld (doubleword), which I get from the third column of the table. The NOS table makes it much easier to write assembly from C++ code. Recall that I recommend always writing your logic in C++ FIRST, then translating to assembly. The NOS table is just one more tool to help you do this.


Arrays

Finding elements of an array is fairly straightforward, and is the main reason why we have 0-based indexing for arrays in C++. The index is merely an offset multiplied by the size of each element and then added back into the array's base address. When we pass arrays to a function, we pass the memory address of the first element. C++ does the "magic" of calculating where to find each element. We use the following formula for arrays:

$$\text{address} = \text{base} + \text{offset}\times\text{size}$$

That is, we take the base and add to it the offset times the size of each element. Let's see an example.

int main() {
    int array[100];
    array[10] = 33;
}

All we do in the code above is allocate 100 integers on the stack (it's local storage). To know how many bytes that is we multiply \(100\times 4=400\). So, we allocate 400 bytes. Then we need to put the value 33 into array[10]. Using the formula \(\text{address} = \text{base}+\text{offset}\times\text{size}\), we can get the index 10. In assembly, it'd look like the following:

main:
   addi $sp, $sp, -400  # 400 bytes for the array (still must be a multiple of 8!)
   addi $a0, $sp, 40    # array[10] starts at byte 40 ($a0 = $sp + 10 x 4)
   li   $t0, 33
   sw   $t0, 0($a0)
   addi $sp, $sp, 400
   jr   $ra

We can also use a little bit of math to return a variable index:

short get_element(short array[], int element) {
    return array[element];
}

Recall that arrays are passed as the memory address of the first element. We can use this in assembly. The following assembly implements get_element above.

get_element:
   # $v0 get_element($a0, $a1)
   # $a0 = short array[]
   # $a1 = int element
   # First scale $a1 by 2 bytes
   sll  $t0, $a1, 1     # shifting left by 1 is the same as multiplying by 2
   add  $t0, $t0, $a0   # add the element to the base memory address
   lh   $v0, 0($t0)     # load 2 bytes (halfword) from array + element x 2
   jr   $ra

Most data sizes are powers of 2: 1, 2, 4, 8, etc. So, we can use shifting. Shifting left by 0 is the same as multiplying by 1, shifting left by 1 is the same as multiplying by 2, shifting left by 2 is the same as multiplying by 4, and shifting left by 3 is the same as multiplying by 8. This is useful since shifting is the easiest way to scale a value by a data size. This, of course, won't work for arrays of structures whose sizes are not powers of 2.


Array of Structures

With an array we just multiply the index by the size of an individual element. The same happens with a structure, however remember, we need a NOS table to help us with the size of each structure. Take the following example.

struct MyStruct {
    int i;
    char j;
};
MyStruct array[10];

Drawing our NOS table, we need to find each individual element and the size of the entire structure.

NameOffsetSize
int i04
char j41
TOTAL SIZE8
MyStruct NOS table

We can see that each element of the array will need to be scaled by 8. Why 8? Remember that when calculating the size of the entire structure, we need to make sure that total_size % size_of_biggest_field == 0. In our case, the biggest field is 4. After an integer and a character make 5, 5 % 8 == 5, so we need to pad by three bytes. (5 + 3) = 8 % 8 == 0.

So, now we need to move by 8 bytes to get to each individual element. When we get to those elements, we need to add 4 to dereference char j or add 0 to dereference int i.

struct MyStruct {
   int i;
   char j;
};
MyStruct array[10];
int add_values(MyStruct array[], int element) {
    return array[element].i + array[element].j
}

Writing this in assembly, we need to know the total size of MyStruct, which is 8, and the offset of elements i and j, which are 0 and 4, respectively.

add_values:
   # $v0 add_values($a0, $a1)
   # $a0 = array[]
   # $a1 = element
   # First scale the element
   sll $a1, $a1, 3    # Shifting left by 3 multiplies by 8
   add $a0, $a0, $a1
   lw  $t0, 0($a0)    # load offset 0 (int i)
   lb  $t1, 4($a0)    # load offset 4 (char j)
   add $v0, $t0, $t1
   jr  $ra

Recall that lb will sign-extend to widen what is loaded from memory into a 32-bit register. If we didn't want this behavior, recall that we would use lbu.