Color Of Code

Font Size

SCREEN

Profile

Layout

Menu Style

Cpanel

Alignment and byte ordering

Definitions

Alignment: affinity of an object to be stored/located at addresses that are multiple of a given integer, the alignment parameter

Byte ordering: way the bytes of a larger object (32bit integer for example) are organized/located in memory. This is also often referred to as endianness.

Introduction

Usually you do not have to care much about alignment or byte ordering. The compiler assembles members into structures or classes, you access their memory transparently by using their names in your favorite programming language. The first contact with byte ordering usually rises up when one has to manipulate bytes in a buffer directly, for example performing some transformation on bitmaps. Why are the relative positions of the R/G/B bytes not the expected ones on Intel PCs? Or when one starts to transfer data over sockets and is facing issues with the host or network byte ordering.

This byte twiddling stuff is not of much fun but when it comes to interoperability over systems, between compilers or for manipulating data in buffers, you will need this knowledge to get the job done properly.

When interoperability is required, the two things that can make you trouble are:

  • Find the position of the objects in memory (alignment may introduce padding that makes the objects slide in memory)
  • Know how to load the data from memory (depends on how the single data bytes are organized)

Byte ordering

The endianness or byte ordering depends on the CPU and memory architecture only. On Intel/AMD PCs you will face little endian systems - not to confuse with little indians! ;-). But there are also big endian systems or even switchable systems as well as mixed or middle endian ones. Very often, the CPU does not simply read one single byte from memory but a whole block of 4 bytes or more. The way the CPU data lines are connected to the memory chips determines the endianness. On PCs usually the both lowest bits of the address lines are not connected. Thus the CPU can only address blocks of 4 bytes. Refer to the Wikipedia article for more details regarding this. It is pretty clear and shows an example of a 32bit value stored in memory: 0x0A0B0C0D.

For programming, what matters:

  • Convert data to network byte ordering independently from your system using adequate transformation functions. In C for example the htons, htonl, ntohs, ntohl functions.
  • Take care of the creation of files that are to be exchanged on little/big endian systems. To be readable define the format and specify the byte ordering to be used for it.
  • For plain text files containing unicode data, use the BOM to generate for example XML to be readable by all parsers on all systems. BOM is the Byte Order Mark.

Alignment

Now we come to the very bad news. How structures and variables are aligned or positioned in memory depends on the compiler and the compiler settings. In C the standard also does not specify much regarding alignment. Some rules are there like:

  • pointers have all same representation and alignment requirements as pointer to char

Interesting operators are:

  • sizeof, return the size of a type
  • alignof, return the preferred alignment of a type
  • offsetof, return the offset of a member relatively to the begin of the structure

Example of source code to check your compiler's behavior in C++:

alignment.cpp

#include  #include  #ifdef __GNUC__ #define __int64 long long #endif #pragma pack(push, 1) #define ALIGNMENT 1 #include "structs.h" #undef ALIGNMENT #pragma pack(pop) #pragma pack(push, 2) #define ALIGNMENT 2 #include "structs.h" #undef ALIGNMENT #pragma pack(pop) #pragma pack(push, 4) #define ALIGNMENT 4 #include "structs.h" #undef ALIGNMENT #pragma pack(pop) #pragma pack(push, 8) #define ALIGNMENT 8 #include "structs.h" #undef ALIGNMENT #pragma pack(pop) template  void print(char const* name) { std::cout << std::setw(10) << name << " sizeof = " << sizeof (T) << " alignof = " << __alignof (T) << std::endl; } #define DUMP_PARAMS(sname) \ print<sname>(#sname) #define DUMP_PARAMS2(sname) \ print<sname>("") #define DUMP_SIZE(sname) \ std::cout << "alignment" << std::endl; \ std::cout << " 1 "; DUMP_PARAMS2(a1_##sname); \ std::cout << " 2 "; DUMP_PARAMS2(a2_##sname); \ std::cout << " 4 "; DUMP_PARAMS2(a4_##sname); \ std::cout << " 8 "; DUMP_PARAMS2(a8_##sname); void main(void) { DUMP_PARAMS(char); DUMP_PARAMS(short); DUMP_PARAMS(int); DUMP_PARAMS(long); DUMP_PARAMS(__int64); DUMP_PARAMS(float); DUMP_PARAMS(double); DUMP_PARAMS(long double); DUMP_PARAMS(void*); std::cout << "------------------" << std::endl; std::cout << "--- sizes for structure with char:" << std::endl; DUMP_SIZE(char_st); std::cout << "--- sizes for structure with char+char:" << std::endl; DUMP_SIZE(charchar_st); std::cout << "--- sizes for structure with char+short:" << std::endl; DUMP_SIZE(charshort_st); std::cout << "--- sizes for structure with char+int:" << std::endl; DUMP_SIZE(charint_st); std::cout << "--- sizes for structure with char+int64:" << std::endl; DUMP_SIZE(charint64_st); std::cout << "--- sizes for structure with int64+char:" << std::endl; DUMP_SIZE(int64char_st); std::cout << "--- sizes for structure with short+short:" << std::endl; DUMP_SIZE(shortshort_st); std::cout << "--- sizes for structure with char+short+int:" << std::endl; DUMP_SIZE(charshortint_st); std::cout << "--- sizes for structure with char_st+char_st:" << std::endl; DUMP_SIZE(char_st_char_st_st); } 

structs.h

#if ALIGNMENT == 1
	#define PREFIX(name)  a1##_##name
#elif ALIGNMENT == 2
	#define PREFIX(name)  a2##_##name
#elif ALIGNMENT == 4
	#define PREFIX(name)  a4##_##name
#elif ALIGNMENT == 8
	#define PREFIX(name)  a8##_##name
#endif

struct PREFIX(char_st)
{
	char c;
};

struct PREFIX(charchar_st)
{
	char c1;
	char c2;
};

struct PREFIX(charshort_st)
{
	char c;
	short s;
};

struct PREFIX(charint_st)
{
	char c;
	int i;
};

struct PREFIX(charint64_st)
{
	char c;
	__int64 i;
};

struct PREFIX(int64char_st)
{
	__int64 i;
	char c;
};

struct PREFIX(shortshort_st)
{
	short s1;
	short s2;
};

struct PREFIX(charshortint_st)
{
	char c;
	short s;
	int i;
};

struct PREFIX(char_st_char_st_st)
{
	PREFIX(char_st) st1;
	PREFIX(char_st) st2;
};

#undef PREFIX

Compile

On Linux with g++:

> g++ -o alignment alignment.cpp

Results

On a 32bit Linux system with g++ 4.4.5, following results come out:

       char sizeof =  1 alignof = 1
      short sizeof =  2 alignof = 2
        int sizeof =  4 alignof = 4
       long sizeof =  4 alignof = 4
    __int64 sizeof =  8 alignof = 8
      float sizeof =  4 alignof = 4
     double sizeof =  8 alignof = 8
long double sizeof = 12 alignof = 4
      void* sizeof =  4 alignof = 4

Note that the alignment of a long long (int64) is 8 bytes. On the same system, the alignment of a structure containing a long long is only 4 even with a specified alignment (pragma pack) of 8!! This is reported to be a wanted feature of GCC to comply with the 32bit ABI. One interesting case is a char and long long inside a structure:

alignment
 1             sizeof =  9 alignof = 1
 2             sizeof = 10 alignof = 2
 4             sizeof = 12 alignof = 4
 8             sizeof = 12 alignof = 4

The behavior of a microsoft compiler differs in this case for a 32bit system:

alignment
 1             sizeof =  9 alignof = 1
 2             sizeof = 10 alignof = 2
 4             sizeof = 12 alignof = 4
 8             sizeof = 16 alignof = 8

This shows that to really create interoperable structures the best way to go is to put pad bytes oneself into the structure to align everything directly without relying on the compiler to do it. Still taking existing structures from 32bit applications and recompiling them on 64bit systems can make the members slide to different offsets. These portability issues can be really tricky to detect.

Interesting links