Correctness of misaligned access in C++

From what I have read, misaligned access means mostly two things:

  • you may get a performance loss
  • you will lose atomicity of loads and stores that aligned access has
  • Supposing that performance is not an issue and what I want from software is correctness, how bad is misaligned access? My understanding is that the x86 CPU will handle such accesses correctly but may have to do additional work to fetch the data.

    What lead me to asking this question was compiling my code with -fsanitize=undefined . I got many errors about misaligned stores/loads. I am not sure if this is an issue because:

  • the stores are performed only during data preparation which is a single-threaded process, so I am not concerned about loss of atomicity
  • the loads are performed in a multithreaded process where many threads (four or more) access the data, but the data is not modified by any of them (held in a const uint8_t* variable)
  • The reason accesses are not aligned is that the const uint8_t* array contains bytes from many different types ( uint8_t , uint16_t , uint32_t , uint64_t , and int64_t ).

    I am sure that no load goes outside the bounds of the allocated uint8_t array (eg the program never loads uint64_t from an address that points to last one, two, or three bytes of the allocated memory block), and am sure that my accesses are all correct - only misaligned.

    Another thing I read is that such loads may be breaking the strict aliasing rules, but the code compiles without a single warning with -Wstrict-aliasing -Werror (which I have long ago enabled).

    Should I pad the data in the uint8_t array to ensure accesses are aligned, or may I safely ignore the warnings?


    There are platforms which doesn't support unaligned access (you will get a crash). And, there are platforms, where unaligned access is supported, but there are some asm instructions, which need aligned access. For example, on ARM, there is LDRD instruction, which needs aligned memory address. And, unfortunately, compiler is free to use this instruction. But, usually, there is a compiler extension which tells the compiler that the pointer is unaligned, so it won't use LDRD.

    On platforms which support UA, there are the penalties you mentioned.

    I recommend you to use memcpy . It works on all platforms, and compilers are pretty good nowadays to optimize it (so you won't get memcpy calls, but fast mov instructions).


    The main problem isn't performance or atomicity, it is correctness. Misaligned accesses invoke undefined behavior according to the C and C++ standards, hence you can't rely on any particular outcome. It may work, or it may crash. Or it may work first, and stop working sometime later. This is the essence of the error messages you get. You may choose to ignore the errors if you know that it will always work for you, but since you asked for such errors to be flagged, by using the corresponding compiler switch, it is only reasonable that you should strive to avoid them, especially if you are not absolutely sure that your code will stay on this platform forever. Furthermore, how do you know it'll always work for you even on the same platform?

    From what you write, it seems that the data is written by the same machine that reads it later, only the threads are different. If so, you should attempt to write data in a properly aligned way, ie use padding where appropriate. You may be able to get help by the compiler by packaging the data in a properly defined struct rather than an unstructured buffer. This will also give you more type safety.

    Otherwise you would have to worry about more than just alignment. For example, you would also need to take endianness into account. In this case you are probably writing a kind of external data record that might end up on a different machine. You are looking for a machine-neutral external data representation, which you can define yourself, or better you use one of several standard representations that have been invented for RPC, which has the advantage that you can find libraries to do the reading and writing.

    链接地址: http://www.djcxy.com/p/54094.html

    上一篇: 将密钥对添加到现有的EC2实例

    下一篇: C ++中未对齐访问的正确性