0

So I now understand that struct assignments is a combination of memcpy()(which apparently works as memory is allocated contiguously for structure variables) and some optimization stuff by the compiler.

But what happens if you use struct assignments when the structure definition contains a pointer?

struct S 
{
   char * p;
};

char array[]="hi"; 
struct S s1, s2;
s1.p = array;
s2 = s1;

Based on the above code: Since there is overlap to what &s1 and &s2 points to as s1.p==s2.p, shouldn't there some sort of Undefined Behaviour(UB) as the structure assignments employ memcpy() underneath the hood?

However, https://stackoverflow.com/a/2302357/10701114 (code example used in his answer slightly varies from mine) states:

Now the pointers of both structs point to the same block of memory - the compiler does not copy the pointed to data.

This is completely off from my assumptions: Not only does such a struct assignment work without invoking UB but memcpy() hidden with the struct assignment doesn't work as intended when the compiler doesn't copy the pointed to data but instead cause the pointers to point to the same memory as mentioned in quote.

To reiterate my question: what happens if you use struct assignments when the structure definition contains a pointer? Why am I wrong in my assumptions?

Leon
  • 346
  • 3
  • 15
  • 1
    Note that there is no "`memcpy` and optimization stuff by the compiler" involved, it's actually the other way around: if the compiler sees that you're calling memcpy for a small struct, it will often remove the call to `memcpy` and just inline everything using registers or by writing literals directly to the destination (for example, note how the compiler will just [write a single integer in the "small" case](https://godbolt.org/z/HpL--y)). – vgru Feb 02 '20 at 03:57
  • I kind of get what you're saying about the "writing literal.." part but what does "inline" and "register" mean? – Leon Feb 02 '20 at 04:06
  • No `memcpy`, just [C11 Standard - 6.3.2.1 Lvalues, arrays, and function designators(p2)](http://port70.net/~nsz/c/c11/n1570.html#6.3.2.1p2). *"...an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue); this is called lvalue conversion."* and [(p3)](http://port70.net/~nsz/c/c11/n1570.html#6.3.2.1p3) *"...an expression that has type ''array of type'' is converted to an expression with type ''pointer to type'' that points to the initial element of the array object and is not an lvalue."* – David C. Rankin Feb 02 '20 at 04:12
  • @Leon: [inlining a function](https://en.wikipedia.org/wiki/Inline_function) means substituting the function call with the entire function code. After placing its body inside the caller function, the compiler is able to do additional optimizations (e.g. [this example](https://godbolt.org/z/PrHgi-) just returns the final sum). With some standard functions (like `memcpy`), it's surprising how far the compiler goes ([there is no call to strlen here](https://godbolt.org/z/njE3Hs)). A [register](https://en.wikipedia.org/wiki/Processor_register) is basically a fast storage location inside a CPU. – vgru Feb 02 '20 at 04:31

1 Answers1

0

When you assign one struct to another, it copies the contents of each member of the struct, regardless of the type.

In this case, the struct contains a char *, so the value of that pointer is copied from one struct to the other. Doing s2 = s1 is exactly the same as doing s1.p = s2.p. Now both pointers contain the same value, meaning they point to the same thing.

There is nothing that says two pointers can't contain the same value.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • But when you assign one struct to another, you are using `memcpy()` right? If the location pointed to by source(ie `&s1`) and destination(ie `s2`) parameter in `memcpy()` overlaps, it should cause UB. In this case, there is overlap right as `s1.p==s2.p`? – Leon Feb 02 '20 at 03:41
  • @Leon There is no overlap because `s1` and `s2` are separate variables with their own memory. They contain the same *values*. That's it. – dbush Feb 02 '20 at 03:44
  • @Leon: `s.p` is a pointer, the pointer is copied, not the array. Whenever you assign a value to a variable, you are copying its entire contents. Whenever you pass a variable to a function through a parameter, you are copying its entire contents. Variables of type `int`, `struct foo` or `int**` are all the same in this regards, their *contents* are copied. – vgru Feb 02 '20 at 03:44
  • Hmm so is it like this: `&s1` and `&s2` point to different memory BUT it just so happens that portions of the different memory( the pointer `p` in structure definition) point to the same thing. So `&s1` and `&s2` themselves don't overlap? – Leon Feb 02 '20 at 03:46
  • @Leon Correct. It would be the same if the structs contained an `int` instead of a `char *`. – dbush Feb 02 '20 at 03:47
  • Yes, `struct S s1, s2;` creates two separate variables, they don't overlap. Note that the *contents* of a `char *` field is an address, this address is copied. If you called a function and passed `s1` to it (e.g. `do_something(s1);`), the function would get its separate copy of `s1`. – vgru Feb 02 '20 at 03:47