Not My Type
In a previous post, I walked through a few basic examples of type systems in managed languages.
The below is a quick type example from the C programming language.
include <stdio.h>
int main(void)
{
int y = 5 + "2"; /* Line 5 */
printf("%d\n", y); /* Line 6 */
char *x = 3 + "2"; /* Line 7 */
printf("%d\n", x); /* Line 8 */
}
/* Compiler warnings
The compiler issues the below warnings (not errors)
but it compiles the program and it can be executed
"warning: adding 'int' to a string does
not append to the string [-Wstring-plus-int]" Line 5
"use array indexing to silence this warning" Line 5
"warning: incompatible pointer to integer conversion initializing
'int' with an expression of type 'char *' [-Wint-conversion]" Line 5
"warning: adding 'int' to a string does not append
to the string [-Wstring-plus-int] Line 7
"use array indexing to silence this warning" Line 7
"warning: format specifies type 'int' but the argument
has type 'char *' [-Wformat]" Line 8 */
/* Output: (Undefined Behavior)
170446759
170446757
*/
“2” is a string literal. In C, a string literal is stored in memory (allocated by the compiler i.e. “statically allocated”) as an array of characters. Each character in the array has a size of one byte.
C interprets line 5 and line 7 as an attempt to create a pointer (y and x) that reference an address in memory that is 5 bytes (line 5) or 3 bytes (line 7) beyond the end of the string “2”.
C interprets the “+” as pointer math involving the number operand on the left and the size (a character is one byte), and starting address operand on the right (“2”). So 5 times one byte added on to the end of “2”’s location, or 3 times one byte added on to the end of “2”’s location.
In summary:
- C is statically and weakly typed.
- The compiler provides type warnings but not type errors.
- When “integer” + “string”
- C does not attempt to implicitly convert the integer to a string or the string to an integer
- Instead C views the statement as an attempt to create a new pointer to a storage location with a size and address that is a function of the integer value and the string’s size and location.
- This results in undefined behavior