[prev] [up] [overview] [next]

Section 9. Structs, Enums, and Unions

9.1: What is the difference between an enum and a series of preprocessor #defines?

At the present time, there is little difference. Although many people might have wished otherwise, the ANSI standard says that enumerations may be freely intermixed with integral types, without errors. (If such intermixing were disallowed without explicit casts, judicious use of enums could catch certain programming errors.)

Some advantages of enums are that the numeric values are automatically assigned, that a debugger may be able to display the symbolic values when enum variables are examined, and that they obey block scope. (A compiler may also generate nonfatal warnings when enums and ints are indiscriminately mixed, since doing so can still be considered bad style even though it is not strictly illegal). A disadvantage is that the programmer has little control over the size (or over those nonfatal warnings).

References: K&R II Sec. 2.3 p. 39, Sec. A4.2 p. 196; H&S Sec. 5.5 p. 100; ANSI Secs. 3.1.2.5, 3.5.2, 3.5.2.2 .

9.2: I heard that structures could be assigned to variables and passed to and from functions, but K&R I says not.

What K&R I said was that the restrictions on struct operations would be lifted in a forthcoming version of the compiler, and in fact struct assignment and passing were fully functional in Ritchie's compiler even as K&R I was being published. Although a few early C compilers lacked struct assignment, all modern compilers support it, and it is part of the ANSI C standard, so there should be no reluctance to use it.

References: K&R I Sec. 6.2 p. 121; K&R II Sec. 6.2 p. 129; H&S Sec. 5.6.2 p. 103; ANSI Secs. 3.1.2.5, 3.2.2.1, 3.3.16 .

9.3: How does struct passing and returning work?

When structures are passed as arguments to functions, the entire struct is typically pushed on the stack, using as many words as are required. (Programmers often choose to use pointers to structures instead, precisely to avoid this overhead.)

Structures are often returned from functions in a location pointed to by an extra, compiler-supplied "hidden" argument to the function. Some older compilers used a special, static location for structure returns, although this made struct-valued functions nonreentrant, which ANSI C disallows.

References: ANSI Sec. 2.2.3 p. 13.

9.4: The following program works correctly, but it dumps core after it finishes. Why?

	struct list
		{
		char *item;
		struct list *next;
		}

	/* Here is the main program. */

	main(argc, argv)
	...

A missing semicolon causes the compiler to believe that main returns a structure. (The connection is hard to see because of the intervening comment.) Since struct-valued functions are usually implemented by adding a hidden return pointer, the generated code for main() tries to accept three arguments, although only two are passed (in this case, by the C start-up code). See also question 17.21.

References: CT&P Sec. 2.3 pp. 21-2.

9.5: Why can't you compare structs?

There is no reasonable way for a compiler to implement struct comparison which is consistent with C's low-level flavor. A byte-by-byte comparison could be invalidated by random bits present in unused "holes" in the structure (such padding is used to keep the alignment of later fields correct; see questions 9.10 and 9.11). A field-by- field comparison would require unacceptable amounts of repetitive, in-line code for large structures.

If you want to compare two structures, you must write your own function to do so. C++ would let you arrange for the == operator to map to your function.

References: K&R II Sec. 6.2 p. 129; H&S Sec. 5.6.2 p. 103; ANSI Rationale Sec. 3.3.9 p. 47.

9.6: How can I read/write structs from/to data files?

It is relatively straightforward to write a struct out using fwrite:

                fwrite((char *)&somestruct, sizeof(somestruct), 1, fp);
and a corresponding fread invocation can read it back in. However, data files so written will not be very portable (see questions 9.11 and 17.3). Note also that on many systems you must use the "b" flag when fopening the files.

9.7: I came across some code that declared a structure like this:

	struct name
		{
		int namelen;
		char name[1];
		};

and then did some tricky allocation to make the name array act like it had several elements. Is this legal and/or portable?

This technique is popular, although Dennis Ritchie has called it "unwarranted chumminess with the C implementation." An ANSI Interpretation Ruling has deemed it (more precisely, access beyond the declared size of the name field) to be not strictly conforming, although a thorough treatment of the arguments surrounding the legality of the technique is beyond the scope of this list. It seems, however, to be portable to all known implementations. (Compilers which check array bounds carefully might issue warnings.)

To be on the safe side, it may be preferable to declare the variable-size element very large, rather than very small; in the case of the above example:

                ...
                char name[MAXSIZE];
                ...
where MAXSIZE is larger than any name which will be stored. (The trick so modified is said to be in conformance with the Standard.)

References: ANSI Rationale Sec. 3.5.4.2 pp. 54-5.

9.8: How can I determine the byte offset of a field within a structure?

ANSI C defines the offsetof macro, which should be used if available; see <stddef.h>. If you don't have it, a suggested implementation is

	#define offsetof(type, mem) ((size_t) \
		((char *)&((type *) 0)->mem - (char *)((type *) 0)))

This implementation is not 100% portable; some compilers may legitimately refuse to accept it.

See the next question for a usage hint.

References: ANSI Sec. 4.1.5, Rationale Sec. 3.5.4.2 p. 55.

9.9: How can I access structure fields by name at run time?

Build a table of names and offsets, using the offsetof() macro. The offset of field b in struct a is

	offsetb = offsetof(struct a, b)

If structp is a pointer to an instance of this structure, and b is an int field with offset as computed above, b's value can be set indirectly with

	*(int *)((char *)structp + offsetb) = value;

9.10: Why does sizeof report a larger size than I expect for a structure type, as if there was padding at the end?

Structures may have this padding (as well as internal padding; see also question 9.5), so that alignment properties will be preserved when an array of contiguous structures is allocated.

9.11: My compiler is leaving holes in structures, which is wasting space and preventing "binary" I/O to external data files. Can I turn off the padding, or otherwise control the alignment of structs?

Your compiler may provide an extension to give you this control (perhaps a #pragma), but there is no standard method. See also question 17.3.

9.12: Can I initialize unions?

ANSI Standard C allows an initializer for the first member of a union. There is no standard way of initializing the other members (nor, under a pre-ANSI compiler, is there generally any way of initializing any of them).

9.13: How can I pass constant values to routines which accept struct arguments?

C has no way of generating anonymous struct values. You will have to use a temporary struct variable.


[prev] [up] [overview] [next]