C# - The Boring Parts: Casts and Type Conversions
Casts are confusing in C#.
The syntax itself is simple: (Bar)foo
casts the value foo to the type Bar. But this means two very different things for reference types and for value types.
For numeric value types, a cast performs a data conversion which may lead to data loss. For reference types, a cast does not convert anything, but just performs a runtime check if the types are compatible. But since the syntax is the same in both cases, this difference in behavior is unexpected.
Lets see how it differs:
Casts for reference types
Every object has an actual type. But any variable or parameter referencing an object also have a type which might be the same as the actual type but also could be a more general type.
Lets say we have this type hierarchy:
[Control]
| |
[Button] [TextBox]
Control is either a superclass or an interface, and it has the two subtypes: Button and TextBox. So we can have this this code:
Button b = new Button()
Control c = b;
Line 1 creates an object with the exact type of Button. Line 2 assigns a reference to this object to the variable c.
The variables b and c now reference the same object which have the actual type “Button”. But the variable c has the more general type Control.
A variable of one type can hold a reference to an object with a different actual type as long as the types are compatible.
So there is an implicit cast from Button to Control when assigning to c. This is allowed and does not require the cast syntax, since it is guaranteed to succeed, because a Button is guaranteed to be a Control.
We can check the actual type at runtime:
Console.Write(c.GetType().Name); // Writes “Button”`
So GetType()
always returns the actual type of the object, even if the variable holding the reference have a different type.
But we cannot just assign back to a more specific type:
Button d = c; // TYPE ERROR
Because c has type Control, it cannot safely be assigned to d, since there is no guarantee it is a Button. It could be a TextBox! So the compiler will not allow the assignment. But in this case we know better than the compiler – we know that c is guaranteed to be a Button. So we use the cast syntax to indicate we know what we are doing:
Button d = (Button)c;
The compiler accepts this, but C# is still paranoid enough to perform a check at runtime if the actual type really is a Button when the assignment is executed. If not, it throws an exception.
The important point is that the object is question is not altered in any way by these casts.
Casting is only allowed upwards and downwards in the type-hierarchy. Any other cast is forbidden.
For example:
Button b = new Button()
TextBox c = (TextBox)b;
Even with the explicit cast, the compiler will not allow this, since this would always fail at runtime anyway.
So there are three kinds of casts:
- Safe cast – casts “upwards” in the hierachy does not require cast syntax
- Unsafe cast – casts “downwards” in the hierachy requires the cast syntax and may fail at runtime if the object have a different actual type than expected
- Illegal cast – any cast which is not directly up or down (e.g. a cast to a sibling class) is disallowed by the compiler.
Cast for value types
Value types does not have the distinction between variable type and actual type. The type of the variable or parameter is always the exact type of the value. When a value type is assigned to a variable of a different type, a conversion into the target type happens.
A numeric type has a numeric range. If the target type has the same or larger numeric range, the conversion is safe. But if the target has a smaller range, data loss happen if the value is outside of the range of the target. Therefor the compiler only allows implicit conversion to larger ranges. Eg.:
int x = 17 //
long y = x; // OK, larger range. Int is 32 bit and Long is 64 bit
short z = x; // Compile err – short have a smaller range.
If we know what we are doing and prepared to deal with the consequences, we can use the cast syntax to indicate that the compiler should be through with the conversion anyway.
Short z = (short)x;
But if the value is outside the range of the target type, bizarre results may follow:
Int x = -27;
ulong y = (ulong)x;
The value of y is now 18446744073709551589. Surprised? This is a consequence of how negative values are stored as integers. Suffice to say you should be very sure you know what you are doing when casting numeric value types!
If you want a runtime safety check you can use “checked
” keyword:
ulong y = checked( (ulong)x );
The cast will then perform a check and then throw a “Arithmetic operation resulted in an overflow” exception. This is arguable saner behavior (because seriously, who could ever want the current behavior?) but since it has performance implications it is not the default.
A further caveat is that you can have a loss of precision without an explicit cast. A long can be implicitly converted to float because the numerical range of float is larger. But some precision may be lost.
Example:
long l = 20000001;
float f = l;
Console.WriteLine(f); // Writes 20000000
Note the last digit is lost! So even without a cast, you can have information loss. But it will not change the general magnitude of the number like the unsafe casts may.
Summary of differences
Effect on the value: Casting reference types will not alter the actual object in any way. Casting numeric types will convert the values, possible leading to data loss.
Safety: A reference type cast is always verified at runtime. A value type cast is not verified at runtime and may lead to data loss or data corruption. (Although the checked keyword can be applied to perform a runtime check.)
Allowed casts: For reference types, the type hierarchy determines which casts are safe, unsafe and invalid. For numeric value types this is determined by the numerical ranges which is independent from the type hierachy. (Indeed all value types are siblings in the type hierarchy as subtypes of ValueType, but this does not affect casting semantics)