What's New in
the C# 4.0
Feature Categories
Microsoft breaks the new features into the following four categories
so I will maintain the pattern:
- Named and
Optional Parameters
- Dynamic
Support
- Variance
- COM Interop
Conventions
Some of the examples assume the following
classes are defined:
Public class Person
{
Public string FirstName { get; set; }
Public string LastName { get; set; }
}
Public class Customer : Person
{
Public int CustomerId { get; set; }
Public void Process() { ... }
}
Public class SalesRep : Person
{
Public int SalesRepId { get; set; }
Public void SellStuff() { ... }
}
Named and Optional Parameters
We'll start off with one of the easier
features to explain. In fact, if you have ever used Visual Basic, then you are
probably already familiar with it.
Optional Parameters
Support for optional parameters allows
you to give a method parameter a default value so that you do not have to
specify it every time you call the method. This comes in handy when you have
overloaded methods that are chained together.
The Old Way
Public void Process( string data )
{
Process( data, false );
}
Public void Process( string data, bool ignoreWS )
{
Process( data, ignoreWS, null );
}
Public void Process( string data, bool ignoreWS, ArrayList moreData )
{
// Actual work done here
}
The reason for overloading
Process
in this way is
to avoid always having to include "false,
null" in the third method call.
Suppose 99% of the time there will not be 'moreData
' provided. It seems
ridiculous to type and pass null so many times.// These 3 calls are equivalent
Process( "foo", false, null );
Process( "foo", false );
Process( "foo" );
The New Way
Public void Process( string data, bool ignoreWS = false, ArrayList moreData = null )
{
// Actual work done here
}
// Note: data must always be provided because it does not have a default value
Now we have one method instead of three, but the three ways we called Process above are still valid and still equivalent.
ArrayList myArrayList = new ArrayList();
Process( "foo" ); // valid
Process( "foo", true ); // valid
Process( "foo", false, myArrayList ); // valid
Process( "foo", myArrayList ); // Invalid! See next section
Awesome, one less thing VB programmers
can brag about having to themselves. I haven't mentioned it up to this point,
but Microsoft has explicitly declared that VB and C# will be
"co-evolving" so the number of disparate features is guaranteed to
shrink over time. I would like to think this will render the VB vs. C# question
moot, but I'm sure people will still find a way to argue about it. ;-)
Named Parameters
In the last example, we saw that the
following call was invalid:
Process( "foo", myArrayList ); // Invalid!
But if the booleanignoreWS is
optional, why can't we just omit it? Well, one reason is for readability and
maintainability, but primarily because it can become impossible to know what
parameter you are specifying. If you had two parameters of the same type, or if
one of the parameters was "object" or some other base class or
interface, the compiler would not know which parameter you are sending. Imagine
a method with ten optional parameters and you give it a single ArrayList. Since
an ArrayList is also an object, an IList, and an IEnumerable, it is impossible
to determine how to use it. Yes, the compiler could just pick the first valid
option for each parameter (or a more complex system could be used), but this
would become impossible for people to maintain and would cause countless programming
mistakes.
Named parameters provide the solution:
ArrayList myArrayList = new ArrayList();
Process( "foo", true ); // valid, moreData omitted
Process( "foo", true, myArrayList ); // valid
Process( "foo", moreData: myArrayList); // valid, ignoreWS omitted
Process( "foo", moreData: myArrayList, ignoreWS: false ); // valid, but silly
As long as a parameter has a default value, it can be omitted, and you can just supply the parameters you want via their name. Note in the second line above, the 'true' value for ignoreWS did not have to be named since it is the next logical parameter.
Dynamic Support
publicobject GetCustomer()
{
Customer cust = new Customer();
...
return cust;
}
...
Customer cust = GetCustomer() as Customer;
if( cust != null )
{
cust.FirstName = "foo";
}
Note the GetCustomer method returns object instead of Customer. Code like this is frustrating because you know it returns a Customer; it always has and it always will. Unfortunately, the coder chose to return object and you can't change it because it modifies the public contract and could potentially break legacy software.
Another instance in which you will be
dealing with an object that you know is another type is Reflection.
Type myType = typeof( Customer );
ConstructorInfo consInfo = myType.GetContructor(new Type[]{});
object cust = consInfo.Invoke(newobject[]{});
((Customer)cust).FirstName = "foo";
Because Reflection can act on any type, ConstructorInfo.Invoke() must return object. Like the first example, this forces you to cast the object. Now, consider the situation where you can't, or don't want to, cast the object. Perhaps, the code author is always changing the name of the type or creating different versions (e.g., 'Customer2'), but the properties and methods stay the same. The examples above assume you, as the programmer, have knowledge of what the true type is. What if you didn't? What if you had to use Reflection to find and invoke methods? What if the object being returned was coming from IronPython, JavaScript, COM, or some other non-statically typed environment?
Enter 'dynamic'
The dynamic keyword is new to C# 4.0, and is used to tell the compiler that a variable's type can change or that it is not known until runtime. Think of it as being able to interact with an Object without having to cast it.
dynamic cust = GetCustomer();
cust.FirstName = "foo"; // works as expected
cust.Process(); // works as expected
cust.MissingMethod(); // No method found!
Notice we did not need to cast nor declarecust as type Customer. Because we declared it dynamic, the runtime takes over and then searches and sets the FirstName property for us. Now, of course, when you are using a dynamic variable, you are giving up compiler type checking. This means the call cust.MissingMethod() will compile and not fail until runtime. The result of this operation is a RuntimeBinderException because MissingMethod is not defined on the Customer class.
The example above shows how dynamic works when calling methods and properties. Another powerful (and potentially dangerous) feature is being able to reuse variables for different types of data. I'm sure the Python, Ruby, and Perl programmers out there can think of a million ways to take advantage of this, but I've been using C# so long that it just feels "wrong" to me.
dynamic foo = 123;
foo = "bar";
OK, so you most likely will not be writing code like the above very often. There may be times, however, when variable reuse can come in handy or clean up a dirty piece of legacy code. One simple case I run into often is constantly having to cast between decimal and double.
decimal foo = GetDecimalValue();
foo = foo / 2.5; // Does not compile
foo = Math.Sqrt(foo); // Does not compile
string bar = foo.ToString("c");
The second line does not compile because 2.5 is typed as a double and line 3 does not compile because Math.Sqrt expects a double. Obviously, all you have to do is cast and/or change your variable type, but there may be situations where dynamic makes sense to use.
dynamic foo = GetDecimalValue(); // still returns a decimal
foo = foo / 2.5; // The runtime takes care of this for us
foo = Math.Sqrt(foo); // Again, the DLR works its magic
string bar = foo.ToString("c");
Update
When you use the dynamic keyword, you
are invoking the new Dynamic Language Runtime libraries (DLR) in the .NET
framework. Also, when possible, you should always cast your objects and take
advantage of type checking. The examples above were meant to show how dynamic
works and how you can create an example to test it.
We have learned that if the object you
declared as dynamic is a plain CLR object, Reflection will be used to locate
members and not the DLR.
Switching Between
Static and Dynamic
It should be apparent that 'switching'
an object from being statically typed to dynamic is easy. After all, how hard
is it to 'lose' information? Well, it turns out that going from dynamic to
static is just as easy.
Customer cust = new Customer();
dynamic dynCust = cust; // static to
dynamic, easy enough
dynCust.FirstName = "foo";
Customer newCustRef = dynCust; //
Works because dynCust is a Customer
Person person = dynCust; // works
because Customer inherits from Person
SalesRep rep = dynCust; // throws
RuntimeBinderException exception
Note that in the example above, no
matter how many different ways we reference it, we only have one Customer
object (cust).
Functions
When you return something from a
dynamic function call, indexer, etc., the result is always dynamic. Note that
you can, of course, cast the result to a known type, but the object still
starts out dynamic.
dynamic cust = GetCustomer();
string first = cust.FirstName; // conversion occurs
dynamic id = cust.CustomerId; // no conversion
object last = cust.LastName; //conversion occurs
There are, of course, a few missing features when it comes to dynamic types. Among them are:
Extension methods are not supported
Anonymous functions cannot be used as parameters
We will have to wait for the final version to see what other features get added or removed.
Variance
OK, a quick quiz. Is the following
legal in .NET?
// Example stolen from the whitepaper ;-)
IList<string> strings = new List<string>();
IList<object> objects = strings;
I think most of us, at first, would answer
'yes' because a string is an object. But the question we should be asking
ourselves is: Is a -list- of strings a -list- of objects? To take it further:
Is a -strongly typed- list of strings a -strongly typed- list of objects? When
phrased that way, it's easier to understand why the answer to the question is
'no'. If the above example was legal, that means the following line would
compile:
objects.Add(123);
Oops, we just inserted the integer
value 123 into a List<string>. Remember, the list contents were never
copied; we simply have two references to the same list. There is a case,
however, when casting the list, this should be allowed. If the list is
read-only, then we should be allowed to view the contents any (type legal) way
we want.
Co and Contra Variance
Within the type system of a programming language, a type conversion operator is:
covariant if it preserves the ordering, =, of types, which orders types from more specific to more generic;
contravariant if it reverses this ordering, which orders types from more generic to more specific;
invariant if neither of these apply.
C# is, of course, covariant, meaning a Customer is a Person and can always be referenced as one. There are lots of discussions on this topic, and I will not cover it here. The changes in C# 4.0 only involve typed (generic) interfaces and delegates in situations like in the example above. In order to support co and contra variance, typed interfaces are going to be given 'input' and 'output' sides. So, to make the example above legal, IList must be declared in the following manner:
publicinterface IList<out T> : ICollection<T>, IEnumerable<T>, IEnumerable
{
...
}
Notice the use of the out keyword.
This is essentially saying the IList is readonly and it is safe to refer to a
List<string> as a List<object>. Now, of course, IList is not going
to be defined this way; it must support having items added to it. A better
example to consider is IEnumerable which should be, and is, readonly.
publicinterface IEnumerable<out T> : IEnumerable
{
IEnumerator<T> GetEnumerator();
}
Using out to basically mean 'read
only' is straightforward, but when does using the in keyword to make something
'write only' useful? Well, it actually becomes useful in situations where a
generic argument is expected and only used internally by the method. IComparer
is the canonical example.
publicinterface IComparer<in T>
{
publicint Compare(T left, T right);
}
As you can see, we can't get back an item of type T. Even though the Compare method could potentially act on the left and right arguments, it is kept within the method so it is a 'black hole' to clients that use the interface.
To continue the example above, this means that an IComparer<object> can be used in the place of an IComparer<string>. The C# 4.0 whitepaper sums the reason up nicely: 'If a comparer can compare any two objects, it can certainly also compare two strings'. This is counter-intuitive (or maybe contra-intuitive) because if a method expects a string, you can't give it an object.
Putting it Together
OK, comparing strings and objects is
great, but I think a somewhat realistic example might help clarify how the new
variance keywords are used. This first example demonstrates the effects of the
redefined IEnumerable interface in C# 4.0. In .NET 3.5, line 3 below does not
compile with an the error: 'can not convert List<Customer> to
List<Person>'. As stated above, this seems 'wrong' because a Customer is
a Person. In .NET 4.0, however, this exact same code compiles without any
changes because IEnumerable is now defined with the out modifier.
MyInterface<Customer> customers = new MyClass<Customer>();
List<Person> people = new List<Person>();
people.AddRange(customers.GetAllTs()); // no in 3.5, yes in 4.0
people.Add(customers.GetAllTs()[0]); // yes in both
...
interface MyInterface<T>
{
List<T> GetAllTs();
}
publicclass MyClass<T> : MyInterface<T>
{
public List<T> GetAllTs()
{
return _data;
}
private List<T> _data = new List<T>();
}
This next example demonstrates how you
can take advantage of the out keyword. In .NET 3.5, line 3 compiles, but line 4
does not with the same 'cannot convert' error. To make this work in .NET 4.0,
simply change the declaration of MyInterface to interface MyInterface<out
T>. Notice that in line 4, T is Person, but we are passing the Customer
version of the class and interface.
MyInterface<Person> people = new MyClass<Person>();
MyInterface<Customer> customers = new MyClass<Customer>();
FooClass<Person>.GetThirdItem(people);
FooClass<Person>.GetThirdItem(customers);
...
publicclass FooClass<T>
{
publicstatic T GetThirdItem(MyInterface<T> foo)
{
return foo.GetItemAt(2);
}
}
publicinterface MyInterface<out T>
{
T GetItemAt(int index);
}
publicclass MyClass<T> : MyInterface<T>
{
public T GetItemAt(int index)
{
return _data[index];
}
private List<T> _data = new List<T>();
}
This final example demonstrates the
wacky logic of contravariance. Notice that we put a SalesRep 'inside' our
Person interface. This isn't a problem because a SalesRep is a Person. Where it
gets interesting is when we pass the MyInterface<Person> to
FooClass<Customer>. In essence, we have 'inserted' a SalesRep into an
interface declared to work with only Customers! In .NET 3.5, line 5 does not
compile; as expected. By adding the in keyword to our interface declaration in
.NET 4.0, everything works fine because we are 'agreeing' to treat everything
as a Person internally and not expose the internal data (which might be that
SalesRep).
MyInterface<Customer> customer = new MyClass<Customer>();
MyInterface<Person> person = new MyClass<Person>();
person.SetItem(new SalesRep());
FooClass<Customer>.Process(customer);
FooClass<Customer>.Process(person);
...
publicclass FooClass<T>
{
publicstaticvoid Process(MyInterface<T> obj)
{
}
}
publicinterface MyInterface<in T>
{
void SetItem(T obj);
void Copy(T obj);
}
publicclass MyClass<T> : MyInterface<T>
{
publicvoid SetItem(T obj)
{
_item = obj;
}
private T _item;
publicvoid Copy(T obj)
{
}
}
COM Interop
This is by far the area in which I have the least experience; however, I'm sure we have all had to interact with Microsoft Office at one point and make calls like this:
// Code simplified for this example
using Microsoft.Office.Interop;
using Microsoft.Office.Interop.Word;
object foo = "MyFile.txt";
object bar = Missing.Value;
object optional = Missing.Value;
Document doc = (Document)Application.GetDocument(ref foo, ref bar, ref optional);
doc.CheckSpelling(ref optional, ref optional, ref optional, ref optional);
There are (at least) three problems with the code above. First, you have to declare all your variables as objects and pass them with the ref keyword. Second, you can't omit parameters and must also pass the Missing.Value even if you are not using the parameter. And third, behind the scenes, you are using huge (in file size) interop assemblies just to make one method call.
C# 4.0 will allow you to write the code above in a much simpler form that ends up looking almost exactly like 'normal' C# code. This is accomplished by using some of the features already discussed; namely dynamic support and optional parameters.
// Again, simplified for example.
using Microsoft.Office.Interop.Word;
var doc = Application.GetDocument("MyFile.txt");
doc.CheckSpelling();
What will also happen behind the
scenes is that the interop assembly that is generated will only include the
interop code you are actually using in your application. This will cut down on
application size tremendously.