What could go wrong if we call an overridable method from within a constructor?
If you’ve used static code analysis tools then you may have come across a warning instructing you not to call virtual methods in constructors.
This particular warning is one that should not be ignored, and nor should it be suppressed without good reason. It is of particular relevance to those moving to C# from some other object oriented languages which do not behave in quite the same way (more on that later).
This is part of a series of posts featuring simple but informative code samples.
You can find a bit of background information and a list of the other posts in the series here.
Consider a simple example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
class Program { static void Main(string[] args) { var test = new DerivedClass(42); } } abstract class BaseClass { protected BaseClass() { this.SaySomething(); } protected virtual void SaySomething() { Console.WriteLine("Base class has nothing to say"); } } class DerivedClass : BaseClass { private int Number; public DerivedClass(int number) { this.Number = number; } protected override void SaySomething() { Console.WriteLine(String.Format("Derived class says Number = {0}", Number)); } } |
If you haven’t seen this before, please scroll back up and take a moment to read through the code and decide what you think the output will be, before reading on.
When I gave this and similar other examples to people as part of a test, what surprised me was not that people got it wrong, but the variety of different incorrect answers.
This example is rather contrived, but hopefully illustrates the kind of potential problems that could arise, especially if you were writing the derived class without visibility of the base class source code.
If we compile this code and run the static code analysis in Visual Studio, we receive the warning CA2214: Do not call overridable methods in constructors
The message detail (and the build output window) contains the following:
1 2 3 4 5 6 7 8 |
warning CA2214: Microsoft.Usage : 'BaseClass.BaseClass()' contains a call chain that results in a call to a virtual method defined by the class. Review the following call stack for unintended consequences: warning CA2214: BaseClass..ctor() warning CA2214: BaseClass.SaySomething():Void warning CA2214: Microsoft.Usage : 'DerivedClass.DerivedClass()' contains a call chain that results in a call to a virtual method defined by the class. Review the following call stack for unintended consequences: warning CA2214: DerivedClass..ctor() warning CA2214: BaseClass.SaySomething():Void Code Analysis Complete -- 0 error(s), 2 warning(s) |
Now, did you read through the source code and work out what you think the output would be?
If your answer included Base class has nothing to say
, then I’m afraid you’re wrong, and this might be a good opportunity to brush up on virtual members and what happens when they are called.
The next most common incorrect answer is probably: Derived class says Number = 42
However, the actual output from this program is: Derived class says Number = 0
When we create the new instance of DerivedClass, the default parameterless constructor in BaseClass is called first. This in turn calls the virtual function SaySomething().
When BaseClass calls this.SaySomething()
, the call to the virtual method is directed to the overriden SaySomething() in DerivedClass.
This in turn prints the value of Number
to the console output.
However, at this point the code in DerivedClass’s constructor has not yet executed, and so the value of number is equal to default(int)
, i.e. zero.
The value of Number
in DerivedClass then gets set to the value we passed in to the constructor, but too late to affect the output.
If you are working on a new project from the outset and have no exceptional reason to violate this code analysis rule (or on an existing project with no current CA2214 warnings), then I would suggest adding CA2214 to the list of warnings that you ask Visual Studio to treat as errors.
Personally, when I have the luxury of beginning a project from scratch, I set Visual Studio to treat all warnings as errors on Release builds and if I encounter a warning for something which there is a justifiable reason and which cannot be written another way, then I add in-code suppression attributes with detailed justification notes.
Returning to the code example given at the beginning, we can build in Debug mode and use reflection together with a few more console output lines in order to see the execution path of the code from the program output. (You could also step through line by line in the debugger, but this gives something to show here.)
I have also added two additional calls to the SaySomething method, one in the constructor for Derived class, and one to the BaseClass implementation from SaySomething in DerivedClass. This allows us to see more clearly the before and after effect.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
using System; using System.Diagnostics; using System.Reflection; class Program { static void Main(string[] args) { var test = new DerivedClass(42); } } abstract class BaseClass { protected BaseClass() { var currentMethod = MethodInfo.GetCurrentMethod(); var caller = (new StackFrame(1)).GetMethod(); Console.WriteLine(String.Format("{0}.{1}() was called from {2}.{3}() while initializing {4}" , currentMethod.DeclaringType.Name , currentMethod.Name , caller.DeclaringType.Name , caller.Name , this.GetType().Name)); this.SaySomething(); } protected virtual void SaySomething() { var currentMethod = MethodInfo.GetCurrentMethod(); var caller = (new StackFrame(1)).GetMethod(); Console.WriteLine(String.Format("{0}.{1}() was called from {2}.{3}()" , currentMethod.DeclaringType.Name , currentMethod.Name , caller.DeclaringType.Name , caller.Name)); Console.WriteLine(String.Format("{0} has nothing to say", this.GetType().Name)); } } class DerivedClass : BaseClass { private int Number; public DerivedClass(int number) { this.Number = number; var currentMethod = MethodInfo.GetCurrentMethod(); var caller = (new StackFrame(1)).GetMethod(); Console.WriteLine(String.Format("{0}.{1}() was called from {2}.{3}() while initializing {4}" , currentMethod.DeclaringType.Name , currentMethod.Name , caller.DeclaringType.Name , caller.Name , this.GetType().Name)); // Additional call to SaySomething, which was not present in the first example this.SaySomething(); } protected override void SaySomething() { var currentMethod = MethodInfo.GetCurrentMethod(); var caller = (new StackFrame(1)).GetMethod(); Console.WriteLine(String.Format("{0}.{1}() was called from {2}.{3}()" , currentMethod.DeclaringType.Name , currentMethod.Name , caller.DeclaringType.Name , caller.Name)); Console.WriteLine(String.Format("{0} says Number = {1}", this.GetType().Name, Number)); // Additional call to SaySomething in BaseClass, which was not present in the first example base.SaySomething(); } } |
When we run the above code, the output to the console is:
1 2 3 4 5 6 7 8 9 10 11 |
BaseClass..ctor() was called from DerivedClass..ctor() while initializing DerivedClass DerivedClass.SaySomething() was called from BaseClass..ctor() DerivedClass says Number = 0 BaseClass.SaySomething() was called from DerivedClass.SaySomething() DerivedClass has nothing to say DerivedClass..ctor() was called from Program.Main() while initializing DerivedClass DerivedClass.SaySomething() was called from DerivedClass..ctor() DerivedClass says Number = 42 BaseClass.SaySomething() was called from DerivedClass.SaySomething() DerivedClass has nothing to say |
Earlier I mentioned that not all object oriented languages behave the same way. If you find yourself having to switch from one to another, then this code sample provides a good way of demonstrating the differences in object initialization strategies in each language.
In C#, when we create an instance of a derived class, the object is always considered to be an instance of the derived class (as illustrated in the output panel above by the value of this.GetType().Name
that is output from the base class constuctor).
Even when the code is executing the base class constructor and has not yet arrived at the derived class constructor, if the base class uses the object reference this
, the type of that object is that of the derived class, and a call to a virtual method which is overriden in the derived class will execute the override rather than the base class method.
C++ behaves differently.
When the new object is being instantiated and the base class constructor is being executed (i.e. prior to execution of the derived class constructor), if we use the pointer this
(or implicitly do so by accessing an instance method or field), then the type of object returned at that time appears and behaves as an instance of the base class.
Only once we have begun to execute the derived class constructor does the this
pointer begin to reference an instance of the derived class. It is as if the class begins its life as one type and then morphs into the more specialised type as the initialization runs down the chain of inheritance.
We can observe this, and the effect on virtual methods called from a constructor, by converting the previous C# code to C++.
(Note that for the sake of brevity, preprocessor directives have been omitted. My test environment for this was a Win32 console application with “stdafx.h” and “typeinfo.h” included.)
1 2 3 4 5 6 7 8 9 |
class BaseClass abstract { protected: virtual void SaySomething(); BaseClass(); public: ~BaseClass() { } }; |
1 2 3 4 5 6 7 8 9 10 11 |
BaseClass::BaseClass() { printf_s("Entered constructor for %s\r\n", typeid(*this).name()); SaySomething(); } void BaseClass::SaySomething() { printf_s("%s has nothing to say\r\n", typeid(*this).name()); } |
1 2 3 4 5 6 7 8 9 10 11 |
class DerivedClass : public BaseClass { private: int Number; protected: virtual void SaySomething() override; public: DerivedClass(int number); ~DerivedClass() { } }; |
1 2 3 4 5 6 7 8 9 10 11 12 13 |
DerivedClass::DerivedClass(int number) { printf_s("Entered constructor for %s\r\n", typeid(*this).name()); Number = number; SaySomething(); } void DerivedClass::SaySomething() { printf_s("%s says Number = %d\r\n", typeid(*this).name(), Number); BaseClass::SaySomething(); } |
1 2 3 4 5 6 |
int _tmain(int argc, _TCHAR* argv[]) { DerivedClass *test = new DerivedClass(42); return 0; } |
When we run this and look at the output with runtime type information, then the difference in behaviour is clear:
1 2 3 4 5 6 |
Entered constructor for class BaseClass class BaseClass has nothing to say Entered constructor for class DerivedClass class DerivedClass says Number = 42 class DerivedClass has nothing to say |
While the derived class appears to behave in exactly the same way as it does in C#, the base class under initialization does not.
Comparing the two, we see that in C# the value of this.GetType().Name
is “DerivedClass” when called from the BaseClass constructor during initialization of a DerivedClass instance.
By contrast in C++, typeid(*this).name()
in the BaseClass implementation of SaySomething() returns “class BaseClass” at first when called from the BaseClass constructor, but later returns “class DerivedClass” once the DerivedClass constructor has begun to execute.
The output also clearly illustrates that (unlike C#) the vall to the virtual method during BaseClass constructor results in a direct call to BaseClass::SaySomething()
– which makes sense because the object has effectively not yet become an instance of DerivedClass.
—
Thanks for reading. Hopefully you found something of interest.
If you have something to add, a mistake to point out, or would just like to let me know what you thought, then please feel free to add a comment using the box below or to contact me on Twitter or Google+.
Great article and very usefull.
Resolution is just missing to make the equivalent of example properly
Great article. C++ actually has a special convention for handling just these situations with ultimate control. They are called Initialization lists and it is a list of base class constructors (and the arguments you want to pass to them) placed just after the method definition. Here is a better explanation:
http://www.cprogramming.com/tutorial/initialization-lists-c++.html
C# and VB both have TERRIBLE OOP patterns IMHO and this is a prime example of why. Until something like this is implemented they can’t support multiple inheritance even if they wanted to, BTW.