Why open-closed principle is important

SOLID design principles is something that every software developer must know, especially if the developer mainly uses object oriented programming paradigm.

SOLID is an acronym. For those who aren’t familiar with the term, this is what it stands for:

  • Single responsibility principle
  • Open-closed principle
  • Liskov substitution principle
  • Interface segregation principle
  • Dependency inversion principle

In this article, we will cover the second principle from this acronym – open-closed principle. We will do so by looking at some examples in C#. However, the general concepts will be applicable to any other object-oriented language.

What is open-closed principle

Open-closed principle states that every atomic unit of code, such as class, module or function, should be open for extension, but closed for modification. The principle was coined in 1988 by Bertrand Meyer.

Essentially, what it means that, once written, the unit of code should be unchangeable, unless some errors are detected in it. However, it should also be written in such a way that additional functionality can be attached to it in the future if requirements are changed or expanded. This can be achieved by common features of object oriented programming, such as inheritance and abstraction.

Unlike single responsibility principle which almost everyone agrees with, open-closed principle has its critics. It’s almost always obvious how single responsibility principle can be implemented and what benefits it will provide. However, trying to foresee where the requirements may change in the future and designing your classes in such a way that most of them would be met without having to alter your existing code is often seen as an unnecessary overhead that doesn’t provide any benefits.

And the way the software is written has moved on by quite a bit since 1988. If back then the process of deploying a new software version was tedious, long and expensive, many systems of today take minutes to deploy. And the whole process can be done on demand with a click of a mouse.

And with agile software development practices being adopted everywhere in the industry, requirements change all the time. Quite often, the changes are radical. Whole sections of code get removed and replaced on regular bases. So, why design for an adherence to open-close principle if the component that you are writing is quite likely to be ditched very soon?

Although these objections are valid, open-closed principle still provides it’s benefits. Here is an example.

Implementing open-closed principle in C#

In my previous article, I have given a C# example of single responsibility principle. We have written an application that reads the text from any text file and converts it into HTML by enclosing every paragraph in P tags.

After all of the refactoring, we have ended up with three classes:

FileProcessor class that reads the input file and saves the output into a HTML file:

using System.IO;

namespace TextToHtmlConvertor
{
    public class FileProcessor
    {
        private readonly string fullFilePath;

        public FileProcessor(string fullFilePath)
        {
            this.fullFilePath = fullFilePath;
        }

        public string ReadAllText()
        {
            return System.Web.HttpUtility.HtmlEncode(File.ReadAllText(fullFilePath));
        }

        public void WriteToFile(string text)
        {
            var outputFilePath = Path.GetDirectoryName(fullFilePath) + Path.DirectorySeparatorChar +
                Path.GetFileNameWithoutExtension(fullFilePath) + ".html";

            using (StreamWriter file =
            new StreamWriter(outputFilePath))
            {
                file.Write(text);
            }
        }
    }
}

TextProcessor class that processes the text from the input file:

using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace TextToHtmlConvertor
{
    public class TextProcessor
    {
        private const string openingParagraphTag = "<p>";
        private const string closingParagraphTag = "</p>";

        private readonly FileProcessor fileProcessor;

        public TextProcessor(FileProcessor fileProcessor)
        {
            this.fileProcessor = fileProcessor;
        }

        public void ConvertText()
        {
            var inputText = fileProcessor.ReadAllText();

            var paragraphs = Regex.Split(inputText, @"(\r\n?|\n)")
                              .Where(p => p.Any(char.IsLetterOrDigit));

            var sb = new StringBuilder();

            foreach (var paragraph in paragraphs)
            {
                if (paragraph.Length == 0)
                    continue;

                sb.AppendLine(openingParagraphTag + paragraph + closingParagraphTag);
            }

            sb.AppendLine("<br/>");
            fileProcessor.WriteToFile(sb.ToString());
        }
    }
}

And Program class that serves as an entry point into the application and executes the logical steps in the correct sequence:

using System;

namespace TextToHtmlConvertor
{
    public class Program
    {
        static void Main()
        {
            try
            {
                Console.WriteLine("Please specify the file to convert to HTML.");
                var fullFilePath = Console.ReadLine();

                var fileProcessor = new FileProcessor(fullFilePath);
                var textProcessor = new TextProcessor(fileProcessor);

                textProcessor.ConvertText();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }

            Console.WriteLine("Press any key to exit.");
            Console.ReadKey();
        }
    }
}

So far, so good. The application is doing exactly what the requirements say and every element of the application serves it’s own purpose. But we know that HTML doesn’t just consist of paragraphs, right?

So, while we are only being asked to read paragraphs and apply HTML formatting to them, it’s not difficult to imagine that we may be asked in the future to expand the functionality to be able to produce much richer HTML output.

In this case, we will have no choice but to rewrite out code. And although the impact of these changes in such a small application would be negligible, what if we had to do it to a much larger application?

We would definitely need to rewrite our unit tests that cover the class, which we may not have enough time to do. So, if we had good code coverage to start with, a tight deadline to deliver new requirements may force us to ditch a few unit tests, and therefore increase the risk of accidentally introducing defects.

What if we had existing services calling into our software that aren’t part of the same code repository? What if we don’t even know those exist? Now, some of these may break due to receiving unexpected results and we may not find out about it until it all has been deployed into production.

So, to prevent these things from happening, we can refactor our code as follows.

Our TextProcessor class will become this:

using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace TextToHtmlConvertor
{
    public class TextProcessor
    {
        private const string openingParagraphTag = "<p>";
        private const string closingParagraphTag = "</p>";

        public virtual string ConvertText(string inputText)
        {
            var paragraphs = Regex.Split(inputText, @"(\r\n?|\n)")
                              .Where(p => p.Any(char.IsLetterOrDigit));

            var sb = new StringBuilder();

            foreach (var paragraph in paragraphs)
            {
                if (paragraph.Length == 0)
                    continue;

                sb.AppendLine(openingParagraphTag + paragraph + closingParagraphTag);
            }

            sb.AppendLine("<br/>");

            return sb.ToString();
        }
    }
}

We have now completely separated file-processing logic from it. The main method of the class, ConvertText(), now takes the input text as a parameter and returns the formatted output text. Otherwise, the logic inside of it is the same as it was before. All it does is splits the input text into paragraphs and encloses each one of them in P tag. And to allow us to expand this functionality if requirements ever change, it was made virtual.

Our Program class is now this:

using System;

namespace TextToHtmlConvertor
{
    class Program
    {
        static void Main()
        {
            try
            {
                Console.WriteLine("Please specify the file to convert to HTML.");
                var fullFilePath = Console.ReadLine();
                var fileProcessor = new FileProcessor(fullFilePath);


                var textProcessor = new TextProcessor(tagsToReplace);

                var inputText = fileProcessor.ReadAllText();
                var outputText = textProcessor.ConvertText(inputText);
                fileProcessor.WriteToFile(outputText);
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }

            Console.WriteLine("Press any key to exit.");
            Console.ReadKey();
        }
    }
}

We are now calling FileProcessor methods from within this class. But otherwise, the output will be exactly the same.

Now, one day, we are told that our application needs to be able to recognize Markdown (MD) emphasis markers in the text, which include bold, italic and strikethrough. These will be converted into their equivalent HTML markup.

So, in order to do this, all you have to do is add another class that inherits from TextProcessor. We’ll call it MdTextProcessor:

using System.Collections.Generic;

namespace TextToHtmlConvertor
{
    public class MdTextProcessor : TextProcessor
    {
        private readonly Dictionary<string, (string, string)> tagsToReplace;

        public MdTextProcessor(Dictionary<string, (string, string)> tagsToReplace)
        {
            this.tagsToReplace = tagsToReplace;
        }

        public override string ConvertText(string inputText)
        {
            var processedText = base.ConvertText(inputText);

            foreach (var key in tagsToReplace.Keys)
            {
                var replacementTags = tagsToReplace[key];

                if (CountStringOccurrences(processedText, key) % 2 == 0)
                    processedText = ApplyTagReplacement(processedText, key, replacementTags.Item1, replacementTags.Item1);
            }

            return processedText;
        }

        private int CountStringOccurrences(string text, string pattern)
        {
            int count = 0;
            int currentIndex = 0;
            while ((currentIndex = text.IndexOf(pattern, currentIndex)) != -1)
            {
                currentIndex += pattern.Length;
                count++;
            }
            return count;
        }

        private string ApplyTagReplacement(string text, string inputTag, string outputOpeningTag, string outputClosingTag)
        {
            int count = 0;
            int currentIndex = 0;

            while ((currentIndex = text.IndexOf(inputTag, currentIndex)) != -1)
            {
                count++;

                if (count % 2 != 0)
                {
                    var prepend = outputOpeningTag;
                    text = text.Insert(currentIndex, prepend);
                    currentIndex += prepend.Length + inputTag.Length;
                }
                else
                {
                    var append = outputClosingTag;
                    text = text.Insert(currentIndex, append);
                    currentIndex += append.Length + inputTag.Length;
                }
            }

            return text.Replace(inputTag, string.Empty);
        }
    }
}

In it’s constructor, the class receives a dictionary of tuples containing two string values. The key in the dictionary is the Markdown emphasis marker, while the value contains opening HTML tag and closing HTML tag. The code inside overridden ConvertText() method calls the original ConvertText() method from its base class and then looks up all instances of each emphasis marker in the text. It then ensures that the number of those is even (otherwise it would be an incorrectly formatted Markdown content) and replaces them with opening and closing HTML tags.

Now, our Program file will look like this:

using System;
using System.Collections.Generic;

namespace TextToHtmlConvertor
{
    class Program
    {
        static void Main()
        {
            try
            {
                Console.WriteLine("Please specify the file to convert to HTML.");
                var fullFilePath = Console.ReadLine();

                var fileProcessor = new FileProcessor(fullFilePath);

                var tagsToReplace = new Dictionary<string, (string, string)>
                {
                    { "**", ("<strong>", "</strong>") },
                    { "*", ("<em>", "</em>") },
                    { "~~", ("<del>", "</del>") }
                };

                var textProcessor = new MdTextProcessor(tagsToReplace);

                var inputText = fileProcessor.ReadAllText();
                var outputText = textProcessor.ConvertText(inputText);
                fileProcessor.WriteToFile(outputText);
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }

            Console.WriteLine("Press any key to exit.");
            Console.ReadKey();
        }
    }
}

The dictionary is something we pass into MdTextProcessor from the outside, so we needed to initialize it here. And now our textProcessor variable is of type MdTextProcessor rather than TextProcessor. The rest of the code has remained unchanged.

So, if we had any existing unit tests on ConvertText() method of TextProcessor class, they would not be affected at all. Likewise, if any external application uses TextProcessor, it will work just like it did before after our code update. Therefore we have added new capabilities without breaking any of the existing functionality at all.

Also, there is another example of how we can future-proof our code. The requirements of what special text markers our application must recognize may change, so we have turned it into easily changeable data. Now, MdTextProcessor doesn’t have to be altered.

Also, although we could simply use one opening HTML tag as the value in the dictionary and then just create a closing tag on the go by inserting a slash character into it, we have defined opening and closing tags explicitly. Again, what if the requirements will state that we need to add various attributes to the opening HTML tags? What if certain key values will correspond to nested tags? It would be difficult to foresee all possible scenarios beforehand, so the easiest thing we could do is make it explicit to cover any of such scenarios.

Conclusion

Open-closed principle is a useful thing to know, as it will substantially minimize the impact of any changes to your application code. If your software is designed with this principle in mind, then future modifications to any one of your code components will not cause the need to modify any other components and assess the impact on any external applications.

However, unlike single responsibility principle, which should be followed almost like a law, there are situations where applying open-closed principle has more cons than pros. For example, when designing a component, you will have to think of any potential changes to the requirements in the future. This, sometimes, is counterproductive, especially when your code is quite likely to be radically restructured at some point.

So, while you still need to spend some time thinking about what new functionality may be added to your new class in the future, use common sense. Considering the most obvious changes is often sufficient.

Also, although adding new public methods to the existing class without modifying the existing methods would, strictly speaking, violate open-closed principle, it will not cause the most common problems that open-closed principle is designed to address. So, in most of the cases, it is completely fine to do so instead of creating even more classes that expand your inheritance hierarchy.

However, in this case, if your code is intended to be used by external software, always make sure that you increment the version of your library. If you don’t, then the updated external software that relies on the new methods will get broken if it accidentally received the old version of the library with the same version number. However, any software development team absolutely must have versioning strategy in place and most of them do, so this problem is expected to be extremely rare.