Code duplication� ?is against the software engineering best practice of� ?code reusability.� ? Some of the major� ?disadvantages� ?of code duplication are the following
- It Increases the number of� ?Lines of Code (LOC), which impacts the performance of the software.
- Need to write� ?extra unit tests� ?to cover each duplicate method� ?to maintain a good coverage.
- Needs to� ?make changes in multiple files� ?for a change due to code duplication. This will impact the maintenance� ? cost
- Highlights the� ?lack of quality� ?of the software team.
Different types of Duplication
- Different Methods but with Identical LOC
- Same Method with Identical LOC but in different class files
- Identical set of LOC in multiple methods
- Similar LOC
This article addresses the� ?first three category� ?of code duplication.
Duplication Elimination Procedure
Here are some of the common methods through which we can remove these duplication.
Different Methods but with Identical LOC & same Method with Identical LOC but in different class files
Example: � ?methods -� ?m1 ()� ?and� ?m2()� ?which contain the same identical LOC.
Method-1 : Delete and Redirect approach
- Maintain one method (e.g.:� ?m1 ()� ?) which will be used throughout the software.
- Delete the contents of other methods (e.g.:� ?m2()� ?and replace it with actual method call -� ?m1()
[code]
public void m1() {
// LOC
}
public void m2() {
// removing its content and replace with m1() call
m1()
}
public void m3() {
// removing its content and replace with m1() call
m1()
}
[/code]
- Maintain one method (e.g.:� ?m1()� ?)which will be used throughout the software.
- Delete all other identical methods. (e.g.: delete� ?m2(),� ?m3()� ?etc which are all identical methods of� ?m1()� ?)
- Identify the code location from which the deleted methods are referenced and replace it with the unique method. (e.g.: � ?All calls to� ?m2()� ?and� ?m3()� ?must be replaced with� ?m1()� ?
Identical set of LOC in multiple methods
- Identify a less complex method which contains this identical code and make sure that it has Unit tests with good coverage.� ? e.g.: � ?m1 ()
- Create a new method and copy all the identical LOC to that method. e.g.:� ?mn()
- Check whether these LOC is using any parameter / attribute reference which were a part of the parent method and if so add that to the method signature. � ?e.g.: if the LOC in mn() is referencing to an amount parameter then re-define the method signature as� ?mn (int amount)
- Replace the LOC in parent method with the new method reference and passing the relevant parameters. Example� ?mn( 100)
- Run all the unit tests for the parent method (e.g.:� ?m1()� ?) and make sure it all got passed.
- Now apply step-4 and step-5 to other duplicate methods sharing the same identical LOC. e.g.: if� ?m2()� ?and� ?m3()� ?also has the same
- LOC as in� ?m1()� ?which was moved to� ?mn(int amount), then delete those LOC from� ?m2()� ?and� ?m3()� ?and replace it with� ?mn()� ?call.
Where to create the new methods
In the above mentioned elimination approaches we are creating a new method.� ? However where to maintain this new method depends on the nature of the method.� ? However here are some generic guidelines.
- If it is a common method like Date formatting, it can be maintained in a library or utility class which can be used by all classes.
- If it is a method in a derived class, then move to the base class.
- If they are methods in two different classes, then check for the feasibility of introducing a base class. If the base class is not meaningful, consider it moving to a utility class.
There is one more kind of code duplication -� ?Similar LOC.� ? They are not identical but behavior is similar. This will be addressed in a separate article.