Unraveling Code Clones in Programmable Logic Controller (PLC) Software

Code cloning is a common practice in software development, particularly in industrial automation where Programmable Logic Controller (PLC) software is developed using IEC 61131-3 Structured Text (ST) and C/C++. While cloning facilitates rapid development, it also introduces maintenance challenges. This study explores the nature of code clones in PLC software, extending an existing detection tool to support ST. The findings highlight the prevalence of clones, differences between C/C++ and ST cloning patterns, and the necessity of specialized clone management strategies. By adopting automated tools and refactoring approaches, software teams can improve maintainability and reduce technical debt.

Unraveling Code Clones in Programmable Logic Controller (PLC) Software

Introduction

Code cloning—the practice of copying and pasting code fragments—is widespread in software development. While often seen as an anti-pattern due to its impact on maintainability and correctness, cloning is also a pragmatic approach to reusing logic efficiently. This is particularly true in industrial automation, where Programmable Logic Controller (PLC) software is frequently built using cloned code. However, managing these clones effectively remains a challenge due to limited tool support for detecting and tracking them.

A recent study investigated code clones in a real-world PLC software system using IEC 61131-3 Structured Text (ST) and C/C++. The research extended an existing clone detection tool to support Structured Text, shedding light on how clones manifest in industrial automation and how they can be managed for better software quality.

Understanding Code Clones

Code clones fall into four primary categories:

  • Type 1 (Exact Clones): Identical code except for whitespace and comments.
  • Type 2 (Parameterized Clones): Similar structure but with variations in identifiers, literals, or formatting.
  • Type 3 (Near-Miss Clones): Structural similarities with additional insertions or deletions.
  • Type 4 (Semantic Clones): Functionally identical but syntactically different.

While clones can increase development speed, they introduce risks such as increased maintenance effort and inconsistent updates. In industrial automation, clones often emerge due to hardware-specific modifications, time constraints, and a lack of advanced programming abstractions like inheritance or polymorphism in IEC 61131-3 languages.

The Research Approach

To understand cloning in PLC software, researchers analyzed a 191,000-line software system used in industrial machinery. This system evolved over multiple iterations, incorporating features for different machine types, leading to significant code duplication.

The study utilized Simian, a widely used clone detection tool, extended to support Structured Text. The tool analyzed the codebase, identifying duplication at various levels:

  • Within a file (simple reuse of logic)
  • Between files (similar logic across different modules)
  • Across different versions (evolution-driven duplication)

Key Findings

  1. Cloning is common in PLC software

    • Both C/C++ and Structured Text exhibited high levels of duplication.
    • Over 133,000 lines of duplicated code were detected across the codebase.
  2. Structured Text requires specialized analysis

    • Standard clone detection tools are limited in analyzing PLC languages.
    • Custom adaptations improved the ability to detect and categorize clones effectively.
  3. Clone types differ across programming languages

    • C/C++ contained more structural clones, where entire headers or function definitions were copied.
    • Structured Text had more aspect-related clones, involving repeated control logic (e.g., debugging, authentication, or error handling).
  4. Clones impact maintainability but are not inherently bad

    • Some clones are beneficial, serving as a lightweight way to manage product variability.
    • Others create technical debt, requiring better tracking and refactoring.

Managing Code Clones Effectively

To mitigate the negative effects of cloning while retaining its benefits, software teams should:

  • Use automated clone detection tools with language-specific adaptations.
  • Track and manage clones over time, ensuring consistent updates.
  • Refactor where necessary, replacing repetitive logic with reusable modules.
  • Balance cloning with modular design principles, reducing unnecessary duplication while maintaining flexibility.

Conclusion

As PLC software continues to grow in complexity, understanding and managing code clones is critical for maintaining software quality. This study demonstrates that adapting clone detection tools to support industrial programming languages can significantly improve clone analysis and management.

For teams working with PLC software, investing in clone management strategies can lead to better maintainability, reduced technical debt, and more scalable automation solutions.

References and images available in the original research paper.

PDF

arXiv