Leveraging Generative AI for Reverse Engineering & Code Reconstruction

Abstracts: This paper investigates how generative artificial intelligence (AI), particularly large language models (LLMs), can be applied to reverse engineering and code reconstruction tasks. Traditional reverse engineering techniques such as disassembly and static analysis are time-consuming, require deep expertise, and often fail to recover high-level semantics. With the rise of generative models like GPT-4, CodeBERT, and AlphaCode, there is a growing opportunity to automate the reconstruction of source code from binaries or legacy languages. The study explores the methodology of training AI on code and binary datasets, outlines the design of an AI-powered tool that modernizes legacy codebases, and identifies key applications in software maintenance, cybersecurity, and digital preservation. It also examines hybrid approaches combining symbolic execution and machine learning. The paper concludes by addressing the significant challenges in binary-to-code transformation—such as hallucination, lossy translation, and dataset scarcity —and suggests future directions for scalable dataset creation, model interpretability, and domain-specific fine-tuning.

| DOI: 10.17148/IJIREEICE.2025.13516

International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering

Leveraging Generative AI for Reverse Engineering & Code Reconstruction

Call for Papers

Author Center

IJIREEICE Management

Archives