Abstracts: This paper investigates how generative artificial intelligence (AI), particularly large language models (LLMs), can be applied to reverse engineering and code reconstruction tasks. Traditional reverse engineering techniques such as disassembly and static analysis are time-consuming, require deep expertise, and often fail to recover high-level semantics. With the rise of generative models like GPT-4, CodeBERT, and AlphaCode, there is a growing opportunity to automate the reconstruction of source code from binaries or legacy languages. The study explores the methodology of training AI on code and binary datasets, outlines the design of an AI-powered tool that modernizes legacy codebases, and identifies key applications in software maintenance, cybersecurity, and digital preservation. It also examines hybrid approaches combining symbolic execution and machine learning. The paper concludes by addressing the significant challenges in binary-to-code transformation—such as hallucination, lossy translation, and dataset scarcity —and suggests future directions for scalable dataset creation, model interpretability, and domain-specific fine-tuning.
Call for Papers
Rapid Publication 24/7
May 2025/June 2025
Submission: eMail paper now
Notification: Immediate
Publication: Immediately with eCertificates
Frequency: Monthly