Napoli, Romani, Gatteschi and Schifanella, ‘Light and Shadows of Smart Contract Development with LLMs’

ABSTRACT
Smart contract development remains almost inaccessible to non-experts developers despite blockchain technology is growing adoption across industries. This paper evaluates the potential of Large Language Models (LLMs) for automated smart contract generation from legal agreements. The work systematically assesses the capabilities of four leading commercial LLMs – gpt-4-turbo (OpenAI), claude-3.5-sonnet (Anthropic), mistral-large (MistralAI), and gemini-1.5-pro (Google) – across a diverse range of legal agreements with varying complexity. The evaluation framework consists of a in-depth evaluation of structured code patterns – typical to smart contracts – to provide nuanced insights into model performances. The results reveal a performance hierarchy with claude-3.5-sonnet and gpt-4-turbo consistently outperforming mistral-large and gemini-1.5-pro, particularly when handling complex agreements such as mortgage note agreement and property sales agreement. A non-linear relationship between contract complexity and model performance has been observed, with even top-performing models showing significant degradation when processing intricate legal structures. While achieving syntactic correctness has become increasingly feasible, ensuring functional completeness and security remains challenging, as evidenced by high-impact vulnerabilities detected across all generated smart contracts. This work contributes to the growing discourse on LLM applications in blockchain technology by providing empirical evidence of current capabilities and limitations, establishing a robust foundation for future research in AI-assisted smart contract development.

Napoli, Emanuele Antonio and Romani, Noemi and Gatteschi, Valentina and Schifanella, Claudio, Light and Shadows of Smart Contract Development with LLMs.

Leave a Reply