Memory Leak Analysis on ESP32

1 min - Published on

Client


Client: Swatch Group

Service: Application Support

Website: [Swatch Group]

Issue


Connecting via OPC UA (OPC Unified Architecture) to the production system became impossible after multiple OPC UA clients connected. Consequently, the production system had to be manually restarted to regain OPC UA access and thus access production data.

Analysis


After four consecutive connections to the ESP32's OPC UA server, the service would shut down, rendering it unavailable. Connections could come from various clients and/or machines, and the service outage occurred in the same manner each time. A system reboot was required to allow a new connection.

Proposed Solution


The OPC UA service shutdown resulted from FreeRTOS closing the "opcua_task" task due to a lack of data memory (DRAM). Firstly, each time a session was created, the wrong time base was used, causing it never to expire. Secondly, in the event of a sudden network disconnection (pulling the network cable or closing the OPC UA client without TCP connection termination signals), the TCP session was partially closed and continued to use memory resources. An aggravating factor was the lack of available data memory because the ESP32 system configuration was not optimized.

Implementation:

  • Correct the time base when creating OPC UA sessions.
  • Activate the TCP SO_LINGER option in the sdkconfig file to terminate and release memory as soon as a session closure request is made from the server side.
  • Optimize memory footprint by configuring ESP32 system options. [Refer to Espressif's article on data memory optimization.](https://blog.espressif.com/esp32-memory-analysis-case-study-eacc75fe5431)

- Optimizations in the sdkconfig file:

CONFIG_FREERTOS_UNICORE=y
CONFIG_MBEDTLS_ASYMMETRIC_CONTENT_LEN=y
CONFIG_MBEDTLS_SSL_OUT_CONTENT_LEN=2048
CONFIG_MBEDTLS_DYNAMIC_BUFFER=y
CONFIG_MBEDTLS_DYNAMIC_FREE_PEER_CERT=y
CONFIG_MBEDTLS_DYNAMIC_FREE_CONFIG_DATA=y
CONFIG_ESP32_ALLOW_RTC_FAST_MEM_AS_HEAP=y

Additionally, configuring OPC UA clients with short timeouts allows for faster session release in the event of a sudden network disconnection (pulling the network cable or closing the OPC UA client without TCP connection termination signals). The following configuration can be adopted as an example:

OPC UA Client Configuration:
  • Secure Channel Timeout: 10 minutes
  • Session Timeout: 1 minute

Results


After testing on a prototype simulating dozens of connection and disconnection cycles and multiple simultaneous connections, no access issues were encountered. Similarly, after monitoring available data memory, it was observed that it fully released all resources allocated for each connection closure.

Conclusion


It is important to understand the target system when integrating applications. In our case, the memory leak was neither due to the application nor the target system. In fact, it was the frequent user connection/disconnection that was not anticipated. IP stacks like LWIP do not instantly release resources upon closing a TCP session, so it is necessary to adapt by releasing resources more quickly to accept new connections.