Recently a customer, let's call her Alice, asked me about designing a bootloader for her STM32L072RZT6 [FLASH=192k, EEPROM=6K, and RAM=20K].
Based on some work I had shown in Hackster.io, she asked if an update downloaded over HTTP could be placed into a second region in flash called APP2 and eventually run from there if the update succeeded.
She also mentioned that her current application size was 93K.
Based on this information, I couple of things jumped out at me: the lack of a Vector Table Offset Register (VTOR) in the the Cortex-M0 she selected would make it difficult to run an application at any offset other than 0x0 (default starting address on Cortex-M devices). Plus, her application was already almost half the size of the available flash, preventing the use of internal flash as a staging area and necessitating the addition of an external flash to her design.
I said that her proposed bootloader design is reasonable in principle. After all, I did implement it in several Hackster.io projects: use the empty space on the internal flash to store and run an application when the flash is split into two sections. The program could be downloaded to, and run in, either section. A very robust design, but has some prerequisites that must be met.
In her case, the application is already almost the same size as half the flash: 192/2 = 96K. This would leave very little room for application improvement or enhancement. Plus, the bootloader itself will need one erase sector, which is 4K on that part, effectively leaving her (192-4)/2=94K, and probably not enough space to hold her existing application.
Then I said to execute an application that is not at the start of flash (address 0x0800 0000 in ST Micro MCUs), the MCU needs to support the Vector Table Offset Register (VTOR) to allow interrupt and exception vectors to be located at arbitrary locations in flash, like in APP1 or APP2 sections. The STM32L0 line (Cortex-M0) do not support the VTOR as far as I can tell, so this means applications must be located at the base of flash.
In these situations, it is still possible to update the firmware, but the downloaded firmware would have to be put into either another area in the internal flash, or onto an external flash. I proposed that the main application has a small routine that is put into RAM and executed from there to do the copy of the application code over top of the existing application. There would not need to be a special bootloader application, and she can still use an appropriate OTA firmware download method like a UART interface to HTTP commands on a cellular modem, or a WiFi module.
In the end I suggested that for her STM32L0 part, I thought the most realizable and appropriate OTA update method would be to download the firmware image to an external flash connected over SPI, then copy a section of her (bootloader) application to RAM that copies the downloaded image on the external flash back to the internal flash, overwriting her current application. When complete, it reboots. Most failures during the application update is recoverable because the bootloader portion of the application resides in the first 4K and is the last sector to be updated.
Before main(), and within the first 4K of flash, a few routines comprising a "bootloader" are present. This special "bootloader" startup routine runs before main(), or as the first function in main(), every time the application runs, to check for a download in the external flash and then start the normal copy procedure as described above again.
For Cortex-M0 parts without a VTOR, her options were limited. There are some vector table "jump" designs that manually construct special jump tables to land an interrupt on an application's table located further up into flash.
The design that I have proposed, by combining the bootloader/udpate function into the main application, requires no special jump table construction and allows updating of the bootloader code itself along with the main application (though not necessary).
The only capability a VTOR would add to this design is to allow separation of the bootloader (which I call firmware image manager) from the application. This could allow for a slightly more robust design where the bootloader is never touched during an update process and is always available to recover failed application copy/write attempts.
However, with IoT bootloaders needing to be moderately complex to handle security features such as firmware image signature verification, it is possible that these modern bootloaders need to be updated in the field. Plus, the overall size of the separately combined bootloader and application would be larger than a monolithic design as I've suggested here due to duplication of shared security libraries.
Based on this assessment, and experience designing and implementing secure OTA firmware update systems for IoT devices, I am rolling out a new firmware product that handles in-field over-the-air firmware updates the same way for Cortex-M0/M3/M4x. This product also has patching capability where only the difference between versions needs to be downloaded to the device. This can support seriously small update images.
I call this type of bootloader/OTA update design Self Replicating Firmware. One application has everything it needs to update itself - by reaching out to update sources to download signed patch data, reconstructing a signed update image, and performing the installation of the reconstructed firmware update image.
I will point out that ARM Mbed is making good progress in making available a production-class secure IoT firmware update system. If you'd like to set that up with your mbed project I can help with that too.