-->
Page 6 of 6

Re: Interrupted even if interrupts are disabled?!

PostPosted: Mon Jul 20, 2015 3:55 pm
by jcmvbkbc
rhaenel wrote:Before I overwrote the vecbase, it was 0x40100000. So to test my newly installed vecbase, I'm just jumping to the original locations (default handlers) here (to make sure it's not my handlers that screw up here). That should work, shouldn't it?

Hmm, you're right. How does it crash? Can you check that the value written to vecbase is the same as the one you read from vecbase afterwards?

Re: Interrupted even if interrupts are disabled?!

PostPosted: Tue Jul 21, 2015 2:45 pm
by rhaenel
jcmvbkbc wrote:
rhaenel wrote:Before I overwrote the vecbase, it was 0x40100000. So to test my newly installed vecbase, I'm just jumping to the original locations (default handlers) here (to make sure it's not my handlers that screw up here). That should work, shouldn't it?

Hmm, you're right. How does it crash? Can you check that the value written to vecbase is the same as the one you read from vecbase afterwards?


That was the golden hint, thank you!

For whatever reason, the last 7 bits of the vecbase must be zero, at least they're getting zero'ed out if written with wsr.vecbase. So that was easy to fix, and now the thing runs!

So, here is now the full code to 'get rid' of the NMI and do critical timing stuff. Hope this turns out to be is useful for anyone else:

Code: Select allvoid time_critical_function()     // DO NOT USE ICACHE_FLASH_ATTR HERE!!!
{
   void *nmi_isr;

   // no interrupts during sending data to guarantee timing
   taskDISABLE_INTERRUPTS();

   __asm__ __volatile__ (
         "j function_entry\n"

         ".align 128\n"
         "vecbase_mod:\n"
         "nop\n"

         ".align 16\n"
         "debug_exception_mod:\n"
         "nop\n"

         ".align 16\n"
         "nmi_exception_mod:\n"
         "wsr.excsave3 a0\n"         // preserve original a0 register
         "rsr.epc3 a0\n"            // get PC from before interrupt
         "jx a0\n"               // and jump back there

         ".align 16\n"            //xxxx
         "kernel_exception_mod:\n"
         "nop\n"

         ".align 16\n"            //xxxx
         "nop\n"

         ".align 16\n"
         "user_exception_mod:\n"
         "nop\n"

         ".align 16\n"            //xxxx
         "nop\n"

         ".align 16\n"
         "double_exception_mod:\n"
         "nop\n"

         ".align 16\n"            //xxxx
         "nop\n"

         ".align 16\n"
         "panic_exception_mod:\n"
         "nop\n"

         "function_entry:\n"
         "rsr.vecbase %0\n"
         "movi a2, vecbase_mod\n"
         "wsr.vecbase a2\n"

         : "=r" (nmi_isr)
         :
         : "a2", "memory"
   );

        // do the time critical stuff here
       // ....................
       // ....................
       // ....................
       // ....................

   __asm__ __volatile__ (
         "wsr.vecbase %0\n"            // restore original vecbase
         "rsr.ps a2\n"               // check if we're running unter exception
         "bbci a2, %2, function_end\n"   // indicated by EXCM bit in the PS register
         "rsr.excsave3 a0\n"            // restore original a0
         "movi a2, function_end\n"
         "wsr.epc3 a2\n"               // generate return vector when original NMI returns
         "addi %0, %0, %1\n"            // add offset to vecbase to get NMI ISR
         "jx %0\n"                  // and jump to the original NMI ISR
         "function_end:\n"
         :
         : "r" (nmi_isr), "i" (XCHAL_NMI_VECOFS), "i" (PS_EXCM_SHIFT)
         : "a2", "memory"
   );

   // re-enable interrupts
   taskENABLE_INTERRUPTS();

}


:D :D :D

Re: Interrupted even if interrupts are disabled?!

PostPosted: Wed Aug 05, 2015 11:58 pm
by projectgus
rhaenel wrote:So, here is now the full code to 'get rid' of the NMI and do critical timing stuff. Hope this turns out to be is useful for anyone else:


Hi rhaenel,

Thanks for posting this, it was extremely useful. I had assumed the NMI was pretty non-maskable until I saw these posts!

I did a few experiments of my own last week working on esp-open-rtos, trying to resolve a problem with dropped packets while interrupts were disabled. Disabling the NMI didn't resolve my problem (it seems frames are ACKed at the WiFi layer even if NMI is off, I was hoping they'd be dropped thereby triggering retries at the WiFi layer). However I did learn a couple of other things that I can share:

- This may be obvious to some, but I didn't notice that the code above clobbers a0 if an NMI comes in during the "timing critical" period. So you can't call any functions as part of the "time critical" stuff, and even simple C may be dangerous depending on register allocation.

The safest thing to do is to probably use one whole __asm__ block for the whole thing (disable->time critical->enable), and list "a0" in the reserved registers for the block so it gets saved/restored correctly by gcc.

- Depending on how long the NMI is disabled for, the ESP may still lock up after NMI is restored. What seems to happen is that during the NMI a sanity check in the lower MAC layer fails and you'll see "lmac.c 576" or "wdev.c 1166" printed, then a watchdog reset. I think this happens if too many WiFi frames are received while the NMI is disabled, some radio-facing buffer overflows and the system can't recover.

On a quiet WiFi network it seems you can disable NMI for the duration of approx 50000-100000 instructions, on a busy WiFi network (simulated by ping flooding the ESP) it's more like 7500. So it's still a lot of instructions to play with, depending on what you need to do. I don't know how that translates to actual times. :)

- I found a way to disable the NMI without clobbering a0, because there is a method wDev_MacTim1Arm that triggers the NMI. This function is normally used to schedule the NMI watchdog every 20ms, but when called with delay=1 it triggers the NMI (setting delay=0 races the timer count and misses the NMI sometimes.)

I get the same results calling wDev_MacTim1Arm as with the above code, only can call functions because of no reserved registers needed. This makes it possible to write disableNMI() and enableNMI() functions that could be called from multiple places.

Here's a quick test, adapted from the above:
Code: Select allIRAM void timeCriticalTest()
{
   void *nmi_isr;

   vPortEnterCritical();

   __asm__ __volatile__ (
         "j function_entry\n"

         ".align 128\n"
         "vecbase_mod:\n"
         "nop\n"

         ".align 16\n"
         "debug_exception_mod:\n"
         "nop\n"

         ".align 16\n"
         "nmi_exception_mod:\n"
         "rfi 3\n"

         "function_entry:\n"
         "rsr.vecbase %0\n"
         "movi a2, vecbase_mod\n"
         "wsr.vecbase a2\n"

         : "=r" (nmi_isr)
         :
         : "a2", "memory"
   );

   /* Roughly calibrated delay loop.
      Count (approx x100 instructions) on first line lets you modify delay duration: */
   /* 1000 = ESP will crash with quiet WiFi */
   /* 500 = ESP seems stable on quiet WiFi, but instant crash on ping flood */
   /* 100 = crashes under ping flood */
   /* 75 =  couldn't make it crash */
   __asm__ __volatile__ (
       "movi a2, 500\n"
       "movi a3, 1\n"
       "loop_top:\n"
       ".rept 100\n"
       "or a3, a3, a3\n"
       ".endr\n"
       "sub a2, a2, a3\n"
       "bnez a2, loop_top\n"
       :
       :
       : "a2", "a3"
       );

   __asm__ __volatile__ (
         "wsr.vecbase %0\n"            // restore original vecbase
         :
         : "r" (nmi_isr)
         : "memory"
   );

   sdk_wDev_MacTim1Arm(1);

   vPortExitCritical();
}


(Above is code for esp-open-rtos - based on RTOS SDK 0.9.9 - so it will need some changes to work with the IoT SDK, the wDev_MacTim1Arm() function may be different/missing/renamed in the IoT SDK also.)

One last thing, I saw your post on bbs.espressif asking for ways to run a timer in the NMI. Foogod at the esp8266-re project has found a way to do this as well, take a look at wDev_MacTimArm & wDev_MacTimSetFunc. In theory these give you a way to run a timer handler in the NMI at a specified timeout, and it's not used for anything else in the 0.9.9 RTOS SDK.

Hope some of this is of interest to someone!

Re: Interrupted even if interrupts are disabled?!

PostPosted: Sat Aug 08, 2015 12:38 pm
by tve
Thanks for to great explanations!