-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timing Analysis #519
Comments
@manuelbl what a great test tool! Thanks for the careful work! I assume the LoRa device you are testing is using an ESP32 or some other device with a 100 ppm or better system clock, rather than the Murata module (which has a 4000ppm system clock). No errors that I can see; I came to the same conclusions about timing by analysis, and by my somewhat different approach using rwc_nst_test. I can also tell that you are testing with HEAD (3.0.99.10 or later), based on the timing. You have experimentally confirmed exactly the timing that I designed after finding the results summarized at #483 (comment). The key finding was that, with noisy environments, the packet loss rate was much higher if I delayed the start-of-RX by several symbols into the transmit preamble, for example: This doesn't reproduce, by the way, with pristine environments, but was pretty reproducible in an unintentionally noisy environment. The comments in the mbed code referenced in #438 (https://github.com/ARMmbed/mbed-os/pull/8822/files) suggest that Semtech was having problems with similar things; they effectively moved their low-speed window into one of the nulls of the picture above. Based on this, and the observed data integrity problems, I moved the window earlier than Semtech. Keeping the window short (in terms of RX syms) is generally good if you can get away with it; the reason Semtech makes it longer is because of the inaccuracy in the millisecond timer you mention. (You have to open up the window if you don't know when it's happening). It's a power thing. However, we have to also take into account clock accuracy. 100 ppm clocks are typical, but 4000ppm clocks are not uncommon (stm32l0 HSI clocks, for example). This complicates join accept timing, which may be delayed six seconds. I did calculations with a 100 ppm clock, and I found that my nominal calculations worked out (I hope someone will double check) but wiht a 4000 ppm clock, the timing blows out (even on the lmic) exactly as described. Please try using If you open the window too late, you won't get enough preamble. I experimentally verified that, using an SX1276 to transmit, the window closes hard 4 symbols after the start of transmit. But the SX1276 is documented as transmitting 10 symbols of preamble by default; and other places state that it requires 6 symbols of preamble to sync up. (So this matches.) Here's a picture: The documentation available is confusing and contradictory, and I couldn't get a good timing reference from the RWC5020, but I believe that the gateways transmit either 8 or 10 symbols of preamble. (I assume that for downlink they'd reduce this if they could since downlink time is relatively precious.) So my formulas assume that the window is only 8 symbols, and that means we need to land somewhere in the first two symbols. In fact, because of my experiments, I determined that the radios I have performed the best if the RX windows was already open when the preamble appeared, possibly because the PLLs have stabilized. That biased me to the beginning of the window. I also subtracted 2 ms because it takes 2 ms for the power amps and analog circuits to turn on (again, this isn't well documented, but I found lots of anecdotal evidence; and in the picture above, you can see that integrity starts to go down as you get close to the right hand side of the window (as a starting time). Another relevant fact: it actually doesn't matter much if you're a little early, because the hard stop is the end of the preamble. If the SX1276 starts seeing a preamble during the RXSYMS timeout period, it will say "I have a preamble" and will go on to receive the packet. I didn't experimentally measure how many symbols of overlap you have to have, but I believe it is 4 symbols. If you aim to land at time t, and the absolute value of your clock error is e (0 being perfect), then you might land at So the LMIC, on the assumption that it might be slow moves up Then, on the assumption that it might be fast, it extends the rx window (in units of rxsyms) until it is sure that there is an overlap. The Semtech and mbed formulas try to "reduce power" by opening the rx window as late as they can (so as to waste less power during preamble time). The fact is, however, that RX downlinks for Class A devices are rare; so it doesn't matter how late you open the window for a successful rx, it matters how long you open the windows for failed rx. |
By the way,
I don't agree with this completely, based on my experiments and based on my analysis above. The margins have to be chosen so that worstcase(T early) plus (number of rx syms) overlap the preamble enough to trigger a reception. This is not necessarily symmetrical when you consider the empirical fact that the more preamble you see, the better; and when you consider that there's a fixed (roughly 2ms) time after you start the radio before it actually is receiving reliably. |
I've tried to come up with a model that takes radio wake up time etc. into account such that 0 margin at the start and at the end is the extreme edge case that still works. However, you have correctly pointed out two things that are not correctly covered in my model:
I will refine my model accordingly. Regarding item 2: You are saying that the radio startup time is roughly 2ms. Do you have any evidence that the timeout starts ticking down before the receiver is really able to detect preamble symbols are that it is only able to detect full symbols? Or how did you come up with 2ms? Other data is contradictory: I've measured 310µs. SemTech uses 1ms knowing that they will be off by 0.5ms on average. The datasheet shows a figure with a receiver startup time (from standby) and specifies it as TS_FS (60µs) + TS_RE (71µs @ 125kHz) = 131µs only... (BTW: For the newer SX126x chip, SemTech uses a much longer wake-up time of 8ms! ) And I was wondering why you go to such great length to have precise timing and compensate for clock drift when it could easily be mitigated with a slightly longer RX window. Is it to reduce power compensation? |
Well, to some degree I'm following the existing practice; believe it or not, I'm trying not to impose my own design philosophy on the IBM-originated framework. Indeed, the theory is that you want the RX window as short as possible, because you pay the full cost of the RX window twice on each uplink (unless there's a downlink). |
I looked at the various other implementations and applied "judgement informed by experience". I could easily measure it with my setup, but I wanted to allow for noise, analog settling, etc. I also wanted the receiver (in the 100ppm case) to be spun up before the TX preamble starts, because of laboratory evidence that it helps. That could just be an artifact of the physical design of the Murata module, etc. I believe I also have to leave time for powering up the TCXO on the Murata module. Fundamentally, I suspect 8ms is closer to physical reality for low-noise high gain analog front ends than a few hundred microseconds; but I'm old, and maybe things are much better than I expect. |
The final thing that needs to be done to make the LMIC pass full certification is to get FSK working properly. Perhaps your tool will do that. I need to find a BluePill, or port your code to the STM32L0. @manuelbl does the bluepill's CPU have multiple SPI busses? We could use a second SPI port in parallel and capture the input traffic from the SX1276. (I'm already scheming on adapting your tool to help with the bringup of the SX1262 port of the LMIC.) |
Yes, the BluePill has two SPI buses. The code should work with any STM32F103C based board. In fact, I have been using three different ones. The effort to port it to the STM32L0 is probably bigger as it would use a different set of pins and buses and needs other clocks and interrupts. |
Well, I ordered a couple of BluePill devices from Amazon, we'll see how it goes. Thanks again for this work. |
Thanks for this great tool! PS: Can't believe, that other LORAWAN stacks did such precise timing analyzes, i.e. those stacks found on chinese STM32 chips with AT-command UART interface. |
There are two things you can look at in the output:
The current LMIC timing assumes that asymmetric margins are advantageous. So far I have assumed otherwise. It depends on the behavior of the SX127x chip. I'm currently running some additional tests (a STM32 MCU controlling two RFM95W modules (sender and receiver) that are half a breadboard apart...). I want to learn more about it (ramp-up time, behavior if two or three preamble symbols are detected and the time-out occurs etc.). I'm close to having results and might come back with a refined model soon. The current LMIC timing also uses rather narrow margins, much narrower than the Semtech software. If you want to increase them, use |
I set up a probing environment with a STM32 bluepill and a ESP32 HeltecV2 board, running my multitasking paxcounter application with MCCI LMIC (current HEAD) on core1. Smooth join and payload communication with SF7. Below the timing log. I am not fully sure yet how to interpret it, can you help? First peek shows huge correction values, but low jitter.
|
@cyberman54 -- it depends on what you mean by "optimum". This shows that the LMIC is starting earlier than @manuelbl's code expects. His code's expectation is based on Semtech's published recommendations; LMIC is tuned for best likelihood of reception based on empirical tests. It definitely uses more battery power in the case of lots of downloads at low data rates (high spreading factors); but in my testing it has better PER. The window timing (for battery power) only has significant effect at low data rates, because the preamble is very long, and you must have the recevier on for longer. But if you're not downlinking a lot, there is no difference between the LMIC timing and the Semtech reference code timing; both listen for a certain number of symbols and then time out. Same number of symbols ==> same power. Furthermore, receiver-on power is roughly 10% of the transmitter-on power. Since there's at least one transmit per receive, the timing of the RX window has at most a 2nd-order (10%) effect on battery life; and probably much less, because the preamble time is only a small fraction of the overall receive message time. |
@terrillmoore thanks for the explanation. I'm aware that RX timing is not the big battery drainer, but that's not the reason why i am trying to get a closer look on timing. I'm still struggling with RX problems in my application, like (too) many join retries and sometimes lost downlink payload. My goal is to make sure, that there are no LMIC timing issues caused by running the LMIC stack in my multitasking ESP32 environment. |
Some more samples for SF12:
|
@cyberman54 sorry for being so pedantic! (I was really writing for the people who find this later with web search.) It strikes me that you should look at the received packet RSSI and SNR. I fixed the calculations for that in recent builds, so I think they're more meaningful. It also strikes me that you could use SPI or a serial port to copy the credentials to a second device, using something like the rwc_nst_test sketch, but close to your main device -- and then you could display downlinks independent of timing. You'd basically have to get the downlink channel to the second device in time, and trigger a continuous read at the approprate spreading factor, etc. Indeed, it strikes me that we could probably hack your device to do a continous read rather than a windowed read, for test purposes. But the thing to do would be side-by-side testing, one windowed, one continuous, and see if the data shows up at all. If it's not on the air, we're not going to receive it. |
At the Things Conference last month, I heard that LimeSDR can be used for LoRa packet sniffing. I don't know if it's true or not, but this might be the best way to find out whether this is an LMIC problem or a "no data" problem. In any case, @cyberman54, the downlink problems you're seeing really deserve their own ticket; I'm not sure we should be chasing them here. |
I'm still working on a more refined timing model and trying to figure out
how much time the ramp-up take, how much time is spent in post-processing
the received packet until the interrupt is triggered and if the timeout is
can be extended and still raise the timeout interrupt.
I hope to have some results within a week (if I can sort out the difference
I currently see between two setups).
Can you leave it open for a few more days?
…On Sun, Feb 9, 2020 at 9:09 PM Terry Moore ***@***.***> wrote:
At the Things Conference last month, I heard that LimeSDR can be used for
LoRa packet sniffing. I don't know if it's true or not, but this might be
the best way to find out whether this is an LMIC problem or a "no data"
problem. In any case, @cyberman54 <https://github.com/cyberman54>, how
should we proceed with this ticket?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#519?email_source=notifications&email_token=ACTSOHGW75FYO5JRRTR42GTRCBPJPA5CNFSM4KC5O752YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELGWVBI#issuecomment-583887493>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACTSOHB55AVWGDE5KDKQIJDRCBPJPANCNFSM4KC5O75Q>
.
|
Of course, happy to. Just doing a sweep on things and pinging people. |
Thanks for this hint. I will order a Lime Mini SDR to test. |
I got a Lime Mini SDR. Any hints / suggestions what software to use for Lora timing analysis? |
I'm unlikely to find time to continue with the analysis. Therefore, I'm closing it. |
I finally had time to complete the SX127x Probe, run sufficient tests, compare LMIC and Semtech code and come up with some findings.
Probe
The SX127x probe is a STM32 MCU that I can connect to a SX127x chip/board in parallel to the MCU that controls it. It observes the SPI communication (MCU to SX127x only) and the DIO0 and DIO1 pins. Based on the observations, it can measure and analyze many timing parameters – in particular the RX windows that are so essential for joining and downlink packets. The output (on a UART TX pin) looks something like this:
The interesting parameters are the computed margin at the start and at the end. It's a measure by how many µs the MCU (or the the gateway) could be early or late and the transmitted packet would still be successfully received. Except for power consumption, the bigger the margin, the better. For the best robustness, the margins at the start and end should be equally long. The conceptual model is described in more detail in the README.
Findings
Timing accuracy
The first finding was that the LMIC code is very repeatable in executing an action at a certain time, within about 2µs. The SemTech code (https://github.com/Lora-net/LoRaMac-node) is only accurate within 1ms, about 500 times worse. This is not surprising as all time values are rounded to 1ms internally. As it turns out, it's still sufficiently precise.
RX window offset and length
LMIC and SemTech use two different formulas to determine when the RX window should start and how long it should be. I've simplified the formulas somewhat (no rounding, no drift). At the core they are:
tSymbol is the length of a symbol. It depends on the data rate.
tWakeUp is the time the SX127x needs to wake up (process the command to start receiving, stabilize the frequency etc.). It is set to 1ms. According to my observations, the wake up time is about 300µs.
The SemTech window is obviously much longer. Independent of the data rate, they have a total margin of about 40ms while LMIC has only 2ms.
The purpose of the SemTech formula for the window offset isn't obvious at first sight. It has been chosen such that the margins at the start and at the end are equally long. So it seems that SemTech's timing is based on the same conceptual model.
The effective values for the different data rates (for EU868) are shown below. They use the more complex formulas with rounding. And they match what I have observed by hooking up the probe to a circuit running with the respective software.
SemTech
The slight difference between the length of the start and end margin is due to rounding and due the a more precise model regarding wake up. But given the 1ms resolution of SemTech's code and the 20ms margin, it's not relevant.
LMIC
The LMIC margins are asymmetric and differ considerably depending on the data rate. For higher data rates, the margin at the end is low.
Conclusions
Even though the LMIC margins are narrow, it still worked. But it could make sense to adopt a formula similar to SemTech's, i.e. a more or less fixed and symmetric margin independent of the data rate. It would probably increase robustness without sacrificing low power consumption.
@terrillmoore I'm interested in your feedback. Is the analysis understandable? Do you see an obvious errors?
The text was updated successfully, but these errors were encountered: