Forum Replies Created
-
AuthorPosts
-
Hi Neil,
Thanks for your good explanations and ideas. As you’ve described, the Mayfly->MMW system in its current form would need additional engineering effort to be able to support a high-reliability communication system with built-in buffering, retries, etc.
At the moment, I’m trying to understand why our Mayfly systems’ reliability has dropped recently, apparently since we’ve implemented code changes. We have loggers that have been in the field for 9 months, sampling every 5 minutes, with no data missing from MMW. Beginning in January, we loaded new firmware onto a handful of those boards (or have simply swapped new, reprogrammed boards in their place) in order to take advantage of improvements that were made to ModularSensors since the boards were first deployed last summer/fall. Since installing that new firmware, a number of these Mayfly stations have begun to have multiple missing datapoints per day, and in some cases, missing data for hours before resuming.
So what I’ve attempted to do is to reestablish a known, working, baseline state to see if I can identify any bug that I might have introduced. I began by cleaning up my development environment (removing all PlatformIO libraries from global storage), creating a new PlatformIO project from the logging_to_MMW example code, and making minimal changes (setting UUIDs, station identifier, etc.). The errors and missing data I’ve described have been the outcome of this testing so far. So I’m a bit stumped as to why I’ve been unable to get back to the higher reliability that we seem to have had previously.
It seems to me that, in the case of the 504 response codes, since the MMW REST endpoint is returning a response, that it is receiving the messages, but perhaps for reasons internal to the server, is failing to save them to its database.
Thanks for reading. I’m kind of a one-man show as far as writing and testing this code, so I’m grateful for any and all suggestions!
Best,
Matt
P.S. I have to mention the irony that, when I went to have a look at your ModularSensors repo, github spun for a while and then eventually returned a 504-Gateway Time-out page. 🙂 It has since recovered.
Thank you, Neil!
Here is more information, from a test I ran overnight using a 2 minute sampling interval, from 04-01 15:32 to 04-02 08:08 MST (UTC-7:00):
- 498 total samples taken, of which 7 failed to get inserted into the MMW database.
- 55 sample events received a Response Code 504.
- 1 event received RC 400.
- The remainder received RC 201 (Successfully created).
Here are the timestamps, in MST, of the 7 missing sample events, along with their corresponding Response Codes or error messages:
- 16:00 (RC 504)
- 18:28 (GPRS connection failed.)
- 18:30 (RC 504)
- 23:22 (RC 504)
- 23:48 (RC 504)
- 03:48 (RC 400)
- 07:48 (RC 504)
I’ll attach my code and the log file from this test run. The site name on MMW is TU_BOISE.
I’m curious whether anyone else may have received 504 responses from the server at similar times, and whether others are seeing missing data points in MMW. It’s difficult (or impossible) to detect these 504 errors at the Mayfly, unless it’s connected to Serial Monitor, as mine is during testing, but you might see missing points on MMW if you download a csv of your data.
Or perhaps there’s logging on the MMW server that could correlate to these gaps? Thanks for any insights that anyone can provide!
Matt
Attachments:
I’ve been getting a lot of 504 (Gateway Timeout) HTTP responses to the POST messages sent to MMW. I ran a test today with my Mayfly sampling every 2 minutes, and out of 111 samples, I got 34 successful (201) response codes and 76 response codes of 504. I also got one 502-Bad Gateway, and that datapoint didn’t make it up to MMW; the rest did. My log file is attached.
Is anyone else getting these, and do we know what causes them?
Attachments:
And it’s down again. 8:54am MDT. I’ll also post on GitHub.
Thanks Heather!
Hi Sara,
Today I started to shift my workflow to having all dependencies local to a given project as you suggested. I’m wondering whether it’s even necessary to clone the ModularSensors dependencies into the project’s .pio\libdeps path? If I create a new PlatformIO project and modify its .ini file to include:
12345lib_ldf_mode = deep+lib_ignore = RTCZerolib_deps =EnviroDIY_ModularSensorsStreamDebugger… then PIO populates a number of dependencies into .pio\libdeps\mayfly\* …It’s not everything from here, but it’s enough to let the project successfully build. [edit 3/21: I ran pio lib update, and it fully populates the dependencies here.]
I have not seen this dependency population happen before, and I’m wondering whether this is new behavior? I updated my PIO core to the latest, 4.3.1, today.
Another question, if you don’t mind: Do you normally specify the version of ModularSensors in your ini when you’re preparing code for a particular Mayfly (as opposed to when you’re doing development work on the library itself)? Initially I was specifying version, simply because I had started from one of the examples, and its ini specified EnviroDIY_ModularSensors@0.23.16. Or do you just leave off the version number and let it grab the latest?
Thanks!
Matt
The new SIM card from Verizon works! I first did some testing using XCTU, and I can set the Carrier Profile (CP) either to 3-Verizon, or to 0-Auto-detect, and it registers and works successfully. (Not surprisingly, it never registers when CP is set to 2 (AT&T), and I also presume that setting CP=0 in an AT&T-only coverage area would not work).
Back on the Mayfly, I changed the APN value to ‘vzwinternet’, and it works there too. (The APN value of ‘-‘ will also work; this is the default setting in XCTU.)
Best,
MattThanks Sara. I managed to get the forbidden list displayed, and cleared. Not certain what I was doing wrong before.
If I remain in the Console and continue querying, the forbidden list will get repopulated after 15-30 seconds. The response is:
123+CRSM: 144,0,"130184FFFFFFFFFFFFFFFFFF"OKSo, assuming that that value decodes to MCC=311 and MNO=480, that’s Verizon, and it appears that Verizon just won’t accept the Hologram SIM.
I’ve ordered a couple of Verizon SIMs from DigiKey, so we’ll see what happens with those. AT&T just won’t work at this particular monitoring site. We haven’t tried T-Mobile, but I assume its coverage is the same as (or less than) AT&T.
Matt
More info: It appears that the command is being sent to the modem at every comma; I had been pasting the entire command into the Console window and getting the 3 errors. If I try typing it by hand, as soon as I type this comma: at+crsm=176,
…it immediately returns ERROR.
I’m unable to query the FPLMN list. I did it successfully last week, so not sure what’s up now.
When I issue the command, I get 3 errors:
123at+crsm=176,28539,0,0,12ERRORERRORERRORI thought that the last time I got it to work, I was in bypass mode and airplane mode. This time, I’ve tried bypass and transparent modes and have tried it in airplane mode and in normal mode (AM=0). All return 3 errors. I’m using the Digi TH dev board and XCTU.
I also get an error (just 1) when I issue
1at+umnoprof?Thanks for any insights!
Matt
-
AuthorPosts