Background:
- Five CR1000 hooked up to RF-401 modules with YAGI antennas
- All CR1000s have unique pakbus IDs
- 'Base station' hooked to a RavenXT modem
- Setup was functional for 4 months: was enable to remotely connect and retreive data from all stations
Problem:
- Started happening 4 days ago.
- Upon opening the IP port to the base station (eg. connecting to a station, or pulling data from a station), one station (not base) appears to continuously trasmitting the same packets. Using the LogTool, we see the same following transations continuously being transmitted.
"2016-03-30 9:59:44 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 234"
"2016-03-30 9:59:45 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 135"
"2016-03-30 9:59:45 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 77"
"2016-03-30 9:59:47 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 230"
"2016-03-30 9:59:47 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 62"
"2016-03-30 9:59:47 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 82"
"2016-03-30 9:59:47 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 246"
"2016-03-30 9:59:48 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 98"
"2016-03-30 9:59:48 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 135"
"2016-03-30 9:59:48 AM","PakBusPort","S","received message","src: 8","dest: 4094","proto: BMP5","type: 0xa1","tran: 83"
The Pakbus ID of the station in question in question is 8.
- We can no longer connect to the station in question remotely.
- We can connect to other stations and retreive data
- The station no longer shows up as a neighbor of the base station (even though it is tramsmitting the above packets)
Does anyone know what is going on here?
The "Please Wait" message is transmitted by the datalogger in response to a pending transaction such as clock check, get table defs, or etcetera while the datalogger is in a state where it cannot immediately respond to the command. This has the effect in LoggerNet of pushing the transaction timeout further ahead so as long as the datalogger is sending these messages in a timely manner, LoggerNet will happily wait for the datalogger to finally send the response message it is waiting for.
The most common situation in which the datalogger will be in this state is when the datalogger is compiling its program and allocating associated resources. If the program has one or more cardout() instructions, this can take a long time indeed as the datalogger needs to create the file(s) associated with cardout() and these files could be megabytes or gigabytes long. I have seen cases where it takes multiple minutes for this to happen but I have never seen a case where the datalogger requires days to finish!
As noted above, it is unusual that the datalogger would continue to issue this Please Wait command, and also at that frequency. As you mention, the PakBus address is 8, and the 4094 is LoggerNet's address. So the datalogger is continually telling LoggerNet to wait. The developers here tell me that the datalogger would normally respond about every four seconds, if it was "busy" doing something and was issuing a Please Wait Command.
Do you know what OS is in the datalogger? Do you have a complete set of logs so that we may take a look (comms, trans, io, and state logs)? If you have logs, please zip & send them to support at campbellsci dot com and ask that they be forwarded to me.
My assumption is that power-cycling the datalogger would likely get it out of this weird state, but it would be good to take a look at the logs and see what is going on. Also, with each OS release we fix bugs, so it could be something that exists in the OS but has since been fixed.
Dana W.
Thank you for the responses.
Following what jtrauntvein said, it seems to be relevant to mention that all stations have CF cards installed on them and the program is scheduled to write a record every 1 hour to the CF card. The cards were replaced about 14 days ago, and the station that has the mentioned issue had been reporting (paraphrasing) "Card Error. Error creating new files" in the Status Table. However, despite this error, this issue did not exist for roughly 14 days, and no changes were made to any stations since the time this error was encountered.
The OS on the CR1000 is running either OS version 18 (most likely) or 25. I have been reluctant to attempt any major updates (such as the OS) while the loggers are deployed.
As requested, I have emailed the requested files to the provided email.
It would be good to update OSes when you can. We have fairly recently (within the past couple of OSes) fixed some issues that were related to cards. And if you are running an OS as old as 18, there have been many issues fixed.
If you have direct access to the dataloggers, DevConfig will let you create a backup configuration file (Backup menu). So you can create a backup, load the new OS, and then send the backup to the datalogger to restore all settings, files, etc. It (hopefully) makes the process a little less painful.
If you must update the datalogger remotely, keep in mind that all settings in your datalogger WILL BE RESET! An OS load from LoggerNet's Connect window does not always reset the device to the defaults. However, in this instance it will. From the Rev Text on our site:
I am out of the office until Monday, but will check any logs you send at that time.
Best,
Dana W.