I'm running LoggerNet for Linux 4.2.1 with 5 configured loggers. All use TCP/IP for data collection. All loggers are configured to collect data every hour at xx:04 with 1 primary retry after 5 minutes. No secondary retries.
Every few days, the data collection of some loggers blocks and no more data is collected for these loggers. I observed this behavior for 2 of the 5 loggers so far. Only a restart of LoggerNet seems to help.
The same loggers a collected by LoggerNet for Windows at the same time (with different collection times configured) without any problems, so it seems to be an issue of LoggerNet for Linux.
I can provide full logs if required. Below I'm providing filtered logs for the logger "SCB2". The last successful data collection was at 13.3.2015 at 23:04, from then on, no data collection for SCB2 was successful anymore.
I think the message "Should get off line because of a parent request" is rather strange.
Also, the time gap of 2 minutes between the these to line is suspicious:
"2015-03-14 00:04:00.011","IPPort_2SCB2","Dev::onNextCommand","Opening"
"2015-03-14 00:06:07.201","IPPort_2SCB2","Dev::sesBegin","0xf2a25ef0"
Related to this issue is also thread http://www.campbellsci.com/forum/messages.cfm?threadid=6BE4A5E4-2755-4E16-9E94E7A33CA609B2
Thank you for your help,
Marcel
~~~~
Filtered state log:
"2015-03-13 23:04:00.008","IMIS_2SCB2","Dev::sesBegin","0xf2a48aa8"
"2015-03-13 23:04:00.008","IMIS_2SCB2","Dev::cmdAdd","Classic::CmdA","2"
"2015-03-13 23:04:00.010","IPPort_2SCB2","Dev::reqDevice","Requesting device: IMIS_2SCB2"
"2015-03-13 23:04:00.011","IPPort_2SCB2","Dev::onNextCommand","Opening"
"2015-03-13 23:04:01.627","IPPort_2SCB2","Dev::sesBegin","0xf2a54000"
"2015-03-13 23:04:01.627","IPPort_2SCB2","Dev","Going on-line"
"2015-03-13 23:04:01.627","IMIS_2SCB2","Dev::reqDevResp","IPPort_2SCB2","IMIS_2SCB2","success"
"2015-03-13 23:04:01.627","IMIS_2SCB2","Dev::sesBegin","0xf4d317d8"
"2015-03-13 23:04:01.627","IMIS_2SCB2","Dev","Going on-line"
"2015-03-13 23:04:01.627","IMIS_2SCB2","Dev::onNextCommand","Executing command","Classic::CmdA","2"
"2015-03-13 23:04:07.207","IMIS_2SCB2","Dev::cmdAdd","Classic::CommandG","2"
"2015-03-13 23:04:07.207","IMIS_2SCB2","Dev::cmdFinished","Classic::CmdA"
"2015-03-13 23:04:07.207","IMIS_2SCB2","Dev::onNextCommand","Executing command","Classic::CommandG","2"
"2015-03-13 23:04:09.287","IMIS_2SCB2","Dev::cmdAdd","Classic::CommandF","2"
"2015-03-13 23:04:09.287","IMIS_2SCB2","Dev::cmdFinished","Classic::CommandG"
"2015-03-13 23:04:09.287","IMIS_2SCB2","Dev::onNextCommand","Executing command","Classic::CommandF","2"
"2015-03-13 23:04:10.993","IMIS_2SCB2","Dev::cmdAdd","Classic::CommandF","2"
"2015-03-13 23:04:10.993","IMIS_2SCB2","Dev::cmdFinished","Classic::CommandF"
"2015-03-13 23:04:10.993","IMIS_2SCB2","Dev::onNextCommand","Executing command","Classic::CommandF","2"
"2015-03-13 23:04:13.328","IMIS_2SCB2","Dev::cmdAdd","Classic::CommandF","2"
"2015-03-13 23:04:13.328","IMIS_2SCB2","Dev::cmdFinished","Classic::CommandF"
"2015-03-13 23:04:13.328","IMIS_2SCB2","Dev::onNextCommand","Executing command","Classic::CommandF","2"
"2015-03-13 23:04:14.967","IMIS_2SCB2","Dev::cmdFinished","Classic::CommandF"
"2015-03-13 23:04:14.967","IMIS_2SCB2","DevHelpers::HangupDelaySession","Hangup delay: 10"
"2015-03-13 23:04:14.968","IMIS_2SCB2","Dev::sesEnd","0xf2a48aa8"
"2015-03-13 23:04:14.973","IMIS_2SCB2","DevHelpers::HangupDelaySession","post completion"
"2015-03-13 23:04:14.973","IMIS_2SCB2","Dev::sesEnd","0xf4d317d8"
"2015-03-13 23:04:14.973","IMIS_2SCB2","Dev","Hangup delay complete received, no sessions left"
"2015-03-13 23:04:14.973","IMIS_2SCB2","Dev::onNextCommand","Ending"
"2015-03-13 23:04:19.327","IPPort_2SCB2","Dev::relDevice","IMIS_2SCB2"
"2015-03-13 23:04:19.327","IMIS_2SCB2","Dev::cmdFinished","Classic::CmdEnd"
"2015-03-13 23:04:19.327","IMIS_2SCB2","Dev::sesEnd","0"
"2015-03-13 23:04:19.327","IMIS_2SCB2","Dev::goingOffLine","Reset off-line counter"
"2015-03-13 23:04:19.327","IPPort_2SCB2","Dev","undialing complete"
"2015-03-13 23:04:19.327","IPPort_2SCB2","DevHelpers::HangupDelaySession","Hangup delay: 10"
"2015-03-13 23:04:19.333","IPPort_2SCB2","DevHelpers::HangupDelaySession","post completion"
"2015-03-13 23:04:19.333","IPPort_2SCB2","Dev::sesEnd","0xf2a54000"
"2015-03-13 23:04:19.333","IPPort_2SCB2","Dev","Hangup delay complete received, no sessions left"
"2015-03-13 23:04:19.333","IPPort_2SCB2","Dev::onNextCommand","Ending"
"2015-03-13 23:04:19.427","IPPort_2SCB2","Dev::cmdFinished","Comm::Root::end_command_type"
"2015-03-13 23:04:19.427","IPPort_2SCB2","Dev::sesEnd","0"
"2015-03-13 23:04:19.427","IPPort_2SCB2","Dev::goingOffLine","Reset off-line counter"
"2015-03-13 23:04:19.427","IPPort_2SCB2","Root::goingOffLine","Callback delay armed"
"2015-03-13 23:04:20.426","IPPort_2SCB2","Comm::Root::onOneShotFired","Callback delay"
"2015-03-14 00:04:00.008","IMIS_2SCB2","Dev::sesBegin","0xf4d7d098"
"2015-03-14 00:04:00.008","IMIS_2SCB2","Dev::cmdAdd","Classic::CmdA","2"
"2015-03-14 00:04:00.010","IPPort_2SCB2","Dev::reqDevice","Requesting device: IMIS_2SCB2"
"2015-03-14 00:04:00.011","IPPort_2SCB2","Dev::onNextCommand","Opening"
"2015-03-14 00:06:07.201","IPPort_2SCB2","Dev::sesBegin","0xf2a25ef0"
"2015-03-14 00:06:07.201","IPPort_2SCB2","Dev","Going on-line"
"2015-03-14 00:06:07.202","IMIS_2SCB2","Dev::reqDevResp","IPPort_2SCB2","IMIS_2SCB2","success"
"2015-03-14 00:06:07.202","IMIS_2SCB2","Dev::sesBegin","0xf2a53818"
"2015-03-14 00:06:07.202","IMIS_2SCB2","Dev","Going on-line"
"2015-03-14 00:06:07.202","IMIS_2SCB2","Dev::onNextCommand","Executing command","Classic::CmdA","2"
"2015-03-14 00:07:07.174","IMIS_2SCB2","Dev::sesEnd","0xf2a53818"
"2015-03-14 00:07:07.174","IPPort_2SCB2","Dev::sesEnd","0xf2a25ef0"
"2015-03-14 00:07:07.174","IMIS_2SCB2","Dev::cmdFinished","Classic::CmdA"
"2015-03-14 00:07:07.175","IMIS_2SCB2","Dev::sesEnd","0xf4d7d098"
"2015-03-14 00:07:07.175","IMIS_2SCB2","Dev::shouldGetOffLine","Should get off line because of a parent request"
"2015-03-14 00:07:07.175","IMIS_2SCB2","Dev::onNextCommand","Ending"
"2015-03-14 00:07:07.175","IPPort_2SCB2","Dev::relDevice","IMIS_2SCB2"
"2015-03-14 00:07:07.175","IMIS_2SCB2","Dev::goingOffLine","Reset off-line counter"
"2015-03-14 00:07:07.175","IPPort_2SCB2","Dev","undialing complete"
"2015-03-14 00:07:07.175","IPPort_2SCB2","Dev::on_undial_complete","there is no hangup delay session"
"2015-03-14 00:07:07.175","IPPort_2SCB2","Dev::shouldGetOffLine","Should get off line because of a parent request"
"2015-03-14 00:07:07.175","IPPort_2SCB2","Dev::onNextCommand","Ending"
"2015-03-14 00:07:07.175","IPPort_2SCB2","Dev::cmdFinished","Comm::Root::end_command_type"
"2015-03-14 00:07:07.176","IPPort_2SCB2","Dev::sesEnd","0"
"2015-03-14 00:07:07.176","IPPort_2SCB2","Dev::goingOffLine","Reset off-line counter"
"2015-03-14 00:07:07.176","IPPort_2SCB2","Root::goingOffLine","Callback delay armed"
"2015-03-14 00:07:08.175","IPPort_2SCB2","Comm::Root::onOneShotFired","Callback delay"
"2015-03-14 00:12:07.175","IMIS_2SCB2","Dev::sesBegin","0xf2a1efb8"
"2015-03-14 00:12:07.175","IMIS_2SCB2","Dev::cmdAdd","Classic::CmdA","2"
"2015-03-14 00:12:07.175","IPPort_2SCB2","Dev::reqDevice","Requesting device: IMIS_2SCB2"
"2015-03-14 00:12:07.175","IPPort_2SCB2","Dev::onNextCommand","Opening"
"2015-03-14 00:14:14.369","IPPort_2SCB2","Dev::sesBegin","0xf2a48ec0"
"2015-03-14 00:14:14.370","IPPort_2SCB2","Dev","Going on-line"
"2015-03-14 00:14:14.370","IMIS_2SCB2","Dev::reqDevResp","IPPort_2SCB2","IMIS_2SCB2","success"
"2015-03-14 00:14:14.370","IMIS_2SCB2","Dev::sesBegin","0xf4ddfd80"
"2015-03-14 00:14:14.370","IMIS_2SCB2","Dev","Going on-line"
"2015-03-14 00:14:14.370","IMIS_2SCB2","Dev::onNextCommand","Executing command","Classic::CmdA","2"
"2015-03-14 00:15:14.345","IMIS_2SCB2","Dev::sesEnd","0xf4ddfd80"
"2015-03-14 00:15:14.345","IPPort_2SCB2","Dev::sesEnd","0xf2a48ec0"
"2015-03-14 00:15:14.346","IMIS_2SCB2","Dev::cmdFinished","Classic::CmdA"
"2015-03-14 00:15:14.346","IMIS_2SCB2","Dev::sesEnd","0xf2a1efb8"
"2015-03-14 00:15:14.347","IMIS_2SCB2","Dev::shouldGetOffLine","Should get off line because of a parent request"
"2015-03-14 00:15:14.347","IMIS_2SCB2","Dev::onNextCommand","Ending"
"2015-03-14 00:15:14.347","IPPort_2SCB2","Dev::relDevice","IMIS_2SCB2"
"2015-03-14 00:15:14.347","IMIS_2SCB2","Dev::goingOffLine","Reset off-line counter"
"2015-03-14 00:15:14.347","IPPort_2SCB2","Dev","undialing complete"
"2015-03-14 00:15:14.347","IPPort_2SCB2","Dev::on_undial_complete","there is no hangup delay session"
"2015-03-14 00:15:14.347","IPPort_2SCB2","Dev::shouldGetOffLine","Should get off line because of a parent request"
"2015-03-14 00:15:14.347","IPPort_2SCB2","Dev::onNextCommand","Ending"
"2015-03-14 00:15:14.347","IPPort_2SCB2","Dev::cmdFinished","Comm::Root::end_command_type"
"2015-03-14 00:15:14.347","IPPort_2SCB2","Dev::sesEnd","0"
"2015-03-14 00:15:14.347","IPPort_2SCB2","Dev::goingOffLine","Reset off-line counter"
"2015-03-14 00:15:14.347","IPPort_2SCB2","Root::goingOffLine","Callback delay armed"
"2015-03-14 00:15:15.345","IPPort_2SCB2","Comm::Root::onOneShotFired","Callback delay"
Filtered transaction log:
"2015-03-13 23:04:00.006","IMIS_2SCB2","188","Next time to poll","2015-03-14 00:04:00"
"2015-03-13 23:04:00.006","IMIS_2SCB2","96","Scheduled poll started"
"2015-03-13 23:04:00.008","IMIS_2SCB2","92","Collect area poll started","final_storage_1"
"2015-03-13 23:04:13.328","IMIS_2SCB2","93","Final storage values received","final_storage_1","99"
"2015-03-13 23:04:13.328","IMIS_2SCB2","144","data file opened","final_storage_1","/home/dataflow/Spools/Dispatcher/2SCB2.dat"
"2015-03-13 23:04:14.967","IMIS_2SCB2","93","Final storage values received","final_storage_1","5"
"2015-03-13 23:04:14.967","IMIS_2SCB2","145","data file closed","final_storage_1","/home/dataflow/Spools/Dispatcher/2SCB2.dat"
"2015-03-13 23:04:14.967","IMIS_2SCB2","94","Collect area poll complete","final_storage_1","104","104"
"2015-03-13 23:04:14.968","IMIS_2SCB2","97","Scheduled polling complete"
"2015-03-13 23:04:14.968","IMIS_2SCB2","188","Next time to poll","2015-03-14 00:04:00"
"2015-03-14 00:04:00.006","IMIS_2SCB2","188","Next time to poll","2015-03-14 01:04:00"
"2015-03-14 00:04:00.006","IMIS_2SCB2","96","Scheduled poll started"
"2015-03-14 00:04:00.008","IMIS_2SCB2","92","Collect area poll started","final_storage_1"
"2015-03-14 00:07:07.174","IMIS_2SCB2","95","Collect area poll failed","final_storage_1","0","0"
"2015-03-14 00:07:07.174","IMIS_2SCB2","98","Scheduled polling failed"
"2015-03-14 00:07:07.174","IMIS_2SCB2","59","Collection schedule: primary retry"
"2015-03-14 00:07:07.174","IMIS_2SCB2","188","Next time to poll","2015-03-14 00:12:07.174708"
"2015-03-14 00:07:07.175","IMIS_2SCB2","188","Next time to poll","2015-03-14 00:12:07.174708"
"2015-03-14 00:12:07.174","IMIS_2SCB2","188","Next time to poll","2015-03-14 00:17:07.174708"
"2015-03-14 00:12:07.175","IMIS_2SCB2","96","Scheduled poll started"
"2015-03-14 00:12:07.175","IMIS_2SCB2","92","Collect area poll started","final_storage_1"
"2015-03-14 00:15:14.345","IMIS_2SCB2","95","Collect area poll failed","final_storage_1","0","0"
"2015-03-14 00:15:14.346","IMIS_2SCB2","98","Scheduled polling failed"
"2015-03-14 00:15:14.346","IMIS_2SCB2","62","Primary retry collection attempt failed"
"2015-03-14 00:15:14.346","IMIS_2SCB2","130","Last primary retry failed","0"
"2015-03-14 00:15:14.346","IMIS_2SCB2","58","Collection schedule: normal"
"2015-03-14 00:15:14.346","IMIS_2SCB2","188","Next time to poll","2015-03-14 01:04:00"
"2015-03-14 00:15:14.346","IMIS_2SCB2","188","Next time to poll","2015-03-14 01:04:00"
Filtered comm log:
"2015-03-13 23:04:00.011","IPPort_2SCB2","S","opening","41798749388.wsl.ch,6785"
"2015-03-13 23:04:01.626","IPPort_2SCB2","S","Provider opened"
"2015-03-13 23:04:01.627","IPPort_2SCB2","S","Device dialed"
"2015-03-13 23:04:07.207","IMIS_2SCB2","S","Classic::CmdA"
"2015-03-13 23:04:09.287","IMIS_2SCB2","S","Classic::CommandG"
"2015-03-13 23:04:10.993","IMIS_2SCB2","S","Classic::CommandF"
"2015-03-13 23:04:13.327","IMIS_2SCB2","S","Classic::CommandF"
"2015-03-13 23:04:14.967","IMIS_2SCB2","S","Classic::CommandF"
"2015-03-13 23:04:19.427","IPPort_2SCB2","S","Provider closed"
"2015-03-14 00:04:00.011","IPPort_2SCB2","S","opening","41798749388.wsl.ch,6785"
"2015-03-14 00:06:07.201","IPPort_2SCB2","S","Provider opened"
"2015-03-14 00:06:07.202","IPPort_2SCB2","S","Device dialed"
"2015-03-14 00:07:07.173","IMIS_2SCB2","F","Classic::CmdA"
"2015-03-14 00:07:07.175","IPPort_2SCB2","S","Provider closed"
"2015-03-14 00:12:07.175","IPPort_2SCB2","S","opening","41798749388.wsl.ch,6785"
"2015-03-14 00:14:14.369","IPPort_2SCB2","S","Provider opened"
"2015-03-14 00:14:14.370","IPPort_2SCB2","S","Device dialed"
"2015-03-14 00:15:14.345","IMIS_2SCB2","F","Classic::CmdA"
"2015-03-14 00:15:14.347","IPPort_2SCB2","S","Provider closed"
io$IPPort_2SCB2.log:
2015-03-13 23:04:01.627
B r e a k --------------------------------------------------
23:04:01.627 T 0d .
23:04:01.965 T 0d .
23:04:02.305 T 0d .
23:04:02.649 T 0d .
23:04:05.267 R 0d 0a 2a 0d 0a 2a ..*..*
23:04:05.297 T 31 41 0d 1A.
23:04:07.206 R 31 41 0d 0a 20 52 2b 31 30 32 32 35 2e 20 46 2b 1A.. R+10225. F+
23:04:07.206 R 34 38 38 31 32 2e 20 56 35 20 41 31 20 4c 2b 31 48812. V5 A1 L+1
23:04:07.206 R 30 32 32 35 2e 20 45 30 30 20 30 30 20 30 30 20 0225. E00 00 00
23:04:07.206 R 4d 30 32 35 36 20 20 20 42 2b 33 2e 32 37 37 32 M0256 B+3.2772
23:04:07.206 R 20 43 33 31 38 36 20 20 0d 0a 2a C3186 ..*
23:04:07.208 T 31 30 31 31 37 47 0d 10117G.
23:04:09.287 R 31 30 31 31 37 47 0d 0a 20 41 31 20 4c 2b 31 30 10117G.. A1 L+10
23:04:09.287 R 31 31 37 20 43 30 39 39 30 20 20 0d 0a 2a 117 C0990 ..*
23:04:09.288 T 34 46 0d 4F.
23:04:10.992 R 34 46 0d 0a 6b b8 60 00 46 af eb 14 55 92 4F..k.`.F...U.
23:04:10.993 T 39 39 46 0d 99F.
23:04:13.327 R 39 39 46 0d 0a fc 96 27 08 44 b0 07 df 00 48 08 99F....'.D....H.
23:04:13.327 R b6 05 fe 60 00 60 00 60 00 60 00 fc 64 27 08 44 ...`.`.`.`..d'.D
23:04:13.327 R b0 07 df 00 48 08 b8 65 95 69 8c 29 ca eb 7f 23 ....H..e.i.)...#
23:04:13.327 R 46 65 cf 27 5b 1b 57 60 00 49 c4 53 88 23 e8 60 Fe.'[.W`.I.S.#.`
23:04:13.327 R ce e1 67 e1 33 e3 7e c3 c6 fd 91 27 08 44 b0 07 ..g.3.~....'.D..
23:04:13.327 R df 00 48 08 b8 66 0e 17 c5 44 f5 44 f4 ee a6 63 ..H..f...D.D...c
23:04:13.327 R e8 60 00 04 df 6b b8 60 00 45 73 eb 1e fc 96 27 .`...k.`.Es....'
23:04:13.327 R 08 44 b0 07 df 00 48 08 fc 05 fe 60 00 60 00 60 .D....H....`.`.`
23:04:13.327 R 00 60 00 fc 64 27 08 44 b0 07 df 00 48 08 fe 65 .`..d'.D....H..e
23:04:13.327 R dd 69 8c 2a 37 eb 77 23 47 66 11 27 5e 1b 57 60 .i.*7.w#Gf.'^.W`
23:04:13.327 R 00 49 c4 53 88 23 e8 60 ce e1 67 e1 33 e3 80 c3 .I.S.#.`..g.3...
23:04:13.327 R fa fd 91 27 08 44 b0 07 df 00 48 08 fe 66 0e 17 ...'.D....H..f..
23:04:13.327 R c5 44 f3 44 f2 ef 60 63 e8 60 00 92 23 .D.D..`c.`..#
23:04:13.329 T 35 46 0d 5F.
23:04:14.966 R 35 46 0d 0a 04 df 6b b8 60 00 48 97 eb db 39 b2 5F....k.`.H...9.
23:04:14.973 T 45 E
23:04:17.226 R 45 E
23:04:17.227 T 0d .
23:04:19.226 R 0d 0a ..
1, 6, 18
2015-03-13 23:04:19.327
B r e a k --------------------------------------------------
Going off line
B r e a k --------------------------------------------------
1, 6, 18
2015-03-14 00:06:07.201
B r e a k --------------------------------------------------
1, 6, 18
2015-03-14 00:07:07.175
B r e a k --------------------------------------------------
Going off line
B r e a k --------------------------------------------------
1, 6, 18
2015-03-14 00:14:14.370
B r e a k --------------------------------------------------
1, 6, 18
2015-03-14 00:15:14.347
B r e a k --------------------------------------------------
Going off line
B r e a k --------------------------------------------------
1, 6, 18
2015-03-14 01:06:07.329
B r e a k --------------------------------------------------
1, 6, 18
What is the PakBus address of (the pakbus port used by) your LoggerNet for Linux server ? Is it different than the default setting of 4094 ? If you have both LN for Windows and LN for Linux set at 4094 for PB address then you are certain to have problems. Each LoggerNet server/pb port that accesses a datalogger (whether Windows or Linux) should have a unique PakBus address.
See section "4.2.8.1.2 PakBus Settings" in the LoggerNet manual, accessed in the SetupScreen utility's Tools menu.
https://s.campbellsci.com/documents/us/manuals/loggernet.pdf
Thank you for your answer. The pakbus port were identical. I changed the one of the LN Linux. However, only one of the 5 loggers configured on LN Linux is a pakbus logger. The logger "SCB2", on which the data collection stucked, is a CR10x.
Don't pay too much attention to the state log. It is intended to help troubleshoot problems internal to the LoggerNet daemon. The "should get off-line because of parent request messages" are happening because of communication errors. The pattern emerges when you look at the time in question in the comms and low level logs.
After your cut-off time, it is taking over two minutes between the time when LoggerNet attempts to first establish the connection with the remote modem and the time when the final TCP acknowledgement comes from the remote modem. Further, it looks like the TCP connection is failing when LoggerNet attempts to first send characters on the newly established link. The fact that nothing is printed in the low level log after your cut-off time also indicates that there is something wrong with the link.
I agree that there's an issue with the link. As at the same time another LN instance (Windows version) is able to collect the data just fine, I think the issue is inside LN Linux. Also that everything is just fine after a restart of LN Linux is indicating that something got stuck inside LN Linux. Is there a way to increase the amount of log messages to get more information what is going on?
It happened again. Same logger as last time is not reachable anymore. Interesting: The last successful data collection has the same timestamp as last time: 23:04 hours. Maybe a coincidence.
This time i tried using the Connect Screen to connect to the logger which can't be reached anymore. When I click "connect", LoggerNet tries to connect forever, there's no timeout or error message like usually. One more indicator that there's something's wrong within LoggerNet.
How do we proceed to make some progress on this issue? It's there anything I can provide to help or investigate?
It happened again. The data collection of 2 loggers blocked. I can connect to those loggers via telnet, another windows LoggerNet instance is collecting them without any problems. So it MUST be a problem within LoggerNet for Linux.
I think the whole issue was caused by low memory on the machine running LoggerNet for Linux.
Now I'm sure the issue was related due to low memory. Since the machine is running on 2 GB instead of 0.5 GB there's no more trouble at all.
Cheers,
Marcel