RepetierServer killed
Hi,
I have some random kills of the RepetierServer process. I'm running it on a Intel Nuc mini computer with Ubuntu Server running on it. There are 20 printers connected, all with a raspberry pi with ser2net running on them. But that part works pretty wel as far as I see. From time to time it happens that the service of RepetierServer gets killed but I can't figure out why. I've monitored the cpu usage and ram with sar but everything seems stable. The Nuc doesn't reboot or anything. The only thing that was happening was one printer which didn't had the usb connector in, and was not disabled, so it kept logging 'error: Reading serial conection failed: End of file. Closing connection.' could that be a reason for a crash? Does anyone have a clue of where to find an error or cause like this?
Thanks in advance,
Christophe
Syslog
I have some random kills of the RepetierServer process. I'm running it on a Intel Nuc mini computer with Ubuntu Server running on it. There are 20 printers connected, all with a raspberry pi with ser2net running on them. But that part works pretty wel as far as I see. From time to time it happens that the service of RepetierServer gets killed but I can't figure out why. I've monitored the cpu usage and ram with sar but everything seems stable. The Nuc doesn't reboot or anything. The only thing that was happening was one printer which didn't had the usb connector in, and was not disabled, so it kept logging 'error: Reading serial conection failed: End of file. Closing connection.' could that be a reason for a crash? Does anyone have a clue of where to find an error or cause like this?
Thanks in advance,
Christophe
Syslog
Jan 23 22:12:39 toadi3dprinters systemd[1]: RepetierServer.service: Main process exited, code=killed, status=6/ABRT Jan 23 22:12:39 toadi3dprinters systemd[1]: RepetierServer.service: Failed with result 'signal'. Jan 23 22:12:39 toadi3dprinters systemd[1]: RepetierServer.service: Service has no hold-off time (RestartSec=0), scheduling restart. Jan 23 22:12:39 toadi3dprinters systemd[1]: RepetierServer.service: Scheduled restart job, restart counter is at 1. Jan 23 22:12:39 toadi3dprinters systemd[1]: Stopped Repetier-Server 3D Printer Server. Jan 23 22:12:39 toadi3dprinters systemd[1]: Starting Repetier-Server 3D Printer Server... Jan 23 22:12:39 toadi3dprinters systemd[1]: Started Repetier-Server 3D Printer Server.RepetierServer
2020-01-23 22:12:31: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:32: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:34: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:35: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:37: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:39: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:39: Start logging... 2020-01-23 22:12:39: Webdirectory: /usr/local/Repetier-Server/www/ 2020-01-23 22:12:39: Storage directory: /var/lib/Repetier-Server/ 2020-01-23 22:12:39: Configuration file: /usr/local/Repetier-Server/etc/RepetierServer.xml 2020-01-23 22:12:39: Directory for temporary files: /tmp/ 2020-01-23 22:12:39: Reading firmware data ... 2020-01-23 22:12:39: Starting Network ... 2020-01-23 22:12:39: Active features:4095 2020-01-23 22:12:39: Reading printer configurations ... 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_0016.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_0016 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_16.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_16 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_12.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_12 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_13.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_13 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_006.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_006 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/Prusa_004.xml 2020-01-23 22:12:39: Starting printjob manager thread for Prusa_004 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_0014.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_0014 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/Prusa_001.xml 2020-01-23 22:12:39: Starting printjob manager thread for Prusa_001 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_0017.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_0017 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_011.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_011 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_0015.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_0015 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_3.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_3 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/Prusa_003.xml 2020-01-23 22:12:39: Starting printjob manager thread for Prusa_003 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_11.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_11 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/Prusa_002.xml 2020-01-23 22:12:39: Starting printjob manager thread for Prusa_002 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_15.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_15 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_10.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_10 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_9.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_9 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_001.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_001 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/RD_Printer_1.xml 2020-01-23 22:12:39: Starting printjob manager thread for RD_Printer_1 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_003.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_003 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_0013.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_0013 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_0012.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_0012 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_004.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_004 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/Prusa_005.xml 2020-01-23 22:12:39: Starting printjob manager thread for Prusa_005 2020-01-23 22:12:39: Reading printer config /var/lib/Repetier-Server/configs/CR_14.xml 2020-01-23 22:12:39: Starting printjob manager thread for CR_14 2020-01-23 22:12:39: Starting printer threads ... 2020-01-23 22:12:39: Starting printer thread for CR_16 2020-01-23 22:12:39: Starting printer thread for CR_006 2020-01-23 22:12:39: Starting printer thread for CR_0016 2020-01-23 22:12:39: Starting printer thread for Prusa_001 2020-01-23 22:12:39: Starting printer thread for Prusa_004 2020-01-23 22:12:39: Starting printer thread for CR_011 2020-01-23 22:12:39: Starting printer thread for CR_0015 2020-01-23 22:12:39: Starting printer thread for Prusa_003 2020-01-23 22:12:39: Starting printer thread for CR_13 2020-01-23 22:12:39: Starting printer thread for CR_15 2020-01-23 22:12:39: Starting printer thread for CR_10 2020-01-23 22:12:39: Starting printer thread for CR_12 2020-01-23 22:12:39: Starting printer thread for RD_Printer_1 2020-01-23 22:12:39: Starting printer thread for CR_11 2020-01-23 22:12:39: Starting printer thread for CR_0013 2020-01-23 22:12:39: Starting printer thread for CR_004 2020-01-23 22:12:39: Starting printer thread for CR_0017 2020-01-23 22:12:39: Starting printer thread for Prusa_005 2020-01-23 22:12:39: Starting printer thread for CR_0014 2020-01-23 22:12:39: Starting printer thread for CR_3 2020-01-23 22:12:39: Starting printer thread for CR_9 2020-01-23 22:12:39: Starting printer thread for CR_003 2020-01-23 22:12:39: Starting printer thread for Prusa_002 2020-01-23 22:12:39: Starting printer thread for CR_0012 2020-01-23 22:12:39: Starting printer thread for CR_14 2020-01-23 22:12:39: Starting printer thread for CR_001 2020-01-23 22:12:39: Starting work dispatcher subsystem ... 2020-01-23 22:12:39: Starting user database ... 2020-01-23 22:12:39: Importing projects ... 2020-01-23 22:12:39: Initializing LUA ... 2020-01-23 22:12:39: Register LUA cloud services 2020-01-23 22:12:39: add G-Code-Renderer 2020-01-23 22:12:39: LUA initalization finished. 2020-01-23 22:12:39: Work dispatcher thread started. 2020-01-23 22:12:39: Internal work dispatcher thread started. 2020-01-23 22:12:39: Starting web server ... 2020-01-23 22:12:39: Webserver started. 2020-01-23 22:12:40: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:40: Connection started: Creality 20 2020-01-23 22:12:40: Connection started: Creality 04 2020-01-23 22:12:40: Connection started: Creality 17 2020-01-23 22:12:40: Connection started: Creality 18 2020-01-23 22:12:40: Connection started: Creality 02 2020-01-23 22:12:40: Connection started: Creality 01 2020-01-23 22:12:40: Connection started: Creality 03 2020-01-23 22:12:41: Connection started: Creality 14 2020-01-23 22:12:41: Connection started: Creality 11 2020-01-23 22:12:41: Connection started: Creality 05 2020-01-23 22:12:41: Connection started: Creality 19 2020-01-23 22:12:41: Connection started: Creality 09 2020-01-23 22:12:41: Connection started: Creality 13 2020-01-23 22:12:41: Connection started: Creality 10 2020-01-23 22:12:41: Connection started: Creality 07 2020-01-23 22:12:41: Connection started: Creality 08 2020-01-23 22:12:41: Connection started: Creality 06 2020-01-23 22:12:41: Connection started: Creality 15 2020-01-23 22:12:41: Connection started: Creality 12 2020-01-23 22:12:41: Connection started: Creality 16 2020-01-23 22:12:41: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:12:42: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:44: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:12:45: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:46: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:12:48: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:50: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:12:52: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:54: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:12:56: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:57: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:12:58: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:12:59: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:13:00: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:13:02: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:13:03: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:13:05: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:13:06: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:13:08: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:13:10: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:13:12: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:13:13: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:13:14: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:13:16: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:13:17: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:13:18: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:13:20: error: Reading serial conection failed: Connection reset by peer. Closing connection. 2020-01-23 22:13:21: error: Reading serial conection failed: End of file. Closing connection. 2020-01-23 22:13:23: error: Reading serial conection failed: Connection reset by peer. Closing connection.
Comments
First thing is make sure you run 0.93.1 since there are all known reasons for hang/crash fixed so far.
Then follow https://www.repetier-server.com/knowledgebase/debugging-crashes-hangs-on-linux/ with one deviation - connect to debugger and hit continue. Since your problem is crash and not hang you need to have server running in gdb so you get a full backtrace at the moment the crash happens. From that I can see where in source code the crash happens and hopefully can see how that problem can arise.
Important fact is that the console with gdb running must stay open so best is to open it on the nuc it self which hopefully has a monitor/keyboard to do so.
I have updated the server to 0.93.1 and ran it under gdb in a screen session. I added 20 virtual printers (as our real printers are currently all printing with sd card again) and started virtual prints on all of them. The gcodes are more than 100 hours printing time. And the server crashed again after 2 days. Which is sooner than before. The gdb logging; https://pastebin.com/sYGNGvsv
The server.log doesn't show anything strange
It does look like it has something to do with a virtual printer, so I hope this isn't a crash caused by virtual printers instead of the actual crash we're been having. We could try adding the real printers one by one but we can't afford to lose much prints at this point so it's a bit hard to test.
I hope you may already find something useful
Thanks for debugging with me!
Any update on this?
Thx,
Christophe
Did you do anything or just run the prints? Any special view in browser when it runs? Monitor running as well? That can cause the different thread access when it fullflils queries from browser. Maybe I should try more browser instances to enforce likelyhood of problems.
It did happen very fast with the virtual printers though, with our real printers it always took some weeks until a crash happens.
Thanks for debugging with me! Currently printing again with sd cards is a huge pain, as we have improved gcodes/models every week. Looking forward working back with your software