pi Slows down and web server not responding

Hi,
Just had this issue, maybe more than once now. Just got up after printing overnight and the repetier server web interface was not responding and the print was not finished. I thought it had locked up completely (I have had this before) but the printer is still printing but it looks like it gets a command every 10 seconds or more. Any ideas?
About to restart :(
Thanks

Comments

  • Also ssh would not connect due to timeout.
  • Without analysis I can not say much here. Question is are you running out of memory so it swaps and slows down everything? If so which software increased memory usage to the limit? These are the main things to check to isolate the reason.

    The fact that it is still working and ssh will time out shows that except the slow down everything seems to be ok. Typical reason is said high memory usage which normally does not happen. Is the network normally rock solid? Many or frequent connections might increase memory usage. Or also seen in past chromium starts getting bigger.
  • Thanks. Network is normally solid (I have decent ubquiti wifi etc), I could use a wired connection as the pi is pretty close to a switch. I am running a display with the touch display as well. Hard to analyse where the issues is if I cannot get into the pi to check anything. I normally have my PC connected to the web interface and also my mobile most of the time. Any advice you can give will be appreciated.
    Thanks again
  • If it happens frequently I would suggest having a ssh connection with putty and run
    top
     in that to see memory and cpu usage. That should give you early a warning when one of the components starts using more memory and even if you miss it you still have the last update on screen to see what used memory/cpu. That allows it narrow down the problem.
  • Thanks, already had it running and just saw this. It did do it overnight, top attached is the final frozen screen. Once the one printer gave up it looks like the other carried on (as it is still printing back at normal speed). It has an obvious mark where the issue occurred. The pi ran out of swap (screenshot attached). It was printing a 90m and a 30M file (2 printers). Does this mean I need to keep an eye on the file size, run more pi s? Or is there a way to fix it. I am not familiar with linux much.
  • Ok, that is quite clear. Repetier-Server uses 83% memory instead of 2-5% which it normally does not do.

    So the main question would be what are you doing that others not having the problem don't do.

    If no one is connected watching (lets except touchscreen here) I'm quite sure it would not happen. As soon as you have an open connection the server buffers unread messages etc. Normally no issue since connected devices read the data so we can delete it freeing memory.

    First question is are you using 0.91.2? For older versions I know ways to get that error but all known ways are closed at the moment.

    If you are on latest version I#d like to ask you to do little debug session. Add a virtual printer for printing (assuming it will happen there as well) or just select virtual port for your printer. That way you do not need to waste filament as the test will ruin the print.

    Follow these instructions, install gdb first.

    https://www.repetier-server.com/knowledgebase/debugging-crashes-hangs-on-linux/

    Once you see memory rises lets say over 5% start gdb, attach process a create a full backtrace and send it to me. Then I see where all threads are doing what. Hopefully that will give the final hint to where it hangs or where we accumulate memory.

    That is exactly what I do if I manage to get a similar error and most of the times it is possible to see what is going wrong.
  • Thanks. I will try this. Some more info in case it helps: repetier is normally using 9 to 13% CPU, 124 to 140M mem free (while printing these 2 models), always with 1 web connection open on my PC and the touch screen. It jumps to 15 or 16 cpu when I use it on my phone as well but seems to settle. At the time of the issue the pc will be connected but the phone was not. Webcams were not showing on the web interface. An hour or more into the prints the CPU usage was stable at 9 to 13% except when uploading gcode from s3d directly in script. It would be at 100% while rendering, and memory free fell to 30 to 40 M but did not touch the swap at all. So sometime over night it just gobbles up everything while I am asleep.

  • Also yes us 91.2
  • Sorry one last thing how do I add a virtual printer? Should I add 2 to replicate the issue using the same files etc? I can do this tonight and run it overnight, as It only ever happens some time into the print

  • A virtual printer is just a printer where you select VirtualCaresian or VirtualDelta as port, that is all.

    CPU usage sound good. If you have more users connected it increases a bit, also uploading g-codes makes server parse them and render them so cpu will be 100%.

    Free memory is not what you see in free memory. That is misleading. You need to add buffer/cache to it. Linux will use all unused memory to cache read files for better speed. But if memory is needed it will also return that directly so it counts to free memory.

    The problem with this to debug is that you need to start gdb when memory starts increasing but before you start to swap. Once it swaps you can not open debugger and analyse. I think it is a problem where memory slowly increases over time so if you look from time to time that would suffice. As soon as top shows > 8% for server I think you can start debugger.
  • thanks. I will start it as soon as I get home from work and hope it is ready to start debugging before bed :)

  • FYI it hit 8.9% mem so i started the debug and hit c (hope thats what you need). Need to sleep now. Hope it gives us something. Looks like it will perhaps happen at the same point in the prints, but that a guess
  • ok so this time it hit 75.7% memory. Still running, one printer has finished other is still going as long print. Not sure what you need me to send. debugger is still running 'c'. I can obviously send the gcode files. What else please and how to get them? Thanks for the help
  • No that is not what the docs say. NEVER run c in the debugger. You can not stop after it.

    You need do exactly as the doc says. Run normally the print and once you see memory is increasing start debugger and attach. Do not start server. All I need is output of

    thread apply all bt
    at that point to see which threads are running and might not free memory any more. Since this will take time the print is now not good any more, which is why I suggested the virtual printer. In case of a real print you might pause in server so it moves away, then do the debugger stuff and continue for the rest of the print with "c" in debugger. But stopping again with Ctrl+C does not work - it will crash the server which listens for the signal.
  • thanks I will try again. It was running virtual as suggested with two virtual printers to simulate both jobs that were running. I have sent the server log via email. It shows wifi and missed pings but not sure if that due to the memory or the cause of the memory usage. 
Sign In or Register to comment.