repetier server stops working during print

my (overclocked) raspberry b +  has a 0.60.3 server running on it, and this server tends to stop sending commands to the printers during a print.

in the beginning (first 3 weeks) everything with the server worked, but now it aborts every print i try after 2 hours +
i changed nothing, this started without my influence.

the last print was a 13 h print and it stopped in the last 40 minutes, 180 meters filament wasted....
this happened more than once.
after it did this, i reinstalled the raspberry from scratch with a new micro sd card, but it keeps aborting the print on a random position.

i changed the LAN and USB cables of the server but nothing changed. the pi also has no cpu overload or anything like this it cpu usage is always below 20% during print.


after a print was ruined by the server, i can stop it in the interface and start a new one and the server works again for 1-2 hours until it stops again, freezing on a position on the print object. extruder as well as the bed keeps powered but no move commands reach the printer any longer. 
the server gets slowed down sending commands to the printer and gets slower and slower until the printer begins to stutter (waiting for next command) and stops finally.

what can i do? this is a annoying issue, especially because it stops at a random position. i dont want to waste more filament on this.

«13

Comments

  • As you are printing long prints, you should check your free disk space. If disk space is used up strange things happen.

    Assuming it i snot a full disk, it would be interesting to see the log part where it happened to see what is going on. I haven't had this problem so I'm not sure why this happens suddenly. Also when it happens check free memory. Server should never use a big memory when running and usage should keep quite constant. All memory intensive computations are outsourced to external programs.

    Last thing you could test if you have repetier-firmware on your printer is sending after sending 

    M111 S24

    then printer only acknowledges commands but do not execute them. So sending a 13h job can be done in 30 minutes only to put server to t's paces and see if the error is still there without wasting plastic. Since it is a communication problem I think that should work.

    Did I understand it right that stopping a print was enough to get everything working again?
  • thanks for your reply, and yes, youre right, a simple restart fixes the problem until it shows up again.

    i tested it, and it was the same layer last time it came to the stop but mostly its random.

    the pi runs on a 8 gb card, this means he only has 2,7 gb free disk space left after a clean installation with raspbian.
    (2,7 gb is displayed in the servers interface, i did not check the real free space in raspbian until now)


    i will test a bigger card so he gets more free disk space, in the beginning (as it worked fine) it had 3 gb free space.


    where can i find the relevant log? is it the one from the last print in the interface or do you mean a different log?

  • 2,7GB is ok. I mean more if you are close 1MB writes to log may fail delaying everything.

    Log is the print log you can download from printer menu -> Logs for last print.

    So apart from log there is the question of memory consuption when it happens. If you start swapping for some reason it may also go slow. You can use free to see free memory and top for cpu usage and memory usage. For a pi I would normally disable swapping at all. Swapping are many writes on flash which is not good anyway and it slows down everything.
  • ok, now i did the 8 gb sd card from the B + in a raspberry 2, without changing anything.

    im going to do a testprint, and if it stops i check the log and the ressources of the pi 2.


    yesterday night, i wanted to print from a pc, everything went fine until i switched my desklight here off.
    5 secs later the printer got connection problems and stopped.

    the lamp is standing next to the usb cable and has a built in transformer for its low volt halogen light.
    i was able to reproduce the connection loss just by switching the lamp on and off.

    after i changed the usb cable again, the effect was not reproducable.

    but the pi stopped without the lamp got switched on/off.... maybe it is another electronical inteference or the PSU itself, its no a PC PSU, its one of those LED lamp PSU (which are not as good as a PC PSU in some cases).

    i will be back for a update, if something happens.


  • it happened again.

    at first it came to a short stop during the print, 1 hour after i started, but the printer continued the print until about 50% of the part, then it just stopped to send movement commands to the print.

    i paused the print and was able to move the printer, looks like it has not lost its connection and it was not out of ram.


    but by accident, i did not take the raspberry 2 as i wanted to change the hardware of the server, it was a 2nd B +.... but now i run the server on a real 2 and try it again.

  • Log shows nothing special. Happy printing until it starts only polling temperatures at 16:23:20. Is that where you hit pause or did you pause after that? It is important to give very exact timings in combination with logs so I can check what happened while event x occured.

    There is one more command that could help debugging. In console when it happens (don't pause) send


    This should give something like 

    Connection status: Buffered:126 manualCommands: 1 jobCommands: 4997

    Here jobCommands shows that it had read many commands to be send. If he stops sending a print it should be only because that buffer is empty and that normally only happens at the end of the print.

    You can send this command any time to see buffer status of server connection.
  • edited July 2015
    this time it happened within the first 5 layers.

    i was here and checked the ressources of the pi 2:

    ram was at 61 MB from 1024 and CPU usage at 26%

    the printer slowed down, began to stutter and made a pause between every move for about a half second.
    it did this for about 20-30 seconds, then it drawed a line with normal speed and after this it stopped.

    i watched the console, it was no connection loss because the printer was still responsive to my commands,
    the server just stopped sending movement commands.

    pause / unpause did not help.



    looks like the printers last command before it stops is the M204. 

    edit: the printer does now stutter from the first layer. no smoth moves any more, only stutter but user commands from me are executed totally normal, fast and responsive.

  • edited July 2015
    i paused only after it already has stopped moving.

    i also tested my 2nd printer on the server, worked fine, tried the 1st one, same issue... 
    disconnected usb cable and connected to another port on the pi, now it prints, lets see how long.


    the command you metioned above tells me this during print:

    19:37:07: Connection status: Buffered:41 manualCommands: 1 jobCommands: 4999


    it displays always values like this for now....

    edit: ok it stopped and i did not press pause this time, instead the @debugcon command telling me this:

    19:44:17: ok T:218.8 /220.0 B:60.1 /60.0 T0:218.8 /220.0 @:127 B@:0
    19:44:18: N5646 M105
    19:44:18: ok T:219.3 /220.0 B:59.9 /60.0 T0:219.3 /220.0 @:127 B@:0
    19:44:19: Connection status: Buffered:15 manualCommands: 1 jobCommands: 5000
    19:44:19: N5647 M105
    19:44:19: ok T:218.9 /220.0 B:59.9 /60.0 T0:218.9 /220.0 @:127 B@:0
    19:44:20: N5648 M105
    19:44:20: ok T:219.7 /220.0 B:60.0 /60.0 T0:219.7 /220.0 @:0 B@:0
    19:44:21: Connection status: Buffered:15 manualCommands: 1 jobCommands: 5000
    19:44:21: N5649 M105
    19:44:21: ok T:220.0 /220.0 B:59.9 /60.0 T0:220.0 /220.0 @:0 B@:127
    19:44:22: N5650 M105
    19:44:22: ok T:219.6 /220.0 B:59.8 /60.0 T0:219.6 /220.0 @:127 B@:127
    19:44:23: N5651 M105
    19:44:23: ok T:220.0 /220.0 B:59.7 /60.0 T0:220.0 /220.0 @:0 B@:127
    19:44:24: Connection status: Buffered:15 manualCommands: 1 jobCommands: 5000
    19:44:24: N5652 M105
    19:44:24: ok T:220.1 /220.0 B:59.7 /60.0 T0:220.1 /220.0 @:0 B@:127


    i do the next print with repetier host and test if it does the same.
  • it did not the same, the printer finished a 10 h print with repetier host. so the printers hardware is ok.
  • Looks strange. On one side server shows it has many commands buffered. With M105 it shows manual commands go through so it is not the printer that blocks. So the question is why does the server not send the commands also it is allowed and it shoudl know they are there. I will analyse the code and report back when I find how this can happen.
  • edited July 2015
     i noticed the following error lines in repetier host during print:


    05:25:32.257 : Drucke Layer 263 von 264
    05:25:56.833 : Error:No Line Number with checksum, Last Line: 180269
    05:25:56.833 : echo:Unknown command: "0"
    05:25:56.833 : echo:Active Extruder: 0
    05:25:56.833 : Error:Line Number is not Last Line Number+1, Last Line: 180269
    05:25:56.833 : Resend: 180270
    05:25:56.848 : Error:Line Number is not Last Line Number+1, Last Line: 180269
    05:25:56.848 : Resend: 180270
    05:25:57.472 : Error:Line Number is not Last Line Number+1, Last Line: 180272
    05:25:57.472 : Resend: 180273
    05:25:57.628 : Error:Line Number is not Last Line Number+1, Last Line: 180273
    05:25:57.628 : Resend: 180274
    05:26:23.704 : Error:No Line Number with checksum, Last Line: 180440
    05:26:23.704 : echo:Unknown command: "0"
    05:26:23.704 : echo:Unknown command: "ok"
    05:26:23.704 : Error:Line Number is not Last Line Number+1, Last Line: 180440
    05:26:23.704 : Resend: 180441
    05:26:23.720 : Error:Line Number is not Last Line Number+1, Last Line: 180440
    05:26:23.720 : Resend: 180441



    it produces a lot of these messages but the print went fine.
    it stopps throwing these errors after i unpowered the printer and reconnected.


    i have made an image of the server, if it helps i can upload the file somewhere for you, but the only thing i did was downloading a new raspberry image (i took the noobs image) then updated it, then installed the repetier server and chromium, nothing else.

    i also ordered a new ramps 1.4 and arduino mega and new stepper drivers. the usb port on the arduino is not really thight, maybe this has something to do with it.
  • The com errors are ok. With the server you also had com errors from time to time. But they got detected and print continues as it should.

    As I said, there must be a condition to prevent sending job commands also they are there. When I know how it could happen I will make a special version for testing for you, since you happen to get the problem.

    Noobs image has many os versions, whcih one did you use? I always use the Raspbian image, but with noobs it had quite limited free disk space thatw as not expandable.
  • i took this one: Version:
    1.4.1
    Release date:2015-05-11

    the lastest built on their website for the raspberry not the lite version, i installed only raspbian.

    but i could try the raspbian image without noobs, for testing.

    the printer currently prints the next 10 h print... have to wait.
  • just an idea of mine: the server is powered by an psu with 1.2 amps, maybe this psu is faulty and the power is probably not always stable, i will change the psu with a 2 amps unit, maybe it helps somehow.
  • edited July 2015
    i have an interesting update:

    i started a 10 h job, watched it 2 hours and left my home then, i cam back 4 hours later and the printer did fine during i was away (repetier host not server), then i made some sports besides the printer, push ups etc. in hearing range of the printer...

    suddenly, i heard stuttering from the printer. it began to pause for about 500 ms during its infill lines but the perimeters are printed in normal speed. i wanted to rescue the print and touched the motors and the stepper drivers on the board, the extruder motor has 60-70 degree celsius and its stepper driver even more. i tried to reduce the power but then it has not enough force to push the filament in during the infill, so i turned the poti back up until the motor had again enough force.

    the printer continued stuttering between every single line and pruduced small blobs... then i took a 12 v ventilator, hooked it up with the extruder output D10 on the ramps board and positioned it besides the ramps... i waited a bit... stuttering stopped even when the fan not really cools things down because its way to weak and small for this job and it stands besides the ramps, because im not able to access it during prints, its below the moving heat bed.


    but now it prints smooth again, at least the last 10 layers were printed normal and before i attached the ventilator, it stuttered on every layer during the infill.

    i guess the stepper driver went to pulse mode because it was to hot?

    but i already thought in this direction, yesterday i let the printer cool down for an hour and tried a new print after the cooldown but it immediately began to stutter even without heat, while i tried a print via the server.

    so i thought it could not be a temperature issue... now i think it maybe has something to do with heat.

    what do you think?

    edit: 10 min later it stopped again, this time for 5 secs during the infill, i saw some com errors in the log, and 10 secs later the printer continued printing...

    this is really strange.
  • It is strange to get similar behaviour on different computers and programs. That is right.

    You might get stuttering when you have some errors in a row. If the buffer runs empty while resending missed lines it will stop and restart.

    It is not a temp. problem of the stepper drivers. When they get too hot they stop working and you get missed steps and misalignments. You would surely see that. As long as you are not loosing connection I think the board is working well, but cooling is always good to let board live a bit longer especially if drivers get hot.

    Meanwhile I have made a new server for your pi:

    I haven't seen anything that would really explain what happens, so I improved debugging quality. First 
    @debugcom now also returns the 2 pause settings used internally. If one of them becomes true your print will stop. One of them is shared with manual commands which still worked, so this only leaves a requested communication stop from your firmware, which I do not think comes.

    I have also added
    which toggles a quite chatty mode on/off. It shows when it tries to send a command and if it fails. Failing will occur frequently, simply because buffers are not empty so that alone is not an error as long as it succeeds from time to time. Use this only when you get the errors or log will explode. Hopefully one of both gives some insight to the source. 

    I have also made some fixes that might become a problem, but I do not think these are your problem.

  • There is one chance that it is a firmware problem. Manual commands rank above job commands and every second the server sends one. If the firmware hangs and only accepts a ok every second there would be no room for new moves to receive. But your logs show exactly one second distance between M105 so no visible delay from firmware making this a bit unlikely if they are not exactly at a second wait.
  • edited July 2015
    ok i installed the new server, did a testprint, it stuttered from the first layer at point where it normally should start to infill, the perimeter worked, the infill stuttered horribly so i played with the new commands...

    the printer says on 9 from 10 commands failed (busy) then one is a success but the next couple of commands fail.

    it did not really stop, i stopped the print after some minutes, you can see this point in the log where it has stopped receive commands.



    at first i tested the commands while it was heating up, later i used 
    @debugcon at line 1177 at 20:38:54 this was when it begun to stutter, some seconds later i sent @explaincom while it was stuttering.

    i switched it on and off but the printer stuttered until i stopped the job.

    for me, while i was watching the log scroll, it looked like i saw a clear break when it slowed down to a speed which forced the printer to wait or like this. the command flow was very slow from the point it begun to stutter.



    on this job it stuttered from the first second, not even the skirt was printed smooth. seems like no command was getting through and this lead to a full stop of the printer.
  • Ok it is going forward. The log with @explaincom shows clearly that the server would like to send new commands, but free buffers do not allow it. I have a bit or a problem with the time resultion now since I can not see the lag between M105 and ok and between other commands and ok.

    But I'm wodering what board you have and if ping ping is enabled. If it is enabled (good) what did you enter as buffer size. Most printers will allow 127 byte (except due). If your buffer is too small you might run into this issue with ping pong disabled, also I can not say how that should be possible. M105 should send ok immediately.

    I will try to increase the log time resultion to milliseconds so we can see where the ok responses belong to.

    It would help to upgrade your marlin firmware to either repetier or latest marlin development. Both report line numbers in ok se we can associate it with the command that gets free. That also allows to free commands when the responsing "ok" was missed one time, Which maybe is exactly what happend in your case. Server missed ok and never freed the buffer and then suddenly only a ok fits into it, but not the longer command and print stops. In that case ping pong would also solve the problem and for every missed ok you get a 40 second (timeout) wait instead.
  • edited July 2015
    the buffer is set to 127 but the working prints with repetier host are done without ping pong mode, it is disabled all the time, it just displays the oks after i enable the ack command switch in the console window of the webinterface.

    the firmware is the lastest marlin, at first i wanted the repetier firmware but after struggling with the autolevel for days i took marlin and got it working with it.
    it would be easier to adapt the marlin dev branch because the settings are hopefully the same.

    im gonna update it later to the dev branch.

    i started a short test with ping pong enabled on the server... for now it prints normal.... i have to leave my home now and hope i don´t have to clean up a mess when i come back.

    edit: printer still prints normal, maybe the ping poing mode fixed it.... i noticed, the buffer of the printer is now always empty and every command sent is a success (maybe because the buffer is never full) it seems to work, but why does the printer struggle without pingpong mode where the other printer with the same firmware and hardware does it not?

    and why does it need ping pong mode on the server while it works without on the repetier host?

  • The buffer in ping pong i salways empty, because that is how ping pong works. That is how the protocol originally was designed. But due to latency the throughput is not very good with it, whcih is why we invented to buffered system.

    Not sure why that is a problem for buffered prints in your case. I use buffered all the time, but with repetier and line numbers allowing to fix missed "ok". If that really is the problem it shoudl also be gone with dev marlin where you get ok N8883.
  • edited July 2015
    so, if something uses the buffer and the printer messes the "ok" up, then it waits until timeout to send the next command, right?

    ping pong simply disables the buffer and the printer needs a direct flow of commands e.g. next command before the last one was executet to keep it smooth.

    stays the question why the hardware does not use the buffer like it was intended, why does it not send the ok immediately, seems there is an delay sometimes and this messes it up.

    in the logs of the failed prints are always more than one ok in a row right before the stuttering or stop appears.
    this could mean it jumped over or missed a command. one time it just jumped over the skirt, it just did not print a skirt and began with the part... 

    it runs on its second ramps 1.4 now, the first one i used destroyed a mosfet on it and a mk2b heatbed right after i connected it... i saw smoke. it was wrong or faulty soldered so i bought a new ramps and a new heated bed.
    maybe the hardware of the arduino was damaged during this failure... and now its buffer does not work as intended?

    in about 5-6 hours, when this job here is done, i will try the marlin dev firmware, like you said and if i have still time (and motivation) after it, i try to use repetier firmware again.

    btw: why is there no command in repetier firmware which can be used to adjust the Z offset?
    like the M555 Z in marlin, it would be nice to adjust the high without compiling and flashing every time.

    the reprap wiki says the command M555 does nothing for marlin, but this is not true... it sets the offset of the probe or an axis, which is very useful for calibrating a printer.

  • edited July 2015
    i tried the dev branch of marlin and repetier now with the following results:

    with marlin i had problems with the auto level, even with the same settings als stable branch. it always drove the nozzle in the negative so i was not able to do a testprint, but it did not stutter on the first layer with ping pong off.

    repetier lost the connection frequently even at some manual moves, the printer was frozen and needed a reconnect. this happened with repetier host and with the server every few minutes.

    now it runs the stable marlin again...

    EDIT while i used the marlin dev branch firmware, i checked th@debugcon command and noticed it displayed always 120 buffer. the setting was 127.
    now i changed the buffer setting in the server interface/printer settings to 120 and disabled the pingpong mode....
    started a print with the stable branch marlin and it seems to print smooth now.

    its not finished yet but last 5 times i tried this with this firmware it stuttered immediately. im not sure how this makes a difference because i already tried 63 before without any change in the stutter-behaviour.


  • the printer did not finish its job, while i was sleeping it stopped at 86% of the print and stayed where it was for hours, this means the timeout was reached and it did not accecpt a new command after the timeout. 

    this printer annoys me hard, this one makes a problem of really every single step since i built it... 
  • Sound like autoleveling is also hard for others. It is really a difficult field with all the different setups.

    The buffer returned with @debugcon is how much is currently used not the capacity. 120 is bretty good. A new command would be more then 7 bytes so it would not fit.

    If you say connection lost, do you mean reall disconnected? Firmware should not be able to disconnect at all - that is a connection between usb converters. All that firmware could influence is not responding. BTW: What printe rboard are you using? Sounds like the connection is so bad that you get sooner or later problems from connection transfer errors. That is also why it works at the beginning and then stops working. This can happen with bad power units or bad cables. I had users where switching the light on/off caused a connection to crash.
  • edited July 2015
    the auto level issue on this printer is because, its hotend tip can be pushed up and triggers then the z probe which is also the z min endstop.
    so is the probe/hotend tip always in the negative if it gets pushed up, i have to tell the printer it has to move z for more than 1 mm up after the homing and before and after the autolevel process.

    this seems to was not considered by the auto level idea, which normally uses a dedicated probe besides the hotend.

    i can simply not tell the printer: move 1mm up after homing and keep this during autolevel AND add this 1mm after auto level to the measured z axis hight. i also cannot adjust the z endstop, it is fixed with this horribly design... 


    connection loss means, im not able to tell the printer new commands, in repetier host it sayd "x commands waiting", but the only way to get the printer to a response was to reset it with the reset button.
    this happened with repetier firmware only during manual moves, especially after i sent G32

    the printer runs on a ramps 1.4, but this is the 2nd board and the first one did the same.

    i gave the printer all the cables from its neighbour printer, standing besids it, and on this one everything works... always. it has also a ramps 1.4 3 year old and was completely soldered by myself.

    the bad printer runs a led psu, the other has a atx psu from a pc. i could try to test it with the atx psu to get sure the psu is ok.
  • In host manual control at the bottom is a ok button. When it stops/stutters there try if pressing it 3 times helps. That the problem is definitely caused by swallod ok responses.

    Interesting to have 2 similar printers and only one is having problems. I guess since server separates printers one could stuck and the other prints normally.

    Switching power unit sound like a good idea. led psu do not need to be stable so it is a bit of a gamble.

    In repetier-firmwre you can set how much to go back on home (if you homed with nozzle support). There is even a homing sequence to heat up nozzle before homing automatically in new 0.93 since cold nozzle can have a pla drop changing signal height. 
  • edited July 2015
    yes i played with the back after home setting, and it worked for the homing but as soon i did a autolevel, it forgot the addition Z heigt and grinded over the bed during homing and after the measuring procedure, it thinks 0 is -3.0 so its like 3mm to deep in Z which makes a print impossible.

    if i try to print, it stays down at a negative high and i don´t know how to fix this. it also measure negative Z height on some probing points but this is just wrong... it does no know where the real Z 0 is so it missinterprets the measuring results.

    this PLA blob on the nozzle becomes a problem if i heat it before i do the autolevel procedure, because the PLA always oozes and this ooze stays on the bed and empties the extruder. when the print then starts, the extruder needs too much time to get filled with filament again, and its difficult to set the extra extrusion on the beginning of a print right. 

    its best (for my case) to clean the nozzle, home, then heat it up during the auto level procedure and then let the nozzle touch on the bed and wait till it starts printing.


    one of the printers is 3 years old and was soldered by myself, at a time where the ramps cost more than 100 euro. now it costs only 20 euro presoldered. i guess there are quality issues of the hardware, the new printer is the one with problems and the old one just works.

    i will switch the PSUs later, if it then does not work, i just use the ping pong mode for this printer until it has printed his successor.

  • edited July 2015
    ok i did a test print with the atx psu and it took just 2 layers for it to beginn to stutter again. it were short stutters because of many com errors in a row, then it continued 3 secs later. so is the PSU probably not the problem.

    the postman brought me the new arduino and a new ramps some minutes ago, the only thing that was not replaced now is the arduino mega. 
    do you think the arduino could be damaged and generate issues like this?
  • No I think the arduino is ok. It is doing everything correctly until you get your problems. And even then everything moves as it should only with pauses and stops. The only thing on the arduino I'm not sure is the communication chip. Since you have 2 arduinos you could of course switch them together with usb cable. Then the new has most components of the working one.

    Regarding autoleveling you need to start 5mm above the bed so z probe distance + negative correction form nozzle bed distance is still positive. That should do the trick. But I will see. In august I get a new printer with the same z probe and homing procedure, so I can see if it works for me.
Sign In or Register to comment.