Problem with error recovery on comms
I have been setting up a Monoprice Maker Ultimate 2 (actually a Weedo F150S). Very little documentation on the board but is ATMega 2560 based using Marlin. Seems to work ok at 115,200 with no ping pong. Except for long prints. Setting the buffer to 63 did not help. However, setting it lower -- 48 -- worked (not sure if that's the highest possible number). What appears to happen is two things. First, between adding the line number when the line number has 5 digits and the checksum, the line was too long for buffering.
Second, the resend only occurs once. So you get a wild orgy of line numbers not matching. At first, I though it might be M105 driving the line number up, but looking at the logs, that doesn't seem to be the case. The host just keeps pushing more lines after one resend.
Second, the resend only occurs once. So you get a wild orgy of line numbers not matching. At first, I though it might be M105 driving the line number up, but looking at the logs, that doesn't seem to be the case. The host just keeps pushing more lines after one resend.
< 17:33:41.721: N14408 G0 F7200 X64.829 Y68.789
> 17:33:41.731: ok
< 17:33:41.731: N14409 G1 F1800 X101.015 Y104.975 E1038.84601
> 17:33:41.833: Error:checksum mismatch, Last Line: 14408
> 17:33:41.834: Resend: 14409
> 17:33:41.844: ok
< 17:33:41.844: Resend: N14409 G1 F1800 X101.015 Y104.975 E1038.84601
> 17:33:41.857: Error:checksum mismatch, Last Line: 14408
> 17:33:41.857: Resend: 14409
> 17:33:41.869: ok
< 17:33:41.869: N14410 G0 F7200 X101.078 Y105.038
> 17:33:41.879: Error:Line Number is not Last Line Number+1, Last Line: 14408
> 17:33:41.883: Resend: 14409
> 17:33:41.894: ok
< 17:33:41.894: N14411 G1 F1800 X101.882 Y105.842 E1038.86492
> 17:33:41.905: Error:Line Number is not Last Line Number+1, Last Line: 14408
> 17:33:41.908: Resend: 14409
> 17:33:41.919: ok
< 17:33:41.919: N14412 G0 F7200 X102.369 Y105.764
> 17:33:41.930: Error:Line Number is not Last Line Number+1, Last Line: 14408
Comments
In your example N14409 G1 F1800 X101.015 Y104.975 E1038.84601 has 45 byte plus *123 for checksum goes over the set limit so that is why you get the problem. You need at least 63 byte buffer also I think 127 is also correct. If that does not work enable ping-pong to throttle speed and only have one command in queue instead.
If I set the buffer lower it does work. I haven't dug into the code to theorize about why because I agree, I figured I had to have at least the size of the longest string. However, empirically, the longer buffer lengths fail each time and the short one works fine.
However, either way, the error recovery doesn't seem to be effective. Again, that's with the longer buffer sizes (63, 127) with or without ping pong.
N14409 G1 F1800 X101.015 Y104.975 E1038.84601
The line number varies but doesn't matter other than it probably has to be a "high" number to get to the number of bytes that triggers the problem.
With buffer set to 63 or 127 with or without ping pong, this line will trigger a checksum error while streaming the GCODE. Setting the buffer to 48 will allow it to work. I know that 48 is smaller than the line size, so I can't explain that. This seems to be a quirk in the firmware. Perhaps it buffers a line and then doesn't have enough left for the next line.
However, the issue I think that affects Server is that despite being asked to resend that line repeatedly, it only resends it once. Then it continues sending new lines despite repeated requests to resend the original line.
I don't think it would help me to fix this because I think the board will constantly overwrite the buffer and then get a checksum error. But, if you did have two legitimate checksum errors in a row, I think this is broken behavior.
I'll investigate some more and I am happy to send you the STL, the gcode and/or the entire log. However, I think the log in the original post shows it all.
< 17:33:17.971: N14192 M105
Here's a little of another similar run:
And another:
The good thing is that I also saw a solution. Don't know which slicer you are using but yours is creating exceptional long lines due to the fact that speed changes F are included in the motion lines (not an error only a problem with small buffers). These are all the lines causing the checksum errors. If you slice with PrusaSlicer you will get F numbers in a separate line so longest line will be 6 chars less.
Another thing you can do is using relative extrusion coordinates. Then E1038.84601 becomes E1.84601 which also reduces size by 3 byte. That might as well be enough to prevent the long line.
One problem might be very long prints. Line number N is reset to 10000 when it reaches 10000000 so you can get 3 more digits from that. Means the gain from relative extrusion can be eaten by line numbers, also 10 million is normally never reached, but you reach more then lines in gcode. Lines are only reset at connection start and we also query temperatures every second. So only moving F to separate line and relative extrusion (better anyway in terms of precision) would remove the buffer size problem.
Best solution of course would be firmware update with bigger buffers like normal firmwares have.