[0.75.1] Alternative server (Conectivity) issue

My setup: PC -> Alix Board - > Arduino.
Alix Board: Repetier Server 
PC: Alternative Server for FFMPEG and model rendering

First day after connecting it everything worked great. Fast rendering, ffmpeg conversion of webcam shoots.
An day after, none of those works anymore. Rendering keeps showing "Rendering" image, Recordings shows Converting all the time.  
I have tried to remove and re-insert API, IP of PC witch Alix detect as Repetier-Renderer and Video Converter services.
But still no rendering and ffmpeg conversion.

Server log fine (Alix side) shows following:
2016-05-12 19:45:07: Job created: /var/lib/Repetier-Server/printer/MP20/models/00000016_1x carrello_sotto_netfabb_mm.u
2016-05-12 19:45:07: finish job /var/lib/Repetier-Server/printer/MP20/models/00000016_1x carrello_sotto_netfabb_mm.u
2016-05-12 19:45:07: Uploaded  size:8337417
2016-05-12 19:45:07: Updating info for /var/lib/Repetier-Server/printer/MP20/models/00000016_1x carrello_sotto_netfabb_mm.g printer MP20
2016-05-12 19:45:07: line 141
2016-05-12 21:34:20: Client closed connection unexpectedly
2016-05-12 21:44:49: Job created: /var/lib/Repetier-Server/printer/MP20/jobs/00000003_2x laterale_netfabb_mm.u
2016-05-12 21:44:50: finish job /var/lib/Repetier-Server/printer/MP20/jobs/00000003_2x laterale_netfabb_mm.u
2016-05-12 21:44:50: Added timelapse 20160512T214450_2x laterale_netfabb_mm
2016-05-12 22:10:49: Client closed connection unexpectedly
2016-05-12 22:10:51: Client closed connection unexpectedly
2016-05-12 22:36:06: Call of rs:toJsonObject without table
2016-05-12 22:40:19: Call of rs:toJsonObject without table

At PC side, there is no entery for any job from Alix.

Comments

  • Hmm, after today reboot, both services came online again.
    Is there some limitation like, Alternative Server MUST be online all the time ?
    Maybe it stops working if I reboot Alternative Server or something like that ?
  • It checks every minute if the alternative server is up and running. If so it will use it. Once it starts doing something on remote it waits for it to finish. In this case if I interpret it you have uploaded a file and wait for models to get rendered. But there are no messages about this.

    I'm a bit confused about "Added timelapse 20160512T214450_2x laterale_netfabb_mm" whch seems to be directly after uploading 00000003_2x laterale_netfabb_mm.

    It it happens again please check if the pc side is still running. I will also have a look into the logic. After all it is a quite complex list of steps that is involved and if one blocks everything blocks. So that should not happen.

    One point that easily causes problems are full harddisks so always wacht out to have enough place.
  • Same happens once more. But after rebooting Alix everything backed to normal (both render, and video).
    Storage is not issue, there is at least 500mb space at Alix.
    In Alix board, external USB flash is mounted and linked as snapshoot folder so JPEGs and video files are stored in large medium.
  • So Alix worker process seems to wait forever. Do you know if image creation was the part not finishing? I saw the log "Call of rs:toJsonObject without table" which can happen only in image rendering.
  • Hmm, It could be. Often I had to stop printing, by resting Arduino, if for ex. first layer do not stick to bed.
  • Do you upload directly to print queue? That would trigger separate image rendering and killing it before rendering is finished deletes it and then image gets delivered. Maybe that is the case that triggers the problem, also in that case the images should just get deleted as mapping object was deleted meanwhile.
  • No, upload is done over RepetierHost. But, maybe RepetierHost is reason of this. Sometimes, after I send print to server. RepetierHost freeze and Linux kill it after few moments.
  • That does not matter. If you click print button in repetier-host it does exactly this. 

    Host freeze will disconnect websocket and should not disturb server in any way. Ping will not come and a timeout stops connection.
  • I might have found situation it happens.

    For some reason both of repetier servers has crashed (PC and Alix) after runing following:
    /usr/bin/ffmpeg -loglevel error -f concat -i /var/lib/Repetier-Server/printer/MP20/timelapse/20160515T142404_3D_test_V3_0/input.txt -i /var/lib/Repetier-Server/database/watermark.png -c:v libx264 -filter_complex [0:v][1:v] overlay=10:10 -threads 1 -r 30 -pix_fmt yuv420p -profile:v baseline -level 3.0 -movflags +faststart -b:v 1000k -y /var/lib/Repetier-Server/printer/MP20/timelapse/20160515T142404_3D_test_V3_0/video.mp4

    That was last log in Alix servers. There was no any log at that time on PC server.
    Alix ffmpeg fails as Alix do not support correctly ffmpeg. ([libx264 @ 0x9edf440] your cpu does not support SSE1, but x264 was compiled with asm[libx264 @ 0x9edf440] to run x264, recompile without asm (configure --disable-asm)Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height)

    After restarting both PC and Alix, topic issue is presented.
    Is there any debug log to find, why ffmpeg crashed both Repetier Servers ?
  • Mhh. ffmpeg is a external software so a crash there can not crash the server directly. Just to be sure on definition - a crash for me is if the software does not run at all. If it hangs on doing further worker jobs like rendering this is not a crash but more a dead lock. 

    Normal exceptions during processing should show a
    "Uncatched error in video conversion:" in the server log.
    If conversion on remote fails it should show "video conversion on remote failed, converting locally" which then would cause a new error on your Alix and work should continue.

    I found one possible dead lock while reviewing the upload code. If uploading one file fails, the alix server would hang in an endless loop causing the dead lock for work processes. Just fixed it for next release. Since video conversion sends quite some data the possibility of a failure is much higher here. This also means shutting down the server would not work as the process never finishes and you would need to kill the server process to get it working again.

    So what I found bug sounds much like what you describe.
















  • Great. I will test it with next release.
    And yes, ussualy I have to reboot Alix to get server up and running again.
Sign In or Register to comment.