s3 remote download speeds

florent

@frank-s said in s3 remote download speeds:

So I continue to test s3 remote. I have noticed that when I upload a backup it is able to completely saturate my upload bandwidth (40Mbps), however when I download (restore) I am lucky to get 20Mbps. I wondered if Wasabi was throttling my downloads so I installed Wasabi explorer on a windows box and tried downloading a large folder. Immediately it was much faster. Then I found the option for increasing the threads and I was able to saturate my download bandwidth (350Mbps). Can I ask how many threads the s3 remote uses when downloading and is there a way to tweak its performance?
Thanks.

when uploading, we can upload file chunk in parallel (default 16), but when downloading we must rebuild the vhd in order, thus limiting the concurrency. We made the choice to optimize backup instead of restore, since there is more backups than restores.

DustinB

@florent said in s3 remote download speeds:

when uploading, we can upload file chunk in parallel (default 16), but when downloading we must rebuild the vhd in order, thus limiting the concurrency. We made the choice to optimize backup instead of restore, since there is more backups than restores.

Makes sense, is this documented anywhere, for the record when I forget it a month... ha

frank-s

@florent said in s3 remote download speeds:

We made the choice to optimize backup instead of restore, since there is more backups than restores.

I see. I understand that there would be more backups than restores - in an ideal world there would be no restores
Yet in the real world when a full restore is needed it is likely to be needed desperately. In this case I would probably prioritise restore over backup - especially where deltas are used as these would be smaller anyway...

florent

@frank-s That is a good argument.

It would be possible to do parallel downloading ( and restore) in two cases : downloading in a local file and then loading it into XCP-ng ( but the vhd can be 2TB each), or with smapiV3 as we imagine it allowing to write part of a file instead of the full vhd at once

We'll check if there is a more clever solution, maybe with reading ahead a few blocks to ensure smooth stream generation

frank-s

@florent Thank you. I am a mere user - you devs are light years in front of me so I wouldn't know what to do with smapiv3...
I guessed I could probably download an entire folder from s3 using third party tools with parallel processing and then make a local remote from that to restore vms but this would be a 2 stage process. I can't help but feel that there is a more elegant solution.

florent

the two stage progress means we break one of our current advantage : XO doesn't need more space depending on the VM size

But we'll have to go thath way when handling archiving , like aws glacier, backup to tape, so it' not a hard no.

I am leaning toward an intermediate solution, where we read a few hundred MBytes of data in memory

frank-s

@florent Thank you so much for looking into this. The ability to saturate the download link will make a big difference - especially for on premises servers.

frank-s

@florent Hi, Have you been able to make any progress with this?
Thanks.

florent

@frank-s said in s3 remote download speeds:

@florent Hi, Have you been able to make any progress with this?
Thanks.

not for now, but it's in our backlog , and I have some major work to do in the next quarter to the S3 code

frank-s

@florent Thank you for the update.