I am currently building a Svelte app where a corefunctionality would be to allow upload of large (20Gig) files to an S3 bucket. I am currently using a form with file input for this and the process works fine on farge up to 2 Gig (which is suspiciously close to free RAM on my machine once all the docker containers are running) but fails on anything larger. Anyone knows a way to process form inputs as a stream rather than load them entirely into memory?
Here are the relevant parts of the code:Form:
<section class="upload"><form on:submit|preventDefault={handleUpload}><div class="file-input-container"><Input type="file" name="file" accept="image/*,video/*" labelText="Find a file to upload" error={errors?.file} /><div class="button-container"><button type="submit" class="btn btn-sm btn-primary" disabled={submitting}>Upload</button></div></div></form></section>
Input class:
{#if type == "file"}<input class="file-input file-input-bordered {error ? 'input-error' : ''}" {type} {placeholder} {name} {value} {accept} />
Upload action
const handleUpload = async (e: any) => { const formData = new FormData(e.target as HTMLFormElement); const data = Object.fromEntries(formData); const file = data.file as File; filename = file.name; try { if (!file || file.size < 1) { console.log("No file selected"); errors = { file: "No file selected" }; return; } submitting = true; const { name: key, size, type } = file; const { id, file_type, uploadId } = await createLargeFile(file, updateProgress); const updateExperienceDto: ExperienceUploadDto = { fileId: id, }; const updateExperienceResponse = await fetch(`/api/experience/origin/${$page.params.id}?type=threeDoF`, { method: "POST", headers: {"Content-Type": "application/json", }, body: JSON.stringify(updateExperienceDto), }); if (!updateExperienceResponse.ok) { console.log(updateExperienceResponse.statusText); return; } uploadComplete = true; submitting = false; await goto(`/experience/${$page.params.id}`); } catch (error) { console.log(`Error in handleSubmit on / route: ${error}`); errors = { file: "Error uploading file" }; } }
Multipart S3 upload function:
const partSizeMb = 100;export const createLargeFile = async ( file: File, updateProgress: (progress: number) => void,) => { const { name: key, size, type } = file; const partSize = partSizeMb * 1024 * 1024; // 100MB const partCount = Math.ceil(size / partSize); const createFileDto: CreateMultipartFileDto = { name: key, type: type, partCount, }; const createFileHeaders = { "Content-Type": "application/json" }; const createFileResponse = await fetch("/api/file/multipart", { method: "POST", body: JSON.stringify(createFileDto), headers: createFileHeaders, }); if (!createFileResponse.ok) { // TODO check error throw new Error(createFileResponse.statusText); } const response_json: FileWithMultipartUploadUrlsEntity = await createFileResponse.json(); const { id, type: file_type, signedUploadUrls, uploadId } = response_json; const reader = new FileReader(); const lastIndex = signedUploadUrls.length - 1; const partPercent = 100 / partCount; const promise = new Promise<void>((resolve, reject) => { // Upload (multipartpart) file reader.onloadend = async () => { const uploadPromises = signedUploadUrls.map( async (element: string, index: number) => { console.log("uploading part", index + 1, "of", partCount); console.log(index !== lastIndex ? partSize : size - index * partSize); const responce = await fetch(element, { method: "PUT", body: index !== lastIndex ? reader?.result?.slice( index * partSize, (index + 1) * partSize, ) : reader?.result?.slice(index * partSize), headers: {"Content-Type": type,"Content-Length": `${ index !== lastIndex ? partSize : size - index * partSize }`, }, }); updateProgress(partPercent); return responce; }, ); const uploadResults = []; for (let i = 0; i < uploadPromises.length; i++) { uploadResults.push(await uploadPromises[i]); } for (const uploadResult of uploadResults) { if (!uploadResult.ok) { // TODO check error throw new Error(uploadResult.statusText); } } const parts = uploadResults.map((element, index) => ({ ETag: element.headers.get("etag"), PartNumber: index + 1, })); const finishMultipartUploadDto: FinishMultipartFileUploadDto = { uploadId, parts, id: id, type: file_type, }; await fetch(`/api/file/multipart/${id}`, { method: "PUT", body: JSON.stringify(finishMultipartUploadDto), headers: createFileHeaders, }); resolve(); }; reader.readAsArrayBuffer(file); }); await promise; //TODO: deal with this without timeout await new Promise((r) => setTimeout(r, 1000)); return { id, file_type, uploadId };};
The way I am doing things, it seems that file reader tries to load the entire file into memory and when it (silently) fails on my 12Gig sample file, it returns an empty buffer and my program uploads 120 0 byte chunks to S3, leading to a failed upload.
I want to find a way for the reader to read a file from the file system as a stream, and process it few chunks at a time.