問題描述
我正在使用 JavaScript 為唯一文件值生成文件哈希值.請檢查以下代碼以了解運(yùn)行良好的哈希生成機(jī)制.
但是,我在為大文件生成哈希值時(shí)遇到了問題,因?yàn)樵诳蛻舳藶g覽器崩潰了.
直到 30MB,HASHING 運(yùn)行良好,但如果我嘗試上傳大于該值的文件,系統(tǒng)就會崩潰.
我的問題是:
<塊引用>我能否為文件的一部分生成 HASH 值而不是讀取大文件并導(dǎo)致崩潰?如果是,我可以知道如何做那個(gè)寬度嗎'文件閱讀器';
我可以指定任意數(shù)量的字節(jié)(例如文件的 2000 個(gè)字符)來生成 HASH 值,然后為大文件生成.
我希望上述兩種解決方案適用于較大和較小的文件.還有其他選擇嗎?
我的小提琴演示
- 我能否為文件的一部分生成 HASH 值而不是讀取大文件并導(dǎo)致崩潰?如果是,我能知道如何處理那個(gè)寬度的FileReader"嗎;
是的,你可以這樣做,它被稱為漸進(jìn)式哈希.
var md5 = CryptoJS.algo.MD5.create();md5.update("文件第1部分");md5.update("文件第2部分");md5.update("文件第 3 部分");var hash = md5.finalize();
<塊引用>
- 我可以指定任意數(shù)量的字節(jié)(例如文件的 2000 個(gè)字符)來生成 HASH 值,然后為大文件生成.
HTML5Rocks 文章介紹了如何可以使用 File.slice
將切片文件傳遞給 FileReader代碼>:
var blob = file.slice(startingByte, endindByte);reader.readAsArrayBuffer(blob);
完整解決方案
我把兩者結(jié)合了.棘手的部分是同步文件讀取,因?yàn)?FileReader.readAsArrayBuffer()
是異步的.我寫了一個(gè)小的 series
函數(shù),它模仿了 series
async.js 的功能.必須一個(gè)接一個(gè)地進(jìn)行,因?yàn)闆]有辦法進(jìn)入CryptoJS的哈希函數(shù)的內(nèi)部狀態(tài).
此外,CryptoJS 不了解 ArrayBuffer
是什么,因此必須將其轉(zhuǎn)換為其原生數(shù)據(jù)表示形式,即所謂的 WordArray:
function arrayBufferToWordArray(ab) {var i8a = new Uint8Array(ab);var a = [];for (var i = 0; i < i8a.length; i += 4) {a.push(i8a[i] <<24 | i8a[i + 1] <<16 | i8a[i + 2] <<8 | i8a[i + 3]);}返回 CryptoJS.lib.WordArray.create(a, i8a.length);}
另一件事是散列是一種同步操作,其中沒有 yield
可以在其他地方繼續(xù)執(zhí)行.因此,瀏覽器將凍結(jié),因?yàn)?JavaScript 是單線程的.解決方案是使用 Web Workers 將哈希卸載到不同的線程,以便 UI 線程保持響應(yīng).
Web 工作者希望在其構(gòu)造函數(shù)中使用腳本文件,因此我使用了 Rob W 的此解決方案來獲得內(nèi)聯(lián)腳本.
函數(shù)系列(任務(wù),完成){if(!tasks || tasks.length === 0) {完畢();} 別的 {任務(wù)[0](函數(shù)(){系列(任務(wù).切片(1),完成);});}}函數(shù) webWorkerOnMessage(e){如果(e.data.type ===創(chuàng)建"){md5 = CryptoJS.algo.MD5.create();postMessage({type: "create"});} else if (e.data.type === "update") {函數(shù) arrayBufferToWordArray(ab) {var i8a = new Uint8Array(ab);var a = [];for (var i = 0; i < i8a.length; i += 4) {a.push(i8a[i] <<24 | i8a[i + 1] <<16 | i8a[i + 2] <<8 | i8a[i + 3]);}返回 CryptoJS.lib.WordArray.create(a, i8a.length);}md5.update(arrayBufferToWordArray(e.data.chunk));postMessage({type: "update"});} else if (e.data.type === "finish") {postMessage({type: "finish", hash: ""+md5.finalize()});}}//URL.createObjectURLwindow.URL = window.URL ||window.webkitURL;//服務(wù)器響應(yīng)",用于所有示例變量響應(yīng) ="importScripts('https://cdn.rawgit.com/CryptoStore/crypto-js/3.1.2/build/rollups/md5.js');"+"var md5;"+"self.onmessage = "+webWorkerOnMessage.toString();var blob;嘗試 {blob = new Blob([response], {type: 'application/javascript'});} catch (e) {//向后兼容window.BlobBuilder = window.BlobBuilder ||window.WebKitBlobBuilder ||窗口.MozBlobBuilder;blob = 新的 BlobBuilder();blob.append(響應(yīng));blob = blob.getBlob();}var worker = new Worker(URL.createObjectURL(blob));var 文件 = evt.target.files;//文件列表對象變量塊大小 = 1000000;//塊大小沒有區(qū)別變量 i = 0,f = 文件[i],塊 = Math.ceil(f.size/chunksize),塊任務(wù) = [],startTime = (new Date()).getTime();worker.onmessage = 函數(shù)(e){//創(chuàng)建回調(diào)for(var j = 0; j <塊; j++){(函數(shù)(j,f){chunkTasks.push(function(next){var blob = f.slice(j * chunksize, Math.min((j+1) * chunksize, f.size));var reader = new FileReader();reader.onload = function(e) {var chunk = e.target.result;worker.onmessage = 函數(shù)(e){//更新回調(diào)document.getElementById('num').innerHTML = ""+(j+1)+"/"+chunks;下一個(gè)();};worker.postMessage({type: "update", chunk: chunk});};reader.readAsArrayBuffer(blob);});})(j, f);}系列(塊任務(wù),功能(){var elem = document.getElementById("hashValueSplit");var telem = document.getElementById("time");worker.onmessage = 函數(shù)(e){//結(jié)束回調(diào)elem.value = e.data.hash;telem.innerHTML = "in " + Math.ceil(((new Date()).getTime() - startTime)/1000) + " seconds";};worker.postMessage({type: "finish"});});//阻塞前面的路...if (document.getElementById("singleHash").checked) {var reader = new FileReader();//關(guān)閉以捕獲文件信息.reader.onloadend = (function(theFile) {函數(shù) arrayBufferToWordArray(ab) {var i8a = new Uint8Array(ab);var a = [];for (var i = 0; i < i8a.length; i += 4) {a.push(i8a[i] <<24 | i8a[i + 1] <<16 | i8a[i + 2] <<8 | i8a[i + 3]);}返回 CryptoJS.lib.WordArray.create(a, i8a.length);}返回函數(shù)(e){var test = e.target.result;var hash = CryptoJS.MD5(arrayBufferToWordArray(test));//var hash = "none";var elem = document.getElementById("hashValue");elem.value = 哈希值;};})(F);//讀入圖像文件作為數(shù)據(jù) URL.reader.readAsArrayBuffer(f);}};worker.postMessage({type: "create"});
DEMO 似乎適用于大文件,但需要相當(dāng)多的時(shí)間.也許這可以使用更快的 MD5 實(shí)現(xiàn)來改進(jìn).散列一個(gè) 3 GB 的文件大約需要 23 分鐘.
我的這個(gè)答案 展示了一個(gè)沒有 SHA-256 網(wǎng)絡(luò)工作者的例子.
I am working with JavaScript to generate File HASH VALUE for unique file values. Kindly check the below code for the Hash Generation Mechanism Which works good.
<script type="text/javascript">
// Reference: https://code.google.com/p/crypto-js/#MD5
function handleFileSelect(evt)
{
var files = evt.target.files; // FileList object
// Loop through the FileList and render image files as thumbnails.
for (var i = 0, f; f = files[i]; i++)
{
var reader = new FileReader();
// Closure to capture the file information.
reader.onload = (function(theFile)
{
return function(e)
{
var span = document.createElement('span');
var test = e.target.result;
//var hash = hex_md5(test);
var hash = CryptoJS.MD5(test);
var elem = document.getElementById("hashValue");
elem.value = hash;
};
})(f);
// Read in the image file as a data URL.
reader.readAsBinaryString(f);
}
}
document.getElementById('videoupload').addEventListener('change', handleFileSelect, false);
</script>
However I am facing problem when generating HASH VALUE for large files as in client side the browser Crashed.
Up-till 30MB the HASHING works well but if i try to upload larger than that the system crashes.
My Question is:
Can I generate HASH Value for part of file than reading the LARGE files and getting crashes? If yes, Can I know how to do that width 'FileReader';
Can I specify any amount of Byte such as 2000 Character of a file to generate HASH Value then generating for large files.
I hope the above two solution will work for larger and small files. Is there any other options?
My Fiddle Demo
- Can I generate HASH Value for part of file than reading the LARGE files and getting crashes? If yes, Can I know how to do that width 'FileReader';
Yes, you can do that and it is called Progressive Hashing.
var md5 = CryptoJS.algo.MD5.create();
md5.update("file part 1");
md5.update("file part 2");
md5.update("file part 3");
var hash = md5.finalize();
- Can I specify any amount of Byte such as 2000 Character of a file to generate HASH Value then generating for large files.
There's an HTML5Rocks article on how one can use File.slice
to pass a sliced file to the FileReader
:
var blob = file.slice(startingByte, endindByte);
reader.readAsArrayBuffer(blob);
Full solution
I have combined both. The tricky part was to synchronize the file reading, because FileReader.readAsArrayBuffer()
is asynchronous. I've written a small series
function which is modeled after the series
function of async.js. It has to be done one after the other, because there is is no way to get to the internal state of the hashing function of CryptoJS.
Additionally, CryptoJS doesn't understand what an ArrayBuffer
is, so it has to be converted to its native data representation, which is the so-called WordArray:
function arrayBufferToWordArray(ab) {
var i8a = new Uint8Array(ab);
var a = [];
for (var i = 0; i < i8a.length; i += 4) {
a.push(i8a[i] << 24 | i8a[i + 1] << 16 | i8a[i + 2] << 8 | i8a[i + 3]);
}
return CryptoJS.lib.WordArray.create(a, i8a.length);
}
The other thing is that hashing is a synchronous operation where there is no yield
to continue execution elsewhere. Because of this, the browser will freeze since JavaScript is single threaded. The solution is to use Web Workers to off-load the hashing to a different thread so that the UI thread keeps responsive.
Web workers expect the script file in their constructors, so I used this solution by Rob W to have an inline script.
function series(tasks, done){
if(!tasks || tasks.length === 0) {
done();
} else {
tasks[0](function(){
series(tasks.slice(1), done);
});
}
}
function webWorkerOnMessage(e){
if (e.data.type === "create") {
md5 = CryptoJS.algo.MD5.create();
postMessage({type: "create"});
} else if (e.data.type === "update") {
function arrayBufferToWordArray(ab) {
var i8a = new Uint8Array(ab);
var a = [];
for (var i = 0; i < i8a.length; i += 4) {
a.push(i8a[i] << 24 | i8a[i + 1] << 16 | i8a[i + 2] << 8 | i8a[i + 3]);
}
return CryptoJS.lib.WordArray.create(a, i8a.length);
}
md5.update(arrayBufferToWordArray(e.data.chunk));
postMessage({type: "update"});
} else if (e.data.type === "finish") {
postMessage({type: "finish", hash: ""+md5.finalize()});
}
}
// URL.createObjectURL
window.URL = window.URL || window.webkitURL;
// "Server response", used in all examples
var response =
"importScripts('https://cdn.rawgit.com/CryptoStore/crypto-js/3.1.2/build/rollups/md5.js');"+
"var md5;"+
"self.onmessage = "+webWorkerOnMessage.toString();
var blob;
try {
blob = new Blob([response], {type: 'application/javascript'});
} catch (e) { // Backwards-compatibility
window.BlobBuilder = window.BlobBuilder || window.WebKitBlobBuilder || window.MozBlobBuilder;
blob = new BlobBuilder();
blob.append(response);
blob = blob.getBlob();
}
var worker = new Worker(URL.createObjectURL(blob));
var files = evt.target.files; // FileList object
var chunksize = 1000000; // the chunk size doesn't make a difference
var i = 0,
f = files[i],
chunks = Math.ceil(f.size / chunksize),
chunkTasks = [],
startTime = (new Date()).getTime();
worker.onmessage = function(e) {
// create callback
for(var j = 0; j < chunks; j++){
(function(j, f){
chunkTasks.push(function(next){
var blob = f.slice(j * chunksize, Math.min((j+1) * chunksize, f.size));
var reader = new FileReader();
reader.onload = function(e) {
var chunk = e.target.result;
worker.onmessage = function(e) {
// update callback
document.getElementById('num').innerHTML = ""+(j+1)+"/"+chunks;
next();
};
worker.postMessage({type: "update", chunk: chunk});
};
reader.readAsArrayBuffer(blob);
});
})(j, f);
}
series(chunkTasks, function(){
var elem = document.getElementById("hashValueSplit");
var telem = document.getElementById("time");
worker.onmessage = function(e) {
// finish callback
elem.value = e.data.hash;
telem.innerHTML = "in " + Math.ceil(((new Date()).getTime() - startTime) / 1000) + " seconds";
};
worker.postMessage({type: "finish"});
});
// blocking way ahead...
if (document.getElementById("singleHash").checked) {
var reader = new FileReader();
// Closure to capture the file information.
reader.onloadend = (function(theFile) {
function arrayBufferToWordArray(ab) {
var i8a = new Uint8Array(ab);
var a = [];
for (var i = 0; i < i8a.length; i += 4) {
a.push(i8a[i] << 24 | i8a[i + 1] << 16 | i8a[i + 2] << 8 | i8a[i + 3]);
}
return CryptoJS.lib.WordArray.create(a, i8a.length);
}
return function(e) {
var test = e.target.result;
var hash = CryptoJS.MD5(arrayBufferToWordArray(test));
//var hash = "none";
var elem = document.getElementById("hashValue");
elem.value = hash;
};
})(f);
// Read in the image file as a data URL.
reader.readAsArrayBuffer(f);
}
};
worker.postMessage({type: "create"});
DEMO seems to work for big files, but it takes quite a lot of time. Maybe this can be improved using a faster MD5 implementation. It took around 23 minutes to hash a 3 GB file.
This answer of mine shows an example without webworkers for SHA-256.
這篇關(guān)于使用文件的一部分生成 JavaScript 文件哈希值的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!