English 中文(简体)
• 将大量数据输入记忆——这样做的最有效方式?
原标题:Loading large amount of data into memory - most efficient way to do this?

我有一个网上文件查询/查询系统,供客户使用。 该系统的一部分是查询系统,使客户能够查找文件中所载的一个术语。 我拿到必要的搜索数据文档,但需要装上大量数据,从8至20秒到所有数据。 数据分为40-100个档案,视需要检索哪些文件而定。 每份档案都在40-350kb处。

此外,这一申请必须能够在当地档案系统和网络服务器上运行。

当网页装满时,我可以编制一份清单,列出我需要填写的搜索数据文档。 整个清单必须装上,才能认为网页有效。

在这样做之前,请看一下我现在如何做。

在我知道整个网页装满后,我称“负荷工作”职能。

function loadData(){
            var d = new Date();
            var curr_min = d.getMinutes();
            var curr_sec = d.getSeconds();
         var curr_mil = d.getMilliseconds();
         console.log("test.js started background loading, time is: " + curr_min + ":" + curr_sec+ ":" + curr_mil);
          recursiveCall();
      }


   function recursiveCall(){
      if(file_array.length > 0){
         var string = file_array.pop();
         setTimeout(function(){$.getScript(string,recursiveCall);},1);
    }
    else{
        var d = new Date();
        var curr_min = d.getMinutes();
        var curr_sec = d.getSeconds();
        var curr_mil = d.getMilliseconds();
        console.log("test.js stopped background loading, time is: " + curr_min + ":" + curr_sec+ ":" + curr_mil);
    }
  }

What this does is processes an array of files sequentially, taking a 1ms break between files. This helps prevent the browser from being completely locked up during the loading process, but the browser still tends to get bogged down by loading the data. Each of the files that I m loading look like this:

AddToBookData(0,[0,1,2,3,4,5,6,7,8]);
AddToBookData(1,[0,1,2,3,4,5,6,7,8]);
AddToBookData(2,[0,1,2,3,4,5,6,7,8]);

如果每一行都是功能要求,则将数据输入阵列。 “ToBookData”的职能只是:

    function AddToBookData(index1,value1){
         BookData[BookIndex].push([index1,value1]);
    }

这是现有的制度。 在填满所有数据之后,“AddToBookData”可被称作100 000+倍。

我的数字是相当低的,因此我写了一封信来进行测试。 j 载有上述所有职能要求的文件,并处理该档案,使之变成一个大阵列,与图书Data正在创建的数据结构相同。 我不是把所有职能都称作旧制度,而是做以下工作:

var test_array[..........(data structure I need).......]
BookData[BookIndex] = test_array;

我预计会看到业绩提高,因为我取消了上述所有职能要求,这种方法需要稍多的时间来建立准确的数据结构。 我应该指出,“抗议”在我真正的世界考验中占据了略超过90 000个元素。

似乎两种装载数据的方法大致相同。 令我感到惊讶的是,由于我预计第二种方法不需要时间,因为数据结构是在手头之前建立的。

请指出:

最佳回答

类似地,有两个基本领域可以选择数据负荷,可以单独审议和处理:

  1. Downloading the data from the server. Rather than one large file you should gain wins from parallel loads of multiple smaller files. Experiment with number of simultaneous loads, bear in mind browser limits and diminishing returns of having too many parallel connections. See my parallel vs sequential experiments on jsfiddle but bear in mind that the results will vary due to the vagaries of pulling the test data from github - you re best off testing with your own data under more tightly controlled conditions.
  2. Building your data structure as efficiently as possible. Your result looks like a multi-dimensional array, this interesting article on JavaScript array performance may give you some ideas for experimentation in this area.

但是,我不敢肯定,你能够真正地选择单独装载数据。 为了解决你的申请中的实际问题(浏览时间太长),你是否考虑过一些选择?

Using Web Workers

Web Workers可能得不到所有目标浏览器的支持,但应防止主浏览器在处理数据时锁定。

对于没有工人的浏览器,您可考虑在间隔期间增加<条码>>的排版,使浏览器有时间为用户和您的《联合文件》服务。 这将使事情实际上稍微放缓,但随着时间的推移,可能会增加用户的幸福。

Providing feedback of progress

对工人和工人能力不足的浏览器来说,需要一定时间更新劳动力调查,使之具备进步的条件。 你们知道,有多少档案可以装载,但进展应当相当一致,尽管实际上可能略微放缓,

Lazy Loading

。 他的评论。 如果谷歌Instant能够像我们这样搜索整个网络,那么,服务器能否在目前的书内把搜索关键词的所有地点归还档案吗? 这份档案应当比书内所有字的位置要小得多,而且要更快,这是我假设你目前试图像你那样迅速装满的。

问题回答

我测试了将同一9 000 000 000点数据集载到3.64的三个方法。

1: Stephen s GetJSON Method
2) My function based push method
3) My pre-processed array appending method:

我以两种方式进行了测试: 试验一的首次检索输入了100份载有10 000个数据集的档案,每卷载有9个数据要素。 [0,1,2,3,4,5,6,7,8]

我尝试合并档案,以便我进口1份档案和900万个数据点。

这远远大于数据集Ill的使用,但它有助于显示各种进口方法的速度。

Separate files:                 Combined file:

JSON:        34 seconds         34
FUNC-BASED:  17.5               24
ARRAY-BASED: 23                 46

Interesting results, to say the least. I closed out the browser after loading each webpage, and ran the tests 4 times each to minimize the effect of network traffic/variation. (ran across a network, using a file server). The number you see is the average, although the individual runs differed by only a second or two at most.





相关问题
selected text in iframe

How to get a selected text inside a iframe. I my page i m having a iframe which is editable true. So how can i get the selected text in that iframe.

How to fire event handlers on the link using javascript

I would like to click a link in my page using javascript. I would like to Fire event handlers on the link without navigating. How can this be done? This has to work both in firefox and Internet ...

How to Add script codes before the </body> tag ASP.NET

Heres the problem, In Masterpage, the google analytics code were pasted before the end of body tag. In ASPX page, I need to generate a script (google addItem tracker) using codebehind ClientScript ...

Clipboard access using Javascript - sans Flash?

Is there a reliable way to access the client machine s clipboard using Javascript? I continue to run into permissions issues when attempting to do this. How does Google Docs do this? Do they use ...

javascript debugging question

I have a large javascript which I didn t write but I need to use it and I m slowely going trough it trying to figure out what does it do and how, I m using alert to print out what it does but now I ...

Parsing date like twitter

I ve made a little forum and I want parse the date on newest posts like twitter, you know "posted 40 minutes ago ","posted 1 hour ago"... What s the best way ? Thanx.