English 中文(简体)
扫描网页和检索javascript变量
原标题:Scrape web page and retrieve javascript variables

我需要拆解一个网页,该网页有一片javascript阵列,被贴在网上javascript代码上,例如:

<script>
    var videos = new Array();
    videos[0] =  http://myvideos.com/video1.mov ; 
    videos[1] = ....
    ....
</script>

采取什么最容易的方式来处理这一问题,最终将使用这些录像带的PHP阵列?

Edit: All videos are .mov extension.

最佳回答

这比以往更为复杂,但只有这些联系才能得到,这些联系实际上是<代码>videos[0] = http://myvideos.com/video1mov;。

$tmp=str_replace(array("
","
"),  ,$original,$matches);
$pattern= /<script>s+var videos.*?((s*videos[d+] = .http://.*?;s*?)+)(.*?)</script>/ ;
$a=preg_match_all($pattern,$tmp,$matches);
unset($tmp);

if (!$a) die("no matches");

$pattern="/videos[d+] = /";
$matches=preg_split($pattern,$matches[1][0]);

$final=array();
while(sizeof($matches)>0) {
  $match=trim(array_shift($matches));
  if ($match==  ) continue;
  $final[]=substr($match,1,-2);
}
unset($matches);

print_r($final);

在该次会议上,在履行机构提出反馈意见后,简化版本如下:

$original=file_get_contents($url);
$pattern= /http://.*?.mov/ ;
$a=preg_match_all($pattern,$original,$matches);
if (!$a) die("no matches");
print_r($matches[0]);
问题回答

You can scrape this by reading the page with a file_get_contents then retrieve the urls with a regex. This is the simplest way i know, especially if you know the file extensions for your videos. Exemple:

<?php
$file = file_get_contents( http://google.com );
$pattern =  /http://([a-zA-Z0-9-.]+.[fr|com]+)/i ;
preg_match_all($pattern, $file, $matches);
var_dump($matches);




相关问题
selected text in iframe

How to get a selected text inside a iframe. I my page i m having a iframe which is editable true. So how can i get the selected text in that iframe.

How to fire event handlers on the link using javascript

I would like to click a link in my page using javascript. I would like to Fire event handlers on the link without navigating. How can this be done? This has to work both in firefox and Internet ...

How to Add script codes before the </body> tag ASP.NET

Heres the problem, In Masterpage, the google analytics code were pasted before the end of body tag. In ASPX page, I need to generate a script (google addItem tracker) using codebehind ClientScript ...

Clipboard access using Javascript - sans Flash?

Is there a reliable way to access the client machine s clipboard using Javascript? I continue to run into permissions issues when attempting to do this. How does Google Docs do this? Do they use ...

javascript debugging question

I have a large javascript which I didn t write but I need to use it and I m slowely going trough it trying to figure out what does it do and how, I m using alert to print out what it does but now I ...

Parsing date like twitter

I ve made a little forum and I want parse the date on newest posts like twitter, you know "posted 40 minutes ago ","posted 1 hour ago"... What s the best way ? Thanx.

热门标签