English 中文(简体)
PhantomJS and getting modified DOM
原标题:

I m developing a tool that needs to download a web page from 3rd party server, execute it as a browser would and then parse the HTML. What I struggle with is that the tool need to parse the HTML after all javascript is executed and DOM is modified. I m trying to use PhantomJS for this purpose and it works on small snippets of code (just a tiny html document with external javascript that adds some nodes to DOM) but when I do the same with a real site (http://www.dba.dk/) I m not getting the final HTML after all modifications done by the js code.

I really need help on this as I have been stuck with it for more than a week.

My PhantomJS code is simple:

if (phantom.state.length === 0) {
     if (phantom.args.length === 0) {
             console.log( Usage: test.js <some URL> );
             phantom.exit();
     } else {
             var address = phantom.args[0];
             phantom.state = Date.now().toString();
             phantom.viewportSize = { width: 1280, height: 800 };
             phantom.open(address);
     }
} else {
     var elapsed = Date.now() - new Date().setTime(phantom.state);
     if (phantom.loadStatus ===  success ) {
             if (!first_time) {
                     var first_time = true;
                     if (!document.addEventListener) {
                             console.log( Not SUPPORTED! );
                     }
                     phantom.render( result.png );
                     var markup = document.documentElement.innerHTML;
                     console.log(markup);
                     phantom.exit();
             }
     } else {
             console.log( FAIL to load the address );
             phantom.exit();
     }
}

the HTML dumped to the console doesn t contain content generated dynamically

最佳回答

The problem was in the Flash plugin. The pages were detecting its absense. Once it was loaded correctly the problem was gone

问题回答

暂无回答




相关问题
CSS working only in Firefox

I am trying to create a search text-field like on the Apple website. The HTML looks like this: <div class="frm-search"> <div> <input class="btn" type="image" src="http://www....

image changed but appears the same in browser

I m writing a php script to crop an image. The script overwrites the old image with the new one, but when I reload the page (which is supposed to pickup the new image) I still see the old one. ...

Firefox background image horizontal centering oddity

I am building some basic HTML code for a CMS. One of the page-related options in the CMS is "background image" and "stretch page width / height to background image width / height." so that with large ...

Separator line in ASP.NET

I d like to add a simple separator line in an aspx web form. Does anyone know how? It sounds easy enough, but still I can t manage to find how to do it.. 10x!

热门标签