English 中文(简体)
How do I extract web wiki pages, which are password protected?
原标题:

I wish to get a few web pages and the sub-links on those which are password protected. I have the user name and the password and can access them from the normal browser UI. But As I wish to save these pages to my local drive for later reference, I am using WGET to get them:

wget --http-user=USER --http-password=PASS http://mywiki.mydomain.com/myproject

But the above is not working, as it asks for the password again. Is there any better way to do this, without getting stuck with the system asking for the password again. Also, what is the best option to get all the links and sub-links on a particular page and store them to a single folder.

Update: The actual page I am trying to access is behind a HTTPS gateway, and the certificate for the same is not gettin g validated. Is there any way to get through this?

mysystem-dsktp ~ $ wget --http-user=USER --http-password=PASS https://secure.site.mydomain.com/login?url=http://mywiki.mydomain.com%2fsite%2fmyproject%2f
--2010-01-24 18:09:21--  https://secure.site.mydomain.com/login?url=http://mywiki.mydomain.com%2fsite%2fmyproject%2f
Resolving secure.site.mydomain.com... 124.123.23.12, 124.123.23.267, 124.123.102.191, ...
Connecting to secure.site.mydomain.com|124.123.23.12|:443... connected.
ERROR: cannot verify secure.site.mydomain.com s certificate, issued by `/C=US/O=Equifax/OU=Equifax Secure Certificate Authority :
  Unable to locally verify the issuer s authority.
To connect to secure.site.mydomain.com insecurely, use `--no-check-certificate .
Unable to establish SSL connection.

I tried the --no-check-certificate option also, it is not working. I only get the login page with this option and not the actual page I requested.

最佳回答

Could you try like this?

wget http://USER:PASSWD@mywiki.mydomain.com/myproject
问题回答

Seems you re trying to access a page secured by a form.

You could to use that --no-check-certificate option and to follow this forum thread suggestions: Can t log in with wget.





相关问题
How do I extract web wiki pages, which are password protected?

I wish to get a few web pages and the sub-links on those which are password protected. I have the user name and the password and can access them from the normal browser UI. But As I wish to save these ...

Limit WGET ing by timer, how to do this?

Sorry for my english (i m rus) I save MJPEG stream from IP-camera with wget wget -O 2010-01-12_01.mjpeg http://172.16.1.220:8070/video.mjpg I need limit saving by hour (every hour is a another file ...

Why does WGET return 2 error messages before succeeding?

I am using a script to pull down some XML data on a authentication required URL with WGET. In doing so, my script produces the following output for each url accessed (IPs and hostnames changed to ...

How to grab live text from a URL?

Im trying to grab all data(text) coming from a URL which is constantly sending text, I tried using PHP but that would mean having the script running the whole time which it isn’t really made for (I ...

Confirmation of Successful HTTP Download in Python

Is there a easy and reliable way to confirm that a web download completed successfully to download using Python or WGET [for large files]? I want to make sure the file downloaded in its entirety ...

Controlling wget with PHP

I’m writing a command line PHP console script to watch for new URLs and launch (large) downloads for a client’s project I am working on. The client is currently manually downloading them with a ...

store wget link into a database (php)

I m trying to find a solution to download automatically .flv link everyday from a website using wget and to store all the links into a database to stream them in my website. (all in php) How to do ...

热门标签