English 中文(简体)
找出错误的链接
原标题:Find the bad links
  • 时间:2012-05-28 14:16:41
  •  标签:
  • bash
  • shell

我有一个大约6k 链接的清单。 我需要通过每个链接, 看看它导致的页面是否包含特定的单词 。

最简单的方法是什么?

最佳回答

肮脏的溶液 :

#! /bin/bash
while read link ; do
    wget -qO- "$link" | grep -qiFf words.lst - && echo "$link"
done < links.lst > found.lst

链接应保存在 links.lst , 每行一个链接。单词应保存在 words.lst , 每行一个单词。

问题回答

我为你们创造了一个。

创建一个名为单词. txt 的文件, 包含用空格检查分隔的单词 。

创建名为 link. url 的文件, 包含一份 url 列表列表, 以检查每行检查一行 。

创建名为 raperer.sh 的文件, 包含以下脚本 :

#!/bin/bash

# A file with a list of urls one per line
LINKS_FILE="links.url"
# A file with a list of words separed by spaces
WORDS_FILE="words.txt"

HTTP_CLIENT="/usr/bin/wget -O - "

rm -f /tmp/temp.html
for link in `cat "$LINKS_FILE"`
do
        # Downloading page
        echo "--"
        echo "Scanning link: $link"
        $HTTP_CLIENT "$link" > /tmp/temp.html
        if [ $? -ne 0 ]
        then
                echo "## Problem downloading resource $link" 1>&2
                continue
        fi

        # Checking words
        for word in `cat "$WORDS_FILE"`
        do
                echo "Checking for the word "$word"..."
                if [ "x`grep -i $word /tmp/temp.html`" != "x" ]
                then
                        echo "** The word $word is found into the uri "$link""
                        continue 2
                fi
        done
        echo "** No words found into "$link""
        echo "--"
        echo
done
rm -f /tmp/temp.html

运行包装纸。

你可以写一个脚本 访问每个URL 然后检查这些字是否出现在这些页面上。

不是最快的方式, 但首先:

#!bin/bash

while read url
do
    content=$(wget $url -q -O -)

    # and here you can check
    # if there are matches in $content

done < "links.txt"




相关问题
Parse players currently in lobby

I m attempting to write a bash script to parse out the following log file and give me a list of CURRENT players in the room (so ignoring players that left, but including players that may have rejoined)...

encoding of file shell script

How can I check the file encoding in a shell script? I need to know if a file is encoded in utf-8 or iso-8859-1. Thanks

Bash usage of vi or emacs

From a programming standpoint, when you set the bash shell to use vi or emacs via set -o vi or set -o emacs What is actually going on here? I ve been reading a book where it claims the bash shell ...

Dynamically building a command in bash

I am construcing a command in bash dynamically. This works fine: COMMAND="java myclass" ${COMMAND} Now I want to dynamically construct a command that redirectes the output: LOG=">> myfile.log ...

Perform OR on two hash outputs of sha1sum

I want perform sha1sum file1 and sha1sum file2 and perform bitwise OR operation with them using bash. Output should be printable i.e 53a23bc2e24d039 ... (160 bit) How can I do this? I know echo $(( ...

Set screen-title from shellscript

Is it possible to set the Screen Title using a shell script? I thought about something like sending the key commands ctrl+A shift-A Name enter I searched for about an hour on how to emulate ...

热门标签