English 中文(简体)
Crawling not working windows2008
原标题:

We installed a new MOSS 2007 farm on windows 2008 SP2 enviroment. We used SQL2008 too. Configuration is 1 index, 1 FE and 1 server with 2008, all on ESX 4.0. All the Service that need it uses a dedicated user, so search has a dedicated user.

Installation went well and we found no problem. We installed an SP1 MOSS from a ISO and after we upgraded WSS and MOSS to SP2. We installed the Italian language pack too and patched it to SP2.

We created a new SSP. We created a web application and created a root website under it.

The problem is that we can t male crawling work in any way. Seems that crawling is not able to reach the web application that we want to crawl. In event viewer of the index we have this error when we try to crawl it:

The start address <h..p://name.domain.it:81> cannot be crawled.

Context: Application  SSP1 , Catalog  Portal_Content 

Details:

The object was not found. (0x80041201)

The log of crawling from the search admin, only says:

h..p://name.domani.it:81
The object was not found. (The item was deleted because it was either not found or the crawler was denied access to it.)

The domain is fully accessible from everywhere using both farm admin user or the search user that we are using for service to run. Site is fully accessible from the index and seem not have problem. Inside the we application we created a root site collection with a couple of file.

The log of the farm simply says.... nothing! When we ask to do a full crawl of the site, it runs for a second and after we have the errors that I wrote above. But the farm s log says nothing.

Any suggestion or help is really appreciated since we are losing a lot of time on it and really we do not have any idea of what s wrong about this farm.

问题回答

This sounds like the famous DisableLoopbackCheck problem, try out the workarounds in this article on the MS support site.





相关问题
Scrapy SgmlLinkExtractor question

I am trying to make the SgmlLinkExtractor to work. This is the signature: SgmlLinkExtractor(allow=(), deny=(), allow_domains=(), deny_domains=(), restrict_xpaths(), tags=( a , area ), attrs=( href )...

Scrapy BaseSpider: How does it work?

This is the BaseSpider example from the Scrapy tutorial: from scrapy.spider import BaseSpider from scrapy.selector import HtmlXPathSelector from dmoz.items import DmozItem class DmozSpider(...

Designing a multi-process spider in Python

I m working on a multi-process spider in Python. It should start scraping one page for links and work from there. Specifically, the top-level page contains a list of categories, the second-level pages ...

What is the best way to crawl a login based sites?

I ve to automate a file download activity from a website (similar to, let s say, yahoomail.com). To reach a page which has this file download link, i ve to login, jump from page to page to provide ...

Twisted errors in Scrapy spider

When I run the spider from the Scrapy tutorial I get these error messages: File "C:Python26libsite-packages wistedinternetase.py", line 374, in fireEvent DeferredList(beforeResults)....

Crawling not working windows2008

We installed a new MOSS 2007 farm on windows 2008 SP2 enviroment. We used SQL2008 too. Configuration is 1 index, 1 FE and 1 server with 2008, all on ESX 4.0. All the Service that need it uses a ...

Is there a list of known web crawlers? [closed]

I m trying to get accurate download numbers for some files on a web server. I look at the user agents and some are clearly bots or web crawlers, but many for many I m not sure, they may or may not be ...

Most optimized way to store crawler states?

I m currently writing a web crawler (using the python framework scrapy). Recently I had to implement a pause/resume system. The solution I implemented is of the simplest kind and, basically, stores ...

热门标签