|
|
bdy wrote:
>
> If I use lwp-rget to retrieve a Web site, will it retrieve new pages
> added that may not be linked to?
>
> For example, the site www.123.com is composed of 10 pages, each of
> which is accessible through links on the site.
>
> But, two pages are added in the span of three days.
> www.123.com/fourteen-five.jsp
> and www.123.com/eight-nine.html.
>
> fourteen-five.jsp isn't accessible through any links on the site, but
> eight-nine.html is.
>
> Will lwp-rget be able to find both pages in addition to the 10
> original pages?
lwp-rget will fetch only the page you specify and any others that the pages it
has already read link to. Fetching a page without specifying a full URL usually
results in either a default web page for that address or a directory listing (or
an error 404) so
lwp-rget www.123.com
is the same as
lwp-rget http://www.rget.com/index.htm
and you will get the index.htm file, and all the resources that file links to,
and all that they in turn link to and so on.
HTH,
Rob
|
|