WEBNEW(1) WEBNEW(1) NAME webnew - Retrieve modification times of HTTP documents SYNOPSIS webnew [-PRVadinrvx] [-A username:password] [-c type] [-e email] [-t title] URL DESCRIPTION webnew produces a listing of URLs (web documents) sorted by the last modification time as reported by the HTTP server. It produces by default a HTML 2.0 document on standard output. The URL on the command line is used as a starting point. By default the URLs to include in the listing are extrac- tred from the document specified by the URL. For a recur- sive search of URLs to include, please see the -R and -r options. OPTIONS -A Use the provided username and password using basic authentication. This is only needed for password protected documents. -P Do not use proxies to access the documents. By default proxy definitions are used from the stan dard environment variables. -R Become a "robot" and turn on -r. To restrict the retrieval of documents, you can use a "/robots.txt" file on your server (the user agent name for webnew is "webnew"). -V Print the version of webnew and exit. -a Use the text of the first anchor found pointing to each URL as the acnhor text in the produced list ing. The default is to prefer the title specified in the document. Using this option will consider ably speed up non-recursive listings, as the indi vidual documents will not be retrieved at all. -c Specify a regular expression to match for the con- tent-type of documents included in the listing. Default is "text". -d Output a trace of the stack of URLs to retrieve. Automatically turns on -v. -e Use the given email address in the HTTP requests. Also causes a <LINK REV=MADE> tag to be included in the HTML output. -i Only output the unordered URL items. This produces HTML that should not be served as a standalone doc- ument. It is intended for including the output inside another HTML file. -n Report URLs that no modification date was retrieved for. -r Use the specified URL as the initial URL to include in the listing. Then retrieve that document and extract URLs from it to be further included and retrieved. Only URLs beginning with the initial URL will be retrieved (to avoid infinite listings). This is very useful for completely automatic "what's new" listings. -t Set the title and top level heading to the given text. The default title is "What's new". -v Show retrieved document URLs, their modification times (if it was reported by the server). If the URL was not searched for more links, the reason is reported in parentheses. -x Exclude pointers to the home page of webnew from the output. If you use this option, please make sure you provide a pointer to the home page in some other fashion. The URL for webnew is http://www.tac.nyc.ny.us/kim/webnew/ and it will always contain a pointer to the most recent version of the software as well as installation and use instructions. EXAMPLES mv new.html old.html webnew -a http://www.tac.nyc.ny.us/kim/old.html > new.html webnew -r http://www.tac.nyc.ny.us/kim/ > new.html BUGS No known bugs. AUTHOR Kimmo Suominen <kim at tac.nyc.ny.us> SEE ALSO urlget(1) Please read the document "A Standard for Robot Exclusion" for more information on restricting robots. http://www.robotstxt.org/wc/norobots.html TAC 1.2 6 May 1996 1