Addclickthecity.com Metro Manila Movie Guide; note: huge site
barrygonzaga
site_samples/palmsized/inq7-mobile.site
2005-08-08
3 level inq7.net site
barrygonzaga
site_samples/regional_philippines/
2005-08-08
inq7.site, pdi.site: replacepdi.site with inq7.site
barrygonzaga
site_samples/linux/gwn.site
2005-08-08
add logo imageurl; update authoremail
barrygonzaga
site_samples/business/businessweek.site
2005-08-08
Reflect web site title,update author email
barrygonzaga
site_samples/palmsized/
2005-08-08
ny_times.site, salon.site: removenonworking site
akkana
lib/Sitescooper/Main.pm
2005-07-06
Add
<
< ^^ >
>
links at end of story aswell as beginning
akkana
site_samples/
2005-07-06
lib/layouts.site, humor/jon_carroll.site,news/wired_news/wired_news_politics.site, opinion/salon.site,science/new_scientist_news.site, tech/newsforge.site: Some updatesfor sites that have changed.
akkana
site_samples/regional_boston/bostonglobe.site
2005-07-06
New site: BostonGlobe City & Region sections. From Bruce Zohn
akkana
site_samples/
2005-07-06
news/USNews.site, news/newsweek_intl.site,tech/pcmag_images.site: Updates from BoonNam Goh
akkana
site_samples/science/new_scientist_news.site
2005-01-26
Changes to trackthe recent site changes
akkana
site_samples/regional_israel/
2005-01-17
haaretz.site, jpost-columns.site,jpost-international.site, jpost-israel.site, jpost-me.site,jpost-opinion.site: David Resnick
: JerusalemPost and Haaretz site files
B. M. Sleight
: minor changes to pick up ask.slashdot.orgit.slashdot.org
akkana
site_samples/weblog/kevin_sites.site
2005-01-05
New site from Delmer Wells
: Kevin's War Blog
akkana
site_samples/tech/pcmag_images.site
2005-01-05
Goh Boon Nam: Update totrack site changes and grab images better
akkana
site_samples/business/the_economist.site
2005-01-05
Goh Boon Nam: RemoveSubscription-only pages which cause problem to Plucker
akkana
site_samples/
2005-01-05
humor/dave_barry.site,linux/debian_weekly_news.site,news/wired_news/wired_news_tech.site, tech/newsforge.site,tech/the_register.site, weblog/riverbend.site: Updates to trackchanges in the web sites
boondocks.site, doonesbury.site,tedrall.site: New comics from Ignatz Sol
akkana
site_samples/
2004-04-25
news/newsweek_intl.site, tech/pcmag_images.site:Updates from Goh Boon Nam
akkana
site_samples/humor/dave_barry.site
2004-04-25
Update from Alan Hoyle
: fix story start, end, headline
cwerner
site_samples/opinion/pulpit.site
2004-04-23
New site for Bob Cringely'sweekly column: The Pulpit. This is the same site scooped byi_cringely.site, except that he old i_cringely site did a 2 levelscoop that attempted to get a set of columns, whereas the new onegets a single column and only on Fridays. The old one can probablybe removed, but I didnt want to mess with it in case someone isrelying on it.
Improved supportfor isiloXC:1. Added a new param to sitescooper.cf "ISiloDefaultIxlFile" thatpoints to an .ixl file in the file system. This means that userscan change the iSiloX options by using the iSiloX GUI tool tocreate a new document, change all the options, then save as a .ixlfile. The
and
tags of the document arestripped and replaced by sitescooper but the rest is used forgenerating the isilox pdb.More details are given in the comments in sitescooper.cf.The most common likely use for this is to allow the users of-isilox to specify global settings for things like image depth,color, inclusion, dithering etc, and perhaps for category too.2. Added a new site param called "ExtraISiloIxlTags", to allow ixlsettings specific to a site. Updated doc/site_params.html, so seethis for more details.This is a little different in that the user has to specify a set oftop-level tags for the .ixl file. These get appended to thegenerated file thus overriding the defaults (or overriding theglobal options if the new config param is used). This takesadvantage of the fact that isilox tolerates the tags appearing morethan once by simply taking the last tag and ignoring earlier copies(or at least its xml parser does).So you can set general options in your .ixl file and overridespecific options in the .site files. The fact that you have tooverride the whole tag such as
means that you can'toverride, say bitdepth separately from dithering, but its stillpretty powerful. And simpler and more durable (ie resitant tochanges in isilox) than adding a bunch of new site params.: Modified Files: : sitescooper/sitescooper.cfsitescooper/doc/site_params.html : sitescooper/lib/Sitescooper/Main.pm : sitescooper/lib/Sitescooper/SCF.pm : Added Files: : sitescooper/default_isilox.ixl
jmason
lib/Sitescooper/
2004-02-19
Robot.pm, StoryURLProcessor.pm: some glitchesin RSS output fixed; now does not search for sub-stories afterhtml_to_text conversion
jmason
site_samples/science/new_scientist_news.site
2004-02-18
New Scientist Newssite updated
akkana
site_samples/
2004-02-16
cinema/ebert_1min.site, cinema/roger_ebert.site,humor/dave_barry.site: Contributions from Alan Hoyle, alanh atemail.unc.edu
jmason
lib/Sitescooper/
2004-02-13
Main.pm, SCF.pm: added patch from Robert Fuhge,robert.fuhge.at.epost.de, assign categories to Plucker documentsusing the Category: line in the site file
jmason
site_samples/tech/risks.site
2004-02-13
updated risks.site to use new'mobile device' rendering
akkana
site_samples/business/the_economist.site
2004-02-11
The Economist, fromBoonNam Goh
akkana
site_samples/news/
2004-02-11
newsweek.site, newsweek_intl.site: Newsweekupdates from BoonNam Goh
jmason
site_samples/security/
2004-02-07
crypto_gram.site, crypto_gram.site:cryptogram site fixed
jmason
lib/Sitescooper/Robot.pm
2004-01-31
handle undef headlines
jmason
lib/Sitescooper/Robot.pm
2004-01-31
oops; RSS output headline was not beingHTML-encoded correctly
akkana
site_samples/
2003-11-15
tech/computer_world.site, news/newsweek_intl.site:Contributions from BoonNam Goh
barrygonzaga
site_samples/linux/gwn.site
2003-11-04
add Gentoo Weekly News
akkana
site_samples/
2003-10-31
news/Newsweek.site, news/NewsweekIntl.site,regional_israel/jpost.site: Remove inconsistently named files
akkana
site_samples/news/
2003-10-31
newsweek.site, newsweek_intl.site: Newsweek,from Goh Boon Nam
akkana
site_samples/regional_israel/jerusalem_post.site
2003-10-31
Jerusalem Post,from David Resnick
akkana
site_samples/tech/wiredmag.site
2003-10-31
Previous commit only got onespecific date. So I've substituted my own Wired site file, whichdoesn't get entire stories yet, but it does get Wired every day.
akkana
site_samples/tech/wiredmag.site
2003-10-31
One issue of Wired Magazine,from richard_html2pdb at yahoo dot com
akkana
site_samples/tech/pcmag_images.site
2003-10-31
Update from Goh Boon Nam:Get full-sized images
akkana
site_samples/news/
2003-10-31
Newsweek.site, NewsweekIntl.site: Newsweekupdates (US and Intl) from BoonNam Goh
akkana
site_samples/regional_israel/jpost.site
2003-10-31
Jerusalem Post, fromDavid Resnick
akkana
site_samples/news/
2003-10-29
Newsweek.site, USNews.site: New sitescontributed by BoonNam Goh
cinema/ebert_answer_man.site,cinema/ebert_features.site, cinema/ebert_great_movies.site,cinema/roger_ebert.site, opinion/nro.site: updated sites from JohnStraw
jmason
site_samples/regional_germany/
2003-06-10
de_cert.site, de_cyberkino.site,de_gazette.site, de_heise_mobil.site, de_heise_tp.site,de_heute.site, de_pdassi_news.site, de_pdassi_software.site,de_spiegel.site, de_stern.site, de_tagesschau.site,de_teltarif.site, de_tvspielfilm.site, mobile2day.site,palmfaq_de.site, pda_debitel_net.site, windows2000faq.site,zdnet_news.site, bundesregierung.site: a whole lot of newregional_germany sites from Stefan Schwingeler
business/hottips.site, linux/linuxplaza.site,opinion/feed.site, regional_germany/de_spiegel.site,regional_north_carolina/weather24_raleigh.site: more dead sitespruned
jmason
site_samples/
2002-01-18
languages/aspwire.site,languages/news_perl_org.site, languages/perlmonth.site,languages/sqlwire.site, languages/vbwire.site,opinion/simson_garfinkel.site, tech/sendmail_net.site: removed lotsof dead sites
bsd/openbsd_journal.site,palm/palminfocenter.site, palmsized/cnn.site,palmsized/ny_times_handheld.site, palmsized/the_register.site: sitefiles from Barry Dexter A. Gonzaga
Guardian siteupdated by Stewart C. Russell (stewart /at/ ref.collins.co.uk)
jmason
site_samples/business/businessweek.site
2001-09-06
oops, forgot busweek
jmason
site_samples/
2001-09-05
palm/pdalive.site, palmsized/ny_times.site,palmsized/salon.site, news/gallup_poll.site,palm/palminfocenter.site: added sites from Barry Dexter A. Gonzaga
jmason
site_samples/regional_denmark/politiken.site
2001-08-27
added Politikensite from Claus Hindsgaul
jmason
lib/Sitescooper/UserAgent.pm
2001-08-20
fixed http auth support
jmason
site_samples/regional_toronto/
2001-08-18
globe_and_mail_columnists.site,globe_and_mail_national.site, globe_and_mail_thearts.site,globe_and_mail_toronto.site: globe+mail sites updated by MichaelGraham (magog@the-wire.com)
jmason
site_samples/regional_california/
2001-08-17
la_times.site,latimes_nat.site, latimes_oc.site,la_times/la_times_frontpage.site, la_times/latimes_local.site,la_times/latimes_nat.site, la_times/latimes_oc.site,la_times/latimes_science.site, la_times/latimes_tech.site,la_times/latimes_world.site: added new LA Times sites from MarkBeckman (mbeckman at jps.net), and reorged them into a directory
business/cnn_financial.site, news/cnn_mobile.site,science/sciam.site, sport/cnn_sports.site: added SciAm site fromMarko, and some CNN sites from David's PODS system translated byMarko
jmason
lib/Sitescooper/Main.pm
2001-06-28
added support for escaped-hashes in sitefiles from Jeff Hecker