[Crossfire-wiki] [Crossfire DokuWiki] page added: user:mhoram:scripts:wiki_orphans

no-reply_wiki at metalforge.org no-reply_wiki at metalforge.org
Wed Dec 20 13:49:27 CST 2006


A page in your DokuWiki was added or changed. Here are the details:

Date        : 2006/12/20 13:49
User        : mhoram
Edit Summary: created

This is a perl script which walks through the Crossfire Dokuwiki index, parses the links on every page, and reports any orphaned pages -- those that are never linked to from other dokuwiki pages.  (In dokuwiki parlance, they have no backlinks.)  If these pages aren't under heavy construction, they should probably be linked to from somewhere so they can be found, or deleted if no longer needed.

This was a fairly quick hack, so it doesn't have all the error-checking that a completed script shoud, but I'm out of time for today, and it does appear to work.  Thanks to [[user:Rednaxela]] for the idea.

===== Requirements =====
  * Perl
  * LWP::Simple (a perl module)
  * A reasonably fast internet connection.  It takes a few minutes across my wireless, so would probably take hours over dialup.

===== Code =====
<code perl>
#!/usr/bin/perl
use strict;
use warnings;

# This script walks the index tree of the Crossfire dokuwiki
# and counts the number of links to each page.  Those with zero
# links are reported as orphans.

use LWP::Simple;
use Data::Dumper;   # just in case I need it

my $base_url  = 'http://wiki.metalforge.net';
my $base_path = '/doku.php';
my %page;       # stores page paths, whether they've been followed, and a count
my %did_index;  # stores index directories that have been expanded
my $DEBUG = 0;  # set to true to get a lot of stuff on stderr

check_link($base_url.$base_path."/start?do=index");

for my $link (sort keys %page){
  next if $link =~ /^$base_path/;
  print "$link\n" unless $page{$link}->{count};
}

sub check_link {
  my $path = shift;
  debug("Checking link $path");
  my $index_text = get( $path );
  $index_text =~ s{^.+?wikipage start}{}s;
  my(@index_links) = $index_text =~ m{"$base_path/([^"]+?)"}g;
  for my $index_link (@index_links){
    if( $index_link =~ m{\?idx=(.+)$} ){
      # this is an index directory, recurse through it once
      unless( $did_index{$index_link} ){
	$did_index{$index_link} = 1;
	check_link("$base_url$base_path/$index_link");
      }
    } else {
      # this is a page, parse it and mark it if it hasn't be done yet
      unless( $page{$index_link}->{done} ){
	debug("Getting $base_url$base_path/$index_link");
	my $text = get("$base_url$base_path/$index_link");
	$text =~ s{^.+?wikipage start}{}s;
	my( @links ) = $text =~ m{"$base_path/([^"]+?)"}g;
	for my $link (@links){
	  $link =~ s/\#.*$//;
	  unless( $index_link eq $link ){
	    $page{$link}->{count}++;
	    debug( "Incremented count for $link from $index_link to $page{$link}->{count}\n");
	  }
	}
	$page{$index_link}->{done} = 1;
      }
    }
  }
}


sub debug {
  print STDERR @_, "\n" if $DEBUG;
}
</code>

===== Notes & Comments =====

===== References =====
  * [[..|Mhoram's page]]

IP-Address  : 206.71.197.56
Old Revision: none
New Revision: http://wiki.metalforge.net/doku.php/user:mhoram:scripts:wiki_orphans

-- 
This mail was generated by DokuWiki at
http://wiki.metalforge.net/




More information about the crossfire-wiki mailing list