Web Robot
We need some web robot to grab special information form the web.
We believe that a correct classification of web robot is:
- browser / web graber
- web robot
- inteligent web robot
A browser / web graber takes a single http page
URL/HREF?PARAMS. The module browse throw the content and grabs same information.
To get the html content we
#!/usr/bin/perl -w
use strict;
# We have some input text content
my $input= "Ralf Schaer";
# This will be our text content output
my $output= "";
#
# With this pipe we can self read from STDOUT
#
my $pid= open(FROM, "-|");
die "FATAL ERROR: Can't open the pipe FROM $! \n" unless defined($pid);
if ( $pid ) {
while ( <FROM> ) {
chomp($_);
$output.= $_;
}
else {
#
# We pipe the input content into a sed pipe chaine
#
open(PARSER, "| sed -n -f my.sed | sed -n -f second.sed")
or die "FATAL ERROR: Can't open the pipe PARSER $! \n";
print PARSER $input;
close(PARSER);
exit(1);
}
}
close(FROM);
# We check the parsed content
#
print "The ouput is $output\n";
1;
--
TWikiAdminUser - 2009-11-12