main.pl

the main script gets triggered from the WATCHDOG.pl script and is able to use the keywords on the page.

Line 6 start the phantom js

Line 11 Keywords

Line 15 Main Page crawl

Line 22 Keyword crawl

#!/usr/bin/perl
use strict;
use warnings;
$|=0;

system('./phantomjs-1.9.7-linux-x86_64/bin/phantomjs --webdriver='.$ARGV[0].' >> /dev/null 2>&1 &');sleep(4);
print 'working '.$$.(time()-$^T).$/;



my @words = ("keyword1","keyword2");

my $run ="";

                for(1..6){
                        $run ="./download.pl 'http://www.example.com/new/$_' '$ARGV[0]'";
                        print $run.$/;
                        system($run." && sleep 1");sleep(10);
                }
for (0..$#words){
my $word = $words[$_];
                for(1..1){
                        $run ="./download.pl 'http://www.example.com/search.php?what=".$word."&page=$_' '$ARGV[0]'";
                        print $run.$/;
                        system($run." && sleep 1");sleep(10);
                }
}

system("ps -e -o pid,args -dd | egrep '--webdriver=$ARGV[0]' | grep -v egrep | awk '{print $1}' | xargs kill -s 9");
print "normal exit !".$/;

 

3 thoughts on “main.pl”

  1. It’s an awesome paragraph for all the internet people; they will
    get benefit from it I am sure.

  2. I really like reading through a post that may make individuals think.
    Also, be grateful for permitting me to comment!

Leave a Reply

Your email address will not be published.

eight + 16 =