Perl

Perl – Practical Extraction and Report Language is a free high-level, interpreted, dynamic programming language written by Larry Wall in the year 1987.Perl is very good for shell scripting and regualar expression.If you use GNU/Linux is default installed, on windows you have to install ActivePerl.The languages includes Perl 5 and 6, I use primary Perl 5(Perl 5 is also continue developing).The symbol of the Perl language is the Camel symbol.

 

Variable

$ for Scalar,@ for Array and % for Hashes, as example:

In Perl all variables are Scalars, you could store numbers and strings in the same value without convert.If you want to add a Scalar to another you have to concate two strings with “.”, but if you want to add two number you should use “+”.To access an element from the array you could use “$name[0]”, it starts from 0 to the last element, you get the last element with “$#name”.For manipulating a Perl Array look at this Post.And you could choose in  a big list of default variables.

 

Quoting

Be careful with quoting, the double quotes could interpolate other Scalars.

The name1 is “1” and name2 is “$a”.

 

Print

an simple “Hello World!” example:

 

Comparison

To compare the Scalars you could use them from this table:

compare type number string
equal == eq
 not equal  != ne
 less than  <  lt
 greater than  >  gt
 less or equal  <=  le
 greater or equal  >=  ge

 

loops

In Perl are more than one way to make a loop:

the first one is faster and faster to write, but the second has more possibility’s to change.

Do-While loop:

While loop:

 

Regex

Regular Expressions are very good implemented by Perl:

 

Optimizing

Perl is an interpreted language and if you need more performance you could use Inline code like C or you could take a look at my Benchmarks or at write fast code in Perl.

 

If you want to see some examples and solution from me see into my Category – Perl or if you want to see some special Tricks in Perl.

XPATH

The XPATH(XML Path Language) is a query language from the W3C, it is used to select nodes in an XML file.

Full Path

In this example I start with “/” that say we start on root and then i navigate from element to element.

 

Anywhere

In this example I use the “//” to search in the full document to find all a nodes.(you could use the “//” behind a “/” like “/html/body/div//a” to find all elements in html body div.)

 

Attributes

you could use the “@” to find all elements wit the attribute class that is the string “secondarary”.

 

As example you could select all hrefs where the Attributes contains a string:

Or select all hrefs where the text from the URL contains a string:

 

Array

If you select all Elements you get an Array, in the example above I select the firs element from my result Array.You should avoid this, better is to navigate over the class or ID because if the Page changes a bit the XPATH don’t work any more.

 

That are some basics, if you need something special take a look into the Network or on my page.As example Web crawling in PHP or Perl.

write fast code in Perl

In this article I want to show how to write fast Perl-Code from the first to the last line, you could use it to speed up existing Programs but its better to make it right from beginning.

 

Turn off things you don’t need

In Perl it is important to use debug output, but only for developing, after your code is written down and tested you could disable it with a constant like Perl strict Benchmark or Perl use output only for debugging.For this take a look at Perl debug output and  Perl debug Benchmark, it is a simple way to switch debugging and other features off and on.The Interpreter knows that it could never be true and removes it completely, its faster than an If.Take a look at Perl print Benchmark to see how it works, and you could see that you should not modify the Stack.Another way is that the Interpreter could calculate things before they needed like in Perl constant Benchmark.Another thing is Bignum it is useful because it gives you a better precision but it costs a bit more time, fore more look at Perl Bignum Benchmark.

 

Don’t modify the Stack

For Perl its hard way to modify the stack with pop push shift unshift,its important for functions and arrays, try to access the items directly and not to shift them down like Perl shift iterate Benchmark.

 

Use pre-increment over post-increment

Lets have a look at the post increment operator, he stores the value in a variable then increments the original value and return the stored variable, the pre increment has just to increment and return the result, its much faster.For an Benchmark look at Perl increment Benchmark.

 

References

References are useful if you have big data, and you want to access them in a sub so you have only to move the reference and not the data, but be carefully if you have just scalars you could waste time like this: Perl reference vs. handing over.Another point is that arrays are smaller and faster than hashes so use them if you could: Perl array vs. hash handing over Benchmark.But in my last Test i get the best result with prototypes.

 

Replace and Regex

For Regex you should use the o flag or qr// to avoid a new check of the Regex compiler, but you cant interpolate the regex like Perl replace benchmark.For replace you should use the y// over the s//, because its faster.To see what the difference in replace look at Perl replace vs. sed, awk and bash.

 

Choose a multiplication over a division

In the most cases you could use a multiplication instead an division and save time like Perl multiplication vs. division Benchmark.

 

Use map

If you want to alter an array and could use map, then use it it speeds up your script: Perl map vs. for Benchmark.

 

Loop with fixed range

If you dont have to modify an pointer in the loop or have anoter increment you should use the for with a fixed range like Perl loop Benchmark.

 

remove double entries

To remove double entries in an array look at this: Perl remove double entries in an array Benchmark.

 

Make it by hand

The last stepp is to make it by hand like Inline C or a Preprocessor, or you could do it in Perl like in this example: Perl grep Benchmark, here I used an manual wy over an grep.

Perl strict Benchmark

This is a Example-Benchmark about to turn off things you don’t need in Perl:

Our Result:

Maybe a good solution is to use it to check your code and then to comment it out, maybe with a debug variable, for this take a look at Perl Debug Output.

(This is just an Example, don’t turn of use strict.)

Perl random password generator

This is an little example how to generate some password with numbers,lower letter, upper letter and some special chars in Perl :

Some random passwords:

Perl trim decimal number

This is a function to trim a decimal number in Perl, this function did not round the number just make an sub string from the dot to the end:

Result looks like this:

we see missing positions are filled with 0 and longer numbers are trimmed.

Perl sub Benchmark

This is a Benchmark about the subroutines in Perl:

Result:

We see the normal way is the best solution.

Perl random number

This is a bit more complex solution to generate a random number in Perl:

It has no max value but normal the highest is 99999.

 

 

Perl bignum Benchmark

This is a Benchmark in Perl about bignum :

Our result:

We see the cost is about 12% performance, so if its not really necessary we should tun this feature off.

Perl default defined variables – store and restore

This is a little Script to store all default defined variables from Perl(Linux) into an Scalar ant then to restore everything like it was before the store function.

Important to include my Data-Dumper !

If you want to store your data in the intern arrays, run this:

  • scalar for Regex and program course are stored in “@_Course”
  • scalar for environment are stored in “@_Environment”
  • hashes and arrays are not stored because they are global or lower scope

only variables that writeable are getting stored.

Then restore your saved variables:

you could see if you changed any default variable with(1=changed):

Or you could print all Stored Variables:

(Important to include my dumper.)

 


scalar for Regex and program course:

$_ – contains current value of a loop

$& – contains match of the current Regex
$` – contains the left side of the current Regex
$’– contains the right side of the current Regex

$1,$2,.. – contains matching groups of Regex
$+ – contains the last group match of an Regex

$. – contains the current line number of the input file

$/ – sets the input record separator

$\ – sets the output record separator, added to every print.

$, – sets the output field separator, added to ever coma separated element

$” – sets the output list separator, added to ever coma separated list

$| – output auto flush, disables the output buffer. 0=off, 1=on

 


Format:

$% – current page number
$= – number of lines per page
$- – number of lines left on the page
$~ – name of the current format
$^ – name of the current head format
$: – delimiter of multiple lines
$^L – delimiter of multiple pages
$^A – accumulator

 


scalar for environment are stored:
$? – contains the last status of a pipe or child process

$! – contains the error status of a pipe or child process.

$@ – contains an empty string if eval works, and if not it contains the error

$$ – process id
$< – real user id
$> – effective user id
$( – real group id
$) – effective group id

$0 – current program name

$] – current Perl version(is deprecated use $^V instead)

$^D – current debugging flag, run your program wit the -D parameter
$^F – highest file descriptor
$^H – contains the number of syntax checks from use strict and other checks
$^I – is defined or undefined if the Inplace flag is set
$^M – could save data if no more ram is available
$^O -contains the current operating system

$^P – contains the current debugging status
$^S – the status of the per interpreter, if eval is running its undefined
$^T – contains the start time of the script

$^V -current Perl interpreter version

$^W – contains warning from the -w parameter, 1 if set, else 0
$^X – contains the current Perl interpreter path

 


hashes and arrays:
@_ – contains the parameter from calling a function

@ARGV – contains the parameter from calling your Perl script

@INC -paths where Perl looks for Modules

@ISA – contains a list of basic classes for object oriented
@EXPORT – Used for Modules to export
@EXPORT_OK – Used for Modules to export, but optional
%ENV – contains environment variables from your system

%SIG – contains Signal Handler to communicate to other processes

 

If you want a list as comment block use this: