GNU/Linux

In this article I wrote about the Linux-Kernel and the GNU-Project, together they are called GNU/Linux.It is an Unix like Operating system and in the most cases POSIX-compliant and open source.The Kernel was firs released in the year 1991 from Linus Torvalds.

 

GNU/Linux is often used for Servers, there are a lot of open source software for work, a very powerful Shell and you have always Perl on the System.And the System needs not as much resources as Microsoft Systems.

 

I regularly use Debian or Mint,Debian is more stable, more for Servers and you could make a minimal installation with or without an GUI.Mint is more for Desktop users with a good looking Desktop.

 

For more take a look in the GNU/Linux Bash – Category

Perl

Perl – Practical Extraction and Report Language is a free high-level, interpreted, dynamic programming language written by Larry Wall in the year 1987.Perl is very good for shell scripting and regualar expression.If you use GNU/Linux is default installed, on windows you have to install ActivePerl.The languages includes Perl 5 and 6, I use primary Perl 5(Perl 5 is also continue developing).The symbol of the Perl language is the Camel symbol.

 

Variable

$ for Scalar,@ for Array and % for Hashes, as example:

In Perl all variables are Scalars, you could store numbers and strings in the same value without convert.If you want to add a Scalar to another you have to concate two strings with “.”, but if you want to add two number you should use “+”.To access an element from the array you could use “$name[0]”, it starts from 0 to the last element, you get the last element with “$#name”.For manipulating a Perl Array look at this Post.And you could choose in  a big list of default variables.

 

Quoting

Be careful with quoting, the double quotes could interpolate other Scalars.

The name1 is “1” and name2 is “$a”.

 

Print

an simple “Hello World!” example:

 

Comparison

To compare the Scalars you could use them from this table:

compare type number string
equal == eq
 not equal  != ne
 less than  <  lt
 greater than  >  gt
 less or equal  <=  le
 greater or equal  >=  ge

 

loops

In Perl are more than one way to make a loop:

the first one is faster and faster to write, but the second has more possibility’s to change.

Do-While loop:

While loop:

 

Regex

Regular Expressions are very good implemented by Perl:

 

Optimizing

Perl is an interpreted language and if you need more performance you could use Inline code like C or you could take a look at my Benchmarks or at write fast code in Perl.

 

If you want to see some examples and solution from me see into my Category – Perl or if you want to see some special Tricks in Perl.

XPATH

The XPATH(XML Path Language) is a query language from the W3C, it is used to select nodes in an XML file.

Full Path

In this example I start with “/” that say we start on root and then i navigate from element to element.

 

Anywhere

In this example I use the “//” to search in the full document to find all a nodes.(you could use the “//” behind a “/” like “/html/body/div//a” to find all elements in html body div.)

 

Attributes

you could use the “@” to find all elements wit the attribute class that is the string “secondarary”.

 

As example you could select all hrefs where the Attributes contains a string:

Or select all hrefs where the text from the URL contains a string:

 

Array

If you select all Elements you get an Array, in the example above I select the firs element from my result Array.You should avoid this, better is to navigate over the class or ID because if the Page changes a bit the XPATH don’t work any more.

 

That are some basics, if you need something special take a look into the Network or on my page.As example Web crawling in PHP or Perl.

Regular expression

A regular expression is a search pattern, its very often used in the programming language Perl, but it is used in other programming languages too or Text editors.

 

Meta characters

char meaning
^ defines the end of the matching String
$ defines the end of the matching String
. matches any character but not newline
* matches from 0 to infinity times
+ matches from 1 to infinity times
? matches 0 or 1 times
{} matches exact the given number or range
| logical or operator
() makes a group to store the result
[] makes a character matching group

 

Matches

\t tabulator
\n new line
\r return (CR)
\w matches from a-z,A-Z,0-9 and “_”.
\W matches nothing from a-z,A-Z,0-9 and “_”.
\s matches space, tab and newline
\S matches nothing from space, tab and newline
\d matches from 0-9
\D matches nothing from 0-9

 

Examples

This matches every string that starts with an “a”.

This pattern matches every string that end with an “a”.

This regex would match “schools” and “school”.

Would match every string with the length between 3 and 4.

This also would match “schools” and “school”.

On this “this is an ‘test’.” the pattern would store “test”.

This regex matches every combination from “a”,”c” and “o” like the word “coca”.

 

For testing Regex you could test the JavaScript Regex tester.

write fast code in Perl

In this article I want to show how to write fast Perl-Code from the first to the last line, you could use it to speed up existing Programs but its better to make it right from beginning.

 

Turn off things you don’t need

In Perl it is important to use debug output, but only for developing, after your code is written down and tested you could disable it with a constant like Perl strict Benchmark or Perl use output only for debugging.For this take a look at Perl debug output and  Perl debug Benchmark, it is a simple way to switch debugging and other features off and on.The Interpreter knows that it could never be true and removes it completely, its faster than an If.Take a look at Perl print Benchmark to see how it works, and you could see that you should not modify the Stack.Another way is that the Interpreter could calculate things before they needed like in Perl constant Benchmark.Another thing is Bignum it is useful because it gives you a better precision but it costs a bit more time, fore more look at Perl Bignum Benchmark.

 

Don’t modify the Stack

For Perl its hard way to modify the stack with pop push shift unshift,its important for functions and arrays, try to access the items directly and not to shift them down like Perl shift iterate Benchmark.

 

Use pre-increment over post-increment

Lets have a look at the post increment operator, he stores the value in a variable then increments the original value and return the stored variable, the pre increment has just to increment and return the result, its much faster.For an Benchmark look at Perl increment Benchmark.

 

References

References are useful if you have big data, and you want to access them in a sub so you have only to move the reference and not the data, but be carefully if you have just scalars you could waste time like this: Perl reference vs. handing over.Another point is that arrays are smaller and faster than hashes so use them if you could: Perl array vs. hash handing over Benchmark.But in my last Test i get the best result with prototypes.

 

Replace and Regex

For Regex you should use the o flag or qr// to avoid a new check of the Regex compiler, but you cant interpolate the regex like Perl replace benchmark.For replace you should use the y// over the s//, because its faster.To see what the difference in replace look at Perl replace vs. sed, awk and bash.

 

Choose a multiplication over a division

In the most cases you could use a multiplication instead an division and save time like Perl multiplication vs. division Benchmark.

 

Use map

If you want to alter an array and could use map, then use it it speeds up your script: Perl map vs. for Benchmark.

 

Loop with fixed range

If you dont have to modify an pointer in the loop or have anoter increment you should use the for with a fixed range like Perl loop Benchmark.

 

remove double entries

To remove double entries in an array look at this: Perl remove double entries in an array Benchmark.

 

Make it by hand

The last stepp is to make it by hand like Inline C or a Preprocessor, or you could do it in Perl like in this example: Perl grep Benchmark, here I used an manual wy over an grep.