Social networks
You can find me on:
Open sharing content

These articles are available under Creative Commons license BY-SA-3.0

Archive for October, 2011

You can download “Perl Scripting for SQLite”.
It’s a pdf of the slides I’ve made for the “Linux day 2011” in my hometown.
Have fun!

Library FOR WWW in Perl (LWP)

In Linux you can istall all the perl modules about the web (LWP, URI, URL, HTTP…) at once:

:~$ sudo apt-get install libwww-perl

LWP is the most used Perl module for accessing data on the web.

LWP::Simple – module to get document by http
its functions don’t support cookies or authorization, setting header lines in the HTTP request; and generally, they don’t support reading header lines in the HTTP response (most notably the full HTTP error message, in case of an error). To get at all those features, you’ll have to use the LWP::UserAgent;

LWP::UserAgent is a class for “virtual browsers,” which you use for performing requests, and HTTP::Response is a class for the responses (or error messages) that you get back from those requests.

There are two objects involved: $browser, which holds an object of the class LWP::UserAgent, and then the $response object, which is of the class HTTP::Response. You really need only one browser object per program; but every time you make a request, you get back ar esponse object, which will have some interesting attributes:

$response->is_success : A HTTP status line, indicating success or failure  (like “404 Not Found”).

$response->content_type A MIME content-type like “text/html”, “image/gif”, “application/xml”, and so on, which you can see with

$response->content : the actual content of the response. If the response is HTML, that’s where the HTML source will be; if it’s a GIF, then $response->content will be the binary GIF data.

Enabling Cookies

A default LWP::UserAgent object acts like a browser with its cookies support turned off.
You can even activate cookies, with the following function:


with “cookie_jar” you can get and save the cookies from Browsers.

The following script gets a url from the shell and print the content of the corresponding web page both to screen and a new file called “code.html” (created by running the script).


#!/usr/bin/perl -w
use LWP::UserAgent;

#browser = instance of the UserAgent class
my $browser = LWP::UserAgent->new;
my $url =$ARGV[0]; # passing the url by command line
my $response = $browser->get($url);

die "Can’t get $url \n", $response->status_line
unless $response->is_success;

# check if the content is html
die "Hey, I was expecting HTML, not ", $response->content_type
unless $response->content_type eq 'text/html';

print "Page content: \n";

#print content to console
print $response->decoded_content;

#print content to a NEW file
open (MYPAGE, '>>code.html');
print MYPAGE $response->decoded_content;
close (MYPAGE);

#REGULAR EXPRESSION: search for a string in the content
if($response->content =~ m/perl/i) {
print " \n \n This page is about Perl!\n \n";
} else {print "\n \n No content about Perl! \n \n"; }

filesystem: the files and directories (or folders), the method used to store data on the hard drive (such as the ext3 filesystem.

– Windows keeps all the important system files in a single directory C:\
– Linux follows the lead of its UNIX
– Windows and Linux setups are both logical

✓ Linux uses a forward slash (/) between directories, not the backslash (\) that Windows uses. So, the file yum.conf in the directory etc is

✓ Files and directories can have names up to 256 characters long, and these names can contain underscores (_), dashes (-), and dots (.) any-
where within. So my.big.file or my.big_file or my-big-file are all valid filenames.

✓ Upper- and lowercase matter. They have to match exactly. The files yum.conf and Yum.conf are not the same as far as Linux is concerned.
Linux is case-sensitive — it pays attention to the case of each character. Windows, on the other hand, is case-insensitive.

✓ The same filesystem can span multiple partitions, hard drives, and media (such as CD-ROM drives). You just keep going down through
subdirectories, not having to care whether something is on disk A, B, or whatever.

Everything in the Linux filesystem is relative to the root directory — not to be confused with the system Administrator, who is the root user. The root directory is referred to as /, and it is the filesystem’s home base — a doorway into all your files. As such, it contains a relatively predictable set of subdirectories. Each distribution varies slightly in terms of what it puts in the root directory. More or less you can find the following directories.

/bin   : Essential commands that everyone needs to use at any time.*
/boot  : The information that boots the machine, including your kernel.*
/dev :  The device drivers for all the hardware that your system needs to  interface with.*
/etc  : The configuration files for your system.*
/home  : The home directories for each of your users.
/lib  : The libraries, or the code that many programs (and the kernel) use.*
/media  : A spot where you add temporary media, such as floppy disks and  CD-ROMs; not all distributions have this directory.
/mnt  :  A spot where you add extra filesystem components such as networked drives and items you aren’t permanently adding to your filesystem but that aren’t as temporary as CD-ROMs and floppies.
/opt   : The location that some people decide to use (and some programs want to use) for installing new software packages, such as word
processors and office suites.
/proc   : Current settings for your kernel (operating system).*
/root   : The superuser’s (root user’s) home directory.
/sbin   : The commands the system Administrator needs access to.*
/srv   : Data for your system’s services (the programs that run in thebackground).*
/sys   : Kernel information about your hardware.*
/tmp   : The place where everyone and everything stores temporary files.
/usr   : A complex hierarchy of additional programs and files.
/var   : The data that changes frequently, such as log files and your mail.

Check my slides about the introduction to gnu/linux OS in pdf file!

There is no .exe equivalent in linux. Executables are denoted by file permissions, not extensions. In directories such as /etc, many files do not use a file extension because it is in /etc it is assumed to be a configuration (ASCII text) file.

Ex. “RELEASE NOTE” is the correct name for a file (remember that it’s case sensitive).

The following list shows the most commons file extensions for linux:

.a   : a static library ;
.au    : an audio file ;
.bin :    a) a binary image of a CD (usually a .cue file is also included); b) represents that the file is binary and is meant to be executed ;
.bz2 :    A file compressed using bzip2 ;
.c :    A C source file ;
.conf :  A configuration file. System-wide config files reside in /etc while any user-specific configuration will be somewhere in the user’s home directory ;
.cpp :  A C++ source file ;
.deb :  a Debian Package;
.diff :   A file containing instructions to apply a patch from a base version to another version of a single file or a project (such as the linux kernel);
.dsc:   a Debian Source information file ;
.ebuild : Bash script used to install programs through the portage system. Especially prevalent on Gentoo systems;
.el :  Emacs Lisp code file;
.elc :  Compiled Emacs Lisp code file;
.gif :    a graphical or image file;
.h :a C or C++ program language header file;
.html/.htm  :   an HTML file;
.iso :    A image (copy) of a CD-ROM or DVD in the ISO-9660 filesystem format;
.jpg :    a graphical or image file, such as a photo or artwork;
.ko :    The kernel module extension for the 2.6.x series kernel;
.la :    A file created by libtool to aide in using the library;
.lo :    The intermediate file of a library that is being compiled;
.lock :    A lock file that prevents the use of another file;
.log :    a system or program’s log file;
.m4 :    M4 macro code file;
.o :    1) The intermediate file of a program that is being compiled ; 2) The kernel module extension for a 2.4 series kernel ; 3)a program object file;
.pdf :    an electronic image of a document;
.php :     a PHP script;
.pid :    Some programs write their process ID into a file with this extention;
.pl :    a Perl script;
.png :    a graphical or image file;
.ps :    a PostScript file; formatted for printing;
.py :    a Python script;
.rpm :    an rpm package. See Distributions of Linux for a list of distributions that use rpms as a part of their package management system;
.s :    An assembly source code file;
.sh :    a shell script;
.so :     a Shared Object, which is a shared library. This is the equivalent form of a Windows DLL file;
.src  :    A source code file. Written in plain text, a source file must be compiled to be used;
.sfs :    Squashfs filesystem used in the SFS Technology;
.tar.bz2 , tbz2, tar.gz :     a compressed file per File Compression;
.tcl :    a TCL script;
.tgz :     a compressed file per File Compression. his may also denote a Slackware binary or source package;
.txt :    a plain ASCII text file;
.xbm :    an XWindows Bitmap image;
.xpm :     an image file;
.xcf.gz, xcf :  A GIMP image (native image format of the GIMP);
.xwd :    a screenshot or image of a window taken with xwd;
.zip :extension for files in ZIP format, a popular file compression format;
.wav :    an audio file.

Although rar and zip files are supported, linux has its on archive file extensions too.
When you’re looking for software or when you need to save yourself some space, you can find files with the following extension:

A tarball is a bunch of files (and possibly directories) packaged together in a .tar file and compressed using the gzip utility; the
tarball then contains the .tar.gz extension.

.tar : A bunch of files bundled together
.tar.bz2  :  A tarball (a .tar file inside a .bz2 file. )
.tar.gz : A traditional tarball, which is a .tar file inside a .gz file.

Program (shell command): tar, bzip2, gunzip,gzip.

Other archive extensions are: .deb, and .rpm

.deb : All the files related to an application bundled together using a Debian-specific   format, used in Ubuntu and gOS.
Program (shell command): dpkg, apt-get, zipper (for open suse distributions).

.rpm : All the files related to a single application bundled together using a format designed by Red Hat and used in Fedora.
Program (shell command): rmp, yum