Thursday, January 28, 2010

Twitter style

Styling My Twitter संजीव लिनुक्स (San-Linux)

Perl UTF8 to DEC

This is reading xml loaded in @xmld, and returns the xml, with utf8 converted to dec.
For the systems which cannot store utf8 char sets.

foreach my $line (@xmld)
{
my $loopc=0;
while ($line=~/([\x{80}-\x{FFFF}])/ || $line=~/\d{3}\_=\_/){
$line=utf8todec($line);
$loopc++;
last if $loopc>4;
}

if ($line =~m/(\d{3,})\_\=\_/){
if (my @u_ar=($line=~m/\d{3,}\_=\_/g)){
foreach my $u_cs (@u_ar){
if (my $u_cs=~m/(\d{3,})\_\=\_/){
my $u_ch=$1;
$line=~s/$u_cs/&#$u_ch;/g;
}
}
}
}

if ($line ne "") {
if ( $jxmld !~/\s$/ && $line !~/.\s/ && $jxmld ne "" ) {
$jxmld .= " $line";
}else{
$jxmld .= $line;
}
}
}

sub utf8todec()
{
my $u_st=shift;
my @u_ar, $u_c1, $u_c2, $u_c3, $u_c4, $u_cs, $u_ch;

$u_st=~ s/([\x{80}-\x{FFFF}])/ord($1).'_=_'/gse;

if (@u_ar=($u_st=~m/\d{3}\_=\_\d{3}\_=\_\d{3}\_=\_/g)){
foreach $u_cs (@u_ar){
if ($u_cs=~m/(\d{3})\_\=\_(\d{3})\_\=\_(\d{3})\_\=\_/){
($u_c1, $u_c2, $u_c3)=($1,$2,$3);
if ($u_c1>=224&& $u_c1<=239){
$u_ch=($u_c1-224)*64*64+($u_c2-128)*64+($u_c3-128);
$u_st=~s/$u_cs/&#$u_ch;/g;
}
}
}
}

if (@u_ar=($u_st=~m/\d{3}\_=\_\d{3}\_=\_/g)){
foreach $u_cs (@u_ar){
if ($u_cs=~m/(\d{3})\_\=\_(\d{3})\_\=\_/){
($u_c1, $u_c2)=($1,$2);
if ($u_c1>=192&& $u_c1<=223){
$u_ch=($u_c1-192)*64+($u_c2-128);
$u_st=~s/$u_cs/&#$u_ch;/g;
}
}
}
}
return $u_st;
}

Tuesday, January 19, 2010

BASH

for loops: 1 to 10

for i in `seq 1 10`; do
echo $i;
done

for loops: 1 to 10

for i in `
echo {1..10}`; do
echo $i;
done

for loops: A to 10

for i in `echo {A..Z}`; do
echo $i;
done

Monday, January 18, 2010

File differences

diff -BNarq

The best use for file differences,
-b --ignore-space-change Ignore changes in the amount of white space.
-w --ignore-all-space Ignore all white space.
-B --ignore-blank-lines Ignore changes whose lines are all blank.
-a --text Treat all files as text.
-r --recursive Recursively compare any subdirectories found.
-N --new-file Treat absent files as empty.
-q --brief Output only whether files differ.


If the sources are checked out from svn,
I would remove all the .svn directories and do the diff, issuing
find . -iname ".svn" -exec rm -frv '{}' \;

SED Tips

Sed: Search text and display

sed -n '/404 Not Found/,/405 Method Not Allowed/p' rfc2616.txt

This searches the rfc2616.txt T, for the pattern 404 Not Found till 405 Method Not Allowed is found, then displays.

source: http://www.faqs.org/rfcs/rfc2616.txt

sed -n '/404 Not Found/,/405 Method Not Allowed/p' rfc2616.txt

10.4.5 404 Not Found ...........................................66
10.4.6 405 Method Not Allowed ..................................66
10.4.5 404 Not Found

The server has not found anything matching the Request-URI. No
indication is given of whether the condition is temporary or
permanent. The 410 (Gone) status code SHOULD be used if the server
knows, through some internally configurable mechanism, that an old
resource is permanently unavailable and has no forwarding address.
This status code is commonly used when the server does not wish to
reveal exactly why the request has been refused, or when no other
response is applicable.

10.4.6 405 Method Not Allowed