Find UTF-8 byte order marks

In a templating application I just ran into ugly “” characters a the beginning of the text. This is caused by the byte order mark with the hex characters 0xEFBBBF. As it was not the only one file that contained the BOM I ran a search:

find . -iname '*.css' -o -iname '*.html' -o -iname '*.js' -o -iname '*.pm' -o -iname '*.pl' -o -iname '*.xml' | xargs grep -rl $'xEFxBBxBF'

To remove the BOM I followed the suggested way by http://stackoverflow.com/questions/204765/elegant-way-to-search-for-utf-8-files-with-bom using sed:

find . -iname '*.css' -o -iname '*.html' -o -iname '*.js' -o -iname '*.pm' -o -iname '*.pl' -o -iname '*.xml' -exec sed 's/^xEFxBBxBF//' -i.bak {} ; -exec rm {}.bak ;

Tada, no more ugly BOMs!

Advertisement

Perl -e in Windows batch and Linux shell scripts

Recently I had to extract a partial string from a space-seperated list of names in a loop within a script. There have to be two versions of that script, one for Windows and one for Linux shell. To loop through one such list is quite easy, in Windows:

@echo off
SETLOCAL
set WEBSITES=Test1 Test2 Test3

FOR /D %%A IN (%WEBSITES%) DO (
echo WebsiteName=%%A
)
GOTO :EOF

ENDLOCAL

and in Linux shell:

WEBSITES="Test1 Test2 Test3"

for WEBSITE_ALIAS in $WEBSITES ; do
echo "Website= $WEBSITE_ALIAS ..."
done

But now came a second list of string into play containing the domain names of the website aliases:

WEBSITES="Test1 Test2 Test3"
DOMAINNAMES="www.test1.lan www.test2.lan www.test3.lan"

In a conventional programming language I would just use a for-loop with an index variable and utilize that variable to access both arrays within one loop. But in that batch/shell scripting this turned out to be quite tricky. My solution here was a small inline PERL script. For Windows:

set WEBSITE_UNIT=Unit4
set WEBSITES=Test1 Test2 Test3
set DOMAINNAMES=www.test1.lan www.test2.lan www.test3.lan
perl -e "use strict; die("argv mismatch!") if !@ARGV or scalar(@ARGV) < 2; my @Websites = split(/[s,;|]/, $ARGV[0]); my @Domains = split(/[s,;|]/, $ARGV[1]); die("number of aliases differs from domain names!") if scalar(@Websites) != scalar(@Domains); for(my $i=0; $i<scalar(@Websites); $i++) { system('perl dosomething.pl -user /Root/'.$ENV{'WEBSITE_UNIT'}.'/admin -passwd admin -servername '.$ENV{'SERVER_NAME'}.' -alias '.$Website[$i].''); system('perl doanotherthing.pl -user /Root/'.$ENV{'WEBSITE_UNIT'}.'/admin -passwd admin -alias '.$Website[$i].' DomainName="'.$Domains[$i].'"'); }" "%WEBSITES%" "%DOMAINNAMES%"

And for Linux:

WEBSITE_UNIT=Unit4
WEBSITES="Test1 Test2 Test3"
DOMAINNAMES="www.test1.lan www.test2.lan www.test3.lan"
perl -e 'use strict; die("argv mismatch!") if !@ARGV or scalar(@ARGV) < 2; my @Websites = split(/[s,;|]/, $ARGV[0]); my @Domains = split(/[s,;|]/, $ARGV[1]); die("number of aliases differs from domain names!") if scalar(@Websites) != scalar(@Domains); for(my $i=0; $i<scalar(@Websites); $i++) { system("perl dosomething.pl -user /Root/'${WEBSITE_UNIT}'/admin -passwd admin -servername ".$ENV{"SERVER_NAME"}." -alias ".$Website[$i].""); system("perl doanotherthing.pl -user /Root/'${WEBSITE_UNIT}'/admin -passwd admin -alias ".$Website[$i]." DomainName="".$Domains[$i]."""); }' "$WEBSITES" "$DOMAINNAMES"

Notice the different handling of the ticks and quotes and the different access to external parameters. In Windows there’s no difference in PERL’s $ENV hash whether accessing real environment variables or local variables set by the batch script. Not so under Linux: I can only access my system-wide exported environment variable SERVER_NAME using $ENV but not my local script’s WEBSITE_UNIT variable. When using the exec -e with perl for Windows I had to use quotes to wrap the execution PERL-code but for linux shell, I needed single -ticks which can NOT be used inside the PERL code – not event when escapting them like ‘. The single ticks are “reserved” by the shell script for being able to insert shell variables anywhere.

Disable auto word wrap in nano

Quite annoying when editing source code via console in nano is the automatic word wrap, which often screws up compilation of the edited file. To disable the word wrap, just edit the .nanorc in your home directory and add:

set nowrap

You can also do this when starting nano:

nano -w <file>

Portable virtual boxes

I’ve been looking for an alternative to VMWare and came across Sun’s VirtualBox. I also found the portable version, which is very cool if you want to setup a bunch of  virtual machines and share it with your colleagues. I installed CentOS5, which is a free linux distribution very similar to RedHat. To get access to the “Guest Additions” coming with VirtualBox, you need a little note how to setup the kernel headers.

Furthermore, if you want to make a copy of a virtual machine drive you created with virtualbox, you can use CloneVDI. Also very nice is the not-so-obvious-possibility to map network ports from the virtual machine to your local computer using the VBoxManage tool. For instance I rerouted the VM’s ports 22 (SSH), 80 (WWW) and 443 (SSL) to my local computer’s ports 2222, 2280 and 22443:

VBoxManage setextradata "CentOS5" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guest_ssh/Protocol" TCP
VBoxManage setextradata "CentOS5" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guest_ssh/GuestPort" 22
VBoxManage setextradata "CentOS5" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guest_ssh/HostPort" 2222

VBoxManage setextradata "CentOS5" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guest_www/Protocol" TCP
VBoxManage setextradata "CentOS5" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guest_www/GuestPort" 80
VBoxManage setextradata "CentOS5" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guest_www/HostPort" 2280

VBoxManage setextradata "CentOS5" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guest_ssl/Protocol" TCP
VBoxManage setextradata "CentOS5" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guest_ssl/GuestPort" 443
VBoxManage setextradata "CentOS5" "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guest_ssl/HostPort" 22443

If you’re walking your first steps in CentOS – like I did, you might find it helpful that the basic configuration screen which appears right after the installation can be re-run using just the setup command. And if you are coming from Debian (apt-get ..) or SuSE (yast2) be aware that the installer for repository packets in RedHat and CentOS ist called yum.

Default editor for midnight commander

Mcedit isn’t the choice of everyone. It’s quite hard to copy & paste text with mcedit, so I wanted to use nano for the F4 key’s job. You can easily change the editor seperately for each user by setting the EDITOR variable:

export EDITOR="nano"

In addition you have to disable the internal mcedit editor by editing the midnight commander’s ini file in the user’s home directory, say the: ~/.mc/ini file:

use_internal_edit=0