Sunday, November 24, 2013

Checking environment variable in shell scripts

Found couple of ways, the common one :

My test scripts :


if [ -n $JJ ] ; then

echo "-n1"

if [ -z $JJ ] ; then

echo "-z1"


if [ -n $JJ ]; then

echo "-n2"

if [ -z $JJ ] ; then

echo "-z2"


if [ -n $JJ ]; then

echo "-n3"

if [ -z $JJ ]; then

echo "-z3"

The result is ...


-z to test if a string is empty.
-n to test is a string is not empty.

What about if a string is undefined? And the result really confused me. I have checked the environment variable, I don't have one with name of "JJ".

Reference :

Have kept this in draft for a long long time due to the -n result. So.... A quick conclusion is, use the -z. Use -n at your own risk. To revisit this next time, if I still remember. :P

"Rename" a user in Linux system

I tried this in Ubuntu 12.04 LTS. Run these commands as Root in a terminal.

To change a user name.

usermod -l<new_user_id> <old_user_id>

To change the home directory

usermod -m -d /home/<new_user_id> <new_user_id>

The above 2 can be combined into a command.

usermod -l <new_user_id> -m -d /home/<new_user_id> <old_user_id>

* I didn't tried this though. :P

Next, change the content of the following files. Edit the content by replacing <old_user_id> to <new_user_id>.


What is next? Yes, the user name. This can be change via  User Accounts. Click on the upper right people icon and open the User Accounts GUI. The name can be changed via that interface.

After logout, you'll see the new_user_id account is ready to be used. :)

Note, if you are seeing error prompting unable to change the user_id, you'll need to login to the system as Root, not to use sudo su as Root.

Reference : 

This is useful when you are using clone VM from your peer. :P

Ah.. nearly forgot the last step. After you relogin, you'll see the group name is remained as <old_user_id>. Run this command to change the group name. :)

groupmod -n <new_user_id> <old_user_id>


Tuesday, September 3, 2013

Create cdb file using perl

To create cdb file in perl is quite easy when you have the hash ready.

See example below.

use strict;

use CDB_File;

my %t = (
            'a' =>'apple', 
            'b' => 'banana',

CDB_File::create %t, "<cdb file name>", "<temporary file name>";

If you have the key-value pairs in a file, you can parse through it and insert into the cdb by getting the hash ready, or by the pairs.

Say your data file contain this.


Perl code for getting this flat file data into cdb :

use strict;

use CDB_File;
my $cdb = new CDB_File("<cdb file name>", "<temporary file name>");

open (DATA, "<data file name>");

while (<DATA>) {
    my @data = split /,/;


Monday, September 2, 2013

Something about object files

This is from a long lost track. I used to deal with object files in my previous job. That was like ages ago. Is VCS still compiling the design into object files to run simulation? Anyway, thanks to Synopsys team that helped me a lot in improving my debugging skills. :)

Some simple steps to create a "test case" here. Let's create a "Hello world!" object file.

#include <stdio.h>

void main() {
    printf("Hello world!\n");

Compile into object file.

gcc <source code file name> -o <object file name>

To enable debug mode, compile with -g option. With debug mode, you can set break point when running the program in gdb.

OK, back to the story.

These are a few commands that they taught me to investigate about the object files.

To get the list of the functions in the object file.

strings <object file>

This one is not obvious in my test case. The strings command actually listed all the strings in the object file. Anyway, it was handy during the debug then.

To list the dependencies and its gcc versions of the object file.

ldd <object file>

To list the symbols and its allocated memory of the object file. I added the format option for better readability on thhe output. :P

nm --format sysv <object file>

Then, I googled and tried one command. This command seems it will "decode" the object files into assembly language! My favourite language during Uni-time. :D

objdump -d <object file>

This is too long to include full result. I am putting a small partial of it here. :)

Friday, August 23, 2013

Some sharings on Oracle database and MySQL database

Recently, I am getting involved a lot on database related tasks. My previous jobs mostly deal with gdbm and MySQL. This "new" job makes me deal a lot with Oracle database and cdb. :)

I actually never bother to get to know the special tables in the database. I need to do a tiny query on production server to get to know the number of records of some of the tables today. Imagine a database with thousands of tables. I can't be doing

select count(*) from <table name>

for all of the table names. In addition, I don't know the list of the tables of interest but only the pattern of the table names.

Luckily, found this post :

There is a special table host the tables information, namely ALL_TABLES. To get the list of tables in the database and number of rows for each table, run this sql statement in sqlplus.


For MySQL :

select table_name, table_rows from information_schema.tables;

If you want to filter by table name, please note that Oracle database is case-sensitive.

To get the history of last queries, Oracle database seems a lot easier.

For Oracle, to select sql run for the past hour :

select cast(sql_fulltext as varchar(<number of characters>)) from v$sqlarea where last_active_time > (sysdate-1/24);

The field sql_fulltext is of clob type, thus a cast would help to display the "real" fulltext. :)

For MySQL, you'll need to turn on the general_log options by start the mySql service with this option :


By default, this option is turned off. If you do not know where is the location of the log file, you may actually find it from the database.

select general_log_file from information_schema.global_variables;

Please note that, you are not able to change the file path or name by updating this field. Similarly, you can specify it when you start the mySql service with this option :

--general_log_file=<file path and name>

That's all for now. I am going to continue to study on MySQL equivalent to the flashback in Oracle before the next sharing on this topic. :)

Wednesday, August 7, 2013

Accessing data in cdb file using Perl

See this post on cdb file related.

Never knew this can be so easy, thanks to the contributors, see here.

To install the library using cpan. Note, must use root to perform this as this will need the root access to create new file/directory.

sudo cpan
install CDB_File

Say you have a cdb file created call abc.cdb with a key summer. Here's how you can get the value of the key summer.

use strict;

use CDB_File;

my $cdb = "abc.cdb";

my %hash;

tie %hash,'CDB_File',$cdb or
die("unable to tie to cdb file");
my $value = $hash{'summer'};

print "$value\n";

Please note, if a key is not found, it will return a null value.

Without the CDB_File library, the tie will not work, and will have this error message :

Can't locate object method "TIEHASH" via package "CDB_File"

So, have fun! :)

Sunday, April 21, 2013

Sending picture file in an email

It is easy to attach a picture in an email, from user's perspective, of course. But, what is actually sent over the Internet to the recipient server?

There are 2 types how a picture can be transmitted over an email, as inline picture, or as an attachement. Before they are transfer over, they need to go through some conversion, that is binary-to-text encoding called Base64.

Say, the binaries of the image starts with FF D8 FF, and the translation process can be as below :

Original binaries FF   D8   FF
Regroup the binaries111111111101100011111111
Associate decimal 62613563
From the Base64 index table /9j/

Inline image

To send as an inline, the email message shall have the following attributes.

Content-Type: image/gif; name="file name"
Content-Transfer-Encoding: base64
X-Attachment-Id: image-id
Content-ID: <image-id>

Followed by the image binaries in Base64. The attributes and the image binaries is placed within the content type boundary.

In the email content, to refer to the image, the source of image will be refered as

<img src="cid:image-id">

Attachment image

For image as an attachment (for download), the email message attributes will be different from the inline image.

Content-Type: image/gif; name="file name"
Content-Disposition: attachment; filename="file name"
Content-Transfer-Encoding: base64
X-Attachment-Id: image-id

Similar to inline image, the message is followed by the image binaries in Base64 format, and is enclosed within the content type boundary.

The highlighted are basically the attributes that differentiate the inline image and attachment image.

Saturday, April 20, 2013

Character set of a html document

To continue with the previous post on character encoding, my actual topic of interest is how the browser detect what is the character set is used before rendering the page when it is not specified in html header.

Below is my "test code", and I saved them into 4 different file formats (the additional one is Unicode).


Note, there's no character set or doctype being specified in the html header. It is rendered in quirks mode.

I tested in 2 different browsers, IE 9 and FireFox 20, surprisingly, I got 2 different results. I am using document.charset and document.characterSet for IE and FF respectively to check for the character encoding of the document.

File format IE9 FF20
ANSI big5 windows-1252
Unicode unicode UTF-16
Unicode (big endian) unicodeFEFF UTF-16
UTF-8 utf-8 UTF-8

FF20 is giving Mojibake characters, it only show correctly when the encoding of the browser is changed to Chinese Traditional (Big5). The rest are rendered fine (readable) by default.

So, I added the charset attribute in meta tag section. I also added the 4.01 strict doctype.

<meta http-equiv="content-type" content="text/html;


File format charset IE9 FF20
ANSI big5 big5 big5
Unicode UTF-8 unicode UTF-16
Unicode (big endian) UTF-8 unicodeFEFF UTF-16
UTF-8 UTF-8 utf-8 UTF-8

Except ANSI file in FF20 is changed to big5, the rest remain same encoding. However, for Unicode (big endian) file in IE9, the following warning message is observed :

HTML1114: Codepage unicodeFEFF from (UNICODE byte order mark) overrides conflicting codepage utf-8 from (META tag)

There is no selection to change the page encoding to UTF-16 in FF, and there is no selection of unicodeFEFF in IE. I have no idea (yet?) why the document character set is returning those results.

From the above result, the recommended file format to have the html document to be saved is in UTF-8 format, that if we are using characters which is out of US-ASCII character set.

Friday, April 19, 2013

Character Encoding

My study object is the Chinese character "一".

I am using Notepad in Window 7 to save in different formats, namely ANSI, Unicode and UTF-8. My system locale is Chinese (Traditional, Taiwan). I am using Traditional Chinese Google IME as input method.

In simple words, encoding is to represent "something" in some "notation", decoding is to return the "something" from some "notation". Notation that I am refering here, is the binary representation.

I downloaded Hex Edit to observe the differences of the "encoding" or format (used in Notepad).


ANSI, which always refer to ASCII character set plus an extended character set. See However, this character set contains only 256 characters, it does not and unable to cover Chinese character. Sometimes, when saving Chinese text in notepad, it will prompt to save in other format, but sometimes, however, I am able to save it in ANSI format. As I search for the mistery encoding for this, I found out it is actually saving in CNS 11643-1986. "一" in this case is represented by A4 40, which is first character of the start of hanzi level 1. See and (Too complicated to use these web sites)

Unicode (Big Endian)

A quick reference : For my test case, notepad is saving as CJK unified ideographs 4E 00. However, as I observed in Hex Editor, it is actually saved as FE FF 4E 00. FE FF is actually represents "zero width no-break space". It occurs on every file that saved as Unicode.


Well, this seems like the "most popular" one, at least in one of our research project on Japanese related encoding, it is prefered to have the data save as UTF-8, for easy translate to other encodings. It is a variabled-width encoding which can go up to 6 bytes to represent a single character. Same as Unicode, the binary representation starts with EF BB BF, which is the "zero width no-break space", followed by E4 B8 80, which is "一". See

I studied somewhere online that Unicode is character set, while UTF-8 is one of the character encodings. So, Unicode encodes the character, and UTF-8 encodes the Unicode? Guess that is the case.

Thursday, March 7, 2013



apt-get install rsyslog

The configuration file is at


Basic syntax

To define log format.

$template <format name>
, "<format string>"

To filter and select destination.

if <filter logic> then <destination>;<format name>

For the above, it must be in one line, or with line separator "\".

Example :

For the properties used in the template format and in the filter, please check out the online manual :

My testing :

This is the messages directed to the destination specified in the example :

Some additional notes.

If you are seeing the message trailing by #012, put the below into your configuration file. I do not encounter this in Ubuntu, however, I see this in Debian.

$EscapeControlCharactersOnReceive off

If you see your log happened to have the messages claimed to be repeated as below image :

The default configuration file must have this $RepeatedMsgReduction is turned on. Comment out the line with #.

# $RepeatedMsgReduction on

Check this out!

Sunday, January 27, 2013

syslog-ng on Ubuntu


apt-get install syslog-ng

The configuration file is at


Some basic syntax

To define the log to be printed out on all terminals. This normally available in the default configuration file.

destination <identifier> { pipe("/dev/xconsole"); };

To define a file where log should be directed to.

destination <identifier> { file("<file name in full path>"); };

To format the log, you can use template in your destination.

destination <identifier> { file("<file name in full path>" template("$ISODATE:$MESSAGE")); };

If you would like to fully format the logline, you can use $MSGONLY. However, please remember to put a newline character at the end of the template.

Filters can be set based on facility, priority, program name, keyword matching and etc. You can refer to this :

filter <identifier> { facility(<facility>); };

You can also put logical operator in the filter.

filter <identifer> { program("test") and level(err); };

After you have set the destination and filter, you can start to configure the logging redirection.

Example I used in my system.

How do I test it? you can simply use logger command.

If you do not specify the priority in the logger command, it will assumed as notice level.

Another finding from my experiment. In the configuration file, there's an option to allow system to create the destination file. See in the options {}; section. It will create only for the first time usage. If you "accidentally" removed the created log file, it will not recreate it until the syslog-ng service is restarted again.

Have fun!