Regex For Date Format of d MM, YYYY

Below regex will match date format of 4 March, 1981 or 4 march, 1981. Click here to try the example.

\b([1-9]|[12][0-9]|3[01]) (?:(J|j)anuary|(F|f)ebruary|(M|m)arch|(A|a)pril|(M|m)ay|(J|j)une|(J|j)uly|(O|o)ctober|(N|n)ovember|(D|d)ecember?)(\,) (?:19[7-9]\d|2\d{3})(?=\D|$)

Let me explain a bit on how this pattern translated into regex matching. The starting token \b is a word boundry matching, it just a safeguard token to check the starting of word, new-line or puntuation marks.

The group matching () token after word boundary token is to matches day in dates from 1 to 31 of month. Inside it’s a 3 character set token matching [] with a different digit pattern matching through conditional OR logic.

Taking from (?:[1-9]|[12][0-9]|3[01]) the [1-9] will match single digit number from 1 to 9, if first condition of character token was not met it will move to second logic whereby [12][0-9] pattern will look for 2 digit number. It will match either 10 to 19 or 20 to 29 range of number. The first part of token will look for digit number 1 or 2 and it will be concantenate with the adjacent character set token where it look for number from 0 to 9.

If you were using character token eg; [123abc] then it will look for character 1, 2, 3, a, b & c in sequence order.

For the month name matching is more straighforward and it will use a lot of OR condition together with group token. Lets take a look at first part of pattern (J|j)anuary where it will search for month of ‘January’ or ‘january’. If this particular month string is not exist it will move to next token of another month matching. You can also match a month shortform name by changing a pattern to (J|j)(anuary|an).

The pattern (?:19[7-9]\d|2\d{3}) is been used as a strategy to match a range of year from 1970 to 2999. The first part of pattern will look for matches year from 1970 to 1999. The last token \d to match any single digit number. You can change you pattern to 19\d\d as well so it can search from year 1900. Using pattern 19[7-9][0-9] will give the same result as in example. The second part of year matching after OR condition will look for digit number 2 with any 3 digit preceeding denoted by \d{3} pattern.

The preceeding matching token will look for quantified similar token specified before it. So if we use a{3} it will match aaa, thus from our example the pattern \d{3} will matches digit number 000 through 999.

The last token matching (?=\D|$) is to detect whether its reach end of string/line. Token (?= is look-ahead group token where it will look for group matching without including it in the matching result or we can say it’s assertion matching. Token \D will look for non digit an token $ generally used for matching end of line/string.

We can use non-capturing group token (?: if we are not expecting any return value. At some point the token (?: and can be used interchangeably with lookahead token (?= depend on the matching strategy. On the example, if we change non-capturing group token (?: infront of (J|j)anury with lookahead token (?= the whole string will be not matche as it will not consume back the group into the matching stream. But if we can change the pattern (?=\D|$) to (?:\D|$) at the end of pattern as it will give the same result and we not expect to consume back the result into matching stream.

Please drop a comment or suggestion below if you find any part of this article can be improve or some information might not be accurate.

Java Virtual Machine Monitoring

Programming and deploying application in Java Application Container require an extensive knowledge in managing and monitoring the JVM (Java Virtual Machine). Below is a few of the great resource to start with.

Using JConsole

Monitoring & Managing Tomcat

Troubleshooting Connection Problems in JConsole

Download Whole Site Using WGET #2

This is improved version on cloning or downloading the whole website to your local storage. You can refer here for previous version of this article.

wget http://example.com --domains example.com --recursive --page-requisites --no-clobber --html-extension --convert-links

Download Whole Site Using WGET

The WGet is a C base program to retrieve a web contents where it popularly used on OS platform such as Linux and others. It’s part of GNU and distributed under GNU General Public License. It has a very robust feature including recursive web page download like a crawler. By using a proper program arguments we can download a whole site by using the following command.

wget -E -H -k -K -p http://the.site.com

The -E arguments to tell te program that it supposed to save HTML and CSS document using proper extension. Where -H to makesure that all file were download even it were hosted on foreign host or domain.

The flag -k -K will convert all the link in HTML to locally downloaded file. The -p will download all the requisite file including images and others to makesure the page is properly working.

Recent Update!

Howto Configure CoovaChilli Listening On Multiple Interfaces

Creat a VLAN interface using ifcfg-vlan filename with following content. The value of IPADDR and NETWORK must be reflect your current VLAN setup.

DEVICE=vlan 
IPADDR=192.168.100.1 
NETWORK=192.168.100.0 
NETMASK=255.255.255.0 
USERCTL=no 
BOOTPROTO=none 
ONBOOT=yes

Edit the physical ethernet file, the eth0 were use as example and this value must reflect your current setting accordingly.

DEVICE=eth0
USERCTL=no
ONBOOT=yes 
MASTER=vlan 
SLAVE=yes 
BOOTPROTO=none

Following line must be add to modprobe.conf file to enable the VLAN device probing by the Linux OS.

alias vlan bonding options vlan mode=3 miimon=100

Change to below value in Chilli config file to fully enable the VLAN networking.

HS_LANIF=vlan