|
GIS
Regex Trainer
C# - Numbers written as Words
TCP/IP with C#
Select your language:
|
 |
Regex Trainer in Java
As you may know, regex is about string parsing. Although you may think parsing is a simple job, it is not.
It is easy to make a parsing program, but from my experience it is very hard to maintain. And always, the hand-made
parse programs written in C++ are crashing (buffer overflow) for a certain parse parameters. In Java, the buffer
overflow triggers a RuntimeException. Both programs are unreliable, the C++ program is also insecure.
In hand-made parse programs, people make suppositions like "the string
cannot be longer than 256 chars" or "no empty string can be given to the parser". You may also think C++ is faster than
Java in string parsing, because in Java the strings are immutable, and when parsing you change the string repeatedly, by
writing separators or trimming.
Well, regex works somehow different.
You define a parsing pattern, this parsing pattern is searched in your input string. You are allowed to
extract the needed parsed areas from the input string. For using these parsing patterns, named "regular expressions" or
shortly "regex" you have to compile them, within your program. After compilation, you can use it as much as you need,
without needing to compile them again, so your program has a good performance.
If you want to practice regex, you can use my free regex trainer
in Java Web Start included in
Java Runtime Environment (JRE).
You can also download the sources. To compile you will need at least Apache Ant.
The user interface was made with NetBeans.

In this sample you can see a pattern that matches all groups of three numbers, ended by one of the letters a, b or c. The program matches
also all these three number groups.
How to use the regex? Here are some samples.
Input string:
12:45
3
as:ac
:12
13:
:
We want to find all the sequences with two characters, then colon, then another two characters. The needed regex is:
..\:..
Running the program, we find:
as:ac
12:45
What if we want to find all sequences ending in a digit?
The needed regex is:
..\:.[0-9]
The result is:
12:45
What if we want to extract the two sequences separated by the colon?
The needed regex is:
(..)\:(.[0-9])
The result is:
12:45 12 45
Of course, this is not all what regex can do. See Java documentation
to find out how regex can help you to reach your target.
To do on this project:
- Java code generator for the tested regex.
- Applet presentation
- Independent java Jar presentation
- Test in Java 6.0
|
 |