Lecture 15: StringTokenization

Download Report

Transcript Lecture 15: StringTokenization

String Tokenization
 What is String Tokenization?
 The StringTokenizer class
 Examples
1
What is String Tokenization?
 So far we have been reading our input one value at a time.
 Sometimes it is more natural to read a group of input at a time.
 For example, when reading records of students from a text file, it is natural to read a whole
record at a time.
"995432 Al-Suhaim Adil
3.5"
 The readLine() method of the Buffered Reader class can read a group of input as a single string object.
 The problem is, how do we break this string object into individual words known as tokens?
"995432"
"Al-Suhaim Adil”
"3.5“
 This process is what String tokenization is about.
2
The StringTokenizer Class
 The StringTokenizer class, of the java.util package, is used to break a string object into
individual tokens.
 It has the following constructors:
Constructor
function
StringTokenizer(String str)
Creates a StringTokenizer object that uses
white space characters as
delimiters.
StringTokenizer(String str, String
delimiters)
Creates a StringTokenizer object that uses
the characters in delimiters as separators.
StringTokenizer(String str,String
delimiters,boolean returnTokens)
Creates a StringTokenizer object that uses
characters in delimiters as separators and
treats separators as tokens.
3
StringTokenizer Methods
 The following are the main methods of the StringTokenizer class:
Method
String nextToken() throws
NoSuchElementException
function
int
Returns the count of tokens in this
StringTokenizer object that are not yet
processed by nextToken() -- initially all.
countTokens()
boolean
hasMoreTokens()
Returns the next token as a string from this
StringTokenizer object. Throws an
exception if there are no more tokens.
Returns true if there are more tokens not
yet processed by nextToken().
4
How to apply the methods
 To break a string into tokens, first, a StringTokenizer object is created.
String myString = "I like Java very much";
StringTokenizer tokenizer = new StringTokenizer(myString);
 Then any of the following loops can be used to process the tokens:
while(tokenizer.hasMoreTokens()){
String token = tokenizer.nextToken();
// process token
}
or
int tokenCount = tokenizer.countTokens();
for(int k = 1; k <= tokenCount; k++){
String token = tokenizer.nextToken();
// process token
}
5
Example 1
 The following program reads grades from the keyboard and finds the average.
The grades are read in one line.
import java.io.*;
import java.util.StringTokenizer;
public class TokenizerExamplel{
public static void main(String[] args)throws IOException{
BufferedReader stdin = new BufferedReader(new
InputStreamReader(System.in));
System.out.print("Enter grades in one line:");
String inputLine = stdin.readLine();
StringTokenizer tokenizer = new StringTokenizer(inputLine);
int count = tokenizer.countTokens();
double sum = 0;
while(tokenizer.hasMoreTokens())
sum += Double.parseDouble(tokenizer.nextToken());
System.out.println("\nThe average = "+ sum / count);
}
}
6
Example 2
 This example shows how to use the second constructor of StringTokenizer class.
 It tokenizes the words in a string, such that the punctuation characters following the words are
not appended to the resulting tokens.
import java.util.StringTokenizer;
public class TokenizerExample2{
public static void main(String[] args){
String inputLine =
"Hi there, do you like Java? I do;very much.";
StringTokenizer tokenizer =
new StringTokenizer (inputLine, ",.?;:! \t\r\n");
while(tokenizer.hasMoreTokens())
System.out.println(tokenizer.nextToken());
}
}
Output:
Hi
there
do
you
like
Java
I
do
very
much
7
Example 3
 This example shows how to use the third constructor of StringTokenizer class.
 It tokenizes an arithmetic expression based on the operators and returns both the
operands and the operators as tokens.
import java.util.StringTokenizer;
public class TokenizerExample3{
public static void main(String[] args){
String inputLine = "(2+5)/(10-1)";
StringTokenizer tokenizer = new
StringTokenizer(inputLine,“+—*/()",true);
while(tokenizer.hasMoreTokens())
System.out.println(tokenizer.nextToken());
}
}
Output:
(
2
+
5
)
/
10
1
)
8