Liang Chapter 5 (3) - UCD School of Computer Science and

Download Report

Transcript Liang Chapter 5 (3) - UCD School of Computer Science and

Chapter 7: The String class
We’ll go through some of this quickly!
Strings as objects
• Strings are objects. Each String is an instance of
the class String
• They can be constructed thus:
• String s = new String("Hi mom!");
• Strings are so common, Java provides a handier
way of creating them:
• String s = "Hi mom!";
• Strings have methods: loads of them!!!
Some methods for Strings
String s1 = "harry";
String s2 = "harold";
if ( s1.equals(s2) )
System.out.println("Strings are the same");
if ( s1.compareTo(s2) < 0 )
System.out.println(s1 + "...." + s2);
else
System.out.println(s2 + "...." + s1);
Comparing Strings
• s1.compareTo("hi mom!")
• returns a value < 0 if this s1 is ordered
before "hi mom"
• returns a value > 0 if "hi mom" is ordered
before s1
• returns 0 if they are the same
• Ordering is called lexicographic order
Quick exercise
• Write a program which reads 2 strings and
writes them out in lexicographic order,
smallest first
• Can you figure out which characters are
ordered before which?
• Is this the same as the telephone book?
String concatenation
• String s1 = "Hi" + " mom";
• String s2 = "your lucky number is "
+ number;
• String s3 = s1.concat(s2);
• String s3 = s1 + s2;
A number will be first turned into a String, then concatenated.
Substrings
• A String is stored as an array of characters:
character:
index:
H
i
_
M
o
m
!
0
1
2
3
4
5
6
• public String substring(int beginIndex)
• public String substring(int beginIndex, int endIndex)
Some things to remember
• The index of the first character is 0, not 1
• substring(a, b) returns the characters from
position a to position b-1, not b!!!!
• substring(a) returns the characters from
position a to the end of the String
Other String methods
• return the character at a given index:
– public char charAt(int index)
• get the length of a String:
– public int length()
Processing string contents:
StringTokeniser
All the strings we’ve seen before have been short (a word or two).
To process long strings (such as sentences) we need to be able to
split up strings into their parts (words, numbers, etc.).
The parts of a sentence are called tokens.
token
token
token token
token token token token
token
“This is a string with 9 tokens in it.”
How do we recognise tokens? They are separated by delimiters
(in the sentence above, blank spaces).
Java.util.StringTokenizer
// this program uses a StringTokenizer object to split a sentence
// into words and print each word on a different line
import java.util.StringTokenizer;
public class TestTokenizer{
public static void main(String[] args){
String test = “This is a test string.”;
StringTokenizer testTokenizer = new StringTokenizer(test);
// ‘testTokenizer’ is an object that will give
// us the consecutive tokens in the String ‘test’.
while (testTokenizer.hasMoreTokens()) {
System.out.println(testTokenizer.nextToken());
}
// the testTokeniser object has methods called
// hasMoreTokens() and nextToken(), which tell
// us whether there are more tokens left in the string
// test, and give us the next token from that string
}
}
using StringTokenizer
To tokenize a String (e.g. split it into words), we
create a new StringTokenizer object , giving the StringTokenizer
constructor the string we want to split up:
String test = “This is a test string.”;
StringTokenizer testTokenizer = new StringTokenizer(test);
The StringTokenizer object now has inside it the string to tokenize.
We get the next token by asking that object for nextToken() . The
object looks in the string it was given at construction, and returns the
next token to us.
We can find out how many tokens are in our string in total by asking
that object for countTokens() . The object looks in the string it
was given at construction, and tells us how many tokens are in it.
Delimiters for StringTokenizer
By default, a StringTokenizer object splits up a String using blank
spaces, tabs ( ‘\t’), new line ( ‘\n’), return ( ‘r’) as delimiters.
We can use different delimiters, by giving a String containing
the delimiters we want to use as arguments to nextToken():
nextToken(“ ,\t\n\r”);
Uses spaces, commas, tabs,
newlines returns as delimiters.
We can also specify the delimiters we want to use when we
construct our StringTokenizer object:
StringTokenizer st=new StringTokenizer(test,“ ,\t\n\r”);
If we want the delimiters to be returned as tokens, we specify
that in the constructor as well (normally they’re not returned).
UML for StringTokenizer
UML means Unified Modelling Language; it’s a way of
summarising object oriented programs quickly.
Read the first bit of Liang, Appendix G (p. 903) , which
explains UML. Here’s the UML for StringTokenizer:
StringTokenizer
+countTokens(): int
+hasMoreTokens(): boolean
+nextToken(): String
+nextToken(delim: String): String
The + means “publically accessible method”. : int means this
method returns an integer.