Characters and Strings

Download Report

Transcript Characters and Strings

Characters and Strings
Characters
• In Java, a char is a primitive type that can hold
one single character
• A character can be:
–
–
–
–
A letter or digit
A punctuation mark
A space, tab, newline, or other whitespace
A control character
• Control characters are holdovers from the days of teletypes
char literals
• A char literal is written between single quotes:
'a'
'A'
'5'
'?'
''
• Some characters cannot be typed directly and must
be written as an “escape sequence”:
– Tab is '\t'
– Newline is '\n'
• Some characters must be escaped to prevent
ambiguity:
– Single quote is '\'' (quote-backslash-quote-quote)
– Backslash is '\\'
Additional character literals
\n newline
\t
tab
\b
backspace
\r
return
\f
\\
\'
\"
form feed
backslash
single quote
double quote
Character encodings
• A character is represented as a pattern of bits
• The number of characters that can be represented
depends on the number of bits used
• For a long time, ASCII (American Standard Code for
Information Interchange) has been used
• ASCII is a seven-bit code (allows 128 characters)
• ASCII is barely enough for English
– Omits many useful characters:
¢½ç“”
Unicode
• Unicode is a new 16 bit (two byte) standard that is
designed to replace ASCII
• “Unicode provides a unique number for every
character, no matter what the platform, no matter
what the program, no matter what the language.”
• Java uses Unicode to represent characters
– You should know that Java uses Unicode, but
– Except for having these extra characters available, it
seldom makes any difference to how you program
Unicode character literals
• The rest of the ASCII characters can be written as
octal numbers from \0 to \377
• Any Unicode character can be written as a
hexadecimal number between \u0000 and \uFFFF
• Since there are over 64000 possible Unicode
characters, the list occupies an entire book
– This makes it hard to look up characters
• Unicode letters in any alphabet can be used
in identifiers
Glyphs and fonts
• A glyph is the printed representation of a character
• For example, the letter ‘A’ can be represented by
any of the glyphs
A A A A A
• A font is a collection of glyphs
• Unicode describes characters, not glyphs
Strings
• A String is a kind of object, and obeys all the rules
for objects
• In addition, there is extra syntax for string literals
and string concatenation
• A string is made up of zero or more characters
• The string containing zero characters is called the
empty string
String literals
• A string literal consists of zero or more characters
enclosed in double quotes
"" "Hello" "This is a String literal."
• To put a double quote character inside a string, it
must be backslashed:
"\"Wait,\" he said, \"Don't go!\""
• Inside a string, a single quote character does not
need to be backslashed (but it can be)
String concatenation
• Strings can be concatenated (put together) with
the + operator
"Hello, " + name + "!"
• Anything “added” to a String is converted to a
string and concatenated
• Concatenation is done left to right:
"abc" + 3 + 5
gives "abc35"
3 + 5 + "abc"
gives "8abc"
3 + (5 + "abc") gives "35abc"
Newlines
• The character '\n' represents a newline
• When “printing” to the screen, you can go to a new
line by printing a newline character
• You can also go to a new line by using
System.out.println with no argument or with one
argument
• When writing to a file, you should avoid \n and use
System.out.println instead
– I’ll explain this when we talk about file I/O
System.out.print and println
• System.out.println can be called with no
arguments (parameters), or with one argument
• System.out.println is called with one argument
• The argument may be any of the 8 primitive types
• The argument may be any object
• Java can print any object, but it doesn’t always do
a good job
– Java does a good job printing Strings
– Java typically does a poor job printing types you define
Printing your objects
• In any class, you can define the following instance method:
public String toString() { ... }
• This method can return any string you choose
• If you have an instance x, you can get its string
representation by calling x.toString()
• If you define your toString() method exactly as above, it
will be used by System.out.print and System.out.println
Constructing a String
• You can construct a string by writing it as a literal:
"This is special syntax to construct a String."
• Since a string is an object, you could construct it
with new:
new String("This also constructs a String.")
• But using new for constructing a string is foolish,
because you have to write the string as a literal to
pass it in to the constructor
– You’re doing the same work twice!
String methods
• This is only a sampling of string methods
• All are called as: myString.method(params)
– length() -- the number of characters in the String
– charAt(index) -- the character at (integer) position index,
where index is between 0 and length-1
– equals(anotherString) -- equality test (because ==
doesn’t do quite what you expect
• Don’t learn all 48 String methods unless you use
them a lot--instead, learn to use the API!
Vocabulary
• escape sequence -- a code sequence for a character,
beginning with a backslash
• ASCII -- an 7-bit standard for encoding characters
• Unicode -- a 16-bit standard for encoding characters
• glyph -- the printed representation of a character
• font -- a collection of glyphs
• empty string -- a string containing no characters
• concatenate -- to join strings together
The End