Strings are sequences of characters. Characters are elements of a finite set used to represent the symbols we use in written communication. ASCII: 7-bit 0-127. ASCII is still not complete obsolete. Your laptop keyboard is based on ASCII characters. In England, ASCII doesn't have a hashtag. #. Instead, it's the British pound symbol. In my life experience, I remember before ASCII had ^. When I started programming, ^ was an up-arrow with a stem. You sometimes see the caret used in describing control characters. ^C : means someone hit CTRL-C. File systems use 8-bit characters are bytes. And so, there have several competing extensions to ASCII where the first 128 are ASCII characters and the last 128 are other stuff that varied from platform, from machine to machine, and sometimes with different pieces of software on the same machine. IBM used EBCDIC, a bizarre 8-bit character set. The PC Revolution, Apples, IBM PC's, Commodores, TRS-80's, they all had their own characters sets that may or may not have resembled ASCII in some way. Now, in an attempt to standardize the world, we have Unicode. Unicode has over a million different characters...and counting. Unicode is the character set of Java, at least internally. Some modern languages use Unicode characters as part of the language itself. The exact character set is unimportant for string algorithms. Just understand that there is a character set with a finite number of characters. For the purposes of this class, the character set I'll use is the 26-character set of capital letters. The first algorithm we'll look at is String Sorting. The fastest possible sort algorithm runs in n lg n time. Radix sort runs even faster. Let's say I have a set of strings, all of which are three characters long. CAT DOG BAT ARE EMU EAR RUN FUN SUN SON TON Here's the idea: You stable sort by the last character in the string, then by the second-to-the-last character, and so forth, and finally by the first character in the string. Stable sort: preserve the relative order of the strings in the event of a tie. Radix Sort: Start with the last character and tally how many of each you see. E(1) G(1) N(5) R(1) T(2) U(1) 1 1 5 1 2 1 Now make a running sum of this tally. 1 2 2 8 10 11 This means that there is one that comes at or before E, two that come at or before G, 7 that come at or before N and so forth. 0. ARE 1. DOG 2. RUN 3. FUN 4. SUN 5. SON 6. TON 7. EAR 8. CAT 9. BAT 10. EMU 0 1 2 7 8 10 E G N R T U 3 1 3 1 3 A M O R U