| Trees | Indices | Help |
|---|
|
|
object --+
|
Word
Represents a word in a body of text. Each Word has a main and a trailing part where the main part is processed according to other flags in the current WordState to improve its presentation to the user via a speech or other output device while the trailing part remains unprocessed. The value of WordDef determines what characters lie in the main and trailing parts of each word. The following constants are available in AEConstants.
Characters in the ignore list are considered blank. A POR can be associated with a Word to indicate its context in a larger body of text.
Callables may be specified as observers for characters processed by the main and trail parts of each Word. An observer must take four parameters, this Word instance, the WordState in use, the current character, and the list of all characters in the main or trail part of the word. The observer should return the character to be added. The list may be modified in place to affect the final contents of the word.
|
|||
|
|||
|
|||
| string |
|
||
| string |
|
||
|
|||
|
|||
|
|||
| POR |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| string |
|
||
| string |
|
||
| boolean |
|
||
| string |
|
||
| integer |
|
||
| integer |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| boolean |
|
||
| string or None |
|
||
| string |
|
||
| string |
|
||
|
Inherited from |
|||
|
|||
| integer |
curr_repeat Indicates a character should be considered a repeat iff this value > MaxRepeat. |
||
| boolean |
has_main Has at least one main character been parsed? |
||
| string |
last_char Last character appended to this Word |
||
| boolean |
main_done Is the main_part complete? |
||
| callable |
main_ob Function to invoke for each character in the main part of a word |
||
| list |
main_part Part of this Word that will receive extra preparation for output |
||
| boolean |
more Are there likely more Words after this one in the text source where this Word originated? |
||
| POR |
por Point of regard indicating where this Word originated |
||
| list |
source_word Original text of this Word without any preparation for output applied |
||
| WordState |
state Settings that determine the definition of a Word and how it is prepared for output |
||
| boolean |
trail_done Is the trail_part complete? |
||
| callable |
trail_ob Function to invoke for each character in the trailing part of a word |
||
| list |
trail_part Part of the word that will receive little preparation for output |
||
|
|||
|
Inherited from |
|||
|
|||
Stores the WordState and initializes all instance variables.
|
Compares this Word to the one provided based on their PORs and content. If their source_words and PORs are the same, they are considered equal.
|
Gets this Word as a unicode string.
|
Gets this Word as a non-unicode string.
|
Determines if the given character should be considered a part of the main part of this word or not based on the definition of the word given by WordState.
|
Replaces the main part of the word with the given string.
|
Replaces the main part of the word with the given string.
|
Gets the POR associated with the start of this Word.
|
Determines if the given character is blank or ignored.
|
Determines if the given character is a letter in the current locale.
|
Determines if the given character is a number in the current locale.
|
Determines if the given character is a punctuation mark.
|
Determines if the given character is a symbol.
|
Determines if the given character is a vowel. Relies on a translator to list all vowels in the current locale.
|
Determines if the given character is an upper case letter.
|
Gets the unicode hex value for a character sans the 0x prefix.
|
Gets the unicode name of the character, one of the strings listed in the http://unicode.org/charts/charindex.html. If the character could not be determined from the given string, returns an empty string. Note that these names are not localized.
|
Gets a localized description of the given character. The most detailed description for a character is returned so that, for instance, 'e' is described as a vowel and not just a letter.
|
Gets the unprocessed text of this word as it was seen in the original text source.
|
Gets the length of the unprocessed source text of this Word.
|
Gets the length of the processed main part of this Word.
|
Makes a guess as to whether or not there are more Words in the body of text from which this word originated. This guess is based on whether or not the last chunk passed to append was processed in full.
|
Gets if this Word has a character repeated more than the maximum number of repetitions allowed or not.
|
Gets if this Word contains an uppercase letter or not.
|
Gets if this Word contains a vowel or not.
|
Gets if this Word is all capitals or not.
|
Gets if this Word is all numbers or not.
|
Gets if this Word is all blanks or not.
|
Parses the given chunk of text for characters that should be added to the main_part or trail_part of this Word. If this word has neither main_done or trail_done set, then all main characters determined by _isMainChar up to the first non-main character are added to the main part of this word. When the first non-main word is encountered, main_done is set. If this word has main_done set and trail_done unset, all non-main characters are added to the trail part of this word. When another main character is encountered after main_done is set, trail_done is set and the remainder of the given chunk is returned unprocessed to be added to another Word. Once trail_done is set, no further text can be appended to this Word.
|
Adds the given character to the source_word. If Caps is unset, makes the character lowercase. If CapExpand and the character is a capital letter or NumExpand and the character is a number, inserts a space in main_part. Finally inserts the possibly lowercased character in main_part.
|
Adds the given character to the source_word. If the character is a blank, inserts a space in trail_part, else inserts the character.
|
|
|||
curr_repeatIndicates a character should be considered a repeat iff this value > MaxRepeat. It is not the exact number of repetitions of a character as it is optimized for speed, not accuracy
|
| Trees | Indices | Help |
|---|
| Generated by Epydoc 3.0beta1 on Mon Jun 4 15:33:27 2007 | http://epydoc.sourceforge.net |