LSR Web UI Specification

Supporting Rich Document Browsing and Accessible Rich Internet Applications

Authors: Peter Parente
Scott Haeger
Addresses:pparent@us.ibm.com, scott@bashautomation.com
Date: 2007-05-22
Revision: 880
Status: working draft
Copyright: Copyright © 2007 IBM Corporation under the BSD License
Source:http://svn.gnome.org/svn/lsr/trunk/doc/web/

Abstract

The purpose of this document is to describe the Web browsing experience for Linux Screen Reader (LSR) users. The behavior of LSR in common Web browsing situations, the facilities provided by LSR for Rich Document Browsing (RDB), and the presentation of Accessible Rich Internet Applications (ARIAs) are all specified. The description assumes AT-SPI support by the Web browser equivalent to that planned for Firefox 3.0.

The extent to which interaction described in this document can be supported in LSR is dependent on the brower's AT-SPI implementation. Regardless of how the implementation deviates, this document will always describe the desired solution.

Contents

1   Terminology

The following terms and phrases are used throughout this document. They are defined here for clarity:

accessible
AT-SPI object representing a visual object or region on the screen.
chrome
Web browser graphical user interface (GUI) excluding rendered Web content.
document frame
Container accessible separating accessibles for rendered Web content from accessibles representing the chrome of the browser. The document frame itself may contain Web content (e.g. text).
editable accessible

Accessible that can become the focus and may be directly changed by user input. For instance, a text area is an editable accesible because the user may change its value by typing text. A combobox is an editable accessible because the user may select a new value using the arrow keys. A link, on the other hand, is not considered an editable because the user may only activate it, not change its value directly. The same is true of a button, which is not editable because the user may only activate it.

An editable accessible is a specialization of focusable accessible. That is, an interactive accessible is focusable but a focusable accessible is not necessarily editable.

The reason for this specialization will become clear in the specification of Rich Document Browsing features.

focus (noun)
State of one and only one accessible indicating it will receive the next input event.
focus (verb)
To cause an accessible to become the focus by tabbing to it, clicking it, hovering over it, etc. and hence move the user point of regard to that accessible as a result.
focusable accessible
Accessible that can become the focus. For instance, a link, a form text area, and the address bar are all focusable.
item

The definition of an item in LSR depends on the context in which it is used. In a document, an item refers to one of the following:

  1. A segment of static text bounded by the extents of a visible line, by a focusable accessible, or by non-text accessible.
  2. A focusable or non-text accessible embedded in the document.
  3. A descendant of a container accessible such as a table cell, list item, tree item, etc.

As an example, consider navigation through the following HTML structure:

This is a paragraph with a link to IBM in it followed by more text.

If all of this text fits on one visual line, the the items in this structure are This is a paragraph with a, link to IBM in it, and followed by more text.

point of regard (POR)
Portion of the user interface to which the user is attending at present. More exactly, it is the accessible or part of accessible toward which the next LSR query or command will be directed.
review (verb)
To move the point of regard using the LSR review keys.
tab (noun)
Notebook tab containing a document frame. Also referred to as a page tab.
tab (verb)
To move focus to another accessible by pressing the Tab or Shift-Tab key.
window frame
Top-level accessible containing both the chrome and the document frame.

2   Basic Browser Behavior

This section details how LSR responds to basic user behaviors in a Web browser such as tabbing through a document, refreshing a page, following a link, raising the window frame to the foreground after switching away, and so on. The behaviors described in this section are independent of the features defined for Rich Document Browsing and Accessible Rich Internet Applications.

2.1   Activating the Window Frame

If the window frame has never been activated previously, LSR responds to the focus event provided by the browser. LSR considers the user's point of regard to be at that location.

Example

The user runs Firefox. The Firefox window appears with the caret in the address bar. The user presses Alt-Tab to switch to a console window. The user runs LSR from the console. The user presses Alt-Tab again to switch back to Firefox. The Firefox window is raised the foreground. LSR reports information about the address bar and sets the point of regard to the position of the caret in the bar.

If the window frame has been activated previously, and is now returning to the foreground, LSR restores the user's last point of regard before the window was deactivated. LSR then reports the restored point of regard. LSR may need to ignore the events fired by the Web browser when it is reactivated to satisfy this requirement.

Example

The user is editing the text in the Firefox search box. The user presses Alt-Tab to switch to another application. The user presses Alt-Tab again to return to Firefox. LSR reports the search box and sets the point of regard to the position of the caret in the box.

Example

The user tabs to a link in the document frame. The user presses Alt-Tab to switch to another application. The user presses Alt-Tab again to return to Firefox. LSR reports the link and sets the point of regard to the start of the link text.

Critical Example

The user is editing the text in the Firefox search box. The user reviews into the document and stops on a line of static text. The user presses Alt-Tab to switch to another application. The user presses Alt-Tab again to return to Firefox. LSR announces the static text and sets the POR to the start of that line.

2.2   Tabbing to the Document Frame

When the user tabs to the document frame, the document frame gains focus. If the document frame accessible starts with human readable text, LSR reads it and sets the point of regard to the first character in that text. If the document frame starts with an embed character, LSR recursively descends into the embedded objects until one is found containing text, reports it, and sets the POR to that location.

Critical Example

The user tabs to the document frame itself. The user presses Alt-Tab to switch to another application. The user presses Alt-Tab again to return to Firefox. If the document frame accessible starts with human readable text, LSR reads it and sets the point of regard to the first character in that text. If the document frame starts with an embed character, LSR recursively descends into the embedded objects until one is found containing text, reports it, and sets the POR to that location.

2.3   Tabbing to a Document Focusable

When the user tabs to a focusable accessible within the document frame, LSR announces that accessible as is appropriate for its type. LSR sets the point of regard to the appropriate part of the accessible.

Example

The user presses Tab to move focus to the next link on a Web page. LSR announces the link role and text, and sets the POR to the start of the link.

Example

The user presses Tab to move focus to a text area on a Web page. LSR announces the text area label, role, and line of text at the caret, and sets the POR to the caret location.

2.4   Tabbing to a Chrome Focusable

When the user tabs to a focusable accessible within the chrome, LSR announces that accessible as is appropriate for its type. LSR sets the point of regard to the appropriate part of the accessible.

Example

The user presses Alt-D to move to the address bar. LSR announces the text of the entry box and sets the POR to the caret location in the entry field.

2.5   Activating and Deactivating Menus

When the user activates a menu bar or context menu, LSR responds to the focus events from the menu. When the menu is dismissed, the last point of regard active before the menu appeared is restored as the current point of regard. Again, LSR may need to ignore the events sent by the browser to properly restore the point of regard when the menu closes and satisfy this requirement.

Example

The focus and POR are on a link in a document frame. The user presses Alt-F to show the File menu. LSR announces the menu name and selected menu item. The user presses Escape to dismiss the menu. LSR responds to the focus event on the link by announcing its name and setting the point of regard to that accessible.

Critical Example

The user tabs to a link, and then reviews into the static text following it. The focus is now on the link, but the POR is in the static text. The user presses Alt-E to show the Edit menu. LSR announces the menu name and selected menu item. The user presses Escape to dismiss the menu. LSR ignores the focus event on the link, announces the line of static text last reviewed by the user, and sets the POR to its previous location in the static text.

2.6   Switching Document Tabs

When the user activates a new document tab, LSR always announces the name of the newly activated tab. If the tab was never before visited, LSR announces the focus event fired by the browser and sets the point of regard to its location.

Example

The user presses Ctrl-T to create a new tab. LSR says untitled tab, responds to the focus event (typically on the address bar), and sets the POR to the focus location.

If the tab was previously visited, and the last POR reviewed while the tab was active falls in its corresponding document frame, LSR announces that point of regard and sets it as the current POR.

Critical Example

The first page tab is active. The user presses Alt-D to move focus to the address bar. The users presses Alt-2 to switch to the second tab which he had been browsing earlier. LSR reports the title of the second tab, then announces the last item browsed in that tab. LSR then sets the current POR to that item.

If the tab was previously visited, but the last POR reviewed while the tab was active does not reside in the document frame, LSR announces any focus event fired by the browser and sets the point of regard to its location. If no event is fired, the POR remains where it is and no additional announcement is made.

Critical Example

The user browses the document under the first tab. He presses Alt-D to switch to the address bar. LSR announces the content of the address bar and sets the POR to the caret location. He then creates a new tab using Ctrl-T. LSR announces the new tab and announces the focus event on the address bar, if one is fired. LSR sets the POR to that location also. The user switches back to the first tab by pressing Alt-1. LSR says the name of the first tab and announces the focus event on the address bar, if one is fired.

2.7   Loading a Document

When a new page starts to load in the active tab, LSR announces that a page is starting to load by default. As progress is made loading the page, LSR reports the percent complete in 10% increments by default. The progress value is announced when the next increment is surpassed. When the page finishes loading, LSR announces a summary of the page and reads the entire page by default. It then sets the point of regard to an initial location in the document.

The FirstPOR setting dictates where the initial POR rests: on the source of the first focus event fired by the browser or at the top of the document. It defaults to following the browser focus.

Example

Assume all settings are at their defaults.

The user activates a link. LSR announces a page is loading. The browser fires progress events at 9%, 25%, and 50%. LSR announces 20% and 50% when they are surpassed. The browser completes loading of the page. LSR announces a page summary, follows the browser focus event, and reads the entire page.

The user may configure whether or not LSR announces when a page starts to load using the PageStart option. The user may modify PageProgress setting to change the resolution of progress announcements.

Example

The user configures progress announcements to 0% intervals and turns off load start announcements in the LSR settings dialog. The user configures LSR to put the POR at the start of the document when a page finishes loading. The user activates a link. The page loads in its entirety. LSR reports a summary of the loaded page and puts the point of regard on the first item in the document frame. LSR then reads the entire page.

What LSR announces when a foreground page load completes is configurable with the PageComplete setting. This option has numerous possible values defining what LSR announces when a page is ready:

Read title
Reads the title of the new page.
Read summary
Reads the title of the new page, and then gives a summary of the page. (See the Rich Document Browsing Overview section).
Read all
Reads the entire page starting at the location of the first POR.
Read title and all
Reads the title of the new page, and continues reading the entire page starting at the location of the first POR.
Read summary and all
Reads a summary of the new page, and continues reading the entire page starting at the location of the first POR.
Read nothing
Does nothing when a new page finishes loading.

Example

The user keeps the configuration from the previous example, except he instructs LSR to read only the page title when it finishes and to set the point of regard to the first focused item. The user then activates a link leading to a Web application. The page loads in its entirety. LSR reports the title of the loaded page and puts the point of regard on the focused item. LSR announces the focus as it would in a desktop application.

When a new page starts to load in a background tab, LSR follows the same logic as if the page were active. However, a separate, set of options named BackPageStart, BackPageProgress, and BackPageComplete controls the verbosity of LSR when responding to load events from background tabs. The BackPageComplete setting only lists the Read title and Read nothing values, unlike its foreground counterpart.

When the active page continually refreshes without direction action from the user, LSR avoids moving the POR to the start of the document and avoids announcing the refresh progress by default. The setting FilterReloads controls whether LSR announces the page load start, progress, and completion in this special case.

Critical Example

The user visit a Web page with a meta refresh timer in the header. Every 30 seconds, the page reloads itself. LSR keeps the POR fixed at its current location, regardless of what events the browser fires. The user enables announcements of auto-refresh progress in the settings dialog. On the next refresh, LSR reports progress according to the start, progress interval, and completion settings.

Note

The usability of this feature will likely benefit from the logic that dictates how ARIA live regions are handled. The specification above will be extended as the Live regions section is written.

For more details about the types and values of all settings defined here, see the Browser Settings Reference at the end of this major section.

2.8   Downloading Files

When the download manager window is in the foreground, LSR reports progress on all downloads in 10% increments by default. LSR says the name of the file on the changed progress bar followed by its percentage. When a download completes, LSR announces the completion including the name of the downloaded file.

Example

The user starts downloading a large file to disk. The Web browser download manager window becomes the foreground window. It contains two progress bars. The first progress bar increases by 5%. The second bar increases by 35% percent. LSR announces the name of the file on the second bar and the 30% mark.

Progress announcements, the start download announcement, and the download complete announcement are configurable by three settings: FileProgress, FileStart and FileComplete. See the Browser Settings Reference section for details.

When the download manager window is in the background, LSR follows the same logic as if the window were in the foreground. However, a separate, identical set of options controls the verbosity of LSR when responding to progress events in the background. See the Browser Settings Reference section for details.

Note

The usability of this feature will likely benefit from the logic that dictates how ARIA live regions are handled. The specification above will be extended as the Live regions section is written.

2.9   Using Typeahead Find

When the user initiates type ahead find, LSR reports the label on the type ahead box and its current content (usually blank). As the user types, LSR reports the content of the search box as if it were any other text entry field. After reporting the entered text, LSR also announces the item containing the current match for the search term, if one exists.

Example

The user presses Ctrl-F to give focus to the typeahead find box. The user presses t. The current document pane scrolls to show the next t and selects it. If character echo is enabled , LSR says t echoing what was typed. LSR reports the item in which the matched t appears.

The user next types h. The document again updates to show the next match for th. If character echo is enabled, LSR says h. LSR reports the item in which the matched th appears.

The user then presses Backspace. The document updates the selection on the current word so that only t is selected. LSR says backspace h and nothing else.

If no match exists for the current term, LSR announces the lack of a match once. Entering additional characters or removing existing characters that do not result in a new match does not result in a response from LSR. Only once a match is found do responses resume.

Example

The user presses Ctrl-F to give focus to the typeahead find box. The user presses t. The current document pane scrolls to show the next t and selects it. If character echo is enabled, LSR says t echoing what was typed. LSR reports the item which the matched t appears.

The user next types z. The browser removes the selection from the document because there is no longer a match. If character echo is enabled, LSR says z echoing what was typed. LSR says no match.

The user next types q. The browser does nothing because a match is impossible in this state. If character echo is enabled, LSR says q echoing what was typed.

The user then presses Backspace twice. The document reselects the t in the last matched word. LSR says backspace q, backspace z, and then the item in which the matched t appears.

When the user activates the go to next match or go to previous match functions in the browser, LSR announces the item containing the new match.

Example

The user finishes typing a succesful search query. The user presses Ctrl-G to navigate to the next match. The document moves the selection to the next match. LSR says the item containing the new match. The user presses Ctrl-Shift-G to go to the previous match. Again, LSR says the item containing the new match.

When the user dismisses the typeahead search bar, either by a key press or by inaction, LSR sets the POR to the active match. If no match is exists, LSR sets the POR to its original location on the page.

Example

The user presses Escape to dismiss the search bar after a successful search. LSR announces the item containing the last match again and sets the POR to the start of the matched word.

Example

The user presses Escape to dismiss the search bar after an unsuccessful search. LSR announces the item that was active before the search began and sets the point of regard to its pre-search location.

Note

This section may be extended with information about how LSR should respond to the Highlight all feature in Firefox. If all matches can be quickly and easily detected, it may be worthwhile to support a read all feature over all matches.

Note

There are actually two type ahead bars in Firefox. One is shown when Control-F is pressed. The other is displayed when / is pressed. The above specification should be supported in either case.

2.10   Browser Settings Reference

For reference, all of the LSR settings mentioned in the proceeding sections are collated here. The first columns gives the programmatic name of the setting. The second gives its type. The third states its default value. The human readable name and description are to be decided by the implementor.

Table: Browser behavior settings
Name Type Default value Notes
FirstPOR choice Focus See Loading a Document for value definitions.
PageProgress percent 10% Steps by 1%. 0% means off.
PageStart boolean True  
PageComplete choice Read summary and all See Loading a Document for value definitions.
BackPageProgress percent 0% Steps by 1%. 0% means off.
BackPageStart boolean False  
BackPageComplete boolean Read title See Loading a Document for value definitions.
FileProgress percent 10% Steps by 1%. 0% means off.
FileStart boolean True  
FileComplete boolean True  
BackFileProgress percent 10% Steps by 1%. 0% means off.
BackFileStart boolean True  
BackFileComplete boolean True  
FilterReloads boolean True When False, obey page foreground and background settings on pages that automatically reload.

3   Rich Document Browsing

This section describes how LSR supports Rich Document Browsing (RDB). Essentially, this term implies that the screen reader provides multiple reading modes for dealing with the diverse types of content in a Web document, navigation hotkeys for quickly moving through sets of elements, and navigation lists showing and all elements of a given type, and context queries for quickly learning about the structure and content of a document. Together, these tools help the user tackle complex documents available on the Web today.

Throughout this section, default key bindings are specified according to the criteria stated in the LSR User Interface Specification under the section on Keyboard User Interface Considerations.

Note

The function names in the tables in this section are representative of the names of the tasks registered in the Firefox Perk, but are not guaranteed to be the actual names used. For instance, tasks first defined in the Firefox Perk are always prefixed with firefox. Functions listed with both next and previous bindings, for example, may actually be implemented as two tasks with the words next and previous in their names.

3.1   Reading Modes

Reading modes define the order and manner in which content is traversed. LSR supports two major reading modes with functions bound to the arrow keys:

  1. Document mode
  2. Widget mode

LSR also supports two minor reading modes with functions bound to the arrow keys with the Control modifier:

  1. Row/column mode
  2. Cell mode

By default, LSR automatically switches to the most specific major reading mode when the point of regard changes. The AutoMode setting controls this feature. The user may turn automatic switching on and off by toggling this option.

If the new POR lies on or within a data table or grid, LSR also activates one of the two minor modes. By default, LSR enables the keys for row/column mode. The user may configure the TableMode option to change this default.

See RDB Settings for details about both AutoMode and TableMode.

Table: Automatic reading mode transitions
New point of regard Major mode Minor mode
Non-editable, non-table Document None
Non-editable, table Document TableMode
Editable, non-table Widget None
Editable, table Widget TableMode

Regardless of the AutoMode and TableMode values, the user may manually toggle between reading modes at any time using the change mode hotkey.

Table: Mode changes
Function Binding Notes
change mode Ctrl-Space  

The result of this function is dependent on the context of the point of regard. It may affect either the major mode, the minor mode, both, or neither.

Table: Manual reading mode changes
Point of regard Major Minor New major New minor
Non-editable, non-table Document None Document None
Non-editable, table Document Row/col Document Cell
Document Cell Document Row/col
Editable, non-table Widget None Document None
Editable, table Widget None Document TableMode

Any time of the mode changes, automatically or manually, LSR announces the name of the new mode.

3.1.1   Document Major Mode

The basic document reading mode supports navigation by character, word, and item. Navigation follows an in-order traversal scheme according to the accessibility hierarchy exposed by the Web browser. The starting boundary of the traversal falls on the first item in the document frame, including the document frame itself. The ending boundary is the last item, of the last child accessible, of the last branch of the document structure under the document frame.

The first column of the table below lists the table navigation functions. The second column states the default key bindings which activate that function.

Table: Document mode navigation
Function Binding Notes
review next item Down Alt-Shift-O always works
review previous item Up Alt-Shift-U always works
review next word Ctrl-Right Alt-Shift-L always works
review previous word Ctrl-Left Alt-Shift-J always works
review next char Right Alt-Shift-. always works
review previous char Left Alt-Shift-M always works
go to item start Home First char of item, report text passed
go to item end End Last char of item, report text passed
go to document start Ctrl-Home Document frame, report item
go to document end Ctrl-End Last char of last item in the frame, report item

Document Mode Settings

Two settings affect document mode. First, AutoMode controls whether the reading mode changes from document mode to widget mode when certain accessibles are encountered. Second, BoundReview dictates whether the review commands can navigate out of the current document frame or not. See RDB Settings for the definition of these settings.

Document Items

Document items meet the definition of the item given in the Terminology section at the start of this document.

Navigation by word and character within an item is allowed. Navigation across two items is allowed according to the wrapping setting in the ReviewPerk.

Items containing no text are handled according to the skipping setting in the ReviewPerk. This setting dictates which of the following three actions is performed when an item with no text content is reached:

  1. The point of regard stops on the item.
  2. The point of regard skips the item.
  3. The point of regard skips the item, but the role of the item is announced in passing.

When any of the basic Perk functions are invoked, the operation is performed as expected on the item at the current point of regard.

Output

When the point of regard is moved to a new item, LSR speaks an announcement equivalent to read item details. When the point of regard is moved to a new word or character, LSR reads that word or character as expected. Depending on the review skipping setting, LSR may speak the roles of skipped accessibles too.

When the BoundReview setting is enabled, LSR also speaks an error message at the start end of the document. If the point of regard is at the start of the traversal, and the user tries to navigate previous, LSR speaks document start. If the point of regard is at the end of the traversal, and the user tries to navigate next, LSR speak document end.

When the point of regard is moved to a new item, LSR sends the text of the new item to the Braille display. When the point of regard is moved to a new word or character, LSR updates the display so that the start of the word or the character falls in the first usable cell on the device.

When the point of regard is moved to a new item, LSR moves the magnifier to the focal point of that item. When the point of regard is moved to a new word or character, LSR updates the magnifier so that it is focused on the start of the word or the character.

Example

The point of regard and application focus are on the document frame. The user navigates into the document frame using the documentation navigation keys. The user traverses three lines of static text. The user then invokes the pointer to focus command. The POR returns to the application focus (i.e. the document frame) and LSR announces the new POR location as is the standard behavior for this command.

Example

The user navigates by item to a link in a document. The user then presses the keys to activate the read text color command. LSR reports the foreground and background color names for the text at the point of regard as well as their numeric values. This report is the standard behavior for this command.

3.1.2   Widget Major Mode

The widget reading mode supports navigating within editable form controls or ARIA widgets. This mode is essentially a pass-through, in that it allows nearly all keystrokes to reach the widget at the POR. LSR then responds to the events fired by the widget as a result of the input. The boundary for entering and exiting widget mode is any editable accessible or part of an editable accessible.

Note

Only the minimal set of keys required to enact a mode change are required to remain active in this mode, namely those that provide the user with an avenue for getting out of this mode. However, as many of the other navigation hotkeys as possible should remain active in this mode for the sake of consistency.

All LSR output in this mode should be the same as the output LSR gives when the user is interacting with an equivalent desktop widget.

3.1.3   Table Minor Modes

The table reading modes support navigation by row, column, and cell in static HTML data tables. Navigation follows a spatial traversal scheme according to the layout of the table. The starting boundary of the traversal is the table container itself. The ending boundary of the traversal lies within the last cell of the table.

Two minor modes control the traversal order within the table. The split into two modes exists to account for different tasks when navigating tables with spanned cells.

Important

Table minor modes are never active for tables used strictly for layout. They are only available for data tables and grids.

Row/Column Minor Mode

Row and column mode supports the navigation of table cells by row and column index, ignoring cell spanning. Navigation to the right from the right-most column or left from the left-most column causes LSR to announce first column or last column by default. Navigation up from the top row or down from the bottom row causes LSR to report first row or last row by default. See Table Mode Settings for details on how settings affect these default behaviors.

The first column of the table below lists the table navigation functions. The second column states the default key bindings which activate that function.

Table: Row/column mode navigation
Function Binding Notes
next column Ctrl-Alt-Right  
previous column Ctrl-Alt-Left  
next row Ctrl-Alt-Down  
previous row Ctrl-Alt-Up  
go to table start Ctrl-Home First press goes to top-left row/column. Second press invokes go to document start.
go to table end Ctrl-End First press goes to bottom-right row/column. Second press invokes go to document end.
go to row start Home First press invokes go to item start. Second press goes to first column, same row.
go to row end End First press invokes go to item end. Second press goes to last column, same row. second.
go to column start Ctrl-Alt-Home First row, same column
go to column end Ctrl-Alt-End Last row, same column

Cell Minor Mode

Cell mode supports the navigation of table cells respecting all spanning. Navigation to the left from a left-most cell or right from a right-most cell causes LSR to announce first cell or last cell by default. Navigation up from a top cell or down from a bottom cell causes LSR to report first cell or last cell by default. See Table Mode Settings for details on how settings affect these default behaviors.

The first column of the table below lists the table navigation functions. The second column states the default key bindings which activate that function.

Table: Cell mode navigation
Function Binding Notes
next row cell Ctrl-Alt-Right  
previous row cell Ctrl-Alt-Left  
next column cell Ctrl-Alt-Down  
previous column cell Ctrl-Alt-Up  
go to table start Ctrl-Home First press goes to top-left cell. Second press invokes go to document start.
go to table end Ctrl-End First press goes to bottom-right cell. Second press invokes go to document end.
go to row start Home First press invokes go to item start. Second press goes to first cell, same row.
go to row end End First press invokes go to item end. Second press goes to last cell, same row. second.
go to column start Ctrl-Alt-Home First cell, same column
go to column end Ctrl-Alt-End Last cell, same column

Table Mode Settings

The TableMode setting controls which minor mode is activated by default when the point of regard enters a table. BoundReview dictates whether the review commands can navigate out of the current table or not. TableWrap dictates whether navigation left of the left-most column or right of the right-most column cause the POR to wrap or not. See the RDB Settings section for details.

Table Cell Items

A simple table cell contains one and only item (e.g. a line of text). Navigation to such a cell causes LSR to announce its contents.

Example

Assume the skipping option for the review keys is set to report.

The user navigates from one table cell to another containing a single number. LSR reports the role of table cell, the role of the paragraph in the table cell, and then the first and only item in the table cell: the number. The POR rests at the start of the number in the table cell.

Table cells may contain any kind of content, including, but not limited to, paragraphs, lists, images, links, tables, and combinations of the preceding. Navigation over content in table cells containing more than one item is accomplished using the regular document navigation keys. The minor mode keys do not conflict with or deactivate the document mode keys.

Output

When the point of regard is moved into a table for the first time, LSR speaks an announcement equivalent to read item details. If the POR is on the table container, the announcement includes the table role and the table name. If the POR is on a cell within the table, this announcement potentially includes the table role, the table cell role, cell row header, cell column header, cell offsets, and table size. Navigation within the table produces similar results, minus table size and repeat roles. Speech about navigation within a cell follows the output spec from the Document Major Mode section.

When the BoundReview setting is enabled, LSR also speaks an error message at the start and end of the table. If the point of regard is at the start of the traversal, and the user tries to navigate previous, LSR speaks table start. If the point of regard is at the end of the traversal, and the user tries to navigate next, LSR speak table end.

When the point of regard is moved into or within a table, LSR sends the text of the item at the current POR to the Braille display. Navigation within a cell produces Braille output according to the Document Major Mode output specification.

When the point of regard is moved into or within a table, LSR moves the magnifier to the focal point of item at the POR. When the POR is on the table container, the focal point should be the center of the table. Navigation within a cell moves the magnifier according to the Document Major Mode output specification.

Note

Output about static HTML tables should mimic output from interactive tables in other applications as closely as possible. The goal is to make the experience of browsing static tables using passive review (i.e. LSR navigates, reports what is at the POR) with browsing dynamic tables which fire focus and descendant events (i.e. LSR responds to events).

Critical Example

Assume the skipping option for the review keys is set to report, the minor mode default is row/column, and BoundReview is enabled. Consider the following table:

  7:00 PM 7:30 PM 8:00 PM 8:30 PM
ABC Lost Season finale Grey's Anatomy
NBC The Office Date Line
FOX The Simpsons Family Guy|American Dad
HBO The Sopranos

The point of regard is on the table container. The user presses Down. LSR sets the POR to the left-most column of the cell starting with the text Lost. LSR announces the role of the table cell, the first item in the cell (Lost), the row header (ABC), the column header (7:00 PM), and the index in the table (1,1).

The user presses Down again. LSR moves the point of regard to the start of the second line in the same table cell. LSR reads the line (Season Finale). No other announcement is made.

The user presses Down a third time. LSR moves the POR to the start of the first item in the next cell in the same row. LSR announces the role of the table cell, the first item in the cell (Grey's Anatomy), the column header (8:00 PM), and the index in the table (1,3).

The user now presses Ctrl-Down. LSR moves the point of regard one row down. LSR announces the item in this cell (Date Line), the row header (NBC), and the index in the table (2,3).

The user next presses Ctrl-Left. LSR moves the point of regard one column to the left. LSR announces the column header (7:30 PM) and the index (2,2). The content of the cell is not announced again.

The user presses Ctrl-Left again. LSR moves the point of regard one more cell to the left, and announces the text (The Office), the column header (7:00 PM), and the index (1,2).

The user then presses Ctrl-Space. LSR announces the user is now browsing by cells. The user presses Ctrl-Down twice in quick succession. LSR moves the POR first to the cell containing the text The Simpsons and then immediately to the cell containing the text The Sopranos. The speech for the intermediate navigation is clipped. The user hears the text in the final cell (The Sopranos), the row header (HBO), and the index (4, 1).

The user now preses Ctrl-Right. LSR says last cell and leaves the POR where it is.

Note

How LSR will treat header cells, as part of the content or not, is undecided. This section will be extended as that discussion continues.

3.4   Orientation

Orientation queries help the user understand where he or she is, where he or she can go next, and where he or she has been. Some of these commands give the user an awareness of where they are currently browsing in relation to the document as a whole, while others provide a glimpse of the overall structure and content of the document.

3.4.1   Context

Context queries remind the user of where he or she is browsing in and across documents. They range from simple announcements of the current page title and URL to reports of where the point of regard is located in terms of the structure of a Web document.

The first column of the table below lists the queries supported. The second column states the default key bindings which activate the queries with respect to the current POR.

Table: Context queries
Function Binding Notes
where am i Alt-Shift-1 Replaces first part of cycle with additional info
read page title CapsLock-Y First in cycle
read page url CapsLock-Y Second in cycle

The Where am I? query extends the function of the same name provided by the more basic Perks packaged with LSR. This document specific implementation reports the following information:

  1. All information typically reported by read item details, as in the first segment of the announcement for the basic where am i query.
  2. Index out of total similar accessibles.
  1. For headings, the total should include headings at any level.
  2. For form controls, the report should count form controls of all types within the current form.
  3. For containers, this announcement is ommitted as the next one will account for it.
  1. Index out of total similar container accessibles.
  1. For tables, forms, image maps, lists, and so forth, LSR should report the total number of such containers on the page. This announcement should be made if the POR is on a container or in a container.
  2. For accessibles that are not containers or are not contained in interesting containers, this announcement is ommitted.
  1. Percentage position. An indication of how far from the start of the document mode traversal the POR is located. The percentage is not based on the percentage the visible page has scrolled, as the page may contain columns which cause the POR to jump from visible bottom to top numerous times during page reading.

Example

The user presses CapsLock-Y. LSR announces the title of the current page. The user presses CapsLock-Y again without any intervening query. LSR announces the URL of the current page.

Example

The user navigates to the Google home page. The point of regard is on the sign in link. The user presses Alt-Shift-1. LSR announces the following about the current POR:

  1. link
  2. sign in
  3. link 1 of 14 in document
  4. 40% down

LSR continues to announce the ancestor chain as it does for the basic Where am I? query on the next press of Alt-Shift-1.

Example

The user navigates to the Google home page. The point of regard is on the search box. The user presses Alt-Shift-1. LSR announces the following about the current POR:

  1. entry
  2. blank
  3. form control 1 of 3
  4. in container form 1 of 1 in document
  5. 40% down

LSR continues to announce the ancestor chain as it does for the basic Where am I? query on the next press of Alt-Shift-1.

3.4.2   Overview

Overview queries give the user a sense of the structure of an entire Web document. The purpose of such reports is to give the user a feeling for an entire page, a glimpse of sorts, without forcing the user to navigate through the entire document to learn its layout.

The first column of the table below lists the queries supported. The second column states the default key bindings which activate the queries with respect to the current POR.

Table: Overview queries
Function Binding Notes
read page summary CapsLock-Y Third in cycle

The read page summary query should report the following information when activated:

  1. Total number of headings, forms, tables, visited links, and unvisited links in the document.
  2. Language of the document, if provided on the document frame accessible.

Example

The user is browsing the Google home page. The user presses CapsLock-Y three times in a row. For the third announcement, LSR reports the following:

  1. 0 headings
  2. 1 form
  3. 0 tables
  4. 0 visited links
  5. 11 unvisited links
  6. US English document

3.4.3   Preview

Preview queries give the user a sense of the content that lies ahead. The purpose of such reports is to give the user a feel for what lies ahead, before navigating there. Such information has the potential to reduce unncessary exploration of uninteresting pages.

The first column of the table below lists the queries supported. The second column states the default key bindings which activate the queries with respect to the current POR.

Table: Preview queries
Function Binding Notes
read link preview CapsLock-K  

The read link preview query should report the following information when activated. Note that not all information will be available for all links. For example, only the type of link will be available for links that trigger Javascript.

  1. Type of link such as Javascript, mailto, non-HTTP protocols (e.g. irc://`, ``ftp://), etc.
  2. The name of the target file, if it can be determined from the URL.
  3. Location of the link target compared with the current point of regard reported as one of the following: on the same page, on same site (domain), on a different site (domain). A difference in machine name in the DNS query should not constitute a change of domain.
  4. Size of the target, if it is a non-HTML file and can be determined.

Caution!

The read link preview query is dangerous on some links. If it blindly queries all links, it can accidentally trigger Javascript actions, form submissions, server side state changes, etc. This feature must be overly cautious in the types of links it decides to follow, erring on the side of safety. When in doubt, the function should only announce what it can about the link target without actually opening the target URL.

Example

The point of regard is on a link to a file named download.tar.gz located on a FTP site under the same domain. The user presses CapsLock-K. LSR reports the following information:

  1. FTP link to download.tar.gz
  2. same site
  3. 1 megabyte

Example

The point of regard is on a link to an anchor elsewhere on the current page. The link target only names the anchor, not the entire page URL. The user presses CapsLock-K. LSR announces same page.

Example

The point of regard is on a link to a page under a different domain. The user presses CapsLock-K. LSR announces the following:

  1. HTTP link
  2. different site

Example

The point of regard is on a Javascript link. The user presses CapsLock-K. LSR announces Javascript link.

3.5   RDB Settings

For reference, all of the LSR settings mentioned in the proceeding sections are collated here. The first columns gives the programmatic name of the setting. The second gives its type. The third states its default value. The human readable name and description are to be decided by the implementor.

Table: Rich Document Browsing settings
Name Type Default value Notes
AutoMode boolean True Enables automatic transitioning from one mode to another
TableMode choice Row/col Determines the default minor mode for tables
TableWrap boolean False Enables row end wrapping in tables
BoundReview boolean True Holds the POR inside a region so that the mode cannot change
TextBlock unbound integer range 50 Minimum threshold for go to contiguous text navigation

4   Accessible Rich Internet Applications

This section describes how LSR supports Accessible Rich Internet Applications (ARIA), dynamic Web content with additional markup supporting accessibility. The W3C ARIA Roadmap, the ARIA Roles, and the ARIA States and Properties documents define three key concepts. Support for the first, landmark roles, is specified in the previous section under Navigation by Landmark. The remaining two, ARIA widgets and live regions, are treated in the following subsections. With support for these two technologies implemented in LSR, users with visual impairments can enjoy usable access to Web applications.

Again, the default key bindings in this section are specified according to the criteria stated in the LSR User Interface Specification under the section on Keyboard User Interface Considerations.

4.1   Widgets

ARIA widgets are formed from standard, structural (X)HTML elements repurposed for use as ineractive controls using Cascading Style Sheets (CSS) and a scripting language (e.g. Javascript). The structural elements are marked with role, state, and property information which the Web browser interprets, and maps to the platform accessibility API. In other words, the Web browser takes responsibility for making ARIA widgets appear to be normal, GUI widgets through the accessibility API. LSR, in general, need only be prepared to deal with arbitrarily complex widgets embedded in a Web page.

When LSR encounters an ARIA widget, it should enter widget mode, as described in the previous section under Widget Major Mode. In this mode, LSR should respond to widget roles, states, and events as it would for a similar widget on the desktop. Any inconsistencies between the behavior of LSR with a desktop widget and an ARIA widget should be minimized in favor of the design specified by the basic UI spec.

Example

Assume the AutoMode flag is True.

The user presses T to navigate to the next table on the page. The next table has the focusable state. LSR moves the POR to the table container and enters widget mode. The user navigates around the table using the arrow keys. The user then presses CapsLock-Up to move the POR to the table container and then CapsLock-Right to move the POR one element past the table to a static paragraph. LSR enters document mode.

Example

Assume the AutoMode flag is True.

The user presses D to navigate to the next embedded object. The next embed is an interactive tree table control. LSR moves the POR to the tree table container and enters widget mode. The user presses CapsLock-Right to immediately navigate out of the tree table to the link immediately past the control. LSR enters document mode.

Some information about ARIA widget properties may not fit into the strictly defined attributes of the platform accessibility API. This extra information may be exposed via weakly defined accessible object attributes (i.e. string name/value pairs). LSR should use this information in formulating its reports to the user whenever possible.

Note

The Mozilla wiki has a pages listing all document, object, and text attributes currently exposed and to be exposed by Firefox via AT-SPI. As an example, two properties of interest include sort and setsize.

Tip

The LSR Task.Tools API may be accounting for some of these properties already. In other cases, existing API methods should be extended to support them. Avoid doing in a Perk what should be done in the scripting API for standard properties.

4.2   Live Regions

Live regions denote sources of meaningful changes outside the point of regard. Such changes may be triggered by user actions (e.g. pressing a button) or real-world events (e.g. rising stock prices, new weather alerts). Live region markup distinguishes these purposeful changes from unimportant event "noise". In another sense, live regions attempt to support the mapping of changes in the periphery of the high-bandwidth, visual display to notifications in lower-bandwidth mediums such as audio and haptics.

The number of potential designs for rendering live region changes is staggering. Consider a user running LSR with magnification and speech browsing a page containing three live regions firing events at varying levels of politeness. Some possibilities for the user experience include:

  • keeping the magnifier zoomer at the focus while announcing live regions changes in speech
  • moving the magnifier zoomer to a changed live region when the politeness is rude
  • creating additional zoom regions over live regions when the politeness level is assertive or rude
  • using auditory icons to indicate changes to polite regions and speech to indicate changes in assertive or rude regions
  • speaking assertive live regions changes concurrently with other speech

This document focuses on one, basic design for live regions using a single stream of speech. The design described covers all concepts of the live region W3 specification, but not all possible rendering configurations. More advanced designs may follow, either as improvements to the existing implementation or separate extensions in their own right.

4.2.1   Events

LSR determines changes in live regions by monitoring text and children change events at and outside the point of regard. When such an event is received, LSR inspects the source and its ancestor chain for the presence of the live attribute. If such an attribute exists, LSR processes the message as a live region change.

4.2.2   Relevance

LSR next looks for the first occurrence of the relevant attribute in the chain of ancestor accessible sstarting with the one marked as live. The value or absence of this property helps LSR decide whether to continue processing the event as a live region change.

additions
Continue processing if the event denotes the addition of one or more descendant accessibles.
removals
Continue processing if the event denotes the removal of one or more descendant accessibles.
text
Continue processing if the event indicates a change of text in the live region accessible or any of its descendants accessibles.
all
Continue processing for any child addition, child removal, or text event within the live regions or its descendants.

If more than one of these values is present for the relevant attribute, LSR applies the logical OR operation.. If the attribute is missing entirely, LSR considers the value to be text additions.

4.2.3   Politeness

After establishing relevance, LSR inspects the value of the live attribute to determine when an announcement should be made.

off
Make no report unless the special controls condition described in the Relations section is met.
polite
Queue the message for later announcement when all active and queued speech has ceased. Once LSR starts outputting a polite announcement, any other report, except another polite announcement, may interrupt it. If interrupted, a polite announcement is never repeated.
assertive
Clear the queue of all polite messages. Stop active speech if it is from a polite region. Queue the new message for announcement when all active speech has ceased. Once LSR starts outputting an assertive announcement, only rude announcements and responses to user input may interrupt it. All other live region announcements are queued. If interrupted, an assertive announcement is never queued again.
rude
Clear the queue of all assertive and polite messages. Stop all active speech. Start speaking the new message immediately. Once LSR starts outputting a rude announcement, only user input may interrupt it. All other live region announcements are queued behind it. If interrupted, a rude report is pushed to the front of the queue and announced again following a timed backoff procedure. The queue is not emptied when re-queuing.

To do

  • How do we handle multiple regions? Do interruptions apply globally? From the same region only? Does source affect queuing? Can we rely on source as a good indicator of a real widget/region in the general case (e.g. individual chat messages marked as live)?
  • How to override to avoid overly rude pages?
  • Settings?

4.2.4   Atomicity

Finally, LSR checks for the first occurrence of the atomic attribute on the chain of ancestor accessibles to determine what should be announced.

true
All content under the live region accessible is included in the report.
false
Only the changed content is included in the report.

If the atomic attribute is not in the ancestor chain, LSR considers the value to be false.

4.2.5   States

If the busy state is present on the source of the event or one of its ancestors, the announcement of the change is delayed until the state is cleared. The announcement remains at the front of the queue of announcements, unless cleared by a less polite announcement according to the rules indicated in the Politeness section.

Note

In practice, when the busy state is not used, LSR may need to delay for some amount of time before trying to announce addition and removal changes to atomic regions. If a delay is not introduced, the accessible hierarchy might incorrectly representat the soon-to-be stable state of the live region (i.e. children are still being added or removed).

4.2.6   Relations

If the labelled by relation is present on a live region accessible, the label should be prefixed to any announcement. The label name assists users in distinguishing one live region from another on a page having multiple regions. It also gives context to the announcement.

By default, live regions with live=off are not automatically announced. This behavior changes, however, when the point of regard is on an accessible specifying a controls relation to a live region that is off. If the user performs an action at such a POR and the controlled live region fires a change event as a result, that event is treated as a polite change. The relevant and atomic properties are processed as in the normal case. The goal in this situation is to inform the user of the effect of his or her action outside of the current POR.

Other contenxt information is announced as appropriate the role of the live region. Table row and colum headers and container names are of particular interest when the change occurs in a new table or container context.

4.3   Web Application Scripting

The registrar component in LSR manages user interface elements and user profiles. The LSR core loads Perks associated with the active profile when an application is seen for the first time. Some Perks may load for all applications while others may load for a particular application. This core feature pairing Perks with applications is the crux of LSR scripting. It is limited, however, to applications running on the desktop denoted by accessibles supporting the Application interface. Web applications do not fall into this category as they are merely a collection of accessibles within the Web browser document frame.

Nevertheless, the ability to script large, complex Web application is desirable. It is possible to provide this feature within the Web browser Perk itself, since it can load and unload additional Perks at will. There are three requirements:

  1. A method for uniquely identifying Web applications
  2. A mechanism for associating Perks with Web applications
  3. A mechanism for loading and unloading Web app Perks

4.3.1   Application Identifiers

Two methods for identifying applications exists. The preferred method relies on the presence of a container accessible with role application and a domain unique XML ID mapped to the id attribute on that accessible. This approach allows for more than one Web application within a page.

Example

A company hosts a webmail application at http://www.foobar.com/webmail. They identify this application using the following XHTML template.

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="http://www.w3.org/StyleSheets/TR/base"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"
       xmlns:role="http://www.w3.org/2005/01/wai-rdf/GUIRoleTaxonomy#"
       xmlns:aaa="http://www.w3.org/2005/07/aaa">
  <head>
  <title>DHTML Slider</title>
  </head>
  <body>
    <div role="role:application" id="foobar.com/foomail">
    ...
    </div>
  </body>
</html>

A Perk for LSR is paired with this application using the string foobar.com/foomail.

The fallback method relies on the domain plus path components of the URL on the current Web document to identify all interactive content on a page as a single Web application. This method requires all content on a page to be scripted as one, large application.

Example

An organization hosts a news reader application at http://www.bazbot.com/apps/news/app.cgi. When a user visits the page, parameters to the CGI application are specified as part of the URL. For instance, http://www.bazbot.com/apps/news/app.cgi?login=joe&refresh=4 No application role or ID is available on the page generated by the CGI script. A LSR Perk is paired with this application using the string www.bazbot.com/apps/news/app.cgi.

Important

There is no recommendation for the formatting of the value in the id attribute field today. One possibility is to use the website domain followed by a unique identifier for the application within that domain.

This section will be updated with the final recommendation from the W3C.

4.3.2   Associating Web Perks

Perks for Web applications are managed by the registrar component just like any other Perk. Such Perks are paired with their Web application identifiers when they are associated with a user profile.

Example

The user wants to associate a Perk for GMail with the profile named user. The domain unique ID for GMail happens to be google.com/gmail. The user enters the following the console.

lsr -a GMailPerk.py -p user --app=google.com/gmail

4.3.3   Loading Web Perks

LSR loads Perks for all Web applications in a document frame when the page finishes loading. The main browser Perk first contacts the registrar to load Perks paired with the domain plus path components of the current URL. The Web browser Perk then attempts to collect all accessibles with role application on the page.

Perks keyed to the URL of a page have the ability the handle all events coming from their respective document frame. Their named tasks are available at all times when the point of regard is in the document frame.

Perks keyed to particular application accessibles within a document frame have the ability to handle all events coming from the subtree of that application. Their named tasks are available only when the point of regard is in that subtree.

The semantics of the event layers tier and background are slightly different for Web application Perks than for standard desktop Perks. The tier layer contains events coming from unfocused accessibles a given Web application. Both the Web browser window and the page tab containing the application must be in the foreground for the events to be placed on the tier layer.

The background layer contains events coming from accessibles within a given Web application. Events are placed on the background layer if either the Web browser window or the page tab containing the application is not in the foreground.

All Perks for applications on a given page are unloaded when the content in the document frame undergoes a complete refresh or the document frame is destroyed.

5   Ideas

This section is an unordered list of ideas which may or may not make it into the formal specification. They are here for brainstorming purposes.

6   Revision History

2007-05-22 Peter Parente <pparent@us.ibm.com>

2007-05-14 Peter Parente <pparent@us.ibm.com>

2007-05-11 Peter Parente <pparent@us.ibm.com>

2007-05-09 Peter Parente <pparent@us.ibm.com>

2007-05-09 Peter Parente <pparent@us.ibm.com>

2007-05-07 Peter Parente <pparent@us.ibm.com>

2007-05-04 Peter Parente <pparent@us.ibm.com>

2007-05-02 Peter Parente <pparent@us.ibm.com>

2007-05-01 Peter Parente <pparent@us.ibm.com>

2007-04-30 Peter Parente <pparent@us.ibm.com>

2007-04-30 Peter Parente <pparent@us.ibm.com>

2007-04-30 Peter Parente <pparent@us.ibm.com>

2007-04-26 Peter Parente <pparent@us.ibm.com>

2007-04-25 Peter Parente <pparent@us.ibm.com>

2007-04-24 Peter Parente <pparent@us.ibm.com>

2007-04-23 Peter Parente <pparent@us.ibm.com>

2007-04-19 Peter Parente <pparent@us.ibm.com>

2007-04-19 Peter Parente <pparent@us.ibm.com>

2007-04-19 Peter Parente <pparent@us.ibm.com>

2007-04-18 Peter Parente <pparent@us.ibm.com>

2007-04-13 Peter Parente <pparent@us.ibm.com>

2007-04-13 Peter Parente <pparent@us.ibm.com>

2007-04-12 Peter Parente <pparent@us.ibm.com>

2007-04-11 Peter Parente <pparent@us.ibm.com>

2007-04-07 Peter Parente <pparent@us.ibm.com>