MIT CSAIL Research Abstracts

CSAIL Publications and Digital Archive header

Research Abstracts Home

CSAIL Digital Archive

Research Activities

CSAIL Home

horizontal line

Research Abstracts - 2007
horizontal line

horizontal line

Web Page Readability Enhancement

Chen-Hsiang (Jones) Yu & Robert C. Miller

Problem Statement

The Web has become an important medium for delivering information, but different users may have different preferences to use it. In our observation, when users try to get information on the Web, they suffer from unnecessary distractions and obstacles, such as advertisements, unsuitable presentation and poorly-organized layout. In our research, we try to solve this problem by providing an automatic way for end-users to customize Web pages at the client side.

Related Works

Providing better personalization service on the Web has been a research topic since 1990s, and many Web sites already provide some form of personalization, such as the Google personalized homepage, Google Reader, and personalized news sites. There are some tools for end-users to adjust Web pages for readability enhancement, such as browser commands that change the font size, and browser extensions that remove or recolor sections of a page. [1] But these tools need manual actions, such as pressing the keys, on the content. Furthermore, the framed object may not be the appropriate user's interested area. Some studies investigate on how to segment Web pages on the semantic level, but none of them focus on readability enhancement on the user interface level. [2,3,4]

Our Approach

We try to enhance Web page readability in two ways. One is by providing an automatic scaling mode to reduce the user's burden when reading content. The other is to eliminate distractions on the page for the users. For example, when the user browses CNN news, he may want to see interesting content enlarged and shrink or dim out unnecessary content or advertisements. In our preliminary study, we designed scripts running under the Chickenfoot extension to integrate automatic scaling, contrast mapping, etc. to customize the Web page in the client side. (Fig. 1) As Fig. 1, when user reads a CNN news article and hovers his mouse on the page, our script will detect the appropriate group of the interested content and enlarge it. Then, all the surrounding content will be dimmed out as transparent.

Running a script to customize one of CNN news page
Fig. 1. Running a script under the Chickenfoot extension to customize one of CNN news page.

We are also developing a new extension, WebPer, to integrate our ideas at the client side to provide better personal service, including ad blocker, content adjustment, personal layout, and so on for all Web pages. For example, to reduce the distractions, the user can get a customized page with smaller images and advertisements by pressing button 1, and get back the original images by pressing button 2. (Fig. 2) This feature contains two phases of processing, one is images classification and the other is images transformation. We define images of a Web page into four categories, interface images, branding images, content images and advertisements. We are designing algorithms to classify images and having better transformations on them.

Fig. 2. Transformation on a Web page to eliminate distractions.

Acknowledgements

This work was supported by Quanta Computer as part of the TParty project. Any opinions, findings, conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the sponsor.

References

[1] Aardvark extension: http://karmatics.com/aardvark/

[2] Cai, D., Yu, S., Wen, J.-R., and Ma, W.-Y.. Extracting Content Structure for Web Pages based on Visual Representation. In The Fifth Asia Pacific Web Conference (APWeb2003), pp. 406-417, 2003.

[3] Chen, J., Zhou, B., Shi, J., Zhang, H.J. and Qiu, F.. Function-Based Object Model Towards Website Adaptation. In Proceedings of 10th International World Wide Web Conference (WWW 2001), pp. 587-596, 2001.

[4] Song, R.H., Liu, H.F., Wen, J.R. and Ma, W.Y.. Learning Block Importance Models for Web Pages. In Proceedings of 13th International World Wide Web Conference (WWW 2004), pp. 203-211, 2004.

Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu