- Using Perl with HAL -

  

 

 

 

 

Using Perl - Page 1

 

The Goal

HAL already integrates limited access to Internet data. HAL can tell you the weather forecast, sports scores, TV Listings, and traffic conditions.

However, these items all come from proprietary Internet sources, and are not adjustable by HAL users.

Many users have been asking for a long time to be able to pull other web data into a form HAL can use. This document details how to do this, in a way that anyone can use.

Speaking Text

HAL V.2.0 offers a new feature, one that has been much in demand by HAL users. A simple feature, but one that until now was missing. Simply the ability to invoke a plain-text file and have the Text-to-Speech engine in HAL read the file.

Until Version 2.0, the only way to have HAL read text is to program that text as an action item either for a rule or macro.

This opens the door to having HAL speak all sorts of data. Once we are able to do this, we will discuss ways to make that data available to other functions.

Running a Program

HAL has long had the ability to invoke an external program. However, Version 2 brings the ability to not only run a program, but to also include command line parameters. This makes it a little easier to invoke specific programs, but without this ability we could still perform the functions we need via DOS Batch files.

What’s a Screen Scraper?

Web pages are really just text, with some graphics to make it pretty. The text part is formatted in a structure called HTML. HTML essentially litters the text with tokens like '<'B'>' and '<'/B'>' and so on, things which tell a browser how to display it. You can see what this looks like by going to a simple text web page (e.g. http://www.deco-group-partners.com) and selecting View|Source from your browser’s menu.

A program does not have to be a browser to read web pages and make sense out of them. Any tool that can manipulate text in a structured manner can do so. A screen-scraper is merely a program that reads an HTML page, searches the text, manipulating the data we want, and ignoring the rest.

Tools to Operate

With the new capabilities in HAL it quickly becomes obvious that we can have HAL find and speak any data on the web. We just need a few more tools to bring it all together.

The first tool we need is something to manipulate the text and HTML tags on the web pages we are interested in. Most computer languages these days have extensive operators for processing text. Almost any language can meet this requirement.

The second tool we need is a Screen-Scraper. Ideally, our screen scraper can grab a web page, stuff it into an array or other data structure, and pass it over to our program to parse out the data we are interested in.

We can manipulate bare HTML data with any language, but highly desirable is a tool that can deal more directly with HTML tags, saving us from unnecessarily re-inventing the wheel.

 

 

 

 

 

 

 

 

 

Sidebar

 

Downloading PERL

PERL sources and documentation exist at the CPAN, but for the Windows version, you can download the package at http://www.activestate.com.

Select Languages | Downloads, then select ActivePerl | download. Select the MSI file for Windows. ActivePerl-5.6.1.635-MSWin32-x86.msi It’s currently an 8.6 MB download. Save the file in a safe spot then double-click to install.

Navigation

Introduction

Page 1

Page 2

Page 3

Page 4

Main Menu