Pocket Reader OCR pen
Harriet Bazley on the latest gadget from CJE Micros
The ingenuity of the Siemens Pocket Reader would appeal to James Bond himself not to speak of the gadget-minded among the rest of us! This is a highly miniaturised scanner little larger than the average fluorescent highlighter pen, specifically designed to recognise and store ordinary printed text for transfer to a personal computer. Back in the 1960s when computers first learned to read the printed word, OCR (Optical Character Recognition) was something that only happened to cheques and machine-readable text required the use of a font so highly stylised as to be all but indecipherable to the merely human eye.
Computer technology has advanced by leaps and bounds since that era, and OCR programs have kept pace; Sleuth, the major RISC OS contender, first came out in 1993 in the days when its PC equivalents cost over £500, and has been under development ever since! But the basic drawback of all such software remains the same the requirement to supply the input text in the form of a large, scanned, sprite.
An A4-size flat-bed scanner has never been essential; but it certainly helps, as does having large amounts of spare disc space and RAM in your computer to deal with the resulting images. The idea behind the Pocket Reader, however, is to skip over this stage of the process altogether, transferring letters on the page directly into ASCII codes for the computer. After all, the desired input is text, and the desired output is text why waste space storing high-resolution pictures in order to convert the one to the other?
The Pocket Reader itself is a wedge-shaped, tapering unit approximately the size of a small plastic ruler six inches long by 1.5 inches wide, or 160mm × 40mm. It is, however, considerably thicker and heavier, varying from an inch (20mm) in thickness at the back where the battery compartment is stored down to approximately half an inch at the thinnest point. The specification states that it weighs 110 grams including batteries (2 AAA size), about the same as a Walkman or a box of five floppy discs. In other words, its very light compared to almost any other piece of electronic equipment but it is just a little too heavy to carry comfortably in a breast pocket.
The scanner head is housed in the tapered end, along with a rubber roller that is used to detect the start and speed of the scan. Scanning takes place at 400dpi, and the data is read into flash EPROM, thus ensuring that it will be retained even while changing the batteries. The exact size of the pens memory is nowhere stated possibly because in the era of 128Mb RAM requirements it might sound very low? but the oft-quoted figure is 20 pages of A4 text. Frankly, in my experimentations I have never come anywhere near even half-filling the pens memory; I cant see this limit as being a problem for anyone unless they were planning to spend, say, a week taking notes in a university library with no means of downloading them at the end of the day. Apart from anything else, it would take rather a long time to scan in 20 A4 pages line by line.
Estimated battery life is at least 20 hours. Every time you switch the Pocket Reader on, exit the configuration menu, or wipe the pens memory remotely via the serial link, a display is flashed up showing the remaining power and free memory. The power level is shown as a sequence of three bars and so far I havent even used up enough power to decrease the rating below full. The pen times out automatically after two minutes, which seems about right. The only times that Ive discovered that it has gone off unexpectedly, I really had put it down and then forgotten I was using it. In any case, since the power has no effect on the data stored, you can simply switch it on again and continue from where you left off.
The display is a simple LCD screen that can hold 24 characters, each consisting of a grid of 5 × 8 pixels higher resolution than a pocket calculator, but lower than the RISC OS system font. On the whole it is perfectly readable, even where accented characters are concerned.
Operating the Pocket Reader
The technique required to operate the pen, while explained clearly in the manual, I found somewhat non-intuitive. Because the leading edge of the Pocket Reader is tapered, it is tempting to hold it at a comfortable angle and rest ones hand on the page, as one would with a pen. However, this technique is almost fatal to producing reliable results.... It is very important to hold the scanner vertically to the paper; I found that it helped to try to think of the device as a highlighter rather than a pen. Those who are accustomed to writing with a Biro rather than a fountain-pen may find this mental leap less essential!
The other important factor proved to be a steady hand. If the scanner is allowed to waver up and down as it travels over the line of text, the results will be mediocre at best, and the most likely outcome is for the line lost message to flash up, indicating that the OCR routines could make neither head nor tail of the shapes detected. This is not a serious problem since the line can simply be rescanned provided of course that you notice in time. There is no audible warning, and it is a good idea at least to glance at the display after scanning each line in order to check that all is well.
I was impressed by the fact that it appears to cope seamlessly with both serif and sans-serif typefaces, including oblique and emboldened text; even true italics (an separate, almost handwritten serif style, as opposed to the simple slanted text that is oblique) can be read fairly well, although I discovered accidentally that angling the pen slightly in the direction of the slope of the text made an enormous difference here! Unfortunately, it copes very badly indeed with monospaced type, such as printed program listings or the output from StrongED.
It does also have a perplexing habit (in any font) of falsely interpreting letters as capitals when they are not when, indeed, to the human eye the two forms bear little resemblance to each other. I could understand it if the OCR were confusing o with O, say; but to confound q with Q or I with i is very odd.
The scan head is quite wide, in order to enable it to read text of up to 16pt size. Initially I attempted to align the bottom edge of the cutaway area in the tip of the pen (which marks the position of the actual scan head itself) with the bottom of the text I was scanning; however, this proved to give poor text recognition.
There is a fine ridge in the plastic marking the centre of this cutaway area (see arrow on image above) and this would appear to be intended as a guide. The best results are obtained by using this ridge to keep the pen aligned with the centre of the strip of text being scanned using this method, even when the text is small enough for two or three lines to fit within the scan area, the central line of text will be isolated and read accurately. However, this guide ridge is nowhere mentioned in the manual.
On the other hand, a clear plastic ruler is supplied, and I assumed that this item was intended as some kind of jig over which the OCR pen could be fitted in order to make it easier to scan in a straight line. So far as I can tell, though, it is simply a ruler; and in practice, given the rounded edges of the pen and the need to align the text with the centre and not the edge of the scanner, I found it almost totally useless.
The Pocket Reader has five buttons along its lower edge.
The right-hand button is the power switch. This is very well designed; it has to be held in for at least a second before the pen is activated, and even then the scanner will not operate until the rubber roller above the scan head is depressed, making it almost impossible to switch the pen on by accident. A single touch on this button is sufficient to switch it off again. Successful power-up is indicated by a reassuring flash of the red scanning light. The LCD window briefly displays the proportion of free memory and battery power remaining, and then switches into its normal display showing the current contents of the pens memory, with the cursor initially placed at the end of the text.
The central rocker switch controls the cursor, as indicated by the two arrows. This allows you to scroll backwards through the text held in the pen, although newly-scanned data is always added at the end of the current text rather than at the cursor position. This is perhaps the weakest point of the Pocket Reader. Only a few characters can fit on the screen, scrolling is slow, and yet due to the LCD technology used it is almost impossible to read the data while the display is moving attempting to find the start of a given section is a stop-start process involving holding the button down for long periods and then trying to work out where you have got to.
The Pocket Reader remembers each individual scanned section as a separate line and by default downloads them as such but it displays these soft line endings in the LCD screen as an ordinary space, and it does not distinguish between them when scrolling. The Return button on the far left can be used to insert a hard carriage return to mark the end of a paragraph. These do show up on the display when scrolling, and double-clicking on the cursor switch will take you to the beginning of the next paragraph. The Return button is conveniently positioned to be operated by the thumb of the hand in which one is holding the pen without needing to shift ones grip, a feature which I appreciate. In my experience, however, for the double-click to work the two clicks have to be extremely close together; generally speaking I end up advancing the cursor by several characters before I manage to register a successful double-click.
The × button on the right controls the Delete function, which erases all the characters between the current position of the cursor and the end of the text. The main use of this function is to rescan the last line of text entered in an attempt to improve the level of character recognition unfortunately there is no quick and/or easy way to do this at the moment. You have to scroll back manually to the start of the line, which has to be identified visually by comparison with the printed page; unless of course it happens to be the first line of the paragraph, in which case you can double-click on the cursor.... Given the combination of slow/unclear scrolling and the small number of characters visible in the LCD display (like the miniaturisation of controls, an inherent problem for any pocket-sized electronic device), I would have simply altered the default action of this button to be delete last line of text. Multiple presses would still allow the deletion of larger blocks, almost certainly more quickly than the current, more flexible, interface; and although the current interface makes it theoretically possible to delete half a line and rescan it, in practice it is very hard to position the pen to an accuracy of more than two or three characters. In any case, the new half-line wold be registered as a separate scan, and displayed split as such when downloaded to the computer. At the moment it is arguably faster to scan problem lines twice, and then try to remember to delete the less accurate of the two later on the computer, than it is to use the delete function of the pen.
Finally, the F button on the left controls the Pocket Readers configured language; its CMOS RAM settings, if you like. Pressing this button briefly will indicate the language it is currently configured to scan. This controls the character set against which the scanned patterns are checked. For example, if you configure the pen to scan in English, any accented characters in the text will tend to be interpreted as poor-quality versions of totally different characters which just happen to be more or less that shape Comte de la Fère is rendered as Comte de la FJre. Conversely, configuring to a foreign language when not actually scanning a text in that language will cause poor-quality characters to be interpreted falsely as accents when configured for German, town bridge came out as tüwn bridge. It is possible to scan single sections in the middle of a long document in an alternate language to the rest, if required, and so far as I could judge the level of recognition of the special characters for each language supported (English, German, French, Italian and Spanish) is very high. The pen appears to use the standard Latin-1 alphabet to represent high-ASCII characters, so there is no problem with transfer to RISC OS.
Holding the F button down for three seconds or more will bring up the configuration menu. The first item here allows you to alter the language being scanned for, as discussed above; this makes it quick and easy to alter this option. The second item allows you to change the language used for the text interface of the pen itself; for example, changing this to Français will alter the labels on the status screen from text and batt to texte and pile and the menu item from menu language to langue du menu. Obviously one wouldnt need to change this very often.... The final option is to erase the entire contents of the pens memory. It takes several clicks to navigate down here, and you are asked to confirm the operation before any data is actually deleted; so the chances of doing this by mistake are (fortunately) practically nil.
The RISC OS software
The Pocket Reader is supplied by the manufacturer with a disc of Windows software. However, when purchased from a RISC OS dealer, a second floppy disc is supplied, containing all the software needed for use with our computers the main PReader application, a copy of the SerialDev application containing routines to read input from the serial port, in case you do not already have this resource (I didnt), and a third program, PRInput, the function of which is nowhere explained (of which more later).
When you first run the program, clicking on the program icon (which since it represents the interface to a device, loads to the left of the iconbar according to RISC OS convention) attempts to connect to the pen, which must be switched on. (The latter point may sound obvious; but since the pen automatically powers down when not in use, even when it is already plugged in one usually has to remember to switch it back on).
Once connection is successfully made, the icon is no longer greyed-out and the main window will open. This window displays the contents of the pens memory (or, in this case, No text captured yet) and a toolbar to the left which allows you to manipulate the contents of that window.
A click on the top icon on the toolbar will start the download, and the text beneath the icon will change to read Fetching. It is worth noting that while the connection is active none of the controls on the pen itself will function you cannot scroll or delete text, or even bring up the menu. However, all these functions can be controlled from the computer end via the PReader software.
The main window and toolbar
By default, the text is displayed line by line as it is received this gives you an idea of its progress. Once it has finished downloading, the pane at the bottom of the window shows data on the total size of the text in lines and characters, and a confidence rating that reflects the quality of the scan as determined by the Pocket Reader in other words, how confident the OCR routines were that they had recognised the characters correctly! The help file states Somewhere around 70% is usual, but I obviously havent got the technique right yet; I find about 60% to be good going. Note that this doesnt mean that about 40% of the letters are wrong (in practice only about 2%), simply that only 60% are recognised at the highest confidence level. The others are (usually accurate) guesses.
The software uses colour to indicate the reliability of each letter, ranging from red for dubious through green and blue to black for almost certainly correct, and attempts to display each line of text at its correct relative size. When scanning, the pen recognises four ranges of text size, from large (16pt) to very small (8pt), and this data is also interpreted and displayed by the computer.
However, while these options are very clever, they can make the text much harder to read. Using coloured text in a 16-colour mode is a particularly bad idea, but reproducing text that was scanned at a very small type size (e.g. from a magazine) can also be awkward. The third and fourth icons on the toolbar thus allow you to switch off both the colour and size options as required.
In practice, as can be seen from the screenshot, the size data is usually useless. For some reason the Pocket Reader tends to report consecutive lines of text in an identical font as being in radically different sizes! The coloured reliability data also seems of little practical import. Looking closely, for example, at the first word of the screen shot reveals it as Plany (Many); the a, marked as least reliable, is in fact correct. The P , marked with a higher confidence, and the l next to it marked as very high confidence, are both totally wrong. Further down the page, the M in Mayor is rendered as lVl but again, all three letters form a mistake, while the red a next door is correct.
The bottom two icons are greyed-out unless the mouse has been used to highlight a selection of text in the window. (One disconcerting fact is that you can only select whole lines of text at a time and highlighting thus only takes place when the mouse is moved vertically. Before I accidentally discovered this I spent several frustrating minutes apparently unable to select anything at all, since I was moving the mouse across the lines of text horizontally from start to the end in the usual RISC OS fashion. It might be less confusing if horizontal movement over the window automatically selected that entire line instead).
The crossed-out text icon deletes the selection, while the icon showing a red I-caret sends the selected text, one character at a time, into the application that currently has the input focus. Note that transfer in this manner is very slow in order to transfer more than a short line or so, it is easier to use the save-box to save out a selection as a text file, where practical.
The savebox can be brought up either by clicking MENU over the window (as in this screenshot) or by selecting the floppy disc icon on the toolbar. There are two main options, Join sections which will remove all the line endings, saving the text in whole paragraphs as defined by the paragraph marks manually entered during scanning, and Selection which saves only the section of text selected in the main window. PReader nominally allows you to save in four different formats, as shown here. In fact, the only extra formatting to be saved is the font sizes and since these are switched on and then off again for every line (even when join sections is selected), the effect, in whatever format, is reminiscent of the HTML output by Microsofts Webpage creation tools. Since the font sizes for consecutive lines are generally falsely varied in any case, I have found no use for this feature.
Other submenu options
The View submenu provides an alternative method of controlling the colour (Show confidence) and size (Show size) options, plus a third control option (Draw during fetch) not available from the toolbar. Deselecting this option means that the main window will not be updated to show the contents of the pens memory until the whole text has been fetched, instead of displaying it line by line as it goes. This ought to provide a speed increase; in practice it seems to have little effect, even under RISC OS 3.1.
The Delete submenu corresponds to the role of the Delete selection icon, but also provides the facility to delete the entire contents of the window before downloading the results of a new scan. Note that if you dont delete the window contents subsequent downloads will simply be appended to the end of the previous text and that once the text has been downloaded, changes made to the window display will have no effect on the contents held in the flash EPROM of the Pocket Reader itself. This can only be erased using the Erase memory option from the iconbar menu (see below).
Finally, Disconnect (or Connect if offline) duplicates the same entry on the iconbar menu.
PReader not only downloads and displays the contents of the Pocket Readers memory, but also gives access to the pens internal configuration options (the F button) via a RISC OS-style menu tree.
Configure Device allows you to set the languages used for display and assumed while scanning, while Erase memory corresponds to the pens Erase all data option.
Device information shows the remaining memory and power as a percentage display rather than as a series of LCD blocks. (Im a little surprised that the battery status still appears to be at 100% despite several weeks of test usage....).
The Settings option controls the settings of the PReader software itself. Here you can configure whether the confidence colours and text line sizes are displayed by default or not, and which colours are assigned to which confidence level. You can also configure what point size the four internal font size levels are translated into for display and formatted save purposes.
The lower half of this window allows you to configure the serial port (you are unlikely to need to change this from the supplied defaults...) and to alter a couple of other options. You can configure Draw during fetch (the third option on the View menu) on or off by default here; more significantly, you can also set the line separator to be substituted at the end of each scanned line when the Join text save option is used.
Five alternatives are offered: None, Tab, Comma, Space and Newline. None and Space appear to be identical in effect the lines are joined by converting the newline characters into a single space, as !Edit does when you select Format text. Newline substitutes each newline character with... another newline character, causing Join text to have no effect at all. This would appear to be another redundant option! Tab and Comma allow you to scan tables as TSV and CSV files respectively by making a separate scan for each item, and inserting a manual paragraph break at the end of each line. This is displayed in the window as a single-column list, but when saved using the Join text and Tab option, it becomes a TSV file with three fields per line, which can be imported into any suitable application (e.g. Impression, Fireworkz) to create a table.
If you simply scan some text, plug in the pen, run PReader, save the result and quit the program, you are unlikely to encounter any problems. However, I did find that having the pen connected for long periods while writing the review showed up a few annoying features and inconsistencies.
The most serious problem was that second and subsequent attempts to download from the PocketReader, after I had erased the memory and/or deleted the text in the window, could sometimes fail after a few seconds with an error: "time out during read" or "illegal byte count during read". In many cases the same data had been downloaded in an earlier attempt without any trouble.
Two problems that annoyed me under RISC OS 3.1, but probably wouldnt affect RISC PC owners, related to mode change I frequently switch between high- and low-resolution modes, and when changing to a smaller screen mode, the titlebar of the main window often disappeared off the top of the screen so that it was no longer selectable. Fortunately the window has an adjust size icon clicking on this will force the window back on-screen. Secondly, if the program starts up in a low-resolution mode and I subsequently switch to a high-resolution mode to take screenshots or vice versa, not only does the text in the window become seriously distorted on mode change (StrongHelp also suffers from this problem) but reopening the window does not (unlike StrongHelp) cause it to redraw correctly. In fact, there then seems to be no way, other than quitting and restarting the program in the new screen mode, to make it readable again.
A more minor annoyance is that when PReader is first run, clicking on the iconbar icon when it reads Offline causes it to connect to the Pocket Reader. However, if the program later times-out and automatically disconnects to save battery power, or if you manually disconnect it (e.g. to go off and do another scan), subsequent clicks on the iconbar when it reads Offline make no attempt to connect. Instead, they open the main window with a new connect icon at the top of the toolbar in place of the previous start download icon . I am not quite sure what the point of this is, but the inconsistency has caught me out on several occasions when I clicked on the iconbar and wondered why nothing was happening.
Finally, the mysterious PRInput program present on the distribution disc.... My initial experimentation led me to the conclusion that this program had no effect other than to wipe the contents of the Pocket Readers memory! It had no !Help file, nor had I seen any mention of it in the documentation for PReader or the advertisement that had originally brought the OCR Pen to my attention, and the Info window simply described it as Special host utility for Pocker Reader. As a result, I did not bother to copy the application off the floppy disc and totally dismissed it. It was only from a posting on Usenet announcing an upgrade to the software that I actually happened to discover what it did.
PRInput is apparently designed for frequent entry of small quantities of data (a task which, as we have seen, can cause errors in PReader) for example, scanning title data into a series of records in a database. Every time you click on its icon it will download the entire contents of the pens memory to the current caret position and wipe the data it has just downloaded in preparation for the next entry. Of course, if the application that currently has the input focus does not accept text data, nothing will happen other than the wiping of the pens memory... hence my confusion.
Applications should not be distributed without documentation in this way. PD libraries are full of undocumented programs that disappeared into obscurity without ever acquiring a userbase, simply because it was too much trouble to work out how to use them and this certainly shouldnt apply to commercial applications.
One of the reasons I wanted to buy the OCR Pen was in the hopes that it would save me from endlessly retyping extracts from newspaper articles, printed submissions etc. when producing a local newsletter. Now, due to long and thankless years of practice I am a reasonably fast and accurate copy-typist; so, in order to see if using the Pocket Reader would actually be more efficient, I decided to stage a time trial between myself and the machine. I first scanned and then typed up manually a page of a standard hardback novel, selecting it for the serif typeface and matt paper surface that seem to suit the scanner best. The page contained 284 words.
Scanning in the 35 lines of text (several of which had to be deleted and rescanned due to very poor character recognition) took 4 minutes. Downloading it into the computer took another 20 seconds using the default settings for the PReader application this could have been speeded up slightly to 15 seconds by disabling the draw during fetch, show confidence and show size options. The software reported an estimated 60% accuracy rating, as normal, indicating no unusual problems. I then went through correcting the errors by hand (for speed, since a standard spell-checker is of little use with scanned text). This took me a further 3 minutes, during which time I corrected 35 mistakes a scanning accuracy rate of just under 98%, or on average one error per line of text. This represents a fair to high quality scan, judging by my general experience so far.
It thus took me about eight minutes in total accurately to transfer one page of a book to the screen using the Pocket Reader.
Manual copy-typing of the same text took 8 minutes twice as long. At 284 words, this equates to a typing speed of exactly 35.5 words per minute. Again, experience suggests that this is a fair representation of my normal typing speed. However, I then spent less than 2 minutes making corrections, finding only 17 mistakes. It was also noticeable that the mistakes were of a different nature; mainly transposed letters (normally picked up by a spell-checker), but also missing punctuation marks, and a case where one word had actually been substituted for another that for this. The accuracy of text transfer in this case was 98·9% an improvement of only about 1%. However, this does correspond to a halving of the number of errors!
A second test was performed on a two-column magazine article, using glossy paper and a sans-serif font. The short lines actually contributed to better character recognition in this case no lines had to be rescanned, and it took only 3 minutes 50 seconds to scan 338 words, split over 73 lines of source text, with a 62% accuracy rating. As before, transfer took about 20 seconds. However, I then had to spend 5 minutes 30 seconds in correcting no fewer than 56 errors! A high proportion of these were caused by confusion between lowercase l and uppercase L unfortunately as the text also contained many genuine Ls a global search and replace was not practical.
Typing in the same article by hand took 8.5 minutes, and I then spent 2 minutes 10 seconds correcting the 22 errors which had crept in. Most of these were, as usual, transposed or omitted characters, but there were two major discrepancies which only a careful proof-reader would pick up I had substituted £70 a year for £70 per year and with quite so many for with so many. Shading the original meaning in this manner is a classic problem with hand transcription!
Conclusions from test results
This experiment suggests that scanning in printed text via the OCR Pen can offer a considerable speed increase over the old-fashioned method. On the other hand, it is certainly less accurate, so a high proportion of the time saved will be taken up in making extra corrections to the resulting text. The nature of the errors is also different; on the whole, they will be harder to correct, since the suggestions offered by spell-checkers are optimised for human rather for OCR error, and some letter combinations can become severely mangled. I am not sure that without the original text to hand for comparison I should have recognised eHort in place of effort, for example, or Plany for Many, though Decernber and vvalking are fairly obvious.
As the table above demonstrates, my experience was that correcting the scanned text took as long as scanning it in the first place. It should be noted, however, that I am both an experienced proof-reader and a competent typist.
Perhaps the ultimate test of the Pocket Reader came a month after I bought it, on the weekend that the next newsletter fell due. Would it make my life easier? Would it save me time? Well... in practice, yes and no.
I used the pen to scan in one item; an extract from a speech by President Eisenhower, reproduced in the columns of the Guardian. Unfortunately, whether because of the poor quality of the newsprint, because of the vertical rules between the columns which were interpreted as extra characters at the lines end, or simply because the paper was not entirely crisp and flat (maybe I should have ironed it?) the quality of the scan was unusually low, as reflected in the accuracy rating 53%.
healing the wars vvounds, of clothing I!
and feeding and hovlsing the IBeedy, of
perfecting ajust I)olitical life, of enjoYIng !
the fruits of their own freetoil.
In this case, the corrections required were so frequent and so major in nature that it might well have been quicker to have typed the 420-word extract in myself, and would almost certainly have been a less frustrating and laborious task. As for the rest of the newsletter submissions... alas, there was no possibility of passing those under the scanner. I fear no OCR routines have yet been developed to cope with hand-corrected pencil manuscript!