Skip to content
This repository has been archived by the owner on May 15, 2020. It is now read-only.
/ MetaSpell Public archive

Object-orientated interface to aspell's pipe mode.

License

Notifications You must be signed in to change notification settings

UrsaDK/MetaSpell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

MetaSpell

Status: ARCHIVED - Fully functional, but missing tests
Version: 2.0.79

NAME

Umka::MetaSpell - object-orientated interface to aspell's pipe mode

SYNOPSIS

The following is an example of the way this module can be used from within a custom script. This example will start a new aspell process using a specific language with predefined encoding, switch to a quiet, html mode and spell check a custom block of text:

    my $Spell = new Umka::MetaSpell( '/usr/local/bin/aspell' );

    $Spell -> setOption( 'lang', 'en' );
    $Spell -> setOption( 'encoding', 'utf-8' )

    $Spell -> setMode( 'quiet' );
    $Spell -> setMode( 'html' );

    my @error = $Spell -> checkText( $my_html );

DESCRIPTION

When searching CPAN for a possible solution to my spell checking problem, I have found that most released modules only deal with plain text and only in ascii or latin-1 encoding. Out of the box, aspell not only supports a vast number of languages in various encoding, but it also capable of spell checking a number of different file formats (txt, html, man ...). Until the release of this module, this functionality was lost to Perl community.

This module is designed to provide an object-orientated interface to aspell's pipe mode. It is largely compatible with "ispell -a" mode (see Incompatibilities section for more information) and allows a user to start and control an aspell process while sending multiple text blocks for it to process. Output of the program is then collected and processed. This way, multiple blocks of text can be spell checked within a single aspell process, saving precious system resources.

Requirements

The following is a list of modules required by Umka::MetaSpell. Please note that version numbers indicate the version of a module this package was built with. With minor tweaking you should be able to get Umka::MetaSpell to run with older versions of the same modules.

    Aspell      0.60.4  interactive spell checker (binary required)
    IPC::Open2  1.02    Open a process for both reading and writing
    Carp        1.03    Throw exceptions outside current package

Installation

Currently this module is not distributed as part of the CPAN archive, therefore installing it is not as simple as doing install Umka::MetaSpell from the CPAN shell. However, it is not much harder then that either.

First of all make sure that you have successfully installed all of the modules listed in the required section. Once that is done, complete installation by copying the module to a location where perl can find it.

A list of directories searched by perl for a file you are attempting to use or require can be found by running perl -V or set by using use lib '/path/to/module' pragma within a script.

Incompatibilities

Please note that this module is designed to be compatible with both aspell and ispell command line utilities. However, since aspell extends ispell's functionality by providing a proprietary interface to some of it's functions, some of the methods defined in this module will only work with aspell.

The following is a list of aspell only methods defined by this module:

    setOption       defines a new value of aspell only options
    getOption       retreives curent value of aspell only options
    getWords        returns a list of all words in dictionaries

METHODS

Publicly Available

The following is a list of publicly available methods, their arguments and return values. Any changes to the syntax of this methods would result in a change to the minor version of the library.

new( command, arguments )

Creates and returns a new Umka::Browser object.

command (required; string)

Path (absolute or relative) to the location of the aspell binary.

    new Umka::MetaSpell( '/usr/local/bin/aspell' );
arguments (optional; string or array)

A list of arguments to be passed to the aspell process upon its invocation. Please note that each new argument should be passed as a separate element of the array. If only one argument is passed then it is possible to pass it as a string.

    new Umka::MetaSpell( 'aspell', '--personal=/path/to/dict' );
    new Umka::MetaSpell( 'aspell', '--lang=en', '--encoding=utf-8' );

end( )

This method is used to terminate all file handles to the current aspell process after which it will wait for current connection to be closed before returning the pid of the deceased process, or -1 if there is no such child process. For more information on the returned values please see waitpid manual page.

    $Spell -> end();

setMode( mode )

This method is used to modify aspell's behaviour by allowing a user to switch current spell checking mode. Thus, all further input will be checked according to the syntax of the new mode.

mode (required; string)

A valid mode will cause Aspell to parse future input according the syntax of that formatter. For more information about different modes see aspell manual page. Examples of different modes are: html, email, url, tex.

In addition to all modes supported by aspell this method supports the following custom modes: quiet - enter terse mode, verbose - exit terse mode, default - enter the default mode.

    $Spell -> setMode( 'html' );

setOption( option, value )

This method allows changing of the Aspell specific extensions. This method is provided in addition to the setMode() command, which is designed for Ispell compatibility. This method always returns 1.

option (required; string)

Defines a name of the configuration option to be modified by the supplied value. See aspell manual page for more information.

    $Setup -> setOption( 'repl', '/path/to/replacement/list' );
value (required; string)

Defines a new value for the option supplied by option.

    $Setup -> setOption( 'personal', '/path/to/personal/dict' );

getOption( option )

This method allows retrieving of the Aspell specific extensions. This method is provided in addition to the setMode() command, which is designed for Ispell compatibility. This method always returns 1.

option (required; string)

Defines a name of the configuration option to be retrieved by module. See aspell manual page for more information.

    $Spell -> getOption( 'dict-dir' );

addWord( word; mode )

This method will allow a user to add current word to the dictionary using a specific mode. If no mode is specified then words are added to the session dictionary. This method always returns 1.

word (required; string)

A word to be added to dictionary.

    $Spell -> addWord( 'Umka' );
mode (optional; string)

The following modes are supported: session - accept the word for current session but leave it out of the dictionary, personal - insert the all-lowercase version of the word in the personal dictionary, asis - add a word to the personal dictionary.

    $Spell -> addWord( 'Umka', 'session' );

getWords( mode )

This method allows retrieval of the complete list of words from either a current session dictionary or a personal word list (depending on the supplied mode). Method returns an array of words stored in a dictionary.

mode (required; string)

The following modes are supported: session - list words from current session dictionary, personal - list words from personal dictionary, all - list words from both personal and session dictionaries. If a word appears in both dictionaries then it will be listed twice.

    $Spell -> getWords( 'all' );

saveWords( )

This method takes no arguments and is used to allow user to save their personal word list / dictionary file. This method always returns 1.

    $Spell -> saveWords();

checkLine( line )

This method allows a user to spell check a single line of text. Upon a successful check an array of anonymous hashes will be returned. Each hash will correspond to a spelling error, listing the following information: offset - a column in which misspelled word has been found, word - a word that has been misspelled, suggestions - a comma separated list of possible suggestions.

line (required; string)

A line of text to be spell checked. Before initialising spell checker, all newline characters (\n) will be removed from the supplied line.

    $Spell -> checkLine( "Goodbye cruel worlld \n"
            ."I'm leving you today \n"
            ."Godbye, goodbye, goodbye" );

checkText( text )

This method allows a user to spell check a block of text. Upon a successful check an array of anonymous hashes will be returned. Each hash will correspond to a spelling error, listing the following information: line - a line number in which misspelling has occured, offset - a column in which misspelled word has been found, word - a word that has been misspelled, suggestions - a comma separated list of possible suggestions.

text (required; string)

A block of text to be spell checked. All newline characters (\n) in the supplied text will be preserved during the check.

    $Spell -> checkText( "Goodbye cruel worlld \n"
            ."I'm leving you today \n"
            ."Goobye, goodbye, goodbye." );

Internal Access

This module has no internal methods.

SEE ALSO

For further information about all posible options used by Aspell, their default values and how they can be set please see aspell manual page or project's webpage. (http://aspell.sourceforge.net/man-html/)

Text::Aspell, Text::SpellCheck, Text::SpellChecker