Skip to content

Pre-process arabic text (remove diacritics, punctuations and repeating characters)

License

Notifications You must be signed in to change notification settings

motazsaad/process-arabic-text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pre-process Arabic Text

Pre-process arabic text (remove diacritics, punctuations, and repeating characters)

Usage:

Usage: clean_arabic_text.py [-h] -i INFILE -o OUTFILE

Pre-process arabic text (remove diacritics, punctuations, and repeating
characters).

optional arguments:
  -h, --help            show this help message and exit
  -i INFILE, --infile INFILE
                        input file.
  -o OUTFILE, --outfile OUTFILE
                        out file.

Example

python clean_arabic_text.py -i infile.txt -o outfile.txt

How to contribute

Your contributions to improve the code are welcomed. Please follow the steps below.

  1. Fork the project.
  2. Modify the code, test it, make sure that it works fine.
  3. Make a pull request.

Please consult github help to get help.

About

Pre-process arabic text (remove diacritics, punctuations and repeating characters)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages