Abstract: In this study we consider several approaches for reducing the entropy as applied to Arabic text files. First we apply the source mapping technique to produce a binary file which we tested the performance on such mapping using both Huffman and Arithmetic coding. Next we implemented file splitting technique for the reduction of the nth-order entropy of text files which was proposed by the authors. The technique is based on splitting the binary file into several subfiles each contains one or more bits from each codeword of the mapped binary file. The resulting subfiles were used to achieve better compression ratios when conventional compression techniques are applied to these subfiles individually and on a bit-wise basis rather than on character-wise basis. The technique was applied on Arabic text files and it was found that considerable reduction in their entropy was achieved. Applying Huffman as well as arithmetic coding on the binary encoded files showed promising results.
Abdel-Rahman M. Jaradat , Mansour I. Irshid and Talha T. Nassar , 2006. Entropy Reduction of Arabic Text Files. Asian Journal of Information Technology, 5: 578-583.