Comparative frequency analysis of Arabic texts | Print |

This article exists in other translations [ Id: ar00003  عربي ], also  accessible through the articles page. 

👋👋👋 Try Anab 🍇, the Quran words retriever with amazing accuracy and speed. Here is an introductory 6 minute video, here is a detailed 30 minute video, and here is the app.
  • The Quran. Data are downloaded from tanzil.net. See the QSS page for details on how the data are prepared
     
  • Others. Sources here are gathered from:
    • The first seven volumes of the series  البداية والنهاية   (The Beginning and The End) for Ibn Katheer. All together, these seven volumes fill up 2,855 pages, containing 1,096,047 words, containing 4,326,031 letters.
    • The book of sirah of   الرحيق المختوم   (The Sealed Nectar; sirah means the life of Prophet Mohammad  صلى الله عليه وسلم) for Almubarakfuri. The book is spread over 284 pages, containing 134,662 words, containing 553,740 letters.
    • The book of  تحفة العروسين  (The Masterpiece for the Bride) for Ash-shuri. The book is spread over 239 pages, containing 66,550 words, containing 242,361 letters.

    Collectively, these sources add up to 3,378 pages, generating 1,297,259 words, or, 5,122,132 letters.

See the Arabic alphabet frequency analysis document for more coverage the topic. Computation are conducted using Intellyze. It is concluded from the table and figure shown above that either source may be used as reference to the frequency of Arabic letters in general Arabic text.