Discussions

There are too many shoddy programs out there that claim to calculate the word count of the text inputted by the user, but they almost always have serious flaws.

My program is different: It doesn't count two spaces as a word, and it ignores everything in parentheses (it was designed for calculating the word counts of research papers, and in-text citations never count towards the word count requirement).

www.kobrascorner.com/tech/wordcount.php

Known issue: Nested parentheses (which I've never seen in a research paper) don't work properly. This means that: "I hate cats (they smell (and give me allergies) and their fur gets everywhere) and love dogs." would not yield a word count of 6, but 13. Until I'm coerced into finding a new regex pattern to use, replace any nested parentheses with []s.

Reply

User Comments

  1. MadameX
    Kobra, I'm going to take a look at this later and link from my writing blog--I get tons of search traffic on phrases like "how many words are there in 5 double-spaced pages" or "10 pages is how many words?"
    1. voodooKobra
      So, how did it go? Does it work to your satisfaction?
  2. PetLvr
    I just tried a few wordpress posts ..

    my last -

    Wordpress 2.6.2 claims it was 902 words
    Leprakaun Plugin claimed it was 896 words
    The Kobra Calculation claimed it was 910 words
    1. voodooKobra
      If you have nested parentheses, it will throw mine off. Otherwise, I need to test it further.

      www.petlvr.com/blog/2008/10/why-you-should-never-buy-a-pet-at-a-pet-store-a...

      My algorithm says 900 words; regardless of whether or not I disable the in-text citation filter.
  3. josephgelb
    i agree though i never went to college and have never countered words
  4. flamingpoodle
    Oh, right. Everything in parentheses is likely to be a reference, so you don't really need to count it.

    You could check for numbers between parentheses, because a reference would have a date on the last 4 characters between the parentheses (at least in the Harvard reference method). That way, you can still add comments or the like between parentheses - which could be done with a dash like this anyway - but that's a pretty cool word counter.
    1. voodooKobra
      Yeah. I'm doing MLA style.

      The regular expression I'm using is just \([^()]+\)

      I could probably make a better one, but this does what I want to do. \([0-9A-Za-z-,\.!? ]*?[0-9]{1,4}\) might be a better choice.
  5. voodooKobra
    Too late to edit my previous post: I added a Beta feature that will allow you to see how the script interprets each "word" in case it's counting junk as a word, or ignoring a valid word. The only one-character "words" the script allows are "I" and "a." If I'm forgetting any, please let me know.

    EDIT: Single-digit numbers are supposed to be spelled out ("seven") in MLA format, but I'm going to count them anyway. is_numeric() is my friend.
    1. sayzlim
      Kobra, what do you use to make that application? Pure PHP?
    2. voodooKobra
      Yep. I've been playing around with PHP for about 6 years.
    3. sayzlim
      OMG, I think I fell behind too long if we're about the same ages.
      But your app is cool.
    4. voodooKobra
      Thanks. I need to brush up on regular expressions so I can make this work better.

Add Your Comment

Login to leave a message.