Discussions
Word Count program for College students
Posted by voodooKobra • 10/13/08 • Subscribe to this Discussion [RSS] • Report This Topic
Topics: count, php, programming, regex, tools, word
There are too many shoddy programs out there that claim to calculate the word count of the text inputted by the user, but they almost always have serious flaws.
My program is different: It doesn't count two spaces as a word, and it ignores everything in parentheses (it was designed for calculating the word counts of research papers, and in-text citations never count towards the word count requirement).
www.kobrascorner.com/tech/wordcount.php
Known issue: Nested parentheses (which I've never seen in a research paper) don't work properly. This means that: "I hate cats (they smell (and give me allergies) and their fur gets everywhere) and love dogs." would not yield a word count of 6, but 13. Until I'm coerced into finding a new regex pattern to use, replace any nested parentheses with []s.
User Comments
-
Kobra, I'm going to take a look at this later and link from my writing blog--I get tons of search traffic on phrases like "how many words are there in 5 double-spaced pages" or "10 pages is how many words?"
-
I just tried a few wordpress posts ..
my last -
Wordpress 2.6.2 claims it was 902 words
Leprakaun Plugin claimed it was 896 words
The Kobra Calculation claimed it was 910 words-
If you have nested parentheses, it will throw mine off. Otherwise, I need to test it further.
www.petlvr.com/blog/2008/10/why-you-should-never-buy-a-pet-at-a-pet-store-a...
My algorithm says 900 words; regardless of whether or not I disable the in-text citation filter.
-
-
Oh, right. Everything in parentheses is likely to be a reference, so you don't really need to count it.
You could check for numbers between parentheses, because a reference would have a date on the last 4 characters between the parentheses (at least in the Harvard reference method). That way, you can still add comments or the like between parentheses - which could be done with a dash like this anyway - but that's a pretty cool word counter. -
Too late to edit my previous post: I added a Beta feature that will allow you to see how the script interprets each "word" in case it's counting junk as a word, or ignoring a valid word. The only one-character "words" the script allows are "I" and "a." If I'm forgetting any, please let me know.
EDIT: Single-digit numbers are supposed to be spelled out ("seven") in MLA format, but I'm going to count them anyway. is_numeric() is my friend.
Add Your Comment
Login to leave a message.





