where to buy misoprostol online how to buy valtrex
I Before E, Except After C – Not a rule, but a joke | Evan Fosmark

I Before E, Except After C – Not a rule, but a joke

When I was in high school, I always had fun finding exceptions the “I before e, except after c” rule. I found over a hundred on my own back then just by searching around the Internet. Today I decided to take a better approach and wrote up a small script in to do the work for me. All in all, it found 533 exceptions to this rule.

Click here to view the list of exceptions.

Yeah, that’s quite a few. If you’re interested in how I did it, continue reading. Don’t worry, I won’t be sad if you came here just for the list.

The Rule, Broken Down

  1. If the word contains “ei”, it must be in the form of “cei” or make a long-A sound.
  2. If the word contains “ie”, it cannot be in the form of a “cie”.

The Tools Used

  1. The Python Programming Language
  2. Regular Expressions
  3. A file containing all English words [ get here ]

The Code

Writing up the code is pretty simple. Basically all it does is open the dictionary file and check each word against the rule. If it doesn’t follow the rule, it logs it in the output file.

import re
 
# Open the input & output file streams
input_handle = file("dictionary.txt", "r")
output_handle = file("i_before_e.txt", "w")
 
# Regexes for removal of valid
bad_ei_regexes = [
 
      # Valid case by default
      re.compile('cei'), 
 
      # Valid case with a long-A sound
      re.compile('^(z|p)ei'),
      re.compile('(?<!--c|s)hei'),
      re.compile('rein'),
      re.compile('vei[nl]'),
      re.compile('eig(h|n|e)')
]
 
# Iterate through all English words
for line in input_handle:
 
    # I comes before e, even when after a c
    if(line.find('cie') != -1):
        output_handle.write(line)
        continue
 
    # Remove all of the long-A sounds
    for regexc in bad_ei_regexes:
        line = regexc.sub('', line)
 
    # E comes before I, without a C, and without long-A sound
    if(line.find('ei') != -1):
        output_handle.write(line)
 
# We're done with the files
input_handle.close()
output_handle.close()

Pretty simple, eh? We let Python and regular expressions do all of the work!

Bonus: Words that have Qs that aren’t followed by Us

Scrabble players rejoice! I just compiled a list of words that contain a Q that isn’t followed by a U. I did it in the same way that I found exceptions to the “I before E, except after C” rule. This was just for fun. I don’t even play Scrabble.

Click here to view the list. (There are 29 words total)

 

 

8 Comments

  1. Drewu wrote,

    Hey man, nice script :) You planning on including the “eigh” in it also?

  2. Evan wrote,

    Hey, Drew! Long time, no see. And yep, I just updated it – thanks for that. :)

    I also made sure to filter out other parts that cause the long-A sound.

  3. Drewu wrote,

    Better. I think there is a final step to this though :P What percentage of words that have ie or ei follow the rule? This would be the true validator of the rule’s accuracy :)

  4. Evan wrote,

    Ah, so that we can compare the number of invalid words to the number of valid words?

  5. Drewu wrote,

    Indeed. A simple word count would probably do it. Thought about doing it myself :P

  6. Alysse wrote,

    This is brilliant! I love it!

  7. over9000 wrote,

    Your tarring my MIND APART EVAN!! I can’t even remember how to spell face hole words right anymore! God no; it’s like your some sort of evil elder god from beyond!

  8. Paul T wrote,

    came across this and thought it was pretty interesting, but thought you might like to know that there are OVER 1600 exceptions to the rule and the list is growing!

Leave a comment