Word frequency

A few weeks ago, I submitted an application for an online editorial job. The ad stated that the company uses US English style, so I doubled-checked for anything I could incorporate. I was able to include search engine optimization, but the only Honours was part of the official name of my linguistics degree, so that had to stay. I then thought about serial commas, which I don’t usually use. (They have their uses, but if in doubt, leave it out.) I searched for and, and was surprised to find 63 ands in a 938-word document, or 6.71% of the total. 

And is the third or fifth most common word in English, depending on which list you consult. One site gives its frequency as 2.67%, which means I used it more than average. I could avoid almost all of them. I could write:

I hold qualifications in linguistics. I hold qualifications in teaching English to speakers of other languages. I hold qualifications in classical music. I have worked as a legal publishing editor. I have worked as a magazine subeditor. I have worked as an English language teacher.

But it is more natural to write:

I hold qualifications in linguistics, teaching English to speakers of other languages and classical music, and have worked as a legal publishing editor, magazine subeditor and English language teacher.

Three ands in 29 words is just over 10%, without being particularly noticeable. 

The Oxford English Dictionaries gives the 10 most common English words as:

the be (including are am is were was being been) to of and a (not including an) in that have (including has had) I

Those 10 words occurred in my letter in the following order:

and 63 out of 938 = 6.71%
a 30 = 3.19% + an 6 = 0.63% = 3.83%
I 31 = 3.3% 
of 29 = 3.09%
the 28 = 2.98%
to 26 = 2.77%
in 21 = 2.23%
be 2 + am 8 + is 2 + are 1 + was 2 = 15 = 1.59%
have 10 + has 3 + had 1 = 14 = 1.49%
that 2 = 0.21%

These 10 words (or 15 if you count an, am, is, are and was separately) together occur 265 times, or 28.25% of the total. 

Other grammar words I used multiple times are: 

as 11 my 11 with 11 for 10 about 6 at 6 on 5 any 4 from 3 me 3 or 3 some 3 then 3 which 3 you 3 after 2 also 2 he 2 his 2 most 2 into 2 so 2 their 2 through 2

So I used the top 10/15 words and those next 24 words 265 + 103 times = 368 times, or 39.23% of the total. Almost two out of every five words is one of those. Not surprisingly, almost all of those words are among the 100 most common English words, the only exception being through.

On the other hand, the top content words I used multiple times are:

editor 5 editors editorial 3 subeditor subeditorial 3 edit editing 4
work 6 worked 4 working 2
English 10 
language 7 languages 2 language-related
music 6 
writing 3 writer write
publishing 4 publishers
student 2 students 3
use uses (n/v) 5 
(and many other 4, 3 or 2 times each).

The most common words there are work (the 11th most common noun and the 30th most common verb) and use (the 67th most common noun and the 21st most common verb). In general, we use more content words fewer times each, and fewer grammar words more times each.

Unfortunately, I didn’t get the job. Whether it was because of the glaring grammatical error I discovered later in the second paragraph, the result of adding text at the refining stage, I’ll never know. The company did actually bother to inform me, which is more than many companies do.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s