How To Use the Regex Dictionary

What is the Regex Dictionary?

The Regex Dictionary is a searchable online dictionary that returns matches based on strings rather than whole words. For example, whereas a search at most online dictionaries for "dog" will return only the definitions for that word, a search of the Regex Dictionary will return any word that includes the string "dog" (e.g., Adjectives: dogged, dogmatic; Nouns: doggerel, dogwood, etc.). It can also limit its search to certain parts of speech (e.g., all adjectives ending in -ly) and filter the results based on a second regular expression (e.g., those not beginning with un).

Regex Dictionary | Top


How does it work?

If you are familiar with Perl, then you can simply search for matches by entering any valid regular expression. There are two special characters: $v represents the character set [aeiouAEIOU] and $c, [bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ].

If you are new to the world of regular expressions, then the easiest way to learn the Regex Dictionary is by example. Here is a step-by-step tutorial:


Basic useage I: Matching words

In the following examples, we'll simply search for a string in all gramatical categories without filtering the results.

  1. A search for the string "cat" means "Find any word containing these three consecutive letters". Hundreds of matches are returned: "catastrophic", "alley cat", "scathingly", "catch", etc.
    Note that the string " cat" (there is a space character in front of the c) will only match compound words like "self catering" and "dairy cattle"; spaces count!
  2. We can limit the matches to all words ending in "cat" by putting the special character "$" after our search string. Searching for "cat$" there are only 5 matches: "alley cat", "cat", "siamese cat", "wildcat" and "cat" (verb).
  3. We can limit the matches to all words beginning in "cat" by putting the special character "^" before our search string. A search for "^cat" produces more than 50 matches: "catholic", "category", "cattle", "catch", etc.
  4. Finally, as is logical, we can limit our results to the word "cat" by typing in "^cat$":
  5. As mentioned earlier, with the Regex Dictionary you can use the special character $v to match any vowel and the character $c to match any consonant. So if we search for the string "^c$vt$", we'll match all words that begin with c, end in t and have one vowel in the middle: cat, cot and cut.
  6. Let's change our previous search to allow any consonant at the end of the word, subsituting the t in our string with the special character $c ("^c$v$c$"). This means "Match any word beginning with c, followed by one vowel, and ending in any single consonant". Now we get a couple dozen matches, including cab, cod i cub.
  7. Finally we'll search for words beginning in c, followed by two vowels ($v$v) and ending in any single consonant: ("^c$v$v$c$"). The matches include cool, coal, coin and coax.
    Note that if we remove the initial ^, we'll match any word ending in "c + 2 vowels + 1 consonant" (oficial, raincoat, scoot, etc.). If we remove only the final $, we'll match any word beginning "c + 2 vowels + 1 consonant" (caught up, courtship, cease, etc.). Finally, removing both the beginning and ending restrictions we can match uncousinly and musicianship.

Basic useage II: Character classes

In the following examples, we'll continue searching for a string in all gramatical categories without filtering the results. From now on we'll dispense with the images.

  1. The special character "." represents any character at all, including a space, a hyphen or an accented letter (like the é in fiancé). Thus the string "fianc.." will match both "affianced" and "fiancé". To get a list of all four-letter words, we can simply type in the string "^....$".
  2. The special character "\w" represents any letter or digit. So if we want to know how many eleven-letter words are in the database, without including either hyphenated or compound words (e.g., "accident-prone" or "hound dog"), we can type "^", "\w" eleven times and "$" ("^\w\w\w\w\w\w\w\w\w\w\w$"). (Fortunately there is an easier way to do this, as we'll see in the section on Matching repetitions). Note that without the beginning "^" and ending "$", the string would match any word of eleven letters or more.
  3. We can search for groups of characters using square brackets ("[]"). In fact, the special characters mentioned in the previous section, $v and $c, are simply convenient aliases for the character classes [aeiouAEIOU] and [bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ]. A search for the string "^[bcr]$vt$" will return any word beginning with either b or c or r, followed by a single vowel and ending with t. Matches include but, cat, rot and bit.
  4. Within the brackets of a character class, the hyphen (-) is special: it specifies a range, so that if you want to match all lower-case letters you can type [a-z] rather than the cumbersome [abcdefghijklmnopqrstuvwxyz]. You can make a hyphen part of your character class by placing it first or last; for example, the string
    "[- ]new$" will match both "brand new" and "brand-new".
  5. You can negate a character class by putting the character "^" right after the first bracket; for example, if you want to match any letter except a or o or u, you'd use: "[^aou]" . Thus "^g[^aou][r-z]$" will match any word beginning in g, followed by any letter except a, o or u and ending in any letter from r to z. Matches include gem, gym and get.

    Note: Remember that the character "^" has two special meanings: At the beginning of a string, it means "The word must begin with the following letter" (ex.: "^c" matches all words that begin with c). At the beginning of a character class, it negates all the characters within the brackets (ex.: "^[^aeiouy]" matches any word that does not begin with a, e, i, o, u or y.


Basic useage III: Matching this or that

  1. The special character "|" allows us to search for two or more alternatives. For example, the string "hand|mind" will return all matches with either of these two words (exs.: absentminded, handicapped, reminder and on the other hand) and the string "hand|mind|eye" will include all these as well as cross-eyed and eyedrops.
  2. We can search for any word ending "handed" or "minded" by using parentheses to group our alternatives: "(hand|mind)ed$". This string matches empty-handed and narrow-minded, among others.
  3. Some other examples: typing "^(g|q)u" will match all words beginning in either g or q and followed by u. The string "(care|fear|harm)(ful|less)" matches careful, careless, fearful, fearless, harmful and harmless.
  4. Note too that groupings can be nested: the somewhat imposing
    "t(ian|en(d|t))$" means: Match any word ending in either tian, tend or tent: for example, "dalmatian", "content" and "extend".

Basic useage II: quantities

  1. The special character "+" means "Match the preceding character one or more times". So "^\w{11}$". The curly brackets are used after a character to mean: "Match this number of the previous character".
    If we wish to include words ending in "eyed" as well, using the string "(hand|mind|eye)ed$" won't work, because in this case we would be trying to match words ending in "eyeed". We can solve this by using the special character "?" after the second "e" of "eye": "(hand|mind|eye?)ed$". The "?" means: "Match the preceding character zero or one time", effectively making the second "e" of eye optional and thus matching words like eagle-eyed and hackneyed.
  2. The following string will match any word beginning with "cat", or including anywhere the string "dog": (^cat|dog)"

What are "regexes?"













that lets you us. It also permits one to find patterns in words use of Perl's regular expressions in its searches Perhaps calling it a "dictionary" is a misnomer, since it doesn't give definitions