Picture

Picture

Friday, April 22, 2011

Search Strategies: Ancestry – Part 2

This is part of a series entitled Search Strategies.  Each article will feature a different database and various ways to conduct effective searches.  Some databases may have multi-part articles.

Search Strategies:  Ancestry – Part 1 showed you how to set up the search screen that I personally use, which gives you more control and usually better results.  Part 2 will discuss various tips for conducting actual searches with this setup using Soundex or wildcard searches.

search3

Soundex

Before I get into a discussion about wildcards, I want to talk about Soundex and it’s limitations.  Many of you already know what Soundex is and how it works.  If you are not familiar with this or want to know more, Ancestry’s Help does a pretty good job explaining it and showing you how to convert a name into this code.

While Soundex is all good and well, there are some limitations and problems, in my opinion.  First, the first character in the code is based on the first letter of the name (e.g., Tarr = T-600).  But, sound is the focus here, so when it’s misread and misindexed as Farr, the T-600 Soundex code does me no good, because the code for Farr is F-600.

The other problem is that a code can have so many different variants that are nothing like what you are looking for. For example, a Soundex search for John Tarr in Illinois produced 248 results for the 1920 census, including the following:  Tayor, Trey, Try, Tarro, Toher, Troy, True, Toouriy, Tria, Towery, Terry, Tauer, Thayer, Their, Therry, Theiry, Thore, Thorne, Threw, Thuraw, Toor, Torew, Torri, Tower, Tredwell, Tree, Trehey, and Trow. 

Some of these don’t even come close to sounding like Tarr, but because of the value assigned to the letters, they all share the same code.  For this reason, I personally do not use this method unless I am at a complete dead end.

And finally, depending on the various spelling of a particular name, there could be more than one Soundex code that needs to be searched.  For example the name Schwartz can be spelled with or without the “t",” which yields two different codes (the “c” and the “h” have no bearing on the code in this case):

  • Schwartz/Swartz/Shwartz (and each variant with an “e” at the end) = S-632
  • Schwarz/Swarz/Shwarz (and each variant with an “e” at the end) = S-620

So how can we overcome these limitations and problems with Soundex?  Simply put—use wildcards!

Wildcards

There are just too many variables at play with Soundex, which is why I prefer the use of wildcards.  This for me is the fun part, and the chance to tap into my creative mind to find those pesky ancestors.  It’s a puzzle in itself, finding the right combination of letters and wildcards to have better control over your search results.  And you may not be aware, but Ancestry changed the way they handle wildcards, probably a year or so ago, giving us even more flexibility in conducting these types of searches (before, you couldn’t start with a wildcard in the first position and now you can).

All of the following examples are from my personal experience and I hope they illustrate ways in which you can implement this solution into your own research. 

But before we get started, here’s a quick definition of what wildcards are.  In many cases, these same characters work across websites that perform search functions, however, they can vary and may have different limiting criteria.  The following is what Ancestry allows:

  • A question mark (?) is used to replace one letter (e.g., use franc?s for Frances and Francis).
  • An asterisk (*) is used to represent a string of letters (e.g., use fran* for Frances, Francis, Frank, Fran, etc.).  The string could be null, in that fran* will retun Fran in addition to the others.
  • A wildcard can be used in any position and multiple wildcards can be used within the same name.  However, at least three other characters have to be used as well.
  • Wildcards can be used for both the first and last name, even in the same search.

If Ancestry doesn’t like your usage, it will notify you and provide suggestions for using wildcards.

The first example I want to highlight is the Schwartz example mentioned above.  Because this would require you to perform multiple searches (either an exact using all the variations, or a Soundex using two different codes), I use the following wildcard search that incorporates all the variables:

     s*war*z*

The first asterisk allows the search to find anything that starts with an “s” and may include other letters before the “w.”  The second asterisk will pick up anything that may have letters after the “r.”  The last asterisk will pick up anything that may have letters after the “z.”  Just because an asterisk is used, it doesn’t mean that the search will only pick up names with strings to fill the blank…it will also pick up the name “swarz” which is what is left after the asterisks are removed.

Let’s illustrate the difference between Soundex and Wildcard search returns.  Using the two different variants with Soundex and the one wildcard search, here’s what the 1900 census for this surname with a location of Illinois reveal:

  • Soundex – Schwartz (5,633) + Schwarz (8,759) = 14,392
  • Wildcard – s*war.z* = 4,757

Another name I’ve been working on is Haacke.  There are many variables for this as well (one “a,” no “e,” no “k” or any combination thereof), although the variables are covered by one Soundex code.  Here’s the wildcard string I’ve been using:

     h*ac*

This helps me pick up names that have either one or two of the letter “a” and those missing the “k” and/or the “e.”

Using the same criteria of a location of Illinois and the 1900 census, the following are the search results:

  • Soundex – Haacke = 34,616
  • Wildcard – h*ac* = 6,308

As you can see, there is a huge difference in the number of returns you get.  Although the wildcard searches are still numerous, it’s certainly more manageable than the Soundex results.  And also keep in mind that the search was done on surname only and a location of Illinois.  Adding a first name will surely bring those numbers down.  And, I also have some tips for narrowing down search results specifically for censuses (this will be covered in Part 3).

Following are some more examples:

Searching For Variations Wildcard String
Boone / Boon boon*
Leppin/ Leppen / Leppon / Lippin / Lippen / Lippon l?p*n
Miserentino/ Miserintino / Miserendino / Miserindino miser*ino
Norton / Naughton n”ton
Rodgers / Rogers ro*gers
Bachmann/ Bachman / Backmann / Backman bac?man
John / Johan/ Johann, Johnny, Johnnie joh*n*
Francis / Frances / Frank / Franny / Fran / Francesca fran*
Francis / Frances (only) franc?s
Caroline / Carolyn / Carolina carol*n*
Kathrine / Katherine / Catherine / Cathrine (or those ending in an “a”) ?ath*rin*
Kathleen / Cathleen ?athleen
All the Katherine and Kathleen variants combined ?ath*
Hulda / Huldah / Hilda / Holda h?lda*
Jesse / Jessie / Jess jess*
Elizabeth / Eliza eliza*
Ann / Anne / Anna ann*
Marion / Marian mari?n
Solomon / Solmon / Salomon / Salmon s?l*mon
Phebe / Phoebe ph*be

Tip:  If you have an O prefix, add an asterisk after the “o” and it will pick up those indexed with and without an apostrophe.  Unfortunately, in order to search for prefixes (O, Mc, Mac) that may or may not include a space, you’ll have to run a search with and without a space in order to pick each type up in the index (an asterisk will not pick the space up—maybe someday it will—hint, hint).

Another Tip:  Use wildcards to not only represent variants of spelling, but also in terms of what’s legible and what’s not.  Characters can be misread and therefore misindexed, such as “a” and “o” or “T” and “F” or “p” and “f”.

While I can’t cover every possible wildcard search in this post, this should give you a better idea of how and when to use wildcards to maximize your search.

To Be Continued…

That’s it for now.  In Part 3 we’ll look at targeting your search results.

See also:


Share/Bookmark

3 comments:

Mary said...

Thanks for the tip on checking the exact match box and going for wild cards instead. I had always been afraid I might miss something, but, you're right, you do get some weird inclusions.

Looking forward to the next installment,

Mary

Sheri Fenley said...

Julie this is great information. Thanks for sharing and I am looking forward to the next installment.

Kenneth W. Spangler said...

Julie,
Thanks for these posts. I'm sure that they will help me out in my future searches. I didn't know about the ? wilcard!

  © Copyright 2008~2013. All rights reserved.

  © Blogger template 'Minimalist E' by Ourblogtemplates.com 2008

  Social media icons are from GraphicsFuel.com

Back to TOP  

Related Posts Plugin for WordPress, Blogger...