Generating a prompt for a research agent for a locality guide
UPDATE: Impressive but imperfect; if you test on a place you know well, you’ll see the cracks more quickly.
As a follow-up to yesterday’s first glimpse at the new research agents, here is a prompt chain (a sequence or workflow of AI chats to produce something useful) to generate a locality guide (hat tip to two researchers, Mary and Simon, for quickly highlighting this use case).
A locality guide is your personalized genealogical roadmap, detailing key resources, records, and historical insights specific to a particular geographic area, helping you efficiently navigate your ancestral research in that location. The new class of research agents such as OpenAI’s Deep Research combine strong language models, reasoning, and internet access. Though folks have used chatbots in the past to generate locality guides, the new research agents offer the promise of improved results. (In addition to the concern about hallucinated results, the issue of source selection is still undefined.)
As is often the case, there are many ways one might use AI to generate a locality guide. The simplest might be single prompt along the lines of, “Generate a locality guide for Ashe County, North Carolina.” One could even prompt for a short list of the five or ten most useful resources.
On the other hand, a meaningful process of building a locality guide will work, too. The process illustrated here is to step the model through an understanding of locality guides in general, then an introduction to a specific locality (here, Ashe County, NC, my ancestral homeland), then generating a research prompt to assign to the research agent; the research agent then assembles the locality guide as instructed from the longer, AI-generated assignment. This process generates a more comprehensive locality guide.
What you see here is a several-step process. Most of the steps took just a few seconds to complete; the research step, however, took over 14 minutes while OpenAI’s Deep Research consulted 45 sources and generated the locality guide. This guide was prompted and generated using one model, GPT-4.5, though using different models for different steps will be an option for folks with access to different models and research agents.
- You can read the whole ChatGPT GPT-4.5 and Deep Research chat here:
https://chatgpt.com/share/67c21419-6434-8004-9945-ab4d0d274106. - You can download and read the whole 35-page Ashe County, NC Locality Guide here:
https://drive.google.com/file/d/1U4uJtHR4rmYvXc7Zs2yaezSHHucn7hqx/view?usp=sharing
And here are the prompts used to generate the guide. Remember, the purpose of breaking-down the report generation into several steps is to use the output response of a first prompt to get a better result in the following prompts.
First prompt:
<prompt>
Describe a 'locality guide' in a sentence, as if you were a professional genealogist speaking with new family historians.
</prompt>
This first prompt enters into the context window a clear understanding of a locality guide; you can edit the result before moving forward if you wish to further steer the process.
The next prompt is:
<prompt>
List must-have resources for a locality guide.
</prompt>
If you want to attempt a more comprehensive guide, you may want to re-phrase this prompt. It is also interesting to see how different models respond to this prompt. Here was the response I received from GPT-4.5:

The next prompt introduces the specific location under consideration as a form of confirmation and reinforcement.
<prompt>
Describe Ashe County, North Carolina, in a sentence, as if you were a professional genealogist speaking with new family historians.
</prompt>
To which the model responded:
Ashe County, nestled in the mountainous northwestern corner of North Carolina along the Virginia and Tennessee borders, offers genealogists rich opportunities through its extensive records of early Appalachian families, historical land grants, and a strong heritage of Scots-Irish, English, and German settlers.
OpenAI’s ChatGPT GPT-4.5
Where the Magic Happens
Up to this point, we have been “setting the table,” so to speak, preparing the background for our work. I love the image of Bob Ross teaching painting as a metaphor for prompt engineering; crafting the context of an LLM chat is a bit like laying down the background of a painting. To this point, we have been building up the “context window” (the “working memory” of the language model). By explicitly prompting the model to respond with focused information, we have been crafting the language-environment, the verbal background, in which the model will next respond (real geeks can read more about LLM ‘latent space’ and ‘neural networks’). The next two prompts do the heavy lifting; the next prompt crafts a research assignment for the research agent, and prompt following that actually instructs the research agent, Deep Research.
Here is the assignment prompt. I’m a bit proud of this one, not having seen anyone else doing this:
<prompt>
You are the assignment editor for a large publishing platform; craft a research assignment for an internet-enabled LLM, an agentic (iterative) research model; the research product will be a resource for many later writing projects, from genealogical reports to history essays, to social media posts; the topic is: Locality Guide for Ashe County, North Carolina; wrap your assignment in <prompt> tags.
</prompt>
Researchers may appreciate that this prompt can be adapted to any subject by changing the topic. The response from this prompt will become the assignment passed along to the research agent.
Here, then, is the research assignment we have generated; it will also be the prompt we submit to the research agent:
<prompt>
You are tasked with conducting comprehensive iterative research to develop an in-depth **Locality Guide for Ashe County, North Carolina**. Your research should thoroughly identify, document, and annotate essential resources across various genealogical and historical categories. The collected information will serve as foundational raw material for future genealogical reports, historical essays, articles, and social media content.
**Research Objectives:**
1. **Historical Overview:**
Provide a brief but authoritative historical overview of Ashe County, highlighting key settlement patterns, historical events, migrations, cultural groups, boundary changes, and significant historical periods.
2. **Record Types and Availability:**
For each type of genealogically relevant record listed below, identify primary repositories, online databases, collections, access limitations, date ranges, gaps, and any unique features:
- Vital Records (birth, marriage, death)
- Census Records (Federal and State)
- Probate and Court Records
- Land, Property, and Deed Records
- Military Records (Revolutionary War through modern eras)
- Church and Cemetery Records (major denominations, congregations, burial grounds)
- Newspapers, Obituaries, and Local Publications
- Immigration, Migration, and Naturalization Records
- Specialized Records (Tax Lists, School Records, Voting Records)
3. **Repositories and Organizations:**
Compile a comprehensive list of archives, libraries, genealogical/historical societies, and other institutions holding primary or secondary records related to Ashe County. Provide addresses, websites, contact information, hours of operation, and guidance on accessing their holdings.
4. **Maps and Geographic Resources:**
Identify significant maps (historical and contemporary) showing township boundaries, migration routes, geographic landmarks, historical sites, and boundary changes affecting Ashe County research.
5. **Online and Digital Resources:**
Curate and summarize online databases, digitized collections, and free or subscription-based websites specific to Ashe County genealogy and history, with annotations on ease-of-use, coverage, quality, and reliability.
6. **Local Historical Context:**
Summarize notable families, community traditions, ethnic groups, economic activities, and local industries historically significant to Ashe County that might aid genealogical or historical writing.
**Deliverable Expectations:**
- Structure your research clearly by the categories above, providing detailed notes, hyperlinks to primary online resources, clearly cited sources, and dates of last updates or access where appropriate.
- Highlight areas where resources are scarce or records gaps exist, and suggest alternative strategies or locations for overcoming research obstacles.
- Prioritize accuracy, completeness, and ease of understanding, ensuring the guide can serve as a trustworthy reference for diverse audiences, from amateur family historians to professional researchers.
Your research will proceed iteratively, beginning broadly and then progressively deepening to refine, correct, and expand findings based on your evaluation and analysis of the gathered data.
</prompt>
The locality guide that OpenAI’s Deep Research generated was about 33 pages. The research agent consulted 45 sources over 14 minutes while generating the report. The model returned linked source citations, albeit in its default format; further work may explore citation formatting options. (Again, an acknowledged and on-going concern is which sources were consulted, and which were not.) After the report was generated, I prompted the model to write an Introduction and a succinct Table of Contents; I did this afterward so that it would mirror the then-created report (rather than before generating the report, which would have guided the report writing).

You can read the full 35-page report here. As always, you are allowed, invited, and encouraged to use and modify these prompts to fit your own needs. And keep in mind that not all research agents are equal; your results will vary widely depending on the model that you use. And today, OpenAI’s Deep Research is the best-in-class; Google’s Gemini 1.5 Pro with Deep Research is gaining a reputation for (unsurprisingly) consulting more resources, while xAI’s Grok DeepSearch’s strength is access to real-time results (breaking news, etc.).
This work is released by Steve Little under a Creative Commons BY-NC 4.0 License.