Building Long-Tail Keyword Lists for Programmatic SEO

This article was co-written with 

For most marketers, getting traffic via SEO is a game of finding keywords with good volume and not too much difficulty, then outsourcing the writing to some low cost country. However when you look at companies that grew primarily through organic traffic, it’s almost never down to this strategy.

https://review.firstround.com/drive-growth-by-picking-the-right-lane-a-customer-acquisition-playbook-for-consumer-startups 

Very few of the big winners in the content space grew via blogging: the majority use some form of programmatic SEO. This means building a platform for machine generated and/or user generated content, rather than doing all the writing in-house. For example Thumbtack asks their tradespeople questions during onboarding designed to get unique content out of them so the pages will rank. 

https://www.slideshare.net/bernardjhuang/programmatic-seo-bernard-huang-500-startups-distro-dojo

Most businesses don’t have millions of users willing to write for us, so we rely more heavily on machine-generated content to fill our pages. For example, I worked on a programmatic SEO project for a client in the salary negotiation space, where we scraped a list of all the recruiters that work in the tech industry from LinkedIn and built out pages for them on our website.

The key insight we got from talking to our customers was that they often Googled the name of any recruiter that reached out to them, in order to check if they were ‘legit’. So we programmatically built over 11,000 pages, one for each recruiter as well as category pages for “recruiters at [company]” and “tech recruiters in [city]”.

https://www.saxifrage.xyz/post/content-marketing 

The resulting uplift in SEO traffic was enormous, driving tens of thousands of visits a month once we had built up enough domain authority (we had a viral campaign in June, resulting in 2,000+ backlinks, after which all our pages started ranking). This recruiter database became an asset that brought in a reliable volume of traffic and leads every week, and helped the company survive through COVID.

The key here is that we didn’t have to write any of those 11,000 pages ourselves, or even pay someone to write them! We used simple rules-based templates with the information from their LinkedIn to fill the pages with content. Machine-generated content has gotten significantly easier to generate with advancements in AI, so today it’d be trivial to build something even better by getting a tool like byword.ai to write all the content for you. 

Long-Tail Keyword Research

Programmatic SEO is a completely different game to blogging. You don’t hand-select keywords with high volume and low difficulty: often the keywords you’re ranking for don’t even show up as having traffic in Ahrefs and other keyword tools! These tools are notoriously unreliable for low volume “long tail” keywords, because their sample of internet traffic is too small to pick up on them. 

Rather than being a disadvantage, this is a huge advantage! If every other marketer with a license to Ahrefs is targeting the keywords they know have high volume and low difficulty, those keywords will quickly become saturated. Whereas keywords that don’t show up will effectively fly under the radar and be ignored, keeping them low competition and easy to rank for. Individually they might not contribute much traffic, but in aggregate a whole category will add up to over 70% of all the traffic, thanks to the power law distribution we always see in search.

The key to finding a suitable category of keywords is having a deep understanding of search intent, user behavior, and how people talk about the topic at hand. You can often find examples in talking to customers that you would never pick up from a keyword tool, just like we did with our recruiter website. You can also rely on your domain knowledge to come up with these niche keyword ideas.

I’ll give you an example from the travel space, where I did some work for a flight comparison tool. Travel is a highly competitive industry, and you can’t expect to rank on “cheap flights” (head term) or “cheap flights to barcelona” (mid term) without a serious budget and lots of patience. However there’s far less competition for airport codes, like “cheap flights to BCA”, because individually most don’t get much traffic. Importantly, there’s a long list of these airport codes, so having a page for every code can add up to a lot of search traffic in aggregate. 

The other consideration that makes this work, is that the category must be something people who buy your product would be interested in, otherwise the traffic won’t be worth anything. So a salary negotiation service ranking on recruiter names, or a cheap flight provider ranking on airport codes both make sense, because anyone searching for these terms may also be interested in their product. 

However if you were a finance app and started ranking on “barbeque recipes”, (a real example!) it’s likely that traffic would be worthless commercially. Just because people are interested in barbeque food, it doesn’t mean that they’ll be interested in your financial products. Yes technically they buy the food with money, but that’s too weak a signal to qualify a valuable customer. There’s no reason why barbeque lovers would be any more interested in your product than the average person. More relevant would be pages listing foreign exchange rates, or stock price movements.

How to Find Ideas for Programmatic SEO

Sometimes you’re just lucky enough to have an idea come to you in a customer interview, or you invent something based on your domain knowledge, but what can you do to actively generate new ideas for programmatic SEO? The process I’ve found most useful is something I call “meme mapping” to label keywords by attribute and type, in order to spot patterns that lead to good programmatic SEO ideas.

Meme mapping works like this:

  1. a) Build a database of relevant keywords
  2. b) Use inductive coding to tag keywords
  3. c) Investigate any useful patterns you find

This process is great for situations where you don’t have a good idea and need a relatively certain bet that if you put in the work, you’ll arrive at something you can use. Here’s a brief runthrough of how it works:

a) Build a database of relevant keywords

Start with the main head term for your industry, and plug it into a keyword tool like Ahrefs. Whatever keyword has the most commercial relevance that you would love to rank for, but can’t because of high competition. Instead of searching for a head term, you can also just look at what keywords your major competitors are ranking for and start there.

Example: for a high-end fashion retailer you might choose just the word “fashion”, for which vogue ranks number 1, and difficulty is 93/100.

b) Use inductive coding to tag keywords

Next go through and start looking for different ‘types’ of keywords that emerge. You can either just eyeball it if you’re only casually looking, or you can export this keyword data into Excel / GSheets / Notion, and specifically label what you find. It’s called inductive coding because it’s a ground up approach where you derive your labels (codes) from the data, without preconceived notions about what they might be.

Example: we can clearly see that there are a surprising number of “[decade] fashion” keywords here, and suspect there’s a long list of less popular decades.

c) Investigate any useful patterns you find

Now you have either a database of tags, or an idea of what types of keywords are suitable, you can switch to active investigation of any patterns that emerge. Go back over your list to check for any miscategorized keywords that should match this label, and go searching for more examples as well as any evidence they might make a good target.

Example: there are definitely certain decades that are more popular than others (thanks to movies?) but also plenty of long tail traffic for early (and current) decades.

Conclusion

Meme mapping is a productive way to force the kind of insights you’d get anyway. Throughout your career you develop domain expertise having seen all sorts of patterns in your work, or in talking to customers. You can speed up that process intentionally by actively reviewing and labeling samples until patterns begin to emerge, and you spot a good opportunity. If you’re interested in learning more, it’s covered in my book, “Marketing Memetics: Reliable Brand Performance Through Reverse-Engineering Creativity”.

Once you have noticed a recurring pattern of user search history, you can take advantage of it using programmatic SEO. Generating thousands of pages for long tail keywords using machine-generated or user-generated content, will offer you the sort of scale you need to become one of the big winners in content, or just to do more with less resources. With advancements in AI writing tools, the ability to execute on programmatic SEO just got much easier, and it’s only a matter of time before these niches start getting saturated. 

This piece was co-written with 

Thanks for reading

I'd love to hear your thoughts; come say hi to me on Twitter.

If you want to join 400 other growth marketers in hearing about when I post new stuff, drop your email below. No spam, I promise.

Nice one!
Oops! Something went wrong while submitting the form.