Instant downloads • Lifetime updates
BlogWordPress Tips
WordPress Tips9 min read

How to Block AI Crawlers and Bots on Your WordPress Site

S
Sean Doyle
April 24, 2026
How to Block AI Crawlers and Bots on Your WordPress Site

WordPress sites attract more automated traffic than most owners realize. On a typical day, a significant share of requests hitting your server come not from human visitors but from bots, crawlers, and automated agents. Most are harmless. Some are not. And a growing category, AI training crawlers, sits in a gray area worth understanding before you decide how to handle it.

This article covers what AI crawlers and bad bots actually are, how to identify them on your site, and how to block them effectively using Botcrawl Bot Blocker.

What Are AI Crawlers?

AI crawlers are automated programs that systematically browse the web to collect content for training large language models and other AI systems. Companies like OpenAI, Google, Apple, Meta, and dozens of smaller AI labs run their own crawlers. They harvest text, images, and structured data from public websites at scale.

Unlike traditional search engine bots, which index your content to send you traffic, AI training crawlers take your content and use it to build commercial products. You get nothing in return. No traffic. No attribution. No compensation.

Common AI crawlers include GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended, CCBot (Common Crawl), Bytespider (ByteDance), and many others. The list grows every month.

What Are Bad Bots?

Beyond AI crawlers, WordPress sites face a range of other automated threats. Bad bots include scrapers that steal your product descriptions and blog content to republish elsewhere, vulnerability scanners probing for outdated plugins and themes, credential stuffing bots attempting brute-force logins, and spam bots flooding your comment sections and contact forms.

These bots consume server resources, inflate your bandwidth usage, skew your analytics, and in some cases actively exploit your site. A site under heavy bot traffic can experience slower load times and higher hosting costs with no corresponding benefit.

How to Identify Bot Traffic on Your Site

Before you block anything, it helps to know what you are dealing with. A few reliable signals of heavy bot traffic include unusually high page views with very low session durations, traffic spikes with no referral source, repeated requests to the same URLs in your server logs, and a high volume of requests from data center IP ranges.

Your server access logs are the most accurate source. Look for user agent strings that identify known crawlers. Many AI crawlers identify themselves honestly. Others do not, which is part of the problem.

How to Block AI Crawlers and Bad Bots

There are several approaches to blocking unwanted bot traffic on WordPress. Each has tradeoffs.

robots.txt

The robots.txt file lets you instruct crawlers not to access certain parts of your site. You can add disallow rules for specific bots by user agent. For example, adding User-agent: GPTBot followed by Disallow: / tells OpenAI's crawler to stay off your site entirely.

The limitation is that robots.txt is advisory. Well-behaved crawlers respect it. Malicious scrapers and many bad bots ignore it entirely.

Server-Level Blocking

For more reliable enforcement, you can block bots at the server level using .htaccess rules on Apache or nginx configuration blocks. This stops requests before they reach WordPress at all. It is effective but requires manual maintenance as new bots emerge.

Using Botcrawl Bot Blocker

Botcrawl Bot Blocker handles all of this from inside WordPress without requiring server access or manual configuration. It maintains a continuously updated list of known AI crawlers, scrapers, and bad bots, and blocks them automatically based on user agent, IP reputation, and behavioral signals.

When a request comes in, Botcrawl Bot Blocker checks it against its ruleset before WordPress processes it. Blocked bots receive a 403 response and never touch your content. Legitimate visitors and search engines pass through without any impact on their experience.

Key features include a real-time block log so you can see exactly what is being stopped, granular controls to allow or deny specific bots by name, rate limiting to catch bots that rotate user agents, and a regularly updated bot signature database maintained by the Sera team.

Setup takes under two minutes. Install the plugin, activate it, and the default ruleset starts blocking known bad actors immediately. You can review the block log from your WordPress dashboard and adjust rules as needed.

What About Legitimate Crawlers?

Not all bots should be blocked. Googlebot, Bingbot, and other search engine crawlers are essential for your SEO. Botcrawl Bot Blocker ships with a whitelist of verified search engine crawlers that are never blocked, regardless of your other settings. You can also add custom whitelist entries for any crawler you want to allow.

Protecting Your Content Going Forward

The volume of AI crawler traffic is only going to increase. As more companies build AI products that depend on web-scraped training data, the pressure on individual site owners grows. Blocking these crawlers now is a reasonable step to protect the content you have invested time and resources into creating.

Botcrawl Bot Blocker gives you a practical, low-maintenance way to do that from within WordPress. You do not need to touch server configuration files or maintain your own bot lists. The plugin handles it and keeps the ruleset current as the bot landscape evolves.

You can find Botcrawl Bot Blocker in the Sera plugin catalog along with documentation covering all configuration options.

AI CrawlersBot BlockingWordPress Securityrobots.txtServer PerformanceBad Bots
S
Written by
Sean Doyle

Founder of Sera.guru and developer of the Sera plugin suite for WordPress.