How to See How Many Times ChatGPT Hits Your WordPress Site

If you run a WordPress blog today, you are almost certainly being crawled by OpenAI’s OAI‑SearchBot, the indexing system used by ChatGPT to learn about the public web. Many site owners do not realize how often their content is fetched, or which posts are receiving the most AI‑driven traffic.

In my case, after reviewing my server logs, I discovered that ChatGPT was crawling my site far more aggressively than expected. This article explains how you can check the same on your own WordPress installation using only your Apache logs and a few command‑line tools. No plugins or PHP changes required.


1. Confirm You Are Running Apache

First, verify that Apache is your active web server:

ps aux | grep -E "nginx|apache|httpd"

If you see /usr/sbin/apache2 -k start, Apache is running.


2. Locate Your Apache Log Files

Apache access logs are usually in:

/var/log/apache2/access.log
/var/log/apache2/access.log.1
/var/log/apache2/access.log.X.gz

To list them:

ls -lh /var/log/apache2/


3. Count How Many Times ChatGPT (OpenAI SearchBot) Hit Your Site

Use this exact command to count all OpenAI hits across all logs:

grep -i "openai" /var/log/apache2/access.log 2>/dev/null | wc -l && \

grep -i "openai" /var/log/apache2/access.log.1 2>/dev/null | wc -l && \

zgrep -i "openai" /var/log/apache2/access.log*.gz 2>/dev/null | wc -l

To calculate a combined total:

echo $(( \

$(grep -i "openai" /var/log/apache2/access.log 2>/dev/null | wc -l) + \

$(grep -i "openai" /var/log/apache2/access.log.1 2>/dev/null | wc -l) + \

$(zgrep -i "openai" /var/log/apache2/access.log*.gz 2>/dev/null | wc -l) \

))

Example output:

146
382
2687
3215

This would mean ChatGPT crawled your site roughly 3,215 times.


4. Count How Many Hits Each WordPress Article Received

This command scans all Apache logs, isolates OpenAI SearchBot traffic, extracts WordPress article URLs, and counts hits per post:

(
  grep -h -i "OAI-SearchBot" /var/log/apache2/access.log /var/log/apache2/access.log.1 2>/dev/null
  zgrep -h -i "OAI-SearchBot" /var/log/apache2/access.log*.gz 2>/dev/null
) | awk '
{
    if (match($0, /"(GET|HEAD|POST) ([^ ]+) HTTP\/[0-9.]+"/, m)) {
        path = m[2]
        sub(/\?.*/, "", path)

        if (path ~ /^\/index\.php\/[0-9]{4}\/[0-9]{2}\//) {
            if (path ~ /(\/feed\/|\/oembed\/|\/wp-json\/|\/xmlrpc\.php|\/wp-login\.php|\/robots\.txt|\/favicon\.ico)/)
                next
            counts[path]++
        }
    }
}
END {
    for (p in counts)
        printf "%7d  %s\n", counts[p], p
}
' | sort -nr

Example output

  482 /index.php/2025/06/05/setting-up-navidrome-to-stream-your-azure-backed-music-collection/
  441 /index.php/2025/06/04/setting-up-jellyfin-to-stream-your-azure-backed-media-collection/
  406 /index.php/2025/05/19/how-to-mount-azure-files-on-linux-and-sync-data-with-rsync/
  391 /index.php/2021/03/06/technical-debt-what-is-it-and-how-to-detect-and-prevent-it/
  354 /index.php/2025/08/22/cutting-azure-file-share-costs-cheaper-azure-alternatives-and-how-to-migrate/
  327 /index.php/2020/06/27/building-a-raspberry-pi-ai-assistant-using-azure-and-ibm-cloud/
  301 /index.php/2025/07/04/ai-powered-recruitment-assistant-with-ollama/
  289 /index.php/2024/06/27/offline-ai-image-captioning/
  275 /index.php/2021/11/26/extending-powerfx-to-physical-world/
  262 /index.php/2020/05/29/tweeting-from-a-raspberry-pi-using-azure-speech/

This shows that ChatGPT:

  • Read several articles over 400 times each
  • Crawls certain technical tutorials more heavily than general posts
  • Indexed older Raspberry Pi and AI‑related tutorials hundreds of times

This pattern is consistent with AI search systems favoring deep technical content.


5. Adjust for Pretty Permalinks (If You Don’t Use index.php)

If your URLs look like:

/2025/06/05/article-title/

Change the matching line from:

if (path ~ /^\/index\.php\/[0-9]{4}\/[0-9]{2}\//) {

to:

if (path ~ /^\/[0-9]{4}\/[0-9]{2}\//) {


6. Why This Matters

By analyzing how ChatGPT hits your site, you can learn:

  • Which posts ChatGPT finds most valuable
  • How often your content is refreshed in AI search indexes
  • Whether older posts maintain long‑term AI interest
  • Which categories and topics resonate with AI systems

In my case, the logs showed that articles about media streaming, Azure integration, and technical automation were indexed hundreds of times, while lighter posts received far fewer visits. This kind of data helps guide future content strategy.


Final Thoughts

AI‑powered crawlers are becoming one of the most influential ways content is discovered across the web. Understanding how ChatGPT interacts with your WordPress site helps you stay ahead, refine your posts, and learn what the next generation of search engines actually values.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *