Tag Index Backlog
Tag Index Backlog
Tag pages (/tag/{slug}/) are auto-generated by _data/process_data.py. A tag page
with fewer than 3 conferences gets both googlebot_noindex: true and
sitemap: false written into its front matter.
googlebot_noindex: truekeeps the page out of Google (we intentionally block thin/low-quality pages from Google to protect site quality signals; Bing ignores this directive).sitemap: falseremoves thin tag pages from the sitemap, which also drops them from the IndexNow submission (submit_indexnow.pyreads the sitemap) — so thin tags are hidden from Bing too until they have real content.
This is unlike hotels and workshops, which carry googlebot_noindex but stay in the
sitemap on purpose: Bing indexes them well and they are valuable long-tail pages, so we
keep feeding Bing while blocking Google. Thin tag pages (1-2 conferences) aren’t worth
indexing anywhere yet, so they get sitemap: false as well.
This re-indexes itself — no manual step
process_data.py rewrites every tag page from scratch on each run
(process_data.py ~line 580-616). The googlebot_noindex + sitemap: false lines are
only emitted when tag_conf_counts[tag] < 3. So the moment a tag reaches 3 confirmed
conferences and you re-run cd _data && python process_data.py, both lines are simply
not written: the page re-enters the sitemap (so Bing/IndexNow picks it up) and becomes
indexable by Google again. You never edit tag/*.md by hand.
Threshold lives in one place: process_data.py if tag_conf_counts.get(tag, 0) < 3:.
Change the 3 there if the policy changes.
How to use this list when adding conferences
When the task is “add conferences to flesh out the thin tags,” prefer real, confirmed conferences (official source required — see CLAUDE.md “Researching Conference Data”) that legitimately carry one of the tags below. Adding a conference with a tag that already exists on other entries will keep tag usage consistent.
Priority order: tags 1 conference away from indexing come first (lowest effort, highest SEO payoff), then tags 2 away.
1 conference away (2 conferences now → add 1 to index)
- Audio
- Autonomous Systems
- Collective Intelligence
- Crowdsourcing
- Edge AI
- Film
- Game Development
- Knowledge Graphs
- Knowledge Representation
- Middleware
- Personalization
- Recommender Systems
- Scheduling
- Search
- Security Research
- Semantic Web
- Social Networks
- Supercomputing
- Telecommunications
- Urban Computing
2 conferences away (1 conference now → add 2 to index)
- Autonomous Agents
- Cloud
- Document Analysis
- Fog Computing
- Human-Computer Interaction
- Humanities
- Language Resources
- Linguistics
- Linux
- Mechatronics
- Metrics
- Mobile Security
- Multi-Agent Systems
- Network Security
- Neuroscience
- Performance
- Quality of Service
- Wireless
- Wireless Security
Refreshing this list
Counts drift as conferences are added. To regenerate the under-3 list:
cd _data && python -c "
import csv, re
from collections import Counter
counts=Counter()
with open('conferences.csv', encoding='utf-8', newline='') as f:
for row in csv.DictReader(f):
for tag in re.split(r'\s*,\s*', (row.get('tags') or '').strip()):
if tag.strip(): counts[tag.strip()]+=1
for c,t in sorted((c,t) for t,c in counts.items() if c<3):
print(c, '|', t)
"
Last refreshed: 2026-06-07 (39 tags under the threshold).