CLAUDE.md - Work & Wander Project Standards

Work & Wander (workwander.tech) is a Jekyll static site that helps attendees find conferences by location, venue, and year. Conference data is stored in CSVs and converted to YAML via a Python script.

Data Architecture

The pipeline: CSV -> YAML -> Jekyll

  1. Edit the CSV files in _data/
  2. Run cd _data && python process_data.py to regenerate all YAML files and Markdown pages
  3. Jekyll reads the YAML files from _data/conferences/, _data/venues/, etc.

Never edit the generated YAML files directly. Changes will be overwritten next time process_data.py runs. The CSVs are the source of truth.

File structure

_data/
  conferences.csv       # source of truth for all conferences
  cities.csv            # source of truth for cities
  venues.csv            # source of truth for venues
  hotels.csv            # source of truth for hotels
  workshops.csv         # source of truth for workshops
  process_data.py       # generates all YAML + MD pages from CSVs

_data/conferences/      # GENERATED - do not edit
_data/cities/           # GENERATED - do not edit
_data/venues/           # GENERATED - do not edit
_data/hotels/           # GENERATED - do not edit
_data/workshops/        # GENERATED - do not edit

conference/             # GENERATED - do not edit
city/                   # GENERATED - do not edit
venue/                  # GENERATED - do not edit
hotel/                  # GENERATED - do not edit
workshop/               # GENERATED - do not edit

CSV column reference

conferences.csv (17 columns): id_conference, name, slug, year, location, id_city, venue_name, id_venue, cfp_url, rank, submission_deadline, notification_date, conference_date_start, conference_date_end, tags, excerpt, description

cities.csv: id_city, name, country, continent, display_name, coordinates, description, excerpt

venues.csv: name, id_venue, city_display_name, id_city, address, coordinates, venue_url, description

hotels.csv: slug, name, address, city_display_name, id_city, coordinates, star_rating, amenities, description, affiliatelink

Hotel descriptions

Hotel descriptions should be 1-2 sentences and must include who the hotel is best suited for. This helps conference attendees quickly decide if a property fits their situation. End with the practical fit signal.

Examples by traveler type:

  • "Historic grand hotel with a full spa and multiple dining venues — best for attendees who want a splurge-worthy stay steps from the venue." (luxury)
  • "All-suite hotel with complimentary hot breakfast and an indoor pool — a solid choice for attendees on a mid-range budget or traveling with family." (families / value)
  • "Budget-friendly chain hotel closest to the campus with reliable WiFi and a 24-hour front desk — ideal for grad students or cost-conscious attendees." (budget / students)
  • "Boutique design hotel popular with younger tech crowds, with a rooftop bar and co-working lounge." (younger / startup crowd)
  • "Reliable business hotel with large rooms and a full fitness center, well-suited for industry attendees combining the conference with client meetings." (business travelers)

Star rating is a guide but the description should reinforce the positioning — a 3-star with free breakfast and great location can be the best pick for many attendees.

workshops.csv: id_workshop, slug, year, title, name, id_conference, external_url, description

Workshop URLs

The external_url field should always be populated — never left empty — so the “visit the workshop website” link on the workshop page always goes somewhere useful.

  • If the workshop has its own website, use that.
  • If not, use the parent conference’s central workshop listing page as a fallback.

Known fallback URLs:

  • CHI workshops page: https://chi2026.acm.org/workshops/accepted/
  • ICLR workshops page: https://iclr.cc/Conferences/2026/Schedule?type=Workshop
  • CVPR workshops page: https://cvpr.thecvf.com/Conferences/2026/Workshops

When adding workshops for a new conference, find its workshops listing URL first and use it as the fallback for any workshop entries that don’t yet have individual sites. Individual URLs can be updated later in workshops.csv when they go live.

Data Standards

IDs

  • All IDs use lowercase snake_case with no spaces, hyphens, or special characters
  • id_conference: {slug}_{year} e.g. aaai_2027
  • slug: short lowercase name e.g. aaai, acm_mm
  • id_city: lowercase city name e.g. montreal, rio_de_janeiro
  • id_venue: descriptive lowercase e.g. montreal_convention_centre, venetian_macao

Dates

  • All dates must be ISO 8601 format: YYYY-MM-DD (e.g. 2027-02-16)
  • US format (M/D/YYYY) will break the JavaScript date parser on the listing page and show conferences under “Unscheduled / Unknown date”
  • Leave date fields empty if genuinely unknown - do not guess

Tags

  • Comma-separated in the CSV: AI, Machine Learning, Robotics
  • Used for the tag filter on the conference listing page
  • Keep tags consistent with existing ones where possible

CORE Rankings

  • Valid values: A*, A, B, C
  • Source: CORE Portal
  • Leave blank if not ranked (e.g. industry events, IATED conferences)

Descriptions

  • Plain prose preferred - match the style of existing entries (see aaai_2026.yml as a reference)
  • No markdown headings (##, ###) in descriptions
  • No em dashes (-) or en dashes - use plain hyphens or rewrite the sentence
  • No AI-writing patterns: “Who Should Submit / Attend / Why It Matters” sections
  • **bold** for emphasis is fine; it renders via markdownify in the conference page template
  • Keep descriptions factual and concise

Excerpts

  • 1-2 sentences, used for SEO meta description
  • Should be readable standalone without seeing the full page

Coordinates

  • Format: [latitude, longitude] as a Python list literal (it’s parsed with ast.literal_eval)
  • Used for map display on venue and conference pages

Adding New Conferences

Step 1: Add prerequisite cities and venues

If the city doesn’t exist in cities.csv, add it first. Required fields: id_city, name, country, continent, display_name, coordinates, description.

If the venue doesn’t exist in venues.csv, add it. Required fields: name, id_venue, city_display_name, id_city, address, coordinates, description.

Step 2: Add the conference row to conferences.csv

Required fields: id_conference, name, slug, year, tags, description

Fill in as many optional fields as are confirmed: location, id_city, venue_name, id_venue, cfp_url, rank, submission_deadline, notification_date, conference_date_start, conference_date_end, excerpt

Step 3: Run process_data.py

cd _data && python process_data.py

This regenerates all YAML files and creates the conference/city/venue Markdown pages.

Step 4: Verify

Check the generated YAML in _data/conferences/{id_conference}.yml looks correct, especially is_latest (set automatically based on whether it’s the most recent year for that slug) and date formats.

Researching Conference Data

Where to find information

  • Official conference website: Always the primary source. Search {conference name} {year} site: or just {conference acronym} {year} conference.
  • CORE portal: core.edu.au/conference-portal for rankings
  • conf.researchr.org: Tracks many ACM/IEEE conferences and their upcoming editions
  • sigplan.org, sigcomm.org, sigchi.org: ACM SIG sites often list future venues years in advance

Confidence levels

Only add data you can confirm from an official source. Common patterns:

  • Conference dates and venue: announced 12-18 months ahead on the official site
  • CFP deadlines: usually announced 6-9 months before the conference
  • Some conferences (esp. SOFSEM, INTED, smaller venues) don’t announce details until 6 months out

What to leave blank

Leave fields empty rather than guessing. The site handles missing data gracefully:

  • Missing venue: shows “Venue info not available”
  • Missing dates: conference goes into “Unscheduled / Unknown date” section on the listing
  • Missing CFP deadline: the Important Dates section simply doesn’t render

Common Pitfalls

Encoding corruption (mojibake)

If CSV data is opened and saved in Excel/Google Sheets without UTF-8 encoding, special characters get corrupted. Symptoms: è instead of è, â€" instead of -.

Fix with Ruby (write to a .rb file, don’t use heredoc in terminal due to Windows IBM437 encoding):

content = File.binread('_data/conferences.csv').force_encoding('utf-8')
content.gsub!("\r\n", "\n")
content.gsub!("\u00E2\u20AC\u201C", "-")  # en-dash mojibake
content.gsub!("\u00E2\u20AC\u201D", "-")  # em-dash mojibake
content.gsub!("\u00C3\u00A8", "\u00E8")   # e-grave (è)
content.gsub!("\u00C3\u00A9", "\u00E9")   # e-acute (é)
File.open('_data/conferences.csv', 'wb') { |f| f.write(content) }

CSV quoting errors (Jekyll “Illegal quoting” error)

Ruby’s CSV parser is strict: a space before an opening quote is illegal. Example: , "AI, Machine Learning" - the space before " breaks it.

Fix: ensure quoted fields have no leading space after the comma.

Date format corrupted by spreadsheet

Spreadsheets often convert 2027-02-16 to 2/16/2027 on open/save. Always verify dates after a spreadsheet merge. The listing page JS requires YYYY-MM-DD format.

Merging new data

Prefer editing CSVs directly in a text editor or via script rather than via spreadsheet, to avoid encoding and date format corruption. If you must use a spreadsheet, open with explicit UTF-8 encoding and verify dates after saving.

Duplicate rows

If a conference is accidentally added twice with the same id_conference, process_data.py will overwrite the YAML with the last row’s data. Check for duplicates after any manual merge.

Running the Site Locally

bundle exec jekyll serve

The site rebuilds on file changes. YAML regeneration (process_data.py) must be run separately before Jekyll picks up CSV changes.