Hayfevr.ly — scrape daily pollen readings for Austin, Texas

I created Hayfevr.ly (GH repo here) to solve a problem I was having. Two local allergy clinics and a TV news station posted daily pollen readings on their websites, so I had to keep re-checking three different websites each morning to find out what was making me sneeze each day.

I wrote a web scraper using Java and Selenium that checks these websites for new readings, updates the daily pollen counts as new readings come in, and summarizes the data in one convenient place on Hayfevr.ly.

Tech Stack

Java and Selenium for the scraper

Tess4J (Tesseract) for OCR to extract pollen counts published as images

Java to convert colloquial names of plants into formal Latin families, genera, and species

MySQL for storing past and current-day readings

Go (a.k.a. “golang”) for generating the static web pages

Leave a Reply