I created Hayfevr.ly (GH repo here) to solve a problem I was having. Two local allergy clinics and a TV news station posted daily pollen readings on their websites, so I had to keep re-checking three different websites each morning to find out what was making me sneeze each day.
I wrote a web scraper using Java and Selenium that checks these websites for new readings, updates the daily pollen counts as new readings come in, and summarizes the data in one convenient place on Hayfevr.ly.
Java and Selenium for the scraper
Tess4J (Tesseract) for OCR to extract pollen counts published as images
Java to convert colloquial names of plants into formal Latin families, genera, and species
MySQL for storing past and current-day readings
Go (a.k.a. “golang”) for generating the static web pages