HTML to Blogger tool rescues 216 archived pages

What if the websites you loved in 2014 could actually live again—but only if you were willing to break Windows, learn Python from scratch, and accidentally lock your Google account?

That’s the dark comedy at the heart of one developer’s recent project: resurrecting 216 pages from a GeoCities-era outdoor club website. But this isn’t just another nostalgic salvage mission. It’s a case study in how desperation, friendship, and sheer stubbornness can force you to build tools you never thought you’d need. Welcome to the world of HTML-to-Blogger conversion automation—where the real challenge isn’t the code. It’s keeping your sanity.

The GeoCities Ghost That Wouldn’t Stay Dead

Let’s set the scene. GeoCities shut down. A beloved outdoor club’s website got archived. The last post was April 2014. Case closed, right? Wrong. The site’s original creator—a friend who built it while learning HTML—couldn’t just let it disappear into the digital void. Neither could the developer who decided to rescue it.

Here’s where the story gets interesting: 216 pages.

Not 16. Not 26. Two hundred and sixteen.

“Doing this by hand is impossible!” the developer wrote. And they were right.

Manual recovery wasn’t an option. Neither was asking an AI to blindly process everything (that failed spectacularly on language-specific instructions). So the developer made a choice that seemed reasonable at the time: build a custom tool. What could possibly go wrong?

Everything. Absolutely everything.

The Windows Apocalypse (And Why Ubuntu Saved the Day)

First attempt: React, VS Code, Windows. A recipe for progress, or so it seemed. Then Windows became “unstable.” Then it crashed. The developer lost their product ID. The entire project evaporated from their main machine.

Most people would’ve quit.

This developer decided to install Ubuntu instead—a OS they’d barely touched—and learn Python. For the first time. While simultaneously changing the entire project’s architecture from a React rebuild to a Google Blogger integration. While adjusting IME settings. While downloading Fcitx5. While sounding increasingly unhinged in their own project notes.

This is what obsession looks like. This is also what happens when you’re friends with your client.

The pivot to Blogger made sense operationally (members could now post, video uploads were possible), but it required gutting the original plan entirely. No more preserving the original design. No more React-based recreation. Instead: automated HTML cleaning, EXIF stripping, keyword extraction, location tagging, and image watermarking—all piped directly into Blogger’s API.

Sounds simple when you write it that way.

When 200 API Requests Happen by Accident

There’s a moment in every developer’s life when they realize they’ve made a catastrophic mistake in real time. For this developer, it was checking their email to find a Google account lock warning.

They’d accidentally sent approximately 200 Blogger API requests without a rate limiter. No timer. No safeguards. Just raw, unfiltered requests flooding Google’s infrastructure like a digital toddler pressing buttons.

The apology email was sent immediately. Google restored the account the next day. Crisis averted, dignity mildly bruised.

But this moment reveals something important about the project: it’s not a polished, well-architected solution. It’s a messy, iterative, crash-and-learn journey that involves debugging Python references you don’t understand, struggling with nesting (literally using their finger to count indent levels), and discovering virtualenv exists only after trying VirtualBox and Wine.

The Library Dependency Hell You Don’t See Coming

Here’s what separates this from typical “I built a tool” posts: the developer is brutally honest about the chaos. Library version conflicts. IME misconfiguration. Realizing that Python’s copy function doesn’t work the way you expect when instances are involved. Installing MS Office through Wine. Switching to LibreOffice. Discovering Nominatim geocoding as a last-minute solution for location tagging.

This isn’t a polished retrospective. It’s raw documentation of technical suffering.

The GUI development using Tkinter was particularly brutal. AI-generated base code that required complete rewrites. Intense debugging sessions. So much keyboard pressing that family members complained about the noise. That’s commitment. That’s also a sign you might be doing something wrong, but the developer got there eventually.

The tool now performs HTML cleaning (removing unnecessary tags and normalizing formatting), image processing (EXIF stripping and watermark addition), and automated metadata assignment (keywords and location information extracted from content and assigned to Blogger posts). It’s functional. It’s available on GitHub. It works on Windows, Mac, and Linux with Python environments.

The Real Lesson Here Isn’t About Technology

This project is nominally about building a Python CLI tool that converts archived HTML into Blogger posts at scale. But the actual story is about something messier and more human: friendship, persistence, and the willingness to learn technologies you’ve never touched when stakes feel high enough.

The developer invests what they lack—wisdom—they compensate for with time. Lots and lots of time. Debugging. Breaking things. Fixing them. Ubuntu was “wonderful.” They discovered community mattered more than they realized while learning it.

Corporate engineers would’ve killed this project at the Windows crash. They would’ve charged money. They would’ve built it properly with tests and documentation. This developer just… kept going. Because someone they knew couldn’t let something they’d built be forgotten.

That’s not a tech story. That’s a friendship story wearing a tech costume.

Why This Matters Beyond One Nostalgic Site

Archival is a real problem. Every day, websites disappear. GeoCities is dead. Tumblr purged content. Medium’s algorithm buries old posts. GitHub repositories get deleted. We act like the internet is permanent when it’s fragile as hell.

Tools like this—messy, specific, built out of necessity rather than market demand—matter because they demonstrate that preservation can be automated at scale. Yes, Google API credentials are required. Yes, it’s technically demanding. But 216 pages that would’ve stayed archived forever are now accessible, editable, and integrated into a living platform.

The developer’s final note is telling: “Please feel free to point out any issues with the Python coding!” There’s humility there. There’s also awareness that this is version 0.98, not 1.0. Bugs probably exist. The code could be cleaner. But the site’s online. The pages are searchable. Kayaking trips from 2012 are discoverable again.

That’s not nothing.

🧬 Related Insights

Frequently Asked Questions

Can I use this tool for my own archived website?

Yes, but you’ll need Google API credentials and a willingness to debug. It’s designed for HTML-to-Blogger conversion specifically, but the principles scale to other platforms with some modification. The developer explicitly invites code review and bugfixes—it’s not polished enterprise software.

What if I don’t want to use Blogger?

The architecture could theoretically adapt to other platforms, but the current tool is Blogger-specific. The HTML cleaning and image processing components are reusable, but API integration would require rebuilds. File an issue on the GitHub repo or contribute a fork.

Is Python the best choice for this kind of automation?

For one person learning on the fly? Not necessarily. But it works, it’s cross-platform with proper environment setup, and the developer got it done. That’s what matters more than theoretical optimization.

HTML to Blogger tool rescues 216 archived pages

Key Takeaways

The GeoCities Ghost That Wouldn’t Stay Dead

The Windows Apocalypse (And Why Ubuntu Saved the Day)

When 200 API Requests Happen by Accident

The Library Dependency Hell You Don’t See Coming

The Real Lesson Here Isn’t About Technology

Why This Matters Beyond One Nostalgic Site

🧬 Related Insights

Frequently asked questions

Worth sharing?

⚡ Key Takeaways

The GeoCities Ghost That Wouldn’t Stay Dead

The Windows Apocalypse (And Why Ubuntu Saved the Day)

When 200 API Requests Happen by Accident

The Library Dependency Hell You Don’t See Coming

The Real Lesson Here Isn’t About Technology

Why This Matters Beyond One Nostalgic Site

🧬 Related Insights

Frequently asked questions

Share this article

Worth sharing?

Related Stories

Stay in the loop

Key Takeaways