Building AIDA (AI: Do Anything!): a browser AI extension born from testing GPT‑5

Spread the love

I wanted to test GPT‑5 in the wild—no carefully engineered spec, just a fuzzy prompt and a real deliverable. That constraint produced Briefly, a Chrome extension that summarizes the active page, translates it, and lets you “ask the page anything.” The twist: I intentionally started with a vague proI wanted to test GPT‑5 in the wild—no carefully engineered spec, just a fuzzy prompt and a real deliverable. That constraint produced AIDA (AI: Do Anything!), a Chrome extension that summarizes the active page, translates it, lets you “ask the page anything,” and offers a Free Chat mode (and many more features as soon as they’ll come in my mind or from online suggestions). The twist: I intentionally started with a vague prompt in ChatGPT’s UI.

Why? Two reasons:

  1. at T0 my ideas aren’t crisp (and sometimes what you generate on day one ends up in the bin), and
  2. ambiguity lets the model interpret requirements and often produce UI/UX or feature ideas I wouldn’t have thought of.

The result was a surprisingly complete starting point—then I hardened it in Cursor with Claude. (And yes, the name AIDA intentionally sounds a bit Italian 😉)

TLDR; Jump to https://organizer.solutions/briefly.html to try it now.


What I built (and why)

Goal: create an extensible base app that anyone can adopt to bring GenAI to everyday browsing—no deep technical skills required. I wired OpenAI first, then Claude and Google Gemini, so people can choose their favorite provider. The philosophy is simple: democratize GenAI. You don’t need to be “super technical” to do great things, ideally at zero cost for casual use.

Current feature set includes page summarization, multi‑language translation, page‑aware Q&A, processing/job tracking, local IndexedDB storage, and a tabbed popup UI with results & processing views.


My workflow: ChatGPT ➜ Cursor ➜ back to ChatGPT

  • I used ChatGPT (GPT‑5) to bootstrap a working manifest V3 extension from a messy prompt. UX and UI came largely from the model—honestly, 1000× faster and better than doing it all by hand)
  • The first build didn’t run as‑is. I switched to Cursor and iterated with Claude‑4 about ten times, mainly to refactor background job handling and asynchronous LLM calls. Two sessions of ~2.5h each later, I had v0.2.0 working smoothly.
  • I added a small build script (also via Cursor/Claude) that zips the extension for the Chrome Web Store and injects a trailing banner into each file with identifying metadata and a license notice.
  • I also asked ChatGPT to draft the initial EULA and Privacy Policy, which I then refined and adopted.
  • I returned to ChatGPT for branding: we iterated from early names (MindBolt/MindBlot/Briefly) to AIDA, and we co‑designed the icon through many passes (the “winking page/person” motif you see in the mosaic).
  • Because OpenAI exposes a models list endpoint, but others were less straightforward historically, I used an OpenAI Agent to research up‑to‑date model names for Claude and Gemini to power a typeahead model picker. (FWIW: today, Anthropic documents /v1/models; it wasn’t part of my earliest iteration but added to version 0.5.0)


Architecture at a glance

  • Popup triggers actions and renders results.
  • Service Worker (background) extracts content from the active tab, calls the selected LLM, persists results to IndexedDB, and pushes updates back to the UI.
  • Options page stores provider, model, and API key (local only).
  • Jobs persist even if the popup closes; you can monitor them in the Processing tab.

Implementation highlights: intelligent main‑content extraction, chunking for long pages, “merge pass” for clean executive summaries, and a local results hub (copy/delete, details modal, debug tools).


Mini‑tutorial: get a Gemini API key (free)

Good news: Gemini has a free tier that lets you call the LLM at no cost (with obvious limits — see https://ai.google.dev/gemini-api/docs/pricing).

  1. Go to Google AI Studio and sign in (https://aistudio.google.com/app/apikey). Click Get API key.
  2. If needed, review the docs on using Gemini API keys (env vars or explicit key).
  3. Copy your key and paste it in Briefly → Settings → Provider = Google.
  4. Pick a Gemini model in the model dropdown/typeahead and you’re set.

About the free tier. Google provides a free tier with lower rate limits intended for testing; exact quotas vary by model and can change. Check the current pricing and rate‑limit pages before you rely on it. (AI Studio itself is free to use in supported countries.)

Tip: AI Studio’s Quickstart shows copy‑paste code snippets if you want to sanity‑check a request outside the extension.


What’s in the box (v0.5.0)

  • Summarize Page. Extracts the core content, filters menus/ads/footers, chunk‑summarizes, and merges into a concise brief. Stored locally for offline reuse.
  • Translate to … 80+ languages with a searchable picker and a configurable default. Chunked translation preserves meaning and structure.
  • Ask Anything. Ask targeted questions; the model answers based on the current page’s content.
  • Free Chat. Have open‑ended conversations not tied to the current page; history is saved and can be resumed later.
  • Results & Processing tabs. Review saved outputs, see live job status, open details, copy/delete, and run debug tools (DB test, background test, debug info).
  • Task Cancellation. Cancel pending AI operations directly from the Processing tab.
  • Modal Persistence. The last opened modal (content view, Ask Anything, Free Chat) is automatically restored on reopen.
  • Settings. Choose provider (OpenAI/Claude/Gemini), select a model (with descriptions), and store API keys locally.

Observations on GPT‑5 (and subscriptions)

For my current day‑to‑day, GPT‑5 performs well enough that I didn’t change plans. Anecdotally, when I played with canceling, I was offered a 50% discount—your mileage may vary.


Lessons learned

  • Underspec at T0 is a feature. Let the model surprise you—especially for UI/UX and “obvious in hindsight” workflows.
  • Background jobs in MV3 need care. Cursor + Claude were excellent for tightening the SW lifecycle and async orchestration.
  • Multi‑provider UX matters. Keys, model lists, rate‑limit expectations, and clear error messages reduce friction for non‑technical users.
  • You are way better with generative AI Agents, but you still need to put your mind on every step (don’t trust blindly!)

Roadmap

  • More actions (FAQ extractor, key quotes, social captions???).
  • Domain allowlist and storage cleanup policy
  • Smarter retry/backoff across providers
  • Suggest anything useful?

Try it

There’s a one‑page landing at https://organizer.solutions/briefly.html.
I’ll keep evolving AIDA with community input— ideas welcome.

1 commento

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.