docs: document reference-images flow in AGENTS.md

This commit is contained in:
2026-05-18 22:25:17 +08:00
parent d3be31d038
commit 54f13c1097
+20 -7
View File
@@ -19,20 +19,29 @@ generation endpoint and serves a small vanilla HTML/JS playground.
`noEmit: true`, so plain `bunx tsc` works too). `noEmit: true`, so plain `bunx tsc` works too).
- Tests / lint / formatter: none configured. If adding tests, use `bun test`. - Tests / lint / formatter: none configured. If adding tests, use `bun test`.
The server binds `0.0.0.0` (see `index.ts:6`), so it is reachable from other The server binds `0.0.0.0` (see `index.ts:61`), so it is reachable from other
hosts on the network when running locally — be mindful when entering API keys. hosts on the network when running locally — be mindful when entering API keys.
## Architecture ## Architecture
- `index.ts` — the entire backend. One `Bun.serve` instance with: - `index.ts` — the entire backend. One `Bun.serve` instance with:
- `/` serves `index.html` via Bun's HTML import (`import index from "./index.html"`). - `/` serves `index.html` via Bun's HTML import (`import index from "./index.html"`).
- `POST /api/generate` accepts `{ baseURL, apiKey, model, prompt, size }`, - `POST /api/generate` accepts
builds an OpenAI-compatible provider with `@ai-sdk/openai-compatible`, and `{ baseURL, apiKey, model, prompt, size, referenceImages? }`. It returns
calls `generateImage` from `ai`. Images come back as base64 and are `{ images: string[] }` where each entry is a `data:` URL (base64).
returned as `data:` URLs in `{ images: string[] }`. - Two code paths inside the handler:
1. No `referenceImages` → uses `@ai-sdk/openai-compatible` + `generateImage`
from `ai`.
2. `referenceImages` present → hand-rolled `multipart/form-data` POST to
`${baseURL}/images/edits` (see `generateWithReference`). The AI SDK
does not currently expose image edits for OpenAI-compatible providers,
so this path bypasses it on purpose. The edits endpoint is gpt-image
series only (see UI hint in `index.html`).
- `index.html` — self-contained UI: inline CSS, plain DOM JS, no build step. - `index.html` — self-contained UI: inline CSS, plain DOM JS, no build step.
Settings (baseURL, apiKey, model, size, prompt) persist in `localStorage` Text fields (`baseURL`, `apiKey`, `model`, `size`, `prompt`) persist in
under the `aip:<field>` prefix. There is no React code despite `localStorage` under the `aip:<field>` prefix. Reference images are kept
in an in-memory `refImages` array as base64 data URLs and are **not**
persisted — refreshing the page drops them. There is no React code despite
`react` / `react-dom` / `@types/react*` being in `package.json` — treat `react` / `react-dom` / `@types/react*` being in `package.json` — treat
those deps as latent. Do not invent a React frontend unless asked. those deps as latent. Do not invent a React frontend unless asked.
- No router, no DB, no auth. API key is supplied per-request by the browser - No router, no DB, no auth. API key is supplied per-request by the browser
@@ -59,3 +68,7 @@ hosts on the network when running locally — be mindful when entering API keys.
- The AI SDK image type is loose; the current handler casts to - The AI SDK image type is loose; the current handler casts to
`{ mediaType?: string; base64?: string }`. Mirror that pattern rather than `{ mediaType?: string; base64?: string }`. Mirror that pattern rather than
trusting field presence. trusting field presence.
- For anything the AI SDK does not cover (e.g. image edits, masks, variations),
follow `generateWithReference`: build `FormData` with `Blob`s decoded from
the incoming data URLs and `fetch` the upstream endpoint directly with the
caller's `Authorization: Bearer <apiKey>`.