The problem that makes automation tempting

A developer recently shared their setup on r/iOSProgramming: 4 languages × 8 screenshots = 32 images, add iPad and it's 64. Every UI change or copy tweak means touching all of them. They wrote a Python script using Fastlane for UI capture and Pillow for compositing — and open-sourced it. Reasonable solution to a real problem.

But replies immediately raised the question: why not just use Fastlane's built-in frameit? Or a browser-based tool? The thread surfaces the same tradeoffs that come up every time this topic appears. Let's go through them honestly.

Three approaches, three different tradeoffs

Approach 1: Fastlane snapshot + frameit

Fastlane is an open-source automation toolkit for iOS and Android. Its snapshot action drives the iOS Simulator through UITest scripts and captures screenshots at every locale. Its frameit action composites those captures into device frames with optional title overlays. For a detailed breakdown of snapshot's setup cost and when it's worth it, see our fastlane screenshots guide.

The full pipeline looks like:

  1. Write UITest scripts that navigate to each screen you want to capture
  2. fastlane snapshot runs those tests across all Simulator locales and saves raw PNGs
  3. fastlane frameit applies device frames and text overlays from a Framefile.json config
  4. fastlane deliver uploads everything to App Store Connect

Real advantages:

Real costs:

Right for: Apps with 6+ localizations, frequent UI changes, and an existing CI pipeline. The per-update time savings compound once setup cost is amortized.

Approach 2: Custom script (Python + Pillow)

The developer in the original post took Fastlane's raw screenshot output and wrote their own Python compositing layer using Pillow instead of frameit. This gives full control over the design — custom fonts, arbitrary layouts, per-locale title text from JSON config, gradient backgrounds, text shadows.

The architecture they described:

  1. Fastlane snapshot → raw locale-organized PNGs
  2. Python script reads raw PNGs + locale JSON files
  3. Pillow composites: device frame + screenshot + background + title text
  4. Output: App Store-ready PNGs per locale
  5. Fastlane deliver uploads them

Advantages over frameit: Complete design control. Can produce results that look as polished as any visual tool. Fully version-controlled — every design decision is in code.

Additional costs over frameit: You're now maintaining a custom image processing script in addition to UITest scripts. Pillow's text rendering has edge cases around font hinting, kerning, and Unicode that become real problems when you're rendering Japanese or Arabic captions. One comment in the thread flagged a gotcha: a Simulator device frame that too closely resembles an actual iPhone can get flagged by App Review under the guideline about not mimicking Apple UI.

Right for: The same situations as Fastlane, plus developers who want design control that frameit can't provide and are comfortable maintaining a small Python codebase.

One gotcha to know: Custom device frame images that look too similar to Apple's official device renders can trigger a rejection — Apple's guidelines prohibit using Apple's own product imagery without permission. Use community-sourced device frame assets (like those in Fastlane's own frame database) rather than Apple's marketing renders.

Approach 3: Browser-based visual tool

Tools like ezscreenshots skip the code entirely. You drag in a raw screenshot, pick a background and font, type a caption, and export. For localizations, you add languages, translate captions with one click, preview each locale in the editor, and export a localized zip.

Advantages:

Costs:

Right for: Solo developers and small teams who release infrequently (monthly or less), apps with a stable UI that doesn't change every update, and developers who want polished results without investing a day in setup.

The honest comparison

Fastlane Custom script Browser tool
Setup time 1–2 days 2–3 days Minutes
Per-update time ~0 (fully automated) ~0 (fully automated) 15–30 min
Design flexibility Low (frameit limits) High (code anything) High (visual editor)
Live app data Yes (Simulator capture) Yes (Simulator capture) Manual capture first
Maintenance burden Medium (UITest upkeep) High (UITests + script) None
Localization handling Good (native strings) Good (JSON config) Good (AI translation)
Breaks on Xcode update Sometimes Sometimes Never

When automation actually pays off

The break-even point for Fastlane automation is roughly: if you'd spend more than 8 hours per year manually updating screenshots, the 1–2 day setup is worth it.

For most solo developers, that math doesn't clear until you have:

For a developer with 2–3 localizations who releases monthly with a stable UI, the manual workflow with a visual tool is faster in total time — including the time not spent debugging UITest flakiness and Xcode Simulator oddities.

A hybrid approach worth considering

One pattern that works well in practice: use Fastlane for raw screenshot capture (it's reliable for this, and running snapshot without frameit is straightforward), then import those raw captures into a visual tool for compositing.

You get the "live app data" benefit of Fastlane — your screenshots always show the current, real UI — with the design flexibility and zero maintenance of a visual editor. The raw captures are just images; you drag them into ezscreenshots, your saved theme applies instantly, and you export.

Dragging a raw screenshot into the ezscreenshots editor
Drop in a raw Simulator screenshot — your saved theme applies immediately. Background, font, device frame all preserved from your last session.

Screenshots without the pipeline

Drop in a raw Simulator screenshot, add your caption and background, export at the right dimensions. For most apps, this is faster than setting up Fastlane — and the results look better than frameit.

Try it free →

Summary