Automating Free AI Image Generation with n8n and Nano Banana

Introduction: The Buzz is Real
Why Automate Image Generation with n8n and Nano Banana?
Getting Started with OpenRouter’s Free API
Part 1: Chat-Based Image Generation
Part 2: Form-Based Editing (Nano Banana’s True Power)
Part 3: The Code Node – Assembling the Payload
Part 4: Simplifying the HTTP Request
Part 5: Real-World Tests
Popular Nano Banana Use Cases (What People Are Doing Right Now)
Frequently Asked Questions
Wrapping Up: Your Turn to Build

Introduction: The Buzz is Real

The AI community has been buzzing like crazy lately, and if you’ve been online at all, you’ve probably seen it too: Nano Banana is everywhere. Tutorials are exploding on YouTube, and Reddit is overflowing with wild experiments. This isn’t just another incremental update to an existing model; it feels like a genuine step-change in accessibility and capability. For developers, artists, and automation enthusiasts, this is the kind of disruption we live for.

This model is shaking things up because it’s not just fast and free—it’s also remarkably versatile, making it perfect for both playful exploration and serious creative automation. As someone who lives and breathes in the world of n8n AI workflows, I couldn’t just scroll past the endless stream of generated images. The moment I saw its potential, the gears in my head started turning. Of course, I had to build a full automation pipeline with it.

And here’s the kicker that makes this whole endeavor possible for everyone: OpenRouter offers a completely free Nano model endpoint. Yes, you read that right: free. For tinkerers, bootstrappers, and automation nerds, that’s like being handed the keys to a candy store. The usual barrier of API costs is gone, leaving nothing but pure, unadulterated creative potential. This is our chance to build, experiment, and innovate without watching a billing meter tick up.

Why Automate Image Generation with n8n and Nano Banana?

Sure, you could just type prompts into a web UI and get results back. That’s fun for the first 10 minutes. It gives you a feel for the model’s capabilities and quirks. But if you’re reading this, you’re probably like me: once you see the potential, you can’t help but think, “How can I scale this? How can I integrate this? How can I automate this?” The real power isn’t in one-off generations; it’s in building systems that create, edit, and distribute content on their own.

Here’s the approach I took, designed like levels in a video game to gradually build complexity and reusability:

Chat-Based Image Generation: A simple, direct setup where users send a text prompt via a chat interface, and Nano spits out an image. This is the foundational “hello world” of image automation.
Form-Based Image Editing: Level up by allowing users to upload their own image and apply AI-powered edits. This is where Nano’s multimodal power shines—changing outfits, swapping backgrounds, adjusting styles, and more.
Telegram Integration: Take it all the way by connecting your workflow to a real-world application. Send a message or a photo in a Telegram chat, and behind the scenes, Nano + n8n work their magic to deliver a result directly to your conversation.

This tiered design isn’t just about adding complexity—it’s about building reusable foundations. Once you nail the core logic of calling the API and handling the response, adding more channels (like Slack, Discord, a custom web app, or even a WordPress media library) becomes trivial. You’re not just building a single trick; you’re building a creative engine.

Getting Started with OpenRouter’s Free API

Let’s take a quick step back and talk about how to actually call Nano Banana via an API. This is the technical heart of our automation.

Step 1. Get Your API Key

First things first, you need credentials. Head over to OpenRouter, create an account (it’s quick), and generate an API key. The free keys have rate limits, but they are incredibly generous and more than perfect for learning, testing, and running light-to-medium personal workflows. This key is your passport to the world of free image generation.

Step 2. Understand the API Documentation

The Nano Banana free model lives at the following endpoint:

https://openrouter.ai/google/gemini-2.5-flash-image-preview:free/api

OpenRouter’s documentation provides multiple code samples in different languages. For our purposes, just pay close attention to the cURL example, because that’s what n8n can ingest directly, saving us a ton of manual configuration time.

cURL command for copying from OpenRouter API documentation. — The cURL command is all you need to get started in n8n.

Example curl Command

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -d '{
  "model": "google/gemini-2.5-flash-image-preview:free",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "What is in this image?" },
        { "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
          }
        }
      ]
    }
  ]}'

Breaking It Down

model → This is where you specify "google/gemini-2.5-flash-image-preview:free". It tells OpenRouter exactly which model to use.
messages → The payload is structured like a chat conversation. This is crucial because it allows for complex, multi-part inputs. The content is an array that can support both text and images simultaneously.
Content Types:
- "text" for your written prompt. This is where you describe what you want to see.
- "image_url" for supplying an input image. This supports both standard HTTP URLs and, more importantly for automation, Base64 Data URIs, which let you embed an image directly into the API call.

This dual modality—the ability to handle text and images in a single call—is why people are calling it a banana. It’s fun, flexible, and splits open to reveal powerful capabilities. 🍌

Part 1: Chat-Based Image Generation

Alright, let’s get our hands dirty and wire this up inside n8n. We’ll start with the simplest text-to-image workflow.

Workflow structure: On Chat Message → HTTP Request → Edit Fields → Convert to File

A simple n8n workflow for text-to-image generation. — The basic n8n workflow for turning text into images.

Step 1. On Chat Message Trigger

Drag the On Chat Message trigger node onto your canvas. This node acts as the entry point. Anytime a user types something into the n8n chat, this workflow will spring to life.

Step 2. HTTP Request Node

This is where we call OpenRouter. Instead of manually configuring the headers, URL, and body, we can use a massive shortcut. Click the Import cURL button, paste in the entire cURL snippet from the OpenRouter docs, and watch as n8n automatically populates all the fields for you. It’s magic.

Importing a cURL command into n8n's HTTP Request node. — Use the ‘Import cURL’ feature to configure the node instantly.

⚠️ Important Security Tip: Do not hardcode your API key directly in the HTTP headers. This is a security risk. Always use n8n’s built-in Credentials system. Create a “Header Auth” credential, store your API key there, and reference it in the node. That way, if you share or export your workflows, your secrets stay safe and sound.

Step 3. Make the Body Dynamic

Now, we need to replace the static text prompt with the user’s actual input. Go to the request body and update it to use an n8n expression. This tells the node to pull in the user’s chat message dynamically:

{
  "model": "google/gemini-2.5-flash-image-preview:free",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "{{$json.chatInput.toJsonString()}}" }
      ]
    }
  ]
}

Result of the HTTP Request showing the API response. — The successful API call returns a JSON object with the image data.

Step 4. Extracting the Image Data

The API response contains a data:image/png;base64,... Data URI. This is essentially the image file encoded as a long string of text. To isolate just the Base64 part that we need, drop in an Edit Fields node (or a “Set” node in older n8n versions) and create a new field with this expression:

{{ $json.choices[0].message.images[0].image_url.url.split(",")[1] }}

This simple JavaScript snippet splits the string at the comma and takes the second part—the raw Base64 data.

Setting fields in n8n to extract Base64 data from the API response. — Isolating the raw Base64 data is a key step.

Step 5. Convert to a Usable File

The final step is to turn that raw data back into an actual image. Add a Convert to File node. It will take the Base64 string as input and output a binary image file. Now you’ve got something you can preview directly in the n8n editor, save to your computer, or pass along to another system.

✅ Result: You’ve successfully built a smooth, automated text-to-image workflow inside n8n. Type a prompt, get an image. Simple, powerful, and incredibly satisfying.

Final result of the n8n workflow: an image of a cat washing dishes. — The final output: a chubby cat diligently washing dishes, as requested.

Part 2: Form-Based Editing (Nano Banana’s True Power)

Now for the truly exciting part: AI-powered image editing. This is what makes Nano Banana a game-changer. It’s not just a generator; it’s a collaborator.

An n8n workflow showing nodes: On form submission, Code, HTTP Request, Edit Fields, and Convert to File. — The complete n8n workflow for form-based image generation and editing.

People are using this for all sorts of incredible transformations:

Outfit swaps: “Take this photo of me and put me in a spacesuit.”
Background changes: “Replace the boring office wall with a view of the Alps.”
Style transfers: “Make this family portrait look like it was painted by Van Gogh.”
Character creation: Turning real photos into professional, anime-style avatars.

But to do any of this, our workflow needs to handle file uploads. This is where n8n’s native form capabilities come in handy.

Why Use n8n Forms?

Fully integrated: No need for third-party form builders. It’s all within the n8n ecosystem.
Handles binary data: It can manage image uploads natively, without any complex workarounds.
Easy to design: The drag-and-drop interface makes it simple to create and customize your input fields.

Configuring form elements in n8n for prompt and image uploads. — Setting up the form with a text area for the prompt and a file upload field.

Designing the Form

For maximum flexibility, you only need two fields:

Prompt (Textarea, Required): This is where the user describes the edit. Example: “Turn this kitchen photo into a medieval castle scene, keeping the cat.”
Reference Image (File Upload, Optional): This is for the input image. It should accept common formats like JPG and PNG.

👉 Pro Tip: Making the image upload field optional is a clever design choice. It allows the very same workflow to support both text-to-image generation (when no file is uploaded) and image-to-image editing (when a file is provided). This makes your automation incredibly versatile.

Part 3: The Code Node – Assembling the Payload

This is the brain of the entire operation. The Code Node is where we’ll use a little bit of JavaScript to handle the conditional logic (is there an image or not?) and format the payload exactly how the OpenRouter API expects it.

Here’s the version I’m using, with comments to explain each part:

// n8n Code node - assemble the API payload for the Google Nano model
// Assume data is received from the previous Form node
const inputData = $input.all();
const prompt = $input.first().json["Prompt"];
const results = [];

for (const item of inputData) {
  // Get prompt text from the form
  const promptText = prompt;

  // Initialize an array to hold uploaded image data
  let uploadedImages = [];

  // Check if there is binary data (uploaded files) from the n8n Form node
  if (item.binary) {
    for (const [fieldName, binaryData] of Object.entries(item.binary)) {
      if (binaryData.mimeType && binaryData.mimeType.startsWith('image/')) {
        uploadedImages.push(binaryData);
      }
    }
  }
  
  // The 'content' array will hold our text and image parts
  const content = [];

  // Add the text prompt if it exists
  if (promptText && promptText.trim()) {
    content.push({ type: "text", text: promptText.trim() });
  }

  // Add any uploaded images, converting them to Base64 Data URIs
  if (uploadedImages.length > 0) {
    for (const image of uploadedImages) {
      if (image && image.data && (image.mimeType || image.mimetype)) {
        const base64Data = Buffer.isBuffer(image.data) ? 
          image.data.toString('base64') : 
          image.data;
        const mimeType = image.mimeType || image.mimetype;
        const imageUrl = `data:${mimeType};base64,${base64Data}`;
        content.push({ type: "image_url", image_url: { url: imageUrl } });
      }
    }
  }

  // Assemble the final API payload
  const payload = {
    model: "google/gemini-2.5-flash-image-preview:free",
    modalities: ["image", "text"],
    messages: [{ role: "user", content }]
  };

  // As a fallback, if no content was added, add a default prompt.
  if (payload.messages[0].content.length === 0) {
    payload.messages[0].content.push({
      type: "text",
      text: "Please analyze this content"
    });
  }

  // Add the fully formed payload to our results
  results.push({ json: payload });
}

// Return the results to the next node in the workflow
return results;

The output of the Code Node showing the structured JSON payload. — The Code Node dynamically assembles the perfect JSON payload for the API.

Part 4: Simplifying the HTTP Request

Once the Code Node has done its magic, the HTTP Request node becomes ridiculously simple. Since our code has already constructed the entire JSON payload, all we need to do is pass it along.

Configure the node as follows:

Body Content Type: JSON
Specify Body: Using JSON
JSON: {{ $json }}

That’s it. This expression tells the node to take the entire JSON output from the previous Code Node and use it as the body of the API request. Clean, simple, and elegant.

Simplified HTTP Request node settings in n8n. — With the payload pre-built, the HTTP Request node is incredibly simple.

Part 5: Real-World Tests

This is where it all comes together and gets fun. We can now test the full image-editing power of our form-based workflow.

Let’s try a prompt that maintains the character from an input image but changes the style and context: Prompt Example: “The same chubby cat character, wearing a tiny apron, holding a small dish towel and carefully wiping a plate. The cat looks very focused and serious but still cute. The background shows a tidy kitchen. Warm lighting, semi-realistic cartoon style, keep the character identity.”

The final result in n8n showing the input data and the generated image of a cat washing a plate. — Putting it all together: the n8n UI shows the form input (prompt + reference image) and the final, stunning image generated by Nano Banana.

🎉 The results are incredible. The model understood not just the literal objects but the vibe—it maintained the character’s core features while flawlessly applying the new art style and context described in the prompt. The details on the apron, the focused expression, and the warm kitchen lighting are all spot on. This is the magic of multimodal AI automation.

Popular Nano Banana Use Cases (What People Are Doing Right Now)

This model’s flexibility means the creative use cases are exploding across the internet. Here are some of the hottest and most impressive ones I’ve seen, all of which can be automated in n8n with minor tweaks to the workflow we just built. For more inspiration, check out templates like n8n templates.

Turning Screenshots into Collectible Figurines

Gamers and character designers are having a field day with this one. Take a simple screenshot of a video game character or a 2D drawing and ask Nano Banana to reimagine it as a physical object. The key is to be descriptive in your prompt.

Example Prompt: “Turn this character screenshot into a high-quality 3D vinyl figurine. Add a professional packaging box with the character’s art on it behind the figure. Place a computer monitor nearby with 3D modeling software open, showing the character’s wireframe. Put the figure on a transparent PVC base with a crystal-clear texture. The scene should be indoors with studio lighting.”

The results are often staggering. The model understands not just the literal objects (box, monitor, base) but also the context and *vibe*—it adds realistic textures, lighting, and even subtle details to the accessories and facial expressions that make the final image look like a real product photoshoot.

Instant Outfit Swaps for Social Media

Fashion influencers and content creators are using Nano Banana to generate lifestyle content at an unprecedented speed. Instead of doing multiple wardrobe changes for a photoshoot, they can take one good selfie and generate dozens of variations.

Example Prompt: “Using the attached selfie, change my blue t-shirt to a black leather jacket. Keep my face, hair, and the background exactly the same. Make the lighting on the jacket match the existing lighting in the photo.”

This is a powerful demonstration of the model’s in-painting and editing capabilities. It can isolate specific elements of an image and modify them while preserving the integrity of the surrounding areas. The ability to specify matching lighting is crucial for creating believable edits.

Radical Background Replacement

This is one of the most practical and widely used features. E-commerce sellers, real estate agents, and anyone looking to improve their photos can instantly swap out a mundane background for something more appealing.

Example Prompt: “Take the photo of the handbag and place it on a marble tabletop in a luxury boutique. The background should be slightly blurred to create a shallow depth of field. Add some soft, warm lighting coming from the side.”

What makes this so powerful is the model’s ability to realistically handle shadows and reflections. It doesn’t just cut and paste the object; it integrates it into the new environment, making the final composite far more convincing than what you could achieve with traditional photo editing tools without significant manual effort.

Whimsical Pet Transformations

On the more playful side of things, the internet is doing what it does best: putting pets in ridiculous and adorable situations. This is a great way to explore the model’s creativity and sense of humor.

Example Prompt: “Here is a photo of my golden retriever. Please dress him in shining medieval plate armor, holding a tiny sword in his mouth. Place him on a grassy hill, with a castle in the background under a dramatic sunset.”

The model excels at these kinds of imaginative fusions, blending the features of the real pet with the fantastical elements of the prompt. The results range from hilarious to surprisingly epic, and they are a testament to how well Nano Banana can interpret creative and abstract ideas.

Stylized Portraits and Avatars

Everyone wants a cool profile picture. Nano Banana can take a standard selfie and transform it into a piece of art in virtually any style imaginable. This is perfect for creating unique avatars for social media, gaming profiles, or professional websites.

Example Prompt: “Convert this selfie into the art style of a Pixar animated movie. Give me large, expressive eyes and soft, rounded features, but make sure it still looks like me. The background should be a simple, solid color.” or “Redraw this portrait in a classic Studio Ghibli anime style, with a watercolor-painted background of a field of flowers.”

The key here is the model’s ability to capture the likeness of the person while faithfully applying the artistic conventions of the requested style. It’s not just a filter; it’s a re-interpretation, and the results can be beautiful.

Rapid Product Mockups and Prototyping

For entrepreneurs and designers, this is a productivity powerhouse. You can generate product mockups, packaging designs, or promotional posters in seconds, allowing for rapid iteration and brainstorming without needing a graphic designer for initial concepts.

Example Prompt: “Create a photorealistic mockup of a coffee bag for a brand called ‘Rocket Fuel’. The bag should be matte black with a minimalist white logo of a rocket ship. Place the bag on a rustic wooden table next to a cup of coffee and some scattered coffee beans.”

This use case can save countless hours in the early stages of product development. It allows you to visualize ideas instantly, get feedback, and refine your concepts before committing to expensive design work. As you can see in Part 1 of our guide, the potential is huge.

Frequently Asked Questions (FAQ)

Is the Nano Banana API really free to use?

Yes, absolutely. Through OpenRouter, you can access the google/gemini-2.5-flash-image-preview:free model endpoint. While there are rate limits, they are quite generous and perfect for personal projects, testing, and building powerful automation workflows without incurring API costs.

What is the difference between text-to-image and image-to-image generation?

Text-to-image is when you provide a written prompt (e.g., “a cat in a spacesuit”) and the AI generates an image from scratch. Image-to-image (or image editing) is when you provide both an existing image and a text prompt. The AI then uses your image as a reference to modify, like changing a background, swapping an outfit, or applying a new art style, which is one of Nano Banana’s greatest strengths.

Do I need advanced coding skills to connect n8n to an API?

Not at all. While the Code Node offers maximum flexibility, n8n is designed to be low-code friendly. As shown in this guide, you can use the HTTP Request node’s “Import cURL” feature to instantly configure the API call. From there, you just need to use n8n’s expressions to link the dynamic data from your previous nodes.

Why do you convert the image to Base64 in the workflow?

The OpenRouter API is designed to accept images embedded directly into the JSON payload of the API call. The most reliable way to do this is by converting the image file into a Base64 Data URI (a long string of text that represents the image). The Code Node in our workflow handles this conversion automatically, ensuring the data is in the perfect format for the API.

Can I connect this workflow to other apps besides a form or chat?

Yes! That’s the beauty of n8n. You can replace the trigger node with almost anything. You could have the workflow run when a new file is added to a Google Drive folder, when you receive an email with an attachment in Gmail, or when a message is posted in a specific Slack or Discord channel. The core logic for calling the Nano Banana API remains the same.

Wrapping Up: Your Turn to Build

Nano Banana isn’t just hype—it’s a genuinely powerful and accessible tool for creative expression and automation. When you pair it with the robust workflow capabilities of n8n AI workflows, it becomes practically unstoppable. With the foundation we’ve built today, you can:

Generate images from simple chat messages.
Perform complex edits on existing photos via user-friendly forms.
Push these capabilities into any application you use daily, like Telegram, Slack, or even your own website.

And thanks to the free tier provided by Nano Banana on OpenRouter, the barrier to entry is effectively zero. You have a full-fledged creative AI playground at your fingertips where you can test, experiment, and build powerful workflows without spending a dime.

If you’re into AI, automation, or just love to play with the latest and greatest shiny tools, this is the perfect weekend project to dive into. The workflows are simple to set up, but the potential for what you can create is nearly limitless.

🔑 Final Thought: The future of creative automation isn’t about single clicks in a web UI—it’s about chaining powerful tools together to build intelligent, autonomous systems. Nano Banana gives us a fantastic and free creative model, and n8n gives us the plumbing to make it practical, scalable, and integrated into our digital lives.