Is Gemini CLI Worth The Hype?
This issue is brought to you by:
Securing AI agents is now possible with Auth0 .
AI agents are reshaping digital experiences. But securing them requires rethinking identity and access controls built for a human-first world.
|
|
With a new AI coding assistant announced every other week, it’s easy to feel overwhelmed by the “paradox of choice.”
Google entered the game late with its Gemini CLI, but they made a smart move: they made it free and completely open source, unlike the other tools earlier to the CLI game like Claude and AWS's Q developer.
But is it just a another Google freebie (and you know what they say about these...) to grab market share, or is it a genuinely powerful options that holds up with the rest of the pack?
I decided to push it beyond a simple game and see if it could build a production ready, autonomous code review agent, you know, like Soham 🤣.
A "soham as a service" if you will.
Which, thankfully, is still SaaS!
The real power of AI CLI tools isn’t unlocked by writing a single, perfect prompt, as so many people thought in the past.. remember "prompt engineering" was a thing?
Power is unlocked by solving two critical problems that most people ignore: breaking down complex work and handling security properly.
The latter, is not straight forward, to put gently.
My little experiment revealed that Gemini can perform sophisticated, parallel tasks, what the script calls “sub-agents".
When you give it clear, sequential instructions, specifically asking for this parallelism.
I also discovered a way to let an AI agent access my GitHub account without giving it my passwords or secret tokens, solving a massive security flaw in most agentic workflows.
So, how can you put these learnings into action?
It comes down to thinking less like a user giving a command and more like an architect designing a system (yea, big leap but stay with me):
The big problem: AI agents are aowerful, but also fragile and insecure
The dream is to have an AI agent that can handle complex, real-world tasks, like reviewing an entire codebase for bugs and security flaws, not only that, it should have access to that code base in the first place, securely (imagine, accessing github on your behalf, opening issues and PRs).
The problem is that these agents are often a black box, you can have all the open source code in the world but whatever happens inside the model is black magic.
When you give them a big, complicated job, they tend to get confused, miss steps, or fail silently. Even worse, to do anything useful (like comment on a github PR), they need access to your accounts. This creates a dilemma: how do you give an AI power without creating a massive security hole?
Most developers, when testing a tool like Gemini CLI, will write a long, detailed prompt and hope for the best.
To solve the access problem, they often do the unthinkable: they generate a personal access token and paste it directly into their code or environment variables, or worse - in the prompt.
They give the agent the keys to the kingdom and cross their fingers.
But it's worse than just that - these keys send in the prompt?
They're now slowly baked in to the model.
Ever wrote "export OPEN_AI_TOKEN=" and had an LLM complete it for you?
Yea, that's your key there, potentially completed for someone else.
This process fails because, models struggle to follow a long list of simultaneous commands.
You end up having to repeat yourself (as you would with a toddler...).
And embedding secrets directly in your code is a security nightmare of its own just waiting to happen.
It’s not a scalable or safe way to build anything serious.
A better way: divide and conquer
More importantly, I solved the security problem by removing trust from the equation entirely.
Instead of giving my agent a secret token, I used an authentication service (Auth0) to act as a middleman.
- The agent requests access to GitHub.
- This triggers a browser pop-up on my machine, asking me to log in and grant permission.
- Once I approve, the service gives the agent a temporary, short-lived pass. The agent never sees my password. I don’t have to store secrets in my code. The access is temporary and secure. This is how you build autonomous systems that can be trusted to interact with the real world.
Instead of one giant task, I broke the problem down.
Inspired by a post on a subreddit, here's a prompt to potentially see whether the tool can actually create a separate parallel tasks: I instructed Gemini to act as a manager for five “sub-agents.”
Each agent had one job: review a set of files and write its analysis to a separate log of its own.
A final step then collected these logs into a master report and clean the separate logs up.
This worked beautifully.
The AI followed the step-by-step logic perfectly, producing a genuinely useful, 200-line code review.
While Gemini is a powerful tool, the hype can be misleading.
“It feels to me like whoever hypes these has either never coded a feature, let alone a project with these AI tools…”
- Someone on Youtube
These tools are not magic.
They are not yet ready to work without a human reviewing their output.
But by combining their raw power with smart architecture, breaking down problems and building secure authentication, you can create something that genuinely enhances your workflow and gets you one step closer to that five-senior-developer Soham as a service dream 😉.
Thank you for reading.
Feel free to reply directly with any question or feedback.
Have a great weekend!
Whenever you’re ready, here’s how I can help you:
|
|