Art Critic AI Agent


Building an autonomous AI pipeline for daily creative critique and content generation.

For 9 years straight, I’ve created a work of art every single day as part of my Everyday Project - a project rooted in discipline, experimentation, and evolving creative process. It’s not just a habit, it’s a living archive of how ideas grow over time.

To take it even further, I built an autonomous AI agent that critiques each daily piece, turning every artwork into an experience - analyzing, narrating, and packaging it into shareable audio [Podcast] and video [Youtube/Instagram] content, all fully automated.

This is Everydays meets AI, reflection meets creation. A daily loop of making, critiquing, and sharing.

This is also a step forward in my exploration to automate parts of my life.

Github Repo
Scroll directly to:
Workflow Architecture
The render image
Text analysing the image
Text to Audio [Podcasts]
Combined Video
Pipeline Breakdown
Stage 1 – Text & Audio Generation

  1. Input: Daily render image from my ongoing Everyday project.
  2. Perception: The agent analyzes the image using OpenAI’s GPT-4o Vision API.
    Later replaced by local LLMs (Mistral, Llama3 via Ollama) for a free, offline workflow.
  3. Reasoning: Generates an art critique text based on the image content.
  4. Voice: Converts the critique text to audio narration.
    Initially used OpenAI TTS, later integrated free alternatives like Coqui TTS and Piper TTS.Bonus: Suggest an existing artwork related to the artwork critiqued.


Stage 2 – Video Synthesis

  1. Combines:
    1. Daily render image (as static background)
    2. Audio narration
  2. Uses FFmpeg (open-source) to produce the final .mp4 video.

Prompt

“Describe and critique this artwork in detail.  Also suggest an existing piece of art that is similar to this based on your analysis. Check and make sure that it is an existing artwork”


Tools & Technologies
  1. Text Generation: OpenAI GPT-4o, Mistral/Llama3 via Ollama
  2. Text-to-Speech: OpenAI TTS, Coqui TTS, Piper TTS
  3. Video Composition: FFmpeg
  4. APIs: Meta Graph API, YouTube Data API (explored for auto-posting)
Takeaways
  1. Created a multi-modal AI Agent exhibiting perception → reasoning → action.
  2. Transition from paid APIs to a free, local-first setup, increasing accessibility and sustainability.
  3. Open to future extensions:
    Auto-posting to platforms
    Engagement-driven feedback loops
    Fully autonomous daily outputs