Available for projects · Remote · Lagos, Nigeria

AI Model
Evaluation
Engineer

I test, compare, and score AI systems with structured precision — turning messy model outputs into clear, actionable quality signals that make AI products smarter and safer.

50+
Evaluation Projects
3×
Models Benchmarked
95%
Accuracy Uplift
100%
Remote Ready
01
About Me
Olatunji Habeeblahi O.
AI Evaluation Engineer

From words to workflows —
to making AI tell the truth.

I’m Olatunji Habeeblahi O., an AI Model Evaluation Engineer and Automation Specialist based in Lagos, Nigeria. My path into AI didn’t begin in a lab — it started with language.

I began my career as a Technical Writer, learning how to translate complex systems into clear, structured communication. That discipline for precision — knowing exactly what something does and why it matters — turned out to be the perfect foundation for everything that followed.

From writing, I moved into AI Automation, building intelligent, self-running workflows in n8n, Zapier, and Make.com. I was designing systems that didn’t just do tasks — they made decisions, routed content, and scaled without adding manual overhead. That work deepened my curiosity about the intelligence behind the tools themselves.

That curiosity led me into Prompt Engineering — studying how language shapes model behaviour, what makes a prompt fail, and how small wording changes can produce entirely different outputs. I began testing systematically, not just intuitively.

Today, I work as a full-on AI Evaluation Specialist. I test AI systems with structured rubrics, run head-to-head agent comparisons, identify bias and failure modes, and produce the kind of actionable feedback that engineering teams can actually use. I am adept at independent remote work, rapid guideline adaptation, and delivering insights that improve AI system reliability and user experience.

Career Journey
✍️
Technical Writer
Built the foundation — precision, clarity, and structured communication under ambiguity.
⚙️
AI Automation Specialist
Designed intelligent n8n workflows, API integrations, and multi-platform automation systems.
🧷
Prompt Engineer
Studied how language steers model behaviour — prompt design, testing, and refinement at scale.
🔮
AI Evaluation Specialist
Rubric-based scoring, QA testing, agent comparison, bias detection — making AI accountable.
LLM EvaluationRubric DesignAgent Testingn8n AutomationQA EngineeringBias DetectionPrompt EngineeringTechnical Writing
02
Core Competencies
🧠
AI Model Evaluation
  • LLM output comparison
  • Rubric-based scoring frameworks
  • Qualitative & quantitative feedback
  • Bias & hallucination detection
  • Prompt behaviour testing
  • Edge-case & failure-mode analysis
🔮
QA & Testing
  • AI workflow validation
  • Regression testing
  • UX-focused output assessment
  • Annotation guideline interpretation
  • Scenario-based testing
  • Documentation & SOPs
⚙️
Automation & Tech
  • n8n workflow design
  • API-based system validation
  • Data integrity checks
  • Automation QA & monitoring
  • Multi-platform integration
  • Prompt segmentation per platform
📊
Analysis & Reporting
  • Structured analytical reports
  • Performance benchmarking
  • Root cause analysis
  • Actionable recommendation docs
  • Scoring framework design
  • Model behaviour mapping
✍️
Writing & Communication
  • Technical writing & documentation
  • Professional English (written + verbal)
  • Clear analytical reporting
  • Prompt & annotation guidelines
  • SOP creation
  • Stakeholder communication
🔧
Tools & Platforms
  • n8n, Zapier, Make.com
  • GPT-4, Claude, Gemini
  • HubSpot CRM (Certified)
  • Google Workspace & Airtable
  • MindStudio, Notion
  • API & webhook integrations
03
Experience
2023 — Present
Freelance · Remote
AI Automation & Evaluation Specialist

Evaluated AI-driven automation workflows for logical accuracy, consistency, and reliability. Tested system outputs against defined requirements and edge cases. Analyzed AI-generated responses to identify errors, inconsistencies, and areas for improvement. Produced structured, actionable feedback to improve model behaviour and workflow quality. Designed and executed QA test cases and documented evaluation criteria and outcomes.

LLM EvaluationRubric Designn8nQA TestingPrompt Engineering
2021 — 2023
TeeJay Construction · Ibadan, Nigeria
Managing Partner — Operations & Analysis

Evaluated operational processes and decision outcomes to drive efficiency improvements. Analyzed market and client data to guide strategic business decisions. Managed documentation, reporting, and stakeholder communication. Applied structured judgment in high-ambiguity scenarios — a discipline that translates directly into AI evaluation work.

Process AnalysisStrategic ReportingOperations
04
Selected Work

Work that
speaks.

🧪
QA Testing · Agent Comparison
Multi-Model Agent QA & Benchmark Study

Structured head-to-head evaluation of GPT-4, Claude, and Gemini across reasoning, accuracy, and consistency.

View Case Study →
📐
Rubric Design · LLM Scoring
LLM Response Quality Scoring Framework

Designed a multi-dimensional rubric to evaluate LLM output quality — adapted from industry standards and original design.

View Case Study →
🔄
Automation · n8n · AI Pipeline
AI-Powered Content Curation & Publishing Automation

End-to-end n8n pipeline: RSS ingestion → AI scoring → quality filtering → rewriting → auto-publishing to X and LinkedIn.

View Case Study →
05
Education & Certifications
2025 · Obafemi Awolowo University
B.Sc. Animal Science
Ile-Ife, Osun State, Nigeria

A degree built on empirical observation, data analysis, and systematic documentation — skills that transfer directly into structured AI evaluation work.

🤖
AI & Automation Systems
Udemy
📈
Digital Marketing Analytics
HubSpot Academy · Certified
⚙️
QA & Process Optimization
Professional Training
🎯
Customer Experience & UX Fundamentals
Online Certification

Let’s build
smarter
AI together.

Available for evaluation contracts, AI QA consulting, and freelance automation projects. Remote-first, professional, and ready to deliver structured results from day one.

📱
WhatsApp
🔗
LinkedIn
📍
Location
Remote · Lagos, Nigeria (UTC+1)
🟢
Availability
Open to new projects · 20+ hrs/week