Research & Publications

This page highlights my research contributions in responsible AI, human-aligned evaluation frameworks, multimodal reasoning, and intent-preserving conversational systems. I focus on bridging model behavior and real human needs, especially in emotionally sensitive or ambiguous interactions.


Google Scholar

Full publication and citation profile is available here:
https://scholar.google.com/citations?user=hUGBL5wAAAAJ


Selected Works

FairEval-Suite: Human-Aligned Evaluation of Generative Models Beyond Accuracy and Toxicity

Kriti Behl
A lightweight scoring framework that evaluates LLM outputs using three human-centric rubrics (helpfulness, relevance, clarity) with transparent toxicity signals.
FairEval-Suite is designed for researchers, engineers and safety teams who want interpretability over leaderboard metrics.

Artifacts

Impact Reveals alignment failures and reasoning collapses that are invisible in benchmark accuracy or toxicity classifiers.


Intent-Preserving Evaluation: Reasoning Failures in Language Models

AIML Bridge LMReasoning @ AAAI 2026 (Under Review)
Most LLM reasoning failures are not hallucinations or safety issues — they are failures to understand human ambiguity.
This position introduces an evaluation method built around user intent repair instead of safety refusal.

Core Ideas

Submission


Beyond Refusal: Intent-Preserving Agency for Human-Aligned AI Systems

Blue Sky Ideas Track — AAMAS 2026 (Under Review)
Agents should not behave as punitive safety moderators.
This paper argues for a paradigm shift:

Contribution Defines intent-preservation as a core competency for multi-agent systems.


FairEval-Suite: Human-Aligned Forensic Evaluation of Generative Models

WACV 2026 Workshop SAFE (Submitted)
A forensic lens on LLM behavior focusing on:

Demonstrates how rubric-based scoring exposes risks long before red-team detectors do.


Research Philosophy

Most LLM systems fail not when users are adversarial,
but when they are vulnerable, indirect, or uncertain.

Safety models treat ambiguity as danger.
Alignment systems punish confusion.
Benchmarks reward surface correctness, not empathy.

My research focuses on systems that:


Ongoing Work

VoiceVisionReasoner

Multimodal reasoning system combining:

GitHub: https://github.com/kritibehl/VoiceVisionReasoner


SpeechIntentEval

Evaluation toolkit for speech models that:

GitHub: https://github.com/kritibehl/SpeechIntentEval


Writing

I document the human impact of model decisions.

Why AI Refusals Feel Like Punishment
Medium: https://medium.com/@kriti0608/why-ai-refusals-feel-like-punishment-and-how-i-learned-to-repair-intent-instead-c7a890a7b0e8

This essay explains how “safety disclaimers” harm real users more than they protect them.


Contact

For collaborations, research opportunities, or AI safety R&D: kritibehl@gmail.com LinkedIn: https://www.linkedin.com/in/kriti-behl/ GitHub: https://github.com/kritibehl