fix: use wildcard for llm.model_name in MCP trace tests #440

Chibionos · 2026-01-20T14:56:13Z

Summary

Replaces hardcoded LLM model version with wildcard (*) in trace test expectations to prevent future test failures when models are updated.

Problem

The simple-local-mcp test expectations were recently updated to use gpt-4.1-mini-2025-04-14, but this approach is brittle - it will break again the next time:

A new model version is released
LLM Gateway defaults change
Test environments use different model configurations

Solution

Use wildcard matching for llm.model_name instead of exact version:

- "llm.model_name": "gpt-4.1-mini-2025-04-14"
+ "llm.model_name": "*"

Also removed exact content matching for the final response, as wording can vary slightly between models.

Benefits

✅ Future-proof: Won't break on model updates
✅ Environment-agnostic: Works regardless of which model is configured
✅ Lower maintenance: No need to update test expectations when models change
✅ Still validates: Provider (azure), system (openai), and span structure are still checked

Implementation

The trace assertion logic (trace_assert.py) already supports wildcards:

def matches_value(expected_value: Any, actual_value: Any) -> bool:
    if expected_value == "*":
        return True  # Accept any value

Testing

This change only affects test expectations, not runtime behavior. The wildcard will accept any model name while still validating:

Correct span structure and hierarchy
Required span attributes
Message roles and content
Tool invocations

🤖 Generated with Claude Code

Extended the wildcard fix to all remaining test files with hardcoded LLM model versions: - company-research-agent: gpt-4.1-mini-2025-04-14 → "*" - init-flow: gpt-4o-mini-2024-07-18 → "*" - ticket-classification: gpt-4.1-mini-2025-04-14 → "*" This ensures all trace validation tests are resilient to future LLM Gateway model changes, preventing CI/CD failures when defaults update. Related to: #440

Replace hardcoded model version with wildcard to prevent test failures when LLM Gateway defaults change. This improves long-term test stability. Previous fix updated model to gpt-4.1-mini-2025-04-14, but this will break again on next model update. Wildcard approach is resilient to future changes. Changes: - llm.model_name: "gpt-4.1-mini-2025-04-14" → "*" - Removed exact content match (varies by model wording) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>

Extended the wildcard fix to all remaining test files with hardcoded LLM model versions: - company-research-agent: gpt-4.1-mini-2025-04-14 → "*" - init-flow: gpt-4o-mini-2024-07-18 → "*" - ticket-classification: gpt-4.1-mini-2025-04-14 → "*" This ensures all trace validation tests are resilient to future LLM Gateway model changes, preventing CI/CD failures when defaults update. Related to: #440

cristipufu requested a review from ionmincu January 20, 2026 15:16

Chibi Vikram and others added 2 commits January 21, 2026 06:59

cristipufu force-pushed the fix/mcp-test-model-wildcard branch from b76b0f4 to 548521e Compare January 21, 2026 04:59

cristipufu approved these changes Jan 21, 2026

View reviewed changes

cristipufu merged commit bc5978f into main Jan 21, 2026
39 checks passed

cristipufu deleted the fix/mcp-test-model-wildcard branch January 21, 2026 05:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use wildcard for llm.model_name in MCP trace tests #440

fix: use wildcard for llm.model_name in MCP trace tests #440

Uh oh!

Chibionos commented Jan 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: use wildcard for llm.model_name in MCP trace tests #440

fix: use wildcard for llm.model_name in MCP trace tests #440

Uh oh!

Conversation

Chibionos commented Jan 20, 2026

Summary

Problem

Solution

Benefits

Implementation

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants