Tests whether a model follows explicit formatting and constraint instructions. 12 cases covering format, length, exclusion, and style constraints.
Evaluate the model's ability to follow explicit instructions.
For each test case, the model will receive an instruction with constraints. Score based on:
Scoring rubric:
Scoring Rubric
Score each response on a 0-10 scale based on exact adherence to explicit instructions provided in the prompt. Deduct points for each violated constraint.