glazing.utils.validators¶
Data validation functions.
validators
¶
Custom validators for the glazing package.
This module provides reusable Pydantic validators and validation utilities for common patterns across all linguistic datasets.
| FUNCTION | DESCRIPTION |
|---|---|
create_pattern_validator |
Factory function to create regex pattern validators. |
create_range_validator |
Factory function to create numeric range validators. |
validate_non_empty_string |
Ensure a string is not empty or only whitespace. |
validate_non_empty_list |
Ensure a list is not empty. |
validate_unique_list |
Ensure all items in a list are unique. |
normalize_whitespace |
Normalize whitespace in strings. |
| CLASS | DESCRIPTION |
|---|---|
PatternValidator |
Reusable regex pattern validator class. |
RangeValidator |
Reusable numeric range validator class. |
Notes
These validators are designed to be used with Pydantic v2 field_validator and model_validator decorators. They provide consistent validation behavior across all dataset-specific models.
Classes¶
PatternValidator(pattern: str, field_name: str, flags: int = 0)
¶
Reusable regex pattern validator.
| PARAMETER | DESCRIPTION |
|---|---|
pattern
|
The regex pattern to match.
TYPE:
|
field_name
|
Human-readable name of the field being validated.
TYPE:
|
flags
|
Regex flags (e.g., re.IGNORECASE).
TYPE:
|
| METHOD | DESCRIPTION |
|---|---|
__call__ |
Validate a value against the pattern. |
Examples:
>>> validator = PatternValidator(r'^[A-Z][a-z]+$', 'name')
>>> validator('John') # Returns 'John'
>>> validator('john') # Raises ValueError
Initialize the pattern validator.
Source code in src/glazing/utils/validators.py
RangeValidator(min_value: float | None = None, max_value: float | None = None, field_name: str = 'value')
¶
Reusable numeric range validator.
| PARAMETER | DESCRIPTION |
|---|---|
min_value
|
Minimum allowed value (inclusive).
TYPE:
|
max_value
|
Maximum allowed value (inclusive).
TYPE:
|
field_name
|
Human-readable name of the field being validated.
TYPE:
|
| METHOD | DESCRIPTION |
|---|---|
__call__ |
Validate a value is within the range. |
Examples:
>>> validator = RangeValidator(0, 100, 'percentage')
>>> validator(50) # Returns 50
>>> validator(150) # Raises ValueError
Initialize the range validator.
Source code in src/glazing/utils/validators.py
Functions¶
create_confidence_validator() -> RangeValidator
¶
create_identifier_validator(pattern: str, field_name: str) -> PatternValidator
¶
create_lemma_validator() -> PatternValidator
¶
create_pattern_validator(pattern: str, field_name: str, flags: int = 0) -> Callable[[str], str]
¶
Factory function to create a pattern validator.
| PARAMETER | DESCRIPTION |
|---|---|
pattern
|
The regex pattern to match.
TYPE:
|
field_name
|
Human-readable name of the field.
TYPE:
|
flags
|
Regex flags.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[str], str]
|
A validator function. |
Examples:
>>> validate_email = create_pattern_validator(
... r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$',
... 'email address'
... )
>>> validate_email('user@example.com') # Returns 'user@example.com'
Source code in src/glazing/utils/validators.py
create_percentage_validator() -> RangeValidator
¶
create_range_validator(min_value: float | None = None, max_value: float | None = None, field_name: str = 'value') -> Callable[[float | int], float | int]
¶
Factory function to create a range validator.
| PARAMETER | DESCRIPTION |
|---|---|
min_value
|
Minimum allowed value.
TYPE:
|
max_value
|
Maximum allowed value.
TYPE:
|
field_name
|
Human-readable name of the field.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Callable[[float | int], float | int]
|
A validator function. |
Examples:
Source code in src/glazing/utils/validators.py
create_uppercase_name_validator(field_name: str = 'name') -> PatternValidator
¶
Create a validator for uppercase-starting names (frames, FEs).
normalize_whitespace(value: str) -> str
¶
Normalize whitespace in a string.
Replaces multiple consecutive whitespace characters with a single space and strips leading/trailing whitespace.
| PARAMETER | DESCRIPTION |
|---|---|
value
|
The string to normalize.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
str
|
The normalized string. |
Examples:
Source code in src/glazing/utils/validators.py
validate_conditional_requirement(values: dict[str, ValueType], condition_field: str, condition_value: ValueType, required_fields: list[str]) -> dict[str, ValueType]
¶
Validate that fields are required when a condition is met.
| PARAMETER | DESCRIPTION |
|---|---|
values
|
The values dictionary from a Pydantic model.
TYPE:
|
condition_field
|
The field to check for the condition.
TYPE:
|
condition_value
|
The value that triggers the requirement.
TYPE:
|
required_fields
|
Fields that are required when the condition is met.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, ValueType]
|
The validated values. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If required fields are missing when the condition is met. |
Source code in src/glazing/utils/validators.py
validate_mutually_exclusive(values: dict[str, ValueType], field_groups: list[list[str]], require_one: bool = False) -> dict[str, ValueType]
¶
Validate that fields are mutually exclusive.
| PARAMETER | DESCRIPTION |
|---|---|
values
|
The values dictionary from a Pydantic model.
TYPE:
|
field_groups
|
Groups of field names that are mutually exclusive.
TYPE:
|
require_one
|
If True, exactly one field from each group must be set.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, ValueType]
|
The validated values. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If mutually exclusive fields are both set. |
Examples:
>>> values = {'source_id': '123', 'source_ids': None}
>>> validate_mutually_exclusive(values, [['source_id', 'source_ids']])
Source code in src/glazing/utils/validators.py
validate_non_empty_list(value: list[T], field_name: str = 'list') -> list[T]
¶
Ensure a list is not empty.
| PARAMETER | DESCRIPTION |
|---|---|
value
|
The list to validate.
TYPE:
|
field_name
|
Name of the field being validated.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[T]
|
The validated list. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the list is empty. |
Source code in src/glazing/utils/validators.py
validate_non_empty_string(value: str, field_name: str = 'string') -> str
¶
Ensure a string is not empty or only whitespace.
| PARAMETER | DESCRIPTION |
|---|---|
value
|
The string to validate.
TYPE:
|
field_name
|
Name of the field being validated.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
str
|
The validated string (stripped of leading/trailing whitespace). |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the string is empty or only whitespace. |
Source code in src/glazing/utils/validators.py
validate_unique_list(value: list[T], field_name: str = 'list') -> list[T]
¶
Ensure all items in a list are unique.
| PARAMETER | DESCRIPTION |
|---|---|
value
|
The list to validate.
TYPE:
|
field_name
|
Name of the field being validated.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[T]
|
The validated list. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the list contains duplicate items. |