glazing.wordnet.models¶

WordNet core data models.

`models` ¶

WordNet data models.

This module implements WordNet 3.1 data models including synsets, words, senses, and relations using Pydantic v2 for validation and type safety.

CLASS	DESCRIPTION
`Synset`	WordNet synset (set of cognitive synonyms).
`Word`	Word/lemma in a synset.
`Pointer`	Relation/pointer to another synset or word.
`VerbFrame`	Syntactic frame for a verb.
`Sense`	Word sense (word-meaning pair).
`IndexEntry`	Entry in WordNet index file.
`ExceptionEntry`	Morphological exception mapping.
`WordNetCrossRef`	Cross-reference to WordNet from other resources.

Examples:

>>> from glazing.wordnet.models import Synset, Word
>>> synset = Synset(
...     offset="00001740",
...     lex_filenum=5,
...     lex_filename="noun.animal",
...     ss_type="n",
...     words=[Word(lemma="dog", lex_id=0)],
...     pointers=[],
...     gloss="a domesticated carnivorous mammal"
... )

Classes¶

`ExceptionEntry` `pydantic-model` ¶

Bases: GlazingBaseModel

Morphological exception mapping.

ATTRIBUTE	DESCRIPTION
`inflected_form`	Inflected/irregular form. TYPE: `str`
`base_forms`	Base/lemma forms. TYPE: `list[str]`

Examples:

>>> entry = ExceptionEntry(
...     inflected_form="geese",
...     base_forms=["goose"]
... )

Fields:

inflected_form (str)
base_forms (list[str])
pos (WordNetPOS | None)

Validators:

validate_forms → inflected_form, base_forms

Attributes¶

`base_forms: list[str]` `pydantic-field` ¶

Base/lemma forms

`inflected_form: str` `pydantic-field` ¶

Inflected/irregular form

`pos: WordNetPOS | None = None` `pydantic-field` ¶

Part of speech

Methods:¶

`validate_forms(v: str | list[str]) -> str | list[str]` `pydantic-validator` ¶

Validate word forms.

PARAMETER	DESCRIPTION
`v`	The value to validate. TYPE: `str \| list[str]`

RETURNS	DESCRIPTION
`str \| list[str]`	The validated value.

RAISES	DESCRIPTION
`ValueError`	If word form is invalid.

Source code in src/glazing/wordnet/models.py

@field_validator("inflected_form", "base_forms")
@classmethod
def validate_forms(cls, v: str | list[str]) -> str | list[str]:
    """Validate word forms.

    Parameters
    ----------
    v : str | list[str]
        The value to validate.

    Returns
    -------
    str | list[str]
        The validated value.

    Raises
    ------
    ValueError
        If word form is invalid.
    """
    if isinstance(v, str):
        cleaned = v.replace("_", "").replace("-", "").replace("'", "").replace(".", "")
        if not v or not cleaned.isalpha():
            msg = f"Invalid word form: {v}"
            raise ValueError(msg)
    elif isinstance(v, list):
        for form in v:
            cleaned = form.replace("_", "").replace("-", "").replace("'", "").replace(".", "")
            if not form or not cleaned.isalpha():
                msg = f"Invalid word form: {form}"
                raise ValueError(msg)
    return v

`IndexEntry` `pydantic-model` ¶

Bases: GlazingBaseModel

An entry in a WordNet index file.

ATTRIBUTE	DESCRIPTION
`lemma`	Word form. TYPE: `str`
`pos`	Part of speech. TYPE: `WordNetPOS`
`synset_cnt`	Number of synsets. TYPE: `int`
`p_cnt`	Number of pointer types. TYPE: `int`
`ptr_symbols`	Pointer symbols for this word. TYPE: `list[PointerSymbol]`
`sense_cnt`	Same as synset_cnt. TYPE: `int`
`tagsense_cnt`	Number of senses in semantic concordances. TYPE: `int`
`synset_offsets`	Synsets containing this word. TYPE: `list[SynsetOffset]`

Examples:

>>> entry = IndexEntry(
...     lemma="dog",
...     pos="n",
...     synset_cnt=7,
...     p_cnt=4,
...     ptr_symbols=["!", "@", "~", "#m"],
...     sense_cnt=7,
...     tagsense_cnt=6,
...     synset_offsets=["00001740", "00002084"]
... )

Fields:

lemma (str)
pos (WordNetPOS)
synset_cnt (int)
p_cnt (int)
ptr_symbols (list[PointerSymbol])
sense_cnt (int)
tagsense_cnt (int)
synset_offsets (list[SynsetOffset])

Attributes¶

`lemma: str` `pydantic-field` ¶

Word form

`p_cnt: int` `pydantic-field` ¶

Number of pointer types

`pos: WordNetPOS` `pydantic-field` ¶

Part of speech

`ptr_symbols: list[PointerSymbol]` `pydantic-field` ¶

Pointer symbols for this word

`sense_cnt: int` `pydantic-field` ¶

Same as synset_cnt

`synset_cnt: int` `pydantic-field` ¶

Number of synsets

`synset_offsets: list[SynsetOffset]` `pydantic-field` ¶

Synsets with this word

`tagsense_cnt: int` `pydantic-field` ¶

Semantic concordance senses

`Pointer` `pydantic-model` ¶

Bases: GlazingBaseModel

A relation/pointer to another synset or word.

ATTRIBUTE	DESCRIPTION
`symbol`	Relation type symbol. TYPE: `PointerSymbol`
`offset`	Target synset offset. TYPE: `SynsetOffset`
`pos`	Target part of speech. TYPE: `WordNetPOS`
`source`	Source word number (0 = entire synset). TYPE: `int`
`target`	Target word number (0 = entire synset). TYPE: `int`
`target_key`	Canonical key of the targeted synset. TYPE: `SynsetKey`

METHOD	DESCRIPTION
`is_lexical`	Check if this is a lexical (word-to-word) relation.
`is_semantic`	Check if this is a semantic (synset-to-synset) relation.

Examples:

>>> pointer = Pointer(
...     symbol="@",
...     offset="00002084",
...     pos="n",
...     source=0,
...     target=0
... )
>>> pointer.is_semantic()
True
>>> pointer.target_key
'00002084n'

Fields:

symbol (PointerSymbol)
offset (SynsetOffset)
pos (WordNetPOS)
source (int)
target (int)

Attributes¶

`offset: SynsetOffset` `pydantic-field` ¶

Target synset offset

`pos: WordNetPOS` `pydantic-field` ¶

Target part of speech

`source: int` `pydantic-field` ¶

Source word number (0 = entire synset)

`symbol: PointerSymbol` `pydantic-field` ¶

Relation type symbol

`target: int` `pydantic-field` ¶

Target word number (0 = entire synset)

`target_key: SynsetKey` `property` ¶

Canonical key of the synset this pointer targets.

Matches Synset.key of the target, including when the target is an adjective satellite: WordNet records "a" in the POS field of every pointer to an adjective, so normalization makes the two agree.

RETURNS	DESCRIPTION
`SynsetKey`	Target offset plus normalized POS.

Methods:¶

`is_lexical() -> bool` ¶

Check if this is a lexical (word-to-word) relation.

RETURNS	DESCRIPTION
`bool`	True if either source or target is non-zero.

Source code in src/glazing/wordnet/models.py

def is_lexical(self) -> bool:
    """Check if this is a lexical (word-to-word) relation.

    Returns
    -------
    bool
        True if either source or target is non-zero.
    """
    return self.source != 0 or self.target != 0

`is_semantic() -> bool` ¶

Check if this is a semantic (synset-to-synset) relation.

RETURNS	DESCRIPTION
`bool`	True if both source and target are zero.

Source code in src/glazing/wordnet/models.py

def is_semantic(self) -> bool:
    """Check if this is a semantic (synset-to-synset) relation.

    Returns
    -------
    bool
        True if both source and target are zero.
    """
    return self.source == 0 and self.target == 0

`Sense` `pydantic-model` ¶

Bases: GlazingBaseModel

A word sense (word-meaning pair).

ATTRIBUTE	DESCRIPTION
`sense_key`	Unique sense identifier. TYPE: `SenseKey`
`lemma`	Word form. TYPE: `str`
`ss_type`	Synset type. TYPE: `WordNetPOS`
`lex_filenum`	Lexical file number. TYPE: `int`
`lex_id`	Lexical ID. TYPE: `LexID`
`head_word`	For adjective satellites. TYPE: `str \| None, default=None`
`head_id`	Head word lex_id. TYPE: `int \| None, default=None`
`synset_offset`	Synset containing this sense. TYPE: `SynsetOffset`
`sense_number`	Frequency-based ordering. TYPE: `SenseNumber`
`tag_count`	Semantic concordance count. TYPE: `TagCount`

METHOD	DESCRIPTION
`parse_sense_key`	Parse sense key into components.

Examples:

>>> sense = Sense(
...     sense_key="dog%1:05:00::",
...     lemma="dog",
...     ss_type="n",
...     lex_filenum=5,
...     lex_id=0,
...     synset_offset="00001740",
...     sense_number=1,
...     tag_count=15
... )
>>> components = sense.parse_sense_key()
>>> components['lemma']
'dog'

Fields:

sense_key (SenseKey)
lemma (str)
ss_type (WordNetPOS)
lex_filenum (int)
lex_id (LexID)
head_word (str | None)
head_id (int | None)
synset_offset (SynsetOffset)
sense_number (SenseNumber)
tag_count (TagCount)

Attributes¶

`head_id: int | None = None` `pydantic-field` ¶

Head word lex_id

`head_word: str | None = None` `pydantic-field` ¶

For adjective satellites

`lemma: str` `pydantic-field` ¶

Word form

`lex_filenum: int` `pydantic-field` ¶

Lexical file number

`lex_id: LexID` `pydantic-field` ¶

Lexical ID

`sense_key: SenseKey` `pydantic-field` ¶

Unique sense identifier

`sense_number: SenseNumber` `pydantic-field` ¶

Frequency-based ordering

`ss_type: WordNetPOS` `pydantic-field` ¶

Synset type

`synset_offset: SynsetOffset` `pydantic-field` ¶

Synset containing this sense

`tag_count: TagCount` `pydantic-field` ¶

Semantic concordance count

Methods:¶

`parse_sense_key() -> dict[str, str | int | None]` ¶

Parse sense key into components.

RETURNS	DESCRIPTION
`dict[str, str \| int \| None]`	Dictionary with components: lemma, ss_type, lex_filenum, lex_id, head_word, head_id.

Examples:

>>> sense = Sense(sense_key="dog%1:05:00::", ...)
>>> components = sense.parse_sense_key()
>>> components['ss_type']
1

Source code in src/glazing/wordnet/models.py

def parse_sense_key(self) -> dict[str, str | int | None]:
    """Parse sense key into components.

    Returns
    -------
    dict[str, str | int | None]
        Dictionary with components: lemma, ss_type, lex_filenum, lex_id,
        head_word, head_id.

    Examples
    --------
    >>> sense = Sense(sense_key="dog%1:05:00::", ...)
    >>> components = sense.parse_sense_key()
    >>> components['ss_type']
    1
    """
    parts = self.sense_key.split("%")
    lemma = parts[0]
    rest = parts[1].split(":")
    return {
        "lemma": lemma,
        "ss_type": int(rest[0]),
        "lex_filenum": int(rest[1]),
        "lex_id": int(rest[2]),
        "head_word": rest[3] if rest[3] else None,
        "head_id": int(rest[4]) if rest[4] else None,
    }

`Synset` `pydantic-model` ¶

Bases: GlazingBaseModel

A WordNet synset (set of cognitive synonyms).

ATTRIBUTE	DESCRIPTION
`offset`	8-digit identifier. TYPE: `SynsetOffset`
`lex_filenum`	Lexical file number (0-44). TYPE: `int`
`lex_filename`	Validated lexical file name. TYPE: `LexFileName`
`ss_type`	Synset type (n, v, a, r, s). TYPE: `WordNetPOS`
`words`	Words in this synset. TYPE: `list[Word]`
`pointers`	Relations to other synsets. TYPE: `list[Pointer]`
`frames`	Verb frames (verbs only). TYPE: `list[VerbFrame] \| None, default=None`
`gloss`	Definition and examples. TYPE: `str`
`key`	Canonical identifier: offset plus normalized POS. TYPE: `SynsetKey`

METHOD	DESCRIPTION
`get_lemmas`	Get all lemmas in the synset.
`get_hypernyms`	Get hypernym pointers.
`get_hyponyms`	Get hyponym pointers.

Examples:

>>> synset = Synset(
...     offset="00001740",
...     lex_filenum=5,
...     lex_filename="noun.animal",
...     ss_type="n",
...     words=[Word(lemma="dog", lex_id=0)],
...     pointers=[],
...     gloss="a domesticated carnivorous mammal"
... )
>>> synset.get_lemmas()
['dog']
>>> synset.key
'00001740n'

An adjective satellite keys under "a", matching the pointers that target it.

>>> satellite = Synset(
...     offset="00014377",
...     lex_filenum=0,
...     lex_filename="adj.all",
...     ss_type="s",
...     words=[Word(lemma="abounding", lex_id=0)],
...     pointers=[],
...     gloss="existing in abundance"
... )
>>> satellite.key
'00014377a'

Fields:

offset (SynsetOffset)
lex_filenum (int)
lex_filename (LexFileName)
ss_type (WordNetPOS)
words (list[Word])
pointers (list[Pointer])
frames (list[VerbFrame] | None)
gloss (str)

Attributes¶

`frames: list[VerbFrame] | None = None` `pydantic-field` ¶

Verb frames (verbs only)

`gloss: str` `pydantic-field` ¶

Definition and examples

`key: SynsetKey` `property` ¶

Canonical identifier for this synset.

Offsets alone do not identify a synset - they are byte offsets into a per-POS data file and collide across files. Pairing the offset with the normalized POS gives a database-wide unique key that also matches the Pointer.target_key of any pointer referencing this synset.

RETURNS	DESCRIPTION
`SynsetKey`	Offset plus normalized POS; satellites (`"s"`) key under `"a"`.

`lex_filename: LexFileName` `pydantic-field` ¶

Lexical file name

`lex_filenum: int` `pydantic-field` ¶

Lexical file number (0-44)

`offset: SynsetOffset` `pydantic-field` ¶

8-digit synset identifier

`pointers: list[Pointer]` `pydantic-field` ¶

Relations

`ss_type: WordNetPOS` `pydantic-field` ¶

Synset type

`words: list[Word]` `pydantic-field` ¶

Words in this synset

Methods:¶

`get_hypernyms() -> list[Pointer]` ¶

Get hypernym pointers.

RETURNS	DESCRIPTION
`list[Pointer]`	Pointers with '@' symbol.

Source code in src/glazing/wordnet/models.py

def get_hypernyms(self) -> list[Pointer]:
    """Get hypernym pointers.

    Returns
    -------
    list[Pointer]
        Pointers with '@' symbol.
    """
    return [p for p in self.pointers if p.symbol == "@"]

`get_hyponyms() -> list[Pointer]` ¶

Get hyponym pointers.

RETURNS	DESCRIPTION
`list[Pointer]`	Pointers with '~' symbol.

Source code in src/glazing/wordnet/models.py

def get_hyponyms(self) -> list[Pointer]:
    """Get hyponym pointers.

    Returns
    -------
    list[Pointer]
        Pointers with '~' symbol.
    """
    return [p for p in self.pointers if p.symbol == "~"]

`get_lemmas() -> list[str]` ¶

Get all lemmas in the synset.

RETURNS	DESCRIPTION
`list[str]`	List of lemma strings.

Source code in src/glazing/wordnet/models.py

def get_lemmas(self) -> list[str]:
    """Get all lemmas in the synset.

    Returns
    -------
    list[str]
        List of lemma strings.
    """
    return [word.lemma for word in self.words]

`get_lexical_pointers() -> list[Pointer]` ¶

Get lexical (word-to-word) pointers only.

RETURNS	DESCRIPTION
`list[Pointer]`	Pointers where source!=0 or target!=0.

Source code in src/glazing/wordnet/models.py

def get_lexical_pointers(self) -> list[Pointer]:
    """Get lexical (word-to-word) pointers only.

    Returns
    -------
    list[Pointer]
        Pointers where source!=0 or target!=0.
    """
    return [p for p in self.pointers if p.is_lexical()]

`get_pointers_by_symbol(symbol: PointerSymbol) -> list[Pointer]` ¶

Get pointers by relation symbol.

PARAMETER	DESCRIPTION
`symbol`	Relation symbol to filter by. TYPE: `PointerSymbol`

RETURNS	DESCRIPTION
`list[Pointer]`	Pointers with the specified symbol.

Examples:

>>> synset = Synset(...)
>>> antonyms = synset.get_pointers_by_symbol("!")

Source code in src/glazing/wordnet/models.py

def get_pointers_by_symbol(self, symbol: PointerSymbol) -> list[Pointer]:
    """Get pointers by relation symbol.

    Parameters
    ----------
    symbol : PointerSymbol
        Relation symbol to filter by.

    Returns
    -------
    list[Pointer]
        Pointers with the specified symbol.

    Examples
    --------
    >>> synset = Synset(...)
    >>> antonyms = synset.get_pointers_by_symbol("!")
    """
    return [p for p in self.pointers if p.symbol == symbol]

`get_semantic_pointers() -> list[Pointer]` ¶

Get semantic (synset-to-synset) pointers only.

RETURNS	DESCRIPTION
`list[Pointer]`	Pointers where source=0 and target=0.

Source code in src/glazing/wordnet/models.py

def get_semantic_pointers(self) -> list[Pointer]:
    """Get semantic (synset-to-synset) pointers only.

    Returns
    -------
    list[Pointer]
        Pointers where source=0 and target=0.
    """
    return [p for p in self.pointers if p.is_semantic()]

`has_relation(symbol: PointerSymbol) -> bool` ¶

Check if synset has a specific relation type.

PARAMETER	DESCRIPTION
`symbol`	Relation symbol to check for. TYPE: `PointerSymbol`

RETURNS	DESCRIPTION
`bool`	True if synset has at least one pointer with this symbol.

Examples:

>>> synset = Synset(...)
>>> has_hypernyms = synset.has_relation("@")

Source code in src/glazing/wordnet/models.py

def has_relation(self, symbol: PointerSymbol) -> bool:
    """Check if synset has a specific relation type.

    Parameters
    ----------
    symbol : PointerSymbol
        Relation symbol to check for.

    Returns
    -------
    bool
        True if synset has at least one pointer with this symbol.

    Examples
    --------
    >>> synset = Synset(...)
    >>> has_hypernyms = synset.has_relation("@")
    """
    return any(p.symbol == symbol for p in self.pointers)

`VerbFrame` `pydantic-model` ¶

Bases: GlazingBaseModel

Syntactic frame for a verb.

ATTRIBUTE	DESCRIPTION
`frame_number`	Frame number (1-35). TYPE: `VerbFrameNumber`
`word_indices`	Word indices (0 = all words, or specific indices). TYPE: `list[int]`
`template`	Natural language frame template (e.g., "Something ----s"). TYPE: `str \| None, default=None`
`example_sentence`	Example sentence with %s placeholder for verb. TYPE: `str \| None, default=None`

Examples:

>>> frame = VerbFrame(frame_number=8, word_indices=[0])
>>> frame.frame_number
8

Fields:

frame_number (VerbFrameNumber)
word_indices (list[int])
template (str | None)
example_sentence (str | None)

Validators:

validate_word_indices → word_indices

Attributes¶

`example_sentence: str | None = None` `pydantic-field` ¶

Example sentence with %s placeholder

`frame_number: VerbFrameNumber` `pydantic-field` ¶

Frame number (1-35)

`template: str | None = None` `pydantic-field` ¶

Natural language frame template

`word_indices: list[int]` `pydantic-field` ¶

Word indices (0 = all words)

Methods:¶

`validate_word_indices(v: list[int]) -> list[int]` `pydantic-validator` ¶

Validate word indices.

PARAMETER	DESCRIPTION
`v`	The word indices to validate. TYPE: `list[int]`

RETURNS	DESCRIPTION
`list[int]`	The validated indices.

RAISES	DESCRIPTION
`ValueError`	If any index is negative.

Source code in src/glazing/wordnet/models.py

@field_validator("word_indices")
@classmethod
def validate_word_indices(cls, v: list[int]) -> list[int]:
    """Validate word indices.

    Parameters
    ----------
    v : list[int]
        The word indices to validate.

    Returns
    -------
    list[int]
        The validated indices.

    Raises
    ------
    ValueError
        If any index is negative.
    """
    for idx in v:
        if idx < 0:
            msg = f"Word index cannot be negative: {idx}"
            raise ValueError(msg)
    return v

`Word` `pydantic-model` ¶

Bases: GlazingBaseModel

A word/lemma in a synset.

ATTRIBUTE	DESCRIPTION
`lemma`	Word form (lowercase, underscores for spaces). TYPE: `str`
`lex_id`	Distinguishes same word in synset (0-15). TYPE: `LexID`
`sense_number`	Frequency-based sense ordering from index.sense. TYPE: `int \| None, default=None`
`tag_count`	Semantic concordance tag count. TYPE: `int, default=0`
`syntactic_marker`	Adjective syntactic-position marker from wndb(5): `p` (predicative), `a` (attributive/prenominal), or `ip` (immediately postnominal). Only adjectives carry a marker; `None` for all other parts of speech. TYPE: `AdjPosition \| None, default=None`

Examples:

>>> word = Word(lemma="dog", lex_id=0)
>>> word.lemma
'dog'
>>> word.lex_id
0
>>> Word(lemma="galore", lex_id=0, syntactic_marker="ip").syntactic_marker
'ip'

Fields:

lemma (str)
lex_id (LexID)
sense_number (int | None)
tag_count (int)
syntactic_marker (AdjPosition | None)

Validators:

validate_lemma → lemma

Attributes¶

`lemma: str` `pydantic-field` ¶

Word form (lowercase, underscores for spaces)

`lex_id: LexID` `pydantic-field` ¶

Lexical ID distinguishing same word in synset

`sense_number: int | None = None` `pydantic-field` ¶

Frequency-based sense ordering

`syntactic_marker: AdjPosition | None = None` `pydantic-field` ¶

Adjective syntactic-position marker (p/a/ip)

`tag_count: int = 0` `pydantic-field` ¶

Semantic concordance tag count

Methods:¶

`validate_lemma(v: str) -> str` `pydantic-validator` ¶

Validate lemma format.

PARAMETER	DESCRIPTION
`v`	The lemma to validate. TYPE: `str`

RETURNS	DESCRIPTION
`str`	The validated lemma.

RAISES	DESCRIPTION
`ValueError`	If lemma format is invalid.

Source code in src/glazing/wordnet/models.py

@field_validator("lemma")
@classmethod
def validate_lemma(cls, v: str) -> str:
    """Validate lemma format.

    Parameters
    ----------
    v : str
        The lemma to validate.

    Returns
    -------
    str
        The validated lemma.

    Raises
    ------
    ValueError
        If lemma format is invalid.
    """
    if not re.match(LEMMA_PATTERN, v):
        msg = f"Invalid lemma format: {v!r}"
        raise ValueError(msg)
    return v

`WordNetCrossRef` `pydantic-model` ¶

Bases: GlazingBaseModel

Cross-reference to WordNet from other resources.

ATTRIBUTE	DESCRIPTION
`sense_key`	TYPE: `SenseKey \| None, default=None`
`synset_offset`	TYPE: `SynsetOffset \| None, default=None`
`lemma`	Word lemma. TYPE: `str`
`pos`	Part of speech. TYPE: `WordNetPOS`
`sense_number`	Sense number for ordering. TYPE: `SenseNumber \| None, default=None`

METHOD	DESCRIPTION
`to_percentage_notation`	Convert to VerbNet percentage notation.
`from_percentage_notation`	Parse VerbNet percentage notation.
`is_valid_reference`	Check if reference has valid identifiers.
`get_primary_identifier`	Get primary identifier (sense_key preferred).

Examples:

>>> ref = WordNetCrossRef(
...     sense_key="give%2:40:00::",
...     lemma="give",
...     pos="v"
... )
>>> notation = ref.to_percentage_notation()
>>> notation
'give%2:40:00'
>>> ref.is_valid_reference()
True

Fields:

sense_key (SenseKey | None)
synset_offset (SynsetOffset | None)
lemma (str)
pos (WordNetPOS)
sense_number (SenseNumber | None)

Attributes¶

`lemma: str` `pydantic-field` ¶

Word lemma

`pos: WordNetPOS` `pydantic-field` ¶

Part of speech

`sense_key: SenseKey | None = None` `pydantic-field` ¶

Stable sense identifier

`sense_number: SenseNumber | None = None` `pydantic-field` ¶

Sense ordering

`synset_offset: SynsetOffset | None = None` `pydantic-field` ¶

Version-specific offset

Methods:¶

`from_percentage_notation(notation: str) -> WordNetCrossRef` `classmethod` ¶

Parse VerbNet percentage notation.

PARAMETER	DESCRIPTION
`notation`	Percentage notation (e.g., "give%2:40:00"). TYPE: `str`

RETURNS	DESCRIPTION
`WordNetCrossRef`	Cross-reference object.

RAISES	DESCRIPTION
`ValueError`	If notation format is invalid.

Examples:

>>> ref = WordNetCrossRef.from_percentage_notation("give%2:40:00")
>>> ref.lemma
'give'
>>> ref.pos
'v'

Source code in src/glazing/wordnet/models.py

@classmethod
def from_percentage_notation(cls, notation: str) -> WordNetCrossRef:
    """Parse VerbNet percentage notation.

    Parameters
    ----------
    notation : str
        Percentage notation (e.g., "give%2:40:00").

    Returns
    -------
    WordNetCrossRef
        Cross-reference object.

    Raises
    ------
    ValueError
        If notation format is invalid.

    Examples
    --------
    >>> ref = WordNetCrossRef.from_percentage_notation("give%2:40:00")
    >>> ref.lemma
    'give'
    >>> ref.pos
    'v'
    """
    match = re.match(r"^([a-z_-]+)%([1-5]):([0-9]{2}):([0-9]{2})$", notation)
    if not match:
        msg = f"Invalid percentage notation: {notation}"
        raise ValueError(msg)

    lemma = match.group(1)
    ss_type = int(match.group(2))
    lex_filenum = match.group(3)
    lex_id = match.group(4)

    # Map ss_type to POS
    pos_map: dict[int, WordNetPOS] = {1: "n", 2: "v", 3: "a", 4: "r", 5: "s"}
    pos = pos_map[ss_type]

    # Construct partial sense key
    sense_key = f"{lemma}%{ss_type}:{lex_filenum}:{lex_id}::"

    return cls(sense_key=sense_key, synset_offset=None, lemma=lemma, pos=pos, sense_number=None)

`get_primary_identifier() -> str | None` ¶

Get primary identifier (sense_key preferred).

RETURNS	DESCRIPTION
`str \| None`	Sense key if available, otherwise synset offset.

Examples:

>>> ref = WordNetCrossRef(sense_key="give%2:40:00::", lemma="give", pos="v")
>>> ref.get_primary_identifier()
'give%2:40:00::'

Source code in src/glazing/wordnet/models.py

def get_primary_identifier(self) -> str | None:
    """Get primary identifier (sense_key preferred).

    Returns
    -------
    str | None
        Sense key if available, otherwise synset offset.

    Examples
    --------
    >>> ref = WordNetCrossRef(sense_key="give%2:40:00::", lemma="give", pos="v")
    >>> ref.get_primary_identifier()
    'give%2:40:00::'
    """
    return self.sense_key or self.synset_offset

`is_valid_reference() -> bool` ¶

Check if reference has valid identifiers.

RETURNS	DESCRIPTION
`bool`	True if has sense_key or synset_offset.

Examples:

>>> ref = WordNetCrossRef(sense_key="give%2:40:00::", lemma="give", pos="v")
>>> ref.is_valid_reference()
True

Source code in src/glazing/wordnet/models.py

def is_valid_reference(self) -> bool:
    """Check if reference has valid identifiers.

    Returns
    -------
    bool
        True if has sense_key or synset_offset.

    Examples
    --------
    >>> ref = WordNetCrossRef(sense_key="give%2:40:00::", lemma="give", pos="v")
    >>> ref.is_valid_reference()
    True
    """
    return self.sense_key is not None or self.synset_offset is not None

`to_percentage_notation() -> str` ¶

Convert to VerbNet percentage notation.

RETURNS	DESCRIPTION
`str`	Percentage notation (e.g., "give%2:40:00").

Examples:

>>> ref = WordNetCrossRef(sense_key="give%2:40:00::", lemma="give", pos="v")
>>> ref.to_percentage_notation()
'give%2:40:00'

Source code in src/glazing/wordnet/models.py

def to_percentage_notation(self) -> str:
    """Convert to VerbNet percentage notation.

    Returns
    -------
    str
        Percentage notation (e.g., "give%2:40:00").

    Examples
    --------
    >>> ref = WordNetCrossRef(sense_key="give%2:40:00::", lemma="give", pos="v")
    >>> ref.to_percentage_notation()
    'give%2:40:00'
    """
    if self.sense_key:
        # Extract components from sense key (format: lemma%ss_type:lex_filenum:lex_id::)
        parts = self.sense_key.split("%")
        if len(parts) >= 2:
            sense_part = parts[1].split(":")
            if len(sense_part) >= 3:
                return f"{self.lemma}%{sense_part[0]}:{sense_part[1]}:{sense_part[2]}"
    return ""

glazing.wordnet.models¶

models ¶

Classes¶

ExceptionEntry pydantic-model ¶

Attributes¶

base_forms: list[str] pydantic-field ¶

inflected_form: str pydantic-field ¶

pos: WordNetPOS | None = None pydantic-field ¶

Methods:¶

validate_forms(v: str | list[str]) -> str | list[str] pydantic-validator ¶

IndexEntry pydantic-model ¶

Attributes¶

lemma: str pydantic-field ¶

p_cnt: int pydantic-field ¶

pos: WordNetPOS pydantic-field ¶

ptr_symbols: list[PointerSymbol] pydantic-field ¶

sense_cnt: int pydantic-field ¶

synset_cnt: int pydantic-field ¶

synset_offsets: list[SynsetOffset] pydantic-field ¶

tagsense_cnt: int pydantic-field ¶

Pointer pydantic-model ¶

Attributes¶

offset: SynsetOffset pydantic-field ¶

pos: WordNetPOS pydantic-field ¶

source: int pydantic-field ¶

symbol: PointerSymbol pydantic-field ¶

target: int pydantic-field ¶

target_key: SynsetKey property ¶

Methods:¶

is_lexical() -> bool ¶

is_semantic() -> bool ¶

Sense pydantic-model ¶

Attributes¶

head_id: int | None = None pydantic-field ¶

head_word: str | None = None pydantic-field ¶

lemma: str pydantic-field ¶

lex_filenum: int pydantic-field ¶

lex_id: LexID pydantic-field ¶

sense_key: SenseKey pydantic-field ¶

sense_number: SenseNumber pydantic-field ¶

ss_type: WordNetPOS pydantic-field ¶

synset_offset: SynsetOffset pydantic-field ¶

tag_count: TagCount pydantic-field ¶

Methods:¶

parse_sense_key() -> dict[str, str | int | None] ¶

Synset pydantic-model ¶

Attributes¶

frames: list[VerbFrame] | None = None pydantic-field ¶

gloss: str pydantic-field ¶

key: SynsetKey property ¶

lex_filename: LexFileName pydantic-field ¶

lex_filenum: int pydantic-field ¶

offset: SynsetOffset pydantic-field ¶

pointers: list[Pointer] pydantic-field ¶

ss_type: WordNetPOS pydantic-field ¶

words: list[Word] pydantic-field ¶

Methods:¶

get_hypernyms() -> list[Pointer] ¶

get_hyponyms() -> list[Pointer] ¶

get_lemmas() -> list[str] ¶

get_lexical_pointers() -> list[Pointer] ¶

get_pointers_by_symbol(symbol: PointerSymbol) -> list[Pointer] ¶

get_semantic_pointers() -> list[Pointer] ¶

has_relation(symbol: PointerSymbol) -> bool ¶

VerbFrame pydantic-model ¶

Attributes¶

example_sentence: str | None = None pydantic-field ¶

frame_number: VerbFrameNumber pydantic-field ¶

template: str | None = None pydantic-field ¶

word_indices: list[int] pydantic-field ¶

Methods:¶

validate_word_indices(v: list[int]) -> list[int] pydantic-validator ¶

Word pydantic-model ¶

Attributes¶

lemma: str pydantic-field ¶

lex_id: LexID pydantic-field ¶

sense_number: int | None = None pydantic-field ¶

syntactic_marker: AdjPosition | None = None pydantic-field ¶

tag_count: int = 0 pydantic-field ¶

Methods:¶

`models` ¶

`ExceptionEntry` `pydantic-model` ¶

`base_forms: list[str]` `pydantic-field` ¶

`inflected_form: str` `pydantic-field` ¶

`pos: WordNetPOS | None = None` `pydantic-field` ¶

`validate_forms(v: str | list[str]) -> str | list[str]` `pydantic-validator` ¶

`IndexEntry` `pydantic-model` ¶

`lemma: str` `pydantic-field` ¶

`p_cnt: int` `pydantic-field` ¶

`pos: WordNetPOS` `pydantic-field` ¶

`ptr_symbols: list[PointerSymbol]` `pydantic-field` ¶

`sense_cnt: int` `pydantic-field` ¶

`synset_cnt: int` `pydantic-field` ¶

`synset_offsets: list[SynsetOffset]` `pydantic-field` ¶

`tagsense_cnt: int` `pydantic-field` ¶

`Pointer` `pydantic-model` ¶

`offset: SynsetOffset` `pydantic-field` ¶

`pos: WordNetPOS` `pydantic-field` ¶

`source: int` `pydantic-field` ¶

`symbol: PointerSymbol` `pydantic-field` ¶

`target: int` `pydantic-field` ¶

`target_key: SynsetKey` `property` ¶

`is_lexical() -> bool` ¶

`is_semantic() -> bool` ¶

`Sense` `pydantic-model` ¶

`head_id: int | None = None` `pydantic-field` ¶

`head_word: str | None = None` `pydantic-field` ¶

`lemma: str` `pydantic-field` ¶

`lex_filenum: int` `pydantic-field` ¶

`lex_id: LexID` `pydantic-field` ¶

`sense_key: SenseKey` `pydantic-field` ¶

`sense_number: SenseNumber` `pydantic-field` ¶

`ss_type: WordNetPOS` `pydantic-field` ¶

`synset_offset: SynsetOffset` `pydantic-field` ¶

`tag_count: TagCount` `pydantic-field` ¶

`parse_sense_key() -> dict[str, str | int | None]` ¶

`Synset` `pydantic-model` ¶

`frames: list[VerbFrame] | None = None` `pydantic-field` ¶

`gloss: str` `pydantic-field` ¶

`key: SynsetKey` `property` ¶

`lex_filename: LexFileName` `pydantic-field` ¶

`lex_filenum: int` `pydantic-field` ¶

`offset: SynsetOffset` `pydantic-field` ¶

`pointers: list[Pointer]` `pydantic-field` ¶

`ss_type: WordNetPOS` `pydantic-field` ¶

`words: list[Word]` `pydantic-field` ¶

`get_hypernyms() -> list[Pointer]` ¶

`get_hyponyms() -> list[Pointer]` ¶

`get_lemmas() -> list[str]` ¶

`get_lexical_pointers() -> list[Pointer]` ¶

`get_pointers_by_symbol(symbol: PointerSymbol) -> list[Pointer]` ¶

`get_semantic_pointers() -> list[Pointer]` ¶

`has_relation(symbol: PointerSymbol) -> bool` ¶

`VerbFrame` `pydantic-model` ¶

`example_sentence: str | None = None` `pydantic-field` ¶

`frame_number: VerbFrameNumber` `pydantic-field` ¶

`template: str | None = None` `pydantic-field` ¶

`word_indices: list[int]` `pydantic-field` ¶

`validate_word_indices(v: list[int]) -> list[int]` `pydantic-validator` ¶

`Word` `pydantic-model` ¶

`lemma: str` `pydantic-field` ¶

`lex_id: LexID` `pydantic-field` ¶

`sense_number: int | None = None` `pydantic-field` ¶

`syntactic_marker: AdjPosition | None = None` `pydantic-field` ¶

`tag_count: int = 0` `pydantic-field` ¶

`validate_lemma(v: str) -> str` `pydantic-validator` ¶

`WordNetCrossRef` `pydantic-model` ¶

`lemma: str` `pydantic-field` ¶

`pos: WordNetPOS` `pydantic-field` ¶

`sense_key: SenseKey | None = None` `pydantic-field` ¶

`sense_number: SenseNumber | None = None` `pydantic-field` ¶

`synset_offset: SynsetOffset | None = None` `pydantic-field` ¶

`from_percentage_notation(notation: str) -> WordNetCrossRef` `classmethod` ¶

`get_primary_identifier() -> str | None` ¶

`is_valid_reference() -> bool` ¶

`to_percentage_notation() -> str` ¶