glazing.search¶
Unified search interface for all linguistic datasets.
search
¶
Unified search interface for all linguistic datasets.
This module provides a unified interface for searching across FrameNet, VerbNet, WordNet, and PropBank data simultaneously. All datasets are loaded automatically when UnifiedSearch is initialized.
| CLASS | DESCRIPTION |
|---|---|
SearchResult |
Individual search result. |
UnifiedSearch |
Unified search interface across all linguistic datasets. |
UnifiedSearchResult |
Container for search results across all datasets. |
Classes¶
SearchResult(dataset: str, id: str, type: str, name: str, description: str, score: float)
dataclass
¶
Individual search result.
| PARAMETER | DESCRIPTION |
|---|---|
dataset
|
Source dataset name.
TYPE:
|
id
|
Entity identifier.
TYPE:
|
type
|
Entity type.
TYPE:
|
name
|
Entity name.
TYPE:
|
description
|
Entity description.
TYPE:
|
score
|
Relevance score.
TYPE:
|
UnifiedSearch(data_dir: Path | str | None = None, framenet: FrameNetSearch | None = None, verbnet: VerbNetSearch | None = None, wordnet: WordNetSearch | None = None, propbank: PropBankSearch | None = None, auto_load: bool = True)
¶
Unified search interface across all linguistic datasets.
Provides methods for searching FrameNet, VerbNet, WordNet, and PropBank simultaneously or individually.
| PARAMETER | DESCRIPTION |
|---|---|
framenet
|
FrameNet search index.
TYPE:
|
verbnet
|
VerbNet search index.
TYPE:
|
wordnet
|
WordNet search index.
TYPE:
|
propbank
|
PropBank search index.
TYPE:
|
| ATTRIBUTE | DESCRIPTION |
|---|---|
framenet |
FrameNet search interface.
TYPE:
|
verbnet |
VerbNet search interface.
TYPE:
|
wordnet |
WordNet search interface.
TYPE:
|
propbank |
PropBank search interface.
TYPE:
|
| METHOD | DESCRIPTION |
|---|---|
by_lemma |
Search all datasets by lemma. |
by_semantic_role |
Search for frames/classes with a semantic role. |
by_semantic_predicate |
Search for verb classes with a semantic predicate. |
by_domain |
Search within a specific domain. |
get_statistics |
Get statistics across all datasets. |
Examples:
>>> search = UnifiedSearch(
... framenet=FrameNetSearch(frames),
... verbnet=VerbNetSearch(classes),
... wordnet=WordNetSearch(synsets),
... propbank=PropBankSearch(framesets)
... )
>>> results = search.by_lemma("give")
Initialize unified search.
| PARAMETER | DESCRIPTION |
|---|---|
data_dir
|
Directory containing converted data files. If None, uses default path from environment.
TYPE:
|
framenet
|
Pre-initialized FrameNet search object.
TYPE:
|
verbnet
|
Pre-initialized VerbNet search object.
TYPE:
|
wordnet
|
Pre-initialized WordNet search object.
TYPE:
|
propbank
|
Pre-initialized PropBank search object.
TYPE:
|
auto_load
|
If True and no search objects provided, automatically loads from data_dir.
TYPE:
|
Source code in src/glazing/search.py
Functions¶
batch_by_lemma(lemmas: list[str], pos: str | None = None) -> dict[str, UnifiedSearchResult]
¶
Search all datasets for multiple lemmas.
| PARAMETER | DESCRIPTION |
|---|---|
lemmas
|
List of lemmas to search for.
TYPE:
|
pos
|
Part of speech constraint.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
dict[str, UnifiedSearchResult]
|
Results mapped by lemma. |
Source code in src/glazing/search.py
by_domain(domain: str) -> UnifiedSearchResult
¶
Search within a specific domain.
| PARAMETER | DESCRIPTION |
|---|---|
domain
|
Domain name (WordNet lexical file name).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
UnifiedSearchResult
|
Results from datasets that support domain search. |
Source code in src/glazing/search.py
by_external_resource(resource_type: ResourceType, class_name: str | None = None) -> UnifiedSearchResult
¶
Search for entries linked to external resources.
| PARAMETER | DESCRIPTION |
|---|---|
resource_type
|
Type of resource (e.g., "VerbNet", "FrameNet").
TYPE:
|
class_name
|
Specific class/frame name to match.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
UnifiedSearchResult
|
Results from datasets with links to the resource. |
Source code in src/glazing/search.py
by_lemma(lemma: str, pos: str | None = None) -> UnifiedSearchResult
¶
Search all datasets by lemma.
| PARAMETER | DESCRIPTION |
|---|---|
lemma
|
Lemma to search for.
TYPE:
|
pos
|
Part of speech constraint (format varies by dataset).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
UnifiedSearchResult
|
Results from all datasets. |
Source code in src/glazing/search.py
by_semantic_predicate(predicate: PredicateType) -> UnifiedSearchResult
¶
Search for verb classes with a semantic predicate.
| PARAMETER | DESCRIPTION |
|---|---|
predicate
|
Semantic predicate to search for.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
UnifiedSearchResult
|
Results from VerbNet. |
Source code in src/glazing/search.py
by_semantic_role(role_name: str) -> UnifiedSearchResult
¶
Search for frames/classes with a semantic role.
| PARAMETER | DESCRIPTION |
|---|---|
role_name
|
Name of semantic role (e.g., "Agent", "Theme").
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
UnifiedSearchResult
|
Results from datasets that have this role. |
Source code in src/glazing/search.py
find_cross_references(entity_id: str, source: str, target: str) -> list[dict[str, str | float]]
¶
Find cross-references between datasets.
| PARAMETER | DESCRIPTION |
|---|---|
entity_id
|
Source entity identifier.
TYPE:
|
source
|
Source dataset name.
TYPE:
|
target
|
Target dataset name.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[dict]
|
List of cross-reference mappings. |
Source code in src/glazing/search.py
from_paths(framenet_path: Path | str | None = None, verbnet_path: Path | str | None = None, wordnet_synsets_path: Path | str | None = None, wordnet_senses_path: Path | str | None = None, propbank_path: Path | str | None = None) -> UnifiedSearch
classmethod
¶
Load unified search from JSON Lines files.
| PARAMETER | DESCRIPTION |
|---|---|
framenet_path
|
Path to FrameNet JSONL file.
TYPE:
|
verbnet_path
|
Path to VerbNet JSONL file.
TYPE:
|
wordnet_synsets_path
|
Path to WordNet synsets JSONL file.
TYPE:
|
wordnet_senses_path
|
Path to WordNet senses JSONL file.
TYPE:
|
propbank_path
|
Path to PropBank JSONL file.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
UnifiedSearch
|
Unified search with loaded datasets. |
Source code in src/glazing/search.py
get_entity(entity_id: str, dataset: str) -> Frame | VerbClass | Synset | Frameset | None
¶
Get a specific entity from a dataset.
| PARAMETER | DESCRIPTION |
|---|---|
entity_id
|
Entity identifier.
TYPE:
|
dataset
|
Dataset name.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
Frame | VerbClass | Synset | Frameset | None
|
The entity if found, None otherwise. |
Source code in src/glazing/search.py
get_statistics() -> dict[str, dict[str, int]]
¶
Get statistics across all datasets.
| RETURNS | DESCRIPTION |
|---|---|
dict[str, dict[str, int]]
|
Statistics for each available dataset. |
Source code in src/glazing/search.py
load_framenet_from_jsonl(filepath: str) -> None
¶
Load FrameNet data from JSONL file.
Source code in src/glazing/search.py
load_propbank_from_jsonl(filepath: str) -> None
¶
Load PropBank data from JSONL file.
Source code in src/glazing/search.py
load_verbnet_from_jsonl(filepath: str) -> None
¶
Load VerbNet data from JSONL file.
Source code in src/glazing/search.py
load_wordnet_from_jsonl(synsets_path: str, _index_path: str, _pos: str) -> None
¶
Load WordNet data from JSONL files.
Source code in src/glazing/search.py
search(query: str) -> list[SearchResult]
¶
Search across all datasets with a text query.
| PARAMETER | DESCRIPTION |
|---|---|
query
|
Search query text.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[SearchResult]
|
List of search results across all datasets. |
Source code in src/glazing/search.py
search_by_syntax(pattern: str, dataset: str | None = None, allow_wildcards: bool = True, min_confidence: float = 0.7) -> list[SearchResult]
¶
Search by syntactic pattern with hierarchical matching.
General patterns match specific instances with full confidence.
| PARAMETER | DESCRIPTION |
|---|---|
pattern
|
Syntactic pattern with optional wildcards and roles. Examples: - "NP V NP" - basic transitive - "NP V PP" - matches all PP subtypes - "NP V PP.instrument" - specific PP role - "NP V NP *" - wildcard for fourth position
TYPE:
|
dataset
|
Limit to specific dataset (verbnet, propbank, framenet).
TYPE:
|
allow_wildcards
|
Whether to process wildcard elements (*).
TYPE:
|
min_confidence
|
Minimum confidence score for matches (0.0-1.0).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[SearchResult]
|
Matching results sorted by confidence. |
Examples:
>>> search = UnifiedSearch()
>>> # Find all PP patterns
>>> results = search.search_by_syntax("NP V PP")
>>> # Find specific PP role
>>> results = search.search_by_syntax("NP V PP.instrument")
>>> # Use wildcards
>>> results = search.search_by_syntax("NP V NP *")
Source code in src/glazing/search.py
search_framenet_elements(core_type: str | None = None, semantic_type: str | None = None) -> list[Frame]
¶
Search FrameNet frames by element properties.
| PARAMETER | DESCRIPTION |
|---|---|
core_type
|
"Core", "Non-Core", or "Extra-Thematic".
TYPE:
|
semantic_type
|
Semantic type of elements.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Frame]
|
FrameNet frames matching criteria. |
Source code in src/glazing/search.py
search_propbank_args(arg_type: str | None = None, prefix: str | None = None, modifier: str | None = None, arg_number: str | None = None) -> list[Roleset]
¶
Search PropBank rolesets by argument properties.
| PARAMETER | DESCRIPTION |
|---|---|
arg_type
|
"core" or "modifier".
TYPE:
|
prefix
|
"C" or "R" for continuation/reference.
TYPE:
|
modifier
|
Modifier type (e.g., "LOC", "TMP").
TYPE:
|
arg_number
|
Specific argument number (e.g., "0", "1", "2").
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Roleset]
|
PropBank rolesets matching criteria. |
Source code in src/glazing/search.py
search_semantic_roles(role_name: str) -> list[SearchResult]
¶
Search for semantic roles across datasets.
| PARAMETER | DESCRIPTION |
|---|---|
role_name
|
Role name to search for.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[SearchResult]
|
List of search results for the role. |
Source code in src/glazing/search.py
search_verbnet_roles(optional: bool | None = None, indexed: bool | None = None, verb_specific: bool | None = None) -> list[VerbClass]
¶
Search VerbNet classes by role properties.
| PARAMETER | DESCRIPTION |
|---|---|
optional
|
Filter for optional roles.
TYPE:
|
indexed
|
Filter for indexed roles.
TYPE:
|
verb_specific
|
Filter for verb-specific roles.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[VerbClass]
|
VerbNet classes matching criteria. |
Source code in src/glazing/search.py
search_with_fuzzy(query: str, fuzzy_threshold: float = 0.8) -> list[SearchResult]
¶
Search across all datasets with fuzzy matching.
| PARAMETER | DESCRIPTION |
|---|---|
query
|
Search query text.
TYPE:
|
fuzzy_threshold
|
Minimum similarity score for fuzzy matches.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[SearchResult]
|
Search results with confidence scores. |
Source code in src/glazing/search.py
1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 | |
search_wordnet_relations(relation_type: str | None = None) -> list[Synset]
¶
Search WordNet synsets by relation type.
| PARAMETER | DESCRIPTION |
|---|---|
relation_type
|
Relation type (e.g., "hypernym", "hyponym").
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list[Synset]
|
WordNet synsets with specified relations. |
Source code in src/glazing/search.py
UnifiedSearchResult(frames: list[Frame], verb_classes: list[VerbClass], synsets: list[Synset], framesets: list[Frameset], rolesets: list[Roleset])
dataclass
¶
Container for search results across all datasets.
| PARAMETER | DESCRIPTION |
|---|---|
frames
|
FrameNet frames found.
TYPE:
|
verb_classes
|
VerbNet verb classes found.
TYPE:
|
synsets
|
WordNet synsets found.
TYPE:
|
framesets
|
PropBank framesets found.
TYPE:
|
rolesets
|
PropBank rolesets found.
TYPE:
|
Examples:
>>> result = UnifiedSearchResult(
... frames=[giving_frame],
... verb_classes=[give_class],
... synsets=[give_synset],
... framesets=[give_frameset],
... rolesets=[]
... )
| METHOD | DESCRIPTION |
|---|---|
count |
Get total count of all results. |
is_empty |
Check if all result lists are empty. |
Functions¶
count() -> int
¶
Get total count of all results.
| RETURNS | DESCRIPTION |
|---|---|
int
|
Total number of results across all datasets. |
Source code in src/glazing/search.py
is_empty() -> bool
¶
Check if all result lists are empty.
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if no results found in any dataset. |