pyrdf2vec.samplers.pagerank module¶
- class pyrdf2vec.samplers.pagerank.PageRankSampler(inverse=False, split=False, *, alpha=0.85)¶
Bases:
pyrdf2vec.samplers.sampler.Sampler
PageRank node-centric sampling strategy which prioritizes walks containing the most frequent objects. This frequency being defined by assigning a higher weight to the most frequent objects using the PageRank ranking.
- _is_support_remote¶
True if the sampling strategy can be used with a remote Knowledge Graph, False Otherwise Defaults to False.
- _pageranks¶
The PageRank dictionary. Defaults to {}.
- _random_state¶
The random state to use to keep random determinism with the sampling strategy. Defaults to None.
- _vertices_deg¶
The degree of the vertices. Defaults to {}.
- _visited¶
Tags vertices that appear at the max depth or of which all their children are tagged. Defaults to set.
- alpha¶
The damping for PageRank. Defaults to 0.85.
- inverse¶
True if the inverse algorithm must be used, False otherwise. Defaults to False.
- split¶
True if the split algorithm must be used, False otherwise. Defaults to False.
- fit(kg)¶
Fits the sampling strategy by running PageRank on a provided KG according to the specified damping.
- get_weight(hop)¶
Gets the weight of a hop in the Knowledge Graph.
- Parameters
hop (
Tuple
[Any
,Any
]) – The hop of a vertex in a (predicate, object) form to get the weight.- Return type
- Returns
The weight of a given hop.
- Raises
ValueError – If there is an attempt to access the weight of a hop without the sampling strategy having been trained.