pyrdf2vec.samplers package¶
Submodules¶
Module contents¶
isort:skip_file
- class pyrdf2vec.samplers.ObjFreqSampler(inverse=False, split=False)¶
Bases:
pyrdf2vec.samplers.sampler.Sampler
Object Frequency Weight node-centric sampling strategy which prioritizes walks containing edges with the highest degree objects. The degree of an object being defined by the number of predicates present in its neighborhood.
- Attributes:
- _counts: The counter for vertices.
Defaults to defaultdict.
- _is_support_remote: True if the sampling strategy can be used with a
remote Knowledge Graph, False Otherwise Defaults to False.
- _random_state: The random state to use to keep random determinism with
the sampling strategy. Defaults to None.
- _vertices_deg: The degree of the vertices.
Defaults to {}.
- _visited: Tags vertices that appear at the max depth or of which all
their children are tagged. Defaults to set.
- inverse: True if the inverse algorithm must be used, False otherwise.
Defaults to False.
- split: True if the split algorithm must be used, False otherwise.
Defaults to False.
- fit(kg)¶
Fits the sampling strategy by counting the number of parent predicates present in the neighborhood of each vertex.
- get_weight(hop)¶
Gets the weight of a hop in the Knowledge Graph.
- Parameters
hop (
Tuple
[Any
,Any
]) – The hop of a vertex in a (predicate, object) form to get the weight.- Return type
- Returns
The weight of a given hop.
- Raises
ValueError – If there is an attempt to access the weight of a hop without the sampling strategy having been trained.
- class pyrdf2vec.samplers.ObjPredFreqSampler(inverse=False, split=False)¶
Bases:
pyrdf2vec.samplers.sampler.Sampler
Predicate-Object Frequency Weight edge-centric sampling strategy which prioritizes walks containing edges with the highest degree of (predicate, object) relations. The degree of a such relation being defined by the number of occurences that a (predicate, object) relation appears in a Knowledge Graph.
- _is_support_remote¶
True if the sampling strategy can be used with a remote Knowledge Graph, False Otherwise Defaults to False.
- _random_state¶
The random state to use to keep random determinism with the sampling strategy. Defaults to None.
- _vertices_deg¶
The degree of the vertices. Defaults to {}.
- _visited¶
Tags vertices that appear at the max depth or of which all their children are tagged. Defaults to set.
- inverse¶
True if the inverse algorithm must be used, False otherwise. Defaults to False.
- split¶
True if the split algorithm must be used, False otherwise. Defaults to False.
- fit(kg)¶
Fits the sampling strategy by counting the number of occurrences of an object belonging to a subject.
- get_weight(hop)¶
Gets the weight of a hop in the Knowledge Graph.
- Parameters
hop (
Tuple
[Any
,Any
]) – The hop of a vertex in a (predicate, object) form to get the weight.- Return type
- Returns
The weight of a given hop.
- Raises
ValueError – If there is an attempt to access the weight of a hop without the sampling strategy having been trained.
- class pyrdf2vec.samplers.PageRankSampler(inverse=False, split=False, *, alpha=0.85)¶
Bases:
pyrdf2vec.samplers.sampler.Sampler
PageRank node-centric sampling strategy which prioritizes walks containing the most frequent objects. This frequency being defined by assigning a higher weight to the most frequent objects using the PageRank ranking.
- _is_support_remote¶
True if the sampling strategy can be used with a remote Knowledge Graph, False Otherwise Defaults to False.
- _pageranks¶
The PageRank dictionary. Defaults to {}.
- _random_state¶
The random state to use to keep random determinism with the sampling strategy. Defaults to None.
- _vertices_deg¶
The degree of the vertices. Defaults to {}.
- _visited¶
Tags vertices that appear at the max depth or of which all their children are tagged. Defaults to set.
- alpha¶
The damping for PageRank. Defaults to 0.85.
- inverse¶
True if the inverse algorithm must be used, False otherwise. Defaults to False.
- split¶
True if the split algorithm must be used, False otherwise. Defaults to False.
- fit(kg)¶
Fits the sampling strategy by running PageRank on a provided KG according to the specified damping.
- get_weight(hop)¶
Gets the weight of a hop in the Knowledge Graph.
- Parameters
hop (
Tuple
[Any
,Any
]) – The hop of a vertex in a (predicate, object) form to get the weight.- Return type
- Returns
The weight of a given hop.
- Raises
ValueError – If there is an attempt to access the weight of a hop without the sampling strategy having been trained.
- class pyrdf2vec.samplers.PredFreqSampler(inverse=False, split=False)¶
Bases:
pyrdf2vec.samplers.sampler.Sampler
Predicate Frequency Weight edge-centric sampling strategy which prioritizes walks containing edges with the highest degree predicates. The degree of a predicate being defined by the number of occurences that a predicate appears in a Knowledge Graph.
- _is_support_remote¶
True if the sampling strategy can be used with a remote Knowledge Graph, False Otherwise Defaults to False.
- _random_state¶
The random state to use to keep random determinism with the sampling strategy. Defaults to None.
- _vertices_deg¶
The degree of the vertices. Defaults to {}.
- _visited¶
Tags vertices that appear at the max depth or of which all their children are tagged. Defaults to set.
- inverse¶
True if the inverse algorithm must be used, False otherwise. Defaults to False.
- split¶
True if the split algorithm must be used, False otherwise. Defaults to False.
- fit(kg)¶
Fits the sampling strategy by counting the number of occurences that a predicate appears in the Knowledge Graph.
- get_weight(hop)¶
Gets the weight of a hop in the Knowledge Graph.
- Parameters
hop (
Tuple
[Any
,Any
]) – The hop of a vertex in a (predicate, object) form to get the weight.- Return type
- Returns
The weight of a given hop.
- Raises
ValueError – If there is an attempt to access the weight of a hop without the sampling strategy having been trained.
- class pyrdf2vec.samplers.Sampler(inverse=False, split=False)¶
Bases:
abc.ABC
Base class of the sampling strategies.
- _is_support_remote¶
True if the sampling strategy can be used with a remote Knowledge Graph, False Otherwise Defaults to False.
- _random_state¶
The random state to use to keep random determinism with the sampling strategy. Defaults to None.
- _vertices_deg¶
The degree of the vertices. Defaults to {}.
- _visited¶
Tags vertices that appear at the max depth or of which all their children are tagged. Defaults to set.
- inverse¶
True if the inverse algorithm must be used, False otherwise. Defaults to False.
- split¶
True if the split algorithm must be used, False otherwise. Defaults to False.
- abstract fit(kg)¶
Fits the sampling strategy.
- Parameters
kg (
KG
) – The Knowledge Graph.- Raises
SamplerNotSupported – If there is an attempt to use an invalid sampling strategy to a remote Knowledge Graph.
- Return type
- abstract get_weight(hop)¶
Gets the weight of a hop in the Knowledge Graph.
- Parameters
hop (
Tuple
[Any
,Any
]) – The hop of a vertex in a (predicate, object) form to get the weight.- Returns
The weight of a given hop.
- Raises
NotImplementedError – If this method is called, without having provided an implementation.
- get_weights(hops)¶
Gets the weights of the provided hops.
- sample_hop(kg, walk, is_last_hop, is_reverse=False)¶
Samples an unvisited random hop in the (predicate, object) form, according to the weight of hops for a given walk.
- Parameters
kg (
KG
) – The Knowledge Graph.walk (
Tuple
[Any
,...
]) – The walk with one or several vertices.is_last_hop (
bool
) – True if the next hop to be visited is the last one for the desired depth, False otherwise.is_reverse (
bool
) – True to get the parent neighbors instead of the child neighbors, False otherwise. Defaults to False.
- Return type
- Returns
An unvisited hop in the (predicate, object) form.
- class pyrdf2vec.samplers.UniformSampler¶
Bases:
pyrdf2vec.samplers.sampler.Sampler
Uniform sampling strategy that assigns a uniform weight to each edge in a Knowledge Graph, in order to prioritizes walks with strongly connected entities.
- _is_support_remote¶
True if the sampling strategy can be used with a remote Knowledge Graph, False Otherwise Defaults to True.
- Type
- _random_state¶
The random state to use to keep random determinism with the sampling strategy. Defaults to None.
- _vertices_deg¶
The degree of the vertices. Defaults to {}.
- _visited¶
Tags vertices that appear at the max depth or of which all their children are tagged. Defaults to set.
- inverse¶
True if the inverse algorithm must be used, False otherwise. Defaults to False.
- split¶
True if the split algorithm must be used, False otherwise. Defaults to False.
- fit(kg)¶
Since the weights are uniform, this function does nothing.
- class pyrdf2vec.samplers.WideSampler(inverse=False, split=False)¶
Bases:
pyrdf2vec.samplers.sampler.Sampler
Wide sampling node-centric sampling strategy which gives priority to walks containing edges with the highest degree of predicates and objects. The degree of a predicate and an object being defined by the number of predicates and objects present in its neighborhood, but also by their number of occurrence in a Knowledge Graph.
- _is_support_remote¶
True if the sampling strategy can be used with a remote Knowledge Graph, False Otherwise Defaults to False.
- _random_state¶
The random state to use to keep random determinism with the sampling strategy. Defaults to None.
- _vertices_deg¶
The degree of the vertices. Defaults to {}.
- _visited¶
Tags vertices that appear at the max depth or of which all their children are tagged. Defaults to set.
- inverse¶
True if the inverse algorithm must be used, False otherwise. Defaults to False.
- split¶
True if the split algorithm must be used, False otherwise. Defaults to False.
- fit(kg)¶
Fits the sampling strategy by couting the number of available neighbors for each vertex, but also by counting the number of occurrence that a predicate and an object appears in the Knowledge Graph.
- get_weight(hop)¶
Gets the weight of a hop in the Knowledge Graph.
- Parameters
hop (
Tuple
[Any
,Any
]) – The hop of a vertex in a (predicate, object) form to get the weight.- Return type
- Returns
The weight of a given hop.
- Raises
ValueError – If there is an attempt to access the weight of a hop without the sampling strategy having been trained.