pyrdf2vec.samplers.frequency module

class pyrdf2vec.samplers.frequency.ObjFreqSampler(inverse=False, split=False)

Bases: pyrdf2vec.samplers.sampler.Sampler

Object Frequency Weight node-centric sampling strategy which prioritizes walks containing edges with the highest degree objects. The degree of an object being defined by the number of predicates present in its neighborhood.

Attributes:
_counts: The counter for vertices.

Defaults to defaultdict.

_is_support_remote: True if the sampling strategy can be used with a

remote Knowledge Graph, False Otherwise Defaults to False.

_random_state: The random state to use to keep random determinism with

the sampling strategy. Defaults to None.

_vertices_deg: The degree of the vertices.

Defaults to {}.

_visited: Tags vertices that appear at the max depth or of which all

their children are tagged. Defaults to set.

inverse: True if the inverse algorithm must be used, False otherwise.

Defaults to False.

split: True if the split algorithm must be used, False otherwise.

Defaults to False.

fit(kg)

Fits the sampling strategy by counting the number of parent predicates present in the neighborhood of each vertex.

Parameters

kg (KG) – The Knowledge Graph.

Return type

None

get_weight(hop)

Gets the weight of a hop in the Knowledge Graph.

Parameters

hop (Tuple[Any, Any]) – The hop of a vertex in a (predicate, object) form to get the weight.

Return type

int

Returns

The weight of a given hop.

Raises

ValueError – If there is an attempt to access the weight of a hop without the sampling strategy having been trained.

class pyrdf2vec.samplers.frequency.ObjPredFreqSampler(inverse=False, split=False)

Bases: pyrdf2vec.samplers.sampler.Sampler

Predicate-Object Frequency Weight edge-centric sampling strategy which prioritizes walks containing edges with the highest degree of (predicate, object) relations. The degree of a such relation being defined by the number of occurences that a (predicate, object) relation appears in a Knowledge Graph.

_counts

The counter for vertices. Defaults to defaultdict.

Type

DefaultDict[Tuple[str, str], int]

_is_support_remote

True if the sampling strategy can be used with a remote Knowledge Graph, False Otherwise Defaults to False.

_random_state

The random state to use to keep random determinism with the sampling strategy. Defaults to None.

_vertices_deg

The degree of the vertices. Defaults to {}.

_visited

Tags vertices that appear at the max depth or of which all their children are tagged. Defaults to set.

inverse

True if the inverse algorithm must be used, False otherwise. Defaults to False.

split

True if the split algorithm must be used, False otherwise. Defaults to False.

fit(kg)

Fits the sampling strategy by counting the number of occurrences of an object belonging to a subject.

Parameters

kg (KG) – The Knowledge Graph.

Return type

None

get_weight(hop)

Gets the weight of a hop in the Knowledge Graph.

Parameters

hop (Tuple[Any, Any]) – The hop of a vertex in a (predicate, object) form to get the weight.

Return type

int

Returns

The weight of a given hop.

Raises

ValueError – If there is an attempt to access the weight of a hop without the sampling strategy having been trained.

class pyrdf2vec.samplers.frequency.PredFreqSampler(inverse=False, split=False)

Bases: pyrdf2vec.samplers.sampler.Sampler

Predicate Frequency Weight edge-centric sampling strategy which prioritizes walks containing edges with the highest degree predicates. The degree of a predicate being defined by the number of occurences that a predicate appears in a Knowledge Graph.

_counts

The counter for vertices. Defaults to defaultdict.

Type

DefaultDict[str, int]

_is_support_remote

True if the sampling strategy can be used with a remote Knowledge Graph, False Otherwise Defaults to False.

_random_state

The random state to use to keep random determinism with the sampling strategy. Defaults to None.

_vertices_deg

The degree of the vertices. Defaults to {}.

_visited

Tags vertices that appear at the max depth or of which all their children are tagged. Defaults to set.

inverse

True if the inverse algorithm must be used, False otherwise. Defaults to False.

split

True if the split algorithm must be used, False otherwise. Defaults to False.

fit(kg)

Fits the sampling strategy by counting the number of occurences that a predicate appears in the Knowledge Graph.

Parameters

kg (KG) – The Knowledge Graph.

Return type

None

get_weight(hop)

Gets the weight of a hop in the Knowledge Graph.

Parameters

hop (Tuple[Any, Any]) – The hop of a vertex in a (predicate, object) form to get the weight.

Return type

int

Returns

The weight of a given hop.

Raises

ValueError – If there is an attempt to access the weight of a hop without the sampling strategy having been trained.