KGE.models.translating_based.RotatE

Classes

`LpDistance`	An implementation of negative Lp-distance.
`RotatE`	An implementation of RotatE from [sun 2019].
`SelfAdversarialNegativeSamplingLoss`	An implementation of Self Adversarial Negative Sampling Loss.
`TranslatingModel`	A base module for Semantic Based Embedding Model.
`UniformStrategy`	An implementation of uniform negative sampling

Class Inheritance Diagram

Inheritance diagram of KGE.models.translating_based.RotatE.RotatE

An implementation of RotatE

class KGE.models.translating_based.RotatE.RotatE[source]

Bases: KGE.models.base_model.TranslatingModel.TranslatingModel

An implementation of RotatE from [sun 2019].

RotatE represents both entities and relations as embedding vectors in the complex space, and models the relation as an element-wise rotation from the head to tail:

\[\textbf{e}_h \circ \textbf{r}_r \approx \textbf{e}_t\]

where \(\textbf{e}_i, \textbf{r}_i \in \mathbb{C}^k\) are vector representations of the entities and relations. and \(\circ\) is the Hadmard (element-wise) product.

The score of \((h,r,t)\) is:

\[f(h,r,t) = s(\textbf{e}_h \circ \textbf{r}_r, \textbf{e}_t)\]

where \(s\) is a scoring function (KGE.score) that scores the plausibility of matching between \((translation, predicate)\).

By default, using KGE.score.LpDistance, negative L1-distance:

\[s(\textbf{e}_h \circ \textbf{r}_r, \textbf{e}_t) = - \left\| \textbf{e}_h \circ \textbf{r}_r - \textbf{e}_t \right\|_1\]

You can change to L2-distance by giving score_fn=LpDistance(p=2) in __init__(), or change any score function you like by specifying score_fn in __init__().

RotatE constrains the modulus of each element of \(\textbf{r} \in \mathbb{C}^k\) to 1, i.e., \(r_i \in \mathbb{C}\) to be \(\left| r_i \right| = 1\). By doing this, \(r_i\) is of the form \(e^{i\theta_{r,i}}\)

Methods

`evaluate`(eval_X, corrupt_side[, positive_X])	Evaluate triplets.
`get_rank`(x, positive_X, corrupt_side)	Get rank for specific one triplet.
`restore_model_weights`(model_weights)	Restore the model weights.
`score_hrt`(h, r, t)	Score the triplets \((h,r,t)\).
`train`(train_X, val_X, metadata, epochs, ...)	Train the Knowledge Graph Embedding Model.

__init__(embedding_params, negative_ratio, corrupt_side, score_fn=<KGE.score.LpDistance object>, loss_fn=<KGE.loss.SelfAdversarialNegativeSamplingLoss object>, ns_strategy=<class 'KGE.ns_strategy.UniformStrategy'>, n_workers=1)[source]

Initialized RotatE

Parameters

embedding_params (dict) – embedding dimension parameters, should have key 'embedding_size' for embedding dimension \(k\)
negative_ratio (int) – number of negative sample
corrupt_side (str) – corrupt from which side while trainging, can be 'h', 't', or 'h+t'
score_fn (function, optional) – scoring function, by default KGE.score.LpDistance
loss_fn (class, optional) – loss function class KGE.loss.Loss, by default KGE.loss.SelfAdversarialNegativeSamplingLoss
ns_strategy (function, optional) – negative sampling strategy, by default KGE.ns_strategy.uniform_strategy()
n_workers (int, optional) – number of workers for negative sampling, by default 1

evaluate(eval_X, corrupt_side, positive_X=None)

Evaluate triplets.

Parameters

eval_X (tf.Tensor or np.array) – triplets to be evaluated
corrupt_side (str) – corrupt triplets from which side, can be 'h' and 't'
positive_X (tf.Tensor or np.array, optional) – positive triplets that should be filtered while generating corrupted triplets, by default None (no filter applied)

Returns

evaluation result

Return type

dict

get_rank(x, positive_X, corrupt_side)

Get rank for specific one triplet.

Parameters

x (tf.Tensor or np.array) – rank this triplet
positive_X (tf.Tensor or np.array, optional) – positive triplets that should bt filtered while generating corrupted triplets, if None, no filter applied
corrupt_side (str) – corrupt triplets from which side, can be 'h' and 't'

Returns

ranking result

Return type

int

restore_model_weights(model_weights)

Restore the model weights.

Parameters: model_weights (dict) – dictionary of model weights to be restored

score_hrt(h, r, t)[source]

Score the triplets \((h,r,t)\).

If h is None, score all entities: \((h_i, r, t)\).

If t is None, score all entities: \((h, r, t_i)\).

h and t should not be None simultaneously.

Parameters

h (tf.Tensor or np.ndarray or None) – index of heads with shape (n,)
r (tf.Tensor or np.ndarray) – index of relations with shape (n,)
t (tf.Tensor or np.ndarray or None) – index of tails with shape (n,)

Returns

triplets scores with shape (n,)

Return type

tf.Tensor

train(train_X, val_X, metadata, epochs, batch_size, early_stopping_rounds=None, model_weights_initial=None, restore_best_weight=True, optimizer='Adam', seed=None, log_path='./logs', log_projector=False)

Train the Knowledge Graph Embedding Model.

Parameters

train_X (np.ndarray or str) –
training triplets.

If np.ndarray, shape should be (n,3) for \((h,r,t)\) respectively.

If str, training triplets should be save under this folder path with csv format, every csv files should have 3 columns without header for \((h,r,t)\) respectively.
val_X (np.ndarray or str) –
validation triplets.

If np.ndarray, shape should be (n,3) for \((h,r,t)\) respectively.

If str, training triplets should be save under this folder path with csv format, every csv files should have 3 columns without header for \((h,r,t)\) respectively.
metadata (dict) –
metadata for kg data. should have following keys:

'ent2ind': dict, dictionay that mapping entity to index.

'ind2ent': list, list that mapping index to entity.

'rel2ind': dict, dictionay that mapping relation to index.

'ind2rel': list, list that mapping index to relation.

can use KGE.data_utils.index_kg to index and get metadata.
epochs (int) – number of epochs
batch_size (int) – batch_size
early_stopping_rounds (int, optional) – number of rounds that trigger early stopping, by default None (no early stopping)
model_weights_initial (dict, optional) – initial model wieghts with specific value, by default None
restore_best_weight (bool, optional) – restore weight to the best iteration if early stopping rounds is not None, by default True
optimizer (str or tensorflow.keras.optimizers, optional) – optimizer that apply in training, by default 'Adam', use the default setting of tf.keras.optimizers.Adam
seed (int, optional) – random seed for shuffling data & embedding initialzation, by default None
log_path (str, optional) – path for tensorboard logging, by default “./logs”
log_projector (bool, optional) – project the embbedings in the tensorboard projector tab, setting this True will write the metadata and embedding tsv files in log_path and project this data on tensorboard projector tab, by default False