KGE.models.translating_based.TransR
Classes
|
An implementation of negative squared Lp-distance. |
|
An implementation of Pairwise Hinge Loss / Margin Ranking Loss. |
An implementation of TransR from [lin 2015]. |
|
|
A base module for Semantic Based Embedding Model. |
|
An implementation of uniform negative sampling |
Class Inheritance Diagram

An implementation of TransR
- class KGE.models.translating_based.TransR.TransR[source]
Bases:
KGE.models.base_model.TranslatingModel.TranslatingModelAn implementation of TransR from [lin 2015].
Both TransE and TransH assume embeddings of entities and relations are in the same embedding space \(\mathbb{R}_k\). But relations and entities are completely different objects, it may be not capable to represent them in the same semantic space. To address this issue, TransH models entities and relations in distinct embedding spaces, i.e., entity space and relation spaces.
TransH represents each entity as \(\textbf{e}_i \in \mathbb{R}^k\) and each relation as \(\textbf{r}_i \in \mathbb{R}^d\), the dimensions of entity embeddings and relation embeddings are not necessarily identical. For each relation, TransH set a projection matrix \(\textbf{M}_i \in \mathbb{R}^{k \times d}\), which projects entities from entity space to relation space, expecting the projected entity embeddings can be connected by the relation embeddings in the relation spaces:
\[ \begin{align}\begin{aligned}{\textbf{e}_h}_{\perp} + \textbf{r}_r \approx {\textbf{e}_t}_{\perp}\\{\textbf{e}_h}_{\perp} = \textbf{e}_h \textbf{M}_r\\{\textbf{e}_t}_{\perp} = \textbf{e}_t \textbf{M}_r\end{aligned}\end{align} \]where \(\textbf{e}_i \in \mathbb{R}^k\) are vector representations of the entities, \(\textbf{r}_i \in \mathbb{R}^d\) are vector representations of the relations, and \(\textbf{M}_i \in \mathbb{R}^{k \times d}\) are relation projection matrix.
The score of \((h,r,t)\) is:
\[f(h,r,t) = s({\textbf{e}_h}_{\perp} + \textbf{r}_r, {\textbf{e}_t}_{\perp})\]where \(s\) is a scoring function (
KGE.score) that scores the plausibility of matching between \((translation, predicate)\).By default, using
KGE.score.LpDistancePow, negative squared L2-distance:\[s({\textbf{e}_h}_{\perp} + \textbf{r}_r, {\textbf{e}_t}_{\perp}) = - \left\| {\textbf{e}_h}_{\perp} + \textbf{r}_r - {\textbf{e}_t}_{\perp} \right\|_2^2\]You can change to L1-distance by giving
score_fn=LpDistancePow(p=1)in__init__(), or change any score function you like by specifyingscore_fnin__init__().If
constraint=Truegiven in__init__(), conduct following constraints:\(\left\| \textbf{e}_h \right\|_2 \leq 1\) and \(\left\| \textbf{r}_r \right\|_2 \leq 1\) and \(\left\| \textbf{e}_t \right\|_2 \leq 1\)
\(\left\| \textbf{e}_h \textbf{M}_r \right\|_2 \leq 1\) and \(\left\| \textbf{e}_t \textbf{M}_r \right\|_2 \leq 1\)
Since the original TransR paper dose not specify how they conduct these constraints, here we use
KGE.constraint.clip_constraint()which restrict the tensor’s norm does not exceeds some value, if exceeds, clip the tensor norm to given threshold value.Methods
evaluate(eval_X, corrupt_side[, positive_X])Evaluate triplets.
get_rank(x, positive_X, corrupt_side)Get rank for specific one triplet.
restore_model_weights(model_weights)Restore the model weights.
score_hrt(h, r, t)Score the triplets \((h,r,t)\).
train(train_X, val_X, metadata, epochs, ...)Train the Knowledge Graph Embedding Model.
- __init__(embedding_params, negative_ratio, corrupt_side, score_fn=<KGE.score.LpDistancePow object>, loss_fn=<KGE.loss.PairwiseHingeLoss object>, ns_strategy=<class 'KGE.ns_strategy.UniformStrategy'>, constraint=True, n_workers=1)[source]
Initialized TransR
- Parameters
embedding_params (
dict) –embedding dimension parameters, should have following keys:
'ent_embedding_size'for entity embedding dimension \(k\)'rel_embedding_size'for relation embedding dimension \(d\)negative_ratio (
int) – number of negative samplecorrupt_side (
str) – corrupt from which side while trainging, can be'h','t', or'h+t'score_fn (
function, optional) – scoring function, by defaultKGE.score.LpDistancePowloss_fn (
class, optional) – loss function classKGE.loss.Loss, by defaultKGE.loss.PairwiseHingeLossns_strategy (
function, optional) – negative sampling strategy, by defaultKGE.ns_strategy.uniform_strategy()constraint (
bool, optional) – conduct constraint or not, by default Truen_workers (
int, optional) – number of workers for negative sampling, by default 1
- evaluate(eval_X, corrupt_side, positive_X=None)
Evaluate triplets.
- Parameters
eval_X (
tf.Tensorornp.array) – triplets to be evaluatedcorrupt_side (
str) – corrupt triplets from which side, can be'h'and't'positive_X (
tf.Tensorornp.array, optional) – positive triplets that should be filtered while generating corrupted triplets, by default None (no filter applied)
- Returns
evaluation result
- Return type
dict
- get_rank(x, positive_X, corrupt_side)
Get rank for specific one triplet.
- Parameters
x (
tf.Tensorornp.array) – rank this tripletpositive_X (
tf.Tensorornp.array, optional) – positive triplets that should bt filtered while generating corrupted triplets, ifNone, no filter appliedcorrupt_side (
str) – corrupt triplets from which side, can be'h'and't'
- Returns
ranking result
- Return type
int
- restore_model_weights(model_weights)
Restore the model weights.
- Parameters
model_weights (
dict) – dictionary of model weights to be restored
- score_hrt(h, r, t)[source]
Score the triplets \((h,r,t)\).
If
hisNone, score all entities: \((h_i, r, t)\).If
tisNone, score all entities: \((h, r, t_i)\).handtshould not beNonesimultaneously.- Parameters
h (
tf.Tensorornp.ndarrayorNone) – index of heads with shape(n,)r (
tf.Tensorornp.ndarray) – index of relations with shape(n,)t (
tf.Tensorornp.ndarrayorNone) – index of tails with shape(n,)
- Returns
triplets scores with shape
(n,)- Return type
tf.Tensor
- train(train_X, val_X, metadata, epochs, batch_size, early_stopping_rounds=None, model_weights_initial=None, restore_best_weight=True, optimizer='Adam', seed=None, log_path='./logs', log_projector=False)
Train the Knowledge Graph Embedding Model.
- Parameters
train_X (
np.ndarrayorstr) –training triplets.
If
np.ndarray, shape should be(n,3)for \((h,r,t)\) respectively.If
str, training triplets should be save under this folder path with csv format, every csv files should have 3 columns without header for \((h,r,t)\) respectively.val_X (
np.ndarrayorstr) –validation triplets.
If
np.ndarray, shape should be(n,3)for \((h,r,t)\) respectively.If
str, training triplets should be save under this folder path with csv format, every csv files should have 3 columns without header for \((h,r,t)\) respectively.metadata (
dict) –metadata for kg data. should have following keys:
'ent2ind': dict, dictionay that mapping entity to index.'ind2ent': list, list that mapping index to entity.'rel2ind': dict, dictionay that mapping relation to index.'ind2rel': list, list that mapping index to relation.can use KGE.data_utils.index_kg to index and get metadata.
epochs (
int) – number of epochsbatch_size (
int) – batch_sizeearly_stopping_rounds (
int, optional) – number of rounds that trigger early stopping, by default None (no early stopping)model_weights_initial (
dict, optional) – initial model wieghts with specific value, by default Nonerestore_best_weight (
bool, optional) – restore weight to the best iteration if early stopping rounds is not None, by default Trueoptimizer (
strortensorflow.keras.optimizers, optional) – optimizer that apply in training, by default'Adam', use the default setting of tf.keras.optimizers.Adamseed (
int, optional) – random seed for shuffling data & embedding initialzation, by default Nonelog_path (
str, optional) – path for tensorboard logging, by default “./logs”log_projector (
bool, optional) – project the embbedings in the tensorboard projector tab, setting this True will write the metadata and embedding tsv files inlog_pathand project this data on tensorboard projector tab, by default False