import pandas as pd
Deep Hierarchical Classifiers
This module contains code to build a deep hierarchical classifier, which is a reimplementation of this paper: https://arxiv.org/ftp/arxiv/papers/2005/2005.06692.pdf
build_DHC_conditional_mask
build_DHC_conditional_mask (df_labels, label1, label2)
This is really similar to build_standard_condition_mask
, but we don’t concatenate l1 1-hot matrix and l2 1-hot matrix
=pd.DataFrame({
_df_labels'col_1':[0,0,0,1,1,2,2,2],
'col_2':[0,1,2,3,4,5,6,7]
}) _df_labels
col_1 | col_2 | |
---|---|---|
0 | 0 | 0 |
1 | 0 | 1 |
2 | 0 | 2 |
3 | 1 | 3 |
4 | 1 | 4 |
5 | 2 | 5 |
6 | 2 | 6 |
7 | 2 | 7 |
= build_DHC_conditional_mask(_df_labels,'col_1','col_2') l1l2_tensor
l1l2_tensor
tensor([[1., 1., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 1., 1.]])
loss_for_DHC
loss_for_DHC (l1_repr_logits, l2_repr_logits, labels_l1, labels_l2, dhc_mask, lloss_weight=1.0, dloss_weight=0.8)
Reference: https://github.com/Ugenteraan/Deep_Hierarchical_Classification/blob/main/model/hierarchical_loss.py
Type | Default | Details | |
---|---|---|---|
l1_repr_logits | Head 1’s logit output | ||
l2_repr_logits | Head 2’s logit output | ||
labels_l1 | True label for head 1 | ||
labels_l2 | True label for head 2 | ||
dhc_mask | A one-hot matrix between classes of head 1 and 2 | ||
lloss_weight | float | 1.0 | Weight for Layer Loss (lloss) |
dloss_weight | float | 0.8 | Weight for Dependence Loss (dloss) |
RobertaConcatHeadDHCRoot
RobertaConcatHeadDHCRoot (config, classifier_dropout=0.1, last_hidden_size=768, layer2concat=4, **kwargs)
Concatenated head for Roberta DHC Classification Model.
Type | Default | Details | |
---|---|---|---|
config | HuggingFace model configuration | ||
classifier_dropout | float | 0.1 | Dropout ratio (for dropout layer right before the last nn.Linear) |
last_hidden_size | int | 768 | Last hidden size (before the last nn.Linear) |
layer2concat | int | 4 | number of hidden layer to concatenate (counting from top) |
kwargs |
RobertaSimpleHSCDHCSequenceClassification
RobertaSimpleHSCDHCSequenceClassification (config, dhc_mask, lloss_weight=1.0, dloss_weight=0.8, layer2concat=4, device=None)
Roberta Simple-DHC Architecture with Hidden-State-Concatenation for Sequence Classification task
Type | Default | Details | |
---|---|---|---|
config | HuggingFace model configuration | ||
dhc_mask | A one-hot matrix between classes of head 1 and 2 | ||
lloss_weight | float | 1.0 | Weight for Layer Loss (lloss) |
dloss_weight | float | 0.8 | Weight for Dependence Loss (dloss) |
layer2concat | int | 4 | number of hidden layer to concatenate (counting from top) |
device | NoneType | None | CPU or GPU |
RobertaHSCDHCSequenceClassification
RobertaHSCDHCSequenceClassification (config, dhc_mask, classifier_dropout=0.1, last_hidden_size=768, linear_l1_size=None, linear_l2_size=None, lloss_weight=1.0, dloss_weight=0.8, layer2concat=4, device=None)
Roberta DHC Architecture with Hidden-State-Concatenation for Sequence Classification task
Type | Default | Details | |
---|---|---|---|
config | HuggingFace model configuration | ||
dhc_mask | A one-hot matrix between classes of head 1 and 2 | ||
classifier_dropout | float | 0.1 | Dropout ratio (for dropout layer right before the last nn.Linear) |
last_hidden_size | int | 768 | Last hidden size (before the last nn.Linear) |
linear_l1_size | NoneType | None | last hidden size for head 1 |
linear_l2_size | NoneType | None | last hidden size for head 2 |
lloss_weight | float | 1.0 | Weight for Layer Loss (lloss) |
dloss_weight | float | 0.8 | Weight for Dependence Loss (dloss) |
layer2concat | int | 4 | number of hidden layer to concatenate (counting from top) |
device | NoneType | None | CPU or GPU |