Deep Hierarchical Classifiers

This module contains code to build a deep hierarchical classifier, which is a reimplementation of this paper: https://arxiv.org/ftp/arxiv/papers/2005/2005.06692.pdf

import pandas as pd

source

build_DHC_conditional_mask

 build_DHC_conditional_mask (df_labels, label1, label2)

This is really similar to build_standard_condition_mask, but we don’t concatenate l1 1-hot matrix and l2 1-hot matrix

_df_labels=pd.DataFrame({
    'col_1':[0,0,0,1,1,2,2,2],
    'col_2':[0,1,2,3,4,5,6,7]
})
_df_labels

	col_1	col_2
0	0	0
1	0	1
2	0	2
3	1	3
4	1	4
5	2	5
6	2	6
7	2	7

l1l2_tensor = build_DHC_conditional_mask(_df_labels,'col_1','col_2')

l1l2_tensor

tensor([[1., 1., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 1., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 1., 1.]])

source

loss_for_DHC

 loss_for_DHC (l1_repr_logits, l2_repr_logits, labels_l1, labels_l2,
               dhc_mask, lloss_weight=1.0, dloss_weight=0.8)

Reference: https://github.com/Ugenteraan/Deep_Hierarchical_Classification/blob/main/model/hierarchical_loss.py

	Type	Default	Details
l1_repr_logits			Head 1’s logit output
l2_repr_logits			Head 2’s logit output
labels_l1			True label for head 1
labels_l2			True label for head 2
dhc_mask			A one-hot matrix between classes of head 1 and 2
lloss_weight	float	1.0	Weight for Layer Loss (lloss)
dloss_weight	float	0.8	Weight for Dependence Loss (dloss)

source

RobertaConcatHeadDHCRoot

 RobertaConcatHeadDHCRoot (config, classifier_dropout=0.1,
                           last_hidden_size=768, layer2concat=4, **kwargs)

Concatenated head for Roberta DHC Classification Model.

	Type	Default	Details
config			HuggingFace model configuration
classifier_dropout	float	0.1	Dropout ratio (for dropout layer right before the last nn.Linear)
last_hidden_size	int	768	Last hidden size (before the last nn.Linear)
layer2concat	int	4	number of hidden layer to concatenate (counting from top)
kwargs

source

RobertaSimpleHSCDHCSequenceClassification

 RobertaSimpleHSCDHCSequenceClassification (config, dhc_mask,
                                            lloss_weight=1.0,
                                            dloss_weight=0.8,
                                            layer2concat=4, device=None)

Roberta Simple-DHC Architecture with Hidden-State-Concatenation for Sequence Classification task

	Type	Default	Details
config			HuggingFace model configuration
dhc_mask			A one-hot matrix between classes of head 1 and 2
lloss_weight	float	1.0	Weight for Layer Loss (lloss)
dloss_weight	float	0.8	Weight for Dependence Loss (dloss)
layer2concat	int	4	number of hidden layer to concatenate (counting from top)
device	NoneType	None	CPU or GPU

source

RobertaHSCDHCSequenceClassification

 RobertaHSCDHCSequenceClassification (config, dhc_mask,
                                      classifier_dropout=0.1,
                                      last_hidden_size=768,
                                      linear_l1_size=None,
                                      linear_l2_size=None,
                                      lloss_weight=1.0, dloss_weight=0.8,
                                      layer2concat=4, device=None)

Roberta DHC Architecture with Hidden-State-Concatenation for Sequence Classification task

	Type	Default	Details
config			HuggingFace model configuration
dhc_mask			A one-hot matrix between classes of head 1 and 2
classifier_dropout	float	0.1	Dropout ratio (for dropout layer right before the last nn.Linear)
last_hidden_size	int	768	Last hidden size (before the last nn.Linear)
linear_l1_size	NoneType	None	last hidden size for head 1
linear_l2_size	NoneType	None	last hidden size for head 2
lloss_weight	float	1.0	Weight for Layer Loss (lloss)
dloss_weight	float	0.8	Weight for Dependence Loss (dloss)
layer2concat	int	4	number of hidden layer to concatenate (counting from top)
device	NoneType	None	CPU or GPU