Heterogeneous graph construction and node representation learning method of Treatise on Febrile Diseases based on graph convolutional network

Objective: To construct symptom-formula-herb heterogeneous graphs structured Treatise on Febrile Diseases (Shang Han Lun,《伤寒论》) dataset and explore an optimal learning method represented with node attributes based on graph convolutional network (GCN). Methods: Clauses that contain symptoms, formulas...

Full description

Saved in:
Bibliographic Details
Main Authors: Junfeng YAN (Author), Zhihua WEN (Author), Beiji ZOU (Author)
Format: Book
Published: KeAi Communications Co., Ltd., 2022-12-01T00:00:00Z.
Subjects:
Online Access:Connect to this object online.
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Objective: To construct symptom-formula-herb heterogeneous graphs structured Treatise on Febrile Diseases (Shang Han Lun,《伤寒论》) dataset and explore an optimal learning method represented with node attributes based on graph convolutional network (GCN). Methods: Clauses that contain symptoms, formulas, and herbs were abstracted from Treatise on Febrile Diseases to construct symptom-formula-herb heterogeneous graphs, which were used to propose a node representation learning method based on GCN − the Traditional Chinese Medicine Graph Convolution Network (TCM-GCN). The symptom-formula, symptom-herb, and formula-herb heterogeneous graphs were processed with the TCM-GCN to realize high-order propagating message passing and neighbor aggregation to obtain new node representation attributes, and thus acquiring the nodes' sum-aggregations of symptoms, formulas, and herbs to lay a foundation for the downstream tasks of the prediction models. Results: Comparisons among the node representations with multi-hot encoding, non-fusion encoding, and fusion encoding showed that the Precision@10, Recall@10, and F1-score@10 of the fusion encoding were 9.77%, 6.65%, and 8.30%, respectively, higher than those of the non-fusion encoding in the prediction studies of the model. Conclusion: Node representations by fusion encoding achieved comparatively ideal results, indicating the TCM-GCN is effective in realizing node-level representations of heterogeneous graph structured Treatise on Febrile Diseases dataset and is able to elevate the performance of the downstream tasks of the diagnosis model.
Item Description:2589-3777
10.1016/j.dcmed.2022.12.007