Graph Learning for Network Data

By Richard Gao
Slide 1: Title slide — 'Graph Learning for Network Data' with mentors and student listed.

Slide-1

Mentors: Dr. Xingquan Zhu and Yufei Jin

Student: Richard Gao

Slide 2: Significance slide describing misdiagnosis, drug development time, drowsy driving and food waste statistics.

Slide-2

Significance

  • About 1 in 18 of American patients are misdiagnosed, 1 in 50 suffer an adverse event because of misdiagnosis, and 1 in 350 suffer permanent disability or death as a result.[1]
  • Development of novel drugs takes an average of 10-15 years.[2]
  • Drowsy driving is responsible for an estimated 6,000 fatal crashes in the US each year, while drunk driving is responsible for around 10,000 deaths per year.[3]
  • 119 billion pounds of food is wasted in the US annually. That’s around 40% of the food in America, or an astounding 130 billion meals.[4]
Slide 3: Neural Networks — citation lines and resource links shown.

Slide-3

Neural Networks

The left side features a matrix with columns labeled "Feature-1" to "Feature-n" and rows labeled "Sample-1" to "Sample-m."

Deep Learning: Deep Guide for All Your Matrix Dimensions and Calculations!, Medium, 24 Aug. 2018, https://medium.com/from-the-scratch/deep-learning-deep-guide-for-all-your-matrix-dimensions-and-calculations-415012de1568. Accessed 31 July 2023.

The right side displays a schematic diagram of a neural network with an "Input Layer," "Multiple Hidden Layers," and an "Output Layer." Each layer is composed of interconnected hexagons in various colors.

Importance of Artificial Neural Networks in Artificial Intelligence, https://www.turing.com, Accessed 27 July 2023.

Slide 4: Graph Neural Networks — brief reference to best GNN architectures and a citation.

Slide-4

Graph Neural Networks

The slide shows two diagrams of connected green and orange circles, with lines representing connections. The diagram on the left has one orange circle labeled "x2" connected to two green circles, "x5" and "x6," by highlighted lines. An orange arrow points from the left diagram to the right. The diagram on the right shows a similar structure, but the orange circle is now labeled "h2." A mathematical formula, "$$h_2 = g(x_1, x_5, x_6)$$," is written below the diagrams.

Karagiannakos, Sergios. Best Graph Neural Network Architectures: GCN, GAT, MPNN and More, 23 Sept. 2021, https://theaisummer.com/gnn-architectures/ . Accessed 27 July 2023.

Slide 5: Project Goals — create methods for heterogeneous data and improve models through novel architectures.

Slide-5

Project Goals

  • Create a way to allow Graph Neural Networks to handle heterogeneous data
  • Improve current models through novel neural network architecture

The Power of Graphs in Machine Learning and Sequential Decision Making, 3 June 2019, https://graphpower.inria.fr/schedule/. Accessed 1 Aug. 2023.

Slide 6: Project Goals (continued) — homogenizing pipeline to label unlabeled data and create distribution labels.

Slide-6

Project Goals

  • establish a homogenizing pipeline for heterogeneous data
  • Allows us to label unlabeled data using the different node types
  • Allows us to easily create distribution labels

The slide shows three diagrams of connected circles labeled (a) "Heterogeneous Graph," (b) "Metapaths," and (c) "Metapath-based Graphs." The diagrams show circles with labels like "a_1," "p_1," and "v_1," connected by lines. Diagram (a) shows connections between "Author," "Paper," and "Venue" nodes. Diagram (c) shows "PAP homogeneous subgraph" and "PA heterogeneous subgraph."

Heterogeneous Graph Neural Network Based on Metapath Subgraph Learning, Sept. 2021, Accessed 27 July 2023.

Slide 7: Distribution Labels — notes about carrying more information, modeling with probability transforms, and a new problem requiring distribution labels.

Slide-7

Distribution Labels

  • Carries more information
  • Models can easily adapt to this structure through a simple probability transformation

NEW PROBLEM:

- To train our model, we must also have distribution labels

- Re-enter homogenizing pipeline

The slide contains two bar charts. The top chart is titled "Distribution Label" and shows six blue bars with varying heights, labeled "Label 1" to "Label 6" on the x-axis and "Probability" on the y-axis. The bottom chart, titled "Multiclass Label," displays three blue bars of equal height at "Label 3," "Label 5," and "Label 6." The y-axis is labeled "Yes or No."

Slide 8: Improving Graph Neural Network Performance — thought process notes and a citation to Heterogeneous Graph Attention Network figure.

Slide-8

Improving Graph Neural Network Performance

Thought Process:

  • Neural Networks have improved through the introduction of more information
  • Extracting more information from pre-existing data
  • Dynamically generating correlation information instead of using static correlation

Wang, Xiao. “Figure 6.” Heterogeneous Graph Attention Network, 20 Jan. 2021, arXiv:1903.07293. Accessed 1 Aug. 2023.

The bottom section of the slide shows two scatter plots side-by-side. Both plots have clusters of colored dots (green, yellow, and blue/purple) and are labeled with numbers on the x and y axes. The clusters of dots show distinct groupings.

Slide 9: Results (On-going) — table comparing models (Non-Graph LDL, GCN LDL, Static ML-GCN, Dynamic ML-GCN) across multiple distance metrics.

Slide-9

Results (On-going)

Chebyshev Distance Cosine Distance Canberra Distance Clark Distance Intersection Distance* KL Divergence Distance
Non-Graph LDL 0.36372 0.2401 2.7926 1.42 0.59305 N/A
GCN LDL 0.3136993139 0.1798002404 2.324628943 1.200705594 0.6427891961 0.4244690184
Static ML-GCN 0.28928456 0.16980088 2.35793181 1.22265781 0.66883566 0.38750959
Dynamic ML-GCN 0.297 0.1723 2.347 1.2186 0.6626 0.3984

*higher values are better

LDL - Label Distribution Learning

GCN - Graph Convolutional Network

ML-GCN - Multi-layer Graph Convolutional Network

Slide 10: Acknowledgements thanking I-SENSE Program and mentors.

Slide-10

Acknowledgements

Thanks to the I-SENSE Program for making this summer experience possible

Thanks to Dr. Zhu and Yufei Jin for guiding me throughout the summer

Slide 11: Citations — list of sources used, including journal articles, arXiv papers and web resources.

Slide-11

Citations

[1] - Newman-Toker DE. Diagnostic Errors in the Emergency Department: A Systematic Review. Comparative Effectiveness Review No. 258. Agency for Healthcare Research and Quality; December 2022. DOI: 10.23970/AHRQEPCCER258

[2] - Hughes JP, Rees S, Kalindjian SB, Philpott KL. Principles of early drug discovery. Br J Pharmacol. 2011 Mar;162(6):1239-49. doi: 10.1111/j.1476-5381.2010.01127.x. PMID: 21091654; PMCID: PMC3058157.

[3] - “Drunk Driving vs. Drowsy Driving vs. Distracted Driving.” Meirowitz & Wasserberg, LLP, 20 June 2023, www.samndan.com/drunk-vs-drowsy-vs-distracted-driving/.

[4] - “Food Waste and Food Rescue.” Feeding America, www.feedingamerica.org/our-work/reduce-food-waste. Accessed 1 Aug. 2023.

Geng, Xin, and Rongzi Ji. “Label Distribution Learning.” 2013 IEEE 13th International Conference on Data Mining Workshops, 2013, https://doi.org/10.1109/icdmw.2013.19.

Shi, Min, et al. “MLNE: Multi-Label Network Embedding.” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 9, Sept. 2020, pp. 3682–95. IEEE Xplore, https://doi.org/10.1109/TNNLS.2019.2945869.

Wang, Xiao, et al. Heterogeneous Graph Attention Network. arXiv:1903.07293, arXiv, 20 Jan. 2021. arXiv.org, http://arxiv.org/abs/1903.07293.

Last slide: Contains plain text stating 'End of presentation. Click the right arrow to return to beginning of slide show.'

End of Presentation

Click the right arrow to return to the beginning of the slide show.

For a downloadable version of this presentation, email: I-SENSE@FAU.

Additional Information
The Institute for Sensing and Embedded Network Systems Engineering (I-SENSE) was established in early 2015 to coordinate university-wide activities in the Sensing and Smart Systems pillar of FAU’s Strategic Plan for the Race to Excellence.
Address
Florida Atlantic University
777 Glades Road
Boca Raton, FL 33431
i-sense@fau.edu