### CS695 – Network Science: Principles and Applications

#### Fall 2016

#### Information

Instructors: Amarda Shehu, Fei Li, [amarda, lifei]\AT\gmu.edu

Place and Time: Art and Design Building, #2026, W 7:20-10:00 pm

Office Hours: ENGR #4452, W 6:20-7:19 pm (Shehu) and ENGR #5326, M 6:20-7:19 (Li)

#### Description

The objective of this course is to introduce students to complex systems and network-based treatments of such systems. Complex systems, whether living or abstract, can be represented as static or dynamic networks of many interacting components. They are typically composed of components that are much simpler in behavior or function than the system. The complex behavior of a multi-component system is an emergent property of the network that can be constructed to describe the local relationships between the components that make up the system.

Network science is a new discipline that investigates the topology and dynamics of such complex networks, aiming to better understand the behavior, function and properties of the underlying systems. Applications of network science address physical, informational, biological, cognitive, and social systems.

In this course, we will emphasize the fundamental underpinnings of network science to graph-theoretic concepts and graph algorithms. We will study algorithmic, computational, and statistical methods of network science. We will also highlight diverse applications in machine learning, robotics, communications, biology, ecology, brain science, sociology, and economics. The course will go beyond the strictly structural concepts of small-world and scale-free networks, focusing on dynamic network processes such as epidemics, synchronization, or adaptive network formation.

#### Target Audience

Target audience: Graduate students that have completed two core courses. Students are encouraged to contact the instructors for more information on what fundamental background knowledge is assumed for students to do well in this course.

#### Format

The class will emphasize active participation by students. Students can provide “mini-talks” (10 minutes) on specific topics that will be announced in class. Students should also actively participate in discussions during the lectures. Attendance is mandatory.

#### Categorization

The final project, whose topic needs to be negotiated with the instructors, determines the area satisfied by this course. Both instructors need to agree on the area based on the topic of the project.

#### Grade Breakdown

Homeworks (3): 45%

Mini-talk: 10%

Class Participation: 5%

Final Project: 40%

#### A Very Tentative and Overambitious List of Possible Topics

[1] Overview of Network Science, History, and Relation to Graph Theory (1.5 WKS)Required reading:

– What is network science? by Ulrik Brandes et al. (Editorial from a new Network Science journal)

– The architecture of complexity by Albert-László Barabási

– Why Model? by J.Epstein (why abstract modeling even in era of “big data”)

Optional reading:

– Networks in Neuroscience: Complex brain networks: graph theoretical analysis of structural and functional systems by Bullmore and Sporns

– Networks in Social Science: Network Analysis in the Social Sciences by Stephen Borgatti et al.

– Networks in Economics: Economic Networks: The New Challenges by Frank Schweitzer et al.

– Networks in Ecology: Networks in ecology by Jordi Bascompte

– Networks and the Web: Web science: an interdisciplinary approach to understanding the web by James Hendler et al.

– Networks and the Internet: Network Topologies: Inference, Modelling and Generation by Hamed Haddadi et al.

2] Network Analysis Metrics and Relation to Graph-theoretic Concepts (1.5 WKS)

Paths, components, degree distribution, clustering, degree correlations (assortativity)

Centrality metrics (and meaning across application domains)

A fast algorithm for the computation of betweenness centrality

Metrics for weighted or spatial networks

Required reading:

– Complex networks: Structure and dynamics by S.Boccaletti et al. (sections 2.1, 2.4, and 2.5)

– A Faster Algorithm for Betweenness Centrality by U.Brandes (Betweenness centrality and how to compute it efficiently)

Optional reading:

– Centrality measures in spatial networks of urban streets by P.Crucitti et al.

– The architecture of complex weighted networks by A.Barrat et al.

– Parallel Algorithms for Evaluating Centrality Indices in Real-World Networks by D.Bader and K.Madduri.

Small-world property

Scale-free property and heavy-tailed degree distributions

Hierarchy – Modularity

Network motifs

Network-centric and motif-centric algorithms for motif detection

Required reading:

– The structure and function of complex networks by M.Newman (section III-(A,B,C,E,F) and Section IV-A)

– Collective dynamics of small-world networks by D.Watts and S.Strogatz (A classic that largely started the “network science” area and transferred the concept of “small-world networks” from sociology to many other network contexts)

– Emergence of scaling in random networks by R.Albert and A.L.Barabasi (The second classic in network science — it started the “scale free networks” research thread)

– Network Motifs- Simple Building Blocks of Complex Networks by R.Milo et al. (Networks contain certain sub-graphs that are much more common than in random graphs)

**Optional reading:**

– Hierarchical organization in complex networks by E.Ravasz and A.L.Barabasi (Networks also show hierarchical organization. There are different ways to think about network hierarchies — this is just one of them)

– Scale_free networks in cell biology by R.Albert

– Small-World Networks and Functional Connectivity in Alzheimer’s Disease by C.Stam et al.

– Disrupted small-world networks in schizophrenia by Y.Liu et al.

4] Network Models (2 WKS)

Random networks (G(n,p) and generative model)

Watts-Strogatz model

Preferential attachment and its variants

Kleinberg’s duplication-based model

Optimization-based network formation models (e.g., HOT)

Required reading:

– Chapters 3, 4 and 5 of Network Science book by A-L.Barabasi (Primary readings for this week and the next.)

– The Web as a graph – measurements, models and methods by J.Kleinberg (This is the first description of the duplication-based model. It is also one of the first graph-centered studies of the WWW.)

– Heuristically Optimized Trade-offs – A New Paradigm for Power Laws in the Internet by A.Fabrikant (One of the first papers that created a connection between heavy-tailed degree distributions and optimization principles.)

Optional reading:

– Random graphs with arbitrary degree distributions and their applications by M.Newman et al. (This is probably the simplest way to analyze random graphs with a given degree distribution. It also gives resuts for directed and bipartite graphs.

– Application paper, Optional reading) A Model of Large-Scale Proteome Evolution by R.Sole et al. (This is basically the duplication-based network generation model applied in the context of proteome evolution.)

Network sampling methods

Bias in traceroute-like network sampling

Network inference based on cross-correlations

Identification of missing or spurious network edges

* 1-hour guest lecture [7] Network Mining and Machine Learning Methods (1.5 WKS)

Anomaly detection in networks

Graph summarization

Required Reading:

– Spectral analysis (I. S. Dhillon, Y. Guan, and B. Kulis. Kernel k-means, spectral clustering and normalized cuts. ACM SIGKDD pg. 556–556, 2004.)

– Inference of network evolution models

– Learning evolving networks

* 1-hour guest lecture highlighting related work by domain-expert faculty in department

Optional reading:

– Spectral analysis in social networks (P. Symeonidis and N. Mantas. Spectral clustering for link prediction in social networks with positive and negative links; Social Network Analysis and Mining, 3(4):1433–1447, 2013; Fengjiao Wang, Guan Wang, Shuyang Lin, and Philip S Yu.; Concurrent goal-oriented coclustering generation in social networks. IEEE Semantic Computing, pg. 350–357, 2015. Raghvendra Mall, Rocco Langone, and Johan AK Suykens).

Spectral analysis in big data networks (R. Mall, R. Langone, and J. AK Suykens. Kernel spectral clustering for big data networks. Entropy, 15(5):1567–1586, 2013)

Faithful sampling of networks (H. Zare, P. Shooshtari, A. Gupta, and R. R Brinkman. Data reduction for spectral clustering to analyze high throughput flow cytometry data. BMC Bioinformatics, 11 (1):403, 2010)

Spectral analysis in biological networks (A. Borg, N. Lavesson, and V. Boeva. Comparison of clustering approaches for gene expression data. SCAI, pg. 55–64, 2013; Z. Yu, L. Li, J. You, H.-S. Wong, and G. Han. Sc3: Triple spectral clustering-based consensus clustering framework for class discovery from cancer gene expression profiles. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(6):1751–1765, 2012; Y. Kluger, R. Basri, J. T. Chang, and M. Gerstein. Spectral biclustering of microarray data: coclustering genes and conditions. Genome research, 13(4):703–716, 2003; J. Ruan and W. Zhang. An efficient spectral algorithm for network community discovery and its applications to biological and social networks. In Data Mining, 2007. ICDM pg. 643–648, 2007; J. Ruan, A. K. Dean, and W. Zhang. A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Systems Biology, 4 (1):8, 2010; Y. X. R. Wang and H. Huang. Review on statistical methods for gene network reconstruction using expression data. Journal of Theoretical Biology, 362:53–61, 2014)

Percolation and network resilience to random and targeted attacks

Growth and Densification

Rewiring

Network epidemics and epidemic threshold (SI, SIS, SIR models)

Immunization strategies

Identification of major spreaders

Computational network epidemiology

Social networks and influence

Information and behavior spreading – measurements and models

Seeding strategies to maximize influence

Influence versus homophily

Synchronization on networks

Networks with capacity constraints and overload-based failures

Decentralized search on networks

Controlling networks

*40-minute guest lecture highlighting related work by domain-expert faculty at GMU [9] Additional Topics depending on time and interests of students:

(a) Coevolutionary Dynamics in Adaptive Networks

Instances of adaptive networks in practice

Coevolutionary dynamics in opinion/consensus formation

Coevolutionary dynamics in epidemics

Coevolutionary effects in population dynamics

(b) Temporal networks

Instances of interdependent networks in practice

Layered networks

Cascade phenomena in intedependent networksbr/> Temporal networks

(c) Network Formation Games Games on networks

Strategic network formation

Evolution of cooperation on social networks

Social learning on networks

*guest lectures highlighting related work by domain-expert faculty at GMU

#### Materials that facilitate homework programming assignments and final project:

Network datasets:

Mark Newmann’s

Eric Kolaczyk’s

Albert Barabasi’s

Uri Alon’s

Pajek’s

Alex Arenas’

Jon Kleinberg’s

Jure Leskovec’s

Peter Skomoroch’s

TrustLet project’s

AOL dataset: 20M web queries collected from ~650k users over three months.

Dataset for the evolution of the Internet AS ecosystem between 1998-2008

Network analysis tools:

Gephi

Infomap

Igraph

Statnet

Network Workbench

Pajek network visualization (Windows)

Jung network analysis

GraphViz

Uri Alon’s network motif detection software

Matlab’s Random Boolean Networks (RBN) toolbox

#### Information on Final Project and Milestones:

Ideally, the final project should have the potential to lead to an original research paper that addresses a question not previously addressed in published literature. Groups of two students are ideal; individual projects are also acceptable. Larger teams will need instructor approval. The instructor will work closely with every student/group during the semester. All projects will be presented in class during the last week of the semester.

Project milestones:

WK 3: project proposal

WK 6: first progress report

WK 9: second progress report

WK 14: paper due

Final’s day: in-class project presentations

#### Grade Breakdown

Quizzes: 45% (~5% each)

Homework: 10% (~5% each)

Midterm Exam: 20%

Final Exam: 25%