CS695 – Network Science: Principles and Applications
Fall 2016
Information
Instructors: Amarda Shehu, Fei Li, [amarda, lifei]\AT\gmu.edu
Place and Time: Art and Design Building, #2026, W 7:20-10:00 pm
Office Hours: ENGR #4452, W 6:20-7:19 pm (Shehu) and ENGR #5326, M 6:20-7:19 (Li)
Description
The objective of this course is to introduce students to complex systems and network-based treatments of such systems. Complex systems, whether living or abstract, can be represented as static or dynamic networks of many interacting components. They are typically composed of components that are much simpler in behavior or function than the system. The complex behavior of a multi-component system is an emergent property of the network that can be constructed to describe the local relationships between the components that make up the system.
Network science is a new discipline that investigates the topology and dynamics of such complex networks, aiming to better understand the behavior, function and properties of the underlying systems. Applications of network science address physical, informational, biological, cognitive, and social systems.
In this course, we will emphasize the fundamental underpinnings of network science to graph-theoretic concepts and graph algorithms. We will study algorithmic, computational, and statistical methods of network science. We will also highlight diverse applications in machine learning, robotics, communications, biology, ecology, brain science, sociology, and economics. The course will go beyond the strictly structural concepts of small-world and scale-free networks, focusing on dynamic network processes such as epidemics, synchronization, or adaptive network formation.
Target Audience
Target audience: Graduate students that have completed two core courses. Students are encouraged to contact the instructors for more information on what fundamental background knowledge is assumed for students to do well in this course.
Format
The class will emphasize active participation by students. Students can provide “mini-talks” (10 minutes) on specific topics that will be announced in class. Students should also actively participate in discussions during the lectures. Attendance is mandatory.
Categorization
The final project, whose topic needs to be negotiated with the instructors, determines the area satisfied by this course. Both instructors need to agree on the area based on the topic of the project.
Grade Breakdown
Homeworks (3): 45%
Mini-talk: 10%
Class Participation: 5%
Final Project: 40%
A Very Tentative and Overambitious List of Possible Topics
[1] Overview of Network Science, History, and Relation to Graph Theory (1.5 WKS)Required reading:
– What is network science? by Ulrik Brandes et al. (Editorial from a new Network Science journal)
– The architecture of complexity by Albert-László Barabási
– Why Model? by J.Epstein (why abstract modeling even in era of “big data”)
Optional reading:
– Networks in Neuroscience: Complex brain networks: graph theoretical analysis of structural and functional systems by Bullmore and Sporns
– Networks in Social Science: Network Analysis in the Social Sciences by Stephen Borgatti et al.
– Networks in Economics: Economic Networks: The New Challenges by Frank Schweitzer et al.
– Networks in Ecology: Networks in ecology by Jordi Bascompte
– Networks and the Web: Web science: an interdisciplinary approach to understanding the web by James Hendler et al.
– Networks and the Internet: Network Topologies: Inference, Modelling and Generation by Hamed Haddadi et al.
2] Network Analysis Metrics and Relation to Graph-theoretic Concepts (1.5 WKS)
Paths, components, degree distribution, clustering, degree correlations (assortativity)
Centrality metrics (and meaning across application domains)
A fast algorithm for the computation of betweenness centrality
Metrics for weighted or spatial networks
Required reading:
– Complex networks: Structure and dynamics by S.Boccaletti et al. (sections 2.1, 2.4, and 2.5)
– A Faster Algorithm for Betweenness Centrality by U.Brandes (Betweenness centrality and how to compute it efficiently)
Optional reading:
– Centrality measures in spatial networks of urban streets by P.Crucitti et al.
– The architecture of complex weighted networks by A.Barrat et al.
– Parallel Algorithms for Evaluating Centrality Indices in Real-World Networks by D.Bader and K.Madduri.
Small-world property
Scale-free property and heavy-tailed degree distributions
Hierarchy – Modularity
Network motifs
Network-centric and motif-centric algorithms for motif detection
Required reading:
– The structure and function of complex networks by M.Newman (section III-(A,B,C,E,F) and Section IV-A)
– Collective dynamics of small-world networks by D.Watts and S.Strogatz (A classic that largely started the “network science” area and transferred the concept of “small-world networks” from sociology to many other network contexts)
– Emergence of scaling in random networks by R.Albert and A.L.Barabasi (The second classic in network science — it started the “scale free networks” research thread)
– Network Motifs- Simple Building Blocks of Complex Networks by R.Milo et al. (Networks contain certain sub-graphs that are much more common than in random graphs)
Optional reading:
– Hierarchical organization in complex networks by E.Ravasz and A.L.Barabasi (Networks also show hierarchical organization. There are different ways to think about network hierarchies — this is just one of them)
– Scale_free networks in cell biology by R.Albert
– Small-World Networks and Functional Connectivity in Alzheimer’s Disease by C.Stam et al.
– Disrupted small-world networks in schizophrenia by Y.Liu et al.
4] Network Models (2 WKS)
Random networks (G(n,p) and generative model)
Watts-Strogatz model
Preferential attachment and its variants
Kleinberg’s duplication-based model
Optimization-based network formation models (e.g., HOT)
Required reading:
– Chapters 3, 4 and 5 of Network Science book by A-L.Barabasi (Primary readings for this week and the next.)
– The Web as a graph – measurements, models and methods by J.Kleinberg (This is the first description of the duplication-based model. It is also one of the first graph-centered studies of the WWW.)
– Heuristically Optimized Trade-offs – A New Paradigm for Power Laws in the Internet by A.Fabrikant (One of the first papers that created a connection between heavy-tailed degree distributions and optimization principles.)
Optional reading:
– Random graphs with arbitrary degree distributions and their applications by M.Newman et al. (This is probably the simplest way to analyze random graphs with a given degree distribution. It also gives resuts for directed and bipartite graphs.
– Application paper, Optional reading) A Model of Large-Scale Proteome Evolution by R.Sole et al. (This is basically the duplication-based network generation model applied in the context of proteome evolution.)
Network sampling methods
Bias in traceroute-like network sampling
Network inference based on cross-correlations
Identification of missing or spurious network edges
* 1-hour guest lecture [7] Network Mining and Machine Learning Methods (1.5 WKS)
Anomaly detection in networks
Graph summarization
Required Reading:
– Spectral analysis (I. S. Dhillon, Y. Guan, and B. Kulis. Kernel k-means, spectral clustering and normalized cuts. ACM SIGKDD pg. 556–556, 2004.)
– Inference of network evolution models
– Learning evolving networks
* 1-hour guest lecture highlighting related work by domain-expert faculty in department
Optional reading:
– Spectral analysis in social networks (P. Symeonidis and N. Mantas. Spectral clustering for link prediction in social networks with positive and negative links; Social Network Analysis and Mining, 3(4):1433–1447, 2013; Fengjiao Wang, Guan Wang, Shuyang Lin, and Philip S Yu.; Concurrent goal-oriented coclustering generation in social networks. IEEE Semantic Computing, pg. 350–357, 2015. Raghvendra Mall, Rocco Langone, and Johan AK Suykens).
Spectral analysis in big data networks (R. Mall, R. Langone, and J. AK Suykens. Kernel spectral clustering for big data networks. Entropy, 15(5):1567–1586, 2013)
Faithful sampling of networks (H. Zare, P. Shooshtari, A. Gupta, and R. R Brinkman. Data reduction for spectral clustering to analyze high throughput flow cytometry data. BMC Bioinformatics, 11 (1):403, 2010)
Spectral analysis in biological networks (A. Borg, N. Lavesson, and V. Boeva. Comparison of clustering approaches for gene expression data. SCAI, pg. 55–64, 2013; Z. Yu, L. Li, J. You, H.-S. Wong, and G. Han. Sc3: Triple spectral clustering-based consensus clustering framework for class discovery from cancer gene expression profiles. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(6):1751–1765, 2012; Y. Kluger, R. Basri, J. T. Chang, and M. Gerstein. Spectral biclustering of microarray data: coclustering genes and conditions. Genome research, 13(4):703–716, 2003; J. Ruan and W. Zhang. An efficient spectral algorithm for network community discovery and its applications to biological and social networks. In Data Mining, 2007. ICDM pg. 643–648, 2007; J. Ruan, A. K. Dean, and W. Zhang. A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Systems Biology, 4 (1):8, 2010; Y. X. R. Wang and H. Huang. Review on statistical methods for gene network reconstruction using expression data. Journal of Theoretical Biology, 362:53–61, 2014)
Percolation and network resilience to random and targeted attacks
Growth and Densification
Rewiring
Network epidemics and epidemic threshold (SI, SIS, SIR models)
Immunization strategies
Identification of major spreaders
Computational network epidemiology
Social networks and influence
Information and behavior spreading – measurements and models
Seeding strategies to maximize influence
Influence versus homophily
Synchronization on networks
Networks with capacity constraints and overload-based failures
Decentralized search on networks
Controlling networks
*40-minute guest lecture highlighting related work by domain-expert faculty at GMU [9] Additional Topics depending on time and interests of students:
(a) Coevolutionary Dynamics in Adaptive Networks
Instances of adaptive networks in practice
Coevolutionary dynamics in opinion/consensus formation
Coevolutionary dynamics in epidemics
Coevolutionary effects in population dynamics
(b) Temporal networks
Instances of interdependent networks in practice
Layered networks
Cascade phenomena in intedependent networksbr/> Temporal networks
(c) Network Formation Games Games on networks
Strategic network formation
Evolution of cooperation on social networks
Social learning on networks
*guest lectures highlighting related work by domain-expert faculty at GMU
Materials that facilitate homework programming assignments and final project:
Network datasets:
Mark Newmann’s
Eric Kolaczyk’s
Albert Barabasi’s
Uri Alon’s
Pajek’s
Alex Arenas’
Jon Kleinberg’s
Jure Leskovec’s
Peter Skomoroch’s
TrustLet project’s
AOL dataset: 20M web queries collected from ~650k users over three months.
Dataset for the evolution of the Internet AS ecosystem between 1998-2008
Network analysis tools:
Gephi
Infomap
Igraph
Statnet
Network Workbench
Pajek network visualization (Windows)
Jung network analysis
GraphViz
Uri Alon’s network motif detection software
Matlab’s Random Boolean Networks (RBN) toolbox
Information on Final Project and Milestones:
Ideally, the final project should have the potential to lead to an original research paper that addresses a question not previously addressed in published literature. Groups of two students are ideal; individual projects are also acceptable. Larger teams will need instructor approval. The instructor will work closely with every student/group during the semester. All projects will be presented in class during the last week of the semester.
Project milestones:
WK 3: project proposal
WK 6: first progress report
WK 9: second progress report
WK 14: paper due
Final’s day: in-class project presentations
Grade Breakdown
Quizzes: 45% (~5% each)
Homework: 10% (~5% each)
Midterm Exam: 20%
Final Exam: 25%