Lab 9, Network Analysis - Network Analysis Elena Candellone Contents Part 1: to be completed at home - Studeersnel (2024)

Network Analysis

Vak

Applied data analysis and visualization 1 for economists (ECB2ADAVE)

12Documenten

Studenten deelden 12 documenten in dit vak

Universiteit

Universiteit Utrecht

Studiejaar: 2023/2024

Geüpload door:

Universiteit Utrecht

0volgers

1Uploads0upvotes

Reacties

inloggen of registreren om een reactie te plaatsen.

Andere studenten bekeken ook

  • FOSS Summary lectures - this
  • Tentamenstof COC
  • Lect 1 - Introduction
  • AR 2022 opg wg 5
  • Presentatie campagneanalyse
  • Definitieve versie - achtergrond document

Gerelateerde documenten

  • CA4 The Anthropology of Globalization hoorcolleges
  • 2024 SK- Bkwan Course manual-20240415
  • TOEtje
  • Codebook voorbeeld
  • Hoorcollege
  • Summary Philosophy of Science: a Very Short Introduction - Chapter 1-7

Preview tekst

Network Analysis

Elena Candellone

Contents

Part 1: to be completed at home before the lab 1

Reading a network file................................. 3Network description I.................................. 4

Part 2: during the lab 5

Network manipulaton.................................. 5Network description II................................. 6Network visualization.................................. 7Centrality measures................................... 7Community detection.................................. 8

Part 1: to be completed at home before the lab

During this practical, we will cover an introduction to network analysis. We cover thefollowing topics:

  • metrics to describe network,
  • centrality indices,
  • community detection,
  • network visualization.

We will mainly use the igraph package.

You can download the student zip including all needed files for this lab here.

Note: the completed homework has to be handed in on BlackBoard and will be graded(pass/fail, counting towards your grade for individual assignment). The deadline is two hours

before the start of your lab. Hand-in should be a PDF file. If you know how to knit pdffiles, you can hand in the knitted pdf file. However, if you have not done this before, you areadvised to knit to a html file as specified below, and within the html browser, ‘print’ yourfile as a pdf file.

For this practical, you will need the following packages:

#install("tidyverse")library(readr)library(tidyverse)library(ggplot2)

#install("igraph")library(igraph)

#install("RColorBrewer")library(RColorBrewer)

#install("sbm")library(sbm)

#install("fossil")library(fossil)

set(42)

We are going to use the high school temporal contacts dataset, created in the contextof the SocioPatterns project.

The dataset is publicly available in the repository Netzschleuder and it corresponds to thecontacts and friendship relations between students in a high school in Marseilles, France, inDecember 2013. The contacts are measured in four different ways: - with proximity devices(see folder data/proximity), - through contact diaries (see folder data/diaries), - fromreported friendships in a survey (see folder data/survey), - from Facebook friendships (seefolder data/facebook).

Each folder contains the edgelist (edges), with the source node, target node, and ad-ditional properties (interaction strength or time at which the interaction happened). Thenodes file contains the node’s (students) IDs, the class they belong to, and their gender.In this lab, we will use the proximity data, but the same analysis can be repeated for theother three datasets in a similar way.

an ID a class and a gender of the students.

  1. Network creation: Create the network using the edge_list using the func- tion graph_from_data_frame from the igraph package. Store it in the variable g. If you look for ?graph_from_data_frame you can find all the arguments of this function. Create an undirected graph, and add the node properties by setting the vertices argument to the node_prop variable

g <- graph_from_data_frame(d = edge_list, vertices = node_prop, directed = F)

Network description I

3a. Descriptive statistics: count how many students are there per class (usinggroup_by), using the node_prop variable

students_per_class <- node_prop %>% group_by(class) %>% summarise(count = n())

3b. Descriptive statistics: print (a) the number of nodes (function vcount) and(b) the number of edges (function ecount) (c) the longest path on the network(diameter, setting the variable unconnected to false), (d) the average path length(mean_distance, setting the variable unconnected to false), (e) the global cluster-ing coefficient (transitivity).

Number of vertices

vcount(g)

## [1] 329

Number of edges

ecount(g)

## [1] 188508

Diameter (maximum eccentricity)

diameter(g, unconnected = F)

[1] Inf

Average path length

mean_distance(g, unconnected = F)

[1] Inf

Clustering coefficient

transitivity(g)

## [1] 0.

3c. How do you interpret the results of the diameter and the average path length?How do you interpret the clustering coefficient?

diameter oupput is Inf meaning the graph is disconnected.same, graph is disconnectedIt is moderately clustering.

Part 2: during the lab

Network manipulaton

4a. Simplify network: the current network has isolated components (some stu-dents do not interact with anybody, maybe because they were sick). Print thenumber of components (function components). Identify the isolated nodes with(V(g)$name[degree(g) == 0]) and remove (delete_vertices) those nodes from thenetwork. Check again the number of connected components in the new network.

4b. Simplify network: the current network has self-loops (i. if there is a recordof a student’s proximity with themselves) and duplicated edges (because studentsinteract several times). Print if there are self-loops (function any_multiple), andthe number of duplicated edges (function which_multiple). We will then removeself-loops (function simplify) and collapse all duplicated edges into one weightededge (simplify(g, edge.attr = "sum"). Store the new graph to the variableg_simple

Network visualization

6a. Network Visualization: visualize the simplified network using the functionplot.

6b. This is a bit ugly, let’s assign different colors to school classes

6c. Now, add a layout. We will use a “spring” algorithm for visualization,where nodes that are connected get pushed together, and nodes that are notconnected get pushed apart. Store the coordinates of each node (coords =layout_with_fr(g_simple)), to plot them in the same position in the next plots(you can experiment with different layout algorithms, see ?layout_).

6d. Make the plot prettier! Play with the plot function options (e., vertex= 5, vertex = NA, edge = 0, edge.arrow = 0) until you arehappy with the results. Make sure you have a legend

6e. Do students in the same class interact more? Why are there so many con-nections between different classes? Do you notice any pattern?

Answer:

Centrality measures

7a. Centrality measures: during the lecture we discussed different types of cen-trality measures, that are useful to quantify the importance of nodes in thenetwork. The most widely used ones are: degree, betweenness, closeness, andpagerank centrality. Explain each measure and how it differs from the oth-ers. Compute all the centrality measures (with functions degree, betweenness,closeness, and page_rank). Find the most central nodes according to thesemeasures (which(centrality == max(centrality))). You can use the pre-madefunction calculate_and_print_max_centrality below. Is the same node the mostcentral node by all definitions?

  • Degree:
  • Betweenness:
  • Closeness:
  • Pagerank:

calculate_and_print_max_centrality <- function(graph, centrality_type) { # Calculate the specified centrality based on the type provided centrality <- switch(centrality_type, "degree" = degree(graph), "betweenness" = betweenness(graph),

"closeness" = closeness(graph),"pagerank" = page_rank(graph)$vector,stop("Invalid centrality type"))

Find the node(s) with the highest centrality

max_value <- max(centrality)highest_nodes <- which(centrality == max_value)

Print results

cat(sprintf("Node id(s) with highest %s centrality: %s\n", centrality_type, toString(h}

7b. Let’s label those nodes. You can again use the plot function to create theplot and set vertex = labels. Also, set vertex.label=1000 to be ableto see the labels

Conditional labeling of nodes remove the commentlabels <- ifelse(V(g_simple)$name %in% c("39", "318"), V(g_simple)$name, NA)Plot the graph with selective labelingAdding a legend

Community detection

8a. Community detection: we would like to detect communities in the net-work. Let’s start with the cluster_leiden function, which creates communitiesthat maximize a metric (either CPM or modularity). Create the communi-ties using modularity (see ?cluster_leiden) and store the results in a variable(e. modularity).

8b. Now, plot the network with the communities. You can just visualize the net-work using the function plot(commdet, g_simple), where commdet may be substi-tuted with the variable name where you stored the community detection results.Remember to fix layout = coords to plot the nodes always in the same position.

8c. How do the communities align with the classrooms (question 6)?

Answer:

Answer:

10b. There is a corrected-by-chance version of the Rand Index called AdjustedRand Index (adj.rand(group1, group2)). Give the definition and repeatthe same done for the RI. Are the results different? Which method is best?

Answer:

Lab 9, Network Analysis - Network Analysis Elena Candellone Contents Part 1: to be completed at home - Studeersnel (2024)
Top Articles
Latest Posts
Article information

Author: Gov. Deandrea McKenzie

Last Updated:

Views: 5567

Rating: 4.6 / 5 (66 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Gov. Deandrea McKenzie

Birthday: 2001-01-17

Address: Suite 769 2454 Marsha Coves, Debbieton, MS 95002

Phone: +813077629322

Job: Real-Estate Executive

Hobby: Archery, Metal detecting, Kitesurfing, Genealogy, Kitesurfing, Calligraphy, Roller skating

Introduction: My name is Gov. Deandrea McKenzie, I am a spotless, clean, glamorous, sparkling, adventurous, nice, brainy person who loves writing and wants to share my knowledge and understanding with you.