A graph data structure is a collection of nodes that have data and are connected to other nodes.
Let's try to understand this through an example. On facebook, everything is a node. That includes User, Photo, Album, Event, Group, Page, Comment, Story, Video, Link, Note. anything that has data is a node.
Every relationship is an edge from one node to another. Whether you post a photo, join a group, like a page, etc., a new edge is created for that relationship.
All of facebook is then a collection of these nodes and edges. This is because facebook uses a graph data structure to store its data.
More precisely, a graph is a data structure (V, E) that consists of
Graphs are commonly represented in two ways:
An adjacency matrix is a 2D array of V x V vertices. Each row and column represent a vertex.
If the value of any element a[i][j] is 1, it represents that there is an edge connecting vertex i and vertex j.
The adjacency matrix for the graph we created above is
Since it is an undirected graph, for edge (0,2), we also need to mark edge (2,0); making the adjacency matrix symmetric about the diagonal.
Edge lookup(checking if an edge exists between vertex A and vertex B) is extremely fast in adjacency matrix representation but we have to reserve space for every possible link between all vertices(V x V), so it requires more space.
An adjacency list represents a graph as an array of linked lists.
The index of the array represents a vertex and each element in its linked list represents the other vertices that form an edge with the vertex.
The adjacency list for the graph we made in the first example is as follows:
An adjacency list is efficient in terms of storage because we only need to store the values for the edges. For a graph with millions of vertices, this can mean a lot of saved space.
The most common graph operations are: