data frame
- a collection of 1-dimensional arrays, called columns, all of the same length
- a row in a data frame consists of the values of all columns at a given integer index
- retrieval
- individual columns can be retrived by name
- individual rows can be retrived by index
- storage: dataframe are stored column-wise
graph
- two way to implement its data structures:
- edge list, g
- data
- vertices :: int
- vertices are consecutive numers 1..g.vertices
- edges :: int
- eges are consecutive numers 1..g.edges
- src :: edge_index -> vertex_index
- the source of edge j is g.src[i]
- tgt :: edge_index -> vertex_index
- the target of edge j is g.tgt[i]
- where edge_index, vertex_index :: int
- vertices :: int
- invariant: length(g.src) == length(g.tgt) == g.edges
- adjacency list, g
- data
- vertices :: int
- edges :: int
- src_index :: int -> list[int]
- the list of edges starting from vertice i (with i as source) is g.src_index[i]
- tgt_index :: int -> list[int]
- the list of edges ending of vertice i (with i as target) is g.tgt_index[i]
- store the inverse images of the source and target maps
- +: allows for easy traversal (through the neighbors of any given vertex)
- -: less convenient to iterate though the edges and retrieve their sources and target
- it’s often useful to have both
src(edge’s source vertex mapping) andsrc_index(vertex’s outgoing edges mapping) fields- but the trouble now beomces bookkepping the consistency when modifying the graph
- we eventually wish to regard a graph as two interlinked data frames