Skip to content

Commit 7579b3f

Browse files
committed
docs(roadsAndLibraries): add README with explainer and implementation notes
1 parent a91ee97 commit 7579b3f

File tree

1 file changed

+85
-1
lines changed
  • src/algorithm_practice/Datastructure_Algorithms/Graph/roadsAndLibraries

1 file changed

+85
-1
lines changed
Original file line numberDiff line numberDiff line change
@@ -1 +1,85 @@
1-
Coming soon...
1+
Source: https://www.hackerrank.com/challenges/torque-and-development/
2+
3+
TODO(domfarolino): revise this post.
4+
5+
This is a pretty interesting graph problem. It vexed me for a bit until I made some cruicial realizations.
6+
7+
# Divide the problem into connected components
8+
9+
When starting with this problem I fumbled around quite a bit. Eventually I came to some good realizations:
10+
11+
- We'll need at least one library per connected component
12+
- In each component, there are two extremes:
13+
- Every city in a connected component has a library
14+
- Only one city in a connected component has a library
15+
16+
My next thought was that the naive solution would be to find all possible combinations of library/road
17+
allocations in between the extremes, which seems combinatorially explosive. For example, what if there
18+
were not the extreme `n` libraries and `0` roads in a component, but instead `n - 1` libraries and `1`
19+
road, or `n - 2` libraries and `2` roads. How many different ways can we
20+
[*choose*](https://en.wikipedia.org/wiki/Binomial_coefficient) how to allocate which cities have libraries
21+
and which cities to connect, and more importantly, does the choosing of these actually affect the cost?
22+
Determining the number of possible choices we can make when allocating libraries to cities is actually pretty
23+
easy (it's just the summation of binomial coefficients, [see here](https://math.stackexchange.com/questions/519832/)),
24+
it would just be combinatorially explosive to go through each one; was it necessary?
25+
26+
When looking at an example graph with five (once-) connected cities I realized that the allocation of libraries
27+
doesn't matter at all and won't affect the cost. (I was considering the idea that perhaps the degree of each city
28+
might have an affect on, or indicate priority of library assignment). The allocation makes no difference as long
29+
as we don't waste a road connecting two library-bearing cities, because why would we do that?
30+
31+
[Enter a tangent]...
32+
33+
The whole reason this accidentally-connecting-two-library-bearing-cities issue came up is because I was examining a
34+
quite feasible 5-city graph with a cycle trying to allocate `3` libraries and `2` roads. I wondered if I could choose
35+
a "bad" allocation of libraries and roads, namely one that doesn't actually connect each city in the component. This is
36+
certainly possible in a graph with cycles when only dealing with `numberOfCities` resources (`3` libraries and `2` roads).
37+
38+
I was then worried about making sure my implementation would not accidentally theoretically waste a road on two
39+
library-bearing cities, and then I realized well yeah, if the allocation doesn't matter, we just have to know that
40+
some working allocation exists, and that will be the minimum total cost for such choices of the number of libraries
41+
and roads for that connected component.
42+
43+
# A connected component is at least a tree
44+
45+
The "choice" of which roads to build dissolves when you realize that the connected component by definition is at least a
46+
tree, and thus always has valid allocations of libraries and roads in the form of:
47+
48+
`N - K` libraries + `K` roads, `∀ K < N` (remember, we need at least one library).
49+
50+
This means each connected component had `N` possible solutions, and for each of the values of `K`, we needed to choose the
51+
minumum one. Going through some examples I realized the best answer always seemed to be one of the extreme allocations, namely
52+
an allocation with all `N` libraries or only `1` library. I tried to find an example where one of the middleground less
53+
extreme allocations could be more optimal, but I came to the conclusion that that will never be the case, because we greedily
54+
want to choose to employ as many of the cheapest resource (either libraries or roads) as possible. In other words, if roads were
55+
cheaper to build then libraries, and there exist the possible roads to repair to connect the entire component (the definition!),
56+
then we'd want to only build `1` library, and as many remaining roads as we'd need. We could build two libraries, and one less
57+
road, but that would give us the same connected result but with a higher cost, unnecessarily.
58+
59+
# Implementation design
60+
61+
When thinking about the implementation, I knew the number of connected components was relevant to this problem. I also knew
62+
we could get an entire connected component (but more importantly its size) using a trivial-to-implement BFS algorithm. I figured
63+
I'd use an adjecency list to store the graph, since I wasn't going to perform any operations that a matrix would be more suited
64+
for. The necessary steps were something like this:
65+
66+
- Build the graph's adjacency list
67+
- For each connected component
68+
- Get the size of the component
69+
- Minimal cost of connecting this component was `min(a, b)` where:
70+
- `a = numCities * costLib`
71+
- `b = costLib + (numCities - 1) * costRoad`
72+
- With the minimal cost of the component in hand, add the value to the running some, and perform the same operation for the next component.
73+
74+
Moving from component-to-component is as easy as just using BFS with some sort of global visitation store.
75+
We can try to find a connected component from each given city. The first time we run BFS, we'll mark *all* nodes in
76+
the discovered component as visited. Then in the next given city, we'll try to find another connected component *if*
77+
the city has not already been visited (does not exist as a part of an already-discovered connected component). We keep
78+
a running sum, adding to it the minimum cost required to connect a once-connected component, and eventually return the
79+
final value.
80+
81+
Time complexity: O(n) (by marking nodes as visited, we're repeating ourselves)
82+
Space complexity: O(n)
83+
84+
*It should be noted that the complexity of this algorithm could easily by O(n^2) (due to edge processing in the complete
85+
graph of K~n~), however the problem description on Hackerrank specifically limits the number of edges to `n`*

0 commit comments

Comments
 (0)