Building the Model Using NetworkX

After we obtained the pij matrix, we had the satisfaction level between each advisor and each student. Therefore, the next step is to build the proper model and match them.

we construct the model based on the linear programming of Operation Research, the raw model is mentioned in the abstract (, and we adjusted it based on the goal of this project.

First, student i is assigned to professor j, and the parameter is ranged from -400 to 14 as mentioned in previous post; xij is designed to be a binary variable (xij=1 if student is assigned to a desired advisor and 0 otherwise); parameter dj is the demand of professors. We want to maximize the total satisfaction degree of both advisees and advisors, which is the product of pij and xij ; but there are restrictions, like “each student can only have one advisor”, and each advisor has their upper bound, we are subject to; finally, the variables cannot be nonnegative. Therefore, combining all these factors together yields the following Linear Programming formulation:

Screen Shot 2018-08-12 at 11.18.45 AM

This is the linear programming formulation, but since our data contains 1615 students and 343 advisors, we need to generalize a system to do the calculation for us and coding all information. Therefore, we choose the NetworkX as the tool for final matching. NetworkX is a Python package for the matching and manipulation of the structure, dynamics, and functions of complex networks. Since we already obtained the pij matrix, the only thing left is to generate best matching, i.e., maximize the value of pij x xij.

The first thing of using NetworkX is to create a “graph”.  Imagining there is a big graph, and both students and advisors are “nodes” on the graph. “Student” nodes are on one side and “advisor” nodes are on another side. The matchings between students and advisors have “direction” because advisors have  “supply” and students have “demands”. So the direction is from advisors to students.
Screen Shot 2018-08-12 at 2.41.41 PM

In addition, each node is linked by “edge”, and each edge has its own “capacity” (in our case, number of advisor a student demanded, which is 1) and “weight” (the pij, satisfaction degree). Since the NetworkX is used to minimize the outcome, and we want to maximize the outcome, so we used it to minimize the negative satisfaction level. For example, in this case, a1 will more likely to match to s1614, since the weight is -6.75, which is smaller than -3.5. This means the pij value between a1 and s1614 is higher than that between a1 and s1.

Screen Shot 2018-08-12 at 2.55.26 PM

Before running the NetworkX, another important thing is that, for NetworkX, the demand and supply should be equal. However, in our case, we have 1615 true students and 1631 true slots (sum of demands of all advisors). So we should first equate the number of demands and supplies.

  1. The pre-matched students will have demand 0 and supplies of corresponding advisor will -1.
  2. In addition, we create a ” student dummy” to take care of the left slots. In our case, we have 1631-1615 = 16 left slots, so the “sdummy” will have a capacity of 16.
  3. Another concern from OAA is that, after the raw matching, some professors have no paired students. Thus, in order to equalize the number of students per advisor, i.e., to prevent the situation that some advisors will not match to advisees, we want that for advisors with demands < 5, they can have one empty slot; for advisors with slots >=5, they can have two empty slots. The way to do it is to have another dummy variable, “s1dummy”. The s1dummy is only a “transit” point and has demand 0, and sdummy has the same demand as before. This means that each advisor can then send one unit to “s1dummy” for free, and one unit to “sdummy” for some cost. So each advisor can have an empty slot (= flow to s1dummy) for free, and they can have one more empty slot at a cost of 1.

The code is attached below:

Screen Shot 2018-08-12 at 10.24.26 PM

Thus, we used the NetworkX successfully built the model and did the final matching. Our last step is to run the solution, and also go back and forth to check it. We will then analyze the effectiveness of solution.