core

A few core functionalities to work with data representing bill of materials

source

get_sample_data

 get_sample_data ()

Return sample BOM data (df) and item attributes (df_attributes) for demonstration purposes.

df, df_attributes = get_sample_data()

At this point, we have:

  • df – Bill of Materials table (predecessor, successor, quantity)
  • df_attributes – Item attributes table (component_type, cost, etc.)

Core Functions

We create one large directed graph containing all nodes and edges from df. For node attributes, we can optionally merge data from df_attributes.


source

build_complete_graph

 build_complete_graph (df, df_attributes=None)

Build a directed graph (DiGraph) from the BOM DataFrame. Optionally enrich nodes with attributes from df_attributes.

G = build_complete_graph(df, df_attributes)

print("Number of nodes:", G.number_of_nodes())
print("Number of edges:", G.number_of_edges())
Number of nodes: 20
Number of edges: 21

If we want to see which assemblies or products contain a given part, we can move upwards in the graph (following edges backwards).


source

get_all_predecessors

 get_all_predecessors (G, node_id)

Return a list of all predecessors (ancestors) of node_id. This effectively finds all assemblies or parent items that contain node_id.

Now, we can answer questions like: > Where is component ‘MOTOR_A1’ installed?

get_all_predecessors(G, 'MOTOR_A1')
['PUMP_RV1', 'PUMP_RV2', 'PUMP_CL1']

Similarly, we can move down the graph to find all sub-components or child items of a given product or assembly.


source

get_all_successors

 get_all_successors (G, node_id)

Return a list of all successors (descendants) of node_id. This effectively finds the complete set of parts that make up node_id.

Now, we can answer questions like: > What component are installed in ‘PUMP_RV1’?

get_all_successors(G, 'PUMP_RV1')
['MOTOR_A1', 'CASE_STD1', 'VANE_ASM1', 'OIL_SYSTEM1', 'VANE_001', 'BEARING_01']

We can select a sub_graph by giving a root node and then searching for all successors.


source

select_subg_by_root

 select_subg_by_root (G, root_id)
sub_g = select_subg_by_root(G, 'PUMP_RV1')
sub_g.nodes()
NodeView(('BEARING_01', 'VANE_001', 'CASE_STD1', 'MOTOR_A1', 'VANE_ASM1', 'PUMP_RV1', 'OIL_SYSTEM1'))

For a multipartite (hierarchical) layout in NetworkX, each node must have a level attribute. The “level” is typically “0” for a root product, “1” for its direct children, “2” for those children’s children, etc. This function automatically detects the root(s) (any node with no incoming edges) and assigns levels by breadth-first search.


source

get_all_roots

 get_all_roots (G)

Returns a list of nodes with no ingoing edges (root_nodes)


source

add_levels

 add_levels (G)

Calculate levels starting from root

sub_g = add_levels(sub_g)
[(n[0],n[1]['level']) for n in sub_g.nodes(data=True)]
[('BEARING_01', 2),
 ('VANE_001', 2),
 ('CASE_STD1', 1),
 ('MOTOR_A1', 1),
 ('VANE_ASM1', 1),
 ('PUMP_RV1', 0),
 ('OIL_SYSTEM1', 1)]

The depth of the bill of materials can give us information about the depth of added value per product.

max([n[1]['level'] for n in sub_g.nodes(data=True)])
2

Visualizing

Finally, we create a plotting function that can handle:

  • Either a multipartite layout (if levels are assigned)
  • Or a kamada_kawai layout (force-directed)

We also display edge attributes (quantity), and we allow parameter overrides (font size, node size, color, etc.) for quick styling changes.


source

plot_graph

 plot_graph (G, layout='multipartite', figsize=(24, 12), font_size=8,
             node_size=1000, node_color='lightblue',
             direction='top_to_bottom', label_rotation=45)

Plot a graph with specified layout and styling.

Type Default Details
G NetworkX graph
layout str multipartite ‘multipartite’ or ‘kamada_kawai’
figsize tuple (24, 12)
font_size int 8
node_size int 1000
node_color str lightblue
direction str top_to_bottom if layout is multipartite the direction is top to bottom, to rotate set to None
label_rotation int 45 Rotation angle for node labels (in degrees)
plot_graph(sub_g)

Creating a binary matrix

We can also create a binary matrix, with endproducts in the index and parts as columns. This is very helpful, for further analysis, like similarity searches and product clustering.


source

create_binary_matrix

 create_binary_matrix (G, root_nodes=None)

Creates a binary matrix with endproducts as indices and parts as columns

b_matrix = create_binary_matrix(G)
b_matrix
parts BEARING_01 BEARING_02 CASE_CL1 CASE_SC1 CASE_STD1 CLAW_001 CLAW_ASM1 MOTOR_A1 MOTOR_B1 OIL_SYSTEM1 SCREW_001 SCREW_ASM1 VANE_001 VANE_002 VANE_ASM1 VANE_ASM2
head
PUMP_CL1 0 1 1 0 0 1 1 1 0 0 0 0 0 0 0 0
PUMP_RV1 1 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0
PUMP_RV2 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1
PUMP_SC1 0 1 0 1 0 0 0 0 1 0 1 1 0 0 0 0

Creating a matrix for parts utilisation

We can also create a matrix, with endproducts in the index and parts as columns and the values representing some value (for example cost or quantity or something else). This is very helpful, for further analysis, like similarity searches and product clustering.

G.out_edges(['PUMP_RV1'], data=True)
OutEdgeDataView([('PUMP_RV1', 'MOTOR_A1', {'quantity': 1}), ('PUMP_RV1', 'CASE_STD1', {'quantity': 1}), ('PUMP_RV1', 'VANE_ASM1', {'quantity': 1}), ('PUMP_RV1', 'OIL_SYSTEM1', {'quantity': 1})])
elist = [[u, v, data['quantity']] for u,v,data in G.out_edges(['PUMP_RV1'], data=True)]
elist
[['PUMP_RV1', 'MOTOR_A1', 1],
 ['PUMP_RV1', 'CASE_STD1', 1],
 ['PUMP_RV1', 'VANE_ASM1', 1],
 ['PUMP_RV1', 'OIL_SYSTEM1', 1]]

source

get_all_successor_edges

 get_all_successor_edges (G, node_id, attr='quantity', default=None)

Return a list of all edge pairs for node_id with the specified attribute.


source

get_all_predecessor_edges

 get_all_predecessor_edges (G, node_id, attr='quantity', default=None)

Return a list of all predecessor edges for node_id with the specified attribute.

get_all_successor_edges(G, 'PUMP_RV1')
[['PUMP_RV1', 'PUMP_RV1', 'MOTOR_A1', 1],
 ['PUMP_RV1', 'PUMP_RV1', 'CASE_STD1', 1],
 ['PUMP_RV1', 'PUMP_RV1', 'VANE_ASM1', 1],
 ['PUMP_RV1', 'PUMP_RV1', 'OIL_SYSTEM1', 1],
 ['PUMP_RV1', 'VANE_ASM1', 'VANE_001', 6],
 ['PUMP_RV1', 'VANE_ASM1', 'BEARING_01', 2]]
get_all_successor_edges(G, 'BEARING_01')
[]
get_all_predecessor_edges(G, 'PUMP_RV1')
[]
get_all_predecessor_edges(G, 'BEARING_01')
[['BEARING_01', 'VANE_ASM1', 'BEARING_01', 2],
 ['BEARING_01', 'VANE_ASM2', 'BEARING_01', 2],
 ['BEARING_01', 'PUMP_RV1', 'VANE_ASM1', 1],
 ['BEARING_01', 'PUMP_RV2', 'VANE_ASM2', 1]]
dfs = [pd.DataFrame(get_all_successor_edges(G, root)) for root in get_all_roots(G)]
dfs[:2]
[          0          1            2  3
 0  PUMP_RV1   PUMP_RV1     MOTOR_A1  1
 1  PUMP_RV1   PUMP_RV1    CASE_STD1  1
 2  PUMP_RV1   PUMP_RV1    VANE_ASM1  1
 3  PUMP_RV1   PUMP_RV1  OIL_SYSTEM1  1
 4  PUMP_RV1  VANE_ASM1     VANE_001  6
 5  PUMP_RV1  VANE_ASM1   BEARING_01  2,
           0          1           2  3
 0  PUMP_RV2   PUMP_RV2    MOTOR_A1  1
 1  PUMP_RV2   PUMP_RV2   CASE_STD1  1
 2  PUMP_RV2   PUMP_RV2   VANE_ASM2  1
 3  PUMP_RV2  VANE_ASM2    VANE_002  6
 4  PUMP_RV2  VANE_ASM2  BEARING_01  2]
final_df = pd.concat(dfs)
final_df.head()
0 1 2 3
0 PUMP_RV1 PUMP_RV1 MOTOR_A1 1
1 PUMP_RV1 PUMP_RV1 CASE_STD1 1
2 PUMP_RV1 PUMP_RV1 VANE_ASM1 1
3 PUMP_RV1 PUMP_RV1 OIL_SYSTEM1 1
4 PUMP_RV1 VANE_ASM1 VANE_001 6
final_df.columns = ['head', 'parent', 'child', 'quantity']
final_df.head()
head parent child quantity
0 PUMP_RV1 PUMP_RV1 MOTOR_A1 1
1 PUMP_RV1 PUMP_RV1 CASE_STD1 1
2 PUMP_RV1 PUMP_RV1 VANE_ASM1 1
3 PUMP_RV1 PUMP_RV1 OIL_SYSTEM1 1
4 PUMP_RV1 VANE_ASM1 VANE_001 6
final_df.pivot_table(index='head', columns='child', values='quantity', aggfunc='sum')
child BEARING_01 BEARING_02 CASE_CL1 CASE_SC1 CASE_STD1 CLAW_001 CLAW_ASM1 MOTOR_A1 MOTOR_B1 OIL_SYSTEM1 SCREW_001 SCREW_ASM1 VANE_001 VANE_002 VANE_ASM1 VANE_ASM2
head
PUMP_CL1 NaN 4.0 1.0 NaN NaN 2.0 1.0 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
PUMP_RV1 2.0 NaN NaN NaN 1.0 NaN NaN 1.0 NaN 1.0 NaN NaN 6.0 NaN 1.0 NaN
PUMP_RV2 2.0 NaN NaN NaN 1.0 NaN NaN 1.0 NaN NaN NaN NaN NaN 6.0 NaN 1.0
PUMP_SC1 NaN 4.0 NaN 1.0 NaN NaN NaN NaN 1.0 NaN 2.0 1.0 NaN NaN NaN NaN

source

create_matrix

 create_matrix (G, attr='quantity', root_nodes=None)

Creates a matrix with endproducts as indices and parts as columns and values as attributes

create_matrix(G)
child BEARING_01 BEARING_02 CASE_CL1 CASE_SC1 CASE_STD1 CLAW_001 CLAW_ASM1 MOTOR_A1 MOTOR_B1 OIL_SYSTEM1 SCREW_001 SCREW_ASM1 VANE_001 VANE_002 VANE_ASM1 VANE_ASM2
head
PUMP_CL1 NaN 4.0 1.0 NaN NaN 2.0 1.0 1.0 NaN NaN NaN NaN NaN NaN NaN NaN
PUMP_RV1 2.0 NaN NaN NaN 1.0 NaN NaN 1.0 NaN 1.0 NaN NaN 6.0 NaN 1.0 NaN
PUMP_RV2 2.0 NaN NaN NaN 1.0 NaN NaN 1.0 NaN NaN NaN NaN NaN 6.0 NaN 1.0
PUMP_SC1 NaN 4.0 NaN 1.0 NaN NaN NaN NaN 1.0 NaN 2.0 1.0 NaN NaN NaN NaN