The maintenance support organization consists of many different actors and components that are geographically distributed. Actors at different locations, with different roles, will behave differently depending on their individual goals. Managers of the maintenance support organization has certain strategic goals and financial criteria, while the operators using the equipment has a different agenda focusing on solving the task at hand while relying on the equipment to be operational. In between there are a number of tactical level maintenance managers working to allocate resources to achieve the best possible operational availability, at the lowest cost and risk.
The geographical distribution of actors and components adds complexity, since the available maintenance options and lead-times for any actor and component will depend on where they are located. There may also be limitations on the availability of information services and/or bandwidth, which will mean that not all actors have the same information upon which to base their decisions. This means that the overall performance of the maintenance organization is affected by its specific geographic distribution, and specifically by the available distribution and communication channels within the organization.
The focus in this document is on tactical level maintenance planning. As defined in Section 1.2.1, tactical level planning concerns effective utilization of available resources and optimization of maintenance policies. The purpose of an organizational model in this context is to be able to study the effect of different resource allocation and maintenance policy decisions, in terms of operational availability, cost and risk. In this section, the functionality that is required from a maintenance organization model is identified, and an organization model design is then proposed and discussed.
Most of the requirements on a maintenance support organization model have been touched upon in the introduction to this section. There are six key requirements, which can be summarized as follows:
It is worth noting that the actual physical location of equipment and other components of the organization is not required to enable the model to evaluate availability, cost and risk. It is sufficient to have information about the characteristics of the distribution and communication channels that connects the different parts of the maintenance organization. This is beneficial in cases where a customer to a maintenance support provider may not be able or willing to share the specific locations of equipment, for example in public-private partnerships in the defense industry.
An organizational level model spans multiple equipment units, as well as other things such as spare equipment components, facilities, staff, etc. The proposed equipment reliability in Section 2.2 can be leveraged to provide information about equipment behavior regarding availability and risk. These are characteristics that are also required from the maintenance support organization model. The following additional maintenance organization model functionality is here identified as required to exploit the equipment reliability model capabilities in the organizational context:
A maintenance support organization can be described as a network of actors, each with their own authority to respond to their perception of the situation. Actors can consume or shift resources, such as spares, within the network depending on location and physical availability. Information is another commodity that is communicated to the different actors in the model, depending on available communication means and information sharing policies. Maintenance operations thus involve complex activities and decision-making on multiple organizational levels that are distributed in time and space. Agent-based modeling appears appropriate under these circumstances (Lim and Zhang 2003). A study of the feasibility of analyzing maintenance strategies using agent-based simulations has concluded that agent-based modeling, combined with other methods and simulation techniques, has a high potential to realistically model large and complex maintenance systems even with limited mathematical efforts (Kaegi, Mock, and Kröger 2009).
In this section, an agent-based model of the maintenance support organization that captures the aforementioned organizational structure is proposed. The problem is decomposed into different types agents and objects that can be interconnected such that they form a representative model of the real maintenance support organization of interest. The representation of the real organization is assumed to be readily available to the tactical level planner.
The differentiation between the terms agent and object is sometimes ambiguous. The following definitions, as used by Kaegi, Mock, and Kröger (2009), are adopted in this text:
The proposed equipment reliability model presented in Section 2.2.3 has four parts; the model structure, a maintenance event model, a failure rate model and an effect of covariates model. The following object-type components constitute the connection between the proposed organizational model in this section, and the proposed equipment reliability model. In fact, some aspects of the equipment reliability model are implemented as part of the organization model; namely the model structure and the maintenance event model. Interfaces are provided against the failure rate model and the effect of covariates model.
The proposed maintenance organization model can be represented as a graph, or rather, two separate graphs, layered on top of each other. One graph describes physical locations and the transportation of physical objects between locations, forming a graph structure with locations as nodes and the channels for transportation as edges, see Figure 2.9. The other layer describes the flow of information, and is discussed further below.
At least two types of agents can be identified in the maintenance support organization, maintainer agents, which perform equipment level maintenance, and location manager agents, which makes decisions on the transportation of components and assigns work to maintainer agents.
So far we have described the nodes of our physical layer graph in the form of location manager agents, each of which can have a number of associated maintainer agent workers for the performance of maintenance on equipment unit components. Two more object-type components are needed to complete the physical layer:
The information layer is the second of the two layers in the graph representation of the maintenance organization model. Database objects form nodes representing locations where information is stored, connected by communication channel objects that form the edges of the graph. Figure 2.10 illustrates the information layer graph, overlaid on the physical layer graph from Figure 2.9. To represent the information exchanged between databases, an information object is used.
The flow of information within the organization will affect, for example, the uncertainty regarding component reliabilities, and may also provide additional information upon which maintenance policy decisions can be made. For example, a location manager agent in the back of the supply chain may decide to proactively order spares rather than wait for orders from the downstream locations, based upon information available directly from the maintained equipment at the front of the supply chain. The flow of information, and representation of information available to each agent, is modeled by the following object-type components:
A solution is needed to be able to evaluate the performance of the modeled maintenance organization over a simulated period of time. In this proposed organization model, the evaluation functionality is provided through an evaluation module which is referenced throughout the organization model framework. Evaluation data is the collected during the simulation by letting any object which incurs costs to notify the evaluation module, and analogously for availability effects. At the end of the simulation, the result can be evaluated by querying the evaluation module.
The risk level is assessed by running the same simulation multiple times with different random number series driving the simulation, and then analyzing the combined result sets using well known statistical methods. These aspects are further discussed in Section 2.4.
To examine the viability of the proposed modeling approach, a proof of concept implementation has been made, covering part of the proposed organization model features. The implementation consists of a framework with utilities for modeling, as well as a sample model built using that framework.
The proof of concept organization model framework has been implemented using the Python programming language. It is designed using object oriented programming such that classes are made available for each of the implemented agent- and object-types in the proposed organization model. The proof of concept implementation does not cover the proposed information layer and associated objects. A brief description of each class that implements an agent or object can be found in Table 2.2.
|
||
|
||
Agent/Object |
Framework Class |
Description |
|
||
|
||
|
||
Equipment Component Object |
EquipmentComponentType |
Objects of this class describe common characteristics of a type of equipment component. These characteristics consists of component cost, mean time to repair and failure rate information. Since the proof of concept implementation does not have the information layer proposed in Section 2.3.2, the interface towards the failure rate model and effects of covariates model components of the proposed equipment reliability model are not implemented. Instead, a basic Weibull distribution is substituted to describe the failure rate, while the effect of covariates is ignored. |
|
EquipmentComponentObject |
Each instance of this class represents a single physical equipment component. Each instance of this type is associated with an instance of the equipment component type class. It also tracks how many hours the component has been in use, whether it is in operational status and whether it is new or used. The new or used tracking is employed to determine when to incur component costs. In this proof of concept implementation of the framework, component costs are assumed to be incurred as soon as a component is taken into use. The practical equivalent would be that all components are kept in consignment storage until fitted on an equipment unit. |
|
||
|
||
Equipment Unit Object |
EquipmentUnitType |
Instances of this class describe common characteristics of a type of equipment unit. In this implementation those characteristics are limited to a list of equipment component type instances, and information about how many instances of each component type that can be found on the represented type of equipment unit. |
|
EquipmentUnitObject |
Each instance of this class represents a single physical equipment unit. Each object instance is associated with one or more equipment component objects, modeling the components of which the unit is built. It is also associated with a single equipment unit type instance, describing which kind of system the equipment unit is representing. The class provides functionality for determining whether a unit is operational, whether it needs maintenance according to a given policy and usage tracking functionality. A limitation in the current proof of concept framework is that equipment units are assumed to be in constant use while in a functioning state. This may be fine for some types of equipment units, while inadequate for others. |
|
||
|
||
Location Manager Agent |
LocationManagerAgent |
Each instance of this class represents the maintenance decision making associated with a particular location in the maintenance organization. The typical decision making made by this type of agent is inventory management - ordering and delivering spares to other location agent instances. In this proof of concept implementation, no effort is made to collate requests for- and deliveries of spares. It is also responsible for checking the maintenance status on any equipment units associated with the location the instance represents. Outstanding maintenance gets assigned to available maintainer agents. If no unused spares are available to perform corrective maintenance, equipment components which have been replaced in preventive maintenance, but that are still functioning, will be used as spares. This means that imperfect maintenance may occur in the model. |
|
||
|
||
Maintainer Agent |
MaintainerAgent |
Each instance of this class represent a maintenance engineer, able to perform maintenance of a single equipment unit object at a time. The maintainer agent is assigned a unit and a set of spares by the location manager agent. This agent then decides on the order and priority of components to be maintained. It keeps track of costs incurred due to maintenance work, and the time it takes to complete the maintenance operations of an assigned equipment unit. |
|
||
|
||
Distribution Channel Object |
DistributionChannelObject |
In this proof of concept implementation of the maintenance organization model framework, each instance of this class describes a directed edge symbolizing a unidirectional transportation route from an upstream location to a downstream location. The object keeps track of lead-times and costs incurred during transport of equipment components. Multiple transports can occur in a single distribution channel instance simultaneously, and no capacity limitations are implemented. |
|
||
|
||
Maintenance
Policy |
MaintenancePolicyObject |
An instance of this type of class encapsulates the rule-set used by a location manager agent or a maintainer agent to guide their decision making. Instances of this type of object are shared among the agents associated with a particular location in the current proof of concept implementation. |
|
||
|
||
| ||
|
||
|
||
|
The maintenance organization model framework also provides three supporting classes associated with resource tracking and evaluation:
Full implementation details of the proof of concept organization model framework are available in Section 3.3.
This section shows how a small maintenance organization model can be built using the maintenance organization model framework presented above. The model constructed here is similar to the model presented in Figure 2.9. It consists of three location manager agents, representing one upstream spare repository with no associated equipment units, and two downstream locations that are connected to the upstream location through distribution channels. Resource tracking and evaluation is supported through application of the three supporting classes. The here implemented model is designed to be useful to study the effect of different maintenance policies in the modeled organization.
The first step is to define the different types of components available to our model. In this case it is done through the creation of a new class called ComponentTypes:
Each component type is modeled using the framework class EquipmentComponentType. The first argument to the constructor is a label for our type of component. The following two are the scale and shape parameters for the Weibull failure rate distribution substituting the interface to the corresponding equipment reliability model components. The last two arguments are the MTTR specified in hours, and the component cost respectively.
The next step is to describe the equipment unit types occurring in the model. In this case there is only one, but for consistency it is implemented in a new class called UnitTypes.
The framework class EquipmentUnitType is used to define the equipment type. It takes two arguments, a label and list of tuples detailing the component types and the number of occurrences of each component type.
Before any objects are instantiated, the evaluation score keeping and the resource tracking objects are created:
The three framework supporting classes are used and instantiated with self explanatory variables.
A maintenance policy object for the first downstream location is then constructed. It is here done in two steps; a dictionary describing the policy is constructed which is then used to instantiate the framework class MaintenancePolicyObject:
A set of equipment units is then constructed from newly made equipment components. This set of units will be used to associate the units with the location.
As can be seen, the factory method new in the resource tracking object g_units is used to create the equipment units.
Maintainer agents are then created:
The constructor of the framework class MaintainerAgent takes three arguments; a label, a reference to the score tracking object and the cost the maintainer agent incurs while performing a maintenance action.
The last piece missing before a location manager agent can be constructed is the initial set of spares available at the location:
The factory method new of the global resource tracking variable g_components is here used to create an initially full (as dictated by the maintenance policy) spares repository.
Finally, the location manager object can be created using the data sets detailed above:
The last two arguments to the framework class LocationManagerAgent constructor are the setup-times required for preventive and corrective maintenance. It is in this model assumed that planned preventive maintenance has a shorter setup time than corrective maintenance needed due to a breakdown while the equipment unit is being used in the field. The two remaining locations, i.e. the upstream location and the second downstream location are created similarly.
The distribution channels are constructed using the framework class DistributionChannelObject:
The first two arguments to the constructor is a label and a reference to the score keeping object, followed by the upstream and downstream locations of the distribution channel. The last two arguments are the lead-time in hours required to transport equipment components through the channel, and the cost per hour of transportation in this channel.
The main simulation loop in this example is run for a simulation period of 30-days, i.e. 720 hours:
The chosen resolution of the simulation here was 1 hour. Smaller simulation time steps yield more accurate results at the expense of computation time. In this example, parameters have been set using whole hours, such that a simulation step of 1 hour would be suitable.
The final step is to print the evaluated results from the simulation:
Which in one example run yields:
The random function used to simulate outcomes in the maintenance organization model framework is the random-function from the random package delivered with Python. It is a so called Mersenne twister algorithm (Matsumoto and Nishimura 1998) which, while producing a uniformly distributed number sequence, is entirely predictable. This means that by controlling or setting the random seed before entering the simulation, entire simulation runs can be reconstructed for further study and post mortem analysis if required. The full source code of this organization model example can be found in Section 3.3.
In this section, the requirements on the maintenance organization model identified in Section 2.3.1 are revisited and compared against the proposed solution and proof of concept implementation in Section 2.3.2 and Section 2.3.3 respectively. This discussion is then followed by some comments on the applicability of the proposed modeling approach.
The first requirement, Requirement 1, identified the need for the model to represent different kinds of physical resources. Maintenance staff, equipment units and equipments components were explicitly listed. This requirement is met by the equipment component object, equipment unit object and the maintainer agent object proposed in Section 2.3.2.
Requirement 2 described the need to be able to represent maintenance policies, and the aspects of preventive replacement thresholds and inventory reorder levels were explicitly mentioned. This requirement is met by the maintenance policy object in the proposed maintenance organization model in Section 2.3.2.
Requirement 3, distribution channels, discussed the need to be able to represent means for resource transportation. The ability to represent lead-time was explicitly mentioned. This requirement is fulfilled by the distribution channel object proposed in Section 2.3.2.
Requirement 4 discussed the need to represent information, as different information may be available to different actors in the maintenance organization during a simulation. This requirement is met through two proposed objects; database objects that encapsulate all information available at a specific point in the maintenance organization, and information objects which describe information deltas that can be communicated between database objects using information channel objects.
Requirement 5 identifies the need to represent different communication options available within the maintenance organization. Communication channels are provided by the communication channel object proposed in Section 2.3.2, fulfilling this requirement.
Requirement 6 describes the need for a means of evaluating model performance in terms of availability, cost and risk. This requirement is met by the inclusion of an evaluation class as proposed in Section 2.3.2 and exemplified in Section 2.3.3.
Requirement 7 concerns the interface towards the proposed equipment reliability model, and specifically the ability to include setup-costs and setup-times when bringing in an equipment unit for maintenance. This requirement is met through a combination of different effects in the location manager agent and the maintainer agent model components proposed in Section 2.3.2. First of all, the location manager will keep a piece of equipment needing corrective maintenance waiting until spares are available to fix the problem. This implies a dynamic setup-time. Secondly, the maintainer agent will need a setup-time depending on the nature of the maintenance to be performed, preventive or corrective, which will incur costs for the maintainers time as well as adding setup-time to the maintenance action.
Requirement 8 deals with the ability to collate usage and environmental data from multiple equipment units. The purpose of the information layer in the proposed organization model is to fulfill this requirement. As such it has not been fulfilled in the proof of concept implementation, of which the information layer is not a part, but can be fulfilled using the proposed information layer objects.
Requirement 9 specifies that resource tracking must be possible to enable non-perfect maintenance. This is enabled through use of the supporting classes in the proposed framework to create a global resource tracking instance, named g_components in the proof of concept implementation, and has been shown to work in the example model provided in Section 2.3.3.
Decision support systems needs to be able to support the decision maker with accurate information, based on historical data on the equipments previous operation, the actual status such as sensor signals and built-in tests, and expected trends or events that can be forecast with the use of simulation (Vin, A. H. C. Ng, et al. 2008). Given the proposed maintenance organization model discussed in the previous sections, a number of data requirements can be identified for this modeling approach to be applicable to an actual maintenance organization modeling scenario.
There is a conceivable risk that with too much uncertainty going into the model, it can be perceived as behaving chaotically and thus become useless to try and derive predicted outcomes from. This could occur if to little is known about component reliability functions, effect of covariates, lead-times, expected repair-times, etc. The simulated outcomes of a model under these circumstances may exhibit significant variance. For example, there is no point in a simulation result that says little more than that the availability will be between 0% and 100%.