Low Carbon Logistics Optimization for Multi-depot CVRP with Backhauls-Model and Solution

CVRP (Capacitated Vehicle Routing Problems) is the integrated optimization of VRP and Bin Packing Problem (BPP), which has far-reaching practical significance, because only by taking both loading and routing into consideration can we make sure the delivery route is the most economic and the items are completely and reasonably loaded into the vehicles. In this paper, the CVRP with backhauls from multiple depots is addressed from the low carbon perspective. The problem calls for the minimization of the carbon emissions of a fleet of vehicles needed for the delivery of the items demanded by the clients. The overall problem, denoted as 2L-MDCVRPB, is NP-hard and it is very difficult to get a good performance solution in practice. We propose a quantum-behaved particle swarm optimization (QPSO) and exploration heuristic local search algorithm (EHLSA) in order to solve this model. In addition, three groups of computational experiments based on well-known benchmark instances are carried out to test the efficiency and effectiveness of the proposed model and algorithm, thereby demonstrating that the proposed method takes a short computing time to generate high quality solutions. For some instances, our algorithm can obtain new better solutions.


INTRODUCTION
It is an important, typical optimization problem of Vehicle Routing Problem (VRP), in which a fleet of vehicles will deliver the goods demanded by the customers at the minimized total carbon emissions. Considering the attributes of the fleet, traditional VRP has developed several variants, including CVRP (Capacitated Vehicle Routing Problems) which is the combination and optimization of VRP and Bin Packing Problem (BPP), HVRP (heterogeneous VRP) where customers are served by different types of vehicles [1][2][3][4]. Both VRP and bin BPP are important, typical combination optimization problems, and are frequently and independently studied in the past few decades. To resolve the VRP and BPP, numerous methodologies based on operations research and mathematical programming techniques or heuristic-based algorithms have been proposed in extant literature [5][6][7]. It mainly includes particle swarm optimization (PSO) algorithm, ant colony system (ACS) algorithm, tabu search (TS) algorithm and genetic algorithm (GA).
PSO algorithm is an intelligent optimization algorithm, proposed by Kennedy in 1995, which imitates the behavior of birds [8]. The algorithm initializes a group of random particles and uses the iterative method to make each particle find its own solution. The best position is close to the best particle in the group, so as to search the optimal solution. Ai and Kachitvichyanukul [9] proposed a PSO algorithm with multi-social structure, which is the first time to use PSO algorithm to solve vehicle routing problem with simultaneous delivery and pickup. Zhang et al. [10] also proposed a modified PSO algorithm, which used (m+1) dimensional vectors to represent a particle for the problem with m customers. In decoding, the particle is transformed into vehicle assignment matrix by scanning algorithm, and then the customer priority matrix of the same vehicle service is determined. The construction of vehicle path is based on these two matrices. This method helps to reduce the number of vehicles used. Goksal et al. [11] mentioned a hybrid algorithm based on PSO and VND for VRP. Although the calculation speed of PSO is fast, it is easy to precocious. In order to solve VRP, Montane and Galvao [12] proposed a TS algorithm with neighborhood structure composed of relocation, interchange, cross operation between paths and 2-opt operation in path. At the same time, a set of examples for solving VRP was given. Vural mentioned two GA algorithms, which are the first time to use GA to solve VRP [13]. Wade and Reimann used ACS to solve the VRP with backhauls (VRPB) [14,15]. Recently, quantum-behaved particle swarm optimization (QPSO) is improved on the basis of the PSO. It not only inherits the advantages of the example algorithm, but also has much better global search performance than particle algorithm and avoids premature convergence to the local optimal solution [16].
In addition, local search algorithm is also widely used in VRP. This algorithm is usually combined with other algorithms to solve VRP problems. Bianchessi and Righini [17] proposed to use variable neighborhood descent (VND) algorithm to solve VRP. They also combined TS algorithm with VND algorithm VRP. In the process of constructing neighborhood structure, both node exchange based and edge swapping based operations are adopted. Crispimand Brandão [18] explored a hybrid algorithm based on TS and VND to solve VRP. This is the first time that modern heuristic algorithm is used to solve VRP problem. Ropkeand Pisinger [19] represented the variants of vehicle routing problem with backhaul in a unified model, and used a large neighborhood search algorithm to solve them.
In recent years, some attention has been devoted to their combined optimization, called two-dimensional Capacitated Vehicle Routing Problem (2L-CVRP). In the 2L-CVRP clients demand sets of rectangular weighted items, while vehicles have a weight capacity and a rectangular two-dimensional loading surface. The aim is to load the items into the vehicles and deliver them to the clients, through a road network, with minimum total cost. In transportation it is often necessary to handle rectangularshaped items that cannot be stacked one on the top of the other because of their fragility, weight or large dimensions such as household appliances, delicate pieces of furniture, etc. Since Iori and Vigo proposed the 2L-CVRP, it has been explored extensively and solved by many heuristic algorithms [20][21][22][23][24][25][26][27][28][29]. For larger scale problems, Gendreau et al. [21] proposed a Tabu Search algorithm, in which the loading component of the problem was solved through heuristics, lower bounds and a truncated branch-and-bound procedure. Zachariadis et al. [22] proposed a metaheuristic algorithm which incorporated the rationale of Tabu Search and Guided Local Search based on paper, employing a guiding mechanism which could drastically diversify the search conducted by trying to eliminate low-quality features from the final solution to control the objective of the problem. For a population-based algorithm, the Ant Colony Optimization (ACO) was employed by Fuellerer, Doerner, Hartl, and Iori, and the performance was proven to be quite satisfactory [23]. The Greedy Randomized Adaptive Search Procedure combined with Evolutionary Local Search (GRASP×ELS) algorithm was proposed by Duhamel, Lacomme, Quilliot, and Toussaint, and this algorithm outperformed all previous methods and obtained new better solutions for several instances. However, only the Unrestricted 2L-CVRP problems were solved [24]. Recently, Zachariadis, Tarantilis, and Kiranoudis proposed an innovative compact meta-heuristic, named Promise Routing-Memory Packing (PRMP), which obtained excellent performance and improved Best Known Solutions for many instances [25]. Wei et al. [26] proposed a variable neighborhood search to address the routing aspect, and adopted a skyline heuristic to examine the loading constraints. The effectiveness of their approach was verified through experiments on widely used benchmark instances involving two distinct versions of loading constraints (unrestricted and sequential 2L-CVRP problem). According to the combination of the loading constraints, the 2L-CVRP mainly includes four variants, namely, 2L-Sequential Oriented Loading (2|SO|L), 2L-Sequential non-oriented (Rotated) Loading (2|SR|L), 2L-Unrestricted Oriented Loading (2|UO|L), and 2L-Unrestricted non-oriented (Rotated) Loading (2|UR|L) [22]. As far as we know, only Fuellerer et al. have studied all four variants [23].
In our paper, the 2L-CVRP is extended into a multidepot 2L-CVRP with backhauls (2L-MDCVRPB). Some of the depots both deliver and pickup all items, while some offer either delivery or pickup service. Hence, two situations were discussed: S1, all depots both deliver and pickup items; S2, all the items for each customer are delivered or picked up at only one depot. The items in this model need to satisfy the sequential constraint, which belongs to the 2|SR|L.
The remainder of this paper is organized as follows. The model development in the next section introduces the formulation notations, hypotheses and complete mathematical model of 2L-MDCVRPB. Section 3 introduces the QPSO algorithm for the solution of the MDCVRPB, and EHLSA algorithm is proposed to present the solution of the 2L loading problem. Section 4 introduces three groups of experiments based on benchmark instances including 2L-CVRP, 2L-CVRPB and 2L-MDCVRPB instances. In the last section, the key contributions of this research are discussed along with the related application suggestions.
is the set edges between the customer vertices and depot vertices. For simplicity, the set of all customers vertices V/D is denoted as V c . The length of edge ij equals the distance c ij between nodes i and j. The distance is a positive and has no directionality: c ij > 0, c ij = c ji and c ii = 0. There are m i > 0 items to be delivered or picked up for the customer i ( c i V ∈ ), and no demand from the depot, i.e. m i = 0, i D ∈ . The weight, length and width of item k for customer i are denoted as q ik , l ik and w ik (1 , respectively. Therefore, we can obtain the total Each depot l has a fleet of vehicles P l ( ) provide the distribution services. The vehicle p in depot l is denoted as l p P ∈ . For each vehicle, the carbon emissions factor per mileage is denoted as ρ; the maximum capacity is denoted as Q; the loading surface is denoted as S = L ⸱ W, with L being the length and W being the width.

With indices
represents the site of the bottom-left corner of item k to (from) customer i in vehicle p belongs to depot l. Drawing on Dominguez et al. [30], the feasible routes can be divided into linehaul routes and backhaul routes. Hence, a feasible route R pl can be viewed as the subset of Under the clustered visits assumptions, the route R pl was divided into two independent subsets, including linehaul vehicle p at depot l travels starting at node i and ending at node j; if 0 pl ij z = , then vehicle p at depot l travels starting at node j and ending at node i. Here, it is assumed that 0 The third decision variable ik θ was defined to show item k belonging to client i has been rotated ( )

Hypotheses
The 2L-MDCVRPB calls for a set of routes and the feasible packing solutions for all vehicles to operate with the minimum total carbon emissions under the following hypotheses: (1) Each route has the identified starting and ending depot, which can be the original depot or any other depot. In S1, every route must have the same starting and ending depot.
(2) To avoid split deliveries, each single vehicle can only serve to one customer.
(3) Each route should visit at least one linehaul customer.
(4) Each customer can only be visited once, i.e. all items of every customer must be placed in a single vehicle.
(5) Each vehicle should not be overloaded and the items loaded on it should not surpass the loading surface.
(6) All items belonging to every customer should totally be loaded on the vehicle's loading surface.
(7) When loading and unloading all items, they should be carried out from the rear of the vehicle, and only straight movement (one per item) is allowed.
(8) Each item has a fixed direction and can be rotated up to 90° at most. (9) The sides of each item being loaded must be parallel to those of the loading surface.
(10) Stacking or stowing items is not allowed in the vehicle.
(11) The items cannot be rearranged at the site of any customer.

Modelling
This paper presents the 2L-MDCVRPB model which includes problem definition, hypothesis and model construction. The graph theory description of the model is given in the problem definition, and some assumptions are put forward in order to make the model boundary clearer. Finally, the integer programming model of the problem is given and the meaning of each formula in the model is explained. The 2L-MDCVRPB problem can be formulated as: , , , , , { } 0,1 , , , , The objective function is (1), which means to minimize the total carbon emissions. Eq. (2) ensures that each vehicle leaves and returns to the same depot. Eq. (3) ensures that a linehaul customer is placed at the start of the route. Eq. (4) and Eq. (5) confirm that every customer is only visited once, and a vehicle leaves from the customer after the visit. Eq. (6) and Eq. (7)

DESIGN OF HYBRID HEURISTIC ALGORITHM
The proposed QPSO (Quantum-behaved Particle Swarm Optimization) and EHLSA (Exploration Heuristic Local Search Algorithm) both are hybrid algorithms for solving the 2L-MDCVRPB, in which QPSO algorithm is for routing vehicles and EHLSA algorithm is for loading all the items of the clients into the vehicle. Section 3.1 proposes the QPSO algorithm for the solution of the MDCVRPB, and it improves the solution in new decoding method and initial solution generation. In section 3.2, EHLSA algorithm is proposed to present the solution of the 2L loading problem.

Quantum-Behaved Particle Swarm Optimization
In the typical PSO algorithm, particle swarm is composed of particles, in which each particle is an independent individual. In the multidimensional search space at a certain speed, each particle's position is considered to be a better position of each problem particle. When the first iteration is carried out, the position of a particle is randomly generated and the particle's velocity is initialized to zero. In other iterations, the particle's position and speed are updated according to the following formulas: After studying the PSO algorithm convergence, Liu et al. mentioned that each particle must converge to the absorber C i of the particle, where the C i was expressed as follows [ where P i is the best particle between individuals, P g is the global best particle. According to the calculation formula of α, it shows that the role of local attractors is to attract particles randomly, and it is situated in a super rectangle with P i and P g at the two ends of the diagonal. Many represented the quantum state of the particle [27]. After that a global point was added into QPSO, which also represented Mainstream Thought or mean best position of the population [28]. The global point is the average of the Pbest positions, which is denoted as mbest. That is: The following formula can calculate the new position of a particle: , , where contraction-expansion coefficient β is used to control the convergence rate of the algorithm. There are four steps for the QPSO algorithm running. The first step is to initialize the position of particles randomly, and the P i is the initial position of particle i. In addition, the initial velocity of the particles is all zero. The second step is to get the value of the particles objective function through the decoding method. After that we can obtain the value of P i and P g of every particle using its objective function. P g is set to the best P i of all population. The last is to apply (26) and (28) to calculate the velocity and position of each particle respectively.
(1) Decoding method The existing QPSO model cannot solve the problem of 2L-MDCVRPB proposed in this paper. Therefore, this paper makes the following changes: (1) making sure the QPSO can handle the problem of picking up items with backhauls through 3D coding; (2) combining QPSO with EHLSA to solve the two dimensional loading problem. The final QPSO-EHLSA algorithm can solve the 2L-MDCVRPB problem. Assuming that there are multiple depots in the model, we design a new particle coding method to represent the relationship between the T depots, the v vehicles and the n client. This method uses a 3 n × matrix, where the first row X d shows the depots, the second row X v represents the vehicles, and the third row X t indicates the vehicles distances. For example, supposing there are two depots and ten customers. Depot1 has two vehicles and depot2 has three vehicles. Fig. 1 shows the results as follows.  Decoding example X t uses its value to adjust the routes instead of the distance, so before that, all clients who can be cut off by each vehicle must be determined, and then sorted according to the value of X t to determine every vehicle's route. This method can ensure that the client can only be severed once, thereby reducing the adjustment process of the feasible scheme. Fig. 2 shows the adjustment process. (2) Local search heuristics We will introduce the local search heuristics in this subsection. The 2-opt heuristic and inner-tour swap was an improved method, provided by Norouzi et al. [29].
The 2-opt heuristic is based on the idea of exchange, transforming one path into another, which is suitable for two trips; first, choosing a spot in each trip so that the two trips are divided into two parts. The next step is to connect the first part of the first trip to the second part of second trip. In the same way, the second part of the first trip add into the first part of second trip.
Inner-tour swap is different from 2-opt heuristic. It can be applied in a single trip. Above all, there are two clients assigned to the one trip, and then their sequences are replaced. We define two counts, including count 1 and count 2 in order to use this local search set. Count 1 is from 1 to trip length-1 that the length is the same as the number of clients in the selected tour, and count 2 is from count 1 to trip length. Next is to exchange the sequence of count 1st client and count 2nd client. If the result of the exchange reduces the cost, the change is accepted and will continue. Otherwise, the change is rejected and another pair of customers is selected for exchange through the count 1 and count 2.

Exploration Heuristic Search Algorithm
On the basis of QPSO algorithm in Section 3.1, the tours can be obtained. In addition, the clients visited by the vehicles and the vehicles in the routes can be determined. The vehicles with relevant constraints need to load all items required by the clients. Actually, there are two aspects considered in the packing algorithm. One is to identify the next loading project, the other is to identify the possible loading location. Therefore, the two methods will be provided.
(1) Determine the items order Suppose that the order of all clients' visits in a certain tour is fixed. If a client's items are being unloaded, other items in this vehicle cannot be moved. As a result, OV presents that items are sorted in descending visit order in the vehicle. After that the items order of each customer is fixed, while the order of goods belonging to the same customer is not fixed. It should be noted that the unrestricted model does not consider visit order to sort items.
There are two orders, O1 and O2, set to identify the eventual orders of one client's items in the sequential model or all clients' items in the unrestricted model. O1 and O2 are arranged in descending order according to the area l ⸱ w and length l, respectively. If there are the same bottom areas in the two items, O2 is prior to O1. Otherwise, O1 is prior to O2. Through the rule above, the loading sequence of items can be determined as follows: It should be noted that the coordinates at the front left represent the location of the item. For example, item A is located at point O with the length of 3 and the width of 2. The space occupied is a square with four points, which is point (0, 0), (0, 3), (2, 0) and (2,3). After following the rule that items should be close to the edge of the vehicle or other items, new locations list involves ( ) Fig. 3.
After that, in order to reduce the waste of the space, the most ideal location of all current items loading position should be selected in all feasible positions. We list all rules about sorting the feasible loading positions.
(1) when the fitness is 1, it means that the length and width of the item are just suitable for the feasible loading place of the vehicle provided, as shown in Fig. 4a.
(2) when the fitness is 2, it means that the width of the item is exactly equal to the feasible loading place, but for length, the item is smaller, as shown in Fig. 4b.
(3) when the fitness is 3, it shows that the length of the item is exactly equal to the feasible loading place, but for width, the item is smaller, as shown in Fig. 4c.
(4) when the fitness is 4, it shows both for width and length, the item is smaller, as shown in Fig. 4d.
(5) When the fitness is infinite, both for length and width, the item is greater than the feasible loading position.

EXPERIMENTAL VERIFICATION 4.1 Experiments on 2L-CVRP and 2L-CVRPB Instances
Our EHLSA -QPSO algorithm was tested with the 2L-CVRP instance proposed by Iori et al. [20], and the 2L-CVRPB instance introduced by Dominguez et al. [30]. In fact, the 2L-MDCVRPB can be viewed as the extension of the 2L-CVRP and 2L-CVRPB. The 2L-CVRP instance is an especial case of 2L-CVRPB where there are only linehaul customers, and the 2L-CVRPB instance is a special case of 2L-MDCVRPB with a single depot. The experiment was conducted under the loading constraints of Sequential non-oriented (Rotated) Loading (2|SR|L) and Sequential Oriented Loading (2|SO|L).
According to the results on the sequential oriented variant of the 2L-CVRP (2|SO|L), our EHLSA -QPSO algorithm converged to the Best Known Solution (BKS) faster than the contrastive algorithms. The Variable Neighborhood Search (VNS) consumed 3 times the duration, the Ant Colony Optimization (ACO) took 2.2 times, and the Biased-Randomized Large Neighbourhood Search (BR-LNS) took 4.3 times.
Next, the 2L-VRP instances were extended into new instances for the 2L-CVRPB. The extension follows the way Toth and Vigo generated VRPB instances from the classic Euclidean VRP. From the 180 2L-CVRP instances in the original sets, 540 2L-CVRPB instances were created totally, and divided into three groups by the percentage of linehaul customers (50%, 60% and 80%). 60 instances (20 in each group) were selected for the numerical experiment.
Through the experimental results, we can see that our algorithm converged to lots of best solutions for many instances. For the 60 test instances, 53 new best solutions were identified correctly. The mean optimized gap was 1.22% for the first group, 2.20% for the second group, and 3.32% for the third group. In addition, the EHLSA -QPSO algorithm outperformed the BR-LNS, which has been applied to solve the 2L-CVRPB in only one paper.

Experiments on 2L-MDCVRPB Instances
By adding a depot, the 60 selected 2L-CVRPB instances were converted to two-depot instances (2L-MDCVRPB instances), and used to verify the effect of our algorithm. The second depot was placed at a random point between the first depot and the center of customers. The numerical experiments were divided into two groups, based on whether the items should be delivered to the designated depot. In the first group, a depot was added to the 2L-CVRPB instances; in the second group, the designated depot was selected randomly after adding a depot.
The results show that the total distance of the first group was shorter than that of the second group. This is because some customers are served by the new depot nearby, eliminating the need to travel from/to the single depot. However, when the item must be sent to the designated depot, the customer is served by the designated depot rather than the nearby depot. Therefore, the mean distance in S2 was longer than that in S1. The mean computing time in S2 was also relatively long, due to the need to check the fulfilment of the constraint on the designated depot.

CONCLUSIONS
This paper probes deep into the 2L-MDCVRPB and designs an improved hybrid algorithm EHLSA -QPSO to solve the problem. As far as we know, this is the first attempt to tackle the realistic extension of the VRP with two-dimensional capacity. Moreover, item rotation is allowed in vehicle loading, which is a realistic but rarely reported assumption.
After exploring various loading constraints and problem variants, the author concluded that our EHLSA -QPSO algorithm takes a short computing time to generate high quality solutions. It has been proved that particle swarm optimization is a very efficient approach for the 2L-MDCVRPB, and the quantum-behaved can help particle swarm optimization to escape effectively from local optimum, and it can be applied to the similar problem in other areas. For the selected instances, our algorithm can find the BKS with much shorter time than contrastive algorithms. The research findings provide new insights into the 2L-MDCVRPB, and effective tool to solve the 2L-CVRP or 2L-CVRPB.
According to the results of the experiments on 2L-CVRP, 2L-CVRPB and 2L-MDCVRPB instances, the proposed algorithm in this paper has achieved better performance in both the calculation results and the calculation speed. This paper solves a scientific problem extracted from the practice of logistics distribution optimization. The main contribution is to establish a mathematical model to solve this scientific problem. Therefore, there are not many numerical experiments that have been carried out, which is the limitation of this research. In the future, we will improve and design a new algorithm with higher efficiency to solve this problem with a large number of numerical experiments.