I Introduction
With the rapid development of electronics and wireless networks, various services are currently supported by modern mobile devices (MD). However, most realtime applications require huge computation efforts from the MDs. Mobile edge computing (MEC) is a very promising technology to solve this dilemma for the nextgeneration wireless networks. According to MEC, edge servers, which are used to connect mobile terminals with a cloud server, provide high storage capability as well as fast computation ability.
In a MEC architecture both latency and energy consumption contribute to the network performance and it is of common interest to investigate the problem of balancing these factors with optimized offloading policies. Scanning the open literature, the authors in [1] proposed to offload a task from a single MD to multiple computational access points (CAP). Furthermore, a weighted sum of energy and latency was optimized by using convex optimization. This problem was recently extended in [2] to a scenario where multiple MDs perform offloading. The multiple tasks were scheduled based on a queuing state in order to adapt channel variations. Alternatively, in [3], the latency was minimized with scheduling the MEC offloading, while the energy consumption was considered as an individual constraint in MDs.
Recently, machine learning (ML) attracts much attention from both academia and industry, as an efficient tool to solve traditional problems in wireless communication
[4][7]. Specifically, the authors in [4]proposed a payoff game framework to maximize the network performance through reinforcement learning. Furthermore, a deep Qnetwork was utilized in
[5] to optimize the computational offloading, without a prioriknowledge of the network. Most of these methods, including those which use deep learning network (DNN), focused on the offloading design from a perspective of longterm optimization and at the cost of complexity and robustness
[6][7]. Moreover, these methods can hardly track fast channel changes, due to the requirement of offline learning. Thus, in general they cannot be applied for realtime applications in timevarying channel and it remains a problem of common interest to optimize offloading policies with a timeefficient method, which simultaneously ensures highquality performance.In this work, we introduce the cross entropy
(CE) approach to solve the offloading association problem, by generating multiple samples and learning the probability distribution of
elite samples. In contrary to the conventional algorithms, the proposed CE learning approach can use parallel computer architecture to reduce computational complexity, and it works for shortterm offloading using online learning architecture, which has a stringent requirement on realtime evaluation. Our work generalizes the CE learning approach to solve the offloading problem with low complexity. The proposed approach is promising since it can effectively replace the traditional convex optimization tools.Ii System Model
We consider the problem of multitask offloading in a network with multiple CAPs, where the MD has access to the CAPs. Each of the tasks can be selected to be executed at the local MD or offloaded to the CAPs, while a CAP serves only one task at each time. Since the index ‘0’ represents the local CPU, and are defined as the sets of tasks and CAPs, respectively. In order to indicate the offloading status, we define the policy
(1) 
and matrix ensembles all the indices. Assume that each task can be offloaded to a single CPU. In this case holds
(2) 
Iia Latency
The execution latency consists of two components: transmission latency and computation time. The transmission time includes task data preparation at the MD, data transmission duration over the air, and received data processing at CAP before conducting computation. Also, the transmission time depends on the achievable rate of physical links. The uplink and downlink data rates can be defined as,
(3) 
where , , . () is the transmitting (receiving) power, and is the channel gain between CAP and MD. When it turns to the specific and , they are set infinitely large because computing at local CPU leaves out the process of offloading. Let denote the input data size in bits, is the computation data size (number of cycles required for CPU) and is the output data size after computation. Then, for the offloaded task , the computation time, the uplink and the downlink transmission time can be
(4) 
where the CAP serves the tasks with a fixed rate of cycles/sec.
In fact, the CAP can start computing after either one or all scheduled tasks are offloaded. Here, we consider computation after one task offloading completes. For this case, there is no intraCAP overlap when evaluating the overall latency. This latency is simple in expression, but the following proposed algorithm is still effective for other general expression. The three steps, offloading, computing and transmitting, take place sequentially, which results in the overall latency at CAP as follows
(5) 
Note that since all CAPs evaluate their tasks in parallel, the delay is the maximum one, given as
(6) 
IiB Energy Consumption
An MD consumes battery to compute the tasks locally or to transmit and receive the task data. The energy consumption in the two cases can be written as
(7) 
(8) 
where denotes the energy for local computation. Then, the total energy consumption is
(9) 
IiC Optimization Problem
Low computational latency and energy consumption are two main objectives of MEC. Unfortunately, these objectives cannot be minimized simultaneously and the problem turns out to be a multiobjective optimization. We define the weights, and , to compromise the two objectives. Then, the weighted objective can be defined as [1]
(10) 
where is defined in (6) as the maximum delay consumed by all the CAPs instead of the sum or average one.
We aim to solve computation resource allocation scheme under specific situation where and are fixed. The joint minimization problem of both power and latency can be formulated as
(11) 
Iii Offloading Learning through Cross Entropy
The problem in (11) is a binary integer programming one, which can be optimally solved via the branchandbound (BnB) algorithm with exponentially large computational complexity, especially when X is large [7]. In future wireless networks, the number of tasks will increase and more CAPs will be involved. Then, the BnB algorithm can hardly satisfy the requirements of realtime applications. Besides, there are studies on trying to solve the problem by using conventional optimization methods. The most popular solution is to use convex relaxation, e.g. to relax as
through linear programming relaxation (LPr) or to relax
by semidefine relaxation (SDR) [1]. The relaxation, however, causes performance degradation compared to BnB algorithm.
Besides the above methods, the problem in (11) with discrete optimization variables can be solved by using a probabilistic model based method, in the way of learning the probability of each policy . The CE approach is a probability learning technique in the ML area [9], [10]. To solve (11), we propose a CE approach with adaptive sampling, namely adaptive sampling cross entropy (ASCE) algorithm.
Iiia The CE Concept
Cross entropy, also known in probability theory as KullbackLeibler (KL) divergence, serves as a metric of the distance between two probability distributions. For two distributions,
and , the CE is defined as(12) 
Note that in our proposed CEbased learning method represents a theoreticallytractable distribution model that we try to learn for obtaining the optimal solutions, while is the empirical distribution which characterizes the true distribution of the optimal solutions. Particularly, in machine learning, distribution is known from observations and is the entropy of , which leads to the equivalence of learning the CE in (12) and .
We are inspired by the definition of CE, a popular cost function in machine learning, to solve problem (11) via probability learning. We learn by iteratively training samples, and then generate the optimal policy of X according to , which is close to the empirical one, .
IiiB The ASCEbased Offload Learning
For probability learning, the probability distribution function is usually introduced with an indicator , e.g.,
can be a Gaussian distribution and
contains its mean and variance
[11]. Denoting that equals to , the indicatoris a vector of
dimensions, defined as where and represents the probability of . With this method, we learn by learning its parameter . Accordingly, X is vectorized as where. Following the Bernoulli distribution, we have the distribution function
as [13](13) 
According to (2), one task associates to at most one CAP. Thus if a task is assigned to one CAP, its probability of being associated to other CAPs becomes zero. Aiming to reduce the redundancy of generated samples, we divide one sample, i.e., a vector of dimensions, into independent blocks, , , , , and each of them associates to one CPU, e.g., the feasible block indicates the task assignment of tasks  to CAP . Let denote the set of indices of the selected blocks in sampling and is another set to store the samples satisfying the constraints in each iteration. We first uniformly choose , an index in . To generate an given , we draw the entries of
according to the probability density function
. For each in , it is drawn according to the Bernoulli distribution of parameter . The indicator of the remaining blocks in is then adjusted based on , that is, if we have for . When the cardinality of , denoted as , is equal to , one valid sample is generated. Note that we draw the sample, while the nonfeasible samples are excluded on the way. All the valid samples gather in and the sampling repeats until the cardinality of , denoted as , reaches .In the proposed CE approach, computations in each iteration can be conducted in parallel, while the iterations are implemented in sequential. As will be shown later in the simulation results, we can adjust the hyperparameters of the proposed algorithm, including , to compromise between the amount of parallel computations per iteration and the number of iterations for convergence. This makes a flexible tradeoff between performance and latency.
Algorithm 1 : ASCEbased Offload Learning Algorithm  

1  Initialize: , . 
2  for 
3  While and 
4  Select an index from ; 
5  Generate entries of based on and update , ; 
6  Adjust where based on ; 
7  end while 
8  Calculate the objective ; 
9  Sort ; 
10  Select the minimum as elites; 
11  update according to (17); 
12  end for 
13  Output: . 
Now we take the CE in (12) as the lost function. It shows that the smaller is, the smaller the distance between and is. This implies
(14) 
where is , since the probability of each independent solution in the set of samples is where is the cardinality of the set [9]. Regarding the problem in (14), the objective is equivalently to finding the optimal indicator minimizing . During the th iteration, series of random samples , serving as candidates, are drawn according to probability . The feasible samples generated by the adaptive sampling are under evaluation. We evaluate the objective of (11) and sort them as
Then, samples, i.e., , yielding the minimum objective, are selected as elites. Now, the best indicator for policy can be determined as
(15) 
Using (13) and (15) and by forcing , the saddle point can be evaluated as
(16) 
In the proposed learning algorithm, we choose the CEbased metric for updating the probability. Considering the randomness of sampling, especially when the number of samples is small, we update in the th iteration not only on the basis of which is handled with (15) and (16), but also learned in the last iteration. It follows
(17) 
where is the learning rate [10]. In general, for the CEbased method, the iterations converge to an optimized solution of the problem [14].
The proposed algorithm is summarized in Algorithm 1. The CE approach combining with the indicator updating mechanism can replace conventional convex optimization methods, to compromise complexity and performance.
Iv Simulations and Discussion
This section validates the efficiency of the proposed approach through simulations, by using the same parameters as in [1]. The MD is equipped with a CPU and Mcycles/sec, W, W and W. The CPU frequencies of the three CAPs are , and cycles/sec. The data rates, and , are set to be Mbps. The average objective in the figure results is the average value of the objective in (10) over a number of trials.
Fig. 1 shows the convergence of the proposed ASCE algorithm under various choices of hyperparameters and . From Fig. 1, it is evident that the algorithm converges fast and the average objective reduces with , which can be considered as closer to the optimum one. Moreover, the average objective converges to almost the same optimal value for all the different choices of the values of hyperparameters. We therefore conclude that, the proposed ASCE algorithm performs robustly to different values of parameters.
In Fig. 2, we compare the proposed ASCE algorithm with the LPrbased offloading algorithm in [1], BnB [7], No MEC and Full MEC. Among them, No MEC and Full MEC represent that all the tasks are arranged to local CPU and CAP 1, respectively. The proposed ASCE algorithm greatly outperforms the LPr method and it approaches the theoretically globally optimal solution obtained by BnB. By contrast of “Full MEC” and “No MEC”, “No MEC” is far inferior to “Full MEC”, which implies that the MDs of multiple tasks can work efficiently with the assist of MEC. From [12], the complexity of the CE approach and BnB algorithm is and , respectively. The latter is far larger because the CEmethod of parallel architecture optimizes parameters in one iteration while BnB solves parameters sequentially. Besides, the BnB algorithm requires much more memory for storage.
The offloading policy is a tradeoff between latency and energy consumption to a certain extent. The value of is chosen to be , where grows from to with step size . While plays an increasing role in the objective function, the curve presents an increasing trend for . As for , there is only one CAP serving the MD, which makes the minimized latency much higher than the cases with multiple CAPs. Because the minimized reduces to the energy consumption of all the tasks computed locally, which is the same for all different values of , the curve of finally decreases to the minimized .
V Conclusion
In this paper, we present an efficient computational offloading approach for a multitier HetMEC network. We propose the ASCE algorithm, which occupies less memory and has lower computational complexity than traditional algorithms. The proposed algorithm performs robustly, while it approaches closely to the optimal performance.
References
 [1] T. Q. Dinh et al., “Offloading in mobile edge computing: Task allocation and computational frequency scaling,” IEEE Trans. Commun., vol. 65, no. 8, pp. 3571–3584, Aug. 2017.
 [2] X. Lyn et al., “Energyefficient admission of delaysensitive tasks for mobile edge computing,” IEEE Trans. Commun., vol. 66, no. 6, pp. 2603–2616, Jun. 2018.
 [3] J. Liu et al., “Delayoptimal computation task scheduling for mobileedge computing systems,” in Proc. IEEE ISIT, Barcelona, Spain, Jul. 2016.
 [4] T. Q. Dinh et al., “Learning for computation offloading in mobile edge computing,” IEEE Trans. Commun., vol. 66, no. 12, pp. 6353–6367, Dec. 2018.
 [5] X. Chen et al., “Optimized computation offloading performance in virtual edge computing systems via deep reinforcement learning,” IEEE IoT J., vol. 6, no. 3, Jun. 2019.
 [6] C. Lu et al., “MIMO channel information feedback using deep recurrent network,” IEEE Commun. Lett., vol. 23, no. 1, pp. 188–191, Jan. 2019.

[7]
C. Lu et al
., “Bitlevel Optimized Neural Network for Multiantenna Channel Quantization,”
IEEE Wireless Commun. Lett., accepted to appear, early access, 2019.  [8] P. M. Narendra and K. Fukunaga, “A branch and bound algorithm for feature subset selection,” IEEE Trans. Comput., vol. C26, no. 9, pp. 917–922, Sept. 1977.
 [9] X. Huang et al., “Learning oriented crossentropy approach in loadbalanced HetNet,” IEEE Wireless Commun. Lett., vol. 7, no. 6, pp. 1014–1017, Dec. 2018.
 [10] P. D. Boer et al., “A tutorial on the crossentropy method,” Annals of Operations Research, vol. 134, no. 1, pp. 19–67, 2005.
 [11] M. Kovaleva et al., “Crossentropy method for electromagnetic optimization with constraints and mixed variables,” IEEE Trans. Antennas Propag., vol. 65, no. 10, pp. 5532–5540, Oct. 2017.
 [12] D. MacKay, Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003.
 [13] A. Vinciarelli, “Role recognition in broadcast news using bernoulli distributions,” in Proc. IEEE ICME, Beijing, China, Jul. 2007.
 [14] S. Mannor et al., “The cross entropy method for fast policy search,” in Proc. ICICML, Washington, Aug. 2003.
Comments
There are no comments yet.