Performance Improvement of SIMD Processor for High-Speed end Devices in IoT Operation Based on Reversible Logic with Hybrid Adder Configuration

: The reversible logic function is gaining significant consideration as a style for the logic design by implementing modern Nano and quantum computing with minimal impact on physical entropy. Recent advances in reversible logic allow for computer design applications using advanced quantum computer algorithms. In the literature, significant contributions have been made towards reversible logic gate structures and arithmetic units. However, there are many attempts to dictate the design of Single Instruction-Multiple Data (SIMD) processors. In this research work, a novel programmable reversible logic gate design is verified and a reversible processor design suggests its implementation of SIMD processor. Then, implementing the ripple-carry, carry-select and Kogge-Stone carry look-ahead adders using reversible logic and the performance is compared. The proposed reversible logic-based architecture has a minimum fan out with binary tree structure and minimum logic depth. The simulation result of the proposed design is obtained from Xilinx 14.5 software. From the simulated result, the computational path net delay for 16 × 16 reversible logic with Kogge Stone Adder is 17.247 ns. Compared with 16-bit Kogge Stone Adder, the reversible logic-based 16-bit Kogge Stone Adder gives low power and low time delay. By looking at the speed, energy and area parameters, including fast applications in which two smaller delay and low power adders are required, the effectiveness, including the proper area use of the hybrid adder recommended by it is evaluated.


INTRODUCTION
The Internet of Things (IoT) exemplifies a wide range of sensors, computations, and networking that multiply our physical world. As a result, it is shaping up to be one of the most influential technologies in the modern world. IoT is affecting regions spreading over from agriculture [1][2][3], medical services [4], manufacturing [5] and by empowering unavoidable information assortment to help further investigation. One of IoT's key segments is the sensor hub, normally a little, low-power device with various sensors, remote communication, and an unassuming measure of computation. As these nodes are often set at locations without significant networking or power structures, such as on farms and forests, many sensor nodes are designed to be highly resource-efficient to operate arbitrarily time-efficient automatically. For example, they may work for quite a long time on a little battery [6,7] and convey through slow and low-power communication networks [8,9].
While we see development in both data assortment abilities of IoT devices and examination limit on the gathered data, wireless communications' exhibition and power effectiveness have not scaled as much because of actual imperatives. Subsequently, the data assortment limit of low-power IoT gadgets is regularly confined by the upheld data rate and power utilization of its wireless communication module [10][11]. Network advances for IoT devices give a wide spectrum of decisions, traversing from fast and eager for power to moderate and power proficient. One noticeable way to address the communications issue is edge mining, where the IoT nodes themselves play out some calculations to reduce the measure of data to be sent. Some true applications have shown various adders can reduce the data transmission prerequisite by more than 95% [12][13][14][15]. In any case, moving calculation to the edge additionally implies the IoT nodes should have critical calculation capacities, which builds the expense and power prerequisites of every node.
In this research work, we present a case reversible logic-based Kogge-Stone carry look-ahead adder for Field-Programmable Gate Arrays (FPGA), improving execution and power effectiveness calculation significantly [16]. Besides, by offloading the calculation initially done at a central server to power-productive FPGAs, this methodology additionally improves the absolute organization's calculation of power proficiency by a significant magnitude [17,18]. While using redesign hardware such as FPGAs to achieve high performance on the accelerator is not a new idea [19,20], we provide an instance of a complex adder implementation and in-depth evaluation on performance and power efficiency incorporating wireless communication overhead.

IoT Applications
The function of IoT devices in different application regions is shown in Fig.1. The significant destinations for IOT are simply the formation of smart environments/spaces and mindful things (for instance smart vehicle, cities, urban communities, buildings, energy, health, living, etc.) for the environment, food, energy, portability, advanced society and health applications [22]. The advancements in smart entities will introduce the improvement of the novel advances expected to address the arising difficulties of public health, maturing populace, environment assurance and environmental change, the protection of energy and scant materials, upgrades to safety and security and the continuation and development of economic prosperity.

REVERSIBLE LOGIC GATE
Customary logic gates were generally (n : 1) in nature. Where n addresses the number of input signals applied, and 1 shows the single output produced from the gate. Simultaneously, reversible logic gates are (n, n) logic gates. Here both, the number of input signals and the quantity of the output signal are equivalent to n. In ordinary logic gates, output signals are less in number than the number of input signals. In any case, in reversible gates, input and output signals are equivalent in number. The mix of the output signal, in any case, can give the specific status of the input mix. This is the fundamental motivation to name these (n, n) gates reversible logic gates [23][24][25].

Design Constraints of Reversible Logic Gates
Delay and TRLIC are the two constraints in constructing reversible logic circuits that ought to be carefully kept up. The logical mix of reversible logic circuits with a redesigned structure will be finished by having the following attributes. i. The least number of logic gates should be used in the design. ii. Constant input should be negligible. iii. The proportion of garbage outputs should be kept negligible.
iv. Quantum costs should be kept as low as could be expected.

Reversible Logic-Based Adder Design
This section presents the design of the adder topology using the reversible logic gate. In this work, the following adder structures are used i. Ripple Carry Added. ii. Carry Select Added. iii. Kogge-Stone carry look-ahead adder.

Reversible Logic-Based Ripple-Carry Adder Design
The ripple carry adder is built by falling full adders (FA) blocks in a series arrangement. A single full adder is answerable for adding two binary digits at any phase of the ripple carry adder. The carryout of one phase is taken care of straightforwardly to the carry-in of the procedure stage. Even though this is a straightforward adder and can add unlimited bit length numbers, it is wasteful when enormous bit numbers are utilized. Quite possibly, the most extreme downside of this adder is that the delay increments straightly with the bit length.

Reversible Logic-Based Carry-Select Adder
Carry Select Adders are among the quickest adders in numerous information preparing processors to perform quick arithmetic functions. The carry-select adder segments the adder into a few gatherings, every one of which performs two increments in parallel. Two four-bit ripple-carry adders are utilized to produce carry per select stage. One ripple-carry adder assesses the carry chain expecting the carry-in is zero; the subsequent ripple-carry adder accepts carry-in to be one. When the carry signals are figured, the right entirety and carryout signs will be chosen by a bunch of multiplexers. Reversible carry select added is carried out straightforwardly from the customary technique. For executing a 1-bit adder, this plan requires four PERES gates and two FREDKIN gates. 4-bit carry select adder constructed utilizing reversible logic can be found in Fig. 4. RFA is a Reversible full adder which can be found in Fig. 5. At each stage, two RFA's are utilized. At whatever point, carry-in is "0", and another for carry-in is "1". FG (FREDKIN) gates in the circuits are utilized as multiplexers, CNOT gates at the base, and buffers. The quantity of garbage yields for each stage is eight. This plan has a Quantum cost of 26 for each stage, as it requires four Peres gates which costs four for each and two Fredkin gates which costs five for each.

Reversible Logic-Based Carry Look-Ahead Adder
Furthermore, it is quite possibly the essential activities for advanced digital frameworks, and Ripple Carry Adder (RCA) is the establishment of computerized adders, although it has certain conspicuous defects. Then again, carry Look-ahead Adder (CLA) is a sort of enhancement to traditional RCA, which beats RCA imperfections, for example, low computing effectiveness and long delay. Subsequently, CLA becomes one kind of wide-utilized adder. This paper presents another plan of 16-bit CLA that can upgrade computing productivity and diminish energy dissipation dependent on reversible logic theory. Fig. 6 outlines the total construction of the proposed reversible CLA. The design of the proposed reversible CLA is indistinguishable from the classic electrical circuit. Notwithstanding, we needed to remake every one of the parts with quantum gates, so the circuit has the quality of reversible logic. As referenced previously, the proposed 16-bit CLA is built by four digits gatherings, which implies the four 4-bit CLA are indistinguishable, and we need to course each stage's carry-out to the carry-in of the following stage. Hence, we chiefly worry about rebuilding all aspects of 4-bit CLA, which is PG produces a block, Carry creates a block, and Sum creates block by quantum gates.
The internal quantum circuit is shown in Fig. 7, which PG generates. Quantum circuits have 4 CNOT gates and 4 TOFFOLI gates. The function f this block is P! = A! $. B! and G! = A! ' B!. The quanta cost of this block is 24.
The Quantum circuit of Carry that generates the block is shown in Fig. 8, which is designed using 10 CNOT gates and 20 TOFFOLI gates. The quantum cost of this block is 10ꞏ1 + 20ꞏ5 = 110. The function of this block is to generate the carry out C! -CH4 of the current stage by P! -P + z and G -Gf. + Z.  The Quantum circuit of the Sum generates a block shown in Fig. 9, designed using 4 CNOT gates. The quantum cost is 4ꞏ1 = 4 and S! = C! $. P! is the function of this block. In all figures, "g" is referred to as a garbage line.

Reversible Logic-Based Kogge-Stone Carry Look-Ahead Adder
Reversible Kogge-Stone adder (RKSA) is a parallel prefix structure that carries a look-ahead adder. In RKSA, all ordinary full adders are supplanted with a reversible HNG gate. Reversible Kogge-Stone prefix adder is a quick adder configuration known as a parallel prefix adder that performs quick logical expansion. Reversible Kogge Stone adder is utilized for wide adders since it shows less delay among different designs. In every upward stage, it produces Propagate and Generate bits. Create bits are delivered in the last stage, and these bits are XOR-ed with the underlying propagate after the contribution to create the sum bits. The proposed rapid 16-bit KSA utilizing prefix adder RLG is displayed in Fig. 10. The fundamental reason for the plan is to abolish tremendous delays, generally speaking to carry calculations. So the profundity logic advanced to the configuration is ideal. Planning a 16-bit prefix reversible adder utilizing logic gates has been acknowledged through CMOS logic's assistance in FPGAS and DSP. Basic logic develops are just inverter works, so that odd falling cells and even cells give the consequences of annulling alters between those two cells. Two contributions of each stage will be given to the XOR gate and AND gate to such an extent that it is anything but a half adder circuit. Be that as it may, by utilizing the XOR and AND gate, the time delay and power consumption increase. For this, by eliminating both the gates (XOR and AND) are Kogge Stone Adder. Supplanting with reversible logic gates like PERES GATE, applying this gate decreases time delay and force consumption. The geography of the plan is simple to diminish impedance coordinating.

Reconfigurable IoT Processor
The importance of SIMD stems from being a source of many applications, whether in opening new horizons in the use of technology or in re-evaluating the traditional use of the processors. In this section, we will highlight the number of applications that have been used as properties of SIMD in their implementation. The Advantages of SIMD for these applications stem from the need for the scientific effort to obtain a speed of implementation, less energy consumption, and cheaper price to shed light on three SIMD uses. Then we will see how these uses have facilitated processors and make them work efficiently.
Conventional SIMD processors are very effective for early visual processing because of their parallelism. However, in performing more advanced processing, they exhibit some problems, such as poor performance in global operations and a tradeoff between processing flexibility and the number of pixels. This work shows a new architecture and sample algorithms of a vision chip that can reconfigure its hardware dynamically by chaining processing elements.
A new architecture has been designed to solve the problems described above and make a more flexible vision chip that can perform more advanced processing. It contains more different features than conventional vision chips and SIMD processors. In particular: 1) it can calculate scalar feature values such as summations at high speed; 2) it can perform fast communication between distant PEs, and 3) the grain size of the PE and network structure is dynamically reconfigurable. These features are achieved by only a small change to a regular SIMD image processor, but the total performance is highly improved. The structure of the PE is shown in Fig. 11. It is similar to that of a regular SIMD image processor with a 1-bit ALU and local memory, which performs a single-bit operation at a time. Still, it differs in that the neighbor output port for inter-PE communication consists of a latch, not a flip-flop. PEs are arranged in a two-dimensional array, and a photo detector (PD) is attached to each PE. Each PE is connected to the neighboring up, down, left, and right PEs and performs inter-PE communication.
The grid is open-ended, and the data from outside the array is always 0. A common global bus is provided for every row and column, and data is supplied to the buses from outside via shift registers. Scalar values are output as the processing result via column adders connected to the right-end PEs in an LSB-first bit-serial manner.
The SIMD reversible Hybrid Adder, the multiplier and the shifter are three basic processing units in the ALU. These three segments can be replicated in various widths (4, 8, and 16-bits) for computing data and can answer all the design instructions.

RESULT AND DISCUSSION
This section discusses the simulation results and performance analysis of the proposed reconfigurable IoT system using a reversible logic function with some other existing methods. The simulation of the proposed system is designed using Xilinx 14.7 software using Spartan 6 xc6slx-3csg32. Tab. 1 discusses the device utilization of existing adders.     Fig. 13 discusses the performance analysis of delay using different hybrid adders with 16 and 32-bits. This comparison clearly shows that the hybrid combination of kogge stone adder carrying look-ahead adder obtains the minimum delay compared with other hybrid adder combinations.
The following Tab. 3 discusses the Device Utilization of different Reversible hybrid Adders. Compared with normal adders, the hybrid adders consume more devices, but the power consumption and delay are low.  Tab. 4 presents the strong analysis for different adders with proposed reversible logic-based hybrid adders for 16bit inputs. This comparison clearly shows the proposed reversible logic-based hybrid adder consumes low power while compared with other adders. For example, the overall power consumption, delay, ADP and PDP of reversible logic based KSA-CLA is 15520.912 nW, 1023 ps, 524.621 and 0.0189 pJ. Tab. 5 presents the strong analysis for different adders with proposed reversible logic-based hybrid adders for 32-bit inputs. This comparison clearly shows the proposed reversible logic-based hybrid adder consumes low power while compared with other adders. For example, the overall power consumption, delay, ADP and PDP of reversible logic based KSA-CLA is 46468.29 nW, 2956 ps, 3069.31 and 0.230 pJ. Finally, the overall performance analysis of different adders with 16-bit adders is shown in Tab. 6.

CONCLUSION
This research aimed to define and implement a framework for SIMD architectures that should be reconfigurable using reversible logic-based hybrid adders. Using this framework and the SoC technology, a complete parallel computer has been implemented and to test it in different aspects, an application was also written. This was to reduce the development time and power consumption when implementing a SIMD system and make it easy to evaluate the effectiveness of different applications, intending to construct a reconfigurable framework in VHDL to ease this process. The resulting framework should be implemented on an FPGA and its usability testing. The main parts of a SIMD architecture were identified to be the Control Unit (CU), the Processing Elements (PE) and the Interconnection Network (ICN), and a framework was constructed with these parts as the main building blocks. The proposed SIMD processor design, particularly with the reconfigurable ALU, changed the modifiers for framing adder trees. The possibilities have been demonstrated to decrease delays and power with smart creativity. This demonstrates that reconfigurable processor methods are successful with the capabilities for IoT applications to be enforced. In the future, use a reversible logic-based Vedic multiplier to improve the overall performance of the SIMD processor.