A NOVEL 3 D NoC SCHEME FOR HIGH THROUGHPUT UNICAST AND MULTICAST ROUTING PROTOCOLS

Original scientific article Novel 3D-NoC architecture has been designed by expanding the impression of lossless compression of data. The proposed design shows remarkable results in terms of power efficiency and network throughput. In this scheme, proposed for 3D-NoC, the data to be transmitted is compressed on the transmitting side, so that the data packet is reduced before transmitting. And at the receiver, the original data is restored by decompressing the encoded data. Golomb-Rice algorithm is utilized to develop and implement the hardware encoder and decoder (Codec). The result is evaluated for both unicast and multicast routing. The improvement in power efficiency and throughput for proposed 3D-NoC (unicast) is 9,25 % and 17,61 % respectively. In similar, the improvement in power efficiency and throughput for proposed 3D-NoC (multicast) is 8,65 % and 14,66 % respectively. Further, from the result we observed that the improvement variation is higher for smaller bandwidth, which means the proposed 3D-NoC works well in case of narrow bandwidth.


Introduction
The long term research in VLSI has led to a technology called System on Chip (SoC) [1].The vast amount of integration makes SoC a perfect candidate for Multimedia Processor, consumer electronics, biological applications.From [2] we can have the knowledge of various components present in SoC.The functionality and performance of the SoC (system on Chip) are increased by utilizing the benefit of 3D IC (Three Dimensional Integrated Circuit).The conventional bus architecture is utilized to communicate among the various components in SoC.On the other hand, Network on Chip (NoC) is emerging as a promising technology [3] to overcome the restricted access of bus architecture in System on Chip (SoC).
The latency in 3D NoC is kept low by stacking the active layer vertically.High resource constraint is achieved, since single chip is used to implement all the hardware utilized in NoC.High scalability, reduced power consumption and low latency are some of the encouraging areas of NoC.Reliable switching model, routing scheme with reduced power along with specific topologies is the protocol followed in Network on Chip (NoC).

Related work
The research of interest has been turned towards the Network on Chip (NoC), since the performance of the NoC is promising over the conventional bus architecture.Many works are contributed towards the NoC based on routing, switching and topology.The high speed and low power performance of the NoC is dominated by switching model and effective wormhole switching model is proposed in [4].The routing scheme without the occurrence of deadlock is proposed in [5].The concern of data buffering is overcome by using complex dynamic routing protocol [6].[7] addresses the crisis faced by Task routing.Deflection routing techniques [8] are anticipated for output contention problems.The proficient layout and simplicity of 3D-mesh topology [9] make them popular among all the topologies for NoC.The network performance in 3D-mesh topology is increased using through-silicon vias (TSVs) [10].The power consumption during communication is reduced by using three step routing algorithm [7].So far, many works have been proposed for optimized NoC based on routing scheme, switching model, low power, contention less, dead-lock free and topology.But there is a minor contribution towards the power reduction strategy to improve the performance of NoC architecture.This paper presents a data compression (lossless) technique based on Golomb-Rice encoder and decoder to reduce the power consumption and increase the performance.To complicate the evaluation process of the proposed system, both unicast and multicast routing is incorporated in the design.In sender, the data to be transferred originally is converted to compressed packet by encoder and at the receiver end the compressed packet is converted to original form by the decoder.The lossless compression and decompression is achieved by hardware codec developed using Golomb-Rice algorithm [11].The proposed codec for 3D-NoC router is developed using Verilog HDL and Design Vision from Synopsys is used to obtain the synthesized results.By analyzing the result, the efficiency in power consumption and performance is improved for both unicast and multicast routing.

Proposed 3D-NoC scheme
A simple 4×4×3 mesh architecture for the proposed NoC scheme is illustrated in Fig. 1.The proposed scheme is practiced in both unicast and multicast routing strategy.The hardware encoder, present, on-chip will compress the data, as soon as transaction of data is initiated at the sender.Then the Network Interface (NI) is used to packetize the compressed data before transmission.The delivered packets reach the receiver after the packet travels all the on-chip network.The NI in the receiver side will de-packetize the received packets.Then the hardware decoder, present, on-chip is used to bring back the original data.Hence the throughput of the network is improved by reducing the data packet generated by the Network Interface (NI).The feat of lossless compression of data is achieved by incorporating Golomb-Rice algorithm.

TSV Bundle
Link Router Node

NI
Router Low complexity of the algorithm made them a perfect match for the lossless compression.Golomb-Rice algorithm [11], for a 'n' input data, and 2 k dividend value, binary encode the residual value and unary encode the quotient values.The proposed codec based on a Golomb-Rice algorithm dominate entire lossless compression technique because the determination of the residual and quotient values are easy.The hardware complexity in finding the residual and quotient values is reduced because the dividend value is always a power of two (^2).
The proposed encoder and decoder based on a Golomb-Rice algorithm are shown in Fig. 2a and 2b.Here 2 k dividend for original data 'L' is assumed.The length of the original data may be smaller than the encoded data length in the worst case scenario.It is due to the fact, that in the Golomb-Rice algorithm residual value is converted to fixed length data by binary encoding, whereas, quotient value is converted to variable length by unary encoding.The original data is transmitted as it is, if the length of original data is less than the length of the encoded data.
The length is calculated by size prediction logic in prior.'A' denotes, whether received encoded data is compressed or not.Due to the variable length of encoded quotient value, the 'T' bit is put together to the data encoded to indicate the termination.
Then in receiver side (i.e.Golomb-Rice decoding stage), to check the compression state of the input data, first 'A' bit is checked.If the condition A=1 is satisfied, the unary decoding of the data before the 'T' bit is performed.The process is followed by binary decoding of the k bits.The original data is generated by concatenating the two decode data.On the other hand, if condition A=0 is satisfied, the original data is generated by directly concatenating the k bit data after the 'A' bit and 'L-k' bit data before the 'T' bit.The flow of data packets in unicast routing and multicast routing is illustrated in Fig. 3.

Experimental eesults
The 'C' programming language is employed to develop the 3D-NoC simulator, which is based on cycle accurate technique.The implemented 3D-NoC simulator utilizes hardware codec (encoder/decoder) based on Golomb-Rice algorithm, power aware routing scheme, interconnect channels, router and Network Interfaces (NI's).To improve the performance of the proposed 3D-NoC simulator several techniques such as virtual and pipelined channel along with effective routing techniques are engaged in the target platform.The Data generation Ratio (DR) is used by the IP-core present internally to generate the data randomly.On the other hand, the data size is selected between 4 ÷ 1024 bytes randomly.The encoder developed based on Golomb-Rice algorithm will compress the data generated randomly followed by packetization of the compressed data by the Network Interface (NI).
The receiver (or) node at the destination will receive the compressed packet after the packet travel through the networks present on-chip.And at the receive side, the original data are obtained by reversing the process with the help of decoder based on Golomb-Rice algorithm.The calculation of cycle time between the sender (time from the data generated) and the receiver (restoration of the original data) is used to evaluate the throughput of the network.In search of accurate result the process is continued until the data generation and diminishing value reach 5.000.000.The impact of the performance is evaluated for various bandwidths namely; 8-bit, 16-bit and 32-bit.In synthesizing scenario, Synopsys Design Vision is used.The 3D-NoC router along with Network Interface is developed and designed using Verilog HDL.The CMOS technology of 0,15 um is employed to synthesize the design.The delay of critical path, dynamic power consumed by hardware and area (gate count) of the hardware for the proposed 3D-NoC simulator with compression scheme is computed as 3,57 ns, 701,19 uW and 11588 respectively.The value varies for conventional 3D-NoC architecture (without compression scheme) as 3,56 ns, 697,81 uW and 10096 respectively.The power efficiency and performance of the proposed 3D-NoC simulator is estimated from data of critical path delay and consumed power.The throughput of the proposed 4×4×3 mesh NoC (with encoding and decoding) system for both unicast and multicast routing is compared with the conventional 4×4×3 mesh NoC (without encoding and decoding) system for performance evaluation is shown in Tab. 1.The improvement in the power efficiency of the proposed 4×4×3 mesh NoC (with encoding and decoding) for both unicast and multicast routing is shown in Fig. 4.Here DR stands for the data generation ratio, which is defined as the ratio of the random data generated by number of IP cores to IP core totally present in one clock cycle.The throughput of the NoC system is represented by TP and CB denotes the bandwidth of the physical channel.The energy consumption of the proposed scheme (both unicast and multicast) over the conventional architecture is calculated for evaluating the power efficiency as shown in Fig. 4.
The power efficiency and throughput of the network is improved in entire cases (i.e.DR= 0,1; 0,2; 0,3 and 0,4) as shown in Fig. 4.And Tab. 1.The average improvement in power efficiency and throughput of the network in unicast routing is estimated as 9,25 % and 17,61 % respectively.In similar, the average improvement in power efficiency and throughput of the network for multicast routing is determined as 8,65 % and 14,66 %.An account for this, throughput improvement of the proposed system shows greater variation as the bandwidth size decreases as shown in Tab. 1.
Two additional clock cycles are needed for compression and decompression (i.e. one cycle for compression and one cycle for decompression).In account to this the End-to End packet latency is tabulated in Tab. 2. Hence, the proposed system can be effectively used in case of narrow bandwidth.The utilization of the encoder and decoder in proposed scheme has led to the trivial increase in power consumption and area.The overhead due to area and power in the proposed scheme is negligible if we consider the improvement in power efficiency and throughput of the networks.

Conclusion
A novel 3-Dimensional Network on Chip (3D-NoC) is designed to improve the throughput of the on-chip network.It is accomplished by utilizing the encoder and decoder for compressing and decompressing the data based on Golomb-Rice algorithm.The proposed system shows an improvement in power efficiency and throughput of the network with the small overhead in area compared to the conventional NoC (without encoding and decoding).As of now, the proposed scheme is evaluated for the 4×4×3 mesh topology; in the future the idea can be extended to different topologies for performance evaluation.

Figure 3
Figure 3 Employed Routing Scheme

Table 2
End-to-End Packet Latency