



International Journal of Information Research and Review Vol. 1, Issue, 3, pp. 044-049, September, 2014



## Full Length Research Article

## THE NEW METHOD FOR THE DESIGN AND SIMULATION OF A 4-BIT ACCUMULATOR

## **Rupesh Singh, Srivastava and Mahfooz Ahmad**

Integral University Lucknow, India

| ARTICLE INFO                                                                                   | ABSTRACT                                                                                                                                                                                           |
|------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Article History:<br>Received 24 <sup>th</sup> June, 2014                                       | an accumulator is an adder or subtractor and a register. Sometimes these are combined with a multiplier to form a multiplier–accumulator (MAC). An incremental adds to the input bus, can add this |
| Received in revised form 15 <sup>th</sup> July, 2014<br>Accepted 03 <sup>rd</sup> August, 2014 | so we can use this function, together with a register, to negate a two's complement number. Advanced Silicon (Si), Complementary Metal Oxide Semiconductor (CMOS) manufacturing processes require  |
| Published online 21 <sup>st</sup> September, 2014                                              | novel circuit design methodology to achieve high performance. The CMOS process leverages existing 120nm silicon process advancements, but lower device breakdown voltages require new circuits to  |
| Keywords:                                                                                      | achieve low voltage operation. These allow for high speed metal oxide semiconductor field effect                                                                                                   |
| Complementary metal Oxide                                                                      | transistor (MOSFET) technology logic blocks combined with high density and low power CMOS                                                                                                          |
| Semiconductor (CMOS),                                                                          | logic. This thesis presents the design of a digital accumulator operating at a 5 Volt supply circuit                                                                                               |
| Metal Oxide Semiconductor Field Effect<br>Transistor (MOSFET), Multiplier-                     | consuming minimum of power supply. This architecture applies a modified logic family inverter, nand nor and xor with 120nm technology circuit                                                      |

Copyright © 2014 Mohd Nadeem. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

nand, nor, and xor with 120nm technology circuit.

## **INTRODUCTION**

accumulator (MAC)

Accumulator is a register in which intermediate arithmetic and logic results are stored. Without a register like an accumulator, it would be necessary to write the result of each calculation (addition, multiplication, shift, etc.) to main memory, perhaps only to be read right back again for use in the next operation. Access to main memory is slower than access to a register like the accumulator because the technology used for the large main memory is slower (but cheaper) than that used for a register. Early electronic computer systems were often split into two groups, those with accumulators and those without. Modern computer systems often have multiple general purpose registers that operate as accumulators, and the term is no longer as common as it once was. However, a number of special-purpose processors still use a single accumulator for their work, in order to simplify their design. For high speed digital circuits require constant technological advancements. CMOS can achieve the required performance for signal processing demands, and these advancements are increases. This paper will present circuit of different logic and the formation of accumulator circuit by digital logic. These circuits integrated with CMOS to boost overall system performance.

Corresponding author: Mohd Nadeem, Integral University Lucknow, India

These techniques are evaluated through the design of a 4-bit digital accumulator. We have taken 120 nm CMOS technology with high performance logic gates. This process allows for combination of high density, low power CMOS technology.

#### MOSFET

MOSFET or MOS is certainly the workhorse of contemporary digital design. Its major asset from a digital perspective is that the device performs very well as a switch, and introduces little parasitic effects. Other important advantages are its integration density combined with a relatively "simple" manufacturing process, which make it possible to produce large and complex circuits in an economical way. The discussion concludes with an enumeration of some second-order effects and the introduction of the SPICE MOS transistor models. The MOSFET is a four terminal device. The voltage applied to the gate terminal determines if and how much current flows between the source and the drain ports (Elguibaly, 2000). The body represents the fourth terminal of the transistor. Its function is secondary as it only serves to modulate the device characteristics and parameters. At the most superficial level, the transistor can be considered to be a switch. When a voltage is applied to the gate that is larger than a given value called the threshold voltage V<sub>T</sub>, a conducting channel is formed between drain and source. In the presence of a voltage difference between the latter two, current flows between them. The conductivity of the channel is modulated by the gate voltage the larger the voltage differences between gate and source, the smaller the resistance of the conducting channel and the larger the current. When the gate voltage is lower than the threshold, no such channel exists, and the switch is considered open.





#### **SPICE Models**

SPICE has built-in MOSFET models, selected by the LEVEL parameter in the model. These models have been rendered obsolete by the progression to short-channel devices. They should only be used for first-order analysis, the main properties are different we used level 3 spice model LEVEL 3 is a semiempirical model. It relies on a mixture of analytical and empirical expressions, and uses measured device data to determine its main parameters. It works quite well for channel lengths down to 1 mm. In response to the inadequacy of the built-in models, we should use SPICE LEVEL 3 model.

*M1 nvout nvin 0 0 nmos.1 W=0.375U L=0.25U+AD=0.24P PD=1.625U AS=0.24P PS=1.625U NRS=1 NRD=1* 

M2 nvout nvin nvdd nvdd pmos.1 W=1.125U L=0.25U+AD=0.7P PD=2.375U AS=0.7P PS=2.375U NRS=0.33 NRD=0.33

#### **D** Flip Flop

The D flip-flop has only one input referred to as the D input, or data input, and two outputs as usual Q and Q'. It transfers the data at the input after the delay of one clock pulse at the output Q. So in some cases the input is referred to as a delay input and the flip-flop gets the name delay (D) flip-flop (Chang *et al.*, 2004). It can be easily constructed from an S-R flip-flop by simply incorporating an inverter between S and R such that the input of the inverter is at the S end and the output of the inverter is at the R end. We can get rid of the undefined condition, i.e, S = R = 1 condition, of the S-R flip-flop in the D flip-flop. The D flip-flop is either used as a delay device or as a latch to store one bit of binary information. The truth table of D flip flop is given in the table in Figure 2. The structure of the D flip-flop is shown in Figure 2 which is being constructed using

NAND gates. The same structure can be constructed using only NOR gates.



Figure 2. D flip-flop using NAND gates

| Input    | Output    |
|----------|-----------|
| $D_n^{}$ | $Q_{n+1}$ |
| 0        | 0         |
| 1        | 1         |

Figure 2[a]. operation of D flip flop

Case 1.If the CLK input is low; the value of the D input has no effect, since the S and R inputs of the basic NAND flip-flop are kept as 1. Case 2.If the CLK = 1, and D = 1, the NAND gate 1 produces 0, which forces the output of NAND gate 3 as 1. On the other hand, both the inputs of NAND gate 2 are 1, which gives the output of gate 2 as 0. Hence, the output of NAND gate 4 is forced to be 1, i.e., Q = 1, whereas both the inputs of gate 5 are 1 and the output is 0, i.e., Q' = 0. Hence, we find that when D = 1, after one clock pulse passes Q = 1, which means the output follows D. Case 3.If the CLK = 1, and D = 0, the NAND gate 1 produces 1. Hence both the inputs of NAND gate 3 are 1, which gives the output of gate 3 as 0. On the other hand, D = 0 forces the output of NAND gate 2 to be 1. Hence the output of NAND gate 5 is forced to be 1, i.e.,



Figure 3. A S-R flip-flop converted into a D flip-flop



Figure 4. the logic symbol of a D flip-flop.

Q' = 1, whereas both the inputs of gate 4 are 1 and the output is 0, i.e., Q = 0. Hence, we find that when D = 0, after one clock pulse passes Q = 0, which means the output again follows D. A simple way to construct a D flip-flop using an S-R flip-flop is shown in Figure 4. The logic symbol of a D flip-flop is shown in Figure 5 D flip-flop is most often used in the construction of sequential circuits like registers.



Figure 5. D flip flop operation

The schematic of the DFF is shown below.



Figure 6. Schematic of a rising edge D flip flop

#### **Design of 4 bit Accumulator**

An accumulator is build with an adder whose sum can be loaded into a register as shown in figure 7.



Figure 7. basic block diagram of an Accumulator

Accumulators are a basic building block of most large digital logic or DSP project. As an analogy, you can think of an up accumulator (the type we are using in this project) as a file cabinet. It starts out empty. If you add two, it now holds the value of two. If you add three more it now holds five. The block diagrams below explain how a 4 bit accumulator is made up of four one bit accumulators. Since we have designed and tested one bit accumulator, in this thesis first, we shall be instantiating it four times to create a four bit ripple carry adder. This saves a lot of design time and is more reliable. A 4-bit accumulator consists of a 4-bit full adder and a resettable 4-bit register. Its 7 inputs are phi, {A3, A2, A1, A0}, cin, and reset. Its 5 outputs are {Q3, Q2, Q1, Q0} and cout. The adder computes the sum of {A3, A2, A1, A0}, {Q3, Q2, Q1, Q0}, and cin, and generates a sum {S3, S2, S1, S0} and a carry cout. The register samples {S3, S2, S1, S0} on the rising edge of phi and stores the result on  $\{Q3, Q2, Q1, Q0\}$ .



Figure 8. Block diagram of 4 bit ripple carry adder

There are 4, 1-bit accumulator. The Cin and Cout is chaining. {A3, A2, A1, A0} is primary input data with Cin. Another input of adder is from output of Flip-Flop. The design style of 4bit adder is ripple carry adder and this adder can calculate 0000~1111. If the result of calculation is bigger than 1111, final Cout will be se, and the result is back to 0000. The carry logic is designed as domino style, The precharging phase is when phi (clk) is low. Thus, the valid value for Cout is when phi (clk) is high (evaluation phase). {Q3, Q2, Q1, Q0} is the primary result vector of this design. There are buffer to handle big next gate

#### 4 Bit Accumulator

The 4 Bit accumulator is designed using 4 full adders and 5 DFFs. The output is obtained at the end of every clock cycle. The output of the first 4 DFFs are given back to the input of the corresponding adder in order for accumulation to take place. Notice that the carry out bit of the first adder is given to the carry in bit of the second stage and this continues till the last stage. The carry out bit of the last adder is given to the DFF. This is done to ensure that while pipelining three of these 4 bit accumulators, the carry bit of one stage is sent to the carry bit of the next stage in exactly one clock cycle.



Figure 9. schematic of a 4 bit accumulator

#### **Tanner Tool simulation of the Schematic**

This was done as routine as a part of the design cycle. The schematic is extracted and converted to schm.sim and tested using S-edit by executing the following circuit.

- 1. Reset the accumulator.
- 2. Set c\_in=0 for the duration of this test.
- 3. Let c\_out be an output, and let the sum wrap around in case of an overflow (i.e. modulo-16 addition).
- 4. Set A=6 (A3=0, A2=1, A1=1, A0=1) and the clock was pulsed.
- 5. Set A=4 (A3=0, A2=1, A1=0, A0=1) and the clock was pulsed.
- 6. Set A=5 (A3=0, A2=1, A1=0, A0=1) and the clock was pulsed.
- 7. Set A=1 (A3=0, A2=0, A1=0, A0=1) and the clock was pulsed.
- 8. The final sum will be Q=0001. COUT shall remain 0 all through the test duration

An image of the simulator waveform for this sequential addition can be found below. As expected there was no carryout signal during the addition and the final value of Q is '0001

#### **Power Dissipation**

Dynamic power dissipation 2.5 mW. Static power dissipation 1.3 mW (WHEN CLOCK IS HELD HIGH, i.e. worst case for dynamic logic) Static power dissipation 1.4 mW (WHEN CLOCK IS HELD LOW)

#### **Critical Path analysis**

The slowest timing path in the 4 bit accumulator is the carry out, since the carry out of the previous slice is the carry in of the next slice. The signal propagates from the full adder that adds Q0 and A0. This produces the Cout 0 that enters the next full adder's Cin. The signals, prorogates through the adder for A1 and Q1. This produces the Cout1 that enters the next full adder's Cin. The signal then prorogates through the adder for A2 and Q2. This produces the Cout2 that enters the next full adder's Cin. The signal then prorogates through the adder for A2 and Q2. This produces the Cout2 that enters the next full adder's Cin. The signal then prorogates through the adder for A3 and Q3. Then the signal must be latched at the register for Q3. Therefore the critical path is:

#### Setup time analysis

The setup time is defined as how late the input can arrive at the latch, in this case it is the sum, in order to be latched by the negative edge triggered latch. The SUM calculated depends on the Inputs to the SUM logic which are A0, A1, A2, A3 and CIN. Thus we find the setup time for these inputs so that there is no loss of data for Q and S outputs. In order to find the setup time for a accumulator one input should change while the other should be constant so that the output can purely depend on one input. The test sequence was derived with that in mind when the setup time for CIN was calculated A0 was zero and vice versa.

#### **Propagation Delay analysis**

The propagation delay can be defined as the time for the output to change with respect to the input. Here in this case there are control signals like PHI and RESET. Thus the need for analyzing the output behavior with respect to the inputs. It is logical to calculate the worst case propagation delays. The propagation delays for the four Q3:0 outputs will have two values per output: delay from the rising edge of clk and delay from the rising edge of Reset. The propagation delay for Cout will have seven values: delay from the rising edge of clk, rising edge of Reset, and transitions on Cin, A3, A2, A1, and A0. It can be seen here that the propagation delay increases as the distance of the input signal from the output signal increases.

#### **RESULTS AND DISCUSSION**

The circuit diagram of different logic gates by using mosfet with 120 nm technology is designed and analyzed by tanner tool schematic editor. In these logic gates section ..., we have given the input pulse with 1 nano second rise and fall time, with different width and length of the pulses 25 nano second 50 nano second 100 nano second 200 nano second and 400 nano second with clock pulse 25 nano second width and 50 nano second length. The functionality of 4 bit Accumulator was tested as following waveform:



Figure 10. Waveform of accumulator circuit with 4 inputs

# A1, A2, A3, and A4. And 10 output q, qbar, q1, q1bar, . . q4, q4bar

So we have designed MOS based 4 bit accumulator circuit and analyzed them. We have d0e0s0i0g0n0e0d0 0t000000000he full adder circuit and its output waveform with same input signal as shown below:



Figure 11. schematic diagram of full adder



Figure 12. output waveform of sum and carry with input A B and C



We have designed the D flip flop circuit by using logic gates in tanner tool schematic editor which is shown in figure.

Figure 13. schematic diagram of D flip flop



Figure 14. output waveform of input pulses in D flips flop

#### Conclusion

A 4 bit accumulator was designed, and functionally tested with tanner tool. Critical Path, Setup times and Propagation delays were Analyzed. The maximum clock frequency it could support was 100MHz. The Critical path which is the ripple carry also decided the frequency other than the latch.

- Identifying the critical (slowest) timing path in our circuit which will limit the clock frequency of the 4-bit accumulator.
- The setup time is the time that the data inputs must be valid before clock transition.
- The critical path is strongly dependent on circuit topology and data dependencies

## REFERENCES

- Abdelgawad, A., Bayoumi, M., 2007. High Speed and Areaefficient Multiply Accumulate (MAC) Unit for Digital Signal Processing Applications. Proc. IEEE Int. Symp. on Circuits and Systems. New Orleans, USA, p.3199-3202. [doi:10.1109/ISCAS.2007.378152]
- Chang, C.H., Gu, J.M., Zhang,M.Y., 2004. Ultra low-voltage low-power CMOS 4-2 and 5-2 compressors for fast arithmetic circuits. IEEE Trans.Circuits Syst. I: Fundam. *Theory Appl.*, 51(10):1985-1997. [doi:10.1109/TCSI.2004. 835683]
- Chang, C.H., Gu, J.M., Zhang, M.Y., 2005. A review of 0.18µm full adder performances for tree structured arithmetic circuits. IEEE Trans. *Very Large Scale Integration* (VLSI) Syst., 13(6):686-695. [doi:10.1109/TVLSI. 2005.848806]
- Chen, K.H., Chu, Y.S., 2007. A low-power multiplier with the spurious power suppression technique. IEEE Trans. *Very Large Scale Integration* (VLSI) Syst., 15(7):846-850. [doi:10.1109/TVLSI.2007.899242]
- Chong, K.S., Gwee, B.H., Chang, J.S., 2007. Low energy 16bit Booth leapfrog array multiplier using dynamic adders. IET Proc. *Circ.*, *Devices & Syst.*, 1(2):170-174. [doi:10.1049/iet-cds:20060109]
- Clark, L., Hoffman, E.J., Miller, J., Biyani, M., Liao, L.Y., Strazdus, S., Morrow, M., Velarde, K.E., Yarch, M.A., 2001. An embedded 32-b microprocessor core for low-power and high-performance applications. IEEE *J.Solid-State Circ.*, 36(11):1599-1608. [doi:10.1109/4.962279]

- Danysh, A., Tan, D., 2005. Architecture and implementation of a vector/SIMD multiply-accumulate unit.IEEE Trans. *Comput.*, 54(3):284-293. [doi:10.1109/TC.2005.41]
- Elguibaly, F., 2000. A fast parallel multiplier-accumulator using the modifiedBooth algorithm. IEEE Trans.Circuits Syst. II: Analog Digital Sign. Process., 47(9): 902-908. [doi:10.1109/82.868458]
- Fang, C.J., Huang, C.H., Wang, J.S., Yeh, C.W., 2002. Fast and Compact Dynamic Ripple Carry Adder Design. Proc. IEEE Asia-Pacific Conf. on ASIC. Taipei, Taiwan, p.25-28. [doi:10.1109/APASIC.2002.1031523]
- Kim, Y., Kim, L., 2001. 64-bit carry-select adder with reduced area. Electr. Lett. 37(10): 614-615. [doi:10.1049/el: 20010430]

\*\*\*\*\*\*