The miniaturisation of integrated circuits is bringing new problems in terms of power consumption, speed, and variability tolerance. The current synchronous designs are struggling to cope with these problems, and in consequence new optimisations or paradigms are being studied. The study of this thesis are the optimisations like clock skew for synchronous circuits and asynchronous circuits as an alternative paradigm. The performance analysis of both cases are equivalent and algorithms on graph theory for cycles have been implemented to calculate the optimum speed. Asynchronous controllers are essential for a good asynchronous design. To create a connectivity structure of controllers it is necessary to group the memory elements (registers) of the circuit into clusters. Clustering registers affects power consumption, performance, area, and variability tolerance. To produce a good clustering is a hard job because of the high number of registers and for the trade-offs of optimising all these characteristics. An initial problem in clustering of controllers is to decide how many controllers we want. A design with one cluster give us the same problems of a synchronous design, high power consumption and too much sensible on variability of temperature, voltage, manufacturing errors, etc. On the other hand, having as many controllers as registers will produce too much overhead in area for all the new logic and wires that needs to be added. It is important to have clusters as less connected as possible to design simple controllers and to minimise the impact on area. We know from benchmarks and industrial designs that the register graph is highly connected, and the controllers graph is almost complete. A variation of Min-Cut can give us a solution to optimise this property. The clustering will have an impact on performance. Grouping registers implies a lost of freedom, and optimisations like clock skew or the asynchronous circuit will be affected by this lost as a handicap to reach the maximum speed. From the placement point of view we need to have clusters where their registers are close to minimise the clock tree. The ideal solution is a partition of the space. The worst solution is to have the registers spared around. The contribution of this thesis are two clustering algorithms; A local search solution to minimise the number of connections, and a k-means implementation that combines the minimisation of the clock trees and the maximisation of performance, by using parameters to balance it. These algorithms have been implemented in the Elastix EDA tool and executed on ISCAS benchmarks and SUN Microsystems OpenSparc processor |