{"title":"一个0.6/spl mu/m CMOS 4Gb/s过采样数据恢复收发器","authors":"C. Yang, R. Farjad-Rad, Horowitz","doi":"10.1109/VLSIC.1997.623812","DOIUrl":null,"url":null,"abstract":"A 0.6pm CMOS 4Gb/s Transceiver with Data Recovery usi Chih-Kong Ken Yang, Ramin Farjad-Rad, and Mark Horowitz Center for Integrated Systems, Stanford University Stanford, CA 94305 ABSTRACT A 4Gb/s serial link transmitter and receiver fabricated in the MOSIS HP0.6pm CMOS process uses edges tapped from a PLL to multiplex (transmit) and demultiplex (receive) the data. For data recovery the input is sampled at 3x the bit rate and uses a digital phase picking logic that allows very fast tracking of the bit window. With a 3.3V supply, the chip has a measured BER of < Architecture The architecture to achieve the 4Gb/s transmission and reception is shown in Fig. 1. Due to intrinsic process limitations, generating a 4Gb/s bit stream directly in a 0.6pm technology is impossible (maximum ring oscillator frequency is <2GHz.) The bit stream is generated by 8: 1 multiplexing using 8 different clock phases from a 4stage ring oscillator (Tx-PLL), so that the on-chip frequency is 1/8th the data rate. Various techniques exist for generating multiple clock phases [2], [3]; this paper uses the one discussed in [l]. The data recovery requires a 1:8 demultiplexing using similar multi-phased clocks. 24 clocks are generated by interpolating phases from a 6-stage ring oscillator (Rx-PLL) for the 3x oversampling [l]. The oversampled data is processed by a decision algorithm and simultaneously delayed so that the decision can be applied to the appropriate samples to recover the actual data. To facilitate the digital design, the data is first re-synchronized from the multiple clock phases to a global clock (this re-synchronizing process is reversed in the transmitter). Fig. 2 shows the timing for generating the transmitted and received signals. The re-synchronizing clocks and global clock are chosen and buffered carefully to prevent hold-time violation. The sampling and re-timing requires 2 cycles of latency. For bit error rate (BER) testing, a 27-1 PRBS encoder and decoder was built on chip as well as a scannable transmit data pattem. Decision Algorithm The algorithm for resolving the data from the samples depends upon the channel characteristics and the application. The algorithm serves a dual purpose of determining the value and timing of the data. The 3x oversampling was chosen as a trade-off of better sampling resolution and data recovery against increased power, area, and complexity. The BER for each oversampling ratio shown in Fig. 4 are calculated by averaging the BER of all possible phase positions. To determine the data value, we can weigh and sum the three samples such as majority voting which rejects high frequency glitches. However, even if the cablelfiber is not bandwidth limiter, the parasitic capacitance from the bank of input samplers required for the oversampling and demultiplexing as well as the parallel current-mode drivers for the output multiplexing forms a significant low-pass filter near the data frequency (85ps RC.) Majority voting becomes less useful and more prone to error to pulse width reduction at lower SNR (Fig. 4). Instead of majority voting this design selects the middle sample out of the three. More complex filtering and averaging could be performed if the digital data had more bits per sample or >3x oversampling. And by predistorting the transmitterj51 the input capatitance can be compensated Picking the center sample requires finding and tracking the bit boundaries. This logic behaves like a digital PLL[4]. However, instead of feeding back the phase information to control the phase of the clocks, the phase information is fed forward to the delayed data to select the","PeriodicalId":175678,"journal":{"name":"Symposium 1997 on VLSI Circuits","volume":"112 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"A 0.6/spl mu/m CMOS 4Gb/s Transceiver With Data Recovery Using Oversampling\",\"authors\":\"C. Yang, R. Farjad-Rad, Horowitz\",\"doi\":\"10.1109/VLSIC.1997.623812\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A 0.6pm CMOS 4Gb/s Transceiver with Data Recovery usi Chih-Kong Ken Yang, Ramin Farjad-Rad, and Mark Horowitz Center for Integrated Systems, Stanford University Stanford, CA 94305 ABSTRACT A 4Gb/s serial link transmitter and receiver fabricated in the MOSIS HP0.6pm CMOS process uses edges tapped from a PLL to multiplex (transmit) and demultiplex (receive) the data. For data recovery the input is sampled at 3x the bit rate and uses a digital phase picking logic that allows very fast tracking of the bit window. With a 3.3V supply, the chip has a measured BER of < Architecture The architecture to achieve the 4Gb/s transmission and reception is shown in Fig. 1. Due to intrinsic process limitations, generating a 4Gb/s bit stream directly in a 0.6pm technology is impossible (maximum ring oscillator frequency is <2GHz.) The bit stream is generated by 8: 1 multiplexing using 8 different clock phases from a 4stage ring oscillator (Tx-PLL), so that the on-chip frequency is 1/8th the data rate. Various techniques exist for generating multiple clock phases [2], [3]; this paper uses the one discussed in [l]. The data recovery requires a 1:8 demultiplexing using similar multi-phased clocks. 24 clocks are generated by interpolating phases from a 6-stage ring oscillator (Rx-PLL) for the 3x oversampling [l]. The oversampled data is processed by a decision algorithm and simultaneously delayed so that the decision can be applied to the appropriate samples to recover the actual data. To facilitate the digital design, the data is first re-synchronized from the multiple clock phases to a global clock (this re-synchronizing process is reversed in the transmitter). Fig. 2 shows the timing for generating the transmitted and received signals. The re-synchronizing clocks and global clock are chosen and buffered carefully to prevent hold-time violation. The sampling and re-timing requires 2 cycles of latency. For bit error rate (BER) testing, a 27-1 PRBS encoder and decoder was built on chip as well as a scannable transmit data pattem. Decision Algorithm The algorithm for resolving the data from the samples depends upon the channel characteristics and the application. The algorithm serves a dual purpose of determining the value and timing of the data. The 3x oversampling was chosen as a trade-off of better sampling resolution and data recovery against increased power, area, and complexity. The BER for each oversampling ratio shown in Fig. 4 are calculated by averaging the BER of all possible phase positions. To determine the data value, we can weigh and sum the three samples such as majority voting which rejects high frequency glitches. However, even if the cablelfiber is not bandwidth limiter, the parasitic capacitance from the bank of input samplers required for the oversampling and demultiplexing as well as the parallel current-mode drivers for the output multiplexing forms a significant low-pass filter near the data frequency (85ps RC.) Majority voting becomes less useful and more prone to error to pulse width reduction at lower SNR (Fig. 4). Instead of majority voting this design selects the middle sample out of the three. More complex filtering and averaging could be performed if the digital data had more bits per sample or >3x oversampling. And by predistorting the transmitterj51 the input capatitance can be compensated Picking the center sample requires finding and tracking the bit boundaries. This logic behaves like a digital PLL[4]. However, instead of feeding back the phase information to control the phase of the clocks, the phase information is fed forward to the delayed data to select the\",\"PeriodicalId\":175678,\"journal\":{\"name\":\"Symposium 1997 on VLSI Circuits\",\"volume\":\"112 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Symposium 1997 on VLSI Circuits\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VLSIC.1997.623812\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Symposium 1997 on VLSI Circuits","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSIC.1997.623812","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A 0.6/spl mu/m CMOS 4Gb/s Transceiver With Data Recovery Using Oversampling
A 0.6pm CMOS 4Gb/s Transceiver with Data Recovery usi Chih-Kong Ken Yang, Ramin Farjad-Rad, and Mark Horowitz Center for Integrated Systems, Stanford University Stanford, CA 94305 ABSTRACT A 4Gb/s serial link transmitter and receiver fabricated in the MOSIS HP0.6pm CMOS process uses edges tapped from a PLL to multiplex (transmit) and demultiplex (receive) the data. For data recovery the input is sampled at 3x the bit rate and uses a digital phase picking logic that allows very fast tracking of the bit window. With a 3.3V supply, the chip has a measured BER of < Architecture The architecture to achieve the 4Gb/s transmission and reception is shown in Fig. 1. Due to intrinsic process limitations, generating a 4Gb/s bit stream directly in a 0.6pm technology is impossible (maximum ring oscillator frequency is <2GHz.) The bit stream is generated by 8: 1 multiplexing using 8 different clock phases from a 4stage ring oscillator (Tx-PLL), so that the on-chip frequency is 1/8th the data rate. Various techniques exist for generating multiple clock phases [2], [3]; this paper uses the one discussed in [l]. The data recovery requires a 1:8 demultiplexing using similar multi-phased clocks. 24 clocks are generated by interpolating phases from a 6-stage ring oscillator (Rx-PLL) for the 3x oversampling [l]. The oversampled data is processed by a decision algorithm and simultaneously delayed so that the decision can be applied to the appropriate samples to recover the actual data. To facilitate the digital design, the data is first re-synchronized from the multiple clock phases to a global clock (this re-synchronizing process is reversed in the transmitter). Fig. 2 shows the timing for generating the transmitted and received signals. The re-synchronizing clocks and global clock are chosen and buffered carefully to prevent hold-time violation. The sampling and re-timing requires 2 cycles of latency. For bit error rate (BER) testing, a 27-1 PRBS encoder and decoder was built on chip as well as a scannable transmit data pattem. Decision Algorithm The algorithm for resolving the data from the samples depends upon the channel characteristics and the application. The algorithm serves a dual purpose of determining the value and timing of the data. The 3x oversampling was chosen as a trade-off of better sampling resolution and data recovery against increased power, area, and complexity. The BER for each oversampling ratio shown in Fig. 4 are calculated by averaging the BER of all possible phase positions. To determine the data value, we can weigh and sum the three samples such as majority voting which rejects high frequency glitches. However, even if the cablelfiber is not bandwidth limiter, the parasitic capacitance from the bank of input samplers required for the oversampling and demultiplexing as well as the parallel current-mode drivers for the output multiplexing forms a significant low-pass filter near the data frequency (85ps RC.) Majority voting becomes less useful and more prone to error to pulse width reduction at lower SNR (Fig. 4). Instead of majority voting this design selects the middle sample out of the three. More complex filtering and averaging could be performed if the digital data had more bits per sample or >3x oversampling. And by predistorting the transmitterj51 the input capatitance can be compensated Picking the center sample requires finding and tracking the bit boundaries. This logic behaves like a digital PLL[4]. However, instead of feeding back the phase information to control the phase of the clocks, the phase information is fed forward to the delayed data to select the