# A NOVEL FPGA-BASED BUNCH PURITY MONITOR SYSTEM AT THE APS STORAGE RING\*

W. Eric Norum and Bingxin Yang, ANL, Argonne, IL 60439, USA

## Abstract

Bunch purity is an important source quality factor for the magnetic resonance experiments at the Advanced Photon Source. Conventional bunch-purity monitors utilizing time-to-amplitude converters are subject to dead time. We present a novel design based on a single fieldprogrammable gate array (FPGA) that continuously processes pulses at the full speed of the detector and front-end electronics. The FPGA provides 7778 single-channel analyzers (six per rf bucket). The starting time and width of each single-channel analyzer window can be set to a resolution of 178 ps. A detector pulse arriving inside the window of a single-channel analyzer is recorded in an associated 32-bit counter. The analyzer makes no contribution to the system dead time. Two channels for each rf bucket count pulses originating from the electrons in the bucket. The other four channels on the early and late side of the bucket provide estimates of the background. A single-chip microcontroller attached to the FPGA acts as an EPICS [1] IOC to make the information in the FPGA available to the EPICS clients.

## **INTRODUCTION**

The bunch purity measurement system described here uses the same detector and signal conditioning electronics as an older bunch purity measurement system developed at the APS [2]. The older system is based on a timeto-amplitude converter driving a multi-channel analyzer. Among its shortcomings is an inability to acquire data from every storage ring bucket. The new system removes this restriction.

An avalanche photodiode detects photons emitted from the electron beam at a storage ring bending magnet port. A constant-fraction discriminator provides a fast NIM pulse when a photon is detected. The other inputs to the new system are the facility 44-MHz clock which is used by the FPGA to generate the acquisition sampling clocks, and the facility P0 synchronization signals used to mark the first bunch in each storage ring turn.

## HARDWARE

The only custom-built hardware in the bunch purity measurement system is a small circuit board that contains components to buffer the timing and synchronization signals and to convert the the fast NIM signal from the constant-fraction discriminator to LVTTL. The heart of the system is a Stratix II FPGA mounted on an evaluation kit circuit board produced by Altera [3]. In addition to the FPGA chip, the board contains flash memory to configure the FPGA on power-up and expansion connectors to which daughterboards containing application-specific components may be attached. There are many other components on the board that are not used by the bunch purity monitor. Even with all these additional unused components the development kit provides a very cost-effective development platform. A commercial microcontroller module [4] is connected to the development kit expansion connectors. This microcontroller acts as an input/output controller (IOC) on the facility EPICS control system. EPICS IOCs are servers in a dynamic distributed database and provide process variable information to clients on the network. The total cost of all hardware components was less than \$2000.

## **OPERATION**

A phase-locked loop (PLL) block in the FPGA takes in the 44-MHz clock and multiplies it to 352 MHz. As shown in Figure 1, the PLL has six outputs, which can be individually offset in steps of 22.5 degrees. The six 352-MHz clocks drive the sampling flip-flops as shown in Figure 2, giving an effective sampling frequency of greater than 2.1 GHz. The facility 44-MHz clock is derived from the rf drive, so changes in the storage ring rf frequency are tracked by the bunch purity sampling clock.



Figure 1: Phase locked loop sampling clock generation.

The state of the six sampling flip-flops is read at the end of each 2.8 ns (1/352 MHz) sampling interval. These values are then checked by a rising-edge detector, and when a rising edge is detected, the counter corresponding to that subinterval and bunch number is incremented. There are 7776 counters — six for each sampling interval and one sampling interval for each of the 1296 buckets in the storage ring. Each counter is 32 bits wide.

> T03 Beam Diagnostics and Instrumentation 1-4244-0917-9/07/\$25.00 ©2007 IEEE

<sup>\*</sup>Work supported by U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357.

<sup>06</sup> Instrumentation, Controls, Feedback & Operational Aspects



Figure 2: Sampling flip-flops.

The six subintervals are described as 'Too early,' 'Early,' 'Good,' 'Late,' 'Too late,' and 'Unclassifiable' (could be assigned to this bucket or to the following bucket). The values of the 'Early,' 'Good,' and 'Late' counters for each bucket are added together and used to determine the bunch purity. The bucket number counter is reset to 0 on the rising edge of the facility 'P0' timing marker.

In order for the six sampling flip-flips to sample the state of the input signal at the desired intervals, the differences between the propagation delays from the input pin to the six sampling flip-flops must be as small as possible. While an exact match of the propagation delays is impossible, the differences between the delays can be minimized by carefully selecting the location of the sampling flip-flops within the FPGA. The FPGA development software allows the designer to specify the exact cell into which a given section of logic will be placed. Through a trial and error approach of trying different locations for the sampling flip-flops, a layout was found that has only 35 ps of skew between the propagation delays.

The other factor that determines the actual sampling instant is the propagation delay of the clock signals from the PLL where they are generated and the six sampling flipflops that they drive. No arrangement of the sampling flipflop locations within the FPGA could reduce this skew to less than 229 ps. Although this skew is large, it can be compensated for to some degree as described later.

The timing of the six sampling instants was measured by applying an asynchronous  $\approx$ 10-MHz signal to the input and acquiring data for a minute or so. The relative number of counts in each of the six classifications provide a direct indication of the relative sizes of the time intervals between the six sampling instants. The skew between the clock signals was found to have made the 'Late' classification too short and the 'Too late' classification too long. This condition was easily resolved by simply adding one more phase delay increment to the clock used to drive the 'Too late'

06 Instrumentation, Controls, Feedback & Operational Aspects

sampling flip-flop. Table 1 shows the original intervals, the intervals after making the clock adjustment, and the ideal intervals.

| Table | 1: A | Acquisi | ition l | Bin | Sizes ( | (ps)     |  |
|-------|------|---------|---------|-----|---------|----------|--|
|       |      |         |         |     |         | <b>N</b> |  |

|              |     | · · 1 | -   |     | (T ) |     |
|--------------|-----|-------|-----|-----|------|-----|
| Original     | 308 | 602   | 474 | 364 | 527  | 563 |
| Final        | 308 | 602   | 474 | 541 | 349  | 563 |
| Ideal        | 355 | 532   | 532 | 532 | 355  | 532 |
| % difference | -13 | 13    | -11 | 1.7 | -1.5 | 5.9 |

The differences between the ideal sampling intervals and the actual sampling intervals are small enough to be insignificant to the operation of the monitor.

The FPGA contains 255 'M4k' memory blocks each of which provide up to 4608 bits of memory. The bunch purity monitor firmware uses 102 of these blocks to implement the 7776 32-bit counters for the 6 classifications of 1296 buckets and another 17 to implement the 1296 32-bit counters for a 'total count' histogram. The counters are implemented as a RAM block and a 32-bit incrementer for a given classification. When a rising edge in a particular classification of a particular bucket is detected, the bucket number is used to address the RAM block for that classification. The location is read, incremented, and written back to the same location. This operation takes two cycles of the 100-MHz system clock. Thus the system can accept a sustained counting rate of 50 MHz without losing any samples. This is far above the rates encountered during normal operation.

## STATISTICS ACCUMULATION

The FPGA counts pulses from the detector and builds a histogram for duration t seconds, where t is an EPICS process variable with default value 120. This histogram has six bins for each of the 1296 storage ring buckets for a total of 7776 bins. The count value for each bucket is the sum of the values in the 'Early,' 'Good,' and 'Late' classifications for each bucket. The main bucket is defined to be any bucket with counts above fraction f of the bucket with the most counts, where f is an EPICS process variable with default value of 0.33. This set of main buckets is designated  $M_0$ . For example, in the case where 24 bunches are evenly spaced around the storage ring  $M_0$  is

$$M_0 = \{m\} = \{0, 54, 108, \dots, 1042\}.$$
 (1)

The average value of histogram counts for the main bucket counts is then

$$\overline{C_0} = \frac{1}{n} \sum_{m \in M_0} C_m, \quad \text{where } n = \sum_{m \in M_0} 1.$$
 (2)

The average value of histogram counts for the three buckets before and six buckets after the main buckets are

$$\overline{C_i} = \frac{1}{n} \sum_{k \in M_i} C_k, \text{ where } M_i = \{m + i | m \in M_0\} \text{ (3)}$$
  
and  $i = -3, -2, -1, 1, \dots, 6.$ 

T03 Beam Diagnostics and Instrumentation

1-4244-0917-9/07/\$25.00 ©2007 IEEE

The average number of counts for all remaining buckets is

$$\overline{C_{\text{other}}} = \frac{\sum_{k \notin \{M_i, M_0\}} C_k}{1296 - 10n}.$$
 (4)

The 'bunch impurity' process variables are defined as the ratios

$$p_i = \frac{\overline{C_i}}{\overline{C_0}}$$
 where  $i = -3, -2, -1, 1, \dots, 6$ , other. (5)

The detector and optics are adjusted to provide a rate of about 7000 counts per second for each of the filled buckets. At this rate the probability of missed counts due to multiple photons arriving at the detector is less than 0.1%.

To obtain better statistics and dynamic range, the IOC maintains a boxcar average of the values read from the FPGA. The boxcar length is set by an EPICS process variable and has a maximum value of 500. The length is commonly set to 30 which, when combined with a two minute FPGA sampling period, results in bunch purity statistics based on data acquired over the past hour.

## RESULTS

The following section presents an example of the information obtained by the bunch purity monitor. The data in question were acquired during a one-week period at the end of March, 2007. Plots of the number of electrons in the impurity buckets over time showed that the number of electrons in most impurity buckets increased by factors of two to four, as expected from the longer lifetime of the impurity bunches. This is consistent with a stable injector that fills all neighbor buckets of the main buckets at nearly constant ratios. A notable exception was the +1 bucket, which during the first 24 hours increased its electron population by 26-fold — about ten times faster than all other impurity buckets.

Figure 3 shows the recorded intensity of bucket 379 and the preceding main bucket (378) over the period in question. During this time the main bucket received 36 top-up injections. The intensity of the +1 bucket (379), remained low for 30 of these shots. The 31st shot increased the number of electrons in the +1 bucket over 40-fold. Figure 4 shows the same plots with expanded time scale to illustrate more clearly that the increase in the +1 bucket charge occurred at the same time as the injection into bucket 378.

Note that between 9:45 and 9:50 AM counts for the main bucket increased by  $2 \times 10^5$  while counts for the +1 bucket increased by about 650. This indicates an impurity of 0.3% for the +1 bucket.

## CONCLUSIONS

The FPGA-based bunch purity monitor described in this paper is now in regular use at the APS. It has shown itself to be both reliable and capable enough that it will completely

06 Instrumentation, Controls, Feedback & Operational Aspects



Figure 3: Recorded bucket intensities for a 40-hour period.



Figure 4: Recorded bucket intensities for a 4-hour period.

replace the older bunch purity monitor system in September, 2007. The use of a commercial FPGA development kit provides a very cost-effective platform for constructing diagnostic data acquisition systems such as this.

#### ACKNOWLEDGEMENTS

We thank Bob Soliday and Hairong Shang for their help setting up the data logger and SDDS programs.

#### REFERENCES

- [1] Experimental Physics and Industrial Control System. http://www.aps.anl.gov/epics/.
- [2] A.H. Lumpkin, C.-Y. Yao, B.X. Yang, and T. Toellner. Bunch purity evolution during APS storage ring top-up operations. In *Proc. of PAC'03*, pages 2413–2415, Portland, Oregon, 2003.
- [3] Altera, 101 Innovation Drive, San Jose, CA 95134. Nios Development Board Stratix II Edition Reference Manual, 1.3 edition, May 2007.
- [4] Arcturus Networks Incorporated, 116 Spadina Avenue, Suite 100, Toronto, Ontario, Canada, M5V2K6. uCdimm<sup>TM</sup> Cold-Fire 5280/5282 Hardware/Firmware Reference Guide.

T03 Beam Diagnostics and Instrumentation

1-4244-0917-9/07/\$25.00 ©2007 IEEE