FPGA Verilog Questions
- Would be it possible to modify the USRP FPGA Verilog code?
Yes. It is possible. The top level of the verilog source for the FPGA can be found in usrp/fpga/toplevel/usrp_std/usrp_std.v and the bulk of the code in usrp/fpga/sdr_lib/*.v.
- I would like to recompile all the verilog files for the FPGA in order to have a better understanding of the role of each module. I believe the program used to compile the files was Quartus II. Is that correct?
If you want to build the FPGA .rbf file from source (not required; we provide pre-compiled .rbf files in usrp/fpga/rbf directory), you'll need Altera's no cost Quartus II development tools. We're currently building with Quartus II Web Edition.
The project file is usrp/fpga/toplevel/usrp_std/usrp_std.qpf. The top level of the verilog source for the FPGA can be found in usrp/fpga/toplevel/usrp_std/usrp_std.v and the bulk of the code in usrp/fpga/sdr_lib/*.v.
- How the data coming/going to the AD9862 is handled in the FPGA?
In the FPGA RX Side, in the DC offset removal process, they are assigned on input. Here is where it happens: adc_interface.v. We can see that the input of the rx_dcoffset module is assigned:
.adc_in({adc0[11],adc0,3'b0})which is a 1 bit sign extension, the 12 original bits, then 3 bits of 0's. While in the FPGA TX Side, See usrp_std.v whatever 16-bits are coming from the tx_chain module is truncated by 2-bits and sent out to the DACs.
- The tx_chain.v module uses a module called phase_acc. It takes as inputs (among others) a 7bit serial address and 32 bit serial data and outputs the 32bit phase. Does anyone have any idea what that module does and what its purpose is in the tx_chain?
It is not enabled. If enabled, this code would implement digital up conversion in the FPGA, instead of using the DUC in the AD9862. N.B., this code hasn't been tested in something like a couple of years and may have suffered bit rot.
- I was looking at the serial_io.v module but I am not sure what its purpose is or what it does?
The serial_io.v is only used for writing and reading the FPGA registers.
- The data arrives as a serial input (from USB bus) to the FX2 chip. Then, it gets to the FPGA as a 16bit parallel input from the GPIF bus. That input is called usb_data in the FPGA. However, upon reading the verilog code of the serial_io.v module, it seems that this module does a serial to parallel conversion of the SDI input and puts it into the 32 bit serial_data output register. I am confused as to whether the serial data from the USB is converted to parallel in the FX2 chip or whether the data sent to the FPGA is serial but a verilog module inside the FPGA converts it to parallel. So I am not sure whether the data coming from the USB to the usrp_std.v module is the 16bit input usb_data or the SDI input.
The actual data to/from the DACs/ADCs (DUCs/DDCs) goes across the 16-bit GPIF bus.(#FIXME# not all the asked question is answered)
- I am looking at the toplevel module ustp_std.v It is my understanding that the data to be sent coming from the PC or the data received going to the PC is the "usbdata" inout. The other inouts io_tx_a, io_tx_b, io_rx_a and io_rx_b are the signal going to/coming from the daughterboards. I am still confused on what is the purpose of the following inputs: rx_a_a, rx_b_a, rx_a_b, rx_b_b and the following outputs: tx_a, tx_b?
Refer AD9862 Data Sheet.
- I am looking at the transmission buffer (tx_buffer.v) verilog code. So it seems that it uses two clock signals. The txclk signal is used to read from the fifo_4k while the usbclk signal is used to write to the fifo_4k. What are the clock speeds for those clock signals?
The USB clock is 48 MHz, provided by the FX2.
- One of the inputs to the usrp_std module is SDI. What is the origin of this input?
The 8051 bit-bangs the SDI data, clock, and strobe lines. They are "slow" compared to other FPGA actions. See usrp/firmware/src/usrp2/spi.c.
- The memory allocated for each of the FIFO in tx_buffer and rx_buffer is twice their capacity, i.e 65K 'bits'(8192 bytes) as seen from the compilation report of Quartus. Is there any specific reason?
The FIFOs are 4K lines long, with each line 16 bits.
- In the cordic.v, 16 bit vector xi, yi and zi are passed as input. Then cordic constants are pre-defined. What wire [bitwidth+1:0] xi_ext = {{2{xi[bitwidth-1]}},xi}; statement is doing. Is it just normal concatenation and assignment?
Yes, we are sign-extending xi.
- The @posedge clock, if reset is set, then x0, y0, z0 is initialized to '0'. Next, when enable is set, z0 is assigned all bits except the first two. And the first 2 bits are used for the case statement. In this case statements, the first 2 bits are used to determine the quadrants and accordingly adjust the quadrant (angle) for xi_ext and yi_ext by 90 or 180 or 270 or 360 degree. But I am not sure. Is this correct?
Yes
- Many instances of cordic_stage are defined and they are passed input values x0, y0, z0 and constants c00 etc. I am not sure what #(bitwidth+2,zwidth-1,0) in this instance definition does?
The #( ) is used to pass parameters to the module being instantiated.
- In cordic_stage.v, if reset is set then xo, y0 and zo are set to 0. And when enable is set, value of z is checked. My understanding of cordic logic is that, if say that current angle is 45, and desired is 20, then subtract from the current angle, and make corresponding adjustments on x axis and y axis variables. This statement checks if "z_is_pos" and does next operation respectively. I am not sure what xi - {{shift+1{yi[bitwidth-1]}}, yi[bitwidth-2:shift]} is actually doing. My understanding of cordic is that based on angle adjustments, operations of multiples of 2 are carried out on x-axis and y-axis variable. And {} statement are used for concatenation in verilog.
Sign extension
- In cordic.v, xo, yo and zo are in continuous assignment mode and are updated every time a x12, y12 or z12 value changes respectively. Not exactly sure how the exact desired angle is obtained in cordic.v. The algorithm that I read made continuous adjustments on 'current angle' so that it converged to the desired angle. I don't see that step happening in cordic.v?
Here how it goes. If the angle is positive, it is reduced. If the angle is negative, it is increased. It always gets closer to zero.
- How we write a test bench for tx_chain.v?
To write a test bench, you would need to set the interpolation frequency properly and if you wanted to use the CORDIC, that would have to be setup properly as well for the phase accumulator. You would then strobe your samples in at a rate of Fs where Fs is some fraction of the total clock speed and Fs*interp_rate = the clock rate. You then have to strobe every Fs to send your samples through the chain.
You also need to assert the interpolator_strobe and sample_strobe on a consistent frequency basis - eg: once every 15 clock cycles. You can't just randomly assert and de-asserts them.
You have no frequency set. I believe this frequency should be the phase difference between samples of the CORDIC for each sample that goes in where 231-1 is equal to 2*PI. You probably want to see a sine wave or some filtered signal - try a pulse train, {1, 0, 1, 0, ...} or you can download the math_real.v module. The test bench here can be used to create some sines and cosines - test_chan_fifo_reader.v.
- Do people just burn their Verilog code to the board to test it?
For pre-synthesis, RTL simulation and testing, I use ICARUS Verilog under Linux and GTKWave. Note that if you plan on using it with ICARUS Verilog, you will need one of the latest developer snapshots as the code I wrote found some problems in their real value handling.
It is definitely NOT recommended you just put it into an FPGA and run with it as you really have no idea what's going on inside there. Accurate modeling can really help with your debugging time.
- I went through the mailing list and figured out that the current Verilog/VHDL code implementation occupies 95% of FPGA's resources. However, there were some mails that pointed that reducing some receiver functionalities could free some FPGA resources. How?
The header file config.vh (trunk?) controls the build configuration and is now functional. Modify it to use the file: ../include/common_config_1rxhb_1tx.vh. This is how:
// Uncomment this for 1 RX channel (w/ halfband) & 1 transmit channel `include "../include/common_config_1rxhb_1tx.vh"This will free up a lot of space on the FPGA for experimentation!
- In rx_chain.v, what are the role for the signals - sample_strobe, decimator_strobe, hb_strobe, serial_addr, serial_data, serial_strobe, debugdata, debugctrl ?
sample_strobe follows the full speed samples coming from the ADC and feeds the input of the decimating CIC filter.
decimator_strobe is generated by the decimating CIC filter to feed the samples into the halfband FIR filter at the specified and decimated rate. When the decimation by N takes place, you get a sample 1/N cycles. This ratio is preserved throughout that part of the chain.
The hb_strobe comes out of the halfband decimating FIR filter. This is a constant decimation by 2, but is asserted when the FIR filter has an output to pass out of the RX chain.
serial_* is the interface used for register writing. If you want to modify any of the registers within the module, the simple serial interface from the FX2 -> FPGA is what changes them. In this case, they are used to modify the register for phase accumulation within the FPGA for the cordic.
debug* is just used for debug as the name suggests. In the current implementation in the trunk, it looks like the input to the halfband FIR filter is what output on the debug bus. (#FIXME#)
For informational purposes, the interpolation strobes and filtering modules are found here - master_control.v, cic_interp.v, strobe_gen.v. The register is a simple register updated once the serial data is properly written (read: atomic operation). The strobe_gen is a down counter which resets to the input of whatever that rate register is feeding the module when the counter gets down to 0.
- How many clock cycles does it take the cordic module to output valid data and why is the cordic algorithm implemented (in cordic.v) fixed at 12 iterations?
It is a pipelined module which takes samples at the full rate and through 12 pipeline stages outputs the rotated vector. Why 12 was picked is probably a size issue since you create registers and flops for each stage. You can make the pipeline deeper and even variable if you want. There is a TODO in there which suggests making it a variable length - not sure if the comment means at runtime or at build time, but have a go at it.
- Can we use Mega cell Altera blocks?
Yes. For example, like the tx_usb_fifo and rx_usb_fifo, you can just tell Quartus to generate the appropriate dual clock FIFO "megacell" for you. In this way, Matt did that to generate the fifo_2k and fifo_4k blocks that we're currently using. See the Quartus manual and the "Cyclone Device Handbook" for supported configurations.
- The following is the setting_reg module
module setting_reg(input clock, input reset, input strobe, input wire [6:0] addr, input wire [31:0] in, output reg [31:0] out, output reg changed);
In the master_control.v module,I see setting_reg module is called as follow:
setting_reg #(`FR_MASTER_CTRL) sr_mstr_ctrl(.clock(master_clk),.reset(1'b0),.strobe(serial_strobe), .addr(serial_addr),.in(serial_data),.out(master_controls));However, in setting_reg.v module, the setting_reg module has an output signal called "changed". Can anyone please tell me what will happen to that "changed" signal?
The “changed” signal is not used in most places, and is left out.
