summaryrefslogtreecommitdiff
path: root/docs/doxygen/other/volk_guide.dox
diff options
context:
space:
mode:
authorMarc Lichtman <marcll@vt.edu>2018-11-02 00:53:52 -0400
committerMarc L <marcll@vt.edu>2019-06-07 12:45:14 -0400
commit73b074f121b0ab2ac38336916a60891c4d88d2cb (patch)
treede8f1782c897af3c278c224dcfbf96345c5c6f1d /docs/doxygen/other/volk_guide.dox
parentb6f15c59e96aa83142c47aeacd64da793dd8ba31 (diff)
docs: moved usage manual to wiki
docs: first snapshot of wiki's usage manual
Diffstat (limited to 'docs/doxygen/other/volk_guide.dox')
-rw-r--r--docs/doxygen/other/volk_guide.dox160
1 files changed, 0 insertions, 160 deletions
diff --git a/docs/doxygen/other/volk_guide.dox b/docs/doxygen/other/volk_guide.dox
deleted file mode 100644
index 1f8f128e4b..0000000000
--- a/docs/doxygen/other/volk_guide.dox
+++ /dev/null
@@ -1,160 +0,0 @@
-# Copyright (C) 2017 Free Software Foundation, Inc.
-#
-# Permission is granted to copy, distribute and/or modify this document
-# under the terms of the GNU Free Documentation License, Version 1.3
-# or any later version published by the Free Software Foundation;
-# with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
-# A copy of the license is included in the section entitled "GNU
-# Free Documentation License".
-
-/*! \page volk_guide Instructions for using VOLK in GNU Radio
-
-Note: Many blocks have already been converted to use VOLK in their calls, so
-they can also serve as examples. See the
-gr::blocks::complex_to_<type>.h files for examples of various blocks
-that make use of VOLK.
-
-\section volk_intro Introduction
-
-VOLK is the Vector-Optimized Library of Kernels. It is a library that
-contains kernels of hand-written SIMD code for different mathematical
-operations. Since each SIMD architecture can be greatly different and
-no compiler has yet come along to handle vectorization properly or
-highly efficiently, VOLK approaches the problem differently. For each
-architecture or platform that a developer wishes to vectorize for, a
-new proto-kernel is added to VOLK. At runtime, VOLK will select the
-correct proto-kernel. In this way, the users of VOLK call a kernel for
-performing the operation that is platform/architecture agnostic. This
-allows us to write portable SIMD code.
-
-VOLK kernels are always defined with a 'generic' proto-kernel, which
-is written in plain C. With the generic kernel, the kernel becomes
-portable to any platform. Kernels are then extended by adding
-proto-kernels for new platforms in which they are desired.
-
-A good example of a VOLK kernel with multiple proto-kernels defined is
-the volk_32f_s32f_multiply_32f_a. This kernel implements a scalar
-multiplication of a vector of floating point numbers (each item in the
-vector is multiplied by the same value). This kernel has the following
-proto-kernels that are defined for 'generic,' 'avx,' 'sse,' and 'neon'
-
-\code
- void volk_32f_s32f_multiply_32f_a_generic
- void volk_32f_s32f_multiply_32f_a_sse
- void volk_32f_s32f_multiply_32f_a_avx
- void volk_32f_s32f_multiply_32f_a_neon
-\endcode
-
-These proto-kernels means that on platforms with AVX support, VOLK can
-select this option or the SSE option, depending on which is faster. If
-all else fails, VOLK can fall back on the generic proto-kernel, which
-will always work.
-
-See <a
-href="http://libvolk.org</a> for details on the VOLK naming scheme.
-
-
-\section volk_alignment Setting and Using Memory Alignment Information
-
-For VOLK to work as best as possible, we want to use memory-aligned
-SIMD calls, which means we have to have some way of knowing and
-controlling the alignment of the buffers passed to gr_block's work
-function. We set the alignment requirement for SIMD aligned memory
-calls with:
-
-\code
- const int alignment_multiple =
- volk_get_alignment() / output_item_size;
- set_alignment(std::max(1,alignment_multiple));
-\endcode
-
-The VOLK function 'volk_get_alignment' provides the alignment of the
-the machine architecture. We then base the alignment on the number of
-output items required to maintain the alignment, so we divide the
-number of alignment bytes by the number of bytes in an output items
-(sizeof(float), sizeof(gr_complex), etc.). This value is then set per
-block with the 'set_alignment' function.
-
-Because the scheduler tries to optimize throughput, the number of
-items available per call to work will change and depends on the
-availability of the read and write buffers. This means that it
-sometimes cannot produce a buffer that is properly memory
-aligned. This is an inevitable consequence of the scheduler
-system. Instead of requiring alignment, the scheduler enforces the
-alignment as much as possible, and when a buffer becomes unaligned,
-the scheduler will work to correct it as much as possible. If a
-block's buffers are unaligned, then, the scheduler sets a flag to
-indicate as much so that the block can then decide what best to
-do. The next section discusses the use of the aligned/unaligned
-information in a gr_block's work function.
-
-
-\section volk_work Calling VOLK kernels in Work()
-
-The buffers passed to work/general_work in a gr_block are not
-guaranteed to be aligned, but they will mostly be aligned whenever
-possible. When not aligned, the 'is_unaligned()' flag will be set so
-the scheduler knows to try to realign the buffers. We actually make
-calls to the VOLK dispatcher, which is mainly designed to check the
-buffer alignments and call the correct version of the kernel for
-us. From the user-level view of VOLK, calling the dispatcher allows us
-to ignore the concept of aligned versus unaligned. This looks like:
-
-\code
-int
-gr_some_block::work (int noutput_items,
- gr_vector_const_void_star &input_items,
- gr_vector_void_star &output_items)
-{
- const float *in = (const float *) input_items[0];
- float *out = (float *) output_items[0];
-
- // Call the dispatcher to check alignment and call the _a or _u
- // version of the kernel.
- volk_32f_something_32f(out, in, noutput_items);
-
- return noutput_items;
-}
-\endcode
-
-
-
-\section volk_tuning Tuning VOLK Performance
-
-VOLK comes with a profiler that will build a config file for the best
-SIMD architecture for your processor. Run volk_profile that is
-installed into $PREFIX/bin. This program tests all known VOLK kernels
-for each architecture supported by the processor. When finished, it
-will write to $HOME/.volk/volk_config the best architecture for the
-VOLK function. This file is read when using a function to know the
-best version of the function to execute.
-
-\subsection volk_hand_tuning Hand-Tuning Performance
-
-If you know a particular architecture works best for your processor,
-you can specify the particular architecture to use in the VOLK
-preferences file: $HOME/.volk/volk_config
-
-The file looks like:
-
-\code
- volk_<FUNCTION_NAME> <ARCHITECTURE>
-\endcode
-
-Where the "FUNCTION_NAME" is the particular function that you want to
-over-ride the default value and "ARCHITECTURE" is the VOLK SIMD
-architecture to use (generic, sse, sse2, sse3, avx, etc.). For
-example, the following config file tells VOLK to use SSE3 for the
-aligned and unaligned versions of a function that multiplies two
-complex streams together.
-
-\code
- volk_32fc_x2_multiply_32fc_a sse3
- volk_32fc_x2_multiply_32fc_u sse3
-\endcode
-
-\b Tip: if benchmarking GNU Radio blocks, it can be useful to have a
-volk_config file that sets all architectures to 'generic' as a way to
-test the vectorized versus non-vectorized implementations.
-
-*/