Back
Close

SSE & AVX Vectorization

Marchete
77.3K views

First AVX Code: SQRT calculation

Now that we have reviewed all the requirements, the autovectorization, and AVX intrinsics, we can create our first manually vectorized program. In this exercise, you need to vectorize a sqrt calculation of float numbers. We will explicitly use the __m256 datatype to store our floats, reducing the overhead in data loading.

Vectorized SQRT

You will probably see a 600% performance improvement or more. That is, once you have the data loaded, AVX will perform up to 7 times faster than normal sqrtf. The theoretical limit is 800%, but it's rarely achieved. You can expect between a 300% and 600% average increase.

Create your playground on Tech.io
This playground was created on Tech.io, our hands-on, knowledge-sharing platform for developers.
Go to tech.io
codingame x discord
Join the CodinGame community on Discord to chat about puzzle contributions, challenges, streams, blog articles - all that good stuff!
JOIN US ON DISCORD
Online Participants