round functions (+modulo bonus)

Oct 232013

The Steam Hardware & Software Survey as of September 2013 shows that 68% of users’ CPU support SSE4.1, leaving 32% still using SSE3 or less. It also shows that 99.7% of users’ CPU support SSE3. So what this tells me is that SSE3 is where it’s at, if you want your game to run on most CPUs, you won’t be able to use SSE 4.1 intrinsic functions.

Problem is, SSE4.1 is awesome. It’s awesome for many reasons but let’s focus on one of them: _mm_round_ps(). You can use this intrinsic, which yield a single ‘roundps‘ instruction, for computing math Floor, Ceil and – as its name suggest – Round.

Well this is great and all but what can we do using non-SSE4.1 instructions? You can always use the standard C math.h floor(), ceil() and your own flavor of round() but if you want to do some intensive and fast vector math in your game, SSE is your savior. If you do a search on Google, something like ‘sse floor’, you’ll probably get a lot of wrong algorithms. Most of them won’t work for negative integer values, flooring -10 to -11 for instance. So I took some time to figure out an algorithm using only SSE3 instructions.


    inline __m128 _mm_floor_ps2(const __m128& x){
        __m128i v0 = _mm_setzero_si128();
        __m128i v1 = _mm_cmpeq_epi32(v0,v0);
        __m128i ji = _mm_srli_epi32( v1, 25);
        __m128 j = *(__m128*)&_mm_slli_epi32( ji, 23); //create vector 1.0f
        __m128i i = _mm_cvttps_epi32(x);
        __m128 fi = _mm_cvtepi32_ps(i);
        __m128 igx = _mm_cmpgt_ps(fi, x);
        j = _mm_and_ps(igx, j);
        return _mm_sub_ps(fi, j);
    }
    inline __m128 _mm_ceil_ps2(const __m128& x){
        __m128i v0 = _mm_setzero_si128();
        __m128i v1 = _mm_cmpeq_epi32(v0,v0);
        __m128i ji = _mm_srli_epi32( v1, 25);
        __m128 j = *(__m128*)&_mm_slli_epi32( ji, 23); //create vector 1.0f
        __m128i i = _mm_cvttps_epi32(x);
        __m128 fi = _mm_cvtepi32_ps(i);
        __m128 igx = _mm_cmplt_ps(fi, x);
        j = _mm_and_ps(igx, j);
        return _mm_add_ps(fi, j);
    }

    inline __m128 _mm_round_ps2(const __m128&amp; a){
        __m128 v0 = _mm_setzero_ps();             //generate the highest value &lt; 2
        __m128 v1 = _mm_cmpeq_ps(v0,v0);
        __m128 vNearest2 = *(__m128*)&amp;_mm_srli_epi32( *(__m128i*)&amp;v1, 2);
        __m128i i = _mm_cvttps_epi32(a);
        __m128 aTrunc = _mm_cvtepi32_ps(i);        // truncate a
        __m128 rmd = _mm_sub_ps(a, aTrunc);        // get remainder
        __m128 rmd2 = _mm_mul_ps( rmd, vNearest2); // mul remainder by near 2 will yield the needed offset
        __m128i rmd2i = _mm_cvttps_epi32(rmd2);    // after being truncated of course
        __m128 rmd2Trunc = _mm_cvtepi32_ps(rmd2i);
        __m128 r =_mm_add_ps(aTrunc, rmd2Trunc);
        return r;
    }

Edit: Special thanks to obyzouth, he worked out better SSE code for floor and ceil functions. 🙂 It does not handle the NaNs and Infinite but those don’t have to be handled, they have to be eradicated. A good NaN is a non-existent NaN. You should use functions that handle them in your debug build though.

As you can see I use a conversion to int and back to float to round the value. This will not work for values that cannot be represented by an int. If you need these function to handle these kind of values you might have to reconsider using float to begin with as it loses quite a lot of precision in these ranges. But for the sake of absolute safety, here’s what you can do:

    template< __m128 (FuncT)(const __m128&) >
    inline __m128 _mm_safeInt_ps(const __m128& a){
        __m128 v8388608 = *(__m128*)&_mm_set1_epi32(0x4b000000);                        //vector with value 8388608
        __m128 aAbs = _mm_and_ps(a, *(__m128*)&_mm_set1_epi32(0x7fffffff));             //Abs(a)
        __m128 aMask = _mm_cmpgt_ps(aAbs, v8388608);                                    //if Abs(a) > 8388608
        // select a if greater then 8388608.0f, otherwise select the result of FuncT
        __m128 r = _mm_xor_ps( _mm_and_ps(aMask, a), _mm_andnot_ps(aMask, FuncT(a)) );
        return r;
    }
    ...
    //then call your functions like so:
    _mm_safeInt_ps<_mm_floor_ps2>( ... );
    _mm_safeInt_ps<_mm_ceil_ps2 >( ... );
    _mm_safeInt_ps<_mm_round_ps2>( ... );

8388608 is the lowest float value that cannot have decimals because of imprecision increasing with the value. So floor()/ceil()/round() will return the same value it receives for number greater or equal to that.

Bonus! Vector equivalent of fmod() :

    inline __m128 _mm_mod_ps2(const __m128& a, const __m128& aDiv){
        __m128 c = _mm_div_ps(a,aDiv);
        __m128i i = _mm_cvttps_epi32(c);
        __m128 cTrunc = _mm_cvtepi32_ps(i);
        __m128 base = _mm_mul_ps(cTrunc, aDiv);
        __m128 r = _mm_sub_ps(a, base);
        return r;
    }

3 Responses to “Pre SSE 4.1 floor/ceil/round functions (+modulo bonus)”

Chuck Walbourn says:

October 24, 2013 at 6:28 pm

Have you looked at the implementations in DirectXMath? The 3.06 version that ships in the Windows 8.1 SDK / VS 2013 had some improvements to XMVectorRound, XMVectorFloor, and XMVectorCeil that work on SSE/SSE2 systems.

Log in to Reply
- StephanieRct says:
  
  October 24, 2013 at 6:51 pm
  
  Yes I looked at XMVectorFloor but the code doesn’t handle big floats correctly :/
  For instance it will floor 88607.0f to 88606.0f.
  
  Log in to Reply
FrankVah says:

January 25, 2022 at 8:24 am

You understand you want to renovate your house, but the truth is don’t know where to start, suited? Well, you’re not alone. Many householders dive into the renovation technique with no clue of what exactly is available. It’s only after they go through renovation mishaps that they rue not having a plan. Without prepare, the home renovation process could be full of disappointments because not like building a new house, gonna catch starting with a blank state. There can be unforeseen expenses together with issues that make the process complex. So before you go down the actual twists and turns of the renovation path, here are five things you must know before you start redecorating.

1. Invest in a Key Lockbox.
If you’re renovating your home before actually move into it along with live far away, consider buying a key lockbox. If there often large projects on your directory site that you can’t do your self, you’ll need to hire a vendor. Remember that contractors start easy, so unless you want to press in morning rush-hour so as to let workers inside, desire to attach a lockbox and even install smart door computers that allows you to provide entry in order to guests with a code. It could actually save you countless hours of time, gallons of fuel, and uncomfortable, early morning wake-up calls.

2. Spend Time in the Space.
Simply because obvious as it might seem, you have got to hold off on some decisions like paint colors, floor coverings, and light fixtures until you spend time in the space you’re improving. For example , if you want to replace ended up being involved with carpet, the choices can be aggravating. Dense or loose program fibers? Striated or no pattern? Just the thing exactly exact shade of hopeless? The answers depend on one other aspects of the renovation, including choosing paint colors. Color palettes selected before the re-designing started need to be seen with walls and could change when you spend time in the house. Freshly painted walls and new surfaces can reveal that many light fixtures just could illuminate the space as you concept it would. If you spend a little bit more time upfront considering the manner in which everything works together from the room you’re remodeling, you can save by yourself several back-to-the-drawing-board moments.

3. Be Realistic About Your Timeline.
Your own renovation will take longer in comparison with expected, so be prepared along with prepare accommodations to avoid frustration. As an example, you might think that replacing the actual whirlpool tub and out of date double vanity in a bathroom will be quick. Removal of typically the fixtures can take a few hours, even so locating a new tub together with vanity you love can take nights. It can be another two weeks earlier than they’re delivered. An requirement of using your new lounge bath within a week is capable of turning into a six-week waiting period of time. Be realistic about the renovation move to make and timeline and select a new updates before having the existing ingredients removed.

4. Expect usually the Unexpected.
All homes get secrets, in the walls, beneath the floors, and elsewhere. Just about any renovation can bring those that you light. Like when your practitioner tells you your floors will likely be uneven due to a shifted core joist while measuring for your highly anticipated new lumber floors. Now you need to cope with the home inspector who get missed it and get the floor joist repaired before the new articles can go down. This is only one example of how you should count on the unexpected by getting ready extra time in your renovation stretch of time and extra money into your remodelling budget to allow for unanticipated complications along the way.

5. Interview Numerous Contractors.
It pays to job interview multiple contractors and evaluate bids. Suppose you want your own hot water heater removed from the particular upstairs closet and a brand-new one installed in the storage area area. The first plumber you ask promises he plans to monthly bill $6, 000 and could reroute hot water lines. chopping into your living room tray limit to do so. The second plumber stated it was too much work. Plumbing engineer three plans to finish the project in a day and charge $3, 285 with no water line rerouting needed. Without acquiring multiple contractors, you purpose the risk of paying too much without having getting what you want. It pays to complete your homework and talk to several experts before making a single last decision.

6. Renovate Your special Kitchen First.
Homeowners commonly want to know in what order they must renovate a house. It’s in addition best to start with the kitchen because these remodels add major value towards your dwelling. According to the National Association along with the Remodeling Industry’s 2019 changing impact report, realtors approval that homeowners can restore 59% of the cost of a total kitchen renovation if they give their home. Plus, if you have a fantastic outdated kitchen, upgrading the particular worn-out space will let you far better enjoy the space while prepared to home. Practically speaking, anyone do the kitchen remodel most important because that work will create essentially the most dust and debris, which you won’t want landing having new paint or finish jobs. It’s always a good idea to segregate any demolition mess by simply putting plastic over entrance doors or pass-throughs. Since a large kitchen remodel typically takes a bit to complete, you’ll want to set up a temporary substitute kitchen inside the dining room, family room, or another adjoining area in your house.

7. Maybe be Specific About Design Concepts.
You’ll want to narrow down your format preferences before you meet with an inside designer. Get inspiration by means of browsing home decorating magazines, home decor websites, and design displays. Getting a handle on a style direction will help you avoid keeping yourself talked into a designer’s personalized ideas, which might be different from your own personal vision for the renovation. At the same time, keep your designer strictly with your budget. Don’t allow them to obtain expensive materials and set you back for it later. If you make an attempt to stay within a specific timeframe, your costs will stay little, too.

Details in https://artbrick.info/

Log in to Reply

Dev Blog on Deep Space Settlement

Pre SSE 4.1 floor/ceil/round functions (+modulo bonus)

Like this:

3 Responses to “Pre SSE 4.1 floor/ceil/round functions (+modulo bonus)”

Leave a Reply Cancel reply

Share this:

Like this:

3 Responses to “Pre SSE 4.1 floor/ceil/round functions (+modulo bonus)”

Leave a Reply Cancel reply