Blender Git Loki
Git Commits -> Revision e3a7925
Revision e3a7925 by Martijn Berger (master) November 22, 2013, 13:42 (GMT) |
Cycles: test code for sse 4.1 kernel and alignment for some vector types. This is mostly work towards enabling the __KERNEL_SSE__ option to start using SIMD operations for vector math operations. This 4.1 kernel performes about 8% faster with that option but overall is still slower than without the option. WITH_CYCLES_OPTIMIZED_KERNEL_SSE41 is the cmake flag for testing this kernel. Alignment of int3, int4, float3, float4 to 16 bytes seems to give a slight 1-2% speedup on tested systems with the current kernel already, so is enabled now. |
Commit Details:
Full Hash: e3a79258d17e6cdca26120eab7a2c48c7c4d4a0f
Parent Commit: 5feb0d2
Committed By: Brecht Van Lommel
Lines Changed: +187, -6
1 Added Path:
/intern/cycles/kernel/kernel_sse41.cpp (+76, -0) (View)
10 Modified Paths:
/intern/cycles/CMakeLists.txt (+4, -0) (Diff)
/intern/cycles/device/CMakeLists.txt (+4, -0) (Diff)
/intern/cycles/device/device_cpu.cpp (+50, -0) (Diff)
/intern/cycles/kernel/CMakeLists.txt (+7, -2) (Diff)
/intern/cycles/kernel/kernel.h (+9, -0) (Diff)
/intern/cycles/SConscript (+10, -0) (Diff)
/intern/cycles/util/util_math.h (+13, -0) (Diff)
/intern/cycles/util/util_system.cpp (+5, -0) (Diff)
/intern/cycles/util/util_system.h (+1, -0) (Diff)
/intern/cycles/util/util_types.h (+8, -4) (Diff)
/intern/cycles/device/CMakeLists.txt (+4, -0) (Diff)
/intern/cycles/device/device_cpu.cpp (+50, -0) (Diff)
/intern/cycles/kernel/CMakeLists.txt (+7, -2) (Diff)
/intern/cycles/kernel/kernel.h (+9, -0) (Diff)
/intern/cycles/SConscript (+10, -0) (Diff)
/intern/cycles/util/util_math.h (+13, -0) (Diff)
/intern/cycles/util/util_system.cpp (+5, -0) (Diff)
/intern/cycles/util/util_system.h (+1, -0) (Diff)
/intern/cycles/util/util_types.h (+8, -4) (Diff)