Revision 230c00d by Mai Lavelle March 8, 2017, 05:52 (GMT) |
Cycles: OpenCL split kernel refactor This does a few things at once: - Refactors host side split kernel logic into a new device agnostic class `DeviceSplitKernel`. - Removes tile splitting, a new work pool implementation takes its place and allows as many threads as will fit in memory regardless of tile size, which can give performance gains. - Refactors split state buffers into one buffer, as well as reduces the number of arguments passed to kernels. Means there's less code to deal with overall. - Moves kernel logic out of OpenCL kernel files so they can later be used by other device types. - Replaced OpenCL specific APIs with new generic versions - Tiles can now be seen updating during rendering |
Revision 520b533 by Mai Lavelle March 8, 2017, 05:52 (GMT) |
Cycles: Add OpenCL kernel for zeroing memory buffers Transferring memory to the device was very slow and there's really no need when only zeroing a buffer. |
Revision dfd6055 by Mai Lavelle March 8, 2017, 05:52 (GMT) |
Cycles: Add more atomic operations |
Revision bc65276 by Mai Lavelle March 8, 2017, 05:52 (GMT) |
Cycles: Expose passes size to device tasks This is needed so devices can know the size of a tile buffer before any tiles are acquired. |
Revision 0f56f7a by Mai Lavelle March 8, 2017, 05:52 (GMT) |
Cycles: Allow device_memory to be used directly This is useful for when theres no host side memory attched to the buffer |
March 8, 2017, 03:14 (GMT) |
OpenGL: convert editarmarture_sketch to new imm mode Part of T49043, T49042 Reviewers: fclem, merwin Tags: #bf_blender_2.8 Differential Revision: https://developer.blender.org/D2548 |
Revision 696bb47 by Clément Foucault March 7, 2017, 17:09 (GMT) |
OpenGL: Converted bit more of drawobject.c. |
Revision 9e566b0 by Sergey Sharybin March 7, 2017, 16:32 (GMT) |
Task scheduler: Add concept of suspended pools Suspended pools allows to push huge amount of initial tasks without any threading synchronization and hence overhead. This gives ~50% speedup of cached rigid body with file from T50027 and seems to have no negative affect in other scenes here. |
Revision 347410a by Sergey Sharybin March 7, 2017, 16:32 (GMT) |
Depsgraph: Remove workarounds from depsgraph for keeping threads alive This is something what should be done in the task scheduler instead with local thread queues so we handle this in a single place. |
Revision 55c2cd8 by Sergey Sharybin March 7, 2017, 16:32 (GMT) |
Task scheduler: Initial implementation of local tasks queues The idea is to allow some amount of tasks to be pushed from working thread to it's local queue, so we can acquire some work without doing whole mutex lock. This should allow us to remove some hacks from depsgraph which was added there to keep threads alive. |
Revision 2f722f1 by Sergey Sharybin March 7, 2017, 16:32 (GMT) |
Task scheduler: Use real pthread's TLS to access active thread's data This allows us to avoid TLS stored in pool which gives us advantage of using pre-allocated tasks pool for the pools created from non-main thread. Even on systems with slow pthread TLS it should not be a problem because we access it once at a pool construction time. If we want to use this more often (for example, to get rid of push_from_thread) we'll have to do much more accurate benchmark. |
Revision a07ad02 by Sergey Sharybin March 7, 2017, 16:32 (GMT) |
Task scheduler: Refactor the way we store thread-spedific data Basically move all thread-specific data (currently it's only task memory pool) from a dedicated array of taskScheduler to TaskThread. This way we can add more thread-specific data in the future with less of a hassle. |
Revision 9522f8a by Sergey Sharybin March 7, 2017, 16:32 (GMT) |
Task scheduler: Remove per-pool threads limit This feature was adding extra complexity to task scheduling which required yet extra variables to be worried about to be modified in atomic manner, which resulted in following issues: - More complex code to maintain, which increases risks of something going wrong when we modify the code. - Extra barriers and/or locks during task scheduling, which causes extra threading overhead. - Unable to use some other implementation (such as TBB) even for the comparison tests. Notes about other changes. There are two places where we really had to use that limit. One of them is the single threaded dependency graph. This will now construct a single-threaded scheduler at evaluation time. This shouldn't be a problem because it only happens when using debugging command line arguments and the code simply don't run in regular Blender operation. The code seems a bit duplicated here across old and new depsgraph, but think it's OK since the old depsgraph is already gone in 2.8 branch and i don't see where else we might want to use such a single-threaded scheduler. When/if we'll want to do so, we can move it to a centralized single-threaded scheduler in threads.c. OpenGL render was a bit more tricky to port, but basically we are using conditional variables to wait background thread to do all the job. |
Revision 35d7812 by Aaron Carlisle March 7, 2017, 14:07 (GMT) |
Fix typo in command line arg list |
Revision af07603 by Julian Eisel March 7, 2017, 10:54 (GMT) |
Revision ca796f8 by Julian Eisel March 7, 2017, 10:23 (GMT) |
Once more T50565: Allow using planar constraints for scale manipulator |
March 7, 2017, 04:44 (GMT) |
March 7, 2017, 04:44 (GMT) |
March 7, 2017, 04:44 (GMT) |
Revision 583373f by Clément Foucault March 6, 2017, 19:57 (GMT) |
OpenGL: Object bound drawing. |
|
|
|


Master Commits
MiikaHweb | 2003-2021