March 4, 2021, 13:52 (GMT) |
Cleanup: simplify and comment CUDA kernel configuration logic |
March 4, 2021, 13:52 (GMT) |
Cycles: increase hardcoded CUDA max paths to a bit more representative Still needs a solution to compute this automatically. |
March 4, 2021, 13:52 (GMT) |
Cycles: stop CUDA integrator queue based on number of active paths Which means indirect light now works. |
March 4, 2021, 13:52 (GMT) |
Fix build using WITH_CUDA_DYNLOAD=OFF |
March 4, 2021, 13:50 (GMT) |
Cycles: Remove progressive refine option The new integrator is intrinsicly progressive. |
March 4, 2021, 13:50 (GMT) |
Cycles: Remove tile size and order settings Integrator will choose tile size which suits the device the best. Ordering for images smaller than big tile is not needed due to the progressive nature of rendering. Configurable ordering for big tiles seems to jut complicate things, without obvious benefit. |
March 4, 2021, 13:17 (GMT) |
Fix uninitialized render tile field access in Cycles update Was reported by valgrind, not sure whether any real problems were caused by this issue or not. Still good to fix it! blender_session.cpp doech check task of the render tile in the `do_write_update_render_tile` callback. |
March 3, 2021, 18:02 (GMT) |
Cycles: Initialize ShaderData differentials to zero They are not yet supported, and support is marked as TODO, but it seems to be important to avoid uninitialized values as it causes non-deterministic issues on certain setups. For example, GCC + E5-2699v4 were giving different results when rendering one sample of BMW27 in multiple threads. |
March 3, 2021, 17:11 (GMT) |
Cycles: Reduce overhead in CPU scheduler Do samples loop outside of tbb's parallel_for range. Gives about 2% speedup. |
March 3, 2021, 17:11 (GMT) |
Cycles: Implement special path trace work for CPU Uses tbb's parallel_for to do pixel partitioning, avoiding overhead of per-pixel atomic. Brings render time of bmw27 from 2:19 to 2:10 (2.92 renders in 2:01). |
March 3, 2021, 17:11 (GMT) |
Cycles: Move tiled path tracer work to own class The goal is to abstract things a bit and make it possible to tap more CPU-friendly scheduler to the path tracer. No functional changes are expecting. |
March 3, 2021, 17:11 (GMT) |
Cycles: Remove hardcoded tile size Is not needed since recent fixes from Brecht. |
March 3, 2021, 17:11 (GMT) |
Cycles: Schedule more samples to path trace work The new scheduling goes as following: - Render 1 sample, give visual feedback, and see how long it takes to render a sample. - After that schedule as many samples as needed to not have more frequent updates then once a second. Currently there are no measurable difference, but it will become more visible once we'll be scheduling more than one sample to GPU. |
March 3, 2021, 11:15 (GMT) |
Cycles: Initial implementation of tiled rendering on CUDA Schedules more than one pixel to CUDA device. The tile size calculation is implemented as a dedicated testable function, which should simplify tweaking the "division" algorithm in the future. The actual splitting logic is commented out and tiles are set to 256 by 256. There seems to be some memory allocation or access issue which makes proper tile size to fail. |
March 3, 2021, 08:57 (GMT) |
Fix various CUDA rendering issues with new integrator * Mark paths outside of the total work size as terminated * Error in device_vector.copy_from_device implementation * Don't assume sort_intersections works for 0 hits * Remove feature flags from integrator state, this is to be handled with SoA selective array allocation instead. |
March 2, 2021, 13:16 (GMT) |
Fix compilation without CUDA enabled Also ensure own headers are always coming first in the implementation file by putting them to own "group" (add empty line after the include statement). Ensures that there are no implicit dependencies in the header. |
February 26, 2021, 18:03 (GMT) |
Cycles: start of CUDA support for new integrator kernels * Very poor performance since it renders one pixel at a time, only practical to render a tiny image at this point. * No indirect light because it only does a single iteration of the kernels. |
February 26, 2021, 18:03 (GMT) |
Cleanup: move CUDA context scope and asserts out of main file |
February 26, 2021, 18:03 (GMT) |
Cleanup: clang-format .cu files |
February 26, 2021, 18:03 (GMT) |
Cycles: rename integrator kernels * Add integrator_ prefix to kernel names to distinguish from filter, displacement, etc * Add verb to every kernel name for consistency and logical grouping * Rename generate camera ray since it does more than that * Rename files from kernel_ to integrator_ consistent with other kernel file naming |
|