Blender Git Loki

Blender Git "temp_custom_loop_normals" branch commits.

Page: 7 / 8

August 20, 2014, 17:20 (GMT)
Merge branch 'master' into temp_custom_loop_normals
August 17, 2014, 21:13 (GMT)
Much better handling of multithreading - at least 80% quicker than previous code,
more than two times quicker than without any threading...

Main idea was to make bigger chunks of tasks (currently, 1024 at once), saves a bunch
of mem management (alloc, copy, etc.) and spinlock locking.
August 17, 2014, 19:07 (GMT)
Better handling of tasks/parallelization, not yet much convincing on perf level, though. :/
August 17, 2014, 13:48 (GMT)
Minor changes/cleanup in BKE_mesh_normals_loop_split, and use a bitmap for sharp_verts.

There is no real difference between a plain bool array and bitmap for sharp_verts,
and this should save quite a bit of mem with heavy models.
August 17, 2014, 13:20 (GMT)
Simplify/fix code to set custom split normals.

We might lose a (very tiny) bit of performances in some cases, but this is not
a critical area, more important to have a simple and working code. Also,
this func no longer modifies given array of custom normals! And hopefully
weights handling is now correct in all cases.
August 17, 2014, 12:08 (GMT)
Merge branch 'master' into temp_custom_loop_normals
August 8, 2014, 17:16 (GMT)
Merge branch 'master' into temp_custom_loop_normals
August 8, 2014, 17:13 (GMT)
Various optimizations and fixes to `mesh_normals_loop_custom_set()`.

Among other things, weighting (interpolation) of custom normal in vertex case
was broken (interpolated several times).
August 8, 2014, 16:23 (GMT)
Better implementation of threading for split lnor code.

Not that much satisfied yet, though, this adds some complexity with not-so-much gain in the end
(about 33% better with 100000 loops and more, on an 8cores machine - with nearly no gain after 4
workers anyway)...

Have the feeling threadable parts of this func are too small chunks needing too much data
to be really efficient (since we can't parallelize the main loop, order is crucial here).
August 8, 2014, 12:33 (GMT)
Use BLI_task for some split normals code - wrong solution

Looks like tasks are not suited for many little elements
(this code is at least two times slower as mono-thread previous one!).
And probably things like calling `BLI_stack_new()` in each task instead of
once globally does not help either. :/

Will refactor into a few worker tasks feeded with a queue...
August 8, 2014, 12:19 (GMT)
Fix remaining floats for clnors data :/
August 7, 2014, 18:23 (GMT)
More optimization: Allocate all LinkNode need for loops in one step, instead of doing one alloc per loop.

This allocate a bit more memory than needed (since we won't use linknodes of 'sharp' loops),
but avoids calling BLI_linklist_prepend_memarena() once per 'smooth' loop.

Note: gave much less benefits than expected, in the end - guess memarena allocation is *really* fast!
August 7, 2014, 17:27 (GMT)
Merge branch 'master' into temp_custom_loop_normals
August 7, 2014, 17:22 (GMT)
Switch clnors data from float[2] to short[2].

Suggested by Campbell, saves memory and does not make things slower (perhaps even slightly faster),
even though it asks for a few more operations...

Note the way alpha factor is handled is a bit different now, since with shorts it's simpler to
always keep factor-like value in the 'virtual' [-1, 1] (real [-SHRT_MAX, SHRT_MAX]) range.

Code is a bit more verbose, since we do not have much math helpers for short 2d vectors,
but this is a detail that can be solved later.
August 6, 2014, 19:34 (GMT)
Custom Split Normals current state

This is not yet ready for a detailed review, requesting more a skim one to see whether there are some big/key issues in current patch code/design.

Further info on concepts used for custom lnors storage can be found [here](http://wiki.blender.org/index.php/User:Mont29/Foundation/Split_Vertex_Normals#Loop_normal_edition), but to summarize, it defines for each smooth fan (i.e. group of neighbor loops sharing a same normal) an orthonormal system which one axis is the auto lnor, and another one is aligned with a 'reference' edge which defines the start of the smooth fan. In addition, we store two angles, alhpa being the average angle between the auto lnor and all edges, and beta, the angle bewteen the start and end edges of the fan (around auto lnor).

We then store custom lnors as a set of two factors, one applied to alpha angle (to get how much 'far away' from auto lnor custom one is), the other applied to beta angle (to get where around auto lnor custom one is).

Since we need those data as well (like auto lnors), nearly all this adds to existing `BKE_mesh_normals_loop_split()` (and its BMesh version), with a small API to define lnor spaces, and convert raw custom normals to/from clnor data format (the two angle factors). Note lnor spaces also embed a representation of smoothfans (as a linklist of relevant loop's indices).

Also note than all this overhead only takes effect when computing lnor spaces and/or using custom lnors, else current split lnors code is nearly unchanged.

Would also like some advice on performances topic - right now, custom split normals can be up to 50 (or more) times slower than mare autosmooth in worst 'grid/sphere' cases, mostly because we can't skip anymore fanning around fully smooth vertices (since we have to generate lnor spaces for those too). In a more average case, it's something like three, four times slower (really depends on the topology!).

I have two different tracks for optimization currently:

* Use real threading (BLI_thread, could probably gives three or four times speedup with modern CPUs). Not really considering OMP, it�??s practically unusable on OSX and Win platforms currently (and not that perfomant compared to real threading anyway, afaik). Thinks the main loop of `BKE_mesh_normals_loop_split` (and it�??s BMesh equivalent) would be a good candidate for that, afaics each loop processing can be operated in parallel (this is a bit tricky to analyze, so not yet 100% sure about it, though).

* Cache lnor spaces, using a similar system to current skey gsoc to detect changes in geometry or topology that would need to recompute them. Since computing those lnor spaces takes a huge part of the added processing time, this should lead to a nice gain too. But that would mean caching quite a bit of data in BMesh, Mesh and DerivedMesh...

Reviewers: campbellbarton

Differential Revision: https://developer.blender.org/D703
August 6, 2014, 19:33 (GMT)
Fix `BKE_mesh_has_custom_loop_normals` in Edit mode, get rid of some debug prints, various minor tweaks.
August 6, 2014, 19:33 (GMT)
Several enhancements:

* Add an 'has custom lnors' helper, also available from RNA.
* Add a 'factors' array to set custom lnors func, to make lerp interpolation between custom lnor and auto lnor.
* Add a 'set custom lnors from vert nors' func (also available from RNA).
* Reshuffle a bit clnor CD data handling in UI (now in Normals panel, and you can also add a layer if needed).
August 6, 2014, 19:33 (GMT)
Some cleanup, add new CD layer (CD_CUSTOMLOOPNORMAL, float[2]),
and a function to convert custom normals to internal data.

WIP
August 6, 2014, 19:33 (GMT)
Fix stupid code to get CD custom lnors data in BMesh!
August 6, 2014, 19:33 (GMT)
First basic code for supporting custom loop normals.

This only affects Object mode currently, and is more proof-of-concept than anything else.

Next step: do same thing for bmesh!
Tehnyt: Miika HämäläinenViimeksi päivitetty: 07.11.2014 14:18MiikaH:n Sivut a.k.a. MiikaHweb | 2003-2021