Transformation toolbox#

This how-to guide explores the different tools available in Mitsuba 3 to manipulate cartesian coordinate systems. When generating datasets, researching advanced light transport algorithms, or developing new appearance models, you will quickly realize how essential those tools are, so we strongly recommend all users to go through this guide.

[1]:

import mitsuba as mi

mi.set_variant("scalar_rgb")

Frame#

The Frame3f class stores a three-dimensional orthonormal coordinate frame. This class is very handy when you wish to convert vectors between different cartesian coordinates systems.

Frame initialization#

A Frame3f can be initialized in different ways as shown below. When given a single vector, it will make use of coordinate_system() to compute the other two basis vectors.

[2]:

mi.Frame3f()  # Empty frame

mi.Frame3f(
    [1, 0, 0],  # s
    [0, 1, 0],  # t
    [0, 0, 1],  # n
)

mi.Frame3f([0, 1, 0])  # n

[2]:

Frame[
  s = [1, -0, -0],
  t = [-0, 0, -1],
  n = [0, 1, 0]
]

Converting to/from local frames#

The two methods below are the main operations you will be using to convert between different coordinate frames.

[3]:

frame = mi.Frame3f(
    [0, 0, 1],
    [0, 1, 0],
    [1, 0, 0],
)

world_vector = mi.Vector3f([3, 2, 1])  # In world frame
local_vector = frame.to_local(world_vector)
local_vector

[3]:

[1.0, 2.0, 3.0]

Spherical coordinates#

Mitsuba 3 provides convenience methods to efficiently compute certain trigonometric evaluations of spherical coordinates with respect to a Frame3f. We use the naming convention that theta is the elevation and phi is the azimuth. For example, you can call Frame3f.sin_theta_2() or Frame3f.cos_phi(). As always, the full list of methods is availble in the reference API.

Transform#

The Transform4f and Transform3f classes provides several static functions to create common transformations, such as translate, scale, rotate and look_at. These are often used for setting "to_world" object parameters in Python using load_dict(). As we will see later, those transformations can also be applied to a Vector, Point, Normal and even a Ray3f.

Note that all transforms are in homogenous coordiantes. Transform4f can therefore be applied to 3-dimensional objects and Transform3f to 2-dimensional objects.

The Transform4f and Transform3f objects hold both the transformation matrix and its transpose of inverse. For convenience, there is also a Transform4f.inverse() method. All put together, this makes transforming back and forth straightforward.

🗒 Note

Often when working with a vectorized variant of Mitsuba (e.g. llvm_ad_rgb), we still want to work with scalar transformation. For instance in the context of scene loading when setting to_world transformations. Mitsuba data-structure types such Transform4f can be prefixed with Scalar to indicate that no matter which variant of Mitsuba is enabled, this type should always refer to the CPU scalar version (which can also be accessed with mitsuba.scalar_rgb.Transform4f. The same applies to all basic types (e.g. Float, UInt32) and other data-structure types (e.g. Ray3f, SurfaceInteraction3f) which can all be prefixed with Scalar.

Transform initialization#

There are several ways to instanciate a transformation object. For example one can create a Transform4f from a numpy array directly, or a simple Python list.

[4]:

import numpy as np

# Default constructor is identity matrix
identity = mi.Transform4f()

np_mat = np.array(
    [
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
    ]
)
mi_mat = mi.Matrix3f(
    [
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
    ]
)
list_mat = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
]

# Build from different types
t_from_np = mi.Transform3f(np_mat)
t_from_mi = mi.Transform3f(mi_mat)
t_from_list = mi.Transform3f(list_mat)

# Broadcasting
t_from_value = mi.Transform3f(3)  # Scaled identity matrix
t_from_row = mi.Transform3f([3, 2, 3])  # Broadcast over matrix columns
t_from_row

[4]:

[[3, 3, 3],
 [2, 2, 2],
 [3, 3, 3]]

We then have a few static function helpful to construct common transformations:

Translate#

[5]:

mi.Transform4f.translate([10, 20, 30])

[5]:

[[1, 0, 0, 10],
 [0, 1, 0, 20],
 [0, 0, 1, 30],
 [0, 0, 0, 1]]

Scale#

[6]:

mi.Transform4f.scale([10, 20, 30])

[6]:

[[10, 0, 0, 0],
 [0, 20, 0, 0],
 [0, 0, 30, 0],
 [0, 0, 0, 1]]

Rotate#

[7]:

mi.Transform4f.rotate(axis=[0, 1, 0], angle=90)

[7]:

[[-4.37114e-08, 0, 1, 0],
 [0, 1, 0, 0],
 [-1, 0, -4.37114e-08, 0],
 [0, 0, 0, 1]]

Look at#

[8]:

mi.Transform4f.look_at(origin=[0, 0, 2], target=[0, 0, 0], up=[0, 1, 0])

[8]:

[[-1, 0, 0, 0],
 [0, 1, 0, 0],
 [0, 0, -1, 2],
 [0, 0, 0, 1]]

Perspective#

The perspective projection does the following: - (1) Project camera space points onto \(z=1\) plane, and non-linearly map \(z\)-coordinates from \([\text{near}, \text{far}]\) to \([0, 1]\), - (2) Scale \((x, y)\) such that the visible region specified by fov lies in \([-1, 1]\times [-1, 1]\).

[9]:

trafo = mi.Transform4f.perspective(fov=90, near=0.1, far=10)
print(trafo)
trafo @ mi.Point3f(2, -2, 2)

[[1, 0, 0, 0],
 [0, 1, 0, 0],
 [0, 0, 1.0101, -0.10101],
 [0, 0, 1, 0]]

[9]:

[0.9999998807907104, -0.9999998807907104, 0.9595960378646851]

Orthographic#

The orthographic projection maps the \(z\)-coordinate to \([0, 1]\).

[10]:

trafo = mi.Transform4f.orthographic(near=0.1, far=10)
print(trafo)
trafo @ mi.Point3f(1, 2, 3)

[[1, 0, 0, 0],
 [0, 1, 0, 0],
 [0, 0, 0.10101, -0.010101],
 [0, 0, 0, 1]]

[10]:

[1.0, 2.0, 0.2929292917251587]

From/to frame#

⚠️ Only available for Transform4f

mi.Transform4f.to_frame(frame) is the matrix representation of the function frame.to_local()

[11]:

frame = mi.Frame3f(
    [0, 0, 1],
    [0, 2, 0],
    [3, 0, 0],
)
mi.Transform4f.to_frame(frame)

[11]:

[[0, 0, 3, 0],
 [0, 2, 0, 0],
 [1, 0, 0, 0],
 [0, 0, 0, 1]]

Applying transforms#

The Python @ (__matmul__) operator can be used to apply Transform objects to points, vectors, normals and rays or multiply transforms with other transforms. Depending on the operand’s type, the operation has a different effect.

Vector3f: A typical matrix multiplication ignoring the homogenous coordinates (e.g. translation)
Point3f: Adjusted matrix multiplication taking into account homogenous coordinates
Normal3f: Matrix multiplication using the inverse transpose to handle non-uniform scaling of surface normals
Ray3f: Both the ray origin (mi.Point) and the ray direction (mi.Vector) are transformed with the @ operator
Transform4f: Combine both transformation.

[12]:

t = mi.Transform4f.translate([0, 1, 2])
t = t @ mi.Transform4f.scale([1, 2, 3])
v = mi.Vector3f([3, 4, 5])
p = mi.Point3f([3, 4, 5])
n = mi.Normal3f([1, 0, 0])

print(f"{t @ v=}")
print(f"{t @ p=}")
print(f"{t @ n=}")

t @ v=[3.0, 8.0, 15.0]
t @ p=[3.0, 9.0, 17.0]
t @ n=[1.0, 0.0, 0.0]

Transformation order#

Transformations in Mitsuba are applied from right to left, similar to how such operations would be written in mathematical form. This means that when multiple transformations are chained together, the net transformation is equivalent to first performing the rightmost transformation, followed by the second rightmost transformation, and so on.

In the following example, the point will first be scaled and then transposed.

[13]:

S = mi.Transform4f.scale(2.0)
T = mi.Transform4f.translate([4, 0, 0])
v = mi.Point3f([1, 1, 1])

trasfo = T @ S

print(trasfo @ v)

[6.0, 2.0, 2.0]

Chaining transforms#

For convinience, it is also possible to chain transformation intialization as follows:

[14]:

mi.Transform4f.scale(2.0).translate([1, 0, 0])

[14]:

[[2, 0, 0, 2],
 [0, 2, 0, 0],
 [0, 0, 2, 0],
 [0, 0, 0, 1]]

The code above is equivalent to:

[15]:

mi.Transform4f.scale(2.0) @ mi.Transform4f.translate([1, 0, 0])

[15]:

[[2, 0, 0, 2],
 [0, 2, 0, 0],
 [0, 0, 2, 0],
 [0, 0, 0, 1]]