Bayesian Layers#
This module provides implementations of Bayesian neural network layers designed for uncertainty modeling in deep learning architectures. These layers support different variational inference techniques and activation moment propagation.
Detailed Descriptions#
BayesianLinear#
Abstract base class for Bayesian neural network layers.
Defines core attributes like weight_mu, weight_rho, bias_mu, bias_rho.
Supports optional bias, prior distributions for weights.
Allows softplus transformation for standard deviation parameters.
Includes MAP mode and KL divergence computation.
BBBLinear#
Implements a Bayesian layer using Bayes by Backprop:
Samples weights and biases during forward pass from learned posterior.
In MAP mode, uses only mean parameters without sampling.
LRLinear#
Implements a Bayesian layer using the Local Reparameterization Trick:
Samples output activations instead of weights for more efficient variance reduction.
Propagates mean and variance through layers.
Supports MAP mode.
CLTLinear#
Implements Bayesian layer using Central Limit Theorem (CLT) approximations:
Supports ReLU and CReLU activations.
Propagates mean and variance analytically through the network.
Supports input/output layer distinctions and MAP mode.
CLTLinearDet#
Deterministic version of CLTLinear:
Disables uncertainty modeling by removing learned standard deviations.
Overrides methods to raise errors for standard deviation and KL divergence calls.
Supports MAP mode and variance propagation accordingly.
Usage Example#
import torch
from nets.layers.bayesian_layers import BBBLinear
layer = BBBLinear(in_features=128, out_features=64, bias=True)
x = torch.randn(32, 128)
output = layer(x)
print(output.shape) # torch.Size([32, 64])
Notes#
The
map()method inBayesianLinearswitches between MAP (deterministic) and sampling modes.The KL divergence calculation sums over all parameters for regularization.
CLTLinearuses moment matching and normal CDF/PDF to approximate nonlinear activations.The deterministic variant
CLTLinearDetraises errors if variance or KL methods are called.
Classes#
- class objectrl.nets.layers.bayesian_layers.BayesianLinear(in_features: int, out_features: int, bias: bool = True, prior_mean: float | Tensor | None = None, prior_std: float | Tensor | None = None, use_softplus: bool = False, manual_reset: bool = False, device=None, dtype=None)[source]#
Bases:
ABC,ModuleAbstract base class for Bayesian neural network layers.
- use_softplus#
Whether to apply softplus to std dev parameters.
- Type:
bool
- _manual_reset#
If True, keep the random state
- Type:
bool
- weight_mu#
Mean of the weight distribution.
- Type:
nn.Parameter
- weight_rho#
Rho (transformed std) of the weight distribution.
- Type:
nn.Parameter
- bias_mu#
Mean of the bias distribution (if bias=True).
- Type:
nn.Parameter | None
- bias_rho#
Rho of the bias distribution (if bias=True).
- Type:
nn.Parameter | None
- prior_mean#
Mean of the prior distribution.
- Type:
torch.Tensor | None
- prior_std#
Standard deviation of the prior distribution.
- Type:
torch.Tensor | None
- _map: bool = False#
- __init__(in_features: int, out_features: int, bias: bool = True, prior_mean: float | Tensor | None = None, prior_std: float | Tensor | None = None, use_softplus: bool = False, manual_reset: bool = False, device=None, dtype=None) None[source]#
- Parameters:
in_features (int) – Size of input features.
out_features (int) – Size of output features.
bias (bool) – Whether to include a bias term.
prior_mean (float or torch.Tensor, optional) – Prior mean.
prior_std (float or torch.Tensor, optional) – Prior std deviation.
use_softplus (bool) – If True, apply softplus to std parameters.
manual_reset (bool) – If True, keep the random state
device (torch.device, optional) – Device to use.
dtype (torch.dtype, optional) – Data type to use.
- Returns:
None
- in_features: int#
- out_features: int#
- map(on: bool = True)[source]#
Switch maximum a posteriori (MAP) on or off
- Parameters:
on (bool) – If True, sets MAP mode on.
- Returns:
None
- static inv_softplus(x: Tensor) Tensor[source]#
Inverse of the softplus function.
- Parameters:
x (torch.Tensor) – Input tensor.
- Returns:
Inverse softplus tensor.
- Return type:
torch.Tensor
- static softplus(x: Tensor) Tensor[source]#
Softplus activation function.
- Parameters:
x (torch.Tensor) – Input tensor.
- Returns:
Softplus tensor.
- Return type:
torch.Tensor
- mean() tuple[Tensor, Tensor | None][source]#
- Returns:
Mean of the weight distribution and optionally bias distribution.
- Return type:
tuple
- std() tuple[Tensor, Tensor | None][source]#
- Returns:
Standard deviation of the weight distribution and optionally bias distribution.
- Return type:
tuple
- var() tuple[Tensor, Tensor | None][source]#
- Returns:
Variance of the weight distribution and optionally bias distribution.
- Return type:
tuple
- class objectrl.nets.layers.bayesian_layers.BBBLinear(in_features: int, out_features: int, bias: bool = True, prior_mean: float | Tensor | None = None, prior_std: float | Tensor | None = None, use_softplus: bool = False, manual_reset: bool = False, device=None, dtype=None)[source]#
Bases:
BayesianLinearImplements a Bayesian Layer following Bayes by Backprop (Blundell et al., 2015) Samples weights and biases during the forward pass from the learned posterior distribution. In MAP mode, only the means are used.
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
bias (bool) – Whether to include a bias term.
prior_mean (float | torch.Tensor | None) – Prior mean for weights.
prior_std (float | torch.Tensor | None) – Prior standard deviation for weights.
use_softplus (bool) – Whether to apply softplus activation to std parameters.
manual_reset (bool) – If True, keep the random state
device (torch.device, optional) – Device to use for the layer.
dtype (torch.dtype, optional) – Data type for the layer parameters.
- in_features#
Number of input features.
- Type:
int
- out_features#
Number of output features.
- Type:
int
- use_softplus#
Whether to apply softplus to std dev parameters.
- Type:
bool
- _manual_reset#
If True, keep the random state
- Type:
bool
- weight_mu#
Mean of the weight distribution.
- Type:
nn.Parameter
- weight_rho#
Rho (transformed std) of the weight distribution.
- Type:
nn.Parameter
- bias_mu#
Mean of the bias distribution (if bias=True).
- Type:
nn.Parameter | None
- bias_rho#
Rho of the bias distribution (if bias=True).
- Type:
nn.Parameter | None
- prior_mean#
Mean of the prior distribution.
- Type:
torch.Tensor | None
- prior_std#
Standard deviation of the prior distribution.
- Type:
torch.Tensor | None
- class objectrl.nets.layers.bayesian_layers.LRLinear(in_features: int, out_features: int, bias: bool = True, prior_mean: float | Tensor | None = None, prior_std: float | Tensor | None = None, use_softplus: bool = False, manual_reset: bool = False, device=None, dtype=None)[source]#
Bases:
BayesianLinearImplements a Bayesian layer using a local reparameterization trick (Kingma et al., 2015). Instead of sampling weights, it samples output activations using propagated mean and variance. More efficient and less noisy than direct weight sampling.
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
bias (bool) – Whether to include a bias term.
prior_mean (float | torch.Tensor | None) – Prior mean for weights.
prior_std (float | torch.Tensor | None) – Prior standard deviation for weights.
use_softplus (bool) – Whether to apply softplus activation to std parameters.
device (torch.device, optional) – Device to use for the layer.
dtype (torch.dtype, optional) – Data type for the layer parameters.
- in_features#
Number of input features.
- Type:
int
- out_features#
Number of output features.
- Type:
int
- use_softplus#
Whether to apply softplus to std dev parameters.
- Type:
bool
- weight_mu#
Mean of the weight distribution.
- Type:
nn.Parameter
- weight_rho#
Rho (transformed std) of the weight distribution.
- Type:
nn.Parameter
- bias_mu#
Mean of the bias distribution (if bias=True).
- Type:
nn.Parameter | None
- bias_rho#
Rho of the bias distribution (if bias=True).
- Type:
nn.Parameter | None
- prior_mean#
Mean of the prior distribution.
- Type:
torch.Tensor | None
- prior_std#
Standard deviation of the prior distribution.
- Type:
torch.Tensor | None
- class objectrl.nets.layers.bayesian_layers.CLTLinear(*args, act: Literal['relu', 'crelu'] = 'relu', is_input: bool = False, is_output: bool = False, **kwargs)[source]#
Bases:
BayesianLinearImplements a Bayesian layer using a central limit theorem (Wu et al., 2019; Haussmann, 2021). Supports ReLU and CReLU activations. During forward pass, propagates mean and variance analytically instead of sampling.
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
bias (bool) – Whether to include a bias term.
prior_mean (float | torch.Tensor | None) – Prior mean for weights.
prior_std (float | torch.Tensor | None) – Prior standard deviation for weights.
use_softplus (bool) – Whether to apply softplus activation to std parameters.
device (torch.device, optional) – Device to use for the layer.
dtype (torch.dtype, optional) – Data type for the layer parameters.
- act#
Activation type (‘relu’ or ‘crelu’).
- Type:
str
- is_input#
Whether this is the input layer.
- Type:
bool
- is_output#
Whether this is the output layer.
- Type:
bool
- __init__(*args, act: Literal['relu', 'crelu'] = 'relu', is_input: bool = False, is_output: bool = False, **kwargs) None[source]#
Initializes the CLTLinear layer.
- Parameters:
act (Literal["relu", "crelu"]) – Activation function to use (‘relu’ or ‘crelu’).
is_input (bool) – Whether this is the input layer.
is_output (bool) – Whether this is the output layer.
- Returns:
None
- static normal_cdf(x, mu: float | Tensor = 0.0, sigma: float | Tensor = 1.0) Tensor[source]#
Computes the cumulative distribution function (CDF) of a normal distribution.
- Parameters:
x (torch.Tensor) – Input tensor.
mu (float or torch.Tensor) – Mean of the normal distribution.
sigma (float or torch.Tensor) – Standard deviation of the normal distribution.
- Returns:
CDF values for the input tensor.
- Return type:
torch.Tensor
- static normal_pdf(x, mu: float | Tensor = 0.0, sigma: float | Tensor = 1.0) Tensor[source]#
Computes the probability density function (PDF) of a normal distribution.
- Parameters:
x (torch.Tensor) – Input tensor.
mu (float or torch.Tensor) – Mean of the normal distribution.
sigma (float or torch.Tensor) – Standard deviation of the normal distribution.
- Returns:
PDF values for the input tensor.
- Return type:
torch.Tensor
- static relu_moments(mu: Tensor, sigma: Tensor) tuple[Tensor, Tensor][source]#
Computes the mean and variance of the ReLU activation function.
- Parameters:
mu (torch.Tensor) – Mean of the input tensor.
sigma (torch.Tensor) – Standard deviation of the input tensor.
- Returns:
Mean and variance of the ReLU activation.
- Return type:
tuple
- static neg_relu_moments(mu: Tensor, sigma: Tensor) tuple[Tensor, Tensor][source]#
Computes the mean and variance of the negative ReLU activation function.
- Parameters:
mu (torch.Tensor) – Mean of the input tensor.
sigma (torch.Tensor) – Standard deviation of the input tensor.
- Returns:
Mean and variance of the negative ReLU activation.
- Return type:
tuple
- static crelu_moments(mu: Tensor, sigma: Tensor) tuple[Tensor, Tensor][source]#
Computes the mean and variance of the CReLU activation function.
- Parameters:
mu (torch.Tensor) – Mean of the input tensor.
sigma (torch.Tensor) – Standard deviation of the input tensor.
- Returns:
Mean and variance of the CReLU activation.
- Return type:
tuple
- class objectrl.nets.layers.bayesian_layers.CLTLinearDet(*args, act: Literal['relu', 'crelu'] = 'relu', is_input: bool = False, is_output: bool = False, **kwargs)[source]#
Bases:
CLTLinearDeterministic version of CLTLinear. Disables uncertainty modeling by removing the learned standard deviation.
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
bias (bool) – Whether to include a bias term.
prior_mean (float | torch.Tensor | None) – Prior mean for weights.
prior_std (float | torch.Tensor | None) – Prior standard deviation for weights.
use_softplus (bool) – Whether to apply softplus activation to std parameters.
device (torch.device, optional) – Device to use for the layer.
dtype (torch.dtype, optional) – Data type for the layer parameters.
- in_features#
Number of input features.
- Type:
int
- out_features#
Number of output features.
- Type:
int
- use_softplus#
Whether to apply softplus to std dev parameters.
- Type:
bool
- weight_mu#
Mean of the weight distribution.
- Type:
nn.Parameter
- weight_rho#
Rho (transformed std) of the weight distribution.
- Type:
nn.Parameter
- bias_mu#
Mean of the bias distribution (if bias=True).
- Type:
nn.Parameter | None
- bias_rho#
Rho of the bias distribution (if bias=True).
- Type:
nn.Parameter | None
- prior_mean#
Mean of the prior distribution.
- Type:
torch.Tensor | None
- prior_std#
Standard deviation of the prior distribution.
- Type:
torch.Tensor | None
- __init(*args, **kwargs)#
- std() tuple[Tensor, Tensor | None][source]#
- Returns:
Standard deviation of the weight distribution and None for bias.
- Return type:
tuple