Pytorch
Pytorch follows imperative programming. You can compile the model by using
torch.jit.script, which transforms python code into torchscript. The compiled model can be further optimized and run in an environment without a python interpreter, and you can save the whole model(code and parameters) to run on other platforms.
Architecture
- Frontend: Python
- Pytorch C++ core
- Backend: cuDNN, MKL
- Hardware Interface Layer: CUDA runtime, CPU instructions
- Computing Hardware: Nvidia GPU, CPU
Python Frontend thread will wait for the C++ backend thread to finish computing the result. One benefit of this design is that the Python frontend thread does not need to perform actual computation. Thus, there is little impact on the program's overall performance, regardless of Python's performance.
Torch
-
torch.tensor() - construct a tensor by supplying the exact values for each element by supplying python list containing numerical literals
-
torch.arange() - return a 1-d tensor, with values from the interval
[start, end)taken with common differencestepbeginning from start -
torch.range() - same as torhc.arange, except include end in values
-
torch.numel(x) / x.numel() - return the total number of elements in the input tensor
-
torch.reshape(x,shape) / x.reshape(shape) - Returns a tensor with the same data and number of elements as
input, but with the specified shape -
torch.zeros() - Returns a tensor filled with the scalar value 0, with the shape defined by the variable argument
size -
torch.ones() - Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument
size -
torch.randn() - Returns a tensor filled with random numbers from a normal distribution with mean 0 and variance 1 (also called the standard normal distribution)
-
torch.normal() - same as randn(), except for customize mean and variance
-
torch.rand() - Returns a tensor filled with random numbers from a uniform distribution on the interval [0,1)
-
torch.randint() - Returns a tensor filled with random integers generated uniformly between
low(inclusive) andhigh(exclusive) -
torch.exp(x) - Returns a new tensor with the exponential of the elements of the input tensor
input -
torch.cat() - Concatenates the given sequence of tensors in
tensorsin the given dimension -
torch.sum(x) / x.sum() - Returns the sum of all elements in the
inputtensor -
torch.t(x) - Expects input to be <= 2-D tensor and transposes dimensions 0 and 1
-
torch.clone(x) / x.clone() - Returns a copy of
inputby allocating new memory -
torch.mean(x) / x.mean() - Returns the mean value of all elements in the
inputtensor. Input must be floating point or complex -
torch.cumsum(x,dim) / x.cumsum(dim) - Returns the cumulative sum of elements of
inputin the dimensiondim -
torch.dot() - Computes the dot product of two 1D tensors
-
torch.mv() - Performs a matrix-vector product of the matrix
inputand the vectorvec -
torch.mul() - Multiplies
inputbyotherby Hadamard Product -
torch.matmul() / @ - Matrix product of two tensors
-
torch.mm - same as torch.matmul, but without boardcast
-
torch.norm() - Returns the matrix norm or vector norm of a given tensor. Find L2 norm
-
torch.abs(x).sum() - Return L1 norm
-
torch.no_grad - Context-manager that disables gradient calculation
-
torch.save - Saves an object to a disk file
-
torch.load - Loads an object saved with torch.save from a file
-
torch.squeeze(input) - Returns a tensor with all specified dimensions of input of size 1 removed.
Tips:
- To automatically infer one component of the shape, we can place a
-1for the shape component that should be inferred automatically. In our case, instead of callingx.reshape(3, 4), we could have equivalently calledx.reshape(-1, 4)orx.reshape(3, -1). - boardcasting
- saving memory
- Running operations can cause new memory to be allocated to host results. For example, if we write
Y = X + Y, we dereference the tensor thatYused to point to and instead pointYat the newly allocated memory. - Fortunately, performing in-place operations is easy. We can assign the result of an operation to a previously allocated array
Yby using slice notation:Y[:] = <expression>. - conversion to other python objects:
- Converting to a NumPy tensor (
ndarray), or vice versa, is easy. The torch tensor and NumPy array will share their underlying memory, and changing one through an in-place operation will also change the other.
- Converting to a NumPy tensor (
- Running operations can cause new memory to be allocated to host results. For example, if we write
Torch.Tensor
-
Tensor.shape - Returns the size of the
selftensor -
Tensor.T - Returns a view of this tensor with its dimensions reversed
-
Tensor.size() - Returns the size of the
selftensor -
Tensor.requires_grad - Is
Trueif gradients need to be computed for this Tensor,Falseotherwise -
Tensor.requires_grad_ - Change if autograd should record operations on this tensor: sets this tensor's
requires_gradattribute in-place -
Tensor.grad - This attribute is
Noneby default and becomes a Tensor the first time a call tobackward()computes gradients forself -
Tensor.backward() - Computes the gradient of current tensor wrt graph leaves
-
Tensor.detach() - Returns a new Tensor, detached from the current graph. The result will never require gradient
Torch.nn
- Tensor.nn.Parameter - A kind of Tensor that is to be considered a module parameter
Torch.nn.Module
-
net.state_dict() - Return a dictionary containing references to the whole state of the module
-
net.load_state_dict() - Copy parameters and buffers from state_dict into this module and its descendants
Torch.cuda
- torch.cuda.is_available: check if GPU is available
- torch.cuda.get_device_name(0): Get names of GPU available
- torch.cuda.synchronize - Wait for all kernels in all streams on a CUDA device to complete.