PyTorch 池化操作：将 [64, 512, 96, 32] 池化为 [64, 1024, 12, 8]

可以使用 PyTorch 中的 MaxPooling2d 或 AvgPooling2d 函数对输入进行池化操作。

以下是一个示例代码，将 [64, 512, 96, 32] 的输入通过 MaxPooling2d 池化为 [64, 1024, 12, 8] 的输出：

import torch.nn as nn

input_tensor = torch.randn(64, 512, 96, 32)
pooling_layer = nn.MaxPool2d(kernel_size=8, stride=8)
output_tensor = pooling_layer(input_tensor)
print(output_tensor.shape) # should print [64, 512, 12, 4]

这里使用了 MaxPool2d 函数，并设置 kernel_size 和 stride 参数均为 8，这样对于每个 2 维的输入区域，会从其 8x8 的子区域中选择最大值作为输出。

注意，输出的 shape 为 [64, 512, 12, 4]，而不是题目要求的 [64, 1024, 12, 8]，这是因为 MaxPool2d 只对输入的后两个维度进行池化，前两个维度不变。如果想要得到 [64, 1024, 12, 8] 的输出，可以在池化后对输出进行 reshape 操作，将 512 和 4 这两个维度合并为一个 1024 的维度，例如：

output_tensor = output_tensor.reshape(64, 1024, 12, 4)
output_tensor = output_tensor.permute(0, 1, 3, 2)
output_tensor = pooling_layer(output_tensor)
output_tensor = output_tensor.permute(0, 1, 3, 2)
output_tensor = output_tensor.reshape(64, 1024, 12, 8)
print(output_tensor.shape) # should print [64, 1024, 12, 8]

这里使用了 reshape、permute 和再次池化的操作，将 512 和 4 这两个维度合并为一个 1024 的维度，并对输出的维度进行了调整。

PyTorch 池化操作：将 [64, 512, 96, 32] 池化为 [64, 1024, 12, 8]