我正在对多任务学习进行研究,我想从纽约联络处负责人那里提取专题地图以供比较。 我正在从,与YOLOv8n而不是ReNet18模型。
model = YOLO( models/yolov8n.pt )
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=0., std=1.)
])
image = Image.open(str( images/puppies.jpg ))
image = transform(image)
image = image.unsqueeze(0)
feature_map = model.model.model[15](image)
feature_map.shape
Error: RuntimeError: Given groups=1, weight of size [128, 64, 3, 3], expected input[1, 3, 224, 224] to have 64 channels, but got 3 channels instead
I read the YOLOv8 architecture and suspect that image passed into the model is constant and does not change which is expected from the layers (Correct me if I m wrong). The reason is because the code works when I want to extract feature from layer before any concatenation occurs. I pretty sure that my code is wrong but not sure how to implement it correctly.
Remarks: I am referring to the YOLOv8 architecture from RangeKing.