π Vision Transformer (ViT) Tutorial β Part 3: Pretraining, Transfer Learning & Real-World Applications
Let's start: https://hackmd.io/@husseinsheikho/vit-3
βοΈ Our Telegram channels: https://t.iss.one/addlist/0f6vfFbEMdAwODBk
Let's start: https://hackmd.io/@husseinsheikho/vit-3
#VisionTransformer #TransferLearning #HuggingFace #ImageNet #FineTuning #AI #DeepLearning #ComputerVision #Transformers #ModelZoo
βοΈ Our Telegram channels: https://t.iss.one/addlist/0f6vfFbEMdAwODBk
β€3
β¨ AI for Healthcare: Fine-Tuning Googleβs PaliGemma 2 for Brain Tumor Detection β¨
π Table of Contents AI for Healthcare: Fine-Tuning Googleβs PaliGemma 2 for Brain Tumor Detection Configuring Your Development Environment Setup and Imports Load the Brain Tumor Dataset Format Dataset to PaliGemma Format Display Train Image and Label COCO Format BBox toβ¦...
π·οΈ #FineTuning #ObjectDetection #PaliGemma2 #PEFT #QLoRA #Transformers #Tutorial #VisionLanguageModels
π Table of Contents AI for Healthcare: Fine-Tuning Googleβs PaliGemma 2 for Brain Tumor Detection Configuring Your Development Environment Setup and Imports Load the Brain Tumor Dataset Format Dataset to PaliGemma Format Display Train Image and Label COCO Format BBox toβ¦...
π·οΈ #FineTuning #ObjectDetection #PaliGemma2 #PEFT #QLoRA #Transformers #Tutorial #VisionLanguageModels
# Create a copy of the original image to draw on
output_img = img.copy()
# Draw a bounding box for each detected bottle
for box in bottle_boxes:
x1, y1, x2, y2 = map(int, box)
# Draw a green rectangle around each bottle
cv2.rectangle(output_img, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Add the final count as text on the image
summary_text = f"Bottle Count: {bottle_count}"
cv2.putText(output_img, summary_text, (20, 50),
cv2.FONT_HERSHEY_SIMPLEX, 1.5, (0, 0, 255), 4)
# Save the resulting image
cv2.imwrite('factory_bottles_result.jpg', output_img)
print("Result image with detections has been saved as 'factory_bottles_result.jpg'")
---
Step 6: Discussion of Results and Limitations
#Discussion #Limitations #FineTuning
Result: The code successfully uses a pre-trained YOLOv8 model to identify and count standard plastic bottles in an image. The final output provides both a numerical count and a visual confirmation of the detections.
Limitations of Pre-trained Model:
1. Occlusion: If bottles are heavily clustered or hiding behind each other, the model might miss some, leading to an undercount.
2. Unusual Shapes: The model is trained on common bottles (from the COCO dataset). If your factory produces bottles of a very unique shape or color, the model's accuracy might decrease.
3. Environmental Factors: Poor lighting, motion blur (if from a fast conveyor belt), or reflections can all negatively impact detection performance.
How to Improve (Next Steps): For a real-world, high-accuracy industrial application, you should not rely on a generic pre-trained model. The best approach is Fine-Tuning. This involves:
1. Collecting Data: Take hundreds or thousands of pictures of your specific bottles in your actual factory environment*.
2. Annotating Data: Draw bounding boxes around every bottle in those images.
3. Training: Use this custom dataset to train (or "fine-tune") the YOLOv8 model. This teaches the model exactly what to look for in your specific use case, leading to much higher accuracy and reliability.
βββββββββββββββ
By: @DataScienceM β¨
β€1