Created a custom model for image captioning, training the model, extracting image features, and generating captions for both your dataset images and some external images. Here’s a step-by-step summary: Step-by-Step Summary:
flickr8k: https://www.kaggle.com/datasets/adityajn105/flickr8k stl10: https://www.kaggle.com/datasets/jessicali9530/stl10
Extracted datasets to specific directories.
Created a custom CNN model for feature extraction. Compiled the model with Adam optimizer and SparseCategoricalCrossentropy loss.
Loaded the STL-10 dataset. Preprocessed images. Trained the model with Early Stopping and Learning Rate Reduction callbacks.
Extracted features using the custom model. Saved extracted features in a pickle file.
Loaded captions and created mappings. Cleaned captions and tokenized text.
Created an encoder-decoder model using LSTM and CNN features. Trained the model with image-caption pairs.
Generated captions for test images. Calculated BLEU scores.
Downloaded images from URLs. Extracted features for these images. Generated captions for downloaded images.