Skip to main content

Dataset

Imagine you're preparing for a big test, and you want to do really well. To get ready, you decide to practice with some old test papers. These old tests are your study material. Now, you have two sets of old tests: one set that you use to practice and learn from, and another set that you use to check how well you're actually doing.

In computer vision, when we're teaching a computer to recognize things in pictures, we also need sets of examples. These examples are like the questions on your old test papers. We divide these examples into two groups:

Training Dataset: This is like your practice set of old test papers. It's where the computer learns from the examples. Just like you learn from solving questions and figuring out the answers, the computer learns from looking at the pictures and understanding the patterns in them.

Validation Dataset: This is like your checking set of old test papers. After practicing a lot with the training dataset, you want to see how well you've learned. So, you use the validation dataset to test yourself. Similarly, the computer uses the validation dataset to test how well it's learned to recognize things in pictures.

By using a separate validation dataset, you can see if the computer has really understood the concepts or if it's making mistakes. If the computer is doing well on the validation examples, it means it's learned correctly. If not, you might need to adjust how it's learning so it can improve.

So, just like you study with old test papers and check your understanding with a separate set of questions, in computer vision, we use a training dataset to teach the computer and a validation dataset to check how well it's learning.